
Pliant web-engine is implemented as a proxy interface between a web browser and an instance of a headless ui client, started on the server.
web browser --> http proxy --> UI server
client server-A server-B
In more detail, all the application business logic runs on the UI server, and is delivered to the outside world through a data protocol called PML, a structured binary format similar in concept to xml, but more efficient for network transmission. The transmitted data contains content for various user interface widgets to be rendered on the client.
The UI client is the intended receiver of the PML data stream and is a graphical terminal software installed on end user's computer. It knows how to draw buttons, drop-down boxes, render text and notify the UI server about events such us mouse clicks, keyboard typing, and drop-down selection changes.
The UI HTTP Proxy acts both as a web-server and as a ui-client. It runs on a server which may not be the same as the one that's running the UI service. It's job is to convert the PML protocol to HTML/AJAX/DOM framework.
web browser <---> http proxy <---> UI server
ajax pml
This website is driven entirely by http proxy. See demos.
Web crawlers aka. robots can significantly load a website, therefore it is important to separate them from human users. Human users are detected by having javascript install a special code in their cookie.
More specifically, there are three types of users: web crawlers, browser users, and active users. Active users are the ones that have started using the widgets, clicking buttons and filling up forms. Active users get the freshest content, but the rest can be served from cache.
The values of the special codes are controlled by the "browser_key" and "active_key" parameters, which are fixed random strings that should be customized per installation.
Robots and non-active browser users are served from cache. There are two cache areas, one for web-browser users (level 1) and one for robots (level 2). Updates to level 1 cache are automatically applied to level 2 cache.
The cache has no expiry notion, but you can force cache updates by externally deleting files or using wget and adding ?refresh or ?browser_refresh to urls. Adding ?active to urls forces an active session which is never cached.
Several instances of proxies can run to improve performance and stability. All proxies could share the same active_key and browser_key, and disk cache areas.
Please note that once a user starts an active session, he should always go through the same proxy.
Here is a sample apache configuration. This block of mod_rewrite code is the minimal setup that can work for most pliant driven websites,
RewriteRule .* - [E=proxy:http://${pliant_config:proxy}]
RewriteRule .* - [E=browser_key:http://${pliant_config:browser_key}]
RewriteRule .* - [E=active_key:http://${pliant_config:active_key}]
RewriteCond %{env:browser_key},,,%{HTTP_COOKIE} !(.+?),,,.*SID=\1 [NC]
RewriteCond %{env:active_key},,,%{HTTP_COOKIE} !(.+?),,,.*SID=\1 [NC]
RewriteCond %{env:active_key},%{QUERY_STRING} !(.+),sid=\1
RewriteCond %{env:active_key},,,%{REQUEST_URI} !(.+),,,/_/.*\1
RewriteCond %{QUERY_STRING} !^active
RewriteRule .* - [E=proxy:http://${pliant_config:cache_proxy|%{env:proxy}}]
RewriteCond %{QUERY_STRING} ^inactive
RewriteRule .* - [E=proxy:http://${pliant_config:cache_proxy|%{env:proxy}}]
RewriteCond %{REQUEST_URI} ^/_/
RewriteRule ^(.*)$ %{env:proxy}$1 [P,L,QSA]
Note that "pliant_config" is an apache rewrite map that can be a static file or a dynamic program. In dynamic case, it can return different values for the purpose of load balancing (see apache documentation).
The "%{env:proxy}" construct now has the http proxy url that must serve the request. Here is the setup for pliantcode.com, that uses rewrite_map to match all urls that should be forwarded to the proxy.
RewriteMap ui_path txt:/pliant/sites/pliantcode.com/etc/ui_path.txt
RewriteCond ${ui_path:%{REQUEST_URI}|undef} !^undef$
RewriteRule ^(.*)$ %{env:proxy}/sites/pliantcode.com/${ui_path:$1} [P,L,QSA]
The recommended minimal setup for high performance uses Apache to direct users between several http proxies.
There are in total three http proxies. Two are used for web crawlers, and the other one for web-browser and active users. All proxies share the same cache root, for example /var/pub. Also, all proxies use the same active_key and browser_key.
Every 20 minutes or so one of the web-crawler proxies is restarted using a cronjob, but not before all crawler requests have been forwarded to the other proxy. This forwarding is facilitated by the pliant_config rewrite map.
The rewrite map and the killing cron job are synchronized using a configuration file that lists the currently active crawler proxy.
The reason that restarting of proxy is needed is because the http proxy code is still buggy under a heavy load of bots.
Sample scripts facilitating this setup are provided in the archive area.
The engine detects robots and serves them pre-cached versions of pages.
Sessions are opened not on page load, but on first ajax request. New users get served pre-cached content until they start actively using widgets.
UI widgets such as buttons and input fields work without a wrapper <form> tag. That means that they do not clash with 3rd party forms you may want to embed in your page.
Copyright © 2008 Pliant Software Solutions | All Rights Reserved
Website powered by Pliant. Contact:
Connection to the server is suspended.
Please press the Continue button to resume.