(Repost plus Rasmus feedback -- as the original got chucked into the bin -- ezmlm issues)

Hi,

I have a particular interest in performance optimisation of the infrastructure stack used in share hosting services, and this topic is raised regularly at various applications forums as well as lists such as this. The last was a thread "APC and CGI" [1], and I've CCed Marten and Rasmus who had this discussion. The nub of this were the points raised (i) by Martin: there is a genuine need for performance acceleration of LAMP stacks used in shared hosting offerings: and (ii) by Rasmus that there are technical difficulties with doing this so the work involved would be non-trivial, and "if you are using a fork-per-request CGI model you are obviously not that concerned about performance".

Can I offer an alternative perspective: developers and implementers use shared hosting offerings for a number of reasons, and perhaps the two major one are that (i) most such implementers simply don't have the administration skills to build and administrate a dedicate/VM host offering a LAMP stack, and (ii) yes, shared services are cheaper. However, the main discriminant on choice of shared or dedicated hosting service architecture is on request *volumes*, not responsiveness. I have yet to meet an admin / implementer that isn't concerned about their application or service responsiveness. Also despite the fall in relative price of VM offerings, the number of shared hosting accounts offered by the hosting providers exceed the number of VM and dedicated accounts by *more than an order of magnitude*, and now such providers routinely offer "one-click" installation of such heavyweight applications as WordPress, MediaWiki and phpBB.

Such applications usually have poor responsiveness on a typical shared service. The main reason for this is NOT the php-cgi image activation time (<100m Sec on a current server) [2], but on the infrastructure architecture adopted by most hosts to achieve scaling. This is normally achieved by using a farm of servers with separate dedicated tiers proving: web services; back-end D/Bs; and user storage. Modern servers arranged in such a farm infrastructure architecture can deliver ample MIPs needed to support such shared hosting solutions, so processing delays aren't usually a major factor in responsive.

The main responsiveness killer is I/O delay. A single web request to one of the MW-class applications can require the loading of roughly 100 PHP modules and compiling perhaps 500K lines of source. The roughly 1 CPU second needed to compile such a script set isn't non-trivial, but the issue driving script response is normally the I/O delay associated with RPCs to the backend NAS infrastructure needed to read the source files. (The NAS filesystems are usually NFS mounted with a short, say 15s, acregmin, so the in-server VFAT cache is usually flushed by each request and this I/O therefore can generate ~300 off-server RPCs as well as cache-miss physical I/O within the NAS itself.)

This I/O and compilation delay is largely avoidable by using a per-script cdb-style file-based opcode cache. This would in effect replace the compilation overhead and assembly/input of the ~100 PHP modules with the largely serial read of a single (compressed) opcode file, and as NFS4 effectively does bulk read-ahead for serial access to files, the I/O overheads in doing this can be reduced by well over on order of magnitude.

I've demonstrated that this can be achieved at an application level with phpBB by processing script hierarchies to marshal them into condensed glob sets[3] and even this give a ~3x speed-up, *but* such processing is application specific and deeply unpopular with the application maintainers, who usually offer "use a dedicated LAMP VM and APC or Xcache". This really needs to be supported at the PHP extension level. So what I am proposing is:

1) We develop an APC-lite extension offering code caching, (but not variable caching) in a manner that is transparent to the applications sitting over it.

2) The main usecase is to deliver performance acceleration for complex applications such as MediaWiki, WordPress and phpBB implemented in a shared hosting environment, that is targeted specifically at php-cgi/cli environments .

3) Any cache strategy should be thread and process-safe, UID-specific and normally SCRIPT_NAME-specific.

4) The file caches are piece-wise constant and extended when necessary using an incremental approach to overwrite the existing cache with the extended version. No LRU or variant cache pruning schemes are supported and the only refresh/prune option is to invalidate an existing cache to enable its recreation on a subsequent script execution.

5) The default mode of operation is "stat=0", that is the cache associated with the SCRIPT_NAME is opened if it exists and no source files are statted or read. This means that a single file can execute an application such as MW or WP.

6) The starting point to this Lite Program Cache (LPC) is some specific and relevant modules from the existing APC code base, (albeit stripped of all code redundant in this usecase).

I am proposing that *I* do this development at least to a working proof of concept extension. (I am not asking the current APC maintainers to do any material work here, since this is a fork, though I would very much welcome access and feedback for review of design docs, code and possibly advice on specific issues.) I've already implemented a suitable DBA(cdb) replacement extension, since cdb performs poorly in this usecase, but I'll release the two extension as a set, and am largely through the reengineering the APC design from its code base. It will take me a couple of months to complete this first release. I can also push the code base from my local git repository to github in a few weeks if anyone wants review access.

This is really just an FYI to the community, but any comments and feedback would be welcome. Failing this, I will post back here when I have a working extension and hard performance data.

Regards Terry Ellison

PS. Since this is my first post to this DL, a short intro on myself: I am an ex-IT dev with ~10yrs C/C++ dev experience -- mostly infrastructure and realtime, as well as some more senior stuff such as being a systems architect. I am now early-retired and "gentleman contributor" keeping my hand in by code contributions to a few FLOSS projects. I was also the sysadmin and maintainer the OpenOffice.org user forums and wiki for over 5 years, and lately a member of the Apache Infrastructure team (though this has now lapsed as I was unhappy with the shift from the previous more friendly Sun project ethos, so I am now looking for another project to get my teeth into). I also answer Qs on StackOverflow on PHP/Apache/using shared hosting services, etc. [3]
T

[1] http://news.php.net/php.pecl.dev/start/9807
[2] http://blog.ellisons.org.uk/article-44 also http://blog.ellisons.org.uk/search-PHP for b/g this and related articles
[3] http://stackoverflow.com/users/1142045/terrye


On 27/09/12 16:44, Rasmus Lerdorf wrote:
That was quite a long message. I still don't believe in CGI-based
setups. It will be extremely painful to build what you propose. I think
a better option would be to look at php-fpm and figure out what is
missing from that to use it in a shared hosting environment. It already
supports APC and per-user process pools. The only missing piece is
probably related to spinning up pools on demand as opposed to starting
them all on server start.

-Rasmus

Reply via email to