(Repost plus Rasmus feedback -- as the original got chucked into the bin
-- ezmlm issues)
Hi,
I have a particular interest in performance optimisation of the
infrastructure stack used in share hosting services, and this topic is
raised regularly at various applications forums as well as lists such as
this. The last was a thread "APC and CGI" [1], and I've CCed Marten and
Rasmus who had this discussion. The nub of this were the points raised
(i) by Martin: there is a genuine need for performance acceleration of
LAMP stacks used in shared hosting offerings: and (ii) by Rasmus that
there are technical difficulties with doing this so the work involved
would be non-trivial, and "if you are using a fork-per-request CGI model
you are obviously not that concerned about performance".
Can I offer an alternative perspective: developers and implementers use
shared hosting offerings for a number of reasons, and perhaps the two
major one are that (i) most such implementers simply don't have the
administration skills to build and administrate a dedicate/VM host
offering a LAMP stack, and (ii) yes, shared services are cheaper.
However, the main discriminant on choice of shared or dedicated hosting
service architecture is on request *volumes*, not responsiveness. I
have yet to meet an admin / implementer that isn't concerned about their
application or service responsiveness. Also despite the fall in
relative price of VM offerings, the number of shared hosting accounts
offered by the hosting providers exceed the number of VM and dedicated
accounts by *more than an order of magnitude*, and now such providers
routinely offer "one-click" installation of such heavyweight
applications as WordPress, MediaWiki and phpBB.
Such applications usually have poor responsiveness on a typical shared
service. The main reason for this is NOT the php-cgi image activation
time (<100m Sec on a current server) [2], but on the infrastructure
architecture adopted by most hosts to achieve scaling. This is normally
achieved by using a farm of servers with separate dedicated tiers
proving: web services; back-end D/Bs; and user storage. Modern servers
arranged in such a farm infrastructure architecture can deliver ample
MIPs needed to support such shared hosting solutions, so processing
delays aren't usually a major factor in responsive.
The main responsiveness killer is I/O delay. A single web request to
one of the MW-class applications can require the loading of roughly 100
PHP modules and compiling perhaps 500K lines of source. The roughly 1
CPU second needed to compile such a script set isn't non-trivial, but
the issue driving script response is normally the I/O delay associated
with RPCs to the backend NAS infrastructure needed to read the source
files. (The NAS filesystems are usually NFS mounted with a short, say
15s, acregmin, so the in-server VFAT cache is usually flushed by each
request and this I/O therefore can generate ~300 off-server RPCs as well
as cache-miss physical I/O within the NAS itself.)
This I/O and compilation delay is largely avoidable by using a
per-script cdb-style file-based opcode cache. This would in effect
replace the compilation overhead and assembly/input of the ~100 PHP
modules with the largely serial read of a single (compressed) opcode
file, and as NFS4 effectively does bulk read-ahead for serial access to
files, the I/O overheads in doing this can be reduced by well over on
order of magnitude.
I've demonstrated that this can be achieved at an application level with
phpBB by processing script hierarchies to marshal them into condensed
glob sets[3] and even this give a ~3x speed-up, *but* such processing is
application specific and deeply unpopular with the application
maintainers, who usually offer "use a dedicated LAMP VM and APC or
Xcache". This really needs to be supported at the PHP extension level.
So what I am proposing is:
1) We develop an APC-lite extension offering code caching, (but not
variable caching) in a manner that is transparent to the applications
sitting over it.
2) The main usecase is to deliver performance acceleration for complex
applications such as MediaWiki, WordPress and phpBB implemented in a
shared hosting environment, that is targeted specifically at php-cgi/cli
environments .
3) Any cache strategy should be thread and process-safe, UID-specific
and normally SCRIPT_NAME-specific.
4) The file caches are piece-wise constant and extended when necessary
using an incremental approach to overwrite the existing cache with the
extended version. No LRU or variant cache pruning schemes are supported
and the only refresh/prune option is to invalidate an existing cache to
enable its recreation on a subsequent script execution.
5) The default mode of operation is "stat=0", that is the cache
associated with the SCRIPT_NAME is opened if it exists and no source
files are statted or read. This means that a single file can execute an
application such as MW or WP.
6) The starting point to this Lite Program Cache (LPC) is some specific
and relevant modules from the existing APC code base, (albeit stripped
of all code redundant in this usecase).
I am proposing that *I* do this development at least to a working proof
of concept extension. (I am not asking the current APC maintainers to
do any material work here, since this is a fork, though I would very
much welcome access and feedback for review of design docs, code and
possibly advice on specific issues.) I've already implemented a
suitable DBA(cdb) replacement extension, since cdb performs poorly in
this usecase, but I'll release the two extension as a set, and am
largely through the reengineering the APC design from its code base. It
will take me a couple of months to complete this first release. I can
also push the code base from my local git repository to github in a few
weeks if anyone wants review access.
This is really just an FYI to the community, but any comments and
feedback would be welcome. Failing this, I will post back here when I
have a working extension and hard performance data.
Regards Terry Ellison
PS. Since this is my first post to this DL, a short intro on myself: I
am an ex-IT dev with ~10yrs C/C++ dev experience -- mostly
infrastructure and realtime, as well as some more senior stuff such as
being a systems architect. I am now early-retired and "gentleman
contributor" keeping my hand in by code contributions to a few FLOSS
projects. I was also the sysadmin and maintainer the OpenOffice.org
user forums and wiki for over 5 years, and lately a member of the Apache
Infrastructure team (though this has now lapsed as I was unhappy with
the shift from the previous more friendly Sun project ethos, so I am now
looking for another project to get my teeth into). I also answer Qs on
StackOverflow on PHP/Apache/using shared hosting services, etc. [3]
T
[1] http://news.php.net/php.pecl.dev/start/9807
[2] http://blog.ellisons.org.uk/article-44 also
http://blog.ellisons.org.uk/search-PHP for b/g this and related articles
[3] http://stackoverflow.com/users/1142045/terrye
On 27/09/12 16:44, Rasmus Lerdorf wrote:
That was quite a long message. I still don't believe in CGI-based
setups. It will be extremely painful to build what you propose. I think
a better option would be to look at php-fpm and figure out what is
missing from that to use it in a shared hosting environment. It already
supports APC and per-user process pools. The only missing piece is
probably related to spinning up pools on demand as opposed to starting
them all on server start.
-Rasmus