Ronald Chmara wrote:
On Feb 21, 2009, at 10:55 PM, shire wrote:
Hi Ronald,
Ronald Chmara wrote:
Wait... so if I understand this right, let's envision a code base where,
per some random page load, 70 functions are actually called, but, oh,
7,000, or even 700,000, are being included for whatever reason?
The speed optimization is in *not* copying a massive amount of things
that weren't even needed, or used, in the first place?
Essentially, yes, this is probably best summed up by the 80/20 rule
where we only use 20% of the code etc...
Well, I can see 80% actually *used* code, with 20% in there by
accident.... but 80% unused code? eep! ack! Call the villagers and get
the torches and pitchforks!...
...but environments vary, of course. ;)
However, there's still the horribly massive speed hit of semi-loading,
and marking, a fairly large amount of unused, un-needed, functions, as
available?
I don't agree with the description of describing this as a "horribly
massive speed hit" at least in comparison with what was happening
without lazy loading.
Fair enough. Before the patch, for example, I might describe it (80%
unused, 20% used code) as an "insanely awful, horribly massive speed
hit", and after the patch, as being reduced to a much lesser "horribly
massive speed hit", but these are just rhetorical, and qualitative,
language devices that I used to characterize code issues.
In both cases, a large amount of CPU is spent on (effectively) doing
nothing, but your patch (as I understand its design) reduces the amount
of CPU waste... doing nothing.
Also, like I said there's further iterations I plan to make here, one
of these being increasing the performance of this marking functions as
available.
One thing I see as quite a beneficial future outcome of your work is the
ability to further profile code, and be able to seek out code that marks
massive amounts of functions as "available".... without actually ever
using them.
I do see the benefit of lazy loading, I'm just not very comfortable with
enabling a philosophy of loading up a massive amount of CPU and RAM with
"just in case they're wanted" features and code in the first place.
Well I am assuming that this is what a large amount of code does
already, except that without lazy loading the situation is
significantly worse.
Different code bases and philosophies vary.
Since much of what I do (enterprise PHP tuning) involves (among many
other things) finding, and eliminating, such code, I can say with great
confidence that there certainly are bloat-bases out there that load
metric hogs-heads of libraries to show a single web email form, but
there are also code bases which do *not* rely on endless libraries,
frameworks, additional template abstractions and end user libraries, or
other pre-determined architectures... to complete the simple task of
showing an web email form.
To frame the issue another way, you are trying to make huge,
complicated, code sets less painful to use, and I am arguing that huge,
complicated, code sets are a major part of the problem..... but since
neither of us can wave a magic wand and reduce the problem to simple,
elegant, code sets, you're reducing the magnitude of pain involved.
Kudos to you.
Your point that we should be sure this does not encourage poor coding
practices is well taken, but it's been my experience that code tends
to take this form regardless so I'm hoping to make the best of the
situation ;-).
There will always be bad code, yes. ;-)
I'm trying to raise a token flag of discussion/resistance to making bad
code practices less painful, as it still enables bad code practices.
Also keep in mind that there are cases where you may not know in
advance which functions you will/will not call, but it's probably fair
to say that the 80/20 rule still holds, so including all the functions
you may need is not particularly a misuse of the language, but rather
a necessity of a dynamic application and language.
It all depends on the use, and environment, I suppose.
It certainly can boost an APC code set such as facebook, where many of
those files and functions *will* likely be used in the next 20 minutes
or so, but I also fear that it will encourage programmers to load
everything they have, every time, just in case they need it.... and 2Gb
apache processes (and APC space) can be.... ugly.
I'm not entirely clear on where code being used in the next 20 minutes
come into play, what differenc does 100 milliseconds vs. 20 minutes
make in APC/lazy loading?
FB seems to have a fair bit of traffic, with a semi-patrolled code set,
so it's likely that any single APC-loaded function will be invoked
*eventually*, within an hour or so.
Contrast this with 1,000 different sites hosted on a box, using a less
patrolled, fairly unregulated, 1,000 different batches of PHP codesets,
where "myTotallyCustomDateTime()" can have 1,000 different variants,
some of which are only actually used once every 3-5 weeks or so.
Lazy loading would (as I understand it) speed up both, but lazy loading
would also encourage not just one code set, but all code sets, to assume
that the *language authors*, rather than the *developers*, were
responsible for making sure CPU was being managed efficiently.
It's actually likely that only a fraction of the code at Facebook
will be used in a request, hence the need for lazy loading.
Ouch. Seriously.
I can't tell you how to build your code, but I think you might seriously
benefit from:
a) Lazy Loading (as you've done, great idea)
b) Using Lazy Loading to find out which apps/code are sucking up massive
CPU, and taking action as need to help tune, or remove, the offending code?
c) Breaking your Lazy Loading targets out, to where (a hypothetical) the
mytzyplk_scramble() function/class method is only included and thus
loaded as needed, rather than (as a guess) a function group (or class
method group) auto-loaded which may/may not be needed for a given page
load?
Does that make sense? Or did you try it already? :)
-Bop
Hi,
Ronald, I think you are overreacting a little bit. It may be that proper
written could would get no benefit from this patch since it would not
load unneeded code and this patch ends up speeding up environments where
such "correct" loading isn't done. I don't think that's a reason to
disqualify a feature that brings benefits with no significant drawbacks.
For the average PHP programmer, the language will simply "get faster".
That can't be bad in any way. It doesn't encourage you to write bad
code, it just doesn't kick you in the nuts when you do.
Regards
Rodtigo Saboya
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php