On Sun, Jan 12, 2020, at 6:45 PM, Mike Schinkel wrote:
> > On Jan 12, 2020, at 1:57 PM, Larry Garfield <[email protected]> wrote:
> >
> > Most notably, *not all code will be run in a preload context*.
>
> Can you give some concrete examples here?
>
> > Language features that only sometimes work scare me greatly.
>
> Do you have some examples of language features, from PHP or another
> language, that only work sometimes and that are known to be
> problematic. and why they are problematic?
To use the example from the OP:
static_run {
$link = mysqli_connect("127.0.0.1", "my_user", "my_password", "my_db");
$res = mysqli_query ($link, 'select * from sometable');
$array = [];
while($row = mysqli_fetch_assoc($res)) {
$array[]= $row;
}
define('CONST_ARRAY' = $array);
}
I can see the use of that, sure. Now, what happens when the code is not
preloaded? Does that block not get run, and thus CONST_ARRAY is not defined?
Does it run on all requests if not preloaded? How does that interact with a
file that gets read multiple times?
What happens if the code does more than set a constant? Can it define new
functions? What happens to those functions in a non-preload situation?
To use the other example:
static_run {
on_file_load(function($file_ast){
// Do something with the ast of the file
return $file_ast;
});
}
AST manipulation from user-space opens up a lot of possibilities for
optimization. However, it's also a huge foot-gun. When you start messing with
the AST I can't imagine it's hard to end up introducing subtle behavioral
changes without intending to. Or, maybe you are intending to. So then what
happens if the code runs in a context when that doesn't happen? Does the AST
then get re-manipulated on every request instead? What's the performance
impact of that? Net negative?
I don't have answers to these questions. It's possible that we could come up
with a set of answers that would address the core issue, but I am skeptical.
My core point here is that I am fully in favor of leveraging preloading to
improve performance, BUT ensuring that there is zero behavioral difference
between preloaded and non-preloaded code, only performance differences, is
paramount, and IMO is more important than any flexibility, power, or
performance benefits it could offer. We should consider exposing that to user
space *only* if we can be pretty damned sure that it's not going to introduce
weird-and-subtle behavioral bugs that end up making preloaded and non-preloaded
code behave differently.
As an example, preloading seems like a great place to do something like tail
recursion flattening. That's a logically safe thing to do, as long as the call
is properly tail-recursive, and would make writing tail-recursive algorithms
more practical. (They're often easier to read and maintain but performance
makes them less practical.)
However! Doing so means the preloaded version doesn't have an issue with
blowing out the stack. The non-preload version does. That means the
non-preload version has a built in limit on how long of a list it can operate
on (100 by default, minus however many stack calls have already been made)
while the preloaded version doesn't. That can have ugly implications if you're
running code that was working in preload in a non-preload context, and suddenly
your 105 element array is causing a fatal error when it didn't before.
That's the sort of subtlety that, frankly, I am a lot more confident in Engine
developers remembering to think about than user-land developers. Myself
included. Not because they're less capable developers but because 99% of the
time PHP doesn't force you to think about such questions, so most developers
won't think to think about them. And 99% of the time that's a good thing.
This is the other 1%. :-)
What I very much want to avoid, for as long as possible at least, is "this
library only works if preloaded" type situations. That's how we end up with a
division in the language; not just between people who own their own servers and
those that don't, but it ties the hands of admins and framework authors in
deciding what to preload. What a "good" preload strategy is depends on
context, and we've only had a month or two experience with it to even know what
to recommend to people.
And that's in addition to the development challenges of developing such code in
the first place:
> > "I changed one character and now I have to restart my webserver to see if
> > it did anything" is a bad place for PHP to be.
>
> As I envision it preloaded code of this nature would not be handled on
> server reboot, but when the files have had their time stamps updated.
> If I am not mistaken, PHP already does this (but I could be mistaken as
> I don't have expertise in PHP OpCodes.)
The opcache does that, yes. The preloader, however, is a one-shot deal and
requires restarting FPM to have it re-run.
Thinking about it, I suspect there would be far more benefit in practice not
from allowing AST manipulation but being able to "Checkpoint" a running script;
that is, allow it to not just pre-load code (which we can do now in 7.4) but
set up variables that are already initialized from one request to the next.
I'm thinking here of things like bootstrapping a dependency injection
container, declaring closed functions, and other semi-global stuff that right
now makes a PHP application's bootstrap process more expensive than most other
languages. (In the area of milliseconds, sure, but still slower.) Allowing
that sort of execution to happen once and get persisted would reduce the need
to do all the precompiling and such that many frameworks do today, at the cost
of a great deal of complexity.
That may be as much of a pipedream to do safely, I don't know, but in practice
that seems like a more promising direction for userland developers to leverage
themselves.
--Larry Garfield
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php