Re: [PHP-DEV] Introducing compile time code execution to PHP preloading

Larry Garfield Mon, 13 Jan 2020 10:32:27 -0800

On Sun, Jan 12, 2020, at 6:45 PM, Mike Schinkel wrote:
> > On Jan 12, 2020, at 1:57 PM, Larry Garfield <[email protected]> wrote:
> > 
> > Most notably, *not all code will be run in a preload context*.  
> 
> Can you give some concrete examples here?
> 
> > Language features that only sometimes work scare me greatly.  
> 
> Do you have some examples of  language features, from PHP or another 
> language, that only work sometimes and that are known to be 
> problematic. and why they are problematic?


To use the example from the OP:

static_run {
    $link = mysqli_connect("127.0.0.1", "my_user", "my_password", "my_db");
    $res = mysqli_query ($link, 'select * from sometable');

    $array = [];
    while($row = mysqli_fetch_assoc($res)) {
        $array[]= $row;
    }

    define('CONST_ARRAY' = $array);
}

I can see the use of that, sure.  Now, what happens when the code is not 
preloaded?  Does that block not get run, and thus CONST_ARRAY is not defined?  
Does it run on all requests if not preloaded?  How does that interact with a 
file that gets read multiple times?

What happens if the code does more than set a constant?  Can it define new 
functions?  What happens to those functions in a non-preload situation?

To use the other example:

static_run {
    on_file_load(function($file_ast){

        // Do something with the ast of the file

        return $file_ast;
    });
}

AST manipulation from user-space opens up a lot of possibilities for 
optimization.  However, it's also a huge foot-gun.  When you start messing with 
the AST I can't imagine it's hard to end up introducing subtle behavioral 
changes without intending to.  Or, maybe you are intending to.  So then what 
happens if the code runs in a context when that doesn't happen?  Does the AST 
then get re-manipulated on every request instead?  What's the performance 
impact of that?  Net negative?

I don't have answers to these questions.  It's possible that we could come up 
with a set of answers that would address the core issue, but I am skeptical.

My core point here is that I am fully in favor of leveraging preloading to 
improve performance, BUT ensuring that there is zero behavioral difference 
between preloaded and non-preloaded code, only performance differences, is 
paramount, and IMO is more important than any flexibility, power, or 
performance benefits it could offer.  We should consider exposing that to user 
space *only* if we can be pretty damned sure that it's not going to introduce 
weird-and-subtle behavioral bugs that end up making preloaded and non-preloaded 
code behave differently.

As an example, preloading seems like a great place to do something like tail 
recursion flattening.  That's a logically safe thing to do, as long as the call 
is properly tail-recursive, and would make writing tail-recursive algorithms 
more practical.  (They're often easier to read and maintain but performance 
makes them less practical.)

However!  Doing so means the preloaded version doesn't have an issue with 
blowing out the stack.  The non-preload version does.  That means the 
non-preload version has a built in limit on how long of a list it can operate 
on (100 by default, minus however many stack calls have already been made) 
while the preloaded version doesn't.  That can have ugly implications if you're 
running code that was working in preload in a non-preload context, and suddenly 
your 105 element array is causing a fatal error when it didn't before.

That's the sort of subtlety that, frankly, I am a lot more confident in Engine 
developers remembering to think about than user-land developers.  Myself 
included.  Not because they're less capable developers but because 99% of the 
time PHP doesn't force you to think about such questions, so most developers 
won't think to think about them.  And 99% of the time that's a good thing.  
This is the other 1%. :-)

What I very much want to avoid, for as long as possible at least, is "this 
library only works if preloaded" type situations.  That's how we end up with a 
division in the language; not just between people who own their own servers and 
those that don't, but it ties the hands of admins and framework authors in 
deciding what to preload.  What a "good" preload strategy is depends on 
context, and we've only had a month or two experience with it to even know what 
to recommend to people.

And that's in addition to the development challenges of developing such code in 
the first place:

> > "I changed one character and now I have to restart my webserver to see if 
> > it did anything" is a bad place for PHP to be.
> 
> As I envision it preloaded code of this nature would not be handled on 
> server reboot, but when the files have had their time stamps updated. 
> If I am not mistaken, PHP already does this (but I could be mistaken as 
> I don't have expertise in PHP OpCodes.) 

The opcache does that, yes.  The preloader, however, is a one-shot deal and 
requires restarting FPM to have it re-run.



Thinking about it, I suspect there would be far more benefit in practice not 
from allowing AST manipulation but being able to "Checkpoint" a running script; 
that is, allow it to not just pre-load code (which we can do now in 7.4) but 
set up variables that are already initialized from one request to the next.  
I'm thinking here of things like bootstrapping a dependency injection 
container, declaring closed functions, and other semi-global stuff that right 
now makes a PHP application's bootstrap process more expensive than most other 
languages.  (In the area of milliseconds, sure, but still slower.)  Allowing 
that sort of execution to happen once and get persisted would reduce the need 
to do all the precompiling and such that many frameworks do today, at the cost 
of a great deal of complexity.

That may be as much of a pipedream to do safely, I don't know, but in practice 
that seems like a more promising direction for userland developers to leverage 
themselves.

--Larry Garfield

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] Introducing compile time code execution to PHP preloading

Reply via email to