Hi Anatol,

Anatol Belski wrote:

> This looks like an extremely fragile topic because it depends on how
> much stack is available to an executable. A custom JIT stack can
> behave more stable but cannot be resized. And the main issue is that
> the JIT stack size, machine stack size and ext/pcre cache size are
> completely unrelated terms. For example, a binary can have not enough
> stack, but the custom JIT stack using mmap/VirtualAlloc could even
> succeed, but then pcre_exec will be executed and overflow the machine
> stack. We can never know which one is exhausted first - the one for
> the JIT compilation or the other one for the execution, or vice
> versa.
> 
> Generally, moving the JIT compilation away from the machine stack and
> increasing the PCRE cache size should be more stable against this .
> However it's an edge case. IMHO we should not do it just to fix some
> crazy usage. Users who need it might just turn off JIT. Normal usage
> seems not to be affected, say loading some sane functional script,
> which FE is done by any benchmark with WP, Symfony, etc. But moving
> JIT compilation away from the machine stack wil lpossibly affect it.

Beforehand, I'm not suggesting to change anything regarding our PCRE
cache (PCRE_G(pcre_cache)); this seems to be fine as it is, and is
indeed not related to this topic.

Now please consider the following simple expression:

  preg_match('/^(foo)+$/', str_repeat('foo', $n))

This will fail (i.e. yield FALSE) independently of pcre.jit for large
enough $n.  However, a user can change pcre.recursion_limit what will
affect the $n limit (the expression will fail for smaller or larger $n),
if pcre.jit=0.  If pcre.jit=1 the user can't influence this boundary in
any way, currently.

And maybe even worse, with pcre.jit=0 the boundary is 50,000, but with
pcre.jit=1 it is only 1,366.  Of course, one can argue that this is a
contrived example, and that such usage is crazy, but why do we have a
default pcre.recursion_limit of 100,000 then?  A recursion_limit of
2,734 would be sufficient to have a boundary of $n == 1,366.

All in all, as this example already suggests, classic execution of
matching is done by recursive calls (using normal stack frames), while
JIT execution of matching is iterative, using a special JIT stack.[1]  I
don't think it is justified to give users a setting to adjust for the
former, but not for the latter (except to disable JIT, albeit JIT might
bring quite some boost especially for such cases).

As we're pretty late in the game for PHP 7.0, it might be best to
postpone a new ini setting or other changes to PHP 7.1, but at the very
least I would introduce a new error constant, say
PHP_PCRE_JIT_STACKLIMIT_ERROR[2], so users get a more meaningful result
when calling preg_last_error() than PHP_PCRE_INTERNAL_ERROR.  And it
seems to be appropriate to add a note to UPGRADING that pcre.jit=1 may
cause some preg_*() to fail which would work with pcre.jit=0.

[1] <http://www.pcre.org/original/doc/html/pcrestack.html>
[2]
<https://github.com/cmb69/php-src/commit/1546e3025403bb7ebed233c71fbc299fd584c9e3>

-- 
Christoph M. Becker

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to