On Wed, Mar 1, 2017 at 4:35 PM, Rasmus Schultz <ras...@mindplay.dk> wrote:

> Hey internals,
>
> I was wondering whether or how PCRE regular expression get parsed and
> cached, and I found this answer on Stack Overflow:
>
> http://stackoverflow.com/questions/209906/compile-regex-in-php
>
> Do I understand this correctly, that:
>
> 1. All regular expressions are hashed and the compiled expression is cached
> internally between calls.
>

Correct.

2. The /S modifier applies more optimizations during compile, but caching
> works the same way.
>

Yes. Additionally, if PCRE JIT is enabled (which it usually is on PHP 7) we
always study, independently of whether /S was specified.


> 3. Compiled expressions are not cached between requests.
>

Compiled expressions are cached between requests. However, they are not
shared between processes (I'm not even sure if that's possible.)

The cache invalidation strategy is FIFO. More specifically, whenever the
cache fills up, we discard the first 1/8 cached regular expressions.

Nikita


> If so, this seems far from optimal.
>
> Every unique regular expression needs to be compiled during every request,
> right?
>
> So with FPM, or with long-running apps, we're missing an opportunity to
> optimize by caching between requests.
>
> And with long-running apps, we're caching every dynamic regular expression,
> which could harm (memory overhead) more than help.
>
> Ideally, shouldn't we have (like some engines/languages) a switch to enable
> caching?
>
> The run-time can't know if a given regular expression is dynamic or static,
> can it? It's just a string either way - so without a switch, you're either
> committing compiled dynamic expressions to the cache unnecessarily, and/or
> missing an opportunity to cache between requests in long-running apps or
> under FPM.
>
> I think most apps use quite a lot of regular expression for validation etc.
> so maybe there's a missed optimization opportunity here?
>
> Cheers,
>   Rasmus Schultz
>

Reply via email to