Hi,

the pattern is always compiled to byte code first, and JIT converts it back, so 
using JIT alone does not help. The reason of not using an iterator in the 
interpreter is practical: PCRE interpreter uses stack recursion, and you cannot 
easily share variable data across function calls. This is not a problem for 
single character iterators, but matching brackets would require inspecting the 
machine stack. Finding the previous call of an iterator on the stack chain and 
getting local data from it is difficult (in C at least). Instead the byte code 
of a subpattern is repeated so there is no need for tracking the iterator 
count. JIT does not use machine stack for recursion, and it has an 
infrastructure for iterator data sharing, so this is not an issue there.

Regards,
Zoltan

Jean-Christophe Deschamps <[email protected]> írta:
>
>   At 18:30 25/01/2015, you wrote:
>   ´¯¯¯
>
>     I think the issue is that the byte code of the pattern is too big.
>     It is basically (?:\d+=) 9999 times. It was easier to implement the
>     interpreter this way (JIT converts back the byte code into an
>     interator again, because of the code size).
>     To make this work, increase the link size 3 or 4
>     (--with-link-size=4) when compiling PCRE.
>
>   `---
>   So if I understand you correctly, the only options are to either use a
>   larger link size or use JIT, none of which is under my control since
>   I'm using a script language interpretor embedding PCRE in linked form.
>   While I regard PCRE as a superior engine and feel obliged by the work
>   of the dev team I find unfortunate the choice to not implement an
>   internal loop structure for fixed repetition of subpatterns.
>   Thank you for your insight anyway.
>
>   --
>   [1][email protected]
>
>References
>
>   1. mailto:[email protected]
>-- 
>## List details at https://lists.exim.org/mailman/listinfo/pcre-dev 
>


-- 
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev 

Reply via email to