On Sat, 12 Feb 2011, ND wrote: > Hi, Philip! > > When a possessive quantifier applyed to expression for example > (?:<EXPR>)*+ > then recursion occurs and its value is equal to the number of sequential > <EXPR> in input string. > Why recurtion is so big? It seems there is no need to keep all > intermediate backtrack points because PCRE have no possibility to return > to them. Perhaps PCRE need to keep only one last point for backtracking to > it if following peace of input string don't match <EXPR>. > > May be this is a way for optimization?
I've looked at this; unfortunately, I can't see how to optimize. A pattern such as (abc)*+ is compiled as if it was (?>(abc)*) and then two issues are raised: (1) The code does not know that (abc)* is the only thing inside the (?> parentheses; for example, it must cope with (?>(abc)*xyz) so it must be able to backtrack within the (?> parentheses. (2) Even if there is nothing else, it needs to back track. Consider the pattern (?>(abc)*) and the subject string "abcabcabx". It has to remember the start point, then try (abc); when that succeeds it remembers that point and tries (abc) again; when that succeeds it remembers the point and tries a third time. This time (abc) fails to match, so it has to back up to the remembered point and then continue out of the (?> parentheses. I suppose that theoretically it should be possible to remember only the last back-up point, but because the saving is done by recursive calls, I can't see how to do that. Philip -- Philip Hazel -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
