On Mon, 23 Jan 2012, ND wrote: > No. If PCRE can not calculate the length of the longest lookbehind in pattern > then main application must know that string returned for a partial match may > be not long enough and may be more symbols needed to keep. > > If PCRE can calculate the length of the longest lookbehind in pattern then it > can simply returns it. Value 0 means that no lookbehinds present in pattern.
PCRE can easily calculate the length of the longest lookbehind. That is not a problem. It can return it to the application via a PCRE_INFO_xxx call. I think a negative value should mean no lookbehinds, because a lookbehind of length 0 is permitted. I think we have not yet got this fully understood. If zero-length partial matches are allowed whenever there is a lookbehind, then just adding a lookbehind in another branch of the pattern will change its behaviour. You can always add (?<!)| at the start of a pattern without having any effect ... the lookbehind always fails, so matching just carries on with the rest of the pattern. Something like /abc/ matched against \P\Pxyz give no match. I do not think it would be useful to make /(?<!)|abc/ give instead a partial match, just because there is a lookbehind somewhere in the pattern. Another idea: the problems arise from lookaheads at the end of the subject string when no previous characters have been inspected. Perhaps a zero-length partial match should be allowed if it arises within a lookahead? This means that (?!b) could cause such a match, but [^b] would not. I need to work through a lot more examples to see what might make sense here. Philip -- Philip Hazel -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
