On 2019-07-01 10:28, ph10 wrote:
On Sun, 30 Jun 2019, ND via Pcre-dev wrote:
PCRE2 version 10.33 2019-04-16
> /\A(?:.|..)(*THEN)c/
> abc
> No match
>>> Perl is match "abc".
> I suppose "next innermost alternative" is interpreted differently by PCRE and
> Perl.
>> If so, may be PCRE should go Perl way in this matter?
I think this is a bug in Perl and I will report it as such.


After reading this post https://rt.perl.org/Public/Bug/Display.html?id=92898#txn-1227153
I don't sure that there is a Perl bug.
I suppose that there are two branches started from "(?:.|..)". Each of this branches ends with a common TAIL to end of pattern. Here are this two branches:
1) .(*THEN)c
2) ..(*THEN)c

Lets look at the Perl debug output:


Matching REx "\A(?:.|..)(*THEN)c" against "abcd"
Intuit: trying to determine minimum start position...
  doing 'check' fbm scan, [1..3] gave 2
  Found floating substr "c" at offset 2 (rx_origin now 0)...
  (multiline anchor test skipped)
Intuit: Successfully guessed: match at offset 0
   0 <> <abcd>               |   0| 1:SBOL /\A/(2)
   0 <> <abcd>               |   0| 2:BRANCH(4)
   0 <> <abcd>               |   1|  3:REG_ANY(8)
   1 <a> <bcd>               |   1|  8:CUTGROUP(10)
   1 <a> <bcd>               |   2|   10:EXACT <c>(12)
                             |   2|   failed...
                             |   1|  failed...
   0 <> <abcd>               |   0| 4:BRANCH(7)
   0 <> <abcd>               |   1|  5:REG_ANY(6)
   1 <a> <bcd>               |   1|  6:REG_ANY(8)
   2 <ab> <cd>               |   1|  8:CUTGROUP(10)
   2 <ab> <cd>               |   2|   10:EXACT <c>(12)
   3 <abc> <d>               |   2|   12:END(0)
Match successful!


So backtracking to (*THEN) in BRANCH(4) caused immediately fail of this branch and jump to BRANCH(7).

--
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev

Reply via email to