Branch: refs/heads/yves/fix_21534_skip_with_find_byclass_in_match
  Home:   https://github.com/Perl/perl5
  Commit: 9715a3bac67ea2cc00929e81b436ae441ef1dcde
      
https://github.com/Perl/perl5/commit/9715a3bac67ea2cc00929e81b436ae441ef1dcde
  Author: Yves Orton <demer...@gmail.com>
  Date:   2023-09-30 (Sat, 30 Sep 2023)

  Changed paths:
    M regexec.c
    M t/re/pat_advanced.t

  Log Message:
  -----------
  regexec.c - make find_byclass() work with (*SKIP) and friends

This fixes https://github.com/Perl/perl5/issues/21534

We have an optimisation that applies to patterns starting with a PLUS
style pattern, so that things like /A+B/ do not try to match at every A
when B does not match after the the first attempt. Eg,
"AAAAAABBBBAAAAC"=~/[Aa]+[CD]/ should not try to match at every 'A' in
the first sequence of 'A's. This optimisation is signalled by the
presence of an PREGf_SKIP flag.

The PLUS optimisation did not play nicely with patterns which were doing
a similar task using the (*SKIP) operator. Essentially we need to
disable the former when the latter has been used, or it can get
confused. Consider the case of

    "AAAAAABBBBBBBAAAAAC"=~/[Aa]+(?:[Bb]+(*SKIP)(*FAIL)|[CD])/.

The idea is to signal to the regex engine that once "AAAAABBBBBBBBB" is
matched it can continue from after the final 'B'. Because of the way the
PLUS optimizatin is implemented, this advancing of the pointer to the
last B confused things, and it failed to match the final AAAAC sequence.
This patch is somewhat of a bodge, it shouldnt be necessary to inspect
inside of reginfo() after a call to regtry() and it is a bit
counter-intuitive to do so. This patch wraps the check in a macro so
that at least it is somewhat self documenting what it is doing.


Reply via email to