Well, then I have a few questions about matching and capturing
groups.

1. "ab" -> "^(a*)*(.)"
So, from your test case I can assume that:
regs[0] = (0, 2]
regs[1] = (0, 1]
regs[2] = (1, 2]

But if we add backref at the end:
2. "ab" -> "^(a*)*(.)\1"
check_matching matches the whole string "ab",
this means that the first group accepted 'a' but in fact is empty,
other vice it could not match backref later on.
What is the correct match here? Is check_matching wrong and
should match only "a" in the 2nd group (as it would be with
"^(a*)(.)\1")? or should set_regs check for this and shrink the
match?

Next,
3. "aaba" -> "^(a*)*(.)\1"
Again check_matching matches "aaba", then the first group
is "a", and were the 2nd 'a' goes?

In PCRE2 they save empty string for an optional groups like
"(a*)*", and I assume this is because capturing group saves the
last match and empty string matches. So in this case they would
match only "aab".

So please tell me how all 3 cases should match, this will
help me to fix the initial issue with backrefs and implement the
correct matching.

Thanks.

--
Egor


Reply via email to