On Thu, 31 May 2018, ND via Pcre-dev wrote: > PCRE2 version 10.31 2018-02-12 > /(?<=\G.)/g,replace=- > abc > 2: a-bc- > > > Logically expected result: a-b-c- > > > PCRE advances by one character between zero-length matches. But it seems it > should not in this case.
It is *pcre2test* that is doing the advancing, not the PCRE2 library, which only ever does one match at a time. The use of \G in lookbehinds, like \K, can cause a lot of confusion. \G is true when the current matching point is at the start of the subject plus the starting offset (compare ^, which is only true at the start of the subject in single-line mode). Consider the match without the replace, and some added parentheses: /(?<=\G(.))/g abc 0: 1: a 0: 1: c The /g operation in pcre2test starts by calling pcre2_match() with a starting offset of 0. This defines where \G will match. However, the lookbehind fails early, because there are no earlier characters, so the match moves on (within the PCRE2 library) to try again from a new starting position, but this does NOT change the original starting offset. This match succeeds. Normally, the next call to pcre2_match() from pcre2test would pass the subject with a new starting offset, set to the end of the previous match. However, the match was for an empty string, so it moves on one character, so the starting offset is now 2. This time it can lookbehind, but when it does, the matching point is not at the starting offset, so once again it moves on within the PCRE library, finding a match one character later. In both cases, it finds a match one character later than the starting offset. Philip -- Philip Hazel -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
