https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85472
--- Comment #9 from Tim Shen <timshen at gcc dot gnu.org> --- Ah with the example it's clear, thanks! > The last line gives for #1 the sub-string "z" , and for #2 "aacbbbcac". This is not what ECMAScript produces either. for capture #2, ECMAScriptn produces "ac", the last match of the loop. Think about the difference between (z)((a+)?(b+)?(c))* and (z)((?:(a+)?(b+)?(c))*) Your #2 seems to capture the second case, which is different. > For > #3 "a", and for #5 "c". But #4 is missing, indication there is no match. So > there might be problem here, as there are earlier matches: > > Perhaps the intent is that it should be implemented as a loop, only > retaining the last #4, That's what the implementations (boost, libstdc++, python) actually do. That's not ECMAScript's intention. ECMAScript's intention is to leave #4 undefined (*not* retaining the last non-empty #4), as in the last iteration of the loop, #4 (b+)? doesn't match any sub-string. > but the problem is that that is not what the > underlying theory says. I'm not sure if there is any theory around caputring groups. If we are about to create one, be aware that there are multiple plausible definitions.