https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85472
--- Comment #8 from Hans Åberg <haberg-1 at telia dot com> --- (In reply to Tim Shen from comment #5) > I'm not following the meaning of "action number" and "the partial reverse > NFA is recorded". > > How many actions numbers are recorded? for regex_match(s, regex(re)), is it > on the order of O(s.size()), or O(re.size())? The goal is to allocate > O(re.size()) of memory exactly once. Let's check the OP example, to see if it is useful, which will also explain the action numbers. So for regex '(z)((a+)?(b+)?(c))*', I used '(z)(a*b*(c))*' with action numbers same as in the C++ library numbering. Here they are written out with # in the NFA: ex = ({0}, {0, 3}, {0: {z ↦ {1, ∅}}[#1]; 1: {a ↦ {1, 2, 3}}[#2#3]; 2: {b ↦ {2, 3}}[#2#4]; 3: {c ↦ {3, 2, 1, ∅}}[#2#5]}, {}) with start set {0}, and set of last (pointing to a final state ∅) states {0, 3}. (The last {} is the not yet computed DFA.) Then lexing: > zaacbbbcac Full string is a valid match! Matches: {zaacbbbcac, [@0@3@7@9]; [#1] [#1][#2#3][#2#3][#2#5] [#1][#2#3][#2#3][#2#5][#2#4][#2#4][#2#4][#2#5] [#1][#2#3][#2#3][#2#5][#2#4][#2#4][#2#4][#2#5][#2#3][#2#5] } Time: 0.000555 s. Each line gives the action numbers for the string positions for the successive strings z, zaac, zaacbbbc, zaacbbbcac with final string positions @0, @3, @7, @9. The last line gives for #1 the sub-string "z" , and for #2 "aacbbbcac". For #3 "a", and for #5 "c". But #4 is missing, indication there is no match. So there might be problem here, as there are earlier matches: Perhaps the intent is that it should be implemented as a loop, only retaining the last #4, but the problem is that that is not what the underlying theory says.