If number of DFA states reaches at 1024, all transition tables are cleared in build_state() in order to avoid out-of-memory. However, for initial state that shouldn't be done, because it's always used.
BTW, this patch enables to revert a previous patch "grep: do not count newline before the start of buffer", because no longer the code is never be run through at a first character of a text.
From 2a31249560845dd65ffcd20f6991b522af39821b Mon Sep 17 00:00:00 2001 From: Norihiro Tanaka <[email protected]> Date: Sat, 24 May 2014 09:30:07 +0900 Subject: [PATCH 1/2] dfa: avoid to clear a transition table for initial state If number of DFA states reaches at 1024, all transition tables are cleared in build_state() in order to avoid out-of-memory. However, for initial state that shouldn't be done, because it's always used. * src/dfa.c (build_state): Transition and failure tables for initial state isn't cleared. --- src/dfa.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/src/dfa.c b/src/dfa.c index 70dc046..792edf3 100644 --- a/src/dfa.c +++ b/src/dfa.c @@ -2850,16 +2850,18 @@ build_state (state_num s, struct dfa *d) /* Set an upper limit on the number of transition tables that will ever exist at once. 1024 is arbitrary. The idea is that the frequently used transition tables will be quickly rebuilt, whereas the ones that - were only needed once or twice will be cleared away. */ + were only needed once or twice will be cleared away. By the way, + transition table for initial state isn't cleared, because it's always + used. */ if (d->trcount >= 1024) { - for (i = 0; i < d->tralloc; ++i) + for (i = 1; i < d->tralloc; ++i) { free (d->trans[i]); free (d->fails[i]); d->trans[i] = d->fails[i] = NULL; } - d->trcount = 0; + d->trcount = 1; } ++d->trcount; -- 1.9.3
From b28bdf6f42229dabe3cbf0d649097230f729d1b8 Mon Sep 17 00:00:00 2001 From: Norihiro Tanaka <[email protected]> Date: Tue, 27 May 2014 08:35:26 +0900 Subject: [PATCH 2/2] dfa: revert "grep: do not count newline before the start of buffer" This reverts commit 5dc3af2806d21455b818be3f9da26c372e4a7f8d. No longer it's needed by previous patch. --- src/dfa.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/dfa.c b/src/dfa.c index 5962374..6d43345 100644 --- a/src/dfa.c +++ b/src/dfa.c @@ -3398,7 +3398,7 @@ dfaexec (struct dfa *d, char const *begin, char *end, /* If the previous character was a newline, count it, and skip checking of multibyte character boundary until here. */ - if (p[-1] == eol && (char *) p != begin) + if (p[-1] == eol) { nlcount++; mbp = p; -- 1.9.3
