Norihiro,

Thank you for the patch.  It looks correct, but induces what looks like
unnecessary duplication.  Did you consider the attached variant?

Also, the affected code path seems not to be covered by any test.
Can you construct a test case that exercises this change?
From 1c8b562ef15a667eeca7ca5c11d26ba83888961e Mon Sep 17 00:00:00 2001
From: Norihiro Tanaka <[email protected]>
Date: Wed, 26 Mar 2014 08:56:50 -0700
Subject: [PATCH] grep: perform the kwset-helping DFA match in narrower range

When kwsexec gives us the offset of a potential match, we compute
line begin/end and then run the DFA matcher to see if there really
is a match on that line.  When the beginning of the line, BEG, is
not on a multibyte character boundary, advance BEG until it on such
a boundary, before running the DFA search.
* src/dfasearch.c (EGexecute): As above.  Add a comment.
---
 src/dfasearch.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/src/dfasearch.c b/src/dfasearch.c
index 0b56960..2fa09fa 100644
--- a/src/dfasearch.c
+++ b/src/dfasearch.c
@@ -247,6 +247,11 @@ EGexecute (char const *buf, size_t size, size_t 
*match_size,
                       || !is_mb_middle (&mb_start, match, buflim,
                                         kwsm.size[0]))
                     goto success;
+
+                  /* The matched line starts in the middle of a multibyte
+                     character.  Advance BEG so that we start searching
+                     from the beginning of the next character.  */
+                  beg = mb_start;
                 }
               if (dfaexec (dfa, beg, (char *) end, 0, NULL, &backref) == NULL)
                 continue;
-- 
1.9.0.258.g00eda23

Reply via email to