On Thu, Oct 31, 2013 at 8:36 PM, Jim Meyering <[email protected]> wrote:
> On Thu, Oct 31, 2013 at 2:55 PM, Jim Meyering <[email protected]> wrote:
>> On Thu, Oct 31, 2013 at 10:46 AM, Mirraz Mirraz <[email protected]> wrote:
>>>
>>> After updating from 2.14 to 2.15 grep has started to fail to match patterns
>>> that contain '\s*' or '\s\+'
>>> For example:
>>>
>>> (grep-2.14)
>>> $ echo '[ ]' | grep '\s*'
>>> [ ]
>>> $
>>>
>>> (grep-2.15)
>>> $ echo '[ ]' | grep '\s*'
>>> $
>>
>> Thank you for the report.
>> That is clearly a regression.  That is now the most compelling (of 3)
>> reasons to make a new release.
>
> Here's a preliminary patch.
> I'm about to write the test suite additions to accompany it:

And here's a proper patch, including NEWS and test suite additions:
From 424cfed90e59013962cf2c1df2e4c1c97c6f9b5c Mon Sep 17 00:00:00 2001
From: Jim Meyering <[email protected]>
Date: Thu, 31 Oct 2013 20:20:30 -0700
Subject: [PATCH] grep: fix regression involving \s and \S

Commit v2.14-40-g01ec90b made \s and \S work with multibyte
characters, but it made it so any use like \s*, \s+, \s?, \s{3}
would malfunction.
* src/dfa.c (lex): Also reset laststart.
* tests/backslash-s-and-repetition-operators: New file.
* tests/Makefile.am (TESTS): Add it.
* NEWS (Bug fixes): Mention it.
Reported by Mirraz Mirraz in http://bugs.gnu.org/15773.
---
 NEWS                                       |  5 +++++
 src/dfa.c                                  |  1 +
 tests/Makefile.am                          |  1 +
 tests/backslash-s-and-repetition-operators | 28 ++++++++++++++++++++++++++++
 4 files changed, 35 insertions(+)
 create mode 100755 tests/backslash-s-and-repetition-operators

diff --git a/NEWS b/NEWS
index 9a8293b..5dd8796 100644
--- a/NEWS
+++ b/NEWS
@@ -9,6 +9,11 @@ GNU grep NEWS                                    -*- outline 
-*-
   procedure resulted in a grep-2.15 tarball that would lead to a grep
   binary whose --version-reported version number was 2.14.51...

+  The fix to make \s and \S work with multi-byte white space broke
+  the use of each shortcut whenever followed by a repetition operator.
+  For example, \s*, \s+, \s? and \s{3} would all malfunction.
+  [bug introduced in grep-2.14]
+

 * Noteworthy changes in release 2.15 (2013-10-26) [stable]

diff --git a/src/dfa.c b/src/dfa.c
index de6c671..92c410e 100644
--- a/src/dfa.c
+++ b/src/dfa.c
@@ -1473,6 +1473,7 @@ lex (void)

           POP_LEX_STATE ();

+          laststart = 0;
           return lasttok;

         case 'w':
diff --git a/tests/Makefile.am b/tests/Makefile.am
index a64a2d2..970a9de 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -36,6 +36,7 @@ TESTS =                                               \
   backref                                      \
   backref-multibyte-slow                       \
   backref-word                                 \
+  backslash-s-and-repetition-operators         \
   backslash-s-vs-invalid-multitype             \
   big-hole                                     \
   big-match                                    \
diff --git a/tests/backslash-s-and-repetition-operators 
b/tests/backslash-s-and-repetition-operators
new file mode 100755
index 0000000..562646d
--- /dev/null
+++ b/tests/backslash-s-and-repetition-operators
@@ -0,0 +1,28 @@
+#! /bin/sh
+# Ensure that \s and \S work with repetition operators.
+#
+# Copyright (C) 2013 Free Software Foundation, Inc.
+#
+# Copying and distribution of this file, with or without modification,
+# are permitted in any medium without royalty provided the copyright
+# notice and this notice are preserved.
+
+. "${srcdir=.}/init.sh"; path_prepend_ ../src
+
+printf ' \n' > in || framework_failure_
+
+fail=0
+
+for re in '\s\+' '\s*' '\s\?' '\s\{1\}'; do
+  grep "^$re\$" in > out || fail=1
+  compare in out || fail=1
+done
+
+printf 'X\n' > in || framework_failure_
+
+for re in '\S\+' '\S*' '\S\?' '\S\{1\}'; do
+  grep "^$re\$" in > out || fail=1
+  compare in out || fail=1
+done
+
+Exit $fail
-- 
1.8.4.2.564.g0d6cf24

Reply via email to