bug#19061: [PATCH] dfa: building superset, access to unallocated memory

2014-11-16 Thread Norihiro Tanaka
On Sun, 16 Nov 2014 07:48:34 -0800
Jim Meyering  wrote:
> Does some code assume that V->charclasses != NULL implies
> 0 < V->calloc? I would argue that such code is incorrect.  I.e.,
> in the degenerate case (calloc == 0), the code should not
> distinguish between a NULL charclasses member and one
> that points to a malloc'd buffer of length 0.

Thanks for the pushing.  I understood that you said.






bug#19061: [PATCH] dfa: building superset, access to unallocated memory

2014-11-16 Thread Jim Meyering
I have pushed that change and am preparing another snapshot.





bug#19061: [PATCH] dfa: building superset, access to unallocated memory

2014-11-16 Thread Jim Meyering
On Sun, Nov 16, 2014 at 1:18 AM, Norihiro Tanaka  wrote:
> On Sat, 15 Nov 2014 22:27:50 -0800
> Jim Meyering  wrote:
>> Thanks for confirming.
>> In that case, since I see no harm in calling xnmalloc with N = 0, I
>> will use a more conservative change: guard only the undefined use of
>> memcpy.
>> I've left your name on this amended patch.
>
> Thanks for the ajustment.  You are right, but the purpose of the code
> is to make a clone of original DFA.  If we do not guard xnmalloc, when
> calloc is 0, charclasses is NULL in original DFA, and it is *NOT* NULL
> in the superset.  I think that it is not right logically.

Does some code assume that V->charclasses != NULL implies
0 < V->calloc? I would argue that such code is incorrect.  I.e.,
in the degenerate case (calloc == 0), the code should not
distinguish between a NULL charclasses member and one
that points to a malloc'd buffer of length 0.





bug#19061: [PATCH] dfa: building superset, access to unallocated memory

2014-11-16 Thread Norihiro Tanaka
On Sat, 15 Nov 2014 22:27:50 -0800
Jim Meyering  wrote:
> Thanks for confirming.
> In that case, since I see no harm in calling xnmalloc with N = 0, I
> will use a more conservative change: guard only the undefined use of
> memcpy.
> I've left your name on this amended patch.

Thanks for the ajustment.  You are right, but the purpose of the code
is to make a clone of original DFA.  If we do not guard xnmalloc, when
calloc is 0, charclasses is NULL in original DFA, and it is *NOT* NULL
in the superset.  I think that it is not right logically.






bug#19061: [PATCH] dfa: building superset, access to unallocated memory

2014-11-15 Thread Jim Meyering
On Sat, Nov 15, 2014 at 5:06 PM, Norihiro Tanaka  wrote:
> On Sat, 15 Nov 2014 10:00:49 -0800
> Jim Meyering  wrote:
>> Thank you for the patch.
>> That seems like a fine change, but so far, I cannot see how
>> it avoids accessing uninitialized memory.
>> I do see that it fixes an error whereby memcpy was being
>> called with its 2nd argument NULL, though in each case,
>> the third argument is always 0.  Passing a NULL pointer as
>> the 2nd argument to memcpy is officially "undefined
>> behavior", and I confirmed that building with gcc and its
>> "undefined behavior sanitizer", the problem was exposed,
>> and that your patch fixes it.
>>
>> Do you know of a way to make grep crash, as stated in your
>> proposed NEWS entry?  If so, please give details.
>>
>> It is UB after all.  Perhaps you found a system whose memcpy
>> dereferences the source pointer even when the size is 0?
>
> Thanks for the review.
>
> I ran accross this problem when I made next improvement.  If size is 0,
> when dfa_charclass_index has been called, the crash was caused.  And If
> I fixed it, the crash was not caused.  So I think that it is a bug.
>
> However, I deleted the branch as the improvement was bad.  And I cannot
> see cause of the bug in the source code.  I seem that the code has no bug.
> Further more, I could not reproduce it, though I re-wrote a similar code
> to the branch.
>
> Possibly other changes which I made are bad, and it might cause a
> buffer-overrun and override memory range for characlasses in the branch.

Thanks for confirming.
In that case, since I see no harm in calling xnmalloc with N = 0, I
will use a more conservative change: guard only the undefined use of
memcpy.
I've left your name on this amended patch.


0001-dfa-avoid-undefined-behavior.patch
Description: Binary data


bug#19061: [PATCH] dfa: building superset, access to unallocated memory

2014-11-15 Thread Norihiro Tanaka
On Sat, 15 Nov 2014 10:00:49 -0800
Jim Meyering  wrote:
> Thank you for the patch.
> That seems like a fine change, but so far, I cannot see how
> it avoids accessing uninitialized memory.
> I do see that it fixes an error whereby memcpy was being
> called with its 2nd argument NULL, though in each case,
> the third argument is always 0.  Passing a NULL pointer as
> the 2nd argument to memcpy is officially "undefined
> behavior", and I confirmed that building with gcc and its
> "undefined behavior sanitizer", the problem was exposed,
> and that your patch fixes it.
> 
> Do you know of a way to make grep crash, as stated in your
> proposed NEWS entry?  If so, please give details.
> 
> It is UB after all.  Perhaps you found a system whose memcpy
> dereferences the source pointer even when the size is 0?

Thanks for the review.

I ran accross this problem when I made next improvement.  If size is 0,
when dfa_charclass_index has been called, the crash was caused.  And If
I fixed it, the crash was not caused.  So I think that it is a bug.

However, I deleted the branch as the improvement was bad.  And I cannot
see cause of the bug in the source code.  I seem that the code has no bug.
Further more, I could not reproduce it, though I re-wrote a similar code
to the branch.

Possibly other changes which I made are bad, and it might cause a
buffer-overrun and override memory range for characlasses in the branch.






bug#19061: [PATCH] dfa: building superset, access to unallocated memory

2014-11-15 Thread Paul Eggert

Jim Meyering wrote:

Perhaps you found a system whose memcpy
dereferences the source pointer even when the size is 0?


That seems pretty unlikely, on a flat-address-space machine.  And grep already 
assumes a flat address space, since gnulib does.






bug#19061: [PATCH] dfa: building superset, access to unallocated memory

2014-11-15 Thread Jim Meyering
On Sat, Nov 15, 2014 at 1:11 AM, Norihiro Tanaka  wrote:
> If original DFA does not have any CSETs, no memory allocated for CSET.
> Even then DFA try to copy CSET from original DFA to the superset.  As
> a result, it is caused to access to unallocated memory.  We have no test
> case so that it is very difficult that we always reproduce this bug, as
> CSET may be added only one in building superset.

Thank you for the patch.
That seems like a fine change, but so far, I cannot see how
it avoids accessing uninitialized memory.
I do see that it fixes an error whereby memcpy was being
called with its 2nd argument NULL, though in each case,
the third argument is always 0.  Passing a NULL pointer as
the 2nd argument to memcpy is officially "undefined
behavior", and I confirmed that building with gcc and its
"undefined behavior sanitizer", the problem was exposed,
and that your patch fixes it.

Do you know of a way to make grep crash, as stated in your
proposed NEWS entry?  If so, please give details.

It is UB after all.  Perhaps you found a system whose memcpy
dereferences the source pointer even when the size is 0?





bug#19061: [PATCH] dfa: building superset, access to unallocated memory

2014-11-15 Thread Norihiro Tanaka
If original DFA does not have any CSETs, no memory allocated for CSET.
Even then DFA try to copy CSET from original DFA to the superset.  As
a result, it is caused to access to unallocated memory.  We have no test
case so that it is very difficult that we always reproduce this bug, as
CSET may be added only one in building superset.
From 6b99a4abd6f969ef17710b0f3ea16e4b0e4ef273 Mon Sep 17 00:00:00 2001
From: Norihiro Tanaka 
Date: Sat, 15 Nov 2014 17:13:10 +0900
Subject: [PATCH] dfa: building superset, access to unallocated memory

If original DFA does not have any CSETs, no memory allocated for CSET.
Even then DFA try to copy CSET from original DFA to the superset.  As
a result, it is caused to access to unallocated memory.  We have no test
case so that it is very difficult that we always reproduce this bug, as
CSET may be added only one in building superset.

* src/dfa.c (dfassbuild): Change so that when orignal DFA does not have
any CSETs, do not copy it.
* NEWS (Bug fixes): Mention it.
---
 NEWS  | 4 
 src/dfa.c | 9 ++---
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/NEWS b/NEWS
index c465162..ffe0f44 100644
--- a/NEWS
+++ b/NEWS
@@ -45,6 +45,10 @@ GNU grep NEWS-*- outline 
-*-
   of a multibyte character when using a '^'-anchored alternate in a pattern,
   leading it to print non-matching lines.  [bug present since "the beginning"]
 
+  grep no longer crashes for patterns that contain period, bracket expression,
+  back reference, etc.
+  [bug introduced in grep-2.19]
+
   grep -E rejected unmatched ')', instead of treating it like '\)'.
   [bug present since "the beginning"]
 
diff --git a/src/dfa.c b/src/dfa.c
index e0fc120..d9ef652 100644
--- a/src/dfa.c
+++ b/src/dfa.c
@@ -3659,9 +3659,12 @@ dfassbuild (struct dfa *d)
   sup->newlines = NULL;
   sup->musts = NULL;
 
-  sup->charclasses = xnmalloc (sup->calloc, sizeof *sup->charclasses);
-  memcpy (sup->charclasses, d->charclasses,
-  d->cindex * sizeof *sup->charclasses);
+  if (sup->calloc > 0)
+{
+  sup->charclasses = xnmalloc (sup->calloc, sizeof *sup->charclasses);
+  memcpy (sup->charclasses, d->charclasses,
+  d->cindex * sizeof *sup->charclasses);
+}
 
   sup->tokens = xnmalloc (d->tindex, 2 * sizeof *sup->tokens);
   sup->talloc = d->tindex * 2;
-- 
2.1.3