On Thu, Apr 17, 2014 at 10:59 PM, Paul Eggert <[email protected]> wrote: > I merged that patch into the grep trunk and am closing this bug report. > While merging, I removed one part of the patch that had a small but > measurable performance degradation, namely, the introduction of struct > dfatab. I found some other minor simplifications and threw them in. There > seems to be no significant change in CPU performance. The patch's causes > the dfa module's text size to shrink by 8% on my platform (x86-64 Fedora > 20), which I suppose is a small win, but the main point of the patch is > simplifying the source. I'm attaching what I installed, broken into 12 > little patches to ease review.
Running the tests with ASAN-enabled grep exposed a buffer overrun that was introduced by the "simplify range char allocation" one. I've fixed it with the attached patch:
From 9ee100985c55da7d2568b06492ed3bee025b2bb9 Mon Sep 17 00:00:00 2001 From: Jim Meyering <[email protected]> Date: Thu, 24 Apr 2014 09:03:39 -0700 Subject: [PATCH] grep: fix new heap write buffer overrun * src/dfa.c (parse_bracket_exp): Fix off-by-one allocation error. Exposed by running the tests with an ASAN-enabled binary (i.e., created using gcc's -fsanitize=address option). Here's the stack trace you get with each abort: ==74010==ERROR: AddressSanitizer: heap-buffer-overflow on address\ 0x60200000c898 at pc 0x106e2cbe9 bp 0x7fff58ddec40 sp 0x7fff58ddec38 WRITE of size 4 at 0x60200000c898 thread T0 #0 0x106e2cbe8 in parse_bracket_exp dfa.c:1137 #1 0x106e2eb34 in lex dfa.c:1523 #2 0x106e30510 in dfaparse dfa.c:1943 #3 0x106e3e24d in dfacomp dfa.c:3547 #4 0x106e439ee in GEAcompile dfasearch.c:199 #5 0x106e2499a in Gcompile grep.c:1648 #6 0x106e27ca8 in main grep.c:2339 Introduced by commit v2.18-70-gd3d9612, "dfa: simplify range char allocation". --- src/dfa.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/dfa.c b/src/dfa.c index 5dc0f09..24b4d5c 100644 --- a/src/dfa.c +++ b/src/dfa.c @@ -1125,7 +1125,7 @@ parse_bracket_exp (void) is wrong in multiple ways, it's never used in practice. FIXME: Remove this (and related) unused code. */ work_mbc->ranges - = maybe_realloc (work_mbc->ranges, work_mbc->nranges + 1, + = maybe_realloc (work_mbc->ranges, work_mbc->nranges + 2, &ranges_al, sizeof *work_mbc->ranges); work_mbc->ranges[work_mbc->nranges].beg = case_fold ? towlower (wc) : wc; -- 1.9.2.459.g68773ac
