On Wed, Jan 29, 2014 at 1:43 AM, Santiago <[email protected]> wrote:
> Package: grep
> Version: 2.16
> Severity: important
>
> Hi there,
>
> I forward this bug from debian's BTS. Last changes in -P brought another
> problem. I've confirmed this behavior on last debian package:
>
> ----- Forwarded message from Vincent Lefevre <[email protected]> -----
>
> [snip]
>
>
> grep -P loops on some files with invalid UTF-8 sequences, e.g.
>
> $ /usr/bin/printf "\xe9\x65\n\xab\n" | grep -P '.e|.?z' | head
> �e
> �e
> �e
> �e
> �e
> �e
> �e
> �e
> �e
> �e
>
> (the infinite loop is interrupted here by a broken pipe due to
> the "head").
>
> It seems that the fix of
>
>   https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=730472

Thanks for the heads-up.  That appears to be a problem with pcre.
I've just build grep (git head) against pcre (git head), and adjusted
your example slightly and built with gcc's address sanitizer mode.
Now, libpcre gets an internal segfault:

$ printf "\xe9\n\xab\n" > k; src/grep -P 'e|.?z' k
ASAN:SIGSEGV
=================================================================
==11821==ERROR: AddressSanitizer: SEGV on unknown address
0x62cfffffffff (pc 0x00\
00004f0743 sp 0x7fff6b32f4a0 bp 0x7fff6b32f760 T0)
    #0 0x4f0742 in match /w/co/pcre/pcre_exec.c:5943
    #1 0x4f26d5 in pcre_exec /w/co/pcre/pcre_exec.c:6941
    #2 0x46f421 in Pexecute /w/co/grep/src/pcresearch.c:178
    #3 0x4717a3 in do_execute /w/co/grep/src/main.c:1075
    #4 0x4717a3 in grepbuf /w/co/grep/src/main.c:1111
    #5 0x472249 in grep /w/co/grep/src/main.c:1222
    #6 0x472249 in grepdesc /w/co/grep/src/main.c:1476
    #7 0x4073ca in main /w/co/grep/src/main.c:2396
    #8 0x7f6f21a53cdc in __libc_start_main (/lib64/libc.so.6+0x1ecdc)
    #9 0x408a54 (/w/u/w/co/grep/src/grep+0x408a54)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /w/co/pcre/pcre_exec.c:5943 match
==11821==ABORTING

Sorry, but I don't have time to debug further.  Quick glance suggests
it is backing up too far:

(gdb) b __asan_report_error
Breakpoint 1 at 0x448c40: file
../../.././libsanitizer/asan/asan_report.cc, line 711.
(gdb) r
Starting program: /w/u/w/co/grep/src/grep -P e\|.\?z k
warning: no loadable sections found in added symbol-file
system-supplied DSO at 0x7ffff7ffa000
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Program received signal SIGSEGV, Segmentation fault.
0x00000000004f0743 in match (eptr=0x62cfffffffff "",
ecode=0x60700000df8a "\035zx",
    mstart=0x62d00000b002 "\253\n", '\276' <repeats 198 times>...,
offset_top=2, md=0x7fffffffce30, eptrb=0x0, rdepth=0)
    at pcre_exec.c:5943
5943              BACKCHAR(eptr);
(gdb) l
5938              {
5939              if (eptr == pp) goto TAIL_RECURSE;
5940              RMATCH(eptr, ecode, offset_top, md, eptrb, RM46);
5941              if (rrc != MATCH_NOMATCH) RRETURN(rrc);
5942              eptr--;
5943              BACKCHAR(eptr);
5944              if (ctype == OP_ANYNL && eptr > pp  && UCHAR21(eptr)
== CHAR_NL &&
5945                  UCHAR21(eptr - 1) == CHAR_CR) eptr--;
5946              }
5947            }



Reply via email to