https://bugs.exim.org/show_bug.cgi?id=1803
--- Comment #22 from Nish Aravamudan <[email protected]> --- (In reply to Zoltan Herczeg from comment #21) > Another idea just came to my mind. > > It seems that all patterns are compiled by pcre_compile here: > > https://github.com/php/php-src/blob/master/ext/pcre/php_pcre.c#L433 > > Would it be possible to dump all regex compilation to some file after this > call? > > E.g. > > re = pcre_compile(pattern, > coptions, > &error, > &erroffset, > tables); > > FILE *f = fopen("dump_file", "a"); // appending at the end > fprintf(f, "/%s/ 0x%x -> %p\n", pattern, coptions, re); > fclose(f); Recompiling PHP7.0 is quite slow, so I tried doing this with gdb... > It would be easy to find the offending pattern from this list. Just find the > latest entry which has the same address as pce->re. I set a breakpoint at: ext/pcre/php_pcre.c:1720 or so, which is the pce->refcount++ in (zif_)preg_split: Ignored the first 232 hits of it, and on the last one: Breakpoint 4, zif_preg_split (execute_data=<optimized out>, return_value=0x7ffff381b240) at /build/php7.0-WHFaJZ/php7.0-7.0.3/ext/pcre/php_pcre.c:1720 1720 pce->refcount++; (gdb) print pce $48 = (pcre_cache_entry *) 0x555555d333f0 (gdb) print subject->val@10 $47 = {"\303", "\251", "\303", "\204", "\303", "\237", "\343", "\201", "\224", "a"} (gdb) print regex->val@14 $58 = {"/", "(", "?", "<", "!", "^", ")", "(", "?", "!", "$", ")", "/", "u"} (gdb) print &pce->re $60 = (pcre **) 0x555555d333f0 (gdb) cont Continuing. Program received signal SIGSEGV, Segmentation fault. __memcpy_avx_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-avx-unaligned.S:271 271 ../sysdeps/x86_64/multiarch/memcpy-avx-unaligned.S: No such file or directory. (gdb) up #1 0x00005555556798d8 in memcpy (__len=18446744073709551614, __src=0x7fffed40b1ac, __dest=0x7fffed7a6348) at /usr/include/x86_64-linux-gnu/bits/string3.h:53 53 return __builtin___memcpy_chk (__dest, __src, __len, __bos0 (__dest)); (gdb) up #2 zend_string_init (persistent=0, len=18446744073709551614, str=0x7fffed40b1ac "\303\237\343\201\224a") at /build/php7.0-WHFaJZ/php7.0-7.0.3/Zend/zend_string.h:159 159 memcpy(ZSTR_VAL(ret), str, len); (gdb) up #3 php_pcre_split_impl (pce=0x555555d333f0, subject=0x7fffed40b1a8 "\303\251\303\204\303\237\343\201\224a", subject_len=10, return_value=0x7ffff381b240, limit_val=-1, flags=<optimized out>) at /build/php7.0-WHFaJZ/php7.0-7.0.3/ext/pcre/php_pcre.c:1808 1808 ZVAL_STRINGL(&tmp, last_match, &subject[offsets[0]]-last_match); (gdb) print pce $61 = (pcre_cache_entry *) 0x555555d333f0 (gdb) print &pce->re $62 = (pcre **) 0x555555d333f0 So the regex in question, I think, is: /(?<!^)(?!$)/u which does correspond to the output I got from the above printf in gdb. Does that help narrow down where the bug might be? Do you still want me to do the control flow analysis? -- You are receiving this mail because: You are on the CC list for the bug. -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
