Edit report at http://bugs.php.net/bug.php?id=48153&edit=1
ID: 48153
Comment by: bit2 at freemail dot hu
Reported by: raul dot gigea at directmedia dot de
Summary: preg_replace() crashes in function "match"
Status: Bogus
Type: Bug
Package: PCRE related
Operating System: FreeBSD 7.1-RELEASE-p4
PHP Version: 5.2CVS-2009-05-06 (snap)
New Comment:
I experienced this bug with Debian 5.0.4 (2.6.26-2-686 #1 SMP i686), PHP
5.2.6-1+lenny8, PCRE 7.6 2008-01-28, using mod_fcgid
(libapache2-mod-fcgid 2.2-1) and the default stacksize of 8k.
The sample code of Raul segfaults for me only with an input of ~3300
characters. I've simplified the code a bit further. The following causes
a segfault for me every time I run it:
$contents = str_repeat('a', 3396);
$contents = preg_replace('/(.)*/', '', $contents);
Playing with stack size or pcre limits (recursion and/or backtrack)
works around the problem, just as Raul described.
I could live with this "limitation" (or bug ... whatever) if PHP didn't
just segfault, but threw an error describing what happened. Getting a
segfault doesn't help too much and I've spent a few hours til I got to
the root of the problem. :-(
PS: if I understand it right, the segfault happens because the 8K stack
gets full. And this is probably because PCRE stores every match of a
parenthesized sub-pattern in a new string. In our example every
character of the input string is a match for that parenthesized pattern
... thus for every character a new string is created in the stack (each
taking up 2 bytes). And the 8K limit is reached, because there were a
few other things in the stack already when the PCRE function started.
Previous Comments:
------------------------------------------------------------------------
[2009-05-06 23:37:24] raul dot gigea at directmedia dot de
by the way, with more than 30000 a's it segfaults with 5.3 too.
$contents = 'sud' . str_repeat('a', 30000) . 'bccess';
$contents = preg_replace('/d(a)+b/', '', $contents);
echo $contents
------------------------------------------------------------------------
[2009-05-06 23:23:36] raul dot gigea at directmedia dot de
Ok, tried it on osx. It doesn't crash with 1900 'a''s, but it crashes
with 2900.
Try this code on OSX with 5.2CVS, it crashes on my macbook. With latest
libmhash from macports as of today ( 0.9.9.9_0 ) :
$contents = 'sud' . str_repeat('a', 2900) . 'bccess';
$contents = preg_replace('/d(a)+b/', '', $contents);
echo $contents;
------------------------------------------------------------------------
[2009-05-06 22:54:53] raul dot gigea at directmedia dot de
Just tried it: with the 5.3 snapshot it works.
Compiling 5.2 on OSX right now
------------------------------------------------------------------------
[2009-05-06 22:30:22] [email protected]
The mhash library is gone in 5.3 and replaced with a wrapper around the
hash library.
Can you try a 5.3 snapshot and see if you get the issue?
I should say I can't reproduce this on 5.2 on OSX with the same
configure like Jani used.
------------------------------------------------------------------------
[2009-05-06 22:18:19] raul dot gigea at directmedia dot de
By the way, the mhash lib version is 0.9.9.
------------------------------------------------------------------------
The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at
http://bugs.php.net/bug.php?id=48153
--
Edit this bug report at http://bugs.php.net/bug.php?id=48153&edit=1