Edit report at http://bugs.php.net/bug.php?id=52929&edit=1
ID: 52929 Updated by: ahar...@php.net Reported by: neufe...@php.net Summary: Segfault in filter_var with FILTER_VALIDATE_EMAIL with large amount of data -Status: Open +Status: Assigned Type: Bug Package: Filter related PHP Version: 5.3.3 -Assigned To: +Assigned To: aharvey Block user comment: N New Comment: Fair call; I'll prosecute the argument for NO_RECURSE elsewhere! The limit on address length is 320 octets per RFC 2821 (64 octet local-part + 1 octet "@" + 255 octet domain), so we may as well set the limit there for now. (If RFC 5336 becomes widespread, that may need to be revisited, but let's cross that bridge when we come to it.) Any system that's so stack constrained for that to be an issue is likely to have other problems anyway. :) Fix for 5.3 and trunk forthcoming, just as soon as I write a test. Previous Comments: ------------------------------------------------------------------------ [2010-09-27 07:24:16] ras...@php.net Perhaps a simple pre-filter before we hit the regex. You can't actually have an 8k email address. There are length limits both before and after the @. ------------------------------------------------------------------------ [2010-09-27 05:21:50] ahar...@php.net I hate you, Chrome. Anyway, as I was saying, I'm not terribly comfortable closing this, since it's likely sites will actually be passing user data straight to filter_var(). I mean, that's what it's there for. Is it worth revisiting the decision to compile our bundled libpcre in its default stack recursive mode? I know NO_RECURSE is slower, but I'm nervous about potential remote crashers. ------------------------------------------------------------------------ [2010-09-27 05:19:58] ahar...@php.net This is the normal issue with heavily nested regular expressions exhausting the available stack size. I can upload a backtrace if there's a sudden desire to see several thousand recursive invocations of PCRE's match function. :) I'm not really comfortable closing this, even though we normally just close preg_replace ------------------------------------------------------------------------ [2010-09-27 02:38:06] neufe...@php.net Looking at the source at http://svn.php.net/viewvc/php/php-src/trunk/ext/filter/logical_filters.c?view=markup I wonder if the problem itself might be in the pcre-lib used since the email-validation itself is PCRE-based? Fedora Linux here ships with PCRE 7.8. ------------------------------------------------------------------------ [2010-09-27 02:09:24] neufe...@php.net Description: ------------ Using the attached test-script with just a large amount of data (e.g. 8kb of just "x") segfaults php. Tried with 5.3.3 (Fedora) and also some 5.3.4-snapshot that I could get hold of. Crashed for me with around 8kb of data. If it works fine for you, maybe increase that limit to 16kb or so. Test script: --------------- <?php $email = file_get_contents('x.data'); $r = filter_var($email, FILTER_VALIDATE_EMAIL); var_dump($r); // and just dump a large number of characters like "x" in x.data // for a in `seq 1 8000`; do echo -n x>>x.data; done Expected result: ---------------- bool(false) Actual result: -------------- segfault ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/bug.php?id=52929&edit=1