Edit report at https://bugs.php.net/bug.php?id=60423&edit=1

 ID:                 60423
 User updated by:    amal dot samally at gmail dot com
 Reported by:        amal dot samally at gmail dot com
 Summary:            Segmentation fault with the UTF-8 check regexp in
                     some cases
-Status:             Feedback
+Status:             Open
 Type:               Bug
 Package:            PCRE related
 Operating System:   Linux
 PHP Version:        5.3.8
 Block user comment: N
 Private report:     N

 New Comment:

gdb output:

(gdb) run test.php
Starting program: /usr/local/bin/php test.php
[Thread debugging using libthread_db enabled]

Program received signal SIGSEGV, Segmentation fault.
0x0000000000498948 in match (
    eptr=0x139a566 "1{font-weight:bold}#gbg6.gbgt-
hvr,#gbg6.gbgt:focus{background-color:transparent;background-
image:none}.gbg4a{font-size:0;line-height:0}.gbg4a .gbts{padding:27px 5px 
0;*padding:25px 5px 0}.gbto .gbg4a "..., ecode=0x13d9525 "^", 
    mstart=0x13990f0 "<!doctype html> <head>      <title>docs.pravo.ru - 
Поиск в 
Google</title>   <script>window.google={kEI:\"hTnXTp-
POZDqOabAzMYO\",getEI:function(a){var b;while(a&&!
(a.getAttribute&&(b=a.getAttribute"..., markptr=0x0, offset_top=2, 
md=0x7fffffffb340, ims=0, eptrb=Cannot access memory at address 0x7fffff7feff8
)
    at /tmp/php_build/php-5.3.8/ext/pcre/pcrelib/pcre_exec.c:471
471     {
(gdb) bt
#0  0x0000000000498948 in match (
    eptr=0x139a566 "1{font-weight:bold}#gbg6.gbgt-
hvr,#gbg6.gbgt:focus{background-color:transparent;background-
image:none}.gbg4a{font-size:0;line-height:0}.gbg4a .gbts{padding:27px 5px 
0;*padding:25px 5px 0}.gbto .gbg4a "..., ecode=0x13d9525 "^", 
    mstart=0x13990f0 "<!doctype html> <head>      <title>docs.pravo.ru - 
Поиск в 
Google</title>   <script>window.google={kEI:\"hTnXTp-
POZDqOabAzMYO\",getEI:function(a){var b;while(a&&!
(a.getAttribute&&(b=a.getAttribute"..., markptr=0x0, offset_top=2, 
md=0x7fffffffb340, ims=0, eptrb=Cannot access memory at address 0x7fffff7feff8
)
    at /tmp/php_build/php-5.3.8/ext/pcre/pcrelib/pcre_exec.c:471
#1  0x000000000049b352 in match (
    eptr=0x139a566 "1{font-weight:bold}#gbg6.gbgt-
hvr,#gbg6.gbgt:focus{background-color:transparent;background-
image:none}.gbg4a{font-size:0;line-height:0}.gbg4a .gbts{padding:27px 5px 
0;*padding:25px 5px 0}.gbto .gbg4a "..., ecode=0x13d9748 "V\002#\033U\002,", 
    mstart=0x13990f0 "<!doctype html> <head>      <title>docs.pravo.ru - 
Поиск в 
Google</title>   <script>window.google={kEI:\"hTnXTp-
POZDqOabAzMYO\",getEI:function(a){var b;while(a&&!
(a.getAttribute&&(b=a.getAttribute"..., markptr=0x0, offset_top=2, 
md=0x7fffffffb340, ims=0, eptrb=0x0, flags=0, rdepth=10464)
    at /tmp/php_build/php-5.3.8/ext/pcre/pcrelib/pcre_exec.c:1654
#2  0x00000000004994e0 in match (
    eptr=0x139a565 "s1{font-weight:bold}#gbg6.gbgt-
hvr,#gbg6.gbgt:focus{background-color:transparent;background-
image:none}.gbg4a{font-size:0;line-height:0}.gbg4a .gbts{padding:27px 5px 
0;*padding:25px 5px 0}.gbto .gbg4a"..., ecode=0x13d9525 "^", 
    mstart=0x13990f0 "<!doctype html> <head>      <title>docs.pravo.ru - 
Поиск в 
Google</title>   <script>window.google={kEI:\"hTnXTp-
POZDqOabAzMYO\",getEI:function(a){var b;while(a&&!
(a.getAttribute&&(b=a.getAttribute"..., markptr=0x0, offset_top=2, 
md=0x7fffffffb340, ims=0, eptrb=0x0, flags=0, rdepth=10463)
    at /tmp/php_build/php-5.3.8/ext/pcre/pcrelib/pcre_exec.c:885
#3  0x000000000049b352 in match (
    eptr=0x139a565 "s1{font-weight:bold}#gbg6.gbgt-
hvr,#gbg6.gbgt:focus{background-color:transparent;background-
image:none}.gbg4a{font-size:0;line-height:0}.gbg4a .gbts{padding:27px 5px 
0;*padding:25px 5px 0}.gbto .gbg4a"..., ecode=0x13d9748 "V\002#\033U\002,", 
    mstart=0x13990f0 "<!doctype html> <head>      <title>docs.pravo.ru - 
Поиск в 
Google</title>   <s---Type <return> to continue, or q <return> to quit---


Previous Comments:
------------------------------------------------------------------------
[2011-12-01 14:46:10] larue...@php.net

Thank you for this bug report. To properly diagnose the problem, we
need a backtrace to see what is happening behind the scenes. To
find out how to generate a backtrace, please read
http://bugs.php.net/bugs-generating-backtrace.php for *NIX and
http://bugs.php.net/bugs-generating-backtrace-win32.php for Win32

Once you have generated a backtrace, please submit it to this bug
report and change the status back to "Open". Thank you for helping
us make PHP better.



------------------------------------------------------------------------
[2011-12-01 10:57:33] amal dot samally at gmail dot com

I think not.
Also changing pcre.backtrack_limit / pcre.recursion_limit do not give anything.

------------------------------------------------------------------------
[2011-12-01 10:10:52] larue...@php.net

see #41638, may be the same.

------------------------------------------------------------------------
[2011-12-01 09:04:37] amal dot samally at gmail dot com

Description:
------------
I'm using the regexp to test whether a string is a valid UTF-8 encoded string.
But in some cases it causes a segmentation fault.

Examples of strings that cause the error:
http://samally.ru/php_pcre_segmentation_fault/test1.txt
http://samally.ru/php_pcre_segmentation_fault/test2.txt

Test script:
---------------
$string = 
file_get_contents('http://samally.ru/php_pcre_segmentation_fault/test1.txt');
// $string = 
file_get_contents('http://samally.ru/php_pcre_segmentation_fault/test2.txt');

// Tests whether a string is a valid UTF-8 encoded string.
// @link http://w3.org/International/questions/qa-forms-utf-8.html
$r = preg_match('~^(?:
   [\x09\x0A\x0D\x20-\x7E]            # ASCII without control characters
 | [\xC2-\xDF][\x80-\xBF]             # non-overlong 2-byte
 | \xE0[\xA0-\xBF][\x80-\xBF]         # excluding overlongs
 | [\xE1-\xEC\xEE\xEF][\x80-\xBF]{2}  # straight 3-byte
 | \xED[\x80-\x9F][\x80-\xBF]         # excluding surrogates
 | \xF0[\x90-\xBF][\x80-\xBF]{2}      # planes 1-3
 | [\xF1-\xF3][\x80-\xBF]{3}          # planes 4-15
 | \xF4[\x80-\x8F][\x80-\xBF]{2}      # plane 16
)*$~DSXx', $string);



------------------------------------------------------------------------



-- 
Edit this bug report at https://bugs.php.net/bug.php?id=60423&edit=1

Reply via email to