#50264 [Bgs]: Possible pcre memory leak
ID: 50264 User updated by: laszlo dot janszky at gmail dot com Reported By: laszlo dot janszky at gmail dot com Status: Bogus Bug Type: PCRE related Operating System: Windows XP PHP Version: 5.3.1 New Comment: Ok. For last, here is my backtrace memory tester. If it's not a bug, then bye bye. 1) { $c=(int)(($a+$b)/2); ini_set('pcre.backtrack_limit', $c); if (preg_match_all($pattern,$_POST['test'],$m,PREG_SET_ORDER)) { $b=$c; } else { $a=$c; } } } ?>A fogyasztott memória:.kb.A memória fogyasztás túllépte az engedélyezett kvótát, vagy a minta nem illeszkedik.A tesztelt szöveg: 1.){display} 2.){display} 3.){display} 4.){display} 5.){display} 6.){display} 7.){display} 8.){display} 9.){display} 1.){display} 2.){display} 3.){display} 4.){display} 5.){display} 6.){display} 7.){display} 8.){display} 9.){display} {/display} {/display} {/display} {/display} {/display} {/display} {/display} {/display} {/display} Previous Comments: [2009-11-28 02:40:16] laszlo dot janszky at gmail dot com "Did you test it with the command line pcre test tool?" I did not test it. [2009-11-28 02:37:58] laszlo dot janszky at gmail dot com Ahm, so you mean, this is a pcre memory handling problem, not a php bug? Okay, then where can I report this issue? (Sorry for the bad English, I'm just med- ...) [2009-11-27 23:26:10] ras...@php.net Did you test it with the command line pcre test tool? This is unlikely to have anything to do with php-specific code. [2009-11-27 23:07:01] laszlo dot janszky at gmail dot com OK. If it's not a bug, then what is it? [2009-11-27 17:57:16] j...@php.net Thank you for taking the time to write to us, but this is not a bug. Please double-check the documentation available at http://www.php.net/manual/ and the instructions on how to report a bug at http://bugs.php.net/how-to-report.php The remainder of the comments for this report are too long. To view the rest of the comments, please view the bug report online at http://bugs.php.net/50264 -- Edit this bug report at http://bugs.php.net/?id=50264&edit=1
#50264 [Bgs]: Possible pcre memory leak
ID: 50264 User updated by: laszlo dot janszky at gmail dot com Reported By: laszlo dot janszky at gmail dot com Status: Bogus Bug Type: PCRE related Operating System: Windows XP PHP Version: 5.3.1 New Comment: "Did you test it with the command line pcre test tool?" I did not test it. Previous Comments: [2009-11-28 02:37:58] laszlo dot janszky at gmail dot com Ahm, so you mean, this is a pcre memory handling problem, not a php bug? Okay, then where can I report this issue? (Sorry for the bad English, I'm just med- ...) [2009-11-27 23:26:10] ras...@php.net Did you test it with the command line pcre test tool? This is unlikely to have anything to do with php-specific code. [2009-11-27 23:07:01] laszlo dot janszky at gmail dot com OK. If it's not a bug, then what is it? [2009-11-27 17:57:16] j...@php.net Thank you for taking the time to write to us, but this is not a bug. Please double-check the documentation available at http://www.php.net/manual/ and the instructions on how to report a bug at http://bugs.php.net/how-to-report.php [2009-11-23 19:38:35] laszlo dot janszky at gmail dot com If it is not clear, by the test: the 8 tokens withBlock (M1) test string is: $test=' {display} {display} {display} {display} {display} {display} {display} {display} {/display} {/display} {/display} {/display} {/display} {/display} {/display} {/display} '; and the 8 tokens withoutBlock (M2) test string is: $test=' {display} {display} {display} {display} {display} {display} {display} {display} '; The remainder of the comments for this report are too long. To view the rest of the comments, please view the bug report online at http://bugs.php.net/50264 -- Edit this bug report at http://bugs.php.net/?id=50264&edit=1
#50264 [Bgs]: Possible pcre memory leak
ID: 50264 User updated by: laszlo dot janszky at gmail dot com Reported By: laszlo dot janszky at gmail dot com Status: Bogus Bug Type: PCRE related Operating System: Windows XP PHP Version: 5.3.1 New Comment: Ahm, so you mean, this is a pcre memory handling problem, not a php bug? Okay, then where can I report this issue? (Sorry for the bad English, I'm just med- ...) Previous Comments: [2009-11-27 23:26:10] ras...@php.net Did you test it with the command line pcre test tool? This is unlikely to have anything to do with php-specific code. [2009-11-27 23:07:01] laszlo dot janszky at gmail dot com OK. If it's not a bug, then what is it? [2009-11-27 17:57:16] j...@php.net Thank you for taking the time to write to us, but this is not a bug. Please double-check the documentation available at http://www.php.net/manual/ and the instructions on how to report a bug at http://bugs.php.net/how-to-report.php [2009-11-23 19:38:35] laszlo dot janszky at gmail dot com If it is not clear, by the test: the 8 tokens withBlock (M1) test string is: $test=' {display} {display} {display} {display} {display} {display} {display} {display} {/display} {/display} {/display} {/display} {/display} {/display} {/display} {/display} '; and the 8 tokens withoutBlock (M2) test string is: $test=' {display} {display} {display} {display} {display} {display} {display} {display} '; [2009-11-23 19:21:02] laszlo dot janszky at gmail dot com The leak is in relation with this http://bugs.php.net/bug.php?id=49333 Here is a simplyfied example with eight "withoutBlock" tokens: The basic syntax is: {withBlock}block{/withBlock} or {withoutBlock} As the {withBlock} opener part is of the same structure like the {withoutBlock}, it starts to collect the string after the {withoutBlock} to the backtrace. But for some kind of reason the {withoutBlock} backtrace eats up the memory superexponential, not linear like in the case of {withBlock}. A measured the memory usage with the simplyfied example. It was not superexponential, just exponential. I think cause I have in this example two capturing groups only, not a lot like in the original code. tokens M1[b] M2[b] LN(M2) 1 19 22 3,0910 2 53 115 4,7449 3 87 405 6,0039 4 121 12867,1593 5 155 39408,2789 6 189 11913 9,3854 7 223 35843 10,4869 8 257 107644 11,6204 M1 = 34 * N - 15 R^2 = 1 M2 = exp ( 1,1192 * N + 2,6669 ) R^2 = 0, for the 3-8 part Btw. it's funny memory usage. The remainder of the comments for this report are too long. To view the rest of the comments, please view the bug report online at http://bugs.php.net/50264 -- Edit this bug report at http://bugs.php.net/?id=50264&edit=1
#50264 [Bgs]: Possible pcre memory leak
ID: 50264 Updated by: ras...@php.net Reported By: laszlo dot janszky at gmail dot com Status: Bogus Bug Type: PCRE related Operating System: Windows XP PHP Version: 5.3.1 New Comment: Did you test it with the command line pcre test tool? This is unlikely to have anything to do with php-specific code. Previous Comments: [2009-11-27 23:07:01] laszlo dot janszky at gmail dot com OK. If it's not a bug, then what is it? [2009-11-27 17:57:16] j...@php.net Thank you for taking the time to write to us, but this is not a bug. Please double-check the documentation available at http://www.php.net/manual/ and the instructions on how to report a bug at http://bugs.php.net/how-to-report.php [2009-11-23 19:38:35] laszlo dot janszky at gmail dot com If it is not clear, by the test: the 8 tokens withBlock (M1) test string is: $test=' {display} {display} {display} {display} {display} {display} {display} {display} {/display} {/display} {/display} {/display} {/display} {/display} {/display} {/display} '; and the 8 tokens withoutBlock (M2) test string is: $test=' {display} {display} {display} {display} {display} {display} {display} {display} '; [2009-11-23 19:21:02] laszlo dot janszky at gmail dot com The leak is in relation with this http://bugs.php.net/bug.php?id=49333 Here is a simplyfied example with eight "withoutBlock" tokens: The basic syntax is: {withBlock}block{/withBlock} or {withoutBlock} As the {withBlock} opener part is of the same structure like the {withoutBlock}, it starts to collect the string after the {withoutBlock} to the backtrace. But for some kind of reason the {withoutBlock} backtrace eats up the memory superexponential, not linear like in the case of {withBlock}. A measured the memory usage with the simplyfied example. It was not superexponential, just exponential. I think cause I have in this example two capturing groups only, not a lot like in the original code. tokens M1[b] M2[b] LN(M2) 1 19 22 3,0910 2 53 115 4,7449 3 87 405 6,0039 4 121 12867,1593 5 155 39408,2789 6 189 11913 9,3854 7 223 35843 10,4869 8 257 107644 11,6204 M1 = 34 * N - 15 R^2 = 1 M2 = exp ( 1,1192 * N + 2,6669 ) R^2 = 0, for the 3-8 part Btw. it's funny memory usage. [2009-11-22 18:53:14] laszlo dot janszky at gmail dot com If I remove the recursive part (?:\\}(?.*?(?:(?0).*?)*?)\\{/(?P=function))? from the end of the regex, then it works fine... The remainder of the comments for this report are too long. To view the rest of the comments, please view the bug report online at http://bugs.php.net/50264 -- Edit this bug report at http://bugs.php.net/?id=50264&edit=1
#50264 [Bgs]: Possible pcre memory leak
ID: 50264 User updated by: laszlo dot janszky at gmail dot com Reported By: laszlo dot janszky at gmail dot com Status: Bogus Bug Type: PCRE related Operating System: Windows XP PHP Version: 5.3.1 New Comment: OK. If it's not a bug, then what is it? Previous Comments: [2009-11-27 17:57:16] j...@php.net Thank you for taking the time to write to us, but this is not a bug. Please double-check the documentation available at http://www.php.net/manual/ and the instructions on how to report a bug at http://bugs.php.net/how-to-report.php [2009-11-23 19:38:35] laszlo dot janszky at gmail dot com If it is not clear, by the test: the 8 tokens withBlock (M1) test string is: $test=' {display} {display} {display} {display} {display} {display} {display} {display} {/display} {/display} {/display} {/display} {/display} {/display} {/display} {/display} '; and the 8 tokens withoutBlock (M2) test string is: $test=' {display} {display} {display} {display} {display} {display} {display} {display} '; [2009-11-23 19:21:02] laszlo dot janszky at gmail dot com The leak is in relation with this http://bugs.php.net/bug.php?id=49333 Here is a simplyfied example with eight "withoutBlock" tokens: The basic syntax is: {withBlock}block{/withBlock} or {withoutBlock} As the {withBlock} opener part is of the same structure like the {withoutBlock}, it starts to collect the string after the {withoutBlock} to the backtrace. But for some kind of reason the {withoutBlock} backtrace eats up the memory superexponential, not linear like in the case of {withBlock}. A measured the memory usage with the simplyfied example. It was not superexponential, just exponential. I think cause I have in this example two capturing groups only, not a lot like in the original code. tokens M1[b] M2[b] LN(M2) 1 19 22 3,0910 2 53 115 4,7449 3 87 405 6,0039 4 121 12867,1593 5 155 39408,2789 6 189 11913 9,3854 7 223 35843 10,4869 8 257 107644 11,6204 M1 = 34 * N - 15 R^2 = 1 M2 = exp ( 1,1192 * N + 2,6669 ) R^2 = 0, for the 3-8 part Btw. it's funny memory usage. [2009-11-22 18:53:14] laszlo dot janszky at gmail dot com If I remove the recursive part (?:\\}(?.*?(?:(?0).*?)*?)\\{/(?P=function))? from the end of the regex, then it works fine... [2009-11-22 18:47:14] laszlo dot janszky at gmail dot com Description: I have a huge recursive regex (about 500bytes), which needs a lot of memory for backtrace. The regex matches on templates like {command1 arg1=$arg1 arg2=$arg2|modifier2 arg3="text"|modifier3:modarg31:modarg32} etc If I use the regex with preg_match_all, then the backtrace memory usage depends on the count of the commands superexponential. So: R^2 = 0,9977 (R^2 for trendline) ln ln M = 0,0787 * N + 1,9304 [M] = used backtrack memory in bytes [N] = number of command calls It don't think that more than 1Mb memory usage is normal for a 0.0002Mb string. The recursion memory usage is normal(under 1kb). I'm pretty disappointed because I can't use my template engine because of a badly written pcre engine. Reproduce code: --- $template1=' {display var=$link} {display var=$link} {display var=$link} {display var=$link} {display var=$link} {display var=$link} {display var=$link} {display var=$link} {display var=$link} {display var=$link} '; $template2=' {display var=$link} {display var=$link} {display var=$link} {display var=$link} test test test test test test test test test test test test test test test test test test test test test test test test test test test test test test test test test test test test test test test test test test test test test test test test test test '; $regex='%\\{(?(?:\\w+))(?:(?\\s(?:[\\w_]+(?:\\s[\\w_]+)*\\s)?(?:\\$\\w+(?:->\\w+|\\.\\w+)*|"(?:.*?)"|\\d+(?:\\.\\d+)?)(?:\\|\\w+(?::(?:\\$\\w+(?:->\\w+|\\.\\w+)*|"(?:.*?)"|\\d+(?:\\.\\d+)?))*)*(?:\\s[\\w_]+(?:\\s[\\w_]+)*\\s(?:\\$\\w+(?:->\\w+|\\.\\w+)*|"(?:.*?)"|\\d+(?:\\.\\d+)?)(?:\\|\\w+(?::(?:\\$\\w+(?:->\\w+|\\.\\w+)*|"(?:.*?)"|\\d+(?:\\.\\d+)?))*)*)*(?:\\s[\\w_]+(?:\\s[\\w_]+)*)?)|(?(?:\\s\\w+=(?:\\$\\w+(?:->\\w+|\\.\\w+)*|"(?:.*?)"|\\d+(?:\\.\\d+)?)(?:\\|\\w+(?::(?:\\$\\w+(?:->\\w+|\\.\\w+)*|"(?:.*?)"|\\d+(?:\\.\\d+)?))*)*)*))(?:\\}(?.*?(?:(?0).*?)*?)\\{/(?P=function))?\\}%usD'; $one_Mb=1024*1024; $one_kb=1024; ini_set('pcre.backtrack_limit', $one_Mb); ini_set('pcre.recursion_limit', $one_kb); preg_match_all($regex,$templa