ID: 49333 User updated by: laszlo dot janszky at gmail dot com -Reported By: inf3rno dot hu at gmail dot com +Reported By: laszlo dot janszky at gmail dot com Status: Bogus Bug Type: PCRE related Operating System: Windows XP PHP Version: 5.2.10 New Comment:
This bug is in relation with the memory leak I found: http://bugs.php.net/bug.php?id=50264 The code works with a raised pcre.backtrack_limit. Previous Comments: ------------------------------------------------------------------------ [2009-08-25 10:25:16] j...@php.net You can abuse things to some extend but there's a limit always. And this is not PHP bug anyway (if a bug at all) but PCRE lib. ------------------------------------------------------------------------ [2009-08-25 10:05:57] inf3rno dot hu at gmail dot com Original pattern was this: '%(?<string>.*?)(?:{\\s*(?<function>[a-z0-9_]+)(?:\\s*(?:(?<hash>(?:(?:\\s+[a-z0-9_]+\\s*=\\s*)?(?:\\$[a-z0-9_]+(?:->[a-z0-9_]+|\\.[a-z0-9_]+)*|\\d+(?:\\.\\d+)?|".*?(?:\\\\".*?)*"))+)|(?<chain>(?:(?:\\s+[a-z0-9_]+(?: [a-z0-9_]+)*\\s+)?(?:\\$[a-z0-9_]+(?:->[a-z0-9_]+|\\.[a-z0-9_]+)*|\\d+(?:\\.\\d+)?|".*?(?:\\\\".*?)*"))+)|(?<list>(?:\\$[a-z0-9_]+(?:->[a-z0-9_]+|\\.[a-z0-9_]+)*|\\d+(?:\\.\\d+)?|".*?(?:\\\\".*?)*")(?:\\s*,\\s*(?:\\$[a-z0-9_]+(?:->[a-z0-9_]+|\\.[a-z0-9_]+)*|\\d+(?:\\.\\d+)?|".*?(?:\\\\".*?)*"))*)))?(?:\\s*}(?<block>.*?(?:(?R).*?)*?){\\s*/(?P=function))?\\s*}|{\\s*\\$(?<variable>[a-z0-9_]+(?:->[a-z0-9_]+|\\.[a-z0-9_]+)*)\\s*}|{\\s*\\*(?<comment>.*?)\\*\\s*}|$)%sDu' This pattern matches on similar tokens like Smarty uses. I need the %string_before(?:function_with_recursive_block|variable|comment|$)% structure because I have to capture the string before the token too, and the fastest way for that is this. With offset capture and a %function_with_recursive_block|variable|comment% structured regex I can do this too, but it's the slower way, cause I have to call strlen and substr functions in a loop. So I need that .*? :-) But recursive patterns have a strange behavior. I thought that '%.*?(?:([a-z])(?:(?R)*?\1)?|$)%sD' has to work too, but it didn't. Logically, the (?R)*? means here: "string+token...+string+end_of_the_recursive_part", but "$" is the end of the whole string, and not the end of the recursive part. :S ------------------------------------------------------------------------ [2009-08-25 08:47:29] j dot boggiano at seld dot be I am not entirely sure what you are trying to achieve so maybe I broke some functionality, but with this pattern at least it gives the expected result with \r\n or \n (of course \n has less chars..) $pattern='%\s*(?:([a-z])(?:.*?(?:(?R).*?)*?\1)?|$)%sD'; Is that good enough ? ------------------------------------------------------------------------ [2009-08-25 08:35:43] j...@php.net When the $test contains \r\n instead of \n it fails. ------------------------------------------------------------------------ [2009-08-24 12:21:40] inf3rno dot hu at gmail dot com I reproduced it on another computer with the latest WAMPServer (Apache 2.2.11, PHP 5.3.0). I copied the code from here. ------------------------------------------------------------------------ The remainder of the comments for this report are too long. To view the rest of the comments, please view the bug report online at http://bugs.php.net/49333 -- Edit this bug report at http://bugs.php.net/?id=49333&edit=1