Edit report at https://bugs.php.net/bug.php?id=62646&edit=1

 ID:                 62646
 Comment by:         Andreas dot Klauer at metamorpher dot de
 Reported by:        Andreas dot Klauer at metamorpher dot de
 Summary:            Impossible to escape/match delimiters within \Q \E
 Status:             Not a bug
 Type:               Bug
 Package:            PCRE related
 PHP Version:        5.3.15
 Block user comment: N
 Private report:     N

 New Comment:

But even if you escape the delimiter, it's not possible to match literal /#~ if 
one of those is the delimiter; you have to escape it, but if you do escape it, 
it matches literal \/#~ instead of just /#~.

Perl:
$subject = "foo/#~bar";
$subject =~ s/\Q\/#~\E/baz/;
print $subject;
=> "foobazbar"

PHP:
$subject = "foo/#~bar";
$subject = preg_replace("/\Q\/#~\E/", "baz", $subject);
echo $subject;
=> "foo/#~bar"

PHP tries to match literal \/#~ here.


Previous Comments:
------------------------------------------------------------------------
[2012-07-24 01:55:35] [email protected]

Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

Hi, the pcretest tool doesn't even handle the \Q..\E stuff as you mentioned. It 
works just like the [..] one. And using Perl I did to escape the delimiter 
inside \Q..\E.

------------------------------------------------------------------------
[2012-07-24 00:33:54] Andreas dot Klauer at metamorpher dot de

Description:
------------
PCRE allows literal matches of strings between \Q and \E. This is also 
documented, \Q.$.\E will match literal .$.

However, if that literal string contains the regexp delimiter (/ or # or ~ or 
() or whichever you choose), the regexp compile either fails, or the match 
fails because it tries to match the escape char used to escape the delimiter.

The problem is php_pcre::pcre_get_compiled_regex_cache() which parses the 
delimiter, not taking \Q \E in account. Delimiters between \Q \E should be 
treated as literal characters, not delimiters (that's what Perl does); or 
alternatively if delimiters have to be escaped, the escape char should be 
removed from the pattern.

Workaround: Use preg_quote() instead of \Q \E if there's a chance the delimiter 
may appear within \Q \E

Test script:
---------------
preg_replace("/\Q/#~\E/", ...);
=> Warning: preg_replace(): Unknown modifier '#' in php shell code on line 1

preg_replace("/\Q\/#~\E/", "OK", "/#~");
=> "/#~" (expected "OK")

preg_replace("/\Q\/#~\E/", "FAIL", "\/#~")
=> "FAIL" (expected "\/#~");




------------------------------------------------------------------------



-- 
Edit this bug report at https://bugs.php.net/bug.php?id=62646&edit=1

Reply via email to