ID:               41739
 User updated by:  trex0003 at umn dot edu
 Reported By:      trex0003 at umn dot edu
-Status:           Bogus
+Status:           Open
 Bug Type:         Documentation problem
 Operating System: CentOS
 PHP Version:      Irrelevant
 New Comment:

Please explain the comments added to this report and how it was
determined to be "bogus". The comments added by [EMAIL PROTECTED] to the bug
do nothing more than reiterate the use of a single, double and quadruple
backslash. My question is about why a triple backslash is necessary in
this instance.

The case of four backslashes in PHP makes sense -- the first and third
backslashes escape the second and fourth. But when you have three
backslashes in a row, doesn't the third backslash act as an escape
character to whatever follows? In my example of:

  $needle = "/(?<!\\\)'/";

I would expect that the first backslash would escape the second, and
the third would escape the right parenthesis, but that's apparently not
what happens, because this particular code WORKS.


Previous Comments:
------------------------------------------------------------------------

[2007-08-16 14:21:11] [EMAIL PROTECTED]

\ character
\\ PCRE representation of this character
'\\\\' PHP string to write this representation.


------------------------------------------------------------------------

[2007-06-19 17:34:33] trex0003 at umn dot edu

Description:
------------
Documentation of how PCRE regex behaves with escaped backslashes is
confusing as written in current documentation:

http://php.planetmirror.com/manual/en/reference.pcre.pattern.syntax.php

Perhaps this just means an expansion of the note:

"Note:  Single and double quoted PHP strings have special meaning of
backslash. Thus if \ has to be matched with a regular expression \\,
then "\\\\" or '\\\\' must be used in PHP code."

While trying to use preg_split to break a string at non-escaped string
quotes, I tried using a negative lookbehind assertion (see code below).
With all other escaped characters in the lookbehind assertion, you only
need one backwards slash before the escaped character (e.g., "(?<!\s)"
works); but if you want to escape a backslash, you need to escape it
twice (e.g., "(?<!\\\)" works but "(?<!\\)" does not). 

Based on the documentation's statement 'In particular, if you want to
match a backslash, you write "\\".' it seems that (?<!\\) should work,
and the extra information in the Note is too terse to explain why.

Perhaps this is actually a bug, but I can't tell if it's normal
behavior that could be clarified more in documentation.

The closest I could find for bug reports to this issue is here:

http://bugs.php.net/bug.php?id=22315

...but this only implies the difference in behavior without explaining
why.

Reproduce code:
---------------
$string = "This sentence is not quoted. 'But this one is, and it
contains \'escaped quotes\' within!'";

$needle = "/(?<!\\)'/";
$results = preg_split($needle, $string);

print_r($results);

Expected result:
----------------
Array
(
    [0] => This sentence is not quoted. 
    [1] => But this one is, and it contains \'escaped quotes\' within!
    [2] => 
)

// Adding a third backslash to the $needle gives this result.

Actual result:
--------------
Warning: preg_split() [function.preg-split]: Compilation failed:
missing ) at offset 7 in
/data/domains/lawweb3.law.umn.edu/public/php/regex.php on line 16


------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=41739&edit=1

Reply via email to