ID:               41739
 Updated by:       [EMAIL PROTECTED]
 Reported By:      trex0003 at umn dot edu
-Status:           Open
+Status:           Bogus
 Bug Type:         Documentation problem
 Operating System: CentOS
 PHP Version:      Irrelevant
 New Comment:

"if you try to escape any other character, the backslash will be
printed too" in strings documentation.


Previous Comments:
------------------------------------------------------------------------

[2007-08-17 17:16:37] trex0003 at umn dot edu

Please explain the comments added to this report and how it was
determined to be "bogus". The comments added by [EMAIL PROTECTED] to the bug
do nothing more than reiterate the use of a single, double and quadruple
backslash. My question is about why a triple backslash is necessary in
this instance.

The case of four backslashes in PHP makes sense -- the first and third
backslashes escape the second and fourth. But when you have three
backslashes in a row, doesn't the third backslash act as an escape
character to whatever follows? In my example of:

  $needle = "/(?<!\\\)'/";

I would expect that the first backslash would escape the second, and
the third would escape the right parenthesis, but that's apparently not
what happens, because this particular code WORKS.

------------------------------------------------------------------------

[2007-08-16 14:21:11] [EMAIL PROTECTED]

\ character
\\ PCRE representation of this character
'\\\\' PHP string to write this representation.


------------------------------------------------------------------------

[2007-06-19 17:34:33] trex0003 at umn dot edu

Description:
------------
Documentation of how PCRE regex behaves with escaped backslashes is
confusing as written in current documentation:

http://php.planetmirror.com/manual/en/reference.pcre.pattern.syntax.php

Perhaps this just means an expansion of the note:

"Note:  Single and double quoted PHP strings have special meaning of
backslash. Thus if \ has to be matched with a regular expression \\,
then "\\\\" or '\\\\' must be used in PHP code."

While trying to use preg_split to break a string at non-escaped string
quotes, I tried using a negative lookbehind assertion (see code below).
With all other escaped characters in the lookbehind assertion, you only
need one backwards slash before the escaped character (e.g., "(?<!\s)"
works); but if you want to escape a backslash, you need to escape it
twice (e.g., "(?<!\\\)" works but "(?<!\\)" does not). 

Based on the documentation's statement 'In particular, if you want to
match a backslash, you write "\\".' it seems that (?<!\\) should work,
and the extra information in the Note is too terse to explain why.

Perhaps this is actually a bug, but I can't tell if it's normal
behavior that could be clarified more in documentation.

The closest I could find for bug reports to this issue is here:

http://bugs.php.net/bug.php?id=22315

...but this only implies the difference in behavior without explaining
why.

Reproduce code:
---------------
$string = "This sentence is not quoted. 'But this one is, and it
contains \'escaped quotes\' within!'";

$needle = "/(?<!\\)'/";
$results = preg_split($needle, $string);

print_r($results);

Expected result:
----------------
Array
(
    [0] => This sentence is not quoted. 
    [1] => But this one is, and it contains \'escaped quotes\' within!
    [2] => 
)

// Adding a third backslash to the $needle gives this result.

Actual result:
--------------
Warning: preg_split() [function.preg-split]: Compilation failed:
missing ) at offset 7 in
/data/domains/lawweb3.law.umn.edu/public/php/regex.php on line 16


------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=41739&edit=1

Reply via email to