From:             
Operating system: 
PHP version:      5.4.0
Package:          PCRE related
Bug Type:         Bug
Bug description:Inconsistent PCRE captures in match results

Description:
------------
Named and unnamed captures in both preg_match and preg_match_all (and
probably preg_replace and the other PCRE functions too but I haven't tested
them all) can capture the wrong number of parentheses if alternation or a
zero-or-more quantifier is used.

If the pattern '/(?<b>b)|(?<c>c)|(?<d>d)/' is used to match 'c', both 'b'
and 'c' will be set in the results array but 'd' won't be. 'b' should not
be set (even to an empty string) as it failed to match anything. However,
if it was trying to match '/(?<b>b?)(?<c>c)/' (note: optional 'b' AND
mandatory 'c'), 'b' _should_ be set to '' as it's allowed to match a
zero-length string. If a match gets tried but it fails and a capture later
in the pattern works, the skipped capture should never produce a key in the
results array. It should be OK to leave holes in the numbered sequence
(e.g. match 0 and 2 but not 1).

Currently, you need to use PREG_OFFSET_CAPTURE and test to see if the key
exists, and if it does, test to see if the capture position is -1. If this
bug is fixed, capture positions will never be -1 as the key won't exist.
Alternatively, an additional flag could be added (e.g.
PREG_KEEP_NONMATCHES) to create keys for ALL captures whether used or not
(so, in the first pattern above, keys would be created for 'b', 'c' and 'd'
in all cases, and if matching the string 'c' the offsets for both 'b' and
'd' would be -1).

In summary, if the pattern '/(?<b>b)|(?<c>c)|(?<d>d)/' is used to match
'c', by default it should only ever create a key for 'c'. If desired, an
additional flag could be added so that it creates keys for all captures:
'b', 'c' and 'd'. The current behaviour where it creates a key for 'b' and
'c' but not 'd' should be considered a bug and fixed.

Test script:
---------------
print('<pre>');
$offset = 0;
while (preg_match('/(?:(?<b>b)|(?<c>c)|(?<d>d))(?<e>e)?/', 'cdec',
$matches, PREG_OFFSET_CAPTURE, $offset)) {
  $offset = $matches[0][1] + strlen($matches[0][0]);
  var_export($matches);
  print("\n\n");
}

print("****************\n\n");

preg_match_all('/(?:(?<b>b)|(?<c>c)|(?<d>d))(?<e>e)?/', 'cdec', $matches,
PREG_OFFSET_CAPTURE | PREG_SET_ORDER);
var_export($matches);
print('</pre>');


Expected result:
----------------
array (
  0 => 
  array (
    0 => 'c',
    1 => 0,
  ),
  'c' => 
  array (
    0 => 'c',
    1 => 0,
  ),
  2 => 
  array (
    0 => 'c',
    1 => 0,
  ),
)

array (
  0 => 
  array (
    0 => 'de',
    1 => 1,
  ),
  'd' => 
  array (
    0 => 'd',
    1 => 1,
  ),
  3 => 
  array (
    0 => 'd',
    1 => 1,
  ),
  'e' => 
  array (
    0 => 'e',
    1 => 2,
  ),
  4 => 
  array (
    0 => 'e',
    1 => 2,
  ),
)

array (
  0 => 
  array (
    0 => 'c',
    1 => 3,
  ),
  'c' => 
  array (
    0 => 'c',
    1 => 3,
  ),
  2 => 
  array (
    0 => 'c',
    1 => 3,
  ),
)

****************

array (
  0 => 
  array (
    0 => 
    array (
      0 => 'c',
      1 => 0,
    ),
    'c' => 
    array (
      0 => 'c',
      1 => 0,
    ),
    2 => 
    array (
      0 => 'c',
      1 => 0,
    ),
  ),
  1 => 
  array (
    0 => 
    array (
      0 => 'de',
      1 => 1,
    ),
    'd' => 
    array (
      0 => 'd',
      1 => 1,
    ),
    3 => 
    array (
      0 => 'd',
      1 => 1,
    ),
    'e' => 
    array (
      0 => 'e',
      1 => 2,
    ),
    4 => 
    array (
      0 => 'e',
      1 => 2,
    ),
  ),
  2 => 
  array (
    0 => 
    array (
      0 => 'c',
      1 => 3,
    ),
    'c' => 
    array (
      0 => 'c',
      1 => 3,
    ),
    2 => 
    array (
      0 => 'c',
      1 => 3,
    ),
  ),
)

Actual result:
--------------
array (
  0 => 
  array (
    0 => 'c',
    1 => 0,
  ),
  'b' => 
  array (
    0 => '',
    1 => -1,
  ),
  1 => 
  array (
    0 => '',
    1 => -1,
  ),
  'c' => 
  array (
    0 => 'c',
    1 => 0,
  ),
  2 => 
  array (
    0 => 'c',
    1 => 0,
  ),
)

array (
  0 => 
  array (
    0 => 'de',
    1 => 1,
  ),
  'b' => 
  array (
    0 => '',
    1 => -1,
  ),
  1 => 
  array (
    0 => '',
    1 => -1,
  ),
  'c' => 
  array (
    0 => '',
    1 => -1,
  ),
  2 => 
  array (
    0 => '',
    1 => -1,
  ),
  'd' => 
  array (
    0 => 'd',
    1 => 1,
  ),
  3 => 
  array (
    0 => 'd',
    1 => 1,
  ),
  'e' => 
  array (
    0 => 'e',
    1 => 2,
  ),
  4 => 
  array (
    0 => 'e',
    1 => 2,
  ),
)

array (
  0 => 
  array (
    0 => 'c',
    1 => 3,
  ),
  'b' => 
  array (
    0 => '',
    1 => -1,
  ),
  1 => 
  array (
    0 => '',
    1 => -1,
  ),
  'c' => 
  array (
    0 => 'c',
    1 => 3,
  ),
  2 => 
  array (
    0 => 'c',
    1 => 3,
  ),
)

****************

array (
  0 => 
  array (
    0 => 
    array (
      0 => 'c',
      1 => 0,
    ),
    'b' => 
    array (
      0 => '',
      1 => -1,
    ),
    1 => 
    array (
      0 => '',
      1 => -1,
    ),
    'c' => 
    array (
      0 => 'c',
      1 => 0,
    ),
    2 => 
    array (
      0 => 'c',
      1 => 0,
    ),
  ),
  1 => 
  array (
    0 => 
    array (
      0 => 'de',
      1 => 1,
    ),
    'b' => 
    array (
      0 => '',
      1 => -1,
    ),
    1 => 
    array (
      0 => '',
      1 => -1,
    ),
    'c' => 
    array (
      0 => '',
      1 => -1,
    ),
    2 => 
    array (
      0 => '',
      1 => -1,
    ),
    'd' => 
    array (
      0 => 'd',
      1 => 1,
    ),
    3 => 
    array (
      0 => 'd',
      1 => 1,
    ),
    'e' => 
    array (
      0 => 'e',
      1 => 2,
    ),
    4 => 
    array (
      0 => 'e',
      1 => 2,
    ),
  ),
  2 => 
  array (
    0 => 
    array (
      0 => 'c',
      1 => 3,
    ),
    'b' => 
    array (
      0 => '',
      1 => -1,
    ),
    1 => 
    array (
      0 => '',
      1 => -1,
    ),
    'c' => 
    array (
      0 => 'c',
      1 => 3,
    ),
    2 => 
    array (
      0 => 'c',
      1 => 3,
    ),
  ),
)

-- 
Edit bug report at https://bugs.php.net/bug.php?id=61780&edit=1
-- 
Try a snapshot (PHP 5.4):            
https://bugs.php.net/fix.php?id=61780&r=trysnapshot54
Try a snapshot (PHP 5.3):            
https://bugs.php.net/fix.php?id=61780&r=trysnapshot53
Try a snapshot (trunk):              
https://bugs.php.net/fix.php?id=61780&r=trysnapshottrunk
Fixed in SVN:                        
https://bugs.php.net/fix.php?id=61780&r=fixed
Fixed in SVN and need be documented: 
https://bugs.php.net/fix.php?id=61780&r=needdocs
Fixed in release:                    
https://bugs.php.net/fix.php?id=61780&r=alreadyfixed
Need backtrace:                      
https://bugs.php.net/fix.php?id=61780&r=needtrace
Need Reproduce Script:               
https://bugs.php.net/fix.php?id=61780&r=needscript
Try newer version:                   
https://bugs.php.net/fix.php?id=61780&r=oldversion
Not developer issue:                 
https://bugs.php.net/fix.php?id=61780&r=support
Expected behavior:                   
https://bugs.php.net/fix.php?id=61780&r=notwrong
Not enough info:                     
https://bugs.php.net/fix.php?id=61780&r=notenoughinfo
Submitted twice:                     
https://bugs.php.net/fix.php?id=61780&r=submittedtwice
register_globals:                    
https://bugs.php.net/fix.php?id=61780&r=globals
PHP 4 support discontinued:          
https://bugs.php.net/fix.php?id=61780&r=php4
Daylight Savings:                    https://bugs.php.net/fix.php?id=61780&r=dst
IIS Stability:                       
https://bugs.php.net/fix.php?id=61780&r=isapi
Install GNU Sed:                     
https://bugs.php.net/fix.php?id=61780&r=gnused
Floating point limitations:          
https://bugs.php.net/fix.php?id=61780&r=float
No Zend Extensions:                  
https://bugs.php.net/fix.php?id=61780&r=nozend
MySQL Configuration Error:           
https://bugs.php.net/fix.php?id=61780&r=mysqlcfg

Reply via email to