From:             strata_ranger at hotmail dot com
Operating system: *
PHP version:      5.2.10
PHP Bug Type:     PCRE related
Bug description:  PREG_BAD_UTF8_ERROR should emit E_NOTICE

Description:
------------
This is not a PHP bug, but a suggestion that would help with
troubleshooting PCRE calls in one's own PHP scripts.

When using the /u modifier in PCRE, if the subject string contains an
invalid Unicode sequence, this generates a PREG_BAD_UTF8_ERROR (which can
be retrieved using preg_last_error() ).  This is expected behavior for
PCRE, but it should also emit an E_NOTICE to the user because it could
indicate an error in their script (the definition of an E_NOTICE).

Specifically, when using preg_replace() in an assignment context (i.e:
$subject = preg_replace($foo, $bar, $subject) ), this can create situations
where a PREG_BAD_UTF8_ERROR causes the subject string to be "erased"
(re-assigned NULL) if the script author didn't take time to ensure that
their subject string was valid utf-8 before calling preg_replace().

Even though it's the fault of the script author, the preg_* functions
should still at least emit an E_NOTICE about bad UTF-8; it's a pain to hunt
through one's proverbial 'miles of code' to figure out why one of their
variables suddenly 'disappeared', without a file name or line number to
start the troubleshooting by.

Workarounds available in the meantime are:

// As of PHP 5.3
// (unless the replacement yields string '0')
$string = preg_replace(..., $string) ?: $string; // As of PHP 5.3

// Other workaround (any PHP version)
$string = is_string($repl=preg_replace(..., $string))? $repl : string;


Reproduce code:
---------------
---
>From manual page: reference.pcre.pattern.modifiers
---
error_reporting(-1); // Emit all errors

$subject = "fa\xa0ade"; // Valid in ISO-8859-1 (but not UTF-8!)

// Causes a PREG_BAD_UTF8_ERROR and sets $subject to NULL.
// And we didn't make a copy of the original $subject.  Oops!
$subject = preg_replace('//u', '', $subject);

var_dump($string); // NULL
var_dump(preg_last_error());

---


Actual result:
--------------
preg_replace() returns NULL; checking preg_last_error() verifies a
PREG_BAD_UTF8_ERROR.  No errors, warnings, or notices of any kind were
generated.
We did, however, immediately assign the preg_replace() back to $subject,
so $subject is now NULL and has lost whatever data it originally contained.
 Even though this was obviously our fault, an E_NOTICE would have told us
about it.

-- 
Edit bug report at http://bugs.php.net/?id=49339&edit=1
-- 
Try a snapshot (PHP 5.2):            
http://bugs.php.net/fix.php?id=49339&r=trysnapshot52
Try a snapshot (PHP 5.3):            
http://bugs.php.net/fix.php?id=49339&r=trysnapshot53
Try a snapshot (PHP 6.0):            
http://bugs.php.net/fix.php?id=49339&r=trysnapshot60
Fixed in SVN:                        
http://bugs.php.net/fix.php?id=49339&r=fixed
Fixed in SVN and need be documented: 
http://bugs.php.net/fix.php?id=49339&r=needdocs
Fixed in release:                    
http://bugs.php.net/fix.php?id=49339&r=alreadyfixed
Need backtrace:                      
http://bugs.php.net/fix.php?id=49339&r=needtrace
Need Reproduce Script:               
http://bugs.php.net/fix.php?id=49339&r=needscript
Try newer version:                   
http://bugs.php.net/fix.php?id=49339&r=oldversion
Not developer issue:                 
http://bugs.php.net/fix.php?id=49339&r=support
Expected behavior:                   
http://bugs.php.net/fix.php?id=49339&r=notwrong
Not enough info:                     
http://bugs.php.net/fix.php?id=49339&r=notenoughinfo
Submitted twice:                     
http://bugs.php.net/fix.php?id=49339&r=submittedtwice
register_globals:                    
http://bugs.php.net/fix.php?id=49339&r=globals
PHP 4 support discontinued:          http://bugs.php.net/fix.php?id=49339&r=php4
Daylight Savings:                    http://bugs.php.net/fix.php?id=49339&r=dst
IIS Stability:                       
http://bugs.php.net/fix.php?id=49339&r=isapi
Install GNU Sed:                     
http://bugs.php.net/fix.php?id=49339&r=gnused
Floating point limitations:          
http://bugs.php.net/fix.php?id=49339&r=float
No Zend Extensions:                  
http://bugs.php.net/fix.php?id=49339&r=nozend
MySQL Configuration Error:           
http://bugs.php.net/fix.php?id=49339&r=mysqlcfg

Reply via email to