From: bertrand dot debaenst at gmx dot net Operating system: windows XP PHP version: 5CVS-2007-01-10 (snap) PHP Bug Type: PCRE related Bug description: Bug in preg_replace concerning UTF-8 characters
Description: ------------ when replacing an utf-8 string containing the character 'à' (hex: c3a0) With the function preg_replace, and the pattern '\s', it changes the second byte of this character. Using the pattern '\t\f\r\n' which is supposed to be the same as \s it works perfectly. I have tried with other utf-8 characters and it seems to work. Reproduce code: --------------- <? $text = utf8_encode("this is a test àt"); echo bin2hex($text)."\r\n"; $text1 = preg_replace("'([\t\f\r\n])+'", " ", $text); echo bin2hex($text1)."\r\n"; echo $text1."\r\n";; $text2 = preg_replace("'([\s])+'", " ", $text); echo bin2hex($text2)."\r\n"; echo $text2; ?> Expected result: ---------------- 746869732069732061207465737420c3a074 746869732069732061207465737420c3a074 this is a test ├át 746869732069732061207465737420c3a074 this is a test ├át Actual result: -------------- 746869732069732061207465737420c3a074 746869732069732061207465737420c3a074 this is a test ├át 746869732069732061207465737420c32074 this is a test ├ t -- Edit bug report at http://bugs.php.net/?id=40090&edit=1 -- Try a CVS snapshot (PHP 4.4): http://bugs.php.net/fix.php?id=40090&r=trysnapshot44 Try a CVS snapshot (PHP 5.2): http://bugs.php.net/fix.php?id=40090&r=trysnapshot52 Try a CVS snapshot (PHP 6.0): http://bugs.php.net/fix.php?id=40090&r=trysnapshot60 Fixed in CVS: http://bugs.php.net/fix.php?id=40090&r=fixedcvs Fixed in release: http://bugs.php.net/fix.php?id=40090&r=alreadyfixed Need backtrace: http://bugs.php.net/fix.php?id=40090&r=needtrace Need Reproduce Script: http://bugs.php.net/fix.php?id=40090&r=needscript Try newer version: http://bugs.php.net/fix.php?id=40090&r=oldversion Not developer issue: http://bugs.php.net/fix.php?id=40090&r=support Expected behavior: http://bugs.php.net/fix.php?id=40090&r=notwrong Not enough info: http://bugs.php.net/fix.php?id=40090&r=notenoughinfo Submitted twice: http://bugs.php.net/fix.php?id=40090&r=submittedtwice register_globals: http://bugs.php.net/fix.php?id=40090&r=globals PHP 3 support discontinued: http://bugs.php.net/fix.php?id=40090&r=php3 Daylight Savings: http://bugs.php.net/fix.php?id=40090&r=dst IIS Stability: http://bugs.php.net/fix.php?id=40090&r=isapi Install GNU Sed: http://bugs.php.net/fix.php?id=40090&r=gnused Floating point limitations: http://bugs.php.net/fix.php?id=40090&r=float No Zend Extensions: http://bugs.php.net/fix.php?id=40090&r=nozend MySQL Configuration Error: http://bugs.php.net/fix.php?id=40090&r=mysqlcfg