From: troublegum at woltlab dot de Operating system: Windows 2000 Pro SP4 PHP version: 4CVS-2003-10-13 (stable) PHP Bug Type: PCRE related Bug description: regular expression on a UTF-8 string brakes this string
Description: ------------ I want to perform a regular expression on a string to replace all whitespace and commas with one single space. But If the string is a UTF-8 string, this string will be broken. I was not able to reproduce this behavior on my DebianLinux/Apache1.3.28/PHP4.3.3 Server. It occurs only on my Windows 2000 machine. Server: Apache 1.3.28 PHP as Apache Module php.ini Settings that diff from php.ini-dist: display_startup_errors = On magic_quotes_gpc = Off doc_root = c:/dev/htdocs extension_dir = c:/dev/php-4.3.3dev/extensions/ upload_max_filesize = 6M extension=php_bz2.dll extension=php_gd2.dll extension=php_gettext.dll extension=php_mbstring.dll extension=php_pdf.dll extension=php_sockets.dll session.save_path = c:/winnt/temp/php4_sessions session.use_trans_sid = 1 Reproduce code: --------------- see http://webpm.woltlab.info/phpgroup/pcre_utf8.phps Please read also the comments on the regular expression I can provide more strings that will fail if it is necessary. Expected result: ---------------- 1) Coeur Déjà Pris 2) Coeur Déjà Pris Actual result: -------------- 1) Coeur Déjà Pris 2) Coeur Déj? Pris -- Edit bug report at http://bugs.php.net/?id=25849&edit=1 -- Try a CVS snapshot (php4): http://bugs.php.net/fix.php?id=25849&r=trysnapshot4 Try a CVS snapshot (php5): http://bugs.php.net/fix.php?id=25849&r=trysnapshot5 Fixed in CVS: http://bugs.php.net/fix.php?id=25849&r=fixedcvs Fixed in release: http://bugs.php.net/fix.php?id=25849&r=alreadyfixed Need backtrace: http://bugs.php.net/fix.php?id=25849&r=needtrace Try newer version: http://bugs.php.net/fix.php?id=25849&r=oldversion Not developer issue: http://bugs.php.net/fix.php?id=25849&r=support Expected behavior: http://bugs.php.net/fix.php?id=25849&r=notwrong Not enough info: http://bugs.php.net/fix.php?id=25849&r=notenoughinfo Submitted twice: http://bugs.php.net/fix.php?id=25849&r=submittedtwice register_globals: http://bugs.php.net/fix.php?id=25849&r=globals PHP 3 support discontinued: http://bugs.php.net/fix.php?id=25849&r=php3 Daylight Savings: http://bugs.php.net/fix.php?id=25849&r=dst IIS Stability: http://bugs.php.net/fix.php?id=25849&r=isapi Install GNU Sed: http://bugs.php.net/fix.php?id=25849&r=gnused Floating point limitations: http://bugs.php.net/fix.php?id=25849&r=float