From: sdamir at gmail dot com Operating system: Linux 2.6.18 PHP version: 5.2.0 PHP Bug Type: *Regular Expressions Bug description: UTF8 support
Description: ------------ I am trying to match all alphabetic utf8 characters. I know (tested) that in perl if $string is utf8 encoded and if i use regex like =~ /\w/ it will match all alphabetic utf8 characters, (cirilic alphabet, chinese, english etc.). However this is not the case for php. I read i need to use special patterns like \pL , well this doesn't work for me either, it matches some characters but cirilic letters aren't matched. I don't know if this is a bug or i am doing something wrong but i really searched the hell out of everything, visited tons of irc support channels no one has an answer to this. Reproduce code: --------------- <?php // setlocale(LC_ALL, 'en_US.utf8'); // if i set locale to en_US, it matches some characters like öåä but not rilic, en_US.utf8 wont match anything. $str=" Срећа "; utf8_encode($str); var_dump($str); preg_match("/[\w\pL]/u",$str, $r); var_dump($r); ?> Expected result: ---------------- string(3) " s " array(1) { [0]=> string(1) "С" } Actual result: -------------- string(12) " Срећа " array(0) { } -- Edit bug report at http://bugs.php.net/?id=39744&edit=1 -- Try a CVS snapshot (PHP 4.4): http://bugs.php.net/fix.php?id=39744&r=trysnapshot44 Try a CVS snapshot (PHP 5.2): http://bugs.php.net/fix.php?id=39744&r=trysnapshot52 Try a CVS snapshot (PHP 6.0): http://bugs.php.net/fix.php?id=39744&r=trysnapshot60 Fixed in CVS: http://bugs.php.net/fix.php?id=39744&r=fixedcvs Fixed in release: http://bugs.php.net/fix.php?id=39744&r=alreadyfixed Need backtrace: http://bugs.php.net/fix.php?id=39744&r=needtrace Need Reproduce Script: http://bugs.php.net/fix.php?id=39744&r=needscript Try newer version: http://bugs.php.net/fix.php?id=39744&r=oldversion Not developer issue: http://bugs.php.net/fix.php?id=39744&r=support Expected behavior: http://bugs.php.net/fix.php?id=39744&r=notwrong Not enough info: http://bugs.php.net/fix.php?id=39744&r=notenoughinfo Submitted twice: http://bugs.php.net/fix.php?id=39744&r=submittedtwice register_globals: http://bugs.php.net/fix.php?id=39744&r=globals PHP 3 support discontinued: http://bugs.php.net/fix.php?id=39744&r=php3 Daylight Savings: http://bugs.php.net/fix.php?id=39744&r=dst IIS Stability: http://bugs.php.net/fix.php?id=39744&r=isapi Install GNU Sed: http://bugs.php.net/fix.php?id=39744&r=gnused Floating point limitations: http://bugs.php.net/fix.php?id=39744&r=float No Zend Extensions: http://bugs.php.net/fix.php?id=39744&r=nozend MySQL Configuration Error: http://bugs.php.net/fix.php?id=39744&r=mysqlcfg