ID: 39744 User updated by: sdamir at gmail dot com Reported By: sdamir at gmail dot com Status: Open Bug Type: *Regular Expressions Operating System: Linux 2.6.18 PHP Version: 5.2.0 New Comment:
I dont know why but your bug-system converted letters in my php code into &#crap; stuff. Previous Comments: ------------------------------------------------------------------------ [2006-12-05 15:48:49] sdamir at gmail dot com Description: ------------ I am trying to match all alphabetic utf8 characters. I know (tested) that in perl if $string is utf8 encoded and if i use regex like =~ /\w/ it will match all alphabetic utf8 characters, (cirilic alphabet, chinese, english etc.). However this is not the case for php. I read i need to use special patterns like \pL , well this doesn't work for me either, it matches some characters but cirilic letters aren't matched. I don't know if this is a bug or i am doing something wrong but i really searched the hell out of everything, visited tons of irc support channels no one has an answer to this. Reproduce code: --------------- <?php // setlocale(LC_ALL, 'en_US.utf8'); // if i set locale to en_US, it matches some characters like öåä but not rilic, en_US.utf8 wont match anything. $str=" Срећа "; utf8_encode($str); var_dump($str); preg_match("/[\w\pL]/u",$str, $r); var_dump($r); ?> Expected result: ---------------- string(3) " s " array(1) { [0]=> string(1) "С" } Actual result: -------------- string(12) " Срећа " array(0) { } ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=39744&edit=1