ID: 37775 User updated by: stronk7 at moodle dot org Reported By: stronk7 at moodle dot org Status: Bogus Bug Type: PCRE related Operating System: Windows XP PHP Version: 5.1.4 New Comment:
Sorry, in my previous post I realise that Ï = C3 AF and it should be: Ï = C3 8F (where 8F is one reserved char) (from http://www.microsoft.com/globaldev/reference/sbcs/ 1252.mspx) Previous Comments: ------------------------------------------------------------------------ [2006-06-11 00:17:05] stronk7 at moodle dot org hi! they aren't three machines but three ways to do "the same thing" in the same XP box. If everything was working fine, both the first and the second ways (both using [[cntrl]]), one PCRE and other POSIX should return the same result, isn't it? I've confirmed that Ï = C3 AF (in utf-8) and AF seems to be a reserved position under win-1252 so, your explanation have sense, assuming that reserved chars = control chars, but it should work the same under both PCRE and POSIX replace, or am I wrong? ------------------------------------------------------------------------ [2006-06-10 23:53:37] [EMAIL PROTECTED] such posix caracther classes depend on the current locale. if you use setlocale() on the 3 machines with the same locale you'll get the same results. (the definition of a control char is collected from the iscntrl() system function) ------------------------------------------------------------------------ [2006-06-10 23:06:55] stronk7 at moodle dot org Description: ------------ I was using one simple preg_replace() to clean strings from control characters and, under XP I found that some utf-8 characters are also modified although they don't contain control characters (\x-\1f and \7f) at all. Same code seems to work properly under MacOS X and linux. Please note that code below is utf-8 and should be pasted with the editor in that mode. The char failing seems to be the upper i with dieresis: Ï The example include the non-working example (first) plus two alternatives that work properly under XP. Ciao :-) Reproduce code: --------------- <?php $orig = "IIÏÏïï"; $dest = preg_replace("/[[:cntrl:]]/","",$orig); echo $dest; echo "\n<br>\n"; $orig = "IIÏÏïï"; $dest = ereg_replace("[[:cntrl:]]","",$orig); echo $dest; echo "\n<br>\n"; $orig = "IIÏÏïï"; $dest = preg_replace("/[\x-\x1f]/","",$orig); echo $dest; echo "\n<br>\n"; ?> Expected result: ---------------- Should return IIÏÏïï in the three alternatives. Actual result: -------------- This returns: II??ïï <--- incorrect <br> IIÏÏïï <--- correct <br> IIÏÏïï <--- correct <br> ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=37775&edit=1