From: gehrig at ishd dot de
Operating system: Windows XP
PHP version: 5.2.6
PHP Bug Type: Strings related
Bug description: strcoll() does not work with UTF-8 strings on Windows
Description:
------------
The strcoll() function for sorting comparing strings in a locale-aware
manner does not seem to work with UTF-8 encoded strings despite using the
correct Windows locale with UTF-8 codepage (65001). strcoll() always
returns 2147483647 which makes array sorting of such strings more or less
random (for example).
Running the same snippet with Windows-1252 (ISO-8859-1) encoded strings or
on a Linux machine does in fact work as expected.
Please note: for running the following reproduce code, the PHP file must
be UTF-8 encoded!
Reproduce code:
---------------
<?php
function traceStrColl($a, $b) {
$outValue=strcoll($a, $b);
echo "$a $b $outValue\r\n";
return $outValue;
}
$locale=(defined('PHP_OS') && stristr(PHP_OS, 'win')) ?
'German_Germany.65001' : 'de_DE.utf8';
$string="ABCDEFGHIJKLMNOPQRSTUVWXYZÄÖÜabcdefghijklmnopqrstuvwxyzäöüß";
$array=array();
for ($i=0; $i<mb_strlen($string, 'UTF-8'); $i++) {
$array[]=mb_substr($string, $i, 1, 'UTF-8');
}
$oldLocale=setlocale(LC_COLLATE, "0");
var_dump(setlocale(LC_COLLATE, $locale));
usort($array, 'traceStrColl');
setlocale(LC_COLLATE, $oldLocale);
var_dump($array);
Expected result:
----------------
string(20) "German_Germany.65001"
a B -1
[...]
array(59) {
[0]=>
string(1) "a"
[1]=>
string(1) "A"
[2]=>
string(2) "ä"
[3]=>
string(2) "Ä"
[4]=>
string(1) "b"
[5]=>
string(1) "B"
[6]=>
string(1) "c"
[7]=>
string(1) "C"
[8]=>
string(1) "d"
[9]=>
string(1) "D"
[10]=>
string(1) "e"
[11]=>
string(1) "E"
[12]=>
string(1) "f"
[13]=>
string(1) "F"
[14]=>
string(1) "g"
[15]=>
string(1) "G"
[16]=>
string(1) "h"
[17]=>
string(1) "H"
[18]=>
string(1) "i"
[19]=>
string(1) "I"
[20]=>
string(1) "j"
[21]=>
string(1) "J"
[22]=>
string(1) "k"
[23]=>
string(1) "K"
[24]=>
string(1) "l"
[25]=>
string(1) "L"
[26]=>
string(1) "m"
[27]=>
string(1) "M"
[28]=>
string(1) "n"
[29]=>
string(1) "N"
[30]=>
string(1) "o"
[31]=>
string(1) "O"
[32]=>
string(2) "ö"
[33]=>
string(2) "Ö"
[34]=>
string(1) "p"
[35]=>
string(1) "P"
[36]=>
string(1) "q"
[37]=>
string(1) "Q"
[38]=>
string(1) "r"
[39]=>
string(1) "R"
[40]=>
string(1) "s"
[41]=>
string(1) "S"
[42]=>
string(2) "ß"
[43]=>
string(1) "t"
[44]=>
string(1) "T"
[45]=>
string(1) "u"
[46]=>
string(1) "U"
[47]=>
string(2) "ü"
[48]=>
string(2) "Ü"
[49]=>
string(1) "v"
[50]=>
string(1) "V"
[51]=>
string(1) "w"
[52]=>
string(1) "W"
[53]=>
string(1) "x"
[54]=>
string(1) "X"
[55]=>
string(1) "y"
[56]=>
string(1) "Y"
[57]=>
string(1) "z"
[58]=>
string(1) "Z"
}
Actual result:
--------------
string(20) "German_Germany.65001"
a B 2147483647
[...]
array(59) {
[0]=>
string(1) "c"
[1]=>
string(1) "B"
[2]=>
string(1) "s"
[3]=>
string(1) "C"
[4]=>
string(1) "k"
[5]=>
string(1) "D"
[6]=>
string(2) "ä"
[7]=>
string(1) "E"
[8]=>
string(1) "g"
[9]=>
string(1) "F"
[10]=>
string(1) "o"
[11]=>
string(1) "G"
[12]=>
string(1) "w"
[13]=>
string(1) "H"
[14]=>
string(1) "A"
[15]=>
string(1) "I"
[16]=>
string(1) "e"
[17]=>
string(1) "J"
[18]=>
string(1) "i"
[19]=>
string(1) "K"
[20]=>
string(1) "m"
[21]=>
string(1) "L"
[22]=>
string(1) "q"
[23]=>
string(1) "M"
[24]=>
string(1) "u"
[25]=>
string(1) "N"
[26]=>
string(1) "y"
[27]=>
string(1) "O"
[28]=>
string(2) "ü"
[29]=>
string(1) "P"
[30]=>
string(1) "b"
[31]=>
string(1) "Q"
[32]=>
string(1) "d"
[33]=>
string(1) "R"
[34]=>
string(1) "f"
[35]=>
string(1) "S"
[36]=>
string(1) "h"
[37]=>
string(1) "T"
[38]=>
string(1) "j"
[39]=>
string(1) "U"
[40]=>
string(1) "l"
[41]=>
string(1) "V"
[42]=>
string(1) "n"
[43]=>
string(1) "W"
[44]=>
string(1) "p"
[45]=>
string(1) "X"
[46]=>
string(1) "r"
[47]=>
string(1) "Y"
[48]=>
string(1) "t"
[49]=>
string(1) "Z"
[50]=>
string(1) "v"
[51]=>
string(2) "Ä"
[52]=>
string(1) "x"
[53]=>
string(2) "Ö"
[54]=>
string(1) "z"
[55]=>
string(2) "Ü"
[56]=>
string(2) "ö"
[57]=>
string(1) "a"
[58]=>
string(2) "ß"
}
--
Edit bug report at http://bugs.php.net/?id=46165&edit=1
--
Try a CVS snapshot (PHP 5.2):
http://bugs.php.net/fix.php?id=46165&r=trysnapshot52
Try a CVS snapshot (PHP 5.3):
http://bugs.php.net/fix.php?id=46165&r=trysnapshot53
Try a CVS snapshot (PHP 6.0):
http://bugs.php.net/fix.php?id=46165&r=trysnapshot60
Fixed in CVS: http://bugs.php.net/fix.php?id=46165&r=fixedcvs
Fixed in release:
http://bugs.php.net/fix.php?id=46165&r=alreadyfixed
Need backtrace: http://bugs.php.net/fix.php?id=46165&r=needtrace
Need Reproduce Script: http://bugs.php.net/fix.php?id=46165&r=needscript
Try newer version: http://bugs.php.net/fix.php?id=46165&r=oldversion
Not developer issue: http://bugs.php.net/fix.php?id=46165&r=support
Expected behavior: http://bugs.php.net/fix.php?id=46165&r=notwrong
Not enough info:
http://bugs.php.net/fix.php?id=46165&r=notenoughinfo
Submitted twice:
http://bugs.php.net/fix.php?id=46165&r=submittedtwice
register_globals: http://bugs.php.net/fix.php?id=46165&r=globals
PHP 4 support discontinued: http://bugs.php.net/fix.php?id=46165&r=php4
Daylight Savings: http://bugs.php.net/fix.php?id=46165&r=dst
IIS Stability: http://bugs.php.net/fix.php?id=46165&r=isapi
Install GNU Sed: http://bugs.php.net/fix.php?id=46165&r=gnused
Floating point limitations: http://bugs.php.net/fix.php?id=46165&r=float
No Zend Extensions: http://bugs.php.net/fix.php?id=46165&r=nozend
MySQL Configuration Error: http://bugs.php.net/fix.php?id=46165&r=mysqlcfg