ID: 35300 Updated by: [EMAIL PROTECTED] Reported By: dmceo415 at yahoo dot com -Status: Open +Status: Bogus -Bug Type: Directory function related +Bug Type: *General Issues Operating System: Windows XP PHP Version: 5.0.5 New Comment:
Please do not submit the same bug more than once. An existing bug report already describes this very problem. Even if you feel that your issue is somewhat different, the resolution is likely to be the same. Thank you for your interest in PHP. There are already feature requests for this. This is not a bug but missing feature. And this missing feature will be in PHP 6. Previous Comments: ------------------------------------------------------------------------ [2005-11-20 00:50:46] dmceo415 at yahoo dot com Description: ------------ On an NTFS-formatted drive, Windows XP stores every character of a file name in two bytes. Specifically, the encoding is UTF-16LE. PHP's "readdir" has no problem reading NTFS file names that consist only of characters included in the ISO-8859-1 repertoire. For these characters, the first UTF-16LE byte is the same as the ISO-8859-1 representation, and the second byte is simply 0x00, the null byte, which causes no damage. However, for two-byte characters at higher code points - Chinese characters, for example - "readdir" fails. To illustrate, I created a directory with two files. One of the files is named with Chinese characters; the other is named with ISO-8859-1 compatible characters. The following code: if ($dir = opendir('.\TestFiles')) { while (false !== ($file = readdir($dir))) { echo $file . '<br>'; } } produces this output: . .. ???????.m4a Greensleeves.m4a Clearly, "readdir" does not handle the two-byte Chinese characters correctly. Instead, it returns question marks. This is a big problem if one wants to use a PHP script to back up an NTFS-formatted drive. (And, by the way, using mb_internal_encoding("UTF-16LE") does not solve the problem; this seems to have no impact on "readdir".) Expected result: ---------------- "readdir" should be able to handle UTF-16 (i.e., two-byte) characters Actual result: -------------- "readdir" does not interpret two-byte characters correctly ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=35300&edit=1