ID:               35300
 Updated by:       [EMAIL PROTECTED]
 Reported By:      dmceo415 at yahoo dot com
-Status:           Open
+Status:           Bogus
-Bug Type:         Directory function related
+Bug Type:         *General Issues
 Operating System: Windows XP
 PHP Version:      5.0.5
 New Comment:

Please do not submit the same bug more than once. An existing
bug report already describes this very problem. Even if you feel
that your issue is somewhat different, the resolution is likely
to be the same. 

Thank you for your interest in PHP.

There are already feature requests for this. This is not a bug but
missing feature. And this missing feature will be in PHP 6.



Previous Comments:
------------------------------------------------------------------------

[2005-11-20 00:50:46] dmceo415 at yahoo dot com

Description:
------------
On an NTFS-formatted drive, Windows XP stores every character of a file
name in two bytes. Specifically, the encoding is UTF-16LE.

PHP's "readdir" has no problem reading NTFS file names that consist
only of characters included in the ISO-8859-1 repertoire. For these
characters, the first UTF-16LE byte is the same as the ISO-8859-1
representation, and the second byte is simply 0x00, the null byte,
which causes no damage.

However, for two-byte characters at higher code points - Chinese
characters, for example - "readdir" fails.

To illustrate, I created a directory with two files. One of the files
is named with Chinese characters; the other is named with ISO-8859-1
compatible characters. The following code:

if ($dir = opendir('.\TestFiles')) {
  while (false !== ($file = readdir($dir))) {
    echo $file . '<br>';
  }
}

produces this output:

.
..
???????.m4a
Greensleeves.m4a

Clearly, "readdir" does not handle the two-byte Chinese characters
correctly. Instead, it returns question marks. This is a big problem if
one wants to use a PHP script to back up an NTFS-formatted drive.

(And, by the way, using mb_internal_encoding("UTF-16LE") does not solve
the problem; this seems to have no impact on "readdir".)

Expected result:
----------------
"readdir" should be able to handle UTF-16 (i.e., two-byte) characters

Actual result:
--------------
"readdir" does not interpret two-byte characters correctly


------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=35300&edit=1

Reply via email to