ID:               43148
 Comment by:       carsten_sttgt at gmx dot de
 Reported By:      banu_daniel1 at yahoo dot com
 Status:           Open
 Bug Type:         Filesystem function related
 Operating System: windows xp 32 bits
 PHP Version:      5.2.4
 New Comment:

> but the problem is still there even on windows xp
> so this is the problem filesize function dose not
> work with filenames with unicode characters.

Ok, after some more tests, I can reproduce this problem. Just look at
this shell log:
| D:\>cd
D:\Apache2.2\htdocs\test\αβγδεζηθ
|
|
D:\Apache2.2\htdocs\test\αβγδεζηθ>dir
/b
| index.html
| phpinfo.php
|
|
D:\Apache2.2\htdocs\test\αβγδεζηθ>type
index.html
| <html><body><h1>It works!</h1></body></html>
|
D:\Apache2.2\htdocs\test\&#945;&#946;&#947;&#948;&#949;&#950;&#951;&#952;>type
phpinfo.php
| <?php phpinfo(); ?>
|
|
D:\Apache2.2\htdocs\test\&#945;&#946;&#947;&#948;&#949;&#950;&#951;&#952;>pear-request
http://localhost/
| test/%ce%b1%ce%b2%ce%b3%ce%b4%ce%b5%ce%b6%ce%b7%ce%b8/index.html
| <html><body><h1>It works!</h1></body></html>
|
D:\Apache2.2\htdocs\test\&#945;&#946;&#947;&#948;&#949;&#950;&#951;&#952;>php
-r "echo getcwd();"
| D:\Apache2.2\htdocs\test\aß?de???
|
D:\Apache2.2\htdocs\test\&#945;&#946;&#947;&#948;&#949;&#950;&#951;&#952;>cd..
|
| D:\Apache2.2\htdocs\test>php -r
"var_dump(stat('&#945;&#946;&#947;&#948;&#949;&#950;&#951;&#952;'));"
|
| Warning: stat(): stat failed for aß?de??? in Command line code on
|  line 1
| bool(false)
|
| D:\Apache2.2\htdocs\test>

As you can see, I can't execute a PHP script in this folder
("&#945;&#946;&#947;&#948;&#949;&#950;&#951;&#952;") or use the PHP
filesystem functions with this path. But I can access this folder
correctly with Apache via HTTP.


> on linux version i don't have this problem.

That's the difference. On Linux (or PHP) you have only UTF-8. But
Windows is using UTF-16 (or the current codepage for the installed
locale).


Just look at this script "test.php" (encoded in UTF-8):
| <?php
| mkdir('&#945;&#946;&#947;&#948;&#949;&#950;&#951;&#952;');
|
var_dump(is_dir('&#945;&#946;&#947;&#948;&#949;&#950;&#951;&#952;'));
| ?>

and the shell log:
| D:\Apache2.2\htdocs\test>php test.php
| bool(true)
|
| D:\Apache2.2\htdocs\test>dir /b
| test.php
| αβγδεζηθ
| 
| D:\Apache2.2\htdocs\test>

As you can see, you can create and access such paths with such a name
with PHP, but only inside PHP. In Windows or Apache you must use an
other (wrong) name. In this case PHP is just using the byte sequence of
UTF-8 chars as Latin1 chars.

This can be a quick fix for you, but is indeed not correct.

The problem is, PHP is only using simple string and filesystem
functions in the c sources, which are only working with the current
locale codepage. But it is not using the wide char and filesystem
functions from the Windows SDK, like Apache did.

BTW:
With a current PHP6 snap (full unicode support?), this also don't
work.

Regards,
Carsten

BTW:
There is another bug in this bugtracker. You can't use UTF-8 chars in
bug reports, after submitting a comment, UTF-8 chars will be replaced
with entities, but all comments are placed between <pre> tags. Thus the
browser shows entities and not the correct chars.

Please open this html page with a browser:
| <html>
| <head>
| <meta http-equiv=content-type content="text/html; charset=UTF-8">
| </head>
| <body>
| &#945;&#946;&#947;&#948;&#949;&#950;&#951;&#952;
| </body>
| </html>
and replace all entities in by comment with the chars you can see in
the browser.


Previous Comments:
------------------------------------------------------------------------

[2007-11-01 22:11:12] banu_daniel1 at yahoo dot com

no i didn't see that. i remove that " and the result is exactly the
same( Array ( ) ).
I've try with other folders (non utf) and it works.

------------------------------------------------------------------------

[2007-11-01 21:57:27] carsten_sttgt at gmx dot de

> dirs = glob('"D:/Downloads/*', GLOB_ONLYDIR);
             --^
Please remove my typo... (you have not seen that?):
| dirs = glob('D:/Downloads/*', GLOB_ONLYDIR);

Regards,
Carsten

------------------------------------------------------------------------

[2007-11-01 21:42:27] banu_daniel1 at yahoo dot com

$dirs = glob('"D:/Downloads/*', GLOB_ONLYDIR);
print_r($dirs);

result is
Array ( )

------------------------------------------------------------------------

[2007-11-01 20:59:12] carsten_sttgt at gmx dot de

> print_r(glob("D:\Downloads\&#21205;&#20043;&#23478;&#30332;&#20296;\
> &#12495;&#12516;&#12486;&#12398;&#12372;&#12392;&#12367;01\
> DVD_VIDEO.MDS"));

Sorry, I was not clear enought... Don't provide a parameter to glob().
Just look what glob() returns:
| $dirs = glob('"D:/Downloads/*', GLOB_ONLYDIR);

And now analyse the array $dirs. The value of one of the keys must be
the directory "D:/Downloads/&#21205;&#20043;&#23478;&#30332;&#20296;",
and look how this directory is stored in the array.

Regards,
Carsten

------------------------------------------------------------------------

[2007-11-01 19:54:39] banu_daniel1 at yahoo dot com

and again all characters are converted in html code

------------------------------------------------------------------------

The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at
    http://bugs.php.net/43148

-- 
Edit this bug report at http://bugs.php.net/?id=43148&edit=1

Reply via email to