php-windows Digest 18 Aug 2013 09:22:23 -0000 Issue 4140

Topics (messages 31133 through 31136):

Re: php can't resolve 8.3 paths to unicode filenames, is that expected ?
        31133 by: Pierre Joye
        31134 by: Pierre Joye
        31135 by: Pierre Joye
        31136 by: lester.lsces.co.uk

Administrivia:

To subscribe to the digest, e-mail:
        php-windows-digest-subscr...@lists.php.net

To unsubscribe from the digest, e-mail:
        php-windows-digest-unsubscr...@lists.php.net

To post to the list, e-mail:
        php-wind...@lists.php.net


----------------------------------------------------------------------
--- Begin Message ---
On Fri, Aug 16, 2013 at 9:40 AM, R. S. <rs...@live.com> wrote:
> Hello Pierre,
>
> Friday, August 16, 2013, 5:38:01 AM, you wrote:
>
>> and  'TESTUN~1.PHP' has unicode, created using CreateFileW with
>> 'testunicode-ßäü123-öâ.php' as path. Fetch the 8.3 path and use it to
>> open the file using php, works. In other words. exactly what your COM
>> script does. Now, why you come to the conclusion that PHP can't open
>> 8.3 path is a misery to me right now. It can and without issue.
>
> It still seems you didn't read my examples, but this time your fresh mind
> actually helped narrowing the problem and thanks
> for that !
>
> Indeed 'testunicode-ßäü123-öâ.php' works without
> any issue. It seems that only some character ranges
> are making php unable to get to the file, like russian or greek.
>
> if you make files like:
> testunicode-Ελλάδα or testunicode-Россия
>
> Then php can't get handle to these files with errors like:
>
> Warning: fopen(C:\TESTUN~3.PHP): failed to open stream: Invalid argument
> in C:\test.php on line 7



> So the problem is even stranger.

Ok, now I can reproduce it. That's actually weird and have to look at it :)

> How the php filesystem function is ever supposed to
> know that the path contain unicode letter since it was given the shorth path.

Again, it does not. Nothing in php deals with unicode paths.

> It really looks like someone tried to implement unicode filename handling in 
> PHP
> but stopped in middle.

No, nobody tried in any acitve branches, nor in master (not released
yet). And believe me, I know that very well.

Cheers,
-- 
Pierre

@pierrejoye | http://www.libgd.org

--- End Message ---
--- Begin Message ---
On Sat, Aug 17, 2013 at 7:17 AM, Pierre Joye <pierre....@gmail.com> wrote:
> On Fri, Aug 16, 2013 at 9:40 AM, R. S. <rs...@live.com> wrote:
>> Hello Pierre,
>>
>> Friday, August 16, 2013, 5:38:01 AM, you wrote:
>>
>>> and  'TESTUN~1.PHP' has unicode, created using CreateFileW with
>>> 'testunicode-ßäü123-öâ.php' as path. Fetch the 8.3 path and use it to
>>> open the file using php, works. In other words. exactly what your COM
>>> script does. Now, why you come to the conclusion that PHP can't open
>>> 8.3 path is a misery to me right now. It can and without issue.
>>
>> It still seems you didn't read my examples, but this time your fresh mind
>> actually helped narrowing the problem and thanks
>> for that !
>>
>> Indeed 'testunicode-ßäü123-öâ.php' works without
>> any issue. It seems that only some character ranges
>> are making php unable to get to the file, like russian or greek.
>>
>> if you make files like:
>> testunicode-Ελλάδα or testunicode-Россия
>>
>> Then php can't get handle to these files with errors like:
>>
>> Warning: fopen(C:\TESTUN~3.PHP): failed to open stream: Invalid argument
>> in C:\test.php on line 7
>
>
>
>> So the problem is even stranger.
>
> Ok, now I can reproduce it. That's actually weird and have to look at it :)

Ok, it looks like a found a bug, not in php but in the windows API.
Which PHP version do you use (newest version uses a different CRT)?

-- 
Pierre

@pierrejoye | http://www.libgd.org

--- End Message ---
--- Begin Message ---
hi!

On Fri, Aug 16, 2013 at 10:45 PM, Lester Caine <les...@lsces.co.uk> wrote:
> R. S. wrote:
>>
>> if you make files like:
>> testunicode-Ελλάδα or testunicode-Россия
>>
>> Then php can't get handle to these files with errors like:
>
>
> RS ... Just out of curiosity does the problem ranges of alphabets fall
> outside of the limited 'UTF16' range that M$ uses for 'wide strings'. I'm
> busy out at an exhibition this weekend so only have limited access to my
> database, but I do seem to recall that since wide strings are only 16 bit
> based, any character area going into the 24bit region (3 byte) is not
> supported.

PHP does not use the wild char APIs at all. So it is not related to
the wild change encoding/charset support.

The problem here is that for whatever reason (to figure out)
FindFirstFileW is used instead of FindFirstFileA when this kind of
path is given (I was only able to reproduce it with Russian). This is
definitively a bug in the CRT and I have to discuss that with my
colleagues from the vc team.

Cheers,
-- 
Pierre

@pierrejoye | http://www.libgd.org

--- End Message ---
--- Begin Message ---
Did you mean wide character? 
I was just flagging up that these are limited range on both M$ and on the 
Inprise platforms and some subsets do get missed.
Just a thought as to possible source of problem ....

Sent from my android device.

--- End Message ---

Reply via email to