On Fri, Feb 27, 2009 at 2:35 PM, Shawn Erickson <shaw...@gmail.com> wrote:
> On Fri, Feb 27, 2009 at 2:15 PM, Martin Wierschin <mar...@nisus.com> wrote:
>
>> On 2009.02.27, at 5:58 AM, Michael Ash wrote:
>>
>>> HFS+ only accepts non-UTF-8 by URL-encoding (!) the non-UTF-8 bytes
>>
>> Wow, that's pretty horrific.
>
> It also isn't really correct. HFS+ doesn't use UTF-8 it uses and
> stores Unicode (fully decomposed and in canonical order).

Mostly decomposed :) (i.e. it doesn't use NFD, but it uses something
pretty close to NFD)

> http://developer.apple.com/technotes/tn/tn1150.html#HFSPlusNames
>
> I don't think URL encode ever comes into play in HFS+ or in the POSIX
> APIs that takes UTF-8 (decomposed) paths

The POSIX APIs take UTF-8, regardless of the
composition/decomposition. That is, both of these lines open the same
file:

    fopen("\xC3\xA9","w"); //é, composed
    fopen("e\xCC\x81","w"); //é, decomposed

>... not sure what Michael is
> talking about.

On Leopard, invalid bytes will indeed be escaped:

[c...@ccox-macbook:~/temp]% ls
a.out   test.c
[c...@ccox-macbook:~/temp]% cat test.c
#include <stdio.h>

int main() {
    fopen("\"\xFF\"","w");
    return 0;
}
[c...@ccox-macbook:~/temp]% cc test.c && ./a.out
[c...@ccox-macbook:~/temp]% ls
"%FF"   a.out   test.c

-- 
Clark S. Cox III
clarkc...@gmail.com
_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to