On 24/06/09 14:00, björn wrote:
>
> Hi Eljay,
>
> 2009/6/23 John (Eljay) Love-Jensen:
>>
>>> As far as I can tell (from searching around) HFS+ always uses
>>> normalization form D (NFD) for filenames.
>>
>> HFS+ uses a variant of NFD for filenames. (The HFS+ variant predates
>> standardizatoin of NFD.) This requirement is enforced by the OS.
>>
>> http://developer.apple.com/technotes/tn/tn1150.html
>> http://developer.apple.com/technotes/tn/tn1150table.html
>> http://developer.apple.com/qa/qa2001/qa1235.html
>> http://www.unicode.org/reports/tr15/
>
> Thanks for clarifying that (and for the links!).
>
>> Windows uses NFC for filenames. I'm not sure if the Linux world settled on
>> NFC or NFK.
>
> I read that Windows uses NFKC. Have you got a reference for the claim
> that NFC is used?
>
>>> So as a workaround for the issue the OP had I now normalize filenames
>>> to compatibility form C (NFKC) before passing the filename on to Vim
>>> and this takes care of the OP's problem.
>>
>> NFC or NFKC? Those are different normalizations.
>>
>> Windows NTFS file system uses NFC. But it isn't enforced by the OS, yet.
>
> I did mean the compatibility form NFKC since I read somewhere that
> NTFS uses NFKC, but I did not research that very carefully.
>
>
>>> However, as I see it this really is a legitimate issue in Vim itself
>>> in that it does not handle NFD properly (the example above should
>>> always render as one glyph, not three as it does now if NFD is used).
>>> Either Vim should ensure that all buffers are normalized to composed
>>> form NFC/NFKC or it needs to be made "NFD aware".
>>
>> I agree with your assessment.
>>
>>> Does anybody on the vim_multibyte list (this mail goes to vim_mac as
>>> well) have any comments on this?
>>
>> The relevant Mac OS X routine APIs are:
>>
>> CFURLRef url =
>> CFURLCreateWithFileSystemPath(
>> kCFAllocatorDefault,
>> cfstringFullPath,
>> kCFURLPOSIXPathStyle,
>> false));
>>
>> char bufferUTF8[32768*4]; // Worst case scenario.
>> // As per Apple documentation, paths can be "up to 30,000 UTF-16
>> // encoding units long", with each component being up to 255 UTF-16
>> // encoding units long. Too bad there isn't an API to specify the
>> // exact buffer size /a priori/.
>>
>> Boolean success =
>> CFURLGetFileSystemRepresentation(
>> url,
>> true,
>> &bufferUTF8[0],
>> sizeof bufferUTF8);
>
> Thanks. NSString has a method called fileSystemRepresentation which
> I'm guessing does the same thing(?). I used the NSString method
> precomposedStringWithCompatibilityMapping to convert to NFKC.
>
> Björn
Hm, NFKC and NFKD sometimes fuse slightly different glyphs into a single
"normalized" form. For instance, NFKC(²) = 2, though both are
(different) Latin1 characters (0xB2 and 0x32). IIRC, DOS would have kept
them distinct.
Best regards,
Tony.
--
hundred-and-one symptoms of being an internet addict:
56. You leave the modem speaker on after connecting because you think it
sounds like the ocean wind...the perfect soundtrack for "surfing
the net".
--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_mac" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---