2009/6/21 Andrew Dunbar:
> 2009/6/20 Tony Mechelynck:
>> On 20/06/09 19:58, björn wrote:
>>>
>>> I tried it myself on Linux and had the same problem and realized that
>>> the problem has to do with how you represent 한.  If done as you
>>> suggest with U+D55C it works (both Linux and MacVim), but if
>>> represented by U+1112, U+1161, U+11AB then Vim will render it as three
>>> glyphs but here the Cocoa text system combines these into one glyph
>>> and that is where the problem in MacVim appears.  (By the way: MacVim
>>> defaults to use utf-8 for 'encoding'.)
>>
>> Ah, I see. I entered it in Vim by copy-paste from your previous post in
>> the vim_mac Google Group page in my browser.
>>
>> Vim is obviously unaware of hangul jamo decomposition / recomposition
>> and IIUC will render each of them as one glyph. I'm not sure how to have
>> them be treated as "one spacing + (in this case) 2 composing characters"
>> though IIUC it would be "the right way" to do it.
>
> Hangul jamo (de)composition is part of Unicode normalization. Do we know
> if OS X does Unicode for all characters or just for Korean? I suspect it is
> done for all characters to prevent two identical looking filenames which 
> differ
> only in Unicode normalization. A good language to test this with would be
> Vietnamese which uses Latin script with up to three "accents" per character.
>
> Unicode normalization might be a feature of the HFS+ filesystem as there is
> a general problem in computing of encodings vs. filesystems.

Hi Andrew,

As far as I can tell (from searching around) HFS+ always uses
normalization form D (NFD) for filenames.  So as a workaround for the
issue the OP had I now normalize filenames to compatibility form C
(NFKC) before passing the filename on to Vim and this takes care of
the OP's problem.

However, as I see it this really is a legitimate issue in Vim itself
in that it does not handle NFD properly (the example above should
always render as one glyph, not three as it does now if NFD is used).
Either Vim should ensure that all buffers are normalized to composed
form NFC/NFKC or it needs to be made "NFD aware".  Does anybody on the
vim_multibyte list (this mail goes to vim_mac as well) have any
comments on this?

Björn

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_mac" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

Reply via email to