> I was thinking about this: maybe the NFS server could enforce > normalization form 'C' so that only the precomposed variant:
Note that normalisation form C is NOT "no combining characters", it is "maximally (re)composed according to Unicode 3.0". Combining characters can remain, and "new" precomposed characters would be decomposed (so there is no point in allocating such characters; though one such snuck in for Unicode 3.2). ... > without duplicate filenames. Hangul would immediatelly be ok > without the need of jamo decomposition. And we are also very Funny you should mention Hangul here. Hangul is the most glaring example where Unicode normalisation does NOT "normalise away" multiple representations of the SAME spelling. E.g. <gg><a> and <g><g><a> represent EXACTLY the same syllable, not even a hint of a difference (like width, font fixedness, or anything else), but none of the Unicode normalisation forms map them to the same representation (NFD, NFKD: no change; NFC, NFKC: <gga> and <g><ga>). This is due to historic events (note: there is NO syllable break between the two <g>'s); but that does not make the non-decomposition into the best way of handling Hangul. Also of interest here may be that, IIRC, HFS+ and UFS (the Apple file systems) represent all file names in NFD (and for UFS: in UTF-8). NFD, not NFC. Kind regards /kent k -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/