On Tue, 25 Jan 2005 16:34:57 +0100 (CET)
Roman Zippel <[EMAIL PROTECTED]> wrote:
> I'm not quite sure, what problem you're trying to solve here.
I am trying to implement character sets conversion for MacHFS. I have some CD
s with russian file names. Currently they are not displayed properly because
Linux uses KOI8-R character set for russian letters and Macintosh uses its own
character set called Mac-cyrillic or codepage 10007.
Firstly i tried to implement character set conversion using NLS tables. It was
done using "iocharset" and "codepage" arguments. "Iocharset" specified Linux's
local character set and "codepage" specified HFS's character set. So to convert
a character i needed to process it twice: convert from "codepage" to Unicode
and then convert from Unicode to "iocharset".
The problem with this is that some characters will be lost during this
conversion. Not all characters from source ("codepage") charset are present in
destination ("iocharset") charset table (for example "Folder" sign). But for
proper operation of dir.c/hfs_lookup() function we need to be able to convert
the name back from KOI8-R to CP10007 otherwise searching algorythm will fail.
This will lead to that we won't be able to operate with any file which contains
such a characters.
A solution was to use my own conversion table which ensures that no characters
will be lost during conversion in both directions. Every unique source
character is translated to some unique destination character. Of course
Mac-specific characters are not displayed properly but they're not lost either.
"codepage" argument was omitted for simplicity because specific "iocharset"
implies specific "codepage" (for example if iocharset is koi8-r then we can
assume that Macintosh codepage is mac-cyrillic). But some people said that this
patch can't be approved because not using NLS is bad solution. So i'd like to
talk to you, may be we'll find a better solution (because you know HFS better
than me) or we can come to a conclusion that there is really no solution and
push the patch upstream.
> If you want to store unicode characters use HFS+, I plan to implement nls
> support real soon for it (especially because to also fix the missing
> decomposition support).
Would be nice. I also thought about it but i have no HFS+ disks with russian
names so i can't test it. And i decided not to do a "blind" implementation in
order not to break the filesystem. Currently my patch adds "iocharset"
argumnent to HFS+ also (so that i can specify both filesystems in one
/etc/fstab line, this is useful for CD-ROM) but it is ignored there.
--
Best regards,
Pavel Fedin,
mailto:[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/