Re: [zfs-discuss] path-name encodings

Bart Smaalders Tue, 04 Mar 2008 13:25:57 -0800

Marcus Sundman wrote:
> Bart Smaalders <[EMAIL PROTECTED]> wrote:
>> UTF8 is the answer here.  If you care about anything more than simple
>> ascii and you work in more than a single locale/encoding, use UTF8.
>> You may not understand the meaning of a filename, but at least
>> you'll see the same characters as the person who wrote it.
> 
> I think you are a bit confused.
> 
> A) If you meant that _I_ should use UTF-8 then that alone won't help.
> Let's say the person who created the file used ISO-8859-1 and named it
> 'häst', i.e., 0x68e47374. If I then use UTF-8 when displaying the
> filename my program will be faced with the problem of what to do with
> the second byte, 0xe4, which can't be decoded using UTF-8. ("häst" is
> 0x68c3a47374 in UTF-8, in case someone wonders.)


What I mean is very simple:

The OS has no way of merging your various encodings.  If I create a
directory, and have people from around the world create a file
in that directory named after themselves in their own character sets,
what should I see when I invoke:

% ls -l | less

in that directory?

If you wish to share filenames across locales, I suggest you and
everyone else writing to that directory use an encoding that will work
across all those locales.  The encoding that works well for this on
Unix systems is UTF8, since it leaves '/' and NULL alone.

- Bart




-- 
Bart Smaalders                  Solaris Kernel Performance
[EMAIL PROTECTED]               http://blogs.sun.com/barts
"You will contribute more with mercurial than with thunderbird."
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] path-name encodings

Reply via email to