Re: question on Linux UTF8 support

srintuar Wed, 03 Aug 2005 22:25:19 -0700

However, when you recommend to an application author that his application
should consider all filenames as being UTF-8, this is not an improvement.
It is a no-op for the UTF-8 users but breaks the world of the EUC-JP and
KOI8-R users.


Perhaps that is too conservative.

Any effort spent supporting legacy encodings, or being prepared to perform
charset conversions on input seems wasteful to me. (even to support

alternative unicode encodings) Locales are still useful, but I thinklocales

should not specify encoding.

There are a lot of benefits to be gained in the form of simplicity and
iteroperability, when applications are free to assume that all text they
might encounter will be utf-8 encoded. Common protocols and file
formats shouldnt have to even specify what encoding text is in, imo.
by specifying they are allowing for the posibility that it might be
different, and that an application may have to deal with charset
conversion etc...

System wide messages, the login screen, the filesystem, gecos fields,
.plans, /etc/issue, /etc/motd, etc are examples where I think a common
enforces encoding would be beneficial.

The alternative, such as tagging metadata onto the filesystem layer,

individual inodes, idividual file metadata descriptors, etc, seems faruglier

in comparison. (imagine a file who's name is in one encoding, metadata in
a second, and content in yet a third :(

IDN URL's are another good example. Its clearly preferable to have a URL

be both canonical (byte for byte) as well as in readable (i.e.non-punycode) form. If a user

provides an idn URI to the system or another user in an unexpected encoding,
the resoure would be unresolvable.


--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Re: question on Linux UTF8 support

Reply via email to