From: Bram Moolenaar <[EMAIL PROTECTED]>
Bruno Haible -
> > GTK+ 2.0 will be taking the policy, that all filenames are in UTF-8
>
> This is a mistake. Users use terminals and 'ls' to see and manipulate
> their files. The file name, as passed from/to the kernel, must be in
> the user's locale encoding.
Yes.
Two users could access the same file, while they have a different locale
set. Does this mean the kernel will do conversion?
No.
There is a potential big problem in this area. If the kernel doesn't do
conversion, will all applications have to do this?
No. A filename is just a sequence of bytes - no conversion required
or desirable.
> Tomohiro Kubota writes:
> > I think it is not a good idea, too. You will have trouble when you
> > use other locales (for example, UTF-8 locales).
>
> When a user switches locales, it's easy to rename all files, using a
> combination of 'find', 'ls', 'iconv', 'mv'. This is much easier than
> converting the contents of the files.
I don't think it is either easy or desirable.
(Example: file names occur as text in files, scripts, Makefiles.)
When a user switches locale, nothing should happen to the already stored
files. I switch locale several times a day when testing Vim.
Wouldn't want my files to be renamed then!
Right.
I suppose the filesystem should have a setting somewhere as to which
encoding is used for for the file names. Applications (or the kernel)
should then do conversion. Obviously, the encoding used for the
file system should match with the most often used locale to avoid
too many conversions.
It doesnt work (at present).
Linux is a multi-user system. Different users with different nationalities
use different locales. These Russians all want KOI-8, while the Danes
want ISO 8859-1. Most filesystem types do not store the character set
the filename is supposed to be in, and most users do not know enough
to supply such information.
That is why I agree with Bruno (on the first point) - everybody sets
things right for his own locale, and sees his own filenames as intended.
In the long run we'll maybe all use UTF-8 and the problem disappears.
Andries
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/lists/