Thomas Zander <[EMAIL PROTECTED]> writes:

> Not true; the locale is nothing more then a pointer to the i18n and l10n
> specific for that person using the computer.
> Naturally the l10n implies certain things like currency, date format etc.
> The nl_NL@EURO has a variant to change one aspect of the implied information.

Yes.

> > Allowing the user to override it is
> > a design bug (de_DE@euro with iso-8859-1 isn't going to work ...).
> 
> That is exactly why your assertion that locale == encoding fails.
> If a user decides to read his error messages in german he changes his
> locale, same with date format etc.
> But changing the locale should not imply a change in the encodings since
> changing the locale does not change all your text files and filenames.

Of course all your data will not be magically converted.  The encoding
of the data is independant of the encoding used by the application
internally.  When reading/writing the files the application has to
convert the files if needed.  vim 6.x can do that, (x)emacs can do
that, even kwrite has a encoding menu in the open/save dialogs for
exactly that purpose.  Even more obvious this is with mail clients,
which must be able to deal with mails in all sorts of encodings at the
same time.

Setting the locale (LC_CTYPE to be exact) to de_DE.UTF-8 will make the
system (libc, X11, ...) expect that your applications internally uses
UTF-8.  All the wide char functions (wprintf and friends) will start
to print strings in UTF-8.  XmbDrawString() parses the strings as
UTF-8.  gettext will start to give you the translated messages in
UTF-8 encoding, ...

Thats exactly the same mechanism which makes them print kanji in
EUC-JP encoding when running in ja_JP locale.

> Your reasoning would work if it is impossible to change the locale, but
> (luckely) it is not, otherwise you would have to reinstall your machine to
> be able to see KDE/GNOME (or bash errors) in another language!  As a SuSE
> install recently showed me, the machine locale is a default, nothing more
> then a hint.

You don't have to tell me that, of course it is possible to change the
locale.  You can switch almost anywhere, you can even different apps
in different locales at the same time in the same X11 session, ...

> >    E199 kraxel ~# LANG=de_DE@euro locale charmap
> >    ISO-8859-15
> >    E199 kraxel ~# LANG=de_DE.UTF-8 locale charmap
> >    UTF-8
> 
> I get no difference in charmap between these last two on my debian SID machine.

Fix your /etc/locale.gen, my woody machine does this just fine.
So does the SuSE box.

   bytesex kraxel ~# grep de_DE /etc/locale.gen
   de_DE ISO-8859-1
   de_DE.UTF-8 UTF-8
   de_DE@euro ISO-8859-15
   [EMAIL PROTECTED] UTF-8

> The correct solution is that the commands man, ls etc would correctly use uft8
> at which moment the strings they display (send through the ssh port or to an
> xterm) have to be encoded in an encoding the current bash/csh decided on.

Great idea.  Know what?  locales do exactly that.  Set the locale to
de_DE.UTF-8 and see all the commands start printing strings in UTF-8:

bytesex kraxel ~# LANG=de_DE date "foobar" 2>&1 | hex
00000000  64 61 74 65  3a 20 75 6e  67 fc 6c 74  69 67 65 73  date: ung.ltiges
                                       ^^ 'ü' in iso-8859-1 encoding

bytesex kraxel ~# LANG=de_DE.UTF-8 date "foobar" 2>&1 | hex
00000000  64 61 74 65  3a 20 75 6e  67 c3 bc 6c  74 69 67 65  date: ung..ltige
                                       ^^ ^^ 'ü' in utf-8 encoding

> Creating a shortcut by trying to do the encoding conversions later on will not
> work. But again; you allready saw that.

I'm not talking about encoding conversions later on.  xterm doesn't
convert stuff.

> Each and every application that assumes latin1 or other non-utf8 encodings has to
> be adjusted before a system as a whole can be converted.

Yes, that is part of the problem.  Lot of apps exist which don't look
at the locale and thats why don't work correctly in de_DE.UTF-8 (and
thats why I think it was a bad idea that RedHat switched the default
locale to UTF-8 in 8.0).

> Conclusion;
> An xterm should be able to display chinese if man or ls outputs that,

That works just fine if xterm and applications use the same charset.
And exactly thats why xterm should come up in utf-8 mode in UTF-8
locales.

  * If your xterm runs in non-utf8 mode and your locale is de_DE both
    xterm and your terminal apps use iso-8859-1 and everything is fine.

  * If your xterm runs in utf8 mode and your locale use de_DE.UTF-8
    both xterm and your terminal apps use UTF-8 and everyting is fine
    (assuming the apps actually look at the locale, see above).  The
    only difference you may notice is that you will see non-latin1
    characters correctly.  Within mutt for example, if you mail with
    people which have non-latin1 characters in the name.  Or spam from
    korea *grin*.

  * If your xterm runs in non-utf8 mode and your shell runs in
    de_DE.UTF-8 the display will be f*cked up as soon as someone uses
    non-ascii characters.  That is what happens if you ssh from your
    Debian box into a fresh RedHat 8.0 system because the terminal
    encoding isn't passed through to the other side.  It's easy to fix
    through, just set LC_CTYPE environment variable to something what
    matches the encoding your terminal uses, de_DE for example.  See?

  * The same problem also exists the other way around of course, i.e.
    xterm in utf8 mode and shell in de_DE locale.

xterms utf8 mode needs iso646-1 fonts btw, so you might have to tweak
your app-defaults if you want to play with that.

  Gerd

-- 
Weil die späten Diskussionen nicht mal mehr den Rotwein lohnen.
                                -- Wacholder in “Melanie”
_______________________________________________
Devel mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/devel

Reply via email to