RE: Linux whining (was Re: xterm and UTF8)
I'm sure it gives you a hardon to bash Linux, but you're being intellectually dishonest. For openers, sexual references are inappropriate. This is a list of professionals, and it ill behoves you to not treat us that way. Secondly, I was in *NO WAY* bashing Linux. I am a HUGE fan of open source and have spent a large number of hours contributing to open source projects and trying to convince a not invented here company to accept more of it in their products. I was making a commentary on badly written *APPLICATIONS* that are very Linux-centric yet still trying to claim portability. That was the sum total of my gripe. GNOME and KDE applications are running successfully on Solaris, FreeBSD, and many other platforms. Sun is actively working on a desktop based on these Linux-centric applications. GNOME and KDE are in *NO WAY* Linux-centric. In fact they are shining examples of how software CAN be portable and how to do it right. You're just as bad as some of my friends, ignorantly bashing Microsoft without any real knowledge. Take your ignorance elsewhere. If I felt you were in any way qualified to make any kind of judgement about my level of ignorance or not, I may take offence. But since you seem to be a reactionary with an axe to grind or an over-inflated sense of defensiveness (is that a word?) towards Linux I will take it from whence it comes. In future, I strongly recommend you read what is actually writen, make an attempt to discern that was intended if the meaning was ambiguous, and inquire as to that intent before running your mouth. This is not the place for this discussion so I shall not continue it here further. If you feel the need for a flame war please respond to me privately. Have a nice timezone. Kean ___ Devel mailing list [EMAIL PROTECTED] http://XFree86.Org/mailman/listinfo/devel
Re: xterm and UTF8
Juliusz Chroboczek [EMAIL PROTECTED] $B$5$s$O=q$-$^$7$?(B: (B (B And, yes, of course xterm should start up in utf-8 mode if the locale (B encoding is UTF-8. (B (B MH Thanks, that was my original conclusion also. I had just (B MH wondered why it doesn't. Just an xterm bug I guess. (B (B *locale: true (B (B (Credit to Tomohiro Kubota.) (B (BThat makes xterm start luit automatically for non-UTF-8 locales. (B (Bxterm starts in UTF-8 mode when the locale encoding is UTF-8 even if (B"locale: false". XTerm did that for a long time already, see also (B (Bhttp://mail.nl.linux.org/linux-utf8/2001-05/msg00063.html (B (B-- (BMike Fabian [EMAIL PROTECTED] http://www.suse.de/~mfabian $B?gL2ITB-$O;E;v$NE($@!#(B (B___ (BDevel mailing list ([EMAIL PROTECTED] (Bhttp://XFree86.Org/mailman/listinfo/devel
RE: xterm and UTF8
On Fri, 2003-02-21 at 12:47, Kean Johnston wrote: There's a libcharset that I think comes with libiconv and is also used in GLib that you can use to work around this problem. Which is fine if you use GNU iconv. For those of us that use the iconv as it was originally invented, libcharset doesn't seem to help very much. Maybe I am missing what its trying to do but certainly on Unixware and OpenServer, it does nothing of any use. The way I got around this on OpenServer was (I think) rather sneaky. Our iconv() lets you add in a .so for any given encoding. So I added the ability to define in the iconv data file a fallback mechanism ... basically * * for the from to fields. Then I simply put the whole of GNU iconv in under that entry. So our system defined conversions are used first, and for all those Linux-centric applications that are so badly written to really only support one OS now just work. Note that iconv as originally invented is ENTIRELY BROKEN because no list of standardized encoding names was specified and no set of encodings was required. I don't know what politics or technical oversight was behind this, but it means that using iconv() portably is impossible without extra infrastructure. The GLib library (used by GTK+, GNOME, etc) works around this by using the libcharset data in both directions ... as well as getting a standardized form of encodings reported by the operating system, it will use it convert standardized names into names that are likely to work for iconv() on a particular system. Regards, Owen ___ Devel mailing list [EMAIL PROTECTED] http://XFree86.Org/mailman/listinfo/devel
Re: xterm and UTF8
Thomas Zander [EMAIL PROTECTED] writes: You can specify en_US.UTF-8 as your locale. Which implies to me that xterm can recognize, from its environment, the encoding, and act accordingly. Hope you aren't using the locale standard 'Variation' variable for that; in most europe countries that will give problems since they allready use it for the EURO variation. Do you know who came up with this idea? Is there a mailing list I can look at? Knowing my locales; encondings should not be in a locale! (Since the user can override them) The encoding _is_ in the locale. Allowing the user to override it is a design bug (de_DE@euro with iso-8859-1 isn't going to work ...). Use the correct locale instead, there are different ones for different encodings: E199 kraxel ~# LANG=POSIX locale charmap ANSI_X3.4-1968 E199 kraxel ~# LANG=de_DE locale charmap ISO-8859-1 E199 kraxel ~# LANG=de_DE@euro locale charmap ISO-8859-15 E199 kraxel ~# LANG=de_DE.UTF-8 locale charmap UTF-8 E199 kraxel ~# And, yes, of course xterm should start up in utf-8 mode if the locale encoding is UTF-8. Gerd -- Weil die späten Diskussionen nicht mal mehr den Rotwein lohnen. -- Wacholder in “Melanie” ___ Devel mailing list [EMAIL PROTECTED] http://XFree86.Org/mailman/listinfo/devel
Re: xterm and UTF8
Thomas Zander [EMAIL PROTECTED] writes: Not true; the locale is nothing more then a pointer to the i18n and l10n specific for that person using the computer. Naturally the l10n implies certain things like currency, date format etc. The nl_NL@EURO has a variant to change one aspect of the implied information. Yes. Allowing the user to override it is a design bug (de_DE@euro with iso-8859-1 isn't going to work ...). That is exactly why your assertion that locale == encoding fails. If a user decides to read his error messages in german he changes his locale, same with date format etc. But changing the locale should not imply a change in the encodings since changing the locale does not change all your text files and filenames. Of course all your data will not be magically converted. The encoding of the data is independant of the encoding used by the application internally. When reading/writing the files the application has to convert the files if needed. vim 6.x can do that, (x)emacs can do that, even kwrite has a encoding menu in the open/save dialogs for exactly that purpose. Even more obvious this is with mail clients, which must be able to deal with mails in all sorts of encodings at the same time. Setting the locale (LC_CTYPE to be exact) to de_DE.UTF-8 will make the system (libc, X11, ...) expect that your applications internally uses UTF-8. All the wide char functions (wprintf and friends) will start to print strings in UTF-8. XmbDrawString() parses the strings as UTF-8. gettext will start to give you the translated messages in UTF-8 encoding, ... Thats exactly the same mechanism which makes them print kanji in EUC-JP encoding when running in ja_JP locale. Your reasoning would work if it is impossible to change the locale, but (luckely) it is not, otherwise you would have to reinstall your machine to be able to see KDE/GNOME (or bash errors) in another language! As a SuSE install recently showed me, the machine locale is a default, nothing more then a hint. You don't have to tell me that, of course it is possible to change the locale. You can switch almost anywhere, you can even different apps in different locales at the same time in the same X11 session, ... E199 kraxel ~# LANG=de_DE@euro locale charmap ISO-8859-15 E199 kraxel ~# LANG=de_DE.UTF-8 locale charmap UTF-8 I get no difference in charmap between these last two on my debian SID machine. Fix your /etc/locale.gen, my woody machine does this just fine. So does the SuSE box. bytesex kraxel ~# grep de_DE /etc/locale.gen de_DE ISO-8859-1 de_DE.UTF-8 UTF-8 de_DE@euro ISO-8859-15 [EMAIL PROTECTED] UTF-8 The correct solution is that the commands man, ls etc would correctly use uft8 at which moment the strings they display (send through the ssh port or to an xterm) have to be encoded in an encoding the current bash/csh decided on. Great idea. Know what? locales do exactly that. Set the locale to de_DE.UTF-8 and see all the commands start printing strings in UTF-8: bytesex kraxel ~# LANG=de_DE date foobar 21 | hex 64 61 74 65 3a 20 75 6e 67 fc 6c 74 69 67 65 73 date: ung.ltiges ^^ 'ü' in iso-8859-1 encoding bytesex kraxel ~# LANG=de_DE.UTF-8 date foobar 21 | hex 64 61 74 65 3a 20 75 6e 67 c3 bc 6c 74 69 67 65 date: ung..ltige ^^ ^^ 'ü' in utf-8 encoding Creating a shortcut by trying to do the encoding conversions later on will not work. But again; you allready saw that. I'm not talking about encoding conversions later on. xterm doesn't convert stuff. Each and every application that assumes latin1 or other non-utf8 encodings has to be adjusted before a system as a whole can be converted. Yes, that is part of the problem. Lot of apps exist which don't look at the locale and thats why don't work correctly in de_DE.UTF-8 (and thats why I think it was a bad idea that RedHat switched the default locale to UTF-8 in 8.0). Conclusion; An xterm should be able to display chinese if man or ls outputs that, That works just fine if xterm and applications use the same charset. And exactly thats why xterm should come up in utf-8 mode in UTF-8 locales. * If your xterm runs in non-utf8 mode and your locale is de_DE both xterm and your terminal apps use iso-8859-1 and everything is fine. * If your xterm runs in utf8 mode and your locale use de_DE.UTF-8 both xterm and your terminal apps use UTF-8 and everyting is fine (assuming the apps actually look at the locale, see above). The only difference you may notice is that you will see non-latin1 characters correctly. Within mutt for example, if you mail with people which have non-latin1 characters in the name. Or spam from korea *grin*. * If your xterm runs in non-utf8 mode and your shell runs in de_DE.UTF-8 the display will be f*cked up as soon as someone uses non-ascii
Re: xterm and UTF8
On Thu, 20 Feb 2003, Thomas E. Dickey wrote: Would it be sensible and acceptable to have xterm default to using the encoding of the user's locale at startup? Seems to be only if you happen to be running redhat 8.x I'd like to know what problems are caused by autodetecting the user's locale, and having xterm use the user's encoding by default. redhat's policy of setting utf-8 locale globally makes it at best awkward for remote connections. I can respect your opinion about that, however I don't believe that answers the question I asked. I'd also like to know what good technical reasons the existing defaults are superior for. Other than taking blind potshots at Red Hat Linux, and what appears to be your disapproval of UTF-8, do you have any valid technical comment to contribute as a response to my question perhaps? I'd love to hear actual technical feedback if there is any. frankly, I'm sure I know more about the topic than you do; Perhaps you do, but I haven't seen you share any of that knowledge in response to my question yet. and am irritated by your snide remarks on this list. I'm irritated by asking an honest question, in an attempt to find out what the best thing to do is, and getting back an irritating remark about Red Hat Linux using UTF-8. My purpose of asking the question, was to hear technical reasoning, not opinionated attack on Red Hat Linux using UTF-8. -- Mike A. Harris ___ Devel mailing list [EMAIL PROTECTED] http://XFree86.Org/mailman/listinfo/devel
Re: xterm and UTF8
On Thu, Feb 20, 2003 at 06:32:56PM +0100, Gerd Knorr wrote: Thomas Zander [EMAIL PROTECTED] writes: Conclusion; An xterm should be able to display chinese if man or ls outputs that, That works just fine if xterm and applications use the same charset. And exactly thats why xterm should come up in utf-8 mode in UTF-8 locales. Ok, thanx for all that info :) (and I agree with your conclusion) I see that we were not completely talking about the same thing. I was talking about utf8 being used in manpages (the files) etc, and in filenames. You were not. The fact that locales can be changed is not relevant when the only thing being effected is the output stream of an app being displayed. * If your xterm runs in non-utf8 mode and your shell runs in de_DE.UTF-8 the display will be f*cked up as soon as someone uses non-ascii characters. That is what happens if you ssh from your Debian box into a fresh RedHat 8.0 system because the terminal encoding isn't passed through to the other side. It's easy to fix through, just set LC_CTYPE environment variable to something what matches the encoding your terminal uses, de_DE for example. See? Whatever the solution its a bug in ssh, not X. I have the opinion that forwarding does not solve all problems (since one of the machines may not know about utf8) so ssh could better convert the data stream using the different locales at both sides. Sorry for the off-topic... At an attempt to get back on topic; I am wondering how xterm should handle different shells (using screen for example). The perfect-world-solution would be to ask the bash its env-var at every printf, but that may prove to not really be feasable.. In other words; to start xterm in utf-8 when the locale is utf8 seems to only be half the solution, since the shell started might just start a startup script that sets LANG to something different.. -- Thomas Zander ___ Devel mailing list [EMAIL PROTECTED] http://XFree86.Org/mailman/listinfo/devel
Re: xterm and UTF8
On Thu, Feb 20, 2003 at 10:30:43PM +0100, Thomas Zander wrote: Whatever the solution its a bug in ssh, not X. I have the opinion that forwarding does not solve all problems (since one of the machines may not know about utf8) so ssh could better convert the data stream using the different locales at both sides. Oh, agreed... but at that point it is analagous to an English-speaking individual logging into a box that only has Japanese locales installed. If one of the machines can't translate the locale into something suitably usable, you're SOL ;-) Jeff ___ Devel mailing list [EMAIL PROTECTED] http://XFree86.Org/mailman/listinfo/devel
Re: xterm and UTF8
And, yes, of course xterm should start up in utf-8 mode if the locale encoding is UTF-8. MH Thanks, that was my original conclusion also. I had just MH wondered why it doesn't. Just an xterm bug I guess. *locale: true (Credit to Tomohiro Kubota.) Juliusz ___ Devel mailing list [EMAIL PROTECTED] http://XFree86.Org/mailman/listinfo/devel
Re: xterm and UTF8
JG Though I disagree that ssh should not transmit the current locale: it JG _should_, precisely because it could be different coming from Solaris, JG or Debian, or Windows, and you want to make sure both sides agree on JG the encoding. Of course, this also assumes that there is some suitable JG locale conversion that can take place; otherwise the whole exercise is JG pointless [such as an English locale user ssh'ing into a Japanese box]. Another solution would be to use UTF-8 as the wire encoding, and have the local and remote ssh implementations do all conversion to the local encodings. Kermit does exactly that (optionally). If you know that your local terminal groks UTF-8, you can simulate the functionality by typing luit first thing when you login. Juliusz ___ Devel mailing list [EMAIL PROTECTED] http://XFree86.Org/mailman/listinfo/devel
Re: xterm and UTF8
On Thu, Feb 20, 2003 at 01:50:40PM -0800, Kean Johnston wrote: You can specify en_US.UTF-8 as your locale. Which implies to me that xterm can recognize, from its environment, the encoding, and act accordingly. Which only encourages the sort of bugs that many an autoconf script has had, that assumes because iconv() doesn't accept UTF-8 as a valid string that it has no UTF-8 support, when in fact, it does, and its called utf8 and utf-8. Maybe the encoding shouldn't be case sensitive, but it seems to be. The problem is that POSIX (or SUS or whatever) standardized iconv() but not the names of the encodings that you can pass to iconv() - thus rendering the standardization of iconv() itself almost completely useless - doh. ;-) There's a libcharset that I think comes with libiconv and is also used in GLib that you can use to work around this problem. Havoc ___ Devel mailing list [EMAIL PROTECTED] http://XFree86.Org/mailman/listinfo/devel
Re: xterm and UTF8
On Wed, Feb 19, 2003 at 08:53:14PM +0100, Thomas Zander wrote: I've heard the same discussion on KDE lists. And as on KDE the point is that the whole system has to be utf-8 to work correctly. Close -- the way Red Hat 8 is set up, it seems like the whole world needs to be UTF8 :/ This is an exaggeration, but it's wider than just the single system. As I mentioned, ssh'ing into a Red Hat box from another non-RHL8 box shows these encoding annoyances. Its news to me that a locale can specify that its utf-8, since I always thought that locales don't define encodings. It's news, then :) You can specify en_US.UTF-8 as your locale. Which implies to me that xterm can recognize, from its environment, the encoding, and act accordingly. Jeff ___ Devel mailing list [EMAIL PROTECTED] http://XFree86.Org/mailman/listinfo/devel
Re: xterm and UTF8
On Wed, Feb 19, 2003 at 03:48:02PM -0500, Jeff Garzik wrote: On Wed, Feb 19, 2003 at 08:53:14PM +0100, Thomas Zander wrote: Its news to me that a locale can specify that its utf-8, since I always thought that locales don't define encodings. It's news, then :) You can specify en_US.UTF-8 as your locale. Which implies to me that xterm can recognize, from its environment, the encoding, and act accordingly. Hope you aren't using the locale standard 'Variation' variable for that; in most europe countries that will give problems since they allready use it for the EURO variation. Do you know who came up with this idea? Is there a mailing list I can look at? Knowing my locales; encondings should not be in a locale! (Since the user can override them) Thanx! -- Thomas Zander ___ Devel mailing list [EMAIL PROTECTED] http://XFree86.Org/mailman/listinfo/devel