Re: gcc and utf-8 source

Egmont Koblinger Mon, 15 Nov 2004 04:10:01 -0800

On Mon, Nov 15, 2004 at 11:22:56AM +0100, [EMAIL PROTECTED] wrote:

> I think the descriptions of the options you mention are a little 
> obscure about their scope.
> Together with your following comments (which aren't completely clear 
> either, if I may say that), I seriously hope that any such 
> gcc "invention" applies to the notation L"string" only, not to "string"!


I agree with you, and though I haven't thoroughly read the manpage, I'm
pretty sure that gcc does this. gcc is, as far as I see, the one and only
gnu project that is maintained correctly and the developers know where
they're going, they have systematic testing, release plans, working
bugtracking system and stuff like that. I trust them that they're not so
crazy to completely break everything.

Furthermore, at L"" strings it's okay for you to specify one charset, the
source of the conversion, as the destination is always UCS-4 (I don't know
whether it's BE or LE or arch-dependent, I guess it's arch-dep, but doesn't
matter here). For simple "" strings you'd have to need two charsets (both
source and destination) for a conversion to maybe make sense at all. But
still it didn't make any sense, in case anyone would need it, a simple iconv
on the source file would do it perfectly.

> What I mean here is the following:
> If a user wants to configure an application to use a certain language 
> (e.g. de_DE for gettext) and a certain encoding (e.g. UTF-8) and it 
> happens that a locale that matches EXACTLY the combination 
> "de_DE.UTF-8" is not installed on the system, many libraries and 
> applications will fail to handle the setup properly. This is silly 
> because if there is a locale "de_DE" or "de.utf8" all information to 
> handle the user configuration is available so why does the 
> operating environment have to bully the user with refusing to 
> accept obvious settings, especially as there is no uniform standard 
> for locale names that could be used as an excuse for being picky...

Well, really, this is a quite different topic right now but I don't mind it
at all :-))

I remember the old days when LANG=hu worked perfectly, nowadays it doesn't,
I have to use LANG=hu_HU instead... I can't understand why the hu_HU.UTF-8
locale consumes much more disk space than the hu_HU locale, and actually,
why do I need to have both on my system for both to work, why can't glibc
convert charsets runtime if necessary? This way I can't even use hu_HU.CP852
or hu_HU.iso-8859-16 if I'd happen to need them for some strange reason. And
there are plenty of other questions as well. For example, lot of people have
already asked, why isn't there a C.UTF-8 locale available? (Does anyone have
a patch to glibc for this to work? Preferably built into glibc so that it
doesn't require external gconv or locale-archives files, so that I can
umount /usr from a bash running with this locale, as I can currently to with
C but not with hu_HU.)

At least, with sufficiently recent glibc, the UTF-8 locales are installed by
default. Unfortunately, there are some distributions which do not install
all the locales by default but let the sysadmin choose which ones he wants
to be available, which is imho a braindamaged design. Sometimes I have to
work on these kind of computers and I really hate to ask the sysadmins to
please please do not only install hu_HU but also hu_HU.UTF-8 and
en_US.UTF-8...



-- 
Egmont

--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Re: gcc and utf-8 source

Reply via email to