Kaixo!

On Mon, Oct 11, 2004 at 11:56:09AM -0400, Edward H. Trager wrote:
 
> MANDRAKE:
> =========
> And does anyone know the official story about Mandrake?
> I installed Mandrake 10.0 (from a magazine disc) and got
> an ISO-8859-1 locale instead of a UTF-8 locale.

It depends on your installation choices.
In Mandrakelinux there are some locales that are in UTF-8 by default
(those languages that can be supported only in UTF-8, or that don't
have any large legacy corpus in non-UTF-8);
and other locales that do have large legacy in non UTF-8 that are,
currently (it will hopefully change sometime in the future) in legacy
encoding by default.
But there is, under the "advanced" tab, a "use UTF-8" by default
checkbox, so you can force UTF-8 in anycase.
UTF-8 is also used if you choose several languages and UTF-8 is the
only shared encoding (eg, if you choose support for French and German, both
with legacy iso-8859-15 by default, UTF-8 won't be enabled (unless you
check the UTF-8 checkbox), but if you choose French and Geek for
example, as iso-8859-15 and iso-8859-7 are different, then UTF-8 is
used). 

> The Mandrake
> locale-setting GUI continued to provide only legacy ISO options,
> as far as I could tell.

the choice has to be done at install time (as use of utf-8 or not
has consequences on how data is stored on hard disks on native linux
partitions; it is not 100% automatizable to change it afterward)

>  In the end I manually set the .i18n
> file to en_US.UTF-8 and everything seems to work to the extent that
> I have tested it.  So why is UTF-8 not the default?  Does anyone know?

Because people complained.
UTF-8 support a year ago was not as good as now, and a lot of people
(in particular those using "en_US" locale :) ) would complain about ugly
fonts and other problems if UTF-8 was the default.

The situation improved a lot, and nowadays there are very few problems
left, probably UTF-8 could be made the default soon; and maybe it could
have been made the default if there weren't other more important issues
to spend our time.

> APACHE:

One of the remaining problems is the problem of web pages in cp1252 with
unanounced encoding, when using utf-8 by default some browsers display
them wrong (browsers should do some automatic charset encoding detection
to see if the page is in utf-8, or in cp1252 (the two only valid choices
for unanounced encoding pages, imho).
Same for email programs too (since I switched to utf-8 I got a lot of
messages that display wrongly as they are encoded in cp1252 but don't
announce it properly (in particular in the subject/from headers; but
also in the body); here too, some automatic encoding detection could
help a lot.

> The last time I installed Apache 2.0.x, it too defaults to the
> legacy ISO-8859-1 configuration.  One has to manually change the configuration
> file in order to get HTML pages served with the correct headers
> indicating UTF-8.

No, it is to the individual files to announce their encoding, not to
the web server.
I don't have any problem using apache with html files correctly
anouncing their encoding, I use a mix of iso-8859-1/iso-8859-15/utf-8,
with some occasional iso-2022-jp pages too.

> Does anyone know if this is still the case?  When is this going to change?
> Apache 2.0.x should really default to UTF-8.  Do people agree with me here?

I disagree :)
The default therefore must not be utf-8 but simply nothing,
forcing a single encoding for all the pages of a whole server is
something that can only be done by the manager of the server, after
carefully thinking about it; not something that should be blindly
enforced by default.

I however fully agree with you that forcing iso-8859-1 by default is
vey wrong; but I think that forcing any encoding by default is wrong.

-- 
Ki ça vos våye bén,
Pablo Saratxaga

http://chanae.walon.org/pablo/          PGP Key available, key ID: 0xD9B85466
[you can write me in Walloon, Spanish, French, English, Catalan or Esperanto]
[min povas skribi en valona, esperanta, angla aux latinidaj lingvoj]

Attachment: pgphAIifRCq3o.pgp
Description: PGP signature

  • D... xerces8
    • ... Daniel M. Bergey
    • ... Edward H. Trager
      • ... Pablo Saratxaga
        • ... Edward H. Trager
          • ... jmaiorana
            • ... Pablo Saratxaga
          • ... Markus Kuhn
          • ... Keld Jørn Simonsen
    • ... Jan Willem Stumpel

Reply via email to