Le jeudi 18 octobre 2007 à 19:17 +0000, Tuomo Valkonen a écrit : > On 2007-10-18, Nicolas Mailhot <[EMAIL PROTECTED]> wrote: > > You can rave all you want things should be nicely tagged with encodings > > but they aren't and won't be till an awful lot of otherwise perfectly > > working code is rewritten. > > The big problem is that _new_ code is being written specifically > for a monoculture; old code simply didn't care.
Old code did care. It just cared in a miriad incompatible ways which all more or less assumed the default was what people wanted it to be, and broke horribly otherwise (I have access to a fun pile of documents that uses a custom encoding just because there was no 8-bit encoding with the right mix of symbols, and relies on a one-of-a-kind font with the same cooked encoding to be user-readable) > > So the next best thing is a good universal default. Which UTF-8 is. So > > live with it (or join unicode.org to get it improved). > > That's just another reason why it's pointless to bear with FOSS, as it > can do nothing better than the commercial OSes. FOSS needs to exchange data with other systems (internet remember?). That means sharing encoding conventions. > > The single best feature of XML was not making possible to tag stuff with > > encodings (HTML had it before, as SGML). The single best feature of XML > > was to select UTF-8 as default encoding, so stuff is internationalised > > by default. > > Most of the XML files I've seen include a specification of the encoding. Which is usually UTF-8 because people just use the default (just like they just used iso-8859-1 in headers because it was the default, and then stuffed something else inside because iso-8859-1 just did not have the required encoding coverage). It's a pity the W3C didn't go the full way and allowed to specify something else – non-UTF-8 XML files win you nothing and are a constant source of bugs. > > So it's fun to shot at UTF-8. UTF-8 is ugly. UTF-8 reeks of compromise. > > But UTF-8 works which was not the case of all the solutions UTF-8 haters > > dreamed before and still cling to. > > Actually, UTF-8 as an ASCII-compatible mapping from 32-bit numbers to > 8-bit sequences is beautiful, something that can not be said of most > other multibyte encodings. However, Unicode or ISO-10646 or whatever > they want to call the character mapping in the background, is extremely > ugly. That's a result of trying to cater to every known human script. Which needs to be done to digitalise existing stuff. No one so far has proved it could be done better. -- Nicolas Mailhot _______________________________________________ wm-spec-list mailing list wm-spec-list@gnome.org http://mail.gnome.org/mailman/listinfo/wm-spec-list