-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Steve Langasek <[EMAIL PROTECTED]> writes:
> On Wed, Jan 26, 2005 at 02:53:52PM +0000, Roger Leigh wrote: > >> > By this, I'm not talking about enforcing this character code on the >> > whole Debian system, but see to that: 1) Installing systems with >> > UTF-8 is easier, also with locales not strictly in need of >> > this. UTF-8 as default is not necessarily my ultimate goal (as the >> > title suggests), but having the option of using UTF-8 (or other >> > encodings) system-wide, no matter what languages are chosen. > >> I think the locales package is the place to start this. For etch, I >> would like the UTF-8 locales to be the default for all languages (with >> language-specific encodings being offered as alternatives). > > Then please begin coordinating with the respective language teams involved > with the debian installer, to ensure that we have a usable UTF-8 based > console environment for all languages. (Or hand us a d-i based graphical > installer sprung fully-formed from your forehead, whichever you find > easier.) I wasn't trying to cause offence with my comments. I fully appreciate this isn't a trivial task. For the last few weeks, I've been working on just that. I'm slowly writing a full framebuffer-based terminal emulator which will support all the bi-di string specifications of ECMA-48, with full separation between data and presentation components. It will use FreeType (or maybe even Pango) for the font rendering, and so should provide the same level of text rendering support (and quality) you get under X, though I plan for it to be a bit faster than the X terminals by more intelligent glyph caching. http://www.whinlatter.ukfsn.org/gtk/uterm-0.1.0.tar.bz2 There's not much to see yet. I've written some of the basic classes, plus most of the ECMA-35 and -43 support. Over the last week or so I've become a little side-tracked writing a code table editor, for charset/element/area mapping/designation/invokation, but I hope to have something usable within a few months. Once the basic table parser (input handling) and terminal classes are done, we can start on the framebuffer driver. (If anyone out there can provide any examples, either code or simple explanation, of how the ECMA-48 data component and presentation components normally interact, that would be of great benefit. This is required for bidirectional nested string handling, but it's not clear what the implications are for line wrapping and the mappings between the two components. I'm also looking to get hold of several ISO standards documents, but they are rather expensive. If anyone can help me get hold of any copies of these standards, that would also be of immense help.) Once I've got the basics written, I'll be making the arch repo available. If anyone's interested, feel free to get in touch. > There's more to providing a working UTF-8 capable second-stage > installer than just setting "UTF-8" in the locale name, and this is > a major issue that makes UTF-8 a non-viable default for sarge. I'm not suggesting this should be done for sarge, which is why I said I'd like it for etch. I'll be honest: I hadn't actually considered the implications for the installer; I was rather more interested in the working system after installation. >> > 2) See to that all Debian packages handles UTF-8 properly. > >> This is a policy issue. Not all packages need to handle it, so this >> should be a reccommendation rather than a requirement. For example, >> there are specialised packages that only work with certain specific >> encodings, and these should probably not be a priority to change. >> Certainly, all general-purpose packages should be UCS-aware, though. > > I hope you're just conflating UCS-2 with UTF-8 here. UCS-2 is a crap > charset, which there's no reason at all for most Unix programs to support. No. UCS == Universal Character Set, a.k.a. ISO-10646. I wasn't referring to any specific encoding thereof, hence the lack of any qualification. A package's UCS support might involve using wide characters and streams, particularly for more sophisticated processing and layout. In this case it's more than just "UTF-8", even if that's what is used for input and output. Regards, Roger - -- Roger Leigh Printing on GNU/Linux? http://gimp-print.sourceforge.net/ Debian GNU/Linux http://www.debian.org/ GPG Public Key: 0x25BFB848. Please sign and encrypt your mail. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Processed by Mailcrypt 3.5.8 <http://mailcrypt.sourceforge.net/> iD8DBQFB+AM2VcFcaSW/uEgRAiHjAKCe7XdTeTLyC/FCIoBFDnZ/DCEJqgCdHYOc BUgTP63kDQ/K7lKUJkSbDls= =1w/v -----END PGP SIGNATURE----- -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

