Hi, Simos, Sorry that I have probably not given this thread as much attention as it deserves due to limitations on time and being too busy at work. Nevertheless, at the risk of possibly repeating some things others may have mentioned, I will put forward a few comments:
First, I think the Linux developer community needs to think very *broadly* to include all scripts defined in Unicode 4.1. It is not good enough to only be able to handle Latin, Greek, and Cyrillic, even if one can solve the problem with accented characters for Latin/Greek/Cyrillic. Ideally the console would be able to handle CJK, Arabic, Syriac, Devanagari, Bengali, Myanmar, Tibetan, and Mongolian UTF-8 as deftly as it can handle Latin. So anyone who understands the issues surrounding the console should also spend enough time to understand the issues of input methods and complex text layout for various scripts, especially for complex-text layout scripts like Myanmar, for example. Some months ago I had had the idea of trying to fill out the missing parts of the GNU Unifont bitmap font: When one looks at a script like Myanmar, it is not at all obvious how one should try to "squish" the various glyphs into one cell or two cells. Some characters, like MYANMAR LETTER KA u+1000 clearly look like they should take up two console character cells, just like Han chinese characters do. Others, like MYANMAR LETTER KHA u+1001 clearly need only one character cell. Other letters like MYANMAR LETTER II u+1024 ought to use up *THREE CONSOLE CHARACTER CELLS* and MYANMAR LETTER AU u+102A should have *FOUR CONSOLE CHARACTER CELLS*. Has anyone ever thought about this before? So, if you ask me, having the option of "single width" vs. "double width" vs. "zero-width" (i.e., accent marks or other diacritics that combine with a previous character but don't take up any additional console character cells) is not enough. There has to be a system that would allow for zero, one, two, three, and four character cell widths. Maybe even more--I'd have to look more carefully to know the answer. One can envision a similar problem for other Indic and Indic-derived scripts, like Devanagari. Now, even if I had a system which allowed me to use up to four character cells just for one glyph (the glyph itself representing one or more unicode characters -- think about all of the consonant conjuncts in Devanagari, Myanmar, and other Indic and Indic-derived scripts), I still have to have smart input method support and smart text layout support ... I doubt if just one person can implement all of this and claim a bounty (if there was a bounty to be claimed). Just my 2 cent opinion. The console is not an area I know very much about--If I have time, I'll try to understand more about the actual implementation issues. -- Ed Trager On Monday 2005.01.10 19:35:23 +0000, Simon wrote: > > Hi All, > I suppose this is a long established issue that would make quite a few > people happy > when it is resolved. > Currently, one can read Unicode on the Linux console. In addition, one > can write Unicode > on the Linux console with the expeption of accents; you cannot use dead > keys. > (Of course I am refering to languages without combining characters, etc...). > The issue about dead keys is described at > https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=143014 > > There was a discussion on LKML about this issue with the aim to get the > patch by > Chris Heath included > (http://www.ussg.iu.edu/hypermail/linux/kernel/0412.1/1039.html). > The reaction was not positive (not complaining), see > http://www.ussg.iu.edu/hypermail/linux/kernel/0412.1/1575.html > and there was no more pressing to get it accepted. > > However, this issue has to be resolved and at least basic Unicode > (+accents) should be supported on the console. The ideal > situation is to be able to do a bit more languages such as arabic. > There are quite a few projects to enable the use of old computers in the > developing world and there is interest > to use the Linux console > (http://lists.debian.org/debian-l10n-greek/2004/12/msg00020.html) > > The way I see it is to write a description of the task to enable better > Unicode on the Linux console (use framebuffer?) and > post it as a "bounty" for whoever is able to implement. > > - Could you have a look at > https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=143014 > and post any information missing/corrections? > - How do you envision should change on the Linux console to support > better Unicode? > - Could this be written as a task (so to post as a bounty)? > > Cheers, > Simos > > > -- > No virus found in this outgoing message. > Checked by AVG Anti-Virus. > Version: 7.0.300 / Virus Database: 265.6.9 - Release Date: 06/01/2005 > > > -- > Linux-UTF8: i18n of Linux on all levels > Archive: http://mail.nl.linux.org/linux-utf8/ > > > -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/