Regarding the idea of having a kernel module for each and every
possible encoding, I think we don't need to do that but just download width
information for each, current locale should be sufficient enough. For instance, 
for EUC, you only need byte lengths and screen column widths for
three supplementary codesets, i.e., total six bytes since no reasonable 
EUC codeset will go over the cardinality that a byte can represent in terms of
byte length and also screen column width. For multibyte PC codepage based
ones, you only need to have some range values for specific screen widths byte
lengths, say, for ASCII, single column width characters, and double width
characters that will be something like 30 bytes or less in most of cases.

For UTF-8 locales, it is quite important to come up with a unified width
definition that hopefully every locale will agree; if that's the case,
the table can stay in the kernel without having to downloaded. If not,
then, the width table has to be downloaded for each UTF-8 locale through
something like ioctl(2).

With regards,

Ienup


] libc functions are not available in kernel space. We have a problem
] here. We could make a kernel module for every possible encoding, and
] have any of these modules dynamically loaded when a tty actually is
] put into this encoding.

] Date: Thu, 25 Jan 2001 15:06:51 +0100 (CET)
] From: Bruno Haible <[EMAIL PROTECTED]>
] Subject: Re: kernel tty patches
] To: [EMAIL PROTECTED]
] MIME-version: 1.0
] Content-transfer-encoding: 7bit
] 
] Tomohiro KUBOTA writes:
] 
] > All existing softwares which can deal with doublewidth characters
] > (including terminals and applications) assume that two backspace
] > characters are needed to erase one doublewidth character.  This is a
] > de-facto standard in CJK world, though it is not documented.
] 
] It is actually very well documented, in the X/Open Curses spec
] 
http://www.opengroup.org/onlinepubs/007908799/xcurses/intov.html#tag_001_004_003
] 
]      "Unless the cursor was already in column 0, <backspace> moves the
]      cursor one column toward the start of the current line and any
]      characters after the <backspace> are added or inserted starting
]      there."
] 
] > Though I am not familiar with kernel programming, can it use the same
] > design as the recent XTerm?  I.e, use Unicode as internal encoding and
] > use iconv() and wcwidth() to support all other encodings such as
] > ISO-8859-*, EUC-*, ISO-2022-*, and so on.
] 
] This is probably overkill. It's easier to have a wcwidth function for
] every encoding. For ISO-8859-* it is trivial, for EUC-* it's easy, for
] UTF-8 it is a 2 KB table, and for GB18030 it is also acceptably small
] (20 KB).
] 
] > Note that Linux is likely to run only with GNU libc and we can
] > expect these functions are always available.
] 
] libc functions are not available in kernel space. We have a problem
] here. We could make a kernel module for every possible encoding, and
] have any of these modules dynamically loaded when a tty actually is
] put into this encoding.
] 
] > Also note that we will need a utility to set tty's locale since LANG
] > variable cannot be used for kernel.
] 
] The program which creates the tty (xterm or fbgetty) can put the tty
] into the proper encoding.
] 
] Bruno
] -
] Linux-UTF8:   i18n of Linux on all levels
] Archive:      http://mail.nl.linux.org/lists/
-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/lists/

Reply via email to