Jungshik Shin,

>   Could I suggest that you do as Romans do in Rome :-) ? This is
> Linux-UTF8 mailing list. So, wouldn't it be better to use 'encoding'
> in place of 'code page'? Especially, it sounds very strange to hear
> stateful 7bit ISO-2022-based encodings being refered to as 'code pages'
>

Bear with me, I will try to remember.  I have been dealing with code pages
since 1985.  I think of encodings differently.  For example UTF-8 is a
Unicode encoding.  I also consider 5 bit Baudot for example as an encoding.
It took me years to stop calling memory core.  My first IT job was for IBM
do Brasil developing test equipment for memory cores for one of the first
solid state computers.

>   BTW, I'm on the same camp as you're in not liking 7bit stateful
> encodings, but I'm pretty sure there will be some objections to what you
> wrote about them because there are string manipulation routines written
> for them in use.

There are some things that you can do with iso-2022.  However I can not
figure out how to get something like strtok to work.  Even strchr
implementations are a problem.  You can do some things but most string
manipulation assumes that you can point anywhere in the string but this is
not possible with stateful encodings.  Another problem I have with iso-2022
is where does it end?  I can see iso-2022-jp, iso-2022-cn and iso-2022-kr,
but when I start seeing French and German escape sequences I begin to think
of a never ending octopus.

I once wrote a primitive windowing system for IBM EBCDIC Kanji text.
Truncating buffers of text in a window and overlaying text was a processing
nightmare to get the stat shifting right.  To make it worse It also
supported Fujitsu JES.  IBM's shift characters took up screen space even
though they did not display but on the Fujitsu screens had different
alignments.  There is always an answer but in some cases the answer may not
be sane.

Carl

-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to