Re: Latin-1-characters

mark . a . biggar Tue, 16 Mar 2004 01:35:26 -0800

Another possibility is to use a UTF-8 extended system where you use values over 
0x10FFFF to encode temporary code block swaps in the encoding.  I.e.,
some magic value means the one byte UTF-8 codes now mean the Greek block
instead of the ASCII block.   But you would need broad agreement for that to
work.  As Dan said this really need a separation between encoding and character set.


--
Mark Biggar
[EMAIL PROTECTED]
> At 12:28 AM +0100 3/16/04, Karl Brodowsky wrote:
> >Anyway, it will be necessary to specify the encoding of unicode in
> >some way, which could possibly allow even to specify even some 
> >non-unicode-charsets.
> 
> While I'll skip diving deeper into the swamp that is character sets 
> and encoding (I'm already up to my neck in it, thanks, and I don't 
> have any long straws handy :) I'll point out that the above statement 
> is meaningless--there *are* no Unicode non-unicode charsets.
> 
> It is possible to use the UTF encodings on non-unicode charsets--you 
> could reasonably use UTF-8 to encode, say, Shift-JIS characters. 
> (where Shift-JIS is both an encoding and a character set, and it can 
> be separated into pieces)
> 
> It's not unwise (and, in practice, at least in implementation quite 
> sensible) to separate the encoding from the character set, but you 
> need to be careful to keep the separation clear, though many of the 
> sets and encodings don't go out of their way to help with that.
> -- 
>                                          Dan
> 
> --------------------------------------"it's like this"-------------------
> Dan Sugalski                          even samurai
> [EMAIL PROTECTED]                         have teddy bears and even
>                                        teddy bears get drunk

Re: Latin-1-characters

Reply via email to