Re: UTF8 vs. Unicode (UTF16) in code

Keld Jørn Simonsen Fri, 09 Mar 2001 11:40:32 -0800

On Fri, Mar 09, 2001 at 10:56:30AM -0800, Yves Arrouye wrote:
> 
> Since the U in UTF stands for Unicode, UTF-32 cannot represent more than
> what Unicode encodes, which is is 1+ million code points. Otherwise, you're
> talking about UCS-4. But I 
> thought that one of the latest revs of ISO 10646 explicitely specified that
> UCS-4 will never encode more than what Unicode can encode, and thus
> definitely these 4 billion characters you're alluding to.

As far as I know the U in UTF stands for Universal - not unicode.
ISO 10646 can encode characters beyond UTF-16, and should retain
this capability. There is a proposal to restrict UTF-8 to
only encompas the same values as UTF-16, but UCS-4 still encodes
the 31-bit code space.

Kind regards
Keld

Re: UTF8 vs. Unicode (UTF16) in code

Reply via email to