Am Montag, 30. Oktober 2006 23:14 schrieb Joost Verburg:
> Georg Baum wrote:
> > So you say Markus Kuhn is wrong? That would be surprising to me, since 
he 
> > is considered to be an unicode expert.
> 
> His information is outdated. RFC 2279 (the old UTF-8 specification) did 
> include support for a 31-bit code space. Because the Unicode code space 
> was later restricted, the RFC has been updated as RFC 3629 and is 
> restricted to the range 0000-10FFFF. There will never be any characters 
> outside this range. RFC 2279 is obsolete.
> 
> So the _current_ definition of UTF-8 (RFC 3629) does _not_ allow 5 and 6 
> byte sequences. See http://www.faqs.org/rfcs/rfc3629.html

Thanks for the clarification. I believe that I really understood it now. 
I'll update the conversion facet where 6 is used.


Georg

Reply via email to