On 11/18/2010 02:31 PM, Marco van de Voort wrote:

Either you have UTF-8 with surrogates, or you have ASCII (since UTF-8
without surrogates means that only char 0..127 are valid, which is ASCII)
In another post surrogate pairs have been denoted as a specialty of a 16 Bit coding (UCS-2), and I did not understand why this was introduced in a discussion about UTF-8. I just accepted that this somehow would leak into UTF-8 as a special (alternate) way to code certain Unicode characters.

I did not think about calling the up to four bytes of a normal UTF-8 "character" "surrogates" (to me these are "codes" or something like this).

Sorry if I introduced any confusion. :(

-Michael
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Reply via email to