On Monday, 15 January 2018 at 04:27:15 UTC, Jonathan M Davis
wrote:
On Monday, January 15, 2018 03:14:02 Tony via
Digitalmars-d-learn wrote:
On Monday, 15 January 2018 at 02:09:25 UTC, rikki cattermole
wrote:
> Unicode has three main variants, UTF-8, UTF-16 and UTF-32.
> The size of a code point is 1, 2 or 4 bytes.
I think to be technically correct, 1 (UTF-8), 2 (UTF-16) or 4
(UTF-32) bytes are referred to as "code units" and the size of
a
code point varies in UTF-8 and UTF-16.
Yes, for UTF-8, a code unit is 8 bits, and there can be up to 6
of them (IIRC) in a code point.
Nooooooooooo!!! Only 4 maximum for Unicode. Beyond that it's
obsolete crap that is not Unicode since version 2 of Unicode.