On Mon, Apr 15, 2013, at 15:16, Strake wrote:
> On 15/04/2013, random...@fastmail.us <random...@fastmail.us> wrote:
> > On Mon, Apr 15, 2013, at 10:58, Martti Kühne wrote:
> >> According to a quick google those chars can become as wide as 6
> >> bytes,
> >
> > No, they can't. I have no idea what your source on this is.
> 
> In UTF-8 the maximum encoded character length is 6 bytes [1]

What on earth does that have to do with using an int to store the code
point *instead of* the raw UTF-8 bytes (which are used _now_)?

Also, this is out of date; the latest version of unicode (since 2003 at
the latest) limits code points to 0x10FFFF and therefore UTF-8 sequences
to four bytes. Unless your manpage is much older than mine, it states
this clearly and you misread it.

Reply via email to