On Mon, Oct 17, 2011 at 11:54 PM, Tom Lane <[email protected]> wrote: > Robert Haas <[email protected]> writes: >> - Why does the second byte need special handling for 0xED and 0xF4? > > http://www.faqs.org/rfcs/rfc3629.html > > See section 4 in particular. The underlying requirement is to disallow > multiple representations of the same Unicode code point.
I'm still confused. The input string is already known to be valid UTF-8, so the second byte (if there is one) must be between 0x80 and 0xBF. Therefore it will be neither 0xED nor 0xF4. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
