On 29/03/2014 23:15, Boris Zbarsky wrote:
On 3/29/14 6:56 PM, Simon Sapin wrote:
Or I guess we could use what I’ll call "evil UTF-8", which is UTF-8
without the artificial restriction of not encoding surrogates.
http://en.wikipedia.org/wiki/CESU-8

CESU-8 is evil too, but it’s not what I had in mind. Its main characteristic is encoding non-BMP characters as surrogates pairs, which does not change the value space.

But http://www.unicode.org/reports/tr26/ is unclear whether CESU-8 allows unpaired surrogates (which was the issue in the previous message.) I suppose it does not, by virtue of valid UTF-16 not allowing them either.

--
Simon Sapin
_______________________________________________
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo

Reply via email to