Use cases for invalid-Unicode atoms

Henri Sivonen Mon, 19 Mar 2018 11:49:58 -0700

It appears that currently we allow atomicizing invalid UTF-16 string,
which are impossible to look up by UTF-8 key and we don't allow
atomicizing invalid UTF-8.


I need to change some things in this area in response to changing
error handling of UTF-8 to UTF-16 XPCOM string conversions to be more
secure, so I want to check if I should change things a bit more.

I can well imagine that the current state is exactly what we want:
Bogosity on the UTF-16 side round-trips and bogus UTF-8 doesn't
normally reach the atom machinery.

Am I correct in assuming we don't want changes here?

(One imaginable change would be replacing invalid sequences in both
UTF-16 and UTF-8 with U+FFFD and then atomicizing the result.)

-- 
Henri Sivonen
hsivo...@hsivonen.fi
https://hsivonen.fi/
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Use cases for invalid-Unicode atoms

Reply via email to