On Mon, Feb 13, 2023 at 10:53:17 -0500, Jason Merrill wrote: > On 1/25/23 13:06, Ben Boeckel wrote: > > Unicode does not support such values because they are unrepresentable in > > UTF-16. > > > > libcpp/ > > > > * charset.cc: Reject encodings of codepoints above 0x10FFFF. > > UTF-16 does not support such codepoints and therefore all > > Unicode rejects such values. > > It seems that this causes a bunch of testsuite failures from tests that > expect this limit to be checked elsewhere with a different diagnostic, > so I think the easiest thing is to fold this into _cpp_valid_utf8_str > instead, i.e.:
Since then, `cpp_valid_utf8_p` has appeared and takes care of the over-long encodings. The new patchset just checks for codepoints beyond 0x10FFFF and rejects them in this function (and the test suite matches `master` results for me then). --Ben