On Mon, Feb 13, 2023 at 10:53:17 -0500, Jason Merrill wrote:
> On 1/25/23 13:06, Ben Boeckel wrote:
> > Unicode does not support such values because they are unrepresentable in
> > UTF-16.
> > 
> > libcpp/
> > 
> >     * charset.cc: Reject encodings of codepoints above 0x10FFFF.
> >     UTF-16 does not support such codepoints and therefore all
> >     Unicode rejects such values.
> 
> It seems that this causes a bunch of testsuite failures from tests that 
> expect this limit to be checked elsewhere with a different diagnostic, 
> so I think the easiest thing is to fold this into _cpp_valid_utf8_str 
> instead, i.e.:

Since then, `cpp_valid_utf8_p` has appeared and takes care of the
over-long encodings. The new patchset just checks for codepoints beyond
0x10FFFF and rejects them in this function (and the test suite matches
`master` results for me then).

--Ben

Reply via email to