On Thu, Oct 20, 2022 at 11:39:25 -0400, Jason Merrill wrote: > Oops, I was thinking this was in gcc as well. In libcpp there's > _cpp_valid_utf8 (which calls one_utf8_to_cppchar).
This routine has a lot more logic (including UCN decoding) and the `one_utf8_to_cppchar` also supports out-of-bounds codepoints above `0x10FFFF`. --Ben