* Eric Botcazou: > the Universal Character Names accepted by the C family of compilers > are mapped to those of ISO/IEC 10646, which defines the Universal > Character Set codespace as the range 0-0x10FFFF inclusive. The > upper bound is already enforced for identifiers but not for > literals, so the following code is accepted in C99: > > #include <stddef.h> > > wchar_t a = L'\U00110000'; > > whereas it is rejected with an error by other compilers (Clang, MSVC). > > I'm not sure whether the compiler is really equired to issue a diagnostic in > this case. Moreover a few tests in the testsuite manipulate UCNs outside the > UCS codespace. That's why I suggest issuing a pedantic warning. > > Tested on x86_64-suse-linux, OK for the mainline?
Since this is a pedantic warning … I think this has to depend on the C standards version. I think each C standard needs to be read against the edition of ISO 10646 current at the time of standards approval (the references are sadly not versioned, so the version is implied). Early versions of ISO 10646 definitely do not have the codespace restriction you mention.