jyknight added a comment. In D106577#2904960 <https://reviews.llvm.org/D106577#2904960>, @rsmith wrote:
> One benefit we don't get with this approach is providing the right value for > the macro (without paying the cost of always including `stdc-predefs.h`). What do you mean by "right value", though? As Aaron pointed out, the value seems only dependent upon what characters can fit into a wchar_t, which is independent of what unicode version the libc supports. If ISO10646 defines a new character, you can store that into a wchar_t, and, say, decode/encode to utf-8 without a libc update. So that the exact value doesn't much matter for a 32-bit wchar_t, so long as ISO10646 doesn't expand the size of a character beyond 32 bits. (Which they won't -- it's stuck at 21-bits effectively permanently.) > AFAICS, the only possible use for the value of the macro is to detect libc > support, so having Clang pick a specific value seems wrong to me. In some > ways I'd be more comfortable with this patch if we defined the macro to `1` > and documented that we think WG14 was wrong to ask for a version number. At this point, there are 3 versions of ISO10646 that changed properties relevant to this: - the initial ISO/IEC 10646-1:1993 allows characters as being potentially up through 0x7FFFFFFF, but only defined characters up through 0xFFFF. - ISO/IEC 10646-2:2001 first actually defined characters beyond 0xFFFF, - and then ISO/IEC 10646:2012 and later versions cut the maximum character value down to 0x10FFFF. So it's not true that the version number is without meaning -- only that it doesn't matter much anymore, because things have settled down. Quite possibly when they first defined this, they expected that toolchains with a 16-bit wchar_t might set it the define, since the standard -- at that point -- didn't have characters beyond 0xffff. But if those characters were indeed defined in a yet-to-be-released standard, then you'd have a problem. (As is the case today.) And also, I think it'd be valid to `#define __STDC_ISO_10646__ 200009L` for 16bit wchar_t platforms. (Not sure if we //should//, but that would appear to be valid). In D106577#2905027 <https://reviews.llvm.org/D106577#2905027>, @aaron.ballman wrote: > Yeah, I'm hoping to hear what WG14 has to say on this. My original thinking > was that this macro is used to tell users and libc what version of Unicode > wchar_t literal values are encoded in (if any), but seeing that both glibc > and musl (https://git.musl-libc.org/cgit/musl/tree/include/stdc-predef.h#n4) > define this macro themselves, I am less certain. Musl has the define simply because GCC does not. That's not an independent confirmation of anything, simply it following the status-quo set by the initial choice of GCC in 2000 not to define the macro itself. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D106577/new/ https://reviews.llvm.org/D106577 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits