jyknight added a comment.

Even after the more recent discussion, I still think my initial message was 
incorrect, and that the compiler should be defining this macro itself, as 
proposed in this patch. Note that my confusion was not that the macro being 
defined or not was dependent on libc behavior, only the precise value it should 
be defined to.

Responding to a couple points:

> I think the point was more about "who is generally responsible for defining 
> this macro, the compiler or the library" as opposed to it being a glibc thing 
> specifically. I notice that musl also defines the macro 
> (https://git.musl-libc.org/cgit/musl/tree/include/stdc-predef.h#n4).

Exactly so. *IF* this macro relates to library behavior, then libraries should 
define it -- and not just glibc. Other systems could/should provide a 
stdc-predef.h file as well. (but per above, I don't think this is the case 
here.)

> This patch is certainly wrong for NetBSD as the wchar_t encoding is up to the 
> specific locale charset and *not* UCS-2 or UCS-4 for certain legacy encodings 
> like the various shift encodings in East Asia.

Yet, the compiler currently always puts UTF-16/UTF-32 in wchar_t string 
literals. If that is inconsistent with the runtime, then the system as a whole 
currently has a serious bug. There is currently no platform that Clang uses a 
non-UTF encoding for wchar_t for. If there were some such platform, it would 
then be correct to not define this macro for that platform. There's no getting 
away from the compiler needing to be aware of the encoding of wchar_t, 
independent from this patch, so there's no point in punting the definition of 
the macro to the libc.

Now, maybe FreeBSD should be such a platform that uses a different wchar_t 
encoding...which leads to the question: what //is// the encoding Clang should 
be using here? What *should* `L"\U00100000"` emit? It sounds like wchar_t 
doesn't even have a consistent encoding at runtime, which implies that there's 
no way the compiler can create a correct wchar_t string literal. So maybe it 
should simply throw a compilation error if you try to use L"" or L''?

Per 
https://www.gnu.org/software/libunistring/manual/html_node/The-wchar_005ft-mess.html
 this same bug also exists for Solaris.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106577/new/

https://reviews.llvm.org/D106577

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to