[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-21 Thread Aaron Ballman via cfe-commits

AaronBallman wrote:

Thank you both for collaborating to get that solved!

https://github.com/llvm/llvm-project/pull/95802
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-21 Thread Mariya Podchishchaeva via cfe-commits

Fznamznon wrote:

According to the bots that worked!

https://github.com/llvm/llvm-project/pull/95802
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-20 Thread Mariya Podchishchaeva via cfe-commits

Fznamznon wrote:

Thank you!


https://github.com/llvm/llvm-project/pull/95802
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-20 Thread Timm Baeder via cfe-commits

tbaederr wrote:

I just pushed 
https://github.com/llvm/llvm-project/commit/99f5fcb0d1e04125daa404ff14c9cd14b7a2c40b
 - I don't have time to run _all_ the tests though, so this is a bit of a long 
shot. If that doesn't  fix it, then disabling them for now sounds fine to me.

https://github.com/llvm/llvm-project/pull/95802
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-20 Thread Mariya Podchishchaeva via cfe-commits

Fznamznon wrote:

I wonder if that would be ok to disable interpreter tests for now?

https://github.com/llvm/llvm-project/pull/95802
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-20 Thread Mariya Podchishchaeva via cfe-commits

Fznamznon wrote:

I can't because I don't have a big endian to verify with. We can try to push 
speculatively if it doesn't break existing tests.


https://github.com/llvm/llvm-project/pull/95802
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-20 Thread Timm Baeder via cfe-commits

tbaederr wrote:

Here's a quick patch with the cast inserted: 

```diff
diff --git a/clang/lib/AST/Interp/ByteCodeExprGen.cpp 
b/clang/lib/AST/Interp/ByteCodeExprGen.cpp
index 731153a6ead9..e7fa1a62c277 100644
--- a/clang/lib/AST/Interp/ByteCodeExprGen.cpp
+++ b/clang/lib/AST/Interp/ByteCodeExprGen.cpp
@@ -1346,13 +1346,30 @@ bool 
ByteCodeExprGen::visitInitList(ArrayRef Inits,
   }
 }

-auto Eval = [&](Expr *Init, unsigned ElemIndex) {
-  return visitArrayElemInit(ElemIndex, Init);
-};
-
+E->dump();
 unsigned ElementIndex = 0;
 for (const Expr *Init : Inits) {
-  if (auto *EmbedS = dyn_cast(Init->IgnoreParenImpCasts())) {
+  if (const auto *EmbedS = 
dyn_cast(Init->IgnoreParenImpCasts())) {
+QualType TargetType = Init->getType();
+PrimType TargetT = classifyPrim(Init->getType());
+TargetType->dump();
+
+
+auto Eval = [&](const Expr *Init, unsigned ElemIndex) {
+  PrimType InitT = classifyPrim(Init->getType());
+  if (!this->visit(Init))
+return false;
+  if (InitT != TargetT) {
+if (!this->emitCast(InitT, TargetT, E))
+  return false;
+  }
+return this->emitInitElem(TargetT, ElemIndex, Init);
+};
+
+
+
+
+
 if (!EmbedS->doForEachDataElement(Eval, ElementIndex))
   return false;
   } else {
```

Can you check if that fixes the problem?

https://github.com/llvm/llvm-project/pull/95802
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-20 Thread Mariya Podchishchaeva via cfe-commits

Fznamznon wrote:

I'm trying to insert a cast using emitCast:
```
--- a/clang/lib/AST/Interp/ByteCodeExprGen.cpp
+++ b/clang/lib/AST/Interp/ByteCodeExprGen.cpp
@@ -1347,6 +1347,13 @@ bool 
ByteCodeExprGen::visitInitList(ArrayRef Inits,
 }

 auto Eval = [&](Expr *Init, unsigned ElemIndex) {
+  auto ArrayTy = Ctx.getASTContext().getAsConstantArrayType(T);
+  std::optional FromT = classify(Init->getType());
+  std::optional ToT = classify(ArrayTy->getElementType());
+  if (FromT != ToT) {
+if (!this->emitCast(*FromT, *ToT, Init))
+  return false;
+  }
   return visitArrayElemInit(ElemIndex, Init);
 };
```
But it fails with an assertion
```
source/llvm-project/clang/lib/AST/Interp/InterpStack.h:46: T 
clang::interp::InterpStack::pop() [with T = clang::interp::Integral<8, false>]: 
Asserti
on `ItemTypes.back() == toPrimType()' failed.
```

https://github.com/llvm/llvm-project/pull/95802
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-20 Thread Mariya Podchishchaeva via cfe-commits

Fznamznon wrote:

Yes, all bots are big endian. Reproducer is
```
clang -cc1 %s -fsyntax-only -verify -fexperimental-new-constant-interpreter
constexpr int value(int a, int b) {
  return a + b;
}
constexpr int init_list_expr() {
  int vals[] = {
#embed "jk.txt"
  };
  return value(vals[0], vals[1]);
}
constexpr int ExpectedValue = 'j' + 'k';
static_assert(init_list_expr() == ExpectedValue);
```

contents of "jk.txt" is simply "jk".

https://github.com/llvm/llvm-project/pull/95802
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-20 Thread Timm Baeder via cfe-commits

tbaederr wrote:

Do you have a smaller reproducer? Are all the failing build bots big endian?

https://github.com/llvm/llvm-project/pull/95802
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-20 Thread Mariya Podchishchaeva via cfe-commits

Fznamznon wrote:

@tbaederr , I noticed that all buildbot failures relate to the run with the new 
constant interpreter. I was wondering if you could see if I did something 
wrong? For example, embed by default yields values of type `unsigned char`. 
However when expanding in 
[ByteCodeExprGen.cpp](https://github.com/llvm/llvm-project/pull/95802/files#diff-ccb62cc00bada13b706286e9cdc64881c86d81da71f2939bcbb40ee692299d67)
 , I did not insert any casts. I think this could affect, since I had to insert 
casts in other places and from the log (as @AaronBallman noticed):
```
Line 18: in call to 'value(1778384896, 1795162112)'
The first value is 0x6A00 in hex and the second is 0x6B00. The ASCII 
value for j is 0x6A and k is 0x6B
```
So the values are actually correct

https://github.com/llvm/llvm-project/pull/95802
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-20 Thread Mariya Podchishchaeva via cfe-commits

Fznamznon wrote:

Buildbot failure, I'm looking 
https://lab.llvm.org/buildbot/#/builders/176/builds/226 .

https://github.com/llvm/llvm-project/pull/95802
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-20 Thread Mariya Podchishchaeva via cfe-commits

https://github.com/Fznamznon closed 
https://github.com/llvm/llvm-project/pull/95802
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-20 Thread Aaron Ballman via cfe-commits

https://github.com/AaronBallman approved this pull request.

Leak fix LGTM, I think it's ready to re-land and try again.

https://github.com/llvm/llvm-project/pull/95802
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-20 Thread Aaron Ballman via cfe-commits


@@ -0,0 +1,98 @@
+// RUN: %clang_cc1 %s -fsyntax-only --embed-dir=%S/Inputs -verify=expected,cxx 
-Wno-c23-extensions
+// RUN: %clang_cc1 -x c -std=c23 %s -fsyntax-only --embed-dir=%S/Inputs 
-verify=expected,c
+#embed 
+;
+
+void f (unsigned char x) { (void)x;}
+void g () {}
+void h (unsigned char x, int y) {(void)x; (void)y;}
+int i () {
+   return
+#embed 
+   ;
+}
+
+_Static_assert(
+#embed  suffix(,)
+""
+);
+_Static_assert(
+#embed 
+, ""
+);
+_Static_assert(sizeof(
+#embed 
+) ==
+sizeof(unsigned char)
+, ""
+);
+_Static_assert(sizeof
+#embed 
+, ""
+);
+_Static_assert(sizeof(
+#embed  // expected-warning {{left operand of comma operator has no 
effect}}
+) ==
+sizeof(unsigned char)
+, ""
+);
+
+#ifdef __cplusplus
+template 
+void j() {
+   static_assert(First == 'j', "");
+   static_assert(Second == 'k', "");
+}
+#endif
+
+void do_stuff() {
+   f(
+#embed 
+   );
+   g(
+#embed 
+   );
+   h(
+#embed 
+   );
+   int r = i();
+   (void)r;
+#ifdef __cplusplus
+   j<
+#embed 
+   >(
+#embed 
+   );
+#endif
+}
+
+// Ensure that we don't accidentally allow you to initialize an unsigned char *
+// from embedded data; the data is modeled as a string literal internally, but
+// is not actually a string literal.
+const unsigned char *ptr =
+#embed  // expected-warning {{left operand of comma operator has no 
effect}}
+; // c-error@-2 {{incompatible integer to pointer conversion initializing 
'const unsigned char *' with an expression of type 'unsigned char'}} \
+ cxx-error@-2 {{cannot initialize a variable of type 'const unsigned char 
*' with an rvalue of type 'unsigned char'}}
+
+// However, there are some cases where this is fine and should work.
+const unsigned char *null_ptr_1 =
+#embed  if_empty(0)
+;
+
+const unsigned char *null_ptr_2 =
+#embed 
+;
+
+const unsigned char *null_ptr_3 = {
+#embed 
+};
+
+#define FILE_NAME 
+#define LIMIT 1
+#define OFFSET 0
+#define EMPTY_SUFFIX suffix()
+
+constexpr unsigned char ch =
+#embed FILE_NAME limit(LIMIT) clang::offset(OFFSET) EMPTY_SUFFIX
+;
+static_assert(ch == 0);

AaronBallman wrote:

> I have a prototype of injecting tokens that helps. It also removes all the 
> "whack a mole" around template arguments. The only downside it is now yields 
> int instead of unsigned char, but I guess it is fine?

Nice! Yes, it's fine to yield an `int`; that's how the feature is defined to 
behave in C and we need the semantics to be the same in C and C++.

> Should I push it to this PR or it makes sense to land this first and make a 
> separate PR? NOTE: I'm on vacation next week, so I will not be available.

IMO, it would be easier for reviewers to land the current changes and then push 
fixes and improvements separately. This patch is already really hard to review 
due to size. What do folks think about landing the changes as-is today/tomorrow 
and then doing follow-up work once @Fznamznon is back from vacation?

https://github.com/llvm/llvm-project/pull/95802
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-20 Thread via cfe-commits


@@ -0,0 +1,98 @@
+// RUN: %clang_cc1 %s -fsyntax-only --embed-dir=%S/Inputs -verify=expected,cxx 
-Wno-c23-extensions
+// RUN: %clang_cc1 -x c -std=c23 %s -fsyntax-only --embed-dir=%S/Inputs 
-verify=expected,c
+#embed 
+;
+
+void f (unsigned char x) { (void)x;}
+void g () {}
+void h (unsigned char x, int y) {(void)x; (void)y;}
+int i () {
+   return
+#embed 
+   ;
+}
+
+_Static_assert(
+#embed  suffix(,)
+""
+);
+_Static_assert(
+#embed 
+, ""
+);
+_Static_assert(sizeof(
+#embed 
+) ==
+sizeof(unsigned char)
+, ""
+);
+_Static_assert(sizeof
+#embed 
+, ""
+);
+_Static_assert(sizeof(
+#embed  // expected-warning {{left operand of comma operator has no 
effect}}
+) ==
+sizeof(unsigned char)
+, ""
+);
+
+#ifdef __cplusplus
+template 
+void j() {
+   static_assert(First == 'j', "");
+   static_assert(Second == 'k', "");
+}
+#endif
+
+void do_stuff() {
+   f(
+#embed 
+   );
+   g(
+#embed 
+   );
+   h(
+#embed 
+   );
+   int r = i();
+   (void)r;
+#ifdef __cplusplus
+   j<
+#embed 
+   >(
+#embed 
+   );
+#endif
+}
+
+// Ensure that we don't accidentally allow you to initialize an unsigned char *
+// from embedded data; the data is modeled as a string literal internally, but
+// is not actually a string literal.
+const unsigned char *ptr =
+#embed  // expected-warning {{left operand of comma operator has no 
effect}}
+; // c-error@-2 {{incompatible integer to pointer conversion initializing 
'const unsigned char *' with an expression of type 'unsigned char'}} \
+ cxx-error@-2 {{cannot initialize a variable of type 'const unsigned char 
*' with an rvalue of type 'unsigned char'}}
+
+// However, there are some cases where this is fine and should work.
+const unsigned char *null_ptr_1 =
+#embed  if_empty(0)
+;
+
+const unsigned char *null_ptr_2 =
+#embed 
+;
+
+const unsigned char *null_ptr_3 = {
+#embed 
+};
+
+#define FILE_NAME 
+#define LIMIT 1
+#define OFFSET 0
+#define EMPTY_SUFFIX suffix()
+
+constexpr unsigned char ch =
+#embed FILE_NAME limit(LIMIT) clang::offset(OFFSET) EMPTY_SUFFIX
+;
+static_assert(ch == 0);

cor3ntin wrote:

I think we should land this PR and iterate from there.

https://github.com/llvm/llvm-project/pull/95802
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-20 Thread Mariya Podchishchaeva via cfe-commits


@@ -0,0 +1,98 @@
+// RUN: %clang_cc1 %s -fsyntax-only --embed-dir=%S/Inputs -verify=expected,cxx 
-Wno-c23-extensions
+// RUN: %clang_cc1 -x c -std=c23 %s -fsyntax-only --embed-dir=%S/Inputs 
-verify=expected,c
+#embed 
+;
+
+void f (unsigned char x) { (void)x;}
+void g () {}
+void h (unsigned char x, int y) {(void)x; (void)y;}
+int i () {
+   return
+#embed 
+   ;
+}
+
+_Static_assert(
+#embed  suffix(,)
+""
+);
+_Static_assert(
+#embed 
+, ""
+);
+_Static_assert(sizeof(
+#embed 
+) ==
+sizeof(unsigned char)
+, ""
+);
+_Static_assert(sizeof
+#embed 
+, ""
+);
+_Static_assert(sizeof(
+#embed  // expected-warning {{left operand of comma operator has no 
effect}}
+) ==
+sizeof(unsigned char)
+, ""
+);
+
+#ifdef __cplusplus
+template 
+void j() {
+   static_assert(First == 'j', "");
+   static_assert(Second == 'k', "");
+}
+#endif
+
+void do_stuff() {
+   f(
+#embed 
+   );
+   g(
+#embed 
+   );
+   h(
+#embed 
+   );
+   int r = i();
+   (void)r;
+#ifdef __cplusplus
+   j<
+#embed 
+   >(
+#embed 
+   );
+#endif
+}
+
+// Ensure that we don't accidentally allow you to initialize an unsigned char *
+// from embedded data; the data is modeled as a string literal internally, but
+// is not actually a string literal.
+const unsigned char *ptr =
+#embed  // expected-warning {{left operand of comma operator has no 
effect}}
+; // c-error@-2 {{incompatible integer to pointer conversion initializing 
'const unsigned char *' with an expression of type 'unsigned char'}} \
+ cxx-error@-2 {{cannot initialize a variable of type 'const unsigned char 
*' with an rvalue of type 'unsigned char'}}
+
+// However, there are some cases where this is fine and should work.
+const unsigned char *null_ptr_1 =
+#embed  if_empty(0)
+;
+
+const unsigned char *null_ptr_2 =
+#embed 
+;
+
+const unsigned char *null_ptr_3 = {
+#embed 
+};
+
+#define FILE_NAME 
+#define LIMIT 1
+#define OFFSET 0
+#define EMPTY_SUFFIX suffix()
+
+constexpr unsigned char ch =
+#embed FILE_NAME limit(LIMIT) clang::offset(OFFSET) EMPTY_SUFFIX
+;
+static_assert(ch == 0);

Fznamznon wrote:

I have a prototype of injecting tokens that helps. It also removes all the 
"whack a mole" around template arguments. The only downside it is now yields 
int instead of unsigned char, but I guess it is fine?

Should I push it to this PR or it makes sense to land this first and make a 
separate PR? NOTE: I'm on vacation next week, so I will not be available.

https://github.com/llvm/llvm-project/pull/95802
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-19 Thread Jakub Jelínek via cfe-commits


@@ -0,0 +1,98 @@
+// RUN: %clang_cc1 %s -fsyntax-only --embed-dir=%S/Inputs -verify=expected,cxx 
-Wno-c23-extensions
+// RUN: %clang_cc1 -x c -std=c23 %s -fsyntax-only --embed-dir=%S/Inputs 
-verify=expected,c
+#embed 
+;
+
+void f (unsigned char x) { (void)x;}
+void g () {}
+void h (unsigned char x, int y) {(void)x; (void)y;}
+int i () {
+   return
+#embed 
+   ;
+}
+
+_Static_assert(
+#embed  suffix(,)
+""
+);
+_Static_assert(
+#embed 
+, ""
+);
+_Static_assert(sizeof(
+#embed 
+) ==
+sizeof(unsigned char)
+, ""
+);
+_Static_assert(sizeof
+#embed 
+, ""
+);
+_Static_assert(sizeof(
+#embed  // expected-warning {{left operand of comma operator has no 
effect}}
+) ==
+sizeof(unsigned char)
+, ""
+);
+
+#ifdef __cplusplus
+template 
+void j() {
+   static_assert(First == 'j', "");
+   static_assert(Second == 'k', "");
+}
+#endif
+
+void do_stuff() {
+   f(
+#embed 
+   );
+   g(
+#embed 
+   );
+   h(
+#embed 
+   );
+   int r = i();
+   (void)r;
+#ifdef __cplusplus
+   j<
+#embed 
+   >(
+#embed 
+   );
+#endif
+}
+
+// Ensure that we don't accidentally allow you to initialize an unsigned char *
+// from embedded data; the data is modeled as a string literal internally, but
+// is not actually a string literal.
+const unsigned char *ptr =
+#embed  // expected-warning {{left operand of comma operator has no 
effect}}
+; // c-error@-2 {{incompatible integer to pointer conversion initializing 
'const unsigned char *' with an expression of type 'unsigned char'}} \
+ cxx-error@-2 {{cannot initialize a variable of type 'const unsigned char 
*' with an rvalue of type 'unsigned char'}}
+
+// However, there are some cases where this is fine and should work.
+const unsigned char *null_ptr_1 =
+#embed  if_empty(0)
+;
+
+const unsigned char *null_ptr_2 =
+#embed 
+;
+
+const unsigned char *null_ptr_3 = {
+#embed 
+};
+
+#define FILE_NAME 
+#define LIMIT 1
+#define OFFSET 0
+#define EMPTY_SUFFIX suffix()
+
+constexpr unsigned char ch =
+#embed FILE_NAME limit(LIMIT) clang::offset(OFFSET) EMPTY_SUFFIX
+;
+static_assert(ch == 0);

jakubjelinek wrote:

Unless you want to special case it in way too many spots, I'd think it would be 
far easier to optimize just the inner part of the integer sequence, i.e. 
everything except the first and last sequence element (maybe with the exception 
when the last prefix token is , or first suffix token is ,
Because one can use arbitrary tokens before and after the #embed, it can be
```c
const unsigned char a[] = {
-400 + 4 * 
#embed __FILE__
- 27 };
```
(or with tokens from prefix/suffix) and at least the current patchset 
mishandles many of such cases.  For the inner part of the sequence you know 
there is , before it and , after it, which simplifies a lot of things.
The above is handled correctly by GCC and by clang -save-temps, but not by 
clang without -save-temps.
And there are tons of other cases like that, e.g. even designated initializer 
[26] = 
before the sequence, etc.

https://github.com/llvm/llvm-project/pull/95802
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-18 Thread via cfe-commits


@@ -0,0 +1,98 @@
+// RUN: %clang_cc1 %s -fsyntax-only --embed-dir=%S/Inputs -verify=expected,cxx 
-Wno-c23-extensions
+// RUN: %clang_cc1 -x c -std=c23 %s -fsyntax-only --embed-dir=%S/Inputs 
-verify=expected,c
+#embed 
+;
+
+void f (unsigned char x) { (void)x;}
+void g () {}
+void h (unsigned char x, int y) {(void)x; (void)y;}
+int i () {
+   return
+#embed 
+   ;
+}
+
+_Static_assert(
+#embed  suffix(,)
+""
+);
+_Static_assert(
+#embed 
+, ""
+);
+_Static_assert(sizeof(
+#embed 
+) ==
+sizeof(unsigned char)
+, ""
+);
+_Static_assert(sizeof
+#embed 
+, ""
+);
+_Static_assert(sizeof(
+#embed  // expected-warning {{left operand of comma operator has no 
effect}}
+) ==
+sizeof(unsigned char)
+, ""
+);
+
+#ifdef __cplusplus
+template 
+void j() {
+   static_assert(First == 'j', "");
+   static_assert(Second == 'k', "");
+}
+#endif
+
+void do_stuff() {
+   f(
+#embed 
+   );
+   g(
+#embed 
+   );
+   h(
+#embed 
+   );
+   int r = i();
+   (void)r;
+#ifdef __cplusplus
+   j<
+#embed 
+   >(
+#embed 
+   );
+#endif
+}
+
+// Ensure that we don't accidentally allow you to initialize an unsigned char *
+// from embedded data; the data is modeled as a string literal internally, but
+// is not actually a string literal.
+const unsigned char *ptr =
+#embed  // expected-warning {{left operand of comma operator has no 
effect}}
+; // c-error@-2 {{incompatible integer to pointer conversion initializing 
'const unsigned char *' with an expression of type 'unsigned char'}} \
+ cxx-error@-2 {{cannot initialize a variable of type 'const unsigned char 
*' with an rvalue of type 'unsigned char'}}
+
+// However, there are some cases where this is fine and should work.
+const unsigned char *null_ptr_1 =
+#embed  if_empty(0)
+;
+
+const unsigned char *null_ptr_2 =
+#embed 
+;
+
+const unsigned char *null_ptr_3 = {
+#embed 
+};
+
+#define FILE_NAME 
+#define LIMIT 1
+#define OFFSET 0
+#define EMPTY_SUFFIX suffix()
+
+constexpr unsigned char ch =
+#embed FILE_NAME limit(LIMIT) clang::offset(OFFSET) EMPTY_SUFFIX
+;
+static_assert(ch == 0);

cor3ntin wrote:

> This is one more case, that would be solved by injecting tokens back into the 
> stream. Right now it is quite complex to understand that embed met by 
> `ParseCastExpression` should be expanded in a special way.

+1. I can work on that when I come back from st Louis if you don't get to it 
first 

https://github.com/llvm/llvm-project/pull/95802
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-18 Thread Mariya Podchishchaeva via cfe-commits


@@ -0,0 +1,98 @@
+// RUN: %clang_cc1 %s -fsyntax-only --embed-dir=%S/Inputs -verify=expected,cxx 
-Wno-c23-extensions
+// RUN: %clang_cc1 -x c -std=c23 %s -fsyntax-only --embed-dir=%S/Inputs 
-verify=expected,c
+#embed 
+;
+
+void f (unsigned char x) { (void)x;}
+void g () {}
+void h (unsigned char x, int y) {(void)x; (void)y;}
+int i () {
+   return
+#embed 
+   ;
+}
+
+_Static_assert(
+#embed  suffix(,)
+""
+);
+_Static_assert(
+#embed 
+, ""
+);
+_Static_assert(sizeof(
+#embed 
+) ==
+sizeof(unsigned char)
+, ""
+);
+_Static_assert(sizeof
+#embed 
+, ""
+);
+_Static_assert(sizeof(
+#embed  // expected-warning {{left operand of comma operator has no 
effect}}
+) ==
+sizeof(unsigned char)
+, ""
+);
+
+#ifdef __cplusplus
+template 
+void j() {
+   static_assert(First == 'j', "");
+   static_assert(Second == 'k', "");
+}
+#endif
+
+void do_stuff() {
+   f(
+#embed 
+   );
+   g(
+#embed 
+   );
+   h(
+#embed 
+   );
+   int r = i();
+   (void)r;
+#ifdef __cplusplus
+   j<
+#embed 
+   >(
+#embed 
+   );
+#endif
+}
+
+// Ensure that we don't accidentally allow you to initialize an unsigned char *
+// from embedded data; the data is modeled as a string literal internally, but
+// is not actually a string literal.
+const unsigned char *ptr =
+#embed  // expected-warning {{left operand of comma operator has no 
effect}}
+; // c-error@-2 {{incompatible integer to pointer conversion initializing 
'const unsigned char *' with an expression of type 'unsigned char'}} \
+ cxx-error@-2 {{cannot initialize a variable of type 'const unsigned char 
*' with an rvalue of type 'unsigned char'}}
+
+// However, there are some cases where this is fine and should work.
+const unsigned char *null_ptr_1 =
+#embed  if_empty(0)
+;
+
+const unsigned char *null_ptr_2 =
+#embed 
+;
+
+const unsigned char *null_ptr_3 = {
+#embed 
+};
+
+#define FILE_NAME 
+#define LIMIT 1
+#define OFFSET 0
+#define EMPTY_SUFFIX suffix()
+
+constexpr unsigned char ch =
+#embed FILE_NAME limit(LIMIT) clang::offset(OFFSET) EMPTY_SUFFIX
+;
+static_assert(ch == 0);

Fznamznon wrote:

This is one more case, that would be solved by injecting tokens back into the 
stream. Right now it is quite complex to understand that embed met by 
`ParseCastExpression` should be expanded in a special way.

https://github.com/llvm/llvm-project/pull/95802
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-18 Thread Aaron Ballman via cfe-commits


@@ -0,0 +1,98 @@
+// RUN: %clang_cc1 %s -fsyntax-only --embed-dir=%S/Inputs -verify=expected,cxx 
-Wno-c23-extensions
+// RUN: %clang_cc1 -x c -std=c23 %s -fsyntax-only --embed-dir=%S/Inputs 
-verify=expected,c
+#embed 
+;
+
+void f (unsigned char x) { (void)x;}
+void g () {}
+void h (unsigned char x, int y) {(void)x; (void)y;}
+int i () {
+   return
+#embed 
+   ;
+}
+
+_Static_assert(
+#embed  suffix(,)
+""
+);
+_Static_assert(
+#embed 
+, ""
+);
+_Static_assert(sizeof(
+#embed 
+) ==
+sizeof(unsigned char)
+, ""
+);
+_Static_assert(sizeof
+#embed 
+, ""
+);
+_Static_assert(sizeof(
+#embed  // expected-warning {{left operand of comma operator has no 
effect}}
+) ==
+sizeof(unsigned char)
+, ""
+);
+
+#ifdef __cplusplus
+template 
+void j() {
+   static_assert(First == 'j', "");
+   static_assert(Second == 'k', "");
+}
+#endif
+
+void do_stuff() {
+   f(
+#embed 
+   );
+   g(
+#embed 
+   );
+   h(
+#embed 
+   );
+   int r = i();
+   (void)r;
+#ifdef __cplusplus
+   j<
+#embed 
+   >(
+#embed 
+   );
+#endif
+}
+
+// Ensure that we don't accidentally allow you to initialize an unsigned char *
+// from embedded data; the data is modeled as a string literal internally, but
+// is not actually a string literal.
+const unsigned char *ptr =
+#embed  // expected-warning {{left operand of comma operator has no 
effect}}
+; // c-error@-2 {{incompatible integer to pointer conversion initializing 
'const unsigned char *' with an expression of type 'unsigned char'}} \
+ cxx-error@-2 {{cannot initialize a variable of type 'const unsigned char 
*' with an rvalue of type 'unsigned char'}}
+
+// However, there are some cases where this is fine and should work.
+const unsigned char *null_ptr_1 =
+#embed  if_empty(0)
+;
+
+const unsigned char *null_ptr_2 =
+#embed 
+;
+
+const unsigned char *null_ptr_3 = {
+#embed 
+};
+
+#define FILE_NAME 
+#define LIMIT 1
+#define OFFSET 0
+#define EMPTY_SUFFIX suffix()
+
+constexpr unsigned char ch =
+#embed FILE_NAME limit(LIMIT) clang::offset(OFFSET) EMPTY_SUFFIX
+;
+static_assert(ch == 0);

AaronBallman wrote:

Because embed produces tokens from the preprocessor, the first example should 
be the same as:
```
void f(float x, char y, char z);
void g() { f((float)
1, 2, 3
);
}
```
which should be accepted (the cast applies to the first argument). You can run 
into the same with something like:
```

void f(float x, char y, char z);
void g() {
  f(
#embed "three_character_file" prefix((float))
  );
}
```

https://github.com/llvm/llvm-project/pull/95802
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-18 Thread Mariya Podchishchaeva via cfe-commits




Fznamznon wrote:

Ok, removed null byte file.

https://github.com/llvm/llvm-project/pull/95802
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-18 Thread Mariya Podchishchaeva via cfe-commits


@@ -2422,6 +2422,10 @@ void ExprEngine::Visit(const Stmt *S, ExplodedNode *Pred,
   Bldr.addNodes(Dst);
   break;
 }
+
+case Stmt::EmbedExprClass:
+  llvm_unreachable("Support for EmbedExpr is not implemented.");

Fznamznon wrote:

Used `report_fatal_error` instead.

https://github.com/llvm/llvm-project/pull/95802
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-18 Thread Mariya Podchishchaeva via cfe-commits


@@ -0,0 +1,98 @@
+// RUN: %clang_cc1 %s -fsyntax-only --embed-dir=%S/Inputs -verify=expected,cxx 
-Wno-c23-extensions
+// RUN: %clang_cc1 -x c -std=c23 %s -fsyntax-only --embed-dir=%S/Inputs 
-verify=expected,c
+#embed 
+;
+
+void f (unsigned char x) { (void)x;}
+void g () {}
+void h (unsigned char x, int y) {(void)x; (void)y;}
+int i () {
+   return
+#embed 
+   ;
+}
+
+_Static_assert(
+#embed  suffix(,)
+""
+);
+_Static_assert(
+#embed 
+, ""
+);
+_Static_assert(sizeof(
+#embed 
+) ==
+sizeof(unsigned char)
+, ""
+);
+_Static_assert(sizeof
+#embed 
+, ""
+);
+_Static_assert(sizeof(
+#embed  // expected-warning {{left operand of comma operator has no 
effect}}
+) ==
+sizeof(unsigned char)
+, ""
+);
+
+#ifdef __cplusplus
+template 
+void j() {
+   static_assert(First == 'j', "");
+   static_assert(Second == 'k', "");
+}
+#endif
+
+void do_stuff() {
+   f(
+#embed 
+   );
+   g(
+#embed 
+   );
+   h(
+#embed 
+   );
+   int r = i();
+   (void)r;
+#ifdef __cplusplus
+   j<
+#embed 
+   >(
+#embed 
+   );
+#endif
+}
+
+// Ensure that we don't accidentally allow you to initialize an unsigned char *
+// from embedded data; the data is modeled as a string literal internally, but
+// is not actually a string literal.
+const unsigned char *ptr =
+#embed  // expected-warning {{left operand of comma operator has no 
effect}}
+; // c-error@-2 {{incompatible integer to pointer conversion initializing 
'const unsigned char *' with an expression of type 'unsigned char'}} \
+ cxx-error@-2 {{cannot initialize a variable of type 'const unsigned char 
*' with an rvalue of type 'unsigned char'}}
+
+// However, there are some cases where this is fine and should work.
+const unsigned char *null_ptr_1 =
+#embed  if_empty(0)
+;
+
+const unsigned char *null_ptr_2 =
+#embed 
+;
+
+const unsigned char *null_ptr_3 = {
+#embed 
+};
+
+#define FILE_NAME 
+#define LIMIT 1
+#define OFFSET 0
+#define EMPTY_SUFFIX suffix()
+
+constexpr unsigned char ch =
+#embed FILE_NAME limit(LIMIT) clang::offset(OFFSET) EMPTY_SUFFIX
+;
+static_assert(ch == 0);

Fznamznon wrote:

Added cases.

https://github.com/llvm/llvm-project/pull/95802
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-18 Thread Mariya Podchishchaeva via cfe-commits


@@ -441,6 +441,7 @@ tok::PPKeywordKind IdentifierInfo::getPPKeywordID() const {
   CASE( 4, 'e', 's', else);
   CASE( 4, 'l', 'n', line);
   CASE( 4, 's', 'c', sccs);
+  CASE(5, 'e', 'b', embed);

Fznamznon wrote:

Ok, done.

https://github.com/llvm/llvm-project/pull/95802
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-18 Thread Mariya Podchishchaeva via cfe-commits


@@ -0,0 +1,98 @@
+// RUN: %clang_cc1 %s -fsyntax-only --embed-dir=%S/Inputs -verify=expected,cxx 
-Wno-c23-extensions
+// RUN: %clang_cc1 -x c -std=c23 %s -fsyntax-only --embed-dir=%S/Inputs 
-verify=expected,c
+#embed 
+;
+
+void f (unsigned char x) { (void)x;}
+void g () {}
+void h (unsigned char x, int y) {(void)x; (void)y;}
+int i () {
+   return
+#embed 
+   ;
+}
+
+_Static_assert(
+#embed  suffix(,)
+""
+);
+_Static_assert(
+#embed 
+, ""
+);
+_Static_assert(sizeof(
+#embed 
+) ==
+sizeof(unsigned char)
+, ""
+);
+_Static_assert(sizeof
+#embed 
+, ""
+);
+_Static_assert(sizeof(
+#embed  // expected-warning {{left operand of comma operator has no 
effect}}
+) ==
+sizeof(unsigned char)
+, ""
+);
+
+#ifdef __cplusplus
+template 
+void j() {
+   static_assert(First == 'j', "");
+   static_assert(Second == 'k', "");
+}
+#endif
+
+void do_stuff() {
+   f(
+#embed 
+   );
+   g(
+#embed 
+   );
+   h(
+#embed 
+   );
+   int r = i();
+   (void)r;
+#ifdef __cplusplus
+   j<
+#embed 
+   >(
+#embed 
+   );
+#endif
+}
+
+// Ensure that we don't accidentally allow you to initialize an unsigned char *
+// from embedded data; the data is modeled as a string literal internally, but
+// is not actually a string literal.
+const unsigned char *ptr =
+#embed  // expected-warning {{left operand of comma operator has no 
effect}}
+; // c-error@-2 {{incompatible integer to pointer conversion initializing 
'const unsigned char *' with an expression of type 'unsigned char'}} \
+ cxx-error@-2 {{cannot initialize a variable of type 'const unsigned char 
*' with an rvalue of type 'unsigned char'}}
+
+// However, there are some cases where this is fine and should work.
+const unsigned char *null_ptr_1 =
+#embed  if_empty(0)
+;
+
+const unsigned char *null_ptr_2 =
+#embed 
+;
+
+const unsigned char *null_ptr_3 = {
+#embed 
+};
+
+#define FILE_NAME 
+#define LIMIT 1
+#define OFFSET 0
+#define EMPTY_SUFFIX suffix()
+
+constexpr unsigned char ch =
+#embed FILE_NAME limit(LIMIT) clang::offset(OFFSET) EMPTY_SUFFIX
+;
+static_assert(ch == 0);

Fznamznon wrote:

The first is interesting, cast to float makes #embed considered as a comma 
expression, so there is actually not enough arguments. Although, I think this 
is a correct behavior. Otherwise It is not quite clear to me to which data 
element the cast should apply. 

https://github.com/llvm/llvm-project/pull/95802
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-18 Thread Donát Nagy via cfe-commits


@@ -441,6 +441,7 @@ tok::PPKeywordKind IdentifierInfo::getPPKeywordID() const {
   CASE( 4, 'e', 's', else);
   CASE( 4, 'l', 'n', line);
   CASE( 4, 's', 'c', sccs);
+  CASE(5, 'e', 'b', embed);

NagyDonat wrote:

```suggestion
  CASE( 5, 'e', 'b', embed);
```
Just drive-by bikeshedding -- consider aligning this like the lines around it.

https://github.com/llvm/llvm-project/pull/95802
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-17 Thread Eli Friedman via cfe-commits




efriedma-quic wrote:

Please don't commit binary files if it isn't absolutely necessary.  You can 
generate whatever files you need in a RUN line.

https://github.com/llvm/llvm-project/pull/95802
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-17 Thread Eli Friedman via cfe-commits


@@ -0,0 +1,98 @@
+// RUN: %clang_cc1 %s -fsyntax-only --embed-dir=%S/Inputs -verify=expected,cxx 
-Wno-c23-extensions
+// RUN: %clang_cc1 -x c -std=c23 %s -fsyntax-only --embed-dir=%S/Inputs 
-verify=expected,c
+#embed 
+;
+
+void f (unsigned char x) { (void)x;}
+void g () {}
+void h (unsigned char x, int y) {(void)x; (void)y;}
+int i () {
+   return
+#embed 
+   ;
+}
+
+_Static_assert(
+#embed  suffix(,)
+""
+);
+_Static_assert(
+#embed 
+, ""
+);
+_Static_assert(sizeof(
+#embed 
+) ==
+sizeof(unsigned char)
+, ""
+);
+_Static_assert(sizeof
+#embed 
+, ""
+);
+_Static_assert(sizeof(
+#embed  // expected-warning {{left operand of comma operator has no 
effect}}
+) ==
+sizeof(unsigned char)
+, ""
+);
+
+#ifdef __cplusplus
+template 
+void j() {
+   static_assert(First == 'j', "");
+   static_assert(Second == 'k', "");
+}
+#endif
+
+void do_stuff() {
+   f(
+#embed 
+   );
+   g(
+#embed 
+   );
+   h(
+#embed 
+   );
+   int r = i();
+   (void)r;
+#ifdef __cplusplus
+   j<
+#embed 
+   >(
+#embed 
+   );
+#endif
+}
+
+// Ensure that we don't accidentally allow you to initialize an unsigned char *
+// from embedded data; the data is modeled as a string literal internally, but
+// is not actually a string literal.
+const unsigned char *ptr =
+#embed  // expected-warning {{left operand of comma operator has no 
effect}}
+; // c-error@-2 {{incompatible integer to pointer conversion initializing 
'const unsigned char *' with an expression of type 'unsigned char'}} \
+ cxx-error@-2 {{cannot initialize a variable of type 'const unsigned char 
*' with an rvalue of type 'unsigned char'}}
+
+// However, there are some cases where this is fine and should work.
+const unsigned char *null_ptr_1 =
+#embed  if_empty(0)
+;
+
+const unsigned char *null_ptr_2 =
+#embed 
+;
+
+const unsigned char *null_ptr_3 = {
+#embed 
+};
+
+#define FILE_NAME 
+#define LIMIT 1
+#define OFFSET 0
+#define EMPTY_SUFFIX suffix()
+
+constexpr unsigned char ch =
+#embed FILE_NAME limit(LIMIT) clang::offset(OFFSET) EMPTY_SUFFIX
+;
+static_assert(ch == 0);

efriedma-quic wrote:

More weird cases to consider:

```
void f(float x, char y, char z);
void g() { f((float)
#embed "three_character_file"
);
}
```

```
struct S { S(char x); ~S(); };
void f() { 
  S s[] = {
#embed "file"
  };
}
```

https://github.com/llvm/llvm-project/pull/95802
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-17 Thread Eli Friedman via cfe-commits


@@ -2422,6 +2422,10 @@ void ExprEngine::Visit(const Stmt *S, ExplodedNode *Pred,
   Bldr.addNodes(Dst);
   break;
 }
+
+case Stmt::EmbedExprClass:
+  llvm_unreachable("Support for EmbedExpr is not implemented.");

efriedma-quic wrote:

Please don't use llvm_unreachable for things which are actually reachable.  At 
the very least, use report_fatal_error.  Prefer a real diagnostic when possible.

https://github.com/llvm/llvm-project/pull/95802
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-17 Thread via cfe-commits

https://github.com/cor3ntin approved this pull request.

The commit to fix the leak LGTM

https://github.com/llvm/llvm-project/pull/95802
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-17 Thread Mariya Podchishchaeva via cfe-commits

Fznamznon wrote:

This fixes 
https://github.com/llvm/llvm-project/pull/68620#issuecomment-2163448739 .
There was also 
https://github.com/llvm/llvm-project/pull/68620#issuecomment-2163603239 
reported, but I'm not able to access proper logs. The link points to sanitizer 
buildbots so I suppose it might be the same memory leak.
With this patch and sanitizer build only `Clang-Unit :: Lex/./LexTests/67/120` 
fails, which also fails on main.

https://github.com/llvm/llvm-project/pull/95802
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-17 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-clang-static-analyzer-1

Author: Mariya Podchishchaeva (Fznamznon)


Changes

This commit implements the entirety of the now-accepted [N3017 -Preprocessor 
Embed](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3017.htm) and its 
sister C++ paper [p1967](https://wg21.link/p1967). It implements everything in 
the specification, and includes an implementation that drastically improves the 
time it takes to embed data in specific scenarios (the initialization of 
character type arrays). The mechanisms used to do this are used under the 
"as-if" rule, and in general when the system cannot detect it is initializing 
an array object in a variable declaration, will generate EmbedExpr AST node 
which will be expanded by AST consumers (CodeGen or constant expression 
evaluators) or expand embed directive as a comma expression.

This reverts commit 
https://github.com/llvm/llvm-project/commit/682d461d5a231cee54d65910e6341769419a67d7.

-

Co-authored-by: The Phantom Derpstorm phdofthehouse@gmail.com
Co-authored-by: Aaron Ballman aaron@aaronballman.com
Co-authored-by: cor3ntin corentinjabot@gmail.com
Co-authored-by: H. Vetinari h.vetinari@gmx.com

---

Patch is 184.69 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/95802.diff


96 Files Affected:

- (modified) clang-tools-extra/test/pp-trace/pp-trace-macro.cpp (+9) 
- (modified) clang/docs/LanguageExtensions.rst (+24) 
- (modified) clang/include/clang/AST/Expr.h (+158) 
- (modified) clang/include/clang/AST/RecursiveASTVisitor.h (+5) 
- (modified) clang/include/clang/AST/TextNodeDumper.h (+1) 
- (modified) clang/include/clang/Basic/DiagnosticCommonKinds.td (+3) 
- (modified) clang/include/clang/Basic/DiagnosticLexKinds.td (+12) 
- (modified) clang/include/clang/Basic/DiagnosticSemaKinds.td (-2) 
- (modified) clang/include/clang/Basic/FileManager.h (+7-4) 
- (modified) clang/include/clang/Basic/StmtNodes.td (+1) 
- (modified) clang/include/clang/Basic/TokenKinds.def (+6) 
- (modified) clang/include/clang/Driver/Options.td (+6) 
- (modified) clang/include/clang/Frontend/PreprocessorOutputOptions.h (+3) 
- (modified) clang/include/clang/Lex/PPCallbacks.h (+54) 
- (added) clang/include/clang/Lex/PPDirectiveParameter.h (+33) 
- (added) clang/include/clang/Lex/PPEmbedParameters.h (+94) 
- (modified) clang/include/clang/Lex/Preprocessor.h (+67-2) 
- (modified) clang/include/clang/Lex/PreprocessorOptions.h (+3) 
- (modified) clang/include/clang/Parse/Parser.h (+3) 
- (modified) clang/include/clang/Sema/Sema.h (+4) 
- (modified) clang/include/clang/Serialization/ASTBitCodes.h (+3) 
- (modified) clang/lib/AST/Expr.cpp (+12) 
- (modified) clang/lib/AST/ExprClassification.cpp (+5) 
- (modified) clang/lib/AST/ExprConstant.cpp (+58-5) 
- (modified) clang/lib/AST/Interp/ByteCodeExprGen.cpp (+18-3) 
- (modified) clang/lib/AST/Interp/ByteCodeExprGen.h (+1) 
- (modified) clang/lib/AST/ItaniumMangle.cpp (+1) 
- (modified) clang/lib/AST/StmtPrinter.cpp (+4) 
- (modified) clang/lib/AST/StmtProfile.cpp (+2) 
- (modified) clang/lib/AST/TextNodeDumper.cpp (+5) 
- (modified) clang/lib/Basic/FileManager.cpp (+6-1) 
- (modified) clang/lib/Basic/IdentifierTable.cpp (+3-2) 
- (modified) clang/lib/CodeGen/CGExprAgg.cpp (+32-8) 
- (modified) clang/lib/CodeGen/CGExprConstant.cpp (+93-25) 
- (modified) clang/lib/CodeGen/CGExprScalar.cpp (+7) 
- (modified) clang/lib/Driver/ToolChains/Clang.cpp (+5-1) 
- (modified) clang/lib/Frontend/CompilerInvocation.cpp (+8) 
- (modified) clang/lib/Frontend/DependencyFile.cpp (+25) 
- (modified) clang/lib/Frontend/DependencyGraph.cpp (+23-1) 
- (modified) clang/lib/Frontend/InitPreprocessor.cpp (+8) 
- (modified) clang/lib/Frontend/PrintPreprocessedOutput.cpp (+115-7) 
- (modified) clang/lib/Lex/PPDirectives.cpp (+474-2) 
- (modified) clang/lib/Lex/PPExpressions.cpp (+36-13) 
- (modified) clang/lib/Lex/PPMacroExpansion.cpp (+111) 
- (modified) clang/lib/Lex/TokenConcatenation.cpp (+4-1) 
- (modified) clang/lib/Parse/ParseExpr.cpp (+36-1) 
- (modified) clang/lib/Parse/ParseInit.cpp (+30) 
- (modified) clang/lib/Parse/ParseTemplate.cpp (+29-12) 
- (modified) clang/lib/Sema/SemaExceptionSpec.cpp (+1) 
- (modified) clang/lib/Sema/SemaExpr.cpp (+12-3) 
- (modified) clang/lib/Sema/SemaInit.cpp (+100-13) 
- (modified) clang/lib/Sema/TreeTransform.h (+5) 
- (modified) clang/lib/Serialization/ASTReaderStmt.cpp (+14) 
- (modified) clang/lib/Serialization/ASTWriterStmt.cpp (+10) 
- (modified) clang/lib/StaticAnalyzer/Core/ExprEngine.cpp (+4) 
- (added) clang/test/C/C2x/Inputs/bits.bin (+1) 
- (added) clang/test/C/C2x/Inputs/boop.h (+1) 
- (added) clang/test/C/C2x/Inputs/i.dat (+1) 
- (added) clang/test/C/C2x/Inputs/jump.wav (+1) 
- (added) clang/test/C/C2x/Inputs/s.dat (+1) 
- (added) clang/test/C/C2x/n3017.c (+216) 
- (added) clang/test/Preprocessor/Inputs/jk.txt (+1) 
- (added) clang/test/Preprocessor/Inputs/media/art.txt (+9) 
- (added) clang/test/Preprocessor/Inputs/media/empty 

[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-17 Thread via cfe-commits

llvmbot wrote:



@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-codegen

Author: Mariya Podchishchaeva (Fznamznon)


Changes

This commit implements the entirety of the now-accepted [N3017 -Preprocessor 
Embed](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3017.htm) and its 
sister C++ paper [p1967](https://wg21.link/p1967). It implements everything in 
the specification, and includes an implementation that drastically improves the 
time it takes to embed data in specific scenarios (the initialization of 
character type arrays). The mechanisms used to do this are used under the 
"as-if" rule, and in general when the system cannot detect it is initializing 
an array object in a variable declaration, will generate EmbedExpr AST node 
which will be expanded by AST consumers (CodeGen or constant expression 
evaluators) or expand embed directive as a comma expression.

This reverts commit 
https://github.com/llvm/llvm-project/commit/682d461d5a231cee54d65910e6341769419a67d7.

-

Co-authored-by: The Phantom Derpstorm phdofthehouse@gmail.com
Co-authored-by: Aaron Ballman aaron@aaronballman.com
Co-authored-by: cor3ntin corentinjabot@gmail.com
Co-authored-by: H. Vetinari h.vetinari@gmx.com

---

Patch is 184.69 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/95802.diff


96 Files Affected:

- (modified) clang-tools-extra/test/pp-trace/pp-trace-macro.cpp (+9) 
- (modified) clang/docs/LanguageExtensions.rst (+24) 
- (modified) clang/include/clang/AST/Expr.h (+158) 
- (modified) clang/include/clang/AST/RecursiveASTVisitor.h (+5) 
- (modified) clang/include/clang/AST/TextNodeDumper.h (+1) 
- (modified) clang/include/clang/Basic/DiagnosticCommonKinds.td (+3) 
- (modified) clang/include/clang/Basic/DiagnosticLexKinds.td (+12) 
- (modified) clang/include/clang/Basic/DiagnosticSemaKinds.td (-2) 
- (modified) clang/include/clang/Basic/FileManager.h (+7-4) 
- (modified) clang/include/clang/Basic/StmtNodes.td (+1) 
- (modified) clang/include/clang/Basic/TokenKinds.def (+6) 
- (modified) clang/include/clang/Driver/Options.td (+6) 
- (modified) clang/include/clang/Frontend/PreprocessorOutputOptions.h (+3) 
- (modified) clang/include/clang/Lex/PPCallbacks.h (+54) 
- (added) clang/include/clang/Lex/PPDirectiveParameter.h (+33) 
- (added) clang/include/clang/Lex/PPEmbedParameters.h (+94) 
- (modified) clang/include/clang/Lex/Preprocessor.h (+67-2) 
- (modified) clang/include/clang/Lex/PreprocessorOptions.h (+3) 
- (modified) clang/include/clang/Parse/Parser.h (+3) 
- (modified) clang/include/clang/Sema/Sema.h (+4) 
- (modified) clang/include/clang/Serialization/ASTBitCodes.h (+3) 
- (modified) clang/lib/AST/Expr.cpp (+12) 
- (modified) clang/lib/AST/ExprClassification.cpp (+5) 
- (modified) clang/lib/AST/ExprConstant.cpp (+58-5) 
- (modified) clang/lib/AST/Interp/ByteCodeExprGen.cpp (+18-3) 
- (modified) clang/lib/AST/Interp/ByteCodeExprGen.h (+1) 
- (modified) clang/lib/AST/ItaniumMangle.cpp (+1) 
- (modified) clang/lib/AST/StmtPrinter.cpp (+4) 
- (modified) clang/lib/AST/StmtProfile.cpp (+2) 
- (modified) clang/lib/AST/TextNodeDumper.cpp (+5) 
- (modified) clang/lib/Basic/FileManager.cpp (+6-1) 
- (modified) clang/lib/Basic/IdentifierTable.cpp (+3-2) 
- (modified) clang/lib/CodeGen/CGExprAgg.cpp (+32-8) 
- (modified) clang/lib/CodeGen/CGExprConstant.cpp (+93-25) 
- (modified) clang/lib/CodeGen/CGExprScalar.cpp (+7) 
- (modified) clang/lib/Driver/ToolChains/Clang.cpp (+5-1) 
- (modified) clang/lib/Frontend/CompilerInvocation.cpp (+8) 
- (modified) clang/lib/Frontend/DependencyFile.cpp (+25) 
- (modified) clang/lib/Frontend/DependencyGraph.cpp (+23-1) 
- (modified) clang/lib/Frontend/InitPreprocessor.cpp (+8) 
- (modified) clang/lib/Frontend/PrintPreprocessedOutput.cpp (+115-7) 
- (modified) clang/lib/Lex/PPDirectives.cpp (+474-2) 
- (modified) clang/lib/Lex/PPExpressions.cpp (+36-13) 
- (modified) clang/lib/Lex/PPMacroExpansion.cpp (+111) 
- (modified) clang/lib/Lex/TokenConcatenation.cpp (+4-1) 
- (modified) clang/lib/Parse/ParseExpr.cpp (+36-1) 
- (modified) clang/lib/Parse/ParseInit.cpp (+30) 
- (modified) clang/lib/Parse/ParseTemplate.cpp (+29-12) 
- (modified) clang/lib/Sema/SemaExceptionSpec.cpp (+1) 
- (modified) clang/lib/Sema/SemaExpr.cpp (+12-3) 
- (modified) clang/lib/Sema/SemaInit.cpp (+100-13) 
- (modified) clang/lib/Sema/TreeTransform.h (+5) 
- (modified) clang/lib/Serialization/ASTReaderStmt.cpp (+14) 
- (modified) clang/lib/Serialization/ASTWriterStmt.cpp (+10) 
- (modified) clang/lib/StaticAnalyzer/Core/ExprEngine.cpp (+4) 
- (added) clang/test/C/C2x/Inputs/bits.bin (+1) 
- (added) clang/test/C/C2x/Inputs/boop.h (+1) 
- (added) clang/test/C/C2x/Inputs/i.dat (+1) 
- (added) clang/test/C/C2x/Inputs/jump.wav (+1) 
- (added) clang/test/C/C2x/Inputs/s.dat (+1) 
- (added) clang/test/C/C2x/n3017.c (+216) 
- (added) clang/test/Preprocessor/Inputs/jk.txt (+1) 
- (added) clang/test/Preprocessor/Inputs/media/art.txt (+9) 
- (added) 

[clang] [clang-tools-extra] Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (PR #95802)

2024-06-17 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-clang-driver

Author: Mariya Podchishchaeva (Fznamznon)


Changes

This commit implements the entirety of the now-accepted [N3017 -Preprocessor 
Embed](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3017.htm) and its 
sister C++ paper [p1967](https://wg21.link/p1967). It implements everything in 
the specification, and includes an implementation that drastically improves the 
time it takes to embed data in specific scenarios (the initialization of 
character type arrays). The mechanisms used to do this are used under the 
"as-if" rule, and in general when the system cannot detect it is initializing 
an array object in a variable declaration, will generate EmbedExpr AST node 
which will be expanded by AST consumers (CodeGen or constant expression 
evaluators) or expand embed directive as a comma expression.

This reverts commit 
https://github.com/llvm/llvm-project/commit/682d461d5a231cee54d65910e6341769419a67d7.

-

Co-authored-by: The Phantom Derpstorm phdofthehouse@gmail.com
Co-authored-by: Aaron Ballman aaron@aaronballman.com
Co-authored-by: cor3ntin corentinjabot@gmail.com
Co-authored-by: H. Vetinari h.vetinari@gmx.com

---

Patch is 184.69 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/95802.diff


96 Files Affected:

- (modified) clang-tools-extra/test/pp-trace/pp-trace-macro.cpp (+9) 
- (modified) clang/docs/LanguageExtensions.rst (+24) 
- (modified) clang/include/clang/AST/Expr.h (+158) 
- (modified) clang/include/clang/AST/RecursiveASTVisitor.h (+5) 
- (modified) clang/include/clang/AST/TextNodeDumper.h (+1) 
- (modified) clang/include/clang/Basic/DiagnosticCommonKinds.td (+3) 
- (modified) clang/include/clang/Basic/DiagnosticLexKinds.td (+12) 
- (modified) clang/include/clang/Basic/DiagnosticSemaKinds.td (-2) 
- (modified) clang/include/clang/Basic/FileManager.h (+7-4) 
- (modified) clang/include/clang/Basic/StmtNodes.td (+1) 
- (modified) clang/include/clang/Basic/TokenKinds.def (+6) 
- (modified) clang/include/clang/Driver/Options.td (+6) 
- (modified) clang/include/clang/Frontend/PreprocessorOutputOptions.h (+3) 
- (modified) clang/include/clang/Lex/PPCallbacks.h (+54) 
- (added) clang/include/clang/Lex/PPDirectiveParameter.h (+33) 
- (added) clang/include/clang/Lex/PPEmbedParameters.h (+94) 
- (modified) clang/include/clang/Lex/Preprocessor.h (+67-2) 
- (modified) clang/include/clang/Lex/PreprocessorOptions.h (+3) 
- (modified) clang/include/clang/Parse/Parser.h (+3) 
- (modified) clang/include/clang/Sema/Sema.h (+4) 
- (modified) clang/include/clang/Serialization/ASTBitCodes.h (+3) 
- (modified) clang/lib/AST/Expr.cpp (+12) 
- (modified) clang/lib/AST/ExprClassification.cpp (+5) 
- (modified) clang/lib/AST/ExprConstant.cpp (+58-5) 
- (modified) clang/lib/AST/Interp/ByteCodeExprGen.cpp (+18-3) 
- (modified) clang/lib/AST/Interp/ByteCodeExprGen.h (+1) 
- (modified) clang/lib/AST/ItaniumMangle.cpp (+1) 
- (modified) clang/lib/AST/StmtPrinter.cpp (+4) 
- (modified) clang/lib/AST/StmtProfile.cpp (+2) 
- (modified) clang/lib/AST/TextNodeDumper.cpp (+5) 
- (modified) clang/lib/Basic/FileManager.cpp (+6-1) 
- (modified) clang/lib/Basic/IdentifierTable.cpp (+3-2) 
- (modified) clang/lib/CodeGen/CGExprAgg.cpp (+32-8) 
- (modified) clang/lib/CodeGen/CGExprConstant.cpp (+93-25) 
- (modified) clang/lib/CodeGen/CGExprScalar.cpp (+7) 
- (modified) clang/lib/Driver/ToolChains/Clang.cpp (+5-1) 
- (modified) clang/lib/Frontend/CompilerInvocation.cpp (+8) 
- (modified) clang/lib/Frontend/DependencyFile.cpp (+25) 
- (modified) clang/lib/Frontend/DependencyGraph.cpp (+23-1) 
- (modified) clang/lib/Frontend/InitPreprocessor.cpp (+8) 
- (modified) clang/lib/Frontend/PrintPreprocessedOutput.cpp (+115-7) 
- (modified) clang/lib/Lex/PPDirectives.cpp (+474-2) 
- (modified) clang/lib/Lex/PPExpressions.cpp (+36-13) 
- (modified) clang/lib/Lex/PPMacroExpansion.cpp (+111) 
- (modified) clang/lib/Lex/TokenConcatenation.cpp (+4-1) 
- (modified) clang/lib/Parse/ParseExpr.cpp (+36-1) 
- (modified) clang/lib/Parse/ParseInit.cpp (+30) 
- (modified) clang/lib/Parse/ParseTemplate.cpp (+29-12) 
- (modified) clang/lib/Sema/SemaExceptionSpec.cpp (+1) 
- (modified) clang/lib/Sema/SemaExpr.cpp (+12-3) 
- (modified) clang/lib/Sema/SemaInit.cpp (+100-13) 
- (modified) clang/lib/Sema/TreeTransform.h (+5) 
- (modified) clang/lib/Serialization/ASTReaderStmt.cpp (+14) 
- (modified) clang/lib/Serialization/ASTWriterStmt.cpp (+10) 
- (modified) clang/lib/StaticAnalyzer/Core/ExprEngine.cpp (+4) 
- (added) clang/test/C/C2x/Inputs/bits.bin (+1) 
- (added) clang/test/C/C2x/Inputs/boop.h (+1) 
- (added) clang/test/C/C2x/Inputs/i.dat (+1) 
- (added) clang/test/C/C2x/Inputs/jump.wav (+1) 
- (added) clang/test/C/C2x/Inputs/s.dat (+1) 
- (added) clang/test/C/C2x/n3017.c (+216) 
- (added) clang/test/Preprocessor/Inputs/jk.txt (+1) 
- (added) clang/test/Preprocessor/Inputs/media/art.txt (+9) 
- (added) clang/test/Preprocessor/Inputs/media/empty () 
-