date:20231121

[clang] [Driver] Add support for -export-dynamic which can match GCC behavior. (PR #72781)

2023-11-21 Thread dong jianqiang via cfe-commits


dongjianqiang2 wrote:

> > > > @MaskRay the reason for adding this option is that gcc supports it. 
> > > > please refer to 
> > > > [godbolt.org/z/54sE6zTa1](https://godbolt.org/z/54sE6zTa1)
> > > 
> > > 
> > > This doesn't answer my question. GCC has a lot of options that Clang 
> > > doesn't support. An option supported by GCC does not mean that Clang 
> > > needs to support it. This option has perfect replacement, which makes it 
> > > even questionable (since to the best of my knowledge `-export-dynamic` 
> > > driver option is not used) See my previous comment:
> > > > GCC's default spec file for Linux does not say how -export-dynamic 
> > > > translates to ld -export-dynamic.
> > > > I think ld --export-dynamic is exclusively caused by 
> > > > -Wl,--export-dynamic or -rdynamic.
> > > > Do you have any example of gcc -export-dynamic uses?
> > 
> > 
> > This is historically undocumented option. and yes, it can be repalced by 
> > -rdynamic. See 
> > [PR47390](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47390). The purpose 
> > of this pull request is to ensure consistency between the two compilers. If 
> > you think it's unnecessary to support it, I'm fine with it. : )
> 
> Thank you for digging this up. So `-export-dynamic` is the only special 
> `-exxx` option GCC supports. Except this gawk issue (2011) which likely gets 
> fixed long ago, `gcc -export-dynamic` seems to have no use. This doesn't 
> justify driver adding an option (that we know is not a good thing) special 
> case.
> 
> Note that `clang -e xxx` or `clang -exxx` did not work before 2020-07, so 
> even all `-exxx` `-e xxx` all probably rarely used. We should consider 
> [#72804 
> (comment)](https://github.com/llvm/llvm-project/pull/72804#issuecomment-1820321163)

Currently, we are switching to the llvm compiler in embedded system. In some 
scenarios, the -e option is used to specify the program entry address.  gcc 
supportes both `-e xxx` and `-exxx`, so it may not be a good idea to support 
'-e xxx' only. So far, I've only encountered this particular -export-dynamic 
scenario, the program got native exception due to the wrong entry address. 
When looking back at the logs. I found this warning "cannot find entry symbol 
xport-dynamic; defaulting to ". 

https://github.com/llvm/llvm-project/pull/72781
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [RISCV] Use Float type instead of Half type for Fixed RVV vector type mangling (PR #73091)

2023-11-21 Thread Craig Topper via cfe-commits


https://github.com/topperc approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/73091
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [RISCV] Use Float type instead of Half type for Fixed RVV vector type mangling (PR #73091)

2023-11-21 Thread Jianjian Guan via cfe-commits


jacquesguan wrote:

> test?

Done, add test case.

https://github.com/llvm/llvm-project/pull/73091
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [RISCV] Use Float type instead of Half type for Fixed RVV vector type mangling (PR #73091)

2023-11-21 Thread Jianjian Guan via cfe-commits


https://github.com/jacquesguan updated 
https://github.com/llvm/llvm-project/pull/73091

>From 5712baa1f74acec9a482d110e0a6bf9638006409 Mon Sep 17 00:00:00 2001
From: Jianjian GUAN 
Date: Wed, 22 Nov 2023 14:34:49 +0800
Subject: [PATCH] [RISCV] Use Float type instead of Half type for Fixed RVV
 vector type mangling

---
 clang/lib/AST/ItaniumMangle.cpp   |  2 +-
 .../riscv-mangle-rvv-fixed-vectors.cpp| 85 +++
 2 files changed, 71 insertions(+), 16 deletions(-)

diff --git a/clang/lib/AST/ItaniumMangle.cpp b/clang/lib/AST/ItaniumMangle.cpp
index 2a62ac0175afb72..b1678479888eb77 100644
--- a/clang/lib/AST/ItaniumMangle.cpp
+++ b/clang/lib/AST/ItaniumMangle.cpp
@@ -4029,7 +4029,7 @@ void CXXNameMangler::mangleRISCVFixedRVVVectorType(const 
VectorType *T) {
   case BuiltinType::ULong:
 TypeNameOS << "uint64";
 break;
-  case BuiltinType::Half:
+  case BuiltinType::Float16:
 TypeNameOS << "float16";
 break;
   case BuiltinType::Float:
diff --git a/clang/test/CodeGenCXX/riscv-mangle-rvv-fixed-vectors.cpp 
b/clang/test/CodeGenCXX/riscv-mangle-rvv-fixed-vectors.cpp
index 98fb27b704fd81d..32bd49f4ff725db 100644
--- a/clang/test/CodeGenCXX/riscv-mangle-rvv-fixed-vectors.cpp
+++ b/clang/test/CodeGenCXX/riscv-mangle-rvv-fixed-vectors.cpp
@@ -1,23 +1,23 @@
 // RUN: %clang_cc1 -triple riscv64-none-linux-gnu %s -emit-llvm -o - \
-// RUN:  -target-feature +f -target-feature +d \
-// RUN:  -target-feature +zve64d -mvscale-min=1 -mvscale-max=1 \
-// RUN:  | FileCheck %s --check-prefix=CHECK-64
+// RUN:  -target-feature +f -target-feature +d -target-feature +zfh \
+// RUN:  -target-feature +zve64d -target-feature +zvfh -mvscale-min=1 \
+// RUN:   -mvscale-max=1 | FileCheck %s --check-prefix=CHECK-64
 // RUN: %clang_cc1 -triple riscv64-none-linux-gnu %s -emit-llvm -o - \
-// RUN:  -target-feature +f -target-feature +d \
-// RUN:  -target-feature +zve64d -mvscale-min=2 -mvscale-max=2 \
-// RUN:  | FileCheck %s --check-prefix=CHECK-128
+// RUN:  -target-feature +f -target-feature +d -target-feature +zfh \
+// RUN:  -target-feature +zve64d -target-feature +zvfh -mvscale-min=2 \
+// RUN:  -mvscale-max=2 | FileCheck %s --check-prefix=CHECK-128
 // RUN: %clang_cc1 -triple riscv64-none-linux-gnu %s -emit-llvm -o - \
-// RUN:  -target-feature +f -target-feature +d \
-// RUN:  -target-feature +zve64d -mvscale-min=4 -mvscale-max=4 \
-// RUN:  | FileCheck %s --check-prefix=CHECK-256
+// RUN:  -target-feature +f -target-feature +d -target-feature +zfh \
+// RUN:  -target-feature +zve64d -target-feature +zvfh -mvscale-min=4 \
+// RUN:  -mvscale-max=4 | FileCheck %s --check-prefix=CHECK-256
 // RUN: %clang_cc1 -triple riscv64-none-linux-gnu %s -emit-llvm -o - \
-// RUN:  -target-feature +f -target-feature +d \
-// RUN:  -target-feature +zve64d -mvscale-min=8 -mvscale-max=8 \
-// RUN:  | FileCheck %s --check-prefix=CHECK-512
+// RUN:  -target-feature +f -target-feature +d -target-feature +zfh \
+// RUN:  -target-feature +zve64d -target-feature +zvfh -mvscale-min=8 \
+// RUN:  -mvscale-max=8 | FileCheck %s --check-prefix=CHECK-512
 // RUN: %clang_cc1 -triple riscv64-none-linux-gnu %s -emit-llvm -o - \
-// RUN:  -target-feature +f -target-feature +d \
-// RUN:  -target-feature +zve64d -mvscale-min=16 -mvscale-max=16 \
-// RUN:  | FileCheck %s --check-prefix=CHECK-1024
+// RUN:  -target-feature +f -target-feature +d -target-feature +zfh \
+// RUN:  -target-feature +zve64d -target-feature +zvfh -mvscale-min=16 \
+// RUN:  -mvscale-max=16 | FileCheck %s --check-prefix=CHECK-1024
 
 typedef __rvv_int8mf8_t vint8mf8_t;
 typedef __rvv_uint8mf8_t vuint8mf8_t;
@@ -26,6 +26,7 @@ typedef __rvv_int8mf4_t vint8mf4_t;
 typedef __rvv_uint8mf4_t vuint8mf4_t;
 typedef __rvv_int16mf4_t vint16mf4_t;
 typedef __rvv_uint16mf4_t vuint16mf4_t;
+typedef __rvv_float16mf4_t vfloat16mf4_t;
 
 typedef __rvv_int8mf2_t vint8mf2_t;
 typedef __rvv_uint8mf2_t vuint8mf2_t;
@@ -33,6 +34,7 @@ typedef __rvv_int16mf2_t vint16mf2_t;
 typedef __rvv_uint16mf2_t vuint16mf2_t;
 typedef __rvv_int32mf2_t vint32mf2_t;
 typedef __rvv_uint32mf2_t vuint32mf2_t;
+typedef __rvv_float16mf2_t vfloat16mf2_t;
 typedef __rvv_float32mf2_t vfloat32mf2_t;
 
 typedef __rvv_int8m1_t vint8m1_t;
@@ -43,6 +45,7 @@ typedef __rvv_int32m1_t vint32m1_t;
 typedef __rvv_uint32m1_t vuint32m1_t;
 typedef __rvv_int64m1_t vint64m1_t;
 typedef __rvv_uint64m1_t vuint64m1_t;
+typedef __rvv_float16m1_t vfloat16m1_t;
 typedef __rvv_float32m1_t vfloat32m1_t;
 typedef __rvv_float64m1_t vfloat64m1_t;
 
@@ -54,6 +57,7 @@ typedef __rvv_int32m2_t vint32m2_t;
 typedef __rvv_uint32m2_t vuint32m2_t;
 typedef __rvv_int64m2_t vint64m2_t;
 typedef __rvv_uint64m2_t vuint64m2_t;
+typedef __rvv_float16m2_t vfloat16m2_t;
 typedef __rvv_float32m2_t vfloat32m2_t;
 typedef __rvv_float64m2_t vfloat64m2_t;
 
@@ -65,6 +69,7 @@ typedef __rvv_int32m4_t vint32m4_t;
 typedef __rvv_uint32m4_t vuint32m4_t;
 typedef __rvv_int64m4_t vint64m4_t;
 typedef __rvv_uint64m4_t vuint64m4_t;
+typedef __rv

[clang] [RISCV] Use Float type instead of Half type for Fixed RVV vector type mangling (PR #73091)

2023-11-21 Thread Craig Topper via cfe-commits


topperc wrote:

test?

https://github.com/llvm/llvm-project/pull/73091
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [RISCV] Use Float type instead of Half type for Fixed RVV vector type mangling (PR #73091)

2023-11-21 Thread via cfe-commits


llvmbot wrote:




@llvm/pr-subscribers-clang

Author: Jianjian Guan (jacquesguan)


Changes



---
Full diff: https://github.com/llvm/llvm-project/pull/73091.diff


1 Files Affected:

- (modified) clang/lib/AST/ItaniumMangle.cpp (+1-1) 


``diff
diff --git a/clang/lib/AST/ItaniumMangle.cpp b/clang/lib/AST/ItaniumMangle.cpp
index 2a62ac0175afb72..b1678479888eb77 100644
--- a/clang/lib/AST/ItaniumMangle.cpp
+++ b/clang/lib/AST/ItaniumMangle.cpp
@@ -4029,7 +4029,7 @@ void CXXNameMangler::mangleRISCVFixedRVVVectorType(const 
VectorType *T) {
   case BuiltinType::ULong:
 TypeNameOS << "uint64";
 break;
-  case BuiltinType::Half:
+  case BuiltinType::Float16:
 TypeNameOS << "float16";
 break;
   case BuiltinType::Float:

``




https://github.com/llvm/llvm-project/pull/73091
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [RISCV] Use Float type instead of Half type for Fixed RVV vector type mangling (PR #73091)

2023-11-21 Thread Jianjian Guan via cfe-commits


https://github.com/jacquesguan created 
https://github.com/llvm/llvm-project/pull/73091

None

>From f785a0a175f509dbc72e11c13eb5eb6f6eaebb43 Mon Sep 17 00:00:00 2001
From: Jianjian GUAN 
Date: Wed, 22 Nov 2023 14:34:49 +0800
Subject: [PATCH] [RISCV] Use Float type instead of Half type for Fixed RVV
 vector type mangling

---
 clang/lib/AST/ItaniumMangle.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/clang/lib/AST/ItaniumMangle.cpp b/clang/lib/AST/ItaniumMangle.cpp
index 2a62ac0175afb72..b1678479888eb77 100644
--- a/clang/lib/AST/ItaniumMangle.cpp
+++ b/clang/lib/AST/ItaniumMangle.cpp
@@ -4029,7 +4029,7 @@ void CXXNameMangler::mangleRISCVFixedRVVVectorType(const 
VectorType *T) {
   case BuiltinType::ULong:
 TypeNameOS << "uint64";
 break;
-  case BuiltinType::Half:
+  case BuiltinType::Float16:
 TypeNameOS << "float16";
 break;
   case BuiltinType::Float:

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang][Sema] Add -Wswitch-default warning option (PR #73077)

2023-11-21 Thread dong jianqiang via cfe-commits

dongjianqiang2 wrote:

> There is one clang-tidy check (bugprone-switch-missing-default-case) also for 
> this feature.

Thank you for your reply. It may be a more convenient and straightforward way 
if can be identified during compile time. 
On the other hand, it it more compatibile with GCC's hehavior. : )

https://github.com/llvm/llvm-project/pull/73077
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][OpenMP] Emit unsupported directive error (PR #70233)

2023-11-21 Thread Raymond Chang via cfe-commits


rkchang wrote:

Added a test case. Thanks for the pointer! Here's the result:

```
~/dev/fork-llvm-project omp_dispatch_unimpl
❯ llvm-lit -vv clang/test/OpenMP/dispatch_unsupported.c
llvm-lit: 
/home/rkchang/dev/fork-llvm-project/llvm/utils/lit/lit/llvm/config.py:488: 
note: using clang: /home/rkchang/dev/fork-llvm-project/build/bin/clang
-- Testing: 1 tests, 1 workers --
PASS: Clang :: OpenMP/dispatch_unsupported.c (1 of 1)

Testing Time: 0.07s

Total Discovered Tests: 1
  Passed: 1 (100.00%)
```

https://github.com/llvm/llvm-project/pull/70233
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][OpenMP] Emit unsupported directive error (PR #70233)

2023-11-21 Thread Raymond Chang via cfe-commits


https://github.com/rkchang updated 
https://github.com/llvm/llvm-project/pull/70233

>From 72c056b825963d0de1dcf3fe3a6de922098d0ad9 Mon Sep 17 00:00:00 2001
From: Raymond Chang 
Date: Thu, 12 Oct 2023 01:51:03 -0400
Subject: [PATCH 1/2] Emit unsupported directive error

---
 clang/lib/CodeGen/CGStmt.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/clang/lib/CodeGen/CGStmt.cpp b/clang/lib/CodeGen/CGStmt.cpp
index c719df1bfa05036..4eeaf9645a3eab8 100644
--- a/clang/lib/CodeGen/CGStmt.cpp
+++ b/clang/lib/CodeGen/CGStmt.cpp
@@ -407,7 +407,7 @@ void CodeGenFunction::EmitStmt(const Stmt *S, 
ArrayRef Attrs) {
 EmitOMPInteropDirective(cast(*S));
 break;
   case Stmt::OMPDispatchDirectiveClass:
-llvm_unreachable("Dispatch directive not supported yet.");
+CGM.ErrorUnsupported(S, "OpenMP dispatch directive");
 break;
   case Stmt::OMPScopeDirectiveClass:
 llvm_unreachable("scope not supported with FE outlining");

>From 65d61090cf6f66a8b6a236e24a775806b5339df9 Mon Sep 17 00:00:00 2001
From: Raymond Chang 
Date: Wed, 22 Nov 2023 01:03:31 -0500
Subject: [PATCH 2/2] Add test case

---
 clang/test/OpenMP/dispatch_unsupported.c | 7 +++
 1 file changed, 7 insertions(+)
 create mode 100644 clang/test/OpenMP/dispatch_unsupported.c

diff --git a/clang/test/OpenMP/dispatch_unsupported.c 
b/clang/test/OpenMP/dispatch_unsupported.c
new file mode 100644
index 000..ff0815dda6a3fd9
--- /dev/null
+++ b/clang/test/OpenMP/dispatch_unsupported.c
@@ -0,0 +1,7 @@
+// RUN: %clang_cc1 -emit-llvm -fopenmp -disable-llvm-passes %s -verify=expected
+
+// expected-error@+2 {{cannot compile this OpenMP dispatch directive yet}} 
+void a(){
+#pragma omp dispatch
+a();
+}
\ No newline at end of file

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Reland "[clang][Sema] Use original template pattern when declaring implicit deduction guides for nested template classes" (PR #69676)

2023-11-21 Thread via cfe-commits


antangelo wrote:

Thank you for the reproducers! I have posted a candidate reland that addresses 
both of these issues here: https://github.com/llvm/llvm-project/pull/73087

https://github.com/llvm/llvm-project/pull/69676
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Reland "[clang][Sema] Use original template pattern when declaring implicit deduction guides for nested template classes" (PR #73087)

2023-11-21 Thread via cfe-commits


llvmbot wrote:




@llvm/pr-subscribers-clang

Author: None (antangelo)


Changes

Reland of f418319730341e9d41ce8ead6fbfe5603c343985 with proper handling of 
template constructors

When a nested template is instantiated, the template pattern of the inner class 
is not copied into the outer class
ClassTemplateSpecializationDecl. The specialization contains a 
ClassTemplateDecl with an empty record that points to the original template 
pattern instead.

As a result, when looking up the constructors of the inner class, no results 
are returned. This patch finds the original template pattern and uses that for 
the lookup instead.

Based on CWG2471 we must also substitute the known outer template arguments 
when creating deduction guides for the inner class.

Changes from last iteration:

1. In template constructors, arguments are first rewritten to depth - 1 
relative to the constructor as compared to depth 0 originally. These arguments 
are needed for substitution into constraint expressions.
2. Outer arguments are then applied with the template instantiator to produce a 
template argument at depth zero for use in the deduction guide. This 
substitution does not evaluate constraints, which preserves constraint 
arguments at the correct depth for later evaluation.
3. Tests are added that cover template constructors within nested deduction 
guides for all special substitution cases.
4. Computation of the template pattern and outer instantiation arguments are 
pulled into the constructor of `ConvertConstructorToDeductionGuideTransform`.

---
Full diff: https://github.com/llvm/llvm-project/pull/73087.diff


4 Files Affected:

- (modified) clang/docs/ReleaseNotes.rst (+5) 
- (modified) clang/lib/Sema/SemaTemplate.cpp (+60-6) 
- (modified) clang/test/SemaTemplate/nested-deduction-guides.cpp (+5) 
- (added) clang/test/SemaTemplate/nested-implicit-deduction-guides.cpp (+60) 


``diff
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 157afd9e8629152..ad5213aa30b20e9 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -734,6 +734,11 @@ Bug Fixes to C++ Support
   declaration definition. Fixes:
   (`#61763 `_)
 
+- Fix a bug where implicit deduction guides are not correctly generated for 
nested template
+  classes. Fixes:
+  (`#46200 `_)
+  (`#57812 `_)
+
 - Diagnose use of a variable-length array in a coroutine. The design of
   coroutines is such that it is not possible to support VLA use. Fixes:
   (`#65858 `_)
diff --git a/clang/lib/Sema/SemaTemplate.cpp b/clang/lib/Sema/SemaTemplate.cpp
index c188dd34014a4b3..34d7b8c731e9076 100644
--- a/clang/lib/Sema/SemaTemplate.cpp
+++ b/clang/lib/Sema/SemaTemplate.cpp
@@ -2250,10 +2250,24 @@ class ExtractTypeForDeductionGuide
 struct ConvertConstructorToDeductionGuideTransform {
   ConvertConstructorToDeductionGuideTransform(Sema &S,
   ClassTemplateDecl *Template)
-  : SemaRef(S), Template(Template) {}
+  : SemaRef(S), Template(Template) {
+// If the template is nested, then we need to use the original
+// pattern to iterate over the constructors.
+ClassTemplateDecl *Pattern = Template;
+while (Pattern->getInstantiatedFromMemberTemplate()) {
+  if (Pattern->isMemberSpecialization())
+break;
+  Pattern = Pattern->getInstantiatedFromMemberTemplate();
+  NestedPattern = Pattern;
+}
+
+if (NestedPattern)
+  OuterInstantiationArgs = SemaRef.getTemplateInstantiationArgs(Template);
+  }
 
   Sema &SemaRef;
   ClassTemplateDecl *Template;
+  ClassTemplateDecl *NestedPattern = nullptr;
 
   DeclContext *DC = Template->getDeclContext();
   CXXRecordDecl *Primary = Template->getTemplatedDecl();
@@ -2266,6 +2280,10 @@ struct ConvertConstructorToDeductionGuideTransform {
   // depth-0 template parameters.
   unsigned Depth1IndexAdjustment = Template->getTemplateParameters()->size();
 
+  // Instantiation arguments for the outermost depth-1 templates
+  // when the template is nested
+  MultiLevelTemplateArgumentList OuterInstantiationArgs;
+
   /// Transform a constructor declaration into a deduction guide.
   NamedDecl *transformConstructor(FunctionTemplateDecl *FTD,
   CXXConstructorDecl *CD) {
@@ -2284,21 +2302,43 @@ struct ConvertConstructorToDeductionGuideTransform {
 if (FTD) {
   TemplateParameterList *InnerParams = FTD->getTemplateParameters();
   SmallVector AllParams;
+  SmallVector Depth1Args;
   AllParams.reserve(TemplateParams->size() + InnerParams->size());
   AllParams.insert(AllParams.begin(),
TemplateParams->begin(), TemplateParams->end());
   SubstArgs.reserve(InnerParams->size());
+  Depth1Args.reserve(InnerParams->size())

[clang] Reland "[clang][Sema] Use original template pattern when declaring implicit deduction guides for nested template classes" (PR #73087)

2023-11-21 Thread via cfe-commits


https://github.com/antangelo created 
https://github.com/llvm/llvm-project/pull/73087

Reland of f418319730341e9d41ce8ead6fbfe5603c343985 with proper handling of 
template constructors

When a nested template is instantiated, the template pattern of the inner class 
is not copied into the outer class
ClassTemplateSpecializationDecl. The specialization contains a 
ClassTemplateDecl with an empty record that points to the original template 
pattern instead.

As a result, when looking up the constructors of the inner class, no results 
are returned. This patch finds the original template pattern and uses that for 
the lookup instead.

Based on CWG2471 we must also substitute the known outer template arguments 
when creating deduction guides for the inner class.

Changes from last iteration:

1. In template constructors, arguments are first rewritten to depth - 1 
relative to the constructor as compared to depth 0 originally. These arguments 
are needed for substitution into constraint expressions.
2. Outer arguments are then applied with the template instantiator to produce a 
template argument at depth zero for use in the deduction guide. This 
substitution does not evaluate constraints, which preserves constraint 
arguments at the correct depth for later evaluation.
3. Tests are added that cover template constructors within nested deduction 
guides for all special substitution cases.
4. Computation of the template pattern and outer instantiation arguments are 
pulled into the constructor of `ConvertConstructorToDeductionGuideTransform`.

>From 37166cd7c7b2a0bb0ad5fe917856eacee2f69ff4 Mon Sep 17 00:00:00 2001
From: Antonio Abbatangelo 
Date: Wed, 1 Nov 2023 18:12:58 -0400
Subject: [PATCH] Reland "[clang][Sema] Use original template pattern when
 declaring implicit deduction guides for nested template classes"

Reland of f418319730341e9d41ce8ead6fbfe5603c343985 with proper handling
of FTDs
---
 clang/docs/ReleaseNotes.rst   |  5 ++
 clang/lib/Sema/SemaTemplate.cpp   | 66 +--
 .../SemaTemplate/nested-deduction-guides.cpp  |  5 ++
 .../nested-implicit-deduction-guides.cpp  | 60 +
 4 files changed, 130 insertions(+), 6 deletions(-)
 create mode 100644 clang/test/SemaTemplate/nested-implicit-deduction-guides.cpp

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 157afd9e8629152..ad5213aa30b20e9 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -734,6 +734,11 @@ Bug Fixes to C++ Support
   declaration definition. Fixes:
   (`#61763 `_)
 
+- Fix a bug where implicit deduction guides are not correctly generated for 
nested template
+  classes. Fixes:
+  (`#46200 `_)
+  (`#57812 `_)
+
 - Diagnose use of a variable-length array in a coroutine. The design of
   coroutines is such that it is not possible to support VLA use. Fixes:
   (`#65858 `_)
diff --git a/clang/lib/Sema/SemaTemplate.cpp b/clang/lib/Sema/SemaTemplate.cpp
index c188dd34014a4b3..34d7b8c731e9076 100644
--- a/clang/lib/Sema/SemaTemplate.cpp
+++ b/clang/lib/Sema/SemaTemplate.cpp
@@ -2250,10 +2250,24 @@ class ExtractTypeForDeductionGuide
 struct ConvertConstructorToDeductionGuideTransform {
   ConvertConstructorToDeductionGuideTransform(Sema &S,
   ClassTemplateDecl *Template)
-  : SemaRef(S), Template(Template) {}
+  : SemaRef(S), Template(Template) {
+// If the template is nested, then we need to use the original
+// pattern to iterate over the constructors.
+ClassTemplateDecl *Pattern = Template;
+while (Pattern->getInstantiatedFromMemberTemplate()) {
+  if (Pattern->isMemberSpecialization())
+break;
+  Pattern = Pattern->getInstantiatedFromMemberTemplate();
+  NestedPattern = Pattern;
+}
+
+if (NestedPattern)
+  OuterInstantiationArgs = SemaRef.getTemplateInstantiationArgs(Template);
+  }
 
   Sema &SemaRef;
   ClassTemplateDecl *Template;
+  ClassTemplateDecl *NestedPattern = nullptr;
 
   DeclContext *DC = Template->getDeclContext();
   CXXRecordDecl *Primary = Template->getTemplatedDecl();
@@ -2266,6 +2280,10 @@ struct ConvertConstructorToDeductionGuideTransform {
   // depth-0 template parameters.
   unsigned Depth1IndexAdjustment = Template->getTemplateParameters()->size();
 
+  // Instantiation arguments for the outermost depth-1 templates
+  // when the template is nested
+  MultiLevelTemplateArgumentList OuterInstantiationArgs;
+
   /// Transform a constructor declaration into a deduction guide.
   NamedDecl *transformConstructor(FunctionTemplateDecl *FTD,
   CXXConstructorDecl *CD) {
@@ -2284,21 +2302,43 @@ struct ConvertConstructorToDeductionGuideTransform {
 if (FTD) {
   TemplateParame

[clang] [Clang][Coroutines] Properly emit EH code for initial suspend `await_resume` (PR #73073)

2023-11-21 Thread Yuxuan Chen via cfe-commits


https://github.com/yuxuanchen1997 closed 
https://github.com/llvm/llvm-project/pull/73073
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] 1fad78b - [Clang][Coroutines] Properly emit EH code for initial suspend `await_resume` (#73073)

2023-11-21 Thread via cfe-commits


Author: Yuxuan Chen
Date: 2023-11-21T21:21:27-08:00
New Revision: 1fad78b123d20db675d339053e4265aceb07c4af

URL: 
https://github.com/llvm/llvm-project/commit/1fad78b123d20db675d339053e4265aceb07c4af
DIFF: 
https://github.com/llvm/llvm-project/commit/1fad78b123d20db675d339053e4265aceb07c4af.diff

LOG: [Clang][Coroutines] Properly emit EH code for initial suspend 
`await_resume`  (#73073)

This change aims to fix an ICE in issue
https://github.com/llvm/llvm-project/issues/63803

The crash happens in `ExitCXXTryStmt` because `EmitAnyExpr()` adds
additional cleanup to the `EHScopeStack`. This messes up the assumption
in `ExitCXXTryStmt` that the top of the stack should be a
`EHCatchScope`.

However, since we never read a value returned from `await_resume()` of
an init suspend, we can skip the part that builds this `RValue`.

The code here may not be in the best shape. There's another bug that
`memberCallExpressionCanThrow` doesn't work on the current Expr due to
type mismatch. I am preparing a separate PR to address it plus some
refactoring might be beneficial.

Added: 
clang/test/CodeGenCoroutines/coro-init-await-nontrivial-return.cpp

Modified: 
clang/docs/ReleaseNotes.rst
clang/lib/CodeGen/CGCoroutine.cpp

Removed: 




diff  --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 157afd9e8629152..b65106b9106d4d7 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -610,6 +610,9 @@ Bug Fixes in This Version
   inside a lambda. (`#61460 
`_)
 - Fix crash during instantiation of some class template specializations within 
class
   templates. Fixes (`#70375 
`_)
+- Fix crash during code generation of C++ coroutine initial suspend when the 
return
+  type of await_resume is not trivially destructible.
+  Fixes (`#63803 `_)
 
 Bug Fixes to Compiler Builtins
 ^^

diff  --git a/clang/lib/CodeGen/CGCoroutine.cpp 
b/clang/lib/CodeGen/CGCoroutine.cpp
index 7e449d5af3423cf..aaf122c0f83bc47 100644
--- a/clang/lib/CodeGen/CGCoroutine.cpp
+++ b/clang/lib/CodeGen/CGCoroutine.cpp
@@ -245,6 +245,15 @@ static LValueOrRValue 
emitSuspendExpression(CodeGenFunction &CGF, CGCoroData &Co
  FPOptionsOverride(), Loc, Loc);
 TryStmt = CXXTryStmt::Create(CGF.getContext(), Loc, TryBody, Catch);
 CGF.EnterCXXTryStmt(*TryStmt);
+CGF.EmitStmt(TryBody);
+// We don't use EmitCXXTryStmt here. We need to store to ResumeEHVar that
+// doesn't exist in the body.
+Builder.CreateFlagStore(false, Coro.ResumeEHVar);
+CGF.ExitCXXTryStmt(*TryStmt);
+LValueOrRValue Res;
+// We are not supposed to obtain the value from init suspend 
await_resume().
+Res.RV = RValue::getIgnored();
+return Res;
   }
 
   LValueOrRValue Res;
@@ -253,11 +262,6 @@ static LValueOrRValue 
emitSuspendExpression(CodeGenFunction &CGF, CGCoroData &Co
   else
 Res.RV = CGF.EmitAnyExpr(S.getResumeExpr(), aggSlot, ignoreResult);
 
-  if (TryStmt) {
-Builder.CreateFlagStore(false, Coro.ResumeEHVar);
-CGF.ExitCXXTryStmt(*TryStmt);
-  }
-
   return Res;
 }
 

diff  --git 
a/clang/test/CodeGenCoroutines/coro-init-await-nontrivial-return.cpp 
b/clang/test/CodeGenCoroutines/coro-init-await-nontrivial-return.cpp
new file mode 100644
index 000..c4b8da327f5c140
--- /dev/null
+++ b/clang/test/CodeGenCoroutines/coro-init-await-nontrivial-return.cpp
@@ -0,0 +1,46 @@
+// RUN: %clang_cc1 -std=c++20 -triple=x86_64-- -emit-llvm -fcxx-exceptions \
+// RUN:-disable-llvm-passes %s -o - | FileCheck %s
+
+#include "Inputs/coroutine.h"
+
+struct NontrivialType {
+  ~NontrivialType() {}
+};
+
+struct Task {
+struct promise_type;
+using handle_type = std::coroutine_handle;
+
+struct initial_suspend_awaiter {
+bool await_ready() {
+return false;
+}
+
+void await_suspend(handle_type h) {}
+
+NontrivialType await_resume() { return {}; }
+};
+
+struct promise_type {
+void return_void() {}
+void unhandled_exception() {}
+initial_suspend_awaiter initial_suspend() { return {}; }
+std::suspend_never final_suspend() noexcept { return {}; }
+Task get_return_object() {
+return Task{handle_type::from_promise(*this)};
+}
+};
+
+handle_type handler;
+};
+
+Task coro_create() {
+co_return;
+}
+
+// CHECK-LABEL: define{{.*}} ptr @_Z11coro_createv(
+// CHECK: init.ready:
+// CHECK-NEXT: store i1 true, ptr {{.*}}
+// CHECK-NEXT: call void @_ZN4Task23initial_suspend_awaiter12await_resumeEv(
+// CHECK-NEXT: call void @_ZN14NontrivialTypeD1Ev(
+// CHECK-NEXT: store i1 false, ptr {{.*}}



___
cfe-commits mailin

[clang] [clang][Sema] Add -Wswitch-default warning option (PR #73077)

2023-11-21 Thread Shivam Gupta via cfe-commits


xgupta wrote:

There is one clang-tidy check (bugprone-switch-missing-default-case) also for 
this feature.

https://github.com/llvm/llvm-project/pull/73077
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][Coroutines] Properly emit EH code for initial suspend `await_resume` (PR #73073)

2023-11-21 Thread Yuxuan Chen via cfe-commits


https://github.com/yuxuanchen1997 updated 
https://github.com/llvm/llvm-project/pull/73073

>From e7d1ae077d7d301094b663166cc0c14c706d2110 Mon Sep 17 00:00:00 2001
From: Yuxuan Chen 
Date: Tue, 21 Nov 2023 19:06:31 -0800
Subject: [PATCH 1/4] [Clang][coro] Fix crash on emitting init suspend if the
 return type of await_resume() is not trivially destructible

---
 clang/lib/CodeGen/CGCoroutine.cpp | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/clang/lib/CodeGen/CGCoroutine.cpp 
b/clang/lib/CodeGen/CGCoroutine.cpp
index 7e449d5af3423cf..aaf122c0f83bc47 100644
--- a/clang/lib/CodeGen/CGCoroutine.cpp
+++ b/clang/lib/CodeGen/CGCoroutine.cpp
@@ -245,6 +245,15 @@ static LValueOrRValue 
emitSuspendExpression(CodeGenFunction &CGF, CGCoroData &Co
  FPOptionsOverride(), Loc, Loc);
 TryStmt = CXXTryStmt::Create(CGF.getContext(), Loc, TryBody, Catch);
 CGF.EnterCXXTryStmt(*TryStmt);
+CGF.EmitStmt(TryBody);
+// We don't use EmitCXXTryStmt here. We need to store to ResumeEHVar that
+// doesn't exist in the body.
+Builder.CreateFlagStore(false, Coro.ResumeEHVar);
+CGF.ExitCXXTryStmt(*TryStmt);
+LValueOrRValue Res;
+// We are not supposed to obtain the value from init suspend 
await_resume().
+Res.RV = RValue::getIgnored();
+return Res;
   }
 
   LValueOrRValue Res;
@@ -253,11 +262,6 @@ static LValueOrRValue 
emitSuspendExpression(CodeGenFunction &CGF, CGCoroData &Co
   else
 Res.RV = CGF.EmitAnyExpr(S.getResumeExpr(), aggSlot, ignoreResult);
 
-  if (TryStmt) {
-Builder.CreateFlagStore(false, Coro.ResumeEHVar);
-CGF.ExitCXXTryStmt(*TryStmt);
-  }
-
   return Res;
 }
 

>From e11b48867b2f02a095d043d57cf7830702f47ed6 Mon Sep 17 00:00:00 2001
From: Yuxuan Chen 
Date: Tue, 21 Nov 2023 19:35:10 -0800
Subject: [PATCH 2/4] add test case for the previously crashing case

---
 .../coro-init-await-nontrivial-return.cpp | 46 +++
 1 file changed, 46 insertions(+)
 create mode 100644 
clang/test/CodeGenCoroutines/coro-init-await-nontrivial-return.cpp

diff --git a/clang/test/CodeGenCoroutines/coro-init-await-nontrivial-return.cpp 
b/clang/test/CodeGenCoroutines/coro-init-await-nontrivial-return.cpp
new file mode 100644
index 000..c4b8da327f5c140
--- /dev/null
+++ b/clang/test/CodeGenCoroutines/coro-init-await-nontrivial-return.cpp
@@ -0,0 +1,46 @@
+// RUN: %clang_cc1 -std=c++20 -triple=x86_64-- -emit-llvm -fcxx-exceptions \
+// RUN:-disable-llvm-passes %s -o - | FileCheck %s
+
+#include "Inputs/coroutine.h"
+
+struct NontrivialType {
+  ~NontrivialType() {}
+};
+
+struct Task {
+struct promise_type;
+using handle_type = std::coroutine_handle;
+
+struct initial_suspend_awaiter {
+bool await_ready() {
+return false;
+}
+
+void await_suspend(handle_type h) {}
+
+NontrivialType await_resume() { return {}; }
+};
+
+struct promise_type {
+void return_void() {}
+void unhandled_exception() {}
+initial_suspend_awaiter initial_suspend() { return {}; }
+std::suspend_never final_suspend() noexcept { return {}; }
+Task get_return_object() {
+return Task{handle_type::from_promise(*this)};
+}
+};
+
+handle_type handler;
+};
+
+Task coro_create() {
+co_return;
+}
+
+// CHECK-LABEL: define{{.*}} ptr @_Z11coro_createv(
+// CHECK: init.ready:
+// CHECK-NEXT: store i1 true, ptr {{.*}}
+// CHECK-NEXT: call void @_ZN4Task23initial_suspend_awaiter12await_resumeEv(
+// CHECK-NEXT: call void @_ZN14NontrivialTypeD1Ev(
+// CHECK-NEXT: store i1 false, ptr {{.*}}

>From c88474fa608219bf1c52fa09e91572dc85ddd1f8 Mon Sep 17 00:00:00 2001
From: Yuxuan Chen 
Date: Tue, 21 Nov 2023 19:52:34 -0800
Subject: [PATCH 3/4] Add release note entry

---
 clang/docs/ReleaseNotes.rst   | 3 +++
 clang/lib/CodeGen/CGCoroutine.cpp | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 157afd9e8629152..b65106b9106d4d7 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -610,6 +610,9 @@ Bug Fixes in This Version
   inside a lambda. (`#61460 
`_)
 - Fix crash during instantiation of some class template specializations within 
class
   templates. Fixes (`#70375 
`_)
+- Fix crash during code generation of C++ coroutine initial suspend when the 
return
+  type of await_resume is not trivially destructible.
+  Fixes (`#63803 `_)
 
 Bug Fixes to Compiler Builtins
 ^^
diff --git a/clang/lib/CodeGen/CGCoroutine.cpp 
b/clang/lib/CodeGen/CGCoroutine.cpp
index aaf122c0f83bc47..ec30c974253d95f 100644
--- a/clang/lib/CodeGen/CGCoroutine.cpp
+++ b/clang/lib/CodeGen/CGCoroutine.cpp
@@ -130,6 +130,8 @@ static SmallString<32>

[clang] [clang][Sema] Add -Wswitch-default warning option (PR #73077)

2023-11-21 Thread via cfe-commits


llvmbot wrote:




@llvm/pr-subscribers-clang

Author: dong jianqiang (dongjianqiang2)


Changes

Adds a warning, issued by the clang semantic analysis. The patch warns on 
switch which don't have the default branch.

This is a counterpart of gcc's Wswitch-default.

---
Full diff: https://github.com/llvm/llvm-project/pull/73077.diff


4 Files Affected:

- (modified) clang/include/clang/Basic/DiagnosticGroups.td (+1-1) 
- (modified) clang/include/clang/Basic/DiagnosticSemaKinds.td (+2) 
- (modified) clang/lib/Sema/SemaStmt.cpp (+3) 
- (added) clang/test/Sema/switch-default.c (+17) 


``diff
diff --git a/clang/include/clang/Basic/DiagnosticGroups.td 
b/clang/include/clang/Basic/DiagnosticGroups.td
index ff028bbbf74261e..12b11527b30571a 100644
--- a/clang/include/clang/Basic/DiagnosticGroups.td
+++ b/clang/include/clang/Basic/DiagnosticGroups.td
@@ -632,7 +632,7 @@ def ShadowAll : DiagGroup<"shadow-all", [Shadow, 
ShadowFieldInConstructor,
 def Shorten64To32 : DiagGroup<"shorten-64-to-32">;
 def : DiagGroup<"sign-promo">;
 def SignCompare : DiagGroup<"sign-compare">;
-def : DiagGroup<"switch-default">;
+def SwitchDefault  : DiagGroup<"switch-default">;
 def : DiagGroup<"synth">;
 def SizeofArrayArgument : DiagGroup<"sizeof-array-argument">;
 def SizeofArrayDecay : DiagGroup<"sizeof-array-decay">;
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td 
b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index 990692c06d7d3a8..17c9627910bb6ce 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -10044,6 +10044,8 @@ def warn_missing_case : Warning<"%plural{"
   "3:enumeration values %1, %2, and %3 not handled in switch|"
   ":%0 enumeration values not handled in switch: %1, %2, %3...}0">,
   InGroup;
+def warn_switch_default : Warning<"switch missing default case">,
+  InGroup, DefaultIgnore;
 
 def warn_unannotated_fallthrough : Warning<
   "unannotated fall-through between switch labels">,
diff --git a/clang/lib/Sema/SemaStmt.cpp b/clang/lib/Sema/SemaStmt.cpp
index 2b45aa5dff7be7c..63348d27a8c94a1 100644
--- a/clang/lib/Sema/SemaStmt.cpp
+++ b/clang/lib/Sema/SemaStmt.cpp
@@ -1327,6 +1327,9 @@ Sema::ActOnFinishSwitchStmt(SourceLocation SwitchLoc, 
Stmt *Switch,
 }
   }
 
+  if (!TheDefaultStmt)
+Diag(SwitchLoc, diag::warn_switch_default);
+
   if (!HasDependentValue) {
 // If we don't have a default statement, check whether the
 // condition is constant.
diff --git a/clang/test/Sema/switch-default.c b/clang/test/Sema/switch-default.c
new file mode 100644
index 000..3f2e21693303378
--- /dev/null
+++ b/clang/test/Sema/switch-default.c
@@ -0,0 +1,17 @@
+// RUN: %clang_cc1 -fsyntax-only -verify -Wswitch-default %s
+
+int f1(int a) {
+  switch (a) {// expected-warning {{switch missing default 
case}}
+case 1: a++; break;
+case 2: a += 2; break;
+  }
+  return a;
+}
+
+int f2(int a) {
+  switch (a) {// no-warning
+default:
+  ;
+  }
+  return a;
+}

``




https://github.com/llvm/llvm-project/pull/73077
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang][Sema] Add -Wswitch-default warning option (PR #73077)

2023-11-21 Thread dong jianqiang via cfe-commits


https://github.com/dongjianqiang2 created 
https://github.com/llvm/llvm-project/pull/73077

Adds a warning, issued by the clang semantic analysis. The patch warns on 
switch which don't have the default branch.

This is a counterpart of gcc's Wswitch-default.

>From 7962e1ffc6bb5ab8873f391f5030199ba62c1345 Mon Sep 17 00:00:00 2001
From: dong jianqiang 
Date: Wed, 22 Nov 2023 11:06:00 +0800
Subject: [PATCH] [clang][Sema] Add -Wswitch-default warning option

Adds a warning, issued by the clang semantic analysis. The patch warns on switch
which don't have the default branch.

This is a counterpart of gcc's Wswitch-default.
---
 clang/include/clang/Basic/DiagnosticGroups.td   |  2 +-
 .../include/clang/Basic/DiagnosticSemaKinds.td  |  2 ++
 clang/lib/Sema/SemaStmt.cpp |  3 +++
 clang/test/Sema/switch-default.c| 17 +
 4 files changed, 23 insertions(+), 1 deletion(-)
 create mode 100644 clang/test/Sema/switch-default.c

diff --git a/clang/include/clang/Basic/DiagnosticGroups.td 
b/clang/include/clang/Basic/DiagnosticGroups.td
index ff028bbbf74261e..12b11527b30571a 100644
--- a/clang/include/clang/Basic/DiagnosticGroups.td
+++ b/clang/include/clang/Basic/DiagnosticGroups.td
@@ -632,7 +632,7 @@ def ShadowAll : DiagGroup<"shadow-all", [Shadow, 
ShadowFieldInConstructor,
 def Shorten64To32 : DiagGroup<"shorten-64-to-32">;
 def : DiagGroup<"sign-promo">;
 def SignCompare : DiagGroup<"sign-compare">;
-def : DiagGroup<"switch-default">;
+def SwitchDefault  : DiagGroup<"switch-default">;
 def : DiagGroup<"synth">;
 def SizeofArrayArgument : DiagGroup<"sizeof-array-argument">;
 def SizeofArrayDecay : DiagGroup<"sizeof-array-decay">;
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td 
b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index 990692c06d7d3a8..17c9627910bb6ce 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -10044,6 +10044,8 @@ def warn_missing_case : Warning<"%plural{"
   "3:enumeration values %1, %2, and %3 not handled in switch|"
   ":%0 enumeration values not handled in switch: %1, %2, %3...}0">,
   InGroup;
+def warn_switch_default : Warning<"switch missing default case">,
+  InGroup, DefaultIgnore;
 
 def warn_unannotated_fallthrough : Warning<
   "unannotated fall-through between switch labels">,
diff --git a/clang/lib/Sema/SemaStmt.cpp b/clang/lib/Sema/SemaStmt.cpp
index 2b45aa5dff7be7c..63348d27a8c94a1 100644
--- a/clang/lib/Sema/SemaStmt.cpp
+++ b/clang/lib/Sema/SemaStmt.cpp
@@ -1327,6 +1327,9 @@ Sema::ActOnFinishSwitchStmt(SourceLocation SwitchLoc, 
Stmt *Switch,
 }
   }
 
+  if (!TheDefaultStmt)
+Diag(SwitchLoc, diag::warn_switch_default);
+
   if (!HasDependentValue) {
 // If we don't have a default statement, check whether the
 // condition is constant.
diff --git a/clang/test/Sema/switch-default.c b/clang/test/Sema/switch-default.c
new file mode 100644
index 000..3f2e21693303378
--- /dev/null
+++ b/clang/test/Sema/switch-default.c
@@ -0,0 +1,17 @@
+// RUN: %clang_cc1 -fsyntax-only -verify -Wswitch-default %s
+
+int f1(int a) {
+  switch (a) {// expected-warning {{switch missing default 
case}}
+case 1: a++; break;
+case 2: a += 2; break;
+  }
+  return a;
+}
+
+int f2(int a) {
+  switch (a) {// no-warning
+default:
+  ;
+  }
+  return a;
+}

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer]:fix valistChecker false negative in windows platform (PR #72951)

2023-11-21 Thread via cfe-commits


mzyKi wrote:

> ..and it turns out that `ValistChecker` already has a function called 
> `ValistChecker::getVAListAsRegion` that handles different representations of 
> `va_list` under various systems. (I was not familiar with it because it was 
> added by @Xazax-hun after the initial commit https://reviews.llvm.org/D15227 
> that contains my contributions.)
> 
> I'm strongly suspect that extending this function is the "right way" to adapt 
> this checker to the windows environment that you're studying.

Thanks for your review.I worte ```isWinValistType``` incorrectly.And I will try 
to find another solution.

https://github.com/llvm/llvm-project/pull/72951
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][Coroutines] Properly emit EH code for initial suspend `await_resume` (PR #73073)

2023-11-21 Thread Yuxuan Chen via cfe-commits


https://github.com/yuxuanchen1997 updated 
https://github.com/llvm/llvm-project/pull/73073

>From e7d1ae077d7d301094b663166cc0c14c706d2110 Mon Sep 17 00:00:00 2001
From: Yuxuan Chen 
Date: Tue, 21 Nov 2023 19:06:31 -0800
Subject: [PATCH 1/3] [Clang][coro] Fix crash on emitting init suspend if the
 return type of await_resume() is not trivially destructible

---
 clang/lib/CodeGen/CGCoroutine.cpp | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/clang/lib/CodeGen/CGCoroutine.cpp 
b/clang/lib/CodeGen/CGCoroutine.cpp
index 7e449d5af3423cf..aaf122c0f83bc47 100644
--- a/clang/lib/CodeGen/CGCoroutine.cpp
+++ b/clang/lib/CodeGen/CGCoroutine.cpp
@@ -245,6 +245,15 @@ static LValueOrRValue 
emitSuspendExpression(CodeGenFunction &CGF, CGCoroData &Co
  FPOptionsOverride(), Loc, Loc);
 TryStmt = CXXTryStmt::Create(CGF.getContext(), Loc, TryBody, Catch);
 CGF.EnterCXXTryStmt(*TryStmt);
+CGF.EmitStmt(TryBody);
+// We don't use EmitCXXTryStmt here. We need to store to ResumeEHVar that
+// doesn't exist in the body.
+Builder.CreateFlagStore(false, Coro.ResumeEHVar);
+CGF.ExitCXXTryStmt(*TryStmt);
+LValueOrRValue Res;
+// We are not supposed to obtain the value from init suspend 
await_resume().
+Res.RV = RValue::getIgnored();
+return Res;
   }
 
   LValueOrRValue Res;
@@ -253,11 +262,6 @@ static LValueOrRValue 
emitSuspendExpression(CodeGenFunction &CGF, CGCoroData &Co
   else
 Res.RV = CGF.EmitAnyExpr(S.getResumeExpr(), aggSlot, ignoreResult);
 
-  if (TryStmt) {
-Builder.CreateFlagStore(false, Coro.ResumeEHVar);
-CGF.ExitCXXTryStmt(*TryStmt);
-  }
-
   return Res;
 }
 

>From e11b48867b2f02a095d043d57cf7830702f47ed6 Mon Sep 17 00:00:00 2001
From: Yuxuan Chen 
Date: Tue, 21 Nov 2023 19:35:10 -0800
Subject: [PATCH 2/3] add test case for the previously crashing case

---
 .../coro-init-await-nontrivial-return.cpp | 46 +++
 1 file changed, 46 insertions(+)
 create mode 100644 
clang/test/CodeGenCoroutines/coro-init-await-nontrivial-return.cpp

diff --git a/clang/test/CodeGenCoroutines/coro-init-await-nontrivial-return.cpp 
b/clang/test/CodeGenCoroutines/coro-init-await-nontrivial-return.cpp
new file mode 100644
index 000..c4b8da327f5c140
--- /dev/null
+++ b/clang/test/CodeGenCoroutines/coro-init-await-nontrivial-return.cpp
@@ -0,0 +1,46 @@
+// RUN: %clang_cc1 -std=c++20 -triple=x86_64-- -emit-llvm -fcxx-exceptions \
+// RUN:-disable-llvm-passes %s -o - | FileCheck %s
+
+#include "Inputs/coroutine.h"
+
+struct NontrivialType {
+  ~NontrivialType() {}
+};
+
+struct Task {
+struct promise_type;
+using handle_type = std::coroutine_handle;
+
+struct initial_suspend_awaiter {
+bool await_ready() {
+return false;
+}
+
+void await_suspend(handle_type h) {}
+
+NontrivialType await_resume() { return {}; }
+};
+
+struct promise_type {
+void return_void() {}
+void unhandled_exception() {}
+initial_suspend_awaiter initial_suspend() { return {}; }
+std::suspend_never final_suspend() noexcept { return {}; }
+Task get_return_object() {
+return Task{handle_type::from_promise(*this)};
+}
+};
+
+handle_type handler;
+};
+
+Task coro_create() {
+co_return;
+}
+
+// CHECK-LABEL: define{{.*}} ptr @_Z11coro_createv(
+// CHECK: init.ready:
+// CHECK-NEXT: store i1 true, ptr {{.*}}
+// CHECK-NEXT: call void @_ZN4Task23initial_suspend_awaiter12await_resumeEv(
+// CHECK-NEXT: call void @_ZN14NontrivialTypeD1Ev(
+// CHECK-NEXT: store i1 false, ptr {{.*}}

>From c88474fa608219bf1c52fa09e91572dc85ddd1f8 Mon Sep 17 00:00:00 2001
From: Yuxuan Chen 
Date: Tue, 21 Nov 2023 19:52:34 -0800
Subject: [PATCH 3/3] Add release note entry

---
 clang/docs/ReleaseNotes.rst   | 3 +++
 clang/lib/CodeGen/CGCoroutine.cpp | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 157afd9e8629152..b65106b9106d4d7 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -610,6 +610,9 @@ Bug Fixes in This Version
   inside a lambda. (`#61460 
`_)
 - Fix crash during instantiation of some class template specializations within 
class
   templates. Fixes (`#70375 
`_)
+- Fix crash during code generation of C++ coroutine initial suspend when the 
return
+  type of await_resume is not trivially destructible.
+  Fixes (`#63803 `_)
 
 Bug Fixes to Compiler Builtins
 ^^
diff --git a/clang/lib/CodeGen/CGCoroutine.cpp 
b/clang/lib/CodeGen/CGCoroutine.cpp
index aaf122c0f83bc47..ec30c974253d95f 100644
--- a/clang/lib/CodeGen/CGCoroutine.cpp
+++ b/clang/lib/CodeGen/CGCoroutine.cpp
@@ -130,6 +130,8 @@ static SmallString<32>

[clang] [Clang][Coroutines] Properly emit EH code for initial suspend `await_resume` (PR #73073)

2023-11-21 Thread Yuxuan Chen via cfe-commits


https://github.com/yuxuanchen1997 updated 
https://github.com/llvm/llvm-project/pull/73073

>From e7d1ae077d7d301094b663166cc0c14c706d2110 Mon Sep 17 00:00:00 2001
From: Yuxuan Chen 
Date: Tue, 21 Nov 2023 19:06:31 -0800
Subject: [PATCH 1/3] [Clang][coro] Fix crash on emitting init suspend if the
 return type of await_resume() is not trivially destructible

---
 clang/lib/CodeGen/CGCoroutine.cpp | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/clang/lib/CodeGen/CGCoroutine.cpp 
b/clang/lib/CodeGen/CGCoroutine.cpp
index 7e449d5af3423cf..aaf122c0f83bc47 100644
--- a/clang/lib/CodeGen/CGCoroutine.cpp
+++ b/clang/lib/CodeGen/CGCoroutine.cpp
@@ -245,6 +245,15 @@ static LValueOrRValue 
emitSuspendExpression(CodeGenFunction &CGF, CGCoroData &Co
  FPOptionsOverride(), Loc, Loc);
 TryStmt = CXXTryStmt::Create(CGF.getContext(), Loc, TryBody, Catch);
 CGF.EnterCXXTryStmt(*TryStmt);
+CGF.EmitStmt(TryBody);
+// We don't use EmitCXXTryStmt here. We need to store to ResumeEHVar that
+// doesn't exist in the body.
+Builder.CreateFlagStore(false, Coro.ResumeEHVar);
+CGF.ExitCXXTryStmt(*TryStmt);
+LValueOrRValue Res;
+// We are not supposed to obtain the value from init suspend 
await_resume().
+Res.RV = RValue::getIgnored();
+return Res;
   }
 
   LValueOrRValue Res;
@@ -253,11 +262,6 @@ static LValueOrRValue 
emitSuspendExpression(CodeGenFunction &CGF, CGCoroData &Co
   else
 Res.RV = CGF.EmitAnyExpr(S.getResumeExpr(), aggSlot, ignoreResult);
 
-  if (TryStmt) {
-Builder.CreateFlagStore(false, Coro.ResumeEHVar);
-CGF.ExitCXXTryStmt(*TryStmt);
-  }
-
   return Res;
 }
 

>From e11b48867b2f02a095d043d57cf7830702f47ed6 Mon Sep 17 00:00:00 2001
From: Yuxuan Chen 
Date: Tue, 21 Nov 2023 19:35:10 -0800
Subject: [PATCH 2/3] add test case for the previously crashing case

---
 .../coro-init-await-nontrivial-return.cpp | 46 +++
 1 file changed, 46 insertions(+)
 create mode 100644 
clang/test/CodeGenCoroutines/coro-init-await-nontrivial-return.cpp

diff --git a/clang/test/CodeGenCoroutines/coro-init-await-nontrivial-return.cpp 
b/clang/test/CodeGenCoroutines/coro-init-await-nontrivial-return.cpp
new file mode 100644
index 000..c4b8da327f5c140
--- /dev/null
+++ b/clang/test/CodeGenCoroutines/coro-init-await-nontrivial-return.cpp
@@ -0,0 +1,46 @@
+// RUN: %clang_cc1 -std=c++20 -triple=x86_64-- -emit-llvm -fcxx-exceptions \
+// RUN:-disable-llvm-passes %s -o - | FileCheck %s
+
+#include "Inputs/coroutine.h"
+
+struct NontrivialType {
+  ~NontrivialType() {}
+};
+
+struct Task {
+struct promise_type;
+using handle_type = std::coroutine_handle;
+
+struct initial_suspend_awaiter {
+bool await_ready() {
+return false;
+}
+
+void await_suspend(handle_type h) {}
+
+NontrivialType await_resume() { return {}; }
+};
+
+struct promise_type {
+void return_void() {}
+void unhandled_exception() {}
+initial_suspend_awaiter initial_suspend() { return {}; }
+std::suspend_never final_suspend() noexcept { return {}; }
+Task get_return_object() {
+return Task{handle_type::from_promise(*this)};
+}
+};
+
+handle_type handler;
+};
+
+Task coro_create() {
+co_return;
+}
+
+// CHECK-LABEL: define{{.*}} ptr @_Z11coro_createv(
+// CHECK: init.ready:
+// CHECK-NEXT: store i1 true, ptr {{.*}}
+// CHECK-NEXT: call void @_ZN4Task23initial_suspend_awaiter12await_resumeEv(
+// CHECK-NEXT: call void @_ZN14NontrivialTypeD1Ev(
+// CHECK-NEXT: store i1 false, ptr {{.*}}

>From 231760360962cf980f06f2b83d1467fd2c9f1078 Mon Sep 17 00:00:00 2001
From: Yuxuan Chen 
Date: Tue, 21 Nov 2023 19:52:34 -0800
Subject: [PATCH 3/3] Add release note entry

---
 clang/docs/ReleaseNotes.rst   | 4 
 clang/lib/CodeGen/CGCoroutine.cpp | 2 ++
 2 files changed, 6 insertions(+)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 157afd9e8629152..9f9f83c673b7ba7 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -592,6 +592,7 @@ Bug Fixes in This Version
 - Clang now properly diagnoses use of stand-alone OpenMP directives after a
   label (including ``case`` or ``default`` labels).
 
+
   Before:
 
   .. code-block:: c++
@@ -610,6 +611,9 @@ Bug Fixes in This Version
   inside a lambda. (`#61460 
`_)
 - Fix crash during instantiation of some class template specializations within 
class
   templates. Fixes (`#70375 
`_)
+- Fix crash during code generation of C++ coroutine initial suspend when the 
+  return type of await_resume is not trivially destructible. 
+  Fixes (`#63803 `_)
 
 Bug Fixes to Compiler Builtins
 ^^
diff --git a/clang/lib/CodeGen/CGC

[clang] [Clang] Fix ICE when `initial_suspend()`'s `await_resume()` returns a non-trivially destructible type (PR #72935)

2023-11-21 Thread Yuxuan Chen via cfe-commits


https://github.com/yuxuanchen1997 closed 
https://github.com/llvm/llvm-project/pull/72935
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][Coroutines] Properly emit EH code for initial suspend `await_resume` (PR #73073)

2023-11-21 Thread Yuxuan Chen via cfe-commits


https://github.com/yuxuanchen1997 updated 
https://github.com/llvm/llvm-project/pull/73073

>From e7d1ae077d7d301094b663166cc0c14c706d2110 Mon Sep 17 00:00:00 2001
From: Yuxuan Chen 
Date: Tue, 21 Nov 2023 19:06:31 -0800
Subject: [PATCH 1/2] [Clang][coro] Fix crash on emitting init suspend if the
 return type of await_resume() is not trivially destructible

---
 clang/lib/CodeGen/CGCoroutine.cpp | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/clang/lib/CodeGen/CGCoroutine.cpp 
b/clang/lib/CodeGen/CGCoroutine.cpp
index 7e449d5af3423cf..aaf122c0f83bc47 100644
--- a/clang/lib/CodeGen/CGCoroutine.cpp
+++ b/clang/lib/CodeGen/CGCoroutine.cpp
@@ -245,6 +245,15 @@ static LValueOrRValue 
emitSuspendExpression(CodeGenFunction &CGF, CGCoroData &Co
  FPOptionsOverride(), Loc, Loc);
 TryStmt = CXXTryStmt::Create(CGF.getContext(), Loc, TryBody, Catch);
 CGF.EnterCXXTryStmt(*TryStmt);
+CGF.EmitStmt(TryBody);
+// We don't use EmitCXXTryStmt here. We need to store to ResumeEHVar that
+// doesn't exist in the body.
+Builder.CreateFlagStore(false, Coro.ResumeEHVar);
+CGF.ExitCXXTryStmt(*TryStmt);
+LValueOrRValue Res;
+// We are not supposed to obtain the value from init suspend 
await_resume().
+Res.RV = RValue::getIgnored();
+return Res;
   }
 
   LValueOrRValue Res;
@@ -253,11 +262,6 @@ static LValueOrRValue 
emitSuspendExpression(CodeGenFunction &CGF, CGCoroData &Co
   else
 Res.RV = CGF.EmitAnyExpr(S.getResumeExpr(), aggSlot, ignoreResult);
 
-  if (TryStmt) {
-Builder.CreateFlagStore(false, Coro.ResumeEHVar);
-CGF.ExitCXXTryStmt(*TryStmt);
-  }
-
   return Res;
 }
 

>From e11b48867b2f02a095d043d57cf7830702f47ed6 Mon Sep 17 00:00:00 2001
From: Yuxuan Chen 
Date: Tue, 21 Nov 2023 19:35:10 -0800
Subject: [PATCH 2/2] add test case for the previously crashing case

---
 .../coro-init-await-nontrivial-return.cpp | 46 +++
 1 file changed, 46 insertions(+)
 create mode 100644 
clang/test/CodeGenCoroutines/coro-init-await-nontrivial-return.cpp

diff --git a/clang/test/CodeGenCoroutines/coro-init-await-nontrivial-return.cpp 
b/clang/test/CodeGenCoroutines/coro-init-await-nontrivial-return.cpp
new file mode 100644
index 000..c4b8da327f5c140
--- /dev/null
+++ b/clang/test/CodeGenCoroutines/coro-init-await-nontrivial-return.cpp
@@ -0,0 +1,46 @@
+// RUN: %clang_cc1 -std=c++20 -triple=x86_64-- -emit-llvm -fcxx-exceptions \
+// RUN:-disable-llvm-passes %s -o - | FileCheck %s
+
+#include "Inputs/coroutine.h"
+
+struct NontrivialType {
+  ~NontrivialType() {}
+};
+
+struct Task {
+struct promise_type;
+using handle_type = std::coroutine_handle;
+
+struct initial_suspend_awaiter {
+bool await_ready() {
+return false;
+}
+
+void await_suspend(handle_type h) {}
+
+NontrivialType await_resume() { return {}; }
+};
+
+struct promise_type {
+void return_void() {}
+void unhandled_exception() {}
+initial_suspend_awaiter initial_suspend() { return {}; }
+std::suspend_never final_suspend() noexcept { return {}; }
+Task get_return_object() {
+return Task{handle_type::from_promise(*this)};
+}
+};
+
+handle_type handler;
+};
+
+Task coro_create() {
+co_return;
+}
+
+// CHECK-LABEL: define{{.*}} ptr @_Z11coro_createv(
+// CHECK: init.ready:
+// CHECK-NEXT: store i1 true, ptr {{.*}}
+// CHECK-NEXT: call void @_ZN4Task23initial_suspend_awaiter12await_resumeEv(
+// CHECK-NEXT: call void @_ZN14NontrivialTypeD1Ev(
+// CHECK-NEXT: store i1 false, ptr {{.*}}

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][Coroutines] Properly emit EH code for initial suspend `await_resume` (PR #73073)

2023-11-21 Thread Chuanqi Xu via cfe-commits


https://github.com/ChuanqiXu9 approved this pull request.

LGTM. Thanks.

https://github.com/llvm/llvm-project/pull/73073
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][Coroutines] Properly emit EH code for initial suspend `await_resume` (PR #73073)

2023-11-21 Thread Yuxuan Chen via cfe-commits


https://github.com/yuxuanchen1997 edited 
https://github.com/llvm/llvm-project/pull/73073
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][Coroutines] Properly emit EH code for initial suspend `await_resume` (PR #73073)

2023-11-21 Thread via cfe-commits


llvmbot wrote:




@llvm/pr-subscribers-clang-codegen

Author: Yuxuan Chen (yuxuanchen1997)


Changes

This change aims to fix an ICE in issue 
https://github.com/llvm/llvm-project/issues/63803

The crash happens in `ExitCXXTryStmt` because `EmitAnyExpr()` adds additional 
cleanup to the `EHScopeStack`. This messes up the assumption in 
`ExitCXXTryStmt` that the top of the stack should be a `EHCatchScope`. 

However, since we never read a value returned from `await_resume()` of an init 
suspend, we can skip the part that builds this `RValue`.

---
Full diff: https://github.com/llvm/llvm-project/pull/73073.diff


2 Files Affected:

- (modified) clang/lib/CodeGen/CGCoroutine.cpp (+9-5) 
- (added) clang/test/CodeGenCoroutines/coro-init-await-nontrivial-return.cpp 
(+49) 


``diff
diff --git a/clang/lib/CodeGen/CGCoroutine.cpp 
b/clang/lib/CodeGen/CGCoroutine.cpp
index 7e449d5af3423cf..aaf122c0f83bc47 100644
--- a/clang/lib/CodeGen/CGCoroutine.cpp
+++ b/clang/lib/CodeGen/CGCoroutine.cpp
@@ -245,6 +245,15 @@ static LValueOrRValue 
emitSuspendExpression(CodeGenFunction &CGF, CGCoroData &Co
  FPOptionsOverride(), Loc, Loc);
 TryStmt = CXXTryStmt::Create(CGF.getContext(), Loc, TryBody, Catch);
 CGF.EnterCXXTryStmt(*TryStmt);
+CGF.EmitStmt(TryBody);
+// We don't use EmitCXXTryStmt here. We need to store to ResumeEHVar that
+// doesn't exist in the body.
+Builder.CreateFlagStore(false, Coro.ResumeEHVar);
+CGF.ExitCXXTryStmt(*TryStmt);
+LValueOrRValue Res;
+// We are not supposed to obtain the value from init suspend 
await_resume().
+Res.RV = RValue::getIgnored();
+return Res;
   }
 
   LValueOrRValue Res;
@@ -253,11 +262,6 @@ static LValueOrRValue 
emitSuspendExpression(CodeGenFunction &CGF, CGCoroData &Co
   else
 Res.RV = CGF.EmitAnyExpr(S.getResumeExpr(), aggSlot, ignoreResult);
 
-  if (TryStmt) {
-Builder.CreateFlagStore(false, Coro.ResumeEHVar);
-CGF.ExitCXXTryStmt(*TryStmt);
-  }
-
   return Res;
 }
 
diff --git a/clang/test/CodeGenCoroutines/coro-init-await-nontrivial-return.cpp 
b/clang/test/CodeGenCoroutines/coro-init-await-nontrivial-return.cpp
new file mode 100644
index 000..78fc88a071d5c74
--- /dev/null
+++ b/clang/test/CodeGenCoroutines/coro-init-await-nontrivial-return.cpp
@@ -0,0 +1,49 @@
+// RUN: %clang_cc1 -std=c++20 -triple=x86_64-- -emit-llvm -fcxx-exceptions \
+// RUN:-disable-llvm-passes %s -o - | FileCheck %s
+
+#include "Inputs/coroutine.h"
+
+struct NontrivialType {
+  ~NontrivialType() {}
+};
+
+struct Task {
+struct promise_type;
+using handle_type = std::coroutine_handle;
+
+struct initial_suspend_awaiter {
+bool await_ready() {
+return false;
+}
+
+void await_suspend(handle_type h) {}
+
+// Nontrivial type caused crash!
+NontrivialType await_resume() noexcept {
+  return {};
+}
+};
+
+struct promise_type {
+void return_void() {}
+void unhandled_exception() {}
+initial_suspend_awaiter initial_suspend() { return {}; }
+std::suspend_never final_suspend() noexcept { return {}; }
+Task get_return_object() {
+return Task{handle_type::from_promise(*this)};
+}
+};
+
+handle_type handler;
+};
+
+Task coro_create() {
+co_return;
+}
+
+// CHECK-LABEL: define{{.*}} ptr @_Z11coro_createv(
+// CHECK: init.ready:
+// CHECK-NEXT: store i1 true, ptr {{.*}}
+// CHECK-NEXT: call void @_ZN4Task23initial_suspend_awaiter12await_resumeEv(
+// CHECK-NEXT: call void @_ZN14NontrivialTypeD1Ev(
+// CHECK-NEXT: store i1 false, ptr {{.*}}

``




https://github.com/llvm/llvm-project/pull/73073
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][Coroutines] Properly emit EH code for initial suspend `await_resume` (PR #73073)

2023-11-21 Thread Yuxuan Chen via cfe-commits


https://github.com/yuxuanchen1997 ready_for_review 
https://github.com/llvm/llvm-project/pull/73073
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][Coroutines] Properly emit EH code for initial suspend `await_resume` (PR #73073)

2023-11-21 Thread Yuxuan Chen via cfe-commits


https://github.com/yuxuanchen1997 updated 
https://github.com/llvm/llvm-project/pull/73073

>From e7d1ae077d7d301094b663166cc0c14c706d2110 Mon Sep 17 00:00:00 2001
From: Yuxuan Chen 
Date: Tue, 21 Nov 2023 19:06:31 -0800
Subject: [PATCH 1/2] [Clang][coro] Fix crash on emitting init suspend if the
 return type of await_resume() is not trivially destructible

---
 clang/lib/CodeGen/CGCoroutine.cpp | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/clang/lib/CodeGen/CGCoroutine.cpp 
b/clang/lib/CodeGen/CGCoroutine.cpp
index 7e449d5af3423cf..aaf122c0f83bc47 100644
--- a/clang/lib/CodeGen/CGCoroutine.cpp
+++ b/clang/lib/CodeGen/CGCoroutine.cpp
@@ -245,6 +245,15 @@ static LValueOrRValue 
emitSuspendExpression(CodeGenFunction &CGF, CGCoroData &Co
  FPOptionsOverride(), Loc, Loc);
 TryStmt = CXXTryStmt::Create(CGF.getContext(), Loc, TryBody, Catch);
 CGF.EnterCXXTryStmt(*TryStmt);
+CGF.EmitStmt(TryBody);
+// We don't use EmitCXXTryStmt here. We need to store to ResumeEHVar that
+// doesn't exist in the body.
+Builder.CreateFlagStore(false, Coro.ResumeEHVar);
+CGF.ExitCXXTryStmt(*TryStmt);
+LValueOrRValue Res;
+// We are not supposed to obtain the value from init suspend 
await_resume().
+Res.RV = RValue::getIgnored();
+return Res;
   }
 
   LValueOrRValue Res;
@@ -253,11 +262,6 @@ static LValueOrRValue 
emitSuspendExpression(CodeGenFunction &CGF, CGCoroData &Co
   else
 Res.RV = CGF.EmitAnyExpr(S.getResumeExpr(), aggSlot, ignoreResult);
 
-  if (TryStmt) {
-Builder.CreateFlagStore(false, Coro.ResumeEHVar);
-CGF.ExitCXXTryStmt(*TryStmt);
-  }
-
   return Res;
 }
 

>From f2cba69fc1c06fac2bdea5d3c66ac0b752248646 Mon Sep 17 00:00:00 2001
From: Yuxuan Chen 
Date: Tue, 21 Nov 2023 19:35:10 -0800
Subject: [PATCH 2/2] add test case for the previously crashing case

---
 .../coro-init-await-nontrivial-return.cpp | 49 +++
 1 file changed, 49 insertions(+)
 create mode 100644 
clang/test/CodeGenCoroutines/coro-init-await-nontrivial-return.cpp

diff --git a/clang/test/CodeGenCoroutines/coro-init-await-nontrivial-return.cpp 
b/clang/test/CodeGenCoroutines/coro-init-await-nontrivial-return.cpp
new file mode 100644
index 000..78fc88a071d5c74
--- /dev/null
+++ b/clang/test/CodeGenCoroutines/coro-init-await-nontrivial-return.cpp
@@ -0,0 +1,49 @@
+// RUN: %clang_cc1 -std=c++20 -triple=x86_64-- -emit-llvm -fcxx-exceptions \
+// RUN:-disable-llvm-passes %s -o - | FileCheck %s
+
+#include "Inputs/coroutine.h"
+
+struct NontrivialType {
+  ~NontrivialType() {}
+};
+
+struct Task {
+struct promise_type;
+using handle_type = std::coroutine_handle;
+
+struct initial_suspend_awaiter {
+bool await_ready() {
+return false;
+}
+
+void await_suspend(handle_type h) {}
+
+// Nontrivial type caused crash!
+NontrivialType await_resume() noexcept {
+  return {};
+}
+};
+
+struct promise_type {
+void return_void() {}
+void unhandled_exception() {}
+initial_suspend_awaiter initial_suspend() { return {}; }
+std::suspend_never final_suspend() noexcept { return {}; }
+Task get_return_object() {
+return Task{handle_type::from_promise(*this)};
+}
+};
+
+handle_type handler;
+};
+
+Task coro_create() {
+co_return;
+}
+
+// CHECK-LABEL: define{{.*}} ptr @_Z11coro_createv(
+// CHECK: init.ready:
+// CHECK-NEXT: store i1 true, ptr {{.*}}
+// CHECK-NEXT: call void @_ZN4Task23initial_suspend_awaiter12await_resumeEv(
+// CHECK-NEXT: call void @_ZN14NontrivialTypeD1Ev(
+// CHECK-NEXT: store i1 false, ptr {{.*}}

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][Coroutines] Properly emit EH code for initial suspend `await_resume` (PR #73073)

2023-11-21 Thread Yuxuan Chen via cfe-commits


https://github.com/yuxuanchen1997 updated 
https://github.com/llvm/llvm-project/pull/73073

>From f782f36c42f9bc1246837bf7ff2142919794580b Mon Sep 17 00:00:00 2001
From: Yuxuan Chen 
Date: Tue, 21 Nov 2023 19:06:31 -0800
Subject: [PATCH 1/2] [Clang][coro] Fix crash on emitting init suspend if the
 return type of await_resume() is not trivially destructible

---
 clang/lib/CodeGen/CGCoroutine.cpp | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/clang/lib/CodeGen/CGCoroutine.cpp 
b/clang/lib/CodeGen/CGCoroutine.cpp
index 7e449d5af3423cf..aaf122c0f83bc47 100644
--- a/clang/lib/CodeGen/CGCoroutine.cpp
+++ b/clang/lib/CodeGen/CGCoroutine.cpp
@@ -245,6 +245,15 @@ static LValueOrRValue 
emitSuspendExpression(CodeGenFunction &CGF, CGCoroData &Co
  FPOptionsOverride(), Loc, Loc);
 TryStmt = CXXTryStmt::Create(CGF.getContext(), Loc, TryBody, Catch);
 CGF.EnterCXXTryStmt(*TryStmt);
+CGF.EmitStmt(TryBody);
+// We don't use EmitCXXTryStmt here. We need to store to ResumeEHVar that
+// doesn't exist in the body.
+Builder.CreateFlagStore(false, Coro.ResumeEHVar);
+CGF.ExitCXXTryStmt(*TryStmt);
+LValueOrRValue Res;
+// We are not supposed to obtain the value from init suspend 
await_resume().
+Res.RV = RValue::getIgnored();
+return Res;
   }
 
   LValueOrRValue Res;
@@ -253,11 +262,6 @@ static LValueOrRValue 
emitSuspendExpression(CodeGenFunction &CGF, CGCoroData &Co
   else
 Res.RV = CGF.EmitAnyExpr(S.getResumeExpr(), aggSlot, ignoreResult);
 
-  if (TryStmt) {
-Builder.CreateFlagStore(false, Coro.ResumeEHVar);
-CGF.ExitCXXTryStmt(*TryStmt);
-  }
-
   return Res;
 }
 

>From 9f98b9d0b812b1027bfc4d963b353feeac36834b Mon Sep 17 00:00:00 2001
From: Yuxuan Chen 
Date: Tue, 21 Nov 2023 19:35:10 -0800
Subject: [PATCH 2/2] add test case for the previously crashing case

---
 .../coro-init-await-nontrivial-return.cpp | 49 +++
 1 file changed, 49 insertions(+)
 create mode 100644 
clang/test/CodeGenCoroutines/coro-init-await-nontrivial-return.cpp

diff --git a/clang/test/CodeGenCoroutines/coro-init-await-nontrivial-return.cpp 
b/clang/test/CodeGenCoroutines/coro-init-await-nontrivial-return.cpp
new file mode 100644
index 000..78fc88a071d5c74
--- /dev/null
+++ b/clang/test/CodeGenCoroutines/coro-init-await-nontrivial-return.cpp
@@ -0,0 +1,49 @@
+// RUN: %clang_cc1 -std=c++20 -triple=x86_64-- -emit-llvm -fcxx-exceptions \
+// RUN:-disable-llvm-passes %s -o - | FileCheck %s
+
+#include "Inputs/coroutine.h"
+
+struct NontrivialType {
+  ~NontrivialType() {}
+};
+
+struct Task {
+struct promise_type;
+using handle_type = std::coroutine_handle;
+
+struct initial_suspend_awaiter {
+bool await_ready() {
+return false;
+}
+
+void await_suspend(handle_type h) {}
+
+// Nontrivial type caused crash!
+NontrivialType await_resume() noexcept {
+  return {};
+}
+};
+
+struct promise_type {
+void return_void() {}
+void unhandled_exception() {}
+initial_suspend_awaiter initial_suspend() { return {}; }
+std::suspend_never final_suspend() noexcept { return {}; }
+Task get_return_object() {
+return Task{handle_type::from_promise(*this)};
+}
+};
+
+handle_type handler;
+};
+
+Task coro_create() {
+co_return;
+}
+
+// CHECK-LABEL: define{{.*}} ptr @_Z11coro_createv(
+// CHECK: init.ready:
+// CHECK-NEXT: store i1 true, ptr {{.*}}
+// CHECK-NEXT: call void @_ZN4Task23initial_suspend_awaiter12await_resumeEv(
+// CHECK-NEXT: call void @_ZN14NontrivialTypeD1Ev(
+// CHECK-NEXT: store i1 false, ptr {{.*}}

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libunwind] [libunwind] Remove unnecessary dependencies on fprintf and stdio.h for increased baremetal friendliness (PR #72040)

2023-11-21 Thread Michael Kenzel via cfe-commits


https://github.com/michael-kenzel edited 
https://github.com/llvm/llvm-project/pull/72040
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][Coroutines] Properly emit EH code for initial suspend `await_resume` (PR #73073)

2023-11-21 Thread Yuxuan Chen via cfe-commits


https://github.com/yuxuanchen1997 created 
https://github.com/llvm/llvm-project/pull/73073

This change aims to fix an ICE in issue 
https://github.com/llvm/llvm-project/issues/63803

The crash happens in `ExitCXXTryStmt` because `EmitAnyExpr()` adds additional 
cleanup to the `EHScopeStack`. This messes up the assumption in 
`ExitCXXTryStmt` that the top of the stack should be a `EHCatchScope`. 

However, since we never read a value returned from `await_resume()` of an init 
suspend, we can skip the part that builds this `RValue`.

>From f782f36c42f9bc1246837bf7ff2142919794580b Mon Sep 17 00:00:00 2001
From: Yuxuan Chen 
Date: Tue, 21 Nov 2023 19:06:31 -0800
Subject: [PATCH] [Clang][coro] Fix crash on emitting init suspend if the
 return type of await_resume() is not trivially destructible

---
 clang/lib/CodeGen/CGCoroutine.cpp | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/clang/lib/CodeGen/CGCoroutine.cpp 
b/clang/lib/CodeGen/CGCoroutine.cpp
index 7e449d5af3423cf..aaf122c0f83bc47 100644
--- a/clang/lib/CodeGen/CGCoroutine.cpp
+++ b/clang/lib/CodeGen/CGCoroutine.cpp
@@ -245,6 +245,15 @@ static LValueOrRValue 
emitSuspendExpression(CodeGenFunction &CGF, CGCoroData &Co
  FPOptionsOverride(), Loc, Loc);
 TryStmt = CXXTryStmt::Create(CGF.getContext(), Loc, TryBody, Catch);
 CGF.EnterCXXTryStmt(*TryStmt);
+CGF.EmitStmt(TryBody);
+// We don't use EmitCXXTryStmt here. We need to store to ResumeEHVar that
+// doesn't exist in the body.
+Builder.CreateFlagStore(false, Coro.ResumeEHVar);
+CGF.ExitCXXTryStmt(*TryStmt);
+LValueOrRValue Res;
+// We are not supposed to obtain the value from init suspend 
await_resume().
+Res.RV = RValue::getIgnored();
+return Res;
   }
 
   LValueOrRValue Res;
@@ -253,11 +262,6 @@ static LValueOrRValue 
emitSuspendExpression(CodeGenFunction &CGF, CGCoroData &Co
   else
 Res.RV = CGF.EmitAnyExpr(S.getResumeExpr(), aggSlot, ignoreResult);
 
-  if (TryStmt) {
-Builder.CreateFlagStore(false, Coro.ResumeEHVar);
-CGF.ExitCXXTryStmt(*TryStmt);
-  }
-
   return Res;
 }
 

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang][analyzer][NFC] Use `*EofVal` instead of constant `-1` (PR #73072)

2023-11-21 Thread Ben Shi via cfe-commits


benshi001 wrote:

According to line 442 of this file, EOF might be a different value than -1, 
although it is unusual.


https://github.com/llvm/llvm-project/pull/73072
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang][analyzer][NFC] Use `*EofVal` instead of constant `-1` (PR #73072)

2023-11-21 Thread via cfe-commits


llvmbot wrote:




@llvm/pr-subscribers-clang

Author: Ben Shi (benshi001)


Changes



---
Full diff: https://github.com/llvm/llvm-project/pull/73072.diff


1 Files Affected:

- (modified) clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp (+3-2) 


``diff
diff --git a/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp 
b/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
index 1d53e59ca067c27..3d6f54c1b606ac0 100644
--- a/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
@@ -952,8 +952,9 @@ void StreamChecker::evalFtell(const FnDescription *Desc, 
const CallEvent &Call,
   if (!StateNotFailed)
 return;
 
-  ProgramStateRef StateFailed = State->BindExpr(
-  CE, C.getLocationContext(), SVB.makeIntVal(-1, 
C.getASTContext().LongTy));
+  ProgramStateRef StateFailed =
+  State->BindExpr(CE, C.getLocationContext(),
+  SVB.makeIntVal(*EofVal, C.getASTContext().LongTy));
 
   C.addTransition(StateNotFailed);
   C.addTransition(StateFailed);

``




https://github.com/llvm/llvm-project/pull/73072
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang][analyzer][NFC] Use `*EofVal` instead of constant `-1` (PR #73072)

2023-11-21 Thread Ben Shi via cfe-commits


https://github.com/benshi001 created 
https://github.com/llvm/llvm-project/pull/73072

None

>From 1079cdb578a434344ac525e32d9931325e6f3f6c Mon Sep 17 00:00:00 2001
From: Ben Shi 
Date: Wed, 22 Nov 2023 11:00:50 +0800
Subject: [PATCH] [clang][analyzer][NFC] Use '*EofVal' instead of constant '-1'

---
 clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp 
b/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
index 1d53e59ca067c27..3d6f54c1b606ac0 100644
--- a/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
@@ -952,8 +952,9 @@ void StreamChecker::evalFtell(const FnDescription *Desc, 
const CallEvent &Call,
   if (!StateNotFailed)
 return;
 
-  ProgramStateRef StateFailed = State->BindExpr(
-  CE, C.getLocationContext(), SVB.makeIntVal(-1, 
C.getASTContext().LongTy));
+  ProgramStateRef StateFailed =
+  State->BindExpr(CE, C.getLocationContext(),
+  SVB.makeIntVal(*EofVal, C.getASTContext().LongTy));
 
   C.addTransition(StateNotFailed);
   C.addTransition(StateFailed);

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang-tools-extra] [clang] [clang-tidy] Added new check to detect redundant inline keyword (PR #73069)

2023-11-21 Thread Félix-Antoine Constantin via cfe-commits


https://github.com/felix642 updated 
https://github.com/llvm/llvm-project/pull/73069

From 89281ccb5354e3d6349d10e6f9446194d2d4 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?F=C3=A9lix-Antoine=20Constantin?=
 
Date: Thu, 16 Nov 2023 22:03:15 -0500
Subject: [PATCH 1/2] =?UTF-8?q?[clang-tidy]=C2=A0Added=20check=20to=20dete?=
 =?UTF-8?q?ct=20redundant=20inline=20keyword?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This checks find usages of the inline keywork where it is
already implicitly defined by the compiler and suggests it's removal.

Fixes #72397
---
 .../clang-tidy/readability/CMakeLists.txt |   1 +
 .../readability/ReadabilityTidyModule.cpp |   3 +
 .../RedundantInlineSpecifierCheck.cpp |  99 
 .../RedundantInlineSpecifierCheck.h   |  36 ++
 clang-tools-extra/docs/ReleaseNotes.rst   |   5 +
 .../docs/clang-tidy/checks/list.rst   |   1 +
 .../redundant-inline-specifier.rst|  34 ++
 .../redundant-inline-specifier.cpp| 110 ++
 clang/include/clang/ASTMatchers/ASTMatchers.h |   2 +-
 9 files changed, 290 insertions(+), 1 deletion(-)
 create mode 100644 
clang-tools-extra/clang-tidy/readability/RedundantInlineSpecifierCheck.cpp
 create mode 100644 
clang-tools-extra/clang-tidy/readability/RedundantInlineSpecifierCheck.h
 create mode 100644 
clang-tools-extra/docs/clang-tidy/checks/readability/redundant-inline-specifier.rst
 create mode 100644 
clang-tools-extra/test/clang-tidy/checkers/readability/redundant-inline-specifier.cpp

diff --git a/clang-tools-extra/clang-tidy/readability/CMakeLists.txt 
b/clang-tools-extra/clang-tidy/readability/CMakeLists.txt
index 5452c2d48a46173..811310db8c721a6 100644
--- a/clang-tools-extra/clang-tidy/readability/CMakeLists.txt
+++ b/clang-tools-extra/clang-tidy/readability/CMakeLists.txt
@@ -20,6 +20,7 @@ add_clang_library(clangTidyReadabilityModule
   IdentifierLengthCheck.cpp
   IdentifierNamingCheck.cpp
   ImplicitBoolConversionCheck.cpp
+  RedundantInlineSpecifierCheck.cpp
   InconsistentDeclarationParameterNameCheck.cpp
   IsolateDeclarationCheck.cpp
   MagicNumbersCheck.cpp
diff --git a/clang-tools-extra/clang-tidy/readability/ReadabilityTidyModule.cpp 
b/clang-tools-extra/clang-tidy/readability/ReadabilityTidyModule.cpp
index b8e6e6414320600..3ce7bfecaecba64 100644
--- a/clang-tools-extra/clang-tidy/readability/ReadabilityTidyModule.cpp
+++ b/clang-tools-extra/clang-tidy/readability/ReadabilityTidyModule.cpp
@@ -39,6 +39,7 @@
 #include "RedundantControlFlowCheck.h"
 #include "RedundantDeclarationCheck.h"
 #include "RedundantFunctionPtrDereferenceCheck.h"
+#include "RedundantInlineSpecifierCheck.h"
 #include "RedundantMemberInitCheck.h"
 #include "RedundantPreprocessorCheck.h"
 #include "RedundantSmartptrGetCheck.h"
@@ -93,6 +94,8 @@ class ReadabilityModule : public ClangTidyModule {
 "readability-identifier-naming");
 CheckFactories.registerCheck(
 "readability-implicit-bool-conversion");
+CheckFactories.registerCheck(
+"readability-redundant-inline-specifier");
 CheckFactories.registerCheck(
 "readability-inconsistent-declaration-parameter-name");
 CheckFactories.registerCheck(
diff --git 
a/clang-tools-extra/clang-tidy/readability/RedundantInlineSpecifierCheck.cpp 
b/clang-tools-extra/clang-tidy/readability/RedundantInlineSpecifierCheck.cpp
new file mode 100644
index 000..e73b570df759153
--- /dev/null
+++ b/clang-tools-extra/clang-tidy/readability/RedundantInlineSpecifierCheck.cpp
@@ -0,0 +1,99 @@
+//===--- RedundantInlineSpecifierCheck.cpp -
+// clang-tidy--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#include "RedundantInlineSpecifierCheck.h"
+#include "clang/AST/ASTContext.h"
+#include "clang/AST/Decl.h"
+#include "clang/AST/DeclCXX.h"
+#include "clang/AST/DeclTemplate.h"
+#include "clang/AST/ExprCXX.h"
+#include "clang/ASTMatchers/ASTMatchers.h"
+#include "clang/Basic/Diagnostic.h"
+#include "clang/Basic/SourceLocation.h"
+#include "clang/Basic/SourceManager.h"
+
+#include "../utils/LexerUtils.h"
+
+using namespace clang::ast_matchers;
+
+namespace clang::tidy::readability {
+
+static std::optional
+getInlineTokenLocation(SourceRange RangeLocation, const SourceManager &Sources,
+   const LangOptions &LangOpts) {
+  SourceLocation CurrentLocation = RangeLocation.getBegin();
+  Token CurrentToken;
+  while (!Lexer::getRawToken(CurrentLocation, CurrentToken, Sources, LangOpts,
+ true) &&
+ CurrentLocation < RangeLocation.getEnd() &&
+ CurrentToken.isNot(tok::eof)) {
+if (CurrentToken.is(tok::raw_identifier)) {
+  i

[PATCH] D155688: [PATCH] [llvm] [InstCombine] Canonicalise ADD+GEP

2023-11-21 Thread Craig Topper via Phabricator via cfe-commits

craig.topper added a comment.

After this patch, I'm seeing a lot of `invariant.gep` created by LICM. For 
example, in `LBM_performStreamCollide` in 470.lbm there are 65 of them. On 
RISC-V, these all get created in registers outside the loop and get spilled. Is 
ARM seeing anything like this or do you have more addressing modes that allow 
CodeGenPrepare to bring these back into the loop?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155688/new/

https://reviews.llvm.org/D155688

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [C++20] [Modules] Introduce a tool 'clang-named-modules-querier' and two plugins 'ClangGetUsedFilesFromModulesPlugin' and 'ClangGetDeclsInModulesPlugin' (PR #72956)

2023-11-21 Thread Chuanqi Xu via cfe-commits


ChuanqiXu9 wrote:

> I'm still really hesitant about this direction.
> 
> One starting concern: what happens if someone adds an overload, or other 
> interesting name resolution to the module? The downstream caller hasn't 
> textually changed, but it should be rebuilt because it should be calling a 
> different overload candidate now? (& even if we then track every function 
> with the same name, there's other cases - like adding an implicit conversion 
> operator, operator overload, etc, that might complicate things)

Oh, nice catch. It is a problem for the used file based solution if we add the 
overload to a separate unused files. While the hash based solution can handle 
the overloads case, it'll be a problem if we add an implicit conversion we 
didn't use before.

https://github.com/llvm/llvm-project/pull/72956
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA][HIP] make trivial ctor/dtor host device (PR #72394)

2023-11-21 Thread Yaxun Liu via cfe-commits


yxsamliu wrote:

> @yxsamliu What's the plan here? This issue is blocking us. If there is no 
> obvious fix very soon, we need to revert this.

I will fix it.

https://github.com/llvm/llvm-project/pull/72394
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang-tools-extra] [clang] [clang-tidy] Added new check to detect redundant inline keyword (PR #73069)

2023-11-21 Thread Félix-Antoine Constantin via cfe-commits


https://github.com/felix642 updated 
https://github.com/llvm/llvm-project/pull/73069

From 89281ccb5354e3d6349d10e6f9446194d2d4 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?F=C3=A9lix-Antoine=20Constantin?=
 
Date: Thu, 16 Nov 2023 22:03:15 -0500
Subject: [PATCH] =?UTF-8?q?[clang-tidy]=C2=A0Added=20check=20to=20detect?=
 =?UTF-8?q?=20redundant=20inline=20keyword?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This checks find usages of the inline keywork where it is
already implicitly defined by the compiler and suggests it's removal.

Fixes #72397
---
 .../clang-tidy/readability/CMakeLists.txt |   1 +
 .../readability/ReadabilityTidyModule.cpp |   3 +
 .../RedundantInlineSpecifierCheck.cpp |  99 
 .../RedundantInlineSpecifierCheck.h   |  36 ++
 clang-tools-extra/docs/ReleaseNotes.rst   |   5 +
 .../docs/clang-tidy/checks/list.rst   |   1 +
 .../redundant-inline-specifier.rst|  34 ++
 .../redundant-inline-specifier.cpp| 110 ++
 clang/include/clang/ASTMatchers/ASTMatchers.h |   2 +-
 9 files changed, 290 insertions(+), 1 deletion(-)
 create mode 100644 
clang-tools-extra/clang-tidy/readability/RedundantInlineSpecifierCheck.cpp
 create mode 100644 
clang-tools-extra/clang-tidy/readability/RedundantInlineSpecifierCheck.h
 create mode 100644 
clang-tools-extra/docs/clang-tidy/checks/readability/redundant-inline-specifier.rst
 create mode 100644 
clang-tools-extra/test/clang-tidy/checkers/readability/redundant-inline-specifier.cpp

diff --git a/clang-tools-extra/clang-tidy/readability/CMakeLists.txt 
b/clang-tools-extra/clang-tidy/readability/CMakeLists.txt
index 5452c2d48a46173..811310db8c721a6 100644
--- a/clang-tools-extra/clang-tidy/readability/CMakeLists.txt
+++ b/clang-tools-extra/clang-tidy/readability/CMakeLists.txt
@@ -20,6 +20,7 @@ add_clang_library(clangTidyReadabilityModule
   IdentifierLengthCheck.cpp
   IdentifierNamingCheck.cpp
   ImplicitBoolConversionCheck.cpp
+  RedundantInlineSpecifierCheck.cpp
   InconsistentDeclarationParameterNameCheck.cpp
   IsolateDeclarationCheck.cpp
   MagicNumbersCheck.cpp
diff --git a/clang-tools-extra/clang-tidy/readability/ReadabilityTidyModule.cpp 
b/clang-tools-extra/clang-tidy/readability/ReadabilityTidyModule.cpp
index b8e6e6414320600..3ce7bfecaecba64 100644
--- a/clang-tools-extra/clang-tidy/readability/ReadabilityTidyModule.cpp
+++ b/clang-tools-extra/clang-tidy/readability/ReadabilityTidyModule.cpp
@@ -39,6 +39,7 @@
 #include "RedundantControlFlowCheck.h"
 #include "RedundantDeclarationCheck.h"
 #include "RedundantFunctionPtrDereferenceCheck.h"
+#include "RedundantInlineSpecifierCheck.h"
 #include "RedundantMemberInitCheck.h"
 #include "RedundantPreprocessorCheck.h"
 #include "RedundantSmartptrGetCheck.h"
@@ -93,6 +94,8 @@ class ReadabilityModule : public ClangTidyModule {
 "readability-identifier-naming");
 CheckFactories.registerCheck(
 "readability-implicit-bool-conversion");
+CheckFactories.registerCheck(
+"readability-redundant-inline-specifier");
 CheckFactories.registerCheck(
 "readability-inconsistent-declaration-parameter-name");
 CheckFactories.registerCheck(
diff --git 
a/clang-tools-extra/clang-tidy/readability/RedundantInlineSpecifierCheck.cpp 
b/clang-tools-extra/clang-tidy/readability/RedundantInlineSpecifierCheck.cpp
new file mode 100644
index 000..e73b570df759153
--- /dev/null
+++ b/clang-tools-extra/clang-tidy/readability/RedundantInlineSpecifierCheck.cpp
@@ -0,0 +1,99 @@
+//===--- RedundantInlineSpecifierCheck.cpp -
+// clang-tidy--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#include "RedundantInlineSpecifierCheck.h"
+#include "clang/AST/ASTContext.h"
+#include "clang/AST/Decl.h"
+#include "clang/AST/DeclCXX.h"
+#include "clang/AST/DeclTemplate.h"
+#include "clang/AST/ExprCXX.h"
+#include "clang/ASTMatchers/ASTMatchers.h"
+#include "clang/Basic/Diagnostic.h"
+#include "clang/Basic/SourceLocation.h"
+#include "clang/Basic/SourceManager.h"
+
+#include "../utils/LexerUtils.h"
+
+using namespace clang::ast_matchers;
+
+namespace clang::tidy::readability {
+
+static std::optional
+getInlineTokenLocation(SourceRange RangeLocation, const SourceManager &Sources,
+   const LangOptions &LangOpts) {
+  SourceLocation CurrentLocation = RangeLocation.getBegin();
+  Token CurrentToken;
+  while (!Lexer::getRawToken(CurrentLocation, CurrentToken, Sources, LangOpts,
+ true) &&
+ CurrentLocation < RangeLocation.getEnd() &&
+ CurrentToken.isNot(tok::eof)) {
+if (CurrentToken.is(tok::raw_identifier)) {
+  if (C

[clang-tools-extra] [clang] [clang-tidy] Added new check to detect redundant inline keyword (PR #73069)

2023-11-21 Thread via cfe-commits


llvmbot wrote:



@llvm/pr-subscribers-clang-tidy

@llvm/pr-subscribers-clang

Author: Félix-Antoine Constantin (felix642)


Changes

This checks find usages of the inline keywork where it is already implicitly 
defined by the compiler and suggests it's removal.

Fixes #72397

---
Full diff: https://github.com/llvm/llvm-project/pull/73069.diff


9 Files Affected:

- (modified) clang-tools-extra/clang-tidy/readability/CMakeLists.txt (+1) 
- (modified) clang-tools-extra/clang-tidy/readability/ReadabilityTidyModule.cpp 
(+3) 
- (added) 
clang-tools-extra/clang-tidy/readability/RedundantInlineSpecifierCheck.cpp 
(+94) 
- (added) 
clang-tools-extra/clang-tidy/readability/RedundantInlineSpecifierCheck.h (+36) 
- (modified) clang-tools-extra/docs/ReleaseNotes.rst (+5) 
- (modified) clang-tools-extra/docs/clang-tidy/checks/list.rst (+1) 
- (added) 
clang-tools-extra/docs/clang-tidy/checks/readability/redundant-inline-specifier.rst
 (+34) 
- (added) 
clang-tools-extra/test/clang-tidy/checkers/readability/redundant-inline-specifier.cpp
 (+110) 
- (modified) clang/include/clang/ASTMatchers/ASTMatchers.h (+1-1) 


``diff
diff --git a/clang-tools-extra/clang-tidy/readability/CMakeLists.txt 
b/clang-tools-extra/clang-tidy/readability/CMakeLists.txt
index 5452c2d48a46173..811310db8c721a6 100644
--- a/clang-tools-extra/clang-tidy/readability/CMakeLists.txt
+++ b/clang-tools-extra/clang-tidy/readability/CMakeLists.txt
@@ -20,6 +20,7 @@ add_clang_library(clangTidyReadabilityModule
   IdentifierLengthCheck.cpp
   IdentifierNamingCheck.cpp
   ImplicitBoolConversionCheck.cpp
+  RedundantInlineSpecifierCheck.cpp
   InconsistentDeclarationParameterNameCheck.cpp
   IsolateDeclarationCheck.cpp
   MagicNumbersCheck.cpp
diff --git a/clang-tools-extra/clang-tidy/readability/ReadabilityTidyModule.cpp 
b/clang-tools-extra/clang-tidy/readability/ReadabilityTidyModule.cpp
index b8e6e6414320600..3ce7bfecaecba64 100644
--- a/clang-tools-extra/clang-tidy/readability/ReadabilityTidyModule.cpp
+++ b/clang-tools-extra/clang-tidy/readability/ReadabilityTidyModule.cpp
@@ -39,6 +39,7 @@
 #include "RedundantControlFlowCheck.h"
 #include "RedundantDeclarationCheck.h"
 #include "RedundantFunctionPtrDereferenceCheck.h"
+#include "RedundantInlineSpecifierCheck.h"
 #include "RedundantMemberInitCheck.h"
 #include "RedundantPreprocessorCheck.h"
 #include "RedundantSmartptrGetCheck.h"
@@ -93,6 +94,8 @@ class ReadabilityModule : public ClangTidyModule {
 "readability-identifier-naming");
 CheckFactories.registerCheck(
 "readability-implicit-bool-conversion");
+CheckFactories.registerCheck(
+"readability-redundant-inline-specifier");
 CheckFactories.registerCheck(
 "readability-inconsistent-declaration-parameter-name");
 CheckFactories.registerCheck(
diff --git 
a/clang-tools-extra/clang-tidy/readability/RedundantInlineSpecifierCheck.cpp 
b/clang-tools-extra/clang-tidy/readability/RedundantInlineSpecifierCheck.cpp
new file mode 100644
index 000..7b67a7419c708b7
--- /dev/null
+++ b/clang-tools-extra/clang-tidy/readability/RedundantInlineSpecifierCheck.cpp
@@ -0,0 +1,94 @@
+//===--- RedundantInlineSpecifierCheck.cpp - 
clang-tidy--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#include "RedundantInlineSpecifierCheck.h"
+#include "clang/AST/ASTContext.h"
+#include "clang/AST/Decl.h"
+#include "clang/AST/DeclCXX.h"
+#include "clang/AST/DeclTemplate.h"
+#include "clang/AST/ExprCXX.h"
+#include "clang/ASTMatchers/ASTMatchers.h"
+#include "clang/Basic/Diagnostic.h"
+#include "clang/Basic/SourceLocation.h"
+#include "clang/Basic/SourceManager.h"
+
+#include "../utils/LexerUtils.h"
+
+using namespace clang::ast_matchers;
+
+namespace clang::tidy::readability {
+
+static std::optional
+getInlineTokenLocation(SourceRange RangeLocation,
+   const SourceManager &Sources,
+   const LangOptions &LangOpts) {
+  SourceLocation CurrentLocation = RangeLocation.getBegin();
+  Token CurrentToken;
+  while (!Lexer::getRawToken(CurrentLocation, CurrentToken, Sources, LangOpts,
+ true) && CurrentLocation < RangeLocation.getEnd() 
&& CurrentToken.isNot(tok::eof)) {
+if (CurrentToken.is(tok::raw_identifier)) {
+  if (CurrentToken.getRawIdentifier() == "inline") {
+return CurrentToken.getLocation();
+  }
+}
+CurrentLocation = CurrentToken.getEndLoc();
+  }
+  return std::nullopt;
+}
+
+void RedundantInlineSpecifierCheck::registerMatchers(MatchFinder *Finder) {
+  Finder->addMatcher(
+  functionDecl(unless(isExpansionInSystemHeader()), unless(isImplicit()),
+   unless(hasAncestor(lambdaExpr())), isInline(),
+   anyO

[clang] [clang-tools-extra] [clang-tidy] Added new check to detect redundant inline keyword (PR #73069)

2023-11-21 Thread Félix-Antoine Constantin via cfe-commits


https://github.com/felix642 edited 
https://github.com/llvm/llvm-project/pull/73069
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang-tools-extra] [clang-tidy] Added check to detect redundant inline keyword (PR #73069)

2023-11-21 Thread Félix-Antoine Constantin via cfe-commits


https://github.com/felix642 created 
https://github.com/llvm/llvm-project/pull/73069

This checks find usages of the inline keywork where it is already implicitly 
defined by the compiler and suggests it's removal.

Fixes #72397

From 894c3c837725c75f8ad19a185139d07b10fb2e0a Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?F=C3=A9lix-Antoine=20Constantin?=
 
Date: Thu, 16 Nov 2023 22:03:15 -0500
Subject: [PATCH] =?UTF-8?q?[clang-tidy]=C2=A0Added=20check=20to=20detect?=
 =?UTF-8?q?=20redundant=20inline=20keyword?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This checks find usages of the inline keywork where it is
already implicitly defined by the compiler and suggests it's removal.

Fixes #72397
---
 .../clang-tidy/readability/CMakeLists.txt |   1 +
 .../readability/ReadabilityTidyModule.cpp |   3 +
 .../RedundantInlineSpecifierCheck.cpp |  94 +++
 .../RedundantInlineSpecifierCheck.h   |  36 ++
 clang-tools-extra/docs/ReleaseNotes.rst   |   5 +
 .../docs/clang-tidy/checks/list.rst   |   1 +
 .../redundant-inline-specifier.rst|  34 ++
 .../redundant-inline-specifier.cpp| 110 ++
 clang/include/clang/ASTMatchers/ASTMatchers.h |   2 +-
 9 files changed, 285 insertions(+), 1 deletion(-)
 create mode 100644 
clang-tools-extra/clang-tidy/readability/RedundantInlineSpecifierCheck.cpp
 create mode 100644 
clang-tools-extra/clang-tidy/readability/RedundantInlineSpecifierCheck.h
 create mode 100644 
clang-tools-extra/docs/clang-tidy/checks/readability/redundant-inline-specifier.rst
 create mode 100644 
clang-tools-extra/test/clang-tidy/checkers/readability/redundant-inline-specifier.cpp

diff --git a/clang-tools-extra/clang-tidy/readability/CMakeLists.txt 
b/clang-tools-extra/clang-tidy/readability/CMakeLists.txt
index 5452c2d48a46173..811310db8c721a6 100644
--- a/clang-tools-extra/clang-tidy/readability/CMakeLists.txt
+++ b/clang-tools-extra/clang-tidy/readability/CMakeLists.txt
@@ -20,6 +20,7 @@ add_clang_library(clangTidyReadabilityModule
   IdentifierLengthCheck.cpp
   IdentifierNamingCheck.cpp
   ImplicitBoolConversionCheck.cpp
+  RedundantInlineSpecifierCheck.cpp
   InconsistentDeclarationParameterNameCheck.cpp
   IsolateDeclarationCheck.cpp
   MagicNumbersCheck.cpp
diff --git a/clang-tools-extra/clang-tidy/readability/ReadabilityTidyModule.cpp 
b/clang-tools-extra/clang-tidy/readability/ReadabilityTidyModule.cpp
index b8e6e6414320600..3ce7bfecaecba64 100644
--- a/clang-tools-extra/clang-tidy/readability/ReadabilityTidyModule.cpp
+++ b/clang-tools-extra/clang-tidy/readability/ReadabilityTidyModule.cpp
@@ -39,6 +39,7 @@
 #include "RedundantControlFlowCheck.h"
 #include "RedundantDeclarationCheck.h"
 #include "RedundantFunctionPtrDereferenceCheck.h"
+#include "RedundantInlineSpecifierCheck.h"
 #include "RedundantMemberInitCheck.h"
 #include "RedundantPreprocessorCheck.h"
 #include "RedundantSmartptrGetCheck.h"
@@ -93,6 +94,8 @@ class ReadabilityModule : public ClangTidyModule {
 "readability-identifier-naming");
 CheckFactories.registerCheck(
 "readability-implicit-bool-conversion");
+CheckFactories.registerCheck(
+"readability-redundant-inline-specifier");
 CheckFactories.registerCheck(
 "readability-inconsistent-declaration-parameter-name");
 CheckFactories.registerCheck(
diff --git 
a/clang-tools-extra/clang-tidy/readability/RedundantInlineSpecifierCheck.cpp 
b/clang-tools-extra/clang-tidy/readability/RedundantInlineSpecifierCheck.cpp
new file mode 100644
index 000..7b67a7419c708b7
--- /dev/null
+++ b/clang-tools-extra/clang-tidy/readability/RedundantInlineSpecifierCheck.cpp
@@ -0,0 +1,94 @@
+//===--- RedundantInlineSpecifierCheck.cpp - 
clang-tidy--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#include "RedundantInlineSpecifierCheck.h"
+#include "clang/AST/ASTContext.h"
+#include "clang/AST/Decl.h"
+#include "clang/AST/DeclCXX.h"
+#include "clang/AST/DeclTemplate.h"
+#include "clang/AST/ExprCXX.h"
+#include "clang/ASTMatchers/ASTMatchers.h"
+#include "clang/Basic/Diagnostic.h"
+#include "clang/Basic/SourceLocation.h"
+#include "clang/Basic/SourceManager.h"
+
+#include "../utils/LexerUtils.h"
+
+using namespace clang::ast_matchers;
+
+namespace clang::tidy::readability {
+
+static std::optional
+getInlineTokenLocation(SourceRange RangeLocation,
+   const SourceManager &Sources,
+   const LangOptions &LangOpts) {
+  SourceLocation CurrentLocation = RangeLocation.getBegin();
+  Token CurrentToken;
+  while (!Lexer::getRawToken(CurrentLocation, CurrentToken, Sources, LangOpts,
+ tr

[clang] [clang][ExprConst] allow single element access of vector object to be constant expression (PR #72607)

2023-11-21 Thread Yuanfang Chen via cfe-commits


yuanfang-chen wrote:

ping?

https://github.com/llvm/llvm-project/pull/72607
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang codegen][regression] Add dso_local/hidden/etc. markings to VTT definitions and declarations (PR #72452)

2023-11-21 Thread via cfe-commits


bd1976bris wrote:

Thanks for the reviews 👍 

https://github.com/llvm/llvm-project/pull/72452
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] Remove experimental from Vector Crypto extensions (PR #69000)

2023-11-21 Thread Eric Biggers via cfe-commits


ebiggers wrote:

At 
https://github.com/ebiggers/llvm-project/tree/remove_experimental_from_vector_crypto
 I've rebased this pull request, squashed the commits, resolved conflicts, 
fixed the Zvkn duplication, and run `git clang-format`.  Note, it was necessary 
to change the type of `enum RVVRequire` from `uint16_t` to `unsigned int` 
because the next flag is now `1 << 16`, and to remove `experimental-` from some 
new files in `llvm/test/Transforms/` where references to the crypto extensions 
had been added.  @4vtomat can you update your pull request accordingly?  Feel 
free to reuse what I have.  Thanks!

https://github.com/llvm/llvm-project/pull/69000
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Introduce scoped variants of GNU atomic functions (PR #72280)

2023-11-21 Thread Joseph Huber via cfe-commits

jhuber6 wrote:

> Missing change to clang/docs/LanguageExtensions.rst describing the new 
> builtins.
>
Will do.

> Are there any other projects that we might want to coordinate with here? gcc, 
> maybe?

Unknown, I've never collaborated with anyone outside of LLVM. I know they have 
handling of GPU programming for NVPTX and AMDGPU targets, but I don't know what 
level of support they have for this. I think it's sufficient to keep it as a 
`clang` extension for now. I'm hoping this is relatively simple given it's just 
an extra argument on the GNU versions. 

https://github.com/llvm/llvm-project/pull/72280
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] fix: compatible C++ empty record with align UB with gcc (PR #72197)

2023-11-21 Thread via cfe-commits



@@ -307,7 +307,12 @@ AArch64ABIInfo::classifyArgumentType(QualType Ty, bool 
IsVariadic,
 // 0.
 if (IsEmpty && Size == 0)
   return ABIArgInfo::getIgnore();
-return ABIArgInfo::getDirect(llvm::Type::getInt8Ty(getVMContext()));
+// An empty struct can have size greater than one byte if alignment is
+// involved.
+// When size <= 64, we still hold it by i8 in IR and lowering to registers.
+// When Size > 64, just fall through to avoid va_list out of sync.

hstk30-hw wrote:

Forgive my foolish，can I just copy it as comment?  

> AAPCS64 does not say that empty records are ignored as arguments,
> but other compilers do so in certain situations, and we copy that behavior.
> Those situations are in fact language-mode-specific, which seems really
> unfortunate, but it's something we just have to accept. If this doesn't
> apply, just fall through to the standard argument-handling path.
> Darwin overrides the psABI here to ignore all empty records in all modes.


https://github.com/llvm/llvm-project/pull/72197
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] fix: compatible C++ empty record with align UB with gcc (PR #72197)

2023-11-21 Thread via cfe-commits


https://github.com/hstk30-hw edited 
https://github.com/llvm/llvm-project/pull/72197
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Introduce scoped variants of GNU atomic functions (PR #72280)

2023-11-21 Thread Eli Friedman via cfe-commits


efriedma-quic wrote:

Missing change to clang/docs/LanguageExtensions.rst describing the new builtins.

Are there any other projects that we might want to coordinate with here?  gcc, 
maybe?

https://github.com/llvm/llvm-project/pull/72280
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #68932)

2023-11-21 Thread Jun Wang via cfe-commits


https://github.com/jwanggit86 updated 
https://github.com/llvm/llvm-project/pull/68932

>From e393477607cb94b45a3b9a5db2aea98fb8af2a86 Mon Sep 17 00:00:00 2001
From: Jun Wang 
Date: Thu, 12 Oct 2023 16:45:59 -0500
Subject: [PATCH 01/10] [AMDGPU] Emit a waitcnt instruction after each memory
 instruction

This patch implements a new command-line option for the backend, namely,
amdgpu-waitcnt-for-all-mem-op. When this option is specified, a "waitcnt 0"
instruction is generated after each memory load/store instruction.
---
 llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp   |  30 ++-
 .../CodeGen/AMDGPU/insert_waitcnt_for_all.ll  | 222 ++
 2 files changed, 251 insertions(+), 1 deletion(-)
 create mode 100644 llvm/test/CodeGen/AMDGPU/insert_waitcnt_for_all.ll

diff --git a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp 
b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
index ede4841b8a5fd7d..728be7c61fa2217 100644
--- a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
@@ -52,6 +52,10 @@ static cl::opt ForceEmitZeroFlag(
   cl::desc("Force all waitcnt instrs to be emitted as s_waitcnt vmcnt(0) 
expcnt(0) lgkmcnt(0)"),
   cl::init(false), cl::Hidden);
 
+static cl::opt EmitForAllMemOpFlag(
+"amdgpu-waitcnt-for-all-mem-op",
+cl::desc("Emit s_waitcnt 0 after each memory operation"), cl::init(false));
+
 namespace {
 // Class of object that encapsulates latest instruction counter score
 // associated with the operand.  Used for determining whether
@@ -388,6 +392,8 @@ class SIInsertWaitcnts : public MachineFunctionPass {
   // message.
   DenseSet ReleaseVGPRInsts;
 
+  bool insertWaitcntAfterMemOp(MachineFunction &MF);
+
 public:
   static char ID;
 
@@ -1809,6 +1815,23 @@ bool SIInsertWaitcnts::shouldFlushVmCnt(MachineLoop *ML,
   return HasVMemLoad && UsesVgprLoadedOutside;
 }
 
+bool SIInsertWaitcnts::insertWaitcntAfterMemOp(MachineFunction &MF) {
+  bool Modified = false;
+
+  for (auto &MBB : MF) {
+for (auto It = MBB.begin(); It != MBB.end();) {
+  bool IsMemOp = It->mayLoadOrStore();
+  ++It;
+  if (IsMemOp) {
+BuildMI(MBB, It, DebugLoc(), TII->get(AMDGPU::S_WAITCNT)).addImm(0);
+Modified = true;
+  }
+}
+  }
+
+  return Modified;
+}
+
 bool SIInsertWaitcnts::runOnMachineFunction(MachineFunction &MF) {
   ST = &MF.getSubtarget();
   TII = ST->getInstrInfo();
@@ -1819,6 +1842,12 @@ bool 
SIInsertWaitcnts::runOnMachineFunction(MachineFunction &MF) {
   MLI = &getAnalysis();
   PDT = &getAnalysis();
 
+  bool Modified = false;
+
+  if (EmitForAllMemOpFlag) {
+Modified = insertWaitcntAfterMemOp(MF);
+  }
+
   ForceEmitZeroWaitcnts = ForceEmitZeroFlag;
   for (auto T : inst_counter_types())
 ForceEmitWaitcnt[T] = false;
@@ -1847,7 +1876,6 @@ bool 
SIInsertWaitcnts::runOnMachineFunction(MachineFunction &MF) {
 
   TrackedWaitcntSet.clear();
   BlockInfos.clear();
-  bool Modified = false;
 
   if (!MFI->isEntryFunction()) {
 // Wait for any outstanding memory operations that the input registers may
diff --git a/llvm/test/CodeGen/AMDGPU/insert_waitcnt_for_all.ll 
b/llvm/test/CodeGen/AMDGPU/insert_waitcnt_for_all.ll
new file mode 100644
index 000..4580b9074ada3cc
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/insert_waitcnt_for_all.ll
@@ -0,0 +1,222 @@
+; Testing the -amdgpu-waitcnt-for-all-mem-op option
+; COM: llc -mtriple=amdgcn -mcpu=hawaii -amdgpu-waitcnt-for-all-mem-op 
-verify-machineinstrs < %s | FileCheck %s -check-prefixes=GFX7
+; COM: llc -mtriple=amdgcn -mcpu=tonga -amdgpu-waitcnt-for-all-mem-op 
-verify-machineinstrs < %s | FileCheck %s -check-prefixes=GFX8
+; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -amdgpu-waitcnt-for-all-mem-op 
-verify-machineinstrs < %s | FileCheck %s -check-prefixes=GFX9
+; RUN: llc -mtriple=amdgcn -mcpu=gfx90a -amdgpu-waitcnt-for-all-mem-op 
-verify-machineinstrs < %s | FileCheck %s -check-prefixes=GFX90A
+; RUN: llc -mtriple=amdgcn -mcpu=gfx1010 -amdgpu-waitcnt-for-all-mem-op 
-verify-machineinstrs < %s | FileCheck %s -check-prefixes=GFX10
+; RUN: llc -mtriple=amdgcn-- -mcpu=gfx900 
-mattr=-flat-for-global,+enable-flat-scratch 
-amdgpu-use-divergent-register-indexing -amdgpu-waitcnt-for-all-mem-op 
-verify-machineinstrs < %s | FileCheck --check-prefixes=GFX9-FLATSCR %s
+
+; from atomicrmw-expand.ll
+; covers flat_load, flat_atomic
+define void @syncscope_workgroup_nortn(ptr %addr, float %val) {
+; GFX90A-LABEL: syncscope_workgroup_nortn:
+; GFX90A:  ; %bb.0:
+; GFX90A: flat_load_dword v5, v[0:1]
+; GFX90A-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX90A:  .LBB0_1: ; %atomicrmw.start
+; GFX90A: flat_atomic_cmpswap v3, v[0:1], v[4:5] glc
+; GFX90A-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+  %res = atomicrmw fadd ptr %addr, float %val syncscope("workgroup") seq_cst
+  ret void
+}
+
+; from atomicrmw-nand.ll
+; covers global_atomic, global_load
+define i32 @atomic_nand_i32_global(ptr addrspace(1) %ptr) nounwind {
+; GFX9-LABEL: atomic_

[clang] [CUDA][HIP] make trivial ctor/dtor host device (PR #72394)

2023-11-21 Thread via cfe-commits


alexfh wrote:

@yxsamliu What's the plan here? This issue is blocking us. If there is no 
obvious fix very soon, we need to revert this.

https://github.com/llvm/llvm-project/pull/72394
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Honor -fno-sanitize-link-runtime for libclang_rt.asan_static (PR #66414)

2023-11-21 Thread Vitaly Buka via cfe-commits


vitalybuka wrote:

LGTM, looks straightforward
Optionally wait for @MaskRay for a day or so.

https://github.com/llvm/llvm-project/pull/66414
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Honor -fno-sanitize-link-runtime for libclang_rt.asan_static (PR #66414)

2023-11-21 Thread Vitaly Buka via cfe-commits


https://github.com/vitalybuka approved this pull request.


https://github.com/llvm/llvm-project/pull/66414
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [clang] Honor -fno-sanitize-link-runtime for libclang_rt.asan_static (PR #66414)

2023-11-21 Thread Vitaly Buka via cfe-commits

Will do
There is a button to "re-request" reviews.
Otherwise it does not show up in seach like
https://github.com/llvm/llvm-project/pulls/review-requested/@me

On Tue, 21 Nov 2023 at 10:06, via cfe-commits 
wrote:

>
> pirama-arumuga-nainar wrote:
>
> @vitalybuka can you help review this change?
>
> https://github.com/llvm/llvm-project/pull/66414
> ___
> cfe-commits mailing list
> cfe-commits@lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
>
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AArch64] Stack probing for function prologues (PR #66524)

2023-11-21 Thread Momchil Velikov via cfe-commits


momchil-velikov wrote:

I only now noticed I had a bunch of comments sitting for a few weeks in 
"Pending" state :/

https://github.com/llvm/llvm-project/pull/66524
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AArch64] Stack probing for function prologues (PR #66524)

2023-11-21 Thread Momchil Velikov via cfe-commits



@@ -1757,46 +1826,55 @@ void AArch64FrameLowering::emitPrologue(MachineFunction 
&MF,
 }
   }
 
-  StackOffset AllocateBefore = SVEStackSize, AllocateAfter = {};
+  StackOffset SVECalleeSavedSize = {}, SVELocalsSize = SVEStackSize;
   MachineBasicBlock::iterator CalleeSavesBegin = MBBI, CalleeSavesEnd = MBBI;
 
   // Process the SVE callee-saves to determine what space needs to be
   // allocated.
   if (int64_t CalleeSavedSize = AFI->getSVECalleeSavedStackSize()) {
+LLVM_DEBUG(dbgs() << "SVECalleeSavedStackSize = " << CalleeSavedSize
+  << "\n");
 // Find callee save instructions in frame.
 CalleeSavesBegin = MBBI;
 assert(IsSVECalleeSave(CalleeSavesBegin) && "Unexpected instruction");
 while (IsSVECalleeSave(MBBI) && MBBI != MBB.getFirstTerminator())
   ++MBBI;
 CalleeSavesEnd = MBBI;
 
-AllocateBefore = StackOffset::getScalable(CalleeSavedSize);
-AllocateAfter = SVEStackSize - AllocateBefore;
+SVECalleeSavedSize = StackOffset::getScalable(CalleeSavedSize);
+SVELocalsSize = SVEStackSize - SVECalleeSavedSize;
+
+// Allocate space for the SVE callee saves.
+if (SVECalleeSavedSize) {
+  allocateSVEStackSpace(
+  MBB, CalleeSavesBegin, SVECalleeSavedSize,
+  StackOffset::getFixed((int64_t)MFI.getStackSize() - NumBytes),
+  EmitAsyncCFI && !HasFP);
+  if (EmitAsyncCFI)
+emitCalleeSavedSVELocations(MBB, CalleeSavesEnd);
+}
   }
 
-  // Allocate space for the callee saves (if any).
-  emitFrameOffset(
-  MBB, CalleeSavesBegin, DL, AArch64::SP, AArch64::SP, -AllocateBefore, 
TII,
-  MachineInstr::FrameSetup, false, false, nullptr,
-  EmitAsyncCFI && !HasFP && AllocateBefore,
-  StackOffset::getFixed((int64_t)MFI.getStackSize() - NumBytes));
-
-  if (EmitAsyncCFI)
-emitCalleeSavedSVELocations(MBB, CalleeSavesEnd);
-
-  // Finally allocate remaining SVE stack space.
-  emitFrameOffset(MBB, CalleeSavesEnd, DL, AArch64::SP, AArch64::SP,
-  -AllocateAfter, TII, MachineInstr::FrameSetup, false, false,
-  nullptr, EmitAsyncCFI && !HasFP && AllocateAfter,
-  AllocateBefore + StackOffset::getFixed(
-   (int64_t)MFI.getStackSize() - 
NumBytes));
+  // Allocate stack space for the local SVE objects.
+  if (SVELocalsSize)
+allocateSVEStackSpace(
+MBB, CalleeSavesEnd, SVELocalsSize,
+SVECalleeSavedSize +
+StackOffset::getFixed((int64_t)MFI.getStackSize() - NumBytes),
+EmitAsyncCFI && !HasFP);
 
   // Allocate space for the rest of the frame.
   if (NumBytes) {
 unsigned scratchSPReg = AArch64::SP;
+bool NeedsStackProbe = TLI.hasInlineStackProbe(MF) &&
+   (NumBytes > AArch64::StackProbeMaxUnprobedStack ||
+MFI.hasVarSizedObjects());
 
 if (NeedsRealignment) {
   scratchSPReg = findScratchNonCalleeSaveRegister(&MBB);
+  NeedsStackProbe |= TLI.hasInlineStackProbe(MF) &&
+ (NumBytes + MFI.getMaxAlign().value()) >

momchil-velikov wrote:

Done

https://github.com/llvm/llvm-project/pull/66524
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [AArch64] Stack probing for function prologues (PR #66524)

2023-11-21 Thread Momchil Velikov via cfe-commits



@@ -1076,6 +1076,16 @@ void CodeGenModule::Release() {
 "sign-return-address-with-bkey", 1);
   }
 
+  if (Arch == llvm::Triple::aarch64 || Arch == llvm::Triple::aarch64_be) {
+auto *InlineAsm = llvm::MDString::get(TheModule.getContext(), 
"inline-asm");
+if (CodeGenOpts.StackClashProtector)
+  getModule().addModuleFlag(llvm::Module::Override, "probe-stack",
+InlineAsm);

momchil-velikov wrote:

We would like to use a module flag so the stack clash protection is effective 
for functions created by LLVM (e..g `asan.module_ctor`).
It is not AArch64 specific in principle, but other backends which implement SCP 
still rely on function attributes. When/if other backends adopt this approach 
the condition can be removed.
@serge-sans-paille 

https://github.com/llvm/llvm-project/pull/66524
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [AArch64] Stack probing for function prologues (PR #66524)

2023-11-21 Thread Momchil Velikov via cfe-commits



@@ -9460,6 +9461,94 @@ bool AArch64InstrInfo::isReallyTriviallyReMaterializable(
   return TargetInstrInfo::isReallyTriviallyReMaterializable(MI);
 }
 
+MachineBasicBlock::iterator
+AArch64InstrInfo::insertStackProbingLoop(MachineBasicBlock::iterator MBBI,
+ Register ScratchReg,
+ Register TargetReg) const {
+  MachineBasicBlock &MBB = *MBBI->getParent();
+  MachineFunction &MF = *MBB.getParent();
+  const AArch64InstrInfo *TII =
+  MF.getSubtarget().getInstrInfo();
+  int64_t ProbeSize = MF.getInfo()->getStackProbeSize();
+  DebugLoc DL = MBB.findDebugLoc(MBBI);
+
+  MachineFunction::iterator MBBInsertPoint = std::next(MBB.getIterator());
+  MachineBasicBlock *LoopTestMBB =
+  MF.CreateMachineBasicBlock(MBB.getBasicBlock());
+  MF.insert(MBBInsertPoint, LoopTestMBB);
+  MachineBasicBlock *LoopBodyMBB =
+  MF.CreateMachineBasicBlock(MBB.getBasicBlock());
+  MF.insert(MBBInsertPoint, LoopBodyMBB);
+  MachineBasicBlock *ExitMBB = MF.CreateMachineBasicBlock(MBB.getBasicBlock());
+  MF.insert(MBBInsertPoint, ExitMBB);
+
+  // LoopTest:
+  //   SUB ScratchReg, ScratchReg, #ProbeSize
+  emitFrameOffset(*LoopTestMBB, LoopTestMBB->end(), DL, ScratchReg, ScratchReg,
+  StackOffset::getFixed(-ProbeSize), TII,
+  MachineInstr::FrameSetup);
+
+  //   CMP ScratchReg, TargetReg
+  AArch64CC::CondCode Cond = AArch64CC::LE;
+  Register Op1 = ScratchReg;
+  Register Op2 = TargetReg;
+  if (Op2 == AArch64::SP) {
+assert(Op1 != AArch64::SP && "At most one of the registers can be SP");
+// CMP TargetReg, ScratchReg
+std::swap(Op1, Op2);
+Cond = AArch64CC::GT;
+  }
+  BuildMI(*LoopTestMBB, LoopTestMBB->end(), DL, TII->get(AArch64::SUBSXrx64),
+  AArch64::XZR)
+  .addReg(Op1)
+  .addReg(Op2)
+  .addImm(AArch64_AM::getArithExtendImm(AArch64_AM::UXTX, 0))
+  .setMIFlags(MachineInstr::FrameSetup);
+
+  //   B. LoopExit
+  BuildMI(*LoopTestMBB, LoopTestMBB->end(), DL, TII->get(AArch64::Bcc))
+  .addImm(Cond)
+  .addMBB(ExitMBB)
+  .setMIFlags(MachineInstr::FrameSetup);
+
+  //   STR XZR, [ScratchReg]
+  BuildMI(*LoopBodyMBB, LoopBodyMBB->end(), DL, TII->get(AArch64::STRXui))
+  .addReg(AArch64::XZR)
+  .addReg(ScratchReg)
+  .addImm(0)
+  .setMIFlags(MachineInstr::FrameSetup);
+
+  //   B loop
+  BuildMI(*LoopBodyMBB, LoopBodyMBB->end(), DL, TII->get(AArch64::B))
+  .addMBB(LoopTestMBB)
+  .setMIFlags(MachineInstr::FrameSetup);
+
+  // LoopExit:
+  //   STR XZR, [TargetReg]
+  BuildMI(*ExitMBB, ExitMBB->begin(), DL, TII->get(AArch64::STRXui))
+  .addReg(AArch64::XZR)
+  .addReg(TargetReg)
+  .addImm(0)
+  .setMIFlags(MachineInstr::FrameSetup);

momchil-velikov wrote:

I have now fixed this issue.

https://github.com/llvm/llvm-project/pull/66524
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [AArch64] Stack probing for function prologues (PR #66524)

2023-11-21 Thread Momchil Velikov via cfe-commits



@@ -26262,3 +26262,37 @@ bool 
AArch64TargetLowering::preferScalarizeSplat(SDNode *N) const {
   }
   return true;
 }
+
+bool AArch64TargetLowering::hasInlineStackProbe(
+const MachineFunction &MF) const {
+  // If the function specifically requests inline stack probes, emit them.
+  if (MF.getFunction().hasFnAttribute("probe-stack")) {
+if (MF.getFunction().getFnAttribute("probe-stack").getValueAsString() ==
+"inline-asm")
+  return true;
+else
+  llvm_unreachable("Unsupported stack probing method");
+  }
+
+  return false;
+}
+
+unsigned
+AArch64TargetLowering::getStackProbeSize(const MachineFunction &MF) const {
+  const TargetFrameLowering *TFI = Subtarget->getFrameLowering();
+  unsigned StackAlign = TFI->getStackAlignment();
+  assert(StackAlign >= 1 && isPowerOf2_32(StackAlign) &&
+ "Unexpected stack alignment");
+  // The default stack probe size is 4096 if the function has no
+  // stack-probe-size attribute. This is a safe default because it is the
+  // smallest possible guard page size.
+  unsigned StackProbeSize = 4096;
+  const Function &Fn = MF.getFunction();
+  if (Fn.hasFnAttribute("stack-probe-size"))

momchil-velikov wrote:

The rounding to the stack alignment size is enough. Rounding down is the safer 
choice, zero is handled, there's no requirement other than be a multiple of 
stack alignment. Some choices might not be appropriate for certain platforms, 
e.g. 
5k or 8k probe size with 4k guard page size, but that's not something that can 
be validated here. Values, greater than 64k might also be legit, e.g. a 
specific platform allocates 2 guard pages (128k) at the top of the stack to 
limit probing overhead.

https://github.com/llvm/llvm-project/pull/66524
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AArch64] Stack probing for function prologues (PR #66524)

2023-11-21 Thread Momchil Velikov via cfe-commits



@@ -1827,12 +1908,36 @@ void AArch64FrameLowering::emitPrologue(MachineFunction 
&MF,
   // FIXME: in the case of dynamic re-alignment, NumBytes doesn't have
   // the correct value here, as NumBytes also includes padding bytes,
   // which shouldn't be counted here.
-  emitFrameOffset(
-  MBB, MBBI, DL, scratchSPReg, AArch64::SP,
-  StackOffset::getFixed(-NumBytes), TII, MachineInstr::FrameSetup,
-  false, NeedsWinCFI, &HasWinCFI, EmitAsyncCFI && !HasFP,
+  StackOffset CFAOffset =
   SVEStackSize +
-  StackOffset::getFixed((int64_t)MFI.getStackSize() - NumBytes));
+  StackOffset::getFixed((int64_t)MFI.getStackSize() - NumBytes);
+  if (NeedsStackProbe && !NeedsRealignment) {
+// If we don't need to re-align the stack, we can use a more efficient
+// sequence for stack probing.
+Register ScratchReg = findScratchNonCalleeSaveRegister(&MBB);

momchil-velikov wrote:

Because we may issue a loop when replacing `PROBED_STACKALLOC` and that loop 
uses a scratch register (containing the new `SP` value) in the loop exit 
condition.

https://github.com/llvm/llvm-project/pull/66524
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AArch64] Stack probing for function prologues (PR #66524)

2023-11-21 Thread Momchil Velikov via cfe-commits



@@ -672,6 +673,74 @@ void AArch64FrameLowering::emitCalleeSavedSVERestores(
   emitCalleeSavedRestores(MBB, MBBI, true);
 }
 
+void AArch64FrameLowering::allocateSVEStackSpace(
+MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI,
+StackOffset AllocSize, StackOffset InitialOffset, bool EmitCFI) const {
+  DebugLoc DL;
+  MachineFunction &MF = *MBB.getParent();
+  const AArch64Subtarget &Subtarget = MF.getSubtarget();
+  const AArch64RegisterInfo &RegInfo = *Subtarget.getRegisterInfo();
+  const AArch64TargetLowering &TLI = *Subtarget.getTargetLowering();
+  const TargetInstrInfo &TII = *Subtarget.getInstrInfo();
+
+  // If not probing the stack or the (uknown) allocation size is less than the
+  // probe size decrement the stack pointer right away. This avoids having to
+  // emit a probing loop when allocating space for up to 16 SVE registers when
+  // using 4k probes.
+
+  // The bit-length of SVE registers is architecturally limited.
+  const int64_t MAX_BYTES_PER_SCALABLE_BYTE = 16;
+  int64_t ProbeSize = TLI.getStackProbeSize(MF);
+  if (!TLI.hasInlineStackProbe(MF) ||
+  AllocSize.getScalable() * MAX_BYTES_PER_SCALABLE_BYTE +
+  AllocSize.getFixed() <=
+  ProbeSize) {
+emitFrameOffset(MBB, MBBI, DL, AArch64::SP, AArch64::SP, -AllocSize, &TII,
+MachineInstr::FrameSetup, false, false, nullptr, EmitCFI,
+InitialOffset);
+if (TLI.hasInlineStackProbe(MF)) {
+  // Issue a probe at the top of the stack to prepare for subsequent
+  // allocations.
+  // STR XZR, [TargetReg]
+  BuildMI(MBB, MBBI, DL, TII.get(AArch64::STRXui))
+  .addReg(AArch64::XZR)
+  .addReg(AArch64::SP)
+  .addImm(0)
+  .setMIFlags(MachineInstr::FrameSetup);
+}
+return;
+  }
+
+  // If we can't be sure the allocation size if less than the probe size, we
+  // have to emit a stack probing loop.
+  Register ScratchReg = findScratchNonCalleeSaveRegister(&MBB);
+  assert(ScratchReg != AArch64::NoRegister);
+  // Get the new top of the stack into a scratch register.
+  emitFrameOffset(MBB, MBBI, DL, ScratchReg, AArch64::SP, -AllocSize, &TII,
+  MachineInstr::FrameSetup, false, false, nullptr, EmitCFI,
+  InitialOffset);
+  // Arrange to emit a probing loop by decrementing SP until it reaches that
+  // new top of the stack.
+  BuildMI(MBB, MBBI, DL, TII.get(AArch64::PROBED_STACKALLOC_VAR), AArch64::SP)
+  .addReg(ScratchReg);
+  // Set SP to its new value.
+  // MOV SP, Xs
+  BuildMI(MBB, MBBI, DL, TII.get(AArch64::ADDXri), AArch64::SP)
+  .addReg(ScratchReg)
+  .addImm(0)
+  .addImm(AArch64_AM::getShifterImm(AArch64_AM::LSL, 0))
+  .setMIFlags(MachineInstr::FrameSetup);
+  if (EmitCFI) {

momchil-velikov wrote:

What if we have FP?

https://github.com/llvm/llvm-project/pull/66524
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [AArch64] Stack probing for function prologues (PR #66524)

2023-11-21 Thread Momchil Velikov via cfe-commits



@@ -9460,6 +9461,94 @@ bool AArch64InstrInfo::isReallyTriviallyReMaterializable(
   return TargetInstrInfo::isReallyTriviallyReMaterializable(MI);
 }
 
+MachineBasicBlock::iterator
+AArch64InstrInfo::insertStackProbingLoop(MachineBasicBlock::iterator MBBI,
+ Register ScratchReg,
+ Register TargetReg) const {
+  MachineBasicBlock &MBB = *MBBI->getParent();
+  MachineFunction &MF = *MBB.getParent();
+  const AArch64InstrInfo *TII =
+  MF.getSubtarget().getInstrInfo();
+  int64_t ProbeSize = MF.getInfo()->getStackProbeSize();
+  DebugLoc DL = MBB.findDebugLoc(MBBI);
+
+  MachineFunction::iterator MBBInsertPoint = std::next(MBB.getIterator());
+  MachineBasicBlock *LoopTestMBB =
+  MF.CreateMachineBasicBlock(MBB.getBasicBlock());
+  MF.insert(MBBInsertPoint, LoopTestMBB);
+  MachineBasicBlock *LoopBodyMBB =
+  MF.CreateMachineBasicBlock(MBB.getBasicBlock());
+  MF.insert(MBBInsertPoint, LoopBodyMBB);
+  MachineBasicBlock *ExitMBB = MF.CreateMachineBasicBlock(MBB.getBasicBlock());
+  MF.insert(MBBInsertPoint, ExitMBB);
+
+  // LoopTest:
+  //   SUB ScratchReg, ScratchReg, #ProbeSize
+  emitFrameOffset(*LoopTestMBB, LoopTestMBB->end(), DL, ScratchReg, ScratchReg,
+  StackOffset::getFixed(-ProbeSize), TII,
+  MachineInstr::FrameSetup);
+
+  //   CMP ScratchReg, TargetReg
+  AArch64CC::CondCode Cond = AArch64CC::LE;
+  Register Op1 = ScratchReg;
+  Register Op2 = TargetReg;
+  if (Op2 == AArch64::SP) {
+assert(Op1 != AArch64::SP && "At most one of the registers can be SP");
+// CMP TargetReg, ScratchReg
+std::swap(Op1, Op2);
+Cond = AArch64CC::GT;
+  }
+  BuildMI(*LoopTestMBB, LoopTestMBB->end(), DL, TII->get(AArch64::SUBSXrx64),
+  AArch64::XZR)
+  .addReg(Op1)
+  .addReg(Op2)
+  .addImm(AArch64_AM::getArithExtendImm(AArch64_AM::UXTX, 0))
+  .setMIFlags(MachineInstr::FrameSetup);
+
+  //   B. LoopExit
+  BuildMI(*LoopTestMBB, LoopTestMBB->end(), DL, TII->get(AArch64::Bcc))
+  .addImm(Cond)
+  .addMBB(ExitMBB)
+  .setMIFlags(MachineInstr::FrameSetup);
+
+  //   STR XZR, [ScratchReg]
+  BuildMI(*LoopBodyMBB, LoopBodyMBB->end(), DL, TII->get(AArch64::STRXui))
+  .addReg(AArch64::XZR)
+  .addReg(ScratchReg)
+  .addImm(0)
+  .setMIFlags(MachineInstr::FrameSetup);
+
+  //   B loop
+  BuildMI(*LoopBodyMBB, LoopBodyMBB->end(), DL, TII->get(AArch64::B))
+  .addMBB(LoopTestMBB)
+  .setMIFlags(MachineInstr::FrameSetup);
+
+  // LoopExit:
+  //   STR XZR, [TargetReg]
+  BuildMI(*ExitMBB, ExitMBB->begin(), DL, TII->get(AArch64::STRXui))
+  .addReg(AArch64::XZR)
+  .addReg(TargetReg)
+  .addImm(0)
+  .setMIFlags(MachineInstr::FrameSetup);

momchil-velikov wrote:

> ```
> sub sp, sp, #0x1, lsl #0xc
> cmp sp, x1
> b.le0x557388
> str xzr, [x1]  {0x0}
> ```
> 
> We are probing the _old_ stack head! `x1` contains `0x7fee80` but `sp` is 
> at `7fde80`! This means that the selection of the `x1` register instead 
> of `sp` is incorrect.

I can't quite see how it is possible to generate this code. This is part of the 
sequence for allocating a compile time unknown amount of stack space that is 
done by `AArch64InstrInfo::insertStackProbingLoop`. In this function 
`TargetReg` is the new
top of the stack and right now [1] `ScratchReg` is always `AArch64::SP` .


Thus first we have
```
  // LoopTest:
  //   SUB ScratchReg, ScratchReg, #ProbeSize
  emitFrameOffset(*LoopTestMBB, LoopTestMBB->end(), DL, ScratchReg, ScratchReg,
  StackOffset::getFixed(-ProbeSize), TII,
  MachineInstr::FrameSetup);
```

This is the code the emits the ` su sp, sp, #0x1, lsl #0xc`. Note, it uses 
`ScratchReg`.

Then we emit the compare
```
  //   CMP ScratchReg, TargetReg

  AArch64CC::CondCode Cond = AArch64CC::LE;
  Register Op1 = ScratchReg;
  Register Op2 = TargetReg;
  if (Op2 == AArch64::SP) { // condition is false here
  // ...
  }

  BuildMI(*LoopTestMBB, LoopTestMBB->end(), DL, TII->get(AArch64::SUBSXrx64),
  AArch64::XZR)
  .addReg(Op1)
  .addReg(Op2)
  .addImm(AArch64_AM::getArithExtendImm(AArch64_AM::UXTX, 0))
  .setMIFlags(MachineInstr::FrameSetup);
```

That is  the `cmp sp, x1`.  So, `Op2` is `TargetReg` and `TargetReg` is `x1`.

Then we emit the loop exit branch:
```
  //   B. LoopExit
  BuildMI(*LoopTestMBB, LoopTestMBB->end(), DL, TII->get(AArch64::Bcc))
  .addImm(Cond)
  .addMBB(ExitMBB)
  .setMIFlags(MachineInstr::FrameSetup);
```

This is the `b.le0x557388` above.

and then, still inside the probing loop, we emit a stack probe to `ScratchReg`, 
i.e. to `SP`.

```
  //   STR XZR, [ScratchReg]
  BuildMI(*LoopBodyMBB, LoopBodyMBB->end(), DL, TII->get(AArch64::STRXui))
  .addReg(AArch64::XZR)
  .addReg(ScratchReg)
  .addImm(0)
  .setMIFlags(MachineInstr::FrameSetup);
```

However, i

[llvm] [clang] [AArch64] Stack probing for function prologues (PR #66524)

2023-11-21 Thread Momchil Velikov via cfe-commits



@@ -672,6 +673,74 @@ void AArch64FrameLowering::emitCalleeSavedSVERestores(
   emitCalleeSavedRestores(MBB, MBBI, true);
 }
 
+void AArch64FrameLowering::allocateSVEStackSpace(
+MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI,
+StackOffset AllocSize, StackOffset InitialOffset, bool EmitCFI) const {
+  DebugLoc DL;
+  MachineFunction &MF = *MBB.getParent();
+  const AArch64Subtarget &Subtarget = MF.getSubtarget();
+  const AArch64RegisterInfo &RegInfo = *Subtarget.getRegisterInfo();
+  const AArch64TargetLowering &TLI = *Subtarget.getTargetLowering();
+  const TargetInstrInfo &TII = *Subtarget.getInstrInfo();
+
+  // If not probing the stack or the (uknown) allocation size is less than the
+  // probe size decrement the stack pointer right away. This avoids having to
+  // emit a probing loop when allocating space for up to 16 SVE registers when
+  // using 4k probes.
+
+  // The bit-length of SVE registers is architecturally limited.
+  const int64_t MAX_BYTES_PER_SCALABLE_BYTE = 16;
+  int64_t ProbeSize = TLI.getStackProbeSize(MF);
+  if (!TLI.hasInlineStackProbe(MF) ||
+  AllocSize.getScalable() * MAX_BYTES_PER_SCALABLE_BYTE +
+  AllocSize.getFixed() <=
+  ProbeSize) {
+emitFrameOffset(MBB, MBBI, DL, AArch64::SP, AArch64::SP, -AllocSize, &TII,
+MachineInstr::FrameSetup, false, false, nullptr, EmitCFI,
+InitialOffset);
+if (TLI.hasInlineStackProbe(MF)) {
+  // Issue a probe at the top of the stack to prepare for subsequent
+  // allocations.
+  // STR XZR, [TargetReg]
+  BuildMI(MBB, MBBI, DL, TII.get(AArch64::STRXui))
+  .addReg(AArch64::XZR)
+  .addReg(AArch64::SP)
+  .addImm(0)
+  .setMIFlags(MachineInstr::FrameSetup);
+}
+return;
+  }
+
+  // If we can't be sure the allocation size if less than the probe size, we
+  // have to emit a stack probing loop.
+  Register ScratchReg = findScratchNonCalleeSaveRegister(&MBB);
+  assert(ScratchReg != AArch64::NoRegister);
+  // Get the new top of the stack into a scratch register.
+  emitFrameOffset(MBB, MBBI, DL, ScratchReg, AArch64::SP, -AllocSize, &TII,
+  MachineInstr::FrameSetup, false, false, nullptr, EmitCFI,
+  InitialOffset);
+  // Arrange to emit a probing loop by decrementing SP until it reaches that
+  // new top of the stack.
+  BuildMI(MBB, MBBI, DL, TII.get(AArch64::PROBED_STACKALLOC_VAR), AArch64::SP)
+  .addReg(ScratchReg);
+  // Set SP to its new value.
+  // MOV SP, Xs
+  BuildMI(MBB, MBBI, DL, TII.get(AArch64::ADDXri), AArch64::SP)
+  .addReg(ScratchReg)
+  .addImm(0)
+  .addImm(AArch64_AM::getShifterImm(AArch64_AM::LSL, 0))
+  .setMIFlags(MachineInstr::FrameSetup);
+  if (EmitCFI) {

momchil-velikov wrote:

Taken care of in invocations of `allocateSVEStackSpace`

https://github.com/llvm/llvm-project/pull/66524
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AArch64] Stack probing for function prologues (PR #66524)

2023-11-21 Thread Momchil Velikov via cfe-commits



@@ -97,14 +97,45 @@ AArch64FunctionInfo::AArch64FunctionInfo(const Function &F,
 if (const auto *BTE = mdconst::extract_or_null(
 F.getParent()->getModuleFlag("branch-target-enforcement")))
   BranchTargetEnforcement = BTE->getZExtValue();
-return;
+  } else {
+const StringRef BTIEnable =
+F.getFnAttribute("branch-target-enforcement").getValueAsString();
+assert(BTIEnable.equals_insensitive("true") ||
+   BTIEnable.equals_insensitive("false"));

momchil-velikov wrote:

This is not a part of this patch series (it just moved a bit). I've created 
https://github.com/llvm/llvm-project/pull/70565

https://github.com/llvm/llvm-project/pull/66524
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [AArch64] Stack probing for function prologues (PR #66524)

2023-11-21 Thread Momchil Velikov via cfe-commits



@@ -4052,3 +4193,192 @@ void AArch64FrameLowering::orderFrameObjects(
 dbgs() << "\n";
   });
 }
+
+/// Emit a loop to decrement SP until it is equal to TargetReg, with probes at
+/// least every ProbeSize bytes. Returns an iterator of the first instruction
+/// after the loop. The difference between SP and TargetReg must be an exact
+/// multiple of ProbeSize.
+MachineBasicBlock::iterator
+AArch64FrameLowering::inlineStackProbeLoopExactMultiple(
+MachineBasicBlock::iterator MBBI, int64_t ProbeSize,
+Register TargetReg) const {
+  MachineBasicBlock &MBB = *MBBI->getParent();
+  MachineFunction &MF = *MBB.getParent();
+  const AArch64InstrInfo *TII =
+  MF.getSubtarget().getInstrInfo();
+  DebugLoc DL = MBB.findDebugLoc(MBBI);
+
+  MachineFunction::iterator MBBInsertPoint = std::next(MBB.getIterator());
+  MachineBasicBlock *LoopMBB = MF.CreateMachineBasicBlock(MBB.getBasicBlock());
+  MF.insert(MBBInsertPoint, LoopMBB);
+  MachineBasicBlock *ExitMBB = MF.CreateMachineBasicBlock(MBB.getBasicBlock());
+  MF.insert(MBBInsertPoint, ExitMBB);
+
+  // SUB SP, SP, #ProbeSize (or equivalent if ProbeSize is not encodable
+  // in SUB).
+  emitFrameOffset(*LoopMBB, LoopMBB->end(), DL, AArch64::SP, AArch64::SP,
+  StackOffset::getFixed(-ProbeSize), TII,
+  MachineInstr::FrameSetup);
+  // STR XZR, [SP]
+  BuildMI(*LoopMBB, LoopMBB->end(), DL, TII->get(AArch64::STRXui))
+  .addReg(AArch64::XZR)
+  .addReg(AArch64::SP)
+  .addImm(0)
+  .setMIFlags(MachineInstr::FrameSetup);
+  // CMP SP, TargetReg
+  BuildMI(*LoopMBB, LoopMBB->end(), DL, TII->get(AArch64::SUBSXrx64),
+  AArch64::XZR)
+  .addReg(AArch64::SP)
+  .addReg(TargetReg)
+  .addImm(AArch64_AM::getArithExtendImm(AArch64_AM::UXTX, 0))
+  .setMIFlags(MachineInstr::FrameSetup);
+  // B.CC Loop
+  BuildMI(*LoopMBB, LoopMBB->end(), DL, TII->get(AArch64::Bcc))
+  .addImm(AArch64CC::NE)
+  .addMBB(LoopMBB)
+  .setMIFlags(MachineInstr::FrameSetup);
+
+  LoopMBB->addSuccessor(ExitMBB);
+  LoopMBB->addSuccessor(LoopMBB);
+  // Synthesize the exit MBB.
+  ExitMBB->splice(ExitMBB->end(), &MBB, MBBI, MBB.end());
+  ExitMBB->transferSuccessorsAndUpdatePHIs(&MBB);
+  MBB.addSuccessor(LoopMBB);
+  // Update liveins.
+  recomputeLiveIns(*LoopMBB);
+  recomputeLiveIns(*ExitMBB);
+
+  return ExitMBB->begin();
+}
+
+MachineBasicBlock::iterator AArch64FrameLowering::inlineStackProbeFixed(
+MachineBasicBlock::iterator MBBI, Register ScratchReg, int64_t FrameSize,
+StackOffset CFAOffset) const {
+  MachineBasicBlock *MBB = MBBI->getParent();
+  MachineFunction &MF = *MBB->getParent();
+  const AArch64TargetLowering *TLI =
+  MF.getSubtarget().getTargetLowering();
+  const AArch64InstrInfo *TII =
+  MF.getSubtarget().getInstrInfo();
+  AArch64FunctionInfo *AFI = MF.getInfo();
+  bool EmitAsyncCFI = AFI->needsAsyncDwarfUnwindInfo(MF);
+  bool HasFP = hasFP(MF);
+
+  DebugLoc DL;
+  int64_t ProbeSize = TLI->getStackProbeSize(MF);
+  int64_t NumBlocks = FrameSize / ProbeSize;
+  int64_t ResidualSize = FrameSize % ProbeSize;
+
+  LLVM_DEBUG(dbgs() << "Stack probing: total " << FrameSize << " bytes, "
+<< NumBlocks << " blocks of " << ProbeSize
+<< " bytes, plus " << ResidualSize << " bytes\n");
+
+  // Decrement SP by NumBlock * ProbeSize bytes, with either unrolled or
+  // ordinary loop.
+  if (NumBlocks <= AArch64::StackProbeMaxLoopUnroll) {
+for (int i = 0; i < NumBlocks; ++i) {
+  // SUB SP, SP, #FrameSize (or equivalent if FrameSize is not

momchil-velikov wrote:

Wrong comment (code is OK): ProbeSize, not FrameSize.

https://github.com/llvm/llvm-project/pull/66524
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [AArch64] Stack probing for function prologues (PR #66524)

2023-11-21 Thread Momchil Velikov via cfe-commits



@@ -1827,12 +1908,36 @@ void AArch64FrameLowering::emitPrologue(MachineFunction 
&MF,
   // FIXME: in the case of dynamic re-alignment, NumBytes doesn't have
   // the correct value here, as NumBytes also includes padding bytes,
   // which shouldn't be counted here.
-  emitFrameOffset(
-  MBB, MBBI, DL, scratchSPReg, AArch64::SP,
-  StackOffset::getFixed(-NumBytes), TII, MachineInstr::FrameSetup,
-  false, NeedsWinCFI, &HasWinCFI, EmitAsyncCFI && !HasFP,
+  StackOffset CFAOffset =
   SVEStackSize +
-  StackOffset::getFixed((int64_t)MFI.getStackSize() - NumBytes));
+  StackOffset::getFixed((int64_t)MFI.getStackSize() - NumBytes);
+  if (NeedsStackProbe && !NeedsRealignment) {
+// If we don't need to re-align the stack, we can use a more efficient
+// sequence for stack probing.
+Register ScratchReg = findScratchNonCalleeSaveRegister(&MBB);

momchil-velikov wrote:

Why do we need scratch reg here?

https://github.com/llvm/llvm-project/pull/66524
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [AArch64] Stack probing for function prologues (PR #66524)

2023-11-21 Thread Momchil Velikov via cfe-commits



@@ -26262,3 +26262,37 @@ bool 
AArch64TargetLowering::preferScalarizeSplat(SDNode *N) const {
   }
   return true;
 }
+
+bool AArch64TargetLowering::hasInlineStackProbe(
+const MachineFunction &MF) const {
+  // If the function specifically requests inline stack probes, emit them.
+  if (MF.getFunction().hasFnAttribute("probe-stack")) {
+if (MF.getFunction().getFnAttribute("probe-stack").getValueAsString() ==
+"inline-asm")
+  return true;
+else
+  llvm_unreachable("Unsupported stack probing method");
+  }
+
+  return false;
+}
+
+unsigned
+AArch64TargetLowering::getStackProbeSize(const MachineFunction &MF) const {
+  const TargetFrameLowering *TFI = Subtarget->getFrameLowering();
+  unsigned StackAlign = TFI->getStackAlignment();
+  assert(StackAlign >= 1 && isPowerOf2_32(StackAlign) &&
+ "Unexpected stack alignment");
+  // The default stack probe size is 4096 if the function has no
+  // stack-probe-size attribute. This is a safe default because it is the
+  // smallest possible guard page size.
+  unsigned StackProbeSize = 4096;
+  const Function &Fn = MF.getFunction();
+  if (Fn.hasFnAttribute("stack-probe-size"))

momchil-velikov wrote:

Some validation of the value would be useful.

https://github.com/llvm/llvm-project/pull/66524
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] Remove experimental from Vector Crypto extensions (PR #69000)

2023-11-21 Thread Eric Biggers via cfe-commits



@@ -141,6 +141,23 @@ on support follow.
  ``Zve64f``   Supported
  ``Zve64d``   Supported
  ``Zvfh`` Supported
+ ``Zvbb`` Supported
+ ``Zvbc`` Supported
+ ``Zvkb`` Supported
+ ``Zvkg`` Supported
+ ``Zvkn`` Supported
+ ``Zvkn`` Supported

ebiggers wrote:

Zvkn is in this list twice

https://github.com/llvm/llvm-project/pull/69000
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Driver] Simply some gcc search logic (PR #72558)

2023-11-21 Thread Tom Stellard via cfe-commits


https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/72558

>From 3a0896141cf11c604f28326b3a6eee3762b4f79d Mon Sep 17 00:00:00 2001
From: Tom Stellard 
Date: Thu, 16 Nov 2023 05:54:29 +
Subject: [PATCH 1/2] [Driver] Simply some gcc search logic

---
 clang/lib/Driver/ToolChains/Gnu.cpp | 18 +++---
 clang/lib/Driver/ToolChains/Gnu.h   |  1 -
 2 files changed, 7 insertions(+), 12 deletions(-)

diff --git a/clang/lib/Driver/ToolChains/Gnu.cpp 
b/clang/lib/Driver/ToolChains/Gnu.cpp
index 19dff4ec4d45e08..d92c0f7f8984758 100644
--- a/clang/lib/Driver/ToolChains/Gnu.cpp
+++ b/clang/lib/Driver/ToolChains/Gnu.cpp
@@ -2117,14 +2117,17 @@ void Generic_GCC::GCCInstallationDetector::init(
   // The compatible GCC triples for this particular architecture.
   SmallVector CandidateTripleAliases;
   SmallVector CandidateBiarchTripleAliases;
+  // Add some triples that we want to check first.
+  CandidateTripleAliases.push_back(TargetTriple.str());
+  std::string TripleNoVendor = TargetTriple.getArchName().str() + "-" +
+   TargetTriple.getOSAndEnvironmentName().str();
+  if (TargetTriple.getVendor() == llvm::Triple::UnknownVendor) {
+CandidateTripleAliases.push_back(TripleNoVendor);
+  }
   CollectLibDirsAndTriples(TargetTriple, BiarchVariantTriple, CandidateLibDirs,
CandidateTripleAliases, CandidateBiarchLibDirs,
CandidateBiarchTripleAliases);
 
-  TripleNoVendor = TargetTriple.getArchName().str() + "-" +
-   TargetTriple.getOSAndEnvironmentName().str();
-  StringRef TripleNoVendorRef(TripleNoVendor);
-
   // If --gcc-install-dir= is specified, skip filesystem detection.
   if (const Arg *A =
   Args.getLastArg(clang::driver::options::OPT_gcc_install_dir_EQ);
@@ -2204,13 +2207,6 @@ void Generic_GCC::GCCInstallationDetector::init(
   // Maybe filter out /gcc and /gcc-cross.
   bool GCCDirExists = VFS.exists(LibDir + "/gcc");
   bool GCCCrossDirExists = VFS.exists(LibDir + "/gcc-cross");
-  // Try to match the exact target triple first.
-  ScanLibDirForGCCTriple(TargetTriple, Args, LibDir, TargetTriple.str(),
- false, GCCDirExists, GCCCrossDirExists);
-  // If vendor is unknown, let's try triple without vendor.
-  if (TargetTriple.getVendor() == llvm::Triple::UnknownVendor)
-ScanLibDirForGCCTriple(TargetTriple, Args, LibDir, TripleNoVendorRef,
-   false, GCCDirExists, GCCCrossDirExists);
   for (StringRef Candidate : CandidateTripleAliases)
 ScanLibDirForGCCTriple(TargetTriple, Args, LibDir, Candidate, false,
GCCDirExists, GCCCrossDirExists);
diff --git a/clang/lib/Driver/ToolChains/Gnu.h 
b/clang/lib/Driver/ToolChains/Gnu.h
index dcfc6307cac79e5..0b664a182d75e1c 100644
--- a/clang/lib/Driver/ToolChains/Gnu.h
+++ b/clang/lib/Driver/ToolChains/Gnu.h
@@ -249,7 +249,6 @@ class LLVM_LIBRARY_VISIBILITY Generic_GCC : public 
ToolChain {
 void print(raw_ostream &OS) const;
 
   private:
-std::string TripleNoVendor;
 static void
 CollectLibDirsAndTriples(const llvm::Triple &TargetTriple,
  const llvm::Triple &BiarchTriple,

>From 994b7846d73b05ff8f5a41c56535e43e72f90220 Mon Sep 17 00:00:00 2001
From: Tom Stellard 
Date: Wed, 22 Nov 2023 00:18:57 +
Subject: [PATCH 2/2] Fix formatting

---
 clang/lib/Driver/ToolChains/Gnu.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/clang/lib/Driver/ToolChains/Gnu.cpp 
b/clang/lib/Driver/ToolChains/Gnu.cpp
index d92c0f7f8984758..0ea2df2ca8d87e2 100644
--- a/clang/lib/Driver/ToolChains/Gnu.cpp
+++ b/clang/lib/Driver/ToolChains/Gnu.cpp
@@ -2121,9 +2121,9 @@ void Generic_GCC::GCCInstallationDetector::init(
   CandidateTripleAliases.push_back(TargetTriple.str());
   std::string TripleNoVendor = TargetTriple.getArchName().str() + "-" +
TargetTriple.getOSAndEnvironmentName().str();
-  if (TargetTriple.getVendor() == llvm::Triple::UnknownVendor) {
+  if (TargetTriple.getVendor() == llvm::Triple::UnknownVendor)
 CandidateTripleAliases.push_back(TripleNoVendor);
-  }
+
   CollectLibDirsAndTriples(TargetTriple, BiarchVariantTriple, CandidateLibDirs,
CandidateTripleAliases, CandidateBiarchLibDirs,
CandidateBiarchTripleAliases);

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] Remove experimental from Vector Crypto extensions (PR #69000)

2023-11-21 Thread Eric Biggers via cfe-commits


ebiggers wrote:

FYI, I tried squashing the commits of this pull request and rebasing onto 
`main`, but there are conflicts in several files

https://github.com/llvm/llvm-project/pull/69000
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] Remove experimental from Vector Crypto extensions (PR #69000)

2023-11-21 Thread Eric Biggers via cfe-commits


ebiggers wrote:

Hi!  What's the status of this pull request?  Are there any major issues 
remaining?  It would be very useful to have the LLVM assembler officially 
support the vector crypto extensions.  I've been reviewing the [RISC-V vector 
crypto accelerated crypto routines for the Linux 
kernel](https://lore.kernel.org/linux-crypto/20231025183644.8735-1-jerry.s...@sifive.com),
 and currently they're emitting the instructions as bare `.inst`s in order to 
not rely on the assembler, which is quite ugly.  The GNU assembler already 
released the support for this a few months ago, in v2.41.

FYI @JerryShih and @phoebesv

https://github.com/llvm/llvm-project/pull/69000
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #68932)

2023-11-21 Thread Jun Wang via cfe-commits



@@ -388,6 +388,8 @@ class SIInsertWaitcnts : public MachineFunctionPass {
   // message.
   DenseSet ReleaseVGPRInsts;
 
+  // bool insertWaitcntAfterMemOp(MachineFunction &MF);

jwanggit86 wrote:

Done.

https://github.com/llvm/llvm-project/pull/68932
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #68932)

2023-11-21 Thread Jun Wang via cfe-commits



@@ -1708,6 +1710,13 @@ bool 
SIInsertWaitcnts::insertWaitcntInBlock(MachineFunction &MF,
 }
 
 ++Iter;
+if (ST->isPreciseMemoryEnabled() && Inst.mayLoadOrStore()) {
+  auto builder =

jwanggit86 wrote:

Done.

https://github.com/llvm/llvm-project/pull/68932
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #68932)

2023-11-21 Thread Jun Wang via cfe-commits



@@ -1708,6 +1710,13 @@ bool 
SIInsertWaitcnts::insertWaitcntInBlock(MachineFunction &MF,
 }
 
 ++Iter;
+if (ST->isPreciseMemoryEnabled() && Inst.mayLoadOrStore()) {
+  auto builder =
+  BuildMI(Block, Iter, DebugLoc(), TII->get(AMDGPU::S_WAITCNT))
+  .addImm(0);

jwanggit86 wrote:

Done.

https://github.com/llvm/llvm-project/pull/68932
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #68932)

2023-11-21 Thread Jun Wang via cfe-commits



@@ -0,0 +1,222 @@
+; Testing the -amdgpu-precise-memory-op option
+; COM: llc -mtriple=amdgcn -mcpu=hawaii -mattr=+amdgpu-precise-memory-op 
-verify-machineinstrs < %s | FileCheck %s -check-prefixes=GFX7

jwanggit86 wrote:

Comment. Some testcases in this file won't run if mcpu=hawaii. In the latest 
commit, the test file has been split into 2.

https://github.com/llvm/llvm-project/pull/68932
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #68932)

2023-11-21 Thread Jun Wang via cfe-commits


https://github.com/jwanggit86 updated 
https://github.com/llvm/llvm-project/pull/68932

>From e393477607cb94b45a3b9a5db2aea98fb8af2a86 Mon Sep 17 00:00:00 2001
From: Jun Wang 
Date: Thu, 12 Oct 2023 16:45:59 -0500
Subject: [PATCH 1/9] [AMDGPU] Emit a waitcnt instruction after each memory
 instruction

This patch implements a new command-line option for the backend, namely,
amdgpu-waitcnt-for-all-mem-op. When this option is specified, a "waitcnt 0"
instruction is generated after each memory load/store instruction.
---
 llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp   |  30 ++-
 .../CodeGen/AMDGPU/insert_waitcnt_for_all.ll  | 222 ++
 2 files changed, 251 insertions(+), 1 deletion(-)
 create mode 100644 llvm/test/CodeGen/AMDGPU/insert_waitcnt_for_all.ll

diff --git a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp 
b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
index ede4841b8a5fd7d..728be7c61fa2217 100644
--- a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
@@ -52,6 +52,10 @@ static cl::opt ForceEmitZeroFlag(
   cl::desc("Force all waitcnt instrs to be emitted as s_waitcnt vmcnt(0) 
expcnt(0) lgkmcnt(0)"),
   cl::init(false), cl::Hidden);
 
+static cl::opt EmitForAllMemOpFlag(
+"amdgpu-waitcnt-for-all-mem-op",
+cl::desc("Emit s_waitcnt 0 after each memory operation"), cl::init(false));
+
 namespace {
 // Class of object that encapsulates latest instruction counter score
 // associated with the operand.  Used for determining whether
@@ -388,6 +392,8 @@ class SIInsertWaitcnts : public MachineFunctionPass {
   // message.
   DenseSet ReleaseVGPRInsts;
 
+  bool insertWaitcntAfterMemOp(MachineFunction &MF);
+
 public:
   static char ID;
 
@@ -1809,6 +1815,23 @@ bool SIInsertWaitcnts::shouldFlushVmCnt(MachineLoop *ML,
   return HasVMemLoad && UsesVgprLoadedOutside;
 }
 
+bool SIInsertWaitcnts::insertWaitcntAfterMemOp(MachineFunction &MF) {
+  bool Modified = false;
+
+  for (auto &MBB : MF) {
+for (auto It = MBB.begin(); It != MBB.end();) {
+  bool IsMemOp = It->mayLoadOrStore();
+  ++It;
+  if (IsMemOp) {
+BuildMI(MBB, It, DebugLoc(), TII->get(AMDGPU::S_WAITCNT)).addImm(0);
+Modified = true;
+  }
+}
+  }
+
+  return Modified;
+}
+
 bool SIInsertWaitcnts::runOnMachineFunction(MachineFunction &MF) {
   ST = &MF.getSubtarget();
   TII = ST->getInstrInfo();
@@ -1819,6 +1842,12 @@ bool 
SIInsertWaitcnts::runOnMachineFunction(MachineFunction &MF) {
   MLI = &getAnalysis();
   PDT = &getAnalysis();
 
+  bool Modified = false;
+
+  if (EmitForAllMemOpFlag) {
+Modified = insertWaitcntAfterMemOp(MF);
+  }
+
   ForceEmitZeroWaitcnts = ForceEmitZeroFlag;
   for (auto T : inst_counter_types())
 ForceEmitWaitcnt[T] = false;
@@ -1847,7 +1876,6 @@ bool 
SIInsertWaitcnts::runOnMachineFunction(MachineFunction &MF) {
 
   TrackedWaitcntSet.clear();
   BlockInfos.clear();
-  bool Modified = false;
 
   if (!MFI->isEntryFunction()) {
 // Wait for any outstanding memory operations that the input registers may
diff --git a/llvm/test/CodeGen/AMDGPU/insert_waitcnt_for_all.ll 
b/llvm/test/CodeGen/AMDGPU/insert_waitcnt_for_all.ll
new file mode 100644
index 000..4580b9074ada3cc
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/insert_waitcnt_for_all.ll
@@ -0,0 +1,222 @@
+; Testing the -amdgpu-waitcnt-for-all-mem-op option
+; COM: llc -mtriple=amdgcn -mcpu=hawaii -amdgpu-waitcnt-for-all-mem-op 
-verify-machineinstrs < %s | FileCheck %s -check-prefixes=GFX7
+; COM: llc -mtriple=amdgcn -mcpu=tonga -amdgpu-waitcnt-for-all-mem-op 
-verify-machineinstrs < %s | FileCheck %s -check-prefixes=GFX8
+; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -amdgpu-waitcnt-for-all-mem-op 
-verify-machineinstrs < %s | FileCheck %s -check-prefixes=GFX9
+; RUN: llc -mtriple=amdgcn -mcpu=gfx90a -amdgpu-waitcnt-for-all-mem-op 
-verify-machineinstrs < %s | FileCheck %s -check-prefixes=GFX90A
+; RUN: llc -mtriple=amdgcn -mcpu=gfx1010 -amdgpu-waitcnt-for-all-mem-op 
-verify-machineinstrs < %s | FileCheck %s -check-prefixes=GFX10
+; RUN: llc -mtriple=amdgcn-- -mcpu=gfx900 
-mattr=-flat-for-global,+enable-flat-scratch 
-amdgpu-use-divergent-register-indexing -amdgpu-waitcnt-for-all-mem-op 
-verify-machineinstrs < %s | FileCheck --check-prefixes=GFX9-FLATSCR %s
+
+; from atomicrmw-expand.ll
+; covers flat_load, flat_atomic
+define void @syncscope_workgroup_nortn(ptr %addr, float %val) {
+; GFX90A-LABEL: syncscope_workgroup_nortn:
+; GFX90A:  ; %bb.0:
+; GFX90A: flat_load_dword v5, v[0:1]
+; GFX90A-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX90A:  .LBB0_1: ; %atomicrmw.start
+; GFX90A: flat_atomic_cmpswap v3, v[0:1], v[4:5] glc
+; GFX90A-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+  %res = atomicrmw fadd ptr %addr, float %val syncscope("workgroup") seq_cst
+  ret void
+}
+
+; from atomicrmw-nand.ll
+; covers global_atomic, global_load
+define i32 @atomic_nand_i32_global(ptr addrspace(1) %ptr) nounwind {
+; GFX9-LABEL: atomic_na

[clang] Supports viewing class member variables in lambda when using the vs debugger (PR #71564)

2023-11-21 Thread David Blaikie via cfe-commits


https://github.com/dwblaikie closed 
https://github.com/llvm/llvm-project/pull/71564
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] 7c3c243 - Supports viewing class member variables in lambda when using the vs debugger (#71564)

2023-11-21 Thread via cfe-commits


Author: GkvJwa
Date: 2023-11-21T15:33:16-08:00
New Revision: 7c3c243c9bf80377fcad6c7699dc9aaedd650a18

URL: 
https://github.com/llvm/llvm-project/commit/7c3c243c9bf80377fcad6c7699dc9aaedd650a18
DIFF: 
https://github.com/llvm/llvm-project/commit/7c3c243c9bf80377fcad6c7699dc9aaedd650a18.diff

LOG: Supports viewing class member variables in lambda when using the vs 
debugger (#71564)

Use "__this" in DataMemberRecord, make vs debugger can be parsed normally

Fixes #71562

Added: 
clang/test/CodeGenCXX/debug-info-lambda-this.cpp

Modified: 
clang/lib/CodeGen/CGDebugInfo.cpp

Removed: 




diff  --git a/clang/lib/CodeGen/CGDebugInfo.cpp 
b/clang/lib/CodeGen/CGDebugInfo.cpp
index 0b52d99ad07f164..3b4932cc4a30f6b 100644
--- a/clang/lib/CodeGen/CGDebugInfo.cpp
+++ b/clang/lib/CodeGen/CGDebugInfo.cpp
@@ -1657,8 +1657,10 @@ void CGDebugInfo::CollectRecordLambdaFields(
   FieldDecl *f = *Field;
   llvm::DIFile *VUnit = getOrCreateFile(f->getLocation());
   QualType type = f->getType();
+  StringRef ThisName =
+  CGM.getCodeGenOpts().EmitCodeView ? "__this" : "this";
   llvm::DIType *fieldType = createFieldType(
-  "this", type, f->getLocation(), f->getAccess(),
+  ThisName, type, f->getLocation(), f->getAccess(),
   layout.getFieldOffset(fieldno), VUnit, RecordTy, CXXDecl);
 
   elements.push_back(fieldType);

diff  --git a/clang/test/CodeGenCXX/debug-info-lambda-this.cpp 
b/clang/test/CodeGenCXX/debug-info-lambda-this.cpp
new file mode 100644
index 000..0a2f08ea4aa6d8e
--- /dev/null
+++ b/clang/test/CodeGenCXX/debug-info-lambda-this.cpp
@@ -0,0 +1,27 @@
+// RUN: %clang_cc1 %s -std=c++11 -triple=x86_64-pc-windows-msvc 
-debug-info-kind=limited -gcodeview -emit-llvm -o - | FileCheck %s
+
+class Foo {
+ public:
+  void foo() {
+int aa = 2;
+auto f = [=] {
+  int aaa = a + aa;
+};
+f();
+  }
+
+ private:
+  int a = 1;
+};
+
+int main() {
+  Foo f;
+  f.foo();
+
+  return 0;
+}
+
+// CHECK: !{![[FOO_THIS:[0-9]+]], ![[FOO_AA:[0-9]+]], ![[FOO_OPERATOR:[0-9]+]]}
+// CHECK-NEXT: ![[FOO_THIS]] = !DIDerivedType(tag: DW_TAG_member, name: 
"__this", scope: ![[#]], file: ![[#]], line: [[#]], baseType: ![[#]], size: 
[[#]])
+// CHECK-NEXT: ![[FOO_AA]] = !DIDerivedType(tag: DW_TAG_member, name: "aa", 
scope: ![[#]], file: ![[#]], line: [[#]], baseType: ![[#]], size: [[#]], 
offset: [[#]])
+// CHECK-NEXT: ![[FOO_OPERATOR]] = !DISubprogram(name: "operator()", scope: 
![[#]], file: ![[#]], line: [[#]], type: ![[#]], scopeLine: [[#]], flags: 
DIFlagPublic | DIFlagPrototyped, spFlags: 0)



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AArch64] Stack probing for dynamic allocas in SelectionDAG (PR #66525)

2023-11-21 Thread Eli Friedman via cfe-commits



@@ -0,0 +1,363 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple aarch64-none-eabi < %s -verify-machineinstrs | FileCheck %s
+
+; Dynamically-sized allocation, needs a loop which can handle any size at
+; runtime. The final iteration of the loop will temporarily put SP below the
+; target address, but this doesn't break any of the ABI constraints on the
+; stack, and also doesn't probe below the target SP value.
+define void @dynamic(i64 %size, ptr %out) #0 {
+; CHECK-LABEL: dynamic:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:stp x29, x30, [sp, #-16]! // 16-byte Folded Spill
+; CHECK-NEXT:.cfi_def_cfa_offset 16
+; CHECK-NEXT:mov x29, sp
+; CHECK-NEXT:.cfi_def_cfa w29, 16
+; CHECK-NEXT:.cfi_offset w30, -8
+; CHECK-NEXT:.cfi_offset w29, -16
+; CHECK-NEXT:add x9, x0, #15
+; CHECK-NEXT:mov x8, sp
+; CHECK-NEXT:and x9, x9, #0xfff0
+; CHECK-NEXT:sub x8, x8, x9
+; CHECK-NEXT:  .LBB0_1: // =>This Inner Loop Header: Depth=1
+; CHECK-NEXT:sub sp, sp, #1, lsl #12 // =4096
+; CHECK-NEXT:cmp sp, x8
+; CHECK-NEXT:b.le .LBB0_3
+; CHECK-NEXT:  // %bb.2: // in Loop: Header=BB0_1 Depth=1
+; CHECK-NEXT:str xzr, [sp]
+; CHECK-NEXT:b .LBB0_1
+; CHECK-NEXT:  .LBB0_3:
+; CHECK-NEXT:mov sp, x8
+; CHECK-NEXT:str xzr, [sp]
+; CHECK-NEXT:str x8, [x1]
+; CHECK-NEXT:mov sp, x29
+; CHECK-NEXT:.cfi_def_cfa wsp, 16
+; CHECK-NEXT:ldp x29, x30, [sp], #16 // 16-byte Folded Reload
+; CHECK-NEXT:.cfi_def_cfa_offset 0
+; CHECK-NEXT:.cfi_restore w30
+; CHECK-NEXT:.cfi_restore w29
+; CHECK-NEXT:ret
+  %v = alloca i8, i64 %size, align 1
+  store ptr %v, ptr %out, align 8
+  ret void
+}
+
+; This function has a fixed-size stack slot and a dynamic one. The fixed size
+; slot isn't large enough that we would normally probe it, but we need to do so
+; here otherwise the gap between the CSR save and the first probe of the
+; dynamic allocation could be too far apart when the size of the dynamic
+; allocation is close to the guard size.
+define void @dynamic_fixed(i64 %size, ptr %out1, ptr %out2) #0 {
+; CHECK-LABEL: dynamic_fixed:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:stp x29, x30, [sp, #-16]! // 16-byte Folded Spill
+; CHECK-NEXT:.cfi_def_cfa_offset 16
+; CHECK-NEXT:mov x29, sp
+; CHECK-NEXT:.cfi_def_cfa w29, 16
+; CHECK-NEXT:.cfi_offset w30, -8
+; CHECK-NEXT:.cfi_offset w29, -16
+; CHECK-NEXT:str xzr, [sp, #-64]!
+; CHECK-NEXT:add x9, x0, #15
+; CHECK-NEXT:mov x8, sp
+; CHECK-NEXT:sub x10, x29, #64
+; CHECK-NEXT:and x9, x9, #0xfff0
+; CHECK-NEXT:str x10, [x1]
+; CHECK-NEXT:sub x8, x8, x9
+; CHECK-NEXT:  .LBB1_1: // =>This Inner Loop Header: Depth=1
+; CHECK-NEXT:sub sp, sp, #1, lsl #12 // =4096
+; CHECK-NEXT:cmp sp, x8
+; CHECK-NEXT:b.le .LBB1_3
+; CHECK-NEXT:  // %bb.2: // in Loop: Header=BB1_1 Depth=1
+; CHECK-NEXT:str xzr, [sp]
+; CHECK-NEXT:b .LBB1_1
+; CHECK-NEXT:  .LBB1_3:
+; CHECK-NEXT:mov sp, x8
+; CHECK-NEXT:str xzr, [sp]
+; CHECK-NEXT:str x8, [x2]
+; CHECK-NEXT:mov sp, x29
+; CHECK-NEXT:.cfi_def_cfa wsp, 16
+; CHECK-NEXT:ldp x29, x30, [sp], #16 // 16-byte Folded Reload
+; CHECK-NEXT:.cfi_def_cfa_offset 0
+; CHECK-NEXT:.cfi_restore w30
+; CHECK-NEXT:.cfi_restore w29
+; CHECK-NEXT:ret
+  %v1 = alloca i8, i64 64, align 1
+  store ptr %v1, ptr %out1, align 8
+  %v2 = alloca i8, i64 %size, align 1
+  store ptr %v2, ptr %out2, align 8
+  ret void
+}
+
+; Dynamic allocation, with an alignment requirement greater than the alignment
+; of SP. Done by ANDing the target SP with a constant to align it down, then
+; doing the loop as normal. Note that we also re-align the stack in the prolog,
+; which isn't actually needed because the only aligned allocations are dynamic,
+; this is done even without stack probing.
+define void @dynamic_align_64(i64 %size, ptr %out) #0 {
+; CHECK-LABEL: dynamic_align_64:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:stp x29, x30, [sp, #-32]! // 16-byte Folded Spill
+; CHECK-NEXT:.cfi_def_cfa_offset 32
+; CHECK-NEXT:str x19, [sp, #16] // 8-byte Folded Spill
+; CHECK-NEXT:mov x29, sp
+; CHECK-NEXT:.cfi_def_cfa w29, 32
+; CHECK-NEXT:.cfi_offset w19, -16
+; CHECK-NEXT:.cfi_offset w30, -24
+; CHECK-NEXT:.cfi_offset w29, -32
+; CHECK-NEXT:sub x9, sp, #32
+; CHECK-NEXT:and sp, x9, #0xffc0
+; CHECK-NEXT:add x9, x0, #15
+; CHECK-NEXT:mov x8, sp
+; CHECK-NEXT:str xzr, [sp]
+; CHECK-NEXT:and x9, x9, #0xfff0
+; CHECK-NEXT:mov x19, sp
+; CHECK-NEXT:sub x8, x8, x9
+; CHECK-NEXT:and x8, x8, #0xffc0
+; CHECK-NEXT:  .LBB2_1: // =>This Inner Loop Header: Depth=1
+; CHECK-NEXT:sub sp, sp, #1, lsl #12 // =4096
+; CHECK-NEXT:cmp sp, x8
+; CHECK-NEXT:b.le .LBB2_3
+; CHECK-NEXT:  // %bb.2: // in Loop: Header=BB2_1 Depth=1
+; CHECK-NE

[clang] [clang][AArch64] Pass down stack clash protection options to LLVM/Backend (PR #68993)

2023-11-21 Thread Eli Friedman via cfe-commits



@@ -1076,6 +1076,16 @@ void CodeGenModule::Release() {
 "sign-return-address-with-bkey", 1);
   }
 
+  if (Arch == llvm::Triple::aarch64 || Arch == llvm::Triple::aarch64_be) {

efriedma-quic wrote:

Module-level attributes tend to lead to issues with LTO.  I mean, for 
compiler-generated functions, there's no obvious choice, so we just kind of 
have to pick something.  But for other functions, I think it makes sense to try 
to respect what the user specified for each input module.  That would suggest 
we should emit both a module attribute and a function attribute.

https://github.com/llvm/llvm-project/pull/68993
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AArch64] Stack probing for function prologues (PR #66524)

2023-11-21 Thread Eli Friedman via cfe-commits



@@ -1076,6 +1076,16 @@ void CodeGenModule::Release() {
 "sign-return-address-with-bkey", 1);
   }
 
+  if (Arch == llvm::Triple::aarch64 || Arch == llvm::Triple::aarch64_be) {
+auto *InlineAsm = llvm::MDString::get(TheModule.getContext(), 
"inline-asm");
+if (CodeGenOpts.StackClashProtector)
+  getModule().addModuleFlag(llvm::Module::Override, "probe-stack",
+InlineAsm);

efriedma-quic wrote:

Any reply here?

https://github.com/llvm/llvm-project/pull/66524
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AArch64] Stack probing for function prologues (PR #66524)

2023-11-21 Thread Eli Friedman via cfe-commits



@@ -0,0 +1,722 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple aarch64-none-eabi < %s -verify-machineinstrs | FileCheck %s
+; RUN: llc -mtriple aarch64-none-eabi < %s -verify-machineinstrs -global-isel 
-global-isel-abort=2 | FileCheck %s
+
+; Test prolog sequences for stack probing when SVE objects are involved.
+
+; The space for SVE objects needs probing in the general case, because
+; the stack adjustment may happen to be too big (i.e. greater than the
+; probe size) to allocate with a single `addvl`.
+; When we do know that the stack adjustment cannot exceed the probe size
+; we can avoid emitting a probe loop and emit a simple `addvl; str`
+; sequence instead.
+
+define void @sve_1_vector(ptr %out) #0 {
+; CHECK-LABEL: sve_1_vector:
+; CHECK:   // %bb.0: // %entry
+; CHECK-NEXT:str x29, [sp, #-16]! // 8-byte Folded Spill
+; CHECK-NEXT:.cfi_def_cfa_offset 16
+; CHECK-NEXT:.cfi_offset w29, -16
+; CHECK-NEXT:addvl sp, sp, #-1
+; CHECK-NEXT:.cfi_escape 0x0f, 0x0c, 0x8f, 0x00, 0x11, 0x10, 0x22, 0x11, 
0x08, 0x92, 0x2e, 0x00, 0x1e, 0x22 // sp + 16 + 8 * VG
+; CHECK-NEXT:addvl sp, sp, #1
+; CHECK-NEXT:.cfi_def_cfa wsp, 16
+; CHECK-NEXT:ldr x29, [sp], #16 // 8-byte Folded Reload
+; CHECK-NEXT:.cfi_def_cfa_offset 0
+; CHECK-NEXT:.cfi_restore w29
+; CHECK-NEXT:ret
+entry:
+  %vec = alloca , align 16
+  ret void
+}
+
+; As above, but with 4 SVE vectors of stack space.
+define void @sve_4_vector(ptr %out) #0 {
+; CHECK-LABEL: sve_4_vector:
+; CHECK:   // %bb.0: // %entry
+; CHECK-NEXT:str x29, [sp, #-16]! // 8-byte Folded Spill
+; CHECK-NEXT:.cfi_def_cfa_offset 16
+; CHECK-NEXT:.cfi_offset w29, -16
+; CHECK-NEXT:addvl sp, sp, #-4
+; CHECK-NEXT:.cfi_escape 0x0f, 0x0c, 0x8f, 0x00, 0x11, 0x10, 0x22, 0x11, 
0x20, 0x92, 0x2e, 0x00, 0x1e, 0x22 // sp + 16 + 32 * VG
+; CHECK-NEXT:addvl sp, sp, #4
+; CHECK-NEXT:.cfi_def_cfa wsp, 16
+; CHECK-NEXT:ldr x29, [sp], #16 // 8-byte Folded Reload
+; CHECK-NEXT:.cfi_def_cfa_offset 0
+; CHECK-NEXT:.cfi_restore w29
+; CHECK-NEXT:ret
+entry:
+  %vec1 = alloca , align 16
+  %vec2 = alloca , align 16
+  %vec3 = alloca , align 16
+  %vec4 = alloca , align 16
+  ret void
+}
+
+; As above, but with 16 SVE vectors of stack space.
+; The stack adjustment is less than or equal to 16 x 256 = 4096, so
+; we can allocate the locals at once.
+define void @sve_16_vector(ptr %out) #0 {
+; CHECK-LABEL: sve_16_vector:
+; CHECK:   // %bb.0: // %entry
+; CHECK-NEXT:str x29, [sp, #-16]! // 8-byte Folded Spill
+; CHECK-NEXT:.cfi_def_cfa_offset 16
+; CHECK-NEXT:.cfi_offset w29, -16
+; CHECK-NEXT:addvl sp, sp, #-16
+; CHECK-NEXT:.cfi_escape 0x0f, 0x0d, 0x8f, 0x00, 0x11, 0x10, 0x22, 0x11, 
0x80, 0x01, 0x92, 0x2e, 0x00, 0x1e, 0x22 // sp + 16 + 128 * VG
+; CHECK-NEXT:str xzr, [sp]
+; CHECK-NEXT:addvl sp, sp, #16
+; CHECK-NEXT:.cfi_def_cfa wsp, 16
+; CHECK-NEXT:ldr x29, [sp], #16 // 8-byte Folded Reload
+; CHECK-NEXT:.cfi_def_cfa_offset 0
+; CHECK-NEXT:.cfi_restore w29
+; CHECK-NEXT:ret
+entry:
+  %vec1 = alloca , align 16
+  %vec2 = alloca , align 16
+  %vec3 = alloca , align 16
+  %vec4 = alloca , align 16
+  %vec5 = alloca , align 16
+  %vec6 = alloca , align 16
+  %vec7 = alloca , align 16
+  %vec8 = alloca , align 16
+  %vec9 = alloca , align 16
+  %vec10 = alloca , align 16
+  %vec11 = alloca , align 16
+  %vec12 = alloca , align 16
+  %vec13 = alloca , align 16
+  %vec14 = alloca , align 16
+  %vec15 = alloca , align 16
+  %vec16 = alloca , align 16
+  ret void
+}
+
+; As above, but with 17 SVE vectors of stack space. Now we need
+; a probing loops since stack adjustment may be greater than
+; the probe size (17 x 256 = 4354 bytes)

efriedma-quic wrote:

Maybe worth noting in a comment that in this specific case, we could consider 
splitting the allocation into two allocations that are known to be less than 
4096 bytes (`addvl sp, sp, #-16; str xzr, [sp]; addvl sp, sp, #-1; str xzr, 
[sp]`).

https://github.com/llvm/llvm-project/pull/66524
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] Avoid memcopy for small structure with padding under -ftrivial-auto-var-init (PR #71677)

2023-11-21 Thread Eli Friedman via cfe-commits


https://github.com/efriedma-quic approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/71677
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] fix: compatible C++ empty record with align UB with gcc (PR #72197)

2023-11-21 Thread Eli Friedman via cfe-commits


https://github.com/efriedma-quic edited 
https://github.com/llvm/llvm-project/pull/72197
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] fix: compatible C++ empty record with align UB with gcc (PR #72197)

2023-11-21 Thread Eli Friedman via cfe-commits



@@ -307,7 +307,12 @@ AArch64ABIInfo::classifyArgumentType(QualType Ty, bool 
IsVariadic,
 // 0.
 if (IsEmpty && Size == 0)
   return ABIArgInfo::getIgnore();
-return ABIArgInfo::getDirect(llvm::Type::getInt8Ty(getVMContext()));
+// An empty struct can have size greater than one byte if alignment is
+// involved.
+// When size <= 64, we still hold it by i8 in IR and lowering to registers.
+// When Size > 64, just fall through to avoid va_list out of sync.

efriedma-quic wrote:

"to avoid va_list out of sync" doesn't really make sense; the important thing 
is that this follows the ABI rule.

Maybe take another look at the suggested comment 
inhttps://github.com/llvm/llvm-project/pull/72197#issuecomment-1815284976 .

https://github.com/llvm/llvm-project/pull/72197
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Refactor ASTContext::getDeclAlign() (NFC) (PR #72977)

2023-11-21 Thread Eli Friedman via cfe-commits



@@ -1627,28 +1627,20 @@ const llvm::fltSemantics 
&ASTContext::getFloatTypeSemantics(QualType T) const {
 CharUnits ASTContext::getDeclAlign(const Decl *D, bool ForAlignof) const {
   unsigned Align = Target->getCharWidth();
 
-  bool UseAlignAttrOnly = false;
-  if (unsigned AlignFromAttr = D->getMaxAlignment()) {
+  const unsigned AlignFromAttr = D->getMaxAlignment();
+  if (AlignFromAttr)
 Align = AlignFromAttr;
 
-// __attribute__((aligned)) can increase or decrease alignment
-// *except* on a struct or struct member, where it only increases
-// alignment unless 'packed' is also specified.
-//
-// It is an error for alignas to decrease alignment, so we can
-// ignore that possibility;  Sema should diagnose it.
-if (isa(D)) {
-  UseAlignAttrOnly = D->hasAttr() ||
-cast(D)->getParent()->hasAttr();
-} else {
-  UseAlignAttrOnly = true;
-}
-  }
-  else if (isa(D))
-  UseAlignAttrOnly =
-D->hasAttr() ||
-cast(D)->getParent()->hasAttr();
-
+  // __attribute__((aligned)) can increase or decrease alignment
+  // *except* on a struct or struct member, where it only increases
+  // alignment unless 'packed' is also specified.
+  //
+  // It is an error for alignas to decrease alignment, so we can
+  // ignore that possibility;  Sema should diagnose it.
+  bool IsPackedField = isa(D) &&
+   (D->hasAttr() ||
+
cast(D)->getParent()->hasAttr());
+  bool UseAlignAttrOnly = isa(D) ? IsPackedField : AlignFromAttr;

efriedma-quic wrote:

That's a lot of `isa<>` and `cast<>`. I think I'd prefer something more like:

```
bool UseAlignAttrOnly;
if (FieldDecl *FD = dyn_cast(D)) {
  UseAlignAttrOnly = FD->hasAttr() ||
 FD->getParent()->hasAttr();
} else {
  UseAlignAttrOnly = AlignFromAttr != 0;
}
```

https://github.com/llvm/llvm-project/pull/72977
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[lldb] [compiler-rt] [clang-tools-extra] [clang] [openmp] [llvm] [libcxx] [flang] [OpenMP] Add memory diff dump for kernel record-replay (PR #70667)

2023-11-21 Thread Giorgis Georgakoudis via cfe-commits


https://github.com/ggeorgakoudis approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/70667
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Allow multiple sanitizers on baremetal targets. (PR #72933)

2023-11-21 Thread Evgenii Stepanov via cfe-commits


https://github.com/eugenis closed 
https://github.com/llvm/llvm-project/pull/72933
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] fb57f4e - Allow multiple sanitizers on baremetal targets. (#72933)

2023-11-21 Thread via cfe-commits


Author: Evgenii Stepanov
Date: 2023-11-21T13:11:12-08:00
New Revision: fb57f4e0e0b302ec1b3181e952a4bd4b3c57a286

URL: 
https://github.com/llvm/llvm-project/commit/fb57f4e0e0b302ec1b3181e952a4bd4b3c57a286
DIFF: 
https://github.com/llvm/llvm-project/commit/fb57f4e0e0b302ec1b3181e952a4bd4b3c57a286.diff

LOG: Allow multiple sanitizers on baremetal targets. (#72933)

Baremetal targets tend to implement their own runtime support for
sanitizers. Clang driver gatekeeping of allowed sanitizer types is
counter productive.

This change allows anything that does not crash and burn in compilation,
and leaves any potential runtime issues for the user to figure out.

Added: 


Modified: 
clang/lib/Driver/ToolChains/BareMetal.cpp
clang/lib/Driver/ToolChains/BareMetal.h
clang/test/Driver/fsanitize.c

Removed: 




diff  --git a/clang/lib/Driver/ToolChains/BareMetal.cpp 
b/clang/lib/Driver/ToolChains/BareMetal.cpp
index 842061c1e1488b0..42c8336e626c7b5 100644
--- a/clang/lib/Driver/ToolChains/BareMetal.cpp
+++ b/clang/lib/Driver/ToolChains/BareMetal.cpp
@@ -491,3 +491,29 @@ void baremetal::Linker::ConstructJob(Compilation &C, const 
JobAction &JA,
   JA, *this, ResponseFileSupport::AtFileCurCP(),
   Args.MakeArgString(TC.GetLinkerPath()), CmdArgs, Inputs, Output));
 }
+
+// BareMetal toolchain allows all sanitizers where the compiler generates valid
+// code, ignoring all runtime library support issues on the assumption that
+// baremetal targets typically implement their own runtime support.
+SanitizerMask BareMetal::getSupportedSanitizers() const {
+  const bool IsX86_64 = getTriple().getArch() == llvm::Triple::x86_64;
+  const bool IsAArch64 = getTriple().getArch() == llvm::Triple::aarch64 ||
+ getTriple().getArch() == llvm::Triple::aarch64_be;
+  const bool IsRISCV64 = getTriple().getArch() == llvm::Triple::riscv64;
+  SanitizerMask Res = ToolChain::getSupportedSanitizers();
+  Res |= SanitizerKind::Address;
+  Res |= SanitizerKind::KernelAddress;
+  Res |= SanitizerKind::PointerCompare;
+  Res |= SanitizerKind::PointerSubtract;
+  Res |= SanitizerKind::Fuzzer;
+  Res |= SanitizerKind::FuzzerNoLink;
+  Res |= SanitizerKind::Vptr;
+  Res |= SanitizerKind::SafeStack;
+  Res |= SanitizerKind::Thread;
+  Res |= SanitizerKind::Scudo;
+  if (IsX86_64 || IsAArch64 || IsRISCV64) {
+Res |= SanitizerKind::HWAddress;
+Res |= SanitizerKind::KernelHWAddress;
+  }
+  return Res;
+}

diff  --git a/clang/lib/Driver/ToolChains/BareMetal.h 
b/clang/lib/Driver/ToolChains/BareMetal.h
index f602ef2be3542fb..67b5aa5998fc3da 100644
--- a/clang/lib/Driver/ToolChains/BareMetal.h
+++ b/clang/lib/Driver/ToolChains/BareMetal.h
@@ -72,6 +72,7 @@ class LLVM_LIBRARY_VISIBILITY BareMetal : public ToolChain {
   void AddLinkRuntimeLib(const llvm::opt::ArgList &Args,
  llvm::opt::ArgStringList &CmdArgs) const;
   std::string computeSysRoot() const override;
+  SanitizerMask getSupportedSanitizers() const override;
 
 private:
   using OrderedMultilibs =

diff  --git a/clang/test/Driver/fsanitize.c b/clang/test/Driver/fsanitize.c
index 9eb800b0d9e2c7f..84a8e2b6b203dd8 100644
--- a/clang/test/Driver/fsanitize.c
+++ b/clang/test/Driver/fsanitize.c
@@ -973,11 +973,58 @@
 // RUN: not %clang --target=x86_64-sie-ps5 -fsanitize=kcfi %s -### 2>&1 | 
FileCheck %s --check-prefix=CHECK-UBSAN-KCFI
 // RUN: not %clang --target=x86_64-sie-ps5 -fsanitize=function -fsanitize=kcfi 
%s -### 2>&1 | FileCheck %s  --check-prefix=CHECK-UBSAN-KCFI 
--check-prefix=CHECK-UBSAN-FUNCTION
 // RUN: %clang --target=x86_64-sie-ps5 -fsanitize=undefined %s -### 2>&1 | 
FileCheck %s --check-prefix=CHECK-UBSAN-UNDEFINED
+// CHECK-UBSAN-UNDEFINED: 
"-fsanitize={{((alignment|array-bounds|bool|builtin|enum|float-cast-overflow|integer-divide-by-zero|nonnull-attribute|null|pointer-overflow|return|returns-nonnull-attribute|shift-base|shift-exponent|signed-integer-overflow|unreachable|vla-bound),?){17}"}}
 
 // RUN: not %clang --target=armv6t2-eabi -mexecute-only -fsanitize=function %s 
-### 2>&1 | FileCheck %s --check-prefix=CHECK-UBSAN-FUNCTION
 // RUN: not %clang --target=armv6t2-eabi -mexecute-only -fsanitize=kcfi %s 
-### 2>&1 | FileCheck %s --check-prefix=CHECK-UBSAN-KCFI
-// RUN: %clang --target=armv6t2-eabi -mexecute-only -fsanitize=undefined %s 
-### 2>&1 | FileCheck %s --check-prefix=CHECK-UBSAN-UNDEFINED
+// RUN: %clang --target=armv6t2-eabi -mexecute-only -fsanitize=undefined %s 
-### 2>&1 | FileCheck %s --check-prefix=CHECK-UBSAN-UNDEFINED-VPTR
 
 // CHECK-UBSAN-KCFI-DAG: error: invalid argument '-fsanitize=kcfi' not allowed 
with {{('x86_64-sie-ps5'|'armv6t2-unknown-unknown-eabi')}}
 // CHECK-UBSAN-FUNCTION-DAG: error: invalid argument '-fsanitize=function' not 
allowed with {{('x86_64-sie-ps5'|'armv6t2-unknown-unknown-eabi')}}
-// CHECK-UBSAN-UNDEFINED: 
"-fsanitize={{((alignment|array-bounds|bool|builtin|enum|float-ca

[clang] Allow multiple sanitizers on baremetal targets. (PR #72933)

2023-11-21 Thread Evgenii Stepanov via cfe-commits



@@ -973,11 +973,58 @@
 // RUN: not %clang --target=x86_64-sie-ps5 -fsanitize=kcfi %s -### 2>&1 | 
FileCheck %s --check-prefix=CHECK-UBSAN-KCFI
 // RUN: not %clang --target=x86_64-sie-ps5 -fsanitize=function -fsanitize=kcfi 
%s -### 2>&1 | FileCheck %s  --check-prefix=CHECK-UBSAN-KCFI 
--check-prefix=CHECK-UBSAN-FUNCTION
 // RUN: %clang --target=x86_64-sie-ps5 -fsanitize=undefined %s -### 2>&1 | 
FileCheck %s --check-prefix=CHECK-UBSAN-UNDEFINED
+// CHECK-UBSAN-UNDEFINED: 
"-fsanitize={{((alignment|array-bounds|bool|builtin|enum|float-cast-overflow|integer-divide-by-zero|nonnull-attribute|null|pointer-overflow|return|returns-nonnull-attribute|shift-base|shift-exponent|signed-integer-overflow|unreachable|vla-bound),?){17}"}}
 
 // RUN: not %clang --target=armv6t2-eabi -mexecute-only -fsanitize=function %s 
-### 2>&1 | FileCheck %s --check-prefix=CHECK-UBSAN-FUNCTION
 // RUN: not %clang --target=armv6t2-eabi -mexecute-only -fsanitize=kcfi %s 
-### 2>&1 | FileCheck %s --check-prefix=CHECK-UBSAN-KCFI
-// RUN: %clang --target=armv6t2-eabi -mexecute-only -fsanitize=undefined %s 
-### 2>&1 | FileCheck %s --check-prefix=CHECK-UBSAN-UNDEFINED
+// RUN: %clang --target=armv6t2-eabi -mexecute-only -fsanitize=undefined %s 
-### 2>&1 | FileCheck %s --check-prefix=CHECK-UBSAN-UNDEFINED-VPTR
 
 // CHECK-UBSAN-KCFI-DAG: error: invalid argument '-fsanitize=kcfi' not allowed 
with {{('x86_64-sie-ps5'|'armv6t2-unknown-unknown-eabi')}}
 // CHECK-UBSAN-FUNCTION-DAG: error: invalid argument '-fsanitize=function' not 
allowed with {{('x86_64-sie-ps5'|'armv6t2-unknown-unknown-eabi')}}
-// CHECK-UBSAN-UNDEFINED: 
"-fsanitize={{((alignment|array-bounds|bool|builtin|enum|float-cast-overflow|integer-divide-by-zero|nonnull-attribute|null|pointer-overflow|return|returns-nonnull-attribute|shift-base|shift-exponent|signed-integer-overflow|unreachable|vla-bound),?){17}"}}
+// CHECK-UBSAN-UNDEFINED-VPTR: 
"-fsanitize={{((alignment|array-bounds|bool|builtin|enum|float-cast-overflow|integer-divide-by-zero|nonnull-attribute|null|pointer-overflow|return|returns-nonnull-attribute|shift-base|shift-exponent|signed-integer-overflow|unreachable|vla-bound|vptr),?){18}"}}
+
+// * BareMetal *

eugenis wrote:

done

https://github.com/llvm/llvm-project/pull/72933
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Allow multiple sanitizers on baremetal targets. (PR #72933)

2023-11-21 Thread Evgenii Stepanov via cfe-commits



@@ -491,3 +491,26 @@ void baremetal::Linker::ConstructJob(Compilation &C, const 
JobAction &JA,
   JA, *this, ResponseFileSupport::AtFileCurCP(),
   Args.MakeArgString(TC.GetLinkerPath()), CmdArgs, Inputs, Output));
 }
+
+SanitizerMask BareMetal::getSupportedSanitizers() const {
+  const bool IsX86_64 = getTriple().getArch() == llvm::Triple::x86_64;

eugenis wrote:

thank you, done

https://github.com/llvm/llvm-project/pull/72933
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Allow multiple sanitizers on baremetal targets. (PR #72933)

2023-11-21 Thread Evgenii Stepanov via cfe-commits


https://github.com/eugenis updated 
https://github.com/llvm/llvm-project/pull/72933

>From f665e96f5a941c45591281d66c69f289aa641985 Mon Sep 17 00:00:00 2001
From: Evgenii Stepanov 
Date: Mon, 20 Nov 2023 16:54:24 -0800
Subject: [PATCH 1/2] Allow multiple sanitizers on baremetal targets.

Baremetal targets tend to implement their own runtime support for
sanitizers. Clang driver gatekeeping of allowed sanitizer types is
counter productive.

This change allows anything that does not crash and burn in compilation,
and leaves any potential runtime issues for the user to figure out.
---
 clang/lib/Driver/ToolChains/BareMetal.cpp | 23 ++
 clang/lib/Driver/ToolChains/BareMetal.h   |  1 +
 clang/test/Driver/fsanitize.c | 51 ++-
 3 files changed, 73 insertions(+), 2 deletions(-)

diff --git a/clang/lib/Driver/ToolChains/BareMetal.cpp 
b/clang/lib/Driver/ToolChains/BareMetal.cpp
index 842061c1e1488b0..9d60a2b5c27af2b 100644
--- a/clang/lib/Driver/ToolChains/BareMetal.cpp
+++ b/clang/lib/Driver/ToolChains/BareMetal.cpp
@@ -491,3 +491,26 @@ void baremetal::Linker::ConstructJob(Compilation &C, const 
JobAction &JA,
   JA, *this, ResponseFileSupport::AtFileCurCP(),
   Args.MakeArgString(TC.GetLinkerPath()), CmdArgs, Inputs, Output));
 }
+
+SanitizerMask BareMetal::getSupportedSanitizers() const {
+  const bool IsX86_64 = getTriple().getArch() == llvm::Triple::x86_64;
+  const bool IsAArch64 = getTriple().getArch() == llvm::Triple::aarch64 ||
+ getTriple().getArch() == llvm::Triple::aarch64_be;
+  const bool IsRISCV64 = getTriple().getArch() == llvm::Triple::riscv64;
+  SanitizerMask Res = ToolChain::getSupportedSanitizers();
+  Res |= SanitizerKind::Address;
+  Res |= SanitizerKind::KernelAddress;
+  Res |= SanitizerKind::PointerCompare;
+  Res |= SanitizerKind::PointerSubtract;
+  Res |= SanitizerKind::Fuzzer;
+  Res |= SanitizerKind::FuzzerNoLink;
+  Res |= SanitizerKind::Vptr;
+  Res |= SanitizerKind::SafeStack;
+  Res |= SanitizerKind::Thread;
+  Res |= SanitizerKind::Scudo;
+  if (IsX86_64 || IsAArch64 || IsRISCV64) {
+Res |= SanitizerKind::HWAddress;
+Res |= SanitizerKind::KernelHWAddress;
+  }
+  return Res;
+}
diff --git a/clang/lib/Driver/ToolChains/BareMetal.h 
b/clang/lib/Driver/ToolChains/BareMetal.h
index f602ef2be3542fb..67b5aa5998fc3da 100644
--- a/clang/lib/Driver/ToolChains/BareMetal.h
+++ b/clang/lib/Driver/ToolChains/BareMetal.h
@@ -72,6 +72,7 @@ class LLVM_LIBRARY_VISIBILITY BareMetal : public ToolChain {
   void AddLinkRuntimeLib(const llvm::opt::ArgList &Args,
  llvm::opt::ArgStringList &CmdArgs) const;
   std::string computeSysRoot() const override;
+  SanitizerMask getSupportedSanitizers() const override;
 
 private:
   using OrderedMultilibs =
diff --git a/clang/test/Driver/fsanitize.c b/clang/test/Driver/fsanitize.c
index 9eb800b0d9e2c7f..3cf3e3118a3a932 100644
--- a/clang/test/Driver/fsanitize.c
+++ b/clang/test/Driver/fsanitize.c
@@ -973,11 +973,58 @@
 // RUN: not %clang --target=x86_64-sie-ps5 -fsanitize=kcfi %s -### 2>&1 | 
FileCheck %s --check-prefix=CHECK-UBSAN-KCFI
 // RUN: not %clang --target=x86_64-sie-ps5 -fsanitize=function -fsanitize=kcfi 
%s -### 2>&1 | FileCheck %s  --check-prefix=CHECK-UBSAN-KCFI 
--check-prefix=CHECK-UBSAN-FUNCTION
 // RUN: %clang --target=x86_64-sie-ps5 -fsanitize=undefined %s -### 2>&1 | 
FileCheck %s --check-prefix=CHECK-UBSAN-UNDEFINED
+// CHECK-UBSAN-UNDEFINED: 
"-fsanitize={{((alignment|array-bounds|bool|builtin|enum|float-cast-overflow|integer-divide-by-zero|nonnull-attribute|null|pointer-overflow|return|returns-nonnull-attribute|shift-base|shift-exponent|signed-integer-overflow|unreachable|vla-bound),?){17}"}}
 
 // RUN: not %clang --target=armv6t2-eabi -mexecute-only -fsanitize=function %s 
-### 2>&1 | FileCheck %s --check-prefix=CHECK-UBSAN-FUNCTION
 // RUN: not %clang --target=armv6t2-eabi -mexecute-only -fsanitize=kcfi %s 
-### 2>&1 | FileCheck %s --check-prefix=CHECK-UBSAN-KCFI
-// RUN: %clang --target=armv6t2-eabi -mexecute-only -fsanitize=undefined %s 
-### 2>&1 | FileCheck %s --check-prefix=CHECK-UBSAN-UNDEFINED
+// RUN: %clang --target=armv6t2-eabi -mexecute-only -fsanitize=undefined %s 
-### 2>&1 | FileCheck %s --check-prefix=CHECK-UBSAN-UNDEFINED-VPTR
 
 // CHECK-UBSAN-KCFI-DAG: error: invalid argument '-fsanitize=kcfi' not allowed 
with {{('x86_64-sie-ps5'|'armv6t2-unknown-unknown-eabi')}}
 // CHECK-UBSAN-FUNCTION-DAG: error: invalid argument '-fsanitize=function' not 
allowed with {{('x86_64-sie-ps5'|'armv6t2-unknown-unknown-eabi')}}
-// CHECK-UBSAN-UNDEFINED: 
"-fsanitize={{((alignment|array-bounds|bool|builtin|enum|float-cast-overflow|integer-divide-by-zero|nonnull-attribute|null|pointer-overflow|return|returns-nonnull-attribute|shift-base|shift-exponent|signed-integer-overflow|unreachable|vla-bound),?){17}"}}
+// CHECK-UBSAN-UNDEFINED-VPTR: 
"-fsanitize={{((alignment|array-bounds|bool|builtin|enum|float-cast-overflow|integer-divide-by-z

[clang] [clang][NFC] Reorder Atomic builtins to be consistent. (PR #72718)

2023-11-21 Thread James Y Knight via cfe-commits


https://github.com/jyknight closed 
https://github.com/llvm/llvm-project/pull/72718
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] 752c21b - [clang][NFC] Reorder Atomic builtins to be consistent. (#72718)

2023-11-21 Thread via cfe-commits


Author: Logikable
Date: 2023-11-21T16:00:31-05:00
New Revision: 752c21be68613f92e2de16cd380098cf830bc261

URL: 
https://github.com/llvm/llvm-project/commit/752c21be68613f92e2de16cd380098cf830bc261
DIFF: 
https://github.com/llvm/llvm-project/commit/752c21be68613f92e2de16cd380098cf830bc261.diff

LOG: [clang][NFC] Reorder Atomic builtins to be consistent. (#72718)

Added: 


Modified: 
clang/lib/CodeGen/CGAtomic.cpp

Removed: 




diff  --git a/clang/lib/CodeGen/CGAtomic.cpp b/clang/lib/CodeGen/CGAtomic.cpp
index f7c597e181b0bd9..6005d5c51c0e1ac 100644
--- a/clang/lib/CodeGen/CGAtomic.cpp
+++ b/clang/lib/CodeGen/CGAtomic.cpp
@@ -861,10 +861,10 @@ RValue CodeGenFunction::EmitAtomicExpr(AtomicExpr *E) {
   case AtomicExpr::AO__opencl_atomic_init:
 llvm_unreachable("Already handled above with EmitAtomicInit!");
 
+  case AtomicExpr::AO__atomic_load_n:
   case AtomicExpr::AO__c11_atomic_load:
   case AtomicExpr::AO__opencl_atomic_load:
   case AtomicExpr::AO__hip_atomic_load:
-  case AtomicExpr::AO__atomic_load_n:
 break;
 
   case AtomicExpr::AO__atomic_load:
@@ -880,14 +880,14 @@ RValue CodeGenFunction::EmitAtomicExpr(AtomicExpr *E) {
 Dest = EmitPointerWithAlignment(E->getVal2());
 break;
 
-  case AtomicExpr::AO__c11_atomic_compare_exchange_strong:
+  case AtomicExpr::AO__atomic_compare_exchange:
+  case AtomicExpr::AO__atomic_compare_exchange_n:
   case AtomicExpr::AO__c11_atomic_compare_exchange_weak:
-  case AtomicExpr::AO__opencl_atomic_compare_exchange_strong:
+  case AtomicExpr::AO__c11_atomic_compare_exchange_strong:
+  case AtomicExpr::AO__hip_atomic_compare_exchange_weak:
   case AtomicExpr::AO__hip_atomic_compare_exchange_strong:
   case AtomicExpr::AO__opencl_atomic_compare_exchange_weak:
-  case AtomicExpr::AO__hip_atomic_compare_exchange_weak:
-  case AtomicExpr::AO__atomic_compare_exchange_n:
-  case AtomicExpr::AO__atomic_compare_exchange:
+  case AtomicExpr::AO__opencl_atomic_compare_exchange_strong:
 Val1 = EmitPointerWithAlignment(E->getVal1());
 if (E->getOp() == AtomicExpr::AO__atomic_compare_exchange)
   Val2 = EmitPointerWithAlignment(E->getVal2());
@@ -938,32 +938,32 @@ RValue CodeGenFunction::EmitAtomicExpr(AtomicExpr *E) {
 ShouldCastToIntPtrTy = !MemTy->isFloatingType();
 [[fallthrough]];
 
-  case AtomicExpr::AO__c11_atomic_store:
-  case AtomicExpr::AO__c11_atomic_exchange:
-  case AtomicExpr::AO__opencl_atomic_store:
-  case AtomicExpr::AO__hip_atomic_store:
-  case AtomicExpr::AO__opencl_atomic_exchange:
-  case AtomicExpr::AO__hip_atomic_exchange:
+  case AtomicExpr::AO__atomic_fetch_and:
+  case AtomicExpr::AO__atomic_fetch_nand:
+  case AtomicExpr::AO__atomic_fetch_or:
+  case AtomicExpr::AO__atomic_fetch_xor:
+  case AtomicExpr::AO__atomic_and_fetch:
+  case AtomicExpr::AO__atomic_nand_fetch:
+  case AtomicExpr::AO__atomic_or_fetch:
+  case AtomicExpr::AO__atomic_xor_fetch:
   case AtomicExpr::AO__atomic_store_n:
   case AtomicExpr::AO__atomic_exchange_n:
   case AtomicExpr::AO__c11_atomic_fetch_and:
+  case AtomicExpr::AO__c11_atomic_fetch_nand:
   case AtomicExpr::AO__c11_atomic_fetch_or:
   case AtomicExpr::AO__c11_atomic_fetch_xor:
-  case AtomicExpr::AO__c11_atomic_fetch_nand:
-  case AtomicExpr::AO__opencl_atomic_fetch_and:
-  case AtomicExpr::AO__opencl_atomic_fetch_or:
-  case AtomicExpr::AO__opencl_atomic_fetch_xor:
-  case AtomicExpr::AO__atomic_fetch_and:
+  case AtomicExpr::AO__c11_atomic_store:
+  case AtomicExpr::AO__c11_atomic_exchange:
   case AtomicExpr::AO__hip_atomic_fetch_and:
-  case AtomicExpr::AO__atomic_fetch_or:
   case AtomicExpr::AO__hip_atomic_fetch_or:
-  case AtomicExpr::AO__atomic_fetch_xor:
   case AtomicExpr::AO__hip_atomic_fetch_xor:
-  case AtomicExpr::AO__atomic_fetch_nand:
-  case AtomicExpr::AO__atomic_and_fetch:
-  case AtomicExpr::AO__atomic_or_fetch:
-  case AtomicExpr::AO__atomic_xor_fetch:
-  case AtomicExpr::AO__atomic_nand_fetch:
+  case AtomicExpr::AO__hip_atomic_store:
+  case AtomicExpr::AO__hip_atomic_exchange:
+  case AtomicExpr::AO__opencl_atomic_fetch_and:
+  case AtomicExpr::AO__opencl_atomic_fetch_or:
+  case AtomicExpr::AO__opencl_atomic_fetch_xor:
+  case AtomicExpr::AO__opencl_atomic_store:
+  case AtomicExpr::AO__opencl_atomic_exchange:
 Val1 = EmitValToTemp(*this, E->getVal1());
 break;
   }
@@ -1002,44 +1002,44 @@ RValue CodeGenFunction::EmitAtomicExpr(AtomicExpr *E) {
 case AtomicExpr::AO__opencl_atomic_init:
   llvm_unreachable("Already handled above with EmitAtomicInit!");
 
-case AtomicExpr::AO__c11_atomic_fetch_add:
-case AtomicExpr::AO__opencl_atomic_fetch_add:
 case AtomicExpr::AO__atomic_fetch_add:
-case AtomicExpr::AO__hip_atomic_fetch_add:
-case AtomicExpr::AO__c11_atomic_fetch_and:
-case AtomicExpr::AO__opencl_atomic_fetch_and:
-case AtomicExpr::AO__hip_atomic_fetch_and:
 case AtomicExpr::AO__atomic_fetch_and:
-case AtomicExpr::

[clang] [llvm] [mlir] [lld] [AMDGPU] Change default AMDHSA Code Object version to 5 (PR #73000)

2023-11-21 Thread Jon Chesterfield via cfe-commits


JonChesterfield wrote:

This is a wild amount of code churn from a trivial change. 10k lines of almost 
all noise. Means the chances of us noticing breakage in a code review tool is 
pretty low.

How about as a first patch we pass `-code-object=v4` or whatever syntax to 
essentially all the tests, then rebase this, so that we can get something 
approximating "this is the functional change, with the codegen change visible 
in these tests"?

In general it seems likely that a lot of tests are checking things they don't 
actually care about, probably because they're frequently generated by the 
python thing. Maybe some of the noise can be removed by tweaking the test 
generator script to emit checks that are insensitive to ABI version?

https://github.com/llvm/llvm-project/pull/73000
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [SystemZ][z/OS] This change adds support for the PPA2 section in zOS (PR #68926)

2023-11-21 Thread Yusra Syeda via cfe-commits


https://github.com/ysyeda updated 
https://github.com/llvm/llvm-project/pull/68926

>From 78f82bcf33998de0663f4684a64a240f2e97f8a9 Mon Sep 17 00:00:00 2001
From: Yusra Syeda 
Date: Thu, 12 Oct 2023 16:56:27 -0400
Subject: [PATCH 01/19] This change adds support for the PPA2 section in zOS

---
 clang/lib/Basic/LangStandards.cpp |   6 +
 clang/lib/CodeGen/CodeGenModule.cpp   |  15 ++
 clang/lib/Driver/ToolChains/Clang.cpp |  13 +-
 clang/lib/Driver/ToolChains/Clang.h   |   3 +-
 clang/test/CodeGen/SystemZ/systemz-ppa2.c |  25 +++
 llvm/include/llvm/BinaryFormat/GOFF.h |   1 +
 llvm/include/llvm/MC/MCObjectFileInfo.h   |   4 +
 llvm/lib/MC/MCObjectFileInfo.cpp  |   5 +
 llvm/lib/Target/SystemZ/SystemZAsmPrinter.cpp | 197 +-
 llvm/lib/Target/SystemZ/SystemZAsmPrinter.h   |   7 +-
 llvm/test/CodeGen/SystemZ/zos-ppa2.ll |  26 +++
 11 files changed, 297 insertions(+), 5 deletions(-)
 create mode 100644 clang/test/CodeGen/SystemZ/systemz-ppa2.c
 create mode 100644 llvm/test/CodeGen/SystemZ/zos-ppa2.ll

diff --git a/clang/lib/Basic/LangStandards.cpp 
b/clang/lib/Basic/LangStandards.cpp
index ab09c7221dda92f..cfe79ec90f3796b 100644
--- a/clang/lib/Basic/LangStandards.cpp
+++ b/clang/lib/Basic/LangStandards.cpp
@@ -10,10 +10,16 @@
 #include "clang/Config/config.h"
 #include "llvm/ADT/StringSwitch.h"
 #include "llvm/Support/ErrorHandling.h"
+#include "llvm/Support/FormatVariadic.h"
 #include "llvm/TargetParser/Triple.h"
 using namespace clang;
 
 StringRef clang::languageToString(Language L) {
+const char *clang::LanguageToString(Language L) {
+  // I would like to make this function and the definition of Language
+  // in the .h file simply expand the contents of a .def file.
+  // However, in the .h the members of the enum have doxygen annotations
+  // and/or comments which would be lost.
   switch (L) {
   case Language::Unknown:
 return "Unknown";
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp 
b/clang/lib/CodeGen/CodeGenModule.cpp
index b1a6683a66bd052..9a4763413ea3fbc 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -976,6 +976,21 @@ void CodeGenModule::Release() {
   Context.getTypeSizeInChars(Context.getWideCharType()).getQuantity();
   getModule().addModuleFlag(llvm::Module::Error, "wchar_size", WCharWidth);
 
+  if (getTriple().isOSzOS()) {
+int32_t ProductVersion, ProductRelease, ProductPatch;
+ProductVersion = LLVM_VERSION_MAJOR,
+ProductRelease = LLVM_VERSION_MINOR, ProductPatch = LLVM_VERSION_PATCH;
+getModule().addModuleFlag(llvm::Module::Warning, "Product Major Version", 
ProductVersion);
+getModule().addModuleFlag(llvm::Module::Warning, "Product Minor Version", 
ProductRelease);
+getModule().addModuleFlag(llvm::Module::Warning, "Product Patchlevel", 
ProductPatch);
+
+// Record the language because we need it for the PPA2.
+const char *lang_str = LanguageToString(
+LangStandard::getLangStandardForKind(LangOpts.LangStd).Language);
+getModule().addModuleFlag(llvm::Module::Error, "zos_cu_language",
+  llvm::MDString::get(VMContext, lang_str));
+  }
+
   llvm::Triple::ArchType Arch = Context.getTargetInfo().getTriple().getArch();
   if (   Arch == llvm::Triple::arm
   || Arch == llvm::Triple::armeb
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 43a92adbef64ba8..109699f2ea4a62a 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -1765,7 +1765,7 @@ void Clang::RenderTargetOptions(const llvm::Triple 
&EffectiveTriple,
 break;
 
   case llvm::Triple::systemz:
-AddSystemZTargetArgs(Args, CmdArgs);
+AddSystemZTargetArgs(EffectiveTriple, Args, CmdArgs);
 break;
 
   case llvm::Triple::x86:
@@ -2262,7 +2262,8 @@ void Clang::AddSparcTargetArgs(const ArgList &Args,
   }
 }
 
-void Clang::AddSystemZTargetArgs(const ArgList &Args,
+void Clang::AddSystemZTargetArgs(const llvm::Triple &Triple,
+ const ArgList &Args,
  ArgStringList &CmdArgs) const {
   if (const Arg *A = Args.getLastArg(options::OPT_mtune_EQ)) {
 CmdArgs.push_back("-tune-cpu");
@@ -2294,6 +2295,14 @@ void Clang::AddSystemZTargetArgs(const ArgList &Args,
 CmdArgs.push_back("-mfloat-abi");
 CmdArgs.push_back("soft");
   }
+
+  if (Triple.isOSzOS()) {
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back(
+Args.MakeArgString(llvm::Twine("-translation-time=")
+   .concat(llvm::Twine(std::time(nullptr)))
+   .str()));
+  }
 }
 
 void Clang::AddX86TargetArgs(const ArgList &Args,
diff --git a/clang/lib/Driver/ToolChains/Clang.h 
b/clang/lib/Driver/ToolChains/Clang.h
index 0f503c4bd1c4fea..9f065f846b4cf34 100644
--- a/clang/lib/Driver/ToolChains/Clang.h
+++ b/clang/lib/Driver/ToolChains/Clang

[clang] [Clang][NVPTX] Allow passing arguments to the linker while standalone (PR #73030)

2023-11-21 Thread Joseph Huber via cfe-commits


https://github.com/jhuber6 updated 
https://github.com/llvm/llvm-project/pull/73030

>From ee43e8f9ae90bcd70d46b17cfecb854711a4b1ce Mon Sep 17 00:00:00 2001
From: Joseph Huber 
Date: Tue, 21 Nov 2023 13:45:10 -0600
Subject: [PATCH] [Clang][NVPTX] Allow passing arguments to the linker while
 standalone

Summary:
We support standalone compilation for the NVPTX architecture using
'nvlink' as our linker. Because of the special handling required to
transform input files to cubins, as nvlink expects for some reason, we
didn't use the standard AddLinkerInput method. However, this also meant
that we weren't forwarding options passed with -Wl to the linker. Add
this support in for the standalone toolchain path.

Revived from https://reviews.llvm.org/D149978
---
 clang/lib/Driver/ToolChains/Cuda.cpp  | 43 +--
 clang/test/Driver/cuda-cross-compiling.c  |  8 
 .../ClangLinkerWrapper.cpp|  4 +-
 3 files changed, 32 insertions(+), 23 deletions(-)

diff --git a/clang/lib/Driver/ToolChains/Cuda.cpp 
b/clang/lib/Driver/ToolChains/Cuda.cpp
index e95ff98e6c940f1..5ef8b4455c23f13 100644
--- a/clang/lib/Driver/ToolChains/Cuda.cpp
+++ b/clang/lib/Driver/ToolChains/Cuda.cpp
@@ -611,35 +611,34 @@ void NVPTX::Linker::ConstructJob(Compilation &C, const 
JobAction &JA,
   continue;
 }
 
-// Currently, we only pass the input files to the linker, we do not pass
-// any libraries that may be valid only for the host.
-if (!II.isFilename())
-  continue;
-
 // The 'nvlink' application performs RDC-mode linking when given a '.o'
 // file and device linking when given a '.cubin' file. We always want to
 // perform device linking, so just rename any '.o' files.
 // FIXME: This should hopefully be removed if NVIDIA updates their tooling.
-auto InputFile = getToolChain().getInputFilename(II);
-if (llvm::sys::path::extension(InputFile) != ".cubin") {
-  // If there are no actions above this one then this is direct input and 
we
-  // can copy it. Otherwise the input is internal so a `.cubin` file should
-  // exist.
-  if (II.getAction() && II.getAction()->getInputs().size() == 0) {
-const char *CubinF =
-Args.MakeArgString(getToolChain().getDriver().GetTemporaryPath(
-llvm::sys::path::stem(InputFile), "cubin"));
-if (llvm::sys::fs::copy_file(InputFile, C.addTempFile(CubinF)))
-  continue;
+if (II.isFilename()) {
+  auto InputFile = getToolChain().getInputFilename(II);
+  if (llvm::sys::path::extension(InputFile) != ".cubin") {
+// If there are no actions above this one then this is direct input and
+// we can copy it. Otherwise the input is internal so a `.cubin` file
+// should exist.
+if (II.getAction() && II.getAction()->getInputs().size() == 0) {
+  const char *CubinF =
+  Args.MakeArgString(getToolChain().getDriver().GetTemporaryPath(
+  llvm::sys::path::stem(InputFile), "cubin"));
+  if (llvm::sys::fs::copy_file(InputFile, C.addTempFile(CubinF)))
+continue;
 
-CmdArgs.push_back(CubinF);
+  CmdArgs.push_back(CubinF);
+} else {
+  SmallString<256> Filename(InputFile);
+  llvm::sys::path::replace_extension(Filename, "cubin");
+  CmdArgs.push_back(Args.MakeArgString(Filename));
+}
   } else {
-SmallString<256> Filename(InputFile);
-llvm::sys::path::replace_extension(Filename, "cubin");
-CmdArgs.push_back(Args.MakeArgString(Filename));
+CmdArgs.push_back(Args.MakeArgString(InputFile));
   }
-} else {
-  CmdArgs.push_back(Args.MakeArgString(InputFile));
+} else if (!II.isNothing()) {
+  II.getInputArg().renderAsInput(Args, CmdArgs);
 }
   }
 
diff --git a/clang/test/Driver/cuda-cross-compiling.c 
b/clang/test/Driver/cuda-cross-compiling.c
index 12d0af3b45f32f6..5a52496838813ee 100644
--- a/clang/test/Driver/cuda-cross-compiling.c
+++ b/clang/test/Driver/cuda-cross-compiling.c
@@ -77,3 +77,11 @@
 // RUN:   | FileCheck -check-prefix=LOWERING %s
 
 // LOWERING: -cc1" "-triple" "nvptx64-nvidia-cuda" {{.*}} "-mllvm" 
"--nvptx-lower-global-ctor-dtor"
+
+//
+// Test passing arguments directly to nvlink.
+//
+// RUN: %clang -target nvptx64-nvidia-cuda -Wl,-v -Wl,a,b -### %s 2>&1 \
+// RUN:   | FileCheck -check-prefix=LINKER-ARGS %s
+
+// LINKER-ARGS: nvlink{{.*}}"-v"{{.*}}"a" "b"
diff --git a/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp 
b/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
index bafe8ace60d1cea..03fb0a7d64552eb 100644
--- a/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+++ b/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -385,9 +385,11 @@ Expected clang(ArrayRef InputFiles, 
const ArgList &Args) {
   Triple.isAMDGPU() ? Args.MakeArgString("-mcpu=" + Arch)
 : Args.MakeArgString("-march=" + Arc

[llvm] [lld] [mlir] [clang] [AMDGPU] Change default AMDHSA Code Object version to 5 (PR #73000)

2023-11-21 Thread Joseph Huber via cfe-commits



@@ -75,8 +75,8 @@ bb.2:
   store volatile i32 0, ptr addrspace(1) undef
   ret void
 }
-; DEFAULTSIZE: .amdhsa_private_segment_fixed_size 4112
-; DEFAULTSIZE: ; ScratchSize: 4112
+; DEFAULTSIZE: .amdhsa_private_segment_fixed_size 16

jhuber6 wrote:

My understanding is that it's supposed to be `0` if the backend could not 
statically determine it, so it's unlikely to be due to that change in COV5. 
Maybe it's not matching the full line of 16384 or similar?

https://github.com/llvm/llvm-project/pull/73000
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [lld] [mlir] [clang] [AMDGPU] Change default AMDHSA Code Object version to 5 (PR #73000)

2023-11-21 Thread Jon Chesterfield via cfe-commits



@@ -75,8 +75,8 @@ bb.2:
   store volatile i32 0, ptr addrspace(1) undef
   ret void
 }
-; DEFAULTSIZE: .amdhsa_private_segment_fixed_size 4112
-; DEFAULTSIZE: ; ScratchSize: 4112
+; DEFAULTSIZE: .amdhsa_private_segment_fixed_size 16

JonChesterfield wrote:

This seems a bit suspect. It used to be about 4k and is now 16. Are we out by a 
factor of 1024 somewhere?

https://github.com/llvm/llvm-project/pull/73000
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

1 2 3 4 >

1 - 100 of 399 matches

Mail list logo