date:20231122

[flang] [clang] [flang][Driver] Support -nodefaultlibs, -nostartfiles and -nostdlib (PR #72601)

2023-11-22 Thread Brad Smith via cfe-commits


brad0 wrote:

@MaskRay Still could use some input about the ```CLOption, DXCOption``` part. 
Same with 34e4e5eb70818fca90574beb8f5617e27bfac138.

https://github.com/llvm/llvm-project/pull/72601
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang][AST][ASTMerge] prevent AST nodes from being deallocated early (PR #73096)

2023-11-22 Thread Chuanqi Xu via cfe-commits

ChuanqiXu9 wrote:

> Debug the https://github.com/llvm/llvm-project/issues/72783 can prove it. 
> Address interval (local from 0x3a9a00 to 0x3aaa00) allocated by allocator 
> contains a IdentifierInfo variable (local address:0x3aa190) whose address is 
> freed early.

In this case, it looks better to extract the use-after-free variable only 
instead of extracting the whole ASTUnit.

> As system header like stdio.h or math.h can't be put into test, it's hard to 
> add testcase. Could anyone give me some guidance? Thanks in advance!

Generally, we need to reduce them in this case. e.g., we need to preprocess 
them, and remove unncessary parts until we can't. It is time consuming but it 
is worthy.

https://github.com/llvm/llvm-project/pull/73096
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] Avoid memcopy for small structure with padding under -ftrivial-auto-var-init (PR #71677)

2023-11-22 Thread Omair Javaid via cfe-commits


omjavaid wrote:

This change appears to have broken several clang tests on following buildbots:
https://lab.llvm.org/buildbot/#/builders/245
https://lab.llvm.org/buildbot/#/builders/188
https://lab.llvm.org/buildbot/#/builders/186
https://lab.llvm.org/buildbot/#/builders/183

Kindly fix issues if they look trivial or revert the patch if more time is 
needed for the fix.

https://github.com/llvm/llvm-project/pull/71677
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] clang/CodeGen/RISCV: test lowering of math builtins (PR #71399)

2023-11-22 Thread Ramkumar Ramachandra via cfe-commits


https://github.com/artagnon closed 
https://github.com/llvm/llvm-project/pull/71399
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] 083a539 - clang/CodeGen/RISCV: test lowering of math builtins (#71399)

2023-11-22 Thread via cfe-commits


Author: Ramkumar Ramachandra
Date: 2023-11-23T07:39:32Z
New Revision: 083a53971758c6f9bbd448eeb9c5d839661e3f68

URL: 
https://github.com/llvm/llvm-project/commit/083a53971758c6f9bbd448eeb9c5d839661e3f68
DIFF: 
https://github.com/llvm/llvm-project/commit/083a53971758c6f9bbd448eeb9c5d839661e3f68.diff

LOG: clang/CodeGen/RISCV: test lowering of math builtins (#71399)

Ever since 98c90a1 (ISel: introduce vector ISD::LRINT, ISD::LLRINT;
custom RISCV lowering) landed, there have been several discussions on
how the lrint and llrint libcalls would lower to LLVM IR via clang on
RV32 and RV64, in an effort to enable vectorization of lrint and llrint
via SLPVectorizer and LoopVectorize. This patch adds a new
math-builtins.c test to the RISC-V target to test the lowering of all
math libcalls, including lrint and llrint.

Added: 
clang/test/CodeGen/RISCV/math-builtins.c
clang/test/CodeGen/X86/math-builtins.c

Modified: 


Removed: 
clang/test/CodeGen/math-builtins.c



diff  --git a/clang/test/CodeGen/RISCV/math-builtins.c 
b/clang/test/CodeGen/RISCV/math-builtins.c
new file mode 100644
index 000..9630d62f0f48292
--- /dev/null
+++ b/clang/test/CodeGen/RISCV/math-builtins.c
@@ -0,0 +1,459 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 3
+// RUN: %clang_cc1 -triple riscv32 -emit-llvm %s -o - | FileCheck 
--check-prefix=RV32 %s
+// RUN: %clang_cc1 -triple riscv64 -emit-llvm %s -o - | FileCheck 
--check-prefix=RV64 %s
+
+float ceilf(float);
+double ceil(double);
+long double ceill(long double);
+float copysignf(float, float);
+double copysign(double, double);
+long double copysignl(long double, long double);
+float cosf(float);
+double cos(double);
+long double cosl(long double);
+float expf(float);
+double exp(double);
+long double expl(long double);
+float exp2f(float);
+double exp2(double);
+long double exp2l(long double);
+float fabsf(float);
+double fabs(double);
+long double fabsl(long double);
+float floorf(float);
+double floor(double);
+long double floorl(long double);
+float fmaxf(float, float);
+double fmax(double, double);
+long double fmaxl(long double, long double);
+float fminf(float, float);
+double fmin(double, double);
+long double fminl(long double, long double);
+float fmodf(float, float);
+double fmod(double, double);
+long double fmodl(long double, long double);
+float logf(float);
+double log(double);
+long double logl(long double);
+float log10f(float);
+double log10(double);
+long double log10l(long double);
+float log2f(float);
+double log2(double);
+long double log2l(long double);
+float nearbyintf(float);
+double nearbyint(double);
+long double nearbyintl(long double);
+float powf(float, float);
+double pow(double, double);
+long double powl(long double, long double);
+float rintf(float);
+double rint(double);
+long double rintl(long double);
+long lrintf(float);
+long lrint(double);
+long lrintl(long double);
+long long llrintf(float);
+long long llrint(double);
+long long llrintl(long double);
+float roundf(float);
+double round(double);
+long double roundl(long double);
+long lroundf(float);
+long lround(double);
+long lroundl(long double);
+long long llroundf(float);
+long long llround(double);
+long long llroundl(long double);
+float roundevenf(float);
+double roundeven(double);
+long double roundevenl(long double);
+float sinf(float);
+double sin(double);
+long double sinl(long double);
+float sqrtf(float);
+double sqrt(double);
+long double sqrtl(long double);
+float truncf(float);
+double trunc(double);
+long double truncl(long double);
+
+// RV32-LABEL: define dso_local void @test(
+// RV32-SAME: float noundef [[FARG:%.*]], double noundef [[DARG:%.*]], fp128 
noundef [[LDARG:%.*]]) #[[ATTR0:[0-9]+]] {
+// RV32-NEXT:  entry:
+// RV32-NEXT:[[FARG_ADDR:%.*]] = alloca float, align 4
+// RV32-NEXT:[[DARG_ADDR:%.*]] = alloca double, align 8
+// RV32-NEXT:[[LDARG_ADDR:%.*]] = alloca fp128, align 16
+// RV32-NEXT:store float [[FARG]], ptr [[FARG_ADDR]], align 4
+// RV32-NEXT:store double [[DARG]], ptr [[DARG_ADDR]], align 8
+// RV32-NEXT:store fp128 [[LDARG]], ptr [[LDARG_ADDR]], align 16
+// RV32-NEXT:[[TMP0:%.*]] = load float, ptr [[FARG_ADDR]], align 4
+// RV32-NEXT:[[TMP1:%.*]] = call float @llvm.ceil.f32(float [[TMP0]])
+// RV32-NEXT:[[TMP2:%.*]] = load double, ptr [[DARG_ADDR]], align 8
+// RV32-NEXT:[[TMP3:%.*]] = call double @llvm.ceil.f64(double [[TMP2]])
+// RV32-NEXT:[[TMP4:%.*]] = load fp128, ptr [[LDARG_ADDR]], align 16
+// RV32-NEXT:[[TMP5:%.*]] = call fp128 @llvm.ceil.f128(fp128 [[TMP4]])
+// RV32-NEXT:[[TMP6:%.*]] = load float, ptr [[FARG_ADDR]], align 4
+// RV32-NEXT:[[TMP7:%.*]] = load float, ptr [[FARG_ADDR]], align 4
+// RV32-NEXT:[[TMP8:%.*]] = call float @llvm.copysign.f32(float [[TMP6]], 
float [[TMP7]])
+// RV32-NEXT:[[TMP9:%.*]] = load double, p

[clang] [clang][ExprConst] allow single element access of vector object to be constant expression (PR #72607)

2023-11-22 Thread Yuanfang Chen via cfe-commits


https://github.com/yuanfang-chen updated 
https://github.com/llvm/llvm-project/pull/72607

>From b7d7c5fc70ffb792f67d007ec1bd71bcaed868fc Mon Sep 17 00:00:00 2001
From: Yuanfang Chen 
Date: Fri, 17 Nov 2023 03:16:38 +
Subject: [PATCH] [clang][ExprConst] allow single element access of vector
 object to be constant expression

Supports both v[0] and v.x/v.r/v.s0 syntax.
Selecting multiple elements is left as a future work.
---
 clang/lib/AST/ExprConstant.cpp| 104 +-
 clang/lib/AST/Interp/State.h  |   3 +-
 clang/test/CodeGenCXX/temporaries.cpp |  43 
 .../constexpr-vectors-access-elements.cpp |  29 +
 4 files changed, 153 insertions(+), 26 deletions(-)
 create mode 100644 clang/test/SemaCXX/constexpr-vectors-access-elements.cpp

diff --git a/clang/lib/AST/ExprConstant.cpp b/clang/lib/AST/ExprConstant.cpp
index 3a41e9718bb5875..d699a3a8fcf3e68 100644
--- a/clang/lib/AST/ExprConstant.cpp
+++ b/clang/lib/AST/ExprConstant.cpp
@@ -221,6 +221,12 @@ namespace {
 ArraySize = 2;
 MostDerivedLength = I + 1;
 IsArray = true;
+  } else if (Type->isVectorType()) {
+const auto *VT = Type->castAs();
+Type = VT->getElementType();
+ArraySize = VT->getNumElements();
+MostDerivedLength = I + 1;
+IsArray = true;
   } else if (const FieldDecl *FD = getAsField(Path[I])) {
 Type = FD->getType();
 ArraySize = 0;
@@ -437,6 +443,16 @@ namespace {
   MostDerivedArraySize = 2;
   MostDerivedPathLength = Entries.size();
 }
+void addVectorUnchecked(QualType EltTy, uint64_t Size, uint64_t Idx) {
+  Entries.push_back(PathEntry::ArrayIndex(Idx));
+
+  // This is technically a most-derived object, though in practice this
+  // is unlikely to matter.
+  MostDerivedType = EltTy;
+  MostDerivedIsArrayElement = true;
+  MostDerivedArraySize = Size;
+  MostDerivedPathLength = Entries.size();
+}
 void diagnoseUnsizedArrayPointerArithmetic(EvalInfo &Info, const Expr *E);
 void diagnosePointerArithmetic(EvalInfo &Info, const Expr *E,
const APSInt &N);
@@ -1715,6 +1731,11 @@ namespace {
   if (checkSubobject(Info, E, Imag ? CSK_Imag : CSK_Real))
 Designator.addComplexUnchecked(EltTy, Imag);
 }
+void addVectorElement(EvalInfo &Info, const Expr *E, QualType EltTy,
+  uint64_t Size, uint64_t Idx) {
+  if (checkSubobject(Info, E, CSK_VectorElement))
+Designator.addVectorUnchecked(EltTy, Size, Idx);
+}
 void clearIsNullPointer() {
   IsNullPtr = false;
 }
@@ -3261,6 +3282,19 @@ static bool HandleLValueComplexElement(EvalInfo &Info, 
const Expr *E,
   return true;
 }
 
+static bool HandleLValueVectorElement(EvalInfo &Info, const Expr *E,
+  LValue &LVal, QualType EltTy,
+  uint64_t Size, uint64_t Idx) {
+  if (Idx) {
+CharUnits SizeOfElement;
+if (!HandleSizeof(Info, E->getExprLoc(), EltTy, SizeOfElement))
+  return false;
+LVal.Offset += SizeOfElement * Idx;
+  }
+  LVal.addVectorElement(Info, E, EltTy, Size, Idx);
+  return true;
+}
+
 /// Try to evaluate the initializer for a variable declaration.
 ///
 /// \param Info   Information about the ongoing evaluation.
@@ -3806,6 +3840,21 @@ findSubobject(EvalInfo &Info, const Expr *E, const 
CompleteObject &Obj,
 return handler.found(Index ? O->getComplexFloatImag()
: O->getComplexFloatReal(), ObjType);
   }
+} else if (ObjType->isVectorType()) {
+  uint64_t Index = Sub.Entries[I].getAsArrayIndex();
+  if (Index >= ObjType->castAs()->getNumElements()) {
+if (Info.getLangOpts().CPlusPlus11)
+  Info.FFDiag(E, diag::note_constexpr_access_past_end)
+<< handler.AccessKind;
+else
+  Info.FFDiag(E);
+return handler.failed();
+  }
+
+  ObjType = ObjType->castAs()->getElementType();
+
+  assert(I == N - 1 && "extracting subobject of scalar?");
+  return handler.found(O->getVectorElt(Index), ObjType);
 } else if (const FieldDecl *Field = getAsField(Sub.Entries[I])) {
   if (Field->isMutable() &&
   !Obj.mayAccessMutableMembers(Info, handler.AccessKind)) {
@@ -8408,6 +8457,7 @@ class LValueExprEvaluator
   bool VisitCXXTypeidExpr(const CXXTypeidExpr *E);
   bool VisitCXXUuidofExpr(const CXXUuidofExpr *E);
   bool VisitArraySubscriptExpr(const ArraySubscriptExpr *E);
+  bool VisitExtVectorElementExpr(const ExtVectorElementExpr *E);
   bool VisitUnaryDeref(const UnaryOperator *E);
   bool VisitUnaryReal(const UnaryOperator *E);
   bool VisitUnaryImag(const UnaryOperator *E);
@@ -8721,15 +8771,63 @@ bool LValueExprEvaluator::VisitMemberExpr(const 
MemberExpr *E) {
   return LValueExprEvaluatorBaseTy::VisitMemberExpr(E);
 }
 
+bool LValueExprEvaluator::VisitExtVectorElement

[clang] clang/CodeGen/RISCV: test lowering of math builtins (PR #71399)

2023-11-22 Thread Craig Topper via cfe-commits


https://github.com/topperc approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/71399
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Driver] Add the --gcc-triple option (PR #73214)

2023-11-22 Thread via cfe-commits


github-actions[bot] wrote:




:warning: C/C++ code formatter, clang-format found issues in your code. 
:warning:



You can test this locally with the following command:


``bash
git-clang-format --diff 7f18f9a28c73490d09938af1fdb1908eb333a62c 
72f6f3a611f237f71ce02cfb79620257a9e2d827 -- clang/test/Driver/gcc-prefix.cpp 
clang/lib/Driver/ToolChains/Gnu.cpp
``





View the diff from clang-format here.


``diff
diff --git a/clang/lib/Driver/ToolChains/Gnu.cpp 
b/clang/lib/Driver/ToolChains/Gnu.cpp
index 44b92a879a..24c4d0c303 100644
--- a/clang/lib/Driver/ToolChains/Gnu.cpp
+++ b/clang/lib/Driver/ToolChains/Gnu.cpp
@@ -2132,7 +2132,8 @@ void Generic_GCC::GCCInstallationDetector::init(
 
   // If --gcc-triple is specified use this instead of trying to
   // auto-detect a triple.
-  if (const Arg *A = 
Args.getLastArg(clang::driver::options::OPT_gcc_triple_EQ)) {
+  if (const Arg *A =
+  Args.getLastArg(clang::driver::options::OPT_gcc_triple_EQ)) {
 StringRef GCCTriple = A->getValue();
 CandidateTripleAliases.clear();
 CandidateTripleAliases.push_back(GCCTriple);

``




https://github.com/llvm/llvm-project/pull/73214
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Driver] Add the --gcc-triple option (PR #73214)

2023-11-22 Thread via cfe-commits


llvmbot wrote:



@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-driver

Author: Tom Stellard (tstellar)


Changes

When --gcc-triple is used, the driver will search for the 'best' gcc 
installation that has the given triple.  This is useful for distributions that 
want clang to use a specific gcc triple, but do not want to pin to a specific 
version as would be required by using --gcc-install-dir.  Having clang linked 
to a specific gcc version can cause clang to stop working when the version of 
gcc installed on the system gets updated.

---
Full diff: https://github.com/llvm/llvm-project/pull/73214.diff


11 Files Affected:

- (modified) clang/include/clang/Driver/Options.td (+2) 
- (modified) clang/lib/Driver/ToolChains/Gnu.cpp (+8) 
- (added) 
clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-linux-gnu/13/crtbegin.o
 () 
- (added) 
clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-linux-gnu/13/crtend.o
 () 
- (added) 
clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-linux-gnu/13/crti.o 
() 
- (added) 
clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-linux-gnu/13/crtn.o 
() 
- (added) 
clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-redhat-linux/13/crtbegin.o
 () 
- (added) 
clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-redhat-linux/13/crtend.o
 () 
- (added) 
clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-redhat-linux/13/crti.o
 () 
- (added) 
clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-redhat-linux/13/crtn.o
 () 
- (added) clang/test/Driver/gcc-prefix.cpp (+8) 


``diff
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index b2f2bcb6ac37910..79ced47150b455d 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -773,6 +773,8 @@ def gcc_install_dir_EQ : Joined<["--"], "gcc-install-dir=">,
 def gcc_toolchain : Joined<["--"], "gcc-toolchain=">, Flags<[NoXarchOption]>,
   HelpText<"Specify a directory where Clang can find 'include' and 
'lib{,32,64}/gcc{,-cross}/$triple/$version'. "
   "Clang will use the GCC installation with the largest version">;
+def gcc_triple_EQ : Joined<["--"], "gcc-triple=">,
+  HelpText<"Search for the GCC installation with the specified triple.">;
 def CC : Flag<["-"], "CC">, Visibility<[ClangOption, CC1Option]>,
   Group,
 HelpText<"Include comments from within macros in preprocessed output">,
diff --git a/clang/lib/Driver/ToolChains/Gnu.cpp 
b/clang/lib/Driver/ToolChains/Gnu.cpp
index 9fb99145d3b909e..44b92a879a3a8c1 100644
--- a/clang/lib/Driver/ToolChains/Gnu.cpp
+++ b/clang/lib/Driver/ToolChains/Gnu.cpp
@@ -2130,6 +2130,14 @@ void Generic_GCC::GCCInstallationDetector::init(
 return;
   }
 
+  // If --gcc-triple is specified use this instead of trying to
+  // auto-detect a triple.
+  if (const Arg *A = 
Args.getLastArg(clang::driver::options::OPT_gcc_triple_EQ)) {
+StringRef GCCTriple = A->getValue();
+CandidateTripleAliases.clear();
+CandidateTripleAliases.push_back(GCCTriple);
+  }
+
   // Compute the set of prefixes for our search.
   SmallVector Prefixes;
   StringRef GCCToolchainDir = getGCCToolchainDir(Args, D.SysRoot);
diff --git 
a/clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-linux-gnu/13/crtbegin.o
 
b/clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-linux-gnu/13/crtbegin.o
new file mode 100644
index 000..e69de29bb2d1d64
diff --git 
a/clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-linux-gnu/13/crtend.o
 
b/clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-linux-gnu/13/crtend.o
new file mode 100644
index 000..e69de29bb2d1d64
diff --git 
a/clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-linux-gnu/13/crti.o
 
b/clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-linux-gnu/13/crti.o
new file mode 100644
index 000..e69de29bb2d1d64
diff --git 
a/clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-linux-gnu/13/crtn.o
 
b/clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-linux-gnu/13/crtn.o
new file mode 100644
index 000..e69de29bb2d1d64
diff --git 
a/clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-redhat-linux/13/crtbegin.o
 
b/clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-redhat-linux/13/crtbegin.o
new file mode 100644
index 000..e69de29bb2d1d64
diff --git 
a/clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-redhat-linux/13/crtend.o
 
b/clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-redhat-linux/13/crtend.o
new file mode 100644
index 000..e69de29bb2d1d64
diff --git 
a/clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-redhat-linux/13/crti.o
 
b/clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-redhat-linux/13/crti.o
new file mode 100644
index 000..e69de29bb2d1d64
diff --git 
a/clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-redhat-

[clang] [Driver] Add the --gcc-triple option (PR #73214)

2023-11-22 Thread Tom Stellard via cfe-commits


https://github.com/tstellar created 
https://github.com/llvm/llvm-project/pull/73214

When --gcc-triple is used, the driver will search for the 'best' gcc 
installation that has the given triple.  This is useful for distributions that 
want clang to use a specific gcc triple, but do not want to pin to a specific 
version as would be required by using --gcc-install-dir.  Having clang linked 
to a specific gcc version can cause clang to stop working when the version of 
gcc installed on the system gets updated.

>From 72f6f3a611f237f71ce02cfb79620257a9e2d827 Mon Sep 17 00:00:00 2001
From: Tom Stellard 
Date: Thu, 16 Nov 2023 05:11:04 +
Subject: [PATCH] [Driver] Add the --gcc-triple option

When --gcc-triple is used, the driver will search for the 'best' gcc
installation that has the given triple.  This is useful for
distributions that want clang to use a specific gcc triple, but do
not want to pin to a specific version as would be required by
using --gcc-install-dir.  Having clang linked to a specific gcc version
can cause clang to stop working when the version of gcc installed on
the system gets updated.
---
 clang/include/clang/Driver/Options.td | 2 ++
 clang/lib/Driver/ToolChains/Gnu.cpp   | 8 
 .../usr/lib/gcc/x86_64-linux-gnu/13/crtbegin.o| 0
 .../usr/lib/gcc/x86_64-linux-gnu/13/crtend.o  | 0
 .../fedora_39_tree/usr/lib/gcc/x86_64-linux-gnu/13/crti.o | 0
 .../fedora_39_tree/usr/lib/gcc/x86_64-linux-gnu/13/crtn.o | 0
 .../usr/lib/gcc/x86_64-redhat-linux/13/crtbegin.o | 0
 .../usr/lib/gcc/x86_64-redhat-linux/13/crtend.o   | 0
 .../usr/lib/gcc/x86_64-redhat-linux/13/crti.o | 0
 .../usr/lib/gcc/x86_64-redhat-linux/13/crtn.o | 0
 clang/test/Driver/gcc-prefix.cpp  | 8 
 11 files changed, 18 insertions(+)
 create mode 100644 
clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-linux-gnu/13/crtbegin.o
 create mode 100644 
clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-linux-gnu/13/crtend.o
 create mode 100644 
clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-linux-gnu/13/crti.o
 create mode 100644 
clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-linux-gnu/13/crtn.o
 create mode 100644 
clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-redhat-linux/13/crtbegin.o
 create mode 100644 
clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-redhat-linux/13/crtend.o
 create mode 100644 
clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-redhat-linux/13/crti.o
 create mode 100644 
clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-redhat-linux/13/crtn.o
 create mode 100644 clang/test/Driver/gcc-prefix.cpp

diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index b2f2bcb6ac37910..79ced47150b455d 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -773,6 +773,8 @@ def gcc_install_dir_EQ : Joined<["--"], "gcc-install-dir=">,
 def gcc_toolchain : Joined<["--"], "gcc-toolchain=">, Flags<[NoXarchOption]>,
   HelpText<"Specify a directory where Clang can find 'include' and 
'lib{,32,64}/gcc{,-cross}/$triple/$version'. "
   "Clang will use the GCC installation with the largest version">;
+def gcc_triple_EQ : Joined<["--"], "gcc-triple=">,
+  HelpText<"Search for the GCC installation with the specified triple.">;
 def CC : Flag<["-"], "CC">, Visibility<[ClangOption, CC1Option]>,
   Group,
 HelpText<"Include comments from within macros in preprocessed output">,
diff --git a/clang/lib/Driver/ToolChains/Gnu.cpp 
b/clang/lib/Driver/ToolChains/Gnu.cpp
index 9fb99145d3b909e..44b92a879a3a8c1 100644
--- a/clang/lib/Driver/ToolChains/Gnu.cpp
+++ b/clang/lib/Driver/ToolChains/Gnu.cpp
@@ -2130,6 +2130,14 @@ void Generic_GCC::GCCInstallationDetector::init(
 return;
   }
 
+  // If --gcc-triple is specified use this instead of trying to
+  // auto-detect a triple.
+  if (const Arg *A = 
Args.getLastArg(clang::driver::options::OPT_gcc_triple_EQ)) {
+StringRef GCCTriple = A->getValue();
+CandidateTripleAliases.clear();
+CandidateTripleAliases.push_back(GCCTriple);
+  }
+
   // Compute the set of prefixes for our search.
   SmallVector Prefixes;
   StringRef GCCToolchainDir = getGCCToolchainDir(Args, D.SysRoot);
diff --git 
a/clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-linux-gnu/13/crtbegin.o
 
b/clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-linux-gnu/13/crtbegin.o
new file mode 100644
index 000..e69de29bb2d1d64
diff --git 
a/clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-linux-gnu/13/crtend.o
 
b/clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-linux-gnu/13/crtend.o
new file mode 100644
index 000..e69de29bb2d1d64
diff --git 
a/clang/test/Driver/Inputs/fedora_39_tree/usr/lib/gcc/x86_64-linux-gnu/13/crti.o
 
b/clang/test/Driver/Inputs/fedora_39_tree/us

[clang-tools-extra] [clangd] Add includes from source to non-self-contained headers (PR #72479)

2023-11-22 Thread Nathan Ridge via cfe-commits


HighCommander4 wrote:

A somewhat narrow use case where this approach might be workable is unity 
builds (groups of source files batched together to be compiled as a single 
translation unit for build performance).

That's a situation where the non-self-contained files of interest (the 
individual source files that are included in such a batch) have precisely one 
includer, their incoming preprocessor state can be replicated precisely using 
`-include` flags, and the intervention can be limited to them (it wouldn't 
apply to header files).

Of course, the including file is not likely to be open in this case, so we'd 
want something like an include graph stored in the index (as proposed in #123) 
to consult.

https://github.com/llvm/llvm-project/pull/72479
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] Avoid memcopy for small structure with padding under -ftrivial-auto-var-init (PR #71677)

2023-11-22 Thread via cfe-commits


https://github.com/serge-sans-paille closed 
https://github.com/llvm/llvm-project/pull/71677
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] 0d2860b - [clang] Avoid memcopy for small structure with padding under -ftrivial-auto-var-init (#71677)

2023-11-22 Thread via cfe-commits


Author: serge-sans-paille
Date: 2023-11-23T05:38:14Z
New Revision: 0d2860b795879f4dd152963b52f969b53b136899

URL: 
https://github.com/llvm/llvm-project/commit/0d2860b795879f4dd152963b52f969b53b136899
DIFF: 
https://github.com/llvm/llvm-project/commit/0d2860b795879f4dd152963b52f969b53b136899.diff

LOG: [clang] Avoid memcopy for small structure with padding under 
-ftrivial-auto-var-init (#71677)

Added: 


Modified: 
clang/lib/CodeGen/CGDecl.cpp
clang/test/CodeGenCXX/auto-var-init.cpp

Removed: 




diff  --git a/clang/lib/CodeGen/CGDecl.cpp b/clang/lib/CodeGen/CGDecl.cpp
index e5795d811c76de7..a5da0aa2965a000 100644
--- a/clang/lib/CodeGen/CGDecl.cpp
+++ b/clang/lib/CodeGen/CGDecl.cpp
@@ -1244,29 +1244,24 @@ static void emitStoresForConstant(CodeGenModule &CGM, 
const VarDecl &D,
   // If the initializer is small, use a handful of stores.
   if (shouldSplitConstantStore(CGM, ConstantSize)) {
 if (auto *STy = dyn_cast(Ty)) {
-  // FIXME: handle the case when STy != Loc.getElementType().
-  if (STy == Loc.getElementType()) {
-for (unsigned i = 0; i != constant->getNumOperands(); i++) {
-  Address EltPtr = Builder.CreateStructGEP(Loc, i);
-  emitStoresForConstant(
-  CGM, D, EltPtr, isVolatile, Builder,
-  cast(Builder.CreateExtractValue(constant, i)),
-  IsAutoInit);
-}
-return;
+  const llvm::StructLayout *Layout =
+  CGM.getDataLayout().getStructLayout(STy);
+  for (unsigned i = 0; i != constant->getNumOperands(); i++) {
+CharUnits CurOff = 
CharUnits::fromQuantity(Layout->getElementOffset(i));
+Address EltPtr = Builder.CreateConstInBoundsByteGEP(
+Loc.withElementType(CGM.Int8Ty), CurOff);
+emitStoresForConstant(CGM, D, EltPtr, isVolatile, Builder,
+  constant->getAggregateElement(i), IsAutoInit);
   }
+  return;
 } else if (auto *ATy = dyn_cast(Ty)) {
-  // FIXME: handle the case when ATy != Loc.getElementType().
-  if (ATy == Loc.getElementType()) {
-for (unsigned i = 0; i != ATy->getNumElements(); i++) {
-  Address EltPtr = Builder.CreateConstArrayGEP(Loc, i);
-  emitStoresForConstant(
-  CGM, D, EltPtr, isVolatile, Builder,
-  cast(Builder.CreateExtractValue(constant, i)),
-  IsAutoInit);
-}
-return;
+  for (unsigned i = 0; i != ATy->getNumElements(); i++) {
+Address EltPtr = Builder.CreateConstGEP(
+Loc.withElementType(ATy->getElementType()), i);
+emitStoresForConstant(CGM, D, EltPtr, isVolatile, Builder,
+  constant->getAggregateElement(i), IsAutoInit);
   }
+  return;
 }
   }
 

diff  --git a/clang/test/CodeGenCXX/auto-var-init.cpp 
b/clang/test/CodeGenCXX/auto-var-init.cpp
index 6cb18528ebadcdf..e5a9d015f22f276 100644
--- a/clang/test/CodeGenCXX/auto-var-init.cpp
+++ b/clang/test/CodeGenCXX/auto-var-init.cpp
@@ -89,22 +89,14 @@ struct padded { char c; int i; };
 // PATTERN-O1-NOT: @__const.test_paddednullinit_custom.custom
 struct paddednullinit { char c = 0; int i = 0; };
 // PATTERN-O0: @__const.test_paddedpacked_uninit.uninit = private unnamed_addr 
constant %struct.paddedpacked <{ i8 [[I8]], i32 [[I32]] }>, align 1
-// PATTERN: @__const.test_paddedpacked_custom.custom = private unnamed_addr 
constant %struct.paddedpacked <{ i8 42, i32 13371337 }>, align 1
-// ZERO: @__const.test_paddedpacked_custom.custom = private unnamed_addr 
constant %struct.paddedpacked <{ i8 42, i32 13371337 }>, align 1
 struct paddedpacked { char c; int i; } __attribute__((packed));
 // PATTERN-O0: @__const.test_paddedpackedarray_uninit.uninit = private 
unnamed_addr constant %struct.paddedpackedarray { [2 x %struct.paddedpacked] 
[%struct.paddedpacked <{ i8 [[I8]], i32 [[I32]] }>, %struct.paddedpacked <{ i8 
[[I8]], i32 [[I32]] }>] }, align 1
-// PATTERN: @__const.test_paddedpackedarray_custom.custom = private 
unnamed_addr constant %struct.paddedpackedarray { [2 x %struct.paddedpacked] 
[%struct.paddedpacked <{ i8 42, i32 13371337 }>, %struct.paddedpacked <{ i8 43, 
i32 13371338 }>] }, align 1
-// ZERO: @__const.test_paddedpackedarray_custom.custom = private unnamed_addr 
constant %struct.paddedpackedarray { [2 x %struct.paddedpacked] 
[%struct.paddedpacked <{ i8 42, i32 13371337 }>, %struct.paddedpacked <{ i8 43, 
i32 13371338 }>] }, align 1
 struct paddedpackedarray { struct paddedpacked p[2]; };
 // PATTERN-O0: @__const.test_unpackedinpacked_uninit.uninit = private 
unnamed_addr constant <{ { i8, [3 x i8], i32 }, i8 }> <{ { i8, [3 x i8], i32 } 
{ i8 [[I8]], [3 x i8] c"\[[IC]]\[[IC]]\[[IC]]", i32 [[I32]] }, i8 [[I8]] }>, 
align 1
 struct unpackedinpacked { padded a; char b; } __attribute__((packed));
 // PATTERN-O0: @__const.test_paddednested_uninit.uninit = private unnamed_addr 
constant { {

[flang] [clang] [flang][Driver] Support -nodefaultlibs, -nostartfiles and -nostdlib (PR #72601)

2023-11-22 Thread Brad Smith via cfe-commits


brad0 wrote:


> It would be helpful if you could clearly distinguish between the two opposite 
> cases that are currently being tested in "dynamic-linker.f90".

Ok, done.

https://github.com/llvm/llvm-project/pull/72601
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[flang] [clang] [flang][Driver] Support -nodefaultlibs, -nostartfiles and -nostdlib (PR #72601)

2023-11-22 Thread Brad Smith via cfe-commits


https://github.com/brad0 updated https://github.com/llvm/llvm-project/pull/72601

>From da17459071b039e9da0f53ae5e68ab1c3aaa13e4 Mon Sep 17 00:00:00 2001
From: Brad Smith 
Date: Wed, 15 Nov 2023 14:24:11 -0500
Subject: [PATCH] [flang][Driver] Support -nodefaultlibs, -nostartfiles and
 -nostdlib

---
 clang/include/clang/Driver/Options.td |  9 ---
 flang/test/Driver/dynamic-linker.f90  | 35 +++
 2 files changed, 41 insertions(+), 3 deletions(-)

diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index b2f2bcb6ac37910..bc04c53d7c76b9d 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -5152,7 +5152,8 @@ def : Flag<["-"], "nocudalib">, Alias;
 def gpulibc : Flag<["-"], "gpulibc">, Visibility<[ClangOption, CC1Option]>,
   HelpText<"Link the LLVM C Library for GPUs">;
 def nogpulibc : Flag<["-"], "nogpulibc">, Visibility<[ClangOption, CC1Option]>;
-def nodefaultlibs : Flag<["-"], "nodefaultlibs">;
+def nodefaultlibs : Flag<["-"], "nodefaultlibs">,
+  Visibility<[ClangOption, CLOption, DXCOption, FlangOption]>;
 def nodriverkitlib : Flag<["-"], "nodriverkitlib">;
 def nofixprebinding : Flag<["-"], "nofixprebinding">;
 def nolibc : Flag<["-"], "nolibc">;
@@ -5162,7 +5163,8 @@ def no_pie : Flag<["-"], "no-pie">, 
Visibility<[ClangOption, FlangOption]>;
 def noprebind : Flag<["-"], "noprebind">;
 def noprofilelib : Flag<["-"], "noprofilelib">;
 def noseglinkedit : Flag<["-"], "noseglinkedit">;
-def nostartfiles : Flag<["-"], "nostartfiles">, Group;
+def nostartfiles : Flag<["-"], "nostartfiles">, Group,
+  Visibility<[ClangOption, CLOption, DXCOption, FlangOption]>;
 def nostdinc : Flag<["-"], "nostdinc">,
   Visibility<[ClangOption, CLOption, DXCOption]>, Group;
 def nostdlibinc : Flag<["-"], "nostdlibinc">, Group;
@@ -5170,7 +5172,8 @@ def nostdincxx : Flag<["-"], "nostdinc++">, 
Visibility<[ClangOption, CC1Option]>
   Group,
   HelpText<"Disable standard #include directories for the C++ standard 
library">,
   MarshallingInfoNegativeFlag>;
-def nostdlib : Flag<["-"], "nostdlib">, Group;
+def nostdlib : Flag<["-"], "nostdlib">, Group,
+  Visibility<[ClangOption, CLOption, DXCOption, FlangOption]>;
 def nostdlibxx : Flag<["-"], "nostdlib++">;
 def object : Flag<["-"], "object">;
 def o : JoinedOrSeparate<["-"], "o">,
diff --git a/flang/test/Driver/dynamic-linker.f90 
b/flang/test/Driver/dynamic-linker.f90
index df119c22a2ea516..aa90be5ac196e0e 100644
--- a/flang/test/Driver/dynamic-linker.f90
+++ b/flang/test/Driver/dynamic-linker.f90
@@ -18,3 +18,38 @@
 ! MSVC-LINKER-OPTIONS: "{{.*}}link{{(.exe)?}}"
 ! MSVC-LINKER-OPTIONS-SAME: "-dll"
 ! MSVC-LINKER-OPTIONS-SAME: "-rpath" "/path/to/dir"
+
+! Verify that certain linker flags are known to the frontend and are not 
passed on
+! to the linker.
+
+! RUN: %flang -### --target=x86_64-unknown-freebsd -nostdlib %s 2>&1 | 
FileCheck \
+! RUN: --check-prefixes=NOSTDLIB %s
+! RUN: %flang -### --target=x86_64-unknown-netbsd -nostdlib %s 2>&1 | 
FileCheck \
+! RUN: --check-prefixes=NOSTDLIB %s
+! RUN: %flang -### --target=i386-pc-solaris2.11 -nostdlib %s 2>&1 | FileCheck \
+! RUN: --check-prefixes=NOSTDLIB %s
+
+! NOSTDLIB: "{{.*}}ld{{(.exe)?}}"
+! NOSTDLIB-NOT: crt{{[^.]+}}.o
+! NOSTDLIB-NOT: "-lFortran_main" "-lFortranRuntime" "-lFortranDecimal" "-lm"
+
+! RUN: %flang -### --target=x86_64-unknown-freebsd -nodefaultlibs %s 2>&1 | 
FileCheck \
+! RUN: --check-prefixes=NODEFAULTLIBS %s
+! RUN: %flang -### --target=x86_64-unknown-netbsd -nodefaultlibs %s 2>&1 | 
FileCheck \
+! RUN: --check-prefixes=NODEFAULTLIBS %s
+! RUN: %flang -### --target=i386-pc-solaris2.11 -nodefaultlibs %s 2>&1 | 
FileCheck \
+! RUN: --check-prefixes=NODEFAULTLIBS %s
+
+! NODEFAULTLIBS: "{{.*}}ld{{(.exe)?}}"
+! NODEFAULTLIBS-NOT: "-lFortran_main" "-lFortranRuntime" "-lFortranDecimal" 
"-lm"
+
+! RUN: %flang -### --target=x86_64-unknown-freebsd -nostartfiles %s 2>&1 | 
FileCheck \
+! RUN: --check-prefixes=NOSTARTFILES %s
+! RUN: %flang -### --target=x86_64-unknown-netbsd -nostartfiles %s 2>&1 | 
FileCheck \
+! RUN: --check-prefixes=NOSTARTFILES %s
+! RUN: %flang -### --target=i386-pc-solaris2.11 -nostartfiles %s 2>&1 | 
FileCheck \
+! RUN: --check-prefixes=NOSTARTFILES %s
+
+! NOSTARTFILES: "{{.*}}ld{{(.exe)?}}"
+! NOSTARTFILES-NOT: crt{{[^.]+}}.o
+! NOSTARTFILES: "-lFortran_main" "-lFortranRuntime" "-lFortranDecimal" "-lm"

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] CGCoroutines skip emitting try block for value returning `noexcept` init `await_resume` calls (PR #73160)

2023-11-22 Thread Chuanqi Xu via cfe-commits



@@ -38,9 +39,52 @@ Task coro_create() {
 co_return;
 }
 
-// CHECK-LABEL: define{{.*}} ptr @_Z11coro_createv(
+// CHECK-LABEL: define{{.*}} ptr @_ZN9can_throw11coro_createEv(
 // CHECK: init.ready:
 // CHECK-NEXT: store i1 true, ptr {{.*}}
-// CHECK-NEXT: call void @_ZN4Task23initial_suspend_awaiter12await_resumeEv(
-// CHECK-NEXT: call void @_ZN14NontrivialTypeD1Ev(
+// CHECK-NEXT: call void 
@_ZN9can_throw4Task23initial_suspend_awaiter12await_resumeEv(
+// CHECK-NEXT: call void @_ZN9can_throw14NontrivialTypeD1Ev(
 // CHECK-NEXT: store i1 false, ptr {{.*}}
+}
+
+namespace no_throw {
+struct NontrivialType {
+  ~NontrivialType() {}
+};
+
+struct Task {

ChuanqiXu9 wrote:

It looks a little bit confusing. Let's try to rename it to InitNoThrowTask.

https://github.com/llvm/llvm-project/pull/73160
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] CGCoroutines skip emitting try block for value returning `noexcept` init `await_resume` calls (PR #73160)

2023-11-22 Thread Chuanqi Xu via cfe-commits



@@ -129,7 +130,14 @@ static SmallString<32> buildSuspendPrefixStr(CGCoroData 
&Coro, AwaitKind Kind) {
   return Prefix;
 }
 
-static bool memberCallExpressionCanThrow(const Expr *E) {
+static bool ResumeExprCanThrow(const CoroutineSuspendExpr &S) {
+  const Expr *E = S.getResumeExpr();
+
+  // If the return type of await_resume is not void, get the CXXMemberCallExpr
+  // from its subexpr.
+  if (const auto *BindTempExpr = dyn_cast(E)) {
+E = BindTempExpr->getSubExpr();
+  }

ChuanqiXu9 wrote:

Such pattern match doesn't smell good. How about looking into its children 
recursively if we find `E` is not CXXMemberCallExpr?

https://github.com/llvm/llvm-project/pull/73160
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] CGCoroutines skip emitting try block for value returning `noexcept` init `await_resume` calls (PR #73160)

2023-11-22 Thread Chuanqi Xu via cfe-commits



@@ -12,9 +12,10 @@
 
 #include "CGCleanup.h"
 #include "CodeGenFunction.h"
-#include "llvm/ADT/ScopeExit.h"
+#include "clang/AST/ExprCXX.h"
 #include "clang/AST/StmtCXX.h"
 #include "clang/AST/StmtVisitor.h"
+#include "llvm/ADT/ScopeExit.h"

ChuanqiXu9 wrote:

Is this change necessary?

https://github.com/llvm/llvm-project/pull/73160
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] Add missing LinkageSpec case to getCursorKindForDecl (PR #72401)

2023-11-22 Thread Shivam Gupta via cfe-commits


xgupta wrote:

Does this fix https://github.com/llvm/llvm-project/issues/56687?

https://github.com/llvm/llvm-project/pull/72401
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] Add partial-inlining options (PR #73210)

2023-11-22 Thread via cfe-commits


llvmbot wrote:




@llvm/pr-subscribers-clang-driver

Author: Jolyon (Jolyon0202)


Changes

Adaptation of adding -fpartial-inlining and -fno-partial-inlining options with 
GCC.

---
Full diff: https://github.com/llvm/llvm-project/pull/73210.diff


3 Files Affected:

- (modified) clang/include/clang/Driver/Options.td (+2) 
- (modified) clang/lib/Driver/ToolChains/Clang.cpp (+12) 
- (modified) clang/test/Driver/clang_f_opts.c (+5) 


``diff
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index b2f2bcb6ac37910..8bc1d7991fdd5cb 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -3134,6 +3134,8 @@ def fno_inline_functions : Flag<["-"], 
"fno-inline-functions">, Group;
 def fno_inline : Flag<["-"], "fno-inline">, Group,
   Visibility<[ClangOption, CC1Option]>;
+def fpartial_inlining : Flag<["-"], "fpartial-inlining">, 
Group, Flags<[CC1Option]>;
+def fno_partial_inlining : Flag<["-"], "fno-partial-inlining">, 
Group, Flags<[CC1Option]>;
 def fno_global_isel : Flag<["-"], "fno-global-isel">, Group,
   HelpText<"Disables the global instruction selector">;
 def fno_experimental_isel : Flag<["-"], "fno-experimental-isel">, 
Group,
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 6dec117aed1056b..028204140d1265f 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -6923,6 +6923,18 @@ void Clang::ConstructJob(Compilation &C, const JobAction 
&JA,
 
   Args.AddLastArg(CmdArgs, options::OPT_finline_max_stacksize_EQ);
 
+  // Adaptation of partial-inlining option with GCC.
+  if (Arg *A = Args.getLastArg(options::OPT_fno_partial_inlining,
+   options::OPT_fpartial_inlining)) {
+if (A->getOption().matches(options::OPT_fno_partial_inlining)) {
+  CmdArgs.push_back("-mllvm");
+  CmdArgs.push_back("-disable-partial-inlining");
+} else if (A->getOption().matches(options::OPT_fpartial_inlining)) {
+  CmdArgs.push_back("-mllvm");
+  CmdArgs.push_back("-enable-partial-inlining");
+}
+  }
+
   // FIXME: Find a better way to determine whether we are in C++20.
   bool HaveCxx20 =
   Std &&
diff --git a/clang/test/Driver/clang_f_opts.c b/clang/test/Driver/clang_f_opts.c
index ebe8a0520bf0fca..bab1ea33dd7c941 100644
--- a/clang/test/Driver/clang_f_opts.c
+++ b/clang/test/Driver/clang_f_opts.c
@@ -611,3 +611,8 @@
 // CHECK-INT-OBJEMITTER-NOT: unsupported option '-fintegrated-objemitter' for 
target
 // RUN: not %clang -### -fno-integrated-objemitter --target=x86_64 %s 2>&1 | 
FileCheck -check-prefix=CHECK-NOINT-OBJEMITTER %s
 // CHECK-NOINT-OBJEMITTER: unsupported option '-fno-integrated-objemitter' for 
target
+
+// RUN: %clang -### -S -fpartial-inlining %s 2>&1 | FileCheck 
-check-prefix=CHECK-PARTIAL-INLINING %s
+// CHECK-PARTIAL-INLINING: "-mllvm" "-enable-partial-inlining"
+// RUN: %clang -### -S -fno-partial-inlining %s 2>&1 | FileCheck 
-check-prefix=CHECK-NO-PARTIAL-INLINING %s
+// CHECK-NO-PARTIAL-INLINING: "-mllvm" "-disable-partial-inlining"

``




https://github.com/llvm/llvm-project/pull/73210
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] Add partial-inlining options (PR #73210)

2023-11-22 Thread via cfe-commits


https://github.com/Jolyon0202 created 
https://github.com/llvm/llvm-project/pull/73210

Adaptation of adding -fpartial-inlining and -fno-partial-inlining options with 
GCC.

>From f525387d65a1cdee561f919b3351b528bd44a535 Mon Sep 17 00:00:00 2001
From: Jian Yang 
Date: Thu, 23 Nov 2023 12:54:52 +0800
Subject: [PATCH] [clang] Add partial-inlining options

Adaptation of adding -fpartial-inlining and -fno-partial-inlining options with 
GCC.
---
 clang/include/clang/Driver/Options.td |  2 ++
 clang/lib/Driver/ToolChains/Clang.cpp | 12 
 clang/test/Driver/clang_f_opts.c  |  5 +
 3 files changed, 19 insertions(+)

diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index b2f2bcb6ac37910..8bc1d7991fdd5cb 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -3134,6 +3134,8 @@ def fno_inline_functions : Flag<["-"], 
"fno-inline-functions">, Group;
 def fno_inline : Flag<["-"], "fno-inline">, Group,
   Visibility<[ClangOption, CC1Option]>;
+def fpartial_inlining : Flag<["-"], "fpartial-inlining">, 
Group, Flags<[CC1Option]>;
+def fno_partial_inlining : Flag<["-"], "fno-partial-inlining">, 
Group, Flags<[CC1Option]>;
 def fno_global_isel : Flag<["-"], "fno-global-isel">, Group,
   HelpText<"Disables the global instruction selector">;
 def fno_experimental_isel : Flag<["-"], "fno-experimental-isel">, 
Group,
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 6dec117aed1056b..028204140d1265f 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -6923,6 +6923,18 @@ void Clang::ConstructJob(Compilation &C, const JobAction 
&JA,
 
   Args.AddLastArg(CmdArgs, options::OPT_finline_max_stacksize_EQ);
 
+  // Adaptation of partial-inlining option with GCC.
+  if (Arg *A = Args.getLastArg(options::OPT_fno_partial_inlining,
+   options::OPT_fpartial_inlining)) {
+if (A->getOption().matches(options::OPT_fno_partial_inlining)) {
+  CmdArgs.push_back("-mllvm");
+  CmdArgs.push_back("-disable-partial-inlining");
+} else if (A->getOption().matches(options::OPT_fpartial_inlining)) {
+  CmdArgs.push_back("-mllvm");
+  CmdArgs.push_back("-enable-partial-inlining");
+}
+  }
+
   // FIXME: Find a better way to determine whether we are in C++20.
   bool HaveCxx20 =
   Std &&
diff --git a/clang/test/Driver/clang_f_opts.c b/clang/test/Driver/clang_f_opts.c
index ebe8a0520bf0fca..bab1ea33dd7c941 100644
--- a/clang/test/Driver/clang_f_opts.c
+++ b/clang/test/Driver/clang_f_opts.c
@@ -611,3 +611,8 @@
 // CHECK-INT-OBJEMITTER-NOT: unsupported option '-fintegrated-objemitter' for 
target
 // RUN: not %clang -### -fno-integrated-objemitter --target=x86_64 %s 2>&1 | 
FileCheck -check-prefix=CHECK-NOINT-OBJEMITTER %s
 // CHECK-NOINT-OBJEMITTER: unsupported option '-fno-integrated-objemitter' for 
target
+
+// RUN: %clang -### -S -fpartial-inlining %s 2>&1 | FileCheck 
-check-prefix=CHECK-PARTIAL-INLINING %s
+// CHECK-PARTIAL-INLINING: "-mllvm" "-enable-partial-inlining"
+// RUN: %clang -### -S -fno-partial-inlining %s 2>&1 | FileCheck 
-check-prefix=CHECK-NO-PARTIAL-INLINING %s
+// CHECK-NO-PARTIAL-INLINING: "-mllvm" "-disable-partial-inlining"

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] ms inline asm: Fix {call,jmp} fptr (PR #73207)

2023-11-22 Thread Fangrui Song via cfe-commits


https://github.com/MaskRay updated 
https://github.com/llvm/llvm-project/pull/73207

>From f8d61499c92d98e3c29027f0137e9d2f734d39c0 Mon Sep 17 00:00:00 2001
From: Fangrui Song 
Date: Wed, 22 Nov 2023 16:14:14 -0800
Subject: [PATCH] ms inline asm: Fix {call,jmp} fptr

https://reviews.llvm.org/D151863 (2023-05) removed
`BaseReg = BaseReg ? BaseReg : 1` (introduced in commit
175d0aeef3725ce17032e9ef76e018139f2f52f0 (2013)) and caused a
regression: ensuring a non-zero `BaseReg` was to treat
`static void (*fptr)(); __asm { call fptr }` as non-`AbsMem`
`AsmOperandClass` and generate `call $0`/`callq *fptr(%rip)` instead of
`call ${0:P}`/`callq *fptr`
(The asm template argument modifier `P` is for call targets, while
no modifier is used by other instructions like `mov rax, fptr`)

This patch reinstates the BaseReg-setting statement but places it inside
`if (IsGlobalLV)` for clarify.

The special case is unfortunate, but we also have special case in
similar places (https://reviews.llvm.org/D149920).

Fix: #73033
---
 clang/test/CodeGen/ms-inline-asm-64.c |  7 +++-
 .../lib/Target/X86/AsmParser/X86AsmParser.cpp | 38 +++
 2 files changed, 29 insertions(+), 16 deletions(-)

diff --git a/clang/test/CodeGen/ms-inline-asm-64.c 
b/clang/test/CodeGen/ms-inline-asm-64.c
index 313d380e121bce0..c7e4c1b603bd76c 100644
--- a/clang/test/CodeGen/ms-inline-asm-64.c
+++ b/clang/test/CodeGen/ms-inline-asm-64.c
@@ -60,17 +60,22 @@ int t4(void) {
 }
 
 void bar() {}
+static void (*fptr)();
 
 void t5(void) {
   __asm {
 call bar
 jmp bar
+call fptr
+jmp fptr
   }
   // CHECK: t5
   // CHECK: call void asm sideeffect inteldialect
   // CHECK-SAME: call ${0:P}
   // CHECK-SAME: jmp ${1:P}
-  // CHECK-SAME: "*m,*m,~{dirflag},~{fpsr},~{flags}"(ptr elementtype(void 
(...)) @bar, ptr elementtype(void (...)) @bar)
+  // CHECK-SAME: call $2
+  // CHECK-SAME: jmp $3
+  // CHECK-SAME: "*m,*m,*m,*m,~{dirflag},~{fpsr},~{flags}"(ptr 
elementtype(void (...)) @bar, ptr elementtype(void (...)) @bar, ptr 
elementtype(ptr) @fptr, ptr elementtype(ptr) @fptr)
 }
 
 void t47(void) {
diff --git a/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp 
b/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
index 008075163b90a8d..a02978c64412cf7 100644
--- a/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
+++ b/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
@@ -1144,8 +1144,8 @@ class X86AsmParser : public MCTargetAsmParser {
   bool ParseIntelMemoryOperandSize(unsigned &Size);
   bool CreateMemForMSInlineAsm(unsigned SegReg, const MCExpr *Disp,
unsigned BaseReg, unsigned IndexReg,
-   unsigned Scale, SMLoc Start, SMLoc End,
-   unsigned Size, StringRef Identifier,
+   unsigned Scale, bool NonAbsMem, SMLoc Start,
+   SMLoc End, unsigned Size, StringRef Identifier,
const InlineAsmIdentifierInfo &Info,
OperandVector &Operands);
 
@@ -1745,10 +1745,13 @@ bool X86AsmParser::parseOperand(OperandVector 
&Operands, StringRef Name) {
   return parseATTOperand(Operands);
 }
 
-bool X86AsmParser::CreateMemForMSInlineAsm(
-unsigned SegReg, const MCExpr *Disp, unsigned BaseReg, unsigned IndexReg,
-unsigned Scale, SMLoc Start, SMLoc End, unsigned Size, StringRef 
Identifier,
-const InlineAsmIdentifierInfo &Info, OperandVector &Operands) {
+bool X86AsmParser::CreateMemForMSInlineAsm(unsigned SegReg, const MCExpr *Disp,
+   unsigned BaseReg, unsigned IndexReg,
+   unsigned Scale, bool NonAbsMem,
+   SMLoc Start, SMLoc End,
+   unsigned Size, StringRef Identifier,
+   const InlineAsmIdentifierInfo &Info,
+   OperandVector &Operands) {
   // If we found a decl other than a VarDecl, then assume it is a FuncDecl or
   // some other label reference.
   if (Info.isKind(InlineAsmIdentifierInfo::IK_Label)) {
@@ -1773,11 +1776,15 @@ bool X86AsmParser::CreateMemForMSInlineAsm(
   }
   // It is widely common for MS InlineAsm to use a global variable and one/two
   // registers in a mmory expression, and though unaccessible via rip/eip.
-  if (IsGlobalLV && (BaseReg || IndexReg)) {
-Operands.push_back(X86Operand::CreateMem(getPointerWidth(), Disp, Start,
- End, Size, Identifier, Decl, 0,
- BaseReg && IndexReg));
-return false;
+  if (IsGlobalLV) {
+if (BaseReg || IndexReg) {
+  Operands.push_back(X86Operand::CreateMem(getPointerWidth(), Disp, Start,
+   End, Size, Identifier, Decl, 0,
+   BaseReg && IndexReg));
+

[llvm] [clang] ms inline asm: Fix {call,jmp} fptr (PR #73207)

2023-11-22 Thread Fangrui Song via cfe-commits


https://github.com/MaskRay edited 
https://github.com/llvm/llvm-project/pull/73207
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] ms inline asm: Fix {call,jmp} fptr (PR #73207)

2023-11-22 Thread Fangrui Song via cfe-commits


MaskRay wrote:

I should confess that I don't understand the mechanism well. I've tried hard to 
write a good description  but I cannot improve the comments in 
`CreateMemForMSInlineAsm`.

https://github.com/llvm/llvm-project/pull/73207
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] ms inline asm: Fix {call,jmp} fptr (PR #73207)

2023-11-22 Thread via cfe-commits


llvmbot wrote:




@llvm/pr-subscribers-clang

Author: Fangrui Song (MaskRay)


Changes

https://reviews.llvm.org/D151863 (2023-05) removed
`BaseReg = BaseReg ? BaseReg : 1` (introduced in commit
175d0aeef3725ce17032e9ef76e018139f2f52f0 (2013)) and caused a
regression: ensuring a non-zero `BaseReg` was to treat
`static void (*fptr)(); __asm { call fptr }` as non-`AbsMem`
`AsmOperandClass` and generate `call $0`/`callq *fptr(%rip)` instead of
`call ${0:P}`/`callq *fptr`
(The asm template argument modifier `P` is for call targets, while
no modifier is used by other instructions like `mov rax, fptr`)

This patch reinstates the BaseReg-setting statement but places it inside
`if (IsGlobalLV)` for clarify.

The special case is unfortunate, but we also have special case in
similar places (https://reviews.llvm.org/D149920).

---
Full diff: https://github.com/llvm/llvm-project/pull/73207.diff


2 Files Affected:

- (modified) clang/test/CodeGen/ms-inline-asm-64.c (+6-1) 
- (modified) llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp (+23-15) 


``diff
diff --git a/clang/test/CodeGen/ms-inline-asm-64.c 
b/clang/test/CodeGen/ms-inline-asm-64.c
index 313d380e121bce0..c7e4c1b603bd76c 100644
--- a/clang/test/CodeGen/ms-inline-asm-64.c
+++ b/clang/test/CodeGen/ms-inline-asm-64.c
@@ -60,17 +60,22 @@ int t4(void) {
 }
 
 void bar() {}
+static void (*fptr)();
 
 void t5(void) {
   __asm {
 call bar
 jmp bar
+call fptr
+jmp fptr
   }
   // CHECK: t5
   // CHECK: call void asm sideeffect inteldialect
   // CHECK-SAME: call ${0:P}
   // CHECK-SAME: jmp ${1:P}
-  // CHECK-SAME: "*m,*m,~{dirflag},~{fpsr},~{flags}"(ptr elementtype(void 
(...)) @bar, ptr elementtype(void (...)) @bar)
+  // CHECK-SAME: call $2
+  // CHECK-SAME: jmp $3
+  // CHECK-SAME: "*m,*m,*m,*m,~{dirflag},~{fpsr},~{flags}"(ptr 
elementtype(void (...)) @bar, ptr elementtype(void (...)) @bar, ptr 
elementtype(ptr) @fptr, ptr elementtype(ptr) @fptr)
 }
 
 void t47(void) {
diff --git a/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp 
b/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
index 008075163b90a8d..a02978c64412cf7 100644
--- a/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
+++ b/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
@@ -1144,8 +1144,8 @@ class X86AsmParser : public MCTargetAsmParser {
   bool ParseIntelMemoryOperandSize(unsigned &Size);
   bool CreateMemForMSInlineAsm(unsigned SegReg, const MCExpr *Disp,
unsigned BaseReg, unsigned IndexReg,
-   unsigned Scale, SMLoc Start, SMLoc End,
-   unsigned Size, StringRef Identifier,
+   unsigned Scale, bool NonAbsMem, SMLoc Start,
+   SMLoc End, unsigned Size, StringRef Identifier,
const InlineAsmIdentifierInfo &Info,
OperandVector &Operands);
 
@@ -1745,10 +1745,13 @@ bool X86AsmParser::parseOperand(OperandVector 
&Operands, StringRef Name) {
   return parseATTOperand(Operands);
 }
 
-bool X86AsmParser::CreateMemForMSInlineAsm(
-unsigned SegReg, const MCExpr *Disp, unsigned BaseReg, unsigned IndexReg,
-unsigned Scale, SMLoc Start, SMLoc End, unsigned Size, StringRef 
Identifier,
-const InlineAsmIdentifierInfo &Info, OperandVector &Operands) {
+bool X86AsmParser::CreateMemForMSInlineAsm(unsigned SegReg, const MCExpr *Disp,
+   unsigned BaseReg, unsigned IndexReg,
+   unsigned Scale, bool NonAbsMem,
+   SMLoc Start, SMLoc End,
+   unsigned Size, StringRef Identifier,
+   const InlineAsmIdentifierInfo &Info,
+   OperandVector &Operands) {
   // If we found a decl other than a VarDecl, then assume it is a FuncDecl or
   // some other label reference.
   if (Info.isKind(InlineAsmIdentifierInfo::IK_Label)) {
@@ -1773,11 +1776,15 @@ bool X86AsmParser::CreateMemForMSInlineAsm(
   }
   // It is widely common for MS InlineAsm to use a global variable and one/two
   // registers in a mmory expression, and though unaccessible via rip/eip.
-  if (IsGlobalLV && (BaseReg || IndexReg)) {
-Operands.push_back(X86Operand::CreateMem(getPointerWidth(), Disp, Start,
- End, Size, Identifier, Decl, 0,
- BaseReg && IndexReg));
-return false;
+  if (IsGlobalLV) {
+if (BaseReg || IndexReg) {
+  Operands.push_back(X86Operand::CreateMem(getPointerWidth(), Disp, Start,
+   End, Size, Identifier, Decl, 0,
+   BaseReg && IndexReg));
+  return false;
+}
+if (NonAbsMem)
+  BaseReg = 1; // Make isAbsMem() false
   }
   Operands.push_back(X86Operand::CreateMem(

[llvm] [clang] ms inline asm: Fix {call,jmp} fptr (PR #73207)

2023-11-22 Thread Fangrui Song via cfe-commits


https://github.com/MaskRay edited 
https://github.com/llvm/llvm-project/pull/73207
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] ms inline asm: Fix {call,jmp} fptr (PR #73207)

2023-11-22 Thread Fangrui Song via cfe-commits


https://github.com/MaskRay updated 
https://github.com/llvm/llvm-project/pull/73207

>From 5b74d57faf8ab5fae4d9512517d6ec4d888a6ecd Mon Sep 17 00:00:00 2001
From: Fangrui Song 
Date: Wed, 22 Nov 2023 16:14:14 -0800
Subject: [PATCH] ms inline asm: Fix {call,jmp} fptr

https://reviews.llvm.org/D151863 (2023-05) removed
`BaseReg = BaseReg ? BaseReg : 1` (introduced in commit
175d0aeef3725ce17032e9ef76e018139f2f52f0 (2013)) and caused a
regression: ensuring a non-zero `BaseReg` was to treat
`static void (*fptr)(); __asm { call fptr }` as non-`AbsMem`
`AsmOperandClass` and generate `call $0`/`callq *fptr(%rip)` instead of
`call ${0:P}`/`callq *fptr`
(The asm template argument modifier `P` is for call targets, while
no modifier is used by other instructions like `mov rax, fptr`)

This patch reinstates the BaseReg-setting statement but places it inside
`if (IsGlobalLV)` for clarify.

The special case is unfortunate, but we also have special case in
similar places (https://reviews.llvm.org/D149920).
---
 clang/test/CodeGen/ms-inline-asm-64.c |  7 +++-
 .../lib/Target/X86/AsmParser/X86AsmParser.cpp | 38 +++
 2 files changed, 29 insertions(+), 16 deletions(-)

diff --git a/clang/test/CodeGen/ms-inline-asm-64.c 
b/clang/test/CodeGen/ms-inline-asm-64.c
index 313d380e121bce0..c7e4c1b603bd76c 100644
--- a/clang/test/CodeGen/ms-inline-asm-64.c
+++ b/clang/test/CodeGen/ms-inline-asm-64.c
@@ -60,17 +60,22 @@ int t4(void) {
 }
 
 void bar() {}
+static void (*fptr)();
 
 void t5(void) {
   __asm {
 call bar
 jmp bar
+call fptr
+jmp fptr
   }
   // CHECK: t5
   // CHECK: call void asm sideeffect inteldialect
   // CHECK-SAME: call ${0:P}
   // CHECK-SAME: jmp ${1:P}
-  // CHECK-SAME: "*m,*m,~{dirflag},~{fpsr},~{flags}"(ptr elementtype(void 
(...)) @bar, ptr elementtype(void (...)) @bar)
+  // CHECK-SAME: call $2
+  // CHECK-SAME: jmp $3
+  // CHECK-SAME: "*m,*m,*m,*m,~{dirflag},~{fpsr},~{flags}"(ptr 
elementtype(void (...)) @bar, ptr elementtype(void (...)) @bar, ptr 
elementtype(ptr) @fptr, ptr elementtype(ptr) @fptr)
 }
 
 void t47(void) {
diff --git a/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp 
b/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
index 008075163b90a8d..a02978c64412cf7 100644
--- a/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
+++ b/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
@@ -1144,8 +1144,8 @@ class X86AsmParser : public MCTargetAsmParser {
   bool ParseIntelMemoryOperandSize(unsigned &Size);
   bool CreateMemForMSInlineAsm(unsigned SegReg, const MCExpr *Disp,
unsigned BaseReg, unsigned IndexReg,
-   unsigned Scale, SMLoc Start, SMLoc End,
-   unsigned Size, StringRef Identifier,
+   unsigned Scale, bool NonAbsMem, SMLoc Start,
+   SMLoc End, unsigned Size, StringRef Identifier,
const InlineAsmIdentifierInfo &Info,
OperandVector &Operands);
 
@@ -1745,10 +1745,13 @@ bool X86AsmParser::parseOperand(OperandVector 
&Operands, StringRef Name) {
   return parseATTOperand(Operands);
 }
 
-bool X86AsmParser::CreateMemForMSInlineAsm(
-unsigned SegReg, const MCExpr *Disp, unsigned BaseReg, unsigned IndexReg,
-unsigned Scale, SMLoc Start, SMLoc End, unsigned Size, StringRef 
Identifier,
-const InlineAsmIdentifierInfo &Info, OperandVector &Operands) {
+bool X86AsmParser::CreateMemForMSInlineAsm(unsigned SegReg, const MCExpr *Disp,
+   unsigned BaseReg, unsigned IndexReg,
+   unsigned Scale, bool NonAbsMem,
+   SMLoc Start, SMLoc End,
+   unsigned Size, StringRef Identifier,
+   const InlineAsmIdentifierInfo &Info,
+   OperandVector &Operands) {
   // If we found a decl other than a VarDecl, then assume it is a FuncDecl or
   // some other label reference.
   if (Info.isKind(InlineAsmIdentifierInfo::IK_Label)) {
@@ -1773,11 +1776,15 @@ bool X86AsmParser::CreateMemForMSInlineAsm(
   }
   // It is widely common for MS InlineAsm to use a global variable and one/two
   // registers in a mmory expression, and though unaccessible via rip/eip.
-  if (IsGlobalLV && (BaseReg || IndexReg)) {
-Operands.push_back(X86Operand::CreateMem(getPointerWidth(), Disp, Start,
- End, Size, Identifier, Decl, 0,
- BaseReg && IndexReg));
-return false;
+  if (IsGlobalLV) {
+if (BaseReg || IndexReg) {
+  Operands.push_back(X86Operand::CreateMem(getPointerWidth(), Disp, Start,
+   End, Size, Identifier, Decl, 0,
+   BaseReg && IndexReg));
+  return false;

[llvm] [clang] ms inline asm: Fix {call,jmp} fptr (PR #73207)

2023-11-22 Thread Fangrui Song via cfe-commits


https://github.com/MaskRay created 
https://github.com/llvm/llvm-project/pull/73207

https://reviews.llvm.org/D151863 (2023-05) removed
`BaseReg = BaseReg ? BaseReg : 1` (introduced in commit
175d0aeef3725ce17032e9ef76e018139f2f52f0 (2013)) and caused a
regression: ensuring a non-zero `BaseReg` was to treat
`static void (*fptr)(); __asm { call var }` as non-`AbsMem`
`AsmOperandClass` and generate `call $0`/`callq *fptr(%rip)` instead of
`call ${0:P}`/`callq *fptr`
(The asm template argument modifier `P` is for call targets, while
no modifier is used by other instructions like `mov rax, fptr`)

This patch reinstates the BaseReg-setting statement but places it inside
`if (IsGlobalLV)` for clarify.

The special case is unfortunate, but we also have special case in
similar places (https://reviews.llvm.org/D149920).


>From a2b0d74d35ea21612e96e322aed336cdc0b282ae Mon Sep 17 00:00:00 2001
From: Fangrui Song 
Date: Wed, 22 Nov 2023 16:14:14 -0800
Subject: [PATCH] ms inline asm: Fix {call,jmp} fptr

https://reviews.llvm.org/D151863 (2023-05) removed
`BaseReg = BaseReg ? BaseReg : 1` (introduced in commit
175d0aeef3725ce17032e9ef76e018139f2f52f0 (2013)) and caused a
regression: ensuring a non-zero `BaseReg` was to treat
`static void (*fptr)(); __asm { call var }` as non-`AbsMem`
`AsmOperandClass` and generate `call $0`/`callq *fptr(%rip)` instead of
`call ${0:P}`/`callq *fptr`
(The asm template argument modifier `P` is for call targets, while
no modifier is used by other instructions like `mov rax, fptr`)

This patch reinstates the BaseReg-setting statement but places it inside
`if (IsGlobalLV)` for clarify.

The special case is unfortunate, but we also have special case in
similar places (https://reviews.llvm.org/D149920).
---
 clang/test/CodeGen/ms-inline-asm-64.c |  7 +++-
 .../lib/Target/X86/AsmParser/X86AsmParser.cpp | 38 +++
 2 files changed, 29 insertions(+), 16 deletions(-)

diff --git a/clang/test/CodeGen/ms-inline-asm-64.c 
b/clang/test/CodeGen/ms-inline-asm-64.c
index 313d380e121bce0..c7e4c1b603bd76c 100644
--- a/clang/test/CodeGen/ms-inline-asm-64.c
+++ b/clang/test/CodeGen/ms-inline-asm-64.c
@@ -60,17 +60,22 @@ int t4(void) {
 }
 
 void bar() {}
+static void (*fptr)();
 
 void t5(void) {
   __asm {
 call bar
 jmp bar
+call fptr
+jmp fptr
   }
   // CHECK: t5
   // CHECK: call void asm sideeffect inteldialect
   // CHECK-SAME: call ${0:P}
   // CHECK-SAME: jmp ${1:P}
-  // CHECK-SAME: "*m,*m,~{dirflag},~{fpsr},~{flags}"(ptr elementtype(void 
(...)) @bar, ptr elementtype(void (...)) @bar)
+  // CHECK-SAME: call $2
+  // CHECK-SAME: jmp $3
+  // CHECK-SAME: "*m,*m,*m,*m,~{dirflag},~{fpsr},~{flags}"(ptr 
elementtype(void (...)) @bar, ptr elementtype(void (...)) @bar, ptr 
elementtype(ptr) @fptr, ptr elementtype(ptr) @fptr)
 }
 
 void t47(void) {
diff --git a/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp 
b/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
index 008075163b90a8d..a02978c64412cf7 100644
--- a/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
+++ b/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
@@ -1144,8 +1144,8 @@ class X86AsmParser : public MCTargetAsmParser {
   bool ParseIntelMemoryOperandSize(unsigned &Size);
   bool CreateMemForMSInlineAsm(unsigned SegReg, const MCExpr *Disp,
unsigned BaseReg, unsigned IndexReg,
-   unsigned Scale, SMLoc Start, SMLoc End,
-   unsigned Size, StringRef Identifier,
+   unsigned Scale, bool NonAbsMem, SMLoc Start,
+   SMLoc End, unsigned Size, StringRef Identifier,
const InlineAsmIdentifierInfo &Info,
OperandVector &Operands);
 
@@ -1745,10 +1745,13 @@ bool X86AsmParser::parseOperand(OperandVector 
&Operands, StringRef Name) {
   return parseATTOperand(Operands);
 }
 
-bool X86AsmParser::CreateMemForMSInlineAsm(
-unsigned SegReg, const MCExpr *Disp, unsigned BaseReg, unsigned IndexReg,
-unsigned Scale, SMLoc Start, SMLoc End, unsigned Size, StringRef 
Identifier,
-const InlineAsmIdentifierInfo &Info, OperandVector &Operands) {
+bool X86AsmParser::CreateMemForMSInlineAsm(unsigned SegReg, const MCExpr *Disp,
+   unsigned BaseReg, unsigned IndexReg,
+   unsigned Scale, bool NonAbsMem,
+   SMLoc Start, SMLoc End,
+   unsigned Size, StringRef Identifier,
+   const InlineAsmIdentifierInfo &Info,
+   OperandVector &Operands) {
   // If we found a decl other than a VarDecl, then assume it is a FuncDecl or
   // some other label reference.
   if (Info.isKind(InlineAsmIdentifierInfo::IK_Label)) {
@@ -1773,11 +1776,15 @@ bool X86AsmParser::CreateMemForMSInlineAsm(
   }
   // It is

[clang] [RISCV] Use Float type instead of Half type for Fixed RVV vector type mangling (PR #73091)

2023-11-22 Thread Jianjian Guan via cfe-commits


https://github.com/jacquesguan closed 
https://github.com/llvm/llvm-project/pull/73091
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] 2eb9c64 - [RISCV] Use Float type instead of Half type for Fixed RVV vector type mangling (#73091)

2023-11-22 Thread via cfe-commits


Author: Jianjian Guan
Date: 2023-11-23T11:08:27+08:00
New Revision: 2eb9c649f0971aaa05404764d74ee7fff15b83ed

URL: 
https://github.com/llvm/llvm-project/commit/2eb9c649f0971aaa05404764d74ee7fff15b83ed
DIFF: 
https://github.com/llvm/llvm-project/commit/2eb9c649f0971aaa05404764d74ee7fff15b83ed.diff

LOG: [RISCV] Use Float type instead of Half type for Fixed RVV vector type 
mangling (#73091)

Added: 


Modified: 
clang/lib/AST/ItaniumMangle.cpp
clang/test/CodeGenCXX/riscv-mangle-rvv-fixed-vectors.cpp

Removed: 




diff  --git a/clang/lib/AST/ItaniumMangle.cpp b/clang/lib/AST/ItaniumMangle.cpp
index 2a62ac0175afb72..b1678479888eb77 100644
--- a/clang/lib/AST/ItaniumMangle.cpp
+++ b/clang/lib/AST/ItaniumMangle.cpp
@@ -4029,7 +4029,7 @@ void CXXNameMangler::mangleRISCVFixedRVVVectorType(const 
VectorType *T) {
   case BuiltinType::ULong:
 TypeNameOS << "uint64";
 break;
-  case BuiltinType::Half:
+  case BuiltinType::Float16:
 TypeNameOS << "float16";
 break;
   case BuiltinType::Float:

diff  --git a/clang/test/CodeGenCXX/riscv-mangle-rvv-fixed-vectors.cpp 
b/clang/test/CodeGenCXX/riscv-mangle-rvv-fixed-vectors.cpp
index 98fb27b704fd81d..32bd49f4ff725db 100644
--- a/clang/test/CodeGenCXX/riscv-mangle-rvv-fixed-vectors.cpp
+++ b/clang/test/CodeGenCXX/riscv-mangle-rvv-fixed-vectors.cpp
@@ -1,23 +1,23 @@
 // RUN: %clang_cc1 -triple riscv64-none-linux-gnu %s -emit-llvm -o - \
-// RUN:  -target-feature +f -target-feature +d \
-// RUN:  -target-feature +zve64d -mvscale-min=1 -mvscale-max=1 \
-// RUN:  | FileCheck %s --check-prefix=CHECK-64
+// RUN:  -target-feature +f -target-feature +d -target-feature +zfh \
+// RUN:  -target-feature +zve64d -target-feature +zvfh -mvscale-min=1 \
+// RUN:   -mvscale-max=1 | FileCheck %s --check-prefix=CHECK-64
 // RUN: %clang_cc1 -triple riscv64-none-linux-gnu %s -emit-llvm -o - \
-// RUN:  -target-feature +f -target-feature +d \
-// RUN:  -target-feature +zve64d -mvscale-min=2 -mvscale-max=2 \
-// RUN:  | FileCheck %s --check-prefix=CHECK-128
+// RUN:  -target-feature +f -target-feature +d -target-feature +zfh \
+// RUN:  -target-feature +zve64d -target-feature +zvfh -mvscale-min=2 \
+// RUN:  -mvscale-max=2 | FileCheck %s --check-prefix=CHECK-128
 // RUN: %clang_cc1 -triple riscv64-none-linux-gnu %s -emit-llvm -o - \
-// RUN:  -target-feature +f -target-feature +d \
-// RUN:  -target-feature +zve64d -mvscale-min=4 -mvscale-max=4 \
-// RUN:  | FileCheck %s --check-prefix=CHECK-256
+// RUN:  -target-feature +f -target-feature +d -target-feature +zfh \
+// RUN:  -target-feature +zve64d -target-feature +zvfh -mvscale-min=4 \
+// RUN:  -mvscale-max=4 | FileCheck %s --check-prefix=CHECK-256
 // RUN: %clang_cc1 -triple riscv64-none-linux-gnu %s -emit-llvm -o - \
-// RUN:  -target-feature +f -target-feature +d \
-// RUN:  -target-feature +zve64d -mvscale-min=8 -mvscale-max=8 \
-// RUN:  | FileCheck %s --check-prefix=CHECK-512
+// RUN:  -target-feature +f -target-feature +d -target-feature +zfh \
+// RUN:  -target-feature +zve64d -target-feature +zvfh -mvscale-min=8 \
+// RUN:  -mvscale-max=8 | FileCheck %s --check-prefix=CHECK-512
 // RUN: %clang_cc1 -triple riscv64-none-linux-gnu %s -emit-llvm -o - \
-// RUN:  -target-feature +f -target-feature +d \
-// RUN:  -target-feature +zve64d -mvscale-min=16 -mvscale-max=16 \
-// RUN:  | FileCheck %s --check-prefix=CHECK-1024
+// RUN:  -target-feature +f -target-feature +d -target-feature +zfh \
+// RUN:  -target-feature +zve64d -target-feature +zvfh -mvscale-min=16 \
+// RUN:  -mvscale-max=16 | FileCheck %s --check-prefix=CHECK-1024
 
 typedef __rvv_int8mf8_t vint8mf8_t;
 typedef __rvv_uint8mf8_t vuint8mf8_t;
@@ -26,6 +26,7 @@ typedef __rvv_int8mf4_t vint8mf4_t;
 typedef __rvv_uint8mf4_t vuint8mf4_t;
 typedef __rvv_int16mf4_t vint16mf4_t;
 typedef __rvv_uint16mf4_t vuint16mf4_t;
+typedef __rvv_float16mf4_t vfloat16mf4_t;
 
 typedef __rvv_int8mf2_t vint8mf2_t;
 typedef __rvv_uint8mf2_t vuint8mf2_t;
@@ -33,6 +34,7 @@ typedef __rvv_int16mf2_t vint16mf2_t;
 typedef __rvv_uint16mf2_t vuint16mf2_t;
 typedef __rvv_int32mf2_t vint32mf2_t;
 typedef __rvv_uint32mf2_t vuint32mf2_t;
+typedef __rvv_float16mf2_t vfloat16mf2_t;
 typedef __rvv_float32mf2_t vfloat32mf2_t;
 
 typedef __rvv_int8m1_t vint8m1_t;
@@ -43,6 +45,7 @@ typedef __rvv_int32m1_t vint32m1_t;
 typedef __rvv_uint32m1_t vuint32m1_t;
 typedef __rvv_int64m1_t vint64m1_t;
 typedef __rvv_uint64m1_t vuint64m1_t;
+typedef __rvv_float16m1_t vfloat16m1_t;
 typedef __rvv_float32m1_t vfloat32m1_t;
 typedef __rvv_float64m1_t vfloat64m1_t;
 
@@ -54,6 +57,7 @@ typedef __rvv_int32m2_t vint32m2_t;
 typedef __rvv_uint32m2_t vuint32m2_t;
 typedef __rvv_int64m2_t vint64m2_t;
 typedef __rvv_uint64m2_t vuint64m2_t;
+typedef __rvv_float16m2_t vfloat16m2_t;
 typedef __rvv_float32m2_t vfloat32m2_t;
 typedef __rvv_float64m2_t vfloat64m2_t;
 
@@ -65,6 +69,7 @@ typedef __rvv_int32m4_t vint32m4_t;
 typedef __r

[clang] [CUDA][HIP] allow trivial ctor/dtor in device var init (PR #73140)

2023-11-22 Thread via cfe-commits


alexfh wrote:

Thank you!

https://github.com/llvm/llvm-project/pull/73140
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang][analyzer] Support `fgetc` in StreamChecker (PR #72627)

2023-11-22 Thread Ben Shi via cfe-commits


https://github.com/benshi001 updated 
https://github.com/llvm/llvm-project/pull/72627

>From 3032cafc2ad43baeeea14de318cd82026b96d035 Mon Sep 17 00:00:00 2001
From: Ben Shi 
Date: Fri, 17 Nov 2023 17:22:10 +0800
Subject: [PATCH] [clang][analyzer] Support `fgetc` in StreamChecker

---
 .../StaticAnalyzer/Checkers/StreamChecker.cpp | 79 ++-
 .../Analysis/Inputs/system-header-simulator.h |  1 +
 clang/test/Analysis/stream-error.c| 71 -
 clang/test/Analysis/stream.c  |  6 ++
 4 files changed, 138 insertions(+), 19 deletions(-)

diff --git a/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp 
b/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
index 1d53e59ca067c27..d6651d6f0cd3a31 100644
--- a/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
@@ -250,9 +250,12 @@ class StreamChecker : public Checker PutVal = Call.getArgSVal(0).getAs();
-  if (!PutVal)
-return;
-  ProgramStateRef StateNotFailed =
-  State->BindExpr(CE, C.getLocationContext(), *PutVal);
-  StateNotFailed =
-  StateNotFailed->set(StreamSym, StreamState::getOpened(Desc));
-  C.addTransition(StateNotFailed);
+  // Generate a transition for the success state of fputc.
+  if (!IsRead) {
+std::optional PutVal = Call.getArgSVal(0).getAs();
+if (!PutVal)
+  return;
+ProgramStateRef StateNotFailed =
+State->BindExpr(CE, C.getLocationContext(), *PutVal);
+StateNotFailed =
+StateNotFailed->set(StreamSym, 
StreamState::getOpened(Desc));
+C.addTransition(StateNotFailed);
+  }
+  // Generate a transition for the success state of fgetc.
+  // If we know the state to be FEOF at fgetc, do not add a success state.
+  else if (OldSS->ErrorState != ErrorFEof) {
+NonLoc RetVal = makeRetVal(C, CE).castAs();
+ProgramStateRef StateNotFailed =
+State->BindExpr(CE, C.getLocationContext(), RetVal);
+SValBuilder &SVB = C.getSValBuilder();
+// The returned 'unsigned char' of `fgetc` is converted to 'int',
+// so we need to check if it is in range [0, 255].
+auto CondLow = SVB.evalBinOp(State, BO_GE, RetVal,
+ SVB.makeZeroVal(C.getASTContext().IntTy),
+ SVB.getConditionType())
+   .getAs();
+auto CondHigh = SVB.evalBinOp(State, BO_LE, RetVal,
+  SVB.makeIntVal(255, C.getASTContext().IntTy),
+  SVB.getConditionType())
+.getAs();
+if (!CondLow || !CondHigh)
+  return;
+StateNotFailed = StateNotFailed->assume(*CondLow, true);
+if (!StateNotFailed)
+  return;
+StateNotFailed = StateNotFailed->assume(*CondHigh, true);
+if (!StateNotFailed)
+  return;
+C.addTransition(StateNotFailed);
+  }
 
   // Add transition for the failed state.
+  ProgramStateRef StateFailed = bindInt(*EofVal, State, C, CE);
+
   // If a (non-EOF) error occurs, the resulting value of the file position
   // indicator for the stream is indeterminate.
-  ProgramStateRef StateFailed = bindInt(*EofVal, State, C, CE);
-  StreamState NewSS = StreamState::getOpened(
-  Desc, ErrorFError, /*IsFilePositionIndeterminate*/ true);
+  StreamErrorState NewES;
+  if (IsRead)
+NewES =
+OldSS->ErrorState == ErrorFEof ? ErrorFEof : ErrorFEof | ErrorFError;
+  else
+NewES = ErrorFError;
+  StreamState NewSS = StreamState::getOpened(Desc, NewES, !NewES.isFEof());
   StateFailed = StateFailed->set(StreamSym, NewSS);
-  C.addTransition(StateFailed);
+  if (IsRead && OldSS->ErrorState != ErrorFEof)
+C.addTransition(StateFailed, constructSetEofNoteTag(C, StreamSym));
+  else
+C.addTransition(StateFailed);
 }
 
 void StreamChecker::preFseek(const FnDescription *Desc, const CallEvent &Call,
diff --git a/clang/test/Analysis/Inputs/system-header-simulator.h 
b/clang/test/Analysis/Inputs/system-header-simulator.h
index 8924103f5046ea2..fc57e8bdc3d30c3 100644
--- a/clang/test/Analysis/Inputs/system-header-simulator.h
+++ b/clang/test/Analysis/Inputs/system-header-simulator.h
@@ -48,6 +48,7 @@ FILE *freopen(const char *restrict pathname, const char 
*restrict mode, FILE *re
 int fclose(FILE *fp);
 size_t fread(void *restrict, size_t, size_t, FILE *restrict);
 size_t fwrite(const void *restrict, size_t, size_t, FILE *restrict);
+int fgetc(FILE *stream);
 int fputc(int ch, FILE *stream);
 int fseek(FILE *__stream, long int __off, int __whence);
 long int ftell(FILE *__stream);
diff --git a/clang/test/Analysis/stream-error.c 
b/clang/test/Analysis/stream-error.c
index 5ebdc32bb1b92ff..8bdd483da7c7c43 100644
--- a/clang/test/Analysis/stream-error.c
+++ b/clang/test/Analysis/stream-error.c
@@ -101,6 +101,30 @@ void error_fwrite(void) {
   Ret = fwrite(0, 1, 10, F); // expected-warning {{Stream might be already 
closed}}
 }
 
+void error_fgetc(void) {
+  FILE *F = tmpfile();
+  if (!F)
+return;
+  int

[clang] [CUDA][HIP] allow trivial ctor/dtor in device var init (PR #73140)

2023-11-22 Thread Yaxun Liu via cfe-commits


yxsamliu wrote:

> Could you first land the two reverts 
> ([511cecf](https://github.com/llvm/llvm-project/commit/511cecff7f76958ebfe713189bc106615763b64a)
>  and 
> [e9a8e90](https://github.com/llvm/llvm-project/commit/e9a8e906d4c14eb4b317a7420b9bba3dc7321ba2))
>  and then have the third commit properly reviewed? @Artem-B may be 
> unavailable for a few more days, but we'd like a fix/revert to land very soon.

reverted

https://github.com/llvm/llvm-project/pull/73140
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA][HIP] allow trivial ctor/dtor in device var init (PR #73140)

2023-11-22 Thread Yaxun Liu via cfe-commits


https://github.com/yxsamliu updated 
https://github.com/llvm/llvm-project/pull/73140

>From 2dc8bda89483ee655e7a76deac19b8ea9e463c7b Mon Sep 17 00:00:00 2001
From: "Yaxun (Sam) Liu" 
Date: Wed, 22 Nov 2023 10:02:59 -0500
Subject: [PATCH] [CUDA][HIP] allow trivial ctor/dtor in device var init

Treat ctor/dtor in device var init as host device function
so that they can be used to initialize file-scope
device variables to match nvcc behavior. If they are non-trivial
they will be diagnosed.

We cannot add implicit host device attrs to non-trivial
ctor/dtor since determining whether they are non-trivial
needs to know whether they have a trivial body and all their
member and base classes' ctor/dtor have trivial body, which
is affected by where their bodies are defined or instantiated.

Fixes: #72261

Fixes: SWDEV-432412
---
 clang/lib/Sema/SemaCUDA.cpp  |  9 
 clang/test/SemaCUDA/trivial-ctor-dtor.cu | 57 
 2 files changed, 66 insertions(+)
 create mode 100644 clang/test/SemaCUDA/trivial-ctor-dtor.cu

diff --git a/clang/lib/Sema/SemaCUDA.cpp b/clang/lib/Sema/SemaCUDA.cpp
index 318174f7be8fa95..6a66ecf6f94c178 100644
--- a/clang/lib/Sema/SemaCUDA.cpp
+++ b/clang/lib/Sema/SemaCUDA.cpp
@@ -225,6 +225,15 @@ Sema::CUDAFunctionPreference
 Sema::IdentifyCUDAPreference(const FunctionDecl *Caller,
  const FunctionDecl *Callee) {
   assert(Callee && "Callee must be valid.");
+
+  // Treat ctor/dtor as host device function in device var initializer to allow
+  // trivial ctor/dtor without device attr to be used. Non-trivial ctor/dtor
+  // will be diagnosed by checkAllowedCUDAInitializer.
+  if (Caller == nullptr && CurCUDATargetCtx.Kind == CTCK_InitGlobalVar &&
+  CurCUDATargetCtx.Target == CFT_Device &&
+  (isa(Callee) || isa(Callee)))
+return CFP_HostDevice;
+
   CUDAFunctionTarget CallerTarget = IdentifyCUDATarget(Caller);
   CUDAFunctionTarget CalleeTarget = IdentifyCUDATarget(Callee);
 
diff --git a/clang/test/SemaCUDA/trivial-ctor-dtor.cu 
b/clang/test/SemaCUDA/trivial-ctor-dtor.cu
new file mode 100644
index 000..34142bcc621200f
--- /dev/null
+++ b/clang/test/SemaCUDA/trivial-ctor-dtor.cu
@@ -0,0 +1,57 @@
+// RUN: %clang_cc1 -isystem %S/Inputs  -fsyntax-only -verify %s
+// RUN: %clang_cc1 -isystem %S/Inputs -fcuda-is-device -fsyntax-only -verify %s
+
+#include 
+
+// Check trivial ctor/dtor
+struct A {
+  int x;
+  A() {}
+  ~A() {}
+};
+
+__device__ A a;
+
+// Check trivial ctor/dtor of template class
+template
+struct TA {
+  T x;
+  TA() {}
+  ~TA() {}
+};
+
+__device__ TA ta;
+
+// Check non-trivial ctor/dtor in parent template class
+template
+struct TB {
+  T x;
+  TB() { static int nontrivial_ctor = 1; }
+  ~TB() {}
+};
+
+template
+struct TC : TB {
+  T x;
+  TC() {}
+  ~TC() {}
+};
+
+template class TC;
+
+__device__ TC tc; //expected-error {{dynamic initialization is not 
supported for __device__, __constant__, __shared__, and __managed__ variables}}
+
+// Check trivial ctor specialization
+template 
+struct C {
+explicit C() {};
+};
+
+template <> C::C() {};
+__device__ C ci_d;
+C ci_h;
+
+// Check non-trivial ctor specialization
+template <> C::C() { static int nontrivial_ctor = 1; }
+__device__ C cf_d; //expected-error {{dynamic initialization is not 
supported for __device__, __constant__, __shared__, and __managed__ variables}}
+C cf_h;

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [LinkerWrapper] Accept some needed lld-link linker arguments for COFF targets (PR #72889)

2023-11-22 Thread Joseph Huber via cfe-commits


https://github.com/jhuber6 closed 
https://github.com/llvm/llvm-project/pull/72889
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] b16f765 - [LinkerWrapper] Accept some needed lld-link linker arguments for COFF targets (#72889)

2023-11-22 Thread via cfe-commits


Author: Joseph Huber
Date: 2023-11-22T20:23:23-06:00
New Revision: b16f765d6fec56a07aecd2056bb1760a9e72d64f

URL: 
https://github.com/llvm/llvm-project/commit/b16f765d6fec56a07aecd2056bb1760a9e72d64f
DIFF: 
https://github.com/llvm/llvm-project/commit/b16f765d6fec56a07aecd2056bb1760a9e72d64f.diff

LOG: [LinkerWrapper] Accept some needed lld-link linker arguments for COFF 
targets (#72889)

Summary:
The linker wrapper is a utility used to create offloading programs from
single-source offloading languages such as OpenMP or CUDA. This is done
by embedding device code into the host object, then feeding it into the
linker wrapper which extracts the accelerator object files, links them,
then wraps them in registration code for the target  runtime. This
previously has only worked in Linux / ELF platforms.

This patch attempts to hand Windows / COFF inputs by also accepting COFF
forms of certain linker arguments we use internally. The important
arguments are library search paths, so we can identify libraries which
may contain device code, libraries themselves, and the output name used
for intermediate output.

I am not intimately familiar with the semantics here for the semantics
in how a `lib` file is earched. I am simply treating `foo.lib` as the
GNU equivalent `-l:foo.lib` in the search logic. Similarly, I am
assuming that static libraries will be llvm-ar style libraries. I will
need to investigate the actual deficiencies later, but this should be a
good starting point along with
https://github.com/llvm/llvm-project/pull/72697

Added: 


Modified: 
clang/test/Driver/linker-wrapper.c
clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
clang/tools/clang-linker-wrapper/LinkerWrapperOpts.td

Removed: 




diff  --git a/clang/test/Driver/linker-wrapper.c 
b/clang/test/Driver/linker-wrapper.c
index da7bdc22153ceae..e82febd61823102 100644
--- a/clang/test/Driver/linker-wrapper.c
+++ b/clang/test/Driver/linker-wrapper.c
@@ -140,3 +140,11 @@
 // RUN:   --linker-path=/usr/bin/ld -- %t.o -o a.out 2>&1 | FileCheck %s 
--check-prefix=CLANG-BACKEND
 
 // CLANG-BACKEND: clang{{.*}} -o {{.*}}.img --target=amdgcn-amd-amdhsa 
-mcpu=gfx908 -O2 -Wl,--no-undefined {{.*}}.bc
+
+// RUN: clang-offload-packager -o %t.out \
+// RUN:   
--image=file=%t.elf.o,kind=openmp,triple=nvptx64-nvidia-cuda,arch=sm_70
+// RUN: %clang -cc1 %s -triple x86_64-unknown-windows-msvc -emit-obj -o %t.o 
-fembed-offload-object=%t.out
+// RUN: clang-linker-wrapper --host-triple=x86_64-unknown-windows-msvc 
--dry-run \
+// RUN:   --linker-path=/usr/bin/lld-link -- %t.o -libpath:./ -out:a.exe 2>&1 
| FileCheck %s --check-prefix=COFF
+
+// COFF: "/usr/bin/lld-link" {{.*}}.o -libpath:./ -out:a.exe 
{{.*}}openmp.image.wrapper{{.*}}

diff  --git a/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp 
b/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
index bafe8ace60d1cea..db0ce3e2a190192 100644
--- a/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+++ b/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -254,7 +254,7 @@ Error runLinker(ArrayRef Files, const ArgList 
&Args) {
   continue;
 
 Arg->render(Args, NewLinkerArgs);
-if (Arg->getOption().matches(OPT_o))
+if (Arg->getOption().matches(OPT_o) || Arg->getOption().matches(OPT_out))
   llvm::transform(Files, std::back_inserter(NewLinkerArgs),
   [&](StringRef Arg) { return Args.MakeArgString(Arg); });
   }
@@ -1188,7 +1188,7 @@ searchLibraryBaseName(StringRef Name, StringRef Root,
 /// `-lfoo` or `-l:libfoo.a`.
 std::optional searchLibrary(StringRef Input, StringRef Root,
  ArrayRef SearchPaths) {
-  if (Input.startswith(":"))
+  if (Input.startswith(":") || Input.ends_with(".lib"))
 return findFromSearchPaths(Input.drop_front(), Root, SearchPaths);
   return searchLibraryBaseName(Input, Root, SearchPaths);
 }
@@ -1339,7 +1339,7 @@ Expected> getDeviceInput(const 
ArgList &Args) {
 
   StringRef Root = Args.getLastArgValue(OPT_sysroot_EQ);
   SmallVector LibraryPaths;
-  for (const opt::Arg *Arg : Args.filtered(OPT_library_path))
+  for (const opt::Arg *Arg : Args.filtered(OPT_library_path, OPT_libpath))
 LibraryPaths.push_back(Arg->getValue());
 
   BumpPtrAllocator Alloc;
@@ -1348,7 +1348,7 @@ Expected> getDeviceInput(const 
ArgList &Args) {
   // Try to extract device code from the linker input files.
   SmallVector InputFiles;
   DenseMap> Syms;
-  bool WholeArchive = false;
+  bool WholeArchive = Args.hasArg(OPT_wholearchive_flag) ? true : false;
   for (const opt::Arg *Arg : Args.filtered(
OPT_INPUT, OPT_library, OPT_whole_archive, OPT_no_whole_archive)) {
 if (Arg->getOption().matches(OPT_whole_archive) ||
@@ -1474,9 +1474,17 @@ int main(int Argc, char **Argv) {
   Verbose = Args.hasArg(OPT_verbose);
   DryRun = Args.hasArg(OPT_dry_run);
   SaveTemps = Args.hasArg(OPT_save_te

[libunwind] [libunwind][WebAssembly] Don't build libunwind.cpp (PR #73196)

2023-11-22 Thread via cfe-commits


llvmbot wrote:




@llvm/pr-subscribers-libunwind

Author: Heejin Ahn (aheejin)


Changes

Wasm doesn't use that file; Wasm does not allow access of system registers and 
those functionalities are provided from the VM. Wasm only uses
https://github.com/llvm/llvm-project/blob/main/libunwind/src/Unwind-wasm.c, 
which implements a few interface functions.

---
Full diff: https://github.com/llvm/llvm-project/pull/73196.diff


1 Files Affected:

- (modified) libunwind/src/libunwind.cpp (+3-2) 


``diff
diff --git a/libunwind/src/libunwind.cpp b/libunwind/src/libunwind.cpp
index 1bd18659b7860c0..cd610377b63de8d 100644
--- a/libunwind/src/libunwind.cpp
+++ b/libunwind/src/libunwind.cpp
@@ -26,7 +26,7 @@
 #include 
 #endif
 
-#if !defined(__USING_SJLJ_EXCEPTIONS__)
+#if !defined(__USING_SJLJ_EXCEPTIONS__) || !defined(__USING_WASM_EXCEPTIONS__)
 #include "AddressSpace.hpp"
 #include "UnwindCursor.hpp"
 
@@ -347,7 +347,8 @@ void __unw_remove_dynamic_eh_frame_section(unw_word_t 
eh_frame_start) {
 }
 
 #endif // defined(_LIBUNWIND_SUPPORT_DWARF_UNWIND)
-#endif // !defined(__USING_SJLJ_EXCEPTIONS__)
+#endif // !defined(__USING_SJLJ_EXCEPTIONS__) ||
+   // !defined(__USING_WASM_EXCEPTIONS__)
 
 #ifdef __APPLE__
 

``




https://github.com/llvm/llvm-project/pull/73196
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libunwind] [libunwind][WebAssembly] Don't build libunwind.cpp (PR #73196)

2023-11-22 Thread Heejin Ahn via cfe-commits


https://github.com/aheejin created 
https://github.com/llvm/llvm-project/pull/73196

Wasm doesn't use that file; Wasm does not allow access of system registers and 
those functionalities are provided from the VM. Wasm only uses
https://github.com/llvm/llvm-project/blob/main/libunwind/src/Unwind-wasm.c, 
which implements a few interface functions.

>From f26f6ec29e6ce735696b1aeb6b73e7221b008316 Mon Sep 17 00:00:00 2001
From: Heejin Ahn 
Date: Wed, 22 Nov 2023 18:18:19 -0800
Subject: [PATCH] [libunwind][WebAssembly] Don't build libunwind.cpp

Wasm doesn't use that file; Wasm does not allow access of system
registers and those functionalities are provided from the VM. Wasm only
uses
https://github.com/llvm/llvm-project/blob/main/libunwind/src/Unwind-wasm.c,
which implements a few interface functions.
---
 libunwind/src/libunwind.cpp | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/libunwind/src/libunwind.cpp b/libunwind/src/libunwind.cpp
index 1bd18659b7860c0..cd610377b63de8d 100644
--- a/libunwind/src/libunwind.cpp
+++ b/libunwind/src/libunwind.cpp
@@ -26,7 +26,7 @@
 #include 
 #endif
 
-#if !defined(__USING_SJLJ_EXCEPTIONS__)
+#if !defined(__USING_SJLJ_EXCEPTIONS__) || !defined(__USING_WASM_EXCEPTIONS__)
 #include "AddressSpace.hpp"
 #include "UnwindCursor.hpp"
 
@@ -347,7 +347,8 @@ void __unw_remove_dynamic_eh_frame_section(unw_word_t 
eh_frame_start) {
 }
 
 #endif // defined(_LIBUNWIND_SUPPORT_DWARF_UNWIND)
-#endif // !defined(__USING_SJLJ_EXCEPTIONS__)
+#endif // !defined(__USING_SJLJ_EXCEPTIONS__) ||
+   // !defined(__USING_WASM_EXCEPTIONS__)
 
 #ifdef __APPLE__
 

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] 6b3470b - Revert "[CUDA][HIP] make trivial ctor/dtor host device (#72394)"

2023-11-22 Thread Yaxun Liu via cfe-commits


Author: Yaxun (Sam) Liu
Date: 2023-11-22T21:20:53-05:00
New Revision: 6b3470b4b83195aeeda60b101e8d3bf8800c321c

URL: 
https://github.com/llvm/llvm-project/commit/6b3470b4b83195aeeda60b101e8d3bf8800c321c
DIFF: 
https://github.com/llvm/llvm-project/commit/6b3470b4b83195aeeda60b101e8d3bf8800c321c.diff

LOG: Revert "[CUDA][HIP] make trivial ctor/dtor host device (#72394)"

This reverts commit 27e6e4a4d0e3296cebad8db577ec0469a286795e.

This patch is reverted due to regression. A testcase is:

`template 
struct ptr {
~ptr() { static int x = 1;}
};

template 
struct Abc : ptr {
 public:
  Abc();
  ~Abc() {}
};

template
class Abc;
`

Added: 


Modified: 
clang/include/clang/Sema/Sema.h
clang/lib/Sema/SemaCUDA.cpp
clang/lib/Sema/SemaDecl.cpp
clang/test/SemaCUDA/call-host-fn-from-device.cu
clang/test/SemaCUDA/default-ctor.cu
clang/test/SemaCUDA/implicit-member-target-collision-cxx11.cu
clang/test/SemaCUDA/implicit-member-target-collision.cu
clang/test/SemaCUDA/implicit-member-target-inherited.cu
clang/test/SemaCUDA/implicit-member-target.cu

Removed: 
clang/test/SemaCUDA/trivial-ctor-dtor.cu



diff  --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index 59806bcbcbb2dbc..e8914f5fcddf19e 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -13466,10 +13466,6 @@ class Sema final {
   void maybeAddCUDAHostDeviceAttrs(FunctionDecl *FD,
const LookupResult &Previous);
 
-  /// May add implicit CUDAHostAttr and CUDADeviceAttr attributes to a
-  /// trivial cotr/dtor that does not have host and device attributes.
-  void maybeAddCUDAHostDeviceAttrsToTrivialCtorDtor(FunctionDecl *FD);
-
   /// May add implicit CUDAConstantAttr attribute to VD, depending on VD
   /// and current compilation settings.
   void MaybeAddCUDAConstantAttr(VarDecl *VD);

diff  --git a/clang/lib/Sema/SemaCUDA.cpp b/clang/lib/Sema/SemaCUDA.cpp
index b94f448dabe7517..318174f7be8fa95 100644
--- a/clang/lib/Sema/SemaCUDA.cpp
+++ b/clang/lib/Sema/SemaCUDA.cpp
@@ -772,22 +772,6 @@ void Sema::maybeAddCUDAHostDeviceAttrs(FunctionDecl *NewD,
   NewD->addAttr(CUDADeviceAttr::CreateImplicit(Context));
 }
 
-// If a trivial ctor/dtor has no host/device
-// attributes, make it implicitly host device function.
-void Sema::maybeAddCUDAHostDeviceAttrsToTrivialCtorDtor(FunctionDecl *FD) {
-  bool IsTrivialCtor = false;
-  if (auto *CD = dyn_cast(FD))
-IsTrivialCtor = isEmptyCudaConstructor(SourceLocation(), CD);
-  bool IsTrivialDtor = false;
-  if (auto *DD = dyn_cast(FD))
-IsTrivialDtor = isEmptyCudaDestructor(SourceLocation(), DD);
-  if ((IsTrivialCtor || IsTrivialDtor) && !FD->hasAttr() &&
-  !FD->hasAttr()) {
-FD->addAttr(CUDAHostAttr::CreateImplicit(Context));
-FD->addAttr(CUDADeviceAttr::CreateImplicit(Context));
-  }
-}
-
 // TODO: `__constant__` memory may be a limited resource for certain targets.
 // A safeguard may be needed at the end of compilation pipeline if
 // `__constant__` memory usage goes beyond limit.

diff  --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp
index 4e1857b931cc868..23dd8ae15c16583 100644
--- a/clang/lib/Sema/SemaDecl.cpp
+++ b/clang/lib/Sema/SemaDecl.cpp
@@ -16255,9 +16255,6 @@ Decl *Sema::ActOnFinishFunctionBody(Decl *dcl, Stmt 
*Body,
   if (FD && !FD->isDeleted())
 checkTypeSupport(FD->getType(), FD->getLocation(), FD);
 
-  if (LangOpts.CUDA)
-maybeAddCUDAHostDeviceAttrsToTrivialCtorDtor(FD);
-
   return dcl;
 }
 

diff  --git a/clang/test/SemaCUDA/call-host-fn-from-device.cu 
b/clang/test/SemaCUDA/call-host-fn-from-device.cu
index b62de92db02d6de..acdd291b664579b 100644
--- a/clang/test/SemaCUDA/call-host-fn-from-device.cu
+++ b/clang/test/SemaCUDA/call-host-fn-from-device.cu
@@ -12,7 +12,7 @@ extern "C" void host_fn() {}
 struct Dummy {};
 
 struct S {
-  S() { static int nontrivial_ctor = 1; }
+  S() {}
   // expected-note@-1 2 {{'S' declared here}}
   ~S() { host_fn(); }
   // expected-note@-1 {{'~S' declared here}}

diff  --git a/clang/test/SemaCUDA/default-ctor.cu 
b/clang/test/SemaCUDA/default-ctor.cu
index 31971fe6b3863c7..cbad7a1774c1501 100644
--- a/clang/test/SemaCUDA/default-ctor.cu
+++ b/clang/test/SemaCUDA/default-ctor.cu
@@ -25,7 +25,7 @@ __device__ void fd() {
   InD ind;
   InH inh; // expected-error{{no matching constructor for initialization of 
'InH'}}
   InHD inhd;
-  Out out;
+  Out out; // expected-error{{no matching constructor for initialization of 
'Out'}}
   OutD outd;
   OutH outh; // expected-error{{no matching constructor for initialization of 
'OutH'}}
   OutHD outhd;

diff  --git a/clang/test/SemaCUDA/implicit-member-target-collision-cxx11.cu 
b/clang/test/SemaCUDA/implicit-member-target-collision-cxx11.cu
index edb543f637ccc18..06015ed0d6d8edc 100644
--- a/clang/test/SemaCUDA/implicit-member-target-collision-cxx11.cu
+++ b/clang/

[clang] 22078bd - Revert "[CUDA][HIP] ignore implicit host/device attr for override (#72815)"

2023-11-22 Thread Yaxun Liu via cfe-commits


Author: Yaxun (Sam) Liu
Date: 2023-11-22T21:04:55-05:00
New Revision: 22078bd9f6842411aac2b75196975d68a817a358

URL: 
https://github.com/llvm/llvm-project/commit/22078bd9f6842411aac2b75196975d68a817a358
DIFF: 
https://github.com/llvm/llvm-project/commit/22078bd9f6842411aac2b75196975d68a817a358.diff

LOG: Revert "[CUDA][HIP] ignore implicit host/device attr for override (#72815)"

This reverts commit a1e2c6566305061c115954b048f2957c8d55cb5b.

Revert this patch due to regression. A testcase is:

`template 
class C {
explicit C() {};
};

template <> C::C() {};
`

Added: 


Modified: 
clang/lib/Sema/SemaOverload.cpp
clang/test/SemaCUDA/implicit-member-target-inherited.cu
clang/test/SemaCUDA/trivial-ctor-dtor.cu

Removed: 




diff  --git a/clang/lib/Sema/SemaOverload.cpp b/clang/lib/Sema/SemaOverload.cpp
index 64607e28b8b35e6..9800d7f1c9cfee9 100644
--- a/clang/lib/Sema/SemaOverload.cpp
+++ b/clang/lib/Sema/SemaOverload.cpp
@@ -1491,10 +1491,8 @@ static bool IsOverloadOrOverrideImpl(Sema &SemaRef, 
FunctionDecl *New,
 // Don't allow overloading of destructors.  (In theory we could, but it
 // would be a giant change to clang.)
 if (!isa(New)) {
-  Sema::CUDAFunctionTarget NewTarget = SemaRef.IdentifyCUDATarget(
-   New, isa(New)),
-   OldTarget = SemaRef.IdentifyCUDATarget(
-   Old, isa(New));
+  Sema::CUDAFunctionTarget NewTarget = SemaRef.IdentifyCUDATarget(New),
+   OldTarget = SemaRef.IdentifyCUDATarget(Old);
   if (NewTarget != Sema::CFT_InvalidTarget) {
 assert((OldTarget != Sema::CFT_InvalidTarget) &&
"Unexpected invalid target.");

diff  --git a/clang/test/SemaCUDA/implicit-member-target-inherited.cu 
b/clang/test/SemaCUDA/implicit-member-target-inherited.cu
index ceca0891fc9b03c..781199bba6b5a11 100644
--- a/clang/test/SemaCUDA/implicit-member-target-inherited.cu
+++ b/clang/test/SemaCUDA/implicit-member-target-inherited.cu
@@ -39,7 +39,6 @@ struct A2_with_device_ctor {
 };
 // expected-note@-3 {{candidate constructor (the implicit copy constructor) 
not viable}}
 // expected-note@-4 {{candidate constructor (the implicit move constructor) 
not viable}}
-// expected-note@-4 {{candidate inherited constructor not viable: call to 
__device__ function from __host__ function}}
 
 struct B2_with_implicit_default_ctor : A2_with_device_ctor {
   using A2_with_device_ctor::A2_with_device_ctor;

diff  --git a/clang/test/SemaCUDA/trivial-ctor-dtor.cu 
b/clang/test/SemaCUDA/trivial-ctor-dtor.cu
index 21d698d28492ac3..1df8adc62bab590 100644
--- a/clang/test/SemaCUDA/trivial-ctor-dtor.cu
+++ b/clang/test/SemaCUDA/trivial-ctor-dtor.cu
@@ -38,19 +38,3 @@ struct TC : TB {
 };
 
 __device__ TC tc; //expected-error {{dynamic initialization is not 
supported for __device__, __constant__, __shared__, and __managed__ variables}}
-
-// Check trivial ctor specialization
-template 
-struct C { //expected-note {{candidate constructor (the implicit copy 
constructor) not viable}}
-   //expected-note@-1 {{candidate constructor (the implicit move 
constructor) not viable}}
-explicit C() {};
-};
-
-template <> C::C() {};
-__device__ C ci_d;
-C ci_h;
-
-// Check non-trivial ctor specialization
-template <> C::C() { static int nontrivial_ctor = 1; } //expected-note 
{{candidate constructor not viable: call to __host__ function from __device__ 
function}}
-__device__ C cf_d; //expected-error {{no matching constructor for 
initialization of 'C'}}
-C cf_h;



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA][HIP] allow trivial ctor/dtor in device var init (PR #73140)

2023-11-22 Thread Yaxun Liu via cfe-commits


yxsamliu wrote:

> Could you first land the two reverts 
> ([511cecf](https://github.com/llvm/llvm-project/commit/511cecff7f76958ebfe713189bc106615763b64a)
>  and 
> [e9a8e90](https://github.com/llvm/llvm-project/commit/e9a8e906d4c14eb4b317a7420b9bba3dc7321ba2))
>  and then have the third commit properly reviewed? @Artem-B may be 
> unavailable for a few more days, but we'd like a fix/revert to land very soon.

sure

https://github.com/llvm/llvm-project/pull/73140
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA][HIP] allow trivial ctor/dtor in device var init (PR #73140)

2023-11-22 Thread via cfe-commits


alexfh wrote:

Could you first land the two reverts (511cecff7f76958ebfe713189bc106615763b64a 
and e9a8e906d4c14eb4b317a7420b9bba3dc7321ba2) and then have the third commit 
properly reverted? @Artem-B may be unavailable for a few more days, but we'd 
like a fix/revert to land very soon.

https://github.com/llvm/llvm-project/pull/73140
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] Refactor TBAA Base Info construction (PR #70499)

2023-11-22 Thread Nathan Sidwell via cfe-commits


urnathan wrote:

I'm going to break this apart, as I've realized this has conflated two separate 
problems.

https://github.com/llvm/llvm-project/pull/70499
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang-format] Option to ignore macro definitions (PR #70338)

2023-11-22 Thread Owen Pan via cfe-commits


owenca wrote:

After giving more thoughts to this, I think what we really want is 
`SkipMacroDefinitionBody`, which would format the code below:
```
   # define   A   a. b   //comment
   # define   A( x , y )   ( ( x ) + ( y ) )
```
To the following:
```
#define A   a. b // comment
#define A(x, y)   ( ( x ) + ( y ) )
```
That is, only the body (except comments) of a macro definition is not formatted.

https://github.com/llvm/llvm-project/pull/70338
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] fix emitvaarg when struct is null (PR #72624)

2023-11-22 Thread via cfe-commits


Jolyon0202 wrote:

@efriedma-quic

https://github.com/llvm/llvm-project/pull/72624
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Driver] Default Generic_GCC aarch64_be to -fasynchronous-unwind-tables (PR #72971)

2023-11-22 Thread via cfe-commits


https://github.com/hstk30-hw closed 
https://github.com/llvm/llvm-project/pull/72971
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] b14f651 - [Driver] Default Generic_GCC aarch64_be to -fasynchronous-unwind-tables (#72971)

2023-11-22 Thread via cfe-commits


Author: dong jianqiang
Date: 2023-11-23T09:30:51+08:00
New Revision: b14f651caf4bb507753ffc94db8911bb2e2a7995

URL: 
https://github.com/llvm/llvm-project/commit/b14f651caf4bb507753ffc94db8911bb2e2a7995
DIFF: 
https://github.com/llvm/llvm-project/commit/b14f651caf4bb507753ffc94db8911bb2e2a7995.diff

LOG: [Driver] Default Generic_GCC aarch64_be to -fasynchronous-unwind-tables 
(#72971)

This patch defaults Generic_GCC aarch64_be to use
-fasynchronous-unwind-tables and ensures consistent behavior with
aarch64 little endian.

Added: 


Modified: 
clang/lib/Driver/ToolChains/Gnu.cpp
clang/test/Driver/aarch64-features.c

Removed: 




diff  --git a/clang/lib/Driver/ToolChains/Gnu.cpp 
b/clang/lib/Driver/ToolChains/Gnu.cpp
index 9fb99145d3b909e..b8759918445 100644
--- a/clang/lib/Driver/ToolChains/Gnu.cpp
+++ b/clang/lib/Driver/ToolChains/Gnu.cpp
@@ -2933,6 +2933,7 @@ ToolChain::UnwindTableLevel
 Generic_GCC::getDefaultUnwindTableLevel(const ArgList &Args) const {
   switch (getArch()) {
   case llvm::Triple::aarch64:
+  case llvm::Triple::aarch64_be:
   case llvm::Triple::ppc:
   case llvm::Triple::ppcle:
   case llvm::Triple::ppc64:

diff  --git a/clang/test/Driver/aarch64-features.c 
b/clang/test/Driver/aarch64-features.c
index a797cc0cf9084c2..d2075c91314a8b2 100644
--- a/clang/test/Driver/aarch64-features.c
+++ b/clang/test/Driver/aarch64-features.c
@@ -1,4 +1,5 @@
 // RUN: %clang --target=aarch64-none-linux-gnu -### %s -fsyntax-only 2>&1 | 
FileCheck %s
+// RUN: %clang --target=aarch64_be-none-linux-gnu -### %s -fsyntax-only 2>&1 | 
FileCheck %s
 // RUN: %clang --target=arm64-none-linux-gnu -### %s -fsyntax-only 2>&1 | 
FileCheck %s
 
 // CHECK: "-funwind-tables=2"



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang][analyzer][NFC] Use `*EofVal` instead of constant `-1` (PR #73072)

2023-11-22 Thread Ben Shi via cfe-commits


https://github.com/benshi001 closed 
https://github.com/llvm/llvm-project/pull/73072
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] Fix sorting module headers (PR #73146)

2023-11-22 Thread David Blaikie via cfe-commits

dwblaikie wrote:

> Splitting it wouldn't help with bisect, as we would continue having a broken 
> commit.

Not sure I understand - presumably this bug has existed for a while, separate 
from the qsort issue? So fixing it separately seems good so that patches do one 
thing clearly - makes it easy to review, easy to identify root causes more 
narrowly, easy to revert, etc.

https://github.com/llvm/llvm-project/pull/73146
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [AArch64][SME2] Add multi-vector SEL (x2, x4) ACLE builtins & intrinsics (PR #73188)

2023-11-22 Thread via cfe-commits


llvmbot wrote:




@llvm/pr-subscribers-clang

Author: Dinar Temirbulatov (dtemirbulatov)


Changes

Add multi-vector SEL (x2, x4) ACLE builtins & intrinsics
Patch by: David Sherwood 

---

Patch is 114.68 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/73188.diff


3 Files Affected:

- (modified) clang/include/clang/Basic/arm_sve.td (+4) 
- (added) clang/test/CodeGen/aarch64-sme2-intrinsics/acle_sme2_selx2.c (+384) 
- (added) clang/test/CodeGen/aarch64-sme2-intrinsics/acle_sme2_selx4.c (+576) 


``diff
diff --git a/clang/include/clang/Basic/arm_sve.td 
b/clang/include/clang/Basic/arm_sve.td
index cd4c09a3ad7a81c..a00229a2b30bf7f 100644
--- a/clang/include/clang/Basic/arm_sve.td
+++ b/clang/include/clang/Basic/arm_sve.td
@@ -2105,6 +2105,10 @@ let TargetGuard = "sme2" in {
 // == ADD (vectors) ==
   def SVADD_SINGLE_X2 : SInst<"svadd[_single_{d}_x2]", "22d", "cUcsUsiUilUl", 
MergeNone, "aarch64_sve_add_single_x2", [IsStreaming], []>;
   def SVADD_SINGLE_X4 : SInst<"svadd[_single_{d}_x4]", "44d", "cUcsUsiUilUl", 
MergeNone, "aarch64_sve_add_single_x4", [IsStreaming], []>;
+
+  // 2-way and 4-way selects
+  def SVSEL_X2  : SInst<"svsel[_{d}_x2]", "2}22", "cUcsUsiUilUlbhfd", 
MergeNone, "aarch64_sve_sel_x2", [IsStreaming], []>;
+  def SVSEL_X4  : SInst<"svsel[_{d}_x4]", "4}44", "cUcsUsiUilUlbhfd", 
MergeNone, "aarch64_sve_sel_x4", [IsStreaming], []>;
 }
 
 let TargetGuard = "sve2p1" in {
diff --git a/clang/test/CodeGen/aarch64-sme2-intrinsics/acle_sme2_selx2.c 
b/clang/test/CodeGen/aarch64-sme2-intrinsics/acle_sme2_selx2.c
new file mode 100644
index 000..271f964559ce5ab
--- /dev/null
+++ b/clang/test/CodeGen/aarch64-sme2-intrinsics/acle_sme2_selx2.c
@@ -0,0 +1,384 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+// REQUIRES: aarch64-registered-target
+
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve 
-target-feature +sme2 -S -disable-O0-optnone -Werror -Wall -emit-llvm -o - %s | 
opt -S -passes=mem2reg,instcombine,tailcallelim | FileCheck %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve 
-target-feature +sme2 -S -disable-O0-optnone -Werror -Wall -emit-llvm -o - -x 
c++ %s | opt -S -passes=mem2reg,instcombine,tailcallelim | FileCheck %s 
-check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -DSVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu 
-target-feature +sve -target-feature +sme2 -S -disable-O0-optnone -Werror -Wall 
-emit-llvm -o - %s | opt -S -passes=mem2reg,instcombine,tailcallelim | 
FileCheck %s
+// RUN: %clang_cc1 -DSVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu 
-target-feature +sve -target-feature +sme2 -S -disable-O0-optnone -Werror -Wall 
-emit-llvm -o - -x c++ %s | opt -S -passes=mem2reg,instcombine,tailcallelim | 
FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve 
-target-feature +sme2 -target-feature -S -disable-O0-optnone -Werror -Wall -o 
/dev/null %s
+#include 
+
+#ifdef SVE_OVERLOADED_FORMS
+// A simple used,unused... macro, long enough to represent any SVE builtin.
+#define SVE_ACLE_FUNC(A1,A2_UNUSED) A1
+#else
+#define SVE_ACLE_FUNC(A1,A2) A1##A2
+#endif
+
+// 8-bit ZIPs
+
+// CHECK-LABEL: @test_svsel_s8_x2(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[TMP0:%.*]] = tail call  
@llvm.vector.extract.nxv16i8.nxv32i8( [[ZN:%.*]], i64 0)
+// CHECK-NEXT:[[TMP1:%.*]] = tail call  
@llvm.vector.extract.nxv16i8.nxv32i8( [[ZN]], i64 16)
+// CHECK-NEXT:[[TMP2:%.*]] = tail call  
@llvm.vector.extract.nxv16i8.nxv32i8( [[ZM:%.*]], i64 0)
+// CHECK-NEXT:[[TMP3:%.*]] = tail call  
@llvm.vector.extract.nxv16i8.nxv32i8( [[ZM]], i64 16)
+// CHECK-NEXT:[[TMP4:%.*]] = tail call { ,  } @llvm.aarch64.sve.sel.x2.nxv16i8(target("aarch64.svcount") [[PN:%.*]], 
 [[TMP0]],  [[TMP1]],  
[[TMP2]],  [[TMP3]])
+// CHECK-NEXT:[[TMP5:%.*]] = extractvalue { ,  } [[TMP4]], 0
+// CHECK-NEXT:[[TMP6:%.*]] = tail call  
@llvm.vector.insert.nxv32i8.nxv16i8( poison,  [[TMP5]], i64 0)
+// CHECK-NEXT:[[TMP7:%.*]] = extractvalue { ,  } [[TMP4]], 1
+// CHECK-NEXT:[[TMP8:%.*]] = tail call  
@llvm.vector.insert.nxv32i8.nxv16i8( [[TMP6]],  [[TMP7]], i64 16)
+// CHECK-NEXT:ret  [[TMP8]]
+//
+// CPP-CHECK-LABEL: @_Z16test_svsel_s8_x2u11__SVCount_t10svint8x2_tS0_(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:[[TMP0:%.*]] = tail call  
@llvm.vector.extract.nxv16i8.nxv32i8( [[ZN:%.*]], i64 0)
+// CPP-CHECK-NEXT:[[TMP1:%.*]] = tail call  
@llvm.vector.extract.nxv16i8.nxv32i8( [[ZN]], i64 16)
+// CPP-CHECK-NEXT:[[TMP2:%.*]] = tail call  
@llvm.vector.extract.nxv16i8.nxv32i8( [[ZM:%.*]], i64 0)
+// CPP-CHECK-NEXT:[[TMP3:%.*]] = tail call  
@llvm.vector.extract.nxv16i8.nxv32i8( [[ZM]], i64 16)
+// CPP-CHECK-NEXT:[[TMP4:%.*]] = tail call { ,  } @llvm.aarch64.sve.sel.x2.nxv16i8(target("aarch64.svcount") 
[[PN:%.*]],  [[TMP0]],  [[TMP1]],  [[TMP2]],  [[TMP3]])
+//

[clang] [clang] Fix sorting module headers (PR #73146)

2023-11-22 Thread Tulio Magno Quites Machado Filho via cfe-commits

tuliom wrote:

> Sorry, I don't quite understand tnhis - but I guess the second field ( 
> PathRelativeToRootModuleDirectory ) comparison was to address this bug?

@dwblaikie Yes.

> It'd probably be good to fix that separately, so it can be discussed in more 
> detail, etc.

The commit is already very small. Splitting it wouldn't help with bisect, as we 
would continue having a broken commit. After the both issues are understood, it 
becomes easy to understand why they're being made.
If you have more questions that would help you understand this commit, I'd be 
glad to answer them.

https://github.com/llvm/llvm-project/pull/73146
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] Fix sorting module headers (PR #73146)

2023-11-22 Thread Tulio Magno Quites Machado Filho via cfe-commits


tuliom wrote:

> Sorry, I don't quite understand tnhis

@dwblaikie Would you have any questions? I'd be glad to answer them.

https://github.com/llvm/llvm-project/pull/73146
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang-tools-extra] [llvm] [Clang] Fix linker error for function multiversioning (PR #71706)

2023-11-22 Thread Tom Honermann via cfe-commits



@@ -4282,10 +4300,19 @@ llvm::Constant 
*CodeGenModule::GetOrCreateMultiVersionResolver(GlobalDecl GD) {
   // Holds the name of the resolver, in ifunc mode this is the ifunc (which has
   // a separate resolver).
   std::string ResolverName = MangledName;
-  if (getTarget().supportsIFunc())
-ResolverName += ".ifunc";
-  else if (FD->isTargetMultiVersion())
+  if (getTarget().supportsIFunc()) {
+// In Aarch64, default versions of multiversioned functions are mangled to
+// their 'normal' assembly name. This deviates from other targets which
+// append a '.default' string. As a result we need to continue appending
+// .ifunc in Aarch64.
+// FIXME: Should Aarch64 mangling for 'default' multiversion function and
+// in turn ifunc function match that of other targets?

tahonermann wrote:

@DanielKristofKiss, please ensure this comment and FIXME is addressed/removed 
by your follow up patch.

https://github.com/llvm/llvm-project/pull/71706
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang-tools-extra] [clang] [llvm] [Clang] Fix linker error for function multiversioning (PR #71706)

2023-11-22 Thread Tom Honermann via cfe-commits



@@ -4114,8 +4114,26 @@ void CodeGenModule::emitMultiVersionFunctions() {
 }
 
 llvm::Constant *ResolverConstant = GetOrCreateMultiVersionResolver(GD);
-if (auto *IFunc = dyn_cast(ResolverConstant))
+if (auto *IFunc = dyn_cast(ResolverConstant)) {
   ResolverConstant = IFunc->getResolver();
+  // In Aarch64, default versions of multiversioned functions are mangled 
to
+  // their 'normal' assembly name. This deviates from other targets which
+  // append a '.default' string. As a result we need to continue appending
+  // .ifunc in Aarch64.
+  // FIXME: Should Aarch64 mangling for 'default' multiversion function and
+  // in turn ifunc function match that of other targets?
+  if (FD->isTargetClonesMultiVersion() &&
+  !getTarget().getTriple().isAArch64()) {
+const CGFunctionInfo &FI = getTypes().arrangeGlobalDeclaration(GD);
+llvm::FunctionType *DeclTy = getTypes().GetFunctionType(FI);
+std::string MangledName = getMangledNameImpl(
+*this, GD, FD, /*OmitMultiVersionMangling=*/true);
+auto *Alias = llvm::GlobalAlias::create(
+DeclTy, 0, getMultiversionLinkage(*this, GD),
+MangledName + ".ifunc", IFunc, &getModule());
+SetCommonAttributes(FD, Alias);
+  }

tahonermann wrote:

This could use a comment that explains why a `.ifunc` alias is being emitted.

https://github.com/llvm/llvm-project/pull/71706
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [clang-tools-extra] [Clang] Fix linker error for function multiversioning (PR #71706)

2023-11-22 Thread Tom Honermann via cfe-commits



@@ -16,13 +16,22 @@
 // LINUX: @__cpu_model = external dso_local global { i32, i32, i32, [1 x i32] }
 // LINUX: @__cpu_features2 = external dso_local global [3 x i32]
 
-// LINUX: @internal.ifunc = internal ifunc i32 (), ptr @internal.resolver
-// LINUX: @foo.ifunc = weak_odr ifunc i32 (), ptr @foo.resolver
-// LINUX: @foo_dupes.ifunc = weak_odr ifunc void (), ptr @foo_dupes.resolver
-// LINUX: @unused.ifunc = weak_odr ifunc void (), ptr @unused.resolver
-// LINUX: @foo_inline.ifunc = weak_odr ifunc i32 (), ptr @foo_inline.resolver
-// LINUX: @foo_inline2.ifunc = weak_odr ifunc i32 (), ptr @foo_inline2.resolver
-// LINUX: @foo_used_no_defn.ifunc = weak_odr ifunc i32 (), ptr 
@foo_used_no_defn.resolver
+// LINUX: @internal.ifunc = internal alias i32 (), ptr @internal
+// LINUX: @foo.ifunc = weak_odr alias i32 (), ptr @foo
+// LINUX: @foo_dupes.ifunc = weak_odr alias void (), ptr @foo_dupes
+// LINUX: @unused.ifunc = weak_odr alias void (), ptr @unused
+// LINUX: @foo_inline.ifunc = weak_odr alias i32 (), ptr @foo_inline
+// LINUX: @foo_inline2.ifunc = weak_odr alias i32 (), ptr @foo_inline2
+// LINUX: @foo_used_no_defn.ifunc = weak_odr alias i32 (), ptr 
@foo_used_no_defn
+// LINUX: @isa_level.ifunc = weak_odr alias i32 (i32), ptr @isa_level
+
+// LINUX: @internal = internal ifunc i32 (), ptr @internal.resolver
+// LINUX: @foo = weak_odr ifunc i32 (), ptr @foo.resolver
+// LINUX: @foo_dupes = weak_odr ifunc void (), ptr @foo_dupes.resolver
+// LINUX: @unused = weak_odr ifunc void (), ptr @unused.resolver
+// LINUX: @foo_inline = weak_odr ifunc i32 (), ptr @foo_inline.resolver
+// LINUX: @foo_inline2 = weak_odr ifunc i32 (), ptr @foo_inline2.resolver
+// LINUX: @foo_used_no_defn = weak_odr ifunc i32 (), ptr 
@foo_used_no_defn.resolver

tahonermann wrote:

Since we're checking for `isa_level.ifunc` above, I think we should also check 
for `isa_level` here.

https://github.com/llvm/llvm-project/pull/71706
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang-tools-extra] [clang] [llvm] [Clang] Fix linker error for function multiversioning (PR #71706)

2023-11-22 Thread Tom Honermann via cfe-commits



@@ -4114,8 +4114,26 @@ void CodeGenModule::emitMultiVersionFunctions() {
 }
 
 llvm::Constant *ResolverConstant = GetOrCreateMultiVersionResolver(GD);
-if (auto *IFunc = dyn_cast(ResolverConstant))
+if (auto *IFunc = dyn_cast(ResolverConstant)) {
   ResolverConstant = IFunc->getResolver();
+  // In Aarch64, default versions of multiversioned functions are mangled 
to
+  // their 'normal' assembly name. This deviates from other targets which
+  // append a '.default' string. As a result we need to continue appending
+  // .ifunc in Aarch64.
+  // FIXME: Should Aarch64 mangling for 'default' multiversion function and
+  // in turn ifunc function match that of other targets?

tahonermann wrote:

@DanielKristofKiss, please ensure this comment and FIXME is addressed/removed 
by your follow up patch.

https://github.com/llvm/llvm-project/pull/71706
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang-tools-extra] [clang] [llvm] [Clang] Fix linker error for function multiversioning (PR #71706)

2023-11-22 Thread Tom Honermann via cfe-commits



@@ -555,6 +555,7 @@ Bug Fixes in This Version
   Fixes (`#67687 `_)
 - Fix crash from constexpr evaluator evaluating uninitialized arrays as rvalue.
   Fixes (`#67317 `_)
+- Fix linker error when using multiversioned function defined in a different 
TU.

tahonermann wrote:

```suggestion
- Fix the name of the ifunc symbol emitted for multiversion functions declared 
with the
  ``target_clones`` attribute. This addresses a linker error that would 
otherwise occur
  when these functions are referenced from other TUs.
```

https://github.com/llvm/llvm-project/pull/71706
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang-tools-extra] [llvm] [Clang] Fix linker error for function multiversioning (PR #71706)

2023-11-22 Thread Tom Honermann via cfe-commits


https://github.com/tahonermann requested changes to this pull request.

I requested some minor changes.

Can we document the `.ifunc` symbols as a deprecated feature? With this change, 
they will never be referenced except by code compiled by older compiler 
versions. Maybe plan to deprecate them a year from now?

It looks like there is a related issue in which multiple ifunc symbols are 
emitted for the `cpu_dispatch` attribute. See https://godbolt.org/z/71vr8ceza. 
The relevant symbols emitted are listed below. Note that both 
`_Z12cpu_specificv` and `_Z12cpu_specificv.ifunc` are "i" symbols with the same 
address. The caller in this case calls the `.ifunc` symbol (just as for 
`target_clones` prior to this change). It would be nice if we can fix this 
issue at the same time and likewise deprecate the `.ifunc` symbol for 
`cpu_dispatch`/`cpu_specific`.
```
24c0 i _Z12cpu_specificv
2480 T _Z12cpu_specificv.A
2490 T _Z12cpu_specificv.M
24c0 i _Z12cpu_specificv.ifunc
24c0 W _Z12cpu_specificv.resolver
```

I think the only time a symbol with a `.ifunc` suffix is actually needed is 
when the `target` attribute is used in an overloading context (since in that 
situation, the `target(default)` definition gets the non-suffixed name.

https://github.com/llvm/llvm-project/pull/71706
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang-tools-extra] [llvm] [clang] [Clang] Fix linker error for function multiversioning (PR #71706)

2023-11-22 Thread Tom Honermann via cfe-commits


https://github.com/tahonermann edited 
https://github.com/llvm/llvm-project/pull/71706
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[compiler-rt] [clang] [llvm] [lld] [flang] [libc] [libcxx] Fix ISel crash when lowering BUILD_VECTOR (PR #73186)

2023-11-22 Thread via cfe-commits


llvmbot wrote:




@llvm/pr-subscribers-backend-x86

Author: David Li (david-xl)


Changes

512bit vpbroadcastw is available only with AVX512BW. Avoid lowering BUILD_VEC 
into vbroard_cast node when the condition is not met. This fixed a crash (see 
the added new test).

---
Full diff: https://github.com/llvm/llvm-project/pull/73186.diff


2 Files Affected:

- (modified) llvm/lib/Target/X86/X86ISelLowering.cpp (+5) 
- (modified) llvm/test/CodeGen/X86/shuffle-half.ll (+338) 


``diff
diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp 
b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 05a2ab093bb86f9..e238defd3abb00c 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -7236,6 +7236,7 @@ static SDValue 
lowerBuildVectorAsBroadcast(BuildVectorSDNode *BVOp,
 
   unsigned ScalarSize = Ld.getValueSizeInBits();
   bool IsGE256 = (VT.getSizeInBits() >= 256);
+  bool IsGT256 = (VT.getSizeInBits() > 256);
 
   // When optimizing for size, generate up to 5 extra bytes for a broadcast
   // instruction to save 8 or more bytes of constant pool data.
@@ -7254,6 +7255,10 @@ static SDValue 
lowerBuildVectorAsBroadcast(BuildVectorSDNode *BVOp,
 EVT CVT = Ld.getValueType();
 assert(!CVT.isVector() && "Must not broadcast a vector type");
 
+// 512 bit vpbroadcastw is only available with AVX512BW
+if (ScalarSize == 16 && IsGT256 && !Subtarget.hasBWI())
+  return SDValue();
+
 // Splat f16, f32, i32, v4f64, v4i64 in all cases with AVX2.
 // For size optimization, also splat v2f64 and v2i64, and for size opt
 // with AVX2, also splat i8 and i16.
diff --git a/llvm/test/CodeGen/X86/shuffle-half.ll 
b/llvm/test/CodeGen/X86/shuffle-half.ll
index 0529ca1a0b82c1d..7d05fd647c09e26 100644
--- a/llvm/test/CodeGen/X86/shuffle-half.ll
+++ b/llvm/test/CodeGen/X86/shuffle-half.ll
@@ -308,4 +308,342 @@ define <32 x half> @dump_vec() {
   ret <32 x half> %1
 }
 
+define <32 x half> @build_vec(ptr %p, <32 x i1> %mask) {
+; CHECK-LABEL: build_vec:
+; CHECK:   # %bb.0:
+; CHECK-NEXT:vpsllw $7, %ymm0, %ymm0
+; CHECK-NEXT:vpmovmskb %ymm0, %eax
+; CHECK-NEXT:testb $1, %al
+; CHECK-NEXT:je .LBB1_1
+; CHECK-NEXT:  # %bb.2: # %cond.load
+; CHECK-NEXT:vpinsrw $0, (%rdi), %xmm0, %xmm0
+; CHECK-NEXT:vpblendw {{.*#+}} xmm0 = xmm0[0],mem[1,2,3,4,5,6,7]
+; CHECK-NEXT:vpbroadcastd {{.*#+}} zmm1 = 
[2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0]
+; CHECK-NEXT:vinserti32x4 $0, %xmm0, %zmm1, %zmm0
+; CHECK-NEXT:testb $2, %al
+; CHECK-NEXT:jne .LBB1_4
+; CHECK-NEXT:jmp .LBB1_5
+; CHECK-NEXT:  .LBB1_1:
+; CHECK-NEXT:vpbroadcastd {{.*#+}} zmm0 = 
[2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0,2.0E+0]
+; CHECK-NEXT:testb $2, %al
+; CHECK-NEXT:je .LBB1_5
+; CHECK-NEXT:  .LBB1_4: # %cond.load1
+; CHECK-NEXT:vpbroadcastw 2(%rdi), %xmm1
+; CHECK-NEXT:vpblendw {{.*#+}} xmm1 = xmm0[0],xmm1[1],xmm0[2,3,4,5,6,7]
+; CHECK-NEXT:vinserti32x4 $0, %xmm1, %zmm0, %zmm0
+; CHECK-NEXT:  .LBB1_5: # %else2
+; CHECK-NEXT:testb $4, %al
+; CHECK-NEXT:jne .LBB1_6
+; CHECK-NEXT:  # %bb.7: # %else5
+; CHECK-NEXT:testb $8, %al
+; CHECK-NEXT:jne .LBB1_8
+; CHECK-NEXT:  .LBB1_9: # %else8
+; CHECK-NEXT:testb $16, %al
+; CHECK-NEXT:jne .LBB1_10
+; CHECK-NEXT:  .LBB1_11: # %else11
+; CHECK-NEXT:testb $32, %al
+; CHECK-NEXT:jne .LBB1_12
+; CHECK-NEXT:  .LBB1_13: # %else14
+; CHECK-NEXT:testb $64, %al
+; CHECK-NEXT:jne .LBB1_14
+; CHECK-NEXT:  .LBB1_15: # %else17
+; CHECK-NEXT:testb %al, %al
+; CHECK-NEXT:js .LBB1_16
+; CHECK-NEXT:  .LBB1_17: # %else20
+; CHECK-NEXT:testl $256, %eax # imm = 0x100
+; CHECK-NEXT:jne .LBB1_18
+; CHECK-NEXT:  .LBB1_19: # %else23
+; CHECK-NEXT:testl $512, %eax # imm = 0x200
+; CHECK-NEXT:jne .LBB1_20
+; CHECK-NEXT:  .LBB1_21: # %else26
+; CHECK-NEXT:testl $1024, %eax # imm = 0x400
+; CHECK-NEXT:jne .LBB1_22
+; CHECK-NEXT:  .LBB1_23: # %else29
+; CHECK-NEXT:testl $2048, %eax # imm = 0x800
+; CHECK-NEXT:jne .LBB1_24
+; CHECK-NEXT:  .LBB1_25: # %else32
+; CHECK-NEXT:testl $4096, %eax # imm = 0x1000
+; CHECK-NEXT:jne .LBB1_26
+; CHECK-NEXT:  .LBB1_27: # %else35
+; CHECK-NEXT:testl $8192, %eax # imm = 0x2000
+; CHECK-NEXT:jne .LBB1_28
+; CHECK-NEXT:  .LBB1_29: # %else38
+; CHECK-NEXT:testl $16384, %eax # imm = 0x4000
+; CHECK-NEXT:jne .LBB1_30
+; CHECK-NEXT:  .LBB1_31: # %else41
+; CHECK-NEXT:testw %ax, %ax
+; CHECK-NEXT:js .LBB1_32
+; CHECK-NEXT:  .LBB1_33: # %else44
+; CHECK-NEXT:testl $65536, %eax # imm = 0x1
+; CHECK-NEXT:jne .LBB1_34
+; CHECK-NEX

[llvm] [clang] [SystemZ][z/OS] This change adds support for the PPA2 section in zOS (PR #68926)

2023-11-22 Thread Yusra Syeda via cfe-commits



@@ -1026,6 +1030,71 @@ void SystemZAsmPrinter::emitADASection() {
   OutStreamer->popSection();
 }
 
+static uint32_t getProductVersion(Module &M) {
+  if (auto *VersionVal = mdconst::extract_or_null(
+  M.getModuleFlag("zos_product_major_version")))
+return VersionVal->getValue().getZExtValue();

ysyeda wrote:

Done

https://github.com/llvm/llvm-project/pull/68926
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [SystemZ][z/OS] This change adds support for the PPA2 section in zOS (PR #68926)

2023-11-22 Thread Yusra Syeda via cfe-commits



@@ -1026,6 +1030,71 @@ void SystemZAsmPrinter::emitADASection() {
   OutStreamer->popSection();
 }
 
+static uint32_t getProductVersion(Module &M) {
+  if (auto *VersionVal = mdconst::extract_or_null(
+  M.getModuleFlag("zos_product_major_version")))
+return VersionVal->getValue().getZExtValue();
+  return LLVM_VERSION_MAJOR;
+}
+
+static uint32_t getProductRelease(Module &M) {
+  if (auto *ReleaseVal = cast_or_null(
+  M.getModuleFlag("zos_product_minor_version")))
+return cast(ReleaseVal->getValue())->getZExtValue();
+  return LLVM_VERSION_MINOR;
+}
+
+static uint32_t getProductPatch(Module &M) {
+  if (auto *PatchVal = cast_or_null(
+  M.getModuleFlag("zos_product_patchlevel")))
+return cast(PatchVal->getValue())->getZExtValue();
+  return LLVM_VERSION_PATCH;
+}
+
+static time_t getTranslationTime(Module &M) {
+  std::time_t Time = 0;
+  if (auto *Val = cast_or_null(
+  M.getModuleFlag("zos_translation_time"))) {
+long SecondsSinceEpoch = 
cast(Val->getValue())->getSExtValue();
+Time = static_cast(SecondsSinceEpoch);
+  }
+  return Time;
+}
+
+void SystemZAsmPrinter::emitIDRLSection(Module &M) {
+  OutStreamer->pushSection();
+  OutStreamer->switchSection(getObjFileLowering().getIDRLSection());
+  constexpr unsigned IDRLDataLength = 30;
+  std::time_t Time = getTranslationTime(M);
+
+  uint32_t ProductVersion = getProductVersion(M);
+  uint32_t ProductRelease = getProductRelease(M);
+
+  std::string ProductID;
+  if (auto *MD = M.getModuleFlag("zos_product_id"))
+ProductID = cast(MD)->getString().str();
+
+  if (ProductID.empty()) {
+char ProductIDFormatted[11]; // 10 + null.
+snprintf(ProductIDFormatted, sizeof(ProductIDFormatted), "LLVM");

ysyeda wrote:

Done

https://github.com/llvm/llvm-project/pull/68926
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [SystemZ][z/OS] This change adds support for the PPA2 section in zOS (PR #68926)

2023-11-22 Thread Yusra Syeda via cfe-commits


https://github.com/ysyeda updated 
https://github.com/llvm/llvm-project/pull/68926

>From 78f82bcf33998de0663f4684a64a240f2e97f8a9 Mon Sep 17 00:00:00 2001
From: Yusra Syeda 
Date: Thu, 12 Oct 2023 16:56:27 -0400
Subject: [PATCH 01/20] This change adds support for the PPA2 section in zOS

---
 clang/lib/Basic/LangStandards.cpp |   6 +
 clang/lib/CodeGen/CodeGenModule.cpp   |  15 ++
 clang/lib/Driver/ToolChains/Clang.cpp |  13 +-
 clang/lib/Driver/ToolChains/Clang.h   |   3 +-
 clang/test/CodeGen/SystemZ/systemz-ppa2.c |  25 +++
 llvm/include/llvm/BinaryFormat/GOFF.h |   1 +
 llvm/include/llvm/MC/MCObjectFileInfo.h   |   4 +
 llvm/lib/MC/MCObjectFileInfo.cpp  |   5 +
 llvm/lib/Target/SystemZ/SystemZAsmPrinter.cpp | 197 +-
 llvm/lib/Target/SystemZ/SystemZAsmPrinter.h   |   7 +-
 llvm/test/CodeGen/SystemZ/zos-ppa2.ll |  26 +++
 11 files changed, 297 insertions(+), 5 deletions(-)
 create mode 100644 clang/test/CodeGen/SystemZ/systemz-ppa2.c
 create mode 100644 llvm/test/CodeGen/SystemZ/zos-ppa2.ll

diff --git a/clang/lib/Basic/LangStandards.cpp 
b/clang/lib/Basic/LangStandards.cpp
index ab09c7221dda92f..cfe79ec90f3796b 100644
--- a/clang/lib/Basic/LangStandards.cpp
+++ b/clang/lib/Basic/LangStandards.cpp
@@ -10,10 +10,16 @@
 #include "clang/Config/config.h"
 #include "llvm/ADT/StringSwitch.h"
 #include "llvm/Support/ErrorHandling.h"
+#include "llvm/Support/FormatVariadic.h"
 #include "llvm/TargetParser/Triple.h"
 using namespace clang;
 
 StringRef clang::languageToString(Language L) {
+const char *clang::LanguageToString(Language L) {
+  // I would like to make this function and the definition of Language
+  // in the .h file simply expand the contents of a .def file.
+  // However, in the .h the members of the enum have doxygen annotations
+  // and/or comments which would be lost.
   switch (L) {
   case Language::Unknown:
 return "Unknown";
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp 
b/clang/lib/CodeGen/CodeGenModule.cpp
index b1a6683a66bd052..9a4763413ea3fbc 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -976,6 +976,21 @@ void CodeGenModule::Release() {
   Context.getTypeSizeInChars(Context.getWideCharType()).getQuantity();
   getModule().addModuleFlag(llvm::Module::Error, "wchar_size", WCharWidth);
 
+  if (getTriple().isOSzOS()) {
+int32_t ProductVersion, ProductRelease, ProductPatch;
+ProductVersion = LLVM_VERSION_MAJOR,
+ProductRelease = LLVM_VERSION_MINOR, ProductPatch = LLVM_VERSION_PATCH;
+getModule().addModuleFlag(llvm::Module::Warning, "Product Major Version", 
ProductVersion);
+getModule().addModuleFlag(llvm::Module::Warning, "Product Minor Version", 
ProductRelease);
+getModule().addModuleFlag(llvm::Module::Warning, "Product Patchlevel", 
ProductPatch);
+
+// Record the language because we need it for the PPA2.
+const char *lang_str = LanguageToString(
+LangStandard::getLangStandardForKind(LangOpts.LangStd).Language);
+getModule().addModuleFlag(llvm::Module::Error, "zos_cu_language",
+  llvm::MDString::get(VMContext, lang_str));
+  }
+
   llvm::Triple::ArchType Arch = Context.getTargetInfo().getTriple().getArch();
   if (   Arch == llvm::Triple::arm
   || Arch == llvm::Triple::armeb
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 43a92adbef64ba8..109699f2ea4a62a 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -1765,7 +1765,7 @@ void Clang::RenderTargetOptions(const llvm::Triple 
&EffectiveTriple,
 break;
 
   case llvm::Triple::systemz:
-AddSystemZTargetArgs(Args, CmdArgs);
+AddSystemZTargetArgs(EffectiveTriple, Args, CmdArgs);
 break;
 
   case llvm::Triple::x86:
@@ -2262,7 +2262,8 @@ void Clang::AddSparcTargetArgs(const ArgList &Args,
   }
 }
 
-void Clang::AddSystemZTargetArgs(const ArgList &Args,
+void Clang::AddSystemZTargetArgs(const llvm::Triple &Triple,
+ const ArgList &Args,
  ArgStringList &CmdArgs) const {
   if (const Arg *A = Args.getLastArg(options::OPT_mtune_EQ)) {
 CmdArgs.push_back("-tune-cpu");
@@ -2294,6 +2295,14 @@ void Clang::AddSystemZTargetArgs(const ArgList &Args,
 CmdArgs.push_back("-mfloat-abi");
 CmdArgs.push_back("soft");
   }
+
+  if (Triple.isOSzOS()) {
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back(
+Args.MakeArgString(llvm::Twine("-translation-time=")
+   .concat(llvm::Twine(std::time(nullptr)))
+   .str()));
+  }
 }
 
 void Clang::AddX86TargetArgs(const ArgList &Args,
diff --git a/clang/lib/Driver/ToolChains/Clang.h 
b/clang/lib/Driver/ToolChains/Clang.h
index 0f503c4bd1c4fea..9f065f846b4cf34 100644
--- a/clang/lib/Driver/ToolChains/Clang.h
+++ b/clang/lib/Driver/ToolChains/Clang

[clang] [llvm] [SystemZ][z/OS] This change adds support for the PPA2 section in zOS (PR #68926)

2023-11-22 Thread Yusra Syeda via cfe-commits



@@ -1026,6 +1030,78 @@ void SystemZAsmPrinter::emitADASection() {
   OutStreamer->popSection();
 }
 
+static uint32_t getProductVersion(Module &M) {
+  if (auto *VersionVal = cast_or_null(
+  M.getModuleFlag("zos_product_major_version")))
+return cast(VersionVal->getValue())->getZExtValue();
+  return LLVM_VERSION_MAJOR;
+}
+
+static uint32_t getProductRelease(Module &M) {
+  if (auto *ReleaseVal = cast_or_null(
+  M.getModuleFlag("zos_product_minor_version")))
+return cast(ReleaseVal->getValue())->getZExtValue();
+  return LLVM_VERSION_MINOR;
+}
+
+static uint32_t getProductPatch(Module &M) {
+  if (auto *PatchVal = cast_or_null(
+  M.getModuleFlag("zos_product_patchlevel")))
+return cast(PatchVal->getValue())->getZExtValue();
+  return LLVM_VERSION_PATCH;
+}
+
+static time_t getTranslationTime(Module &M) {
+  std::time_t Time = 0;
+  if (auto *Val = cast_or_null(
+  M.getModuleFlag("zos_translation_time"))) {
+long SecondsSinceEpoch = 
cast(Val->getValue())->getSExtValue();
+Time = static_cast(SecondsSinceEpoch);
+  }
+  return Time;
+}
+
+void SystemZAsmPrinter::emitIDRLSection(Module &M) {
+  OutStreamer->pushSection();
+  OutStreamer->switchSection(getObjFileLowering().getIDRLSection());
+  constexpr unsigned IDRLDataLength = 30;
+  std::time_t Time = getTranslationTime(M);
+
+  uint32_t ProductVersion = getProductVersion(M);
+  uint32_t ProductRelease = getProductRelease(M);
+
+  std::string ProductID;
+  if (auto *MD = M.getModuleFlag("zos_product_id"))
+ProductID = cast(MD)->getString().str();
+
+  if (ProductID.empty()) {
+char ProductIDFormatted[11]; // 10 + null.
+snprintf(ProductIDFormatted, sizeof(ProductIDFormatted), "LLVM  %02d%02d",
+ ProductVersion, ProductRelease);
+ProductID = ProductIDFormatted;
+  }
+
+  // Remove - from Product Id, which makes it consistent with legacy.
+  // The binder expects alphanumeric characters only.
+  std::size_t DashFound = ProductID.find("-");
+  if (DashFound != std::string::npos)
+ProductID.erase(ProductID.begin() + DashFound);

ysyeda wrote:

I tried changing the productID to an arbitrary string containing multiple 
dashes and non alphanumerics, and it does not cause issues in the binder. The 
original comment is wrong and the only need to remove the '-' is for the C/C++ 
compiler to match xlc/xlC. So this code can be removed from here.

https://github.com/llvm/llvm-project/pull/68926
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #68932)

2023-11-22 Thread Jun Wang via cfe-commits



@@ -1708,6 +1710,19 @@ bool 
SIInsertWaitcnts::insertWaitcntInBlock(MachineFunction &MF,
 }
 
 ++Iter;
+if (ST->isPreciseMemoryEnabled() && Inst.mayLoadOrStore()) {
+  auto Builder =
+  BuildMI(Block, Iter, DebugLoc(), TII->get(AMDGPU::S_WAITCNT))
+  .addImm(0);
+  if (IsGFX10Plus) {

jwanggit86 wrote:

My understanding is that the feature request asks for a "s_waitcnt 0" to be 
*blindly* inserted after each and every memory instruction. Enabling the 
feature is at the user's discretion via a clang command-line option (disabled 
by default). The purpose of the feature is to help debug memory problems on 
GPUs that do not support precise memory. (Although someone, Tony I think, 
mentioned it could go beyond debugging). I'll send you the link for the feature 
request.

Based on that, the implementation doesn't check on GPU models, doesn't have 
model-dependent code (except the newly-added code for GFX10+), or differentiate 
loads from stores. I'll work with the requester to get the requirements 
straightened out.

https://github.com/llvm/llvm-project/pull/68932
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [clang] 661a73f - Fix typo in DiagnosticSemaKinds.td

2023-11-22 Thread David Blaikie via cfe-commits

Does this diagnostic have test coverage, could it be expanded to check the
spelling here? (mostly as a motivation to ensure this diagnostic is
actually tested... )

On Mon, Nov 20, 2023 at 3:04 AM via cfe-commits 
wrote:

>
> Author: Utkarsh Saxena
> Date: 2023-11-20T12:04:32+01:00
> New Revision: 661a73ff712c54d05042eb37d536be4bade307b4
>
> URL:
> https://github.com/llvm/llvm-project/commit/661a73ff712c54d05042eb37d536be4bade307b4
> DIFF:
> https://github.com/llvm/llvm-project/commit/661a73ff712c54d05042eb37d536be4bade307b4.diff
>
> LOG: Fix typo in DiagnosticSemaKinds.td
>
> s/makred/marked
>
> Added:
>
>
> Modified:
> clang/include/clang/Basic/DiagnosticSemaKinds.td
>
> Removed:
>
>
>
>
> 
> diff  --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td
> b/clang/include/clang/Basic/DiagnosticSemaKinds.td
> index 1a6c4d80ccb6ac2..9a7dafa4a298273 100644
> --- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
> +++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
> @@ -11594,7 +11594,7 @@ def err_coro_invalid_addr_of_label : Error<
>"the GNU address of label extension is not allowed in coroutines."
>  >;
>  def err_coroutine_return_type : Error<
> -  "function returns a type %0 makred with [[clang::coro_return_type]] but
> is neither a coroutine nor a coroutine wrapper; "
> +  "function returns a type %0 marked with [[clang::coro_return_type]] but
> is neither a coroutine nor a coroutine wrapper; "
>"non-coroutines should be marked with [[clang::coro_wrapper]] to allow
> returning coroutine return type"
>  >;
>  } // end of coroutines issue category
>
>
>
> ___
> cfe-commits mailing list
> cfe-commits@lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
>
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Driver] Add support for -export-dynamic which can match GCC behavior. (PR #72781)

2023-11-22 Thread Fangrui Song via cfe-commits


MaskRay wrote:

GCC compatibility certainly matters a lot, but I feel that some of my previous 
points are ignored. My previous replies asked us to focus on other respects (a) 
whether the behavior makes sense or makes things more error-prone (b) whether 
not supporting it causes us trouble (c) whether the option has an alternative 
(d) ...

The facts that `-exxx`/`-e xxx` has been broken for so long up until I fixed it 
in 2020 and `-Wl,--export-dynamic`/`-rdynamic` are almost exclusively used mean 
a lot. The GCC PR says that (a) `-export-dynamic` was not supported in GCC (b) 
it worked was more of an accident (c) adding `-export-dynamic` was useful 
around 2011 when they needed to work with the problematic gawk use. If you have 
private software that uses `-export-dynamic`, can you fix it instead?

https://github.com/llvm/llvm-project/pull/72781
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [LinkerWrapper] Accept some needed lld-link linker arguments for COFF targets (PR #72889)

2023-11-22 Thread Michael Kruse via cfe-commits


https://github.com/Meinersbur edited 
https://github.com/llvm/llvm-project/pull/72889
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[lld] [clang-tools-extra] [mlir] [clang] [llvm] [flang] [clangtidy] Allow safe suspensions in coroutine-hostile-raii check (PR #72954)

2023-11-22 Thread Björn Svensson via cfe-commits



@@ -52,27 +52,42 @@ AST_MATCHER_P(Stmt, forEachPrevStmt, 
ast_matchers::internal::Matcher,
   }
   return IsHostile;
 }
+
+// Matches the expression awaited by the `co_await`.
+AST_MATCHER_P(CoawaitExpr, awaiatable, ast_matchers::internal::Matcher,

bjosv wrote:

```suggestion
AST_MATCHER_P(CoawaitExpr, awaitable, ast_matchers::internal::Matcher,
```

https://github.com/llvm/llvm-project/pull/72954
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [lld] [flang] [clang-tools-extra] [clang] [mlir] [clangtidy] Allow safe suspensions in coroutine-hostile-raii check (PR #72954)

2023-11-22 Thread Björn Svensson via cfe-commits


https://github.com/bjosv commented:

Nice. Added a naming comment/nit.

https://github.com/llvm/llvm-project/pull/72954
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[mlir] [llvm] [flang] [lld] [clang-tools-extra] [clang] [clangtidy] Allow safe suspensions in coroutine-hostile-raii check (PR #72954)

2023-11-22 Thread Björn Svensson via cfe-commits


https://github.com/bjosv edited https://github.com/llvm/llvm-project/pull/72954
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [LinkerWrapper] Accept some needed lld-link linker arguments for COFF targets (PR #72889)

2023-11-22 Thread Michael Kruse via cfe-commits


https://github.com/Meinersbur edited 
https://github.com/llvm/llvm-project/pull/72889
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] Fix sorting module headers (PR #73146)

2023-11-22 Thread David Blaikie via cfe-commits

dwblaikie wrote:

> This commit also fixes commit 
> https://github.com/llvm/llvm-project/commit/d3676d4b666ead794fc58bbc7e07aa406dcf487a
>  that caused all headers to have NameAsWritten set to a 0-length string 
> without adapting compareModuleHeaders() to the new field.

Sorry, I don't quite understand tnhis - but I guess the second field ( 
PathRelativeToRootModuleDirectory ) comparison was to address this bug? It'd 
probably be good to fix that separately, so it can be discussed in more detail, 
etc.

https://github.com/llvm/llvm-project/pull/73146
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang-format] Option to ignore macro definitions (PR #70338)

2023-11-22 Thread Owen Pan via cfe-commits


https://github.com/owenca edited https://github.com/llvm/llvm-project/pull/70338
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang-format] Option to ignore macro definitions (PR #70338)

2023-11-22 Thread Owen Pan via cfe-commits



@@ -24153,6 +24153,123 @@ TEST_F(FormatTest, WhitespaceSensitiveMacros) {
   verifyNoChange("FOO(String-ized&Messy+But: :Still=Intentional);", Style);
 }
 
+TEST_F(FormatTest, IgnorePPDefinitions) {
+  FormatStyle Style = getLLVMStyle();
+  Style.IgnorePPDefinitions = true;
+
+  verifyNoChange("#define  A", Style);
+  verifyNoChange("#define A   b", Style);
+  verifyNoChange("#define A  (  args   )", Style);
+  verifyNoChange("#define A  (  args   )  =  func  (  args  )", Style);
+
+  verifyNoChange("#define A x:", Style);
+  verifyNoChange("#define A a. b", Style);
+
+  // Surrounded with formatted code.
+  verifyFormat("int a;\n"
+   "#define  A  a\n"
+   "int a;",
+   "int  a ;\n"
+   "#define  A  a\n"
+   "int  a ;",
+   Style);
+
+  // Columns are not broken when a limit is set.
+  Style.ColumnLimit = 10;
+  verifyNoChange("#define A a a a a", Style);
+
+  Style.ColumnLimit = 15;
+  verifyNoChange("#define A //a very long comment", Style);
+  // in the following examples, since second line will not be formtted, it 
won't
+  // take into considertaion the alignment from the first line. The third line
+  // will follow the second line's alignment.
+  verifyFormat("int aa; // a\n"
+   "#define A // a\n"
+   "int a;// a",
+   "int aa; // a\n"
+   "#define A // a\n"
+   "int a; // a",
+   Style);
+
+  Style.ColumnLimit = 0;
+
+  // Multiline definition.
+  verifyNoChange("#define A \\\n"
+ "Line one with spaces  .  \\\n"
+ " Line two.",
+ Style);
+  verifyNoChange("#define A \\\n"
+ "a a \\\n"
+ "a\\\n"
+ "a",
+ Style);
+  Style.AlignEscapedNewlines = FormatStyle::ENAS_Left;
+  verifyNoChange("#define A \\\n"
+ "a a \\\n"
+ "a\\\n"
+ "a",
+ Style);
+  Style.AlignEscapedNewlines = FormatStyle::ENAS_Right;
+  verifyNoChange("#define A \\\n"
+ "a a \\\n"
+ "a\\\n"
+ "a",
+ Style);
+
+  // Adjust indendations but don't change the definition.
+  Style.IndentPPDirectives = FormatStyle::PPDIS_None;
+  verifyNoChange("#if A\n"
+ "#define A  a\n"
+ "#endif",
+ Style);
+  verifyNoChange("#define UNITY 1\n"
+ "#if A\n"
+ "#  define   A  a\\\n"
+ "  a  a\n"
+ "#endif",
+ Style);
+  verifyFormat("#if A\n"
+   "#define A  a\n"
+   "#endif",
+   "#if A\n"
+   "  #define A  a\n"
+   "#endif",
+   Style);
+  Style.IndentPPDirectives = FormatStyle::PPDIS_AfterHash;
+  verifyFormat("#if A\n"
+   "#  define A  a\n"
+   "#endif",
+   "#if A\n"
+   "  #  define A  a\n"
+   "#endif",
+   Style);
+  Style.IndentPPDirectives = FormatStyle::PPDIS_BeforeHash;
+  verifyNoChange("#if A\n"
+ "  #define A  a\n"
+ "#endif",
+ Style);
+
+  Style.IndentPPDirectives = FormatStyle::PPDIS_None;
+  // IgnorePPDefinitions should not affect other PP directives
+  verifyFormat("#if !defined(A)\n"
+   "# define  A  a\n"

owenca wrote:

Ditto.

https://github.com/llvm/llvm-project/pull/70338
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang-format] Option to ignore macro definitions (PR #70338)

2023-11-22 Thread Owen Pan via cfe-commits



@@ -24153,6 +24153,123 @@ TEST_F(FormatTest, WhitespaceSensitiveMacros) {
   verifyNoChange("FOO(String-ized&Messy+But: :Still=Intentional);", Style);
 }
 
+TEST_F(FormatTest, IgnorePPDefinitions) {
+  FormatStyle Style = getLLVMStyle();
+  Style.IgnorePPDefinitions = true;
+
+  verifyNoChange("#define  A", Style);
+  verifyNoChange("#define A   b", Style);
+  verifyNoChange("#define A  (  args   )", Style);
+  verifyNoChange("#define A  (  args   )  =  func  (  args  )", Style);
+
+  verifyNoChange("#define A x:", Style);
+  verifyNoChange("#define A a. b", Style);
+
+  // Surrounded with formatted code.
+  verifyFormat("int a;\n"
+   "#define  A  a\n"
+   "int a;",
+   "int  a ;\n"
+   "#define  A  a\n"
+   "int  a ;",
+   Style);
+
+  // Columns are not broken when a limit is set.
+  Style.ColumnLimit = 10;
+  verifyNoChange("#define A a a a a", Style);
+
+  Style.ColumnLimit = 15;
+  verifyNoChange("#define A //a very long comment", Style);
+  // in the following examples, since second line will not be formtted, it 
won't
+  // take into considertaion the alignment from the first line. The third line
+  // will follow the second line's alignment.
+  verifyFormat("int aa; // a\n"
+   "#define A // a\n"
+   "int a;// a",
+   "int aa; // a\n"
+   "#define A // a\n"
+   "int a; // a",
+   Style);
+
+  Style.ColumnLimit = 0;
+
+  // Multiline definition.
+  verifyNoChange("#define A \\\n"
+ "Line one with spaces  .  \\\n"
+ " Line two.",
+ Style);
+  verifyNoChange("#define A \\\n"
+ "a a \\\n"
+ "a\\\n"
+ "a",
+ Style);
+  Style.AlignEscapedNewlines = FormatStyle::ENAS_Left;
+  verifyNoChange("#define A \\\n"
+ "a a \\\n"
+ "a\\\n"
+ "a",
+ Style);
+  Style.AlignEscapedNewlines = FormatStyle::ENAS_Right;
+  verifyNoChange("#define A \\\n"
+ "a a \\\n"
+ "a\\\n"
+ "a",
+ Style);
+
+  // Adjust indendations but don't change the definition.
+  Style.IndentPPDirectives = FormatStyle::PPDIS_None;
+  verifyNoChange("#if A\n"
+ "#define A  a\n"
+ "#endif",
+ Style);
+  verifyNoChange("#define UNITY 1\n"
+ "#if A\n"
+ "#  define   A  a\\\n"
+ "  a  a\n"
+ "#endif",
+ Style);
+  verifyFormat("#if A\n"
+   "#define A  a\n"
+   "#endif",
+   "#if A\n"
+   "  #define A  a\n"
+   "#endif",
+   Style);
+  Style.IndentPPDirectives = FormatStyle::PPDIS_AfterHash;
+  verifyFormat("#if A\n"
+   "#  define A  a\n"
+   "#endif",
+   "#if A\n"
+   "  #  define A  a\n"
+   "#endif",
+   Style);
+  Style.IndentPPDirectives = FormatStyle::PPDIS_BeforeHash;
+  verifyNoChange("#if A\n"
+ "  #define A  a\n"
+ "#endif",
+ Style);
+
+  Style.IndentPPDirectives = FormatStyle::PPDIS_None;
+  // IgnorePPDefinitions should not affect other PP directives
+  verifyFormat("#if !defined(A)\n"
+   "# define  A  a\n"
+   "#endif",
+   "#if ! defined ( A )\n"
+   "  # define  A  a\n"
+   "#endif",
+   Style);
+
+  // With comments.
+  verifyNoChange("/* */  # define A  a  //  a  a", Style);
+  verifyFormat("int a; // a\n"
+   "#define A  // a\n"
+   "int aaa;   // a",

owenca wrote:

This seems inconsistent with lines 24186-24188 above.

https://github.com/llvm/llvm-project/pull/70338
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang-format] Option to ignore macro definitions (PR #70338)

2023-11-22 Thread Owen Pan via cfe-commits



@@ -24153,6 +24153,123 @@ TEST_F(FormatTest, WhitespaceSensitiveMacros) {
   verifyNoChange("FOO(String-ized&Messy+But: :Still=Intentional);", Style);
 }
 
+TEST_F(FormatTest, IgnorePPDefinitions) {
+  FormatStyle Style = getLLVMStyle();
+  Style.IgnorePPDefinitions = true;
+
+  verifyNoChange("#define  A", Style);
+  verifyNoChange("#define A   b", Style);
+  verifyNoChange("#define A  (  args   )", Style);
+  verifyNoChange("#define A  (  args   )  =  func  (  args  )", Style);
+
+  verifyNoChange("#define A x:", Style);
+  verifyNoChange("#define A a. b", Style);
+
+  // Surrounded with formatted code.
+  verifyFormat("int a;\n"
+   "#define  A  a\n"
+   "int a;",
+   "int  a ;\n"
+   "#define  A  a\n"
+   "int  a ;",
+   Style);
+
+  // Columns are not broken when a limit is set.
+  Style.ColumnLimit = 10;
+  verifyNoChange("#define A a a a a", Style);
+
+  Style.ColumnLimit = 15;
+  verifyNoChange("#define A //a very long comment", Style);
+  // in the following examples, since second line will not be formtted, it 
won't
+  // take into considertaion the alignment from the first line. The third line
+  // will follow the second line's alignment.
+  verifyFormat("int aa; // a\n"
+   "#define A // a\n"
+   "int a;// a",
+   "int aa; // a\n"
+   "#define A // a\n"
+   "int a; // a",
+   Style);
+
+  Style.ColumnLimit = 0;
+
+  // Multiline definition.
+  verifyNoChange("#define A \\\n"
+ "Line one with spaces  .  \\\n"
+ " Line two.",
+ Style);
+  verifyNoChange("#define A \\\n"
+ "a a \\\n"
+ "a\\\n"
+ "a",
+ Style);
+  Style.AlignEscapedNewlines = FormatStyle::ENAS_Left;
+  verifyNoChange("#define A \\\n"
+ "a a \\\n"
+ "a\\\n"
+ "a",
+ Style);
+  Style.AlignEscapedNewlines = FormatStyle::ENAS_Right;
+  verifyNoChange("#define A \\\n"
+ "a a \\\n"
+ "a\\\n"
+ "a",
+ Style);
+
+  // Adjust indendations but don't change the definition.
+  Style.IndentPPDirectives = FormatStyle::PPDIS_None;
+  verifyNoChange("#if A\n"
+ "#define A  a\n"
+ "#endif",
+ Style);
+  verifyNoChange("#define UNITY 1\n"
+ "#if A\n"
+ "#  define   A  a\\\n"
+ "  a  a\n"
+ "#endif",
+ Style);
+  verifyFormat("#if A\n"
+   "#define A  a\n"
+   "#endif",
+   "#if A\n"
+   "  #define A  a\n"
+   "#endif",
+   Style);
+  Style.IndentPPDirectives = FormatStyle::PPDIS_AfterHash;
+  verifyFormat("#if A\n"
+   "#  define A  a\n"

owenca wrote:

Ditto.

https://github.com/llvm/llvm-project/pull/70338
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang-format] Option to ignore macro definitions (PR #70338)

2023-11-22 Thread Owen Pan via cfe-commits



@@ -24153,6 +24153,123 @@ TEST_F(FormatTest, WhitespaceSensitiveMacros) {
   verifyNoChange("FOO(String-ized&Messy+But: :Still=Intentional);", Style);
 }
 
+TEST_F(FormatTest, IgnorePPDefinitions) {
+  FormatStyle Style = getLLVMStyle();
+  Style.IgnorePPDefinitions = true;
+
+  verifyNoChange("#define  A", Style);
+  verifyNoChange("#define A   b", Style);
+  verifyNoChange("#define A  (  args   )", Style);
+  verifyNoChange("#define A  (  args   )  =  func  (  args  )", Style);
+
+  verifyNoChange("#define A x:", Style);
+  verifyNoChange("#define A a. b", Style);
+
+  // Surrounded with formatted code.
+  verifyFormat("int a;\n"
+   "#define  A  a\n"
+   "int a;",
+   "int  a ;\n"
+   "#define  A  a\n"
+   "int  a ;",
+   Style);
+
+  // Columns are not broken when a limit is set.
+  Style.ColumnLimit = 10;
+  verifyNoChange("#define A a a a a", Style);
+
+  Style.ColumnLimit = 15;
+  verifyNoChange("#define A //a very long comment", Style);
+  // in the following examples, since second line will not be formtted, it 
won't
+  // take into considertaion the alignment from the first line. The third line
+  // will follow the second line's alignment.
+  verifyFormat("int aa; // a\n"
+   "#define A // a\n"
+   "int a;// a",
+   "int aa; // a\n"
+   "#define A // a\n"
+   "int a; // a",
+   Style);
+
+  Style.ColumnLimit = 0;
+
+  // Multiline definition.
+  verifyNoChange("#define A \\\n"
+ "Line one with spaces  .  \\\n"
+ " Line two.",
+ Style);
+  verifyNoChange("#define A \\\n"
+ "a a \\\n"
+ "a\\\n"
+ "a",
+ Style);
+  Style.AlignEscapedNewlines = FormatStyle::ENAS_Left;
+  verifyNoChange("#define A \\\n"
+ "a a \\\n"
+ "a\\\n"
+ "a",
+ Style);
+  Style.AlignEscapedNewlines = FormatStyle::ENAS_Right;
+  verifyNoChange("#define A \\\n"
+ "a a \\\n"
+ "a\\\n"
+ "a",
+ Style);
+
+  // Adjust indendations but don't change the definition.
+  Style.IndentPPDirectives = FormatStyle::PPDIS_None;
+  verifyNoChange("#if A\n"
+ "#define A  a\n"
+ "#endif",
+ Style);
+  verifyNoChange("#define UNITY 1\n"
+ "#if A\n"
+ "#  define   A  a\\\n"
+ "  a  a\n"
+ "#endif",
+ Style);
+  verifyFormat("#if A\n"
+   "#define A  a\n"

owenca wrote:

The indentation should _not_ change.

https://github.com/llvm/llvm-project/pull/70338
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang-format] Option to ignore macro definitions (PR #70338)

2023-11-22 Thread Owen Pan via cfe-commits


https://github.com/owenca requested changes to this pull request.


https://github.com/llvm/llvm-project/pull/70338
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [lld] [mlir] [clang] [AMDGPU] Change default AMDHSA Code Object version to 5 (PR #73000)

2023-11-22 Thread David Blaikie via cfe-commits


dwblaikie wrote:

(part of the point is so that patches can be reverted as needed without having 
to revert/reapply huge patches when they aren't actually dependent)

https://github.com/llvm/llvm-project/pull/73000
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [lld] [mlir] [clang] [AMDGPU] Change default AMDHSA Code Object version to 5 (PR #73000)

2023-11-22 Thread David Blaikie via cfe-commits


dwblaikie wrote:

Generally when we split things up, they're separate reviews and separate 
commits. fixups are for necessary changes that need to be applied atomically 
(fixes to the base patch in a pull request, etc).


https://github.com/llvm/llvm-project/pull/73000
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [mlir] [llvm] [lld] [AMDGPU] Change default AMDHSA Code Object version to 5 (PR #73000)

2023-11-22 Thread David Blaikie via cfe-commits


https://github.com/dwblaikie closed 
https://github.com/llvm/llvm-project/pull/73000
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [mlir] [llvm] [lld] [AMDGPU] Change default AMDHSA Code Object version to 5 (PR #73000)

2023-11-22 Thread David Blaikie via cfe-commits



@@ -2402,7 +2402,7 @@ void tools::checkAMDGPUCodeObjectVersion(const Driver &D,
 
 unsigned tools::getAMDGPUCodeObjectVersion(const Driver &D,
const llvm::opt::ArgList &Args) {
-  unsigned CodeObjVer = 4; // default
+  unsigned CodeObjVer = 5; // default

dwblaikie wrote:

Perhaps at least this could be avoided? The flag configuration already says 
what th edefault value is - could that be relied on without needing to specify 
a value here too?

https://github.com/llvm/llvm-project/pull/73000
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [mlir] [llvm] [lld] [AMDGPU] Change default AMDHSA Code Object version to 5 (PR #73000)

2023-11-22 Thread David Blaikie via cfe-commits



@@ -4708,12 +4708,12 @@ defm amdgpu_ieee : BoolOption<"m", "amdgpu-ieee",
   NegFlag>, Group;
 
 def mcode_object_version_EQ : Joined<["-"], "mcode-object-version=">, 
Group,
-  HelpText<"Specify code object ABI version. Defaults to 4. (AMDGPU only)">,
+  HelpText<"Specify code object ABI version. Defaults to 5. (AMDGPU only)">,

dwblaikie wrote:

Be nice if we could avoid having this written in 3 places... are there other 
techniques used for other flags with default values

https://github.com/llvm/llvm-project/pull/73000
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] Ensure minimal alignment of global vars of incomplete type. (PR #72886)

2023-11-22 Thread Eli Friedman via cfe-commits



@@ -0,0 +1,32 @@
+// RUN: %clang --target=s390x-linux -S -emit-llvm -o - %s | FileCheck %s

efriedma-quic wrote:

Sorry, I missed this when reviewing.

We generally only use Driver tests when we're trying to test some aspect of the 
driver.  For arm-alignment.c, it isn't really trying to test the code 
generation itself; it's trying to test the effect of alignment-related flags.

https://github.com/llvm/llvm-project/pull/72886
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] 8840eb3 - [flang][OpenMP] Add semantic check for declare target (#72770)

2023-11-22 Thread via cfe-commits


Author: Shraiysh
Date: 2023-11-22T16:13:14-06:00
New Revision: 8840eb3fb535bc44704bb61515ca90dceaae35a7

URL: 
https://github.com/llvm/llvm-project/commit/8840eb3fb535bc44704bb61515ca90dceaae35a7
DIFF: 
https://github.com/llvm/llvm-project/commit/8840eb3fb535bc44704bb61515ca90dceaae35a7.diff

LOG: [flang][OpenMP] Add semantic check for declare target (#72770)

Added: 

flang/test/Lower/OpenMP/FIR/declare-target-implicit-func-and-subr-cap-enter.f90
flang/test/Lower/OpenMP/declare-target-implicit-func-and-subr-cap-enter.f90

Modified: 
clang/lib/Parse/ParseOpenMP.cpp
clang/test/OpenMP/target_enter_data_nowait_messages.cpp
flang/lib/Lower/OpenMP.cpp
flang/lib/Parser/openmp-parsers.cpp
flang/lib/Semantics/check-omp-structure.cpp
flang/lib/Semantics/check-omp-structure.h
flang/lib/Semantics/resolve-directives.cpp
flang/test/Lower/OpenMP/FIR/declare-target-data.f90
flang/test/Lower/OpenMP/FIR/declare-target-func-and-subr.f90
flang/test/Lower/OpenMP/declare-target-data.f90
flang/test/Lower/OpenMP/declare-target-func-and-subr.f90
flang/test/Lower/OpenMP/declare-target-implicit-tarop-cap.f90
flang/test/Lower/OpenMP/function-filtering-2.f90
flang/test/Lower/OpenMP/function-filtering.f90
flang/test/Parser/OpenMP/declare_target-device_type.f90
flang/test/Semantics/OpenMP/declarative-directive.f90
flang/test/Semantics/OpenMP/declare-target01.f90
flang/test/Semantics/OpenMP/declare-target02.f90
flang/test/Semantics/OpenMP/declare-target06.f90
flang/test/Semantics/OpenMP/requires04.f90
flang/test/Semantics/OpenMP/requires05.f90
llvm/include/llvm/Frontend/OpenMP/OMP.td

Removed: 




diff  --git a/clang/lib/Parse/ParseOpenMP.cpp b/clang/lib/Parse/ParseOpenMP.cpp
index 9b47ec4fbbc51f1..9112c3707579083 100644
--- a/clang/lib/Parse/ParseOpenMP.cpp
+++ b/clang/lib/Parse/ParseOpenMP.cpp
@@ -3369,6 +3369,7 @@ OMPClause *Parser::ParseOpenMPClause(OpenMPDirectiveKind 
DKind,
   case OMPC_exclusive:
   case OMPC_affinity:
   case OMPC_doacross:
+  case OMPC_enter:
 if (getLangOpts().OpenMP >= 52 && DKind == OMPD_ordered &&
 CKind == OMPC_depend)
   Diag(Tok, diag::warn_omp_depend_in_ordered_deprecated);

diff  --git a/clang/test/OpenMP/target_enter_data_nowait_messages.cpp 
b/clang/test/OpenMP/target_enter_data_nowait_messages.cpp
index 3f5dde00b814ca4..ba5eaf1d2214a07 100644
--- a/clang/test/OpenMP/target_enter_data_nowait_messages.cpp
+++ b/clang/test/OpenMP/target_enter_data_nowait_messages.cpp
@@ -6,14 +6,26 @@ int main(int argc, char **argv) {
   int i;
 
   #pragma omp nowait target enter data map(to: i) // expected-error {{expected 
an OpenMP directive}}
-  #pragma omp target nowait enter data map(to: i) // expected-warning {{extra 
tokens at the end of '#pragma omp target' are ignored}}
+  {}
+  #pragma omp target nowait foo data map(to: i) // expected-warning {{extra 
tokens at the end of '#pragma omp target' are ignored}}
+  {}
+  #pragma omp target nowait enter data map(to: i) // expected-warning {{extra 
tokens at the end of '#pragma omp target' are ignored}} expected-error 
{{unexpected OpenMP clause 'enter' in directive '#pragma omp target'}} 
expected-error {{expected '(' after 'enter'}}
+  {}
   #pragma omp target enter nowait data map(to: i) // expected-error {{expected 
an OpenMP directive}}
+  {}
   #pragma omp target enter data nowait() map(to: i) // expected-warning 
{{extra tokens at the end of '#pragma omp target enter data' are ignored}} 
expected-error {{expected at least one 'map' clause for '#pragma omp target 
enter data'}}
+  {}
   #pragma omp target enter data map(to: i) nowait( // expected-warning {{extra 
tokens at the end of '#pragma omp target enter data' are ignored}}
+  {}
   #pragma omp target enter data map(to: i) nowait (argc)) // expected-warning 
{{extra tokens at the end of '#pragma omp target enter data' are ignored}}
+  {}
   #pragma omp target enter data map(to: i) nowait device (-10u)
+  {}
   #pragma omp target enter data map(to: i) nowait (3.14) device (-10u) // 
expected-warning {{extra tokens at the end of '#pragma omp target enter data' 
are ignored}}
+  {}
   #pragma omp target enter data map(to: i) nowait nowait // expected-error 
{{directive '#pragma omp target enter data' cannot contain more than one 
'nowait' clause}}
+  {}
   #pragma omp target enter data nowait map(to: i) nowait // expected-error 
{{directive '#pragma omp target enter data' cannot contain more than one 
'nowait' clause}}
+  {}
   return 0;
 }

diff  --git a/flang/lib/Lower/OpenMP.cpp b/flang/lib/Lower/OpenMP.cpp
index f6a61ba3a528e32..cc799fdc27be063 100644
--- a/flang/lib/Lower/OpenMP.cpp
+++ b/flang/lib/Lower/OpenMP.cpp
@@ -564,6 +564,8 @@ class ClauseProcessor {
   bool processDepend(llvm::SmallVectorImpl 
&dependTypeOperands,
  llvm::SmallVectorImpl &dependOperands) const;
   bool
+  proces

[clang] [llvm] [flang] [flang][OpenMP] Add semantic check for declare target (PR #72770)

2023-11-22 Thread via cfe-commits


https://github.com/shraiysh closed 
https://github.com/llvm/llvm-project/pull/72770
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D105909: [clang][CallGraphSection] Add type id metadata to indirect call and targets

2023-11-22 Thread Prabhu Karthikeyan Rajasekaran via Phabricator via cfe-commits

Prabhuk updated this revision to Diff 558152.
Prabhuk added a comment.

Rebased the patchset and addressed the compilation and test failures


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105909/new/

https://reviews.llvm.org/D105909

Files:
  llvm/include/llvm/CodeGen/AsmPrinter.h
  llvm/include/llvm/MC/MCObjectFileInfo.h
  llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
  llvm/lib/MC/MCObjectFileInfo.cpp
  llvm/test/CodeGen/call-graph-section.ll

Index: llvm/test/CodeGen/call-graph-section.ll
===
--- /dev/null
+++ llvm/test/CodeGen/call-graph-section.ll
@@ -0,0 +1,73 @@
+; Tests that we store the type identifiers in .callgraph section of the binary.
+
+; RUN: llc --call-graph-section -filetype=obj -o - < %s | \
+; RUN: llvm-readelf -x .callgraph - | FileCheck %s
+
+target triple = "x86_64-unknown-linux-gnu"
+
+define dso_local void @foo() #0 !type !4 {
+entry:
+  ret void
+}
+
+define dso_local i32 @bar(i8 signext %a) #0 !type !5 {
+entry:
+  %a.addr = alloca i8, align 1
+  store i8 %a, i8* %a.addr, align 1
+  ret i32 0
+}
+
+define dso_local i32* @baz(i8* %a) #0 !type !6 {
+entry:
+  %a.addr = alloca i8*, align 8
+  store i8* %a, i8** %a.addr, align 8
+  ret i32* null
+}
+
+define dso_local i32 @main() #0 !type !7 {
+entry:
+  %retval = alloca i32, align 4
+  %fp_foo = alloca void (...)*, align 8
+  %a = alloca i8, align 1
+  %fp_bar = alloca i32 (i8)*, align 8
+  %fp_baz = alloca i32* (i8*)*, align 8
+  store i32 0, i32* %retval, align 4
+  store void (...)* bitcast (void ()* @foo to void (...)*), void (...)** %fp_foo, align 8
+  %0 = load void (...)*, void (...)** %fp_foo, align 8
+  call void (...) %0() [ "type"(metadata !"_ZTSFvE.generalized") ]
+  store i32 (i8)* @bar, i32 (i8)** %fp_bar, align 8
+  %1 = load i32 (i8)*, i32 (i8)** %fp_bar, align 8
+  %2 = load i8, i8* %a, align 1
+  %call = call i32 %1(i8 signext %2) [ "type"(metadata !"_ZTSFicE.generalized") ]
+  store i32* (i8*)* @baz, i32* (i8*)** %fp_baz, align 8
+  %3 = load i32* (i8*)*, i32* (i8*)** %fp_baz, align 8
+  %call1 = call i32* %3(i8* %a) [ "type"(metadata !"_ZTSFPvS_E.generalized") ]
+  call void @foo() [ "type"(metadata !"_ZTSFvE.generalized") ]
+  %4 = load i8, i8* %a, align 1
+  %call2 = call i32 @bar(i8 signext %4) [ "type"(metadata !"_ZTSFicE.generalized") ]
+  %call3 = call i32* @baz(i8* %a) [ "type"(metadata !"_ZTSFPvS_E.generalized") ]
+  ret i32 0
+}
+
+attributes #0 = { noinline nounwind optnone uwtable "frame-pointer"="all" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
+
+!llvm.module.flags = !{!0, !1, !2}
+!llvm.ident = !{!3}
+
+; Check that the numeric type id (md5 hash) for the below type ids are emitted
+; to the callgraph section.
+
+; CHECK: Hex dump of section '.callgraph':
+
+!0 = !{i32 1, !"wchar_size", i32 4}
+!1 = !{i32 7, !"uwtable", i32 1}
+!2 = !{i32 7, !"frame-pointer", i32 2}
+!3 = !{!"clang version 13.0.0 (g...@github.com:llvm/llvm-project.git 6d35f403b91c2f2c604e23763f699d580370ca96)"}
+; CHECK-DAG: 2444f731 f5eecb3e
+!4 = !{i64 0, !"_ZTSFvE.generalized"}
+; CHECK-DAG: 5486bc59 814b8e30
+!5 = !{i64 0, !"_ZTSFicE.generalized"}
+; CHECK-DAG: 7ade6814 f897fd77
+!6 = !{i64 0, !"_ZTSFPvS_E.generalized"}
+; CHECK-DAG: caaf769a 600968fa
+!7 = !{i64 0, !"_ZTSFiE.generalized"}
Index: llvm/lib/MC/MCObjectFileInfo.cpp
===
--- llvm/lib/MC/MCObjectFileInfo.cpp
+++ llvm/lib/MC/MCObjectFileInfo.cpp
@@ -530,6 +530,8 @@
   EHFrameSection =
   Ctx->getELFSection(".eh_frame", EHSectionType, EHSectionFlags);
 
+  CallGraphSection = Ctx->getELFSection(".callgraph", ELF::SHT_PROGBITS, 0);
+
   StackSizesSection = Ctx->getELFSection(".stack_sizes", ELF::SHT_PROGBITS, 0);
 
   PseudoProbeSection = Ctx->getELFSection(".pseudo_probe", DebugSecType, 0);
@@ -1112,6 +1114,24 @@
   llvm_unreachable("Unknown ObjectFormatType");
 }
 
+MCSection *
+MCObjectFileInfo::getCallGraphSection(const MCSection &TextSec) const {
+  if (Ctx->getObjectFileType() != MCContext::IsELF)
+return CallGraphSection;
+
+  const MCSectionELF &ElfSec = static_cast(TextSec);
+  unsigned Flags = ELF::SHF_LINK_ORDER;
+  StringRef GroupName;
+  if (const MCSymbol *Group = ElfSec.getGroup()) {
+GroupName = Group->getName();
+Flags |= ELF::SHF_GROUP;
+  }
+
+  return Ctx->getELFSection(".callgraph", ELF::SHT_PROGBITS, Flags, 0,
+GroupName, true, ElfSec.getUniqueID(),
+cast(TextSec.getBeginSymbol()));
+}
+
 MCSection *
 MCObjectFileInfo::getStackSizesSection(const MCSection &TextSec) const {
   if ((Ctx->getObjectFileType() != MCContext::IsELF) ||
Index: llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
===
--- llvm/lib/CodeGen/AsmPr

[clang] [clang][CodeGen] Emit atomic IR instead of libcalls for misaligned po… (PR #73176)

2023-11-22 Thread via cfe-commits


https://github.com/Logikable updated 
https://github.com/llvm/llvm-project/pull/73176

>From dd6964d9108cf16fb062c1c809abbd75dae7ff65 Mon Sep 17 00:00:00 2001
From: Sean Luchen 
Date: Fri, 17 Nov 2023 17:29:52 +
Subject: [PATCH] [clang][CodeGen] Emit atomic IR instead of libcalls for
 misaligned pointers.

Calling __atomic_fetch_op_n is undefined for misaligned pointers.

Since the backend can handle atomic IR on misaligned pointers, emit that
instead. To keep things simple, we make this change for all misaligned
operations, not just the integral ones.

There is an additional consequence of this change. Previously, libcalls
were emitted for misaligned, misshapen (size != 2^n), and oversized
objects. Since optimized libcalls only operate on 2^n sized members,
removing the misaligned case means optimized libcalls will never be
emitted, and all relevant codepaths can be cleaned up.

A simple correctness test is to have one thread perform an arithmetic
operation on a misaligned integer, and another thread perform a
non-arithmetic operation (e.g. xchg) on the same value. Such a test
exhibits incorrect behaviour currently.
---
 clang/lib/CodeGen/CGAtomic.cpp| 336 ++
 clang/test/CodeGen/LoongArch/atomics.c|   6 +-
 clang/test/CodeGen/PowerPC/quadword-atomics.c |   2 +-
 clang/test/CodeGen/RISCV/riscv-atomics.c  |  42 +--
 clang/test/CodeGen/arm-atomics-m.c|   8 +-
 clang/test/CodeGen/arm-atomics-m0.c   |  16 +-
 clang/test/CodeGen/atomic-ops-libcall.c   | 119 ---
 clang/test/CodeGen/atomic-ops.c   |  25 +-
 clang/test/CodeGen/atomics-inlining.c |  28 +-
 clang/test/CodeGen/c11atomics.c   |  18 +-
 .../test/CodeGenOpenCL/atomic-ops-libcall.cl  |  54 +--
 11 files changed, 273 insertions(+), 381 deletions(-)

diff --git a/clang/lib/CodeGen/CGAtomic.cpp b/clang/lib/CodeGen/CGAtomic.cpp
index 6005d5c51c0e1ac..c082590e1084df2 100644
--- a/clang/lib/CodeGen/CGAtomic.cpp
+++ b/clang/lib/CodeGen/CGAtomic.cpp
@@ -785,27 +785,75 @@ static void EmitAtomicOp(CodeGenFunction &CGF, AtomicExpr 
*Expr, Address Dest,
   Builder.SetInsertPoint(ContBB);
 }
 
-static void
-AddDirectArgument(CodeGenFunction &CGF, CallArgList &Args,
-  bool UseOptimizedLibcall, llvm::Value *Val, QualType ValTy,
-  SourceLocation Loc, CharUnits SizeInChars) {
-  if (UseOptimizedLibcall) {
-// Load value and pass it to the function directly.
-CharUnits Align = CGF.getContext().getTypeAlignInChars(ValTy);
-int64_t SizeInBits = CGF.getContext().toBits(SizeInChars);
-ValTy =
-CGF.getContext().getIntTypeForBitwidth(SizeInBits, /*Signed=*/false);
-llvm::Type *ITy = llvm::IntegerType::get(CGF.getLLVMContext(), SizeInBits);
-Address Ptr = Address(Val, ITy, Align);
-Val = CGF.EmitLoadOfScalar(Ptr, false,
-   CGF.getContext().getPointerType(ValTy),
-   Loc);
-// Coerce the value into an appropriately sized integer type.
-Args.add(RValue::get(Val), ValTy);
-  } else {
-// Non-optimized functions always take a reference.
-Args.add(RValue::get(Val), CGF.getContext().VoidPtrTy);
+static bool hasUnoptimizedLibcall(AtomicExpr::AtomicOp op) {
+  switch (op) {
+  case AtomicExpr::AO__c11_atomic_init:
+  case AtomicExpr::AO__opencl_atomic_init:
+  case AtomicExpr::AO__atomic_compare_exchange:
+  case AtomicExpr::AO__atomic_compare_exchange_n:
+  case AtomicExpr::AO__c11_atomic_compare_exchange_weak:
+  case AtomicExpr::AO__c11_atomic_compare_exchange_strong:
+  case AtomicExpr::AO__hip_atomic_compare_exchange_weak:
+  case AtomicExpr::AO__hip_atomic_compare_exchange_strong:
+  case AtomicExpr::AO__opencl_atomic_compare_exchange_weak:
+  case AtomicExpr::AO__opencl_atomic_compare_exchange_strong:
+  case AtomicExpr::AO__atomic_exchange:
+  case AtomicExpr::AO__atomic_exchange_n:
+  case AtomicExpr::AO__c11_atomic_exchange:
+  case AtomicExpr::AO__hip_atomic_exchange:
+  case AtomicExpr::AO__opencl_atomic_exchange:
+  case AtomicExpr::AO__atomic_store:
+  case AtomicExpr::AO__atomic_store_n:
+  case AtomicExpr::AO__c11_atomic_store:
+  case AtomicExpr::AO__hip_atomic_store:
+  case AtomicExpr::AO__opencl_atomic_store:
+  case AtomicExpr::AO__atomic_load:
+  case AtomicExpr::AO__atomic_load_n:
+  case AtomicExpr::AO__c11_atomic_load:
+  case AtomicExpr::AO__hip_atomic_load:
+  case AtomicExpr::AO__opencl_atomic_load:
+return true;
+  case AtomicExpr::AO__atomic_add_fetch:
+  case AtomicExpr::AO__atomic_fetch_add:
+  case AtomicExpr::AO__c11_atomic_fetch_add:
+  case AtomicExpr::AO__hip_atomic_fetch_add:
+  case AtomicExpr::AO__opencl_atomic_fetch_add:
+  case AtomicExpr::AO__atomic_and_fetch:
+  case AtomicExpr::AO__atomic_fetch_and:
+  case AtomicExpr::AO__c11_atomic_fetch_and:
+  case AtomicExpr::AO__hip_atomic_fetch_and:
+  case AtomicExpr::AO__opencl_atomic_fetch_and:
+  case AtomicExpr::AO__atomic_or_fetch:
+  case Atomi

[PATCH] D105907: [CallGraphSection] Add call graph section options and documentation

2023-11-22 Thread Prabhu Karthikeyan Rajasekaran via Phabricator via cfe-commits

Prabhuk updated this revision to Diff 558151.
Prabhuk added a comment.

Rebased the patchset and addressed the compilation failures


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105907/new/

https://reviews.llvm.org/D105907

Files:
  llvm/include/llvm/CodeGen/AsmPrinter.h
  llvm/include/llvm/MC/MCObjectFileInfo.h
  llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
  llvm/lib/MC/MCObjectFileInfo.cpp
  llvm/test/CodeGen/call-graph-section.ll

Index: llvm/test/CodeGen/call-graph-section.ll
===
--- /dev/null
+++ llvm/test/CodeGen/call-graph-section.ll
@@ -0,0 +1,73 @@
+; Tests that we store the type identifiers in .callgraph section of the binary.
+
+; RUN: llc --call-graph-section -filetype=obj -o - < %s | \
+; RUN: llvm-readelf -x .callgraph - | FileCheck %s
+
+target triple = "x86_64-unknown-linux-gnu"
+
+define dso_local void @foo() #0 !type !4 {
+entry:
+  ret void
+}
+
+define dso_local i32 @bar(i8 signext %a) #0 !type !5 {
+entry:
+  %a.addr = alloca i8, align 1
+  store i8 %a, i8* %a.addr, align 1
+  ret i32 0
+}
+
+define dso_local i32* @baz(i8* %a) #0 !type !6 {
+entry:
+  %a.addr = alloca i8*, align 8
+  store i8* %a, i8** %a.addr, align 8
+  ret i32* null
+}
+
+define dso_local i32 @main() #0 !type !7 {
+entry:
+  %retval = alloca i32, align 4
+  %fp_foo = alloca void (...)*, align 8
+  %a = alloca i8, align 1
+  %fp_bar = alloca i32 (i8)*, align 8
+  %fp_baz = alloca i32* (i8*)*, align 8
+  store i32 0, i32* %retval, align 4
+  store void (...)* bitcast (void ()* @foo to void (...)*), void (...)** %fp_foo, align 8
+  %0 = load void (...)*, void (...)** %fp_foo, align 8
+  call void (...) %0() [ "type"(metadata !"_ZTSFvE.generalized") ]
+  store i32 (i8)* @bar, i32 (i8)** %fp_bar, align 8
+  %1 = load i32 (i8)*, i32 (i8)** %fp_bar, align 8
+  %2 = load i8, i8* %a, align 1
+  %call = call i32 %1(i8 signext %2) [ "type"(metadata !"_ZTSFicE.generalized") ]
+  store i32* (i8*)* @baz, i32* (i8*)** %fp_baz, align 8
+  %3 = load i32* (i8*)*, i32* (i8*)** %fp_baz, align 8
+  %call1 = call i32* %3(i8* %a) [ "type"(metadata !"_ZTSFPvS_E.generalized") ]
+  call void @foo() [ "type"(metadata !"_ZTSFvE.generalized") ]
+  %4 = load i8, i8* %a, align 1
+  %call2 = call i32 @bar(i8 signext %4) [ "type"(metadata !"_ZTSFicE.generalized") ]
+  %call3 = call i32* @baz(i8* %a) [ "type"(metadata !"_ZTSFPvS_E.generalized") ]
+  ret i32 0
+}
+
+attributes #0 = { noinline nounwind optnone uwtable "frame-pointer"="all" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
+
+!llvm.module.flags = !{!0, !1, !2}
+!llvm.ident = !{!3}
+
+; Check that the numeric type id (md5 hash) for the below type ids are emitted
+; to the callgraph section.
+
+; CHECK: Hex dump of section '.callgraph':
+
+!0 = !{i32 1, !"wchar_size", i32 4}
+!1 = !{i32 7, !"uwtable", i32 1}
+!2 = !{i32 7, !"frame-pointer", i32 2}
+!3 = !{!"clang version 13.0.0 (g...@github.com:llvm/llvm-project.git 6d35f403b91c2f2c604e23763f699d580370ca96)"}
+; CHECK-DAG: 2444f731 f5eecb3e
+!4 = !{i64 0, !"_ZTSFvE.generalized"}
+; CHECK-DAG: 5486bc59 814b8e30
+!5 = !{i64 0, !"_ZTSFicE.generalized"}
+; CHECK-DAG: 7ade6814 f897fd77
+!6 = !{i64 0, !"_ZTSFPvS_E.generalized"}
+; CHECK-DAG: caaf769a 600968fa
+!7 = !{i64 0, !"_ZTSFiE.generalized"}
Index: llvm/lib/MC/MCObjectFileInfo.cpp
===
--- llvm/lib/MC/MCObjectFileInfo.cpp
+++ llvm/lib/MC/MCObjectFileInfo.cpp
@@ -530,6 +530,8 @@
   EHFrameSection =
   Ctx->getELFSection(".eh_frame", EHSectionType, EHSectionFlags);
 
+  CallGraphSection = Ctx->getELFSection(".callgraph", ELF::SHT_PROGBITS, 0);
+
   StackSizesSection = Ctx->getELFSection(".stack_sizes", ELF::SHT_PROGBITS, 0);
 
   PseudoProbeSection = Ctx->getELFSection(".pseudo_probe", DebugSecType, 0);
@@ -1112,6 +1114,24 @@
   llvm_unreachable("Unknown ObjectFormatType");
 }
 
+MCSection *
+MCObjectFileInfo::getCallGraphSection(const MCSection &TextSec) const {
+  if (Ctx->getObjectFileType() != MCContext::IsELF)
+return CallGraphSection;
+
+  const MCSectionELF &ElfSec = static_cast(TextSec);
+  unsigned Flags = ELF::SHF_LINK_ORDER;
+  StringRef GroupName;
+  if (const MCSymbol *Group = ElfSec.getGroup()) {
+GroupName = Group->getName();
+Flags |= ELF::SHF_GROUP;
+  }
+
+  return Ctx->getELFSection(".callgraph", ELF::SHT_PROGBITS, Flags, 0,
+GroupName, true, ElfSec.getUniqueID(),
+cast(TextSec.getBeginSymbol()));
+}
+
 MCSection *
 MCObjectFileInfo::getStackSizesSection(const MCSection &TextSec) const {
   if ((Ctx->getObjectFileType() != MCContext::IsELF) ||
Index: llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
===
--- llvm/lib/CodeGen/AsmPrinter/Asm

[clang] [flang] [flang] Enable alias tags pass by default (PR #73111)

2023-11-22 Thread Tom Eccles via cfe-commits


tblah wrote:

> Thank you for the changes, Tom!
> 
> I have one minor comment, but I would like to ask to merge this after US 
> holidays, if possible. Could you please postpone the merging until Monday GMT?

Sure. I'll wait until Monday.

https://github.com/llvm/llvm-project/pull/73111
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang][CodeGen] Emit atomic IR instead of libcalls for misaligned po… (PR #73176)

2023-11-22 Thread via cfe-commits


https://github.com/Logikable updated 
https://github.com/llvm/llvm-project/pull/73176

>From d0f164da6b75d5698da45d863f7d955cecd94f20 Mon Sep 17 00:00:00 2001
From: Sean Luchen 
Date: Fri, 17 Nov 2023 17:29:52 +
Subject: [PATCH] [clang][CodeGen] Emit atomic IR instead of libcalls for
 misaligned pointers.

Calling __atomic_fetch_op_n is undefined for misaligned pointers.

Since the backend can handle atomic IR on misaligned pointers, emit that
instead. To keep things simple, we make this change for all misaligned
operations, not just the integral ones.

There is an additional consequence of this change. Previously, libcalls
were emitted for misaligned, misshapen (size != 2^n), and oversized
objects. Since optimized libcalls only operate on 2^n sized members,
removing the misaligned case means optimized libcalls will never be
emitted, and all relevant codepaths can be cleaned up.

A simple correctness test is to have one thread perform an arithmetic
operation on a misaligned integer, and another thread perform a
non-arithmetic operation (e.g. xchg) on the same value. Such a test
exhibits incorrect behaviour currently.
---
 clang/lib/CodeGen/CGAtomic.cpp| 335 ++
 clang/test/CodeGen/LoongArch/atomics.c|   6 +-
 clang/test/CodeGen/PowerPC/quadword-atomics.c |   2 +-
 clang/test/CodeGen/RISCV/riscv-atomics.c  |  42 +--
 clang/test/CodeGen/arm-atomics-m.c|   8 +-
 clang/test/CodeGen/arm-atomics-m0.c   |  16 +-
 clang/test/CodeGen/atomic-ops-libcall.c   | 119 ---
 clang/test/CodeGen/atomic-ops.c   |  25 +-
 clang/test/CodeGen/atomics-inlining.c |  28 +-
 clang/test/CodeGen/c11atomics.c   |  18 +-
 .../test/CodeGenOpenCL/atomic-ops-libcall.cl  |  54 +--
 11 files changed, 272 insertions(+), 381 deletions(-)

diff --git a/clang/lib/CodeGen/CGAtomic.cpp b/clang/lib/CodeGen/CGAtomic.cpp
index 6005d5c51c0e1ac..df378fcec712392 100644
--- a/clang/lib/CodeGen/CGAtomic.cpp
+++ b/clang/lib/CodeGen/CGAtomic.cpp
@@ -785,27 +785,75 @@ static void EmitAtomicOp(CodeGenFunction &CGF, AtomicExpr 
*Expr, Address Dest,
   Builder.SetInsertPoint(ContBB);
 }
 
-static void
-AddDirectArgument(CodeGenFunction &CGF, CallArgList &Args,
-  bool UseOptimizedLibcall, llvm::Value *Val, QualType ValTy,
-  SourceLocation Loc, CharUnits SizeInChars) {
-  if (UseOptimizedLibcall) {
-// Load value and pass it to the function directly.
-CharUnits Align = CGF.getContext().getTypeAlignInChars(ValTy);
-int64_t SizeInBits = CGF.getContext().toBits(SizeInChars);
-ValTy =
-CGF.getContext().getIntTypeForBitwidth(SizeInBits, /*Signed=*/false);
-llvm::Type *ITy = llvm::IntegerType::get(CGF.getLLVMContext(), SizeInBits);
-Address Ptr = Address(Val, ITy, Align);
-Val = CGF.EmitLoadOfScalar(Ptr, false,
-   CGF.getContext().getPointerType(ValTy),
-   Loc);
-// Coerce the value into an appropriately sized integer type.
-Args.add(RValue::get(Val), ValTy);
-  } else {
-// Non-optimized functions always take a reference.
-Args.add(RValue::get(Val), CGF.getContext().VoidPtrTy);
+static bool isArithmeticOp(AtomicExpr::AtomicOp op) {
+  switch (op) {
+  case AtomicExpr::AO__atomic_add_fetch:
+  case AtomicExpr::AO__atomic_fetch_add:
+  case AtomicExpr::AO__c11_atomic_fetch_add:
+  case AtomicExpr::AO__hip_atomic_fetch_add:
+  case AtomicExpr::AO__opencl_atomic_fetch_add:
+  case AtomicExpr::AO__atomic_and_fetch:
+  case AtomicExpr::AO__atomic_fetch_and:
+  case AtomicExpr::AO__c11_atomic_fetch_and:
+  case AtomicExpr::AO__hip_atomic_fetch_and:
+  case AtomicExpr::AO__opencl_atomic_fetch_and:
+  case AtomicExpr::AO__atomic_or_fetch:
+  case AtomicExpr::AO__atomic_fetch_or:
+  case AtomicExpr::AO__c11_atomic_fetch_or:
+  case AtomicExpr::AO__hip_atomic_fetch_or:
+  case AtomicExpr::AO__opencl_atomic_fetch_or:
+  case AtomicExpr::AO__atomic_sub_fetch:
+  case AtomicExpr::AO__atomic_fetch_sub:
+  case AtomicExpr::AO__c11_atomic_fetch_sub:
+  case AtomicExpr::AO__hip_atomic_fetch_sub:
+  case AtomicExpr::AO__opencl_atomic_fetch_sub:
+  case AtomicExpr::AO__atomic_xor_fetch:
+  case AtomicExpr::AO__atomic_fetch_xor:
+  case AtomicExpr::AO__c11_atomic_fetch_xor:
+  case AtomicExpr::AO__hip_atomic_fetch_xor:
+  case AtomicExpr::AO__opencl_atomic_fetch_xor:
+  case AtomicExpr::AO__atomic_nand_fetch:
+  case AtomicExpr::AO__atomic_fetch_nand:
+  case AtomicExpr::AO__c11_atomic_fetch_nand:
+  case AtomicExpr::AO__atomic_min_fetch:
+  case AtomicExpr::AO__atomic_fetch_min:
+  case AtomicExpr::AO__c11_atomic_fetch_min:
+  case AtomicExpr::AO__hip_atomic_fetch_min:
+  case AtomicExpr::AO__opencl_atomic_fetch_min:
+  case AtomicExpr::AO__atomic_max_fetch:
+  case AtomicExpr::AO__atomic_fetch_max:
+  case AtomicExpr::AO__c11_atomic_fetch_max:
+  case AtomicExpr::AO__hip_atomic_fetch_max:
+  case AtomicExpr::AO__opencl_atomic_fetch_max:
+

[clang] [flang] [flang] Enable alias tags pass by default (PR #73111)

2023-11-22 Thread Slava Zakharin via cfe-commits



@@ -142,6 +142,26 @@ void Flang::addCodegenOptions(const ArgList &Args,
   if (shouldLoopVersion(Args))
 CmdArgs.push_back("-fversion-loops-for-stride");
 
+  Arg *aliasAnalysis = Args.getLastArg(options::OPT_falias_analysis,
+   options::OPT_fno_alias_analysis);
+  Arg *optLevel =
+  Args.getLastArg(options::OPT_Ofast, options::OPT_O, options::OPT_O4);

vzakhari wrote:

It looks like `clang` generates `tbaa` metadata at all opt levels, except 
`-O0`.  I think this makes sense: the optimization themselves need to decide 
how to use it, e.g. for improving performance/code-size/etc.

https://github.com/llvm/llvm-project/pull/73111
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [flang] [flang] Enable alias tags pass by default (PR #73111)

2023-11-22 Thread Slava Zakharin via cfe-commits


https://github.com/vzakhari commented:

Thank you for the changes, Tom!

I have one minor comment, but I would like to ask to merge this after US 
holidays, if possible.  Could you please postpone the merging until Monday GMT?

https://github.com/llvm/llvm-project/pull/73111
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[flang] [clang] [flang] Enable alias tags pass by default (PR #73111)

2023-11-22 Thread Slava Zakharin via cfe-commits


https://github.com/vzakhari edited 
https://github.com/llvm/llvm-project/pull/73111
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang][CodeGen] Emit atomic IR instead of libcalls for misaligned po… (PR #73176)

2023-11-22 Thread via cfe-commits


https://github.com/Logikable updated 
https://github.com/llvm/llvm-project/pull/73176

>From cba04d497b09aa932b9f515b353d00794abac011 Mon Sep 17 00:00:00 2001
From: Sean Luchen 
Date: Fri, 17 Nov 2023 17:29:52 +
Subject: [PATCH] [clang][CodeGen] Emit atomic IR instead of libcalls for
 misaligned pointers.

Calling __atomic_fetch_op_n is undefined for misaligned pointers.

Since the backend can handle atomic IR on misaligned pointers, emit that
instead. To keep things simple, we make this change for all misaligned
operations, not just the integral ones.

There is an additional consequence of this change. Previously, libcalls
were emitted for misaligned, misshapen (size != 2^n), and oversized
objects. Since optimized libcalls only operate on 2^n sized members,
removing the misaligned case means optimized libcalls will never be
emitted, and all relevant codepaths can be cleaned up.

A simple correctness test is to have one thread perform an arithmetic
operation on a misaligned integer, and another thread perform a
non-arithmetic operation (e.g. xchg) on the same value. Such a test
exhibits incorrect behaviour currently.
---
 clang/lib/CodeGen/CGAtomic.cpp| 335 ++
 clang/test/CodeGen/LoongArch/atomics.c|   6 +-
 clang/test/CodeGen/PowerPC/quadword-atomics.c |   2 +-
 clang/test/CodeGen/RISCV/riscv-atomics.c  |  42 +--
 clang/test/CodeGen/arm-atomics-m.c|   8 +-
 clang/test/CodeGen/arm-atomics-m0.c   |  16 +-
 clang/test/CodeGen/atomic-ops-libcall.c   | 119 ---
 clang/test/CodeGen/atomic-ops.c   |  25 +-
 clang/test/CodeGen/atomics-inlining.c |  28 +-
 clang/test/CodeGen/c11atomics.c   |  18 +-
 .../test/CodeGenOpenCL/atomic-ops-libcall.cl  |  54 +--
 11 files changed, 272 insertions(+), 381 deletions(-)

diff --git a/clang/lib/CodeGen/CGAtomic.cpp b/clang/lib/CodeGen/CGAtomic.cpp
index 6005d5c51c0e1ac..df378fcec712392 100644
--- a/clang/lib/CodeGen/CGAtomic.cpp
+++ b/clang/lib/CodeGen/CGAtomic.cpp
@@ -785,27 +785,75 @@ static void EmitAtomicOp(CodeGenFunction &CGF, AtomicExpr 
*Expr, Address Dest,
   Builder.SetInsertPoint(ContBB);
 }
 
-static void
-AddDirectArgument(CodeGenFunction &CGF, CallArgList &Args,
-  bool UseOptimizedLibcall, llvm::Value *Val, QualType ValTy,
-  SourceLocation Loc, CharUnits SizeInChars) {
-  if (UseOptimizedLibcall) {
-// Load value and pass it to the function directly.
-CharUnits Align = CGF.getContext().getTypeAlignInChars(ValTy);
-int64_t SizeInBits = CGF.getContext().toBits(SizeInChars);
-ValTy =
-CGF.getContext().getIntTypeForBitwidth(SizeInBits, /*Signed=*/false);
-llvm::Type *ITy = llvm::IntegerType::get(CGF.getLLVMContext(), SizeInBits);
-Address Ptr = Address(Val, ITy, Align);
-Val = CGF.EmitLoadOfScalar(Ptr, false,
-   CGF.getContext().getPointerType(ValTy),
-   Loc);
-// Coerce the value into an appropriately sized integer type.
-Args.add(RValue::get(Val), ValTy);
-  } else {
-// Non-optimized functions always take a reference.
-Args.add(RValue::get(Val), CGF.getContext().VoidPtrTy);
+static bool isArithmeticOp(AtomicExpr::AtomicOp op) {
+  switch (op) {
+  case AtomicExpr::AO__atomic_add_fetch:
+  case AtomicExpr::AO__atomic_fetch_add:
+  case AtomicExpr::AO__c11_atomic_fetch_add:
+  case AtomicExpr::AO__hip_atomic_fetch_add:
+  case AtomicExpr::AO__opencl_atomic_fetch_add:
+  case AtomicExpr::AO__atomic_and_fetch:
+  case AtomicExpr::AO__atomic_fetch_and:
+  case AtomicExpr::AO__c11_atomic_fetch_and:
+  case AtomicExpr::AO__hip_atomic_fetch_and:
+  case AtomicExpr::AO__opencl_atomic_fetch_and:
+  case AtomicExpr::AO__atomic_or_fetch:
+  case AtomicExpr::AO__atomic_fetch_or:
+  case AtomicExpr::AO__c11_atomic_fetch_or:
+  case AtomicExpr::AO__hip_atomic_fetch_or:
+  case AtomicExpr::AO__opencl_atomic_fetch_or:
+  case AtomicExpr::AO__atomic_sub_fetch:
+  case AtomicExpr::AO__atomic_fetch_sub:
+  case AtomicExpr::AO__c11_atomic_fetch_sub:
+  case AtomicExpr::AO__hip_atomic_fetch_sub:
+  case AtomicExpr::AO__opencl_atomic_fetch_sub:
+  case AtomicExpr::AO__atomic_xor_fetch:
+  case AtomicExpr::AO__atomic_fetch_xor:
+  case AtomicExpr::AO__c11_atomic_fetch_xor:
+  case AtomicExpr::AO__hip_atomic_fetch_xor:
+  case AtomicExpr::AO__opencl_atomic_fetch_xor:
+  case AtomicExpr::AO__atomic_nand_fetch:
+  case AtomicExpr::AO__atomic_fetch_nand:
+  case AtomicExpr::AO__c11_atomic_fetch_nand:
+  case AtomicExpr::AO__atomic_min_fetch:
+  case AtomicExpr::AO__atomic_fetch_min:
+  case AtomicExpr::AO__c11_atomic_fetch_min:
+  case AtomicExpr::AO__hip_atomic_fetch_min:
+  case AtomicExpr::AO__opencl_atomic_fetch_min:
+  case AtomicExpr::AO__atomic_max_fetch:
+  case AtomicExpr::AO__atomic_fetch_max:
+  case AtomicExpr::AO__c11_atomic_fetch_max:
+  case AtomicExpr::AO__hip_atomic_fetch_max:
+  case AtomicExpr::AO__opencl_atomic_fetch_max:
+

[llvm] [clang] [CUDA][HIP] Improve variable registration with the new driver (PR #73177)

2023-11-22 Thread via cfe-commits


llvmbot wrote:



@llvm/pr-subscribers-flang-openmp
@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-clang

Author: Joseph Huber (jhuber6)


Changes

Summary:
This patch adds support for registering texture / surface variables from
CUDA / HIP. Additionally, we now properly track the `extern` and `const`
flags that are also used in these runtime functions.

This does not implement the `managed` variables yet as those seem to
require some extra handling I'm not familiar with. The issue is that the
current offload entry isn't large enough to carry size and alignment
information along with an extra global.


---

Patch is 29.89 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/73177.diff


8 Files Affected:

- (modified) clang/lib/CodeGen/CGCUDANV.cpp (+19-6) 
- (modified) clang/lib/CodeGen/CGCUDARuntime.h (+7-1) 
- (modified) clang/test/CodeGenCUDA/offloading-entries.cu (+60-26) 
- (modified) clang/test/Driver/linker-wrapper-image.c (+48-26) 
- (modified) clang/tools/clang-linker-wrapper/OffloadWrapper.cpp (+53-5) 
- (modified) llvm/include/llvm/Frontend/Offloading/Utility.h (+4-2) 
- (modified) llvm/lib/Frontend/Offloading/Utility.cpp (+2-2) 
- (modified) llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp (+1-1) 


``diff
diff --git a/clang/lib/CodeGen/CGCUDANV.cpp b/clang/lib/CodeGen/CGCUDANV.cpp
index 66147f656071f53..eb059080b977872 100644
--- a/clang/lib/CodeGen/CGCUDANV.cpp
+++ b/clang/lib/CodeGen/CGCUDANV.cpp
@@ -1132,26 +1132,39 @@ void CGNVCUDARuntime::createOffloadingEntries() {
   for (KernelInfo &I : EmittedKernels)
 llvm::offloading::emitOffloadingEntry(
 M, KernelHandles[I.Kernel->getName()],
-getDeviceSideName(cast(I.D)), 0,
+getDeviceSideName(cast(I.D)), /*Flags=*/0, /*Data=*/0,
 DeviceVarFlags::OffloadGlobalEntry, Section);
 
   for (VarInfo &I : DeviceVars) {
 uint64_t VarSize =
 CGM.getDataLayout().getTypeAllocSize(I.Var->getValueType());
+int32_t Flags =
+(I.Flags.isExtern()
+ ? static_cast(DeviceVarFlags::OffloadGlobalExtern)
+ : 0) |
+(I.Flags.isConstant()
+ ? static_cast(DeviceVarFlags::OffloadGlobalConstant)
+ : 0) |
+(I.Flags.isNormalized()
+ ? static_cast(DeviceVarFlags::OffloadGlobalNormalized)
+ : 0);
 if (I.Flags.getKind() == DeviceVarFlags::Variable) {
   llvm::offloading::emitOffloadingEntry(
   M, I.Var, getDeviceSideName(I.D), VarSize,
-  I.Flags.isManaged() ? DeviceVarFlags::OffloadGlobalManagedEntry
-  : DeviceVarFlags::OffloadGlobalEntry,
-  Section);
+  (I.Flags.isManaged() ? DeviceVarFlags::OffloadGlobalManagedEntry
+   : DeviceVarFlags::OffloadGlobalEntry) |
+  Flags,
+  /*Data=*/0, Section);
 } else if (I.Flags.getKind() == DeviceVarFlags::Surface) {
   llvm::offloading::emitOffloadingEntry(
   M, I.Var, getDeviceSideName(I.D), VarSize,
-  DeviceVarFlags::OffloadGlobalSurfaceEntry, Section);
+  DeviceVarFlags::OffloadGlobalSurfaceEntry | Flags,
+  I.Flags.getSurfTexType(), Section);
 } else if (I.Flags.getKind() == DeviceVarFlags::Texture) {
   llvm::offloading::emitOffloadingEntry(
   M, I.Var, getDeviceSideName(I.D), VarSize,
-  DeviceVarFlags::OffloadGlobalTextureEntry, Section);
+  DeviceVarFlags::OffloadGlobalTextureEntry | Flags,
+  I.Flags.getSurfTexType(), Section);
 }
   }
 }
diff --git a/clang/lib/CodeGen/CGCUDARuntime.h 
b/clang/lib/CodeGen/CGCUDARuntime.h
index 9a9c6d26cc63c40..a224cdf0054f952 100644
--- a/clang/lib/CodeGen/CGCUDARuntime.h
+++ b/clang/lib/CodeGen/CGCUDARuntime.h
@@ -52,7 +52,7 @@ class CGCUDARuntime {
   Texture,  // Builtin texture
 };
 
-/// The kind flag for an offloading entry.
+/// The kind bit-field for an offloading entry.
 enum OffloadEntryKindFlag : uint32_t {
   /// Mark the entry as a global entry. This indicates the presense of a
   /// kernel if the size field is zero and a variable otherwise.
@@ -63,6 +63,12 @@ class CGCUDARuntime {
   OffloadGlobalSurfaceEntry = 0x2,
   /// Mark the entry as a texture variable.
   OffloadGlobalTextureEntry = 0x3,
+  /// Mark the entry as being extern.
+  OffloadGlobalExtern = 0x4,
+  /// Mark the entry as being constant.
+  OffloadGlobalConstant = 0x8,
+  /// Mark the entry as being a normalized surface.
+  OffloadGlobalNormalized = 0x16,
 };
 
   private:
diff --git a/clang/test/CodeGenCUDA/offloading-entries.cu 
b/clang/test/CodeGenCUDA/offloading-entries.cu
index 46235051f1e4f12..4f5cf65ecd0bde6 100644
--- a/clang/test/CodeGenCUDA/offloading-entries.cu
+++ b/clang/test/CodeGenCUDA/offloading-entries.cu
@@ -17,31 +17,47 @@
 //.
 // CUDA: @.omp_offloading.entry_name = internal unnamed_addr constant [8 x i8] 
c"_Z3foov\00"
 // CUDA: @.o

[llvm] [clang] [CUDA][HIP] Improve variable registration with the new driver (PR #73177)

2023-11-22 Thread via cfe-commits


llvmbot wrote:




@llvm/pr-subscribers-clang-driver

Author: Joseph Huber (jhuber6)


Changes

Summary:
This patch adds support for registering texture / surface variables from
CUDA / HIP. Additionally, we now properly track the `extern` and `const`
flags that are also used in these runtime functions.

This does not implement the `managed` variables yet as those seem to
require some extra handling I'm not familiar with. The issue is that the
current offload entry isn't large enough to carry size and alignment
information along with an extra global.


---

Patch is 29.89 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/73177.diff


8 Files Affected:

- (modified) clang/lib/CodeGen/CGCUDANV.cpp (+19-6) 
- (modified) clang/lib/CodeGen/CGCUDARuntime.h (+7-1) 
- (modified) clang/test/CodeGenCUDA/offloading-entries.cu (+60-26) 
- (modified) clang/test/Driver/linker-wrapper-image.c (+48-26) 
- (modified) clang/tools/clang-linker-wrapper/OffloadWrapper.cpp (+53-5) 
- (modified) llvm/include/llvm/Frontend/Offloading/Utility.h (+4-2) 
- (modified) llvm/lib/Frontend/Offloading/Utility.cpp (+2-2) 
- (modified) llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp (+1-1) 


``diff
diff --git a/clang/lib/CodeGen/CGCUDANV.cpp b/clang/lib/CodeGen/CGCUDANV.cpp
index 66147f656071f53..eb059080b977872 100644
--- a/clang/lib/CodeGen/CGCUDANV.cpp
+++ b/clang/lib/CodeGen/CGCUDANV.cpp
@@ -1132,26 +1132,39 @@ void CGNVCUDARuntime::createOffloadingEntries() {
   for (KernelInfo &I : EmittedKernels)
 llvm::offloading::emitOffloadingEntry(
 M, KernelHandles[I.Kernel->getName()],
-getDeviceSideName(cast(I.D)), 0,
+getDeviceSideName(cast(I.D)), /*Flags=*/0, /*Data=*/0,
 DeviceVarFlags::OffloadGlobalEntry, Section);
 
   for (VarInfo &I : DeviceVars) {
 uint64_t VarSize =
 CGM.getDataLayout().getTypeAllocSize(I.Var->getValueType());
+int32_t Flags =
+(I.Flags.isExtern()
+ ? static_cast(DeviceVarFlags::OffloadGlobalExtern)
+ : 0) |
+(I.Flags.isConstant()
+ ? static_cast(DeviceVarFlags::OffloadGlobalConstant)
+ : 0) |
+(I.Flags.isNormalized()
+ ? static_cast(DeviceVarFlags::OffloadGlobalNormalized)
+ : 0);
 if (I.Flags.getKind() == DeviceVarFlags::Variable) {
   llvm::offloading::emitOffloadingEntry(
   M, I.Var, getDeviceSideName(I.D), VarSize,
-  I.Flags.isManaged() ? DeviceVarFlags::OffloadGlobalManagedEntry
-  : DeviceVarFlags::OffloadGlobalEntry,
-  Section);
+  (I.Flags.isManaged() ? DeviceVarFlags::OffloadGlobalManagedEntry
+   : DeviceVarFlags::OffloadGlobalEntry) |
+  Flags,
+  /*Data=*/0, Section);
 } else if (I.Flags.getKind() == DeviceVarFlags::Surface) {
   llvm::offloading::emitOffloadingEntry(
   M, I.Var, getDeviceSideName(I.D), VarSize,
-  DeviceVarFlags::OffloadGlobalSurfaceEntry, Section);
+  DeviceVarFlags::OffloadGlobalSurfaceEntry | Flags,
+  I.Flags.getSurfTexType(), Section);
 } else if (I.Flags.getKind() == DeviceVarFlags::Texture) {
   llvm::offloading::emitOffloadingEntry(
   M, I.Var, getDeviceSideName(I.D), VarSize,
-  DeviceVarFlags::OffloadGlobalTextureEntry, Section);
+  DeviceVarFlags::OffloadGlobalTextureEntry | Flags,
+  I.Flags.getSurfTexType(), Section);
 }
   }
 }
diff --git a/clang/lib/CodeGen/CGCUDARuntime.h 
b/clang/lib/CodeGen/CGCUDARuntime.h
index 9a9c6d26cc63c40..a224cdf0054f952 100644
--- a/clang/lib/CodeGen/CGCUDARuntime.h
+++ b/clang/lib/CodeGen/CGCUDARuntime.h
@@ -52,7 +52,7 @@ class CGCUDARuntime {
   Texture,  // Builtin texture
 };
 
-/// The kind flag for an offloading entry.
+/// The kind bit-field for an offloading entry.
 enum OffloadEntryKindFlag : uint32_t {
   /// Mark the entry as a global entry. This indicates the presense of a
   /// kernel if the size field is zero and a variable otherwise.
@@ -63,6 +63,12 @@ class CGCUDARuntime {
   OffloadGlobalSurfaceEntry = 0x2,
   /// Mark the entry as a texture variable.
   OffloadGlobalTextureEntry = 0x3,
+  /// Mark the entry as being extern.
+  OffloadGlobalExtern = 0x4,
+  /// Mark the entry as being constant.
+  OffloadGlobalConstant = 0x8,
+  /// Mark the entry as being a normalized surface.
+  OffloadGlobalNormalized = 0x16,
 };
 
   private:
diff --git a/clang/test/CodeGenCUDA/offloading-entries.cu 
b/clang/test/CodeGenCUDA/offloading-entries.cu
index 46235051f1e4f12..4f5cf65ecd0bde6 100644
--- a/clang/test/CodeGenCUDA/offloading-entries.cu
+++ b/clang/test/CodeGenCUDA/offloading-entries.cu
@@ -17,31 +17,47 @@
 //.
 // CUDA: @.omp_offloading.entry_name = internal unnamed_addr constant [8 x i8] 
c"_Z3foov\00"
 // CUDA: @.omp_offloading.entry._Z3foov = weak constant 
%struct.__tgt_off

[llvm] [clang] [CUDA][HIP] Improve variable registration with the new driver (PR #73177)

2023-11-22 Thread Joseph Huber via cfe-commits


https://github.com/jhuber6 created 
https://github.com/llvm/llvm-project/pull/73177

Summary:
This patch adds support for registering texture / surface variables from
CUDA / HIP. Additionally, we now properly track the `extern` and `const`
flags that are also used in these runtime functions.

This does not implement the `managed` variables yet as those seem to
require some extra handling I'm not familiar with. The issue is that the
current offload entry isn't large enough to carry size and alignment
information along with an extra global.


>From 90e785f9c2bfebfd9db59307f0ad3a5156c4e303 Mon Sep 17 00:00:00 2001
From: Joseph Huber 
Date: Wed, 22 Nov 2023 15:57:22 -0600
Subject: [PATCH] [CUDA][HIP] Improve variable registration with the new driver

Summary:
This patch adds support for registering texture / surface variables from
CUDA / HIP. Additionally, we now properly track the `extern` and `const`
flags that are also used in these runtime functions.

This does not implement the `managed` variables yet as those seem to
require some extra handling I'm not familiar with. The issue is that the
current offload entry isn't large enough to carry size and alignment
information along with an extra global.
---
 clang/lib/CodeGen/CGCUDANV.cpp| 25 --
 clang/lib/CodeGen/CGCUDARuntime.h |  8 +-
 clang/test/CodeGenCUDA/offloading-entries.cu  | 86 +--
 clang/test/Driver/linker-wrapper-image.c  | 74 ++--
 .../clang-linker-wrapper/OffloadWrapper.cpp   | 58 +++--
 .../llvm/Frontend/Offloading/Utility.h|  6 +-
 llvm/lib/Frontend/Offloading/Utility.cpp  |  4 +-
 llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp |  2 +-
 8 files changed, 194 insertions(+), 69 deletions(-)

diff --git a/clang/lib/CodeGen/CGCUDANV.cpp b/clang/lib/CodeGen/CGCUDANV.cpp
index 66147f656071f53..eb059080b977872 100644
--- a/clang/lib/CodeGen/CGCUDANV.cpp
+++ b/clang/lib/CodeGen/CGCUDANV.cpp
@@ -1132,26 +1132,39 @@ void CGNVCUDARuntime::createOffloadingEntries() {
   for (KernelInfo &I : EmittedKernels)
 llvm::offloading::emitOffloadingEntry(
 M, KernelHandles[I.Kernel->getName()],
-getDeviceSideName(cast(I.D)), 0,
+getDeviceSideName(cast(I.D)), /*Flags=*/0, /*Data=*/0,
 DeviceVarFlags::OffloadGlobalEntry, Section);
 
   for (VarInfo &I : DeviceVars) {
 uint64_t VarSize =
 CGM.getDataLayout().getTypeAllocSize(I.Var->getValueType());
+int32_t Flags =
+(I.Flags.isExtern()
+ ? static_cast(DeviceVarFlags::OffloadGlobalExtern)
+ : 0) |
+(I.Flags.isConstant()
+ ? static_cast(DeviceVarFlags::OffloadGlobalConstant)
+ : 0) |
+(I.Flags.isNormalized()
+ ? static_cast(DeviceVarFlags::OffloadGlobalNormalized)
+ : 0);
 if (I.Flags.getKind() == DeviceVarFlags::Variable) {
   llvm::offloading::emitOffloadingEntry(
   M, I.Var, getDeviceSideName(I.D), VarSize,
-  I.Flags.isManaged() ? DeviceVarFlags::OffloadGlobalManagedEntry
-  : DeviceVarFlags::OffloadGlobalEntry,
-  Section);
+  (I.Flags.isManaged() ? DeviceVarFlags::OffloadGlobalManagedEntry
+   : DeviceVarFlags::OffloadGlobalEntry) |
+  Flags,
+  /*Data=*/0, Section);
 } else if (I.Flags.getKind() == DeviceVarFlags::Surface) {
   llvm::offloading::emitOffloadingEntry(
   M, I.Var, getDeviceSideName(I.D), VarSize,
-  DeviceVarFlags::OffloadGlobalSurfaceEntry, Section);
+  DeviceVarFlags::OffloadGlobalSurfaceEntry | Flags,
+  I.Flags.getSurfTexType(), Section);
 } else if (I.Flags.getKind() == DeviceVarFlags::Texture) {
   llvm::offloading::emitOffloadingEntry(
   M, I.Var, getDeviceSideName(I.D), VarSize,
-  DeviceVarFlags::OffloadGlobalTextureEntry, Section);
+  DeviceVarFlags::OffloadGlobalTextureEntry | Flags,
+  I.Flags.getSurfTexType(), Section);
 }
   }
 }
diff --git a/clang/lib/CodeGen/CGCUDARuntime.h 
b/clang/lib/CodeGen/CGCUDARuntime.h
index 9a9c6d26cc63c40..a224cdf0054f952 100644
--- a/clang/lib/CodeGen/CGCUDARuntime.h
+++ b/clang/lib/CodeGen/CGCUDARuntime.h
@@ -52,7 +52,7 @@ class CGCUDARuntime {
   Texture,  // Builtin texture
 };
 
-/// The kind flag for an offloading entry.
+/// The kind bit-field for an offloading entry.
 enum OffloadEntryKindFlag : uint32_t {
   /// Mark the entry as a global entry. This indicates the presense of a
   /// kernel if the size field is zero and a variable otherwise.
@@ -63,6 +63,12 @@ class CGCUDARuntime {
   OffloadGlobalSurfaceEntry = 0x2,
   /// Mark the entry as a texture variable.
   OffloadGlobalTextureEntry = 0x3,
+  /// Mark the entry as being extern.
+  OffloadGlobalExtern = 0x4,
+  /// Mark the entry as being constant.
+  OffloadGlobalConstant = 0x8,
+  /// Mark the entry as bein

[clang] [clang][CodeGen] Emit atomic IR instead of libcalls for misaligned po… (PR #73176)

2023-11-22 Thread via cfe-commits


github-actions[bot] wrote:




:warning: C/C++ code formatter, clang-format found issues in your code. 
:warning:



You can test this locally with the following command:


``bash
git-clang-format --diff 07fdc084fe75f971688d4140a5bd2dcb1d60eba2 
51423d866934f1507b64f8049d7cfcedf9727e43 -- clang/lib/CodeGen/CGAtomic.cpp 
clang/test/CodeGen/LoongArch/atomics.c 
clang/test/CodeGen/PowerPC/quadword-atomics.c 
clang/test/CodeGen/RISCV/riscv-atomics.c clang/test/CodeGen/arm-atomics-m.c 
clang/test/CodeGen/arm-atomics-m0.c clang/test/CodeGen/atomic-ops-libcall.c 
clang/test/CodeGen/atomic-ops.c clang/test/CodeGen/atomics-inlining.c 
clang/test/CodeGen/c11atomics.c
``





View the diff from clang-format here.


``diff
diff --git a/clang/lib/CodeGen/CGAtomic.cpp b/clang/lib/CodeGen/CGAtomic.cpp
index bc432afb6e..ff883e49c2 100644
--- a/clang/lib/CodeGen/CGAtomic.cpp
+++ b/clang/lib/CodeGen/CGAtomic.cpp
@@ -785,74 +785,73 @@ static void EmitAtomicOp(CodeGenFunction &CGF, AtomicExpr 
*Expr, Address Dest,
   Builder.SetInsertPoint(ContBB);
 }
 
-static bool
-isArithmeticOp(AtomicExpr::AtomicOp op) {
+static bool isArithmeticOp(AtomicExpr::AtomicOp op) {
   switch (op) {
-case AtomicExpr::AO__atomic_add_fetch:
-case AtomicExpr::AO__atomic_fetch_add:
-case AtomicExpr::AO__c11_atomic_fetch_add:
-case AtomicExpr::AO__hip_atomic_fetch_add:
-case AtomicExpr::AO__opencl_atomic_fetch_add:
-case AtomicExpr::AO__atomic_and_fetch:
-case AtomicExpr::AO__atomic_fetch_and:
-case AtomicExpr::AO__c11_atomic_fetch_and:
-case AtomicExpr::AO__hip_atomic_fetch_and:
-case AtomicExpr::AO__opencl_atomic_fetch_and:
-case AtomicExpr::AO__atomic_or_fetch:
-case AtomicExpr::AO__atomic_fetch_or:
-case AtomicExpr::AO__c11_atomic_fetch_or:
-case AtomicExpr::AO__hip_atomic_fetch_or:
-case AtomicExpr::AO__opencl_atomic_fetch_or:
-case AtomicExpr::AO__atomic_sub_fetch:
-case AtomicExpr::AO__atomic_fetch_sub:
-case AtomicExpr::AO__c11_atomic_fetch_sub:
-case AtomicExpr::AO__hip_atomic_fetch_sub:
-case AtomicExpr::AO__opencl_atomic_fetch_sub:
-case AtomicExpr::AO__atomic_xor_fetch:
-case AtomicExpr::AO__atomic_fetch_xor:
-case AtomicExpr::AO__c11_atomic_fetch_xor:
-case AtomicExpr::AO__hip_atomic_fetch_xor:
-case AtomicExpr::AO__opencl_atomic_fetch_xor:
-case AtomicExpr::AO__atomic_nand_fetch:
-case AtomicExpr::AO__atomic_fetch_nand:
-case AtomicExpr::AO__c11_atomic_fetch_nand:
-case AtomicExpr::AO__atomic_min_fetch:
-case AtomicExpr::AO__atomic_fetch_min:
-case AtomicExpr::AO__c11_atomic_fetch_min:
-case AtomicExpr::AO__hip_atomic_fetch_min:
-case AtomicExpr::AO__opencl_atomic_fetch_min:
-case AtomicExpr::AO__atomic_max_fetch:
-case AtomicExpr::AO__atomic_fetch_max:
-case AtomicExpr::AO__c11_atomic_fetch_max:
-case AtomicExpr::AO__hip_atomic_fetch_max:
-case AtomicExpr::AO__opencl_atomic_fetch_max:
-  return true;
-case AtomicExpr::AO__c11_atomic_init:
-case AtomicExpr::AO__opencl_atomic_init:
-case AtomicExpr::AO__atomic_compare_exchange:
-case AtomicExpr::AO__atomic_compare_exchange_n:
-case AtomicExpr::AO__c11_atomic_compare_exchange_weak:
-case AtomicExpr::AO__c11_atomic_compare_exchange_strong:
-case AtomicExpr::AO__hip_atomic_compare_exchange_weak:
-case AtomicExpr::AO__hip_atomic_compare_exchange_strong:
-case AtomicExpr::AO__opencl_atomic_compare_exchange_weak:
-case AtomicExpr::AO__opencl_atomic_compare_exchange_strong:
-case AtomicExpr::AO__atomic_exchange:
-case AtomicExpr::AO__atomic_exchange_n:
-case AtomicExpr::AO__c11_atomic_exchange:
-case AtomicExpr::AO__hip_atomic_exchange:
-case AtomicExpr::AO__opencl_atomic_exchange:
-case AtomicExpr::AO__atomic_store:
-case AtomicExpr::AO__atomic_store_n:
-case AtomicExpr::AO__c11_atomic_store:
-case AtomicExpr::AO__hip_atomic_store:
-case AtomicExpr::AO__opencl_atomic_store:
-case AtomicExpr::AO__atomic_load:
-case AtomicExpr::AO__atomic_load_n:
-case AtomicExpr::AO__c11_atomic_load:
-case AtomicExpr::AO__hip_atomic_load:
-case AtomicExpr::AO__opencl_atomic_load:
-  return false;
+  case AtomicExpr::AO__atomic_add_fetch:
+  case AtomicExpr::AO__atomic_fetch_add:
+  case AtomicExpr::AO__c11_atomic_fetch_add:
+  case AtomicExpr::AO__hip_atomic_fetch_add:
+  case AtomicExpr::AO__opencl_atomic_fetch_add:
+  case AtomicExpr::AO__atomic_and_fetch:
+  case AtomicExpr::AO__atomic_fetch_and:
+  case AtomicExpr::AO__c11_atomic_fetch_and:
+  case AtomicExpr::AO__hip_atomic_fetch_and:
+  case AtomicExpr::AO__opencl_atomic_fetch_and:
+  case AtomicExpr::AO__atomic_or_fetch:
+  case AtomicExpr::AO__atomic_fetch_or:
+  case AtomicExpr::AO__c11_atomic_fetch_or:
+  case AtomicExpr::AO__hip_atomic_fetch_or:
+  case AtomicExpr::AO__opencl_atomic_fetch_or:
+  case AtomicExpr::AO__atomic_sub_fetch:
+  case AtomicExpr::AO__atomic_fetch_s

1 2 3 4 >

1 - 100 of 361 matches

Mail list logo