[llvm-branch-commits] [llvm] [SROA] Use !tbaa instead of !tbaa.struct if op matches field. (PR #81289)

2024-02-16 Thread via llvm-branch-commits


@@ -4561,6 +4577,10 @@ bool SROA::presplitLoadsAndStores(AllocaInst &AI, 
AllocaSlices &AS) {
 PStore->copyMetadata(*SI, {LLVMContext::MD_mem_parallel_loop_access,
LLVMContext::MD_access_group,
LLVMContext::MD_DIAssignID});
+
+if (AATags)
+  PStore->setAAMetadata(

dobbelaj-snps wrote:

thx !

https://github.com/llvm/llvm-project/pull/81289
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [SROA] Use !tbaa instead of !tbaa.struct if op matches field. (PR #81289)

2024-02-16 Thread via llvm-branch-commits

https://github.com/dobbelaj-snps approved this pull request.

lgtm

https://github.com/llvm/llvm-project/pull/81289
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [openmp] release/18.x: [OpenMP] [cmake] Don't use -fno-semantic-interposition on Windows (#81113) (PR #81332)

2024-02-16 Thread Martin Storsjö via llvm-branch-commits

https://github.com/mstorsjo approved this pull request.

Marking as approved (approved by @jhuber6), so it shows up right in PR listings.

https://github.com/llvm/llvm-project/pull/81332
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] [LLD] [docs] Add more release notes for COFF and MinGW (PR #81977)

2024-02-16 Thread Martin Storsjö via llvm-branch-commits

https://github.com/mstorsjo milestoned 
https://github.com/llvm/llvm-project/pull/81977
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] [LLD] [docs] Add more release notes for COFF and MinGW (PR #81977)

2024-02-16 Thread Martin Storsjö via llvm-branch-commits

https://github.com/mstorsjo created 
https://github.com/llvm/llvm-project/pull/81977

Add review references to all items already mentioned.

Move some items to the right section (from the MinGW section to COFF, as the 
implementation is in the COFF linker side, and may be relevant for non-MinGW 
cases as well).

From 03da62f54b8f90dda00ca0632a962f9cdc73d571 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Martin=20Storsj=C3=B6?= 
Date: Mon, 12 Feb 2024 14:16:40 +0200
Subject: [PATCH] [LLD] [docs] Add more release notes for COFF and MinGW

Add review references to all items already mentioned.

Move some items to the right section (from the MinGW section to COFF,
as the implementation is in the COFF linker side, and may be relevant
for non-MinGW cases as well).
---
 lld/docs/ReleaseNotes.rst | 48 ---
 1 file changed, 45 insertions(+), 3 deletions(-)

diff --git a/lld/docs/ReleaseNotes.rst b/lld/docs/ReleaseNotes.rst
index 82f9d93b8e86ab..56ba3463aeadc0 100644
--- a/lld/docs/ReleaseNotes.rst
+++ b/lld/docs/ReleaseNotes.rst
@@ -82,14 +82,46 @@ COFF Improvements
 
 * Added support for ``--time-trace`` and associated 
``--time-trace-granularity``.
   This generates a .json profile trace of the linker execution.
+  (`#68236 `_)
+
+* The ``-dependentloadflag`` option was implemented.
+  (`#71537 `_)
 
 * LLD now prefers library paths specified with ``-libpath:`` over the 
implicitly
   detected toolchain paths.
+  (`#78039 `_)
+
+* Added new options ``-lldemit:llvm`` and ``-lldemit:asm`` for getting
+  the output of LTO compilation as LLVM bitcode or assembly.
+  (`#66964 `_)
+  (`#67079 `_)
+
+* Added a new option ``-build-id`` for generating a ``.buildid`` section
+  when not generating a PDB. A new symbol ``__buildid`` is generated by
+  the linker, allowing code to reference the build ID of the binary.
+  (`#71433 `_)
+  (`#74652 `_)
+
+* A new, LLD specific option, ``-lld-allow-duplicate-weak``, was added
+  for allowing duplicate weak symbols.
+  (`#68077 `_)
+
+* More correctly handle LTO of files that define ``__imp_`` prefixed dllimport
+  redirections.
+  (`#70777 `_)
+  (`#71376 `_)
+  (`#72989 `_)
+
+* Linking undefined references to weak symbols with LTO now works.
+  (`#70430 `_)
 
 * Use the ``SOURCE_DATE_EPOCH`` environment variable for the PE header and
   debug directory timestamps, if neither the ``/Brepro`` nor ``/timestamp:``
   options have been specified. This makes the linker output reproducible by
   setting this environment variable.
+  (`#81326 `_)
+
+* Lots of incremental work towards supporting linking ARM64EC binaries.
 
 MinGW Improvements
 --
@@ -97,19 +129,29 @@ MinGW Improvements
 * Added support for many LTO and ThinLTO options (most LTO options supported
   by the ELF driver, that are implemented by the COFF backend as well,
   should be supported now).
+  (`D158412 `_)
+  (`D158887 `_)
+  (`#77387 `_)
+  (`#81475 `_)
 
 * LLD no longer tries to autodetect and use library paths from MSVC/WinSDK
   installations when run in MinGW mode; that mode of operation shouldn't
   ever be needed in MinGW mode, and could be a source of unexpected
   behaviours.
+  (`D144084 `_)
 
 * The ``--icf=safe`` option now works as expected; it was previously a no-op.
-
-* More correctly handle LTO of files that define ``__imp_`` prefixed dllimport
-  redirections.
+  (`#70037 `_)
 
 * The strip flags ``-S`` and ``-s`` now can be used to strip out DWARF debug
   info and symbol tables while emitting a PDB debug info file.
+  (`#75181 `_)
+
+* The option ``--dll`` is handled as an alias for the ``--shared`` option.
+  (`#68575 `_)
+
+* The option ``--sort-common`` is ignored now.
+  (`#66336 `_)
 
 MachO Improvements
 --

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] [LLD] [docs] Add more release notes for COFF and MinGW (PR #81977)

2024-02-16 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-lld

Author: Martin Storsjö (mstorsjo)


Changes

Add review references to all items already mentioned.

Move some items to the right section (from the MinGW section to COFF, as the 
implementation is in the COFF linker side, and may be relevant for non-MinGW 
cases as well).

---
Full diff: https://github.com/llvm/llvm-project/pull/81977.diff


1 Files Affected:

- (modified) lld/docs/ReleaseNotes.rst (+45-3) 


``diff
diff --git a/lld/docs/ReleaseNotes.rst b/lld/docs/ReleaseNotes.rst
index 82f9d93b8e86ab..56ba3463aeadc0 100644
--- a/lld/docs/ReleaseNotes.rst
+++ b/lld/docs/ReleaseNotes.rst
@@ -82,14 +82,46 @@ COFF Improvements
 
 * Added support for ``--time-trace`` and associated 
``--time-trace-granularity``.
   This generates a .json profile trace of the linker execution.
+  (`#68236 `_)
+
+* The ``-dependentloadflag`` option was implemented.
+  (`#71537 `_)
 
 * LLD now prefers library paths specified with ``-libpath:`` over the 
implicitly
   detected toolchain paths.
+  (`#78039 `_)
+
+* Added new options ``-lldemit:llvm`` and ``-lldemit:asm`` for getting
+  the output of LTO compilation as LLVM bitcode or assembly.
+  (`#66964 `_)
+  (`#67079 `_)
+
+* Added a new option ``-build-id`` for generating a ``.buildid`` section
+  when not generating a PDB. A new symbol ``__buildid`` is generated by
+  the linker, allowing code to reference the build ID of the binary.
+  (`#71433 `_)
+  (`#74652 `_)
+
+* A new, LLD specific option, ``-lld-allow-duplicate-weak``, was added
+  for allowing duplicate weak symbols.
+  (`#68077 `_)
+
+* More correctly handle LTO of files that define ``__imp_`` prefixed dllimport
+  redirections.
+  (`#70777 `_)
+  (`#71376 `_)
+  (`#72989 `_)
+
+* Linking undefined references to weak symbols with LTO now works.
+  (`#70430 `_)
 
 * Use the ``SOURCE_DATE_EPOCH`` environment variable for the PE header and
   debug directory timestamps, if neither the ``/Brepro`` nor ``/timestamp:``
   options have been specified. This makes the linker output reproducible by
   setting this environment variable.
+  (`#81326 `_)
+
+* Lots of incremental work towards supporting linking ARM64EC binaries.
 
 MinGW Improvements
 --
@@ -97,19 +129,29 @@ MinGW Improvements
 * Added support for many LTO and ThinLTO options (most LTO options supported
   by the ELF driver, that are implemented by the COFF backend as well,
   should be supported now).
+  (`D158412 `_)
+  (`D158887 `_)
+  (`#77387 `_)
+  (`#81475 `_)
 
 * LLD no longer tries to autodetect and use library paths from MSVC/WinSDK
   installations when run in MinGW mode; that mode of operation shouldn't
   ever be needed in MinGW mode, and could be a source of unexpected
   behaviours.
+  (`D144084 `_)
 
 * The ``--icf=safe`` option now works as expected; it was previously a no-op.
-
-* More correctly handle LTO of files that define ``__imp_`` prefixed dllimport
-  redirections.
+  (`#70037 `_)
 
 * The strip flags ``-S`` and ``-s`` now can be used to strip out DWARF debug
   info and symbol tables while emitting a PDB debug info file.
+  (`#75181 `_)
+
+* The option ``--dll`` is handled as an alias for the ``--shared`` option.
+  (`#68575 `_)
+
+* The option ``--sort-common`` is ignored now.
+  (`#66336 `_)
 
 MachO Improvements
 --

``




https://github.com/llvm/llvm-project/pull/81977
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [SLP]Fix PR79229: Check that extractelement is used only in a single node (PR #81984)

2024-02-16 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/81984

Backport 48bbd7658710ef1699bf2a6532ff5830230aacc5

Requested by: @nikic

>From c685d6940a8e1f43b510a79d8aefb1e2f0ba0e1e Mon Sep 17 00:00:00 2001
From: Alexey Bataev 
Date: Wed, 24 Jan 2024 10:57:18 -0800
Subject: [PATCH] [SLP]Fix PR79229: Check that extractelement is used only in a
 single node before erasing.

Before trying to erase the extractelement instruction, not enough to
check for single use, need to check that it is not used in several nodes
because of the preliminary nodes reordering.

(cherry picked from commit 48bbd7658710ef1699bf2a6532ff5830230aacc5)
---
 .../Transforms/Vectorize/SLPVectorizer.cpp|  11 +-
 .../extractelement-single-use-many-nodes.ll   | 144 ++
 2 files changed, 154 insertions(+), 1 deletion(-)
 create mode 100644 
llvm/test/Transforms/SLPVectorizer/X86/extractelement-single-use-many-nodes.ll

diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp 
b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index 601d2454c1e163..83f787d7fb624a 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -10216,7 +10216,16 @@ class BoUpSLP::ShuffleInstructionBuilder final : 
public BaseShuffleAnalysis {
   // If the only one use is vectorized - can delete the extractelement
   // itself.
   if (!EI->hasOneUse() || any_of(EI->users(), [&](User *U) {
-return !R.ScalarToTreeEntry.count(U);
+const TreeEntry *UTE = R.getTreeEntry(U);
+return !UTE || R.MultiNodeScalars.contains(U) ||
+   count_if(R.VectorizableTree,
+[&](const std::unique_ptr &TE) {
+  return any_of(TE->UserTreeIndices,
+[&](const EdgeInfo &Edge) {
+  return Edge.UserTE == UTE;
+}) &&
+ is_contained(TE->Scalars, EI);
+}) != 1;
   }))
 continue;
   R.eraseInstruction(EI);
diff --git 
a/llvm/test/Transforms/SLPVectorizer/X86/extractelement-single-use-many-nodes.ll
 
b/llvm/test/Transforms/SLPVectorizer/X86/extractelement-single-use-many-nodes.ll
new file mode 100644
index 00..f665dac3282b79
--- /dev/null
+++ 
b/llvm/test/Transforms/SLPVectorizer/X86/extractelement-single-use-many-nodes.ll
@@ -0,0 +1,144 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 4
+; RUN: opt -passes=slp-vectorizer -mtriple=x86_64-unknown-linux-gnu 
-mcpu=x86-64-v3 -S < %s | FileCheck %s
+
+define void @foo(double %i) {
+; CHECK-LABEL: define void @foo(
+; CHECK-SAME: double [[I:%.*]]) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT:  bb:
+; CHECK-NEXT:[[TMP0:%.*]] = insertelement <4 x double> , double 
[[I]], i32 2
+; CHECK-NEXT:[[TMP1:%.*]] = fsub <4 x double> zeroinitializer, [[TMP0]]
+; CHECK-NEXT:[[TMP2:%.*]] = extractelement <4 x double> [[TMP1]], i32 1
+; CHECK-NEXT:[[TMP3:%.*]] = insertelement <2 x double> poison, double 
[[I]], i32 0
+; CHECK-NEXT:[[TMP4:%.*]] = fsub <2 x double> zeroinitializer, [[TMP3]]
+; CHECK-NEXT:[[TMP5:%.*]] = extractelement <2 x double> [[TMP4]], i32 1
+; CHECK-NEXT:[[TMP6:%.*]] = shufflevector <2 x double> [[TMP4]], <2 x 
double> poison, <8 x i32> 
+; CHECK-NEXT:[[TMP7:%.*]] = shufflevector <8 x double> [[TMP6]], <8 x 
double> , 
<8 x i32> 
+; CHECK-NEXT:[[TMP8:%.*]] = insertelement <8 x double> [[TMP7]], double 
[[TMP2]], i32 3
+; CHECK-NEXT:[[TMP9:%.*]] = shufflevector <4 x double> [[TMP1]], <4 x 
double> poison, <8 x i32> 
+; CHECK-NEXT:[[TMP10:%.*]] = shufflevector <8 x double> [[TMP9]], <8 x 
double> , <8 x i32> 
+; CHECK-NEXT:[[TMP11:%.*]] = insertelement <8 x double> [[TMP10]], double 
[[TMP5]], i32 6
+; CHECK-NEXT:[[TMP12:%.*]] = fmul <8 x double> [[TMP8]], [[TMP11]]
+; CHECK-NEXT:[[TMP13:%.*]] = fadd <8 x double> zeroinitializer, [[TMP12]]
+; CHECK-NEXT:[[TMP14:%.*]] = fadd <8 x double> [[TMP13]], zeroinitializer
+; CHECK-NEXT:[[TMP15:%.*]] = fcmp ult <8 x double> [[TMP14]], 
zeroinitializer
+; CHECK-NEXT:[[TMP16:%.*]] = freeze <8 x i1> [[TMP15]]
+; CHECK-NEXT:[[TMP17:%.*]] = call i1 @llvm.vector.reduce.and.v8i1(<8 x i1> 
[[TMP16]])
+; CHECK-NEXT:br i1 [[TMP17]], label [[BB58:%.*]], label [[BB115:%.*]]
+; CHECK:   bb115:
+; CHECK-NEXT:[[TMP18:%.*]] = fmul <2 x double> zeroinitializer, [[TMP4]]
+; CHECK-NEXT:[[TMP19:%.*]] = extractelement <2 x double> [[TMP18]], i32 0
+; CHECK-NEXT:[[TMP20:%.*]] = extractelement <2 x double> [[TMP18]], i32 1
+; CHECK-NEXT:[[I118:%.*]] = fadd double [[TMP19]], [[TMP20]]
+; CHECK-NEXT:[[TMP21:%.*]] = fmul <4 x double> zeroinitializer, [[TMP1]]
+; CHECK-NEXT:[[TMP22:%.*]] = shufflevector <2 x double> [[TMP4]], <2 x 
double> poison, <4 x i32

[llvm-branch-commits] [llvm] release/18.x: [SLP]Fix PR79229: Check that extractelement is used only in a single node (PR #81984)

2024-02-16 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/81984
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [SLP]Fix PR79229: Check that extractelement is used only in a single node (PR #81984)

2024-02-16 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-llvm-transforms

Author: None (llvmbot)


Changes

Backport 48bbd7658710ef1699bf2a6532ff5830230aacc5

Requested by: @nikic

---
Full diff: https://github.com/llvm/llvm-project/pull/81984.diff


2 Files Affected:

- (modified) llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp (+10-1) 
- (added) 
llvm/test/Transforms/SLPVectorizer/X86/extractelement-single-use-many-nodes.ll 
(+144) 


``diff
diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp 
b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index 601d2454c1e163..83f787d7fb624a 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -10216,7 +10216,16 @@ class BoUpSLP::ShuffleInstructionBuilder final : 
public BaseShuffleAnalysis {
   // If the only one use is vectorized - can delete the extractelement
   // itself.
   if (!EI->hasOneUse() || any_of(EI->users(), [&](User *U) {
-return !R.ScalarToTreeEntry.count(U);
+const TreeEntry *UTE = R.getTreeEntry(U);
+return !UTE || R.MultiNodeScalars.contains(U) ||
+   count_if(R.VectorizableTree,
+[&](const std::unique_ptr &TE) {
+  return any_of(TE->UserTreeIndices,
+[&](const EdgeInfo &Edge) {
+  return Edge.UserTE == UTE;
+}) &&
+ is_contained(TE->Scalars, EI);
+}) != 1;
   }))
 continue;
   R.eraseInstruction(EI);
diff --git 
a/llvm/test/Transforms/SLPVectorizer/X86/extractelement-single-use-many-nodes.ll
 
b/llvm/test/Transforms/SLPVectorizer/X86/extractelement-single-use-many-nodes.ll
new file mode 100644
index 00..f665dac3282b79
--- /dev/null
+++ 
b/llvm/test/Transforms/SLPVectorizer/X86/extractelement-single-use-many-nodes.ll
@@ -0,0 +1,144 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 4
+; RUN: opt -passes=slp-vectorizer -mtriple=x86_64-unknown-linux-gnu 
-mcpu=x86-64-v3 -S < %s | FileCheck %s
+
+define void @foo(double %i) {
+; CHECK-LABEL: define void @foo(
+; CHECK-SAME: double [[I:%.*]]) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT:  bb:
+; CHECK-NEXT:[[TMP0:%.*]] = insertelement <4 x double> , double 
[[I]], i32 2
+; CHECK-NEXT:[[TMP1:%.*]] = fsub <4 x double> zeroinitializer, [[TMP0]]
+; CHECK-NEXT:[[TMP2:%.*]] = extractelement <4 x double> [[TMP1]], i32 1
+; CHECK-NEXT:[[TMP3:%.*]] = insertelement <2 x double> poison, double 
[[I]], i32 0
+; CHECK-NEXT:[[TMP4:%.*]] = fsub <2 x double> zeroinitializer, [[TMP3]]
+; CHECK-NEXT:[[TMP5:%.*]] = extractelement <2 x double> [[TMP4]], i32 1
+; CHECK-NEXT:[[TMP6:%.*]] = shufflevector <2 x double> [[TMP4]], <2 x 
double> poison, <8 x i32> 
+; CHECK-NEXT:[[TMP7:%.*]] = shufflevector <8 x double> [[TMP6]], <8 x 
double> , 
<8 x i32> 
+; CHECK-NEXT:[[TMP8:%.*]] = insertelement <8 x double> [[TMP7]], double 
[[TMP2]], i32 3
+; CHECK-NEXT:[[TMP9:%.*]] = shufflevector <4 x double> [[TMP1]], <4 x 
double> poison, <8 x i32> 
+; CHECK-NEXT:[[TMP10:%.*]] = shufflevector <8 x double> [[TMP9]], <8 x 
double> , <8 x i32> 
+; CHECK-NEXT:[[TMP11:%.*]] = insertelement <8 x double> [[TMP10]], double 
[[TMP5]], i32 6
+; CHECK-NEXT:[[TMP12:%.*]] = fmul <8 x double> [[TMP8]], [[TMP11]]
+; CHECK-NEXT:[[TMP13:%.*]] = fadd <8 x double> zeroinitializer, [[TMP12]]
+; CHECK-NEXT:[[TMP14:%.*]] = fadd <8 x double> [[TMP13]], zeroinitializer
+; CHECK-NEXT:[[TMP15:%.*]] = fcmp ult <8 x double> [[TMP14]], 
zeroinitializer
+; CHECK-NEXT:[[TMP16:%.*]] = freeze <8 x i1> [[TMP15]]
+; CHECK-NEXT:[[TMP17:%.*]] = call i1 @llvm.vector.reduce.and.v8i1(<8 x i1> 
[[TMP16]])
+; CHECK-NEXT:br i1 [[TMP17]], label [[BB58:%.*]], label [[BB115:%.*]]
+; CHECK:   bb115:
+; CHECK-NEXT:[[TMP18:%.*]] = fmul <2 x double> zeroinitializer, [[TMP4]]
+; CHECK-NEXT:[[TMP19:%.*]] = extractelement <2 x double> [[TMP18]], i32 0
+; CHECK-NEXT:[[TMP20:%.*]] = extractelement <2 x double> [[TMP18]], i32 1
+; CHECK-NEXT:[[I118:%.*]] = fadd double [[TMP19]], [[TMP20]]
+; CHECK-NEXT:[[TMP21:%.*]] = fmul <4 x double> zeroinitializer, [[TMP1]]
+; CHECK-NEXT:[[TMP22:%.*]] = shufflevector <2 x double> [[TMP4]], <2 x 
double> poison, <4 x i32> 
+; CHECK-NEXT:[[TMP23:%.*]] = shufflevector <4 x double> , <4 x 
double> [[TMP22]], <4 x i32> 
+; CHECK-NEXT:[[TMP24:%.*]] = fadd <4 x double> [[TMP21]], [[TMP23]]
+; CHECK-NEXT:[[TMP25:%.*]] = fadd <4 x double> [[TMP24]], zeroinitializer
+; CHECK-NEXT:[[TMP26:%.*]] = select <4 x i1> zeroinitializer, <4 x double> 
zeroinitializer, <4 x double> [[TMP25]]
+; CHECK-NEXT:[[TMP27:%.*]] = fmul <4 x double> [[TMP26]], zeroinitializer
+; CHECK-NEXT:[[TMP28:%.*]] = fmul <4 x do

[llvm-branch-commits] [llvm] release/18.x: [SLP]Fix PR79229: Check that extractelement is used only in a single node (PR #81984)

2024-02-16 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/81984

>From 40f34121a954fc2184b1ecf19a2b797482c1434e Mon Sep 17 00:00:00 2001
From: Alexey Bataev 
Date: Wed, 24 Jan 2024 10:57:18 -0800
Subject: [PATCH 1/2] [SLP]Fix PR79229: Check that extractelement is used only
 in a single node before erasing.

Before trying to erase the extractelement instruction, not enough to
check for single use, need to check that it is not used in several nodes
because of the preliminary nodes reordering.

(cherry picked from commit 48bbd7658710ef1699bf2a6532ff5830230aacc5)
---
 .../Transforms/Vectorize/SLPVectorizer.cpp|  11 +-
 .../extractelement-single-use-many-nodes.ll   | 144 ++
 2 files changed, 154 insertions(+), 1 deletion(-)
 create mode 100644 
llvm/test/Transforms/SLPVectorizer/X86/extractelement-single-use-many-nodes.ll

diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp 
b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index 601d2454c1e163..83f787d7fb624a 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -10216,7 +10216,16 @@ class BoUpSLP::ShuffleInstructionBuilder final : 
public BaseShuffleAnalysis {
   // If the only one use is vectorized - can delete the extractelement
   // itself.
   if (!EI->hasOneUse() || any_of(EI->users(), [&](User *U) {
-return !R.ScalarToTreeEntry.count(U);
+const TreeEntry *UTE = R.getTreeEntry(U);
+return !UTE || R.MultiNodeScalars.contains(U) ||
+   count_if(R.VectorizableTree,
+[&](const std::unique_ptr &TE) {
+  return any_of(TE->UserTreeIndices,
+[&](const EdgeInfo &Edge) {
+  return Edge.UserTE == UTE;
+}) &&
+ is_contained(TE->Scalars, EI);
+}) != 1;
   }))
 continue;
   R.eraseInstruction(EI);
diff --git 
a/llvm/test/Transforms/SLPVectorizer/X86/extractelement-single-use-many-nodes.ll
 
b/llvm/test/Transforms/SLPVectorizer/X86/extractelement-single-use-many-nodes.ll
new file mode 100644
index 00..f665dac3282b79
--- /dev/null
+++ 
b/llvm/test/Transforms/SLPVectorizer/X86/extractelement-single-use-many-nodes.ll
@@ -0,0 +1,144 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 4
+; RUN: opt -passes=slp-vectorizer -mtriple=x86_64-unknown-linux-gnu 
-mcpu=x86-64-v3 -S < %s | FileCheck %s
+
+define void @foo(double %i) {
+; CHECK-LABEL: define void @foo(
+; CHECK-SAME: double [[I:%.*]]) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT:  bb:
+; CHECK-NEXT:[[TMP0:%.*]] = insertelement <4 x double> , double 
[[I]], i32 2
+; CHECK-NEXT:[[TMP1:%.*]] = fsub <4 x double> zeroinitializer, [[TMP0]]
+; CHECK-NEXT:[[TMP2:%.*]] = extractelement <4 x double> [[TMP1]], i32 1
+; CHECK-NEXT:[[TMP3:%.*]] = insertelement <2 x double> poison, double 
[[I]], i32 0
+; CHECK-NEXT:[[TMP4:%.*]] = fsub <2 x double> zeroinitializer, [[TMP3]]
+; CHECK-NEXT:[[TMP5:%.*]] = extractelement <2 x double> [[TMP4]], i32 1
+; CHECK-NEXT:[[TMP6:%.*]] = shufflevector <2 x double> [[TMP4]], <2 x 
double> poison, <8 x i32> 
+; CHECK-NEXT:[[TMP7:%.*]] = shufflevector <8 x double> [[TMP6]], <8 x 
double> , 
<8 x i32> 
+; CHECK-NEXT:[[TMP8:%.*]] = insertelement <8 x double> [[TMP7]], double 
[[TMP2]], i32 3
+; CHECK-NEXT:[[TMP9:%.*]] = shufflevector <4 x double> [[TMP1]], <4 x 
double> poison, <8 x i32> 
+; CHECK-NEXT:[[TMP10:%.*]] = shufflevector <8 x double> [[TMP9]], <8 x 
double> , <8 x i32> 
+; CHECK-NEXT:[[TMP11:%.*]] = insertelement <8 x double> [[TMP10]], double 
[[TMP5]], i32 6
+; CHECK-NEXT:[[TMP12:%.*]] = fmul <8 x double> [[TMP8]], [[TMP11]]
+; CHECK-NEXT:[[TMP13:%.*]] = fadd <8 x double> zeroinitializer, [[TMP12]]
+; CHECK-NEXT:[[TMP14:%.*]] = fadd <8 x double> [[TMP13]], zeroinitializer
+; CHECK-NEXT:[[TMP15:%.*]] = fcmp ult <8 x double> [[TMP14]], 
zeroinitializer
+; CHECK-NEXT:[[TMP16:%.*]] = freeze <8 x i1> [[TMP15]]
+; CHECK-NEXT:[[TMP17:%.*]] = call i1 @llvm.vector.reduce.and.v8i1(<8 x i1> 
[[TMP16]])
+; CHECK-NEXT:br i1 [[TMP17]], label [[BB58:%.*]], label [[BB115:%.*]]
+; CHECK:   bb115:
+; CHECK-NEXT:[[TMP18:%.*]] = fmul <2 x double> zeroinitializer, [[TMP4]]
+; CHECK-NEXT:[[TMP19:%.*]] = extractelement <2 x double> [[TMP18]], i32 0
+; CHECK-NEXT:[[TMP20:%.*]] = extractelement <2 x double> [[TMP18]], i32 1
+; CHECK-NEXT:[[I118:%.*]] = fadd double [[TMP19]], [[TMP20]]
+; CHECK-NEXT:[[TMP21:%.*]] = fmul <4 x double> zeroinitializer, [[TMP1]]
+; CHECK-NEXT:[[TMP22:%.*]] = shufflevector <2 x double> [[TMP4]], <2 x 
double> poison, <4 x i32> 
+; CHECK-NEXT:[[TMP23:%.*]] = shufflevector <4 x double> , <4 

[llvm-branch-commits] [llvm] release/18.x: [RISCV] Use APInt in useInversedSetcc to prevent crashes when mask is larger than UINT64_MAX. (#81888) (PR #81905)

2024-02-16 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/81905

>From 2138f4734df035219b3071f31c08e2e46df14837 Mon Sep 17 00:00:00 2001
From: Craig Topper 
Date: Thu, 15 Feb 2024 10:48:52 -0800
Subject: [PATCH] [RISCV] Use APInt in useInversedSetcc to prevent crashes when
 mask is larger than UINT64_MAX. (#81888)

There are no checks that the type is legal so we need to handle any
type.

(cherry picked from commit b57ba8ec514190b38eced26d541e8e25af66c485)
---
 llvm/lib/Target/RISCV/RISCVISelLowering.cpp |  4 +-
 llvm/test/CodeGen/RISCV/condops.ll  | 51 +
 2 files changed, 53 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp 
b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index dba4df77663b07..37d94be5316eea 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -14654,8 +14654,8 @@ static SDValue useInversedSetcc(SDNode *N, SelectionDAG 
&DAG,
 ISD::CondCode CC = cast(Cond.getOperand(2))->get();
 if (CC == ISD::SETEQ && LHS.getOpcode() == ISD::AND &&
 isa(LHS.getOperand(1)) && isNullConstant(RHS)) {
-  uint64_t MaskVal = LHS.getConstantOperandVal(1);
-  if (isPowerOf2_64(MaskVal) && !isInt<12>(MaskVal))
+  const APInt &MaskVal = LHS.getConstantOperandAPInt(1);
+  if (MaskVal.isPowerOf2() && !MaskVal.isSignedIntN(12))
 return DAG.getSelect(DL, VT,
  DAG.getSetCC(DL, CondVT, LHS, RHS, ISD::SETNE),
  False, True);
diff --git a/llvm/test/CodeGen/RISCV/condops.ll 
b/llvm/test/CodeGen/RISCV/condops.ll
index 8e53782b5dcd78..101cb5aeeb0940 100644
--- a/llvm/test/CodeGen/RISCV/condops.ll
+++ b/llvm/test/CodeGen/RISCV/condops.ll
@@ -3719,3 +3719,54 @@ entry:
   %cond = select i1 %tobool.not, i64 0, i64 %x
   ret i64 %cond
 }
+
+; Test that we don't crash on types larger than 64 bits.
+define i64 @single_bit3(i80 %x, i64 %y) {
+; RV32I-LABEL: single_bit3:
+; RV32I:   # %bb.0: # %entry
+; RV32I-NEXT:lw a0, 8(a0)
+; RV32I-NEXT:slli a0, a0, 31
+; RV32I-NEXT:srai a3, a0, 31
+; RV32I-NEXT:and a0, a3, a1
+; RV32I-NEXT:and a1, a3, a2
+; RV32I-NEXT:ret
+;
+; RV64I-LABEL: single_bit3:
+; RV64I:   # %bb.0: # %entry
+; RV64I-NEXT:slli a1, a1, 63
+; RV64I-NEXT:srai a0, a1, 63
+; RV64I-NEXT:and a0, a0, a2
+; RV64I-NEXT:ret
+;
+; RV64XVENTANACONDOPS-LABEL: single_bit3:
+; RV64XVENTANACONDOPS:   # %bb.0: # %entry
+; RV64XVENTANACONDOPS-NEXT:andi a1, a1, 1
+; RV64XVENTANACONDOPS-NEXT:vt.maskc a0, a2, a1
+; RV64XVENTANACONDOPS-NEXT:ret
+;
+; RV64XTHEADCONDMOV-LABEL: single_bit3:
+; RV64XTHEADCONDMOV:   # %bb.0: # %entry
+; RV64XTHEADCONDMOV-NEXT:slli a1, a1, 63
+; RV64XTHEADCONDMOV-NEXT:srai a0, a1, 63
+; RV64XTHEADCONDMOV-NEXT:and a0, a0, a2
+; RV64XTHEADCONDMOV-NEXT:ret
+;
+; RV32ZICOND-LABEL: single_bit3:
+; RV32ZICOND:   # %bb.0: # %entry
+; RV32ZICOND-NEXT:lw a0, 8(a0)
+; RV32ZICOND-NEXT:andi a3, a0, 1
+; RV32ZICOND-NEXT:czero.eqz a0, a1, a3
+; RV32ZICOND-NEXT:czero.eqz a1, a2, a3
+; RV32ZICOND-NEXT:ret
+;
+; RV64ZICOND-LABEL: single_bit3:
+; RV64ZICOND:   # %bb.0: # %entry
+; RV64ZICOND-NEXT:andi a1, a1, 1
+; RV64ZICOND-NEXT:czero.eqz a0, a2, a1
+; RV64ZICOND-NEXT:ret
+entry:
+  %and = and i80 %x, 18446744073709551616 ; 1 << 64
+  %tobool.not = icmp eq i80 %and, 0
+  %cond = select i1 %tobool.not, i64 0, i64 %y
+  ret i64 %cond
+}

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [RISCV] Use APInt in useInversedSetcc to prevent crashes when mask is larger than UINT64_MAX. (#81888) (PR #81905)

2024-02-16 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/81905

>From 023925bcdfbc06941edaa64ba789dbad2bca2ce1 Mon Sep 17 00:00:00 2001
From: Craig Topper 
Date: Thu, 15 Feb 2024 10:48:52 -0800
Subject: [PATCH] [RISCV] Use APInt in useInversedSetcc to prevent crashes when
 mask is larger than UINT64_MAX. (#81888)

There are no checks that the type is legal so we need to handle any
type.

(cherry picked from commit b57ba8ec514190b38eced26d541e8e25af66c485)
---
 llvm/lib/Target/RISCV/RISCVISelLowering.cpp |  4 +-
 llvm/test/CodeGen/RISCV/condops.ll  | 51 +
 2 files changed, 53 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp 
b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index dba4df77663b07..37d94be5316eea 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -14654,8 +14654,8 @@ static SDValue useInversedSetcc(SDNode *N, SelectionDAG 
&DAG,
 ISD::CondCode CC = cast(Cond.getOperand(2))->get();
 if (CC == ISD::SETEQ && LHS.getOpcode() == ISD::AND &&
 isa(LHS.getOperand(1)) && isNullConstant(RHS)) {
-  uint64_t MaskVal = LHS.getConstantOperandVal(1);
-  if (isPowerOf2_64(MaskVal) && !isInt<12>(MaskVal))
+  const APInt &MaskVal = LHS.getConstantOperandAPInt(1);
+  if (MaskVal.isPowerOf2() && !MaskVal.isSignedIntN(12))
 return DAG.getSelect(DL, VT,
  DAG.getSetCC(DL, CondVT, LHS, RHS, ISD::SETNE),
  False, True);
diff --git a/llvm/test/CodeGen/RISCV/condops.ll 
b/llvm/test/CodeGen/RISCV/condops.ll
index 8e53782b5dcd78..101cb5aeeb0940 100644
--- a/llvm/test/CodeGen/RISCV/condops.ll
+++ b/llvm/test/CodeGen/RISCV/condops.ll
@@ -3719,3 +3719,54 @@ entry:
   %cond = select i1 %tobool.not, i64 0, i64 %x
   ret i64 %cond
 }
+
+; Test that we don't crash on types larger than 64 bits.
+define i64 @single_bit3(i80 %x, i64 %y) {
+; RV32I-LABEL: single_bit3:
+; RV32I:   # %bb.0: # %entry
+; RV32I-NEXT:lw a0, 8(a0)
+; RV32I-NEXT:slli a0, a0, 31
+; RV32I-NEXT:srai a3, a0, 31
+; RV32I-NEXT:and a0, a3, a1
+; RV32I-NEXT:and a1, a3, a2
+; RV32I-NEXT:ret
+;
+; RV64I-LABEL: single_bit3:
+; RV64I:   # %bb.0: # %entry
+; RV64I-NEXT:slli a1, a1, 63
+; RV64I-NEXT:srai a0, a1, 63
+; RV64I-NEXT:and a0, a0, a2
+; RV64I-NEXT:ret
+;
+; RV64XVENTANACONDOPS-LABEL: single_bit3:
+; RV64XVENTANACONDOPS:   # %bb.0: # %entry
+; RV64XVENTANACONDOPS-NEXT:andi a1, a1, 1
+; RV64XVENTANACONDOPS-NEXT:vt.maskc a0, a2, a1
+; RV64XVENTANACONDOPS-NEXT:ret
+;
+; RV64XTHEADCONDMOV-LABEL: single_bit3:
+; RV64XTHEADCONDMOV:   # %bb.0: # %entry
+; RV64XTHEADCONDMOV-NEXT:slli a1, a1, 63
+; RV64XTHEADCONDMOV-NEXT:srai a0, a1, 63
+; RV64XTHEADCONDMOV-NEXT:and a0, a0, a2
+; RV64XTHEADCONDMOV-NEXT:ret
+;
+; RV32ZICOND-LABEL: single_bit3:
+; RV32ZICOND:   # %bb.0: # %entry
+; RV32ZICOND-NEXT:lw a0, 8(a0)
+; RV32ZICOND-NEXT:andi a3, a0, 1
+; RV32ZICOND-NEXT:czero.eqz a0, a1, a3
+; RV32ZICOND-NEXT:czero.eqz a1, a2, a3
+; RV32ZICOND-NEXT:ret
+;
+; RV64ZICOND-LABEL: single_bit3:
+; RV64ZICOND:   # %bb.0: # %entry
+; RV64ZICOND-NEXT:andi a1, a1, 1
+; RV64ZICOND-NEXT:czero.eqz a0, a2, a1
+; RV64ZICOND-NEXT:ret
+entry:
+  %and = and i80 %x, 18446744073709551616 ; 1 << 64
+  %tobool.not = icmp eq i80 %and, 0
+  %cond = select i1 %tobool.not, i64 0, i64 %y
+  ret i64 %cond
+}

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 023925b - [RISCV] Use APInt in useInversedSetcc to prevent crashes when mask is larger than UINT64_MAX. (#81888)

2024-02-16 Thread Tom Stellard via llvm-branch-commits

Author: Craig Topper
Date: 2024-02-16T04:37:03-08:00
New Revision: 023925bcdfbc06941edaa64ba789dbad2bca2ce1

URL: 
https://github.com/llvm/llvm-project/commit/023925bcdfbc06941edaa64ba789dbad2bca2ce1
DIFF: 
https://github.com/llvm/llvm-project/commit/023925bcdfbc06941edaa64ba789dbad2bca2ce1.diff

LOG: [RISCV] Use APInt in useInversedSetcc to prevent crashes when mask is 
larger than UINT64_MAX. (#81888)

There are no checks that the type is legal so we need to handle any
type.

(cherry picked from commit b57ba8ec514190b38eced26d541e8e25af66c485)

Added: 


Modified: 
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
llvm/test/CodeGen/RISCV/condops.ll

Removed: 




diff  --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp 
b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index dba4df77663b07..37d94be5316eea 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -14654,8 +14654,8 @@ static SDValue useInversedSetcc(SDNode *N, SelectionDAG 
&DAG,
 ISD::CondCode CC = cast(Cond.getOperand(2))->get();
 if (CC == ISD::SETEQ && LHS.getOpcode() == ISD::AND &&
 isa(LHS.getOperand(1)) && isNullConstant(RHS)) {
-  uint64_t MaskVal = LHS.getConstantOperandVal(1);
-  if (isPowerOf2_64(MaskVal) && !isInt<12>(MaskVal))
+  const APInt &MaskVal = LHS.getConstantOperandAPInt(1);
+  if (MaskVal.isPowerOf2() && !MaskVal.isSignedIntN(12))
 return DAG.getSelect(DL, VT,
  DAG.getSetCC(DL, CondVT, LHS, RHS, ISD::SETNE),
  False, True);

diff  --git a/llvm/test/CodeGen/RISCV/condops.ll 
b/llvm/test/CodeGen/RISCV/condops.ll
index 8e53782b5dcd78..101cb5aeeb0940 100644
--- a/llvm/test/CodeGen/RISCV/condops.ll
+++ b/llvm/test/CodeGen/RISCV/condops.ll
@@ -3719,3 +3719,54 @@ entry:
   %cond = select i1 %tobool.not, i64 0, i64 %x
   ret i64 %cond
 }
+
+; Test that we don't crash on types larger than 64 bits.
+define i64 @single_bit3(i80 %x, i64 %y) {
+; RV32I-LABEL: single_bit3:
+; RV32I:   # %bb.0: # %entry
+; RV32I-NEXT:lw a0, 8(a0)
+; RV32I-NEXT:slli a0, a0, 31
+; RV32I-NEXT:srai a3, a0, 31
+; RV32I-NEXT:and a0, a3, a1
+; RV32I-NEXT:and a1, a3, a2
+; RV32I-NEXT:ret
+;
+; RV64I-LABEL: single_bit3:
+; RV64I:   # %bb.0: # %entry
+; RV64I-NEXT:slli a1, a1, 63
+; RV64I-NEXT:srai a0, a1, 63
+; RV64I-NEXT:and a0, a0, a2
+; RV64I-NEXT:ret
+;
+; RV64XVENTANACONDOPS-LABEL: single_bit3:
+; RV64XVENTANACONDOPS:   # %bb.0: # %entry
+; RV64XVENTANACONDOPS-NEXT:andi a1, a1, 1
+; RV64XVENTANACONDOPS-NEXT:vt.maskc a0, a2, a1
+; RV64XVENTANACONDOPS-NEXT:ret
+;
+; RV64XTHEADCONDMOV-LABEL: single_bit3:
+; RV64XTHEADCONDMOV:   # %bb.0: # %entry
+; RV64XTHEADCONDMOV-NEXT:slli a1, a1, 63
+; RV64XTHEADCONDMOV-NEXT:srai a0, a1, 63
+; RV64XTHEADCONDMOV-NEXT:and a0, a0, a2
+; RV64XTHEADCONDMOV-NEXT:ret
+;
+; RV32ZICOND-LABEL: single_bit3:
+; RV32ZICOND:   # %bb.0: # %entry
+; RV32ZICOND-NEXT:lw a0, 8(a0)
+; RV32ZICOND-NEXT:andi a3, a0, 1
+; RV32ZICOND-NEXT:czero.eqz a0, a1, a3
+; RV32ZICOND-NEXT:czero.eqz a1, a2, a3
+; RV32ZICOND-NEXT:ret
+;
+; RV64ZICOND-LABEL: single_bit3:
+; RV64ZICOND:   # %bb.0: # %entry
+; RV64ZICOND-NEXT:andi a1, a1, 1
+; RV64ZICOND-NEXT:czero.eqz a0, a2, a1
+; RV64ZICOND-NEXT:ret
+entry:
+  %and = and i80 %x, 18446744073709551616 ; 1 << 64
+  %tobool.not = icmp eq i80 %and, 0
+  %cond = select i1 %tobool.not, i64 0, i64 %y
+  ret i64 %cond
+}



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [RISCV] Use APInt in useInversedSetcc to prevent crashes when mask is larger than UINT64_MAX. (#81888) (PR #81905)

2024-02-16 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/81905
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [RISCV] Make sure ADDI replacement in optimizeCondBranch has a virtual reg destination. (#81938) (PR #81953)

2024-02-16 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/81953

>From 38c5b352c6f3b26632f40faa17d07c2bfab88a2d Mon Sep 17 00:00:00 2001
From: Craig Topper 
Date: Thu, 15 Feb 2024 16:34:40 -0800
Subject: [PATCH] [RISCV] Make sure ADDI replacement in optimizeCondBranch has
 a virtual reg destination. (#81938)

If it isn't virtual, we may extend the live range of the physical
register past were it is valid. For example, across a call.

Found while trying to enable -riscv-enable-sink-fold which enables some
copy propagation in machine sink that led to ADDIs with physical
register destinations.

(cherry picked from commit feee627974df81e4cbf15537e4c4688aed66b12f)
---
 llvm/lib/Target/RISCV/RISCVInstrInfo.cpp |  3 +-
 llvm/test/CodeGen/RISCV/branch-opt.mir   | 68 
 2 files changed, 70 insertions(+), 1 deletion(-)
 create mode 100644 llvm/test/CodeGen/RISCV/branch-opt.mir

diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
index 592962cebe8973..d5b1ddfbeb3dc9 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
@@ -1229,7 +1229,8 @@ bool RISCVInstrInfo::optimizeCondBranch(MachineInstr &MI) 
const {
 MachineBasicBlock::reverse_iterator II(&MI), E = MBB->rend();
 auto DefC1 = std::find_if(++II, E, [&](const MachineInstr &I) -> bool {
   int64_t Imm;
-  return isLoadImm(&I, Imm) && Imm == C1;
+  return isLoadImm(&I, Imm) && Imm == C1 &&
+ I.getOperand(0).getReg().isVirtual();
 });
 if (DefC1 != E)
   return DefC1->getOperand(0).getReg();
diff --git a/llvm/test/CodeGen/RISCV/branch-opt.mir 
b/llvm/test/CodeGen/RISCV/branch-opt.mir
new file mode 100644
index 00..ba3a20f2fbfcd3
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/branch-opt.mir
@@ -0,0 +1,68 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py 
UTC_ARGS: --version 4
+# RUN: llc %s -mtriple=riscv64 -run-pass=peephole-opt -o - | FileCheck %s
+
+# Make sure we shouldn't replace the %2 ADDI with the $x10 ADDI since it has a
+# physical register destination.
+
+--- |
+  define void @foo(i32 signext %0) {
+tail call void @bar(i32 1)
+%2 = icmp ugt i32 %0, 1
+br i1 %2, label %3, label %4
+
+  3:; preds = %1
+tail call void @bar(i32 3)
+ret void
+
+  4:; preds = %1
+ret void
+  }
+
+  declare void @bar(...)
+
+...
+---
+name:foo
+tracksRegLiveness: true
+body: |
+  ; CHECK-LABEL: name: foo
+  ; CHECK: bb.0 (%ir-block.1):
+  ; CHECK-NEXT:   successors: %bb.1(0x4000), %bb.2(0x4000)
+  ; CHECK-NEXT:   liveins: $x10
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:gpr = COPY $x10
+  ; CHECK-NEXT:   ADJCALLSTACKDOWN 0, 0, implicit-def dead $x2, implicit $x2
+  ; CHECK-NEXT:   $x10 = ADDI $x0, 1
+  ; CHECK-NEXT:   PseudoCALL target-flags(riscv-call) @bar, csr_ilp32_lp64, 
implicit-def dead $x1, implicit $x10, implicit-def $x2
+  ; CHECK-NEXT:   ADJCALLSTACKUP 0, 0, implicit-def dead $x2, implicit $x2
+  ; CHECK-NEXT:   [[ADDI:%[0-9]+]]:gpr = ADDI $x0, 2
+  ; CHECK-NEXT:   BLTU [[COPY]], killed [[ADDI]], %bb.2
+  ; CHECK-NEXT:   PseudoBR %bb.1
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1 (%ir-block.3):
+  ; CHECK-NEXT:   $x10 = ADDI $x0, 3
+  ; CHECK-NEXT:   PseudoTAIL target-flags(riscv-call) @bar, implicit $x2, 
implicit $x10
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2 (%ir-block.4):
+  ; CHECK-NEXT:   PseudoRET
+  bb.0 (%ir-block.1):
+successors: %bb.1, %bb.2
+liveins: $x10
+
+%0:gpr = COPY $x10
+ADJCALLSTACKDOWN 0, 0, implicit-def dead $x2, implicit $x2
+$x10 = ADDI $x0, 1
+PseudoCALL target-flags(riscv-call) @bar, csr_ilp32_lp64, implicit-def 
dead $x1, implicit $x10, implicit-def $x2
+ADJCALLSTACKUP 0, 0, implicit-def dead $x2, implicit $x2
+%2:gpr = ADDI $x0, 2
+BLTU %0, killed %2, %bb.2
+PseudoBR %bb.1
+
+  bb.1 (%ir-block.3):
+$x10 = ADDI $x0, 3
+PseudoTAIL target-flags(riscv-call) @bar, implicit $x2, implicit $x10
+
+  bb.2 (%ir-block.4):
+PseudoRET
+
+...

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 38c5b35 - [RISCV] Make sure ADDI replacement in optimizeCondBranch has a virtual reg destination. (#81938)

2024-02-16 Thread Tom Stellard via llvm-branch-commits

Author: Craig Topper
Date: 2024-02-16T04:38:41-08:00
New Revision: 38c5b352c6f3b26632f40faa17d07c2bfab88a2d

URL: 
https://github.com/llvm/llvm-project/commit/38c5b352c6f3b26632f40faa17d07c2bfab88a2d
DIFF: 
https://github.com/llvm/llvm-project/commit/38c5b352c6f3b26632f40faa17d07c2bfab88a2d.diff

LOG: [RISCV] Make sure ADDI replacement in optimizeCondBranch has a virtual reg 
destination. (#81938)

If it isn't virtual, we may extend the live range of the physical
register past were it is valid. For example, across a call.

Found while trying to enable -riscv-enable-sink-fold which enables some
copy propagation in machine sink that led to ADDIs with physical
register destinations.

(cherry picked from commit feee627974df81e4cbf15537e4c4688aed66b12f)

Added: 
llvm/test/CodeGen/RISCV/branch-opt.mir

Modified: 
llvm/lib/Target/RISCV/RISCVInstrInfo.cpp

Removed: 




diff  --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
index 592962cebe8973..d5b1ddfbeb3dc9 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
@@ -1229,7 +1229,8 @@ bool RISCVInstrInfo::optimizeCondBranch(MachineInstr &MI) 
const {
 MachineBasicBlock::reverse_iterator II(&MI), E = MBB->rend();
 auto DefC1 = std::find_if(++II, E, [&](const MachineInstr &I) -> bool {
   int64_t Imm;
-  return isLoadImm(&I, Imm) && Imm == C1;
+  return isLoadImm(&I, Imm) && Imm == C1 &&
+ I.getOperand(0).getReg().isVirtual();
 });
 if (DefC1 != E)
   return DefC1->getOperand(0).getReg();

diff  --git a/llvm/test/CodeGen/RISCV/branch-opt.mir 
b/llvm/test/CodeGen/RISCV/branch-opt.mir
new file mode 100644
index 00..ba3a20f2fbfcd3
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/branch-opt.mir
@@ -0,0 +1,68 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py 
UTC_ARGS: --version 4
+# RUN: llc %s -mtriple=riscv64 -run-pass=peephole-opt -o - | FileCheck %s
+
+# Make sure we shouldn't replace the %2 ADDI with the $x10 ADDI since it has a
+# physical register destination.
+
+--- |
+  define void @foo(i32 signext %0) {
+tail call void @bar(i32 1)
+%2 = icmp ugt i32 %0, 1
+br i1 %2, label %3, label %4
+
+  3:; preds = %1
+tail call void @bar(i32 3)
+ret void
+
+  4:; preds = %1
+ret void
+  }
+
+  declare void @bar(...)
+
+...
+---
+name:foo
+tracksRegLiveness: true
+body: |
+  ; CHECK-LABEL: name: foo
+  ; CHECK: bb.0 (%ir-block.1):
+  ; CHECK-NEXT:   successors: %bb.1(0x4000), %bb.2(0x4000)
+  ; CHECK-NEXT:   liveins: $x10
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:gpr = COPY $x10
+  ; CHECK-NEXT:   ADJCALLSTACKDOWN 0, 0, implicit-def dead $x2, implicit $x2
+  ; CHECK-NEXT:   $x10 = ADDI $x0, 1
+  ; CHECK-NEXT:   PseudoCALL target-flags(riscv-call) @bar, csr_ilp32_lp64, 
implicit-def dead $x1, implicit $x10, implicit-def $x2
+  ; CHECK-NEXT:   ADJCALLSTACKUP 0, 0, implicit-def dead $x2, implicit $x2
+  ; CHECK-NEXT:   [[ADDI:%[0-9]+]]:gpr = ADDI $x0, 2
+  ; CHECK-NEXT:   BLTU [[COPY]], killed [[ADDI]], %bb.2
+  ; CHECK-NEXT:   PseudoBR %bb.1
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1 (%ir-block.3):
+  ; CHECK-NEXT:   $x10 = ADDI $x0, 3
+  ; CHECK-NEXT:   PseudoTAIL target-flags(riscv-call) @bar, implicit $x2, 
implicit $x10
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2 (%ir-block.4):
+  ; CHECK-NEXT:   PseudoRET
+  bb.0 (%ir-block.1):
+successors: %bb.1, %bb.2
+liveins: $x10
+
+%0:gpr = COPY $x10
+ADJCALLSTACKDOWN 0, 0, implicit-def dead $x2, implicit $x2
+$x10 = ADDI $x0, 1
+PseudoCALL target-flags(riscv-call) @bar, csr_ilp32_lp64, implicit-def 
dead $x1, implicit $x10, implicit-def $x2
+ADJCALLSTACKUP 0, 0, implicit-def dead $x2, implicit $x2
+%2:gpr = ADDI $x0, 2
+BLTU %0, killed %2, %bb.2
+PseudoBR %bb.1
+
+  bb.1 (%ir-block.3):
+$x10 = ADDI $x0, 3
+PseudoTAIL target-flags(riscv-call) @bar, implicit $x2, implicit $x10
+
+  bb.2 (%ir-block.4):
+PseudoRET
+
+...



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [RISCV] Make sure ADDI replacement in optimizeCondBranch has a virtual reg destination. (#81938) (PR #81953)

2024-02-16 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/81953
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [openmp] release/18.x: [OpenMP] [cmake] Don't use -fno-semantic-interposition on Windows (#81113) (PR #81332)

2024-02-16 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/81332

>From d7c6794aff6625c420a719d64402827cbae55292 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Martin=20Storsj=C3=B6?= 
Date: Thu, 8 Feb 2024 15:28:46 +0200
Subject: [PATCH] [OpenMP] [cmake] Don't use -fno-semantic-interposition on
 Windows (#81113)

This was added in 4b7beab4187ab0766c3d7b272511d5751431a8da. When the
flag was added implicitly elsewhere, it was added via
llvm/cmake/modules/HandleLLVMOptions.cmake, where it wasn't added on
Windows/Cygwin targets.

This avoids one warning per object file in OpenMP.

(cherry picked from commit 72f04fa0734f8559ad515f507a4a3ce3f461f196)
---
 openmp/cmake/HandleOpenMPOptions.cmake | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/openmp/cmake/HandleOpenMPOptions.cmake 
b/openmp/cmake/HandleOpenMPOptions.cmake
index 71346201129b68..9387d9b3b0ff75 100644
--- a/openmp/cmake/HandleOpenMPOptions.cmake
+++ b/openmp/cmake/HandleOpenMPOptions.cmake
@@ -46,7 +46,11 @@ append_if(OPENMP_HAVE_WEXTRA_FLAG "-Wno-extra" CMAKE_C_FLAGS 
CMAKE_CXX_FLAGS)
 append_if(OPENMP_HAVE_WPEDANTIC_FLAG "-Wno-pedantic" CMAKE_C_FLAGS 
CMAKE_CXX_FLAGS)
 append_if(OPENMP_HAVE_WMAYBE_UNINITIALIZED_FLAG "-Wno-maybe-uninitialized" 
CMAKE_C_FLAGS CMAKE_CXX_FLAGS)
 
-append_if(OPENMP_HAVE_NO_SEMANTIC_INTERPOSITION "-fno-semantic-interposition" 
CMAKE_C_FLAGS CMAKE_CXX_FLAGS)
+if (NOT (WIN32 OR CYGWIN))
+  # This flag is not relevant on Windows; the flag is accepted, but produces 
warnings
+  # about argument unused during compilation.
+  append_if(OPENMP_HAVE_NO_SEMANTIC_INTERPOSITION 
"-fno-semantic-interposition" CMAKE_C_FLAGS CMAKE_CXX_FLAGS)
+endif()
 append_if(OPENMP_HAVE_FUNCTION_SECTIONS "-ffunction-section" CMAKE_C_FLAGS 
CMAKE_CXX_FLAGS)
 append_if(OPENMP_HAVE_DATA_SECTIONS "-fdata-sections" CMAKE_C_FLAGS 
CMAKE_CXX_FLAGS)
 

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [openmp] d7c6794 - [OpenMP] [cmake] Don't use -fno-semantic-interposition on Windows (#81113)

2024-02-16 Thread Tom Stellard via llvm-branch-commits

Author: Martin Storsjö
Date: 2024-02-16T04:40:41-08:00
New Revision: d7c6794aff6625c420a719d64402827cbae55292

URL: 
https://github.com/llvm/llvm-project/commit/d7c6794aff6625c420a719d64402827cbae55292
DIFF: 
https://github.com/llvm/llvm-project/commit/d7c6794aff6625c420a719d64402827cbae55292.diff

LOG: [OpenMP] [cmake] Don't use -fno-semantic-interposition on Windows (#81113)

This was added in 4b7beab4187ab0766c3d7b272511d5751431a8da. When the
flag was added implicitly elsewhere, it was added via
llvm/cmake/modules/HandleLLVMOptions.cmake, where it wasn't added on
Windows/Cygwin targets.

This avoids one warning per object file in OpenMP.

(cherry picked from commit 72f04fa0734f8559ad515f507a4a3ce3f461f196)

Added: 


Modified: 
openmp/cmake/HandleOpenMPOptions.cmake

Removed: 




diff  --git a/openmp/cmake/HandleOpenMPOptions.cmake 
b/openmp/cmake/HandleOpenMPOptions.cmake
index 71346201129b68..9387d9b3b0ff75 100644
--- a/openmp/cmake/HandleOpenMPOptions.cmake
+++ b/openmp/cmake/HandleOpenMPOptions.cmake
@@ -46,7 +46,11 @@ append_if(OPENMP_HAVE_WEXTRA_FLAG "-Wno-extra" CMAKE_C_FLAGS 
CMAKE_CXX_FLAGS)
 append_if(OPENMP_HAVE_WPEDANTIC_FLAG "-Wno-pedantic" CMAKE_C_FLAGS 
CMAKE_CXX_FLAGS)
 append_if(OPENMP_HAVE_WMAYBE_UNINITIALIZED_FLAG "-Wno-maybe-uninitialized" 
CMAKE_C_FLAGS CMAKE_CXX_FLAGS)
 
-append_if(OPENMP_HAVE_NO_SEMANTIC_INTERPOSITION "-fno-semantic-interposition" 
CMAKE_C_FLAGS CMAKE_CXX_FLAGS)
+if (NOT (WIN32 OR CYGWIN))
+  # This flag is not relevant on Windows; the flag is accepted, but produces 
warnings
+  # about argument unused during compilation.
+  append_if(OPENMP_HAVE_NO_SEMANTIC_INTERPOSITION 
"-fno-semantic-interposition" CMAKE_C_FLAGS CMAKE_CXX_FLAGS)
+endif()
 append_if(OPENMP_HAVE_FUNCTION_SECTIONS "-ffunction-section" CMAKE_C_FLAGS 
CMAKE_CXX_FLAGS)
 append_if(OPENMP_HAVE_DATA_SECTIONS "-fdata-sections" CMAKE_C_FLAGS 
CMAKE_CXX_FLAGS)
 



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [openmp] release/18.x: [OpenMP] [cmake] Don't use -fno-semantic-interposition on Windows (#81113) (PR #81332)

2024-02-16 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/81332
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [AArch64] Only apply bool vector bitcast opt if result is scalar (#81256) (PR #81454)

2024-02-16 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/81454

>From e098f6c4aaccec326a2fc4b45323b3822e02c270 Mon Sep 17 00:00:00 2001
From: Nikita Popov 
Date: Mon, 12 Feb 2024 10:00:34 +0100
Subject: [PATCH] [AArch64] Only apply bool vector bitcast opt if result is
 scalar (#81256)

This optimization tries to optimize bitcasts from `` to iN, but
currently also triggers for `` to `` bitcasts, if custom
lowering has been requested for these for an unrelated reason. Fix this
by explicitly checking that the result type is scalar.

Fixes https://github.com/llvm/llvm-project/issues/81216.

(cherry picked from commit 92d79922051f732560acf3791b543df1e6580689)
---
 .../Target/AArch64/AArch64ISelLowering.cpp|  3 +-
 .../AArch64/vec-combine-compare-to-bitmask.ll | 28 +++
 2 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp 
b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index bfce5bc92a9ad1..0287856560e91a 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -24427,7 +24427,8 @@ void AArch64TargetLowering::ReplaceBITCASTResults(
 return;
   }
 
-  if (SrcVT.isVector() && SrcVT.getVectorElementType() == MVT::i1)
+  if (SrcVT.isVector() && SrcVT.getVectorElementType() == MVT::i1 &&
+  !VT.isVector())
 return replaceBoolVectorBitcast(N, Results, DAG);
 
   if (VT != MVT::i16 || (SrcVT != MVT::f16 && SrcVT != MVT::bf16))
diff --git a/llvm/test/CodeGen/AArch64/vec-combine-compare-to-bitmask.ll 
b/llvm/test/CodeGen/AArch64/vec-combine-compare-to-bitmask.ll
index 1b22e2f900ddb7..557aa010b3a7d9 100644
--- a/llvm/test/CodeGen/AArch64/vec-combine-compare-to-bitmask.ll
+++ b/llvm/test/CodeGen/AArch64/vec-combine-compare-to-bitmask.ll
@@ -489,3 +489,31 @@ define i6 @no_combine_illegal_num_elements(<6 x i32> %vec) 
{
   %bitmask = bitcast <6 x i1> %cmp_result to i6
   ret i6 %bitmask
 }
+
+; Only apply the combine when casting a vector to a scalar.
+define <2 x i8> @vector_to_vector_cast(<16 x i1> %arg) nounwind {
+; CHECK-LABEL: vector_to_vector_cast:
+; CHECK:   ; %bb.0:
+; CHECK-NEXT:sub sp, sp, #16
+; CHECK-NEXT:shl.16b v0, v0, #7
+; CHECK-NEXT:  Lloh36:
+; CHECK-NEXT:adrp x8, lCPI20_0@PAGE
+; CHECK-NEXT:  Lloh37:
+; CHECK-NEXT:ldr q1, [x8, lCPI20_0@PAGEOFF]
+; CHECK-NEXT:add x8, sp, #14
+; CHECK-NEXT:cmlt.16b v0, v0, #0
+; CHECK-NEXT:and.16b v0, v0, v1
+; CHECK-NEXT:ext.16b v1, v0, v0, #8
+; CHECK-NEXT:zip1.16b v0, v0, v1
+; CHECK-NEXT:addv.8h h0, v0
+; CHECK-NEXT:str h0, [sp, #14]
+; CHECK-NEXT:ld1.b { v0 }[0], [x8]
+; CHECK-NEXT:orr x8, x8, #0x1
+; CHECK-NEXT:ld1.b { v0 }[4], [x8]
+; CHECK-NEXT:; kill: def $d0 killed $d0 killed $q0
+; CHECK-NEXT:add sp, sp, #16
+; CHECK-NEXT:ret
+; CHECK-NEXT:.loh AdrpLdr Lloh36, Lloh37
+  %bc = bitcast <16 x i1> %arg to <2 x i8>
+  ret <2 x i8> %bc
+}

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] e098f6c - [AArch64] Only apply bool vector bitcast opt if result is scalar (#81256)

2024-02-16 Thread Tom Stellard via llvm-branch-commits

Author: Nikita Popov
Date: 2024-02-16T04:50:47-08:00
New Revision: e098f6c4aaccec326a2fc4b45323b3822e02c270

URL: 
https://github.com/llvm/llvm-project/commit/e098f6c4aaccec326a2fc4b45323b3822e02c270
DIFF: 
https://github.com/llvm/llvm-project/commit/e098f6c4aaccec326a2fc4b45323b3822e02c270.diff

LOG: [AArch64] Only apply bool vector bitcast opt if result is scalar (#81256)

This optimization tries to optimize bitcasts from `` to iN, but
currently also triggers for `` to `` bitcasts, if custom
lowering has been requested for these for an unrelated reason. Fix this
by explicitly checking that the result type is scalar.

Fixes https://github.com/llvm/llvm-project/issues/81216.

(cherry picked from commit 92d79922051f732560acf3791b543df1e6580689)

Added: 


Modified: 
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
llvm/test/CodeGen/AArch64/vec-combine-compare-to-bitmask.ll

Removed: 




diff  --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp 
b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index bfce5bc92a9ad1..0287856560e91a 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -24427,7 +24427,8 @@ void AArch64TargetLowering::ReplaceBITCASTResults(
 return;
   }
 
-  if (SrcVT.isVector() && SrcVT.getVectorElementType() == MVT::i1)
+  if (SrcVT.isVector() && SrcVT.getVectorElementType() == MVT::i1 &&
+  !VT.isVector())
 return replaceBoolVectorBitcast(N, Results, DAG);
 
   if (VT != MVT::i16 || (SrcVT != MVT::f16 && SrcVT != MVT::bf16))

diff  --git a/llvm/test/CodeGen/AArch64/vec-combine-compare-to-bitmask.ll 
b/llvm/test/CodeGen/AArch64/vec-combine-compare-to-bitmask.ll
index 1b22e2f900ddb7..557aa010b3a7d9 100644
--- a/llvm/test/CodeGen/AArch64/vec-combine-compare-to-bitmask.ll
+++ b/llvm/test/CodeGen/AArch64/vec-combine-compare-to-bitmask.ll
@@ -489,3 +489,31 @@ define i6 @no_combine_illegal_num_elements(<6 x i32> %vec) 
{
   %bitmask = bitcast <6 x i1> %cmp_result to i6
   ret i6 %bitmask
 }
+
+; Only apply the combine when casting a vector to a scalar.
+define <2 x i8> @vector_to_vector_cast(<16 x i1> %arg) nounwind {
+; CHECK-LABEL: vector_to_vector_cast:
+; CHECK:   ; %bb.0:
+; CHECK-NEXT:sub sp, sp, #16
+; CHECK-NEXT:shl.16b v0, v0, #7
+; CHECK-NEXT:  Lloh36:
+; CHECK-NEXT:adrp x8, lCPI20_0@PAGE
+; CHECK-NEXT:  Lloh37:
+; CHECK-NEXT:ldr q1, [x8, lCPI20_0@PAGEOFF]
+; CHECK-NEXT:add x8, sp, #14
+; CHECK-NEXT:cmlt.16b v0, v0, #0
+; CHECK-NEXT:and.16b v0, v0, v1
+; CHECK-NEXT:ext.16b v1, v0, v0, #8
+; CHECK-NEXT:zip1.16b v0, v0, v1
+; CHECK-NEXT:addv.8h h0, v0
+; CHECK-NEXT:str h0, [sp, #14]
+; CHECK-NEXT:ld1.b { v0 }[0], [x8]
+; CHECK-NEXT:orr x8, x8, #0x1
+; CHECK-NEXT:ld1.b { v0 }[4], [x8]
+; CHECK-NEXT:; kill: def $d0 killed $d0 killed $q0
+; CHECK-NEXT:add sp, sp, #16
+; CHECK-NEXT:ret
+; CHECK-NEXT:.loh AdrpLdr Lloh36, Lloh37
+  %bc = bitcast <16 x i1> %arg to <2 x i8>
+  ret <2 x i8> %bc
+}



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [AArch64] Only apply bool vector bitcast opt if result is scalar (#81256) (PR #81454)

2024-02-16 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/81454
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 5750be5 - [CFI][annotation] Leave alone function pointers in function annotations (#81673)

2024-02-16 Thread via llvm-branch-commits

Author: yozhu
Date: 2024-02-16T04:53:41-08:00
New Revision: 5750be5fc5d130c62f3f7703926ac2c8c4992586

URL: 
https://github.com/llvm/llvm-project/commit/5750be5fc5d130c62f3f7703926ac2c8c4992586
DIFF: 
https://github.com/llvm/llvm-project/commit/5750be5fc5d130c62f3f7703926ac2c8c4992586.diff

LOG: [CFI][annotation] Leave alone function pointers in function annotations 
(#81673)

Function annotation, as part of llvm.metadata, is for the function
itself and doesn't apply to its corresponding jump table entry, so with
CFI we shouldn't replace function pointer in function annotation with
pointer to its corresponding jump table entry.

(cherry picked from commit c7a0db1e20251f436e3d500eac03bd9be1d88b45)

Added: 
llvm/test/Transforms/LowerTypeTests/cfi-annotation.ll

Modified: 
llvm/lib/Transforms/IPO/LowerTypeTests.cpp

Removed: 




diff  --git a/llvm/lib/Transforms/IPO/LowerTypeTests.cpp 
b/llvm/lib/Transforms/IPO/LowerTypeTests.cpp
index 733f290b1bc93a..633fcb3314c42f 100644
--- a/llvm/lib/Transforms/IPO/LowerTypeTests.cpp
+++ b/llvm/lib/Transforms/IPO/LowerTypeTests.cpp
@@ -470,6 +470,9 @@ class LowerTypeTestsModule {
 
   Function *WeakInitializerFn = nullptr;
 
+  GlobalVariable *GlobalAnnotation;
+  DenseSet FunctionAnnotations;
+
   bool shouldExportConstantsAsAbsoluteSymbols();
   uint8_t *exportTypeId(StringRef TypeId, const TypeIdLowering &TIL);
   TypeIdLowering importTypeId(StringRef TypeId);
@@ -531,6 +534,10 @@ class LowerTypeTestsModule {
   /// replace each use, which is a direct function call.
   void replaceDirectCalls(Value *Old, Value *New);
 
+  bool isFunctionAnnotation(Value *V) const {
+return FunctionAnnotations.contains(V);
+  }
+
 public:
   LowerTypeTestsModule(Module &M, ModuleAnalysisManager &AM,
ModuleSummaryIndex *ExportSummary,
@@ -1377,8 +1384,11 @@ void 
LowerTypeTestsModule::replaceWeakDeclarationWithJumpTablePtr(
   // (all?) targets. Switch to a runtime initializer.
   SmallSetVector GlobalVarUsers;
   findGlobalVariableUsersOf(F, GlobalVarUsers);
-  for (auto *GV : GlobalVarUsers)
+  for (auto *GV : GlobalVarUsers) {
+if (GV == GlobalAnnotation)
+  continue;
 moveInitializerToModuleConstructor(GV);
+  }
 
   // Can not RAUW F with an expression that uses F. Replace with a temporary
   // placeholder first.
@@ -1837,6 +1847,16 @@ LowerTypeTestsModule::LowerTypeTestsModule(
   }
   OS = TargetTriple.getOS();
   ObjectFormat = TargetTriple.getObjectFormat();
+
+  // Function annotation describes or applies to function itself, and
+  // shouldn't be associated with jump table thunk generated for CFI.
+  GlobalAnnotation = M.getGlobalVariable("llvm.global.annotations");
+  if (GlobalAnnotation && GlobalAnnotation->hasInitializer()) {
+const ConstantArray *CA =
+cast(GlobalAnnotation->getInitializer());
+for (Value *Op : CA->operands())
+  FunctionAnnotations.insert(Op);
+  }
 }
 
 bool LowerTypeTestsModule::runForTesting(Module &M, ModuleAnalysisManager &AM) 
{
@@ -1896,10 +1916,14 @@ void LowerTypeTestsModule::replaceCfiUses(Function 
*Old, Value *New,
 if (isa(U.getUser()))
   continue;
 
-// Skip direct calls to externally defined or non-dso_local functions
+// Skip direct calls to externally defined or non-dso_local functions.
 if (isDirectCall(U) && (Old->isDSOLocal() || !IsJumpTableCanonical))
   continue;
 
+// Skip function annotation.
+if (isFunctionAnnotation(U.getUser()))
+  continue;
+
 // Must handle Constants specially, we cannot call replaceUsesOfWith on a
 // constant because they are uniqued.
 if (auto *C = dyn_cast(U.getUser())) {

diff  --git a/llvm/test/Transforms/LowerTypeTests/cfi-annotation.ll 
b/llvm/test/Transforms/LowerTypeTests/cfi-annotation.ll
new file mode 100644
index 00..034af89112cb63
--- /dev/null
+++ b/llvm/test/Transforms/LowerTypeTests/cfi-annotation.ll
@@ -0,0 +1,68 @@
+; REQUIRES: aarch64-registered-target
+
+; RUN: opt -passes=lowertypetests %s -o %t.o
+; RUN: llvm-dis %t.o -o - | FileCheck %s --check-prefix=CHECK-foobar
+; CHECK-foobar: {{llvm.global.annotations = .*[foo|bar], .*[foo|bar],}}
+; RUN: llvm-dis %t.o -o - | FileCheck %s --check-prefix=CHECK-cfi
+; CHECK-cfi-NOT: {{llvm.global.annotations = .*cfi.*}}
+
+target triple = "aarch64-none-linux-gnu"
+
+@.src = private unnamed_addr constant [7 x i8] c"test.c\00", align 1
+@.str = private unnamed_addr constant [30 x i8] 
c"annotation_string_literal_bar\00", section "llvm.metadata"
+@.str.1 = private unnamed_addr constant [7 x i8] c"test.c\00", section 
"llvm.metadata"
+@.str.2 = private unnamed_addr constant [30 x i8] 
c"annotation_string_literal_foo\00", section "llvm.metadata"
+@llvm.global.annotations = appending global [2 x { ptr, ptr, ptr, i32, ptr }] 
[{ ptr, ptr, ptr, i32, ptr } { ptr @bar, ptr @.str, ptr @.str.1, i32 2, ptr 
null }, { ptr, ptr, ptr, i32, ptr } { ptr @f

[llvm-branch-commits] [llvm] [CFI][annotation] Leave alone function pointers in function annotations (PR #81673)

2024-02-16 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/81673
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] release/18.x: [lld] Fix test failures when running as root user (#81339) (PR #81988)

2024-02-16 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/81988

Backport e165bea1d4ec2de96ee0548cece79d71a75ce8f8

Requested by: @tstellar

>From d0392f7015df24088e17f79b92b7376468ddbccd Mon Sep 17 00:00:00 2001
From: Tom Stellard 
Date: Fri, 9 Feb 2024 20:57:05 -0800
Subject: [PATCH] [lld] Fix test failures when running as root user (#81339)

This makes it easier to run the tests in a containerized environment.

(cherry picked from commit e165bea1d4ec2de96ee0548cece79d71a75ce8f8)
---
 lld/test/COFF/lto-cache-errors.ll | 2 +-
 lld/test/COFF/thinlto-emit-imports.ll | 2 +-
 lld/test/ELF/lto/resolution-err.ll| 2 +-
 lld/test/ELF/lto/thinlto-cant-write-index.ll  | 2 +-
 lld/test/ELF/lto/thinlto-emit-imports.ll  | 2 +-
 lld/test/MachO/invalid/invalid-lto-object-path.ll | 2 +-
 lld/test/MachO/thinlto-emit-imports.ll| 2 +-
 7 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/lld/test/COFF/lto-cache-errors.ll 
b/lld/test/COFF/lto-cache-errors.ll
index 55244e5690dc34..a46190a81b6230 100644
--- a/lld/test/COFF/lto-cache-errors.ll
+++ b/lld/test/COFF/lto-cache-errors.ll
@@ -1,4 +1,4 @@
-; REQUIRES: x86
+; REQUIRES: x86, non-root-user
 ;; Not supported on windows since we use permissions to deny the creation
 ; UNSUPPORTED: system-windows
 
diff --git a/lld/test/COFF/thinlto-emit-imports.ll 
b/lld/test/COFF/thinlto-emit-imports.ll
index a9f22c1dc2dcff..b47a6cea4eb7df 100644
--- a/lld/test/COFF/thinlto-emit-imports.ll
+++ b/lld/test/COFF/thinlto-emit-imports.ll
@@ -1,4 +1,4 @@
-; REQUIRES: x86
+; REQUIRES: x86, non-root-user
 
 ; Generate summary sections and test lld handling.
 ; RUN: opt -module-summary %s -o %t1.obj
diff --git a/lld/test/ELF/lto/resolution-err.ll 
b/lld/test/ELF/lto/resolution-err.ll
index 6dfa64b1b8b9ee..f9855abaff3279 100644
--- a/lld/test/ELF/lto/resolution-err.ll
+++ b/lld/test/ELF/lto/resolution-err.ll
@@ -1,5 +1,5 @@
 ; UNSUPPORTED: system-windows
-; REQUIRES: shell
+; REQUIRES: shell, non-root-user
 ; RUN: llvm-as %s -o %t.bc
 ; RUN: touch %t.resolution.txt
 ; RUN: chmod u-w %t.resolution.txt
diff --git a/lld/test/ELF/lto/thinlto-cant-write-index.ll 
b/lld/test/ELF/lto/thinlto-cant-write-index.ll
index e664acbb17de1a..286fcddd4238a1 100644
--- a/lld/test/ELF/lto/thinlto-cant-write-index.ll
+++ b/lld/test/ELF/lto/thinlto-cant-write-index.ll
@@ -1,4 +1,4 @@
-; REQUIRES: x86
+; REQUIRES: x86, non-root-user
 
 ; Basic ThinLTO tests.
 ; RUN: opt -module-summary %s -o %t1.o
diff --git a/lld/test/ELF/lto/thinlto-emit-imports.ll 
b/lld/test/ELF/lto/thinlto-emit-imports.ll
index 6d0e1e65047db4..253ec08619c982 100644
--- a/lld/test/ELF/lto/thinlto-emit-imports.ll
+++ b/lld/test/ELF/lto/thinlto-emit-imports.ll
@@ -1,4 +1,4 @@
-; REQUIRES: x86
+; REQUIRES: x86, non-root-user
 ;; Test a few properties not tested by thinlto-index-only.ll
 
 ; RUN: opt -module-summary %s -o %t1.o
diff --git a/lld/test/MachO/invalid/invalid-lto-object-path.ll 
b/lld/test/MachO/invalid/invalid-lto-object-path.ll
index 75c6a97e446fb2..c862538d592ce8 100644
--- a/lld/test/MachO/invalid/invalid-lto-object-path.ll
+++ b/lld/test/MachO/invalid/invalid-lto-object-path.ll
@@ -1,4 +1,4 @@
-; REQUIRES: x86
+; REQUIRES: x86, non-root-user
 
 ;; Creating read-only directories with `chmod 400` isn't supported on Windows
 ; UNSUPPORTED: system-windows
diff --git a/lld/test/MachO/thinlto-emit-imports.ll 
b/lld/test/MachO/thinlto-emit-imports.ll
index 47a612bd0a7b56..88f766f59c8877 100644
--- a/lld/test/MachO/thinlto-emit-imports.ll
+++ b/lld/test/MachO/thinlto-emit-imports.ll
@@ -1,4 +1,4 @@
-; REQUIRES: x86
+; REQUIRES: x86, non-root-user
 ; RUN: rm -rf %t; split-file %s %t
 
 ; Generate summary sections and test lld handling.

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] release/18.x: [lld] Fix test failures when running as root user (#81339) (PR #81988)

2024-02-16 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/81988
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] release/18.x: [lld] Fix test failures when running as root user (#81339) (PR #81988)

2024-02-16 Thread via llvm-branch-commits

llvmbot wrote:

@MaskRay What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/81988
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] release/18.x: [lld] Fix test failures when running as root user (#81339) (PR #81988)

2024-02-16 Thread via llvm-branch-commits

llvmbot wrote:



@llvm/pr-subscribers-lld-macho

@llvm/pr-subscribers-lld

Author: None (llvmbot)


Changes

Backport e165bea1d4ec2de96ee0548cece79d71a75ce8f8

Requested by: @tstellar

---
Full diff: https://github.com/llvm/llvm-project/pull/81988.diff


7 Files Affected:

- (modified) lld/test/COFF/lto-cache-errors.ll (+1-1) 
- (modified) lld/test/COFF/thinlto-emit-imports.ll (+1-1) 
- (modified) lld/test/ELF/lto/resolution-err.ll (+1-1) 
- (modified) lld/test/ELF/lto/thinlto-cant-write-index.ll (+1-1) 
- (modified) lld/test/ELF/lto/thinlto-emit-imports.ll (+1-1) 
- (modified) lld/test/MachO/invalid/invalid-lto-object-path.ll (+1-1) 
- (modified) lld/test/MachO/thinlto-emit-imports.ll (+1-1) 


``diff
diff --git a/lld/test/COFF/lto-cache-errors.ll 
b/lld/test/COFF/lto-cache-errors.ll
index 55244e5690dc34..a46190a81b6230 100644
--- a/lld/test/COFF/lto-cache-errors.ll
+++ b/lld/test/COFF/lto-cache-errors.ll
@@ -1,4 +1,4 @@
-; REQUIRES: x86
+; REQUIRES: x86, non-root-user
 ;; Not supported on windows since we use permissions to deny the creation
 ; UNSUPPORTED: system-windows
 
diff --git a/lld/test/COFF/thinlto-emit-imports.ll 
b/lld/test/COFF/thinlto-emit-imports.ll
index a9f22c1dc2dcff..b47a6cea4eb7df 100644
--- a/lld/test/COFF/thinlto-emit-imports.ll
+++ b/lld/test/COFF/thinlto-emit-imports.ll
@@ -1,4 +1,4 @@
-; REQUIRES: x86
+; REQUIRES: x86, non-root-user
 
 ; Generate summary sections and test lld handling.
 ; RUN: opt -module-summary %s -o %t1.obj
diff --git a/lld/test/ELF/lto/resolution-err.ll 
b/lld/test/ELF/lto/resolution-err.ll
index 6dfa64b1b8b9ee..f9855abaff3279 100644
--- a/lld/test/ELF/lto/resolution-err.ll
+++ b/lld/test/ELF/lto/resolution-err.ll
@@ -1,5 +1,5 @@
 ; UNSUPPORTED: system-windows
-; REQUIRES: shell
+; REQUIRES: shell, non-root-user
 ; RUN: llvm-as %s -o %t.bc
 ; RUN: touch %t.resolution.txt
 ; RUN: chmod u-w %t.resolution.txt
diff --git a/lld/test/ELF/lto/thinlto-cant-write-index.ll 
b/lld/test/ELF/lto/thinlto-cant-write-index.ll
index e664acbb17de1a..286fcddd4238a1 100644
--- a/lld/test/ELF/lto/thinlto-cant-write-index.ll
+++ b/lld/test/ELF/lto/thinlto-cant-write-index.ll
@@ -1,4 +1,4 @@
-; REQUIRES: x86
+; REQUIRES: x86, non-root-user
 
 ; Basic ThinLTO tests.
 ; RUN: opt -module-summary %s -o %t1.o
diff --git a/lld/test/ELF/lto/thinlto-emit-imports.ll 
b/lld/test/ELF/lto/thinlto-emit-imports.ll
index 6d0e1e65047db4..253ec08619c982 100644
--- a/lld/test/ELF/lto/thinlto-emit-imports.ll
+++ b/lld/test/ELF/lto/thinlto-emit-imports.ll
@@ -1,4 +1,4 @@
-; REQUIRES: x86
+; REQUIRES: x86, non-root-user
 ;; Test a few properties not tested by thinlto-index-only.ll
 
 ; RUN: opt -module-summary %s -o %t1.o
diff --git a/lld/test/MachO/invalid/invalid-lto-object-path.ll 
b/lld/test/MachO/invalid/invalid-lto-object-path.ll
index 75c6a97e446fb2..c862538d592ce8 100644
--- a/lld/test/MachO/invalid/invalid-lto-object-path.ll
+++ b/lld/test/MachO/invalid/invalid-lto-object-path.ll
@@ -1,4 +1,4 @@
-; REQUIRES: x86
+; REQUIRES: x86, non-root-user
 
 ;; Creating read-only directories with `chmod 400` isn't supported on Windows
 ; UNSUPPORTED: system-windows
diff --git a/lld/test/MachO/thinlto-emit-imports.ll 
b/lld/test/MachO/thinlto-emit-imports.ll
index 47a612bd0a7b56..88f766f59c8877 100644
--- a/lld/test/MachO/thinlto-emit-imports.ll
+++ b/lld/test/MachO/thinlto-emit-imports.ll
@@ -1,4 +1,4 @@
-; REQUIRES: x86
+; REQUIRES: x86, non-root-user
 ; RUN: rm -rf %t; split-file %s %t
 
 ; Generate summary sections and test lld handling.

``




https://github.com/llvm/llvm-project/pull/81988
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] release/18.x: [LLD] [MinGW] Implement the --lto-emit-asm and -plugin-opt=emit-llvm options (#81475) (PR #81798)

2024-02-16 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/81798

>From d442a51287dda821a216cd683eacd601bd7441c0 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Martin=20Storsj=C3=B6?= 
Date: Tue, 13 Feb 2024 09:32:40 +0200
Subject: [PATCH] [LLD] [MinGW] Implement the --lto-emit-asm and
 -plugin-opt=emit-llvm options (#81475)

These were implemented in the COFF linker in
3923e61b96cf90123762f0e0381504efaba2d77a and
d12b99a4313816cf99e97cb5f579e2d51ba72b0b.

This matches the corresponding options in the ELF linker.

(cherry picked from commit d033366bd2189e33343ca93d276b40341dc39770)
---
 lld/MinGW/Driver.cpp   | 4 
 lld/MinGW/Options.td   | 5 +
 lld/test/MinGW/driver.test | 7 +++
 3 files changed, 16 insertions(+)

diff --git a/lld/MinGW/Driver.cpp b/lld/MinGW/Driver.cpp
index 4752d92e3b1d71..7b16764dd2c7ce 100644
--- a/lld/MinGW/Driver.cpp
+++ b/lld/MinGW/Driver.cpp
@@ -448,6 +448,10 @@ bool link(ArrayRef argsArr, 
llvm::raw_ostream &stdoutOS,
 add("-lto-cs-profile-generate");
   if (auto *arg = args.getLastArg(OPT_lto_cs_profile_file))
 add("-lto-cs-profile-file:" + StringRef(arg->getValue()));
+  if (args.hasArg(OPT_plugin_opt_emit_llvm))
+add("-lldemit:llvm");
+  if (args.hasArg(OPT_lto_emit_asm))
+add("-lldemit:asm");
 
   if (auto *a = args.getLastArg(OPT_thinlto_cache_dir))
 add("-lldltocache:" + StringRef(a->getValue()));
diff --git a/lld/MinGW/Options.td b/lld/MinGW/Options.td
index 02f00f27406c08..9a0a96aac7f1c6 100644
--- a/lld/MinGW/Options.td
+++ b/lld/MinGW/Options.td
@@ -158,6 +158,8 @@ def lto_cs_profile_generate: FF<"lto-cs-profile-generate">,
   HelpText<"Perform context sensitive PGO instrumentation">;
 def lto_cs_profile_file: JJ<"lto-cs-profile-file=">,
   HelpText<"Context sensitive profile file path">;
+def lto_emit_asm: FF<"lto-emit-asm">,
+  HelpText<"Emit assembly code">;
 
 def thinlto_cache_dir: JJ<"thinlto-cache-dir=">,
   HelpText<"Path to ThinLTO cached object file directory">;
@@ -181,6 +183,9 @@ def: J<"plugin-opt=cs-profile-path=">,
   Alias, HelpText<"Alias for --lto-cs-profile-file">;
 def plugin_opt_dwo_dir_eq: J<"plugin-opt=dwo_dir=">,
   HelpText<"Directory to store .dwo files when LTO and debug fission are 
used">;
+def plugin_opt_emit_asm: F<"plugin-opt=emit-asm">,
+  Alias, HelpText<"Alias for --lto-emit-asm">;
+def plugin_opt_emit_llvm: F<"plugin-opt=emit-llvm">;
 def: J<"plugin-opt=jobs=">, Alias, HelpText<"Alias for 
--thinlto-jobs=">;
 def plugin_opt_mcpu_eq: J<"plugin-opt=mcpu=">;
 
diff --git a/lld/test/MinGW/driver.test b/lld/test/MinGW/driver.test
index 559a32bfa242f8..057de2a22f6a0c 100644
--- a/lld/test/MinGW/driver.test
+++ b/lld/test/MinGW/driver.test
@@ -409,6 +409,13 @@ LTO_OPTS: -mllvm:-mcpu=x86-64 -opt:lldlto=2 -dwodir:foo 
-lto-cs-profile-generate
 RUN: ld.lld -### foo.o -m i386pep --lto-O2 --lto-CGO1 
--lto-cs-profile-generate --lto-cs-profile-file=foo 2>&1 | FileCheck 
-check-prefix=LTO_OPTS2 %s
 LTO_OPTS2:-opt:lldlto=2 -opt:lldltocgo=1 -lto-cs-profile-generate 
-lto-cs-profile-file:foo
 
+RUN: ld.lld -### foo.o -m i386pe -plugin-opt=emit-asm 2>&1 | FileCheck 
-check-prefix=LTO_EMIT_ASM %s
+RUN: ld.lld -### foo.o -m i386pe --lto-emit-asm 2>&1 | FileCheck 
-check-prefix=LTO_EMIT_ASM %s
+LTO_EMIT_ASM: -lldemit:asm
+
+RUN: ld.lld -### foo.o -m i386pe -plugin-opt=emit-llvm 2>&1 | FileCheck 
-check-prefix=LTO_EMIT_LLVM %s
+LTO_EMIT_LLVM: -lldemit:llvm
+
 Test GCC specific LTO options that GCC passes unconditionally, that we ignore.
 
 RUN: ld.lld -### foo.o -m i386pep -plugin 
/usr/lib/gcc/x86_64-w64-mingw32/10-posix/liblto_plugin.so 
-plugin-opt=/usr/lib/gcc/x86_64-w64-mingw32/10-posix/lto-wrapper 
-plugin-opt=-fresolution=/tmp/ccM9d4fP.res -plugin-opt=-pass-through=-lmingw32 
2> /dev/null

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] release/18.x: [LLD] [MinGW] Implement the --lto-emit-asm and -plugin-opt=emit-llvm options (#81475) (PR #81798)

2024-02-16 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/81798

>From 28be6f670fabe068e02d59670c26571efad1be4b Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Martin=20Storsj=C3=B6?= 
Date: Tue, 13 Feb 2024 09:32:40 +0200
Subject: [PATCH] [LLD] [MinGW] Implement the --lto-emit-asm and
 -plugin-opt=emit-llvm options (#81475)

These were implemented in the COFF linker in
3923e61b96cf90123762f0e0381504efaba2d77a and
d12b99a4313816cf99e97cb5f579e2d51ba72b0b.

This matches the corresponding options in the ELF linker.

(cherry picked from commit d033366bd2189e33343ca93d276b40341dc39770)
---
 lld/MinGW/Driver.cpp   | 4 
 lld/MinGW/Options.td   | 5 +
 lld/test/MinGW/driver.test | 7 +++
 3 files changed, 16 insertions(+)

diff --git a/lld/MinGW/Driver.cpp b/lld/MinGW/Driver.cpp
index 4752d92e3b1d71..7b16764dd2c7ce 100644
--- a/lld/MinGW/Driver.cpp
+++ b/lld/MinGW/Driver.cpp
@@ -448,6 +448,10 @@ bool link(ArrayRef argsArr, 
llvm::raw_ostream &stdoutOS,
 add("-lto-cs-profile-generate");
   if (auto *arg = args.getLastArg(OPT_lto_cs_profile_file))
 add("-lto-cs-profile-file:" + StringRef(arg->getValue()));
+  if (args.hasArg(OPT_plugin_opt_emit_llvm))
+add("-lldemit:llvm");
+  if (args.hasArg(OPT_lto_emit_asm))
+add("-lldemit:asm");
 
   if (auto *a = args.getLastArg(OPT_thinlto_cache_dir))
 add("-lldltocache:" + StringRef(a->getValue()));
diff --git a/lld/MinGW/Options.td b/lld/MinGW/Options.td
index 02f00f27406c08..9a0a96aac7f1c6 100644
--- a/lld/MinGW/Options.td
+++ b/lld/MinGW/Options.td
@@ -158,6 +158,8 @@ def lto_cs_profile_generate: FF<"lto-cs-profile-generate">,
   HelpText<"Perform context sensitive PGO instrumentation">;
 def lto_cs_profile_file: JJ<"lto-cs-profile-file=">,
   HelpText<"Context sensitive profile file path">;
+def lto_emit_asm: FF<"lto-emit-asm">,
+  HelpText<"Emit assembly code">;
 
 def thinlto_cache_dir: JJ<"thinlto-cache-dir=">,
   HelpText<"Path to ThinLTO cached object file directory">;
@@ -181,6 +183,9 @@ def: J<"plugin-opt=cs-profile-path=">,
   Alias, HelpText<"Alias for --lto-cs-profile-file">;
 def plugin_opt_dwo_dir_eq: J<"plugin-opt=dwo_dir=">,
   HelpText<"Directory to store .dwo files when LTO and debug fission are 
used">;
+def plugin_opt_emit_asm: F<"plugin-opt=emit-asm">,
+  Alias, HelpText<"Alias for --lto-emit-asm">;
+def plugin_opt_emit_llvm: F<"plugin-opt=emit-llvm">;
 def: J<"plugin-opt=jobs=">, Alias, HelpText<"Alias for 
--thinlto-jobs=">;
 def plugin_opt_mcpu_eq: J<"plugin-opt=mcpu=">;
 
diff --git a/lld/test/MinGW/driver.test b/lld/test/MinGW/driver.test
index 559a32bfa242f8..057de2a22f6a0c 100644
--- a/lld/test/MinGW/driver.test
+++ b/lld/test/MinGW/driver.test
@@ -409,6 +409,13 @@ LTO_OPTS: -mllvm:-mcpu=x86-64 -opt:lldlto=2 -dwodir:foo 
-lto-cs-profile-generate
 RUN: ld.lld -### foo.o -m i386pep --lto-O2 --lto-CGO1 
--lto-cs-profile-generate --lto-cs-profile-file=foo 2>&1 | FileCheck 
-check-prefix=LTO_OPTS2 %s
 LTO_OPTS2:-opt:lldlto=2 -opt:lldltocgo=1 -lto-cs-profile-generate 
-lto-cs-profile-file:foo
 
+RUN: ld.lld -### foo.o -m i386pe -plugin-opt=emit-asm 2>&1 | FileCheck 
-check-prefix=LTO_EMIT_ASM %s
+RUN: ld.lld -### foo.o -m i386pe --lto-emit-asm 2>&1 | FileCheck 
-check-prefix=LTO_EMIT_ASM %s
+LTO_EMIT_ASM: -lldemit:asm
+
+RUN: ld.lld -### foo.o -m i386pe -plugin-opt=emit-llvm 2>&1 | FileCheck 
-check-prefix=LTO_EMIT_LLVM %s
+LTO_EMIT_LLVM: -lldemit:llvm
+
 Test GCC specific LTO options that GCC passes unconditionally, that we ignore.
 
 RUN: ld.lld -### foo.o -m i386pep -plugin 
/usr/lib/gcc/x86_64-w64-mingw32/10-posix/liblto_plugin.so 
-plugin-opt=/usr/lib/gcc/x86_64-w64-mingw32/10-posix/lto-wrapper 
-plugin-opt=-fresolution=/tmp/ccM9d4fP.res -plugin-opt=-pass-through=-lmingw32 
2> /dev/null

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] release/18.x: [LLD] [MinGW] Implement the --lto-emit-asm and -plugin-opt=emit-llvm options (#81475) (PR #81798)

2024-02-16 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/81798
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] 28be6f6 - [LLD] [MinGW] Implement the --lto-emit-asm and -plugin-opt=emit-llvm options (#81475)

2024-02-16 Thread Tom Stellard via llvm-branch-commits

Author: Martin Storsjö
Date: 2024-02-16T05:07:05-08:00
New Revision: 28be6f670fabe068e02d59670c26571efad1be4b

URL: 
https://github.com/llvm/llvm-project/commit/28be6f670fabe068e02d59670c26571efad1be4b
DIFF: 
https://github.com/llvm/llvm-project/commit/28be6f670fabe068e02d59670c26571efad1be4b.diff

LOG: [LLD] [MinGW] Implement the --lto-emit-asm and -plugin-opt=emit-llvm 
options (#81475)

These were implemented in the COFF linker in
3923e61b96cf90123762f0e0381504efaba2d77a and
d12b99a4313816cf99e97cb5f579e2d51ba72b0b.

This matches the corresponding options in the ELF linker.

(cherry picked from commit d033366bd2189e33343ca93d276b40341dc39770)

Added: 


Modified: 
lld/MinGW/Driver.cpp
lld/MinGW/Options.td
lld/test/MinGW/driver.test

Removed: 




diff  --git a/lld/MinGW/Driver.cpp b/lld/MinGW/Driver.cpp
index 4752d92e3b1d71..7b16764dd2c7ce 100644
--- a/lld/MinGW/Driver.cpp
+++ b/lld/MinGW/Driver.cpp
@@ -448,6 +448,10 @@ bool link(ArrayRef argsArr, 
llvm::raw_ostream &stdoutOS,
 add("-lto-cs-profile-generate");
   if (auto *arg = args.getLastArg(OPT_lto_cs_profile_file))
 add("-lto-cs-profile-file:" + StringRef(arg->getValue()));
+  if (args.hasArg(OPT_plugin_opt_emit_llvm))
+add("-lldemit:llvm");
+  if (args.hasArg(OPT_lto_emit_asm))
+add("-lldemit:asm");
 
   if (auto *a = args.getLastArg(OPT_thinlto_cache_dir))
 add("-lldltocache:" + StringRef(a->getValue()));

diff  --git a/lld/MinGW/Options.td b/lld/MinGW/Options.td
index 02f00f27406c08..9a0a96aac7f1c6 100644
--- a/lld/MinGW/Options.td
+++ b/lld/MinGW/Options.td
@@ -158,6 +158,8 @@ def lto_cs_profile_generate: FF<"lto-cs-profile-generate">,
   HelpText<"Perform context sensitive PGO instrumentation">;
 def lto_cs_profile_file: JJ<"lto-cs-profile-file=">,
   HelpText<"Context sensitive profile file path">;
+def lto_emit_asm: FF<"lto-emit-asm">,
+  HelpText<"Emit assembly code">;
 
 def thinlto_cache_dir: JJ<"thinlto-cache-dir=">,
   HelpText<"Path to ThinLTO cached object file directory">;
@@ -181,6 +183,9 @@ def: J<"plugin-opt=cs-profile-path=">,
   Alias, HelpText<"Alias for --lto-cs-profile-file">;
 def plugin_opt_dwo_dir_eq: J<"plugin-opt=dwo_dir=">,
   HelpText<"Directory to store .dwo files when LTO and debug fission are 
used">;
+def plugin_opt_emit_asm: F<"plugin-opt=emit-asm">,
+  Alias, HelpText<"Alias for --lto-emit-asm">;
+def plugin_opt_emit_llvm: F<"plugin-opt=emit-llvm">;
 def: J<"plugin-opt=jobs=">, Alias, HelpText<"Alias for 
--thinlto-jobs=">;
 def plugin_opt_mcpu_eq: J<"plugin-opt=mcpu=">;
 

diff  --git a/lld/test/MinGW/driver.test b/lld/test/MinGW/driver.test
index 559a32bfa242f8..057de2a22f6a0c 100644
--- a/lld/test/MinGW/driver.test
+++ b/lld/test/MinGW/driver.test
@@ -409,6 +409,13 @@ LTO_OPTS: -mllvm:-mcpu=x86-64 -opt:lldlto=2 -dwodir:foo 
-lto-cs-profile-generate
 RUN: ld.lld -### foo.o -m i386pep --lto-O2 --lto-CGO1 
--lto-cs-profile-generate --lto-cs-profile-file=foo 2>&1 | FileCheck 
-check-prefix=LTO_OPTS2 %s
 LTO_OPTS2:-opt:lldlto=2 -opt:lldltocgo=1 -lto-cs-profile-generate 
-lto-cs-profile-file:foo
 
+RUN: ld.lld -### foo.o -m i386pe -plugin-opt=emit-asm 2>&1 | FileCheck 
-check-prefix=LTO_EMIT_ASM %s
+RUN: ld.lld -### foo.o -m i386pe --lto-emit-asm 2>&1 | FileCheck 
-check-prefix=LTO_EMIT_ASM %s
+LTO_EMIT_ASM: -lldemit:asm
+
+RUN: ld.lld -### foo.o -m i386pe -plugin-opt=emit-llvm 2>&1 | FileCheck 
-check-prefix=LTO_EMIT_LLVM %s
+LTO_EMIT_LLVM: -lldemit:llvm
+
 Test GCC specific LTO options that GCC passes unconditionally, that we ignore.
 
 RUN: ld.lld -### foo.o -m i386pep -plugin 
/usr/lib/gcc/x86_64-w64-mingw32/10-posix/liblto_plugin.so 
-plugin-opt=/usr/lib/gcc/x86_64-w64-mingw32/10-posix/lto-wrapper 
-plugin-opt=-fresolution=/tmp/ccM9d4fP.res -plugin-opt=-pass-through=-lmingw32 
2> /dev/null



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [openmp] release/18.x: [OpenMP][AIX]Define struct kmp_base_tas_lock with the order of two members swapped for big-endian (#79188) (PR #81743)

2024-02-16 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/81743

>From cf130269fade1c08e3f83a7f34bc450a27287852 Mon Sep 17 00:00:00 2001
From: Xing Xue 
Date: Wed, 7 Feb 2024 15:24:52 -0500
Subject: [PATCH 1/2] [OpenMP][test]Flip bit-fields in 'struct flags' for
 big-endian in test cases (#79895)

This patch flips bit-fields in `struct flags` for big-endian in test
cases to be consistent with the definition of the structure in libomp
`kmp.h`.

(cherry picked from commit 7a9b0e4acb3b5ee15f8eb138aad937cfa4763fb8)
---
 openmp/runtime/src/kmp.h  |  3 ++-
 .../test/tasking/bug_nested_proxy_task.c  | 21 +--
 .../test/tasking/bug_proxy_task_dep_waiting.c | 21 +--
 .../test/tasking/hidden_helper_task/common.h  | 18 +---
 4 files changed, 47 insertions(+), 16 deletions(-)

diff --git a/openmp/runtime/src/kmp.h b/openmp/runtime/src/kmp.h
index c287a31e0b1b54..b147063d228263 100644
--- a/openmp/runtime/src/kmp.h
+++ b/openmp/runtime/src/kmp.h
@@ -2494,7 +2494,8 @@ typedef struct kmp_dephash_entry kmp_dephash_entry_t;
 #define KMP_DEP_MTX 0x4
 #define KMP_DEP_SET 0x8
 #define KMP_DEP_ALL 0x80
-// Compiler sends us this info:
+// Compiler sends us this info. Note: some test cases contain an explicit copy
+// of this struct and should be in sync with any changes here.
 typedef struct kmp_depend_info {
   kmp_intptr_t base_addr;
   size_t len;
diff --git a/openmp/runtime/test/tasking/bug_nested_proxy_task.c 
b/openmp/runtime/test/tasking/bug_nested_proxy_task.c
index 43502bdcd1abd1..24fe1f3fe7607c 100644
--- a/openmp/runtime/test/tasking/bug_nested_proxy_task.c
+++ b/openmp/runtime/test/tasking/bug_nested_proxy_task.c
@@ -50,12 +50,21 @@ typedef struct kmp_depend_info {
  union {
 kmp_uint8 flag; // flag as an unsigned char
 struct { // flag as a set of 8 bits
-unsigned in : 1;
-unsigned out : 1;
-unsigned mtx : 1;
-unsigned set : 1;
-unsigned unused : 3;
-unsigned all : 1;
+#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+  unsigned all : 1;
+  unsigned unused : 3;
+  unsigned set : 1;
+  unsigned mtx : 1;
+  unsigned out : 1;
+  unsigned in : 1;
+#else
+  unsigned in : 1;
+  unsigned out : 1;
+  unsigned mtx : 1;
+  unsigned set : 1;
+  unsigned unused : 3;
+  unsigned all : 1;
+#endif
 } flags;
  };
 } kmp_depend_info_t;
diff --git a/openmp/runtime/test/tasking/bug_proxy_task_dep_waiting.c 
b/openmp/runtime/test/tasking/bug_proxy_task_dep_waiting.c
index ff75df51aff077..688860c035728f 100644
--- a/openmp/runtime/test/tasking/bug_proxy_task_dep_waiting.c
+++ b/openmp/runtime/test/tasking/bug_proxy_task_dep_waiting.c
@@ -47,12 +47,21 @@ typedef struct kmp_depend_info {
  union {
 kmp_uint8 flag; // flag as an unsigned char
 struct { // flag as a set of 8 bits
-unsigned in : 1;
-unsigned out : 1;
-unsigned mtx : 1;
-unsigned set : 1;
-unsigned unused : 3;
-unsigned all : 1;
+#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+  unsigned all : 1;
+  unsigned unused : 3;
+  unsigned set : 1;
+  unsigned mtx : 1;
+  unsigned out : 1;
+  unsigned in : 1;
+#else
+  unsigned in : 1;
+  unsigned out : 1;
+  unsigned mtx : 1;
+  unsigned set : 1;
+  unsigned unused : 3;
+  unsigned all : 1;
+#endif
 } flags;
 };
 } kmp_depend_info_t;
diff --git a/openmp/runtime/test/tasking/hidden_helper_task/common.h 
b/openmp/runtime/test/tasking/hidden_helper_task/common.h
index 402ecf3ed553c9..ba57656cbac41d 100644
--- a/openmp/runtime/test/tasking/hidden_helper_task/common.h
+++ b/openmp/runtime/test/tasking/hidden_helper_task/common.h
@@ -17,9 +17,21 @@ typedef struct kmp_depend_info {
   union {
 unsigned char flag;
 struct {
-  bool in : 1;
-  bool out : 1;
-  bool mtx : 1;
+#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+  unsigned all : 1;
+  unsigned unused : 3;
+  unsigned set : 1;
+  unsigned mtx : 1;
+  unsigned out : 1;
+  unsigned in : 1;
+#else
+  unsigned in : 1;
+  unsigned out : 1;
+  unsigned mtx : 1;
+  unsigned set : 1;
+  unsigned unused : 3;
+  unsigned all : 1;
+#endif
 } flags;
   };
 } kmp_depend_info_t;

>From 34fdf52cce678cb4fd3714c31f1a798bece84303 Mon Sep 17 00:00:00 2001
From: Xing Xue 
Date: Tue, 13 Feb 2024 15:11:24 -0500
Subject: [PATCH 2/2] [OpenMP][AIX]Define struct kmp_base_tas_lock with the
 order of two members swapped for big-endian (#79188)

The direct lock data structure has bit `0` (the least significant bit)
of the first 32-bit word set to `1` to indicate it is a direct lock. On
the other hand, the first word (in 32-bit mode) or first two words (in
64-bit mode) of an

[llvm-branch-commits] [openmp] cf13026 - [OpenMP][test]Flip bit-fields in 'struct flags' for big-endian in test cases (#79895)

2024-02-16 Thread Tom Stellard via llvm-branch-commits

Author: Xing Xue
Date: 2024-02-16T05:15:11-08:00
New Revision: cf130269fade1c08e3f83a7f34bc450a27287852

URL: 
https://github.com/llvm/llvm-project/commit/cf130269fade1c08e3f83a7f34bc450a27287852
DIFF: 
https://github.com/llvm/llvm-project/commit/cf130269fade1c08e3f83a7f34bc450a27287852.diff

LOG: [OpenMP][test]Flip bit-fields in 'struct flags' for big-endian in test 
cases (#79895)

This patch flips bit-fields in `struct flags` for big-endian in test
cases to be consistent with the definition of the structure in libomp
`kmp.h`.

(cherry picked from commit 7a9b0e4acb3b5ee15f8eb138aad937cfa4763fb8)

Added: 


Modified: 
openmp/runtime/src/kmp.h
openmp/runtime/test/tasking/bug_nested_proxy_task.c
openmp/runtime/test/tasking/bug_proxy_task_dep_waiting.c
openmp/runtime/test/tasking/hidden_helper_task/common.h

Removed: 




diff  --git a/openmp/runtime/src/kmp.h b/openmp/runtime/src/kmp.h
index c287a31e0b1b54..b147063d228263 100644
--- a/openmp/runtime/src/kmp.h
+++ b/openmp/runtime/src/kmp.h
@@ -2494,7 +2494,8 @@ typedef struct kmp_dephash_entry kmp_dephash_entry_t;
 #define KMP_DEP_MTX 0x4
 #define KMP_DEP_SET 0x8
 #define KMP_DEP_ALL 0x80
-// Compiler sends us this info:
+// Compiler sends us this info. Note: some test cases contain an explicit copy
+// of this struct and should be in sync with any changes here.
 typedef struct kmp_depend_info {
   kmp_intptr_t base_addr;
   size_t len;

diff  --git a/openmp/runtime/test/tasking/bug_nested_proxy_task.c 
b/openmp/runtime/test/tasking/bug_nested_proxy_task.c
index 43502bdcd1abd1..24fe1f3fe7607c 100644
--- a/openmp/runtime/test/tasking/bug_nested_proxy_task.c
+++ b/openmp/runtime/test/tasking/bug_nested_proxy_task.c
@@ -50,12 +50,21 @@ typedef struct kmp_depend_info {
  union {
 kmp_uint8 flag; // flag as an unsigned char
 struct { // flag as a set of 8 bits
-unsigned in : 1;
-unsigned out : 1;
-unsigned mtx : 1;
-unsigned set : 1;
-unsigned unused : 3;
-unsigned all : 1;
+#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+  unsigned all : 1;
+  unsigned unused : 3;
+  unsigned set : 1;
+  unsigned mtx : 1;
+  unsigned out : 1;
+  unsigned in : 1;
+#else
+  unsigned in : 1;
+  unsigned out : 1;
+  unsigned mtx : 1;
+  unsigned set : 1;
+  unsigned unused : 3;
+  unsigned all : 1;
+#endif
 } flags;
  };
 } kmp_depend_info_t;

diff  --git a/openmp/runtime/test/tasking/bug_proxy_task_dep_waiting.c 
b/openmp/runtime/test/tasking/bug_proxy_task_dep_waiting.c
index ff75df51aff077..688860c035728f 100644
--- a/openmp/runtime/test/tasking/bug_proxy_task_dep_waiting.c
+++ b/openmp/runtime/test/tasking/bug_proxy_task_dep_waiting.c
@@ -47,12 +47,21 @@ typedef struct kmp_depend_info {
  union {
 kmp_uint8 flag; // flag as an unsigned char
 struct { // flag as a set of 8 bits
-unsigned in : 1;
-unsigned out : 1;
-unsigned mtx : 1;
-unsigned set : 1;
-unsigned unused : 3;
-unsigned all : 1;
+#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+  unsigned all : 1;
+  unsigned unused : 3;
+  unsigned set : 1;
+  unsigned mtx : 1;
+  unsigned out : 1;
+  unsigned in : 1;
+#else
+  unsigned in : 1;
+  unsigned out : 1;
+  unsigned mtx : 1;
+  unsigned set : 1;
+  unsigned unused : 3;
+  unsigned all : 1;
+#endif
 } flags;
 };
 } kmp_depend_info_t;

diff  --git a/openmp/runtime/test/tasking/hidden_helper_task/common.h 
b/openmp/runtime/test/tasking/hidden_helper_task/common.h
index 402ecf3ed553c9..ba57656cbac41d 100644
--- a/openmp/runtime/test/tasking/hidden_helper_task/common.h
+++ b/openmp/runtime/test/tasking/hidden_helper_task/common.h
@@ -17,9 +17,21 @@ typedef struct kmp_depend_info {
   union {
 unsigned char flag;
 struct {
-  bool in : 1;
-  bool out : 1;
-  bool mtx : 1;
+#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+  unsigned all : 1;
+  unsigned unused : 3;
+  unsigned set : 1;
+  unsigned mtx : 1;
+  unsigned out : 1;
+  unsigned in : 1;
+#else
+  unsigned in : 1;
+  unsigned out : 1;
+  unsigned mtx : 1;
+  unsigned set : 1;
+  unsigned unused : 3;
+  unsigned all : 1;
+#endif
 } flags;
   };
 } kmp_depend_info_t;



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [openmp] release/18.x: [OpenMP][AIX]Define struct kmp_base_tas_lock with the order of two members swapped for big-endian (#79188) (PR #81743)

2024-02-16 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/81743
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [openmp] 34fdf52 - [OpenMP][AIX]Define struct kmp_base_tas_lock with the order of two members swapped for big-endian (#79188)

2024-02-16 Thread Tom Stellard via llvm-branch-commits

Author: Xing Xue
Date: 2024-02-16T05:15:11-08:00
New Revision: 34fdf52cce678cb4fd3714c31f1a798bece84303

URL: 
https://github.com/llvm/llvm-project/commit/34fdf52cce678cb4fd3714c31f1a798bece84303
DIFF: 
https://github.com/llvm/llvm-project/commit/34fdf52cce678cb4fd3714c31f1a798bece84303.diff

LOG: [OpenMP][AIX]Define struct kmp_base_tas_lock with the order of two members 
swapped for big-endian (#79188)

The direct lock data structure has bit `0` (the least significant bit)
of the first 32-bit word set to `1` to indicate it is a direct lock. On
the other hand, the first word (in 32-bit mode) or first two words (in
64-bit mode) of an indirect lock are the address of the entry allocated
from the indirect lock table. The runtime checks bit `0` of the first
32-bit word to tell if this is a direct or an indirect lock. This works
fine for 32-bit and 64-bit little-endian because its memory layout of a
64-bit address is (`low word`, `high word`). However, this causes
problems for big-endian where the memory layout of a 64-bit address is
(`high word`, `low word`). If an address of the indirect lock table
entry is something like `0x110035300`, i.e., (`0x1`, `0x10035300`), it
is treated as a direct lock. This patch defines `struct
kmp_base_tas_lock` with the ordering of the two 32-bit members flipped
for big-endian PPC64 so that when checking/setting tags in member
`poll`, the second word (the low word) is used. This patch also changes
places where `poll` is not already explicitly specified for
checking/setting tags.

(cherry picked from commit ac97562c99c3ae97f063048ccaf08ebdae60ac30)

Added: 


Modified: 
openmp/runtime/src/kmp_csupport.cpp
openmp/runtime/src/kmp_gsupport.cpp
openmp/runtime/src/kmp_lock.cpp
openmp/runtime/src/kmp_lock.h

Removed: 




diff  --git a/openmp/runtime/src/kmp_csupport.cpp 
b/openmp/runtime/src/kmp_csupport.cpp
index 9eeaeb88fb9ec7..878e78b5c7ad2d 100644
--- a/openmp/runtime/src/kmp_csupport.cpp
+++ b/openmp/runtime/src/kmp_csupport.cpp
@@ -1533,8 +1533,9 @@ void __kmpc_critical_with_hint(ident_t *loc, kmp_int32 
global_tid,
   kmp_dyna_lockseq_t lockseq = __kmp_map_hint_to_lock(hint);
   if (*lk == 0) {
 if (KMP_IS_D_LOCK(lockseq)) {
-  KMP_COMPARE_AND_STORE_ACQ32((volatile kmp_int32 *)crit, 0,
-  KMP_GET_D_TAG(lockseq));
+  KMP_COMPARE_AND_STORE_ACQ32(
+  (volatile kmp_int32 *)&((kmp_base_tas_lock_t *)crit)->poll, 0,
+  KMP_GET_D_TAG(lockseq));
 } else {
   __kmp_init_indirect_csptr(crit, loc, global_tid, KMP_GET_I_TAG(lockseq));
 }

diff  --git a/openmp/runtime/src/kmp_gsupport.cpp 
b/openmp/runtime/src/kmp_gsupport.cpp
index 88189659a23416..4dc8a90f83b4ea 100644
--- a/openmp/runtime/src/kmp_gsupport.cpp
+++ b/openmp/runtime/src/kmp_gsupport.cpp
@@ -144,7 +144,7 @@ void KMP_EXPAND_NAME(KMP_API_NAME_GOMP_BARRIER)(void) {
 
 // Mutual exclusion
 
-// The symbol that icc/ifort generates for unnamed for unnamed critical 
sections
+// The symbol that icc/ifort generates for unnamed critical sections
 // - .gomp_critical_user_ - is defined using .comm in any objects reference it.
 // We can't reference it directly here in C code, as the symbol contains a ".".
 //

diff  --git a/openmp/runtime/src/kmp_lock.cpp b/openmp/runtime/src/kmp_lock.cpp
index 85c54f4cdc7e96..0ad14f862bcb9b 100644
--- a/openmp/runtime/src/kmp_lock.cpp
+++ b/openmp/runtime/src/kmp_lock.cpp
@@ -2689,7 +2689,7 @@ void __kmp_spin_backoff(kmp_backoff_t *boff) {
 // lock word.
 static void __kmp_init_direct_lock(kmp_dyna_lock_t *lck,
kmp_dyna_lockseq_t seq) {
-  TCW_4(*lck, KMP_GET_D_TAG(seq));
+  TCW_4(((kmp_base_tas_lock_t *)lck)->poll, KMP_GET_D_TAG(seq));
   KA_TRACE(
   20,
   ("__kmp_init_direct_lock: initialized direct lock with type#%d\n", seq));
@@ -3180,8 +3180,8 @@ kmp_indirect_lock_t *__kmp_allocate_indirect_lock(void 
**user_lock,
   lck->type = tag;
 
   if (OMP_LOCK_T_SIZE < sizeof(void *)) {
-*((kmp_lock_index_t *)user_lock) = idx
-   << 1; // indirect lock word must be even
+*(kmp_lock_index_t *)&(((kmp_base_tas_lock_t *)user_lock)->poll) =
+idx << 1; // indirect lock word must be even
   } else {
 *((kmp_indirect_lock_t **)user_lock) = lck;
   }

diff  --git a/openmp/runtime/src/kmp_lock.h b/openmp/runtime/src/kmp_lock.h
index f21179b4eb68a1..e2a0cda01a9718 100644
--- a/openmp/runtime/src/kmp_lock.h
+++ b/openmp/runtime/src/kmp_lock.h
@@ -50,7 +50,7 @@ typedef struct ident ident_t;
 // recent versions), but we are bounded by the pointer-sized chunks that
 // the Intel compiler allocates.
 
-#if KMP_OS_LINUX && defined(KMP_GOMP_COMPAT)
+#if (KMP_OS_LINUX || KMP_OS_AIX) && defined(KMP_GOMP_COMPAT)
 #define OMP_LOCK_T_SIZE sizeof(int)
 #define OMP_NEST_LOCK_T_SIZE sizeof(void *)
 #else
@@ -120,8 +120,15 @@ extern void __kmp_validate_locks

[llvm-branch-commits] [lld] release/18.x: [lld] Add target support for SystemZ (s390x) (#75643) (PR #81675)

2024-02-16 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/81675

>From 0a44c3792a6ff799df5f100670d7e19d1bc49f03 Mon Sep 17 00:00:00 2001
From: Ulrich Weigand 
Date: Tue, 13 Feb 2024 11:29:21 +0100
Subject: [PATCH] [lld] Add target support for SystemZ (s390x) (#75643)

This patch adds full support for linking SystemZ (ELF s390x) object
files. Support should be generally complete:
- All relocation types are supported.
- Full shared library support (DYNAMIC, GOT, PLT, ifunc).
- Relaxation of TLS and GOT relocations where appropriate.
- Platform-specific test cases.

In addition to new platform code and the obvious changes, there were a
few additional changes to common code:

- Add three new RelExpr members (R_GOTPLT_OFF, R_GOTPLT_PC, and
R_PLT_GOTREL) needed to support certain s390x relocations. I chose not
to use a platform-specific name since nothing in the definition of these
relocs is actually platform-specific; it is well possible that other
platforms will need the same.

- A couple of tweaks to TLS relocation handling, as the particular
semantics of the s390x versions differ slightly. See comments in the
code.

This was tested by building and testing >1500 Fedora packages, with only
a handful of failures; as these also have issues when building with LLD
on other architectures, they seem unrelated.

Co-authored-by: Tulio Magno Quites Machado Filho 
(cherry picked from commit fe3406e349884e4ef61480dd0607f1e237102c74)
---
 lld/ELF/Arch/SystemZ.cpp| 607 
 lld/ELF/CMakeLists.txt  |   1 +
 lld/ELF/Driver.cpp  |   3 +-
 lld/ELF/InputFiles.cpp  |   2 +
 lld/ELF/InputSection.cpp|   7 +
 lld/ELF/Relocations.cpp |  25 +-
 lld/ELF/Relocations.h   |   3 +
 lld/ELF/ScriptParser.cpp|   1 +
 lld/ELF/SyntheticSections.cpp   |   3 +
 lld/ELF/Target.cpp  |   2 +
 lld/ELF/Target.h|   1 +
 lld/test/ELF/Inputs/systemz-init.s  |   5 +
 lld/test/ELF/basic-systemz.s|  63 ++
 lld/test/ELF/emulation-systemz.s|  29 +
 lld/test/ELF/lto/systemz.ll |  18 +
 lld/test/ELF/systemz-got.s  |  16 +
 lld/test/ELF/systemz-gotent-relax-align.s   |  48 ++
 lld/test/ELF/systemz-gotent-relax-und-dso.s |  68 +++
 lld/test/ELF/systemz-gotent-relax.s |  91 +++
 lld/test/ELF/systemz-ifunc-nonpreemptible.s |  75 +++
 lld/test/ELF/systemz-init-padding.s |  27 +
 lld/test/ELF/systemz-pie.s  |  38 ++
 lld/test/ELF/systemz-plt.s  |  83 +++
 lld/test/ELF/systemz-reloc-abs.s|  32 ++
 lld/test/ELF/systemz-reloc-disp12.s |  21 +
 lld/test/ELF/systemz-reloc-disp20.s |  21 +
 lld/test/ELF/systemz-reloc-got.s|  92 +++
 lld/test/ELF/systemz-reloc-gotrel.s |  36 ++
 lld/test/ELF/systemz-reloc-pc16.s   |  39 ++
 lld/test/ELF/systemz-reloc-pc32.s   |  39 ++
 lld/test/ELF/systemz-reloc-pcdbl.s  |  68 +++
 lld/test/ELF/systemz-tls-gd.s   | 142 +
 lld/test/ELF/systemz-tls-ie.s   |  87 +++
 lld/test/ELF/systemz-tls-ld.s   | 114 
 lld/test/ELF/systemz-tls-le.s   |  61 ++
 lld/test/lit.cfg.py |   1 +
 36 files changed, 1959 insertions(+), 10 deletions(-)
 create mode 100644 lld/ELF/Arch/SystemZ.cpp
 create mode 100644 lld/test/ELF/Inputs/systemz-init.s
 create mode 100644 lld/test/ELF/basic-systemz.s
 create mode 100644 lld/test/ELF/emulation-systemz.s
 create mode 100644 lld/test/ELF/lto/systemz.ll
 create mode 100644 lld/test/ELF/systemz-got.s
 create mode 100644 lld/test/ELF/systemz-gotent-relax-align.s
 create mode 100644 lld/test/ELF/systemz-gotent-relax-und-dso.s
 create mode 100644 lld/test/ELF/systemz-gotent-relax.s
 create mode 100644 lld/test/ELF/systemz-ifunc-nonpreemptible.s
 create mode 100644 lld/test/ELF/systemz-init-padding.s
 create mode 100644 lld/test/ELF/systemz-pie.s
 create mode 100644 lld/test/ELF/systemz-plt.s
 create mode 100644 lld/test/ELF/systemz-reloc-abs.s
 create mode 100644 lld/test/ELF/systemz-reloc-disp12.s
 create mode 100644 lld/test/ELF/systemz-reloc-disp20.s
 create mode 100644 lld/test/ELF/systemz-reloc-got.s
 create mode 100644 lld/test/ELF/systemz-reloc-gotrel.s
 create mode 100644 lld/test/ELF/systemz-reloc-pc16.s
 create mode 100644 lld/test/ELF/systemz-reloc-pc32.s
 create mode 100644 lld/test/ELF/systemz-reloc-pcdbl.s
 create mode 100644 lld/test/ELF/systemz-tls-gd.s
 create mode 100644 lld/test/ELF/systemz-tls-ie.s
 create mode 100644 lld/test/ELF/systemz-tls-ld.s
 create mode 100644 lld/test/ELF/systemz-tls-le.s

diff --git a/lld/ELF/Arch/SystemZ.cpp b/lld/ELF/Arch/SystemZ.cpp
new file mode 100644
index 00..d37db6877559dc
--- /dev/null
+++ b/lld/ELF/Arch/SystemZ.cpp
@@ -0,0 +1,607 @@
+/

[llvm-branch-commits] [lld] 0a44c37 - [lld] Add target support for SystemZ (s390x) (#75643)

2024-02-16 Thread Tom Stellard via llvm-branch-commits

Author: Ulrich Weigand
Date: 2024-02-16T05:17:32-08:00
New Revision: 0a44c3792a6ff799df5f100670d7e19d1bc49f03

URL: 
https://github.com/llvm/llvm-project/commit/0a44c3792a6ff799df5f100670d7e19d1bc49f03
DIFF: 
https://github.com/llvm/llvm-project/commit/0a44c3792a6ff799df5f100670d7e19d1bc49f03.diff

LOG: [lld] Add target support for SystemZ (s390x) (#75643)

This patch adds full support for linking SystemZ (ELF s390x) object
files. Support should be generally complete:
- All relocation types are supported.
- Full shared library support (DYNAMIC, GOT, PLT, ifunc).
- Relaxation of TLS and GOT relocations where appropriate.
- Platform-specific test cases.

In addition to new platform code and the obvious changes, there were a
few additional changes to common code:

- Add three new RelExpr members (R_GOTPLT_OFF, R_GOTPLT_PC, and
R_PLT_GOTREL) needed to support certain s390x relocations. I chose not
to use a platform-specific name since nothing in the definition of these
relocs is actually platform-specific; it is well possible that other
platforms will need the same.

- A couple of tweaks to TLS relocation handling, as the particular
semantics of the s390x versions differ slightly. See comments in the
code.

This was tested by building and testing >1500 Fedora packages, with only
a handful of failures; as these also have issues when building with LLD
on other architectures, they seem unrelated.

Co-authored-by: Tulio Magno Quites Machado Filho 
(cherry picked from commit fe3406e349884e4ef61480dd0607f1e237102c74)

Added: 
lld/ELF/Arch/SystemZ.cpp
lld/test/ELF/Inputs/systemz-init.s
lld/test/ELF/basic-systemz.s
lld/test/ELF/emulation-systemz.s
lld/test/ELF/lto/systemz.ll
lld/test/ELF/systemz-got.s
lld/test/ELF/systemz-gotent-relax-align.s
lld/test/ELF/systemz-gotent-relax-und-dso.s
lld/test/ELF/systemz-gotent-relax.s
lld/test/ELF/systemz-ifunc-nonpreemptible.s
lld/test/ELF/systemz-init-padding.s
lld/test/ELF/systemz-pie.s
lld/test/ELF/systemz-plt.s
lld/test/ELF/systemz-reloc-abs.s
lld/test/ELF/systemz-reloc-disp12.s
lld/test/ELF/systemz-reloc-disp20.s
lld/test/ELF/systemz-reloc-got.s
lld/test/ELF/systemz-reloc-gotrel.s
lld/test/ELF/systemz-reloc-pc16.s
lld/test/ELF/systemz-reloc-pc32.s
lld/test/ELF/systemz-reloc-pcdbl.s
lld/test/ELF/systemz-tls-gd.s
lld/test/ELF/systemz-tls-ie.s
lld/test/ELF/systemz-tls-ld.s
lld/test/ELF/systemz-tls-le.s

Modified: 
lld/ELF/CMakeLists.txt
lld/ELF/Driver.cpp
lld/ELF/InputFiles.cpp
lld/ELF/InputSection.cpp
lld/ELF/Relocations.cpp
lld/ELF/Relocations.h
lld/ELF/ScriptParser.cpp
lld/ELF/SyntheticSections.cpp
lld/ELF/Target.cpp
lld/ELF/Target.h
lld/test/lit.cfg.py

Removed: 




diff  --git a/lld/ELF/Arch/SystemZ.cpp b/lld/ELF/Arch/SystemZ.cpp
new file mode 100644
index 00..d37db6877559dc
--- /dev/null
+++ b/lld/ELF/Arch/SystemZ.cpp
@@ -0,0 +1,607 @@
+//===- SystemZ.cpp 
===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#include "OutputSections.h"
+#include "Symbols.h"
+#include "SyntheticSections.h"
+#include "Target.h"
+#include "lld/Common/ErrorHandler.h"
+#include "llvm/BinaryFormat/ELF.h"
+#include "llvm/Support/Endian.h"
+
+using namespace llvm;
+using namespace llvm::support::endian;
+using namespace llvm::ELF;
+using namespace lld;
+using namespace lld::elf;
+
+namespace {
+class SystemZ : public TargetInfo {
+public:
+  SystemZ();
+  int getTlsGdRelaxSkip(RelType type) const override;
+  RelExpr getRelExpr(RelType type, const Symbol &s,
+ const uint8_t *loc) const override;
+  RelType getDynRel(RelType type) const override;
+  void writeGotHeader(uint8_t *buf) const override;
+  void writeGotPlt(uint8_t *buf, const Symbol &s) const override;
+  void writeIgotPlt(uint8_t *buf, const Symbol &s) const override;
+  void writePltHeader(uint8_t *buf) const override;
+  void addPltHeaderSymbols(InputSection &isd) const override;
+  void writePlt(uint8_t *buf, const Symbol &sym,
+uint64_t pltEntryAddr) const override;
+  RelExpr adjustTlsExpr(RelType type, RelExpr expr) const override;
+  RelExpr adjustGotPcExpr(RelType type, int64_t addend,
+  const uint8_t *loc) const override;
+  bool relaxOnce(int pass) const override;
+  void relocate(uint8_t *loc, const Relocation &rel,
+uint64_t val) const override;
+  int64_t getImplicitAddend(const uint8_t *buf, RelType type) const override;
+
+private:
+  void relaxGot(uint8_t *loc, const Relocation &rel, uint64_t val) const;
+

[llvm-branch-commits] [lld] release/18.x: [lld] Add target support for SystemZ (s390x) (#75643) (PR #81675)

2024-02-16 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/81675
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79861 (PR #80832)

2024-02-16 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

@nikic This backport introduces a build failure.

https://github.com/llvm/llvm-project/pull/80832
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] PR for llvm/llvm-project#81000 (PR #81003)

2024-02-16 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

ping @vitalybuka does this look OK?

https://github.com/llvm/llvm-project/pull/81003
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [RISCV] Check type is legal before combining mgather to vlse intrinsic (#81107) (PR #81568)

2024-02-16 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Ping @preames @topperc what do you think about backporting this?

https://github.com/llvm/llvm-project/pull/81568
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Backport [DAGCombine] Fix multi-use miscompile in load combine (#81586) (PR #81633)

2024-02-16 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

@RKSimon What do you think about backporting this?

https://github.com/llvm/llvm-project/pull/81633
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [SLP]Fix PR79229: Check that extractelement is used only in a single node (PR #81984)

2024-02-16 Thread Alexey Bataev via llvm-branch-commits

https://github.com/alexey-bataev approved this pull request.

LG

https://github.com/llvm/llvm-project/pull/81984
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] release/18.x: [AArch64] Backport Ampere1B support (#81297 , #81341, and #81744) (PR #81857)

2024-02-16 Thread Philipp Tomsich via llvm-branch-commits

https://github.com/ptomsich edited 
https://github.com/llvm/llvm-project/pull/81857
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] release/18.x: [lld/ELF] Avoid unnecessary TPOFF relocations in GOT for -pie (#81739) (PR #81990)

2024-02-16 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/81990

Backport 6f907733e65d24edad65f763fb14402464bd578b

Requested by: @uweigand

>From 05922282105dff080bbe1e0bbb04a1f0d639d850 Mon Sep 17 00:00:00 2001
From: Ulrich Weigand 
Date: Wed, 14 Feb 2024 18:26:38 +0100
Subject: [PATCH] [lld/ELF] Avoid unnecessary TPOFF relocations in GOT for -pie
 (#81739)

With the new SystemZ port we noticed that -pie executables generated
from files containing R_390_TLS_IEENT relocations will have unnecessary
relocations in their GOT:

9e8d8: R_390_TLS_TPOFF  *ABS*+0x18

This is caused by the config->isPic conditon in addTpOffsetGotEntry:

 static void addTpOffsetGotEntry(Symbol &sym) {
   in.got->addEntry(sym);
   uint64_t off = sym.getGotOffset();
   if (!sym.isPreemptible && !config->isPic) {
 in.got->addConstant({R_TPREL, target->symbolicRel, off, 0, &sym});
 return;
   }

It is correct that we need to retain a TPOFF relocation if the target
symbol is preemptible or if we're building a shared library. But when
building a -pie executable, those values are fixed at link time and
there's no need for any remaining dynamic relocation.

Note that the equivalent MIPS-specific code in MipsGotSection::build
checks for config->shared instead of config->isPic; we should use the
same check here. (Note also that on many other platforms we're not even
using addTpOffsetGotEntry in this case as an IE->LE relaxation is
applied before; we don't have this type of relaxation on SystemZ.)

(cherry picked from commit 6f907733e65d24edad65f763fb14402464bd578b)
---
 lld/ELF/Relocations.cpp   |  2 +-
 lld/test/ELF/systemz-tls-ie.s | 34 ++
 2 files changed, 35 insertions(+), 1 deletion(-)

diff --git a/lld/ELF/Relocations.cpp b/lld/ELF/Relocations.cpp
index f64b4219e0acc1..619fbaf5dc5452 100644
--- a/lld/ELF/Relocations.cpp
+++ b/lld/ELF/Relocations.cpp
@@ -940,7 +940,7 @@ void elf::addGotEntry(Symbol &sym) {
 static void addTpOffsetGotEntry(Symbol &sym) {
   in.got->addEntry(sym);
   uint64_t off = sym.getGotOffset();
-  if (!sym.isPreemptible && !config->isPic) {
+  if (!sym.isPreemptible && !config->shared) {
 in.got->addConstant({R_TPREL, target->symbolicRel, off, 0, &sym});
 return;
   }
diff --git a/lld/test/ELF/systemz-tls-ie.s b/lld/test/ELF/systemz-tls-ie.s
index 27b642ed2dfc5f..85e2f24cb61f62 100644
--- a/lld/test/ELF/systemz-tls-ie.s
+++ b/lld/test/ELF/systemz-tls-ie.s
@@ -12,6 +12,14 @@
 # RUN: llvm-objdump --section .data --full-contents %t | FileCheck 
--check-prefix=LE-DATA %s
 # RUN: llvm-objdump --section .got --full-contents %t | FileCheck 
--check-prefix=LE-GOT %s
 
+## With -pie we still have the R_390_RELATIVE for the data element, but all GOT
+## entries should be fully resolved without any remaining R_390_TLS_TPOFF.
+# RUN: ld.lld -pie %t.o -o %t.pie
+# RUN: llvm-readelf -r %t.pie | FileCheck --check-prefix=PIE-REL %s
+# RUN: llvm-objdump -d --no-show-raw-insn %t.pie | FileCheck 
--check-prefix=PIE %s
+# RUN: llvm-objdump --section .data --full-contents %t.pie | FileCheck 
--check-prefix=PIE-DATA %s
+# RUN: llvm-objdump --section .got --full-contents %t.pie | FileCheck 
--check-prefix=PIE-GOT %s
+
 # IE-REL: Relocation section '.rela.dyn' at offset {{.*}} contains 4 entries:
 # IE-REL: 3478 000c R_390_RELATIVE 2460
 # IE-REL: 2460 00010038 R_390_TLS_TPOFF 0008 a 
+ 0
@@ -58,6 +66,32 @@
 # LE-GOT: 1002248    fff8
 # LE-GOT: 1002258  fffc  
 
+# PIE-REL: Relocation section '.rela.dyn' at offset {{.*}} contains 1 entries:
+# PIE-REL: 33d0 000c R_390_RELATIVE 23b8
+
+## TP offset for a is at 0x23b8
+# PIE:  lgrl%r1, 0x23b8
+# PIE-NEXT: lgf %r1, 0(%r1,%r7)
+
+## TP offset for b is at 0x23c0
+# PIE-NEXT: lgrl%r1, 0x23c0
+# PIE-NEXT: lgf %r1, 0(%r1,%r7)
+
+## TP offset for c is at 0x23c8
+# PIE-NEXT: lgrl%r1, 0x23c8
+# PIE-NEXT: lgf %r1, 0(%r1,%r7)
+
+## Data element: TP offset for a is at 0x23b8 (relocated via R_390_RELATIVE 
above)
+# PIE-DATA: 33d0  
+
+## TP offsets in GOT:
+# a: -8
+# b: -4
+# c: 0
+# PIE-GOT: 23a0  22d0  
+# PIE-GOT: 23b0    fff8
+# PIE-GOT: 23c0  fffc  
+
 ear %r7,%a0
 sllg%r7,%r1,32
 ear %r7,%a1

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] release/18.x: [lld/ELF] Avoid unnecessary TPOFF relocations in GOT for -pie (#81739) (PR #81990)

2024-02-16 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/81990
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] release/18.x: [lld/ELF] Avoid unnecessary TPOFF relocations in GOT for -pie (#81739) (PR #81990)

2024-02-16 Thread via llvm-branch-commits

llvmbot wrote:

@MaskRay What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/81990
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] release/18.x: [lld/ELF] Avoid unnecessary TPOFF relocations in GOT for -pie (#81739) (PR #81990)

2024-02-16 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-lld-elf

Author: None (llvmbot)


Changes

Backport 6f907733e65d24edad65f763fb14402464bd578b

Requested by: @uweigand

---
Full diff: https://github.com/llvm/llvm-project/pull/81990.diff


2 Files Affected:

- (modified) lld/ELF/Relocations.cpp (+1-1) 
- (modified) lld/test/ELF/systemz-tls-ie.s (+34) 


``diff
diff --git a/lld/ELF/Relocations.cpp b/lld/ELF/Relocations.cpp
index f64b4219e0acc1..619fbaf5dc5452 100644
--- a/lld/ELF/Relocations.cpp
+++ b/lld/ELF/Relocations.cpp
@@ -940,7 +940,7 @@ void elf::addGotEntry(Symbol &sym) {
 static void addTpOffsetGotEntry(Symbol &sym) {
   in.got->addEntry(sym);
   uint64_t off = sym.getGotOffset();
-  if (!sym.isPreemptible && !config->isPic) {
+  if (!sym.isPreemptible && !config->shared) {
 in.got->addConstant({R_TPREL, target->symbolicRel, off, 0, &sym});
 return;
   }
diff --git a/lld/test/ELF/systemz-tls-ie.s b/lld/test/ELF/systemz-tls-ie.s
index 27b642ed2dfc5f..85e2f24cb61f62 100644
--- a/lld/test/ELF/systemz-tls-ie.s
+++ b/lld/test/ELF/systemz-tls-ie.s
@@ -12,6 +12,14 @@
 # RUN: llvm-objdump --section .data --full-contents %t | FileCheck 
--check-prefix=LE-DATA %s
 # RUN: llvm-objdump --section .got --full-contents %t | FileCheck 
--check-prefix=LE-GOT %s
 
+## With -pie we still have the R_390_RELATIVE for the data element, but all GOT
+## entries should be fully resolved without any remaining R_390_TLS_TPOFF.
+# RUN: ld.lld -pie %t.o -o %t.pie
+# RUN: llvm-readelf -r %t.pie | FileCheck --check-prefix=PIE-REL %s
+# RUN: llvm-objdump -d --no-show-raw-insn %t.pie | FileCheck 
--check-prefix=PIE %s
+# RUN: llvm-objdump --section .data --full-contents %t.pie | FileCheck 
--check-prefix=PIE-DATA %s
+# RUN: llvm-objdump --section .got --full-contents %t.pie | FileCheck 
--check-prefix=PIE-GOT %s
+
 # IE-REL: Relocation section '.rela.dyn' at offset {{.*}} contains 4 entries:
 # IE-REL: 3478 000c R_390_RELATIVE 2460
 # IE-REL: 2460 00010038 R_390_TLS_TPOFF 0008 a 
+ 0
@@ -58,6 +66,32 @@
 # LE-GOT: 1002248    fff8
 # LE-GOT: 1002258  fffc  
 
+# PIE-REL: Relocation section '.rela.dyn' at offset {{.*}} contains 1 entries:
+# PIE-REL: 33d0 000c R_390_RELATIVE 23b8
+
+## TP offset for a is at 0x23b8
+# PIE:  lgrl%r1, 0x23b8
+# PIE-NEXT: lgf %r1, 0(%r1,%r7)
+
+## TP offset for b is at 0x23c0
+# PIE-NEXT: lgrl%r1, 0x23c0
+# PIE-NEXT: lgf %r1, 0(%r1,%r7)
+
+## TP offset for c is at 0x23c8
+# PIE-NEXT: lgrl%r1, 0x23c8
+# PIE-NEXT: lgf %r1, 0(%r1,%r7)
+
+## Data element: TP offset for a is at 0x23b8 (relocated via R_390_RELATIVE 
above)
+# PIE-DATA: 33d0  
+
+## TP offsets in GOT:
+# a: -8
+# b: -4
+# c: 0
+# PIE-GOT: 23a0  22d0  
+# PIE-GOT: 23b0    fff8
+# PIE-GOT: 23c0  fffc  
+
 ear %r7,%a0
 sllg%r7,%r1,32
 ear %r7,%a1

``




https://github.com/llvm/llvm-project/pull/81990
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] release/18.x: [lld/ELF] Avoid unnecessary TPOFF relocations in GOT for -pie (#81739) (PR #81990)

2024-02-16 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-lld

Author: None (llvmbot)


Changes

Backport 6f907733e65d24edad65f763fb14402464bd578b

Requested by: @uweigand

---
Full diff: https://github.com/llvm/llvm-project/pull/81990.diff


2 Files Affected:

- (modified) lld/ELF/Relocations.cpp (+1-1) 
- (modified) lld/test/ELF/systemz-tls-ie.s (+34) 


``diff
diff --git a/lld/ELF/Relocations.cpp b/lld/ELF/Relocations.cpp
index f64b4219e0acc1..619fbaf5dc5452 100644
--- a/lld/ELF/Relocations.cpp
+++ b/lld/ELF/Relocations.cpp
@@ -940,7 +940,7 @@ void elf::addGotEntry(Symbol &sym) {
 static void addTpOffsetGotEntry(Symbol &sym) {
   in.got->addEntry(sym);
   uint64_t off = sym.getGotOffset();
-  if (!sym.isPreemptible && !config->isPic) {
+  if (!sym.isPreemptible && !config->shared) {
 in.got->addConstant({R_TPREL, target->symbolicRel, off, 0, &sym});
 return;
   }
diff --git a/lld/test/ELF/systemz-tls-ie.s b/lld/test/ELF/systemz-tls-ie.s
index 27b642ed2dfc5f..85e2f24cb61f62 100644
--- a/lld/test/ELF/systemz-tls-ie.s
+++ b/lld/test/ELF/systemz-tls-ie.s
@@ -12,6 +12,14 @@
 # RUN: llvm-objdump --section .data --full-contents %t | FileCheck 
--check-prefix=LE-DATA %s
 # RUN: llvm-objdump --section .got --full-contents %t | FileCheck 
--check-prefix=LE-GOT %s
 
+## With -pie we still have the R_390_RELATIVE for the data element, but all GOT
+## entries should be fully resolved without any remaining R_390_TLS_TPOFF.
+# RUN: ld.lld -pie %t.o -o %t.pie
+# RUN: llvm-readelf -r %t.pie | FileCheck --check-prefix=PIE-REL %s
+# RUN: llvm-objdump -d --no-show-raw-insn %t.pie | FileCheck 
--check-prefix=PIE %s
+# RUN: llvm-objdump --section .data --full-contents %t.pie | FileCheck 
--check-prefix=PIE-DATA %s
+# RUN: llvm-objdump --section .got --full-contents %t.pie | FileCheck 
--check-prefix=PIE-GOT %s
+
 # IE-REL: Relocation section '.rela.dyn' at offset {{.*}} contains 4 entries:
 # IE-REL: 3478 000c R_390_RELATIVE 2460
 # IE-REL: 2460 00010038 R_390_TLS_TPOFF 0008 a 
+ 0
@@ -58,6 +66,32 @@
 # LE-GOT: 1002248    fff8
 # LE-GOT: 1002258  fffc  
 
+# PIE-REL: Relocation section '.rela.dyn' at offset {{.*}} contains 1 entries:
+# PIE-REL: 33d0 000c R_390_RELATIVE 23b8
+
+## TP offset for a is at 0x23b8
+# PIE:  lgrl%r1, 0x23b8
+# PIE-NEXT: lgf %r1, 0(%r1,%r7)
+
+## TP offset for b is at 0x23c0
+# PIE-NEXT: lgrl%r1, 0x23c0
+# PIE-NEXT: lgf %r1, 0(%r1,%r7)
+
+## TP offset for c is at 0x23c8
+# PIE-NEXT: lgrl%r1, 0x23c8
+# PIE-NEXT: lgf %r1, 0(%r1,%r7)
+
+## Data element: TP offset for a is at 0x23b8 (relocated via R_390_RELATIVE 
above)
+# PIE-DATA: 33d0  
+
+## TP offsets in GOT:
+# a: -8
+# b: -4
+# c: 0
+# PIE-GOT: 23a0  22d0  
+# PIE-GOT: 23b0    fff8
+# PIE-GOT: 23c0  fffc  
+
 ear %r7,%a0
 sllg%r7,%r1,32
 ear %r7,%a1

``




https://github.com/llvm/llvm-project/pull/81990
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79861 (PR #80832)

2024-02-16 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/80832

>From 8bc0739799778d12ed6f4e89f28945368ea64386 Mon Sep 17 00:00:00 2001
From: Nikita Popov 
Date: Thu, 1 Feb 2024 12:57:59 +0100
Subject: [PATCH 1/4] [IndVars] Add tests for #79861 (NFC)

(cherry picked from commit c105848fd29d3b46eeb794bb6b10dad04f903b09)
---
 .../test/Transforms/IndVarSimplify/pr79861.ll | 104 ++
 1 file changed, 104 insertions(+)
 create mode 100644 llvm/test/Transforms/IndVarSimplify/pr79861.ll

diff --git a/llvm/test/Transforms/IndVarSimplify/pr79861.ll 
b/llvm/test/Transforms/IndVarSimplify/pr79861.ll
new file mode 100644
index 00..a8e2aa42a365ce
--- /dev/null
+++ b/llvm/test/Transforms/IndVarSimplify/pr79861.ll
@@ -0,0 +1,104 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 4
+; RUN: opt -S -passes=indvars < %s | FileCheck %s
+
+target datalayout = "n64"
+
+declare void @use(i64)
+
+define void @or_disjoint() {
+; CHECK-LABEL: define void @or_disjoint() {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:br label [[LOOP:%.*]]
+; CHECK:   loop:
+; CHECK-NEXT:[[IV:%.*]] = phi i64 [ 2, [[ENTRY:%.*]] ], [ [[IV_DEC:%.*]], 
[[LOOP]] ]
+; CHECK-NEXT:[[OR:%.*]] = or disjoint i64 [[IV]], 1
+; CHECK-NEXT:call void @use(i64 [[OR]])
+; CHECK-NEXT:[[IV_DEC]] = add nsw i64 [[IV]], -1
+; CHECK-NEXT:[[EXIT_COND:%.*]] = icmp eq i64 [[IV_DEC]], 0
+; CHECK-NEXT:br i1 [[EXIT_COND]], label [[EXIT:%.*]], label [[LOOP]]
+; CHECK:   exit:
+; CHECK-NEXT:ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi i64 [ 2, %entry ], [ %iv.dec, %loop ]
+  %or = or disjoint i64 %iv, 1
+  %add = add nsw i64 %iv, 1
+  %sel = select i1 false, i64 %or, i64 %add
+  call void @use(i64 %sel)
+
+  %iv.dec = add nsw i64 %iv, -1
+  %exit.cond = icmp eq i64 %iv.dec, 0
+  br i1 %exit.cond, label %exit, label %loop
+
+exit:
+  ret void
+}
+
+define void @add_nowrap_flags(i64 %n) {
+; CHECK-LABEL: define void @add_nowrap_flags(
+; CHECK-SAME: i64 [[N:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:br label [[LOOP:%.*]]
+; CHECK:   loop:
+; CHECK-NEXT:[[IV:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[IV_INC:%.*]], 
[[LOOP]] ]
+; CHECK-NEXT:[[ADD1:%.*]] = add nuw nsw i64 [[IV]], 123
+; CHECK-NEXT:call void @use(i64 [[ADD1]])
+; CHECK-NEXT:[[IV_INC]] = add i64 [[IV]], 1
+; CHECK-NEXT:[[EXIT_COND:%.*]] = icmp eq i64 [[IV_INC]], [[N]]
+; CHECK-NEXT:br i1 [[EXIT_COND]], label [[EXIT:%.*]], label [[LOOP]]
+; CHECK:   exit:
+; CHECK-NEXT:ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi i64 [ 0, %entry ], [ %iv.inc, %loop ]
+  %add1 = add nuw nsw i64 %iv, 123
+  %add2 = add i64 %iv, 123
+  %sel = select i1 false, i64 %add1, i64 %add2
+  call void @use(i64 %sel)
+
+  %iv.inc = add i64 %iv, 1
+  %exit.cond = icmp eq i64 %iv.inc, %n
+  br i1 %exit.cond, label %exit, label %loop
+
+exit:
+  ret void
+}
+
+
+define void @expander_or_disjoint(i64 %n) {
+; CHECK-LABEL: define void @expander_or_disjoint(
+; CHECK-SAME: i64 [[N:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:[[OR:%.*]] = or i64 [[N]], 1
+; CHECK-NEXT:br label [[LOOP:%.*]]
+; CHECK:   loop:
+; CHECK-NEXT:[[IV:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[IV_INC:%.*]], 
[[LOOP]] ]
+; CHECK-NEXT:[[IV_INC]] = add i64 [[IV]], 1
+; CHECK-NEXT:[[ADD:%.*]] = add i64 [[IV]], [[OR]]
+; CHECK-NEXT:call void @use(i64 [[ADD]])
+; CHECK-NEXT:[[EXITCOND:%.*]] = icmp ne i64 [[IV_INC]], [[OR]]
+; CHECK-NEXT:br i1 [[EXITCOND]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK:   exit:
+; CHECK-NEXT:ret void
+;
+entry:
+  %or = or disjoint i64 %n, 1
+  br label %loop
+
+loop:
+  %iv = phi i64 [ 0, %entry ], [ %iv.inc, %loop ]
+  %iv.inc = add i64 %iv, 1
+  %add = add i64 %iv, %or
+  call void @use(i64 %add)
+  %cmp = icmp ult i64 %iv, %n
+  br i1 %cmp, label %loop, label %exit
+
+exit:
+  ret void
+}

>From cb65f48dc90dc602e775cc6bb877db3dcd2ae609 Mon Sep 17 00:00:00 2001
From: Nikita Popov 
Date: Fri, 2 Feb 2024 10:52:05 +0100
Subject: [PATCH 2/4] [SCEVExpander] Do not reuse disjoint or (#80281)

SCEV treats "or disjoint" the same as "add nsw nuw". However, when
expanding, we cannot generally replace an add SCEV node with an "or
disjoint" instruction. Just dropping the poison flag is insufficient in
this case, we would have to actually convert the or into an add.

This is a partial fix for #79861.

(cherry picked from commit 5b8e1a6ebf11b6e93bcc96a0d009febe4bb3d7bc)
---
 llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp | 7 +++
 llvm/test/Transforms/IndVarSimplify/pr79861.ll| 5 +++--
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp 
b/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
index a1d7f0f9ba0f74..e6f93e72c98a77 100644
--- a/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
+++ b/llvm/lib/Transforms/Utils/ScalarEvolutionExpander

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79861 (PR #80832)

2024-02-16 Thread Nikita Popov via llvm-branch-commits

nikic wrote:

@tstellar Thanks, should be fixed now. I missed one necessary commit.

https://github.com/llvm/llvm-project/pull/80832
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Backport [DAGCombine] Fix multi-use miscompile in load combine (#81586) (PR #81633)

2024-02-16 Thread Simon Pilgrim via llvm-branch-commits

https://github.com/RKSimon approved this pull request.

LGTM for backport

https://github.com/llvm/llvm-project/pull/81633
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] ddc2a5f - [18.x][Docs] Add release note about Clang-defined target OS macros (#80044)

2024-02-16 Thread via llvm-branch-commits

Author: Zixu Wang
Date: 2024-02-16T05:36:18-08:00
New Revision: ddc2a5ff4e149d07fcda735c1d860be95006fe2a

URL: 
https://github.com/llvm/llvm-project/commit/ddc2a5ff4e149d07fcda735c1d860be95006fe2a
DIFF: 
https://github.com/llvm/llvm-project/commit/ddc2a5ff4e149d07fcda735c1d860be95006fe2a.diff

LOG: [18.x][Docs] Add release note about Clang-defined target OS macros (#80044)

The change is included in the 18.x release. Move the release note to the
release branch and reformat.

(cherry picked from commit b40d5b1b08564d23d5e0769892ebbc32447b2987)

Added: 


Modified: 
clang/docs/ReleaseNotes.rst

Removed: 




diff  --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 9edbfbfbbac02e..93a67e7a895592 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -171,6 +171,22 @@ AST Dumping Potentially Breaking Changes
   "qualType": "foo"
 }
 
+Clang Frontend Potentially Breaking Changes
+---
+- Target OS macros extension
+  A new Clang extension (see :ref:`here `) is enabled for
+  Darwin (Apple platform) targets. Clang now defines ``TARGET_OS_*`` macros for
+  these targets, which could break existing code bases with improper checks for
+  the ``TARGET_OS_`` macros. For example, existing checks might fail to include
+  the ``TargetConditionals.h`` header from Apple SDKs and therefore leaving the
+  macros undefined and guarded code unexercised.
+
+  Affected code should be checked to see if it's still intended for the 
specific
+  target and fixed accordingly.
+
+  The extension can be turned off by the option 
``-fno-define-target-os-macros``
+  as a workaround.
+
 What's New in Clang |release|?
 ==
 Some of the major new features and improvements to Clang are listed
@@ -351,6 +367,15 @@ New Compiler Flags
 * Full register names can be used when printing assembly via ``-mregnames``.
   This option now matches the one used by GCC.
 
+.. _target_os_detail:
+
+* ``-fdefine-target-os-macros`` and its complement
+  ``-fno-define-target-os-macros``. Enables or disables the Clang extension to
+  provide built-in definitions of a list of ``TARGET_OS_*`` macros based on the
+  target triple.
+
+  The extension is enabled by default for Darwin (Apple platform) targets.
+
 Deprecated Compiler Flags
 -
 



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [18.x][Docs] Add release note about Clang-defined target OS macros (PR #80044)

2024-02-16 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/80044
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] PR for llvm/llvm-project#80789 (PR #80790)

2024-02-16 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

The CI failure is unrelated to this PR.

https://github.com/llvm/llvm-project/pull/80790
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] PR for llvm/llvm-project#80789 (PR #80790)

2024-02-16 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/80790

>From 60a8ec3a35c722a9eb8298c215321b89d0faf5b5 Mon Sep 17 00:00:00 2001
From: Jinyang He 
Date: Tue, 6 Feb 2024 09:09:13 +0800
Subject: [PATCH] [lld][ELF] Support relax R_LARCH_ALIGN (#78692)

Refer to commit 6611d58f5bbc ("Relax R_RISCV_ALIGN"), we can relax
R_LARCH_ALIGN by same way. Reuse `SymbolAnchor`, `RISCVRelaxAux` and
`initSymbolAnchors` to simplify codes. As `riscvFinalizeRelax` is an
arch-specific function, put it override on `TargetInfo::finalizeRelax`,
so that LoongArch can override it, too.

The flow of relax R_LARCH_ALIGN is almost consistent with RISCV. The
difference is that LoongArch only has 4-bytes NOP and all executable
insn is 4-bytes aligned. So LoongArch not need rewrite NOP sequence.
Alignment maxBytesEmit parameter is supported in psABI v2.30.

(cherry picked from commit 06a728f3feab876f9195738b5774e82dadc0f3a7)
---
 lld/ELF/Arch/LoongArch.cpp | 156 -
 lld/ELF/Arch/RISCV.cpp |  29 +---
 lld/ELF/InputSection.cpp   |   7 +-
 lld/ELF/InputSection.h |  24 +++-
 lld/ELF/Target.h   |   3 +
 lld/ELF/Writer.cpp |   4 +-
 lld/test/ELF/loongarch-relax-align.s   | 126 +
 lld/test/ELF/loongarch-relax-emit-relocs.s |  49 +++
 8 files changed, 363 insertions(+), 35 deletions(-)
 create mode 100644 lld/test/ELF/loongarch-relax-align.s
 create mode 100644 lld/test/ELF/loongarch-relax-emit-relocs.s

diff --git a/lld/ELF/Arch/LoongArch.cpp b/lld/ELF/Arch/LoongArch.cpp
index ab2ec5b447d000..05fd38fb753fda 100644
--- a/lld/ELF/Arch/LoongArch.cpp
+++ b/lld/ELF/Arch/LoongArch.cpp
@@ -36,6 +36,8 @@ class LoongArch final : public TargetInfo {
   bool usesOnlyLowPageBits(RelType type) const override;
   void relocate(uint8_t *loc, const Relocation &rel,
 uint64_t val) const override;
+  bool relaxOnce(int pass) const override;
+  void finalizeRelax(int passes) const override;
 };
 } // end anonymous namespace
 
@@ -465,8 +467,9 @@ RelExpr LoongArch::getRelExpr(const RelType type, const 
Symbol &s,
   case R_LARCH_TLS_GD_HI20:
 return R_TLSGD_GOT;
   case R_LARCH_RELAX:
-// LoongArch linker relaxation is not implemented yet.
-return R_NONE;
+return config->relax ? R_RELAX_HINT : R_NONE;
+  case R_LARCH_ALIGN:
+return R_RELAX_HINT;
 
   // Other known relocs that are explicitly unimplemented:
   //
@@ -659,6 +662,155 @@ void LoongArch::relocate(uint8_t *loc, const Relocation 
&rel,
   }
 }
 
+static bool relax(InputSection &sec) {
+  const uint64_t secAddr = sec.getVA();
+  const MutableArrayRef relocs = sec.relocs();
+  auto &aux = *sec.relaxAux;
+  bool changed = false;
+  ArrayRef sa = ArrayRef(aux.anchors);
+  uint64_t delta = 0;
+
+  std::fill_n(aux.relocTypes.get(), relocs.size(), R_LARCH_NONE);
+  aux.writes.clear();
+  for (auto [i, r] : llvm::enumerate(relocs)) {
+const uint64_t loc = secAddr + r.offset - delta;
+uint32_t &cur = aux.relocDeltas[i], remove = 0;
+switch (r.type) {
+case R_LARCH_ALIGN: {
+  const uint64_t addend =
+  r.sym->isUndefined() ? Log2_64(r.addend) + 1 : r.addend;
+  const uint64_t allBytes = (1 << (addend & 0xff)) - 4;
+  const uint64_t align = 1 << (addend & 0xff);
+  const uint64_t maxBytes = addend >> 8;
+  const uint64_t off = loc & (align - 1);
+  const uint64_t curBytes = off == 0 ? 0 : align - off;
+  // All bytes beyond the alignment boundary should be removed.
+  // If emit bytes more than max bytes to emit, remove all.
+  if (maxBytes != 0 && curBytes > maxBytes)
+remove = allBytes;
+  else
+remove = allBytes - curBytes;
+  // If we can't satisfy this alignment, we've found a bad input.
+  if (LLVM_UNLIKELY(static_cast(remove) < 0)) {
+errorOrWarn(getErrorLocation((const uint8_t *)loc) +
+"insufficient padding bytes for " + lld::toString(r.type) +
+": " + Twine(allBytes) + " bytes available for " +
+"requested alignment of " + Twine(align) + " bytes");
+remove = 0;
+  }
+  break;
+}
+}
+
+// For all anchors whose offsets are <= r.offset, they are preceded by
+// the previous relocation whose `relocDeltas` value equals `delta`.
+// Decrease their st_value and update their st_size.
+for (; sa.size() && sa[0].offset <= r.offset; sa = sa.slice(1)) {
+  if (sa[0].end)
+sa[0].d->size = sa[0].offset - delta - sa[0].d->value;
+  else
+sa[0].d->value = sa[0].offset - delta;
+}
+delta += remove;
+if (delta != cur) {
+  cur = delta;
+  changed = true;
+}
+  }
+
+  for (const SymbolAnchor &a : sa) {
+if (a.end)
+  a.d->size = a.offset - delta - a.d->value;
+else
+  a.d->value = a.offset - delta;
+  }
+  // Inform assignAddresses that the siz

[llvm-branch-commits] [lld] 60a8ec3 - [lld][ELF] Support relax R_LARCH_ALIGN (#78692)

2024-02-16 Thread Tom Stellard via llvm-branch-commits

Author: Jinyang He
Date: 2024-02-16T05:39:14-08:00
New Revision: 60a8ec3a35c722a9eb8298c215321b89d0faf5b5

URL: 
https://github.com/llvm/llvm-project/commit/60a8ec3a35c722a9eb8298c215321b89d0faf5b5
DIFF: 
https://github.com/llvm/llvm-project/commit/60a8ec3a35c722a9eb8298c215321b89d0faf5b5.diff

LOG: [lld][ELF] Support relax R_LARCH_ALIGN (#78692)

Refer to commit 6611d58f5bbc ("Relax R_RISCV_ALIGN"), we can relax
R_LARCH_ALIGN by same way. Reuse `SymbolAnchor`, `RISCVRelaxAux` and
`initSymbolAnchors` to simplify codes. As `riscvFinalizeRelax` is an
arch-specific function, put it override on `TargetInfo::finalizeRelax`,
so that LoongArch can override it, too.

The flow of relax R_LARCH_ALIGN is almost consistent with RISCV. The
difference is that LoongArch only has 4-bytes NOP and all executable
insn is 4-bytes aligned. So LoongArch not need rewrite NOP sequence.
Alignment maxBytesEmit parameter is supported in psABI v2.30.

(cherry picked from commit 06a728f3feab876f9195738b5774e82dadc0f3a7)

Added: 
lld/test/ELF/loongarch-relax-align.s
lld/test/ELF/loongarch-relax-emit-relocs.s

Modified: 
lld/ELF/Arch/LoongArch.cpp
lld/ELF/Arch/RISCV.cpp
lld/ELF/InputSection.cpp
lld/ELF/InputSection.h
lld/ELF/Target.h
lld/ELF/Writer.cpp

Removed: 




diff  --git a/lld/ELF/Arch/LoongArch.cpp b/lld/ELF/Arch/LoongArch.cpp
index ab2ec5b447d000..05fd38fb753fda 100644
--- a/lld/ELF/Arch/LoongArch.cpp
+++ b/lld/ELF/Arch/LoongArch.cpp
@@ -36,6 +36,8 @@ class LoongArch final : public TargetInfo {
   bool usesOnlyLowPageBits(RelType type) const override;
   void relocate(uint8_t *loc, const Relocation &rel,
 uint64_t val) const override;
+  bool relaxOnce(int pass) const override;
+  void finalizeRelax(int passes) const override;
 };
 } // end anonymous namespace
 
@@ -465,8 +467,9 @@ RelExpr LoongArch::getRelExpr(const RelType type, const 
Symbol &s,
   case R_LARCH_TLS_GD_HI20:
 return R_TLSGD_GOT;
   case R_LARCH_RELAX:
-// LoongArch linker relaxation is not implemented yet.
-return R_NONE;
+return config->relax ? R_RELAX_HINT : R_NONE;
+  case R_LARCH_ALIGN:
+return R_RELAX_HINT;
 
   // Other known relocs that are explicitly unimplemented:
   //
@@ -659,6 +662,155 @@ void LoongArch::relocate(uint8_t *loc, const Relocation 
&rel,
   }
 }
 
+static bool relax(InputSection &sec) {
+  const uint64_t secAddr = sec.getVA();
+  const MutableArrayRef relocs = sec.relocs();
+  auto &aux = *sec.relaxAux;
+  bool changed = false;
+  ArrayRef sa = ArrayRef(aux.anchors);
+  uint64_t delta = 0;
+
+  std::fill_n(aux.relocTypes.get(), relocs.size(), R_LARCH_NONE);
+  aux.writes.clear();
+  for (auto [i, r] : llvm::enumerate(relocs)) {
+const uint64_t loc = secAddr + r.offset - delta;
+uint32_t &cur = aux.relocDeltas[i], remove = 0;
+switch (r.type) {
+case R_LARCH_ALIGN: {
+  const uint64_t addend =
+  r.sym->isUndefined() ? Log2_64(r.addend) + 1 : r.addend;
+  const uint64_t allBytes = (1 << (addend & 0xff)) - 4;
+  const uint64_t align = 1 << (addend & 0xff);
+  const uint64_t maxBytes = addend >> 8;
+  const uint64_t off = loc & (align - 1);
+  const uint64_t curBytes = off == 0 ? 0 : align - off;
+  // All bytes beyond the alignment boundary should be removed.
+  // If emit bytes more than max bytes to emit, remove all.
+  if (maxBytes != 0 && curBytes > maxBytes)
+remove = allBytes;
+  else
+remove = allBytes - curBytes;
+  // If we can't satisfy this alignment, we've found a bad input.
+  if (LLVM_UNLIKELY(static_cast(remove) < 0)) {
+errorOrWarn(getErrorLocation((const uint8_t *)loc) +
+"insufficient padding bytes for " + lld::toString(r.type) +
+": " + Twine(allBytes) + " bytes available for " +
+"requested alignment of " + Twine(align) + " bytes");
+remove = 0;
+  }
+  break;
+}
+}
+
+// For all anchors whose offsets are <= r.offset, they are preceded by
+// the previous relocation whose `relocDeltas` value equals `delta`.
+// Decrease their st_value and update their st_size.
+for (; sa.size() && sa[0].offset <= r.offset; sa = sa.slice(1)) {
+  if (sa[0].end)
+sa[0].d->size = sa[0].offset - delta - sa[0].d->value;
+  else
+sa[0].d->value = sa[0].offset - delta;
+}
+delta += remove;
+if (delta != cur) {
+  cur = delta;
+  changed = true;
+}
+  }
+
+  for (const SymbolAnchor &a : sa) {
+if (a.end)
+  a.d->size = a.offset - delta - a.d->value;
+else
+  a.d->value = a.offset - delta;
+  }
+  // Inform assignAddresses that the size has changed.
+  if (!isUInt<32>(delta))
+fatal("section size decrease is too large: " + Twine(delta));
+  sec.bytesDropped = delta;
+  return changed;
+}
+
+// When relaxing just R_LARCH_ALIGN, reloc

[llvm-branch-commits] [lld] PR for llvm/llvm-project#80789 (PR #80790)

2024-02-16 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/80790
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] [LLD] [docs] Add more release notes for COFF and MinGW (PR #81977)

2024-02-16 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/81977
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] d01a4ab - [LLD] [docs] Add more release notes for COFF and MinGW (#81977)

2024-02-16 Thread via llvm-branch-commits

Author: Martin Storsjö
Date: 2024-02-16T05:48:29-08:00
New Revision: d01a4ab21044ceb20e39b783a5983a8d4cc93cb6

URL: 
https://github.com/llvm/llvm-project/commit/d01a4ab21044ceb20e39b783a5983a8d4cc93cb6
DIFF: 
https://github.com/llvm/llvm-project/commit/d01a4ab21044ceb20e39b783a5983a8d4cc93cb6.diff

LOG: [LLD] [docs] Add more release notes for COFF and MinGW (#81977)

Add review references to all items already mentioned.

Move some items to the right section (from the MinGW section to COFF, as
the implementation is in the COFF linker side, and may be relevant for
non-MinGW cases as well).

Added: 


Modified: 
lld/docs/ReleaseNotes.rst

Removed: 




diff  --git a/lld/docs/ReleaseNotes.rst b/lld/docs/ReleaseNotes.rst
index 82f9d93b8e86ab..56ba3463aeadc0 100644
--- a/lld/docs/ReleaseNotes.rst
+++ b/lld/docs/ReleaseNotes.rst
@@ -82,14 +82,46 @@ COFF Improvements
 
 * Added support for ``--time-trace`` and associated 
``--time-trace-granularity``.
   This generates a .json profile trace of the linker execution.
+  (`#68236 `_)
+
+* The ``-dependentloadflag`` option was implemented.
+  (`#71537 `_)
 
 * LLD now prefers library paths specified with ``-libpath:`` over the 
implicitly
   detected toolchain paths.
+  (`#78039 `_)
+
+* Added new options ``-lldemit:llvm`` and ``-lldemit:asm`` for getting
+  the output of LTO compilation as LLVM bitcode or assembly.
+  (`#66964 `_)
+  (`#67079 `_)
+
+* Added a new option ``-build-id`` for generating a ``.buildid`` section
+  when not generating a PDB. A new symbol ``__buildid`` is generated by
+  the linker, allowing code to reference the build ID of the binary.
+  (`#71433 `_)
+  (`#74652 `_)
+
+* A new, LLD specific option, ``-lld-allow-duplicate-weak``, was added
+  for allowing duplicate weak symbols.
+  (`#68077 `_)
+
+* More correctly handle LTO of files that define ``__imp_`` prefixed dllimport
+  redirections.
+  (`#70777 `_)
+  (`#71376 `_)
+  (`#72989 `_)
+
+* Linking undefined references to weak symbols with LTO now works.
+  (`#70430 `_)
 
 * Use the ``SOURCE_DATE_EPOCH`` environment variable for the PE header and
   debug directory timestamps, if neither the ``/Brepro`` nor ``/timestamp:``
   options have been specified. This makes the linker output reproducible by
   setting this environment variable.
+  (`#81326 `_)
+
+* Lots of incremental work towards supporting linking ARM64EC binaries.
 
 MinGW Improvements
 --
@@ -97,19 +129,29 @@ MinGW Improvements
 * Added support for many LTO and ThinLTO options (most LTO options supported
   by the ELF driver, that are implemented by the COFF backend as well,
   should be supported now).
+  (`D158412 `_)
+  (`D158887 `_)
+  (`#77387 `_)
+  (`#81475 `_)
 
 * LLD no longer tries to autodetect and use library paths from MSVC/WinSDK
   installations when run in MinGW mode; that mode of operation shouldn't
   ever be needed in MinGW mode, and could be a source of unexpected
   behaviours.
+  (`D144084 `_)
 
 * The ``--icf=safe`` option now works as expected; it was previously a no-op.
-
-* More correctly handle LTO of files that define ``__imp_`` prefixed dllimport
-  redirections.
+  (`#70037 `_)
 
 * The strip flags ``-S`` and ``-s`` now can be used to strip out DWARF debug
   info and symbol tables while emitting a PDB debug info file.
+  (`#75181 `_)
+
+* The option ``--dll`` is handled as an alias for the ``--shared`` option.
+  (`#68575 `_)
+
+* The option ``--sort-common`` is ignored now.
+  (`#66336 `_)
 
 MachO Improvements
 --



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 1a69056 - Backport [DAGCombine] Fix multi-use miscompile in load combine (#81586) (#81633)

2024-02-16 Thread via llvm-branch-commits

Author: Nikita Popov
Date: 2024-02-16T05:50:14-08:00
New Revision: 1a69056c899a74c311d700bd0f5618cbfee23518

URL: 
https://github.com/llvm/llvm-project/commit/1a69056c899a74c311d700bd0f5618cbfee23518
DIFF: 
https://github.com/llvm/llvm-project/commit/1a69056c899a74c311d700bd0f5618cbfee23518.diff

LOG: Backport [DAGCombine] Fix multi-use miscompile in load combine (#81586) 
(#81633)

(cherry picked from commit 25b9ed6e4964344e3710359bec4c831e5a8448b9)

Added: 


Modified: 
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
llvm/test/CodeGen/X86/load-combine.ll

Removed: 




diff  --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp 
b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 98d8a6d9409f25..3135ec73a99e76 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -9253,7 +9253,7 @@ SDValue DAGCombiner::MatchLoadCombine(SDNode *N) {
 
   // Transfer chain users from old loads to the new load.
   for (LoadSDNode *L : Loads)
-DAG.ReplaceAllUsesOfValueWith(SDValue(L, 1), SDValue(NewLoad.getNode(), 
1));
+DAG.makeEquivalentMemoryOrdering(L, NewLoad);
 
   if (!NeedsBswap)
 return NewLoad;

diff  --git a/llvm/test/CodeGen/X86/load-combine.ll 
b/llvm/test/CodeGen/X86/load-combine.ll
index 7f8115dc1ce389..b5f3e789918813 100644
--- a/llvm/test/CodeGen/X86/load-combine.ll
+++ b/llvm/test/CodeGen/X86/load-combine.ll
@@ -1282,3 +1282,35 @@ define i32 @zext_load_i32_by_i8_bswap_shl_16(ptr %arg) {
   %tmp8 = or i32 %tmp7, %tmp30
   ret i32 %tmp8
 }
+
+define i32 @pr80911_vector_load_multiuse(ptr %ptr, ptr %clobber) nounwind {
+; CHECK-LABEL: pr80911_vector_load_multiuse:
+; CHECK:   # %bb.0:
+; CHECK-NEXT:pushl %esi
+; CHECK-NEXT:movl {{[0-9]+}}(%esp), %ecx
+; CHECK-NEXT:movl {{[0-9]+}}(%esp), %edx
+; CHECK-NEXT:movl (%edx), %esi
+; CHECK-NEXT:movzwl (%edx), %eax
+; CHECK-NEXT:movl $0, (%ecx)
+; CHECK-NEXT:movl %esi, (%edx)
+; CHECK-NEXT:popl %esi
+; CHECK-NEXT:retl
+;
+; CHECK64-LABEL: pr80911_vector_load_multiuse:
+; CHECK64:   # %bb.0:
+; CHECK64-NEXT:movl (%rdi), %ecx
+; CHECK64-NEXT:movzwl (%rdi), %eax
+; CHECK64-NEXT:movl $0, (%rsi)
+; CHECK64-NEXT:movl %ecx, (%rdi)
+; CHECK64-NEXT:retq
+  %load = load <4 x i8>, ptr %ptr, align 16
+  store i32 0, ptr %clobber
+  store <4 x i8> %load, ptr %ptr, align 16
+  %e1 = extractelement <4 x i8> %load, i64 1
+  %e1.ext = zext i8 %e1 to i32
+  %e1.ext.shift = shl nuw nsw i32 %e1.ext, 8
+  %e0 = extractelement <4 x i8> %load, i64 0
+  %e0.ext = zext i8 %e0 to i32
+  %res = or i32 %e1.ext.shift, %e0.ext
+  ret i32 %res
+}



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Backport [DAGCombine] Fix multi-use miscompile in load combine (#81586) (PR #81633)

2024-02-16 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/81633
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [SLP]Fix PR79229: Check that extractelement is used only in a single node (PR #81984)

2024-02-16 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/81984

>From 5226ae4617023e3b8957e9db0b9c2c83ea7e77a2 Mon Sep 17 00:00:00 2001
From: Alexey Bataev 
Date: Wed, 24 Jan 2024 10:57:18 -0800
Subject: [PATCH 1/2] [SLP]Fix PR79229: Check that extractelement is used only
 in a single node before erasing.

Before trying to erase the extractelement instruction, not enough to
check for single use, need to check that it is not used in several nodes
because of the preliminary nodes reordering.

(cherry picked from commit 48bbd7658710ef1699bf2a6532ff5830230aacc5)
---
 .../Transforms/Vectorize/SLPVectorizer.cpp|  11 +-
 .../extractelement-single-use-many-nodes.ll   | 144 ++
 2 files changed, 154 insertions(+), 1 deletion(-)
 create mode 100644 
llvm/test/Transforms/SLPVectorizer/X86/extractelement-single-use-many-nodes.ll

diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp 
b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index 601d2454c1e163..83f787d7fb624a 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -10216,7 +10216,16 @@ class BoUpSLP::ShuffleInstructionBuilder final : 
public BaseShuffleAnalysis {
   // If the only one use is vectorized - can delete the extractelement
   // itself.
   if (!EI->hasOneUse() || any_of(EI->users(), [&](User *U) {
-return !R.ScalarToTreeEntry.count(U);
+const TreeEntry *UTE = R.getTreeEntry(U);
+return !UTE || R.MultiNodeScalars.contains(U) ||
+   count_if(R.VectorizableTree,
+[&](const std::unique_ptr &TE) {
+  return any_of(TE->UserTreeIndices,
+[&](const EdgeInfo &Edge) {
+  return Edge.UserTE == UTE;
+}) &&
+ is_contained(TE->Scalars, EI);
+}) != 1;
   }))
 continue;
   R.eraseInstruction(EI);
diff --git 
a/llvm/test/Transforms/SLPVectorizer/X86/extractelement-single-use-many-nodes.ll
 
b/llvm/test/Transforms/SLPVectorizer/X86/extractelement-single-use-many-nodes.ll
new file mode 100644
index 00..f665dac3282b79
--- /dev/null
+++ 
b/llvm/test/Transforms/SLPVectorizer/X86/extractelement-single-use-many-nodes.ll
@@ -0,0 +1,144 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 4
+; RUN: opt -passes=slp-vectorizer -mtriple=x86_64-unknown-linux-gnu 
-mcpu=x86-64-v3 -S < %s | FileCheck %s
+
+define void @foo(double %i) {
+; CHECK-LABEL: define void @foo(
+; CHECK-SAME: double [[I:%.*]]) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT:  bb:
+; CHECK-NEXT:[[TMP0:%.*]] = insertelement <4 x double> , double 
[[I]], i32 2
+; CHECK-NEXT:[[TMP1:%.*]] = fsub <4 x double> zeroinitializer, [[TMP0]]
+; CHECK-NEXT:[[TMP2:%.*]] = extractelement <4 x double> [[TMP1]], i32 1
+; CHECK-NEXT:[[TMP3:%.*]] = insertelement <2 x double> poison, double 
[[I]], i32 0
+; CHECK-NEXT:[[TMP4:%.*]] = fsub <2 x double> zeroinitializer, [[TMP3]]
+; CHECK-NEXT:[[TMP5:%.*]] = extractelement <2 x double> [[TMP4]], i32 1
+; CHECK-NEXT:[[TMP6:%.*]] = shufflevector <2 x double> [[TMP4]], <2 x 
double> poison, <8 x i32> 
+; CHECK-NEXT:[[TMP7:%.*]] = shufflevector <8 x double> [[TMP6]], <8 x 
double> , 
<8 x i32> 
+; CHECK-NEXT:[[TMP8:%.*]] = insertelement <8 x double> [[TMP7]], double 
[[TMP2]], i32 3
+; CHECK-NEXT:[[TMP9:%.*]] = shufflevector <4 x double> [[TMP1]], <4 x 
double> poison, <8 x i32> 
+; CHECK-NEXT:[[TMP10:%.*]] = shufflevector <8 x double> [[TMP9]], <8 x 
double> , <8 x i32> 
+; CHECK-NEXT:[[TMP11:%.*]] = insertelement <8 x double> [[TMP10]], double 
[[TMP5]], i32 6
+; CHECK-NEXT:[[TMP12:%.*]] = fmul <8 x double> [[TMP8]], [[TMP11]]
+; CHECK-NEXT:[[TMP13:%.*]] = fadd <8 x double> zeroinitializer, [[TMP12]]
+; CHECK-NEXT:[[TMP14:%.*]] = fadd <8 x double> [[TMP13]], zeroinitializer
+; CHECK-NEXT:[[TMP15:%.*]] = fcmp ult <8 x double> [[TMP14]], 
zeroinitializer
+; CHECK-NEXT:[[TMP16:%.*]] = freeze <8 x i1> [[TMP15]]
+; CHECK-NEXT:[[TMP17:%.*]] = call i1 @llvm.vector.reduce.and.v8i1(<8 x i1> 
[[TMP16]])
+; CHECK-NEXT:br i1 [[TMP17]], label [[BB58:%.*]], label [[BB115:%.*]]
+; CHECK:   bb115:
+; CHECK-NEXT:[[TMP18:%.*]] = fmul <2 x double> zeroinitializer, [[TMP4]]
+; CHECK-NEXT:[[TMP19:%.*]] = extractelement <2 x double> [[TMP18]], i32 0
+; CHECK-NEXT:[[TMP20:%.*]] = extractelement <2 x double> [[TMP18]], i32 1
+; CHECK-NEXT:[[I118:%.*]] = fadd double [[TMP19]], [[TMP20]]
+; CHECK-NEXT:[[TMP21:%.*]] = fmul <4 x double> zeroinitializer, [[TMP1]]
+; CHECK-NEXT:[[TMP22:%.*]] = shufflevector <2 x double> [[TMP4]], <2 x 
double> poison, <4 x i32> 
+; CHECK-NEXT:[[TMP23:%.*]] = shufflevector <4 x double> , <4 

[llvm-branch-commits] [llvm] 5226ae4 - [SLP]Fix PR79229: Check that extractelement is used only in a single node

2024-02-16 Thread Tom Stellard via llvm-branch-commits

Author: Alexey Bataev
Date: 2024-02-16T05:51:10-08:00
New Revision: 5226ae4617023e3b8957e9db0b9c2c83ea7e77a2

URL: 
https://github.com/llvm/llvm-project/commit/5226ae4617023e3b8957e9db0b9c2c83ea7e77a2
DIFF: 
https://github.com/llvm/llvm-project/commit/5226ae4617023e3b8957e9db0b9c2c83ea7e77a2.diff

LOG: [SLP]Fix PR79229: Check that extractelement is used only in a single node
before erasing.

Before trying to erase the extractelement instruction, not enough to
check for single use, need to check that it is not used in several nodes
because of the preliminary nodes reordering.

(cherry picked from commit 48bbd7658710ef1699bf2a6532ff5830230aacc5)

Added: 

llvm/test/Transforms/SLPVectorizer/X86/extractelement-single-use-many-nodes.ll

Modified: 
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp

Removed: 




diff  --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp 
b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index 601d2454c1e163..83f787d7fb624a 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -10216,7 +10216,16 @@ class BoUpSLP::ShuffleInstructionBuilder final : 
public BaseShuffleAnalysis {
   // If the only one use is vectorized - can delete the extractelement
   // itself.
   if (!EI->hasOneUse() || any_of(EI->users(), [&](User *U) {
-return !R.ScalarToTreeEntry.count(U);
+const TreeEntry *UTE = R.getTreeEntry(U);
+return !UTE || R.MultiNodeScalars.contains(U) ||
+   count_if(R.VectorizableTree,
+[&](const std::unique_ptr &TE) {
+  return any_of(TE->UserTreeIndices,
+[&](const EdgeInfo &Edge) {
+  return Edge.UserTE == UTE;
+}) &&
+ is_contained(TE->Scalars, EI);
+}) != 1;
   }))
 continue;
   R.eraseInstruction(EI);

diff  --git 
a/llvm/test/Transforms/SLPVectorizer/X86/extractelement-single-use-many-nodes.ll
 
b/llvm/test/Transforms/SLPVectorizer/X86/extractelement-single-use-many-nodes.ll
new file mode 100644
index 00..f665dac3282b79
--- /dev/null
+++ 
b/llvm/test/Transforms/SLPVectorizer/X86/extractelement-single-use-many-nodes.ll
@@ -0,0 +1,144 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 4
+; RUN: opt -passes=slp-vectorizer -mtriple=x86_64-unknown-linux-gnu 
-mcpu=x86-64-v3 -S < %s | FileCheck %s
+
+define void @foo(double %i) {
+; CHECK-LABEL: define void @foo(
+; CHECK-SAME: double [[I:%.*]]) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT:  bb:
+; CHECK-NEXT:[[TMP0:%.*]] = insertelement <4 x double> , double 
[[I]], i32 2
+; CHECK-NEXT:[[TMP1:%.*]] = fsub <4 x double> zeroinitializer, [[TMP0]]
+; CHECK-NEXT:[[TMP2:%.*]] = extractelement <4 x double> [[TMP1]], i32 1
+; CHECK-NEXT:[[TMP3:%.*]] = insertelement <2 x double> poison, double 
[[I]], i32 0
+; CHECK-NEXT:[[TMP4:%.*]] = fsub <2 x double> zeroinitializer, [[TMP3]]
+; CHECK-NEXT:[[TMP5:%.*]] = extractelement <2 x double> [[TMP4]], i32 1
+; CHECK-NEXT:[[TMP6:%.*]] = shufflevector <2 x double> [[TMP4]], <2 x 
double> poison, <8 x i32> 
+; CHECK-NEXT:[[TMP7:%.*]] = shufflevector <8 x double> [[TMP6]], <8 x 
double> , 
<8 x i32> 
+; CHECK-NEXT:[[TMP8:%.*]] = insertelement <8 x double> [[TMP7]], double 
[[TMP2]], i32 3
+; CHECK-NEXT:[[TMP9:%.*]] = shufflevector <4 x double> [[TMP1]], <4 x 
double> poison, <8 x i32> 
+; CHECK-NEXT:[[TMP10:%.*]] = shufflevector <8 x double> [[TMP9]], <8 x 
double> , <8 x i32> 
+; CHECK-NEXT:[[TMP11:%.*]] = insertelement <8 x double> [[TMP10]], double 
[[TMP5]], i32 6
+; CHECK-NEXT:[[TMP12:%.*]] = fmul <8 x double> [[TMP8]], [[TMP11]]
+; CHECK-NEXT:[[TMP13:%.*]] = fadd <8 x double> zeroinitializer, [[TMP12]]
+; CHECK-NEXT:[[TMP14:%.*]] = fadd <8 x double> [[TMP13]], zeroinitializer
+; CHECK-NEXT:[[TMP15:%.*]] = fcmp ult <8 x double> [[TMP14]], 
zeroinitializer
+; CHECK-NEXT:[[TMP16:%.*]] = freeze <8 x i1> [[TMP15]]
+; CHECK-NEXT:[[TMP17:%.*]] = call i1 @llvm.vector.reduce.and.v8i1(<8 x i1> 
[[TMP16]])
+; CHECK-NEXT:br i1 [[TMP17]], label [[BB58:%.*]], label [[BB115:%.*]]
+; CHECK:   bb115:
+; CHECK-NEXT:[[TMP18:%.*]] = fmul <2 x double> zeroinitializer, [[TMP4]]
+; CHECK-NEXT:[[TMP19:%.*]] = extractelement <2 x double> [[TMP18]], i32 0
+; CHECK-NEXT:[[TMP20:%.*]] = extractelement <2 x double> [[TMP18]], i32 1
+; CHECK-NEXT:[[I118:%.*]] = fadd double [[TMP19]], [[TMP20]]
+; CHECK-NEXT:[[TMP21:%.*]] = fmul <4 x double> zeroinitializer, [[TMP1]]
+; CHECK-NEXT:[[TMP22:%.*]] = shufflevector <2 x double> [[TMP4]], <2 x 
double> poison, <4 x i32> 
+; CHECK-NEXT:[[TMP23

[llvm-branch-commits] [llvm] b7a4ff8 - [SLP]Fix PR79229: Do not erase extractelement, if it used in

2024-02-16 Thread Tom Stellard via llvm-branch-commits

Author: Alexey Bataev
Date: 2024-02-16T05:51:10-08:00
New Revision: b7a4ff80a4ccaecf1d497db51bfdc9499c3cbb48

URL: 
https://github.com/llvm/llvm-project/commit/b7a4ff80a4ccaecf1d497db51bfdc9499c3cbb48
DIFF: 
https://github.com/llvm/llvm-project/commit/b7a4ff80a4ccaecf1d497db51bfdc9499c3cbb48.diff

LOG: [SLP]Fix PR79229: Do not erase extractelement, if it used in
multiregister node.

If the node can be span between several registers and same
extractelement instruction is used in several parts, it may be required
to keep such extractelement instruction to avoid compiler crash.

(cherry picked from commit 6fe21bc1dac883efa0dfa807f327048ae9969b81)

Added: 
llvm/test/Transforms/SLPVectorizer/X86/extractelement-multi-register-use.ll

Modified: 
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp

Removed: 




diff  --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp 
b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index 83f787d7fb624a..0a9e2c7f49f55f 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -10215,7 +10215,8 @@ class BoUpSLP::ShuffleInstructionBuilder final : public 
BaseShuffleAnalysis {
   UniqueBases.insert(VecBase);
   // If the only one use is vectorized - can delete the extractelement
   // itself.
-  if (!EI->hasOneUse() || any_of(EI->users(), [&](User *U) {
+  if (!EI->hasOneUse() || (NumParts != 1 && count(E->Scalars, EI) > 1) ||
+  any_of(EI->users(), [&](User *U) {
 const TreeEntry *UTE = R.getTreeEntry(U);
 return !UTE || R.MultiNodeScalars.contains(U) ||
count_if(R.VectorizableTree,

diff  --git 
a/llvm/test/Transforms/SLPVectorizer/X86/extractelement-multi-register-use.ll 
b/llvm/test/Transforms/SLPVectorizer/X86/extractelement-multi-register-use.ll
new file mode 100644
index 00..ba406c8f20bb08
--- /dev/null
+++ 
b/llvm/test/Transforms/SLPVectorizer/X86/extractelement-multi-register-use.ll
@@ -0,0 +1,107 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 4
+; RUN: opt -passes=slp-vectorizer -S -mtriple=x86_64-unknown-linux-gnu 
-mcpu=x86-64-v3 < %s | FileCheck %s
+
+define void @test(double %i) {
+; CHECK-LABEL: define void @test(
+; CHECK-SAME: double [[I:%.*]]) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT:  bb:
+; CHECK-NEXT:[[TMP0:%.*]] = insertelement <2 x double> poison, double 
[[I]], i32 0
+; CHECK-NEXT:[[TMP1:%.*]] = fsub <2 x double> zeroinitializer, [[TMP0]]
+; CHECK-NEXT:[[TMP2:%.*]] = insertelement <2 x double> , double [[I]], i32 1
+; CHECK-NEXT:[[TMP3:%.*]] = fsub <2 x double> zeroinitializer, [[TMP2]]
+; CHECK-NEXT:[[TMP4:%.*]] = extractelement <2 x double> [[TMP3]], i32 1
+; CHECK-NEXT:[[TMP5:%.*]] = fsub <2 x double> [[TMP0]], zeroinitializer
+; CHECK-NEXT:[[TMP6:%.*]] = shufflevector <2 x double> [[TMP5]], <2 x 
double> [[TMP3]], <4 x i32> 
+; CHECK-NEXT:[[TMP7:%.*]] = shufflevector <2 x double> [[TMP1]], <2 x 
double> [[TMP5]], <4 x i32> 
+; CHECK-NEXT:[[TMP8:%.*]] = shufflevector <4 x double> [[TMP6]], <4 x 
double> [[TMP7]], <8 x i32> 
+; CHECK-NEXT:[[TMP9:%.*]] = shufflevector <8 x double> [[TMP8]], <8 x 
double> , <8 x 
i32> 
+; CHECK-NEXT:[[TMP10:%.*]] = insertelement <8 x double> [[TMP9]], double 
[[TMP4]], i32 7
+; CHECK-NEXT:[[TMP11:%.*]] = fmul <8 x double> zeroinitializer, [[TMP10]]
+; CHECK-NEXT:[[TMP12:%.*]] = fadd <8 x double> zeroinitializer, [[TMP11]]
+; CHECK-NEXT:[[TMP13:%.*]] = fadd <8 x double> [[TMP12]], zeroinitializer
+; CHECK-NEXT:[[TMP14:%.*]] = fcmp ult <8 x double> [[TMP13]], 
zeroinitializer
+; CHECK-NEXT:br label [[BB116:%.*]]
+; CHECK:   bb116:
+; CHECK-NEXT:[[TMP15:%.*]] = fmul <2 x double> zeroinitializer, [[TMP5]]
+; CHECK-NEXT:[[TMP16:%.*]] = extractelement <2 x double> [[TMP15]], i32 0
+; CHECK-NEXT:[[TMP17:%.*]] = extractelement <2 x double> [[TMP15]], i32 1
+; CHECK-NEXT:[[I120:%.*]] = fadd double [[TMP16]], [[TMP17]]
+; CHECK-NEXT:[[TMP18:%.*]] = fmul <2 x double> zeroinitializer, [[TMP1]]
+; CHECK-NEXT:[[TMP19:%.*]] = fmul <2 x double> zeroinitializer, [[TMP3]]
+; CHECK-NEXT:[[TMP20:%.*]] = extractelement <2 x double> [[TMP18]], i32 0
+; CHECK-NEXT:[[TMP21:%.*]] = extractelement <2 x double> [[TMP18]], i32 1
+; CHECK-NEXT:[[I128:%.*]] = fadd double [[TMP20]], [[TMP21]]
+; CHECK-NEXT:[[I139:%.*]] = call double @llvm.maxnum.f64(double [[I128]], 
double 0.00e+00)
+; CHECK-NEXT:[[TMP22:%.*]] = fadd <2 x double> [[TMP19]], zeroinitializer
+; CHECK-NEXT:[[TMP23:%.*]] = call <2 x double> @llvm.maxnum.v2f64(<2 x 
double> [[TMP22]], <2 x double> zeroinitializer)
+; CHECK-NEXT:[[TMP24:%.*]] = fmul <2 x double> [[TMP23]], zeroinitializer
+; CHECK-NEXT:[[TMP25:%.*]] = fptosi <2 x double> [[TMP24]] to <2 x i32>
+; CHECK-NEXT:[[TMP26:%.*]

[llvm-branch-commits] [llvm] release/18.x: [SLP]Fix PR79229: Check that extractelement is used only in a single node (PR #81984)

2024-02-16 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/81984
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][Transforms][NFC] Simplify `ArgConverter` state (PR #81462)

2024-02-16 Thread Matthias Springer via llvm-branch-commits

https://github.com/matthias-springer updated 
https://github.com/llvm/llvm-project/pull/81462

>From a7fffc3e18588d1112411a03936e41a2931cefec Mon Sep 17 00:00:00 2001
From: Matthias Springer 
Date: Fri, 16 Feb 2024 15:11:41 +
Subject: [PATCH] [mlir][Transforms][NFC] Simplify `ArgConverter` state

* When converting a block signature, `ArgConverter` creates a new block with 
the new signature and moves all operation from the old block to the new block. 
The new block is temporarily inserted into a region that is stored in 
`regionMapping`. The old block is not yet deleted, so that the conversion can 
be rolled back. `regionMapping` is not needed. Instead of moving the old block 
to a temporary region, it can just be unlinked. Block erasures are handles in 
the same way in the dialect conversion.
* `regionToConverter` is a mapping from regions to type converter. That field 
is never accessed within `ArgConverter`. It should be stored in 
`ConversionPatternRewriterImpl` instead.
---
 .../Transforms/Utils/DialectConversion.cpp| 79 ++-
 1 file changed, 22 insertions(+), 57 deletions(-)

diff --git a/mlir/lib/Transforms/Utils/DialectConversion.cpp 
b/mlir/lib/Transforms/Utils/DialectConversion.cpp
index 673bd0383809cb..35028001a03dd9 100644
--- a/mlir/lib/Transforms/Utils/DialectConversion.cpp
+++ b/mlir/lib/Transforms/Utils/DialectConversion.cpp
@@ -343,23 +343,6 @@ struct ArgConverter {
 const TypeConverter *converter;
   };
 
-  /// Return if the signature of the given block has already been converted.
-  bool hasBeenConverted(Block *block) const {
-return conversionInfo.count(block) || convertedBlocks.count(block);
-  }
-
-  /// Set the type converter to use for the given region.
-  void setConverter(Region *region, const TypeConverter *typeConverter) {
-assert(typeConverter && "expected valid type converter");
-regionToConverter[region] = typeConverter;
-  }
-
-  /// Return the type converter to use for the given region, or null if there
-  /// isn't one.
-  const TypeConverter *getConverter(Region *region) {
-return regionToConverter.lookup(region);
-  }
-
   
//======//
   // Rewrite Application
   
//======//
@@ -409,24 +392,10 @@ struct ArgConverter {
   ConversionValueMapping &mapping,
   SmallVectorImpl &argReplacements);
 
-  /// Insert a new conversion into the cache.
-  void insertConversion(Block *newBlock, ConvertedBlockInfo &&info);
-
   /// A collection of blocks that have had their arguments converted. This is a
   /// map from the new replacement block, back to the original block.
   llvm::MapVector conversionInfo;
 
-  /// The set of original blocks that were converted.
-  DenseSet convertedBlocks;
-
-  /// A mapping from valid regions, to those containing the original blocks of 
a
-  /// conversion.
-  DenseMap> regionMapping;
-
-  /// A mapping of regions to type converters that should be used when
-  /// converting the arguments of blocks within that region.
-  DenseMap regionToConverter;
-
   /// The pattern rewriter to use when materializing conversions.
   PatternRewriter &rewriter;
 
@@ -474,12 +443,12 @@ void ArgConverter::discardRewrites(Block *block) {
 block->getArgument(i).dropAllUses();
   block->replaceAllUsesWith(origBlock);
 
-  // Move the operations back the original block and the delete the new block.
+  // Move the operations back the original block, move the original block back
+  // into its original location and the delete the new block.
   origBlock->getOperations().splice(origBlock->end(), block->getOperations());
-  origBlock->moveBefore(block);
+  block->getParent()->getBlocks().insert(Region::iterator(block), origBlock);
   block->erase();
 
-  convertedBlocks.erase(origBlock);
   conversionInfo.erase(it);
 }
 
@@ -510,6 +479,9 @@ void ArgConverter::applyRewrites(ConversionValueMapping 
&mapping) {
 mapping.lookupOrDefault(castValue, origArg.getType()));
   }
 }
+
+delete origBlock;
+blockInfo.origBlock = nullptr;
   }
 }
 
@@ -572,9 +544,11 @@ FailureOr ArgConverter::convertSignature(
 Block *block, const TypeConverter *converter,
 ConversionValueMapping &mapping,
 SmallVectorImpl &argReplacements) {
-  // Check if the block was already converted. If the block is detached,
-  // conservatively assume it is going to be deleted.
-  if (hasBeenConverted(block) || !block->getParent())
+  // Check if the block was already converted.
+  // * If the block is mapped in `conversionInfo`, it is a converted block.
+  // * If the block is detached, conservatively assume that it is going to be
+  //   deleted; it is likely the old block (before it was converted).
+  if (conversionInfo.count(block) || !block->getParent())
 return block;
   // If a converter wasn't provided, and the block wasn't already converted,
   // there is nothing we can do

[llvm-branch-commits] [mlir] [mlir][Transforms] Dialect conversion: Improve signature conversion API (PR #81997)

2024-02-16 Thread Matthias Springer via llvm-branch-commits

https://github.com/matthias-springer created 
https://github.com/llvm/llvm-project/pull/81997

This commit improves the block signature conversion API of the dialect 
conversion.

There is the following comment in `ArgConverter::applySignatureConversion`:
```
// If no arguments are being changed or added, there is nothing to do.
```

However, the implementation actually used to replace a block with a new block 
even if the block argument types do not change (i.e., there is "nothing to 
do"). This is fixed in this commit. The documentation of the public 
`ConversionPatternRewriter` API is updated accordingly.

This commit also removes a check that used to *sometimes* skip a block 
signature conversion if the block was already converted. This is not consistent 
with the public `ConversionPatternRewriter` API; block should always be 
converted, regardless of whether they were already converted or not.

Block signature conversion also used to be silently skipped when the specified 
block was detached. Instead of silently skipping, an assertion is triggered. 
Attempting to convert a detached block (which is likely an erased block) is 
invalid API usage.

>From 5dc79f6af9ac00a61767062980b13eb4ae8d2571 Mon Sep 17 00:00:00 2001
From: Matthias Springer 
Date: Fri, 16 Feb 2024 15:13:37 +
Subject: [PATCH] [mlir][Transforms] Dialect conversion: Improve signature
 conversion API

This commit improves the block signature conversion API of the dialect 
conversion.

There is the following comment in `ArgConverter::applySignatureConversion`:
```
// If no arguments are being changed or added, there is nothing to do.
```

However, the implementation actually used to replace a block with a new block 
even if the block argument types do not change (i.e., there is "nothing to 
do"). This is fixed in this commit. The documentation of the public 
`ConversionPatternRewriter` API is updated accordingly.

This commit also removes a check that used to *sometimes* skip a block 
signature conversion if the block was already converted. This is not consistent 
with the public `ConversionPatternRewriter` API; block should always be 
converted, regardless of whether they were already converted or not.

Block signature conversion also used to be silently skipped when the specified 
block was detached. Instead of silently skipping, an assertion is triggered. 
Attempting to convert a detached block (which is likely an erased block) is 
invalid API usage.
---
 mlir/include/mlir/Transforms/DialectConversion.h | 12 +---
 mlir/lib/Transforms/Utils/DialectConversion.cpp  | 10 +++---
 2 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/mlir/include/mlir/Transforms/DialectConversion.h 
b/mlir/include/mlir/Transforms/DialectConversion.h
index 0d7722aa07ee38..2575be4cdea1ac 100644
--- a/mlir/include/mlir/Transforms/DialectConversion.h
+++ b/mlir/include/mlir/Transforms/DialectConversion.h
@@ -663,6 +663,8 @@ class ConversionPatternRewriter final : public 
PatternRewriter {
   /// Apply a signature conversion to the entry block of the given region. This
   /// replaces the entry block with a new block containing the updated
   /// signature. The new entry block to the region is returned for convenience.
+  /// If no block argument types are changing, the entry original block will be
+  /// left in place and returned.
   ///
   /// If provided, `converter` will be used for any materializations.
   Block *
@@ -671,8 +673,11 @@ class ConversionPatternRewriter final : public 
PatternRewriter {
const TypeConverter *converter = nullptr);
 
   /// Convert the types of block arguments within the given region. This
-  /// replaces each block with a new block containing the updated signature. 
The
-  /// entry block may have a special conversion if `entryConversion` is
+  /// replaces each block with a new block containing the updated signature. If
+  /// an updated signature would match the current signature, the respective
+  /// block is left in place as is.
+  ///
+  /// The entry block may have a special conversion if `entryConversion` is
   /// provided. On success, the new entry block to the region is returned for
   /// convenience. Otherwise, failure is returned.
   FailureOr convertRegionTypes(
@@ -681,7 +686,8 @@ class ConversionPatternRewriter final : public 
PatternRewriter {
 
   /// Convert the types of block arguments within the given region except for
   /// the entry region. This replaces each non-entry block with a new block
-  /// containing the updated signature.
+  /// containing the updated signature. If an updated signature would match the
+  /// current signature, the respective block is left in place as is.
   ///
   /// If special conversion behavior is needed for the non-entry blocks (for
   /// example, we need to convert only a subset of a BB arguments), such
diff --git a/mlir/lib/Transforms/Utils/DialectConversion.cpp 
b/mlir/lib/Transforms/Utils/DialectConversion.cpp
index 

[llvm-branch-commits] [mlir] [mlir][Transforms] Dialect conversion: Improve signature conversion API (PR #81997)

2024-02-16 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-mlir

Author: Matthias Springer (matthias-springer)


Changes

This commit improves the block signature conversion API of the dialect 
conversion.

There is the following comment in `ArgConverter::applySignatureConversion`:
```
// If no arguments are being changed or added, there is nothing to do.
```

However, the implementation actually used to replace a block with a new block 
even if the block argument types do not change (i.e., there is "nothing to 
do"). This is fixed in this commit. The documentation of the public 
`ConversionPatternRewriter` API is updated accordingly.

This commit also removes a check that used to *sometimes* skip a block 
signature conversion if the block was already converted. This is not consistent 
with the public `ConversionPatternRewriter` API; block should always be 
converted, regardless of whether they were already converted or not.

Block signature conversion also used to be silently skipped when the specified 
block was detached. Instead of silently skipping, an assertion is triggered. 
Attempting to convert a detached block (which is likely an erased block) is 
invalid API usage.

---
Full diff: https://github.com/llvm/llvm-project/pull/81997.diff


2 Files Affected:

- (modified) mlir/include/mlir/Transforms/DialectConversion.h (+9-3) 
- (modified) mlir/lib/Transforms/Utils/DialectConversion.cpp (+3-7) 


``diff
diff --git a/mlir/include/mlir/Transforms/DialectConversion.h 
b/mlir/include/mlir/Transforms/DialectConversion.h
index 0d7722aa07ee38..2575be4cdea1ac 100644
--- a/mlir/include/mlir/Transforms/DialectConversion.h
+++ b/mlir/include/mlir/Transforms/DialectConversion.h
@@ -663,6 +663,8 @@ class ConversionPatternRewriter final : public 
PatternRewriter {
   /// Apply a signature conversion to the entry block of the given region. This
   /// replaces the entry block with a new block containing the updated
   /// signature. The new entry block to the region is returned for convenience.
+  /// If no block argument types are changing, the entry original block will be
+  /// left in place and returned.
   ///
   /// If provided, `converter` will be used for any materializations.
   Block *
@@ -671,8 +673,11 @@ class ConversionPatternRewriter final : public 
PatternRewriter {
const TypeConverter *converter = nullptr);
 
   /// Convert the types of block arguments within the given region. This
-  /// replaces each block with a new block containing the updated signature. 
The
-  /// entry block may have a special conversion if `entryConversion` is
+  /// replaces each block with a new block containing the updated signature. If
+  /// an updated signature would match the current signature, the respective
+  /// block is left in place as is.
+  ///
+  /// The entry block may have a special conversion if `entryConversion` is
   /// provided. On success, the new entry block to the region is returned for
   /// convenience. Otherwise, failure is returned.
   FailureOr convertRegionTypes(
@@ -681,7 +686,8 @@ class ConversionPatternRewriter final : public 
PatternRewriter {
 
   /// Convert the types of block arguments within the given region except for
   /// the entry region. This replaces each non-entry block with a new block
-  /// containing the updated signature.
+  /// containing the updated signature. If an updated signature would match the
+  /// current signature, the respective block is left in place as is.
   ///
   /// If special conversion behavior is needed for the non-entry blocks (for
   /// example, we need to convert only a subset of a BB arguments), such
diff --git a/mlir/lib/Transforms/Utils/DialectConversion.cpp 
b/mlir/lib/Transforms/Utils/DialectConversion.cpp
index 35028001a03dd9..c16bb144efecf5 100644
--- a/mlir/lib/Transforms/Utils/DialectConversion.cpp
+++ b/mlir/lib/Transforms/Utils/DialectConversion.cpp
@@ -544,12 +544,8 @@ FailureOr ArgConverter::convertSignature(
 Block *block, const TypeConverter *converter,
 ConversionValueMapping &mapping,
 SmallVectorImpl &argReplacements) {
-  // Check if the block was already converted.
-  // * If the block is mapped in `conversionInfo`, it is a converted block.
-  // * If the block is detached, conservatively assume that it is going to be
-  //   deleted; it is likely the old block (before it was converted).
-  if (conversionInfo.count(block) || !block->getParent())
-return block;
+  assert(block->getParent() && "cannot convert signature of detached block");
+
   // If a converter wasn't provided, and the block wasn't already converted,
   // there is nothing we can do.
   if (!converter)
@@ -570,7 +566,7 @@ Block *ArgConverter::applySignatureConversion(
   // If no arguments are being changed or added, there is nothing to do.
   unsigned origArgCount = block->getNumArguments();
   auto convertedTypes = signatureConversion.getConvertedTypes();
-  if (origArgCount == 0 && convertedTypes.empty())
+  if (llvm::equal(b

[llvm-branch-commits] [mlir] [mlir][Transforms][NFC] Turn block type conversion into `IRRewrite` (PR #81756)

2024-02-16 Thread Matthias Springer via llvm-branch-commits

https://github.com/matthias-springer updated 
https://github.com/llvm/llvm-project/pull/81756

>From 5dc79f6af9ac00a61767062980b13eb4ae8d2571 Mon Sep 17 00:00:00 2001
From: Matthias Springer 
Date: Fri, 16 Feb 2024 15:13:37 +
Subject: [PATCH 1/2] [mlir][Transforms] Dialect conversion: Improve signature
 conversion API

This commit improves the block signature conversion API of the dialect 
conversion.

There is the following comment in `ArgConverter::applySignatureConversion`:
```
// If no arguments are being changed or added, there is nothing to do.
```

However, the implementation actually used to replace a block with a new block 
even if the block argument types do not change (i.e., there is "nothing to 
do"). This is fixed in this commit. The documentation of the public 
`ConversionPatternRewriter` API is updated accordingly.

This commit also removes a check that used to *sometimes* skip a block 
signature conversion if the block was already converted. This is not consistent 
with the public `ConversionPatternRewriter` API; block should always be 
converted, regardless of whether they were already converted or not.

Block signature conversion also used to be silently skipped when the specified 
block was detached. Instead of silently skipping, an assertion is triggered. 
Attempting to convert a detached block (which is likely an erased block) is 
invalid API usage.
---
 mlir/include/mlir/Transforms/DialectConversion.h | 12 +---
 mlir/lib/Transforms/Utils/DialectConversion.cpp  | 10 +++---
 2 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/mlir/include/mlir/Transforms/DialectConversion.h 
b/mlir/include/mlir/Transforms/DialectConversion.h
index 0d7722aa07ee38..2575be4cdea1ac 100644
--- a/mlir/include/mlir/Transforms/DialectConversion.h
+++ b/mlir/include/mlir/Transforms/DialectConversion.h
@@ -663,6 +663,8 @@ class ConversionPatternRewriter final : public 
PatternRewriter {
   /// Apply a signature conversion to the entry block of the given region. This
   /// replaces the entry block with a new block containing the updated
   /// signature. The new entry block to the region is returned for convenience.
+  /// If no block argument types are changing, the entry original block will be
+  /// left in place and returned.
   ///
   /// If provided, `converter` will be used for any materializations.
   Block *
@@ -671,8 +673,11 @@ class ConversionPatternRewriter final : public 
PatternRewriter {
const TypeConverter *converter = nullptr);
 
   /// Convert the types of block arguments within the given region. This
-  /// replaces each block with a new block containing the updated signature. 
The
-  /// entry block may have a special conversion if `entryConversion` is
+  /// replaces each block with a new block containing the updated signature. If
+  /// an updated signature would match the current signature, the respective
+  /// block is left in place as is.
+  ///
+  /// The entry block may have a special conversion if `entryConversion` is
   /// provided. On success, the new entry block to the region is returned for
   /// convenience. Otherwise, failure is returned.
   FailureOr convertRegionTypes(
@@ -681,7 +686,8 @@ class ConversionPatternRewriter final : public 
PatternRewriter {
 
   /// Convert the types of block arguments within the given region except for
   /// the entry region. This replaces each non-entry block with a new block
-  /// containing the updated signature.
+  /// containing the updated signature. If an updated signature would match the
+  /// current signature, the respective block is left in place as is.
   ///
   /// If special conversion behavior is needed for the non-entry blocks (for
   /// example, we need to convert only a subset of a BB arguments), such
diff --git a/mlir/lib/Transforms/Utils/DialectConversion.cpp 
b/mlir/lib/Transforms/Utils/DialectConversion.cpp
index 35028001a03dd9..c16bb144efecf5 100644
--- a/mlir/lib/Transforms/Utils/DialectConversion.cpp
+++ b/mlir/lib/Transforms/Utils/DialectConversion.cpp
@@ -544,12 +544,8 @@ FailureOr ArgConverter::convertSignature(
 Block *block, const TypeConverter *converter,
 ConversionValueMapping &mapping,
 SmallVectorImpl &argReplacements) {
-  // Check if the block was already converted.
-  // * If the block is mapped in `conversionInfo`, it is a converted block.
-  // * If the block is detached, conservatively assume that it is going to be
-  //   deleted; it is likely the old block (before it was converted).
-  if (conversionInfo.count(block) || !block->getParent())
-return block;
+  assert(block->getParent() && "cannot convert signature of detached block");
+
   // If a converter wasn't provided, and the block wasn't already converted,
   // there is nothing we can do.
   if (!converter)
@@ -570,7 +566,7 @@ Block *ArgConverter::applySignatureConversion(
   // If no arguments are being changed or added, there is nothing to do.
   unsigned origArgCount

[llvm-branch-commits] [mlir] [mlir][Transforms][NFC] Turn block type conversion into `IRRewrite` (PR #81756)

2024-02-16 Thread Matthias Springer via llvm-branch-commits

https://github.com/matthias-springer edited 
https://github.com/llvm/llvm-project/pull/81756
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][Transforms][NFC] Turn op/block arg replacements into `IRRewrite`s (PR #81757)

2024-02-16 Thread Matthias Springer via llvm-branch-commits

https://github.com/matthias-springer updated 
https://github.com/llvm/llvm-project/pull/81757

>From b8d4cbd5e237dfc9fa6b7420b85a8e5de94f0725 Mon Sep 17 00:00:00 2001
From: Matthias Springer 
Date: Fri, 16 Feb 2024 15:19:07 +
Subject: [PATCH] [mlir][Transforms][NFC] Turn op/block arg replacements into
 `IRRewrite`s

This commit is a refactoring of the dialect conversion. The dialect conversion 
maintains a list of "IR rewrites" that can be commited (upon success) or rolled 
back (upon failure).

Until now, op replacements and block argument replacements were kept track in 
separate data structures inside the dialect conversion. This commit turns them 
into `IRRewrite`s, so that they can be committed or rolled back just like any 
other rewrite. This simplifies the internal state of the dialect conversion.

Overview of changes:
* Add two new rewrite classes: `ReplaceBlockArgRewrite` and 
`ReplaceOperationRewrite`. Remove the `OpReplacement` helper class; it is now 
part of `ReplaceOperationRewrite`.
* Simplify `RewriterState`: `numReplacements` and `numArgReplacements` are no 
longer needed. (Now being kept track of by `numRewrites`.)
* Add `IRRewrite::cleanup`. Operations should not be erased in `commit` because 
they may still be referenced in other internal state of the dialect conversion 
(`mapping`). Detaching operations is fine.
---
 .../Transforms/Utils/DialectConversion.cpp| 291 +-
 1 file changed, 153 insertions(+), 138 deletions(-)

diff --git a/mlir/lib/Transforms/Utils/DialectConversion.cpp 
b/mlir/lib/Transforms/Utils/DialectConversion.cpp
index 30133a14dbae56..21fd02fcd5c725 100644
--- a/mlir/lib/Transforms/Utils/DialectConversion.cpp
+++ b/mlir/lib/Transforms/Utils/DialectConversion.cpp
@@ -153,14 +153,12 @@ namespace {
 /// This is useful when saving and undoing a set of rewrites.
 struct RewriterState {
   RewriterState(unsigned numCreatedOps, unsigned numUnresolvedMaterializations,
-unsigned numReplacements, unsigned numArgReplacements,
 unsigned numRewrites, unsigned numIgnoredOperations,
 unsigned numErased)
   : numCreatedOps(numCreatedOps),
 numUnresolvedMaterializations(numUnresolvedMaterializations),
-numReplacements(numReplacements),
-numArgReplacements(numArgReplacements), numRewrites(numRewrites),
-numIgnoredOperations(numIgnoredOperations), numErased(numErased) {}
+numRewrites(numRewrites), numIgnoredOperations(numIgnoredOperations),
+numErased(numErased) {}
 
   /// The current number of created operations.
   unsigned numCreatedOps;
@@ -168,12 +166,6 @@ struct RewriterState {
   /// The current number of unresolved materializations.
   unsigned numUnresolvedMaterializations;
 
-  /// The current number of replacements queued.
-  unsigned numReplacements;
-
-  /// The current number of argument replacements queued.
-  unsigned numArgReplacements;
-
   /// The current number of rewrites performed.
   unsigned numRewrites;
 
@@ -184,20 +176,6 @@ struct RewriterState {
   unsigned numErased;
 };
 
-//===--===//
-// OpReplacement
-
-/// This class represents one requested operation replacement via 'replaceOp' 
or
-/// 'eraseOp`.
-struct OpReplacement {
-  OpReplacement(const TypeConverter *converter = nullptr)
-  : converter(converter) {}
-
-  /// An optional type converter that can be used to materialize conversions
-  /// between the new and old values if necessary.
-  const TypeConverter *converter;
-};
-
 
//===--===//
 // UnresolvedMaterialization
 
@@ -318,8 +296,10 @@ class IRRewrite {
 MoveBlock,
 SplitBlock,
 BlockTypeConversion,
+ReplaceBlockArg,
 MoveOperation,
-ModifyOperation
+ModifyOperation,
+ReplaceOperation
   };
 
   virtual ~IRRewrite() = default;
@@ -330,6 +310,12 @@ class IRRewrite {
   /// Commit the rewrite.
   virtual void commit() {}
 
+  /// Cleanup operations. Operations may be unlinked from their blocks during
+  /// the commit/rollback phase, but they must not be erased yet. This is
+  /// because internal dialect conversion state (such as `mapping`) may still
+  /// be using them. Operations must be erased during cleanup.
+  virtual void cleanup() {}
+
   Kind getKind() const { return kind; }
 
   static bool classof(const IRRewrite *rewrite) { return true; }
@@ -356,7 +342,7 @@ class BlockRewrite : public IRRewrite {
 
   static bool classof(const IRRewrite *rewrite) {
 return rewrite->getKind() >= Kind::CreateBlock &&
-   rewrite->getKind() <= Kind::BlockTypeConversion;
+   rewrite->getKind() <= Kind::ReplaceBlockArg;
   }
 
 protected:
@@ -424,6 +410,8 @@ class EraseBlockRewrite : public BlockRewrite {
   void commit() override {
 // Erase the block.
 assert(block && "expected block");
+assert(block->empty() && "expected empty b

[llvm-branch-commits] [mlir] [mlir][Transforms][NFC] Turn op creation into `IRRewrite` (PR #81759)

2024-02-16 Thread Matthias Springer via llvm-branch-commits

https://github.com/matthias-springer updated 
https://github.com/llvm/llvm-project/pull/81759

>From 67010343f91c6808a18731f01d139db6cb36fde6 Mon Sep 17 00:00:00 2001
From: Matthias Springer 
Date: Fri, 16 Feb 2024 15:21:48 +
Subject: [PATCH] [mlir][Transforms][NFC] Turn op creation into `IRRewrite`

This commit is a refactoring of the dialect conversion. The dialect conversion 
maintains a list of "IR rewrites" that can be commited (upon success) or rolled 
back (upon failure).

Until now, the dialect conversion kept track of "op creation" in separate 
internal data structures. This commit turns "op creation" into an `IRRewrite` 
that can be committed and rolled back just like any other rewrite. This commit 
simplifies the internal state of the dialect conversion.
---
 .../Transforms/Utils/DialectConversion.cpp| 104 +++---
 1 file changed, 66 insertions(+), 38 deletions(-)

diff --git a/mlir/lib/Transforms/Utils/DialectConversion.cpp 
b/mlir/lib/Transforms/Utils/DialectConversion.cpp
index 21fd02fcd5c725..5b7ad4e7b8e281 100644
--- a/mlir/lib/Transforms/Utils/DialectConversion.cpp
+++ b/mlir/lib/Transforms/Utils/DialectConversion.cpp
@@ -152,17 +152,12 @@ namespace {
 /// This class contains a snapshot of the current conversion rewriter state.
 /// This is useful when saving and undoing a set of rewrites.
 struct RewriterState {
-  RewriterState(unsigned numCreatedOps, unsigned numUnresolvedMaterializations,
-unsigned numRewrites, unsigned numIgnoredOperations,
-unsigned numErased)
-  : numCreatedOps(numCreatedOps),
-numUnresolvedMaterializations(numUnresolvedMaterializations),
+  RewriterState(unsigned numUnresolvedMaterializations, unsigned numRewrites,
+unsigned numIgnoredOperations, unsigned numErased)
+  : numUnresolvedMaterializations(numUnresolvedMaterializations),
 numRewrites(numRewrites), numIgnoredOperations(numIgnoredOperations),
 numErased(numErased) {}
 
-  /// The current number of created operations.
-  unsigned numCreatedOps;
-
   /// The current number of unresolved materializations.
   unsigned numUnresolvedMaterializations;
 
@@ -299,7 +294,8 @@ class IRRewrite {
 ReplaceBlockArg,
 MoveOperation,
 ModifyOperation,
-ReplaceOperation
+ReplaceOperation,
+CreateOperation
   };
 
   virtual ~IRRewrite() = default;
@@ -372,7 +368,11 @@ class CreateBlockRewrite : public BlockRewrite {
 auto &blockOps = block->getOperations();
 while (!blockOps.empty())
   blockOps.remove(blockOps.begin());
-eraseBlock(block);
+if (block->getParent()) {
+  eraseBlock(block);
+} else {
+  delete block;
+}
   }
 };
 
@@ -602,7 +602,7 @@ class OperationRewrite : public IRRewrite {
 
   static bool classof(const IRRewrite *rewrite) {
 return rewrite->getKind() >= Kind::MoveOperation &&
-   rewrite->getKind() <= Kind::ReplaceOperation;
+   rewrite->getKind() <= Kind::CreateOperation;
   }
 
 protected:
@@ -708,6 +708,19 @@ class ReplaceOperationRewrite : public OperationRewrite {
   /// 1->N conversion of some kind.
   bool changedResults;
 };
+
+class CreateOperationRewrite : public OperationRewrite {
+public:
+  CreateOperationRewrite(ConversionPatternRewriterImpl &rewriterImpl,
+ Operation *op)
+  : OperationRewrite(Kind::CreateOperation, rewriterImpl, op) {}
+
+  static bool classof(const IRRewrite *rewrite) {
+return rewrite->getKind() == Kind::CreateOperation;
+  }
+
+  void rollback() override;
+};
 } // namespace
 
 /// Return "true" if there is an operation rewrite that matches the specified
@@ -925,9 +938,6 @@ struct ConversionPatternRewriterImpl : public 
RewriterBase::Listener {
   // replacing a value with one of a different type.
   ConversionValueMapping mapping;
 
-  /// Ordered vector of all of the newly created operations during conversion.
-  SmallVector createdOps;
-
   /// Ordered vector of all unresolved type conversion materializations during
   /// conversion.
   SmallVector unresolvedMaterializations;
@@ -1110,7 +1120,18 @@ void ReplaceOperationRewrite::rollback() {
 
 void ReplaceOperationRewrite::cleanup() { eraseOp(op); }
 
+void CreateOperationRewrite::rollback() {
+  for (Region ®ion : op->getRegions()) {
+while (!region.getBlocks().empty())
+  region.getBlocks().remove(region.getBlocks().begin());
+  }
+  op->dropAllUses();
+  eraseOp(op);
+}
+
 void ConversionPatternRewriterImpl::detachNestedAndErase(Operation *op) {
+  // if (erasedIR.erasedOps.contains(op)) return;
+
   for (Region ®ion : op->getRegions()) {
 for (Block &block : region.getBlocks()) {
   while (!block.getOperations().empty())
@@ -1127,8 +1148,6 @@ void ConversionPatternRewriterImpl::discardRewrites() {
   // Remove any newly created ops.
   for (UnresolvedMaterialization &materialization : unresolvedMaterializations)
 detachNestedAndErase(materialization.getOp());
-  for (auto *op : llvm:

[llvm-branch-commits] [mlir] [mlir][Transforms][NFC][WIP] Turn unresolved materializations into `IRRewrite`s (PR #81761)

2024-02-16 Thread Matthias Springer via llvm-branch-commits

https://github.com/matthias-springer updated 
https://github.com/llvm/llvm-project/pull/81761

>From ab0cd8c2d3b66b0a11bee128dc4e11e754e11a36 Mon Sep 17 00:00:00 2001
From: Matthias Springer 
Date: Fri, 16 Feb 2024 15:24:42 +
Subject: [PATCH] [WIP] UnresolvedMaterialization

BEGIN_PUBLIC
No public commit message needed for presubmit.
END_PUBLIC
---
 .../Transforms/Utils/DialectConversion.cpp| 374 +-
 1 file changed, 179 insertions(+), 195 deletions(-)

diff --git a/mlir/lib/Transforms/Utils/DialectConversion.cpp 
b/mlir/lib/Transforms/Utils/DialectConversion.cpp
index 5b7ad4e7b8e281..6fc9225568d028 100644
--- a/mlir/lib/Transforms/Utils/DialectConversion.cpp
+++ b/mlir/lib/Transforms/Utils/DialectConversion.cpp
@@ -152,15 +152,11 @@ namespace {
 /// This class contains a snapshot of the current conversion rewriter state.
 /// This is useful when saving and undoing a set of rewrites.
 struct RewriterState {
-  RewriterState(unsigned numUnresolvedMaterializations, unsigned numRewrites,
-unsigned numIgnoredOperations, unsigned numErased)
-  : numUnresolvedMaterializations(numUnresolvedMaterializations),
-numRewrites(numRewrites), numIgnoredOperations(numIgnoredOperations),
+  RewriterState(unsigned numRewrites, unsigned numIgnoredOperations,
+unsigned numErased)
+  : numRewrites(numRewrites), numIgnoredOperations(numIgnoredOperations),
 numErased(numErased) {}
 
-  /// The current number of unresolved materializations.
-  unsigned numUnresolvedMaterializations;
-
   /// The current number of rewrites performed.
   unsigned numRewrites;
 
@@ -171,109 +167,10 @@ struct RewriterState {
   unsigned numErased;
 };
 
-//===--===//
-// UnresolvedMaterialization
-
-/// This class represents an unresolved materialization, i.e. a materialization
-/// that was inserted during conversion that needs to be legalized at the end 
of
-/// the conversion process.
-class UnresolvedMaterialization {
-public:
-  /// The type of materialization.
-  enum Kind {
-/// This materialization materializes a conversion for an illegal block
-/// argument type, to a legal one.
-Argument,
-
-/// This materialization materializes a conversion from an illegal type to 
a
-/// legal one.
-Target
-  };
-
-  UnresolvedMaterialization(UnrealizedConversionCastOp op = nullptr,
-const TypeConverter *converter = nullptr,
-Kind kind = Target, Type origOutputType = nullptr)
-  : op(op), converterAndKind(converter, kind),
-origOutputType(origOutputType) {}
-
-  /// Return the temporary conversion operation inserted for this
-  /// materialization.
-  UnrealizedConversionCastOp getOp() const { return op; }
-
-  /// Return the type converter of this materialization (which may be null).
-  const TypeConverter *getConverter() const {
-return converterAndKind.getPointer();
-  }
-
-  /// Return the kind of this materialization.
-  Kind getKind() const { return converterAndKind.getInt(); }
-
-  /// Set the kind of this materialization.
-  void setKind(Kind kind) { converterAndKind.setInt(kind); }
-
-  /// Return the original illegal output type of the input values.
-  Type getOrigOutputType() const { return origOutputType; }
-
-private:
-  /// The unresolved materialization operation created during conversion.
-  UnrealizedConversionCastOp op;
-
-  /// The corresponding type converter to use when resolving this
-  /// materialization, and the kind of this materialization.
-  llvm::PointerIntPair converterAndKind;
-
-  /// The original output type. This is only used for argument conversions.
-  Type origOutputType;
-};
-} // namespace
-
-/// Build an unresolved materialization operation given an output type and set
-/// of input operands.
-static Value buildUnresolvedMaterialization(
-UnresolvedMaterialization::Kind kind, Block *insertBlock,
-Block::iterator insertPt, Location loc, ValueRange inputs, Type outputType,
-Type origOutputType, const TypeConverter *converter,
-SmallVectorImpl &unresolvedMaterializations) {
-  // Avoid materializing an unnecessary cast.
-  if (inputs.size() == 1 && inputs.front().getType() == outputType)
-return inputs.front();
-
-  // Create an unresolved materialization. We use a new OpBuilder to avoid
-  // tracking the materialization like we do for other operations.
-  OpBuilder builder(insertBlock, insertPt);
-  auto convertOp =
-  builder.create(loc, outputType, inputs);
-  unresolvedMaterializations.emplace_back(convertOp, converter, kind,
-  origOutputType);
-  return convertOp.getResult(0);
-}
-static Value buildUnresolvedArgumentMaterialization(
-PatternRewriter &rewriter, Location loc, ValueRange inputs,
-Type origOutputType, Type outputType, const TypeConverter *converter,
-SmallVectorImpl &unresolvedMaterial

[llvm-branch-commits] [flang] [llvm] [flang][OpenMP] Main splitting functionality dev-complete (PR #82003)

2024-02-16 Thread Krzysztof Parzyszek via llvm-branch-commits

https://github.com/kparzysz created 
https://github.com/llvm/llvm-project/pull/82003

[flang][OpenMP] TableGen support for getting leaf constructs

Implement getLeafConstructs(D), which for a composite directive D will return 
the list of the constituent leaf directives.

[flang][OpenMP] Set OpenMP attributes in MLIR module in bbc before lowering

Right now attributes like OpenMP version or target attributes for offload are 
set after lowering in bbc. The flang frontend sets them before lowering, making 
them available in the lowering process.

This change sets them before lowering in bbc as well.

getOpenMPVersion

>From ac2d8fd31c0a2b8f818a73a619496d5263c3ccb8 Mon Sep 17 00:00:00 2001
From: Krzysztof Parzyszek 
Date: Tue, 16 Jan 2024 16:40:47 -0600
Subject: [PATCH] [flang][OpenMP] Main splitting functionality dev-complete

[flang][OpenMP] TableGen support for getting leaf constructs

Implement getLeafConstructs(D), which for a composite directive D
will return the list of the constituent leaf directives.

[flang][OpenMP] Set OpenMP attributes in MLIR module in bbc before lowering

Right now attributes like OpenMP version or target attributes for offload
are set after lowering in bbc. The flang frontend sets them before lowering,
making them available in the lowering process.

This change sets them before lowering in bbc as well.

getOpenMPVersion
---
 flang/lib/Lower/OpenMP.cpp| 1044 -
 flang/tools/bbc/bbc.cpp   |2 +-
 .../llvm/Frontend/Directive/DirectiveBase.td  |4 +
 llvm/include/llvm/Frontend/OpenMP/OMP.td  |   60 +-
 llvm/include/llvm/TableGen/DirectiveEmitter.h |4 +
 llvm/utils/TableGen/DirectiveEmitter.cpp  |   77 ++
 6 files changed, 1174 insertions(+), 17 deletions(-)

diff --git a/flang/lib/Lower/OpenMP.cpp b/flang/lib/Lower/OpenMP.cpp
index e45ab842b15556..ed6a0063848b18 100644
--- a/flang/lib/Lower/OpenMP.cpp
+++ b/flang/lib/Lower/OpenMP.cpp
@@ -31,6 +31,7 @@
 #include "mlir/Dialect/OpenMP/OpenMPDialect.h"
 #include "mlir/Dialect/SCF/IR/SCF.h"
 #include "mlir/Transforms/RegionUtils.h"
+#include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/Frontend/OpenMP/OMPConstants.h"
 #include "llvm/Support/CommandLine.h"
@@ -48,6 +49,29 @@ using DeclareTargetCapturePair =
 // Common helper functions
 
//===--===//
 
+static llvm::ArrayRef getWorksharing() {
+  static llvm::omp::Directive worksharing[] = {
+  llvm::omp::Directive::OMPD_do, llvm::omp::Directive::OMPD_for,
+  llvm::omp::Directive::OMPD_scope,  llvm::omp::Directive::OMPD_sections,
+  llvm::omp::Directive::OMPD_single, llvm::omp::Directive::OMPD_workshare,
+  };
+  return worksharing;
+}
+
+static llvm::ArrayRef getWorksharingLoop() {
+  static llvm::omp::Directive worksharingLoop[] = {
+  llvm::omp::Directive::OMPD_do,
+  llvm::omp::Directive::OMPD_for,
+  };
+  return worksharingLoop;
+}
+
+static uint32_t getOpenMPVersion(const mlir::ModuleOp &mod) {
+  if (mlir::Attribute verAttr = mod->getAttr("omp.version"))
+return llvm::cast(verAttr).getVersion();
+  llvm_unreachable("Exoecting OpenMP version attribute in module");
+}
+
 static Fortran::semantics::Symbol *
 getOmpObjectSymbol(const Fortran::parser::OmpObject &ompObject) {
   Fortran::semantics::Symbol *sym = nullptr;
@@ -166,6 +190,15 @@ struct SymDsgExtractor {
 return t;
   }
 
+  static semantics::Symbol *symbol_addr(const evaluate::SymbolRef &ref) {
+// Symbols cannot be created after semantic checks, so all symbol
+// pointers that are non-null must point to one of those pre-existing
+// objects. Throughout the code, symbols are often pointed to by
+// non-const pointers, so there is no harm in casting the constness
+// away.
+return const_cast(&ref.get());
+  }
+
   template  //
   static SymDsg visit(T &&) {
 // Use this to see missing overloads:
@@ -175,19 +208,12 @@ struct SymDsgExtractor {
 
   template  //
   static SymDsg visit(const evaluate::Designator &e) {
-// Symbols cannot be created after semantic checks, so all symbol
-// pointers that are non-null must point to one of those pre-existing
-// objects. Throughout the code, symbols are often pointed to by
-// non-const pointers, so there is no harm in casting the constness
-// away.
-return std::make_tuple(const_cast(e.GetLastSymbol()),
+return std::make_tuple(symbol_addr(*e.GetLastSymbol()),
evaluate::AsGenericExpr(AsRvalueRef(e)));
   }
 
   static SymDsg visit(const evaluate::ProcedureDesignator &e) {
-// See comment above regarding const_cast.
-return std::make_tuple(const_cast(e.GetSymbol()),
-   std::nullopt);
+return std::make_tuple(symbol_addr(*e.GetSymbol()), std::nullopt);
   }
 
   template  //
@@ -313,6 +339,42 @@ std::optional maybeApply(F &&func, const 
std::optional &inp) {
   return std

[llvm-branch-commits] [flang] [llvm] [flang][OpenMP] Main splitting functionality dev-complete (PR #82003)

2024-02-16 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-flang-openmp

Author: Krzysztof Parzyszek (kparzysz)


Changes

[flang][OpenMP] TableGen support for getting leaf constructs

Implement getLeafConstructs(D), which for a composite directive D will return 
the list of the constituent leaf directives.

[flang][OpenMP] Set OpenMP attributes in MLIR module in bbc before lowering

Right now attributes like OpenMP version or target attributes for offload are 
set after lowering in bbc. The flang frontend sets them before lowering, making 
them available in the lowering process.

This change sets them before lowering in bbc as well.

getOpenMPVersion

---

Patch is 63.80 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/82003.diff


6 Files Affected:

- (modified) flang/lib/Lower/OpenMP.cpp (+1034-10) 
- (modified) flang/tools/bbc/bbc.cpp (+1-1) 
- (modified) llvm/include/llvm/Frontend/Directive/DirectiveBase.td (+4) 
- (modified) llvm/include/llvm/Frontend/OpenMP/OMP.td (+54-6) 
- (modified) llvm/include/llvm/TableGen/DirectiveEmitter.h (+4) 
- (modified) llvm/utils/TableGen/DirectiveEmitter.cpp (+77) 


``diff
diff --git a/flang/lib/Lower/OpenMP.cpp b/flang/lib/Lower/OpenMP.cpp
index e45ab842b15556..ed6a0063848b18 100644
--- a/flang/lib/Lower/OpenMP.cpp
+++ b/flang/lib/Lower/OpenMP.cpp
@@ -31,6 +31,7 @@
 #include "mlir/Dialect/OpenMP/OpenMPDialect.h"
 #include "mlir/Dialect/SCF/IR/SCF.h"
 #include "mlir/Transforms/RegionUtils.h"
+#include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/Frontend/OpenMP/OMPConstants.h"
 #include "llvm/Support/CommandLine.h"
@@ -48,6 +49,29 @@ using DeclareTargetCapturePair =
 // Common helper functions
 
//===--===//
 
+static llvm::ArrayRef getWorksharing() {
+  static llvm::omp::Directive worksharing[] = {
+  llvm::omp::Directive::OMPD_do, llvm::omp::Directive::OMPD_for,
+  llvm::omp::Directive::OMPD_scope,  llvm::omp::Directive::OMPD_sections,
+  llvm::omp::Directive::OMPD_single, llvm::omp::Directive::OMPD_workshare,
+  };
+  return worksharing;
+}
+
+static llvm::ArrayRef getWorksharingLoop() {
+  static llvm::omp::Directive worksharingLoop[] = {
+  llvm::omp::Directive::OMPD_do,
+  llvm::omp::Directive::OMPD_for,
+  };
+  return worksharingLoop;
+}
+
+static uint32_t getOpenMPVersion(const mlir::ModuleOp &mod) {
+  if (mlir::Attribute verAttr = mod->getAttr("omp.version"))
+return llvm::cast(verAttr).getVersion();
+  llvm_unreachable("Exoecting OpenMP version attribute in module");
+}
+
 static Fortran::semantics::Symbol *
 getOmpObjectSymbol(const Fortran::parser::OmpObject &ompObject) {
   Fortran::semantics::Symbol *sym = nullptr;
@@ -166,6 +190,15 @@ struct SymDsgExtractor {
 return t;
   }
 
+  static semantics::Symbol *symbol_addr(const evaluate::SymbolRef &ref) {
+// Symbols cannot be created after semantic checks, so all symbol
+// pointers that are non-null must point to one of those pre-existing
+// objects. Throughout the code, symbols are often pointed to by
+// non-const pointers, so there is no harm in casting the constness
+// away.
+return const_cast(&ref.get());
+  }
+
   template  //
   static SymDsg visit(T &&) {
 // Use this to see missing overloads:
@@ -175,19 +208,12 @@ struct SymDsgExtractor {
 
   template  //
   static SymDsg visit(const evaluate::Designator &e) {
-// Symbols cannot be created after semantic checks, so all symbol
-// pointers that are non-null must point to one of those pre-existing
-// objects. Throughout the code, symbols are often pointed to by
-// non-const pointers, so there is no harm in casting the constness
-// away.
-return std::make_tuple(const_cast(e.GetLastSymbol()),
+return std::make_tuple(symbol_addr(*e.GetLastSymbol()),
evaluate::AsGenericExpr(AsRvalueRef(e)));
   }
 
   static SymDsg visit(const evaluate::ProcedureDesignator &e) {
-// See comment above regarding const_cast.
-return std::make_tuple(const_cast(e.GetSymbol()),
-   std::nullopt);
+return std::make_tuple(symbol_addr(*e.GetSymbol()), std::nullopt);
   }
 
   template  //
@@ -313,6 +339,42 @@ std::optional maybeApply(F &&func, const 
std::optional &inp) {
   return std::move(func(*inp));
 }
 
+std::optional
+getBaseObject(const Object &object,
+  Fortran::semantics::SemanticsContext &semaCtx) {
+  // If it's just the symbol, then there is no base.
+  if (!object.dsg)
+return std::nullopt;
+
+  auto maybeRef = evaluate::ExtractDataRef(*object.dsg);
+  if (!maybeRef)
+return std::nullopt;
+
+  evaluate::DataRef ref = *maybeRef;
+
+  if (std::get_if(&ref.u)) {
+return std::nullopt;
+  } else if (auto *comp = std::get_if(&ref.u)) {
+const evaluate::DataRef &base = comp->base();
+return Object{SymDsgExtractor::symbol_addr(base.GetLastSymbol()),
+   

[llvm-branch-commits] [flang] [llvm] [flang][OpenMP] Main splitting functionality dev-complete (PR #82003)

2024-02-16 Thread Krzysztof Parzyszek via llvm-branch-commits

https://github.com/kparzysz converted_to_draft 
https://github.com/llvm/llvm-project/pull/82003
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [llvm] [flang][OpenMP] Main splitting functionality dev-complete (PR #82003)

2024-02-16 Thread Krzysztof Parzyszek via llvm-branch-commits

kparzysz wrote:

This is a follow-up to the [previous 
draft](https://github.com/llvm/llvm-project/pull/80059), based on the 
clause-representation stack.

https://github.com/llvm/llvm-project/pull/82003
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [llvm] [flang][OpenMP] Main splitting functionality dev-complete (PR #82003)

2024-02-16 Thread Krzysztof Parzyszek via llvm-branch-commits

https://github.com/kparzysz edited 
https://github.com/llvm/llvm-project/pull/82003
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [OpenMP][MLIR] Extend explicit derived type member mapping support for OpenMP dialects lowering to LLVM-IR (PR #81510)

2024-02-16 Thread via llvm-branch-commits


@@ -1783,6 +1783,98 @@ void collectMapDataFromMapOperands(MapInfoData &mapData,
   }
 }
 
+static int getMapDataMemberIdx(MapInfoData &mapData,
+   mlir::omp::MapInfoOp memberOp) {
+  int memberDataIdx = -1;
+  for (size_t i = 0; i < mapData.MapClause.size(); ++i) {
+if (mapData.MapClause[i] == memberOp)
+  memberDataIdx = i;
+  }
+  return memberDataIdx;
+}

agozillon wrote:

Thank you, that's an excellent solution :-) 

https://github.com/llvm/llvm-project/pull/81510
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] release/18.x: [libc++][modules] Re-add build dir CMakeLists.txt. (#81370) (PR #81651)

2024-02-16 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/81651

>From 0756378b77054938b2e252c105e91395954366ec Mon Sep 17 00:00:00 2001
From: Mark de Wever 
Date: Tue, 13 Feb 2024 20:04:34 +0100
Subject: [PATCH] [libc++][modules] Re-add build dir CMakeLists.txt. (#81370)

This CMakeLists.txt is used to build modules without build system
support. This was removed in d06ae33ec32122bb526fb35025c1f0cf979f1090.
This is used in the documentation how to use modules.

Made some minor changes to make it work with the std.compat module using
the std module.

Note the CMakeLists.txt in the build dir should be removed once build
system support is generally available.

(cherry picked from commit fc0e9c8315564288f9079a633892abadace534cf)
---
 libcxx/docs/Modules.rst  |  4 ++
 libcxx/modules/CMakeLists.txt| 20 
 libcxx/modules/CMakeLists.txt.in | 88 
 3 files changed, 112 insertions(+)
 create mode 100644 libcxx/modules/CMakeLists.txt.in

diff --git a/libcxx/docs/Modules.rst b/libcxx/docs/Modules.rst
index 533c3fbd2a1eea..ee2b81d3b9e7ca 100644
--- a/libcxx/docs/Modules.rst
+++ b/libcxx/docs/Modules.rst
@@ -218,9 +218,13 @@ Building this project is done with the following steps, 
assuming the files
 
   $ mkdir build
   $ cmake -G Ninja -S . -B build -DCMAKE_CXX_COMPILER= 
-DLIBCXX_BUILD=
+  $ ninja -j1 std -C build
   $ ninja -C build
   $ build/main
 
+.. note:: The ``std`` dependencies of ``std.compat`` is not always resolved 
when
+  building the ``std`` target using multiple jobs.
+
 .. warning::  should point point to the real binary and
  not to a symlink.
 
diff --git a/libcxx/modules/CMakeLists.txt b/libcxx/modules/CMakeLists.txt
index 0388c048dacb8b..0dea8cfca94ac3 100644
--- a/libcxx/modules/CMakeLists.txt
+++ b/libcxx/modules/CMakeLists.txt
@@ -137,6 +137,25 @@ set(LIBCXX_MODULE_STD_COMPAT_SOURCES
   std.compat/cwctype.inc
 )
 
+# TODO MODULES the CMakeLists.txt in the build directory is only temporary.
+# This allows using as available in the build directory. Once build systems
+# have proper support for the installed files this will be removed.
+if ("${LIBCXX_GENERATED_INCLUDE_DIR}" STREQUAL 
"${LIBCXX_GENERATED_INCLUDE_TARGET_DIR}")
+  # This typically happens when the target is not installed.
+  set(LIBCXX_CONFIGURED_INCLUDE_DIRS "${LIBCXX_GENERATED_INCLUDE_DIR}")
+else()
+  # It's important that the arch directory be included first so that its 
header files
+  # which interpose on the default include dir be included instead of the 
default ones.
+  set(LIBCXX_CONFIGURED_INCLUDE_DIRS
+"${LIBCXX_GENERATED_INCLUDE_TARGET_DIR};${LIBCXX_GENERATED_INCLUDE_DIR}"
+  )
+endif()
+configure_file(
+  "CMakeLists.txt.in"
+  "${LIBCXX_GENERATED_MODULE_DIR}/CMakeLists.txt"
+  @ONLY
+)
+
 set(LIBCXX_MODULE_STD_INCLUDE_SOURCES)
 foreach(file ${LIBCXX_MODULE_STD_SOURCES})
   set(
@@ -166,6 +185,7 @@ configure_file(
 )
 
 set(_all_modules)
+list(APPEND _all_modules "${LIBCXX_GENERATED_MODULE_DIR}/CMakeLists.txt")
 list(APPEND _all_modules "${LIBCXX_GENERATED_MODULE_DIR}/std.cppm")
 list(APPEND _all_modules "${LIBCXX_GENERATED_MODULE_DIR}/std.compat.cppm")
 foreach(file ${LIBCXX_MODULE_STD_SOURCES} ${LIBCXX_MODULE_STD_COMPAT_SOURCES})
diff --git a/libcxx/modules/CMakeLists.txt.in b/libcxx/modules/CMakeLists.txt.in
new file mode 100644
index 00..e332d70cc16333
--- /dev/null
+++ b/libcxx/modules/CMakeLists.txt.in
@@ -0,0 +1,88 @@
+cmake_minimum_required(VERSION 3.26)
+
+project(libc++-modules LANGUAGES CXX)
+
+# Enable CMake's module support
+if(CMAKE_VERSION VERSION_LESS "3.28.0")
+  if(CMAKE_VERSION VERSION_LESS "3.27.0")
+set(CMAKE_EXPERIMENTAL_CXX_MODULE_CMAKE_API 
"2182bf5c-ef0d-489a-91da-49dbc3090d2a")
+  else()
+set(CMAKE_EXPERIMENTAL_CXX_MODULE_CMAKE_API 
"aa1f7df0-828a-4fcd-9afc-2dc80491aca7")
+  endif()
+  set(CMAKE_EXPERIMENTAL_CXX_MODULE_DYNDEP 1)
+else()
+  cmake_policy(VERSION 3.28)
+endif()
+
+# Default to C++ extensions being off. Libc++'s modules support have trouble
+# with extensions right now.
+set(CMAKE_CXX_EXTENSIONS OFF)
+
+# Propagates the CMake options to the modules.
+#
+# This uses the std module hard-coded since the std.compat module does not
+# depend on these flags.
+macro(compile_define_if_not condition def)
+  if (NOT ${condition})
+target_compile_definitions(std PRIVATE ${def})
+  endif()
+endmacro()
+macro(compile_define_if condition def)
+  if (${condition})
+target_compile_definitions(std PRIVATE ${def})
+  endif()
+endmacro()
+
+### STD
+
+add_library(std)
+target_sources(std
+  PUBLIC FILE_SET cxx_modules TYPE CXX_MODULES FILES
+std.cppm
+)
+
+target_include_directories(std SYSTEM PRIVATE @LIBCXX_CONFIGURED_INCLUDE_DIRS@)
+
+if (NOT @LIBCXX_ENABLE_EXCEPTIONS@)
+  target_compile_options(std PUBLIC -fno-exceptions)
+endif()
+
+target_compile_options(std
+  PUBLIC
+-nostdinc++
+-Wno-reserved-module-identifier
+-Wno-reserved-user-defined-literal
+   

[llvm-branch-commits] [libcxx] 0756378 - [libc++][modules] Re-add build dir CMakeLists.txt. (#81370)

2024-02-16 Thread Tom Stellard via llvm-branch-commits

Author: Mark de Wever
Date: 2024-02-16T11:01:06-08:00
New Revision: 0756378b77054938b2e252c105e91395954366ec

URL: 
https://github.com/llvm/llvm-project/commit/0756378b77054938b2e252c105e91395954366ec
DIFF: 
https://github.com/llvm/llvm-project/commit/0756378b77054938b2e252c105e91395954366ec.diff

LOG: [libc++][modules] Re-add build dir CMakeLists.txt. (#81370)

This CMakeLists.txt is used to build modules without build system
support. This was removed in d06ae33ec32122bb526fb35025c1f0cf979f1090.
This is used in the documentation how to use modules.

Made some minor changes to make it work with the std.compat module using
the std module.

Note the CMakeLists.txt in the build dir should be removed once build
system support is generally available.

(cherry picked from commit fc0e9c8315564288f9079a633892abadace534cf)

Added: 
libcxx/modules/CMakeLists.txt.in

Modified: 
libcxx/docs/Modules.rst
libcxx/modules/CMakeLists.txt

Removed: 




diff  --git a/libcxx/docs/Modules.rst b/libcxx/docs/Modules.rst
index 533c3fbd2a1eea..ee2b81d3b9e7ca 100644
--- a/libcxx/docs/Modules.rst
+++ b/libcxx/docs/Modules.rst
@@ -218,9 +218,13 @@ Building this project is done with the following steps, 
assuming the files
 
   $ mkdir build
   $ cmake -G Ninja -S . -B build -DCMAKE_CXX_COMPILER= 
-DLIBCXX_BUILD=
+  $ ninja -j1 std -C build
   $ ninja -C build
   $ build/main
 
+.. note:: The ``std`` dependencies of ``std.compat`` is not always resolved 
when
+  building the ``std`` target using multiple jobs.
+
 .. warning::  should point point to the real binary and
  not to a symlink.
 

diff  --git a/libcxx/modules/CMakeLists.txt b/libcxx/modules/CMakeLists.txt
index 0388c048dacb8b..0dea8cfca94ac3 100644
--- a/libcxx/modules/CMakeLists.txt
+++ b/libcxx/modules/CMakeLists.txt
@@ -137,6 +137,25 @@ set(LIBCXX_MODULE_STD_COMPAT_SOURCES
   std.compat/cwctype.inc
 )
 
+# TODO MODULES the CMakeLists.txt in the build directory is only temporary.
+# This allows using as available in the build directory. Once build systems
+# have proper support for the installed files this will be removed.
+if ("${LIBCXX_GENERATED_INCLUDE_DIR}" STREQUAL 
"${LIBCXX_GENERATED_INCLUDE_TARGET_DIR}")
+  # This typically happens when the target is not installed.
+  set(LIBCXX_CONFIGURED_INCLUDE_DIRS "${LIBCXX_GENERATED_INCLUDE_DIR}")
+else()
+  # It's important that the arch directory be included first so that its 
header files
+  # which interpose on the default include dir be included instead of the 
default ones.
+  set(LIBCXX_CONFIGURED_INCLUDE_DIRS
+"${LIBCXX_GENERATED_INCLUDE_TARGET_DIR};${LIBCXX_GENERATED_INCLUDE_DIR}"
+  )
+endif()
+configure_file(
+  "CMakeLists.txt.in"
+  "${LIBCXX_GENERATED_MODULE_DIR}/CMakeLists.txt"
+  @ONLY
+)
+
 set(LIBCXX_MODULE_STD_INCLUDE_SOURCES)
 foreach(file ${LIBCXX_MODULE_STD_SOURCES})
   set(
@@ -166,6 +185,7 @@ configure_file(
 )
 
 set(_all_modules)
+list(APPEND _all_modules "${LIBCXX_GENERATED_MODULE_DIR}/CMakeLists.txt")
 list(APPEND _all_modules "${LIBCXX_GENERATED_MODULE_DIR}/std.cppm")
 list(APPEND _all_modules "${LIBCXX_GENERATED_MODULE_DIR}/std.compat.cppm")
 foreach(file ${LIBCXX_MODULE_STD_SOURCES} ${LIBCXX_MODULE_STD_COMPAT_SOURCES})

diff  --git a/libcxx/modules/CMakeLists.txt.in 
b/libcxx/modules/CMakeLists.txt.in
new file mode 100644
index 00..e332d70cc16333
--- /dev/null
+++ b/libcxx/modules/CMakeLists.txt.in
@@ -0,0 +1,88 @@
+cmake_minimum_required(VERSION 3.26)
+
+project(libc++-modules LANGUAGES CXX)
+
+# Enable CMake's module support
+if(CMAKE_VERSION VERSION_LESS "3.28.0")
+  if(CMAKE_VERSION VERSION_LESS "3.27.0")
+set(CMAKE_EXPERIMENTAL_CXX_MODULE_CMAKE_API 
"2182bf5c-ef0d-489a-91da-49dbc3090d2a")
+  else()
+set(CMAKE_EXPERIMENTAL_CXX_MODULE_CMAKE_API 
"aa1f7df0-828a-4fcd-9afc-2dc80491aca7")
+  endif()
+  set(CMAKE_EXPERIMENTAL_CXX_MODULE_DYNDEP 1)
+else()
+  cmake_policy(VERSION 3.28)
+endif()
+
+# Default to C++ extensions being off. Libc++'s modules support have trouble
+# with extensions right now.
+set(CMAKE_CXX_EXTENSIONS OFF)
+
+# Propagates the CMake options to the modules.
+#
+# This uses the std module hard-coded since the std.compat module does not
+# depend on these flags.
+macro(compile_define_if_not condition def)
+  if (NOT ${condition})
+target_compile_definitions(std PRIVATE ${def})
+  endif()
+endmacro()
+macro(compile_define_if condition def)
+  if (${condition})
+target_compile_definitions(std PRIVATE ${def})
+  endif()
+endmacro()
+
+### STD
+
+add_library(std)
+target_sources(std
+  PUBLIC FILE_SET cxx_modules TYPE CXX_MODULES FILES
+std.cppm
+)
+
+target_include_directories(std SYSTEM PRIVATE @LIBCXX_CONFIGURED_INCLUDE_DIRS@)
+
+if (NOT @LIBCXX_ENABLE_EXCEPTIONS@)
+  target_compile_options(std PUBLIC -fno-exceptions)
+endif()
+
+target_compile_options(std
+  PUBLIC
+-nostdinc++
+-Wno-reserved-module-identifier
+   

[llvm-branch-commits] [libcxx] release/18.x: [libc++][modules] Re-add build dir CMakeLists.txt. (#81370) (PR #81651)

2024-02-16 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/81651
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] release/18.x: [lld] Fix test failures when running as root user (#81339) (PR #81988)

2024-02-16 Thread Fangrui Song via llvm-branch-commits

https://github.com/MaskRay approved this pull request.


https://github.com/llvm/llvm-project/pull/81988
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] release/18.x: [lld/ELF] Avoid unnecessary TPOFF relocations in GOT for -pie (#81739) (PR #81990)

2024-02-16 Thread Fangrui Song via llvm-branch-commits

https://github.com/MaskRay approved this pull request.


https://github.com/llvm/llvm-project/pull/81990
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [ELF] Support placing .lbss/.lrodata/.ldata after .bss (PR #81224)

2024-02-16 Thread Fangrui Song via llvm-branch-commits


@@ -1436,6 +1436,8 @@ static void readConfigs(opt::InputArgList &args) {
   config->zInterpose = hasZOption(args, "interpose");
   config->zKeepTextSectionPrefix = getZFlag(
   args, "keep-text-section-prefix", "nokeep-text-section-prefix", false);
+  config->zLrodataAfterBss =
+  getZFlag(args, "lrodata-after-bss", "nolrodata-after-bss", false);

MaskRay wrote:

I think using "everybody" is not accurate. I've mentioned that `-fpie -no-pie` 
does not need this. And this `-fno-pie -no-pie` use has been a tiny portion of 
the user community. We are, after all, dealing with a problem that very few 
users run into in the first place. I think it matters to optimize for the 
prevailing configurations (-fpie/-fpic with -pie).

We also need evidence that "I have heard that x86-64 -fno-pic is measurably 
slower than -fpie in large workloads"

https://github.com/llvm/llvm-project/pull/81224
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [ELF] Support placing .lbss/.lrodata/.ldata after .bss (PR #81224)

2024-02-16 Thread Fangrui Song via llvm-branch-commits

https://github.com/MaskRay edited 
https://github.com/llvm/llvm-project/pull/81224
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] release/18.x: [lld] Fix test failures when running as root user (#81339) (PR #81988)

2024-02-16 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/81988

>From d71aae5f79863ce897e38f6aab46710f0257f72e Mon Sep 17 00:00:00 2001
From: Tom Stellard 
Date: Fri, 9 Feb 2024 20:57:05 -0800
Subject: [PATCH] [lld] Fix test failures when running as root user (#81339)

This makes it easier to run the tests in a containerized environment.

(cherry picked from commit e165bea1d4ec2de96ee0548cece79d71a75ce8f8)
---
 lld/test/COFF/lto-cache-errors.ll | 2 +-
 lld/test/COFF/thinlto-emit-imports.ll | 2 +-
 lld/test/ELF/lto/resolution-err.ll| 2 +-
 lld/test/ELF/lto/thinlto-cant-write-index.ll  | 2 +-
 lld/test/ELF/lto/thinlto-emit-imports.ll  | 2 +-
 lld/test/MachO/invalid/invalid-lto-object-path.ll | 2 +-
 lld/test/MachO/thinlto-emit-imports.ll| 2 +-
 7 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/lld/test/COFF/lto-cache-errors.ll 
b/lld/test/COFF/lto-cache-errors.ll
index 55244e5690dc34..a46190a81b6230 100644
--- a/lld/test/COFF/lto-cache-errors.ll
+++ b/lld/test/COFF/lto-cache-errors.ll
@@ -1,4 +1,4 @@
-; REQUIRES: x86
+; REQUIRES: x86, non-root-user
 ;; Not supported on windows since we use permissions to deny the creation
 ; UNSUPPORTED: system-windows
 
diff --git a/lld/test/COFF/thinlto-emit-imports.ll 
b/lld/test/COFF/thinlto-emit-imports.ll
index a9f22c1dc2dcff..b47a6cea4eb7df 100644
--- a/lld/test/COFF/thinlto-emit-imports.ll
+++ b/lld/test/COFF/thinlto-emit-imports.ll
@@ -1,4 +1,4 @@
-; REQUIRES: x86
+; REQUIRES: x86, non-root-user
 
 ; Generate summary sections and test lld handling.
 ; RUN: opt -module-summary %s -o %t1.obj
diff --git a/lld/test/ELF/lto/resolution-err.ll 
b/lld/test/ELF/lto/resolution-err.ll
index 6dfa64b1b8b9ee..f9855abaff3279 100644
--- a/lld/test/ELF/lto/resolution-err.ll
+++ b/lld/test/ELF/lto/resolution-err.ll
@@ -1,5 +1,5 @@
 ; UNSUPPORTED: system-windows
-; REQUIRES: shell
+; REQUIRES: shell, non-root-user
 ; RUN: llvm-as %s -o %t.bc
 ; RUN: touch %t.resolution.txt
 ; RUN: chmod u-w %t.resolution.txt
diff --git a/lld/test/ELF/lto/thinlto-cant-write-index.ll 
b/lld/test/ELF/lto/thinlto-cant-write-index.ll
index e664acbb17de1a..286fcddd4238a1 100644
--- a/lld/test/ELF/lto/thinlto-cant-write-index.ll
+++ b/lld/test/ELF/lto/thinlto-cant-write-index.ll
@@ -1,4 +1,4 @@
-; REQUIRES: x86
+; REQUIRES: x86, non-root-user
 
 ; Basic ThinLTO tests.
 ; RUN: opt -module-summary %s -o %t1.o
diff --git a/lld/test/ELF/lto/thinlto-emit-imports.ll 
b/lld/test/ELF/lto/thinlto-emit-imports.ll
index 6d0e1e65047db4..253ec08619c982 100644
--- a/lld/test/ELF/lto/thinlto-emit-imports.ll
+++ b/lld/test/ELF/lto/thinlto-emit-imports.ll
@@ -1,4 +1,4 @@
-; REQUIRES: x86
+; REQUIRES: x86, non-root-user
 ;; Test a few properties not tested by thinlto-index-only.ll
 
 ; RUN: opt -module-summary %s -o %t1.o
diff --git a/lld/test/MachO/invalid/invalid-lto-object-path.ll 
b/lld/test/MachO/invalid/invalid-lto-object-path.ll
index 75c6a97e446fb2..c862538d592ce8 100644
--- a/lld/test/MachO/invalid/invalid-lto-object-path.ll
+++ b/lld/test/MachO/invalid/invalid-lto-object-path.ll
@@ -1,4 +1,4 @@
-; REQUIRES: x86
+; REQUIRES: x86, non-root-user
 
 ;; Creating read-only directories with `chmod 400` isn't supported on Windows
 ; UNSUPPORTED: system-windows
diff --git a/lld/test/MachO/thinlto-emit-imports.ll 
b/lld/test/MachO/thinlto-emit-imports.ll
index 47a612bd0a7b56..88f766f59c8877 100644
--- a/lld/test/MachO/thinlto-emit-imports.ll
+++ b/lld/test/MachO/thinlto-emit-imports.ll
@@ -1,4 +1,4 @@
-; REQUIRES: x86
+; REQUIRES: x86, non-root-user
 ; RUN: rm -rf %t; split-file %s %t
 
 ; Generate summary sections and test lld handling.

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] d71aae5 - [lld] Fix test failures when running as root user (#81339)

2024-02-16 Thread Tom Stellard via llvm-branch-commits

Author: Tom Stellard
Date: 2024-02-16T12:21:13-08:00
New Revision: d71aae5f79863ce897e38f6aab46710f0257f72e

URL: 
https://github.com/llvm/llvm-project/commit/d71aae5f79863ce897e38f6aab46710f0257f72e
DIFF: 
https://github.com/llvm/llvm-project/commit/d71aae5f79863ce897e38f6aab46710f0257f72e.diff

LOG: [lld] Fix test failures when running as root user (#81339)

This makes it easier to run the tests in a containerized environment.

(cherry picked from commit e165bea1d4ec2de96ee0548cece79d71a75ce8f8)

Added: 


Modified: 
lld/test/COFF/lto-cache-errors.ll
lld/test/COFF/thinlto-emit-imports.ll
lld/test/ELF/lto/resolution-err.ll
lld/test/ELF/lto/thinlto-cant-write-index.ll
lld/test/ELF/lto/thinlto-emit-imports.ll
lld/test/MachO/invalid/invalid-lto-object-path.ll
lld/test/MachO/thinlto-emit-imports.ll

Removed: 




diff  --git a/lld/test/COFF/lto-cache-errors.ll 
b/lld/test/COFF/lto-cache-errors.ll
index 55244e5690dc34..a46190a81b6230 100644
--- a/lld/test/COFF/lto-cache-errors.ll
+++ b/lld/test/COFF/lto-cache-errors.ll
@@ -1,4 +1,4 @@
-; REQUIRES: x86
+; REQUIRES: x86, non-root-user
 ;; Not supported on windows since we use permissions to deny the creation
 ; UNSUPPORTED: system-windows
 

diff  --git a/lld/test/COFF/thinlto-emit-imports.ll 
b/lld/test/COFF/thinlto-emit-imports.ll
index a9f22c1dc2dcff..b47a6cea4eb7df 100644
--- a/lld/test/COFF/thinlto-emit-imports.ll
+++ b/lld/test/COFF/thinlto-emit-imports.ll
@@ -1,4 +1,4 @@
-; REQUIRES: x86
+; REQUIRES: x86, non-root-user
 
 ; Generate summary sections and test lld handling.
 ; RUN: opt -module-summary %s -o %t1.obj

diff  --git a/lld/test/ELF/lto/resolution-err.ll 
b/lld/test/ELF/lto/resolution-err.ll
index 6dfa64b1b8b9ee..f9855abaff3279 100644
--- a/lld/test/ELF/lto/resolution-err.ll
+++ b/lld/test/ELF/lto/resolution-err.ll
@@ -1,5 +1,5 @@
 ; UNSUPPORTED: system-windows
-; REQUIRES: shell
+; REQUIRES: shell, non-root-user
 ; RUN: llvm-as %s -o %t.bc
 ; RUN: touch %t.resolution.txt
 ; RUN: chmod u-w %t.resolution.txt

diff  --git a/lld/test/ELF/lto/thinlto-cant-write-index.ll 
b/lld/test/ELF/lto/thinlto-cant-write-index.ll
index e664acbb17de1a..286fcddd4238a1 100644
--- a/lld/test/ELF/lto/thinlto-cant-write-index.ll
+++ b/lld/test/ELF/lto/thinlto-cant-write-index.ll
@@ -1,4 +1,4 @@
-; REQUIRES: x86
+; REQUIRES: x86, non-root-user
 
 ; Basic ThinLTO tests.
 ; RUN: opt -module-summary %s -o %t1.o

diff  --git a/lld/test/ELF/lto/thinlto-emit-imports.ll 
b/lld/test/ELF/lto/thinlto-emit-imports.ll
index 6d0e1e65047db4..253ec08619c982 100644
--- a/lld/test/ELF/lto/thinlto-emit-imports.ll
+++ b/lld/test/ELF/lto/thinlto-emit-imports.ll
@@ -1,4 +1,4 @@
-; REQUIRES: x86
+; REQUIRES: x86, non-root-user
 ;; Test a few properties not tested by thinlto-index-only.ll
 
 ; RUN: opt -module-summary %s -o %t1.o

diff  --git a/lld/test/MachO/invalid/invalid-lto-object-path.ll 
b/lld/test/MachO/invalid/invalid-lto-object-path.ll
index 75c6a97e446fb2..c862538d592ce8 100644
--- a/lld/test/MachO/invalid/invalid-lto-object-path.ll
+++ b/lld/test/MachO/invalid/invalid-lto-object-path.ll
@@ -1,4 +1,4 @@
-; REQUIRES: x86
+; REQUIRES: x86, non-root-user
 
 ;; Creating read-only directories with `chmod 400` isn't supported on Windows
 ; UNSUPPORTED: system-windows

diff  --git a/lld/test/MachO/thinlto-emit-imports.ll 
b/lld/test/MachO/thinlto-emit-imports.ll
index 47a612bd0a7b56..88f766f59c8877 100644
--- a/lld/test/MachO/thinlto-emit-imports.ll
+++ b/lld/test/MachO/thinlto-emit-imports.ll
@@ -1,4 +1,4 @@
-; REQUIRES: x86
+; REQUIRES: x86, non-root-user
 ; RUN: rm -rf %t; split-file %s %t
 
 ; Generate summary sections and test lld handling.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] release/18.x: [lld] Fix test failures when running as root user (#81339) (PR #81988)

2024-02-16 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/81988
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Use container on Linux to run llvm-project-tests workflow (#81349) (PR #81807)

2024-02-16 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/81807

>From f40655f940d7d070da09cd4ae6db0fad74fd716e Mon Sep 17 00:00:00 2001
From: Tom Stellard 
Date: Wed, 14 Feb 2024 16:05:52 -0800
Subject: [PATCH] Use container on Linux to run llvm-project-tests workflow
 (#81349)

(cherry picked from commit fe20a759fcd20e1755ea1b34c5e6447a787925dc)
---
 .github/workflows/llvm-project-tests.yml | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/.github/workflows/llvm-project-tests.yml 
b/.github/workflows/llvm-project-tests.yml
index 68b4a68d1af984..43b90193406fc9 100644
--- a/.github/workflows/llvm-project-tests.yml
+++ b/.github/workflows/llvm-project-tests.yml
@@ -58,6 +58,10 @@ jobs:
   lit-tests:
 name: Lit Tests
 runs-on: ${{ matrix.os }}
+container:
+  image: ${{(startsWith(matrix.os, 'ubuntu') && 
'ghcr.io/llvm/ci-ubuntu-22.04:latest') || null}}
+  volumes:
+- /mnt/:/mnt/
 strategy:
   fail-fast: false
   matrix:
@@ -77,6 +81,7 @@ jobs:
 with:
   python-version: ${{ inputs.python_version }}
   - name: Install Ninja
+if: runner.os != 'Linux'
 uses: llvm/actions/install-ninja@main
   # actions/checkout deletes any existing files in the new git directory,
   # so this needs to either run before ccache-action or it has to use
@@ -108,8 +113,8 @@ jobs:
 run: |
   if [ "${{ runner.os }}" == "Linux" ]; then
 builddir="/mnt/build/"
-sudo mkdir -p $builddir
-sudo chown `whoami`:`whoami` $builddir
+mkdir -p $builddir
+extra_cmake_args="-DCMAKE_CXX_COMPILER=clang++ 
-DCMAKE_C_COMPILER=clang"
   else
 builddir="$(pwd)"/build
   fi
@@ -123,6 +128,7 @@ jobs:
 -DLLDB_INCLUDE_TESTS=OFF \
 -DCMAKE_C_COMPILER_LAUNCHER=sccache \
 -DCMAKE_CXX_COMPILER_LAUNCHER=sccache \
+$extra_cmake_args \
 ${{ inputs.extra_cmake_args }}
   ninja -C "$builddir" '${{ inputs.build_target }}'
 

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] release/18.x: [lld/ELF] Avoid unnecessary TPOFF relocations in GOT for -pie (#81739) (PR #81990)

2024-02-16 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/81990

>From 347977c8b16fc4db809d7e049ceca874a5e4940b Mon Sep 17 00:00:00 2001
From: Ulrich Weigand 
Date: Wed, 14 Feb 2024 18:26:38 +0100
Subject: [PATCH] [lld/ELF] Avoid unnecessary TPOFF relocations in GOT for -pie
 (#81739)

With the new SystemZ port we noticed that -pie executables generated
from files containing R_390_TLS_IEENT relocations will have unnecessary
relocations in their GOT:

9e8d8: R_390_TLS_TPOFF  *ABS*+0x18

This is caused by the config->isPic conditon in addTpOffsetGotEntry:

 static void addTpOffsetGotEntry(Symbol &sym) {
   in.got->addEntry(sym);
   uint64_t off = sym.getGotOffset();
   if (!sym.isPreemptible && !config->isPic) {
 in.got->addConstant({R_TPREL, target->symbolicRel, off, 0, &sym});
 return;
   }

It is correct that we need to retain a TPOFF relocation if the target
symbol is preemptible or if we're building a shared library. But when
building a -pie executable, those values are fixed at link time and
there's no need for any remaining dynamic relocation.

Note that the equivalent MIPS-specific code in MipsGotSection::build
checks for config->shared instead of config->isPic; we should use the
same check here. (Note also that on many other platforms we're not even
using addTpOffsetGotEntry in this case as an IE->LE relaxation is
applied before; we don't have this type of relaxation on SystemZ.)

(cherry picked from commit 6f907733e65d24edad65f763fb14402464bd578b)
---
 lld/ELF/Relocations.cpp   |  2 +-
 lld/test/ELF/systemz-tls-ie.s | 34 ++
 2 files changed, 35 insertions(+), 1 deletion(-)

diff --git a/lld/ELF/Relocations.cpp b/lld/ELF/Relocations.cpp
index f64b4219e0acc1..619fbaf5dc5452 100644
--- a/lld/ELF/Relocations.cpp
+++ b/lld/ELF/Relocations.cpp
@@ -940,7 +940,7 @@ void elf::addGotEntry(Symbol &sym) {
 static void addTpOffsetGotEntry(Symbol &sym) {
   in.got->addEntry(sym);
   uint64_t off = sym.getGotOffset();
-  if (!sym.isPreemptible && !config->isPic) {
+  if (!sym.isPreemptible && !config->shared) {
 in.got->addConstant({R_TPREL, target->symbolicRel, off, 0, &sym});
 return;
   }
diff --git a/lld/test/ELF/systemz-tls-ie.s b/lld/test/ELF/systemz-tls-ie.s
index 27b642ed2dfc5f..85e2f24cb61f62 100644
--- a/lld/test/ELF/systemz-tls-ie.s
+++ b/lld/test/ELF/systemz-tls-ie.s
@@ -12,6 +12,14 @@
 # RUN: llvm-objdump --section .data --full-contents %t | FileCheck 
--check-prefix=LE-DATA %s
 # RUN: llvm-objdump --section .got --full-contents %t | FileCheck 
--check-prefix=LE-GOT %s
 
+## With -pie we still have the R_390_RELATIVE for the data element, but all GOT
+## entries should be fully resolved without any remaining R_390_TLS_TPOFF.
+# RUN: ld.lld -pie %t.o -o %t.pie
+# RUN: llvm-readelf -r %t.pie | FileCheck --check-prefix=PIE-REL %s
+# RUN: llvm-objdump -d --no-show-raw-insn %t.pie | FileCheck 
--check-prefix=PIE %s
+# RUN: llvm-objdump --section .data --full-contents %t.pie | FileCheck 
--check-prefix=PIE-DATA %s
+# RUN: llvm-objdump --section .got --full-contents %t.pie | FileCheck 
--check-prefix=PIE-GOT %s
+
 # IE-REL: Relocation section '.rela.dyn' at offset {{.*}} contains 4 entries:
 # IE-REL: 3478 000c R_390_RELATIVE 2460
 # IE-REL: 2460 00010038 R_390_TLS_TPOFF 0008 a 
+ 0
@@ -58,6 +66,32 @@
 # LE-GOT: 1002248    fff8
 # LE-GOT: 1002258  fffc  
 
+# PIE-REL: Relocation section '.rela.dyn' at offset {{.*}} contains 1 entries:
+# PIE-REL: 33d0 000c R_390_RELATIVE 23b8
+
+## TP offset for a is at 0x23b8
+# PIE:  lgrl%r1, 0x23b8
+# PIE-NEXT: lgf %r1, 0(%r1,%r7)
+
+## TP offset for b is at 0x23c0
+# PIE-NEXT: lgrl%r1, 0x23c0
+# PIE-NEXT: lgf %r1, 0(%r1,%r7)
+
+## TP offset for c is at 0x23c8
+# PIE-NEXT: lgrl%r1, 0x23c8
+# PIE-NEXT: lgf %r1, 0(%r1,%r7)
+
+## Data element: TP offset for a is at 0x23b8 (relocated via R_390_RELATIVE 
above)
+# PIE-DATA: 33d0  
+
+## TP offsets in GOT:
+# a: -8
+# b: -4
+# c: 0
+# PIE-GOT: 23a0  22d0  
+# PIE-GOT: 23b0    fff8
+# PIE-GOT: 23c0  fffc  
+
 ear %r7,%a0
 sllg%r7,%r1,32
 ear %r7,%a1

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] 347977c - [lld/ELF] Avoid unnecessary TPOFF relocations in GOT for -pie (#81739)

2024-02-16 Thread Tom Stellard via llvm-branch-commits

Author: Ulrich Weigand
Date: 2024-02-16T12:24:37-08:00
New Revision: 347977c8b16fc4db809d7e049ceca874a5e4940b

URL: 
https://github.com/llvm/llvm-project/commit/347977c8b16fc4db809d7e049ceca874a5e4940b
DIFF: 
https://github.com/llvm/llvm-project/commit/347977c8b16fc4db809d7e049ceca874a5e4940b.diff

LOG: [lld/ELF] Avoid unnecessary TPOFF relocations in GOT for -pie (#81739)

With the new SystemZ port we noticed that -pie executables generated
from files containing R_390_TLS_IEENT relocations will have unnecessary
relocations in their GOT:

9e8d8: R_390_TLS_TPOFF  *ABS*+0x18

This is caused by the config->isPic conditon in addTpOffsetGotEntry:

 static void addTpOffsetGotEntry(Symbol &sym) {
   in.got->addEntry(sym);
   uint64_t off = sym.getGotOffset();
   if (!sym.isPreemptible && !config->isPic) {
 in.got->addConstant({R_TPREL, target->symbolicRel, off, 0, &sym});
 return;
   }

It is correct that we need to retain a TPOFF relocation if the target
symbol is preemptible or if we're building a shared library. But when
building a -pie executable, those values are fixed at link time and
there's no need for any remaining dynamic relocation.

Note that the equivalent MIPS-specific code in MipsGotSection::build
checks for config->shared instead of config->isPic; we should use the
same check here. (Note also that on many other platforms we're not even
using addTpOffsetGotEntry in this case as an IE->LE relaxation is
applied before; we don't have this type of relaxation on SystemZ.)

(cherry picked from commit 6f907733e65d24edad65f763fb14402464bd578b)

Added: 


Modified: 
lld/ELF/Relocations.cpp
lld/test/ELF/systemz-tls-ie.s

Removed: 




diff  --git a/lld/ELF/Relocations.cpp b/lld/ELF/Relocations.cpp
index f64b4219e0acc1..619fbaf5dc5452 100644
--- a/lld/ELF/Relocations.cpp
+++ b/lld/ELF/Relocations.cpp
@@ -940,7 +940,7 @@ void elf::addGotEntry(Symbol &sym) {
 static void addTpOffsetGotEntry(Symbol &sym) {
   in.got->addEntry(sym);
   uint64_t off = sym.getGotOffset();
-  if (!sym.isPreemptible && !config->isPic) {
+  if (!sym.isPreemptible && !config->shared) {
 in.got->addConstant({R_TPREL, target->symbolicRel, off, 0, &sym});
 return;
   }

diff  --git a/lld/test/ELF/systemz-tls-ie.s b/lld/test/ELF/systemz-tls-ie.s
index 27b642ed2dfc5f..85e2f24cb61f62 100644
--- a/lld/test/ELF/systemz-tls-ie.s
+++ b/lld/test/ELF/systemz-tls-ie.s
@@ -12,6 +12,14 @@
 # RUN: llvm-objdump --section .data --full-contents %t | FileCheck 
--check-prefix=LE-DATA %s
 # RUN: llvm-objdump --section .got --full-contents %t | FileCheck 
--check-prefix=LE-GOT %s
 
+## With -pie we still have the R_390_RELATIVE for the data element, but all GOT
+## entries should be fully resolved without any remaining R_390_TLS_TPOFF.
+# RUN: ld.lld -pie %t.o -o %t.pie
+# RUN: llvm-readelf -r %t.pie | FileCheck --check-prefix=PIE-REL %s
+# RUN: llvm-objdump -d --no-show-raw-insn %t.pie | FileCheck 
--check-prefix=PIE %s
+# RUN: llvm-objdump --section .data --full-contents %t.pie | FileCheck 
--check-prefix=PIE-DATA %s
+# RUN: llvm-objdump --section .got --full-contents %t.pie | FileCheck 
--check-prefix=PIE-GOT %s
+
 # IE-REL: Relocation section '.rela.dyn' at offset {{.*}} contains 4 entries:
 # IE-REL: 3478 000c R_390_RELATIVE 2460
 # IE-REL: 2460 00010038 R_390_TLS_TPOFF 0008 a 
+ 0
@@ -58,6 +66,32 @@
 # LE-GOT: 1002248    fff8
 # LE-GOT: 1002258  fffc  
 
+# PIE-REL: Relocation section '.rela.dyn' at offset {{.*}} contains 1 entries:
+# PIE-REL: 33d0 000c R_390_RELATIVE 23b8
+
+## TP offset for a is at 0x23b8
+# PIE:  lgrl%r1, 0x23b8
+# PIE-NEXT: lgf %r1, 0(%r1,%r7)
+
+## TP offset for b is at 0x23c0
+# PIE-NEXT: lgrl%r1, 0x23c0
+# PIE-NEXT: lgf %r1, 0(%r1,%r7)
+
+## TP offset for c is at 0x23c8
+# PIE-NEXT: lgrl%r1, 0x23c8
+# PIE-NEXT: lgf %r1, 0(%r1,%r7)
+
+## Data element: TP offset for a is at 0x23b8 (relocated via R_390_RELATIVE 
above)
+# PIE-DATA: 33d0  
+
+## TP offsets in GOT:
+# a: -8
+# b: -4
+# c: 0
+# PIE-GOT: 23a0  22d0  
+# PIE-GOT: 23b0    fff8
+# PIE-GOT: 23c0  fffc  
+
 ear %r7,%a0
 sllg%r7,%r1,32
 ear %r7,%a1



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] release/18.x: [lld/ELF] Avoid unnecessary TPOFF relocations in GOT for -pie (#81739) (PR #81990)

2024-02-16 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/81990
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [OpenMP][MLIR] Extend explicit derived type member mapping support for OpenMP dialects lowering to LLVM-IR (PR #81510)

2024-02-16 Thread via llvm-branch-commits


@@ -1783,6 +1783,98 @@ void collectMapDataFromMapOperands(MapInfoData &mapData,
   }
 }
 
+static int getMapDataMemberIdx(MapInfoData &mapData,
+   mlir::omp::MapInfoOp memberOp) {
+  int memberDataIdx = -1;
+  for (size_t i = 0; i < mapData.MapClause.size(); ++i) {
+if (mapData.MapClause[i] == memberOp)
+  memberDataIdx = i;
+  }
+  return memberDataIdx;
+}
+
+static mlir::omp::MapInfoOp
+getFirstOrLastMappedMemberPtr(mlir::omp::MapInfoOp mapInfo, bool first) {
+  // Only 1 member has been mapped, we can return it.
+  if (mapInfo.getMembersIndex()->size() == 1)
+if (auto mapOp = mlir::dyn_cast(
+mapInfo.getMembers()[0].getDefiningOp()))
+  return mapOp;
+
+  int64_t curPos =
+  mapInfo.getMembersIndex()->begin()->cast().getInt();
+
+  int64_t idx = 1, curIdx = 0, memberPlacement = 0;
+  for (const auto *iter = std::next(mapInfo.getMembersIndex()->begin());
+   iter != mapInfo.getMembersIndex()->end(); iter++) {
+memberPlacement = iter->cast().getInt();
+if (first) {
+  if (memberPlacement < curPos) {
+curIdx = idx;
+curPos = memberPlacement;
+  }
+} else {
+  if (memberPlacement > curPos) {
+curIdx = idx;
+curPos = memberPlacement;
+  }
+}
+idx++;
+  }

agozillon wrote:

I imagine the former of sorting it when the function is called should be fine, 
I'll have to double check though.

The latter, I don't think is possible, or at the very least I'd like to 
maintain the user-ordering for the maps for the moment (and member indices by 
extension) as I think it is important specification wise (at least until we can 
rule it out in any case) as there's only one case where the specification calls 
for re-ordering (which we still need to add support for actually) and that's 
re-ordering from/tofrom/to map clauses in-front of alloca/delete/release.

https://github.com/llvm/llvm-project/pull/81510
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] release/18.x: [libc++] Only include from the C library if it exists (#81887) (PR #82045)

2024-02-16 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/82045

Backport d8278b6

Requested by: @ldionne

>From 74fe0854bc9cb159e5a182f511966dd5ddb33915 Mon Sep 17 00:00:00 2001
From: Louis Dionne 
Date: Fri, 16 Feb 2024 16:45:00 -0500
Subject: [PATCH] [libc++] Only include  from the C library if it
 exists (#81887)

In 2cea1babefbb, we removed the  header provided by libc++. However, 
we did not conditionally include the underlying 
header only if the C library provides one, which we otherwise do consistently 
(see e.g. 647ddc08f43c).

rdar://122978778
(cherry picked from commit d8278b682386f51dfba204849c624672a3df40c7)
---
 libcxx/include/csetjmp | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/libcxx/include/csetjmp b/libcxx/include/csetjmp
index d219c8e6cb2250..9012cad22ebe74 100644
--- a/libcxx/include/csetjmp
+++ b/libcxx/include/csetjmp
@@ -33,7 +33,13 @@ void longjmp(jmp_buf env, int val);
 #include <__assert> // all public C++ headers provide the assertion handler
 #include <__config>
 
-#include 
+//  is not provided by libc++
+#if __has_include()
+#  include 
+#  ifdef _LIBCPP_SETJMP_H
+#error "If libc++ starts defining , the __has_include check 
should move to libc++'s "
+#  endif
+#endif
 
 #if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
 #  pragma GCC system_header

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] release/18.x: [libc++] Only include from the C library if it exists (#81887) (PR #82045)

2024-02-16 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/82045
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] release/18.x: [libc++] Only include from the C library if it exists (#81887) (PR #82045)

2024-02-16 Thread via llvm-branch-commits

llvmbot wrote:

@mordante What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/82045
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


  1   2   >