[llvm-branch-commits] [lld] [llvm] release/18.x: [LoongArch] Use R_LARCH_ALIGN with section symbol (#84741) (PR #88891)

2024-05-17 Thread Lu Weining via llvm-branch-commits

SixWeining wrote:

This will be reverted in main branch 
(https://github.com/llvm/llvm-project/pull/92584). So close it.

https://github.com/llvm/llvm-project/pull/88891
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] [llvm] release/18.x: [LoongArch] Use R_LARCH_ALIGN with section symbol (#84741) (PR #88891)

2024-05-17 Thread Lu Weining via llvm-branch-commits

https://github.com/SixWeining closed 
https://github.com/llvm/llvm-project/pull/88891
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/18.x: [clang] Don't assume location of compiler-rt for OpenBSD (#92183) (PR #92293)

2024-05-17 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/92293
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] 48c1364 - [clang] Don't assume location of compiler-rt for OpenBSD (#92183)

2024-05-17 Thread Tom Stellard via llvm-branch-commits

Author: John Ericson
Date: 2024-05-17T16:26:37-07:00
New Revision: 48c1364200b5649dda2f9ccbe382b0bd908b99de

URL: 
https://github.com/llvm/llvm-project/commit/48c1364200b5649dda2f9ccbe382b0bd908b99de
DIFF: 
https://github.com/llvm/llvm-project/commit/48c1364200b5649dda2f9ccbe382b0bd908b99de.diff

LOG: [clang] Don't assume location of compiler-rt for OpenBSD (#92183)

If the `/usr/lib/...` path where compiler-rt is conventionally installed
on OpenBSD does not exist, fall back to the regular logic to find it.

This is a minimal change to allow OpenBSD cross compilation from a
toolchain that doesn't adopt all of OpenBSD's monorepo's conventions.

(cherry picked from commit be10746f3a4381456eb5082a968766201c17ab5d)

Added: 


Modified: 
clang/lib/Driver/ToolChains/OpenBSD.cpp

Removed: 




diff  --git a/clang/lib/Driver/ToolChains/OpenBSD.cpp 
b/clang/lib/Driver/ToolChains/OpenBSD.cpp
index fd6aa4d7e6844..00b6c520fcdd7 100644
--- a/clang/lib/Driver/ToolChains/OpenBSD.cpp
+++ b/clang/lib/Driver/ToolChains/OpenBSD.cpp
@@ -371,7 +371,8 @@ std::string OpenBSD::getCompilerRT(const ArgList , 
StringRef Component,
   if (Component == "builtins") {
 SmallString<128> Path(getDriver().SysRoot);
 llvm::sys::path::append(Path, "/usr/lib/libcompiler_rt.a");
-return std::string(Path);
+if (getVFS().exists(Path))
+  return std::string(Path);
   }
   SmallString<128> P(getDriver().ResourceDir);
   std::string CRTBasename =



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [release/18.x] Backport fixes for ARM64EC thunk generation (PR #92580)

2024-05-17 Thread Daniel Paoliello via llvm-branch-commits

dpaoliello wrote:

> @dpaoliello (or anyone else). If you would like to add a note about this fix 
> in the release notes (completely optional). Please reply to this comment with 
> a one or two sentence description of the fix. When you are done, please add 
> the release:note label to this PR.

Fixes issues where LLVM is either generating the incorrect thunk for a function 
with aligned parameters or didn't correctly pass through the return value when 
`StructRet` was used.

https://github.com/llvm/llvm-project/pull/92580
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libcxxabi] release/18.x: [libcxx][libcxxabi] Fix build for OpenBSD (#92186) (PR #92601)

2024-05-17 Thread Brad Smith via llvm-branch-commits

brad0 wrote:

> Thanks, both of you. I can't merge these, so I am guessing someone else will 
> come along that can?

The RE manager does. This release cycle being tsteller.

https://github.com/llvm/llvm-project/pull/92601
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libcxxabi] release/18.x: [libcxx][libcxxabi] Fix build for OpenBSD (#92186) (PR #92601)

2024-05-17 Thread John Ericson via llvm-branch-commits

Ericson2314 wrote:

Thanks, both of you. I can't merge these, so I am guessing someone else will 
come along that can?

https://github.com/llvm/llvm-project/pull/92601
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/18.x: [clang] Don't assume location of compiler-rt for OpenBSD (#92183) (PR #92293)

2024-05-17 Thread Brad Smith via llvm-branch-commits

https://github.com/brad0 approved this pull request.


https://github.com/llvm/llvm-project/pull/92293
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libcxxabi] release/18.x: [libcxx][libcxxabi] Fix build for OpenBSD (#92186) (PR #92601)

2024-05-17 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

@brad0 Can you look at #92293 too?

https://github.com/llvm/llvm-project/pull/92601
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libcxxabi] release/18.x: [libcxx][libcxxabi] Fix build for OpenBSD (#92186) (PR #92601)

2024-05-17 Thread Brad Smith via llvm-branch-commits

https://github.com/brad0 approved this pull request.


https://github.com/llvm/llvm-project/pull/92601
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/18.x: [clang] Don't assume location of compiler-rt for OpenBSD (#92183) (PR #92293)

2024-05-17 Thread John Ericson via llvm-branch-commits

Ericson2314 wrote:

(FWIW https://github.com/llvm/llvm-project/pull/92601 is somewhat a companion 
backport.)

https://github.com/llvm/llvm-project/pull/92293
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libcxxabi] release/18.x: [libcxx][libcxxabi] Fix build for OpenBSD (#92186) (PR #92601)

2024-05-17 Thread via llvm-branch-commits

llvmbot wrote:



@llvm/pr-subscribers-libcxxabi

@llvm/pr-subscribers-libcxx

Author: None (llvmbot)


Changes

Backport af7467c

Requested by: @Ericson2314

---
Full diff: https://github.com/llvm/llvm-project/pull/92601.diff


3 Files Affected:

- (modified) libcxx/src/atomic.cpp (+14-2) 
- (modified) libcxx/src/chrono.cpp (+3-1) 
- (modified) libcxxabi/src/cxa_guard_impl.h (+15-1) 


``diff
diff --git a/libcxx/src/atomic.cpp b/libcxx/src/atomic.cpp
index 2f0389ae6974a..6b1f03c21bbcc 100644
--- a/libcxx/src/atomic.cpp
+++ b/libcxx/src/atomic.cpp
@@ -25,16 +25,28 @@
 #  if !defined(SYS_futex) && defined(SYS_futex_time64)
 #define SYS_futex SYS_futex_time64
 #  endif
+#  define _LIBCPP_FUTEX(...) syscall(SYS_futex, __VA_ARGS__)
 
 #elif defined(__FreeBSD__)
 
 #  include 
 #  include 
 
+#  define _LIBCPP_FUTEX(...) syscall(SYS_futex, __VA_ARGS__)
+
+#elif defined(__OpenBSD__)
+
+#  include 
+
+// OpenBSD has no indirect syscalls
+#  define _LIBCPP_FUTEX(...) futex(__VA_ARGS__)
+
 #else // <- Add other operating systems here
 
 // Baseline needs no new headers
 
+#  define _LIBCPP_FUTEX(...) syscall(SYS_futex, __VA_ARGS__)
+
 #endif
 
 _LIBCPP_BEGIN_NAMESPACE_STD
@@ -44,11 +56,11 @@ _LIBCPP_BEGIN_NAMESPACE_STD
 static void
 __libcpp_platform_wait_on_address(__cxx_atomic_contention_t const volatile* 
__ptr, __cxx_contention_t __val) {
   static constexpr timespec __timeout = {2, 0};
-  syscall(SYS_futex, __ptr, FUTEX_WAIT_PRIVATE, __val, &__timeout, 0, 0);
+  _LIBCPP_FUTEX(__ptr, FUTEX_WAIT_PRIVATE, __val, &__timeout, 0, 0);
 }
 
 static void __libcpp_platform_wake_by_address(__cxx_atomic_contention_t const 
volatile* __ptr, bool __notify_one) {
-  syscall(SYS_futex, __ptr, FUTEX_WAKE_PRIVATE, __notify_one ? 1 : INT_MAX, 0, 
0, 0);
+  _LIBCPP_FUTEX(__ptr, FUTEX_WAKE_PRIVATE, __notify_one ? 1 : INT_MAX, 0, 0, 
0);
 }
 
 #elif defined(__APPLE__) && defined(_LIBCPP_USE_ULOCK)
diff --git a/libcxx/src/chrono.cpp b/libcxx/src/chrono.cpp
index c5e827c0cb59f..e7d6dfbc22924 100644
--- a/libcxx/src/chrono.cpp
+++ b/libcxx/src/chrono.cpp
@@ -31,7 +31,9 @@
 #  include  // for gettimeofday and timeval
 #endif
 
-#if defined(__APPLE__) || defined(__gnu_hurd__) || (defined(_POSIX_TIMERS) && 
_POSIX_TIMERS > 0)
+// OpenBSD does not have a fully conformant suite of POSIX timers, but
+// it does have clock_gettime and CLOCK_MONOTONIC which is all we need.
+#if defined(__APPLE__) || defined(__gnu_hurd__) || defined(__OpenBSD__) || 
(defined(_POSIX_TIMERS) && _POSIX_TIMERS > 0)
 #  define _LIBCPP_HAS_CLOCK_GETTIME
 #endif
 
diff --git a/libcxxabi/src/cxa_guard_impl.h b/libcxxabi/src/cxa_guard_impl.h
index e00d54b3a7318..90d589be4d773 100644
--- a/libcxxabi/src/cxa_guard_impl.h
+++ b/libcxxabi/src/cxa_guard_impl.h
@@ -47,6 +47,9 @@
 #include "__cxxabi_config.h"
 #include "include/atomic_support.h" // from libc++
 #if defined(__has_include)
+#  if __has_include()
+#include 
+#  endif
 #  if __has_include()
 #include 
 #  endif
@@ -411,7 +414,18 @@ struct InitByteGlobalMutex {
 // Futex Implementation
 
//===--===//
 
-#if defined(SYS_futex)
+#if defined(__OpenBSD__)
+void PlatformFutexWait(int* addr, int expect) {
+  constexpr int WAIT = 0;
+  futex(reinterpret_cast(addr), WAIT, expect, NULL, NULL);
+  __tsan_acquire(addr);
+}
+void PlatformFutexWake(int* addr) {
+  constexpr int WAKE = 1;
+  __tsan_release(addr);
+  futex(reinterpret_cast(addr), WAKE, INT_MAX, NULL, NULL);
+}
+#elif defined(SYS_futex)
 void PlatformFutexWait(int* addr, int expect) {
   constexpr int WAIT = 0;
   syscall(SYS_futex, addr, WAIT, expect, 0);

``




https://github.com/llvm/llvm-project/pull/92601
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libcxxabi] release/18.x: [libcxx][libcxxabi] Fix build for OpenBSD (#92186) (PR #92601)

2024-05-17 Thread via llvm-branch-commits

llvmbot wrote:

@brad0 What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/92601
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libcxxabi] release/18.x: [libcxx][libcxxabi] Fix build for OpenBSD (#92186) (PR #92601)

2024-05-17 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/92601
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libcxxabi] release/18.x: [libcxx][libcxxabi] Fix build for OpenBSD (#92186) (PR #92601)

2024-05-17 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/92601

Backport af7467c

Requested by: @Ericson2314

>From 472f75ba1cd626b92bf2b10099626717fcd58e29 Mon Sep 17 00:00:00 2001
From: John Ericson 
Date: Fri, 17 May 2024 16:49:04 -0400
Subject: [PATCH] [libcxx][libcxxabi] Fix build for OpenBSD (#92186)

- No indirect syscalls on OpenBSD. Instead there is a `futex` function
which issues a direct syscall.

- Monotonic clock is available despite the full POSIX suite of timers
not being available in its entirety.

  See https://lists.boost.org/boost-bugs/2015/07/41690.php and
  
https://github.com/boostorg/log/commit/c98b1f459add14d5ce3e9e63e2469064601d7f71
  for a description of an analogous problem and fix for Boost.

(cherry picked from commit af7467ce9f447d6fe977b73db1f03a18d6bbd511)
---
 libcxx/src/atomic.cpp  | 16 ++--
 libcxx/src/chrono.cpp  |  4 +++-
 libcxxabi/src/cxa_guard_impl.h | 16 +++-
 3 files changed, 32 insertions(+), 4 deletions(-)

diff --git a/libcxx/src/atomic.cpp b/libcxx/src/atomic.cpp
index 2f0389ae6974a..6b1f03c21bbcc 100644
--- a/libcxx/src/atomic.cpp
+++ b/libcxx/src/atomic.cpp
@@ -25,16 +25,28 @@
 #  if !defined(SYS_futex) && defined(SYS_futex_time64)
 #define SYS_futex SYS_futex_time64
 #  endif
+#  define _LIBCPP_FUTEX(...) syscall(SYS_futex, __VA_ARGS__)
 
 #elif defined(__FreeBSD__)
 
 #  include 
 #  include 
 
+#  define _LIBCPP_FUTEX(...) syscall(SYS_futex, __VA_ARGS__)
+
+#elif defined(__OpenBSD__)
+
+#  include 
+
+// OpenBSD has no indirect syscalls
+#  define _LIBCPP_FUTEX(...) futex(__VA_ARGS__)
+
 #else // <- Add other operating systems here
 
 // Baseline needs no new headers
 
+#  define _LIBCPP_FUTEX(...) syscall(SYS_futex, __VA_ARGS__)
+
 #endif
 
 _LIBCPP_BEGIN_NAMESPACE_STD
@@ -44,11 +56,11 @@ _LIBCPP_BEGIN_NAMESPACE_STD
 static void
 __libcpp_platform_wait_on_address(__cxx_atomic_contention_t const volatile* 
__ptr, __cxx_contention_t __val) {
   static constexpr timespec __timeout = {2, 0};
-  syscall(SYS_futex, __ptr, FUTEX_WAIT_PRIVATE, __val, &__timeout, 0, 0);
+  _LIBCPP_FUTEX(__ptr, FUTEX_WAIT_PRIVATE, __val, &__timeout, 0, 0);
 }
 
 static void __libcpp_platform_wake_by_address(__cxx_atomic_contention_t const 
volatile* __ptr, bool __notify_one) {
-  syscall(SYS_futex, __ptr, FUTEX_WAKE_PRIVATE, __notify_one ? 1 : INT_MAX, 0, 
0, 0);
+  _LIBCPP_FUTEX(__ptr, FUTEX_WAKE_PRIVATE, __notify_one ? 1 : INT_MAX, 0, 0, 
0);
 }
 
 #elif defined(__APPLE__) && defined(_LIBCPP_USE_ULOCK)
diff --git a/libcxx/src/chrono.cpp b/libcxx/src/chrono.cpp
index c5e827c0cb59f..e7d6dfbc22924 100644
--- a/libcxx/src/chrono.cpp
+++ b/libcxx/src/chrono.cpp
@@ -31,7 +31,9 @@
 #  include  // for gettimeofday and timeval
 #endif
 
-#if defined(__APPLE__) || defined(__gnu_hurd__) || (defined(_POSIX_TIMERS) && 
_POSIX_TIMERS > 0)
+// OpenBSD does not have a fully conformant suite of POSIX timers, but
+// it does have clock_gettime and CLOCK_MONOTONIC which is all we need.
+#if defined(__APPLE__) || defined(__gnu_hurd__) || defined(__OpenBSD__) || 
(defined(_POSIX_TIMERS) && _POSIX_TIMERS > 0)
 #  define _LIBCPP_HAS_CLOCK_GETTIME
 #endif
 
diff --git a/libcxxabi/src/cxa_guard_impl.h b/libcxxabi/src/cxa_guard_impl.h
index e00d54b3a7318..90d589be4d773 100644
--- a/libcxxabi/src/cxa_guard_impl.h
+++ b/libcxxabi/src/cxa_guard_impl.h
@@ -47,6 +47,9 @@
 #include "__cxxabi_config.h"
 #include "include/atomic_support.h" // from libc++
 #if defined(__has_include)
+#  if __has_include()
+#include 
+#  endif
 #  if __has_include()
 #include 
 #  endif
@@ -411,7 +414,18 @@ struct InitByteGlobalMutex {
 // Futex Implementation
 
//===--===//
 
-#if defined(SYS_futex)
+#if defined(__OpenBSD__)
+void PlatformFutexWait(int* addr, int expect) {
+  constexpr int WAIT = 0;
+  futex(reinterpret_cast(addr), WAIT, expect, NULL, NULL);
+  __tsan_acquire(addr);
+}
+void PlatformFutexWake(int* addr) {
+  constexpr int WAKE = 1;
+  __tsan_release(addr);
+  futex(reinterpret_cast(addr), WAKE, INT_MAX, NULL, NULL);
+}
+#elif defined(SYS_futex)
 void PlatformFutexWait(int* addr, int expect) {
   constexpr int WAIT = 0;
   syscall(SYS_futex, addr, WAIT, expect, 0);

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] [llvm] release/18.x: [LoongArch] Use R_LARCH_ALIGN with section symbol (#84741) (PR #88891)

2024-05-17 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar demilestoned 
https://github.com/llvm/llvm-project/pull/88891
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [GlobalOpt] Don't replace aliasee with alias that has weak linkage (#91483) (PR #92468)

2024-05-17 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/92468
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 3d0752b - [GlobalOpt] Don't replace aliasee with alias that has weak linkage (#91483)

2024-05-17 Thread Tom Stellard via llvm-branch-commits

Author: DianQK
Date: 2024-05-17T13:50:38-07:00
New Revision: 3d0752b9492efd60e85aedec79676596af6fb4f8

URL: 
https://github.com/llvm/llvm-project/commit/3d0752b9492efd60e85aedec79676596af6fb4f8
DIFF: 
https://github.com/llvm/llvm-project/commit/3d0752b9492efd60e85aedec79676596af6fb4f8.diff

LOG: [GlobalOpt] Don't replace aliasee with alias that has weak linkage (#91483)

Fixes #91312.

Don't perform the transform if the alias may be replaced at link time.

(cherry picked from commit c79690040acf5bb3d857558b0878db47f7f23dc3)

Added: 
llvm/test/Transforms/GlobalOpt/alias-weak.ll

Modified: 
llvm/lib/Transforms/IPO/GlobalOpt.cpp

Removed: 




diff  --git a/llvm/lib/Transforms/IPO/GlobalOpt.cpp 
b/llvm/lib/Transforms/IPO/GlobalOpt.cpp
index 951372adcfa93..619b3f612f25f 100644
--- a/llvm/lib/Transforms/IPO/GlobalOpt.cpp
+++ b/llvm/lib/Transforms/IPO/GlobalOpt.cpp
@@ -2212,6 +2212,9 @@ static bool mayHaveOtherReferences(GlobalValue , const 
LLVMUsed ) {
 
 static bool hasUsesToReplace(GlobalAlias , const LLVMUsed ,
  bool ) {
+  if (GA.isWeakForLinker())
+return false;
+
   RenameTarget = false;
   bool Ret = false;
   if (hasUseOtherThanLLVMUsed(GA, U))

diff  --git a/llvm/test/Transforms/GlobalOpt/alias-weak.ll 
b/llvm/test/Transforms/GlobalOpt/alias-weak.ll
new file mode 100644
index 0..aec2a56313b12
--- /dev/null
+++ b/llvm/test/Transforms/GlobalOpt/alias-weak.ll
@@ -0,0 +1,57 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --check-globals all --include-generated-funcs --version 4
+; RUN: opt < %s -passes=globalopt -S | FileCheck %s
+
+@f1_alias = linkonce_odr hidden alias void (), ptr @f1
+@f2_alias = linkonce_odr hidden alias void (), ptr @f2
+
+define void @foo() {
+  call void @f1_alias()
+  ret void
+}
+
+define void @bar() {
+  call void @f1()
+  ret void
+}
+
+define void @baz() {
+  call void @f2_alias()
+  ret void
+}
+
+; We cannot use `f1_alias` to replace `f1` because they are both in use
+; and `f1_alias` could be replaced at link time.
+define internal void @f1() {
+  ret void
+}
+
+; FIXME: We can use `f2_alias` to replace `f2` because `b2` is not in use.
+define internal void @f2() {
+  ret void
+}
+;.
+; CHECK: @f1_alias = linkonce_odr hidden alias void (), ptr @f1
+; CHECK: @f2_alias = linkonce_odr hidden alias void (), ptr @f2
+;.
+; CHECK-LABEL: define void @foo() local_unnamed_addr {
+; CHECK-NEXT:call void @f1_alias()
+; CHECK-NEXT:ret void
+;
+;
+; CHECK-LABEL: define void @bar() local_unnamed_addr {
+; CHECK-NEXT:call void @f1()
+; CHECK-NEXT:ret void
+;
+;
+; CHECK-LABEL: define void @baz() local_unnamed_addr {
+; CHECK-NEXT:call void @f2_alias()
+; CHECK-NEXT:ret void
+;
+;
+; CHECK-LABEL: define internal void @f1() {
+; CHECK-NEXT:ret void
+;
+;
+; CHECK-LABEL: define internal void @f2() {
+; CHECK-NEXT:ret void
+;



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [RISCV] Re-separate unaligned scalar and vector memory features in the backend. (PR #92143)

2024-05-17 Thread Craig Topper via llvm-branch-commits

topperc wrote:

> @topperc (or anyone else). If you would like to add a note about this fix in 
> the release notes (completely optional). Please reply to this comment with a 
> one or two sentence description of the fix. When you are done, please add the 
> release:note label to this PR.

`-Xclang -target-feature -Xclang +unaligned-scalar-mem` can be used to enable 
unaligned scalar memory accesses for CPUs that do not support unaligned vector 
accesses. `-mno-strict-align` will enable unaligned scalar and vector memory 
accesses.

https://github.com/llvm/llvm-project/pull/92143
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [GlobalOpt] Don't replace aliasee with alias that has weak linkage (#91483) (PR #92468)

2024-05-17 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/92468

>From 3d0752b9492efd60e85aedec79676596af6fb4f8 Mon Sep 17 00:00:00 2001
From: DianQK 
Date: Fri, 17 May 2024 05:51:49 +0800
Subject: [PATCH] [GlobalOpt] Don't replace aliasee with alias that has weak
 linkage (#91483)

Fixes #91312.

Don't perform the transform if the alias may be replaced at link time.

(cherry picked from commit c79690040acf5bb3d857558b0878db47f7f23dc3)
---
 llvm/lib/Transforms/IPO/GlobalOpt.cpp|  3 ++
 llvm/test/Transforms/GlobalOpt/alias-weak.ll | 57 
 2 files changed, 60 insertions(+)
 create mode 100644 llvm/test/Transforms/GlobalOpt/alias-weak.ll

diff --git a/llvm/lib/Transforms/IPO/GlobalOpt.cpp 
b/llvm/lib/Transforms/IPO/GlobalOpt.cpp
index 951372adcfa93..619b3f612f25f 100644
--- a/llvm/lib/Transforms/IPO/GlobalOpt.cpp
+++ b/llvm/lib/Transforms/IPO/GlobalOpt.cpp
@@ -2212,6 +2212,9 @@ static bool mayHaveOtherReferences(GlobalValue , const 
LLVMUsed ) {
 
 static bool hasUsesToReplace(GlobalAlias , const LLVMUsed ,
  bool ) {
+  if (GA.isWeakForLinker())
+return false;
+
   RenameTarget = false;
   bool Ret = false;
   if (hasUseOtherThanLLVMUsed(GA, U))
diff --git a/llvm/test/Transforms/GlobalOpt/alias-weak.ll 
b/llvm/test/Transforms/GlobalOpt/alias-weak.ll
new file mode 100644
index 0..aec2a56313b12
--- /dev/null
+++ b/llvm/test/Transforms/GlobalOpt/alias-weak.ll
@@ -0,0 +1,57 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --check-globals all --include-generated-funcs --version 4
+; RUN: opt < %s -passes=globalopt -S | FileCheck %s
+
+@f1_alias = linkonce_odr hidden alias void (), ptr @f1
+@f2_alias = linkonce_odr hidden alias void (), ptr @f2
+
+define void @foo() {
+  call void @f1_alias()
+  ret void
+}
+
+define void @bar() {
+  call void @f1()
+  ret void
+}
+
+define void @baz() {
+  call void @f2_alias()
+  ret void
+}
+
+; We cannot use `f1_alias` to replace `f1` because they are both in use
+; and `f1_alias` could be replaced at link time.
+define internal void @f1() {
+  ret void
+}
+
+; FIXME: We can use `f2_alias` to replace `f2` because `b2` is not in use.
+define internal void @f2() {
+  ret void
+}
+;.
+; CHECK: @f1_alias = linkonce_odr hidden alias void (), ptr @f1
+; CHECK: @f2_alias = linkonce_odr hidden alias void (), ptr @f2
+;.
+; CHECK-LABEL: define void @foo() local_unnamed_addr {
+; CHECK-NEXT:call void @f1_alias()
+; CHECK-NEXT:ret void
+;
+;
+; CHECK-LABEL: define void @bar() local_unnamed_addr {
+; CHECK-NEXT:call void @f1()
+; CHECK-NEXT:ret void
+;
+;
+; CHECK-LABEL: define void @baz() local_unnamed_addr {
+; CHECK-NEXT:call void @f2_alias()
+; CHECK-NEXT:ret void
+;
+;
+; CHECK-LABEL: define internal void @f1() {
+; CHECK-NEXT:ret void
+;
+;
+; CHECK-LABEL: define internal void @f2() {
+; CHECK-NEXT:ret void
+;

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [workflows] Fix libclang-abi-tests to work with new version scheme (PR #91096)

2024-05-17 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Merged: 6456ebbc18a6c2eaa2d7f6cfb7b2e5938e2daf7a

https://github.com/llvm/llvm-project/pull/91096
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [workflows] Fix libclang-abi-tests to work with new version scheme (PR #91096)

2024-05-17 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/91096
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [GlobalOpt] Don't replace aliasee with alias that has weak linkage (#91483) (PR #92468)

2024-05-17 Thread via llvm-branch-commits

DianQK wrote:

> > Per [#91483 
> > (comment)](https://github.com/llvm/llvm-project/pull/91483#issuecomment-2116394616),
> >  we still need to further investigate this issue, but it won't stop us from 
> > backporting it.
> > cc @MaskRay
> 
> What exactly does this mean? Was there a bug in the original patch?

It's safe, also see 
https://github.com/llvm/llvm-project/issues/91312#issuecomment-2116404306.

https://github.com/llvm/llvm-project/pull/92468
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [release/18.x] Backport fixes for ARM64EC thunk generation (PR #92580)

2024-05-17 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

@dpaoliello (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR.

https://github.com/llvm/llvm-project/pull/92580
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [release/18.x] Backport fixes for ARM64EC thunk generation (PR #92580)

2024-05-17 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/92580
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 9208786 - [Arm64EC] Correctly handle sret in entry thunks. (#92326)

2024-05-17 Thread Tom Stellard via llvm-branch-commits

Author: Eli Friedman
Date: 2024-05-17T13:35:09-07:00
New Revision: 92087868d5d291464056066f3e193eca97621514

URL: 
https://github.com/llvm/llvm-project/commit/92087868d5d291464056066f3e193eca97621514
DIFF: 
https://github.com/llvm/llvm-project/commit/92087868d5d291464056066f3e193eca97621514.diff

LOG: [Arm64EC] Correctly handle sret in entry thunks. (#92326)

I accidentally left out the code to transfer sret attributes to entry
thunks, so values weren't being passed in the right registers, and the
sret pointer wasn't returned in the correct register.

Fixes #90229

Added: 


Modified: 
llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp
llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll

Removed: 




diff  --git a/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp 
b/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp
index d4dd28aecac48..862aefe46193d 100644
--- a/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp
@@ -514,7 +514,14 @@ Function 
*AArch64Arm64ECCallLowering::buildEntryThunk(Function *F) {
   // Call the function passed to the thunk.
   Value *Callee = Thunk->getArg(0);
   Callee = IRB.CreateBitCast(Callee, PtrTy);
-  Value *Call = IRB.CreateCall(Arm64Ty, Callee, Args);
+  CallInst *Call = IRB.CreateCall(Arm64Ty, Callee, Args);
+
+  auto SRetAttr = F->getAttributes().getParamAttr(0, Attribute::StructRet);
+  auto InRegAttr = F->getAttributes().getParamAttr(0, Attribute::InReg);
+  if (SRetAttr.isValid() && !InRegAttr.isValid()) {
+Thunk->addParamAttr(1, SRetAttr);
+Call->addParamAttr(0, SRetAttr);
+  }
 
   Value *RetVal = Call;
   if (TransformDirectToSRet) {

diff  --git a/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll 
b/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll
index c00c9bfe127e8..e9556b9d5cbee 100644
--- a/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll
+++ b/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll
@@ -222,12 +222,12 @@ define i8 @matches_has_sret() nounwind {
 }
 
 %TSRet = type { i64, i64 }
-define void @has_aligned_sret(ptr align 32 sret(%TSRet)) nounwind {
-; CHECK-LABEL:.def$ientry_thunk$cdecl$m16$v;
-; CHECK:  .section
.wowthk$aa,"xr",discard,$ientry_thunk$cdecl$m16$v
+define void @has_aligned_sret(ptr align 32 sret(%TSRet), i32) nounwind {
+; CHECK-LABEL:.def$ientry_thunk$cdecl$m16$i8;
+; CHECK:  .section
.wowthk$aa,"xr",discard,$ientry_thunk$cdecl$m16$i8
 ; CHECK:  // %bb.0:
-; CHECK-NEXT: stp q6, q7, [sp, #-176]!// 32-byte Folded 
Spill
-; CHECK-NEXT: .seh_save_any_reg_pxq6, 176
+; CHECK-NEXT: stp q6, q7, [sp, #-192]!// 32-byte Folded 
Spill
+; CHECK-NEXT: .seh_save_any_reg_pxq6, 192
 ; CHECK-NEXT: stp q8, q9, [sp, #32]   // 32-byte Folded 
Spill
 ; CHECK-NEXT: .seh_save_any_reg_p q8, 32
 ; CHECK-NEXT: stp q10, q11, [sp, #64] // 32-byte Folded 
Spill
@@ -236,17 +236,25 @@ define void @has_aligned_sret(ptr align 32 sret(%TSRet)) 
nounwind {
 ; CHECK-NEXT: .seh_save_any_reg_p q12, 96
 ; CHECK-NEXT: stp q14, q15, [sp, #128]// 32-byte Folded 
Spill
 ; CHECK-NEXT: .seh_save_any_reg_p q14, 128
-; CHECK-NEXT: stp x29, x30, [sp, #160]// 16-byte Folded 
Spill
-; CHECK-NEXT: .seh_save_fplr  160
-; CHECK-NEXT: add x29, sp, #160
-; CHECK-NEXT: .seh_add_fp 160
+; CHECK-NEXT: str x19, [sp, #160] // 8-byte Folded 
Spill
+; CHECK-NEXT: .seh_save_reg   x19, 160
+; CHECK-NEXT: stp x29, x30, [sp, #168]// 16-byte Folded 
Spill
+; CHECK-NEXT: .seh_save_fplr  168
+; CHECK-NEXT: add x29, sp, #168
+; CHECK-NEXT: .seh_add_fp 168
 ; CHECK-NEXT: .seh_endprologue
+; CHECK-NEXT: mov x19, x0
+; CHECK-NEXT: mov x8, x0
+; CHECK-NEXT: mov x0, x1
 ; CHECK-NEXT: blr x9
 ; CHECK-NEXT: adrpx8, __os_arm64x_dispatch_ret
 ; CHECK-NEXT: ldr x0, [x8, :lo12:__os_arm64x_dispatch_ret]
+; CHECK-NEXT: mov x8, x19
 ; CHECK-NEXT: .seh_startepilogue
-; CHECK-NEXT: ldp x29, x30, [sp, #160]// 16-byte Folded 
Reload
-; CHECK-NEXT: .seh_save_fplr  160
+; CHECK-NEXT: ldp x29, x30, [sp, #168]// 16-byte Folded 
Reload
+; CHECK-NEXT: .seh_save_fplr  168
+; CHECK-NEXT: ldr x19, [sp, #160] // 8-byte Folded 
Reload
+; CHECK-NEXT: .seh_save_reg   x19, 160
 ; CHECK-NEXT: ldp q14, q15, [sp, #128]// 32-byte Folded 
Reload
 ; CHECK-NEXT: .seh_save_any_reg_p q14, 128
 ; CHECK-NEXT: ldp q12, q13, [sp, #96] // 32-byte Folded 
Reload
@@ -255,8 +263,8 @@ define void @has_aligned_sret(ptr align 32 sret(%TSRet)) 
nounwind {
 ; CHECK-NEXT: .seh_save_any_reg_p 

[llvm-branch-commits] [llvm] bee6966 - [Arm64EC] Improve alignment mangling in arm64ec thunks. (#90115)

2024-05-17 Thread Tom Stellard via llvm-branch-commits

Author: Eli Friedman
Date: 2024-05-17T13:35:09-07:00
New Revision: bee6966d8efa18041e2e228c3bb7b09c4618677b

URL: 
https://github.com/llvm/llvm-project/commit/bee6966d8efa18041e2e228c3bb7b09c4618677b
DIFF: 
https://github.com/llvm/llvm-project/commit/bee6966d8efa18041e2e228c3bb7b09c4618677b.diff

LOG: [Arm64EC] Improve alignment mangling in arm64ec thunks. (#90115)

In some cases, MSVC's mangling for arm64ec thunks includes the alignment
of a struct. I added some code to try to match... but it never really
worked right. The issues:

- Alignment is only mangled if it's 16 or more (I guess the default is
supposed to be 8).
- Alignment isn't mangled on return values (since the memory is
allocated by the caller).

The current patch leaves hooks to make alignment mangling work... but
doesn't actually ever mangle alignment: clang never actually encodes a
relevant alignment into the IR. Once we get clang to emit the real
size/alignment of structs, we can start emitting it.

Added: 


Modified: 
llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp
llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll
llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll

Removed: 




diff  --git a/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp 
b/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp
index 55c5bbc66a3f4..d4dd28aecac48 100644
--- a/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp
@@ -181,13 +181,14 @@ void AArch64Arm64ECCallLowering::getThunkArgTypes(
   }
 
   for (unsigned E = FT->getNumParams(); I != E; ++I) {
-Align ParamAlign = AttrList.getParamAlignment(I).valueOrOne();
 #if 0
 // FIXME: Need more information about argument size; see
 // https://reviews.llvm.org/D132926
 uint64_t ArgSizeBytes = AttrList.getParamArm64ECArgSizeBytes(I);
+Align ParamAlign = AttrList.getParamAlignment(I).valueOrOne();
 #else
 uint64_t ArgSizeBytes = 0;
+Align ParamAlign = Align();
 #endif
 Type *Arm64Ty, *X64Ty;
 canonicalizeThunkType(FT->getParamType(I), ParamAlign,
@@ -297,7 +298,7 @@ void AArch64Arm64ECCallLowering::canonicalizeThunkType(
 uint64_t TotalSizeBytes = ElementCnt * ElementSizePerBytes;
 if (ElementTy->isFloatTy() || ElementTy->isDoubleTy()) {
   Out << (ElementTy->isFloatTy() ? "F" : "D") << TotalSizeBytes;
-  if (Alignment.value() >= 8 && !T->isPointerTy())
+  if (Alignment.value() >= 16 && !Ret)
 Out << "a" << Alignment.value();
   Arm64Ty = T;
   if (TotalSizeBytes <= 8) {
@@ -328,7 +329,7 @@ void AArch64Arm64ECCallLowering::canonicalizeThunkType(
   Out << "m";
   if (TypeSize != 4)
 Out << TypeSize;
-  if (Alignment.value() >= 8 && !T->isPointerTy())
+  if (Alignment.value() >= 16 && !Ret)
 Out << "a" << Alignment.value();
   // FIXME: Try to canonicalize Arm64Ty more thoroughly?
   Arm64Ty = T;

diff  --git a/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll 
b/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll
index bb9ba05f7a272..c00c9bfe127e8 100644
--- a/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll
+++ b/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll
@@ -223,8 +223,8 @@ define i8 @matches_has_sret() nounwind {
 
 %TSRet = type { i64, i64 }
 define void @has_aligned_sret(ptr align 32 sret(%TSRet)) nounwind {
-; CHECK-LABEL:.def$ientry_thunk$cdecl$m16a32$v;
-; CHECK:  .section
.wowthk$aa,"xr",discard,$ientry_thunk$cdecl$m16a32$v
+; CHECK-LABEL:.def$ientry_thunk$cdecl$m16$v;
+; CHECK:  .section
.wowthk$aa,"xr",discard,$ientry_thunk$cdecl$m16$v
 ; CHECK:  // %bb.0:
 ; CHECK-NEXT: stp q6, q7, [sp, #-176]!// 32-byte Folded 
Spill
 ; CHECK-NEXT: .seh_save_any_reg_pxq6, 176
@@ -457,7 +457,7 @@ define %T2 @simple_struct(%T1 %0, %T2 %1, %T3, %T4) 
nounwind {
 ; CHECK-NEXT: .symidx $ientry_thunk$cdecl$i8$v
 ; CHECK-NEXT: .word   1
 ; CHECK-NEXT: .symidx "#has_aligned_sret"
-; CHECK-NEXT: .symidx $ientry_thunk$cdecl$m16a32$v
+; CHECK-NEXT: .symidx $ientry_thunk$cdecl$m16$v
 ; CHECK-NEXT: .word   1
 ; CHECK-NEXT: .symidx "#small_array"
 ; CHECK-NEXT: .symidx $ientry_thunk$cdecl$m2$m2F8

diff  --git a/llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll 
b/llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll
index 3b911e78aff2a..7a40fcd85ac58 100644
--- a/llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll
+++ b/llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll
@@ -236,8 +236,8 @@ declare void @has_sret(ptr sret([100 x i8])) nounwind;
 
 %TSRet = type { i64, i64 }
 declare void @has_aligned_sret(ptr align 32 sret(%TSRet)) nounwind;
-; CHECK-LABEL:.def$iexit_thunk$cdecl$m16a32$v;
-; CHECK:  .section
.wowthk$aa,"xr",discard,$iexit_thunk$cdecl$m16a32$v
+; CHECK-LABEL:.def$iexit_thunk$cdecl$m16$v;
+; CHECK:  .section

[llvm-branch-commits] [llvm] [release/18.x] Backport fixes for ARM64EC thunk generation (PR #92580)

2024-05-17 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/92580

>From bee6966d8efa18041e2e228c3bb7b09c4618677b Mon Sep 17 00:00:00 2001
From: Eli Friedman 
Date: Fri, 26 Apr 2024 11:06:11 -0700
Subject: [PATCH 1/2] [Arm64EC] Improve alignment mangling in arm64ec thunks.
 (#90115)

In some cases, MSVC's mangling for arm64ec thunks includes the alignment
of a struct. I added some code to try to match... but it never really
worked right. The issues:

- Alignment is only mangled if it's 16 or more (I guess the default is
supposed to be 8).
- Alignment isn't mangled on return values (since the memory is
allocated by the caller).

The current patch leaves hooks to make alignment mangling work... but
doesn't actually ever mangle alignment: clang never actually encodes a
relevant alignment into the IR. Once we get clang to emit the real
size/alignment of structs, we can start emitting it.
---
 llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp |  7 ---
 llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll  |  6 +++---
 llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll   | 10 +-
 3 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp 
b/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp
index 55c5bbc66a3f4..d4dd28aecac48 100644
--- a/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp
@@ -181,13 +181,14 @@ void AArch64Arm64ECCallLowering::getThunkArgTypes(
   }
 
   for (unsigned E = FT->getNumParams(); I != E; ++I) {
-Align ParamAlign = AttrList.getParamAlignment(I).valueOrOne();
 #if 0
 // FIXME: Need more information about argument size; see
 // https://reviews.llvm.org/D132926
 uint64_t ArgSizeBytes = AttrList.getParamArm64ECArgSizeBytes(I);
+Align ParamAlign = AttrList.getParamAlignment(I).valueOrOne();
 #else
 uint64_t ArgSizeBytes = 0;
+Align ParamAlign = Align();
 #endif
 Type *Arm64Ty, *X64Ty;
 canonicalizeThunkType(FT->getParamType(I), ParamAlign,
@@ -297,7 +298,7 @@ void AArch64Arm64ECCallLowering::canonicalizeThunkType(
 uint64_t TotalSizeBytes = ElementCnt * ElementSizePerBytes;
 if (ElementTy->isFloatTy() || ElementTy->isDoubleTy()) {
   Out << (ElementTy->isFloatTy() ? "F" : "D") << TotalSizeBytes;
-  if (Alignment.value() >= 8 && !T->isPointerTy())
+  if (Alignment.value() >= 16 && !Ret)
 Out << "a" << Alignment.value();
   Arm64Ty = T;
   if (TotalSizeBytes <= 8) {
@@ -328,7 +329,7 @@ void AArch64Arm64ECCallLowering::canonicalizeThunkType(
   Out << "m";
   if (TypeSize != 4)
 Out << TypeSize;
-  if (Alignment.value() >= 8 && !T->isPointerTy())
+  if (Alignment.value() >= 16 && !Ret)
 Out << "a" << Alignment.value();
   // FIXME: Try to canonicalize Arm64Ty more thoroughly?
   Arm64Ty = T;
diff --git a/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll 
b/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll
index bb9ba05f7a272..c00c9bfe127e8 100644
--- a/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll
+++ b/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll
@@ -223,8 +223,8 @@ define i8 @matches_has_sret() nounwind {
 
 %TSRet = type { i64, i64 }
 define void @has_aligned_sret(ptr align 32 sret(%TSRet)) nounwind {
-; CHECK-LABEL:.def$ientry_thunk$cdecl$m16a32$v;
-; CHECK:  .section
.wowthk$aa,"xr",discard,$ientry_thunk$cdecl$m16a32$v
+; CHECK-LABEL:.def$ientry_thunk$cdecl$m16$v;
+; CHECK:  .section
.wowthk$aa,"xr",discard,$ientry_thunk$cdecl$m16$v
 ; CHECK:  // %bb.0:
 ; CHECK-NEXT: stp q6, q7, [sp, #-176]!// 32-byte Folded 
Spill
 ; CHECK-NEXT: .seh_save_any_reg_pxq6, 176
@@ -457,7 +457,7 @@ define %T2 @simple_struct(%T1 %0, %T2 %1, %T3, %T4) 
nounwind {
 ; CHECK-NEXT: .symidx $ientry_thunk$cdecl$i8$v
 ; CHECK-NEXT: .word   1
 ; CHECK-NEXT: .symidx "#has_aligned_sret"
-; CHECK-NEXT: .symidx $ientry_thunk$cdecl$m16a32$v
+; CHECK-NEXT: .symidx $ientry_thunk$cdecl$m16$v
 ; CHECK-NEXT: .word   1
 ; CHECK-NEXT: .symidx "#small_array"
 ; CHECK-NEXT: .symidx $ientry_thunk$cdecl$m2$m2F8
diff --git a/llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll 
b/llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll
index 3b911e78aff2a..7a40fcd85ac58 100644
--- a/llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll
+++ b/llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll
@@ -236,8 +236,8 @@ declare void @has_sret(ptr sret([100 x i8])) nounwind;
 
 %TSRet = type { i64, i64 }
 declare void @has_aligned_sret(ptr align 32 sret(%TSRet)) nounwind;
-; CHECK-LABEL:.def$iexit_thunk$cdecl$m16a32$v;
-; CHECK:  .section
.wowthk$aa,"xr",discard,$iexit_thunk$cdecl$m16a32$v
+; CHECK-LABEL:.def$iexit_thunk$cdecl$m16$v;
+; CHECK:  .section
.wowthk$aa,"xr",discard,$iexit_thunk$cdecl$m16$v
 ; CHECK:  // %bb.0:
 ; CHECK-NEXT: 

[llvm-branch-commits] [llvm] release/18.x: [workflows] Fix libclang-abi-tests to work with new version scheme (#91865) (PR #92258)

2024-05-17 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/92258
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [workflows] Fix libclang-abi-tests to work with new version scheme (#91865) (PR #92258)

2024-05-17 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/92258

>From 6456ebbc18a6c2eaa2d7f6cfb7b2e5938e2daf7a Mon Sep 17 00:00:00 2001
From: Tom Stellard 
Date: Wed, 15 May 2024 06:08:29 -0700
Subject: [PATCH] [workflows] Fix libclang-abi-tests to work with new version
 scheme (#91865)

(cherry picked from commit d06270ee00e37b247eb99268fb2f106dbeee08ff)
---
 .github/workflows/libclang-abi-tests.yml | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/.github/workflows/libclang-abi-tests.yml 
b/.github/workflows/libclang-abi-tests.yml
index ccfc1e5fb8a74..972d21c3bcedf 100644
--- a/.github/workflows/libclang-abi-tests.yml
+++ b/.github/workflows/libclang-abi-tests.yml
@@ -33,7 +33,6 @@ jobs:
   ABI_HEADERS: ${{ steps.vars.outputs.ABI_HEADERS }}
   ABI_LIBS: ${{ steps.vars.outputs.ABI_LIBS }}
   BASELINE_VERSION_MAJOR: ${{ steps.vars.outputs.BASELINE_VERSION_MAJOR }}
-  BASELINE_VERSION_MINOR: ${{ steps.vars.outputs.BASELINE_VERSION_MINOR }}
   LLVM_VERSION_MAJOR: ${{ steps.version.outputs.LLVM_VERSION_MAJOR }}
   LLVM_VERSION_MINOR: ${{ steps.version.outputs.LLVM_VERSION_MINOR }}
   LLVM_VERSION_PATCH: ${{ steps.version.outputs.LLVM_VERSION_PATCH }}
@@ -51,9 +50,9 @@ jobs:
 id: vars
 run: |
   remote_repo='https://github.com/llvm/llvm-project'
-  if [ ${{ steps.version.outputs.LLVM_VERSION_MINOR }} -ne 0 ] || [ 
${{ steps.version.outputs.LLVM_VERSION_PATCH }} -eq 0 ]; then
+  if [ ${{ steps.version.outputs.LLVM_VERSION_PATCH }} -eq 0 ]; then
 major_version=$(( ${{ steps.version.outputs.LLVM_VERSION_MAJOR }} 
- 1))
-baseline_ref="llvmorg-$major_version.0.0"
+baseline_ref="llvmorg-$major_version.1.0"
 
 # If there is a minor release, we want to use that as the base 
line.
 minor_ref=$(git ls-remote --refs -t "$remote_repo" 
llvmorg-"$major_version".[1-9].[0-9] | tail -n1 | grep -o 'llvmorg-.\+' || true)
@@ -75,7 +74,7 @@ jobs:
   else
 {
   echo "BASELINE_VERSION_MAJOR=${{ 
steps.version.outputs.LLVM_VERSION_MAJOR }}"
-  echo "BASELINE_REF=llvmorg-${{ 
steps.version.outputs.LLVM_VERSION_MAJOR }}.0.0"
+  echo "BASELINE_REF=llvmorg-${{ 
steps.version.outputs.LLVM_VERSION_MAJOR }}.1.0"
   echo "ABI_HEADERS=."
   echo "ABI_LIBS=libclang.so libclang-cpp.so"
 } >> "$GITHUB_OUTPUT"

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/18.x: [clang] Don't assume location of compiler-rt for OpenBSD (#92183) (PR #92293)

2024-05-17 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

cc @epsilon-0 

https://github.com/llvm/llvm-project/pull/92293
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [GlobalOpt] Don't replace aliasee with alias that has weak linkage (#91483) (PR #92468)

2024-05-17 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

> Per [#91483 
> (comment)](https://github.com/llvm/llvm-project/pull/91483#issuecomment-2116394616),
>  we still need to further investigate this issue, but it won't stop us from 
> backporting it.
> 
> cc @MaskRay

What exactly does this mean? Was there a bug in the original patch?

https://github.com/llvm/llvm-project/pull/92468
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [RISCV] Re-separate unaligned scalar and vector memory features in the backend. (PR #92143)

2024-05-17 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

@topperc (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR.

https://github.com/llvm/llvm-project/pull/92143
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [RISCV] Re-separate unaligned scalar and vector memory features in the backend. (PR #92143)

2024-05-17 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/92143
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] a7cd0c6 - [RISCV] Add a unaligned-scalar-mem feature like we had in clang 17.

2024-05-17 Thread Tom Stellard via llvm-branch-commits

Author: Craig Topper
Date: 2024-05-17T13:22:27-07:00
New Revision: a7cd0c61123889a632ceea67dc8c8e2c8753ae08

URL: 
https://github.com/llvm/llvm-project/commit/a7cd0c61123889a632ceea67dc8c8e2c8753ae08
DIFF: 
https://github.com/llvm/llvm-project/commit/a7cd0c61123889a632ceea67dc8c8e2c8753ae08.diff

LOG: [RISCV] Add a unaligned-scalar-mem feature like we had in clang 17.

This is ORed with the fast-unaligned-access feature which applies
to scalar and vector together.:

Added: 


Modified: 
llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp
llvm/lib/Target/RISCV/RISCVFeatures.td
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
llvm/test/CodeGen/RISCV/memcpy-inline.ll
llvm/test/CodeGen/RISCV/memcpy.ll
llvm/test/CodeGen/RISCV/memset-inline.ll
llvm/test/CodeGen/RISCV/pr56110.ll
llvm/test/CodeGen/RISCV/unaligned-load-store.ll

Removed: 




diff  --git a/llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp 
b/llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp
index 0a314fdd41cbe..89207640ee54a 100644
--- a/llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp
+++ b/llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp
@@ -317,8 +317,9 @@ bool 
RISCVExpandPseudo::expandRV32ZdinxStore(MachineBasicBlock ,
   .addReg(MBBI->getOperand(1).getReg())
   .add(MBBI->getOperand(2));
   if (MBBI->getOperand(2).isGlobal() || MBBI->getOperand(2).isCPI()) {
-// FIXME: Zdinx RV32 can not work on unaligned memory.
-assert(!STI->hasFastUnalignedAccess());
+// FIXME: Zdinx RV32 can not work on unaligned scalar memory.
+assert(!STI->hasFastUnalignedAccess() &&
+   !STI->enableUnalignedScalarMem());
 
 assert(MBBI->getOperand(2).getOffset() % 8 == 0);
 MBBI->getOperand(2).setOffset(MBBI->getOperand(2).getOffset() + 4);

diff  --git a/llvm/lib/Target/RISCV/RISCVFeatures.td 
b/llvm/lib/Target/RISCV/RISCVFeatures.td
index 26451c80f57b4..1bb6b6a561f4a 100644
--- a/llvm/lib/Target/RISCV/RISCVFeatures.td
+++ b/llvm/lib/Target/RISCV/RISCVFeatures.td
@@ -1025,6 +1025,11 @@ def FeatureFastUnalignedAccess
   "true", "Has reasonably performant unaligned "
   "loads and stores (both scalar and vector)">;
 
+def FeatureUnalignedScalarMem
+   : SubtargetFeature<"unaligned-scalar-mem", "EnableUnalignedScalarMem",
+  "true", "Has reasonably performant unaligned scalar "
+  "loads and stores">;
+
 def FeaturePostRAScheduler : SubtargetFeature<"use-postra-scheduler",
 "UsePostRAScheduler", "true", "Schedule again after register allocation">;
 

diff  --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp 
b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index d46093b9e260a..3fe7ddfdd4279 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -1883,7 +1883,8 @@ bool 
RISCVTargetLowering::shouldConvertConstantLoadToIntImm(const APInt ,
   // replace. If we don't support unaligned scalar mem, prefer the constant
   // pool.
   // TODO: Can the caller pass down the alignment?
-  if (!Subtarget.hasFastUnalignedAccess())
+  if (!Subtarget.hasFastUnalignedAccess() &&
+  !Subtarget.enableUnalignedScalarMem())
 return true;
 
   // Prefer to keep the load if it would require many instructions.
@@ -19772,8 +19773,10 @@ bool 
RISCVTargetLowering::allowsMisalignedMemoryAccesses(
 unsigned *Fast) const {
   if (!VT.isVector()) {
 if (Fast)
-  *Fast = Subtarget.hasFastUnalignedAccess();
-return Subtarget.hasFastUnalignedAccess();
+  *Fast = Subtarget.hasFastUnalignedAccess() ||
+  Subtarget.enableUnalignedScalarMem();
+return Subtarget.hasFastUnalignedAccess() ||
+   Subtarget.enableUnalignedScalarMem();
   }
 
   // All vector implementations must support element alignment

diff  --git a/llvm/test/CodeGen/RISCV/memcpy-inline.ll 
b/llvm/test/CodeGen/RISCV/memcpy-inline.ll
index 343695ee37da8..709b8264b5833 100644
--- a/llvm/test/CodeGen/RISCV/memcpy-inline.ll
+++ b/llvm/test/CodeGen/RISCV/memcpy-inline.ll
@@ -7,6 +7,10 @@
 ; RUN:   | FileCheck %s --check-prefixes=RV32-BOTH,RV32-FAST
 ; RUN: llc < %s -mtriple=riscv64 -mattr=+fast-unaligned-access \
 ; RUN:   | FileCheck %s --check-prefixes=RV64-BOTH,RV64-FAST
+; RUN: llc < %s -mtriple=riscv32 -mattr=+unaligned-scalar-mem \
+; RUN:   | FileCheck %s --check-prefixes=RV32-BOTH,RV32-FAST
+; RUN: llc < %s -mtriple=riscv64 -mattr=+unaligned-scalar-mem \
+; RUN:   | FileCheck %s --check-prefixes=RV64-BOTH,RV64-FAST
 
 ; --
 ; Fully unaligned cases

diff  --git a/llvm/test/CodeGen/RISCV/memcpy.ll 
b/llvm/test/CodeGen/RISCV/memcpy.ll
index 12ec0881b20d9..f8f5d25947d7f 100644
--- a/llvm/test/CodeGen/RISCV/memcpy.ll
+++ b/llvm/test/CodeGen/RISCV/memcpy.ll
@@ -7,6 +7,10 @@
 ; RUN:   | FileCheck %s --check-prefixes=RV32-BOTH,RV32-FAST
 ; RUN: 

[llvm-branch-commits] [llvm] release/18.x: [RISCV] Re-separate unaligned scalar and vector memory features in the backend. (PR #92143)

2024-05-17 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/92143

>From a7cd0c61123889a632ceea67dc8c8e2c8753ae08 Mon Sep 17 00:00:00 2001
From: Craig Topper 
Date: Thu, 16 May 2024 12:27:05 -0700
Subject: [PATCH] [RISCV] Add a unaligned-scalar-mem feature like we had in
 clang 17.

This is ORed with the fast-unaligned-access feature which applies
to scalar and vector together.:
---
 llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp | 5 +++--
 llvm/lib/Target/RISCV/RISCVFeatures.td   | 5 +
 llvm/lib/Target/RISCV/RISCVISelLowering.cpp  | 9 ++---
 llvm/test/CodeGen/RISCV/memcpy-inline.ll | 4 
 llvm/test/CodeGen/RISCV/memcpy.ll| 4 
 llvm/test/CodeGen/RISCV/memset-inline.ll | 4 
 llvm/test/CodeGen/RISCV/pr56110.ll   | 1 +
 llvm/test/CodeGen/RISCV/unaligned-load-store.ll  | 4 
 8 files changed, 31 insertions(+), 5 deletions(-)

diff --git a/llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp 
b/llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp
index 0a314fdd41cbe..89207640ee54a 100644
--- a/llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp
+++ b/llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp
@@ -317,8 +317,9 @@ bool 
RISCVExpandPseudo::expandRV32ZdinxStore(MachineBasicBlock ,
   .addReg(MBBI->getOperand(1).getReg())
   .add(MBBI->getOperand(2));
   if (MBBI->getOperand(2).isGlobal() || MBBI->getOperand(2).isCPI()) {
-// FIXME: Zdinx RV32 can not work on unaligned memory.
-assert(!STI->hasFastUnalignedAccess());
+// FIXME: Zdinx RV32 can not work on unaligned scalar memory.
+assert(!STI->hasFastUnalignedAccess() &&
+   !STI->enableUnalignedScalarMem());
 
 assert(MBBI->getOperand(2).getOffset() % 8 == 0);
 MBBI->getOperand(2).setOffset(MBBI->getOperand(2).getOffset() + 4);
diff --git a/llvm/lib/Target/RISCV/RISCVFeatures.td 
b/llvm/lib/Target/RISCV/RISCVFeatures.td
index 26451c80f57b4..1bb6b6a561f4a 100644
--- a/llvm/lib/Target/RISCV/RISCVFeatures.td
+++ b/llvm/lib/Target/RISCV/RISCVFeatures.td
@@ -1025,6 +1025,11 @@ def FeatureFastUnalignedAccess
   "true", "Has reasonably performant unaligned "
   "loads and stores (both scalar and vector)">;
 
+def FeatureUnalignedScalarMem
+   : SubtargetFeature<"unaligned-scalar-mem", "EnableUnalignedScalarMem",
+  "true", "Has reasonably performant unaligned scalar "
+  "loads and stores">;
+
 def FeaturePostRAScheduler : SubtargetFeature<"use-postra-scheduler",
 "UsePostRAScheduler", "true", "Schedule again after register allocation">;
 
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp 
b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index d46093b9e260a..3fe7ddfdd4279 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -1883,7 +1883,8 @@ bool 
RISCVTargetLowering::shouldConvertConstantLoadToIntImm(const APInt ,
   // replace. If we don't support unaligned scalar mem, prefer the constant
   // pool.
   // TODO: Can the caller pass down the alignment?
-  if (!Subtarget.hasFastUnalignedAccess())
+  if (!Subtarget.hasFastUnalignedAccess() &&
+  !Subtarget.enableUnalignedScalarMem())
 return true;
 
   // Prefer to keep the load if it would require many instructions.
@@ -19772,8 +19773,10 @@ bool 
RISCVTargetLowering::allowsMisalignedMemoryAccesses(
 unsigned *Fast) const {
   if (!VT.isVector()) {
 if (Fast)
-  *Fast = Subtarget.hasFastUnalignedAccess();
-return Subtarget.hasFastUnalignedAccess();
+  *Fast = Subtarget.hasFastUnalignedAccess() ||
+  Subtarget.enableUnalignedScalarMem();
+return Subtarget.hasFastUnalignedAccess() ||
+   Subtarget.enableUnalignedScalarMem();
   }
 
   // All vector implementations must support element alignment
diff --git a/llvm/test/CodeGen/RISCV/memcpy-inline.ll 
b/llvm/test/CodeGen/RISCV/memcpy-inline.ll
index 343695ee37da8..709b8264b5833 100644
--- a/llvm/test/CodeGen/RISCV/memcpy-inline.ll
+++ b/llvm/test/CodeGen/RISCV/memcpy-inline.ll
@@ -7,6 +7,10 @@
 ; RUN:   | FileCheck %s --check-prefixes=RV32-BOTH,RV32-FAST
 ; RUN: llc < %s -mtriple=riscv64 -mattr=+fast-unaligned-access \
 ; RUN:   | FileCheck %s --check-prefixes=RV64-BOTH,RV64-FAST
+; RUN: llc < %s -mtriple=riscv32 -mattr=+unaligned-scalar-mem \
+; RUN:   | FileCheck %s --check-prefixes=RV32-BOTH,RV32-FAST
+; RUN: llc < %s -mtriple=riscv64 -mattr=+unaligned-scalar-mem \
+; RUN:   | FileCheck %s --check-prefixes=RV64-BOTH,RV64-FAST
 
 ; --
 ; Fully unaligned cases
diff --git a/llvm/test/CodeGen/RISCV/memcpy.ll 
b/llvm/test/CodeGen/RISCV/memcpy.ll
index 12ec0881b20d9..f8f5d25947d7f 100644
--- a/llvm/test/CodeGen/RISCV/memcpy.ll
+++ b/llvm/test/CodeGen/RISCV/memcpy.ll
@@ -7,6 +7,10 @@
 ; RUN:   | FileCheck %s --check-prefixes=RV32-BOTH,RV32-FAST
 ; RUN: llc < %s -mtriple=riscv64 

[llvm-branch-commits] [llvm] [release/18.x] Backport fixes for ARM64EC thunk generation (PR #92580)

2024-05-17 Thread Eli Friedman via llvm-branch-commits

https://github.com/efriedma-quic approved this pull request.

LGTM

This only affects Arm64EC targets, the fixes are relatively small, and this 
affects correctness of generated thunks.

https://github.com/llvm/llvm-project/pull/92580
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [release/18.x] Backport fixes for ARM64EC thunk generation (PR #92580)

2024-05-17 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-aarch64

Author: Daniel Paoliello (dpaoliello)


Changes

Backports !90115 and !92326

Release notes:
Fixes issues where LLVM is either generating the incorrect thunk for a function 
with aligned parameters or didn't correctly pass through the return value when 
`StructRet` was used.

---
Full diff: https://github.com/llvm/llvm-project/pull/92580.diff


3 Files Affected:

- (modified) llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp (+12-4) 
- (modified) llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll (+22-14) 
- (modified) llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll (+5-5) 


``diff
diff --git a/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp 
b/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp
index 55c5bbc66a3f4..862aefe46193d 100644
--- a/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp
@@ -181,13 +181,14 @@ void AArch64Arm64ECCallLowering::getThunkArgTypes(
   }
 
   for (unsigned E = FT->getNumParams(); I != E; ++I) {
-Align ParamAlign = AttrList.getParamAlignment(I).valueOrOne();
 #if 0
 // FIXME: Need more information about argument size; see
 // https://reviews.llvm.org/D132926
 uint64_t ArgSizeBytes = AttrList.getParamArm64ECArgSizeBytes(I);
+Align ParamAlign = AttrList.getParamAlignment(I).valueOrOne();
 #else
 uint64_t ArgSizeBytes = 0;
+Align ParamAlign = Align();
 #endif
 Type *Arm64Ty, *X64Ty;
 canonicalizeThunkType(FT->getParamType(I), ParamAlign,
@@ -297,7 +298,7 @@ void AArch64Arm64ECCallLowering::canonicalizeThunkType(
 uint64_t TotalSizeBytes = ElementCnt * ElementSizePerBytes;
 if (ElementTy->isFloatTy() || ElementTy->isDoubleTy()) {
   Out << (ElementTy->isFloatTy() ? "F" : "D") << TotalSizeBytes;
-  if (Alignment.value() >= 8 && !T->isPointerTy())
+  if (Alignment.value() >= 16 && !Ret)
 Out << "a" << Alignment.value();
   Arm64Ty = T;
   if (TotalSizeBytes <= 8) {
@@ -328,7 +329,7 @@ void AArch64Arm64ECCallLowering::canonicalizeThunkType(
   Out << "m";
   if (TypeSize != 4)
 Out << TypeSize;
-  if (Alignment.value() >= 8 && !T->isPointerTy())
+  if (Alignment.value() >= 16 && !Ret)
 Out << "a" << Alignment.value();
   // FIXME: Try to canonicalize Arm64Ty more thoroughly?
   Arm64Ty = T;
@@ -513,7 +514,14 @@ Function 
*AArch64Arm64ECCallLowering::buildEntryThunk(Function *F) {
   // Call the function passed to the thunk.
   Value *Callee = Thunk->getArg(0);
   Callee = IRB.CreateBitCast(Callee, PtrTy);
-  Value *Call = IRB.CreateCall(Arm64Ty, Callee, Args);
+  CallInst *Call = IRB.CreateCall(Arm64Ty, Callee, Args);
+
+  auto SRetAttr = F->getAttributes().getParamAttr(0, Attribute::StructRet);
+  auto InRegAttr = F->getAttributes().getParamAttr(0, Attribute::InReg);
+  if (SRetAttr.isValid() && !InRegAttr.isValid()) {
+Thunk->addParamAttr(1, SRetAttr);
+Call->addParamAttr(0, SRetAttr);
+  }
 
   Value *RetVal = Call;
   if (TransformDirectToSRet) {
diff --git a/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll 
b/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll
index bb9ba05f7a272..e9556b9d5cbee 100644
--- a/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll
+++ b/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll
@@ -222,12 +222,12 @@ define i8 @matches_has_sret() nounwind {
 }
 
 %TSRet = type { i64, i64 }
-define void @has_aligned_sret(ptr align 32 sret(%TSRet)) nounwind {
-; CHECK-LABEL:.def$ientry_thunk$cdecl$m16a32$v;
-; CHECK:  .section
.wowthk$aa,"xr",discard,$ientry_thunk$cdecl$m16a32$v
+define void @has_aligned_sret(ptr align 32 sret(%TSRet), i32) nounwind {
+; CHECK-LABEL:.def$ientry_thunk$cdecl$m16$i8;
+; CHECK:  .section
.wowthk$aa,"xr",discard,$ientry_thunk$cdecl$m16$i8
 ; CHECK:  // %bb.0:
-; CHECK-NEXT: stp q6, q7, [sp, #-176]!// 32-byte Folded 
Spill
-; CHECK-NEXT: .seh_save_any_reg_pxq6, 176
+; CHECK-NEXT: stp q6, q7, [sp, #-192]!// 32-byte Folded 
Spill
+; CHECK-NEXT: .seh_save_any_reg_pxq6, 192
 ; CHECK-NEXT: stp q8, q9, [sp, #32]   // 32-byte Folded 
Spill
 ; CHECK-NEXT: .seh_save_any_reg_p q8, 32
 ; CHECK-NEXT: stp q10, q11, [sp, #64] // 32-byte Folded 
Spill
@@ -236,17 +236,25 @@ define void @has_aligned_sret(ptr align 32 sret(%TSRet)) 
nounwind {
 ; CHECK-NEXT: .seh_save_any_reg_p q12, 96
 ; CHECK-NEXT: stp q14, q15, [sp, #128]// 32-byte Folded 
Spill
 ; CHECK-NEXT: .seh_save_any_reg_p q14, 128
-; CHECK-NEXT: stp x29, x30, [sp, #160]// 16-byte Folded 
Spill
-; CHECK-NEXT: .seh_save_fplr  160
-; CHECK-NEXT: add x29, sp, #160
-; CHECK-NEXT: .seh_add_fp 160
+; CHECK-NEXT: str x19, [sp, #160] // 8-byte Folded 
Spill
+; CHECK-NEXT: .seh_save_reg   x19, 160
+; CHECK-NEXT:   

[llvm-branch-commits] [llvm] [release/18.x] Backport fixes for ARM64EC thunk generation (PR #92580)

2024-05-17 Thread Daniel Paoliello via llvm-branch-commits

https://github.com/dpaoliello created 
https://github.com/llvm/llvm-project/pull/92580

Backports !90115 and !92326

Release notes:
Fixes issues where LLVM is either generating the incorrect thunk for a function 
with aligned parameters or didn't correctly pass through the return value when 
`StructRet` was used.

>From 5e0477fafd6aa8ea8451a7ea4968f407ca893aef Mon Sep 17 00:00:00 2001
From: Eli Friedman 
Date: Fri, 26 Apr 2024 11:06:11 -0700
Subject: [PATCH 1/2] [Arm64EC] Improve alignment mangling in arm64ec thunks.
 (#90115)

In some cases, MSVC's mangling for arm64ec thunks includes the alignment
of a struct. I added some code to try to match... but it never really
worked right. The issues:

- Alignment is only mangled if it's 16 or more (I guess the default is
supposed to be 8).
- Alignment isn't mangled on return values (since the memory is
allocated by the caller).

The current patch leaves hooks to make alignment mangling work... but
doesn't actually ever mangle alignment: clang never actually encodes a
relevant alignment into the IR. Once we get clang to emit the real
size/alignment of structs, we can start emitting it.
---
 llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp |  7 ---
 llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll  |  6 +++---
 llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll   | 10 +-
 3 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp 
b/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp
index 55c5bbc66a3f4..d4dd28aecac48 100644
--- a/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp
@@ -181,13 +181,14 @@ void AArch64Arm64ECCallLowering::getThunkArgTypes(
   }
 
   for (unsigned E = FT->getNumParams(); I != E; ++I) {
-Align ParamAlign = AttrList.getParamAlignment(I).valueOrOne();
 #if 0
 // FIXME: Need more information about argument size; see
 // https://reviews.llvm.org/D132926
 uint64_t ArgSizeBytes = AttrList.getParamArm64ECArgSizeBytes(I);
+Align ParamAlign = AttrList.getParamAlignment(I).valueOrOne();
 #else
 uint64_t ArgSizeBytes = 0;
+Align ParamAlign = Align();
 #endif
 Type *Arm64Ty, *X64Ty;
 canonicalizeThunkType(FT->getParamType(I), ParamAlign,
@@ -297,7 +298,7 @@ void AArch64Arm64ECCallLowering::canonicalizeThunkType(
 uint64_t TotalSizeBytes = ElementCnt * ElementSizePerBytes;
 if (ElementTy->isFloatTy() || ElementTy->isDoubleTy()) {
   Out << (ElementTy->isFloatTy() ? "F" : "D") << TotalSizeBytes;
-  if (Alignment.value() >= 8 && !T->isPointerTy())
+  if (Alignment.value() >= 16 && !Ret)
 Out << "a" << Alignment.value();
   Arm64Ty = T;
   if (TotalSizeBytes <= 8) {
@@ -328,7 +329,7 @@ void AArch64Arm64ECCallLowering::canonicalizeThunkType(
   Out << "m";
   if (TypeSize != 4)
 Out << TypeSize;
-  if (Alignment.value() >= 8 && !T->isPointerTy())
+  if (Alignment.value() >= 16 && !Ret)
 Out << "a" << Alignment.value();
   // FIXME: Try to canonicalize Arm64Ty more thoroughly?
   Arm64Ty = T;
diff --git a/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll 
b/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll
index bb9ba05f7a272..c00c9bfe127e8 100644
--- a/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll
+++ b/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll
@@ -223,8 +223,8 @@ define i8 @matches_has_sret() nounwind {
 
 %TSRet = type { i64, i64 }
 define void @has_aligned_sret(ptr align 32 sret(%TSRet)) nounwind {
-; CHECK-LABEL:.def$ientry_thunk$cdecl$m16a32$v;
-; CHECK:  .section
.wowthk$aa,"xr",discard,$ientry_thunk$cdecl$m16a32$v
+; CHECK-LABEL:.def$ientry_thunk$cdecl$m16$v;
+; CHECK:  .section
.wowthk$aa,"xr",discard,$ientry_thunk$cdecl$m16$v
 ; CHECK:  // %bb.0:
 ; CHECK-NEXT: stp q6, q7, [sp, #-176]!// 32-byte Folded 
Spill
 ; CHECK-NEXT: .seh_save_any_reg_pxq6, 176
@@ -457,7 +457,7 @@ define %T2 @simple_struct(%T1 %0, %T2 %1, %T3, %T4) 
nounwind {
 ; CHECK-NEXT: .symidx $ientry_thunk$cdecl$i8$v
 ; CHECK-NEXT: .word   1
 ; CHECK-NEXT: .symidx "#has_aligned_sret"
-; CHECK-NEXT: .symidx $ientry_thunk$cdecl$m16a32$v
+; CHECK-NEXT: .symidx $ientry_thunk$cdecl$m16$v
 ; CHECK-NEXT: .word   1
 ; CHECK-NEXT: .symidx "#small_array"
 ; CHECK-NEXT: .symidx $ientry_thunk$cdecl$m2$m2F8
diff --git a/llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll 
b/llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll
index 3b911e78aff2a..7a40fcd85ac58 100644
--- a/llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll
+++ b/llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll
@@ -236,8 +236,8 @@ declare void @has_sret(ptr sret([100 x i8])) nounwind;
 
 %TSRet = type { i64, i64 }
 declare void @has_aligned_sret(ptr align 32 sret(%TSRet)) nounwind;
-; CHECK-LABEL:.def$iexit_thunk$cdecl$m16a32$v;
-; CHECK:  .section

[llvm-branch-commits] [llvm] [release/18.x] Backport fixes for ARM64EC thunk generation (PR #92580)

2024-05-17 Thread Daniel Paoliello via llvm-branch-commits

https://github.com/dpaoliello milestoned 
https://github.com/llvm/llvm-project/pull/92580
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [RISCV] Re-separate unaligned scalar and vector memory features in the backend. (PR #92143)

2024-05-17 Thread Philip Reames via llvm-branch-commits

preames wrote:

I'm fine with this approach.  No strong opinion either way, but definitely 
don't let me previous comments be blocking here.  

https://github.com/llvm/llvm-project/pull/92143
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [MLIR][OpenMP] Support clause-based representation of operations (PR #92519)

2024-05-17 Thread Sergio Afonso via llvm-branch-commits

skatrak wrote:

Here's a link to the RFC about this proposal, with links to all related PRs: 
https://discourse.llvm.org/t/rfc-clause-based-representation-of-openmp-dialect-operations/79053

https://github.com/llvm/llvm-project/pull/92519
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: InstCombine: Process addrspacecast uses in PointerReplacer (#91953) (PR #92479)

2024-05-17 Thread via llvm-branch-commits

https://github.com/AtariDreams closed 
https://github.com/llvm/llvm-project/pull/92479
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [RISCV] Re-separate unaligned scalar and vector memory features in the backend. (PR #92143)

2024-05-17 Thread Nikita Popov via llvm-branch-commits

nikic wrote:

The approach looks reasonable to me. 

https://github.com/llvm/llvm-project/pull/92143
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: InstCombine: Process addrspacecast uses in PointerReplacer (#91953) (PR #92479)

2024-05-17 Thread Nikita Popov via llvm-branch-commits

https://github.com/nikic requested changes to this pull request.

There are test failures.

Generally I don't have enough confidence in this change for a last-minute 
backport.

https://github.com/llvm/llvm-project/pull/92479
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [MLIR][OpenMP] Support clause-based representation of operations (PR #92519)

2024-05-17 Thread Sergio Afonso via llvm-branch-commits

skatrak wrote:

> I like the idea, but also have some questions:
> 
> 1. How does an operation definition look like with all its clauses 
> defined? Does it have to be repeated for each operation supporting the same 
> clause(s)?

You can see how things would look like in #92523, but I can copy one example 
here:

```tablegen
def TeamsOp : OpenMP_Op<"teams", traits = [
AttrSizedOperandSegments, RecursiveMemoryEffects
  ], clauses = [
OpenMP_NumTeamsClause, OpenMP_IfClause, OpenMP_ThreadLimitClause,
OpenMP_AllocateClause, OpenMP_ReductionClause
  ], singleRegion = true> {
  let summary = "teams construct";
  let description = [{
The teams construct defines a region of code that triggers the creation of a
league of teams. Once created, the number of teams remains constant for the
duration of its code region.
  }] # clausesDescription;

  let builders = [
OpBuilder<(ins CArg<"const TeamsClauseOps &">:$clauses)>
  ];

  let hasVerifier = 1;
}
```
> 
> 2. It seems this requires all properties of a clause specified in its 
> constructor. Wouldn't it be better if you could subclass `OpenMP_Clause` and 
> in there overwrite all the properties that are non-default?
> 

Actually this only requires the list of clauses themselves to be passed in to 
the definition of the `OpenMP_Op`. They are the ones that define all these 
properties, and they indeed all subclass `OpenMP_Clause` (see in #92521).

> 3. Have you considered to define/derive from all this info in OpenMP.td 
> of LLVMFrontend? Clause information should be language-independent.

I get that, the issue is that the information being used here is very much MLIR 
dialect specific, and OMP.td in LLVMFrontend doesn't seem like the right place 
to be defining these things, since it's outside of the MLIR project. Maybe 
descriptions could potentially be moved there, but then we'd have some sort of 
matching system between clauses in LLVMFrontend and the ones in MLIR. Not sure 
how feasible that would be.

https://github.com/llvm/llvm-project/pull/92519
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [MLIR][OpenMP] Support clause-based representation of operations (PR #92519)

2024-05-17 Thread Michael Kruse via llvm-branch-commits

Meinersbur wrote:

I like the idea, but also have some questions:
1. How does an operation definition look like with all its clauses defined? 
Does it have to be repeated for each operation supporting the same clause(s)?
2. It seems this requires all properties of a clause specified in its 
constructor. Wouldn't it be better if you could subclass `OpenMP_Clause` and in 
there overwrite all the properties that are non-default?
3. Have you considered to define/derive from all this info in OpenMP.td of 
LLVMFrontend? Clause information should be language-independent. 

https://github.com/llvm/llvm-project/pull/92519
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [MLIR][OpenMP] Support clause-based representation of operations (PR #92519)

2024-05-17 Thread Kiran Chandramohan via llvm-branch-commits

kiranchandramohan wrote:

@skatrak Could you copy the summary of the patch and create an RFC?

https://github.com/llvm/llvm-project/pull/92519
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [Flang][OpenMP] Update flang with changes to the OpenMP dialect (PR #92524)

2024-05-17 Thread via llvm-branch-commits

llvmbot wrote:



@llvm/pr-subscribers-flang-fir-hlfir

@llvm/pr-subscribers-flang-openmp

Author: Sergio Afonso (skatrak)


Changes

This patch applies fixes after the updates to OpenMP clause operands, as well 
as updating some tests that were impacted by changes to the ordering or 
assembly format of some clauses in MLIR.

---

Patch is 20.82 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/92524.diff


10 Files Affected:

- (modified) flang/lib/Lower/OpenMP/ClauseProcessor.cpp (+2-2) 
- (modified) flang/lib/Lower/OpenMP/ClauseProcessor.h (+2-2) 
- (modified) flang/lib/Lower/OpenMP/OpenMP.cpp (+10-9) 
- (modified) flang/test/Lower/OpenMP/atomic-capture.f90 (+1-1) 
- (modified) flang/test/Lower/OpenMP/copyin-order.f90 (+1-1) 
- (modified) flang/test/Lower/OpenMP/parallel-wsloop.f90 (+1-1) 
- (modified) flang/test/Lower/OpenMP/parallel.f90 (+12-12) 
- (modified) flang/test/Lower/OpenMP/simd.f90 (+1-1) 
- (modified) flang/test/Lower/OpenMP/target.f90 (+12-12) 
- (modified) flang/test/Lower/OpenMP/use-device-ptr-to-use-device-addr.f90 
(+1-1) 


``diff
diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp 
b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
index b7198c951c8fe..357cc09bfb445 100644
--- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
+++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
@@ -997,7 +997,7 @@ bool ClauseProcessor::processEnter(
 }
 
 bool ClauseProcessor::processUseDeviceAddr(
-mlir::omp::UseDeviceClauseOps ,
+mlir::omp::UseDeviceAddrClauseOps ,
 llvm::SmallVectorImpl ,
 llvm::SmallVectorImpl ,
 llvm::SmallVectorImpl )
@@ -1011,7 +1011,7 @@ bool ClauseProcessor::processUseDeviceAddr(
 }
 
 bool ClauseProcessor::processUseDevicePtr(
-mlir::omp::UseDeviceClauseOps ,
+mlir::omp::UseDevicePtrClauseOps ,
 llvm::SmallVectorImpl ,
 llvm::SmallVectorImpl ,
 llvm::SmallVectorImpl )
diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h 
b/flang/lib/Lower/OpenMP/ClauseProcessor.h
index 78c148ab02163..220ea7b6d9920 100644
--- a/flang/lib/Lower/OpenMP/ClauseProcessor.h
+++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h
@@ -128,13 +128,13 @@ class ClauseProcessor {
 mlir::omp::ReductionClauseOps ) const;
   bool processTo(llvm::SmallVectorImpl ) 
const;
   bool
-  processUseDeviceAddr(mlir::omp::UseDeviceClauseOps ,
+  processUseDeviceAddr(mlir::omp::UseDeviceAddrClauseOps ,
llvm::SmallVectorImpl ,
llvm::SmallVectorImpl ,
llvm::SmallVectorImpl
) const;
   bool
-  processUseDevicePtr(mlir::omp::UseDeviceClauseOps ,
+  processUseDevicePtr(mlir::omp::UseDevicePtrClauseOps ,
   llvm::SmallVectorImpl ,
   llvm::SmallVectorImpl ,
   llvm::SmallVectorImpl
diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp 
b/flang/lib/Lower/OpenMP/OpenMP.cpp
index 44011ad78f2e2..2f612dd6f2fb6 100644
--- a/flang/lib/Lower/OpenMP/OpenMP.cpp
+++ b/flang/lib/Lower/OpenMP/OpenMP.cpp
@@ -239,7 +239,8 @@ 
createAndSetPrivatizedLoopVar(Fortran::lower::AbstractConverter ,
 //  clause. Support for such list items in a use_device_ptr clause
 //  is deprecated."
 static void promoteNonCPtrUseDevicePtrArgsToUseDeviceAddr(
-mlir::omp::UseDeviceClauseOps ,
+llvm::SmallVectorImpl ,
+llvm::SmallVectorImpl ,
 llvm::SmallVectorImpl ,
 llvm::SmallVectorImpl ,
 llvm::SmallVectorImpl
@@ -252,10 +253,9 @@ static void promoteNonCPtrUseDevicePtrArgsToUseDeviceAddr(
 
   // Iterate over our use_device_ptr list and shift all non-cptr arguments into
   // use_device_addr.
-  for (auto *it = clauseOps.useDevicePtrVars.begin();
-   it != clauseOps.useDevicePtrVars.end();) {
+  for (auto *it = useDevicePtrVars.begin(); it != useDevicePtrVars.end();) {
 if (!fir::isa_builtin_cptr_type(fir::unwrapRefType(it->getType( {
-  clauseOps.useDeviceAddrVars.push_back(*it);
+  useDeviceAddrVars.push_back(*it);
   // We have to shuffle the symbols around as well, to maintain
   // the correct Input -> BlockArg for use_device_ptr/use_device_addr.
   // NOTE: However, as map's do not seem to be included currently
@@ -263,11 +263,11 @@ static void promoteNonCPtrUseDevicePtrArgsToUseDeviceAddr(
   // future alterations. I believe the reason they are not currently
   // is that the BlockArg assign/lowering needs to be extended
   // to a greater set of types.
-  auto idx = std::distance(clauseOps.useDevicePtrVars.begin(), it);
+  auto idx = std::distance(useDevicePtrVars.begin(), it);
   moveElementToBack(idx, useDeviceTypes);
   moveElementToBack(idx, useDeviceLocs);
   moveElementToBack(idx, useDeviceSymbols);
-  it = clauseOps.useDevicePtrVars.erase(it);
+  it = useDevicePtrVars.erase(it);
   continue;
 }
 ++it;
@@ -1005,7 +1005,7 @@ 
genCriticalDeclareClauses(Fortran::lower::AbstractConverter 

[llvm-branch-commits] [flang] [Flang][OpenMP] Update flang with changes to the OpenMP dialect (PR #92524)

2024-05-17 Thread Sergio Afonso via llvm-branch-commits

https://github.com/skatrak created 
https://github.com/llvm/llvm-project/pull/92524

This patch applies fixes after the updates to OpenMP clause operands, as well 
as updating some tests that were impacted by changes to the ordering or 
assembly format of some clauses in MLIR.

>From 522812fb4354812e3bcfaf1b1e52dfa9e0db05ae Mon Sep 17 00:00:00 2001
From: Sergio Afonso 
Date: Fri, 17 May 2024 11:38:36 +0100
Subject: [PATCH] [Flang][OpenMP] Update flang with changes to the OpenMP
 dialect

This patch applies fixes after the updates to OpenMP clause operands, as well
as updating some tests that were impacted by changes to the ordering or
assembly format of some clauses in MLIR.
---
 flang/lib/Lower/OpenMP/ClauseProcessor.cpp|  4 ++--
 flang/lib/Lower/OpenMP/ClauseProcessor.h  |  4 ++--
 flang/lib/Lower/OpenMP/OpenMP.cpp | 19 ---
 flang/test/Lower/OpenMP/atomic-capture.f90|  2 +-
 flang/test/Lower/OpenMP/copyin-order.f90  |  2 +-
 flang/test/Lower/OpenMP/parallel-wsloop.f90   |  2 +-
 flang/test/Lower/OpenMP/parallel.f90  | 24 +--
 flang/test/Lower/OpenMP/simd.f90  |  2 +-
 flang/test/Lower/OpenMP/target.f90| 24 +--
 .../use-device-ptr-to-use-device-addr.f90 |  2 +-
 10 files changed, 43 insertions(+), 42 deletions(-)

diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp 
b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
index b7198c951c8fe..357cc09bfb445 100644
--- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
+++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
@@ -997,7 +997,7 @@ bool ClauseProcessor::processEnter(
 }
 
 bool ClauseProcessor::processUseDeviceAddr(
-mlir::omp::UseDeviceClauseOps ,
+mlir::omp::UseDeviceAddrClauseOps ,
 llvm::SmallVectorImpl ,
 llvm::SmallVectorImpl ,
 llvm::SmallVectorImpl )
@@ -1011,7 +1011,7 @@ bool ClauseProcessor::processUseDeviceAddr(
 }
 
 bool ClauseProcessor::processUseDevicePtr(
-mlir::omp::UseDeviceClauseOps ,
+mlir::omp::UseDevicePtrClauseOps ,
 llvm::SmallVectorImpl ,
 llvm::SmallVectorImpl ,
 llvm::SmallVectorImpl )
diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h 
b/flang/lib/Lower/OpenMP/ClauseProcessor.h
index 78c148ab02163..220ea7b6d9920 100644
--- a/flang/lib/Lower/OpenMP/ClauseProcessor.h
+++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h
@@ -128,13 +128,13 @@ class ClauseProcessor {
 mlir::omp::ReductionClauseOps ) const;
   bool processTo(llvm::SmallVectorImpl ) 
const;
   bool
-  processUseDeviceAddr(mlir::omp::UseDeviceClauseOps ,
+  processUseDeviceAddr(mlir::omp::UseDeviceAddrClauseOps ,
llvm::SmallVectorImpl ,
llvm::SmallVectorImpl ,
llvm::SmallVectorImpl
) const;
   bool
-  processUseDevicePtr(mlir::omp::UseDeviceClauseOps ,
+  processUseDevicePtr(mlir::omp::UseDevicePtrClauseOps ,
   llvm::SmallVectorImpl ,
   llvm::SmallVectorImpl ,
   llvm::SmallVectorImpl
diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp 
b/flang/lib/Lower/OpenMP/OpenMP.cpp
index 44011ad78f2e2..2f612dd6f2fb6 100644
--- a/flang/lib/Lower/OpenMP/OpenMP.cpp
+++ b/flang/lib/Lower/OpenMP/OpenMP.cpp
@@ -239,7 +239,8 @@ 
createAndSetPrivatizedLoopVar(Fortran::lower::AbstractConverter ,
 //  clause. Support for such list items in a use_device_ptr clause
 //  is deprecated."
 static void promoteNonCPtrUseDevicePtrArgsToUseDeviceAddr(
-mlir::omp::UseDeviceClauseOps ,
+llvm::SmallVectorImpl ,
+llvm::SmallVectorImpl ,
 llvm::SmallVectorImpl ,
 llvm::SmallVectorImpl ,
 llvm::SmallVectorImpl
@@ -252,10 +253,9 @@ static void promoteNonCPtrUseDevicePtrArgsToUseDeviceAddr(
 
   // Iterate over our use_device_ptr list and shift all non-cptr arguments into
   // use_device_addr.
-  for (auto *it = clauseOps.useDevicePtrVars.begin();
-   it != clauseOps.useDevicePtrVars.end();) {
+  for (auto *it = useDevicePtrVars.begin(); it != useDevicePtrVars.end();) {
 if (!fir::isa_builtin_cptr_type(fir::unwrapRefType(it->getType( {
-  clauseOps.useDeviceAddrVars.push_back(*it);
+  useDeviceAddrVars.push_back(*it);
   // We have to shuffle the symbols around as well, to maintain
   // the correct Input -> BlockArg for use_device_ptr/use_device_addr.
   // NOTE: However, as map's do not seem to be included currently
@@ -263,11 +263,11 @@ static void promoteNonCPtrUseDevicePtrArgsToUseDeviceAddr(
   // future alterations. I believe the reason they are not currently
   // is that the BlockArg assign/lowering needs to be extended
   // to a greater set of types.
-  auto idx = std::distance(clauseOps.useDevicePtrVars.begin(), it);
+  auto idx = std::distance(useDevicePtrVars.begin(), it);
   moveElementToBack(idx, useDeviceTypes);
   moveElementToBack(idx, useDeviceLocs);
   moveElementToBack(idx, 

[llvm-branch-commits] [mlir] [MLIR][OpenMP] Clause-based OpenMP operation definition (PR #92523)

2024-05-17 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-mlir-llvm

Author: Sergio Afonso (skatrak)


Changes

This patch updates `OpenMP_Op` definitions to be based on the new set of 
`OpenMP_Clause` definitions, and to take advantage of clause-based 
automatically-generated argument lists, descriptions, assembly format and class 
declarations.

There are also changes introduced to the clause operands structures to match 
the current set of tablegen clause definitions. These two are very closely 
linked and should be kept in sync. It would probably be a good idea to try 
generating clause operands structures from the tablegen `OpenMP_Clause` 
definitions in the future.

As a result of this change, arguments for some operations have been reordered. 
This patch also addresses this by updating affected operation build calls and 
unit tests. Some other updates to tests related to the order of arguments in 
the resulting assembly format and others due to certain previous 
inconsistencies in the printing/parsing of clauses are addressed.

The printer and parser functions for the `map` clause are updated, so that they 
are able to handle `map` clauses linked to entry block arguments as well as 
those which aren't.

---

Patch is 106.52 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/92523.diff


11 Files Affected:

- (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h (+26-10) 
- (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td (+298-846) 
- (modified) mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp (+2-1) 
- (modified) mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp (+50-28) 
- (modified) 
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+1-1) 
- (modified) mlir/test/Dialect/OpenMP/invalid.mlir (+10-10) 
- (modified) mlir/test/Dialect/OpenMP/ops.mlir (+25-26) 
- (modified) mlir/test/Target/LLVMIR/omptarget-llvm.mlir (+3-3) 
- (modified) mlir/test/Target/LLVMIR/omptarget-nowait-llvm.mlir (+1-1) 
- (modified) mlir/test/Target/LLVMIR/omptarget-parallel-llvm.mlir (+2-2) 
- (modified) mlir/test/Target/LLVMIR/openmp-llvm.mlir (+1-1) 


``diff
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h 
b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h
index 244cee1dd635b..bd0d44f932981 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h
@@ -39,6 +39,10 @@ struct AllocateClauseOps {
   llvm::SmallVector allocatorVars, allocateVars;
 };
 
+struct CancelDirectiveNameClauseOps {
+  ClauseCancellationConstructTypeAttr cancelDirectiveNameAttr;
+};
+
 struct CollapseClauseOps {
   llvm::SmallVector loopLBVar, loopUBVar, loopStepVar;
 };
@@ -48,6 +52,10 @@ struct CopyprivateClauseOps {
   llvm::SmallVector copyprivateFuncs;
 };
 
+struct CriticalNameClauseOps {
+  StringAttr criticalNameAttr;
+};
+
 struct DependClauseOps {
   llvm::SmallVector dependTypeAttrs;
   llvm::SmallVector dependVars;
@@ -84,6 +92,7 @@ struct GrainsizeClauseOps {
 struct HasDeviceAddrClauseOps {
   llvm::SmallVector hasDeviceAddrVars;
 };
+
 struct HintClauseOps {
   IntegerAttr hintAttr;
 };
@@ -117,10 +126,6 @@ struct MergeableClauseOps {
   UnitAttr mergeableAttr;
 };
 
-struct NameClauseOps {
-  StringAttr nameAttr;
-};
-
 struct NogroupClauseOps {
   UnitAttr nogroupAttr;
 };
@@ -209,8 +214,12 @@ struct UntiedClauseOps {
   UnitAttr untiedAttr;
 };
 
-struct UseDeviceClauseOps {
-  llvm::SmallVector useDevicePtrVars, useDeviceAddrVars;
+struct UseDeviceAddrClauseOps {
+  llvm::SmallVector useDeviceAddrVars;
+};
+
+struct UseDevicePtrClauseOps {
+  llvm::SmallVector useDevicePtrVars;
 };
 
 
//===--===//
@@ -225,7 +234,13 @@ template 
 struct Clauses : public Mixins... {};
 } // namespace detail
 
-using CriticalClauseOps = detail::Clauses;
+using CancelClauseOps =
+detail::Clauses;
+
+using CancellationPointClauseOps =
+detail::Clauses;
+
+using CriticalClauseOps = detail::Clauses;
 
 // TODO `indirect` clause.
 using DeclareTargetClauseOps = detail::Clauses;
@@ -264,10 +279,11 @@ using TargetClauseOps =
 detail::Clauses;
+PrivateClauseOps, ThreadLimitClauseOps>;
 
-using TargetDataClauseOps = detail::Clauses;
+using TargetDataClauseOps =
+detail::Clauses;
 
 using TargetEnterExitUpdateDataClauseOps =
 detail::Clauses {
 
//===--===//
 
 def ParallelOp : OpenMP_Op<"parallel", [
- AutomaticAllocationScope, AttrSizedOperandSegments,
- DeclareOpInterfaceMethods,
- DeclareOpInterfaceMethods,
- RecursiveMemoryEffects, ReductionClauseInterface]> {
+AttrSizedOperandSegments, AutomaticAllocationScope,
+DeclareOpInterfaceMethods,
+DeclareOpInterfaceMethods,
+RecursiveMemoryEffects
+  ], [
+// TODO: Sort clauses alphabetically.
+ 

[llvm-branch-commits] [mlir] [MLIR][OpenMP] Clause-based OpenMP operation definition (PR #92523)

2024-05-17 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-mlir

Author: Sergio Afonso (skatrak)


Changes

This patch updates `OpenMP_Op` definitions to be based on the new set of 
`OpenMP_Clause` definitions, and to take advantage of clause-based 
automatically-generated argument lists, descriptions, assembly format and class 
declarations.

There are also changes introduced to the clause operands structures to match 
the current set of tablegen clause definitions. These two are very closely 
linked and should be kept in sync. It would probably be a good idea to try 
generating clause operands structures from the tablegen `OpenMP_Clause` 
definitions in the future.

As a result of this change, arguments for some operations have been reordered. 
This patch also addresses this by updating affected operation build calls and 
unit tests. Some other updates to tests related to the order of arguments in 
the resulting assembly format and others due to certain previous 
inconsistencies in the printing/parsing of clauses are addressed.

The printer and parser functions for the `map` clause are updated, so that they 
are able to handle `map` clauses linked to entry block arguments as well as 
those which aren't.

---

Patch is 106.52 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/92523.diff


11 Files Affected:

- (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h (+26-10) 
- (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td (+298-846) 
- (modified) mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp (+2-1) 
- (modified) mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp (+50-28) 
- (modified) 
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+1-1) 
- (modified) mlir/test/Dialect/OpenMP/invalid.mlir (+10-10) 
- (modified) mlir/test/Dialect/OpenMP/ops.mlir (+25-26) 
- (modified) mlir/test/Target/LLVMIR/omptarget-llvm.mlir (+3-3) 
- (modified) mlir/test/Target/LLVMIR/omptarget-nowait-llvm.mlir (+1-1) 
- (modified) mlir/test/Target/LLVMIR/omptarget-parallel-llvm.mlir (+2-2) 
- (modified) mlir/test/Target/LLVMIR/openmp-llvm.mlir (+1-1) 


``diff
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h 
b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h
index 244cee1dd635b..bd0d44f932981 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h
@@ -39,6 +39,10 @@ struct AllocateClauseOps {
   llvm::SmallVector allocatorVars, allocateVars;
 };
 
+struct CancelDirectiveNameClauseOps {
+  ClauseCancellationConstructTypeAttr cancelDirectiveNameAttr;
+};
+
 struct CollapseClauseOps {
   llvm::SmallVector loopLBVar, loopUBVar, loopStepVar;
 };
@@ -48,6 +52,10 @@ struct CopyprivateClauseOps {
   llvm::SmallVector copyprivateFuncs;
 };
 
+struct CriticalNameClauseOps {
+  StringAttr criticalNameAttr;
+};
+
 struct DependClauseOps {
   llvm::SmallVector dependTypeAttrs;
   llvm::SmallVector dependVars;
@@ -84,6 +92,7 @@ struct GrainsizeClauseOps {
 struct HasDeviceAddrClauseOps {
   llvm::SmallVector hasDeviceAddrVars;
 };
+
 struct HintClauseOps {
   IntegerAttr hintAttr;
 };
@@ -117,10 +126,6 @@ struct MergeableClauseOps {
   UnitAttr mergeableAttr;
 };
 
-struct NameClauseOps {
-  StringAttr nameAttr;
-};
-
 struct NogroupClauseOps {
   UnitAttr nogroupAttr;
 };
@@ -209,8 +214,12 @@ struct UntiedClauseOps {
   UnitAttr untiedAttr;
 };
 
-struct UseDeviceClauseOps {
-  llvm::SmallVector useDevicePtrVars, useDeviceAddrVars;
+struct UseDeviceAddrClauseOps {
+  llvm::SmallVector useDeviceAddrVars;
+};
+
+struct UseDevicePtrClauseOps {
+  llvm::SmallVector useDevicePtrVars;
 };
 
 
//===--===//
@@ -225,7 +234,13 @@ template 
 struct Clauses : public Mixins... {};
 } // namespace detail
 
-using CriticalClauseOps = detail::Clauses;
+using CancelClauseOps =
+detail::Clauses;
+
+using CancellationPointClauseOps =
+detail::Clauses;
+
+using CriticalClauseOps = detail::Clauses;
 
 // TODO `indirect` clause.
 using DeclareTargetClauseOps = detail::Clauses;
@@ -264,10 +279,11 @@ using TargetClauseOps =
 detail::Clauses;
+PrivateClauseOps, ThreadLimitClauseOps>;
 
-using TargetDataClauseOps = detail::Clauses;
+using TargetDataClauseOps =
+detail::Clauses;
 
 using TargetEnterExitUpdateDataClauseOps =
 detail::Clauses {
 
//===--===//
 
 def ParallelOp : OpenMP_Op<"parallel", [
- AutomaticAllocationScope, AttrSizedOperandSegments,
- DeclareOpInterfaceMethods,
- DeclareOpInterfaceMethods,
- RecursiveMemoryEffects, ReductionClauseInterface]> {
+AttrSizedOperandSegments, AutomaticAllocationScope,
+DeclareOpInterfaceMethods,
+DeclareOpInterfaceMethods,
+RecursiveMemoryEffects
+  ], [
+// TODO: Sort clauses alphabetically.
+

[llvm-branch-commits] [mlir] [MLIR][OpenMP] Clause-based OpenMP operation definition (PR #92523)

2024-05-17 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-flang-openmp

Author: Sergio Afonso (skatrak)


Changes

This patch updates `OpenMP_Op` definitions to be based on the new set of 
`OpenMP_Clause` definitions, and to take advantage of clause-based 
automatically-generated argument lists, descriptions, assembly format and class 
declarations.

There are also changes introduced to the clause operands structures to match 
the current set of tablegen clause definitions. These two are very closely 
linked and should be kept in sync. It would probably be a good idea to try 
generating clause operands structures from the tablegen `OpenMP_Clause` 
definitions in the future.

As a result of this change, arguments for some operations have been reordered. 
This patch also addresses this by updating affected operation build calls and 
unit tests. Some other updates to tests related to the order of arguments in 
the resulting assembly format and others due to certain previous 
inconsistencies in the printing/parsing of clauses are addressed.

The printer and parser functions for the `map` clause are updated, so that they 
are able to handle `map` clauses linked to entry block arguments as well as 
those which aren't.

---

Patch is 106.52 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/92523.diff


11 Files Affected:

- (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h (+26-10) 
- (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td (+298-846) 
- (modified) mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp (+2-1) 
- (modified) mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp (+50-28) 
- (modified) 
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+1-1) 
- (modified) mlir/test/Dialect/OpenMP/invalid.mlir (+10-10) 
- (modified) mlir/test/Dialect/OpenMP/ops.mlir (+25-26) 
- (modified) mlir/test/Target/LLVMIR/omptarget-llvm.mlir (+3-3) 
- (modified) mlir/test/Target/LLVMIR/omptarget-nowait-llvm.mlir (+1-1) 
- (modified) mlir/test/Target/LLVMIR/omptarget-parallel-llvm.mlir (+2-2) 
- (modified) mlir/test/Target/LLVMIR/openmp-llvm.mlir (+1-1) 


``diff
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h 
b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h
index 244cee1dd635b..bd0d44f932981 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h
@@ -39,6 +39,10 @@ struct AllocateClauseOps {
   llvm::SmallVector allocatorVars, allocateVars;
 };
 
+struct CancelDirectiveNameClauseOps {
+  ClauseCancellationConstructTypeAttr cancelDirectiveNameAttr;
+};
+
 struct CollapseClauseOps {
   llvm::SmallVector loopLBVar, loopUBVar, loopStepVar;
 };
@@ -48,6 +52,10 @@ struct CopyprivateClauseOps {
   llvm::SmallVector copyprivateFuncs;
 };
 
+struct CriticalNameClauseOps {
+  StringAttr criticalNameAttr;
+};
+
 struct DependClauseOps {
   llvm::SmallVector dependTypeAttrs;
   llvm::SmallVector dependVars;
@@ -84,6 +92,7 @@ struct GrainsizeClauseOps {
 struct HasDeviceAddrClauseOps {
   llvm::SmallVector hasDeviceAddrVars;
 };
+
 struct HintClauseOps {
   IntegerAttr hintAttr;
 };
@@ -117,10 +126,6 @@ struct MergeableClauseOps {
   UnitAttr mergeableAttr;
 };
 
-struct NameClauseOps {
-  StringAttr nameAttr;
-};
-
 struct NogroupClauseOps {
   UnitAttr nogroupAttr;
 };
@@ -209,8 +214,12 @@ struct UntiedClauseOps {
   UnitAttr untiedAttr;
 };
 
-struct UseDeviceClauseOps {
-  llvm::SmallVector useDevicePtrVars, useDeviceAddrVars;
+struct UseDeviceAddrClauseOps {
+  llvm::SmallVector useDeviceAddrVars;
+};
+
+struct UseDevicePtrClauseOps {
+  llvm::SmallVector useDevicePtrVars;
 };
 
 
//===--===//
@@ -225,7 +234,13 @@ template 
 struct Clauses : public Mixins... {};
 } // namespace detail
 
-using CriticalClauseOps = detail::Clauses;
+using CancelClauseOps =
+detail::Clauses;
+
+using CancellationPointClauseOps =
+detail::Clauses;
+
+using CriticalClauseOps = detail::Clauses;
 
 // TODO `indirect` clause.
 using DeclareTargetClauseOps = detail::Clauses;
@@ -264,10 +279,11 @@ using TargetClauseOps =
 detail::Clauses;
+PrivateClauseOps, ThreadLimitClauseOps>;
 
-using TargetDataClauseOps = detail::Clauses;
+using TargetDataClauseOps =
+detail::Clauses;
 
 using TargetEnterExitUpdateDataClauseOps =
 detail::Clauses {
 
//===--===//
 
 def ParallelOp : OpenMP_Op<"parallel", [
- AutomaticAllocationScope, AttrSizedOperandSegments,
- DeclareOpInterfaceMethods,
- DeclareOpInterfaceMethods,
- RecursiveMemoryEffects, ReductionClauseInterface]> {
+AttrSizedOperandSegments, AutomaticAllocationScope,
+DeclareOpInterfaceMethods,
+DeclareOpInterfaceMethods,
+RecursiveMemoryEffects
+  ], [
+// TODO: Sort clauses 

[llvm-branch-commits] [mlir] [MLIR][OpenMP] Add `OpenMP_Clause` tablegen definitions (PR #92521)

2024-05-17 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-mlir

Author: Sergio Afonso (skatrak)


Changes

This patch adds a new tablegen file for the OpenMP dialect containing the list 
of clauses currently supported.

---

Patch is 44.02 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/92521.diff


1 Files Affected:

- (added) mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td (+1183) 


``diff
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td 
b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
new file mode 100644
index 0..8b3a53a5842f3
--- /dev/null
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
@@ -0,0 +1,1183 @@
+//=== OpenMPClauses.td - OpenMP dialect clause definitions -*- tablegen 
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file contains clause definitions for the OpenMP dialect.
+//
+// For each "Xyz" clause, there is an "OpenMP_XyzClauseSkip" class and an
+// "OpenMP_XyzClause" definition. The latter is an instantiation of the former
+// where all "skip" template parameters are set to `false` and should be the
+// preferred variant to used whenever possible when defining `OpenMP_Op`
+// instances.
+//
+//===--===//
+
+#ifndef OPENMP_CLAUSES
+#define OPENMP_CLAUSES
+
+include "mlir/Dialect/OpenMP/OpenMPOpBase.td"
+
+//===--===//
+// V5.2: [5.11] `aligned` clause
+//===--===//
+
+class OpenMP_AlignedClauseSkip<
+bit traits = false, bit arguments = false, bit assemblyFormat = false,
+bit description = false, bit extraClassDeclaration = false
+  > : OpenMP_Clause {
+  let arguments = (ins
+Variadic:$aligned_vars,
+OptionalAttr:$alignment_values
+  );
+
+  let assemblyFormat = [{
+`aligned` `(` custom($aligned_vars, type($aligned_vars),
+$alignment_values) `)`
+  }];
+
+  let description = [{
+The `alignment_values` attribute additionally specifies alignment of each
+corresponding aligned operand. Note that `aligned_vars` and
+`alignment_values` should contain the same number of elements.
+  }];
+}
+
+def OpenMP_AlignedClause : OpenMP_AlignedClauseSkip<>;
+
+//===--===//
+// V5.2: [6.6] `allocate` clause
+//===--===//
+
+class OpenMP_AllocateClauseSkip<
+bit traits = false, bit arguments = false, bit assemblyFormat = false,
+bit description = false, bit extraClassDeclaration = false
+  > : OpenMP_Clause {
+  let arguments = (ins
+Variadic:$allocate_vars,
+Variadic:$allocators_vars
+  );
+
+  let assemblyFormat = [{
+`allocate` `(`
+  custom($allocate_vars, type($allocate_vars),
+   $allocators_vars, type($allocators_vars)) 
`)`
+  }];
+
+  let description = [{
+The `allocators_vars` and `allocate_vars` parameters are a variadic list of
+values that specify the memory allocator to be used to obtain storage for
+private values.
+  }];
+}
+
+def OpenMP_AllocateClause : OpenMP_AllocateClauseSkip<>;
+
+//===--===//
+// V5.2: [16.1, 16.2] `cancel-directive-name` clause set
+//===--===//
+
+class OpenMP_CancelDirectiveNameClauseSkip<
+bit traits = false, bit arguments = false, bit assemblyFormat = false,
+bit description = false, bit extraClassDeclaration = false
+  > : OpenMP_Clause {
+  let arguments = (ins
+CancellationConstructTypeAttr:$cancellation_construct_type_val
+  );
+
+  let assemblyFormat = [{
+`cancellation_construct_type` `(`
+  custom($cancellation_construct_type_val) `)`
+  }];
+
+  // TODO: Add description.
+}
+
+def OpenMP_CancelDirectiveNameClause : OpenMP_CancelDirectiveNameClauseSkip<>;
+
+//===--===//
+// V5.2: [4.4.3] `collapse` clause
+//===--===//
+
+class OpenMP_CollapseClauseSkip<
+bit traits = false, bit arguments = false, bit assemblyFormat = false,
+bit description = false, bit extraClassDeclaration = false
+  > : OpenMP_Clause {
+  let traits = [
+AllTypesMatch<["lowerBound", "upperBound", "step"]>
+  ];
+
+  let arguments = (ins
+Variadic:$lowerBound,
+Variadic:$upperBound,
+Variadic:$step
+  );
+
+  let extraClassDeclaration = [{
+/// Returns the number 

[llvm-branch-commits] [mlir] [MLIR][OpenMP] Add `OpenMP_Clause` tablegen definitions (PR #92521)

2024-05-17 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-mlir-openmp

Author: Sergio Afonso (skatrak)


Changes

This patch adds a new tablegen file for the OpenMP dialect containing the list 
of clauses currently supported.

---

Patch is 44.02 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/92521.diff


1 Files Affected:

- (added) mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td (+1183) 


``diff
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td 
b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
new file mode 100644
index 0..8b3a53a5842f3
--- /dev/null
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
@@ -0,0 +1,1183 @@
+//=== OpenMPClauses.td - OpenMP dialect clause definitions -*- tablegen 
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file contains clause definitions for the OpenMP dialect.
+//
+// For each "Xyz" clause, there is an "OpenMP_XyzClauseSkip" class and an
+// "OpenMP_XyzClause" definition. The latter is an instantiation of the former
+// where all "skip" template parameters are set to `false` and should be the
+// preferred variant to used whenever possible when defining `OpenMP_Op`
+// instances.
+//
+//===--===//
+
+#ifndef OPENMP_CLAUSES
+#define OPENMP_CLAUSES
+
+include "mlir/Dialect/OpenMP/OpenMPOpBase.td"
+
+//===--===//
+// V5.2: [5.11] `aligned` clause
+//===--===//
+
+class OpenMP_AlignedClauseSkip<
+bit traits = false, bit arguments = false, bit assemblyFormat = false,
+bit description = false, bit extraClassDeclaration = false
+  > : OpenMP_Clause {
+  let arguments = (ins
+Variadic:$aligned_vars,
+OptionalAttr:$alignment_values
+  );
+
+  let assemblyFormat = [{
+`aligned` `(` custom($aligned_vars, type($aligned_vars),
+$alignment_values) `)`
+  }];
+
+  let description = [{
+The `alignment_values` attribute additionally specifies alignment of each
+corresponding aligned operand. Note that `aligned_vars` and
+`alignment_values` should contain the same number of elements.
+  }];
+}
+
+def OpenMP_AlignedClause : OpenMP_AlignedClauseSkip<>;
+
+//===--===//
+// V5.2: [6.6] `allocate` clause
+//===--===//
+
+class OpenMP_AllocateClauseSkip<
+bit traits = false, bit arguments = false, bit assemblyFormat = false,
+bit description = false, bit extraClassDeclaration = false
+  > : OpenMP_Clause {
+  let arguments = (ins
+Variadic:$allocate_vars,
+Variadic:$allocators_vars
+  );
+
+  let assemblyFormat = [{
+`allocate` `(`
+  custom($allocate_vars, type($allocate_vars),
+   $allocators_vars, type($allocators_vars)) 
`)`
+  }];
+
+  let description = [{
+The `allocators_vars` and `allocate_vars` parameters are a variadic list of
+values that specify the memory allocator to be used to obtain storage for
+private values.
+  }];
+}
+
+def OpenMP_AllocateClause : OpenMP_AllocateClauseSkip<>;
+
+//===--===//
+// V5.2: [16.1, 16.2] `cancel-directive-name` clause set
+//===--===//
+
+class OpenMP_CancelDirectiveNameClauseSkip<
+bit traits = false, bit arguments = false, bit assemblyFormat = false,
+bit description = false, bit extraClassDeclaration = false
+  > : OpenMP_Clause {
+  let arguments = (ins
+CancellationConstructTypeAttr:$cancellation_construct_type_val
+  );
+
+  let assemblyFormat = [{
+`cancellation_construct_type` `(`
+  custom($cancellation_construct_type_val) `)`
+  }];
+
+  // TODO: Add description.
+}
+
+def OpenMP_CancelDirectiveNameClause : OpenMP_CancelDirectiveNameClauseSkip<>;
+
+//===--===//
+// V5.2: [4.4.3] `collapse` clause
+//===--===//
+
+class OpenMP_CollapseClauseSkip<
+bit traits = false, bit arguments = false, bit assemblyFormat = false,
+bit description = false, bit extraClassDeclaration = false
+  > : OpenMP_Clause {
+  let traits = [
+AllTypesMatch<["lowerBound", "upperBound", "step"]>
+  ];
+
+  let arguments = (ins
+Variadic:$lowerBound,
+Variadic:$upperBound,
+Variadic:$step
+  );
+
+  let extraClassDeclaration = [{
+/// Returns the 

[llvm-branch-commits] [mlir] [MLIR][OpenMP] Add `OpenMP_Clause` tablegen definitions (PR #92521)

2024-05-17 Thread Sergio Afonso via llvm-branch-commits

https://github.com/skatrak created 
https://github.com/llvm/llvm-project/pull/92521

This patch adds a new tablegen file for the OpenMP dialect containing the list 
of clauses currently supported.

>From e1aa6cb890dfc8f7f03fade845cff45a163201ff Mon Sep 17 00:00:00 2001
From: Sergio Afonso 
Date: Fri, 17 May 2024 10:56:32 +0100
Subject: [PATCH] [MLIR][OpenMP] Add `OpenMP_Clause` tablegen definitions

This patch adds a new tablegen file for the OpenMP dialect containing the list
of clauses currently supported.
---
 .../mlir/Dialect/OpenMP/OpenMPClauses.td  | 1183 +
 1 file changed, 1183 insertions(+)
 create mode 100644 mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td

diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td 
b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
new file mode 100644
index 0..8b3a53a5842f3
--- /dev/null
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
@@ -0,0 +1,1183 @@
+//=== OpenMPClauses.td - OpenMP dialect clause definitions -*- tablegen 
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// This file contains clause definitions for the OpenMP dialect.
+//
+// For each "Xyz" clause, there is an "OpenMP_XyzClauseSkip" class and an
+// "OpenMP_XyzClause" definition. The latter is an instantiation of the former
+// where all "skip" template parameters are set to `false` and should be the
+// preferred variant to used whenever possible when defining `OpenMP_Op`
+// instances.
+//
+//===--===//
+
+#ifndef OPENMP_CLAUSES
+#define OPENMP_CLAUSES
+
+include "mlir/Dialect/OpenMP/OpenMPOpBase.td"
+
+//===--===//
+// V5.2: [5.11] `aligned` clause
+//===--===//
+
+class OpenMP_AlignedClauseSkip<
+bit traits = false, bit arguments = false, bit assemblyFormat = false,
+bit description = false, bit extraClassDeclaration = false
+  > : OpenMP_Clause {
+  let arguments = (ins
+Variadic:$aligned_vars,
+OptionalAttr:$alignment_values
+  );
+
+  let assemblyFormat = [{
+`aligned` `(` custom($aligned_vars, type($aligned_vars),
+$alignment_values) `)`
+  }];
+
+  let description = [{
+The `alignment_values` attribute additionally specifies alignment of each
+corresponding aligned operand. Note that `aligned_vars` and
+`alignment_values` should contain the same number of elements.
+  }];
+}
+
+def OpenMP_AlignedClause : OpenMP_AlignedClauseSkip<>;
+
+//===--===//
+// V5.2: [6.6] `allocate` clause
+//===--===//
+
+class OpenMP_AllocateClauseSkip<
+bit traits = false, bit arguments = false, bit assemblyFormat = false,
+bit description = false, bit extraClassDeclaration = false
+  > : OpenMP_Clause {
+  let arguments = (ins
+Variadic:$allocate_vars,
+Variadic:$allocators_vars
+  );
+
+  let assemblyFormat = [{
+`allocate` `(`
+  custom($allocate_vars, type($allocate_vars),
+   $allocators_vars, type($allocators_vars)) 
`)`
+  }];
+
+  let description = [{
+The `allocators_vars` and `allocate_vars` parameters are a variadic list of
+values that specify the memory allocator to be used to obtain storage for
+private values.
+  }];
+}
+
+def OpenMP_AllocateClause : OpenMP_AllocateClauseSkip<>;
+
+//===--===//
+// V5.2: [16.1, 16.2] `cancel-directive-name` clause set
+//===--===//
+
+class OpenMP_CancelDirectiveNameClauseSkip<
+bit traits = false, bit arguments = false, bit assemblyFormat = false,
+bit description = false, bit extraClassDeclaration = false
+  > : OpenMP_Clause {
+  let arguments = (ins
+CancellationConstructTypeAttr:$cancellation_construct_type_val
+  );
+
+  let assemblyFormat = [{
+`cancellation_construct_type` `(`
+  custom($cancellation_construct_type_val) `)`
+  }];
+
+  // TODO: Add description.
+}
+
+def OpenMP_CancelDirectiveNameClause : OpenMP_CancelDirectiveNameClauseSkip<>;
+
+//===--===//
+// V5.2: [4.4.3] `collapse` clause
+//===--===//
+
+class OpenMP_CollapseClauseSkip<
+bit traits = false, bit arguments = false, bit assemblyFormat = false,
+bit description = false, bit extraClassDeclaration = false
+  > : 

[llvm-branch-commits] [mlir] [MLIR][OpenMP] Support clause-based representation of operations (PR #92519)

2024-05-17 Thread via llvm-branch-commits

llvmbot wrote:



@llvm/pr-subscribers-flang-openmp

@llvm/pr-subscribers-mlir

Author: Sergio Afonso (skatrak)


Changes

Currently, OpenMP operations are defined independently of each other. However, 
one property of the OpenMP specification is that many clauses can be applied to 
multiple constructs.

Keeping the MLIR representation of clauses consistent across all operations 
that can accept them is important, but since this information is scattered into 
multiple operation definitions, it is currently prone to divergence as new 
features and changes are added to the dialect. Furthermore, centralizing this 
information allows for a single source of truth and avoids redundancy in the 
dialect.

The proposal in this patch is to make OpenMP clauses independent top level 
definitions which can then be passed in a template argument list to OpenMP 
operation definitions, just as it's done for traits. Clauses can define these 
properties, which are joined together in order to make a default initialization 
for the fields of the same name of the OpenMP operation:

- `traits`: Optional. It gets added to the list of traits of the operation.
- `arguments`: Mandatory. It defines how the clause is represented.
- `assemblyFormat`: Optional (though it should almost always be defined). This 
is the declarative definition of the printer/parser for the `arguments`. How 
these are combined depends on whether this is an optional or required clause.
- `description`: Optional. It's used to populate a `clausesDescription` field, 
so each operation definition must still define a `description` itself. That 
field is intended to be appended to the end of the `OpenMP_Op`'s `description`.
- `extraClassDeclaration`: Optional. It can define some C++ code to be added to 
every OpenMP operation that includes that clause.

In order to give operation definitions fine-grained control over features of a 
certain clause might need to be inhibited, the `OpenMP_Clause` class takes 
"skipTraits", "skipArguments", "skipAssemblyFormat", "skipDescription" and 
"skipExtraClassDeclaration" bit template arguments. These are intended to be 
used very sparingly for cases where some of the clauses might collide in some 
way otherwise.

---
Full diff: https://github.com/llvm/llvm-project/pull/92519.diff


1 Files Affected:

- (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPOpBase.td (+163-2) 


``diff
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOpBase.td 
b/mlir/include/mlir/Dialect/OpenMP/OpenMPOpBase.td
index b98d87aa74a6f..d93abd63977ef 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOpBase.td
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOpBase.td
@@ -42,7 +42,168 @@ def OpenMP_MapBoundsType : OpenMP_Type<"MapBounds", 
"map_bounds_ty"> {
 // Base classes for OpenMP dialect operations.
 
//===--===//
 
-class OpenMP_Op traits = []> :
-  Op;
+// Base class for representing OpenMP clauses.
+//
+// Clauses are meant to be used in a mixin-style pattern to help define OpenMP
+// operations in a scalable way, since often the same clause can be applied to
+// multiple different operations.
+//
+// To keep the representation of clauses consistent across different 
operations,
+// each clause must define a set of arguments (values and attributes) which 
will
+// become input arguments of each OpenMP operation that accepts that clause.
+//
+// It is also recommended that an assembly format and description are defined
+// for each clause wherever posible, to make sure they are always printed,
+// parsed and described in the same way.
+//
+// Optionally, operation traits and extra class declarations might be attached
+// to clauses, which will be forwarded to all operations that include them.
+//
+// Each clause must specify whether it's required or optional. This impacts how
+// the `assemblyFormat` for operations including it get generated.
+//
+// An `OpenMP_Op` can inhibit the inheritance of `traits`, `arguments`,
+// `assemblyFormat`, `description` and `extraClassDeclaration` fields from any
+// given `OpenMP_Clause` by setting to 1 the corresponding "skip" template
+// argument bit.
+class OpenMP_Clause {
+  bit required = isRequired;
+
+  bit ignoreTraits = skipTraits;
+  list traits = [];
+
+  bit ignoreArgs = skipArguments;
+  dag arguments;
+
+  bit ignoreAsmFormat = skipAssemblyFormat;
+  string assemblyFormat = "";
+
+  bit ignoreDesc = skipDescription;
+  string description = "";
+
+  bit ignoreExtraDecl = skipExtraClassDeclaration;
+  string extraClassDeclaration = "";
+}
+
+// Base class for representing OpenMP operations.
+//
+// This is a subclass of the builtin `Op` for the OpenMP dialect. By default,
+// some of its fields are initialized according to the list of OpenMP clauses
+// passed as template argument:
+//   - `traits`: It is a union of the traits list passed as template argument
+// and those inherited from the `traits` field of all 

[llvm-branch-commits] [mlir] [MLIR][OpenMP] Support clause-based representation of operations (PR #92519)

2024-05-17 Thread Sergio Afonso via llvm-branch-commits

https://github.com/skatrak created 
https://github.com/llvm/llvm-project/pull/92519

Currently, OpenMP operations are defined independently of each other. However, 
one property of the OpenMP specification is that many clauses can be applied to 
multiple constructs.

Keeping the MLIR representation of clauses consistent across all operations 
that can accept them is important, but since this information is scattered into 
multiple operation definitions, it is currently prone to divergence as new 
features and changes are added to the dialect. Furthermore, centralizing this 
information allows for a single source of truth and avoids redundancy in the 
dialect.

The proposal in this patch is to make OpenMP clauses independent top level 
definitions which can then be passed in a template argument list to OpenMP 
operation definitions, just as it's done for traits. Clauses can define these 
properties, which are joined together in order to make a default initialization 
for the fields of the same name of the OpenMP operation:

- `traits`: Optional. It gets added to the list of traits of the operation.
- `arguments`: Mandatory. It defines how the clause is represented.
- `assemblyFormat`: Optional (though it should almost always be defined). This 
is the declarative definition of the printer/parser for the `arguments`. How 
these are combined depends on whether this is an optional or required clause.
- `description`: Optional. It's used to populate a `clausesDescription` field, 
so each operation definition must still define a `description` itself. That 
field is intended to be appended to the end of the `OpenMP_Op`'s `description`.
- `extraClassDeclaration`: Optional. It can define some C++ code to be added to 
every OpenMP operation that includes that clause.

In order to give operation definitions fine-grained control over features of a 
certain clause might need to be inhibited, the `OpenMP_Clause` class takes 
"skipTraits", "skipArguments", "skipAssemblyFormat", "skipDescription" and 
"skipExtraClassDeclaration" bit template arguments. These are intended to be 
used very sparingly for cases where some of the clauses might collide in some 
way otherwise.

>From fec244fb8403d1ebcabe30cd27cf23b1839b0b65 Mon Sep 17 00:00:00 2001
From: Sergio Afonso 
Date: Fri, 17 May 2024 10:20:55 +0100
Subject: [PATCH] [MLIR][OpenMP] Support clause-based representation of
 operations

Currently, OpenMP operations are defined independently of each other. However,
one property of the OpenMP specification is that many clauses can be applied to
multiple constructs.

Keeping the MLIR representation of clauses consistent across all operations
that can accept them is important, but since this information is scattered into
multiple operation definitions, it is currently prone to divergence as new
features and changes are added to the dialect. Furthermore, centralizing this
information allows for a single source of truth and avoids redundancy in the
dialect.

The proposal in this patch is to make OpenMP clauses independent top level
definitions which can then be passed in a template argument list to OpenMP
operation definitions, just as it's done for traits. Clauses can define these
properties, which are joined together in order to make a default initialization
for the fields of the same name of the OpenMP operation:

- `traits`: Optional. It gets added to the list of traits of the operation.
- `arguments`: Mandatory. It defines how the clause is represented.
- `assemblyFormat`: Optional (though it should almost always be defined). This
is the declarative definition of the printer/parser for the `arguments`. How
these are combined depends on whether this is an optional or required clause.
- `description`: Optional. It's used to populate a `clausesDescription` field,
so each operation definition must still define a `description` itself. That
field is intended to be appended to the end of the `OpenMP_Op`'s `description`.
- `extraClassDeclaration`: Optional. It can define some C++ code to be added to
every OpenMP operation that includes that clause.

In order to give operation definitions fine-grained control over features of a
certain clause might need to be inhibited, the `OpenMP_Clause` class takes
"skipTraits", "skipArguments", "skipAssemblyFormat", "skipDescription" and
"skipExtraClassDeclaration" bit template arguments. These are intended to be
used very sparingly for cases where some of the clauses might collide in some
way otherwise.
---
 .../mlir/Dialect/OpenMP/OpenMPOpBase.td   | 165 +-
 1 file changed, 163 insertions(+), 2 deletions(-)

diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOpBase.td 
b/mlir/include/mlir/Dialect/OpenMP/OpenMPOpBase.td
index b98d87aa74a6f..d93abd63977ef 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOpBase.td
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOpBase.td
@@ -42,7 +42,168 @@ def OpenMP_MapBoundsType : OpenMP_Type<"MapBounds", 
"map_bounds_ty"> {
 // Base classes 

[llvm-branch-commits] [llvm] release/18.x: [RISCV] Re-separate unaligned scalar and vector memory features in the backend. (PR #92143)

2024-05-17 Thread Luke Lau via llvm-branch-commits

https://github.com/lukel97 approved this pull request.

Chiming in that this seems reasonable to me, given the performance impact of 
not having unaligned scalar accesses. And hopefully we can remove this one 
we're settled on a proper interface.

https://github.com/llvm/llvm-project/pull/92143
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [serialization] No transitive type change (PR #92511)

2024-05-17 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang-modules

Author: Chuanqi Xu (ChuanqiXu9)


Changes

Following of https://github.com/llvm/llvm-project/pull/92085. 

 motivation

The motivation is still cutting of the unnecessary change in the dependency 
chain. See the above link (recursively) for details.

And this will be the last patch of the `no-transitive-*-change` series. If 
there are any following patches, they might be C++20 Named modules specific to 
handle special grammars like `ADL` (See the reply in 
https://discourse.llvm.org/t/rfc-c-20-modules-introduce-thin-bmi-and-decls-hash/74755/53
 for example). So they won't affect the whole serialization part as the series 
patch did.

 example

After this patch, finally we are able to cut of unnecessary change of types. 
For example, 

```

//--- m-partA.cppm
export module m:partA;

//--- m-partA.v1.cppm
export module m:partA;

namespace NS {
class A {
public:
int getValue() {
return 43;
}
};
}

//--- m-partB.cppm
export module m:partB;

export inline int getB() {
return 430;
}

//--- m.cppm
export module m;
export import :partA;
export import :partB;

//--- useBOnly.cppm
export module useBOnly;
import m;

export inline int get() {
return getB();
}
```

The BMI of `useBOnly.cppm` is expected to not change if we only add a new class 
in `m:partA`. This will be pretty useful in practice.

 implementation details

The key idea of this patch is similar with the previous patches: extend the 
32bits type ID to 64bits so that we can store the module file index in the 
higher bits. Then the encoding of the type ID is independent on the imported 
modules.

But there are two differences from the previous patches:
- TypeID is not completely an index of serialized types. We used the lower 3 
bits to store the qualifiers.
- TypeID won't take part in any lookup process. So the uses of TypeID is much 
less than the previous patches.

The first difference make we have some more slightly complex bit operations. 
And the second difference makes the patch much simpler than the previous ones.


---

Patch is 28.70 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/92511.diff


11 Files Affected:

- (modified) clang/include/clang/Serialization/ASTBitCodes.h (+24-8) 
- (modified) clang/include/clang/Serialization/ASTReader.h (+11-12) 
- (modified) clang/include/clang/Serialization/ASTRecordReader.h (+1-1) 
- (modified) clang/include/clang/Serialization/ModuleFile.h (-3) 
- (modified) clang/lib/Serialization/ASTReader.cpp (+55-49) 
- (modified) clang/lib/Serialization/ASTWriter.cpp (+18-13) 
- (modified) clang/lib/Serialization/ModuleFile.cpp (-1) 
- (modified) clang/test/Modules/no-transitive-decls-change.cppm (+1-11) 
- (modified) clang/test/Modules/no-transitive-identifier-change.cppm (-3) 
- (added) clang/test/Modules/no-transitive-type-change.cppm (+68) 
- (modified) clang/test/Modules/pr5.cppm (+18-18) 


``diff
diff --git a/clang/include/clang/Serialization/ASTBitCodes.h 
b/clang/include/clang/Serialization/ASTBitCodes.h
index 1fd482b5aff0e..486d5f4042c61 100644
--- a/clang/include/clang/Serialization/ASTBitCodes.h
+++ b/clang/include/clang/Serialization/ASTBitCodes.h
@@ -26,6 +26,7 @@
 #include "clang/Serialization/SourceLocationEncoding.h"
 #include "llvm/ADT/DenseMapInfo.h"
 #include "llvm/Bitstream/BitCodes.h"
+#include "llvm/Support/MathExtras.h"
 #include 
 #include 
 
@@ -70,38 +71,53 @@ using DeclID = DeclIDBase::DeclID;
 
 /// An ID number that refers to a type in an AST file.
 ///
-/// The ID of a type is partitioned into two parts: the lower
+/// The ID of a type is partitioned into three parts:
+/// - the lower
 /// three bits are used to store the const/volatile/restrict
-/// qualifiers (as with QualType) and the upper bits provide a
-/// type index. The type index values are partitioned into two
+/// qualifiers (as with QualType).
+/// - the upper 29 bits provide a type index in the corresponding
+/// module file.
+/// - the upper 32 bits provide a module file index.
+///
+/// The type index values are partitioned into two
 /// sets. The values below NUM_PREDEF_TYPE_IDs are predefined type
 /// IDs (based on the PREDEF_TYPE_*_ID constants), with 0 as a
 /// placeholder for "no type". Values from NUM_PREDEF_TYPE_IDs are
 /// other types that have serialized representations.
-using TypeID = uint32_t;
+using TypeID = uint64_t;
 
 /// A type index; the type ID with the qualifier bits removed.
+/// Keep structure alignment 32-bit since the blob is assumed as 32-bit
+/// aligned.
 class TypeIdx {
+  uint32_t ModuleFileIndex = 0;
   uint32_t Idx = 0;
 
 public:
   TypeIdx() = default;
-  explicit TypeIdx(uint32_t index) : Idx(index) {}
+  explicit TypeIdx(uint32_t Idx) : ModuleFileIndex(0), Idx(Idx) {}
+
+  explicit TypeIdx(uint32_t ModuleFileIdx, uint32_t Idx)
+  : ModuleFileIndex(ModuleFileIdx), Idx(Idx) {}
+
+  uint32_t 

[llvm-branch-commits] [clang] [serialization] No transitive type change (PR #92511)

2024-05-17 Thread Chuanqi Xu via llvm-branch-commits

https://github.com/ChuanqiXu9 created 
https://github.com/llvm/llvm-project/pull/92511

Following of https://github.com/llvm/llvm-project/pull/92085. 

 motivation

The motivation is still cutting of the unnecessary change in the dependency 
chain. See the above link (recursively) for details.

And this will be the last patch of the `no-transitive-*-change` series. If 
there are any following patches, they might be C++20 Named modules specific to 
handle special grammars like `ADL` (See the reply in 
https://discourse.llvm.org/t/rfc-c-20-modules-introduce-thin-bmi-and-decls-hash/74755/53
 for example). So they won't affect the whole serialization part as the series 
patch did.

 example

After this patch, finally we are able to cut of unnecessary change of types. 
For example, 

```

//--- m-partA.cppm
export module m:partA;

//--- m-partA.v1.cppm
export module m:partA;

namespace NS {
class A {
public:
int getValue() {
return 43;
}
};
}

//--- m-partB.cppm
export module m:partB;

export inline int getB() {
return 430;
}

//--- m.cppm
export module m;
export import :partA;
export import :partB;

//--- useBOnly.cppm
export module useBOnly;
import m;

export inline int get() {
return getB();
}
```

The BMI of `useBOnly.cppm` is expected to not change if we only add a new class 
in `m:partA`. This will be pretty useful in practice.

 implementation details

The key idea of this patch is similar with the previous patches: extend the 
32bits type ID to 64bits so that we can store the module file index in the 
higher bits. Then the encoding of the type ID is independent on the imported 
modules.

But there are two differences from the previous patches:
- TypeID is not completely an index of serialized types. We used the lower 3 
bits to store the qualifiers.
- TypeID won't take part in any lookup process. So the uses of TypeID is much 
less than the previous patches.

The first difference make we have some more slightly complex bit operations. 
And the second difference makes the patch much simpler than the previous ones.


>From 2265f12343f929cc81f2b4fe6d27cc4ff3f31ec2 Mon Sep 17 00:00:00 2001
From: Chuanqi Xu 
Date: Fri, 17 May 2024 14:25:53 +0800
Subject: [PATCH] [serialization] No transitive type change

---
 .../include/clang/Serialization/ASTBitCodes.h |  32 --
 clang/include/clang/Serialization/ASTReader.h |  23 ++--
 .../clang/Serialization/ASTRecordReader.h |   2 +-
 .../include/clang/Serialization/ModuleFile.h  |   3 -
 clang/lib/Serialization/ASTReader.cpp | 104 +-
 clang/lib/Serialization/ASTWriter.cpp |  31 +++---
 clang/lib/Serialization/ModuleFile.cpp|   1 -
 .../Modules/no-transitive-decls-change.cppm   |  12 +-
 .../no-transitive-identifier-change.cppm  |   3 -
 .../Modules/no-transitive-type-change.cppm|  68 
 clang/test/Modules/pr5.cppm   |  36 +++---
 11 files changed, 196 insertions(+), 119 deletions(-)
 create mode 100644 clang/test/Modules/no-transitive-type-change.cppm

diff --git a/clang/include/clang/Serialization/ASTBitCodes.h 
b/clang/include/clang/Serialization/ASTBitCodes.h
index 1fd482b5aff0e..486d5f4042c61 100644
--- a/clang/include/clang/Serialization/ASTBitCodes.h
+++ b/clang/include/clang/Serialization/ASTBitCodes.h
@@ -26,6 +26,7 @@
 #include "clang/Serialization/SourceLocationEncoding.h"
 #include "llvm/ADT/DenseMapInfo.h"
 #include "llvm/Bitstream/BitCodes.h"
+#include "llvm/Support/MathExtras.h"
 #include 
 #include 
 
@@ -70,38 +71,53 @@ using DeclID = DeclIDBase::DeclID;
 
 /// An ID number that refers to a type in an AST file.
 ///
-/// The ID of a type is partitioned into two parts: the lower
+/// The ID of a type is partitioned into three parts:
+/// - the lower
 /// three bits are used to store the const/volatile/restrict
-/// qualifiers (as with QualType) and the upper bits provide a
-/// type index. The type index values are partitioned into two
+/// qualifiers (as with QualType).
+/// - the upper 29 bits provide a type index in the corresponding
+/// module file.
+/// - the upper 32 bits provide a module file index.
+///
+/// The type index values are partitioned into two
 /// sets. The values below NUM_PREDEF_TYPE_IDs are predefined type
 /// IDs (based on the PREDEF_TYPE_*_ID constants), with 0 as a
 /// placeholder for "no type". Values from NUM_PREDEF_TYPE_IDs are
 /// other types that have serialized representations.
-using TypeID = uint32_t;
+using TypeID = uint64_t;
 
 /// A type index; the type ID with the qualifier bits removed.
+/// Keep structure alignment 32-bit since the blob is assumed as 32-bit
+/// aligned.
 class TypeIdx {
+  uint32_t ModuleFileIndex = 0;
   uint32_t Idx = 0;
 
 public:
   TypeIdx() = default;
-  explicit TypeIdx(uint32_t index) : Idx(index) {}
+  explicit TypeIdx(uint32_t Idx) : ModuleFileIndex(0), Idx(Idx) {}
+
+  explicit TypeIdx(uint32_t ModuleFileIdx, uint32_t Idx)
+  

[llvm-branch-commits] [llvm] [AArch64][PAC] Fix creating check instructions for BBs without an epilog (PR #92508)

2024-05-17 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-aarch64

Author: Igor Kudrin (igorkudrin)


Changes

`AArch64PAuth::checkAuthenticatedRegister()` splits the basic block containing 
the tail call instruction to add check instructions, assuming at least one more 
instruction before the call. This assumption is incorrect in cases where some 
execution paths lead to the termination block without creating the stack frame. 
This patch rearranges the creation of the checks so that the prior splitting is 
not required.

---
Full diff: https://github.com/llvm/llvm-project/pull/92508.diff


2 Files Affected:

- (modified) llvm/lib/Target/AArch64/AArch64PointerAuth.cpp (+7-16) 
- (modified) llvm/test/CodeGen/AArch64/sign-return-address-tailcall.ll (+32) 


``diff
diff --git a/llvm/lib/Target/AArch64/AArch64PointerAuth.cpp 
b/llvm/lib/Target/AArch64/AArch64PointerAuth.cpp
index 90bf089dbebf7..60d3d533d9c10 100644
--- a/llvm/lib/Target/AArch64/AArch64PointerAuth.cpp
+++ b/llvm/lib/Target/AArch64/AArch64PointerAuth.cpp
@@ -257,21 +257,12 @@ void llvm::AArch64PAuth::checkAuthenticatedRegister(
 
   // Control flow has to be changed, so arrange new MBBs.
 
-  // At now, at least an AUT* instruction is expected before MBBI
-  assert(MBBI != MBB.begin() &&
- "Cannot insert the check at the very beginning of MBB");
-  // The block to insert check into.
-  MachineBasicBlock *CheckBlock = 
-  // The remaining part of the original MBB that is executed on success.
-  MachineBasicBlock *SuccessBlock = MBB.splitAt(*std::prev(MBBI));
-
   // The block that explicitly generates a break-point exception on failure.
   MachineBasicBlock *BreakBlock =
   MF.CreateMachineBasicBlock(MBB.getBasicBlock());
   MF.push_back(BreakBlock);
-  MBB.splitSuccessor(SuccessBlock, BreakBlock);
+  MBB.addSuccessor(BreakBlock);
 
-  assert(CheckBlock->getFallThrough() == SuccessBlock);
   BuildMI(BreakBlock, DL, TII->get(AArch64::BRK)).addImm(BrkImm);
 
   switch (Method) {
@@ -279,11 +270,11 @@ void llvm::AArch64PAuth::checkAuthenticatedRegister(
   case AuthCheckMethod::DummyLoad:
 llvm_unreachable("Should be handled above");
   case AuthCheckMethod::HighBitsNoTBI:
-BuildMI(CheckBlock, DL, TII->get(AArch64::EORXrs), TmpReg)
+BuildMI(MBB, MBBI, DL, TII->get(AArch64::EORXrs), TmpReg)
 .addReg(AuthenticatedReg)
 .addReg(AuthenticatedReg)
 .addImm(1);
-BuildMI(CheckBlock, DL, TII->get(AArch64::TBNZX))
+BuildMI(MBB, MBBI, DL, TII->get(AArch64::TBNZX))
 .addReg(TmpReg)
 .addImm(62)
 .addMBB(BreakBlock);
@@ -292,16 +283,16 @@ void llvm::AArch64PAuth::checkAuthenticatedRegister(
 assert(AuthenticatedReg == AArch64::LR &&
"XPACHint mode is only compatible with checking the LR register");
 assert(UseIKey && "XPACHint mode is only compatible with I-keys");
-BuildMI(CheckBlock, DL, TII->get(AArch64::ORRXrs), TmpReg)
+BuildMI(MBB, MBBI, DL, TII->get(AArch64::ORRXrs), TmpReg)
 .addReg(AArch64::XZR)
 .addReg(AArch64::LR)
 .addImm(0);
-BuildMI(CheckBlock, DL, TII->get(AArch64::XPACLRI));
-BuildMI(CheckBlock, DL, TII->get(AArch64::SUBSXrs), AArch64::XZR)
+BuildMI(MBB, MBBI, DL, TII->get(AArch64::XPACLRI));
+BuildMI(MBB, MBBI, DL, TII->get(AArch64::SUBSXrs), AArch64::XZR)
 .addReg(TmpReg)
 .addReg(AArch64::LR)
 .addImm(0);
-BuildMI(CheckBlock, DL, TII->get(AArch64::Bcc))
+BuildMI(MBB, MBBI, DL, TII->get(AArch64::Bcc))
 .addImm(AArch64CC::NE)
 .addMBB(BreakBlock);
 return;
diff --git a/llvm/test/CodeGen/AArch64/sign-return-address-tailcall.ll 
b/llvm/test/CodeGen/AArch64/sign-return-address-tailcall.ll
index cf033cb8208cc..0cc707298e458 100644
--- a/llvm/test/CodeGen/AArch64/sign-return-address-tailcall.ll
+++ b/llvm/test/CodeGen/AArch64/sign-return-address-tailcall.ll
@@ -129,4 +129,36 @@ define i32 @tailcall_ib_key() "sign-return-address"="all" 
"sign-return-address-k
   ret i32 %call
 }
 
+define i32 @tailcall_two_branches(i1 %0) "sign-return-address"="all" {
+; COMMON-LABEL:tailcall_two_branches:
+; COMMON:tbz w0, #0, .[[ELSE:LBB[_0-9]+]]
+; COMMON:str x30, [sp, #-16]!
+; COMMON:bl callee2
+; COMMON:ldr x30, [sp], #16
+; COMMON-NEXT:   [[AUTIASP]]
+; COMMON-NEXT: .[[ELSE]]:
+
+; LDR-NEXT:  ldr w16, [x30]
+;
+; BITS-NOTBI-NEXT:   eor x16, x30, x30, lsl #1
+; BITS-NOTBI-NEXT:   tbnz x16, #62, .[[FAIL:LBB[_0-9]+]]
+;
+; XPAC-NEXT: mov x16, x30
+; XPAC-NEXT: [[XPACLRI]]
+; XPAC-NEXT: cmp x16, x30
+; XPAC-NEXT: b.ne .[[FAIL:LBB[_0-9]+]]
+;
+; COMMON-NEXT:   b callee
+; BRK-NEXT:.[[FAIL]]:
+; BRK-NEXT:  brk #0xc470
+  br i1 %0, label %2, label %3
+2:
+  call void @callee2()
+  br label %3
+3:
+  %call = tail call i32 @callee()
+  ret i32 %call
+}
+
 declare i32 @callee()
+declare void @callee2()

``





[llvm-branch-commits] [llvm] [AArch64][PAC] Fix creating check instructions for BBs without an epilog (PR #92508)

2024-05-17 Thread Igor Kudrin via llvm-branch-commits

https://github.com/igorkudrin created 
https://github.com/llvm/llvm-project/pull/92508

`AArch64PAuth::checkAuthenticatedRegister()` splits the basic block containing 
the tail call instruction to add check instructions, assuming at least one more 
instruction before the call. This assumption is incorrect in cases where some 
execution paths lead to the termination block without creating the stack frame. 
This patch rearranges the creation of the checks so that the prior splitting is 
not required.

>From a3039508f7bf9eeacbb4739460468cb3e71ba133 Mon Sep 17 00:00:00 2001
From: Igor Kudrin 
Date: Thu, 16 May 2024 22:26:32 -0700
Subject: [PATCH 1/2] test

---
 .../AArch64/sign-return-address-tailcall.ll   | 32 +++
 1 file changed, 32 insertions(+)

diff --git a/llvm/test/CodeGen/AArch64/sign-return-address-tailcall.ll 
b/llvm/test/CodeGen/AArch64/sign-return-address-tailcall.ll
index cf033cb8208cc..0cc707298e458 100644
--- a/llvm/test/CodeGen/AArch64/sign-return-address-tailcall.ll
+++ b/llvm/test/CodeGen/AArch64/sign-return-address-tailcall.ll
@@ -129,4 +129,36 @@ define i32 @tailcall_ib_key() "sign-return-address"="all" 
"sign-return-address-k
   ret i32 %call
 }
 
+define i32 @tailcall_two_branches(i1 %0) "sign-return-address"="all" {
+; COMMON-LABEL:tailcall_two_branches:
+; COMMON:tbz w0, #0, .[[ELSE:LBB[_0-9]+]]
+; COMMON:str x30, [sp, #-16]!
+; COMMON:bl callee2
+; COMMON:ldr x30, [sp], #16
+; COMMON-NEXT:   [[AUTIASP]]
+; COMMON-NEXT: .[[ELSE]]:
+
+; LDR-NEXT:  ldr w16, [x30]
+;
+; BITS-NOTBI-NEXT:   eor x16, x30, x30, lsl #1
+; BITS-NOTBI-NEXT:   tbnz x16, #62, .[[FAIL:LBB[_0-9]+]]
+;
+; XPAC-NEXT: mov x16, x30
+; XPAC-NEXT: [[XPACLRI]]
+; XPAC-NEXT: cmp x16, x30
+; XPAC-NEXT: b.ne .[[FAIL:LBB[_0-9]+]]
+;
+; COMMON-NEXT:   b callee
+; BRK-NEXT:.[[FAIL]]:
+; BRK-NEXT:  brk #0xc470
+  br i1 %0, label %2, label %3
+2:
+  call void @callee2()
+  br label %3
+3:
+  %call = tail call i32 @callee()
+  ret i32 %call
+}
+
 declare i32 @callee()
+declare void @callee2()

>From 2641fe82837455b422d6c8229cc2f3d3736de4da Mon Sep 17 00:00:00 2001
From: Igor Kudrin 
Date: Thu, 16 May 2024 22:26:40 -0700
Subject: [PATCH 2/2] [AArch64][PAC] Fix creating check instructions for BBs
 without an epilog

`AArch64PAuth::checkAuthenticatedRegister()` splits the basic block
containing the tail call instruction to add check instructions, assuming
at least one more instruction before the call. This assumption is
incorrect in cases where some execution paths lead to the termination
block without creating the stack frame. This patch rearranges the
creation of the checks so that the prior splitting is not required.
---
 .../lib/Target/AArch64/AArch64PointerAuth.cpp | 23 ++-
 1 file changed, 7 insertions(+), 16 deletions(-)

diff --git a/llvm/lib/Target/AArch64/AArch64PointerAuth.cpp 
b/llvm/lib/Target/AArch64/AArch64PointerAuth.cpp
index 90bf089dbebf7..60d3d533d9c10 100644
--- a/llvm/lib/Target/AArch64/AArch64PointerAuth.cpp
+++ b/llvm/lib/Target/AArch64/AArch64PointerAuth.cpp
@@ -257,21 +257,12 @@ void llvm::AArch64PAuth::checkAuthenticatedRegister(
 
   // Control flow has to be changed, so arrange new MBBs.
 
-  // At now, at least an AUT* instruction is expected before MBBI
-  assert(MBBI != MBB.begin() &&
- "Cannot insert the check at the very beginning of MBB");
-  // The block to insert check into.
-  MachineBasicBlock *CheckBlock = 
-  // The remaining part of the original MBB that is executed on success.
-  MachineBasicBlock *SuccessBlock = MBB.splitAt(*std::prev(MBBI));
-
   // The block that explicitly generates a break-point exception on failure.
   MachineBasicBlock *BreakBlock =
   MF.CreateMachineBasicBlock(MBB.getBasicBlock());
   MF.push_back(BreakBlock);
-  MBB.splitSuccessor(SuccessBlock, BreakBlock);
+  MBB.addSuccessor(BreakBlock);
 
-  assert(CheckBlock->getFallThrough() == SuccessBlock);
   BuildMI(BreakBlock, DL, TII->get(AArch64::BRK)).addImm(BrkImm);
 
   switch (Method) {
@@ -279,11 +270,11 @@ void llvm::AArch64PAuth::checkAuthenticatedRegister(
   case AuthCheckMethod::DummyLoad:
 llvm_unreachable("Should be handled above");
   case AuthCheckMethod::HighBitsNoTBI:
-BuildMI(CheckBlock, DL, TII->get(AArch64::EORXrs), TmpReg)
+BuildMI(MBB, MBBI, DL, TII->get(AArch64::EORXrs), TmpReg)
 .addReg(AuthenticatedReg)
 .addReg(AuthenticatedReg)
 .addImm(1);
-BuildMI(CheckBlock, DL, TII->get(AArch64::TBNZX))
+BuildMI(MBB, MBBI, DL, TII->get(AArch64::TBNZX))
 .addReg(TmpReg)
 .addImm(62)
 .addMBB(BreakBlock);
@@ -292,16 +283,16 @@ void llvm::AArch64PAuth::checkAuthenticatedRegister(
 assert(AuthenticatedReg == AArch64::LR &&
"XPACHint mode is only compatible with checking the LR register");
 assert(UseIKey && "XPACHint mode is only compatible with I-keys");
-