[llvm-branch-commits] [lld] [llvm] release/18.x: [LoongArch] Use R_LARCH_ALIGN with section symbol (#84741) (PR #88891)
SixWeining wrote: This will be reverted in main branch (https://github.com/llvm/llvm-project/pull/92584). So close it. https://github.com/llvm/llvm-project/pull/88891 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lld] [llvm] release/18.x: [LoongArch] Use R_LARCH_ALIGN with section symbol (#84741) (PR #88891)
https://github.com/SixWeining closed https://github.com/llvm/llvm-project/pull/88891 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/18.x: [clang] Don't assume location of compiler-rt for OpenBSD (#92183) (PR #92293)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/92293 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] 48c1364 - [clang] Don't assume location of compiler-rt for OpenBSD (#92183)
Author: John Ericson Date: 2024-05-17T16:26:37-07:00 New Revision: 48c1364200b5649dda2f9ccbe382b0bd908b99de URL: https://github.com/llvm/llvm-project/commit/48c1364200b5649dda2f9ccbe382b0bd908b99de DIFF: https://github.com/llvm/llvm-project/commit/48c1364200b5649dda2f9ccbe382b0bd908b99de.diff LOG: [clang] Don't assume location of compiler-rt for OpenBSD (#92183) If the `/usr/lib/...` path where compiler-rt is conventionally installed on OpenBSD does not exist, fall back to the regular logic to find it. This is a minimal change to allow OpenBSD cross compilation from a toolchain that doesn't adopt all of OpenBSD's monorepo's conventions. (cherry picked from commit be10746f3a4381456eb5082a968766201c17ab5d) Added: Modified: clang/lib/Driver/ToolChains/OpenBSD.cpp Removed: diff --git a/clang/lib/Driver/ToolChains/OpenBSD.cpp b/clang/lib/Driver/ToolChains/OpenBSD.cpp index fd6aa4d7e6844..00b6c520fcdd7 100644 --- a/clang/lib/Driver/ToolChains/OpenBSD.cpp +++ b/clang/lib/Driver/ToolChains/OpenBSD.cpp @@ -371,7 +371,8 @@ std::string OpenBSD::getCompilerRT(const ArgList , StringRef Component, if (Component == "builtins") { SmallString<128> Path(getDriver().SysRoot); llvm::sys::path::append(Path, "/usr/lib/libcompiler_rt.a"); -return std::string(Path); +if (getVFS().exists(Path)) + return std::string(Path); } SmallString<128> P(getDriver().ResourceDir); std::string CRTBasename = ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [release/18.x] Backport fixes for ARM64EC thunk generation (PR #92580)
dpaoliello wrote: > @dpaoliello (or anyone else). If you would like to add a note about this fix > in the release notes (completely optional). Please reply to this comment with > a one or two sentence description of the fix. When you are done, please add > the release:note label to this PR. Fixes issues where LLVM is either generating the incorrect thunk for a function with aligned parameters or didn't correctly pass through the return value when `StructRet` was used. https://github.com/llvm/llvm-project/pull/92580 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] [libcxxabi] release/18.x: [libcxx][libcxxabi] Fix build for OpenBSD (#92186) (PR #92601)
brad0 wrote: > Thanks, both of you. I can't merge these, so I am guessing someone else will > come along that can? The RE manager does. This release cycle being tsteller. https://github.com/llvm/llvm-project/pull/92601 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] [libcxxabi] release/18.x: [libcxx][libcxxabi] Fix build for OpenBSD (#92186) (PR #92601)
Ericson2314 wrote: Thanks, both of you. I can't merge these, so I am guessing someone else will come along that can? https://github.com/llvm/llvm-project/pull/92601 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/18.x: [clang] Don't assume location of compiler-rt for OpenBSD (#92183) (PR #92293)
https://github.com/brad0 approved this pull request. https://github.com/llvm/llvm-project/pull/92293 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] [libcxxabi] release/18.x: [libcxx][libcxxabi] Fix build for OpenBSD (#92186) (PR #92601)
tstellar wrote: @brad0 Can you look at #92293 too? https://github.com/llvm/llvm-project/pull/92601 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] [libcxxabi] release/18.x: [libcxx][libcxxabi] Fix build for OpenBSD (#92186) (PR #92601)
https://github.com/brad0 approved this pull request. https://github.com/llvm/llvm-project/pull/92601 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/18.x: [clang] Don't assume location of compiler-rt for OpenBSD (#92183) (PR #92293)
Ericson2314 wrote: (FWIW https://github.com/llvm/llvm-project/pull/92601 is somewhat a companion backport.) https://github.com/llvm/llvm-project/pull/92293 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] [libcxxabi] release/18.x: [libcxx][libcxxabi] Fix build for OpenBSD (#92186) (PR #92601)
llvmbot wrote: @llvm/pr-subscribers-libcxxabi @llvm/pr-subscribers-libcxx Author: None (llvmbot) Changes Backport af7467c Requested by: @Ericson2314 --- Full diff: https://github.com/llvm/llvm-project/pull/92601.diff 3 Files Affected: - (modified) libcxx/src/atomic.cpp (+14-2) - (modified) libcxx/src/chrono.cpp (+3-1) - (modified) libcxxabi/src/cxa_guard_impl.h (+15-1) ``diff diff --git a/libcxx/src/atomic.cpp b/libcxx/src/atomic.cpp index 2f0389ae6974a..6b1f03c21bbcc 100644 --- a/libcxx/src/atomic.cpp +++ b/libcxx/src/atomic.cpp @@ -25,16 +25,28 @@ # if !defined(SYS_futex) && defined(SYS_futex_time64) #define SYS_futex SYS_futex_time64 # endif +# define _LIBCPP_FUTEX(...) syscall(SYS_futex, __VA_ARGS__) #elif defined(__FreeBSD__) # include # include +# define _LIBCPP_FUTEX(...) syscall(SYS_futex, __VA_ARGS__) + +#elif defined(__OpenBSD__) + +# include + +// OpenBSD has no indirect syscalls +# define _LIBCPP_FUTEX(...) futex(__VA_ARGS__) + #else // <- Add other operating systems here // Baseline needs no new headers +# define _LIBCPP_FUTEX(...) syscall(SYS_futex, __VA_ARGS__) + #endif _LIBCPP_BEGIN_NAMESPACE_STD @@ -44,11 +56,11 @@ _LIBCPP_BEGIN_NAMESPACE_STD static void __libcpp_platform_wait_on_address(__cxx_atomic_contention_t const volatile* __ptr, __cxx_contention_t __val) { static constexpr timespec __timeout = {2, 0}; - syscall(SYS_futex, __ptr, FUTEX_WAIT_PRIVATE, __val, &__timeout, 0, 0); + _LIBCPP_FUTEX(__ptr, FUTEX_WAIT_PRIVATE, __val, &__timeout, 0, 0); } static void __libcpp_platform_wake_by_address(__cxx_atomic_contention_t const volatile* __ptr, bool __notify_one) { - syscall(SYS_futex, __ptr, FUTEX_WAKE_PRIVATE, __notify_one ? 1 : INT_MAX, 0, 0, 0); + _LIBCPP_FUTEX(__ptr, FUTEX_WAKE_PRIVATE, __notify_one ? 1 : INT_MAX, 0, 0, 0); } #elif defined(__APPLE__) && defined(_LIBCPP_USE_ULOCK) diff --git a/libcxx/src/chrono.cpp b/libcxx/src/chrono.cpp index c5e827c0cb59f..e7d6dfbc22924 100644 --- a/libcxx/src/chrono.cpp +++ b/libcxx/src/chrono.cpp @@ -31,7 +31,9 @@ # include // for gettimeofday and timeval #endif -#if defined(__APPLE__) || defined(__gnu_hurd__) || (defined(_POSIX_TIMERS) && _POSIX_TIMERS > 0) +// OpenBSD does not have a fully conformant suite of POSIX timers, but +// it does have clock_gettime and CLOCK_MONOTONIC which is all we need. +#if defined(__APPLE__) || defined(__gnu_hurd__) || defined(__OpenBSD__) || (defined(_POSIX_TIMERS) && _POSIX_TIMERS > 0) # define _LIBCPP_HAS_CLOCK_GETTIME #endif diff --git a/libcxxabi/src/cxa_guard_impl.h b/libcxxabi/src/cxa_guard_impl.h index e00d54b3a7318..90d589be4d773 100644 --- a/libcxxabi/src/cxa_guard_impl.h +++ b/libcxxabi/src/cxa_guard_impl.h @@ -47,6 +47,9 @@ #include "__cxxabi_config.h" #include "include/atomic_support.h" // from libc++ #if defined(__has_include) +# if __has_include() +#include +# endif # if __has_include() #include # endif @@ -411,7 +414,18 @@ struct InitByteGlobalMutex { // Futex Implementation //===--===// -#if defined(SYS_futex) +#if defined(__OpenBSD__) +void PlatformFutexWait(int* addr, int expect) { + constexpr int WAIT = 0; + futex(reinterpret_cast(addr), WAIT, expect, NULL, NULL); + __tsan_acquire(addr); +} +void PlatformFutexWake(int* addr) { + constexpr int WAKE = 1; + __tsan_release(addr); + futex(reinterpret_cast(addr), WAKE, INT_MAX, NULL, NULL); +} +#elif defined(SYS_futex) void PlatformFutexWait(int* addr, int expect) { constexpr int WAIT = 0; syscall(SYS_futex, addr, WAIT, expect, 0); `` https://github.com/llvm/llvm-project/pull/92601 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] [libcxxabi] release/18.x: [libcxx][libcxxabi] Fix build for OpenBSD (#92186) (PR #92601)
llvmbot wrote: @brad0 What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/92601 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] [libcxxabi] release/18.x: [libcxx][libcxxabi] Fix build for OpenBSD (#92186) (PR #92601)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/92601 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] [libcxxabi] release/18.x: [libcxx][libcxxabi] Fix build for OpenBSD (#92186) (PR #92601)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/92601 Backport af7467c Requested by: @Ericson2314 >From 472f75ba1cd626b92bf2b10099626717fcd58e29 Mon Sep 17 00:00:00 2001 From: John Ericson Date: Fri, 17 May 2024 16:49:04 -0400 Subject: [PATCH] [libcxx][libcxxabi] Fix build for OpenBSD (#92186) - No indirect syscalls on OpenBSD. Instead there is a `futex` function which issues a direct syscall. - Monotonic clock is available despite the full POSIX suite of timers not being available in its entirety. See https://lists.boost.org/boost-bugs/2015/07/41690.php and https://github.com/boostorg/log/commit/c98b1f459add14d5ce3e9e63e2469064601d7f71 for a description of an analogous problem and fix for Boost. (cherry picked from commit af7467ce9f447d6fe977b73db1f03a18d6bbd511) --- libcxx/src/atomic.cpp | 16 ++-- libcxx/src/chrono.cpp | 4 +++- libcxxabi/src/cxa_guard_impl.h | 16 +++- 3 files changed, 32 insertions(+), 4 deletions(-) diff --git a/libcxx/src/atomic.cpp b/libcxx/src/atomic.cpp index 2f0389ae6974a..6b1f03c21bbcc 100644 --- a/libcxx/src/atomic.cpp +++ b/libcxx/src/atomic.cpp @@ -25,16 +25,28 @@ # if !defined(SYS_futex) && defined(SYS_futex_time64) #define SYS_futex SYS_futex_time64 # endif +# define _LIBCPP_FUTEX(...) syscall(SYS_futex, __VA_ARGS__) #elif defined(__FreeBSD__) # include # include +# define _LIBCPP_FUTEX(...) syscall(SYS_futex, __VA_ARGS__) + +#elif defined(__OpenBSD__) + +# include + +// OpenBSD has no indirect syscalls +# define _LIBCPP_FUTEX(...) futex(__VA_ARGS__) + #else // <- Add other operating systems here // Baseline needs no new headers +# define _LIBCPP_FUTEX(...) syscall(SYS_futex, __VA_ARGS__) + #endif _LIBCPP_BEGIN_NAMESPACE_STD @@ -44,11 +56,11 @@ _LIBCPP_BEGIN_NAMESPACE_STD static void __libcpp_platform_wait_on_address(__cxx_atomic_contention_t const volatile* __ptr, __cxx_contention_t __val) { static constexpr timespec __timeout = {2, 0}; - syscall(SYS_futex, __ptr, FUTEX_WAIT_PRIVATE, __val, &__timeout, 0, 0); + _LIBCPP_FUTEX(__ptr, FUTEX_WAIT_PRIVATE, __val, &__timeout, 0, 0); } static void __libcpp_platform_wake_by_address(__cxx_atomic_contention_t const volatile* __ptr, bool __notify_one) { - syscall(SYS_futex, __ptr, FUTEX_WAKE_PRIVATE, __notify_one ? 1 : INT_MAX, 0, 0, 0); + _LIBCPP_FUTEX(__ptr, FUTEX_WAKE_PRIVATE, __notify_one ? 1 : INT_MAX, 0, 0, 0); } #elif defined(__APPLE__) && defined(_LIBCPP_USE_ULOCK) diff --git a/libcxx/src/chrono.cpp b/libcxx/src/chrono.cpp index c5e827c0cb59f..e7d6dfbc22924 100644 --- a/libcxx/src/chrono.cpp +++ b/libcxx/src/chrono.cpp @@ -31,7 +31,9 @@ # include // for gettimeofday and timeval #endif -#if defined(__APPLE__) || defined(__gnu_hurd__) || (defined(_POSIX_TIMERS) && _POSIX_TIMERS > 0) +// OpenBSD does not have a fully conformant suite of POSIX timers, but +// it does have clock_gettime and CLOCK_MONOTONIC which is all we need. +#if defined(__APPLE__) || defined(__gnu_hurd__) || defined(__OpenBSD__) || (defined(_POSIX_TIMERS) && _POSIX_TIMERS > 0) # define _LIBCPP_HAS_CLOCK_GETTIME #endif diff --git a/libcxxabi/src/cxa_guard_impl.h b/libcxxabi/src/cxa_guard_impl.h index e00d54b3a7318..90d589be4d773 100644 --- a/libcxxabi/src/cxa_guard_impl.h +++ b/libcxxabi/src/cxa_guard_impl.h @@ -47,6 +47,9 @@ #include "__cxxabi_config.h" #include "include/atomic_support.h" // from libc++ #if defined(__has_include) +# if __has_include() +#include +# endif # if __has_include() #include # endif @@ -411,7 +414,18 @@ struct InitByteGlobalMutex { // Futex Implementation //===--===// -#if defined(SYS_futex) +#if defined(__OpenBSD__) +void PlatformFutexWait(int* addr, int expect) { + constexpr int WAIT = 0; + futex(reinterpret_cast(addr), WAIT, expect, NULL, NULL); + __tsan_acquire(addr); +} +void PlatformFutexWake(int* addr) { + constexpr int WAKE = 1; + __tsan_release(addr); + futex(reinterpret_cast(addr), WAKE, INT_MAX, NULL, NULL); +} +#elif defined(SYS_futex) void PlatformFutexWait(int* addr, int expect) { constexpr int WAIT = 0; syscall(SYS_futex, addr, WAIT, expect, 0); ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lld] [llvm] release/18.x: [LoongArch] Use R_LARCH_ALIGN with section symbol (#84741) (PR #88891)
https://github.com/tstellar demilestoned https://github.com/llvm/llvm-project/pull/88891 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [GlobalOpt] Don't replace aliasee with alias that has weak linkage (#91483) (PR #92468)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/92468 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] 3d0752b - [GlobalOpt] Don't replace aliasee with alias that has weak linkage (#91483)
Author: DianQK Date: 2024-05-17T13:50:38-07:00 New Revision: 3d0752b9492efd60e85aedec79676596af6fb4f8 URL: https://github.com/llvm/llvm-project/commit/3d0752b9492efd60e85aedec79676596af6fb4f8 DIFF: https://github.com/llvm/llvm-project/commit/3d0752b9492efd60e85aedec79676596af6fb4f8.diff LOG: [GlobalOpt] Don't replace aliasee with alias that has weak linkage (#91483) Fixes #91312. Don't perform the transform if the alias may be replaced at link time. (cherry picked from commit c79690040acf5bb3d857558b0878db47f7f23dc3) Added: llvm/test/Transforms/GlobalOpt/alias-weak.ll Modified: llvm/lib/Transforms/IPO/GlobalOpt.cpp Removed: diff --git a/llvm/lib/Transforms/IPO/GlobalOpt.cpp b/llvm/lib/Transforms/IPO/GlobalOpt.cpp index 951372adcfa93..619b3f612f25f 100644 --- a/llvm/lib/Transforms/IPO/GlobalOpt.cpp +++ b/llvm/lib/Transforms/IPO/GlobalOpt.cpp @@ -2212,6 +2212,9 @@ static bool mayHaveOtherReferences(GlobalValue , const LLVMUsed ) { static bool hasUsesToReplace(GlobalAlias , const LLVMUsed , bool ) { + if (GA.isWeakForLinker()) +return false; + RenameTarget = false; bool Ret = false; if (hasUseOtherThanLLVMUsed(GA, U)) diff --git a/llvm/test/Transforms/GlobalOpt/alias-weak.ll b/llvm/test/Transforms/GlobalOpt/alias-weak.ll new file mode 100644 index 0..aec2a56313b12 --- /dev/null +++ b/llvm/test/Transforms/GlobalOpt/alias-weak.ll @@ -0,0 +1,57 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --check-globals all --include-generated-funcs --version 4 +; RUN: opt < %s -passes=globalopt -S | FileCheck %s + +@f1_alias = linkonce_odr hidden alias void (), ptr @f1 +@f2_alias = linkonce_odr hidden alias void (), ptr @f2 + +define void @foo() { + call void @f1_alias() + ret void +} + +define void @bar() { + call void @f1() + ret void +} + +define void @baz() { + call void @f2_alias() + ret void +} + +; We cannot use `f1_alias` to replace `f1` because they are both in use +; and `f1_alias` could be replaced at link time. +define internal void @f1() { + ret void +} + +; FIXME: We can use `f2_alias` to replace `f2` because `b2` is not in use. +define internal void @f2() { + ret void +} +;. +; CHECK: @f1_alias = linkonce_odr hidden alias void (), ptr @f1 +; CHECK: @f2_alias = linkonce_odr hidden alias void (), ptr @f2 +;. +; CHECK-LABEL: define void @foo() local_unnamed_addr { +; CHECK-NEXT:call void @f1_alias() +; CHECK-NEXT:ret void +; +; +; CHECK-LABEL: define void @bar() local_unnamed_addr { +; CHECK-NEXT:call void @f1() +; CHECK-NEXT:ret void +; +; +; CHECK-LABEL: define void @baz() local_unnamed_addr { +; CHECK-NEXT:call void @f2_alias() +; CHECK-NEXT:ret void +; +; +; CHECK-LABEL: define internal void @f1() { +; CHECK-NEXT:ret void +; +; +; CHECK-LABEL: define internal void @f2() { +; CHECK-NEXT:ret void +; ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [RISCV] Re-separate unaligned scalar and vector memory features in the backend. (PR #92143)
topperc wrote: > @topperc (or anyone else). If you would like to add a note about this fix in > the release notes (completely optional). Please reply to this comment with a > one or two sentence description of the fix. When you are done, please add the > release:note label to this PR. `-Xclang -target-feature -Xclang +unaligned-scalar-mem` can be used to enable unaligned scalar memory accesses for CPUs that do not support unaligned vector accesses. `-mno-strict-align` will enable unaligned scalar and vector memory accesses. https://github.com/llvm/llvm-project/pull/92143 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [GlobalOpt] Don't replace aliasee with alias that has weak linkage (#91483) (PR #92468)
https://github.com/tstellar updated https://github.com/llvm/llvm-project/pull/92468 >From 3d0752b9492efd60e85aedec79676596af6fb4f8 Mon Sep 17 00:00:00 2001 From: DianQK Date: Fri, 17 May 2024 05:51:49 +0800 Subject: [PATCH] [GlobalOpt] Don't replace aliasee with alias that has weak linkage (#91483) Fixes #91312. Don't perform the transform if the alias may be replaced at link time. (cherry picked from commit c79690040acf5bb3d857558b0878db47f7f23dc3) --- llvm/lib/Transforms/IPO/GlobalOpt.cpp| 3 ++ llvm/test/Transforms/GlobalOpt/alias-weak.ll | 57 2 files changed, 60 insertions(+) create mode 100644 llvm/test/Transforms/GlobalOpt/alias-weak.ll diff --git a/llvm/lib/Transforms/IPO/GlobalOpt.cpp b/llvm/lib/Transforms/IPO/GlobalOpt.cpp index 951372adcfa93..619b3f612f25f 100644 --- a/llvm/lib/Transforms/IPO/GlobalOpt.cpp +++ b/llvm/lib/Transforms/IPO/GlobalOpt.cpp @@ -2212,6 +2212,9 @@ static bool mayHaveOtherReferences(GlobalValue , const LLVMUsed ) { static bool hasUsesToReplace(GlobalAlias , const LLVMUsed , bool ) { + if (GA.isWeakForLinker()) +return false; + RenameTarget = false; bool Ret = false; if (hasUseOtherThanLLVMUsed(GA, U)) diff --git a/llvm/test/Transforms/GlobalOpt/alias-weak.ll b/llvm/test/Transforms/GlobalOpt/alias-weak.ll new file mode 100644 index 0..aec2a56313b12 --- /dev/null +++ b/llvm/test/Transforms/GlobalOpt/alias-weak.ll @@ -0,0 +1,57 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --check-globals all --include-generated-funcs --version 4 +; RUN: opt < %s -passes=globalopt -S | FileCheck %s + +@f1_alias = linkonce_odr hidden alias void (), ptr @f1 +@f2_alias = linkonce_odr hidden alias void (), ptr @f2 + +define void @foo() { + call void @f1_alias() + ret void +} + +define void @bar() { + call void @f1() + ret void +} + +define void @baz() { + call void @f2_alias() + ret void +} + +; We cannot use `f1_alias` to replace `f1` because they are both in use +; and `f1_alias` could be replaced at link time. +define internal void @f1() { + ret void +} + +; FIXME: We can use `f2_alias` to replace `f2` because `b2` is not in use. +define internal void @f2() { + ret void +} +;. +; CHECK: @f1_alias = linkonce_odr hidden alias void (), ptr @f1 +; CHECK: @f2_alias = linkonce_odr hidden alias void (), ptr @f2 +;. +; CHECK-LABEL: define void @foo() local_unnamed_addr { +; CHECK-NEXT:call void @f1_alias() +; CHECK-NEXT:ret void +; +; +; CHECK-LABEL: define void @bar() local_unnamed_addr { +; CHECK-NEXT:call void @f1() +; CHECK-NEXT:ret void +; +; +; CHECK-LABEL: define void @baz() local_unnamed_addr { +; CHECK-NEXT:call void @f2_alias() +; CHECK-NEXT:ret void +; +; +; CHECK-LABEL: define internal void @f1() { +; CHECK-NEXT:ret void +; +; +; CHECK-LABEL: define internal void @f2() { +; CHECK-NEXT:ret void +; ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [workflows] Fix libclang-abi-tests to work with new version scheme (PR #91096)
tstellar wrote: Merged: 6456ebbc18a6c2eaa2d7f6cfb7b2e5938e2daf7a https://github.com/llvm/llvm-project/pull/91096 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [workflows] Fix libclang-abi-tests to work with new version scheme (PR #91096)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/91096 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [GlobalOpt] Don't replace aliasee with alias that has weak linkage (#91483) (PR #92468)
DianQK wrote: > > Per [#91483 > > (comment)](https://github.com/llvm/llvm-project/pull/91483#issuecomment-2116394616), > > we still need to further investigate this issue, but it won't stop us from > > backporting it. > > cc @MaskRay > > What exactly does this mean? Was there a bug in the original patch? It's safe, also see https://github.com/llvm/llvm-project/issues/91312#issuecomment-2116404306. https://github.com/llvm/llvm-project/pull/92468 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [release/18.x] Backport fixes for ARM64EC thunk generation (PR #92580)
tstellar wrote: @dpaoliello (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/92580 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [release/18.x] Backport fixes for ARM64EC thunk generation (PR #92580)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/92580 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] 9208786 - [Arm64EC] Correctly handle sret in entry thunks. (#92326)
Author: Eli Friedman Date: 2024-05-17T13:35:09-07:00 New Revision: 92087868d5d291464056066f3e193eca97621514 URL: https://github.com/llvm/llvm-project/commit/92087868d5d291464056066f3e193eca97621514 DIFF: https://github.com/llvm/llvm-project/commit/92087868d5d291464056066f3e193eca97621514.diff LOG: [Arm64EC] Correctly handle sret in entry thunks. (#92326) I accidentally left out the code to transfer sret attributes to entry thunks, so values weren't being passed in the right registers, and the sret pointer wasn't returned in the correct register. Fixes #90229 Added: Modified: llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll Removed: diff --git a/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp b/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp index d4dd28aecac48..862aefe46193d 100644 --- a/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp +++ b/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp @@ -514,7 +514,14 @@ Function *AArch64Arm64ECCallLowering::buildEntryThunk(Function *F) { // Call the function passed to the thunk. Value *Callee = Thunk->getArg(0); Callee = IRB.CreateBitCast(Callee, PtrTy); - Value *Call = IRB.CreateCall(Arm64Ty, Callee, Args); + CallInst *Call = IRB.CreateCall(Arm64Ty, Callee, Args); + + auto SRetAttr = F->getAttributes().getParamAttr(0, Attribute::StructRet); + auto InRegAttr = F->getAttributes().getParamAttr(0, Attribute::InReg); + if (SRetAttr.isValid() && !InRegAttr.isValid()) { +Thunk->addParamAttr(1, SRetAttr); +Call->addParamAttr(0, SRetAttr); + } Value *RetVal = Call; if (TransformDirectToSRet) { diff --git a/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll b/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll index c00c9bfe127e8..e9556b9d5cbee 100644 --- a/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll +++ b/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll @@ -222,12 +222,12 @@ define i8 @matches_has_sret() nounwind { } %TSRet = type { i64, i64 } -define void @has_aligned_sret(ptr align 32 sret(%TSRet)) nounwind { -; CHECK-LABEL:.def$ientry_thunk$cdecl$m16$v; -; CHECK: .section .wowthk$aa,"xr",discard,$ientry_thunk$cdecl$m16$v +define void @has_aligned_sret(ptr align 32 sret(%TSRet), i32) nounwind { +; CHECK-LABEL:.def$ientry_thunk$cdecl$m16$i8; +; CHECK: .section .wowthk$aa,"xr",discard,$ientry_thunk$cdecl$m16$i8 ; CHECK: // %bb.0: -; CHECK-NEXT: stp q6, q7, [sp, #-176]!// 32-byte Folded Spill -; CHECK-NEXT: .seh_save_any_reg_pxq6, 176 +; CHECK-NEXT: stp q6, q7, [sp, #-192]!// 32-byte Folded Spill +; CHECK-NEXT: .seh_save_any_reg_pxq6, 192 ; CHECK-NEXT: stp q8, q9, [sp, #32] // 32-byte Folded Spill ; CHECK-NEXT: .seh_save_any_reg_p q8, 32 ; CHECK-NEXT: stp q10, q11, [sp, #64] // 32-byte Folded Spill @@ -236,17 +236,25 @@ define void @has_aligned_sret(ptr align 32 sret(%TSRet)) nounwind { ; CHECK-NEXT: .seh_save_any_reg_p q12, 96 ; CHECK-NEXT: stp q14, q15, [sp, #128]// 32-byte Folded Spill ; CHECK-NEXT: .seh_save_any_reg_p q14, 128 -; CHECK-NEXT: stp x29, x30, [sp, #160]// 16-byte Folded Spill -; CHECK-NEXT: .seh_save_fplr 160 -; CHECK-NEXT: add x29, sp, #160 -; CHECK-NEXT: .seh_add_fp 160 +; CHECK-NEXT: str x19, [sp, #160] // 8-byte Folded Spill +; CHECK-NEXT: .seh_save_reg x19, 160 +; CHECK-NEXT: stp x29, x30, [sp, #168]// 16-byte Folded Spill +; CHECK-NEXT: .seh_save_fplr 168 +; CHECK-NEXT: add x29, sp, #168 +; CHECK-NEXT: .seh_add_fp 168 ; CHECK-NEXT: .seh_endprologue +; CHECK-NEXT: mov x19, x0 +; CHECK-NEXT: mov x8, x0 +; CHECK-NEXT: mov x0, x1 ; CHECK-NEXT: blr x9 ; CHECK-NEXT: adrpx8, __os_arm64x_dispatch_ret ; CHECK-NEXT: ldr x0, [x8, :lo12:__os_arm64x_dispatch_ret] +; CHECK-NEXT: mov x8, x19 ; CHECK-NEXT: .seh_startepilogue -; CHECK-NEXT: ldp x29, x30, [sp, #160]// 16-byte Folded Reload -; CHECK-NEXT: .seh_save_fplr 160 +; CHECK-NEXT: ldp x29, x30, [sp, #168]// 16-byte Folded Reload +; CHECK-NEXT: .seh_save_fplr 168 +; CHECK-NEXT: ldr x19, [sp, #160] // 8-byte Folded Reload +; CHECK-NEXT: .seh_save_reg x19, 160 ; CHECK-NEXT: ldp q14, q15, [sp, #128]// 32-byte Folded Reload ; CHECK-NEXT: .seh_save_any_reg_p q14, 128 ; CHECK-NEXT: ldp q12, q13, [sp, #96] // 32-byte Folded Reload @@ -255,8 +263,8 @@ define void @has_aligned_sret(ptr align 32 sret(%TSRet)) nounwind { ; CHECK-NEXT: .seh_save_any_reg_p
[llvm-branch-commits] [llvm] bee6966 - [Arm64EC] Improve alignment mangling in arm64ec thunks. (#90115)
Author: Eli Friedman Date: 2024-05-17T13:35:09-07:00 New Revision: bee6966d8efa18041e2e228c3bb7b09c4618677b URL: https://github.com/llvm/llvm-project/commit/bee6966d8efa18041e2e228c3bb7b09c4618677b DIFF: https://github.com/llvm/llvm-project/commit/bee6966d8efa18041e2e228c3bb7b09c4618677b.diff LOG: [Arm64EC] Improve alignment mangling in arm64ec thunks. (#90115) In some cases, MSVC's mangling for arm64ec thunks includes the alignment of a struct. I added some code to try to match... but it never really worked right. The issues: - Alignment is only mangled if it's 16 or more (I guess the default is supposed to be 8). - Alignment isn't mangled on return values (since the memory is allocated by the caller). The current patch leaves hooks to make alignment mangling work... but doesn't actually ever mangle alignment: clang never actually encodes a relevant alignment into the IR. Once we get clang to emit the real size/alignment of structs, we can start emitting it. Added: Modified: llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll Removed: diff --git a/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp b/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp index 55c5bbc66a3f4..d4dd28aecac48 100644 --- a/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp +++ b/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp @@ -181,13 +181,14 @@ void AArch64Arm64ECCallLowering::getThunkArgTypes( } for (unsigned E = FT->getNumParams(); I != E; ++I) { -Align ParamAlign = AttrList.getParamAlignment(I).valueOrOne(); #if 0 // FIXME: Need more information about argument size; see // https://reviews.llvm.org/D132926 uint64_t ArgSizeBytes = AttrList.getParamArm64ECArgSizeBytes(I); +Align ParamAlign = AttrList.getParamAlignment(I).valueOrOne(); #else uint64_t ArgSizeBytes = 0; +Align ParamAlign = Align(); #endif Type *Arm64Ty, *X64Ty; canonicalizeThunkType(FT->getParamType(I), ParamAlign, @@ -297,7 +298,7 @@ void AArch64Arm64ECCallLowering::canonicalizeThunkType( uint64_t TotalSizeBytes = ElementCnt * ElementSizePerBytes; if (ElementTy->isFloatTy() || ElementTy->isDoubleTy()) { Out << (ElementTy->isFloatTy() ? "F" : "D") << TotalSizeBytes; - if (Alignment.value() >= 8 && !T->isPointerTy()) + if (Alignment.value() >= 16 && !Ret) Out << "a" << Alignment.value(); Arm64Ty = T; if (TotalSizeBytes <= 8) { @@ -328,7 +329,7 @@ void AArch64Arm64ECCallLowering::canonicalizeThunkType( Out << "m"; if (TypeSize != 4) Out << TypeSize; - if (Alignment.value() >= 8 && !T->isPointerTy()) + if (Alignment.value() >= 16 && !Ret) Out << "a" << Alignment.value(); // FIXME: Try to canonicalize Arm64Ty more thoroughly? Arm64Ty = T; diff --git a/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll b/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll index bb9ba05f7a272..c00c9bfe127e8 100644 --- a/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll +++ b/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll @@ -223,8 +223,8 @@ define i8 @matches_has_sret() nounwind { %TSRet = type { i64, i64 } define void @has_aligned_sret(ptr align 32 sret(%TSRet)) nounwind { -; CHECK-LABEL:.def$ientry_thunk$cdecl$m16a32$v; -; CHECK: .section .wowthk$aa,"xr",discard,$ientry_thunk$cdecl$m16a32$v +; CHECK-LABEL:.def$ientry_thunk$cdecl$m16$v; +; CHECK: .section .wowthk$aa,"xr",discard,$ientry_thunk$cdecl$m16$v ; CHECK: // %bb.0: ; CHECK-NEXT: stp q6, q7, [sp, #-176]!// 32-byte Folded Spill ; CHECK-NEXT: .seh_save_any_reg_pxq6, 176 @@ -457,7 +457,7 @@ define %T2 @simple_struct(%T1 %0, %T2 %1, %T3, %T4) nounwind { ; CHECK-NEXT: .symidx $ientry_thunk$cdecl$i8$v ; CHECK-NEXT: .word 1 ; CHECK-NEXT: .symidx "#has_aligned_sret" -; CHECK-NEXT: .symidx $ientry_thunk$cdecl$m16a32$v +; CHECK-NEXT: .symidx $ientry_thunk$cdecl$m16$v ; CHECK-NEXT: .word 1 ; CHECK-NEXT: .symidx "#small_array" ; CHECK-NEXT: .symidx $ientry_thunk$cdecl$m2$m2F8 diff --git a/llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll b/llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll index 3b911e78aff2a..7a40fcd85ac58 100644 --- a/llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll +++ b/llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll @@ -236,8 +236,8 @@ declare void @has_sret(ptr sret([100 x i8])) nounwind; %TSRet = type { i64, i64 } declare void @has_aligned_sret(ptr align 32 sret(%TSRet)) nounwind; -; CHECK-LABEL:.def$iexit_thunk$cdecl$m16a32$v; -; CHECK: .section .wowthk$aa,"xr",discard,$iexit_thunk$cdecl$m16a32$v +; CHECK-LABEL:.def$iexit_thunk$cdecl$m16$v; +; CHECK: .section
[llvm-branch-commits] [llvm] [release/18.x] Backport fixes for ARM64EC thunk generation (PR #92580)
https://github.com/tstellar updated https://github.com/llvm/llvm-project/pull/92580 >From bee6966d8efa18041e2e228c3bb7b09c4618677b Mon Sep 17 00:00:00 2001 From: Eli Friedman Date: Fri, 26 Apr 2024 11:06:11 -0700 Subject: [PATCH 1/2] [Arm64EC] Improve alignment mangling in arm64ec thunks. (#90115) In some cases, MSVC's mangling for arm64ec thunks includes the alignment of a struct. I added some code to try to match... but it never really worked right. The issues: - Alignment is only mangled if it's 16 or more (I guess the default is supposed to be 8). - Alignment isn't mangled on return values (since the memory is allocated by the caller). The current patch leaves hooks to make alignment mangling work... but doesn't actually ever mangle alignment: clang never actually encodes a relevant alignment into the IR. Once we get clang to emit the real size/alignment of structs, we can start emitting it. --- llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp | 7 --- llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll | 6 +++--- llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll | 10 +- 3 files changed, 12 insertions(+), 11 deletions(-) diff --git a/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp b/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp index 55c5bbc66a3f4..d4dd28aecac48 100644 --- a/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp +++ b/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp @@ -181,13 +181,14 @@ void AArch64Arm64ECCallLowering::getThunkArgTypes( } for (unsigned E = FT->getNumParams(); I != E; ++I) { -Align ParamAlign = AttrList.getParamAlignment(I).valueOrOne(); #if 0 // FIXME: Need more information about argument size; see // https://reviews.llvm.org/D132926 uint64_t ArgSizeBytes = AttrList.getParamArm64ECArgSizeBytes(I); +Align ParamAlign = AttrList.getParamAlignment(I).valueOrOne(); #else uint64_t ArgSizeBytes = 0; +Align ParamAlign = Align(); #endif Type *Arm64Ty, *X64Ty; canonicalizeThunkType(FT->getParamType(I), ParamAlign, @@ -297,7 +298,7 @@ void AArch64Arm64ECCallLowering::canonicalizeThunkType( uint64_t TotalSizeBytes = ElementCnt * ElementSizePerBytes; if (ElementTy->isFloatTy() || ElementTy->isDoubleTy()) { Out << (ElementTy->isFloatTy() ? "F" : "D") << TotalSizeBytes; - if (Alignment.value() >= 8 && !T->isPointerTy()) + if (Alignment.value() >= 16 && !Ret) Out << "a" << Alignment.value(); Arm64Ty = T; if (TotalSizeBytes <= 8) { @@ -328,7 +329,7 @@ void AArch64Arm64ECCallLowering::canonicalizeThunkType( Out << "m"; if (TypeSize != 4) Out << TypeSize; - if (Alignment.value() >= 8 && !T->isPointerTy()) + if (Alignment.value() >= 16 && !Ret) Out << "a" << Alignment.value(); // FIXME: Try to canonicalize Arm64Ty more thoroughly? Arm64Ty = T; diff --git a/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll b/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll index bb9ba05f7a272..c00c9bfe127e8 100644 --- a/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll +++ b/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll @@ -223,8 +223,8 @@ define i8 @matches_has_sret() nounwind { %TSRet = type { i64, i64 } define void @has_aligned_sret(ptr align 32 sret(%TSRet)) nounwind { -; CHECK-LABEL:.def$ientry_thunk$cdecl$m16a32$v; -; CHECK: .section .wowthk$aa,"xr",discard,$ientry_thunk$cdecl$m16a32$v +; CHECK-LABEL:.def$ientry_thunk$cdecl$m16$v; +; CHECK: .section .wowthk$aa,"xr",discard,$ientry_thunk$cdecl$m16$v ; CHECK: // %bb.0: ; CHECK-NEXT: stp q6, q7, [sp, #-176]!// 32-byte Folded Spill ; CHECK-NEXT: .seh_save_any_reg_pxq6, 176 @@ -457,7 +457,7 @@ define %T2 @simple_struct(%T1 %0, %T2 %1, %T3, %T4) nounwind { ; CHECK-NEXT: .symidx $ientry_thunk$cdecl$i8$v ; CHECK-NEXT: .word 1 ; CHECK-NEXT: .symidx "#has_aligned_sret" -; CHECK-NEXT: .symidx $ientry_thunk$cdecl$m16a32$v +; CHECK-NEXT: .symidx $ientry_thunk$cdecl$m16$v ; CHECK-NEXT: .word 1 ; CHECK-NEXT: .symidx "#small_array" ; CHECK-NEXT: .symidx $ientry_thunk$cdecl$m2$m2F8 diff --git a/llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll b/llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll index 3b911e78aff2a..7a40fcd85ac58 100644 --- a/llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll +++ b/llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll @@ -236,8 +236,8 @@ declare void @has_sret(ptr sret([100 x i8])) nounwind; %TSRet = type { i64, i64 } declare void @has_aligned_sret(ptr align 32 sret(%TSRet)) nounwind; -; CHECK-LABEL:.def$iexit_thunk$cdecl$m16a32$v; -; CHECK: .section .wowthk$aa,"xr",discard,$iexit_thunk$cdecl$m16a32$v +; CHECK-LABEL:.def$iexit_thunk$cdecl$m16$v; +; CHECK: .section .wowthk$aa,"xr",discard,$iexit_thunk$cdecl$m16$v ; CHECK: // %bb.0: ; CHECK-NEXT:
[llvm-branch-commits] [llvm] release/18.x: [workflows] Fix libclang-abi-tests to work with new version scheme (#91865) (PR #92258)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/92258 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [workflows] Fix libclang-abi-tests to work with new version scheme (#91865) (PR #92258)
https://github.com/tstellar updated https://github.com/llvm/llvm-project/pull/92258 >From 6456ebbc18a6c2eaa2d7f6cfb7b2e5938e2daf7a Mon Sep 17 00:00:00 2001 From: Tom Stellard Date: Wed, 15 May 2024 06:08:29 -0700 Subject: [PATCH] [workflows] Fix libclang-abi-tests to work with new version scheme (#91865) (cherry picked from commit d06270ee00e37b247eb99268fb2f106dbeee08ff) --- .github/workflows/libclang-abi-tests.yml | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/.github/workflows/libclang-abi-tests.yml b/.github/workflows/libclang-abi-tests.yml index ccfc1e5fb8a74..972d21c3bcedf 100644 --- a/.github/workflows/libclang-abi-tests.yml +++ b/.github/workflows/libclang-abi-tests.yml @@ -33,7 +33,6 @@ jobs: ABI_HEADERS: ${{ steps.vars.outputs.ABI_HEADERS }} ABI_LIBS: ${{ steps.vars.outputs.ABI_LIBS }} BASELINE_VERSION_MAJOR: ${{ steps.vars.outputs.BASELINE_VERSION_MAJOR }} - BASELINE_VERSION_MINOR: ${{ steps.vars.outputs.BASELINE_VERSION_MINOR }} LLVM_VERSION_MAJOR: ${{ steps.version.outputs.LLVM_VERSION_MAJOR }} LLVM_VERSION_MINOR: ${{ steps.version.outputs.LLVM_VERSION_MINOR }} LLVM_VERSION_PATCH: ${{ steps.version.outputs.LLVM_VERSION_PATCH }} @@ -51,9 +50,9 @@ jobs: id: vars run: | remote_repo='https://github.com/llvm/llvm-project' - if [ ${{ steps.version.outputs.LLVM_VERSION_MINOR }} -ne 0 ] || [ ${{ steps.version.outputs.LLVM_VERSION_PATCH }} -eq 0 ]; then + if [ ${{ steps.version.outputs.LLVM_VERSION_PATCH }} -eq 0 ]; then major_version=$(( ${{ steps.version.outputs.LLVM_VERSION_MAJOR }} - 1)) -baseline_ref="llvmorg-$major_version.0.0" +baseline_ref="llvmorg-$major_version.1.0" # If there is a minor release, we want to use that as the base line. minor_ref=$(git ls-remote --refs -t "$remote_repo" llvmorg-"$major_version".[1-9].[0-9] | tail -n1 | grep -o 'llvmorg-.\+' || true) @@ -75,7 +74,7 @@ jobs: else { echo "BASELINE_VERSION_MAJOR=${{ steps.version.outputs.LLVM_VERSION_MAJOR }}" - echo "BASELINE_REF=llvmorg-${{ steps.version.outputs.LLVM_VERSION_MAJOR }}.0.0" + echo "BASELINE_REF=llvmorg-${{ steps.version.outputs.LLVM_VERSION_MAJOR }}.1.0" echo "ABI_HEADERS=." echo "ABI_LIBS=libclang.so libclang-cpp.so" } >> "$GITHUB_OUTPUT" ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/18.x: [clang] Don't assume location of compiler-rt for OpenBSD (#92183) (PR #92293)
tstellar wrote: cc @epsilon-0 https://github.com/llvm/llvm-project/pull/92293 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [GlobalOpt] Don't replace aliasee with alias that has weak linkage (#91483) (PR #92468)
tstellar wrote: > Per [#91483 > (comment)](https://github.com/llvm/llvm-project/pull/91483#issuecomment-2116394616), > we still need to further investigate this issue, but it won't stop us from > backporting it. > > cc @MaskRay What exactly does this mean? Was there a bug in the original patch? https://github.com/llvm/llvm-project/pull/92468 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [RISCV] Re-separate unaligned scalar and vector memory features in the backend. (PR #92143)
tstellar wrote: @topperc (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. https://github.com/llvm/llvm-project/pull/92143 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [RISCV] Re-separate unaligned scalar and vector memory features in the backend. (PR #92143)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/92143 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] a7cd0c6 - [RISCV] Add a unaligned-scalar-mem feature like we had in clang 17.
Author: Craig Topper Date: 2024-05-17T13:22:27-07:00 New Revision: a7cd0c61123889a632ceea67dc8c8e2c8753ae08 URL: https://github.com/llvm/llvm-project/commit/a7cd0c61123889a632ceea67dc8c8e2c8753ae08 DIFF: https://github.com/llvm/llvm-project/commit/a7cd0c61123889a632ceea67dc8c8e2c8753ae08.diff LOG: [RISCV] Add a unaligned-scalar-mem feature like we had in clang 17. This is ORed with the fast-unaligned-access feature which applies to scalar and vector together.: Added: Modified: llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp llvm/lib/Target/RISCV/RISCVFeatures.td llvm/lib/Target/RISCV/RISCVISelLowering.cpp llvm/test/CodeGen/RISCV/memcpy-inline.ll llvm/test/CodeGen/RISCV/memcpy.ll llvm/test/CodeGen/RISCV/memset-inline.ll llvm/test/CodeGen/RISCV/pr56110.ll llvm/test/CodeGen/RISCV/unaligned-load-store.ll Removed: diff --git a/llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp b/llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp index 0a314fdd41cbe..89207640ee54a 100644 --- a/llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp +++ b/llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp @@ -317,8 +317,9 @@ bool RISCVExpandPseudo::expandRV32ZdinxStore(MachineBasicBlock , .addReg(MBBI->getOperand(1).getReg()) .add(MBBI->getOperand(2)); if (MBBI->getOperand(2).isGlobal() || MBBI->getOperand(2).isCPI()) { -// FIXME: Zdinx RV32 can not work on unaligned memory. -assert(!STI->hasFastUnalignedAccess()); +// FIXME: Zdinx RV32 can not work on unaligned scalar memory. +assert(!STI->hasFastUnalignedAccess() && + !STI->enableUnalignedScalarMem()); assert(MBBI->getOperand(2).getOffset() % 8 == 0); MBBI->getOperand(2).setOffset(MBBI->getOperand(2).getOffset() + 4); diff --git a/llvm/lib/Target/RISCV/RISCVFeatures.td b/llvm/lib/Target/RISCV/RISCVFeatures.td index 26451c80f57b4..1bb6b6a561f4a 100644 --- a/llvm/lib/Target/RISCV/RISCVFeatures.td +++ b/llvm/lib/Target/RISCV/RISCVFeatures.td @@ -1025,6 +1025,11 @@ def FeatureFastUnalignedAccess "true", "Has reasonably performant unaligned " "loads and stores (both scalar and vector)">; +def FeatureUnalignedScalarMem + : SubtargetFeature<"unaligned-scalar-mem", "EnableUnalignedScalarMem", + "true", "Has reasonably performant unaligned scalar " + "loads and stores">; + def FeaturePostRAScheduler : SubtargetFeature<"use-postra-scheduler", "UsePostRAScheduler", "true", "Schedule again after register allocation">; diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp index d46093b9e260a..3fe7ddfdd4279 100644 --- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp +++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp @@ -1883,7 +1883,8 @@ bool RISCVTargetLowering::shouldConvertConstantLoadToIntImm(const APInt , // replace. If we don't support unaligned scalar mem, prefer the constant // pool. // TODO: Can the caller pass down the alignment? - if (!Subtarget.hasFastUnalignedAccess()) + if (!Subtarget.hasFastUnalignedAccess() && + !Subtarget.enableUnalignedScalarMem()) return true; // Prefer to keep the load if it would require many instructions. @@ -19772,8 +19773,10 @@ bool RISCVTargetLowering::allowsMisalignedMemoryAccesses( unsigned *Fast) const { if (!VT.isVector()) { if (Fast) - *Fast = Subtarget.hasFastUnalignedAccess(); -return Subtarget.hasFastUnalignedAccess(); + *Fast = Subtarget.hasFastUnalignedAccess() || + Subtarget.enableUnalignedScalarMem(); +return Subtarget.hasFastUnalignedAccess() || + Subtarget.enableUnalignedScalarMem(); } // All vector implementations must support element alignment diff --git a/llvm/test/CodeGen/RISCV/memcpy-inline.ll b/llvm/test/CodeGen/RISCV/memcpy-inline.ll index 343695ee37da8..709b8264b5833 100644 --- a/llvm/test/CodeGen/RISCV/memcpy-inline.ll +++ b/llvm/test/CodeGen/RISCV/memcpy-inline.ll @@ -7,6 +7,10 @@ ; RUN: | FileCheck %s --check-prefixes=RV32-BOTH,RV32-FAST ; RUN: llc < %s -mtriple=riscv64 -mattr=+fast-unaligned-access \ ; RUN: | FileCheck %s --check-prefixes=RV64-BOTH,RV64-FAST +; RUN: llc < %s -mtriple=riscv32 -mattr=+unaligned-scalar-mem \ +; RUN: | FileCheck %s --check-prefixes=RV32-BOTH,RV32-FAST +; RUN: llc < %s -mtriple=riscv64 -mattr=+unaligned-scalar-mem \ +; RUN: | FileCheck %s --check-prefixes=RV64-BOTH,RV64-FAST ; -- ; Fully unaligned cases diff --git a/llvm/test/CodeGen/RISCV/memcpy.ll b/llvm/test/CodeGen/RISCV/memcpy.ll index 12ec0881b20d9..f8f5d25947d7f 100644 --- a/llvm/test/CodeGen/RISCV/memcpy.ll +++ b/llvm/test/CodeGen/RISCV/memcpy.ll @@ -7,6 +7,10 @@ ; RUN: | FileCheck %s --check-prefixes=RV32-BOTH,RV32-FAST ; RUN:
[llvm-branch-commits] [llvm] release/18.x: [RISCV] Re-separate unaligned scalar and vector memory features in the backend. (PR #92143)
https://github.com/tstellar updated https://github.com/llvm/llvm-project/pull/92143 >From a7cd0c61123889a632ceea67dc8c8e2c8753ae08 Mon Sep 17 00:00:00 2001 From: Craig Topper Date: Thu, 16 May 2024 12:27:05 -0700 Subject: [PATCH] [RISCV] Add a unaligned-scalar-mem feature like we had in clang 17. This is ORed with the fast-unaligned-access feature which applies to scalar and vector together.: --- llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp | 5 +++-- llvm/lib/Target/RISCV/RISCVFeatures.td | 5 + llvm/lib/Target/RISCV/RISCVISelLowering.cpp | 9 ++--- llvm/test/CodeGen/RISCV/memcpy-inline.ll | 4 llvm/test/CodeGen/RISCV/memcpy.ll| 4 llvm/test/CodeGen/RISCV/memset-inline.ll | 4 llvm/test/CodeGen/RISCV/pr56110.ll | 1 + llvm/test/CodeGen/RISCV/unaligned-load-store.ll | 4 8 files changed, 31 insertions(+), 5 deletions(-) diff --git a/llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp b/llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp index 0a314fdd41cbe..89207640ee54a 100644 --- a/llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp +++ b/llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp @@ -317,8 +317,9 @@ bool RISCVExpandPseudo::expandRV32ZdinxStore(MachineBasicBlock , .addReg(MBBI->getOperand(1).getReg()) .add(MBBI->getOperand(2)); if (MBBI->getOperand(2).isGlobal() || MBBI->getOperand(2).isCPI()) { -// FIXME: Zdinx RV32 can not work on unaligned memory. -assert(!STI->hasFastUnalignedAccess()); +// FIXME: Zdinx RV32 can not work on unaligned scalar memory. +assert(!STI->hasFastUnalignedAccess() && + !STI->enableUnalignedScalarMem()); assert(MBBI->getOperand(2).getOffset() % 8 == 0); MBBI->getOperand(2).setOffset(MBBI->getOperand(2).getOffset() + 4); diff --git a/llvm/lib/Target/RISCV/RISCVFeatures.td b/llvm/lib/Target/RISCV/RISCVFeatures.td index 26451c80f57b4..1bb6b6a561f4a 100644 --- a/llvm/lib/Target/RISCV/RISCVFeatures.td +++ b/llvm/lib/Target/RISCV/RISCVFeatures.td @@ -1025,6 +1025,11 @@ def FeatureFastUnalignedAccess "true", "Has reasonably performant unaligned " "loads and stores (both scalar and vector)">; +def FeatureUnalignedScalarMem + : SubtargetFeature<"unaligned-scalar-mem", "EnableUnalignedScalarMem", + "true", "Has reasonably performant unaligned scalar " + "loads and stores">; + def FeaturePostRAScheduler : SubtargetFeature<"use-postra-scheduler", "UsePostRAScheduler", "true", "Schedule again after register allocation">; diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp index d46093b9e260a..3fe7ddfdd4279 100644 --- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp +++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp @@ -1883,7 +1883,8 @@ bool RISCVTargetLowering::shouldConvertConstantLoadToIntImm(const APInt , // replace. If we don't support unaligned scalar mem, prefer the constant // pool. // TODO: Can the caller pass down the alignment? - if (!Subtarget.hasFastUnalignedAccess()) + if (!Subtarget.hasFastUnalignedAccess() && + !Subtarget.enableUnalignedScalarMem()) return true; // Prefer to keep the load if it would require many instructions. @@ -19772,8 +19773,10 @@ bool RISCVTargetLowering::allowsMisalignedMemoryAccesses( unsigned *Fast) const { if (!VT.isVector()) { if (Fast) - *Fast = Subtarget.hasFastUnalignedAccess(); -return Subtarget.hasFastUnalignedAccess(); + *Fast = Subtarget.hasFastUnalignedAccess() || + Subtarget.enableUnalignedScalarMem(); +return Subtarget.hasFastUnalignedAccess() || + Subtarget.enableUnalignedScalarMem(); } // All vector implementations must support element alignment diff --git a/llvm/test/CodeGen/RISCV/memcpy-inline.ll b/llvm/test/CodeGen/RISCV/memcpy-inline.ll index 343695ee37da8..709b8264b5833 100644 --- a/llvm/test/CodeGen/RISCV/memcpy-inline.ll +++ b/llvm/test/CodeGen/RISCV/memcpy-inline.ll @@ -7,6 +7,10 @@ ; RUN: | FileCheck %s --check-prefixes=RV32-BOTH,RV32-FAST ; RUN: llc < %s -mtriple=riscv64 -mattr=+fast-unaligned-access \ ; RUN: | FileCheck %s --check-prefixes=RV64-BOTH,RV64-FAST +; RUN: llc < %s -mtriple=riscv32 -mattr=+unaligned-scalar-mem \ +; RUN: | FileCheck %s --check-prefixes=RV32-BOTH,RV32-FAST +; RUN: llc < %s -mtriple=riscv64 -mattr=+unaligned-scalar-mem \ +; RUN: | FileCheck %s --check-prefixes=RV64-BOTH,RV64-FAST ; -- ; Fully unaligned cases diff --git a/llvm/test/CodeGen/RISCV/memcpy.ll b/llvm/test/CodeGen/RISCV/memcpy.ll index 12ec0881b20d9..f8f5d25947d7f 100644 --- a/llvm/test/CodeGen/RISCV/memcpy.ll +++ b/llvm/test/CodeGen/RISCV/memcpy.ll @@ -7,6 +7,10 @@ ; RUN: | FileCheck %s --check-prefixes=RV32-BOTH,RV32-FAST ; RUN: llc < %s -mtriple=riscv64
[llvm-branch-commits] [llvm] [release/18.x] Backport fixes for ARM64EC thunk generation (PR #92580)
https://github.com/efriedma-quic approved this pull request. LGTM This only affects Arm64EC targets, the fixes are relatively small, and this affects correctness of generated thunks. https://github.com/llvm/llvm-project/pull/92580 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [release/18.x] Backport fixes for ARM64EC thunk generation (PR #92580)
llvmbot wrote: @llvm/pr-subscribers-backend-aarch64 Author: Daniel Paoliello (dpaoliello) Changes Backports !90115 and !92326 Release notes: Fixes issues where LLVM is either generating the incorrect thunk for a function with aligned parameters or didn't correctly pass through the return value when `StructRet` was used. --- Full diff: https://github.com/llvm/llvm-project/pull/92580.diff 3 Files Affected: - (modified) llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp (+12-4) - (modified) llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll (+22-14) - (modified) llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll (+5-5) ``diff diff --git a/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp b/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp index 55c5bbc66a3f4..862aefe46193d 100644 --- a/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp +++ b/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp @@ -181,13 +181,14 @@ void AArch64Arm64ECCallLowering::getThunkArgTypes( } for (unsigned E = FT->getNumParams(); I != E; ++I) { -Align ParamAlign = AttrList.getParamAlignment(I).valueOrOne(); #if 0 // FIXME: Need more information about argument size; see // https://reviews.llvm.org/D132926 uint64_t ArgSizeBytes = AttrList.getParamArm64ECArgSizeBytes(I); +Align ParamAlign = AttrList.getParamAlignment(I).valueOrOne(); #else uint64_t ArgSizeBytes = 0; +Align ParamAlign = Align(); #endif Type *Arm64Ty, *X64Ty; canonicalizeThunkType(FT->getParamType(I), ParamAlign, @@ -297,7 +298,7 @@ void AArch64Arm64ECCallLowering::canonicalizeThunkType( uint64_t TotalSizeBytes = ElementCnt * ElementSizePerBytes; if (ElementTy->isFloatTy() || ElementTy->isDoubleTy()) { Out << (ElementTy->isFloatTy() ? "F" : "D") << TotalSizeBytes; - if (Alignment.value() >= 8 && !T->isPointerTy()) + if (Alignment.value() >= 16 && !Ret) Out << "a" << Alignment.value(); Arm64Ty = T; if (TotalSizeBytes <= 8) { @@ -328,7 +329,7 @@ void AArch64Arm64ECCallLowering::canonicalizeThunkType( Out << "m"; if (TypeSize != 4) Out << TypeSize; - if (Alignment.value() >= 8 && !T->isPointerTy()) + if (Alignment.value() >= 16 && !Ret) Out << "a" << Alignment.value(); // FIXME: Try to canonicalize Arm64Ty more thoroughly? Arm64Ty = T; @@ -513,7 +514,14 @@ Function *AArch64Arm64ECCallLowering::buildEntryThunk(Function *F) { // Call the function passed to the thunk. Value *Callee = Thunk->getArg(0); Callee = IRB.CreateBitCast(Callee, PtrTy); - Value *Call = IRB.CreateCall(Arm64Ty, Callee, Args); + CallInst *Call = IRB.CreateCall(Arm64Ty, Callee, Args); + + auto SRetAttr = F->getAttributes().getParamAttr(0, Attribute::StructRet); + auto InRegAttr = F->getAttributes().getParamAttr(0, Attribute::InReg); + if (SRetAttr.isValid() && !InRegAttr.isValid()) { +Thunk->addParamAttr(1, SRetAttr); +Call->addParamAttr(0, SRetAttr); + } Value *RetVal = Call; if (TransformDirectToSRet) { diff --git a/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll b/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll index bb9ba05f7a272..e9556b9d5cbee 100644 --- a/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll +++ b/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll @@ -222,12 +222,12 @@ define i8 @matches_has_sret() nounwind { } %TSRet = type { i64, i64 } -define void @has_aligned_sret(ptr align 32 sret(%TSRet)) nounwind { -; CHECK-LABEL:.def$ientry_thunk$cdecl$m16a32$v; -; CHECK: .section .wowthk$aa,"xr",discard,$ientry_thunk$cdecl$m16a32$v +define void @has_aligned_sret(ptr align 32 sret(%TSRet), i32) nounwind { +; CHECK-LABEL:.def$ientry_thunk$cdecl$m16$i8; +; CHECK: .section .wowthk$aa,"xr",discard,$ientry_thunk$cdecl$m16$i8 ; CHECK: // %bb.0: -; CHECK-NEXT: stp q6, q7, [sp, #-176]!// 32-byte Folded Spill -; CHECK-NEXT: .seh_save_any_reg_pxq6, 176 +; CHECK-NEXT: stp q6, q7, [sp, #-192]!// 32-byte Folded Spill +; CHECK-NEXT: .seh_save_any_reg_pxq6, 192 ; CHECK-NEXT: stp q8, q9, [sp, #32] // 32-byte Folded Spill ; CHECK-NEXT: .seh_save_any_reg_p q8, 32 ; CHECK-NEXT: stp q10, q11, [sp, #64] // 32-byte Folded Spill @@ -236,17 +236,25 @@ define void @has_aligned_sret(ptr align 32 sret(%TSRet)) nounwind { ; CHECK-NEXT: .seh_save_any_reg_p q12, 96 ; CHECK-NEXT: stp q14, q15, [sp, #128]// 32-byte Folded Spill ; CHECK-NEXT: .seh_save_any_reg_p q14, 128 -; CHECK-NEXT: stp x29, x30, [sp, #160]// 16-byte Folded Spill -; CHECK-NEXT: .seh_save_fplr 160 -; CHECK-NEXT: add x29, sp, #160 -; CHECK-NEXT: .seh_add_fp 160 +; CHECK-NEXT: str x19, [sp, #160] // 8-byte Folded Spill +; CHECK-NEXT: .seh_save_reg x19, 160 +; CHECK-NEXT:
[llvm-branch-commits] [llvm] [release/18.x] Backport fixes for ARM64EC thunk generation (PR #92580)
https://github.com/dpaoliello created https://github.com/llvm/llvm-project/pull/92580 Backports !90115 and !92326 Release notes: Fixes issues where LLVM is either generating the incorrect thunk for a function with aligned parameters or didn't correctly pass through the return value when `StructRet` was used. >From 5e0477fafd6aa8ea8451a7ea4968f407ca893aef Mon Sep 17 00:00:00 2001 From: Eli Friedman Date: Fri, 26 Apr 2024 11:06:11 -0700 Subject: [PATCH 1/2] [Arm64EC] Improve alignment mangling in arm64ec thunks. (#90115) In some cases, MSVC's mangling for arm64ec thunks includes the alignment of a struct. I added some code to try to match... but it never really worked right. The issues: - Alignment is only mangled if it's 16 or more (I guess the default is supposed to be 8). - Alignment isn't mangled on return values (since the memory is allocated by the caller). The current patch leaves hooks to make alignment mangling work... but doesn't actually ever mangle alignment: clang never actually encodes a relevant alignment into the IR. Once we get clang to emit the real size/alignment of structs, we can start emitting it. --- llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp | 7 --- llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll | 6 +++--- llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll | 10 +- 3 files changed, 12 insertions(+), 11 deletions(-) diff --git a/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp b/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp index 55c5bbc66a3f4..d4dd28aecac48 100644 --- a/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp +++ b/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp @@ -181,13 +181,14 @@ void AArch64Arm64ECCallLowering::getThunkArgTypes( } for (unsigned E = FT->getNumParams(); I != E; ++I) { -Align ParamAlign = AttrList.getParamAlignment(I).valueOrOne(); #if 0 // FIXME: Need more information about argument size; see // https://reviews.llvm.org/D132926 uint64_t ArgSizeBytes = AttrList.getParamArm64ECArgSizeBytes(I); +Align ParamAlign = AttrList.getParamAlignment(I).valueOrOne(); #else uint64_t ArgSizeBytes = 0; +Align ParamAlign = Align(); #endif Type *Arm64Ty, *X64Ty; canonicalizeThunkType(FT->getParamType(I), ParamAlign, @@ -297,7 +298,7 @@ void AArch64Arm64ECCallLowering::canonicalizeThunkType( uint64_t TotalSizeBytes = ElementCnt * ElementSizePerBytes; if (ElementTy->isFloatTy() || ElementTy->isDoubleTy()) { Out << (ElementTy->isFloatTy() ? "F" : "D") << TotalSizeBytes; - if (Alignment.value() >= 8 && !T->isPointerTy()) + if (Alignment.value() >= 16 && !Ret) Out << "a" << Alignment.value(); Arm64Ty = T; if (TotalSizeBytes <= 8) { @@ -328,7 +329,7 @@ void AArch64Arm64ECCallLowering::canonicalizeThunkType( Out << "m"; if (TypeSize != 4) Out << TypeSize; - if (Alignment.value() >= 8 && !T->isPointerTy()) + if (Alignment.value() >= 16 && !Ret) Out << "a" << Alignment.value(); // FIXME: Try to canonicalize Arm64Ty more thoroughly? Arm64Ty = T; diff --git a/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll b/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll index bb9ba05f7a272..c00c9bfe127e8 100644 --- a/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll +++ b/llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll @@ -223,8 +223,8 @@ define i8 @matches_has_sret() nounwind { %TSRet = type { i64, i64 } define void @has_aligned_sret(ptr align 32 sret(%TSRet)) nounwind { -; CHECK-LABEL:.def$ientry_thunk$cdecl$m16a32$v; -; CHECK: .section .wowthk$aa,"xr",discard,$ientry_thunk$cdecl$m16a32$v +; CHECK-LABEL:.def$ientry_thunk$cdecl$m16$v; +; CHECK: .section .wowthk$aa,"xr",discard,$ientry_thunk$cdecl$m16$v ; CHECK: // %bb.0: ; CHECK-NEXT: stp q6, q7, [sp, #-176]!// 32-byte Folded Spill ; CHECK-NEXT: .seh_save_any_reg_pxq6, 176 @@ -457,7 +457,7 @@ define %T2 @simple_struct(%T1 %0, %T2 %1, %T3, %T4) nounwind { ; CHECK-NEXT: .symidx $ientry_thunk$cdecl$i8$v ; CHECK-NEXT: .word 1 ; CHECK-NEXT: .symidx "#has_aligned_sret" -; CHECK-NEXT: .symidx $ientry_thunk$cdecl$m16a32$v +; CHECK-NEXT: .symidx $ientry_thunk$cdecl$m16$v ; CHECK-NEXT: .word 1 ; CHECK-NEXT: .symidx "#small_array" ; CHECK-NEXT: .symidx $ientry_thunk$cdecl$m2$m2F8 diff --git a/llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll b/llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll index 3b911e78aff2a..7a40fcd85ac58 100644 --- a/llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll +++ b/llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll @@ -236,8 +236,8 @@ declare void @has_sret(ptr sret([100 x i8])) nounwind; %TSRet = type { i64, i64 } declare void @has_aligned_sret(ptr align 32 sret(%TSRet)) nounwind; -; CHECK-LABEL:.def$iexit_thunk$cdecl$m16a32$v; -; CHECK: .section
[llvm-branch-commits] [llvm] [release/18.x] Backport fixes for ARM64EC thunk generation (PR #92580)
https://github.com/dpaoliello milestoned https://github.com/llvm/llvm-project/pull/92580 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [RISCV] Re-separate unaligned scalar and vector memory features in the backend. (PR #92143)
preames wrote: I'm fine with this approach. No strong opinion either way, but definitely don't let me previous comments be blocking here. https://github.com/llvm/llvm-project/pull/92143 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Support clause-based representation of operations (PR #92519)
skatrak wrote: Here's a link to the RFC about this proposal, with links to all related PRs: https://discourse.llvm.org/t/rfc-clause-based-representation-of-openmp-dialect-operations/79053 https://github.com/llvm/llvm-project/pull/92519 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: InstCombine: Process addrspacecast uses in PointerReplacer (#91953) (PR #92479)
https://github.com/AtariDreams closed https://github.com/llvm/llvm-project/pull/92479 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [RISCV] Re-separate unaligned scalar and vector memory features in the backend. (PR #92143)
nikic wrote: The approach looks reasonable to me. https://github.com/llvm/llvm-project/pull/92143 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: InstCombine: Process addrspacecast uses in PointerReplacer (#91953) (PR #92479)
https://github.com/nikic requested changes to this pull request. There are test failures. Generally I don't have enough confidence in this change for a last-minute backport. https://github.com/llvm/llvm-project/pull/92479 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Support clause-based representation of operations (PR #92519)
skatrak wrote: > I like the idea, but also have some questions: > > 1. How does an operation definition look like with all its clauses > defined? Does it have to be repeated for each operation supporting the same > clause(s)? You can see how things would look like in #92523, but I can copy one example here: ```tablegen def TeamsOp : OpenMP_Op<"teams", traits = [ AttrSizedOperandSegments, RecursiveMemoryEffects ], clauses = [ OpenMP_NumTeamsClause, OpenMP_IfClause, OpenMP_ThreadLimitClause, OpenMP_AllocateClause, OpenMP_ReductionClause ], singleRegion = true> { let summary = "teams construct"; let description = [{ The teams construct defines a region of code that triggers the creation of a league of teams. Once created, the number of teams remains constant for the duration of its code region. }] # clausesDescription; let builders = [ OpBuilder<(ins CArg<"const TeamsClauseOps &">:$clauses)> ]; let hasVerifier = 1; } ``` > > 2. It seems this requires all properties of a clause specified in its > constructor. Wouldn't it be better if you could subclass `OpenMP_Clause` and > in there overwrite all the properties that are non-default? > Actually this only requires the list of clauses themselves to be passed in to the definition of the `OpenMP_Op`. They are the ones that define all these properties, and they indeed all subclass `OpenMP_Clause` (see in #92521). > 3. Have you considered to define/derive from all this info in OpenMP.td > of LLVMFrontend? Clause information should be language-independent. I get that, the issue is that the information being used here is very much MLIR dialect specific, and OMP.td in LLVMFrontend doesn't seem like the right place to be defining these things, since it's outside of the MLIR project. Maybe descriptions could potentially be moved there, but then we'd have some sort of matching system between clauses in LLVMFrontend and the ones in MLIR. Not sure how feasible that would be. https://github.com/llvm/llvm-project/pull/92519 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Support clause-based representation of operations (PR #92519)
Meinersbur wrote: I like the idea, but also have some questions: 1. How does an operation definition look like with all its clauses defined? Does it have to be repeated for each operation supporting the same clause(s)? 2. It seems this requires all properties of a clause specified in its constructor. Wouldn't it be better if you could subclass `OpenMP_Clause` and in there overwrite all the properties that are non-default? 3. Have you considered to define/derive from all this info in OpenMP.td of LLVMFrontend? Clause information should be language-independent. https://github.com/llvm/llvm-project/pull/92519 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Support clause-based representation of operations (PR #92519)
kiranchandramohan wrote: @skatrak Could you copy the summary of the patch and create an RFC? https://github.com/llvm/llvm-project/pull/92519 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [Flang][OpenMP] Update flang with changes to the OpenMP dialect (PR #92524)
llvmbot wrote: @llvm/pr-subscribers-flang-fir-hlfir @llvm/pr-subscribers-flang-openmp Author: Sergio Afonso (skatrak) Changes This patch applies fixes after the updates to OpenMP clause operands, as well as updating some tests that were impacted by changes to the ordering or assembly format of some clauses in MLIR. --- Patch is 20.82 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/92524.diff 10 Files Affected: - (modified) flang/lib/Lower/OpenMP/ClauseProcessor.cpp (+2-2) - (modified) flang/lib/Lower/OpenMP/ClauseProcessor.h (+2-2) - (modified) flang/lib/Lower/OpenMP/OpenMP.cpp (+10-9) - (modified) flang/test/Lower/OpenMP/atomic-capture.f90 (+1-1) - (modified) flang/test/Lower/OpenMP/copyin-order.f90 (+1-1) - (modified) flang/test/Lower/OpenMP/parallel-wsloop.f90 (+1-1) - (modified) flang/test/Lower/OpenMP/parallel.f90 (+12-12) - (modified) flang/test/Lower/OpenMP/simd.f90 (+1-1) - (modified) flang/test/Lower/OpenMP/target.f90 (+12-12) - (modified) flang/test/Lower/OpenMP/use-device-ptr-to-use-device-addr.f90 (+1-1) ``diff diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index b7198c951c8fe..357cc09bfb445 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -997,7 +997,7 @@ bool ClauseProcessor::processEnter( } bool ClauseProcessor::processUseDeviceAddr( -mlir::omp::UseDeviceClauseOps , +mlir::omp::UseDeviceAddrClauseOps , llvm::SmallVectorImpl , llvm::SmallVectorImpl , llvm::SmallVectorImpl ) @@ -1011,7 +1011,7 @@ bool ClauseProcessor::processUseDeviceAddr( } bool ClauseProcessor::processUseDevicePtr( -mlir::omp::UseDeviceClauseOps , +mlir::omp::UseDevicePtrClauseOps , llvm::SmallVectorImpl , llvm::SmallVectorImpl , llvm::SmallVectorImpl ) diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h index 78c148ab02163..220ea7b6d9920 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.h +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h @@ -128,13 +128,13 @@ class ClauseProcessor { mlir::omp::ReductionClauseOps ) const; bool processTo(llvm::SmallVectorImpl ) const; bool - processUseDeviceAddr(mlir::omp::UseDeviceClauseOps , + processUseDeviceAddr(mlir::omp::UseDeviceAddrClauseOps , llvm::SmallVectorImpl , llvm::SmallVectorImpl , llvm::SmallVectorImpl ) const; bool - processUseDevicePtr(mlir::omp::UseDeviceClauseOps , + processUseDevicePtr(mlir::omp::UseDevicePtrClauseOps , llvm::SmallVectorImpl , llvm::SmallVectorImpl , llvm::SmallVectorImpl diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 44011ad78f2e2..2f612dd6f2fb6 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -239,7 +239,8 @@ createAndSetPrivatizedLoopVar(Fortran::lower::AbstractConverter , // clause. Support for such list items in a use_device_ptr clause // is deprecated." static void promoteNonCPtrUseDevicePtrArgsToUseDeviceAddr( -mlir::omp::UseDeviceClauseOps , +llvm::SmallVectorImpl , +llvm::SmallVectorImpl , llvm::SmallVectorImpl , llvm::SmallVectorImpl , llvm::SmallVectorImpl @@ -252,10 +253,9 @@ static void promoteNonCPtrUseDevicePtrArgsToUseDeviceAddr( // Iterate over our use_device_ptr list and shift all non-cptr arguments into // use_device_addr. - for (auto *it = clauseOps.useDevicePtrVars.begin(); - it != clauseOps.useDevicePtrVars.end();) { + for (auto *it = useDevicePtrVars.begin(); it != useDevicePtrVars.end();) { if (!fir::isa_builtin_cptr_type(fir::unwrapRefType(it->getType( { - clauseOps.useDeviceAddrVars.push_back(*it); + useDeviceAddrVars.push_back(*it); // We have to shuffle the symbols around as well, to maintain // the correct Input -> BlockArg for use_device_ptr/use_device_addr. // NOTE: However, as map's do not seem to be included currently @@ -263,11 +263,11 @@ static void promoteNonCPtrUseDevicePtrArgsToUseDeviceAddr( // future alterations. I believe the reason they are not currently // is that the BlockArg assign/lowering needs to be extended // to a greater set of types. - auto idx = std::distance(clauseOps.useDevicePtrVars.begin(), it); + auto idx = std::distance(useDevicePtrVars.begin(), it); moveElementToBack(idx, useDeviceTypes); moveElementToBack(idx, useDeviceLocs); moveElementToBack(idx, useDeviceSymbols); - it = clauseOps.useDevicePtrVars.erase(it); + it = useDevicePtrVars.erase(it); continue; } ++it; @@ -1005,7 +1005,7 @@ genCriticalDeclareClauses(Fortran::lower::AbstractConverter
[llvm-branch-commits] [flang] [Flang][OpenMP] Update flang with changes to the OpenMP dialect (PR #92524)
https://github.com/skatrak created https://github.com/llvm/llvm-project/pull/92524 This patch applies fixes after the updates to OpenMP clause operands, as well as updating some tests that were impacted by changes to the ordering or assembly format of some clauses in MLIR. >From 522812fb4354812e3bcfaf1b1e52dfa9e0db05ae Mon Sep 17 00:00:00 2001 From: Sergio Afonso Date: Fri, 17 May 2024 11:38:36 +0100 Subject: [PATCH] [Flang][OpenMP] Update flang with changes to the OpenMP dialect This patch applies fixes after the updates to OpenMP clause operands, as well as updating some tests that were impacted by changes to the ordering or assembly format of some clauses in MLIR. --- flang/lib/Lower/OpenMP/ClauseProcessor.cpp| 4 ++-- flang/lib/Lower/OpenMP/ClauseProcessor.h | 4 ++-- flang/lib/Lower/OpenMP/OpenMP.cpp | 19 --- flang/test/Lower/OpenMP/atomic-capture.f90| 2 +- flang/test/Lower/OpenMP/copyin-order.f90 | 2 +- flang/test/Lower/OpenMP/parallel-wsloop.f90 | 2 +- flang/test/Lower/OpenMP/parallel.f90 | 24 +-- flang/test/Lower/OpenMP/simd.f90 | 2 +- flang/test/Lower/OpenMP/target.f90| 24 +-- .../use-device-ptr-to-use-device-addr.f90 | 2 +- 10 files changed, 43 insertions(+), 42 deletions(-) diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp index b7198c951c8fe..357cc09bfb445 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp @@ -997,7 +997,7 @@ bool ClauseProcessor::processEnter( } bool ClauseProcessor::processUseDeviceAddr( -mlir::omp::UseDeviceClauseOps , +mlir::omp::UseDeviceAddrClauseOps , llvm::SmallVectorImpl , llvm::SmallVectorImpl , llvm::SmallVectorImpl ) @@ -1011,7 +1011,7 @@ bool ClauseProcessor::processUseDeviceAddr( } bool ClauseProcessor::processUseDevicePtr( -mlir::omp::UseDeviceClauseOps , +mlir::omp::UseDevicePtrClauseOps , llvm::SmallVectorImpl , llvm::SmallVectorImpl , llvm::SmallVectorImpl ) diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h index 78c148ab02163..220ea7b6d9920 100644 --- a/flang/lib/Lower/OpenMP/ClauseProcessor.h +++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h @@ -128,13 +128,13 @@ class ClauseProcessor { mlir::omp::ReductionClauseOps ) const; bool processTo(llvm::SmallVectorImpl ) const; bool - processUseDeviceAddr(mlir::omp::UseDeviceClauseOps , + processUseDeviceAddr(mlir::omp::UseDeviceAddrClauseOps , llvm::SmallVectorImpl , llvm::SmallVectorImpl , llvm::SmallVectorImpl ) const; bool - processUseDevicePtr(mlir::omp::UseDeviceClauseOps , + processUseDevicePtr(mlir::omp::UseDevicePtrClauseOps , llvm::SmallVectorImpl , llvm::SmallVectorImpl , llvm::SmallVectorImpl diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index 44011ad78f2e2..2f612dd6f2fb6 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -239,7 +239,8 @@ createAndSetPrivatizedLoopVar(Fortran::lower::AbstractConverter , // clause. Support for such list items in a use_device_ptr clause // is deprecated." static void promoteNonCPtrUseDevicePtrArgsToUseDeviceAddr( -mlir::omp::UseDeviceClauseOps , +llvm::SmallVectorImpl , +llvm::SmallVectorImpl , llvm::SmallVectorImpl , llvm::SmallVectorImpl , llvm::SmallVectorImpl @@ -252,10 +253,9 @@ static void promoteNonCPtrUseDevicePtrArgsToUseDeviceAddr( // Iterate over our use_device_ptr list and shift all non-cptr arguments into // use_device_addr. - for (auto *it = clauseOps.useDevicePtrVars.begin(); - it != clauseOps.useDevicePtrVars.end();) { + for (auto *it = useDevicePtrVars.begin(); it != useDevicePtrVars.end();) { if (!fir::isa_builtin_cptr_type(fir::unwrapRefType(it->getType( { - clauseOps.useDeviceAddrVars.push_back(*it); + useDeviceAddrVars.push_back(*it); // We have to shuffle the symbols around as well, to maintain // the correct Input -> BlockArg for use_device_ptr/use_device_addr. // NOTE: However, as map's do not seem to be included currently @@ -263,11 +263,11 @@ static void promoteNonCPtrUseDevicePtrArgsToUseDeviceAddr( // future alterations. I believe the reason they are not currently // is that the BlockArg assign/lowering needs to be extended // to a greater set of types. - auto idx = std::distance(clauseOps.useDevicePtrVars.begin(), it); + auto idx = std::distance(useDevicePtrVars.begin(), it); moveElementToBack(idx, useDeviceTypes); moveElementToBack(idx, useDeviceLocs); moveElementToBack(idx,
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Clause-based OpenMP operation definition (PR #92523)
llvmbot wrote: @llvm/pr-subscribers-mlir-llvm Author: Sergio Afonso (skatrak) Changes This patch updates `OpenMP_Op` definitions to be based on the new set of `OpenMP_Clause` definitions, and to take advantage of clause-based automatically-generated argument lists, descriptions, assembly format and class declarations. There are also changes introduced to the clause operands structures to match the current set of tablegen clause definitions. These two are very closely linked and should be kept in sync. It would probably be a good idea to try generating clause operands structures from the tablegen `OpenMP_Clause` definitions in the future. As a result of this change, arguments for some operations have been reordered. This patch also addresses this by updating affected operation build calls and unit tests. Some other updates to tests related to the order of arguments in the resulting assembly format and others due to certain previous inconsistencies in the printing/parsing of clauses are addressed. The printer and parser functions for the `map` clause are updated, so that they are able to handle `map` clauses linked to entry block arguments as well as those which aren't. --- Patch is 106.52 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/92523.diff 11 Files Affected: - (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h (+26-10) - (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td (+298-846) - (modified) mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp (+2-1) - (modified) mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp (+50-28) - (modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+1-1) - (modified) mlir/test/Dialect/OpenMP/invalid.mlir (+10-10) - (modified) mlir/test/Dialect/OpenMP/ops.mlir (+25-26) - (modified) mlir/test/Target/LLVMIR/omptarget-llvm.mlir (+3-3) - (modified) mlir/test/Target/LLVMIR/omptarget-nowait-llvm.mlir (+1-1) - (modified) mlir/test/Target/LLVMIR/omptarget-parallel-llvm.mlir (+2-2) - (modified) mlir/test/Target/LLVMIR/openmp-llvm.mlir (+1-1) ``diff diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h index 244cee1dd635b..bd0d44f932981 100644 --- a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h +++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h @@ -39,6 +39,10 @@ struct AllocateClauseOps { llvm::SmallVector allocatorVars, allocateVars; }; +struct CancelDirectiveNameClauseOps { + ClauseCancellationConstructTypeAttr cancelDirectiveNameAttr; +}; + struct CollapseClauseOps { llvm::SmallVector loopLBVar, loopUBVar, loopStepVar; }; @@ -48,6 +52,10 @@ struct CopyprivateClauseOps { llvm::SmallVector copyprivateFuncs; }; +struct CriticalNameClauseOps { + StringAttr criticalNameAttr; +}; + struct DependClauseOps { llvm::SmallVector dependTypeAttrs; llvm::SmallVector dependVars; @@ -84,6 +92,7 @@ struct GrainsizeClauseOps { struct HasDeviceAddrClauseOps { llvm::SmallVector hasDeviceAddrVars; }; + struct HintClauseOps { IntegerAttr hintAttr; }; @@ -117,10 +126,6 @@ struct MergeableClauseOps { UnitAttr mergeableAttr; }; -struct NameClauseOps { - StringAttr nameAttr; -}; - struct NogroupClauseOps { UnitAttr nogroupAttr; }; @@ -209,8 +214,12 @@ struct UntiedClauseOps { UnitAttr untiedAttr; }; -struct UseDeviceClauseOps { - llvm::SmallVector useDevicePtrVars, useDeviceAddrVars; +struct UseDeviceAddrClauseOps { + llvm::SmallVector useDeviceAddrVars; +}; + +struct UseDevicePtrClauseOps { + llvm::SmallVector useDevicePtrVars; }; //===--===// @@ -225,7 +234,13 @@ template struct Clauses : public Mixins... {}; } // namespace detail -using CriticalClauseOps = detail::Clauses; +using CancelClauseOps = +detail::Clauses; + +using CancellationPointClauseOps = +detail::Clauses; + +using CriticalClauseOps = detail::Clauses; // TODO `indirect` clause. using DeclareTargetClauseOps = detail::Clauses; @@ -264,10 +279,11 @@ using TargetClauseOps = detail::Clauses; +PrivateClauseOps, ThreadLimitClauseOps>; -using TargetDataClauseOps = detail::Clauses; +using TargetDataClauseOps = +detail::Clauses; using TargetEnterExitUpdateDataClauseOps = detail::Clauses { //===--===// def ParallelOp : OpenMP_Op<"parallel", [ - AutomaticAllocationScope, AttrSizedOperandSegments, - DeclareOpInterfaceMethods, - DeclareOpInterfaceMethods, - RecursiveMemoryEffects, ReductionClauseInterface]> { +AttrSizedOperandSegments, AutomaticAllocationScope, +DeclareOpInterfaceMethods, +DeclareOpInterfaceMethods, +RecursiveMemoryEffects + ], [ +// TODO: Sort clauses alphabetically. +
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Clause-based OpenMP operation definition (PR #92523)
llvmbot wrote: @llvm/pr-subscribers-mlir Author: Sergio Afonso (skatrak) Changes This patch updates `OpenMP_Op` definitions to be based on the new set of `OpenMP_Clause` definitions, and to take advantage of clause-based automatically-generated argument lists, descriptions, assembly format and class declarations. There are also changes introduced to the clause operands structures to match the current set of tablegen clause definitions. These two are very closely linked and should be kept in sync. It would probably be a good idea to try generating clause operands structures from the tablegen `OpenMP_Clause` definitions in the future. As a result of this change, arguments for some operations have been reordered. This patch also addresses this by updating affected operation build calls and unit tests. Some other updates to tests related to the order of arguments in the resulting assembly format and others due to certain previous inconsistencies in the printing/parsing of clauses are addressed. The printer and parser functions for the `map` clause are updated, so that they are able to handle `map` clauses linked to entry block arguments as well as those which aren't. --- Patch is 106.52 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/92523.diff 11 Files Affected: - (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h (+26-10) - (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td (+298-846) - (modified) mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp (+2-1) - (modified) mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp (+50-28) - (modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+1-1) - (modified) mlir/test/Dialect/OpenMP/invalid.mlir (+10-10) - (modified) mlir/test/Dialect/OpenMP/ops.mlir (+25-26) - (modified) mlir/test/Target/LLVMIR/omptarget-llvm.mlir (+3-3) - (modified) mlir/test/Target/LLVMIR/omptarget-nowait-llvm.mlir (+1-1) - (modified) mlir/test/Target/LLVMIR/omptarget-parallel-llvm.mlir (+2-2) - (modified) mlir/test/Target/LLVMIR/openmp-llvm.mlir (+1-1) ``diff diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h index 244cee1dd635b..bd0d44f932981 100644 --- a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h +++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h @@ -39,6 +39,10 @@ struct AllocateClauseOps { llvm::SmallVector allocatorVars, allocateVars; }; +struct CancelDirectiveNameClauseOps { + ClauseCancellationConstructTypeAttr cancelDirectiveNameAttr; +}; + struct CollapseClauseOps { llvm::SmallVector loopLBVar, loopUBVar, loopStepVar; }; @@ -48,6 +52,10 @@ struct CopyprivateClauseOps { llvm::SmallVector copyprivateFuncs; }; +struct CriticalNameClauseOps { + StringAttr criticalNameAttr; +}; + struct DependClauseOps { llvm::SmallVector dependTypeAttrs; llvm::SmallVector dependVars; @@ -84,6 +92,7 @@ struct GrainsizeClauseOps { struct HasDeviceAddrClauseOps { llvm::SmallVector hasDeviceAddrVars; }; + struct HintClauseOps { IntegerAttr hintAttr; }; @@ -117,10 +126,6 @@ struct MergeableClauseOps { UnitAttr mergeableAttr; }; -struct NameClauseOps { - StringAttr nameAttr; -}; - struct NogroupClauseOps { UnitAttr nogroupAttr; }; @@ -209,8 +214,12 @@ struct UntiedClauseOps { UnitAttr untiedAttr; }; -struct UseDeviceClauseOps { - llvm::SmallVector useDevicePtrVars, useDeviceAddrVars; +struct UseDeviceAddrClauseOps { + llvm::SmallVector useDeviceAddrVars; +}; + +struct UseDevicePtrClauseOps { + llvm::SmallVector useDevicePtrVars; }; //===--===// @@ -225,7 +234,13 @@ template struct Clauses : public Mixins... {}; } // namespace detail -using CriticalClauseOps = detail::Clauses; +using CancelClauseOps = +detail::Clauses; + +using CancellationPointClauseOps = +detail::Clauses; + +using CriticalClauseOps = detail::Clauses; // TODO `indirect` clause. using DeclareTargetClauseOps = detail::Clauses; @@ -264,10 +279,11 @@ using TargetClauseOps = detail::Clauses; +PrivateClauseOps, ThreadLimitClauseOps>; -using TargetDataClauseOps = detail::Clauses; +using TargetDataClauseOps = +detail::Clauses; using TargetEnterExitUpdateDataClauseOps = detail::Clauses { //===--===// def ParallelOp : OpenMP_Op<"parallel", [ - AutomaticAllocationScope, AttrSizedOperandSegments, - DeclareOpInterfaceMethods, - DeclareOpInterfaceMethods, - RecursiveMemoryEffects, ReductionClauseInterface]> { +AttrSizedOperandSegments, AutomaticAllocationScope, +DeclareOpInterfaceMethods, +DeclareOpInterfaceMethods, +RecursiveMemoryEffects + ], [ +// TODO: Sort clauses alphabetically. +
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Clause-based OpenMP operation definition (PR #92523)
llvmbot wrote: @llvm/pr-subscribers-flang-openmp Author: Sergio Afonso (skatrak) Changes This patch updates `OpenMP_Op` definitions to be based on the new set of `OpenMP_Clause` definitions, and to take advantage of clause-based automatically-generated argument lists, descriptions, assembly format and class declarations. There are also changes introduced to the clause operands structures to match the current set of tablegen clause definitions. These two are very closely linked and should be kept in sync. It would probably be a good idea to try generating clause operands structures from the tablegen `OpenMP_Clause` definitions in the future. As a result of this change, arguments for some operations have been reordered. This patch also addresses this by updating affected operation build calls and unit tests. Some other updates to tests related to the order of arguments in the resulting assembly format and others due to certain previous inconsistencies in the printing/parsing of clauses are addressed. The printer and parser functions for the `map` clause are updated, so that they are able to handle `map` clauses linked to entry block arguments as well as those which aren't. --- Patch is 106.52 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/92523.diff 11 Files Affected: - (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h (+26-10) - (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td (+298-846) - (modified) mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp (+2-1) - (modified) mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp (+50-28) - (modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+1-1) - (modified) mlir/test/Dialect/OpenMP/invalid.mlir (+10-10) - (modified) mlir/test/Dialect/OpenMP/ops.mlir (+25-26) - (modified) mlir/test/Target/LLVMIR/omptarget-llvm.mlir (+3-3) - (modified) mlir/test/Target/LLVMIR/omptarget-nowait-llvm.mlir (+1-1) - (modified) mlir/test/Target/LLVMIR/omptarget-parallel-llvm.mlir (+2-2) - (modified) mlir/test/Target/LLVMIR/openmp-llvm.mlir (+1-1) ``diff diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h index 244cee1dd635b..bd0d44f932981 100644 --- a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h +++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauseOperands.h @@ -39,6 +39,10 @@ struct AllocateClauseOps { llvm::SmallVector allocatorVars, allocateVars; }; +struct CancelDirectiveNameClauseOps { + ClauseCancellationConstructTypeAttr cancelDirectiveNameAttr; +}; + struct CollapseClauseOps { llvm::SmallVector loopLBVar, loopUBVar, loopStepVar; }; @@ -48,6 +52,10 @@ struct CopyprivateClauseOps { llvm::SmallVector copyprivateFuncs; }; +struct CriticalNameClauseOps { + StringAttr criticalNameAttr; +}; + struct DependClauseOps { llvm::SmallVector dependTypeAttrs; llvm::SmallVector dependVars; @@ -84,6 +92,7 @@ struct GrainsizeClauseOps { struct HasDeviceAddrClauseOps { llvm::SmallVector hasDeviceAddrVars; }; + struct HintClauseOps { IntegerAttr hintAttr; }; @@ -117,10 +126,6 @@ struct MergeableClauseOps { UnitAttr mergeableAttr; }; -struct NameClauseOps { - StringAttr nameAttr; -}; - struct NogroupClauseOps { UnitAttr nogroupAttr; }; @@ -209,8 +214,12 @@ struct UntiedClauseOps { UnitAttr untiedAttr; }; -struct UseDeviceClauseOps { - llvm::SmallVector useDevicePtrVars, useDeviceAddrVars; +struct UseDeviceAddrClauseOps { + llvm::SmallVector useDeviceAddrVars; +}; + +struct UseDevicePtrClauseOps { + llvm::SmallVector useDevicePtrVars; }; //===--===// @@ -225,7 +234,13 @@ template struct Clauses : public Mixins... {}; } // namespace detail -using CriticalClauseOps = detail::Clauses; +using CancelClauseOps = +detail::Clauses; + +using CancellationPointClauseOps = +detail::Clauses; + +using CriticalClauseOps = detail::Clauses; // TODO `indirect` clause. using DeclareTargetClauseOps = detail::Clauses; @@ -264,10 +279,11 @@ using TargetClauseOps = detail::Clauses; +PrivateClauseOps, ThreadLimitClauseOps>; -using TargetDataClauseOps = detail::Clauses; +using TargetDataClauseOps = +detail::Clauses; using TargetEnterExitUpdateDataClauseOps = detail::Clauses { //===--===// def ParallelOp : OpenMP_Op<"parallel", [ - AutomaticAllocationScope, AttrSizedOperandSegments, - DeclareOpInterfaceMethods, - DeclareOpInterfaceMethods, - RecursiveMemoryEffects, ReductionClauseInterface]> { +AttrSizedOperandSegments, AutomaticAllocationScope, +DeclareOpInterfaceMethods, +DeclareOpInterfaceMethods, +RecursiveMemoryEffects + ], [ +// TODO: Sort clauses
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Add `OpenMP_Clause` tablegen definitions (PR #92521)
llvmbot wrote: @llvm/pr-subscribers-mlir Author: Sergio Afonso (skatrak) Changes This patch adds a new tablegen file for the OpenMP dialect containing the list of clauses currently supported. --- Patch is 44.02 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/92521.diff 1 Files Affected: - (added) mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td (+1183) ``diff diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td new file mode 100644 index 0..8b3a53a5842f3 --- /dev/null +++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td @@ -0,0 +1,1183 @@ +//=== OpenMPClauses.td - OpenMP dialect clause definitions -*- tablegen -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// This file contains clause definitions for the OpenMP dialect. +// +// For each "Xyz" clause, there is an "OpenMP_XyzClauseSkip" class and an +// "OpenMP_XyzClause" definition. The latter is an instantiation of the former +// where all "skip" template parameters are set to `false` and should be the +// preferred variant to used whenever possible when defining `OpenMP_Op` +// instances. +// +//===--===// + +#ifndef OPENMP_CLAUSES +#define OPENMP_CLAUSES + +include "mlir/Dialect/OpenMP/OpenMPOpBase.td" + +//===--===// +// V5.2: [5.11] `aligned` clause +//===--===// + +class OpenMP_AlignedClauseSkip< +bit traits = false, bit arguments = false, bit assemblyFormat = false, +bit description = false, bit extraClassDeclaration = false + > : OpenMP_Clause { + let arguments = (ins +Variadic:$aligned_vars, +OptionalAttr:$alignment_values + ); + + let assemblyFormat = [{ +`aligned` `(` custom($aligned_vars, type($aligned_vars), +$alignment_values) `)` + }]; + + let description = [{ +The `alignment_values` attribute additionally specifies alignment of each +corresponding aligned operand. Note that `aligned_vars` and +`alignment_values` should contain the same number of elements. + }]; +} + +def OpenMP_AlignedClause : OpenMP_AlignedClauseSkip<>; + +//===--===// +// V5.2: [6.6] `allocate` clause +//===--===// + +class OpenMP_AllocateClauseSkip< +bit traits = false, bit arguments = false, bit assemblyFormat = false, +bit description = false, bit extraClassDeclaration = false + > : OpenMP_Clause { + let arguments = (ins +Variadic:$allocate_vars, +Variadic:$allocators_vars + ); + + let assemblyFormat = [{ +`allocate` `(` + custom($allocate_vars, type($allocate_vars), + $allocators_vars, type($allocators_vars)) `)` + }]; + + let description = [{ +The `allocators_vars` and `allocate_vars` parameters are a variadic list of +values that specify the memory allocator to be used to obtain storage for +private values. + }]; +} + +def OpenMP_AllocateClause : OpenMP_AllocateClauseSkip<>; + +//===--===// +// V5.2: [16.1, 16.2] `cancel-directive-name` clause set +//===--===// + +class OpenMP_CancelDirectiveNameClauseSkip< +bit traits = false, bit arguments = false, bit assemblyFormat = false, +bit description = false, bit extraClassDeclaration = false + > : OpenMP_Clause { + let arguments = (ins +CancellationConstructTypeAttr:$cancellation_construct_type_val + ); + + let assemblyFormat = [{ +`cancellation_construct_type` `(` + custom($cancellation_construct_type_val) `)` + }]; + + // TODO: Add description. +} + +def OpenMP_CancelDirectiveNameClause : OpenMP_CancelDirectiveNameClauseSkip<>; + +//===--===// +// V5.2: [4.4.3] `collapse` clause +//===--===// + +class OpenMP_CollapseClauseSkip< +bit traits = false, bit arguments = false, bit assemblyFormat = false, +bit description = false, bit extraClassDeclaration = false + > : OpenMP_Clause { + let traits = [ +AllTypesMatch<["lowerBound", "upperBound", "step"]> + ]; + + let arguments = (ins +Variadic:$lowerBound, +Variadic:$upperBound, +Variadic:$step + ); + + let extraClassDeclaration = [{ +/// Returns the number
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Add `OpenMP_Clause` tablegen definitions (PR #92521)
llvmbot wrote: @llvm/pr-subscribers-mlir-openmp Author: Sergio Afonso (skatrak) Changes This patch adds a new tablegen file for the OpenMP dialect containing the list of clauses currently supported. --- Patch is 44.02 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/92521.diff 1 Files Affected: - (added) mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td (+1183) ``diff diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td new file mode 100644 index 0..8b3a53a5842f3 --- /dev/null +++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td @@ -0,0 +1,1183 @@ +//=== OpenMPClauses.td - OpenMP dialect clause definitions -*- tablegen -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// This file contains clause definitions for the OpenMP dialect. +// +// For each "Xyz" clause, there is an "OpenMP_XyzClauseSkip" class and an +// "OpenMP_XyzClause" definition. The latter is an instantiation of the former +// where all "skip" template parameters are set to `false` and should be the +// preferred variant to used whenever possible when defining `OpenMP_Op` +// instances. +// +//===--===// + +#ifndef OPENMP_CLAUSES +#define OPENMP_CLAUSES + +include "mlir/Dialect/OpenMP/OpenMPOpBase.td" + +//===--===// +// V5.2: [5.11] `aligned` clause +//===--===// + +class OpenMP_AlignedClauseSkip< +bit traits = false, bit arguments = false, bit assemblyFormat = false, +bit description = false, bit extraClassDeclaration = false + > : OpenMP_Clause { + let arguments = (ins +Variadic:$aligned_vars, +OptionalAttr:$alignment_values + ); + + let assemblyFormat = [{ +`aligned` `(` custom($aligned_vars, type($aligned_vars), +$alignment_values) `)` + }]; + + let description = [{ +The `alignment_values` attribute additionally specifies alignment of each +corresponding aligned operand. Note that `aligned_vars` and +`alignment_values` should contain the same number of elements. + }]; +} + +def OpenMP_AlignedClause : OpenMP_AlignedClauseSkip<>; + +//===--===// +// V5.2: [6.6] `allocate` clause +//===--===// + +class OpenMP_AllocateClauseSkip< +bit traits = false, bit arguments = false, bit assemblyFormat = false, +bit description = false, bit extraClassDeclaration = false + > : OpenMP_Clause { + let arguments = (ins +Variadic:$allocate_vars, +Variadic:$allocators_vars + ); + + let assemblyFormat = [{ +`allocate` `(` + custom($allocate_vars, type($allocate_vars), + $allocators_vars, type($allocators_vars)) `)` + }]; + + let description = [{ +The `allocators_vars` and `allocate_vars` parameters are a variadic list of +values that specify the memory allocator to be used to obtain storage for +private values. + }]; +} + +def OpenMP_AllocateClause : OpenMP_AllocateClauseSkip<>; + +//===--===// +// V5.2: [16.1, 16.2] `cancel-directive-name` clause set +//===--===// + +class OpenMP_CancelDirectiveNameClauseSkip< +bit traits = false, bit arguments = false, bit assemblyFormat = false, +bit description = false, bit extraClassDeclaration = false + > : OpenMP_Clause { + let arguments = (ins +CancellationConstructTypeAttr:$cancellation_construct_type_val + ); + + let assemblyFormat = [{ +`cancellation_construct_type` `(` + custom($cancellation_construct_type_val) `)` + }]; + + // TODO: Add description. +} + +def OpenMP_CancelDirectiveNameClause : OpenMP_CancelDirectiveNameClauseSkip<>; + +//===--===// +// V5.2: [4.4.3] `collapse` clause +//===--===// + +class OpenMP_CollapseClauseSkip< +bit traits = false, bit arguments = false, bit assemblyFormat = false, +bit description = false, bit extraClassDeclaration = false + > : OpenMP_Clause { + let traits = [ +AllTypesMatch<["lowerBound", "upperBound", "step"]> + ]; + + let arguments = (ins +Variadic:$lowerBound, +Variadic:$upperBound, +Variadic:$step + ); + + let extraClassDeclaration = [{ +/// Returns the
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Add `OpenMP_Clause` tablegen definitions (PR #92521)
https://github.com/skatrak created https://github.com/llvm/llvm-project/pull/92521 This patch adds a new tablegen file for the OpenMP dialect containing the list of clauses currently supported. >From e1aa6cb890dfc8f7f03fade845cff45a163201ff Mon Sep 17 00:00:00 2001 From: Sergio Afonso Date: Fri, 17 May 2024 10:56:32 +0100 Subject: [PATCH] [MLIR][OpenMP] Add `OpenMP_Clause` tablegen definitions This patch adds a new tablegen file for the OpenMP dialect containing the list of clauses currently supported. --- .../mlir/Dialect/OpenMP/OpenMPClauses.td | 1183 + 1 file changed, 1183 insertions(+) create mode 100644 mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td new file mode 100644 index 0..8b3a53a5842f3 --- /dev/null +++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td @@ -0,0 +1,1183 @@ +//=== OpenMPClauses.td - OpenMP dialect clause definitions -*- tablegen -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// +// +// This file contains clause definitions for the OpenMP dialect. +// +// For each "Xyz" clause, there is an "OpenMP_XyzClauseSkip" class and an +// "OpenMP_XyzClause" definition. The latter is an instantiation of the former +// where all "skip" template parameters are set to `false` and should be the +// preferred variant to used whenever possible when defining `OpenMP_Op` +// instances. +// +//===--===// + +#ifndef OPENMP_CLAUSES +#define OPENMP_CLAUSES + +include "mlir/Dialect/OpenMP/OpenMPOpBase.td" + +//===--===// +// V5.2: [5.11] `aligned` clause +//===--===// + +class OpenMP_AlignedClauseSkip< +bit traits = false, bit arguments = false, bit assemblyFormat = false, +bit description = false, bit extraClassDeclaration = false + > : OpenMP_Clause { + let arguments = (ins +Variadic:$aligned_vars, +OptionalAttr:$alignment_values + ); + + let assemblyFormat = [{ +`aligned` `(` custom($aligned_vars, type($aligned_vars), +$alignment_values) `)` + }]; + + let description = [{ +The `alignment_values` attribute additionally specifies alignment of each +corresponding aligned operand. Note that `aligned_vars` and +`alignment_values` should contain the same number of elements. + }]; +} + +def OpenMP_AlignedClause : OpenMP_AlignedClauseSkip<>; + +//===--===// +// V5.2: [6.6] `allocate` clause +//===--===// + +class OpenMP_AllocateClauseSkip< +bit traits = false, bit arguments = false, bit assemblyFormat = false, +bit description = false, bit extraClassDeclaration = false + > : OpenMP_Clause { + let arguments = (ins +Variadic:$allocate_vars, +Variadic:$allocators_vars + ); + + let assemblyFormat = [{ +`allocate` `(` + custom($allocate_vars, type($allocate_vars), + $allocators_vars, type($allocators_vars)) `)` + }]; + + let description = [{ +The `allocators_vars` and `allocate_vars` parameters are a variadic list of +values that specify the memory allocator to be used to obtain storage for +private values. + }]; +} + +def OpenMP_AllocateClause : OpenMP_AllocateClauseSkip<>; + +//===--===// +// V5.2: [16.1, 16.2] `cancel-directive-name` clause set +//===--===// + +class OpenMP_CancelDirectiveNameClauseSkip< +bit traits = false, bit arguments = false, bit assemblyFormat = false, +bit description = false, bit extraClassDeclaration = false + > : OpenMP_Clause { + let arguments = (ins +CancellationConstructTypeAttr:$cancellation_construct_type_val + ); + + let assemblyFormat = [{ +`cancellation_construct_type` `(` + custom($cancellation_construct_type_val) `)` + }]; + + // TODO: Add description. +} + +def OpenMP_CancelDirectiveNameClause : OpenMP_CancelDirectiveNameClauseSkip<>; + +//===--===// +// V5.2: [4.4.3] `collapse` clause +//===--===// + +class OpenMP_CollapseClauseSkip< +bit traits = false, bit arguments = false, bit assemblyFormat = false, +bit description = false, bit extraClassDeclaration = false + > :
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Support clause-based representation of operations (PR #92519)
llvmbot wrote: @llvm/pr-subscribers-flang-openmp @llvm/pr-subscribers-mlir Author: Sergio Afonso (skatrak) Changes Currently, OpenMP operations are defined independently of each other. However, one property of the OpenMP specification is that many clauses can be applied to multiple constructs. Keeping the MLIR representation of clauses consistent across all operations that can accept them is important, but since this information is scattered into multiple operation definitions, it is currently prone to divergence as new features and changes are added to the dialect. Furthermore, centralizing this information allows for a single source of truth and avoids redundancy in the dialect. The proposal in this patch is to make OpenMP clauses independent top level definitions which can then be passed in a template argument list to OpenMP operation definitions, just as it's done for traits. Clauses can define these properties, which are joined together in order to make a default initialization for the fields of the same name of the OpenMP operation: - `traits`: Optional. It gets added to the list of traits of the operation. - `arguments`: Mandatory. It defines how the clause is represented. - `assemblyFormat`: Optional (though it should almost always be defined). This is the declarative definition of the printer/parser for the `arguments`. How these are combined depends on whether this is an optional or required clause. - `description`: Optional. It's used to populate a `clausesDescription` field, so each operation definition must still define a `description` itself. That field is intended to be appended to the end of the `OpenMP_Op`'s `description`. - `extraClassDeclaration`: Optional. It can define some C++ code to be added to every OpenMP operation that includes that clause. In order to give operation definitions fine-grained control over features of a certain clause might need to be inhibited, the `OpenMP_Clause` class takes "skipTraits", "skipArguments", "skipAssemblyFormat", "skipDescription" and "skipExtraClassDeclaration" bit template arguments. These are intended to be used very sparingly for cases where some of the clauses might collide in some way otherwise. --- Full diff: https://github.com/llvm/llvm-project/pull/92519.diff 1 Files Affected: - (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPOpBase.td (+163-2) ``diff diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOpBase.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOpBase.td index b98d87aa74a6f..d93abd63977ef 100644 --- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOpBase.td +++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOpBase.td @@ -42,7 +42,168 @@ def OpenMP_MapBoundsType : OpenMP_Type<"MapBounds", "map_bounds_ty"> { // Base classes for OpenMP dialect operations. //===--===// -class OpenMP_Op traits = []> : - Op; +// Base class for representing OpenMP clauses. +// +// Clauses are meant to be used in a mixin-style pattern to help define OpenMP +// operations in a scalable way, since often the same clause can be applied to +// multiple different operations. +// +// To keep the representation of clauses consistent across different operations, +// each clause must define a set of arguments (values and attributes) which will +// become input arguments of each OpenMP operation that accepts that clause. +// +// It is also recommended that an assembly format and description are defined +// for each clause wherever posible, to make sure they are always printed, +// parsed and described in the same way. +// +// Optionally, operation traits and extra class declarations might be attached +// to clauses, which will be forwarded to all operations that include them. +// +// Each clause must specify whether it's required or optional. This impacts how +// the `assemblyFormat` for operations including it get generated. +// +// An `OpenMP_Op` can inhibit the inheritance of `traits`, `arguments`, +// `assemblyFormat`, `description` and `extraClassDeclaration` fields from any +// given `OpenMP_Clause` by setting to 1 the corresponding "skip" template +// argument bit. +class OpenMP_Clause { + bit required = isRequired; + + bit ignoreTraits = skipTraits; + list traits = []; + + bit ignoreArgs = skipArguments; + dag arguments; + + bit ignoreAsmFormat = skipAssemblyFormat; + string assemblyFormat = ""; + + bit ignoreDesc = skipDescription; + string description = ""; + + bit ignoreExtraDecl = skipExtraClassDeclaration; + string extraClassDeclaration = ""; +} + +// Base class for representing OpenMP operations. +// +// This is a subclass of the builtin `Op` for the OpenMP dialect. By default, +// some of its fields are initialized according to the list of OpenMP clauses +// passed as template argument: +// - `traits`: It is a union of the traits list passed as template argument +// and those inherited from the `traits` field of all
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Support clause-based representation of operations (PR #92519)
https://github.com/skatrak created https://github.com/llvm/llvm-project/pull/92519 Currently, OpenMP operations are defined independently of each other. However, one property of the OpenMP specification is that many clauses can be applied to multiple constructs. Keeping the MLIR representation of clauses consistent across all operations that can accept them is important, but since this information is scattered into multiple operation definitions, it is currently prone to divergence as new features and changes are added to the dialect. Furthermore, centralizing this information allows for a single source of truth and avoids redundancy in the dialect. The proposal in this patch is to make OpenMP clauses independent top level definitions which can then be passed in a template argument list to OpenMP operation definitions, just as it's done for traits. Clauses can define these properties, which are joined together in order to make a default initialization for the fields of the same name of the OpenMP operation: - `traits`: Optional. It gets added to the list of traits of the operation. - `arguments`: Mandatory. It defines how the clause is represented. - `assemblyFormat`: Optional (though it should almost always be defined). This is the declarative definition of the printer/parser for the `arguments`. How these are combined depends on whether this is an optional or required clause. - `description`: Optional. It's used to populate a `clausesDescription` field, so each operation definition must still define a `description` itself. That field is intended to be appended to the end of the `OpenMP_Op`'s `description`. - `extraClassDeclaration`: Optional. It can define some C++ code to be added to every OpenMP operation that includes that clause. In order to give operation definitions fine-grained control over features of a certain clause might need to be inhibited, the `OpenMP_Clause` class takes "skipTraits", "skipArguments", "skipAssemblyFormat", "skipDescription" and "skipExtraClassDeclaration" bit template arguments. These are intended to be used very sparingly for cases where some of the clauses might collide in some way otherwise. >From fec244fb8403d1ebcabe30cd27cf23b1839b0b65 Mon Sep 17 00:00:00 2001 From: Sergio Afonso Date: Fri, 17 May 2024 10:20:55 +0100 Subject: [PATCH] [MLIR][OpenMP] Support clause-based representation of operations Currently, OpenMP operations are defined independently of each other. However, one property of the OpenMP specification is that many clauses can be applied to multiple constructs. Keeping the MLIR representation of clauses consistent across all operations that can accept them is important, but since this information is scattered into multiple operation definitions, it is currently prone to divergence as new features and changes are added to the dialect. Furthermore, centralizing this information allows for a single source of truth and avoids redundancy in the dialect. The proposal in this patch is to make OpenMP clauses independent top level definitions which can then be passed in a template argument list to OpenMP operation definitions, just as it's done for traits. Clauses can define these properties, which are joined together in order to make a default initialization for the fields of the same name of the OpenMP operation: - `traits`: Optional. It gets added to the list of traits of the operation. - `arguments`: Mandatory. It defines how the clause is represented. - `assemblyFormat`: Optional (though it should almost always be defined). This is the declarative definition of the printer/parser for the `arguments`. How these are combined depends on whether this is an optional or required clause. - `description`: Optional. It's used to populate a `clausesDescription` field, so each operation definition must still define a `description` itself. That field is intended to be appended to the end of the `OpenMP_Op`'s `description`. - `extraClassDeclaration`: Optional. It can define some C++ code to be added to every OpenMP operation that includes that clause. In order to give operation definitions fine-grained control over features of a certain clause might need to be inhibited, the `OpenMP_Clause` class takes "skipTraits", "skipArguments", "skipAssemblyFormat", "skipDescription" and "skipExtraClassDeclaration" bit template arguments. These are intended to be used very sparingly for cases where some of the clauses might collide in some way otherwise. --- .../mlir/Dialect/OpenMP/OpenMPOpBase.td | 165 +- 1 file changed, 163 insertions(+), 2 deletions(-) diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOpBase.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOpBase.td index b98d87aa74a6f..d93abd63977ef 100644 --- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOpBase.td +++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOpBase.td @@ -42,7 +42,168 @@ def OpenMP_MapBoundsType : OpenMP_Type<"MapBounds", "map_bounds_ty"> { // Base classes
[llvm-branch-commits] [llvm] release/18.x: [RISCV] Re-separate unaligned scalar and vector memory features in the backend. (PR #92143)
https://github.com/lukel97 approved this pull request. Chiming in that this seems reasonable to me, given the performance impact of not having unaligned scalar accesses. And hopefully we can remove this one we're settled on a proper interface. https://github.com/llvm/llvm-project/pull/92143 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [serialization] No transitive type change (PR #92511)
llvmbot wrote: @llvm/pr-subscribers-clang-modules Author: Chuanqi Xu (ChuanqiXu9) Changes Following of https://github.com/llvm/llvm-project/pull/92085. motivation The motivation is still cutting of the unnecessary change in the dependency chain. See the above link (recursively) for details. And this will be the last patch of the `no-transitive-*-change` series. If there are any following patches, they might be C++20 Named modules specific to handle special grammars like `ADL` (See the reply in https://discourse.llvm.org/t/rfc-c-20-modules-introduce-thin-bmi-and-decls-hash/74755/53 for example). So they won't affect the whole serialization part as the series patch did. example After this patch, finally we are able to cut of unnecessary change of types. For example, ``` //--- m-partA.cppm export module m:partA; //--- m-partA.v1.cppm export module m:partA; namespace NS { class A { public: int getValue() { return 43; } }; } //--- m-partB.cppm export module m:partB; export inline int getB() { return 430; } //--- m.cppm export module m; export import :partA; export import :partB; //--- useBOnly.cppm export module useBOnly; import m; export inline int get() { return getB(); } ``` The BMI of `useBOnly.cppm` is expected to not change if we only add a new class in `m:partA`. This will be pretty useful in practice. implementation details The key idea of this patch is similar with the previous patches: extend the 32bits type ID to 64bits so that we can store the module file index in the higher bits. Then the encoding of the type ID is independent on the imported modules. But there are two differences from the previous patches: - TypeID is not completely an index of serialized types. We used the lower 3 bits to store the qualifiers. - TypeID won't take part in any lookup process. So the uses of TypeID is much less than the previous patches. The first difference make we have some more slightly complex bit operations. And the second difference makes the patch much simpler than the previous ones. --- Patch is 28.70 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/92511.diff 11 Files Affected: - (modified) clang/include/clang/Serialization/ASTBitCodes.h (+24-8) - (modified) clang/include/clang/Serialization/ASTReader.h (+11-12) - (modified) clang/include/clang/Serialization/ASTRecordReader.h (+1-1) - (modified) clang/include/clang/Serialization/ModuleFile.h (-3) - (modified) clang/lib/Serialization/ASTReader.cpp (+55-49) - (modified) clang/lib/Serialization/ASTWriter.cpp (+18-13) - (modified) clang/lib/Serialization/ModuleFile.cpp (-1) - (modified) clang/test/Modules/no-transitive-decls-change.cppm (+1-11) - (modified) clang/test/Modules/no-transitive-identifier-change.cppm (-3) - (added) clang/test/Modules/no-transitive-type-change.cppm (+68) - (modified) clang/test/Modules/pr5.cppm (+18-18) ``diff diff --git a/clang/include/clang/Serialization/ASTBitCodes.h b/clang/include/clang/Serialization/ASTBitCodes.h index 1fd482b5aff0e..486d5f4042c61 100644 --- a/clang/include/clang/Serialization/ASTBitCodes.h +++ b/clang/include/clang/Serialization/ASTBitCodes.h @@ -26,6 +26,7 @@ #include "clang/Serialization/SourceLocationEncoding.h" #include "llvm/ADT/DenseMapInfo.h" #include "llvm/Bitstream/BitCodes.h" +#include "llvm/Support/MathExtras.h" #include #include @@ -70,38 +71,53 @@ using DeclID = DeclIDBase::DeclID; /// An ID number that refers to a type in an AST file. /// -/// The ID of a type is partitioned into two parts: the lower +/// The ID of a type is partitioned into three parts: +/// - the lower /// three bits are used to store the const/volatile/restrict -/// qualifiers (as with QualType) and the upper bits provide a -/// type index. The type index values are partitioned into two +/// qualifiers (as with QualType). +/// - the upper 29 bits provide a type index in the corresponding +/// module file. +/// - the upper 32 bits provide a module file index. +/// +/// The type index values are partitioned into two /// sets. The values below NUM_PREDEF_TYPE_IDs are predefined type /// IDs (based on the PREDEF_TYPE_*_ID constants), with 0 as a /// placeholder for "no type". Values from NUM_PREDEF_TYPE_IDs are /// other types that have serialized representations. -using TypeID = uint32_t; +using TypeID = uint64_t; /// A type index; the type ID with the qualifier bits removed. +/// Keep structure alignment 32-bit since the blob is assumed as 32-bit +/// aligned. class TypeIdx { + uint32_t ModuleFileIndex = 0; uint32_t Idx = 0; public: TypeIdx() = default; - explicit TypeIdx(uint32_t index) : Idx(index) {} + explicit TypeIdx(uint32_t Idx) : ModuleFileIndex(0), Idx(Idx) {} + + explicit TypeIdx(uint32_t ModuleFileIdx, uint32_t Idx) + : ModuleFileIndex(ModuleFileIdx), Idx(Idx) {} + + uint32_t
[llvm-branch-commits] [clang] [serialization] No transitive type change (PR #92511)
https://github.com/ChuanqiXu9 created https://github.com/llvm/llvm-project/pull/92511 Following of https://github.com/llvm/llvm-project/pull/92085. motivation The motivation is still cutting of the unnecessary change in the dependency chain. See the above link (recursively) for details. And this will be the last patch of the `no-transitive-*-change` series. If there are any following patches, they might be C++20 Named modules specific to handle special grammars like `ADL` (See the reply in https://discourse.llvm.org/t/rfc-c-20-modules-introduce-thin-bmi-and-decls-hash/74755/53 for example). So they won't affect the whole serialization part as the series patch did. example After this patch, finally we are able to cut of unnecessary change of types. For example, ``` //--- m-partA.cppm export module m:partA; //--- m-partA.v1.cppm export module m:partA; namespace NS { class A { public: int getValue() { return 43; } }; } //--- m-partB.cppm export module m:partB; export inline int getB() { return 430; } //--- m.cppm export module m; export import :partA; export import :partB; //--- useBOnly.cppm export module useBOnly; import m; export inline int get() { return getB(); } ``` The BMI of `useBOnly.cppm` is expected to not change if we only add a new class in `m:partA`. This will be pretty useful in practice. implementation details The key idea of this patch is similar with the previous patches: extend the 32bits type ID to 64bits so that we can store the module file index in the higher bits. Then the encoding of the type ID is independent on the imported modules. But there are two differences from the previous patches: - TypeID is not completely an index of serialized types. We used the lower 3 bits to store the qualifiers. - TypeID won't take part in any lookup process. So the uses of TypeID is much less than the previous patches. The first difference make we have some more slightly complex bit operations. And the second difference makes the patch much simpler than the previous ones. >From 2265f12343f929cc81f2b4fe6d27cc4ff3f31ec2 Mon Sep 17 00:00:00 2001 From: Chuanqi Xu Date: Fri, 17 May 2024 14:25:53 +0800 Subject: [PATCH] [serialization] No transitive type change --- .../include/clang/Serialization/ASTBitCodes.h | 32 -- clang/include/clang/Serialization/ASTReader.h | 23 ++-- .../clang/Serialization/ASTRecordReader.h | 2 +- .../include/clang/Serialization/ModuleFile.h | 3 - clang/lib/Serialization/ASTReader.cpp | 104 +- clang/lib/Serialization/ASTWriter.cpp | 31 +++--- clang/lib/Serialization/ModuleFile.cpp| 1 - .../Modules/no-transitive-decls-change.cppm | 12 +- .../no-transitive-identifier-change.cppm | 3 - .../Modules/no-transitive-type-change.cppm| 68 clang/test/Modules/pr5.cppm | 36 +++--- 11 files changed, 196 insertions(+), 119 deletions(-) create mode 100644 clang/test/Modules/no-transitive-type-change.cppm diff --git a/clang/include/clang/Serialization/ASTBitCodes.h b/clang/include/clang/Serialization/ASTBitCodes.h index 1fd482b5aff0e..486d5f4042c61 100644 --- a/clang/include/clang/Serialization/ASTBitCodes.h +++ b/clang/include/clang/Serialization/ASTBitCodes.h @@ -26,6 +26,7 @@ #include "clang/Serialization/SourceLocationEncoding.h" #include "llvm/ADT/DenseMapInfo.h" #include "llvm/Bitstream/BitCodes.h" +#include "llvm/Support/MathExtras.h" #include #include @@ -70,38 +71,53 @@ using DeclID = DeclIDBase::DeclID; /// An ID number that refers to a type in an AST file. /// -/// The ID of a type is partitioned into two parts: the lower +/// The ID of a type is partitioned into three parts: +/// - the lower /// three bits are used to store the const/volatile/restrict -/// qualifiers (as with QualType) and the upper bits provide a -/// type index. The type index values are partitioned into two +/// qualifiers (as with QualType). +/// - the upper 29 bits provide a type index in the corresponding +/// module file. +/// - the upper 32 bits provide a module file index. +/// +/// The type index values are partitioned into two /// sets. The values below NUM_PREDEF_TYPE_IDs are predefined type /// IDs (based on the PREDEF_TYPE_*_ID constants), with 0 as a /// placeholder for "no type". Values from NUM_PREDEF_TYPE_IDs are /// other types that have serialized representations. -using TypeID = uint32_t; +using TypeID = uint64_t; /// A type index; the type ID with the qualifier bits removed. +/// Keep structure alignment 32-bit since the blob is assumed as 32-bit +/// aligned. class TypeIdx { + uint32_t ModuleFileIndex = 0; uint32_t Idx = 0; public: TypeIdx() = default; - explicit TypeIdx(uint32_t index) : Idx(index) {} + explicit TypeIdx(uint32_t Idx) : ModuleFileIndex(0), Idx(Idx) {} + + explicit TypeIdx(uint32_t ModuleFileIdx, uint32_t Idx) +
[llvm-branch-commits] [llvm] [AArch64][PAC] Fix creating check instructions for BBs without an epilog (PR #92508)
llvmbot wrote: @llvm/pr-subscribers-backend-aarch64 Author: Igor Kudrin (igorkudrin) Changes `AArch64PAuth::checkAuthenticatedRegister()` splits the basic block containing the tail call instruction to add check instructions, assuming at least one more instruction before the call. This assumption is incorrect in cases where some execution paths lead to the termination block without creating the stack frame. This patch rearranges the creation of the checks so that the prior splitting is not required. --- Full diff: https://github.com/llvm/llvm-project/pull/92508.diff 2 Files Affected: - (modified) llvm/lib/Target/AArch64/AArch64PointerAuth.cpp (+7-16) - (modified) llvm/test/CodeGen/AArch64/sign-return-address-tailcall.ll (+32) ``diff diff --git a/llvm/lib/Target/AArch64/AArch64PointerAuth.cpp b/llvm/lib/Target/AArch64/AArch64PointerAuth.cpp index 90bf089dbebf7..60d3d533d9c10 100644 --- a/llvm/lib/Target/AArch64/AArch64PointerAuth.cpp +++ b/llvm/lib/Target/AArch64/AArch64PointerAuth.cpp @@ -257,21 +257,12 @@ void llvm::AArch64PAuth::checkAuthenticatedRegister( // Control flow has to be changed, so arrange new MBBs. - // At now, at least an AUT* instruction is expected before MBBI - assert(MBBI != MBB.begin() && - "Cannot insert the check at the very beginning of MBB"); - // The block to insert check into. - MachineBasicBlock *CheckBlock = - // The remaining part of the original MBB that is executed on success. - MachineBasicBlock *SuccessBlock = MBB.splitAt(*std::prev(MBBI)); - // The block that explicitly generates a break-point exception on failure. MachineBasicBlock *BreakBlock = MF.CreateMachineBasicBlock(MBB.getBasicBlock()); MF.push_back(BreakBlock); - MBB.splitSuccessor(SuccessBlock, BreakBlock); + MBB.addSuccessor(BreakBlock); - assert(CheckBlock->getFallThrough() == SuccessBlock); BuildMI(BreakBlock, DL, TII->get(AArch64::BRK)).addImm(BrkImm); switch (Method) { @@ -279,11 +270,11 @@ void llvm::AArch64PAuth::checkAuthenticatedRegister( case AuthCheckMethod::DummyLoad: llvm_unreachable("Should be handled above"); case AuthCheckMethod::HighBitsNoTBI: -BuildMI(CheckBlock, DL, TII->get(AArch64::EORXrs), TmpReg) +BuildMI(MBB, MBBI, DL, TII->get(AArch64::EORXrs), TmpReg) .addReg(AuthenticatedReg) .addReg(AuthenticatedReg) .addImm(1); -BuildMI(CheckBlock, DL, TII->get(AArch64::TBNZX)) +BuildMI(MBB, MBBI, DL, TII->get(AArch64::TBNZX)) .addReg(TmpReg) .addImm(62) .addMBB(BreakBlock); @@ -292,16 +283,16 @@ void llvm::AArch64PAuth::checkAuthenticatedRegister( assert(AuthenticatedReg == AArch64::LR && "XPACHint mode is only compatible with checking the LR register"); assert(UseIKey && "XPACHint mode is only compatible with I-keys"); -BuildMI(CheckBlock, DL, TII->get(AArch64::ORRXrs), TmpReg) +BuildMI(MBB, MBBI, DL, TII->get(AArch64::ORRXrs), TmpReg) .addReg(AArch64::XZR) .addReg(AArch64::LR) .addImm(0); -BuildMI(CheckBlock, DL, TII->get(AArch64::XPACLRI)); -BuildMI(CheckBlock, DL, TII->get(AArch64::SUBSXrs), AArch64::XZR) +BuildMI(MBB, MBBI, DL, TII->get(AArch64::XPACLRI)); +BuildMI(MBB, MBBI, DL, TII->get(AArch64::SUBSXrs), AArch64::XZR) .addReg(TmpReg) .addReg(AArch64::LR) .addImm(0); -BuildMI(CheckBlock, DL, TII->get(AArch64::Bcc)) +BuildMI(MBB, MBBI, DL, TII->get(AArch64::Bcc)) .addImm(AArch64CC::NE) .addMBB(BreakBlock); return; diff --git a/llvm/test/CodeGen/AArch64/sign-return-address-tailcall.ll b/llvm/test/CodeGen/AArch64/sign-return-address-tailcall.ll index cf033cb8208cc..0cc707298e458 100644 --- a/llvm/test/CodeGen/AArch64/sign-return-address-tailcall.ll +++ b/llvm/test/CodeGen/AArch64/sign-return-address-tailcall.ll @@ -129,4 +129,36 @@ define i32 @tailcall_ib_key() "sign-return-address"="all" "sign-return-address-k ret i32 %call } +define i32 @tailcall_two_branches(i1 %0) "sign-return-address"="all" { +; COMMON-LABEL:tailcall_two_branches: +; COMMON:tbz w0, #0, .[[ELSE:LBB[_0-9]+]] +; COMMON:str x30, [sp, #-16]! +; COMMON:bl callee2 +; COMMON:ldr x30, [sp], #16 +; COMMON-NEXT: [[AUTIASP]] +; COMMON-NEXT: .[[ELSE]]: + +; LDR-NEXT: ldr w16, [x30] +; +; BITS-NOTBI-NEXT: eor x16, x30, x30, lsl #1 +; BITS-NOTBI-NEXT: tbnz x16, #62, .[[FAIL:LBB[_0-9]+]] +; +; XPAC-NEXT: mov x16, x30 +; XPAC-NEXT: [[XPACLRI]] +; XPAC-NEXT: cmp x16, x30 +; XPAC-NEXT: b.ne .[[FAIL:LBB[_0-9]+]] +; +; COMMON-NEXT: b callee +; BRK-NEXT:.[[FAIL]]: +; BRK-NEXT: brk #0xc470 + br i1 %0, label %2, label %3 +2: + call void @callee2() + br label %3 +3: + %call = tail call i32 @callee() + ret i32 %call +} + declare i32 @callee() +declare void @callee2() ``
[llvm-branch-commits] [llvm] [AArch64][PAC] Fix creating check instructions for BBs without an epilog (PR #92508)
https://github.com/igorkudrin created https://github.com/llvm/llvm-project/pull/92508 `AArch64PAuth::checkAuthenticatedRegister()` splits the basic block containing the tail call instruction to add check instructions, assuming at least one more instruction before the call. This assumption is incorrect in cases where some execution paths lead to the termination block without creating the stack frame. This patch rearranges the creation of the checks so that the prior splitting is not required. >From a3039508f7bf9eeacbb4739460468cb3e71ba133 Mon Sep 17 00:00:00 2001 From: Igor Kudrin Date: Thu, 16 May 2024 22:26:32 -0700 Subject: [PATCH 1/2] test --- .../AArch64/sign-return-address-tailcall.ll | 32 +++ 1 file changed, 32 insertions(+) diff --git a/llvm/test/CodeGen/AArch64/sign-return-address-tailcall.ll b/llvm/test/CodeGen/AArch64/sign-return-address-tailcall.ll index cf033cb8208cc..0cc707298e458 100644 --- a/llvm/test/CodeGen/AArch64/sign-return-address-tailcall.ll +++ b/llvm/test/CodeGen/AArch64/sign-return-address-tailcall.ll @@ -129,4 +129,36 @@ define i32 @tailcall_ib_key() "sign-return-address"="all" "sign-return-address-k ret i32 %call } +define i32 @tailcall_two_branches(i1 %0) "sign-return-address"="all" { +; COMMON-LABEL:tailcall_two_branches: +; COMMON:tbz w0, #0, .[[ELSE:LBB[_0-9]+]] +; COMMON:str x30, [sp, #-16]! +; COMMON:bl callee2 +; COMMON:ldr x30, [sp], #16 +; COMMON-NEXT: [[AUTIASP]] +; COMMON-NEXT: .[[ELSE]]: + +; LDR-NEXT: ldr w16, [x30] +; +; BITS-NOTBI-NEXT: eor x16, x30, x30, lsl #1 +; BITS-NOTBI-NEXT: tbnz x16, #62, .[[FAIL:LBB[_0-9]+]] +; +; XPAC-NEXT: mov x16, x30 +; XPAC-NEXT: [[XPACLRI]] +; XPAC-NEXT: cmp x16, x30 +; XPAC-NEXT: b.ne .[[FAIL:LBB[_0-9]+]] +; +; COMMON-NEXT: b callee +; BRK-NEXT:.[[FAIL]]: +; BRK-NEXT: brk #0xc470 + br i1 %0, label %2, label %3 +2: + call void @callee2() + br label %3 +3: + %call = tail call i32 @callee() + ret i32 %call +} + declare i32 @callee() +declare void @callee2() >From 2641fe82837455b422d6c8229cc2f3d3736de4da Mon Sep 17 00:00:00 2001 From: Igor Kudrin Date: Thu, 16 May 2024 22:26:40 -0700 Subject: [PATCH 2/2] [AArch64][PAC] Fix creating check instructions for BBs without an epilog `AArch64PAuth::checkAuthenticatedRegister()` splits the basic block containing the tail call instruction to add check instructions, assuming at least one more instruction before the call. This assumption is incorrect in cases where some execution paths lead to the termination block without creating the stack frame. This patch rearranges the creation of the checks so that the prior splitting is not required. --- .../lib/Target/AArch64/AArch64PointerAuth.cpp | 23 ++- 1 file changed, 7 insertions(+), 16 deletions(-) diff --git a/llvm/lib/Target/AArch64/AArch64PointerAuth.cpp b/llvm/lib/Target/AArch64/AArch64PointerAuth.cpp index 90bf089dbebf7..60d3d533d9c10 100644 --- a/llvm/lib/Target/AArch64/AArch64PointerAuth.cpp +++ b/llvm/lib/Target/AArch64/AArch64PointerAuth.cpp @@ -257,21 +257,12 @@ void llvm::AArch64PAuth::checkAuthenticatedRegister( // Control flow has to be changed, so arrange new MBBs. - // At now, at least an AUT* instruction is expected before MBBI - assert(MBBI != MBB.begin() && - "Cannot insert the check at the very beginning of MBB"); - // The block to insert check into. - MachineBasicBlock *CheckBlock = - // The remaining part of the original MBB that is executed on success. - MachineBasicBlock *SuccessBlock = MBB.splitAt(*std::prev(MBBI)); - // The block that explicitly generates a break-point exception on failure. MachineBasicBlock *BreakBlock = MF.CreateMachineBasicBlock(MBB.getBasicBlock()); MF.push_back(BreakBlock); - MBB.splitSuccessor(SuccessBlock, BreakBlock); + MBB.addSuccessor(BreakBlock); - assert(CheckBlock->getFallThrough() == SuccessBlock); BuildMI(BreakBlock, DL, TII->get(AArch64::BRK)).addImm(BrkImm); switch (Method) { @@ -279,11 +270,11 @@ void llvm::AArch64PAuth::checkAuthenticatedRegister( case AuthCheckMethod::DummyLoad: llvm_unreachable("Should be handled above"); case AuthCheckMethod::HighBitsNoTBI: -BuildMI(CheckBlock, DL, TII->get(AArch64::EORXrs), TmpReg) +BuildMI(MBB, MBBI, DL, TII->get(AArch64::EORXrs), TmpReg) .addReg(AuthenticatedReg) .addReg(AuthenticatedReg) .addImm(1); -BuildMI(CheckBlock, DL, TII->get(AArch64::TBNZX)) +BuildMI(MBB, MBBI, DL, TII->get(AArch64::TBNZX)) .addReg(TmpReg) .addImm(62) .addMBB(BreakBlock); @@ -292,16 +283,16 @@ void llvm::AArch64PAuth::checkAuthenticatedRegister( assert(AuthenticatedReg == AArch64::LR && "XPACHint mode is only compatible with checking the LR register"); assert(UseIKey && "XPACHint mode is only compatible with I-keys"); -