[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-06-19 Thread via cfe-commits
macurtis-amd wrote: > To make the question more precise, would it make sense to change AMDGPUs > unaligned access hook such that https://godbolt.org/z/Gv1j4vjqE will look the > same as on X86? > > That should also fix the motivating problem here. I'll investigate this as an alternative. Thank

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-06-18 Thread Nikita Popov via cfe-commits
nikic wrote: To make the question more precise, would it make sense to change AMDGPUs unaligned access hook such that https://godbolt.org/z/Gv1j4vjqE will look the same as on X86? That should also fix the motivating problem here. https://github.com/llvm/llvm-project/pull/133301 __

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-06-17 Thread Matt Arsenault via cfe-commits
arsenm wrote: > and that's why you want to have single element loads that get inserted into a > vector and then perform sub-vector extracts on it? We care about eliminate stack usage at any cost. Big load are always better, even if not naturally aligned. When we do care about alignment, we onl

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-06-17 Thread Matt Arsenault via cfe-commits
arsenm wrote: > We do support unaligned access IIUC, and it is actually required by HSA ABI. > CC @arsenm Unaligned access has been supported since almost always. There's just a mode control for it, you can disable it in the driver though https://github.com/llvm/llvm-project/pull/133301 __

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-06-17 Thread Shilei Tian via cfe-commits
shiltian wrote: > I assume AMDGPU does not support unaligned loads We do support unaligned access IIUC, and it is actually required by HSA ABI. CC @arsenm https://github.com/llvm/llvm-project/pull/133301 ___ cfe-commits mailing list cfe-commits@list

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-06-17 Thread Nikita Popov via cfe-commits
nikic wrote: On x86, what we actually end up doing is to combine those to unaligned i64 loads (see https://godbolt.org/z/P5z674x4r), which is probably the best outcome if they are supported. I assume AMDGPU does not support unaligned loads, and that's why you want to have single element loads

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-06-17 Thread via cfe-commits
https://github.com/ronlieb approved this pull request. LGTM, probably needs a rebase https://github.com/llvm/llvm-project/pull/133301 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-06-16 Thread via cfe-commits
macurtis-amd wrote: @nikic ping https://github.com/llvm/llvm-project/pull/133301 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-06-02 Thread via cfe-commits
macurtis-amd wrote: @nikic ping or if anyone else (@davemgreen @kazutakahirata?) has time, a review would be much appreciated https://github.com/llvm/llvm-project/pull/133301 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.o

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-05-08 Thread via cfe-commits
macurtis-amd wrote: @nikic ping ping ping https://github.com/llvm/llvm-project/pull/133301 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-04-29 Thread via cfe-commits
macurtis-amd wrote: @nikic ping ping Are you okay with the latest revision? (Thanks for reviewing this. Apologies if I'm being annoying). https://github.com/llvm/llvm-project/pull/133301 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://l

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-04-24 Thread via cfe-commits
macurtis-amd wrote: @nikic ping https://github.com/llvm/llvm-project/pull/133301 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-04-21 Thread via cfe-commits
macurtis-amd wrote: @nikic Latest revision only promotes memsets where the region being set is within an existing vector in the underlying alloca (as determined by getTypePartition). It is much more constrained as you can see by the much smaller number of tests that are now affected, while st

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-04-14 Thread via cfe-commits
macurtis-amd wrote: I should preface this by mentioning that I'm not all that familiar with SROA, so thank you for your patience. > So if I understand correctly, you are marking memsets as unsplittable, > lowering them to vector zero and smaller accesses to inserts/extracts. > Yes. As a naive

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-04-14 Thread Nikita Popov via cfe-commits
nikic wrote: > > I don't think your general approach here is going to work. We need to be > > careful about introducing vector operations out of thin air, because LLVM > > is not going to second guess them. If you convert a memset to <32768 x i8> > > ops here, LLVM is going to carry those all

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-04-14 Thread Nikita Popov via cfe-commits
@@ -1170,10 +1204,17 @@ class AllocaSlices::SliceBuilder : public PtrUseVisitor { if (!IsOffsetKnown) return PI.setAborted(&II); +bool Splittable; + +if (getVectorTypeFor(II, DL)) + Splittable = isSplittableMemOp(AS.AI.getAllocatedType(), II.isVolatile

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-04-14 Thread Nikita Popov via cfe-commits
https://github.com/nikic requested changes to this pull request. So if I understand correctly, you are marking memsets as unsplittable, lowering them to vector zero and smaller accesses to inserts/extracts. I don't think your general approach here is going to work. We need to be careful about

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-04-14 Thread Nikita Popov via cfe-commits
https://github.com/nikic edited https://github.com/llvm/llvm-project/pull/133301 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-04-14 Thread via cfe-commits
macurtis-amd wrote: @nikic ping https://github.com/llvm/llvm-project/pull/133301 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-04-05 Thread via cfe-commits
@@ -1011,6 +1011,31 @@ static Value *foldPHINodeOrSelectInst(Instruction &I) { return foldSelectInst(cast(I)); } +/// Returns a fixed vector type equivalent to the memory set by II or nullptr if +/// unable to do so. +static FixedVectorType *getVectorTypeFor(const MemSetIns

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-04-05 Thread via cfe-commits
https://github.com/macurtis-amd updated https://github.com/llvm/llvm-project/pull/133301 >From 30b4ca3cc90ab043db7abb336d028e8bb8b71dcf Mon Sep 17 00:00:00 2001 From: Matthew Curtis Date: Thu, 27 Mar 2025 14:04:40 -0500 Subject: [PATCH 1/4] [SROA] Vector promote some memsets --- clang/test/Co

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-04-04 Thread via cfe-commits
macurtis-amd wrote: > The tests also don't make any sense to me, seeing as they compile down to a > constant, unused insertelement. @nikic I cleaned up the test function a bit, again starting from the original example. They exercise the new code paths, though they still result in an unused in

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-04-04 Thread via cfe-commits
https://github.com/macurtis-amd updated https://github.com/llvm/llvm-project/pull/133301 >From 51dd5f27b9a09851c746466f02821087305ad93c Mon Sep 17 00:00:00 2001 From: Matthew Curtis Date: Thu, 27 Mar 2025 14:04:40 -0500 Subject: [PATCH 1/5] [SROA] Vector promote some memsets --- clang/test/Co

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-04-03 Thread via cfe-commits
macurtis-amd wrote: Thanks Nikita for taking a look at this. > Missing PR description. I've updated the PR description. > The tests also don't make any sense to me, seeing as they compile down to a > constant, unused insertelement. Yes. Tests are very synthetic. Reduced from the already red

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-04-03 Thread via cfe-commits
https://github.com/macurtis-amd edited https://github.com/llvm/llvm-project/pull/133301 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-04-03 Thread Nikita Popov via cfe-commits
https://github.com/nikic requested changes to this pull request. Missing PR description. The tests also don't make any sense to me, seeing as they compile down to a constant, unused insertelement. https://github.com/llvm/llvm-project/pull/133301 ___ c

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-04-03 Thread via cfe-commits
macurtis-amd wrote: @arsenm Are you okay with the latest revision? https://github.com/llvm/llvm-project/pull/133301 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-03-31 Thread Matt Arsenault via cfe-commits
arsenm wrote: > @arsenm Any recommendations for appeasing the [undef > deprecator](https://github.com/llvm/llvm-project/pull/133301#issuecomment-2759145421)? I don't think you did anything other than update existing tests, I would ignore it for the purposes of this change https://github.com/

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-03-31 Thread via cfe-commits
macurtis-amd wrote: @arsenm Any recommendations for appeasing the [undef deprecator](https://github.com/llvm/llvm-project/pull/133301#issuecomment-2759145421)? https://github.com/llvm/llvm-project/pull/133301 ___ cfe-commits mailing list cfe-commits@l

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-03-31 Thread via cfe-commits
@@ -1011,6 +1011,31 @@ static Value *foldPHINodeOrSelectInst(Instruction &I) { return foldSelectInst(cast(I)); } +/// Returns a fixed vector type equivalent to the memory set by II or nullptr if +/// unable to do so. +static FixedVectorType *getVectorTypeFor(const MemSetIns

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-03-31 Thread via cfe-commits
@@ -0,0 +1,124 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py +; RUN: opt < %s -passes='sroa' -S | FileCheck %s +target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-n8:16:3

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-03-31 Thread via cfe-commits
@@ -1011,6 +1011,31 @@ static Value *foldPHINodeOrSelectInst(Instruction &I) { return foldSelectInst(cast(I)); } +/// Returns a fixed vector type equivalent to the memory set by II or nullptr if +/// unable to do so. +static FixedVectorType *getVectorTypeFor(const MemSetIns

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-03-31 Thread via cfe-commits
@@ -0,0 +1,124 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py +; RUN: opt < %s -passes='sroa' -S | FileCheck %s +target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-n8:16:3

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-03-31 Thread via cfe-commits
@@ -1011,6 +1011,31 @@ static Value *foldPHINodeOrSelectInst(Instruction &I) { return foldSelectInst(cast(I)); } +/// Returns a fixed vector type equivalent to the memory set by II or nullptr if +/// unable to do so. +static FixedVectorType *getVectorTypeFor(const MemSetIns

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-03-30 Thread Matt Arsenault via cfe-commits
@@ -1011,6 +1011,31 @@ static Value *foldPHINodeOrSelectInst(Instruction &I) { return foldSelectInst(cast(I)); } +/// Returns a fixed vector type equivalent to the memory set by II or nullptr if +/// unable to do so. +static FixedVectorType *getVectorTypeFor(const MemSetIns

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-03-30 Thread Matt Arsenault via cfe-commits
@@ -1011,6 +1011,31 @@ static Value *foldPHINodeOrSelectInst(Instruction &I) { return foldSelectInst(cast(I)); } +/// Returns a fixed vector type equivalent to the memory set by II or nullptr if +/// unable to do so. +static FixedVectorType *getVectorTypeFor(const MemSetIns

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-03-30 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,124 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py +; RUN: opt < %s -passes='sroa' -S | FileCheck %s +target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-n8:16:3

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-03-30 Thread Matt Arsenault via cfe-commits
@@ -1011,6 +1011,31 @@ static Value *foldPHINodeOrSelectInst(Instruction &I) { return foldSelectInst(cast(I)); } +/// Returns a fixed vector type equivalent to the memory set by II or nullptr if +/// unable to do so. +static FixedVectorType *getVectorTypeFor(const MemSetIns

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-03-30 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,124 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py +; RUN: opt < %s -passes='sroa' -S | FileCheck %s +target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-n8:16:3

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-03-30 Thread Matt Arsenault via cfe-commits
@@ -1011,6 +1011,31 @@ static Value *foldPHINodeOrSelectInst(Instruction &I) { return foldSelectInst(cast(I)); } +/// Returns a fixed vector type equivalent to the memory set by II or nullptr if +/// unable to do so. +static FixedVectorType *getVectorTypeFor(const MemSetIns

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-03-29 Thread via cfe-commits
@@ -1170,10 +1191,23 @@ class AllocaSlices::SliceBuilder : public PtrUseVisitor { if (!IsOffsetKnown) return PI.setAborted(&II); +auto IsSplittable = [&]() { macurtis-amd wrote: Cleaned up code removes the helper altogether. Hopefully better no

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-03-29 Thread via cfe-commits
macurtis-amd wrote: > Missing new tests? I'd expect to see a few new targeted tests stressing > different vector sizes and alignments, and not just updates of existing tests Added a new test. Thanks for the review! https://github.com/llvm/llvm-project/pull/133301 _

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-03-29 Thread via cfe-commits
@@ -2316,12 +2362,15 @@ static VectorType *isVectorPromotionViable(Partition &P, const DataLayout &DL) { // Put load and store types into a set for de-duplication. for (const Slice &S : P) { -Type *Ty; +Type *Ty = nullptr; if (auto *LI = dyn_cast(S.getUse()->

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-03-29 Thread via cfe-commits
@@ -1170,10 +1191,23 @@ class AllocaSlices::SliceBuilder : public PtrUseVisitor { if (!IsOffsetKnown) return PI.setAborted(&II); +auto IsSplittable = [&]() { + FixedVectorType *VTy = getVectorTypeFor(II, DL); + Type *ATy = AS.AI.getAllocatedType(); + +

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-03-29 Thread via cfe-commits
@@ -1011,6 +1011,26 @@ static Value *foldPHINodeOrSelectInst(Instruction &I) { return foldSelectInst(cast(I)); } +/// Returns a fixed vector type equivalent to the memory set by II or nullptr if +/// unable to do so. +static FixedVectorType *getVectorTypeFor(const MemSetIns

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-03-29 Thread via cfe-commits
https://github.com/macurtis-amd updated https://github.com/llvm/llvm-project/pull/133301 >From c0525fd7bd2c740b5b969e8e2913a878792a377c Mon Sep 17 00:00:00 2001 From: Matthew Curtis Date: Thu, 27 Mar 2025 14:04:40 -0500 Subject: [PATCH 1/3] [SROA] Vector promote some memsets --- clang/test/Co

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-03-29 Thread via cfe-commits
https://github.com/macurtis-amd updated https://github.com/llvm/llvm-project/pull/133301 >From c0525fd7bd2c740b5b969e8e2913a878792a377c Mon Sep 17 00:00:00 2001 From: Matthew Curtis Date: Thu, 27 Mar 2025 14:04:40 -0500 Subject: [PATCH 1/2] [SROA] Vector promote some memsets --- clang/test/Co

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-03-27 Thread via cfe-commits
llvmbot wrote: @llvm/pr-subscribers-clang Author: None (macurtis-amd) Changes --- Patch is 32.26 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/133301.diff 7 Files Affected: - (modified) clang/test/CodeGenOpenCL/amdgpu-nullptr.cl (+4-6) -

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-03-27 Thread via cfe-commits
https://github.com/macurtis-amd created https://github.com/llvm/llvm-project/pull/133301 None >From 096fde6a6eda7ba8d09444a0558f04c3c4531017 Mon Sep 17 00:00:00 2001 From: Matthew Curtis Date: Thu, 27 Mar 2025 14:04:40 -0500 Subject: [PATCH] [SROA] Vector promote some memsets --- clang/test/

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-03-27 Thread via cfe-commits
github-actions[bot] wrote: :warning: undef deprecator found issues in your code. :warning: You can test this locally with the following command: ``bash git diff -U0 --pickaxe-regex -S '([^a-zA-Z0-9#_-]undef[^a-zA-Z0-9_-]|UndefValue::get)' 08aedf7201e296af532575685372bb5ff7ed8b01

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-03-27 Thread via cfe-commits
llvmbot wrote: @llvm/pr-subscribers-backend-amdgpu Author: None (macurtis-amd) Changes --- Patch is 32.26 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/133301.diff 7 Files Affected: - (modified) clang/test/CodeGenOpenCL/amdgpu-nullptr.cl