[clang] [llvm] [InferAlignment] Propagate alignment between loads/stores of the same base pointer (PR #145733)

2025-08-08 Thread Drew Kersnar via cfe-commits

https://github.com/dakersnar closed 
https://github.com/llvm/llvm-project/pull/145733
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InferAlignment] Propagate alignment between loads/stores of the same base pointer (PR #145733)

2025-08-07 Thread Drew Kersnar via cfe-commits

dakersnar wrote:

I'll merge tomorrow morning, unless other reviewers have any hesitations.

https://github.com/llvm/llvm-project/pull/145733
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InferAlignment] Propagate alignment between loads/stores of the same base pointer (PR #145733)

2025-08-07 Thread Nikita Popov via cfe-commits

https://github.com/nikic approved this pull request.


https://github.com/llvm/llvm-project/pull/145733
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InferAlignment] Propagate alignment between loads/stores of the same base pointer (PR #145733)

2025-08-07 Thread Drew Kersnar via cfe-commits


@@ -58,14 +58,55 @@ bool inferAlignment(Function &F, AssumptionCache &AC, 
DominatorTree &DT) {
   }
 
   // Compute alignment from known bits.
+  auto InferFromKnownBits = [&](Instruction &I, Value *PtrOp) {
+KnownBits Known = computeKnownBits(PtrOp, DL, &AC, &I, &DT);
+unsigned TrailZ =
+std::min(Known.countMinTrailingZeros(), +Value::MaxAlignmentExponent);
+return Align(1ull << std::min(Known.getBitWidth() - 1, TrailZ));
+  };
+
+  // Propagate alignment between loads and stores that originate from the
+  // same base pointer.
+  DenseMap BestBasePointerAligns;
+  auto InferFromBasePointer = [&](Value *PtrOp, Align LoadStoreAlign) {
+APInt OffsetFromBase(DL.getIndexTypeSizeInBits(PtrOp->getType()), 0);
+PtrOp = PtrOp->stripAndAccumulateConstantOffsets(DL, OffsetFromBase, true);
+// Derive the base pointer alignment from the load/store alignment
+// and the offset from the base pointer.
+Align BasePointerAlign =
+commonAlignment(LoadStoreAlign, OffsetFromBase.getLimitedValue());
+
+auto [It, Inserted] =
+BestBasePointerAligns.try_emplace(PtrOp, BasePointerAlign);
+if (!Inserted) {
+  // If the stored base pointer alignment is better than the
+  // base pointer alignment we derived, we may be able to use it
+  // to improve the load/store alignment. If not, store the
+  // improved base pointer alignment for future iterations.
+  if (It->second > BasePointerAlign) {
+Align BetterLoadStoreAlign =
+commonAlignment(It->second, OffsetFromBase.getLimitedValue());
+return BetterLoadStoreAlign;
+  }
+  It->second = BasePointerAlign;
+}
+return LoadStoreAlign;
+  };
+
   for (BasicBlock &BB : F) {
+// We need to reset the map for each block because alignment information

dakersnar wrote:

I updated the comment. Let me know if it looks accurate.

https://github.com/llvm/llvm-project/pull/145733
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InferAlignment] Propagate alignment between loads/stores of the same base pointer (PR #145733)

2025-08-07 Thread Drew Kersnar via cfe-commits


@@ -58,14 +58,55 @@ bool inferAlignment(Function &F, AssumptionCache &AC, 
DominatorTree &DT) {
   }
 
   // Compute alignment from known bits.
+  auto InferFromKnownBits = [&](Instruction &I, Value *PtrOp) {
+KnownBits Known = computeKnownBits(PtrOp, DL, &AC, &I, &DT);
+unsigned TrailZ =
+std::min(Known.countMinTrailingZeros(), +Value::MaxAlignmentExponent);
+return Align(1ull << std::min(Known.getBitWidth() - 1, TrailZ));
+  };
+
+  // Propagate alignment between loads and stores that originate from the
+  // same base pointer.
+  DenseMap BestBasePointerAligns;
+  auto InferFromBasePointer = [&](Value *PtrOp, Align LoadStoreAlign) {
+APInt OffsetFromBase(DL.getIndexTypeSizeInBits(PtrOp->getType()), 0);
+PtrOp = PtrOp->stripAndAccumulateConstantOffsets(DL, OffsetFromBase, true);
+// Derive the base pointer alignment from the load/store alignment
+// and the offset from the base pointer.
+Align BasePointerAlign =
+commonAlignment(LoadStoreAlign, OffsetFromBase.getLimitedValue());
+
+auto [It, Inserted] =
+BestBasePointerAligns.try_emplace(PtrOp, BasePointerAlign);
+if (!Inserted) {
+  // If the stored base pointer alignment is better than the
+  // base pointer alignment we derived, we may be able to use it
+  // to improve the load/store alignment. If not, store the
+  // improved base pointer alignment for future iterations.
+  if (It->second > BasePointerAlign) {
+Align BetterLoadStoreAlign =
+commonAlignment(It->second, OffsetFromBase.getLimitedValue());
+return BetterLoadStoreAlign;
+  }
+  It->second = BasePointerAlign;
+}
+return LoadStoreAlign;
+  };
+
   for (BasicBlock &BB : F) {
+// We need to reset the map for each block because alignment information

dakersnar wrote:

Good call out, thank you for pointing that out! This tracks with my 
understanding of why a backwards propagation worked in the LSV but doesn't work 
here: the LSV analyzes within the scope of what it calls a "pseudo basic 
block", which is defined as follows:

```
  /// Runs the vectorizer on a "pseudo basic block", which is a range of
  /// instructions [Begin, End) within one BB all of which have
  /// isGuaranteedToTransferExecutionToSuccessor(I) == true.
```

Anyway, I can adjust the comment to call out your example. And just to confirm, 
the hypothetical dominator tree approach described in my comment would still be 
correct, right? 

https://github.com/llvm/llvm-project/pull/145733
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InferAlignment] Propagate alignment between loads/stores of the same base pointer (PR #145733)

2025-08-07 Thread Philip Reames via cfe-commits


@@ -58,14 +58,55 @@ bool inferAlignment(Function &F, AssumptionCache &AC, 
DominatorTree &DT) {
   }
 
   // Compute alignment from known bits.
+  auto InferFromKnownBits = [&](Instruction &I, Value *PtrOp) {
+KnownBits Known = computeKnownBits(PtrOp, DL, &AC, &I, &DT);
+unsigned TrailZ =
+std::min(Known.countMinTrailingZeros(), +Value::MaxAlignmentExponent);
+return Align(1ull << std::min(Known.getBitWidth() - 1, TrailZ));
+  };
+
+  // Propagate alignment between loads and stores that originate from the
+  // same base pointer.
+  DenseMap BestBasePointerAligns;
+  auto InferFromBasePointer = [&](Value *PtrOp, Align LoadStoreAlign) {
+APInt OffsetFromBase(DL.getIndexTypeSizeInBits(PtrOp->getType()), 0);
+PtrOp = PtrOp->stripAndAccumulateConstantOffsets(DL, OffsetFromBase, true);
+// Derive the base pointer alignment from the load/store alignment
+// and the offset from the base pointer.
+Align BasePointerAlign =
+commonAlignment(LoadStoreAlign, OffsetFromBase.getLimitedValue());
+
+auto [It, Inserted] =
+BestBasePointerAligns.try_emplace(PtrOp, BasePointerAlign);
+if (!Inserted) {
+  // If the stored base pointer alignment is better than the
+  // base pointer alignment we derived, we may be able to use it
+  // to improve the load/store alignment. If not, store the
+  // improved base pointer alignment for future iterations.
+  if (It->second > BasePointerAlign) {
+Align BetterLoadStoreAlign =
+commonAlignment(It->second, OffsetFromBase.getLimitedValue());
+return BetterLoadStoreAlign;
+  }
+  It->second = BasePointerAlign;
+}
+return LoadStoreAlign;
+  };
+
   for (BasicBlock &BB : F) {
+// We need to reset the map for each block because alignment information

preames wrote:

FYI, the problem here is worse than described.  The code as written appears to 
be correct, I'm just pointing out a conceptual problem which may need 
reflecting in comments, etc..

Consider this:
```
%v = load i8, ptr %p, align 4
call void @throw_if_unaligned(%p, 16)
store i8 %v, ptr %p, align 16

```
Propagating the alignment forward is sound, but propagating it backwards (over 
the possibly throwing call) is not.



https://github.com/llvm/llvm-project/pull/145733
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InferAlignment] Propagate alignment between loads/stores of the same base pointer (PR #145733)

2025-08-06 Thread Nikita Popov via cfe-commits

nikic wrote:

> Is there any merit to keeping the abs for readability, so future maintainers 
> don't need to reason through this? Maybe just a comment?

I think it's ok to drop as long as there's test coverage, but no strong opinion 
either way.

https://github.com/llvm/llvm-project/pull/145733
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InferAlignment] Propagate alignment between loads/stores of the same base pointer (PR #145733)

2025-08-06 Thread Drew Kersnar via cfe-commits

dakersnar wrote:

I think you're right. Since the only thing that really matters for 
commonAlignment is the number of trailing zeros of each input, we can 
generalize whether we need the `abs` to the following statement: "will a number 
and it's two's compliment always have the same number of trailing zeros?". And 
the answer is yes, because negating will turn the least significant 1 followed 
by trailing 0s into a 0 followed by trailing 1s, and then adding 1 will revert 
those bits back to the original pattern, leaving the same number of trailing 
zeros.

Is there any merit to keeping the abs for readability, so future maintainers 
don't need to reason through this? Maybe just a comment?

https://github.com/llvm/llvm-project/pull/145733
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InferAlignment] Propagate alignment between loads/stores of the same base pointer (PR #145733)

2025-08-06 Thread Nikita Popov via cfe-commits

nikic wrote:

Are you sure the abs is needed? It looks like your tests also passed before 
that change. I think this ends up being correct because an offset like -8 is 
something like ...1000 in binary, which has the same effect for the common 
alignment as using ...1000.

https://github.com/llvm/llvm-project/pull/145733
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InferAlignment] Propagate alignment between loads/stores of the same base pointer (PR #145733)

2025-08-05 Thread Drew Kersnar via cfe-commits

dakersnar wrote:

Good call, they were not being handled correctly. Taking the `abs` of the 
offsets before calling `getLimitedValue` should do the trick as far as I can 
reason.

https://github.com/llvm/llvm-project/pull/145733
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits