[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)

2025-01-16 Thread Florian Mayer via cfe-commits

fmayer wrote:

> Seems like the root cause was an vectorizer bug, sent 
> https://github.com/llvm/llvm-project/pull/120730 that fixes it.

This is landed, so from that side we could reland this.

https://github.com/llvm/llvm-project/pull/119225
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)

2024-12-20 Thread Florian Mayer via cfe-commits

fmayer wrote:

Seems like the root cause was an vectorizer bug, sent 
https://github.com/llvm/llvm-project/pull/120730 that fixes it.

https://github.com/llvm/llvm-project/pull/119225
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)

2024-12-19 Thread Nikita Popov via cfe-commits

nikic wrote:

@fmayer The `withoutNoUnsignedSignedWrap()` API already removed the inbounds 
flag as well (because inbounds requires nusw). So I think the effect of your 
change is to drop inbounds in case all indices are negative, which should 
generally not be necessary.

It's pretty likely that the root cause here is indeed incorrect inbounds 
preservation somewhere, but I think the logic in that transform is correct.

https://github.com/llvm/llvm-project/pull/119225
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)

2024-12-19 Thread Florian Mayer via cfe-commits

fmayer wrote:

I was working on this miscompile. I cannot give a reproducer right now, but the 
following observations:

The generated IR is clearly incorrect (the `inbounds nuw`), of the form:

```
%x = alloca [12 x i32], align 16
[...]
%59 = getelementptr inbounds nuw i8, ptr %x, i64 -4
%wide.masked.load.2 = call <4 x i32> @llvm.masked.load.v4i32.p0(ptr nonnull 
%59, i32 4, <4 x i1> , <4 x i32> poison)
```

This is some interaction between this CL and these two InstCombines:

* 
https://github.com/llvm/llvm-project/blob/5a3f1acad7e8ce0e8cb90165794dce71f4b80bcd/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp#L2495
* 
https://github.com/llvm/llvm-project/blob/5a3f1acad7e8ce0e8cb90165794dce71f4b80bcd/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp#L2949

If I disable any of the three, the error goes away. The observation is that 
with the two above, we get an `inbounds` getelementptr that is _NOT_ inbounds 
(`-4`), that then gets masked vector read to correct for that.

The comment in the first link implies that it maybe should be removing 
`inbounds`:

```
  // Even if the total offset is inbounds, we may end up representing it
  // by first performing a larger negative offset, and then a smaller
  // positive one. The large negative offset might go out of bounds. Only
  // preserve inbounds if all signs are the same.
```

but it then goes and only removes the wrap flags:

```
  if (Idx.isNonNegative() != ConstIndices[0].isNonNegative())
NW = NW.withoutNoUnsignedSignedWrap();
  if (!Idx.isNonNegative())
NW = NW.withoutNoUnsignedWrap();
```

If I change this to

```
  if (Idx.isNonNegative() != ConstIndices[0].isNonNegative())
NW = NW.withoutNoUnsignedSignedWrap().withoutInBounds();
  if (!Idx.isNonNegative())
NW = NW.withoutNoUnsignedWrap().withoutInBounds();
```

the error also goes away.

If I disable the second link (i8 canonicalization) the access is then `-1` 
(makes sense) but *NOT* marked as `inbounds`.

I don't feel like this is a sanitizer issue, it seems like MSan might just have 
gotten unlucky.

https://github.com/llvm/llvm-project/pull/119225
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)

2024-12-18 Thread Alexander Kornienko via cfe-commits

alexfh wrote:

> > Heads up: apart from a number of instances of actual UB in the code (which 
> > are quite tedious to localize and understand due to a lack of specialized 
> > tooling) we're investigating a bad interaction of this commit with msan, 
> > which seems to result in a miscompilation.
> 
> Thanks for the heads up! As I mentioned on another thread, I'm happy to just 
> entirely revert this patch due to lack of good sanitizer support (I can do it 
> tomorrow, but feel free to do it yourself earlier). Having a reproducer for 
> the msan issue is probably still valuable, as it may indicate a larger 
> problem.

Thanks! Sent a PR with a revert: 
https://github.com/llvm/llvm-project/pull/120460

As mentioned, a test case is in the works.

https://github.com/llvm/llvm-project/pull/119225
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)

2024-12-18 Thread Nikita Popov via cfe-commits

nikic wrote:

> Heads up: apart from a number of instances of actual UB in the code (which 
> are quite tedious to localize and understand due to a lack of specialized 
> tooling) we're investigating a bad interaction of this commit with msan, 
> which seems to result in a miscompilation.

Thanks for the heads up! As I mentioned on another thread, I'm happy to just 
entirely revert this patch due to lack of good sanitizer support (I can do it 
tomorrow, but feel free to do it yourself earlier). Having a reproducer for the 
msan issue is probably still valuable, as it may indicate a larger problem.

https://github.com/llvm/llvm-project/pull/119225
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)

2024-12-18 Thread Alexander Kornienko via cfe-commits

alexfh wrote:

Heads up: apart from a number of instances of actual UB in the code (which are 
quite tedious to localize and understand due to a lack of specialized tooling) 
we're investigating a bad interaction of this commit with msan, which seems to 
result in a miscompilation.

https://github.com/llvm/llvm-project/pull/119225
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)

2024-12-17 Thread Vitaly Buka via cfe-commits

vitalybuka wrote:

> I think 
> [llvm/llvm-test-suite#190](https://github.com/llvm/llvm-test-suite/pull/190) 
> should fix this issue, but haven't tested on an affected arch.
> 
> I'm a bit worried that we don't have a sanitizer to catch this issue (the 
> negative left shift issue is an unrelated one).

`-fsanitize=local-bounds` is quite related, but I guess InstCombine happens 
before the check inserted

https://github.com/llvm/llvm-project/pull/119225
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)

2024-12-13 Thread via cfe-commits

serge-sans-paille wrote:

Tested on an aarch64 machine where I've been able to reproduce. Works like a 
charm, thanks @nikic 

https://github.com/llvm/llvm-project/pull/119225
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)

2024-12-12 Thread Nikita Popov via cfe-commits

nikic wrote:

I think https://github.com/llvm/llvm-test-suite/pull/190 should fix this issue, 
but haven't tested on an affected arch.

I'm a bit worried that we don't have a sanitizer to catch this issue (the 
negative left shift issue is an unrelated one).

https://github.com/llvm/llvm-project/pull/119225
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)

2024-12-12 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

> huffbench.c:319:10: runtime error: left shift of negative value -93
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior huffbench.c:319:10 

https://github.com/llvm/llvm-project/pull/119225
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)

2024-12-12 Thread Nikita Popov via cfe-commits

nikic wrote:

>From a quick look, it's probably due to UB on this line: 
>https://github.com/llvm/llvm-test-suite/blob/4f14adb8a2f2c5fa02c6b3643d45e74adfcb31f9/SingleSource/Benchmarks/CoyoteBench/huffbench.c#L114
> It indexes below the start of an object.

https://github.com/llvm/llvm-project/pull/119225
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)

2024-12-12 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

> Seems that this change causes Segment Fault on multiple targets including 
> aarch64, loongarch64 and riscv64.
> 
> This is detected by a LoongArch 
> [buildbot](https://lab.llvm.org/staging/#/builders/20/builds/6282) and 
> manually checked on aarch64 and riscv64 QEMUs.
> 
> I wonder why aarch64 buildbot doesn't report this. I find that 
> `CMAKE_CXX_FLAGS_RELEASE` is empty on [aarch64 
> bot](https://lab.llvm.org/buildbot/#/builders/121/builds/614) while it is 
> `-O3 -DNDEBUG` on loongarch64. Is it an issue of the llvm-zorg framework?
> 
> UPDATE: This 
> [buildbot](https://lab.llvm.org/buildbot/#/builders/143)(aarch64) also fail 
> which uses `-O3`.
> 
> # Reproduce
> $ clang llvm-test-suite/SingleSource/Benchmarks/CoyoteBench/huffbench.c -O3 
> -o huffbench $ ./huffbench

Reproduced on Loongson-3A5000. I will provide a reduced test case later.


https://github.com/llvm/llvm-project/pull/119225
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)

2024-12-12 Thread via cfe-commits

serge-sans-paille wrote:

Hey @nikic,
I've bisected the failures from this buildbot:
https://lab.llvm.org/buildbot/#/builders/199/builds/57
down to this commit.

The reproducer is unfortunately not pleasant:
get a stage-2 build of clang then run the llvm-test suite. This reproducibility 
chokes on `SingleSource/Benchmarks/CoyoteBench/huffbench.test`, as can be seen 
on the buildbot log.

https://github.com/llvm/llvm-project/pull/119225
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)

2024-12-12 Thread Lu Weining via cfe-commits

SixWeining wrote:

Seems that this change causes Segment Fault on multiple targets including 
aarch64, loongarch64 and riscv64.

This is detected by a LoongArch 
[buildbot](https://lab.llvm.org/staging/#/builders/20/builds/6282) and manually 
checked on aarch64 and riscv64 QEMUs.

I wonder why aarch64 buildbot doesn't report this. I find that 
`CMAKE_CXX_FLAGS_RELEASE` is empty on aarch64 bot while it is `-O3 -DNDEBUG` on 
loongarch64. Is it an issue of the llvm-zorg framework?

# Reproduce
$ clang llvm-test-suite/SingleSource/Benchmarks/CoyoteBench/huffbench.c -O3 -o 
huffbench
$ ./huffbench


https://github.com/llvm/llvm-project/pull/119225
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)

2024-12-10 Thread LLVM Continuous Integration via cfe-commits

llvm-ci wrote:

LLVM Buildbot has detected a new failure on builder `lldb-arm-ubuntu` running 
on `linaro-lldb-arm-ubuntu` while building `clang,llvm` at step 6 "test".

Full details are available at: 
https://lab.llvm.org/buildbot/#/builders/18/builds/8300


Here is the relevant piece of the build log for the reference

```
Step 6 (test) failure: build (failure)
...
PASS: lldb-api :: functionalities/dyld-exec-linux/TestDyldExecLinux.py (443 of 
2846)
UNSUPPORTED: lldb-api :: functionalities/fat_archives/TestFatArchives.py (444 
of 2846)
PASS: lldb-api :: functionalities/executable_first/TestExecutableFirst.py (445 
of 2846)
PASS: lldb-api :: functionalities/find-line-entry/TestFindLineEntry.py (446 of 
2846)
UNSUPPORTED: lldb-api :: 
functionalities/fork/concurrent_vfork/TestConcurrentVFork.py (447 of 2846)
PASS: lldb-api :: functionalities/exec/TestExec.py (448 of 2846)
PASS: lldb-api :: functionalities/float-display/TestFloatDisplay.py (449 of 
2846)
PASS: lldb-api :: functionalities/gdb_remote_client/TestAArch64XMLRegOffsets.py 
(450 of 2846)
PASS: lldb-api :: 
functionalities/dynamic_value_child_count/TestDynamicValueChildCount.py (451 of 
2846)
PASS: lldb-api :: 
functionalities/gdb_remote_client/TestArmRegisterDefinition.py (452 of 2846)
FAIL: lldb-api :: functionalities/fork/resumes-child/TestForkResumesChild.py 
(453 of 2846)
 TEST 'lldb-api :: 
functionalities/fork/resumes-child/TestForkResumesChild.py' FAILED 

Script:
--
/usr/bin/python3.10 
/home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/test/API/dotest.py 
-u CXXFLAGS -u CFLAGS --env 
LLVM_LIBS_DIR=/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./lib --env 
LLVM_INCLUDE_DIR=/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/include --env 
LLVM_TOOLS_DIR=/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin --arch 
armv8l --build-dir 
/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/lldb-test-build.noindex 
--lldb-module-cache-dir 
/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/lldb-test-build.noindex/module-cache-lldb/lldb-api
 --clang-module-cache-dir 
/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/lldb-test-build.noindex/module-cache-clang/lldb-api
 --executable /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin/lldb 
--compiler /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin/clang 
--dsymutil /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin/dsymutil 
--make /usr/bin/make --llvm-tools-dir 
/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin --lldb-obj-root 
/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/tools/lldb --lldb-libs-dir 
/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./lib 
/home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/test/API/functionalities/fork/resumes-child
 -p TestForkResumesChild.py
--
Exit Code: 1

Command Output (stdout):
--
lldb version 20.0.0git (https://github.com/llvm/llvm-project.git revision 
e21ab4d16b555c28ded307571d138f594f33e325)
  clang revision e21ab4d16b555c28ded307571d138f594f33e325
  llvm revision e21ab4d16b555c28ded307571d138f594f33e325
Skipping the following test categories: ['libc++', 'dsym', 'gmodules', 
'debugserver', 'objc']

--
Command Output (stderr):
--
FAIL: LLDB (/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/bin/clang-arm) :: 
test_step_over_fork (TestForkResumesChild.TestForkResumesChild)
==
FAIL: test_step_over_fork (TestForkResumesChild.TestForkResumesChild)
--
Traceback (most recent call last):
  File 
"/home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/test/API/functionalities/fork/resumes-child/TestForkResumesChild.py",
 line 22, in test_step_over_fork
self.expect("continue", substrs=["exited with status = 0"])
  File 
"/home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py",
 line 2371, in expect
self.runCmd(
  File 
"/home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py",
 line 996, in runCmd
self.assertTrue(self.res.Succeeded(), msg + output)
AssertionError: False is not true : Command 'continue' did not return 
successfully
Error output:
error: Process must be launched.

Config=arm-/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/bin/clang
--
Ran 1 test in 0.805s

FAILED (failures=1)


```



https://github.com/llvm/llvm-project/pull/119225
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)

2024-12-10 Thread Nikita Popov via cfe-commits

https://github.com/nikic closed https://github.com/llvm/llvm-project/pull/119225
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)

2024-12-09 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw approved this pull request.

LGTM.

https://github.com/llvm/llvm-project/pull/119225
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)

2024-12-09 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-llvm-transforms

Author: Nikita Popov (nikic)


Changes

When we have a gep inbounds from the base of an object (e.g. alloca or global), 
we know that the index cannot be negative, as this would go out of bounds. As 
such, we can infer nuw as well.

The implementation is a bit stricter than necessary, we could also accept one 
unknown index followed by known-non-negative indices.

Proof: https://alive2.llvm.org/ce/z/Hp7-6w (Note that alive2 currently 
incorrectly doesn't require the inbounds for the alloca case, see 
https://github.com/AliveToolkit/alive2/issues/1138).

---

Patch is 90.53 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/119225.diff


29 Files Affected:

- (modified) clang/test/CodeGen/attr-counted-by.c (+2-2) 
- (modified) clang/test/CodeGen/union-tbaa1.c (+3-3) 
- (modified) llvm/lib/Transforms/InstCombine/InstructionCombining.cpp (+20) 
- (modified) llvm/test/Transforms/InstCombine/AMDGPU/memcpy-from-constant.ll 
(+3-3) 
- (modified) llvm/test/Transforms/InstCombine/cast_phi.ll (+2-2) 
- (modified) llvm/test/Transforms/InstCombine/load-cmp.ll (+1-1) 
- (modified) llvm/test/Transforms/InstCombine/memcpy-addrspace.ll (+8-8) 
- (modified) llvm/test/Transforms/InstCombine/memcpy-from-global.ll (+1-1) 
- (modified) llvm/test/Transforms/InstCombine/stpcpy-1.ll (+1-1) 
- (modified) llvm/test/Transforms/InstCombine/stpcpy_chk-1.ll (+1-1) 
- (modified) llvm/test/Transforms/InstCombine/strlen-1.ll (+3-3) 
- (modified) llvm/test/Transforms/InstCombine/strlen-4.ll (+8-8) 
- (modified) llvm/test/Transforms/InstCombine/strncat-2.ll (+1-1) 
- (modified) llvm/test/Transforms/InstCombine/strnlen-3.ll (+9-9) 
- (modified) llvm/test/Transforms/InstCombine/strnlen-4.ll (+2-2) 
- (modified) llvm/test/Transforms/InstCombine/strnlen-5.ll (+2-2) 
- (modified) llvm/test/Transforms/InstCombine/sub-gep.ll (+4-4) 
- (modified) llvm/test/Transforms/InstCombine/wcslen-1.ll (+3-3) 
- (modified) llvm/test/Transforms/InstCombine/wcslen-3.ll (+1-1) 
- (modified) llvm/test/Transforms/InstCombine/wcslen-5.ll (+8-8) 
- (modified) 
llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll (+4-4) 
- (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve2-histcnt.ll (+3-3) 
- (modified) llvm/test/Transforms/LoopVectorize/X86/small-size.ll (+19-19) 
- (modified) llvm/test/Transforms/LoopVectorize/X86/x86_fp80-vector-store.ll 
(+2-2) 
- (modified) llvm/test/Transforms/LoopVectorize/interleaved-accesses.ll (+3-3) 
- (modified) llvm/test/Transforms/LoopVectorize/multiple-address-spaces.ll 
(+2-2) 
- (modified) llvm/test/Transforms/LoopVectorize/non-const-n.ll (+3-3) 
- (modified) llvm/test/Transforms/PhaseOrdering/X86/excessive-unrolling.ll 
(+87-87) 
- (modified) llvm/test/Transforms/SLPVectorizer/X86/operandorder.ll (+8-8) 


``diff
diff --git a/clang/test/CodeGen/attr-counted-by.c 
b/clang/test/CodeGen/attr-counted-by.c
index 6b3cad5708835b..be4c7f07e92150 100644
--- a/clang/test/CodeGen/attr-counted-by.c
+++ b/clang/test/CodeGen/attr-counted-by.c
@@ -1043,7 +1043,7 @@ int test12_a, test12_b;
 // NO-SANITIZE-WITH-ATTR-NEXT:call void @llvm.lifetime.start.p0(i64 24, 
ptr nonnull [[BAZ]]) #[[ATTR11:[0-9]+]]
 // NO-SANITIZE-WITH-ATTR-NEXT:call void @llvm.memcpy.p0.p0.i64(ptr noundef 
nonnull align 4 dereferenceable(24) [[BAZ]], ptr noundef nonnull align 4 
dereferenceable(24) @test12_bar, i64 24, i1 false), !tbaa.struct 
[[TBAA_STRUCT7:![0-9]+]]
 // NO-SANITIZE-WITH-ATTR-NEXT:[[IDXPROM:%.*]] = sext i32 [[INDEX]] to i64
-// NO-SANITIZE-WITH-ATTR-NEXT:[[ARRAYIDX:%.*]] = getelementptr inbounds [6 
x i32], ptr [[BAZ]], i64 0, i64 [[IDXPROM]]
+// NO-SANITIZE-WITH-ATTR-NEXT:[[ARRAYIDX:%.*]] = getelementptr inbounds 
nuw [6 x i32], ptr [[BAZ]], i64 0, i64 [[IDXPROM]]
 // NO-SANITIZE-WITH-ATTR-NEXT:[[TMP0:%.*]] = load i32, ptr [[ARRAYIDX]], 
align 4, !tbaa [[TBAA2]]
 // NO-SANITIZE-WITH-ATTR-NEXT:store i32 [[TMP0]], ptr @test12_b, align 4, 
!tbaa [[TBAA2]]
 // NO-SANITIZE-WITH-ATTR-NEXT:[[TMP1:%.*]] = load i32, ptr getelementptr 
inbounds nuw (i8, ptr @test12_foo, i64 4), align 4, !tbaa [[TBAA2]]
@@ -1085,7 +1085,7 @@ int test12_a, test12_b;
 // NO-SANITIZE-WITHOUT-ATTR-NEXT:call void @llvm.lifetime.start.p0(i64 24, 
ptr nonnull [[BAZ]]) #[[ATTR9:[0-9]+]]
 // NO-SANITIZE-WITHOUT-ATTR-NEXT:call void @llvm.memcpy.p0.p0.i64(ptr 
noundef nonnull align 4 dereferenceable(24) [[BAZ]], ptr noundef nonnull align 
4 dereferenceable(24) @test12_bar, i64 24, i1 false), !tbaa.struct 
[[TBAA_STRUCT7:![0-9]+]]
 // NO-SANITIZE-WITHOUT-ATTR-NEXT:[[IDXPROM:%.*]] = sext i32 [[INDEX]] to 
i64
-// NO-SANITIZE-WITHOUT-ATTR-NEXT:[[ARRAYIDX:%.*]] = getelementptr inbounds 
[6 x i32], ptr [[BAZ]], i64 0, i64 [[IDXPROM]]
+// NO-SANITIZE-WITHOUT-ATTR-NEXT:[[ARRAYIDX:%.*]] = getelementptr inbounds 
nuw [6 x i32], ptr [[BAZ]], i64 0, i64 [[IDXPROM]]
 // NO-SANITIZE-WITHOUT-ATTR-NEXT:[[TMP0:%.*]] = load i32,

[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)

2024-12-09 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-amdgpu

Author: Nikita Popov (nikic)


Changes

When we have a gep inbounds from the base of an object (e.g. alloca or global), 
we know that the index cannot be negative, as this would go out of bounds. As 
such, we can infer nuw as well.

The implementation is a bit stricter than necessary, we could also accept one 
unknown index followed by known-non-negative indices.

Proof: https://alive2.llvm.org/ce/z/Hp7-6w (Note that alive2 currently 
incorrectly doesn't require the inbounds for the alloca case, see 
https://github.com/AliveToolkit/alive2/issues/1138).

---

Patch is 90.53 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/119225.diff


29 Files Affected:

- (modified) clang/test/CodeGen/attr-counted-by.c (+2-2) 
- (modified) clang/test/CodeGen/union-tbaa1.c (+3-3) 
- (modified) llvm/lib/Transforms/InstCombine/InstructionCombining.cpp (+20) 
- (modified) llvm/test/Transforms/InstCombine/AMDGPU/memcpy-from-constant.ll 
(+3-3) 
- (modified) llvm/test/Transforms/InstCombine/cast_phi.ll (+2-2) 
- (modified) llvm/test/Transforms/InstCombine/load-cmp.ll (+1-1) 
- (modified) llvm/test/Transforms/InstCombine/memcpy-addrspace.ll (+8-8) 
- (modified) llvm/test/Transforms/InstCombine/memcpy-from-global.ll (+1-1) 
- (modified) llvm/test/Transforms/InstCombine/stpcpy-1.ll (+1-1) 
- (modified) llvm/test/Transforms/InstCombine/stpcpy_chk-1.ll (+1-1) 
- (modified) llvm/test/Transforms/InstCombine/strlen-1.ll (+3-3) 
- (modified) llvm/test/Transforms/InstCombine/strlen-4.ll (+8-8) 
- (modified) llvm/test/Transforms/InstCombine/strncat-2.ll (+1-1) 
- (modified) llvm/test/Transforms/InstCombine/strnlen-3.ll (+9-9) 
- (modified) llvm/test/Transforms/InstCombine/strnlen-4.ll (+2-2) 
- (modified) llvm/test/Transforms/InstCombine/strnlen-5.ll (+2-2) 
- (modified) llvm/test/Transforms/InstCombine/sub-gep.ll (+4-4) 
- (modified) llvm/test/Transforms/InstCombine/wcslen-1.ll (+3-3) 
- (modified) llvm/test/Transforms/InstCombine/wcslen-3.ll (+1-1) 
- (modified) llvm/test/Transforms/InstCombine/wcslen-5.ll (+8-8) 
- (modified) 
llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll (+4-4) 
- (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve2-histcnt.ll (+3-3) 
- (modified) llvm/test/Transforms/LoopVectorize/X86/small-size.ll (+19-19) 
- (modified) llvm/test/Transforms/LoopVectorize/X86/x86_fp80-vector-store.ll 
(+2-2) 
- (modified) llvm/test/Transforms/LoopVectorize/interleaved-accesses.ll (+3-3) 
- (modified) llvm/test/Transforms/LoopVectorize/multiple-address-spaces.ll 
(+2-2) 
- (modified) llvm/test/Transforms/LoopVectorize/non-const-n.ll (+3-3) 
- (modified) llvm/test/Transforms/PhaseOrdering/X86/excessive-unrolling.ll 
(+87-87) 
- (modified) llvm/test/Transforms/SLPVectorizer/X86/operandorder.ll (+8-8) 


``diff
diff --git a/clang/test/CodeGen/attr-counted-by.c 
b/clang/test/CodeGen/attr-counted-by.c
index 6b3cad5708835b..be4c7f07e92150 100644
--- a/clang/test/CodeGen/attr-counted-by.c
+++ b/clang/test/CodeGen/attr-counted-by.c
@@ -1043,7 +1043,7 @@ int test12_a, test12_b;
 // NO-SANITIZE-WITH-ATTR-NEXT:call void @llvm.lifetime.start.p0(i64 24, 
ptr nonnull [[BAZ]]) #[[ATTR11:[0-9]+]]
 // NO-SANITIZE-WITH-ATTR-NEXT:call void @llvm.memcpy.p0.p0.i64(ptr noundef 
nonnull align 4 dereferenceable(24) [[BAZ]], ptr noundef nonnull align 4 
dereferenceable(24) @test12_bar, i64 24, i1 false), !tbaa.struct 
[[TBAA_STRUCT7:![0-9]+]]
 // NO-SANITIZE-WITH-ATTR-NEXT:[[IDXPROM:%.*]] = sext i32 [[INDEX]] to i64
-// NO-SANITIZE-WITH-ATTR-NEXT:[[ARRAYIDX:%.*]] = getelementptr inbounds [6 
x i32], ptr [[BAZ]], i64 0, i64 [[IDXPROM]]
+// NO-SANITIZE-WITH-ATTR-NEXT:[[ARRAYIDX:%.*]] = getelementptr inbounds 
nuw [6 x i32], ptr [[BAZ]], i64 0, i64 [[IDXPROM]]
 // NO-SANITIZE-WITH-ATTR-NEXT:[[TMP0:%.*]] = load i32, ptr [[ARRAYIDX]], 
align 4, !tbaa [[TBAA2]]
 // NO-SANITIZE-WITH-ATTR-NEXT:store i32 [[TMP0]], ptr @test12_b, align 4, 
!tbaa [[TBAA2]]
 // NO-SANITIZE-WITH-ATTR-NEXT:[[TMP1:%.*]] = load i32, ptr getelementptr 
inbounds nuw (i8, ptr @test12_foo, i64 4), align 4, !tbaa [[TBAA2]]
@@ -1085,7 +1085,7 @@ int test12_a, test12_b;
 // NO-SANITIZE-WITHOUT-ATTR-NEXT:call void @llvm.lifetime.start.p0(i64 24, 
ptr nonnull [[BAZ]]) #[[ATTR9:[0-9]+]]
 // NO-SANITIZE-WITHOUT-ATTR-NEXT:call void @llvm.memcpy.p0.p0.i64(ptr 
noundef nonnull align 4 dereferenceable(24) [[BAZ]], ptr noundef nonnull align 
4 dereferenceable(24) @test12_bar, i64 24, i1 false), !tbaa.struct 
[[TBAA_STRUCT7:![0-9]+]]
 // NO-SANITIZE-WITHOUT-ATTR-NEXT:[[IDXPROM:%.*]] = sext i32 [[INDEX]] to 
i64
-// NO-SANITIZE-WITHOUT-ATTR-NEXT:[[ARRAYIDX:%.*]] = getelementptr inbounds 
[6 x i32], ptr [[BAZ]], i64 0, i64 [[IDXPROM]]
+// NO-SANITIZE-WITHOUT-ATTR-NEXT:[[ARRAYIDX:%.*]] = getelementptr inbounds 
nuw [6 x i32], ptr [[BAZ]], i64 0, i64 [[IDXPROM]]
 // NO-SANITIZE-WITHOUT-ATTR-NEXT:[[TMP0:%.*]] = load i32, 

[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)

2024-12-09 Thread Nikita Popov via cfe-commits

https://github.com/nikic created 
https://github.com/llvm/llvm-project/pull/119225

When we have a gep inbounds from the base of an object (e.g. alloca or global), 
we know that the index cannot be negative, as this would go out of bounds. As 
such, we can infer nuw as well.

The implementation is a bit stricter than necessary, we could also accept one 
unknown index followed by known-non-negative indices.

Proof: https://alive2.llvm.org/ce/z/Hp7-6w (Note that alive2 currently 
incorrectly doesn't require the inbounds for the alloca case, see 
https://github.com/AliveToolkit/alive2/issues/1138).

>From 0c337ef857e1b96253e0d18959d66c4640780274 Mon Sep 17 00:00:00 2001
From: Nikita Popov 
Date: Mon, 9 Dec 2024 15:59:40 +0100
Subject: [PATCH] [InstCombine] Infer nuw for gep inbounds from base of object

When we have a gep inbounds from the base of an object (e.g.
alloca or global), we know that the index cannot be negative,
as this would go out of bounds. As such, we can infer nuw as well.

The implementation is a bit stricter than necessary, we could
also accept one unknown index followed by known-non-negative
indices.

Proof: https://alive2.llvm.org/ce/z/Hp7-6w (Note that alive2
currently incorrectly doesn't require the inbounds for the alloca
case, see https://github.com/AliveToolkit/alive2/issues/1138).
---
 clang/test/CodeGen/attr-counted-by.c  |   4 +-
 clang/test/CodeGen/union-tbaa1.c  |   6 +-
 .../InstCombine/InstructionCombining.cpp  |  20 ++
 .../AMDGPU/memcpy-from-constant.ll|   6 +-
 llvm/test/Transforms/InstCombine/cast_phi.ll  |   4 +-
 llvm/test/Transforms/InstCombine/load-cmp.ll  |   2 +-
 .../InstCombine/memcpy-addrspace.ll   |  16 +-
 .../InstCombine/memcpy-from-global.ll |   2 +-
 llvm/test/Transforms/InstCombine/stpcpy-1.ll  |   2 +-
 .../Transforms/InstCombine/stpcpy_chk-1.ll|   2 +-
 llvm/test/Transforms/InstCombine/strlen-1.ll  |   6 +-
 llvm/test/Transforms/InstCombine/strlen-4.ll  |  16 +-
 llvm/test/Transforms/InstCombine/strncat-2.ll |   2 +-
 llvm/test/Transforms/InstCombine/strnlen-3.ll |  18 +-
 llvm/test/Transforms/InstCombine/strnlen-4.ll |   4 +-
 llvm/test/Transforms/InstCombine/strnlen-5.ll |   4 +-
 llvm/test/Transforms/InstCombine/sub-gep.ll   |   8 +-
 llvm/test/Transforms/InstCombine/wcslen-1.ll  |   6 +-
 llvm/test/Transforms/InstCombine/wcslen-3.ll  |   2 +-
 llvm/test/Transforms/InstCombine/wcslen-5.ll  |  16 +-
 .../AArch64/sve-interleaved-accesses.ll   |   8 +-
 .../LoopVectorize/AArch64/sve2-histcnt.ll |   6 +-
 .../LoopVectorize/X86/small-size.ll   |  38 ++--
 .../X86/x86_fp80-vector-store.ll  |   4 +-
 .../LoopVectorize/interleaved-accesses.ll |   6 +-
 .../LoopVectorize/multiple-address-spaces.ll  |   4 +-
 .../Transforms/LoopVectorize/non-const-n.ll   |   6 +-
 .../PhaseOrdering/X86/excessive-unrolling.ll  | 174 +-
 .../SLPVectorizer/X86/operandorder.ll |  16 +-
 29 files changed, 214 insertions(+), 194 deletions(-)

diff --git a/clang/test/CodeGen/attr-counted-by.c 
b/clang/test/CodeGen/attr-counted-by.c
index 6b3cad5708835b..be4c7f07e92150 100644
--- a/clang/test/CodeGen/attr-counted-by.c
+++ b/clang/test/CodeGen/attr-counted-by.c
@@ -1043,7 +1043,7 @@ int test12_a, test12_b;
 // NO-SANITIZE-WITH-ATTR-NEXT:call void @llvm.lifetime.start.p0(i64 24, 
ptr nonnull [[BAZ]]) #[[ATTR11:[0-9]+]]
 // NO-SANITIZE-WITH-ATTR-NEXT:call void @llvm.memcpy.p0.p0.i64(ptr noundef 
nonnull align 4 dereferenceable(24) [[BAZ]], ptr noundef nonnull align 4 
dereferenceable(24) @test12_bar, i64 24, i1 false), !tbaa.struct 
[[TBAA_STRUCT7:![0-9]+]]
 // NO-SANITIZE-WITH-ATTR-NEXT:[[IDXPROM:%.*]] = sext i32 [[INDEX]] to i64
-// NO-SANITIZE-WITH-ATTR-NEXT:[[ARRAYIDX:%.*]] = getelementptr inbounds [6 
x i32], ptr [[BAZ]], i64 0, i64 [[IDXPROM]]
+// NO-SANITIZE-WITH-ATTR-NEXT:[[ARRAYIDX:%.*]] = getelementptr inbounds 
nuw [6 x i32], ptr [[BAZ]], i64 0, i64 [[IDXPROM]]
 // NO-SANITIZE-WITH-ATTR-NEXT:[[TMP0:%.*]] = load i32, ptr [[ARRAYIDX]], 
align 4, !tbaa [[TBAA2]]
 // NO-SANITIZE-WITH-ATTR-NEXT:store i32 [[TMP0]], ptr @test12_b, align 4, 
!tbaa [[TBAA2]]
 // NO-SANITIZE-WITH-ATTR-NEXT:[[TMP1:%.*]] = load i32, ptr getelementptr 
inbounds nuw (i8, ptr @test12_foo, i64 4), align 4, !tbaa [[TBAA2]]
@@ -1085,7 +1085,7 @@ int test12_a, test12_b;
 // NO-SANITIZE-WITHOUT-ATTR-NEXT:call void @llvm.lifetime.start.p0(i64 24, 
ptr nonnull [[BAZ]]) #[[ATTR9:[0-9]+]]
 // NO-SANITIZE-WITHOUT-ATTR-NEXT:call void @llvm.memcpy.p0.p0.i64(ptr 
noundef nonnull align 4 dereferenceable(24) [[BAZ]], ptr noundef nonnull align 
4 dereferenceable(24) @test12_bar, i64 24, i1 false), !tbaa.struct 
[[TBAA_STRUCT7:![0-9]+]]
 // NO-SANITIZE-WITHOUT-ATTR-NEXT:[[IDXPROM:%.*]] = sext i32 [[INDEX]] to 
i64
-// NO-SANITIZE-WITHOUT-ATTR-NEXT:[[ARRAYIDX:%.*]] = getelementptr inbounds 
[6 x i32], ptr [[BAZ]], i64 0, i64 [[IDXPROM]]
+// NO-SANITIZE-WITHOUT-ATTR-NEXT: