[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)
fmayer wrote: > Seems like the root cause was an vectorizer bug, sent > https://github.com/llvm/llvm-project/pull/120730 that fixes it. This is landed, so from that side we could reland this. https://github.com/llvm/llvm-project/pull/119225 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)
fmayer wrote: Seems like the root cause was an vectorizer bug, sent https://github.com/llvm/llvm-project/pull/120730 that fixes it. https://github.com/llvm/llvm-project/pull/119225 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)
nikic wrote: @fmayer The `withoutNoUnsignedSignedWrap()` API already removed the inbounds flag as well (because inbounds requires nusw). So I think the effect of your change is to drop inbounds in case all indices are negative, which should generally not be necessary. It's pretty likely that the root cause here is indeed incorrect inbounds preservation somewhere, but I think the logic in that transform is correct. https://github.com/llvm/llvm-project/pull/119225 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)
fmayer wrote: I was working on this miscompile. I cannot give a reproducer right now, but the following observations: The generated IR is clearly incorrect (the `inbounds nuw`), of the form: ``` %x = alloca [12 x i32], align 16 [...] %59 = getelementptr inbounds nuw i8, ptr %x, i64 -4 %wide.masked.load.2 = call <4 x i32> @llvm.masked.load.v4i32.p0(ptr nonnull %59, i32 4, <4 x i1> , <4 x i32> poison) ``` This is some interaction between this CL and these two InstCombines: * https://github.com/llvm/llvm-project/blob/5a3f1acad7e8ce0e8cb90165794dce71f4b80bcd/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp#L2495 * https://github.com/llvm/llvm-project/blob/5a3f1acad7e8ce0e8cb90165794dce71f4b80bcd/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp#L2949 If I disable any of the three, the error goes away. The observation is that with the two above, we get an `inbounds` getelementptr that is _NOT_ inbounds (`-4`), that then gets masked vector read to correct for that. The comment in the first link implies that it maybe should be removing `inbounds`: ``` // Even if the total offset is inbounds, we may end up representing it // by first performing a larger negative offset, and then a smaller // positive one. The large negative offset might go out of bounds. Only // preserve inbounds if all signs are the same. ``` but it then goes and only removes the wrap flags: ``` if (Idx.isNonNegative() != ConstIndices[0].isNonNegative()) NW = NW.withoutNoUnsignedSignedWrap(); if (!Idx.isNonNegative()) NW = NW.withoutNoUnsignedWrap(); ``` If I change this to ``` if (Idx.isNonNegative() != ConstIndices[0].isNonNegative()) NW = NW.withoutNoUnsignedSignedWrap().withoutInBounds(); if (!Idx.isNonNegative()) NW = NW.withoutNoUnsignedWrap().withoutInBounds(); ``` the error also goes away. If I disable the second link (i8 canonicalization) the access is then `-1` (makes sense) but *NOT* marked as `inbounds`. I don't feel like this is a sanitizer issue, it seems like MSan might just have gotten unlucky. https://github.com/llvm/llvm-project/pull/119225 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)
alexfh wrote: > > Heads up: apart from a number of instances of actual UB in the code (which > > are quite tedious to localize and understand due to a lack of specialized > > tooling) we're investigating a bad interaction of this commit with msan, > > which seems to result in a miscompilation. > > Thanks for the heads up! As I mentioned on another thread, I'm happy to just > entirely revert this patch due to lack of good sanitizer support (I can do it > tomorrow, but feel free to do it yourself earlier). Having a reproducer for > the msan issue is probably still valuable, as it may indicate a larger > problem. Thanks! Sent a PR with a revert: https://github.com/llvm/llvm-project/pull/120460 As mentioned, a test case is in the works. https://github.com/llvm/llvm-project/pull/119225 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)
nikic wrote: > Heads up: apart from a number of instances of actual UB in the code (which > are quite tedious to localize and understand due to a lack of specialized > tooling) we're investigating a bad interaction of this commit with msan, > which seems to result in a miscompilation. Thanks for the heads up! As I mentioned on another thread, I'm happy to just entirely revert this patch due to lack of good sanitizer support (I can do it tomorrow, but feel free to do it yourself earlier). Having a reproducer for the msan issue is probably still valuable, as it may indicate a larger problem. https://github.com/llvm/llvm-project/pull/119225 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)
alexfh wrote: Heads up: apart from a number of instances of actual UB in the code (which are quite tedious to localize and understand due to a lack of specialized tooling) we're investigating a bad interaction of this commit with msan, which seems to result in a miscompilation. https://github.com/llvm/llvm-project/pull/119225 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)
vitalybuka wrote: > I think > [llvm/llvm-test-suite#190](https://github.com/llvm/llvm-test-suite/pull/190) > should fix this issue, but haven't tested on an affected arch. > > I'm a bit worried that we don't have a sanitizer to catch this issue (the > negative left shift issue is an unrelated one). `-fsanitize=local-bounds` is quite related, but I guess InstCombine happens before the check inserted https://github.com/llvm/llvm-project/pull/119225 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)
serge-sans-paille wrote: Tested on an aarch64 machine where I've been able to reproduce. Works like a charm, thanks @nikic https://github.com/llvm/llvm-project/pull/119225 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)
nikic wrote: I think https://github.com/llvm/llvm-test-suite/pull/190 should fix this issue, but haven't tested on an affected arch. I'm a bit worried that we don't have a sanitizer to catch this issue (the negative left shift issue is an unrelated one). https://github.com/llvm/llvm-project/pull/119225 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)
dtcxzyw wrote: > huffbench.c:319:10: runtime error: left shift of negative value -93 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior huffbench.c:319:10 https://github.com/llvm/llvm-project/pull/119225 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)
nikic wrote: >From a quick look, it's probably due to UB on this line: >https://github.com/llvm/llvm-test-suite/blob/4f14adb8a2f2c5fa02c6b3643d45e74adfcb31f9/SingleSource/Benchmarks/CoyoteBench/huffbench.c#L114 > It indexes below the start of an object. https://github.com/llvm/llvm-project/pull/119225 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)
dtcxzyw wrote: > Seems that this change causes Segment Fault on multiple targets including > aarch64, loongarch64 and riscv64. > > This is detected by a LoongArch > [buildbot](https://lab.llvm.org/staging/#/builders/20/builds/6282) and > manually checked on aarch64 and riscv64 QEMUs. > > I wonder why aarch64 buildbot doesn't report this. I find that > `CMAKE_CXX_FLAGS_RELEASE` is empty on [aarch64 > bot](https://lab.llvm.org/buildbot/#/builders/121/builds/614) while it is > `-O3 -DNDEBUG` on loongarch64. Is it an issue of the llvm-zorg framework? > > UPDATE: This > [buildbot](https://lab.llvm.org/buildbot/#/builders/143)(aarch64) also fail > which uses `-O3`. > > # Reproduce > $ clang llvm-test-suite/SingleSource/Benchmarks/CoyoteBench/huffbench.c -O3 > -o huffbench $ ./huffbench Reproduced on Loongson-3A5000. I will provide a reduced test case later. https://github.com/llvm/llvm-project/pull/119225 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)
serge-sans-paille wrote: Hey @nikic, I've bisected the failures from this buildbot: https://lab.llvm.org/buildbot/#/builders/199/builds/57 down to this commit. The reproducer is unfortunately not pleasant: get a stage-2 build of clang then run the llvm-test suite. This reproducibility chokes on `SingleSource/Benchmarks/CoyoteBench/huffbench.test`, as can be seen on the buildbot log. https://github.com/llvm/llvm-project/pull/119225 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)
SixWeining wrote: Seems that this change causes Segment Fault on multiple targets including aarch64, loongarch64 and riscv64. This is detected by a LoongArch [buildbot](https://lab.llvm.org/staging/#/builders/20/builds/6282) and manually checked on aarch64 and riscv64 QEMUs. I wonder why aarch64 buildbot doesn't report this. I find that `CMAKE_CXX_FLAGS_RELEASE` is empty on aarch64 bot while it is `-O3 -DNDEBUG` on loongarch64. Is it an issue of the llvm-zorg framework? # Reproduce $ clang llvm-test-suite/SingleSource/Benchmarks/CoyoteBench/huffbench.c -O3 -o huffbench $ ./huffbench https://github.com/llvm/llvm-project/pull/119225 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)
llvm-ci wrote: LLVM Buildbot has detected a new failure on builder `lldb-arm-ubuntu` running on `linaro-lldb-arm-ubuntu` while building `clang,llvm` at step 6 "test". Full details are available at: https://lab.llvm.org/buildbot/#/builders/18/builds/8300 Here is the relevant piece of the build log for the reference ``` Step 6 (test) failure: build (failure) ... PASS: lldb-api :: functionalities/dyld-exec-linux/TestDyldExecLinux.py (443 of 2846) UNSUPPORTED: lldb-api :: functionalities/fat_archives/TestFatArchives.py (444 of 2846) PASS: lldb-api :: functionalities/executable_first/TestExecutableFirst.py (445 of 2846) PASS: lldb-api :: functionalities/find-line-entry/TestFindLineEntry.py (446 of 2846) UNSUPPORTED: lldb-api :: functionalities/fork/concurrent_vfork/TestConcurrentVFork.py (447 of 2846) PASS: lldb-api :: functionalities/exec/TestExec.py (448 of 2846) PASS: lldb-api :: functionalities/float-display/TestFloatDisplay.py (449 of 2846) PASS: lldb-api :: functionalities/gdb_remote_client/TestAArch64XMLRegOffsets.py (450 of 2846) PASS: lldb-api :: functionalities/dynamic_value_child_count/TestDynamicValueChildCount.py (451 of 2846) PASS: lldb-api :: functionalities/gdb_remote_client/TestArmRegisterDefinition.py (452 of 2846) FAIL: lldb-api :: functionalities/fork/resumes-child/TestForkResumesChild.py (453 of 2846) TEST 'lldb-api :: functionalities/fork/resumes-child/TestForkResumesChild.py' FAILED Script: -- /usr/bin/python3.10 /home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/test/API/dotest.py -u CXXFLAGS -u CFLAGS --env LLVM_LIBS_DIR=/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./lib --env LLVM_INCLUDE_DIR=/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/include --env LLVM_TOOLS_DIR=/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin --arch armv8l --build-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/lldb-test-build.noindex --lldb-module-cache-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/lldb-test-build.noindex/module-cache-lldb/lldb-api --clang-module-cache-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/lldb-test-build.noindex/module-cache-clang/lldb-api --executable /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin/lldb --compiler /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin/clang --dsymutil /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin/dsymutil --make /usr/bin/make --llvm-tools-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin --lldb-obj-root /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/tools/lldb --lldb-libs-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./lib /home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/test/API/functionalities/fork/resumes-child -p TestForkResumesChild.py -- Exit Code: 1 Command Output (stdout): -- lldb version 20.0.0git (https://github.com/llvm/llvm-project.git revision e21ab4d16b555c28ded307571d138f594f33e325) clang revision e21ab4d16b555c28ded307571d138f594f33e325 llvm revision e21ab4d16b555c28ded307571d138f594f33e325 Skipping the following test categories: ['libc++', 'dsym', 'gmodules', 'debugserver', 'objc'] -- Command Output (stderr): -- FAIL: LLDB (/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/bin/clang-arm) :: test_step_over_fork (TestForkResumesChild.TestForkResumesChild) == FAIL: test_step_over_fork (TestForkResumesChild.TestForkResumesChild) -- Traceback (most recent call last): File "/home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/test/API/functionalities/fork/resumes-child/TestForkResumesChild.py", line 22, in test_step_over_fork self.expect("continue", substrs=["exited with status = 0"]) File "/home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 2371, in expect self.runCmd( File "/home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 996, in runCmd self.assertTrue(self.res.Succeeded(), msg + output) AssertionError: False is not true : Command 'continue' did not return successfully Error output: error: Process must be launched. Config=arm-/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/bin/clang -- Ran 1 test in 0.805s FAILED (failures=1) ``` https://github.com/llvm/llvm-project/pull/119225 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)
https://github.com/nikic closed https://github.com/llvm/llvm-project/pull/119225 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)
https://github.com/dtcxzyw approved this pull request. LGTM. https://github.com/llvm/llvm-project/pull/119225 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)
llvmbot wrote: @llvm/pr-subscribers-llvm-transforms Author: Nikita Popov (nikic) Changes When we have a gep inbounds from the base of an object (e.g. alloca or global), we know that the index cannot be negative, as this would go out of bounds. As such, we can infer nuw as well. The implementation is a bit stricter than necessary, we could also accept one unknown index followed by known-non-negative indices. Proof: https://alive2.llvm.org/ce/z/Hp7-6w (Note that alive2 currently incorrectly doesn't require the inbounds for the alloca case, see https://github.com/AliveToolkit/alive2/issues/1138). --- Patch is 90.53 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/119225.diff 29 Files Affected: - (modified) clang/test/CodeGen/attr-counted-by.c (+2-2) - (modified) clang/test/CodeGen/union-tbaa1.c (+3-3) - (modified) llvm/lib/Transforms/InstCombine/InstructionCombining.cpp (+20) - (modified) llvm/test/Transforms/InstCombine/AMDGPU/memcpy-from-constant.ll (+3-3) - (modified) llvm/test/Transforms/InstCombine/cast_phi.ll (+2-2) - (modified) llvm/test/Transforms/InstCombine/load-cmp.ll (+1-1) - (modified) llvm/test/Transforms/InstCombine/memcpy-addrspace.ll (+8-8) - (modified) llvm/test/Transforms/InstCombine/memcpy-from-global.ll (+1-1) - (modified) llvm/test/Transforms/InstCombine/stpcpy-1.ll (+1-1) - (modified) llvm/test/Transforms/InstCombine/stpcpy_chk-1.ll (+1-1) - (modified) llvm/test/Transforms/InstCombine/strlen-1.ll (+3-3) - (modified) llvm/test/Transforms/InstCombine/strlen-4.ll (+8-8) - (modified) llvm/test/Transforms/InstCombine/strncat-2.ll (+1-1) - (modified) llvm/test/Transforms/InstCombine/strnlen-3.ll (+9-9) - (modified) llvm/test/Transforms/InstCombine/strnlen-4.ll (+2-2) - (modified) llvm/test/Transforms/InstCombine/strnlen-5.ll (+2-2) - (modified) llvm/test/Transforms/InstCombine/sub-gep.ll (+4-4) - (modified) llvm/test/Transforms/InstCombine/wcslen-1.ll (+3-3) - (modified) llvm/test/Transforms/InstCombine/wcslen-3.ll (+1-1) - (modified) llvm/test/Transforms/InstCombine/wcslen-5.ll (+8-8) - (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll (+4-4) - (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve2-histcnt.ll (+3-3) - (modified) llvm/test/Transforms/LoopVectorize/X86/small-size.ll (+19-19) - (modified) llvm/test/Transforms/LoopVectorize/X86/x86_fp80-vector-store.ll (+2-2) - (modified) llvm/test/Transforms/LoopVectorize/interleaved-accesses.ll (+3-3) - (modified) llvm/test/Transforms/LoopVectorize/multiple-address-spaces.ll (+2-2) - (modified) llvm/test/Transforms/LoopVectorize/non-const-n.ll (+3-3) - (modified) llvm/test/Transforms/PhaseOrdering/X86/excessive-unrolling.ll (+87-87) - (modified) llvm/test/Transforms/SLPVectorizer/X86/operandorder.ll (+8-8) ``diff diff --git a/clang/test/CodeGen/attr-counted-by.c b/clang/test/CodeGen/attr-counted-by.c index 6b3cad5708835b..be4c7f07e92150 100644 --- a/clang/test/CodeGen/attr-counted-by.c +++ b/clang/test/CodeGen/attr-counted-by.c @@ -1043,7 +1043,7 @@ int test12_a, test12_b; // NO-SANITIZE-WITH-ATTR-NEXT:call void @llvm.lifetime.start.p0(i64 24, ptr nonnull [[BAZ]]) #[[ATTR11:[0-9]+]] // NO-SANITIZE-WITH-ATTR-NEXT:call void @llvm.memcpy.p0.p0.i64(ptr noundef nonnull align 4 dereferenceable(24) [[BAZ]], ptr noundef nonnull align 4 dereferenceable(24) @test12_bar, i64 24, i1 false), !tbaa.struct [[TBAA_STRUCT7:![0-9]+]] // NO-SANITIZE-WITH-ATTR-NEXT:[[IDXPROM:%.*]] = sext i32 [[INDEX]] to i64 -// NO-SANITIZE-WITH-ATTR-NEXT:[[ARRAYIDX:%.*]] = getelementptr inbounds [6 x i32], ptr [[BAZ]], i64 0, i64 [[IDXPROM]] +// NO-SANITIZE-WITH-ATTR-NEXT:[[ARRAYIDX:%.*]] = getelementptr inbounds nuw [6 x i32], ptr [[BAZ]], i64 0, i64 [[IDXPROM]] // NO-SANITIZE-WITH-ATTR-NEXT:[[TMP0:%.*]] = load i32, ptr [[ARRAYIDX]], align 4, !tbaa [[TBAA2]] // NO-SANITIZE-WITH-ATTR-NEXT:store i32 [[TMP0]], ptr @test12_b, align 4, !tbaa [[TBAA2]] // NO-SANITIZE-WITH-ATTR-NEXT:[[TMP1:%.*]] = load i32, ptr getelementptr inbounds nuw (i8, ptr @test12_foo, i64 4), align 4, !tbaa [[TBAA2]] @@ -1085,7 +1085,7 @@ int test12_a, test12_b; // NO-SANITIZE-WITHOUT-ATTR-NEXT:call void @llvm.lifetime.start.p0(i64 24, ptr nonnull [[BAZ]]) #[[ATTR9:[0-9]+]] // NO-SANITIZE-WITHOUT-ATTR-NEXT:call void @llvm.memcpy.p0.p0.i64(ptr noundef nonnull align 4 dereferenceable(24) [[BAZ]], ptr noundef nonnull align 4 dereferenceable(24) @test12_bar, i64 24, i1 false), !tbaa.struct [[TBAA_STRUCT7:![0-9]+]] // NO-SANITIZE-WITHOUT-ATTR-NEXT:[[IDXPROM:%.*]] = sext i32 [[INDEX]] to i64 -// NO-SANITIZE-WITHOUT-ATTR-NEXT:[[ARRAYIDX:%.*]] = getelementptr inbounds [6 x i32], ptr [[BAZ]], i64 0, i64 [[IDXPROM]] +// NO-SANITIZE-WITHOUT-ATTR-NEXT:[[ARRAYIDX:%.*]] = getelementptr inbounds nuw [6 x i32], ptr [[BAZ]], i64 0, i64 [[IDXPROM]] // NO-SANITIZE-WITHOUT-ATTR-NEXT:[[TMP0:%.*]] = load i32,
[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)
llvmbot wrote: @llvm/pr-subscribers-backend-amdgpu Author: Nikita Popov (nikic) Changes When we have a gep inbounds from the base of an object (e.g. alloca or global), we know that the index cannot be negative, as this would go out of bounds. As such, we can infer nuw as well. The implementation is a bit stricter than necessary, we could also accept one unknown index followed by known-non-negative indices. Proof: https://alive2.llvm.org/ce/z/Hp7-6w (Note that alive2 currently incorrectly doesn't require the inbounds for the alloca case, see https://github.com/AliveToolkit/alive2/issues/1138). --- Patch is 90.53 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/119225.diff 29 Files Affected: - (modified) clang/test/CodeGen/attr-counted-by.c (+2-2) - (modified) clang/test/CodeGen/union-tbaa1.c (+3-3) - (modified) llvm/lib/Transforms/InstCombine/InstructionCombining.cpp (+20) - (modified) llvm/test/Transforms/InstCombine/AMDGPU/memcpy-from-constant.ll (+3-3) - (modified) llvm/test/Transforms/InstCombine/cast_phi.ll (+2-2) - (modified) llvm/test/Transforms/InstCombine/load-cmp.ll (+1-1) - (modified) llvm/test/Transforms/InstCombine/memcpy-addrspace.ll (+8-8) - (modified) llvm/test/Transforms/InstCombine/memcpy-from-global.ll (+1-1) - (modified) llvm/test/Transforms/InstCombine/stpcpy-1.ll (+1-1) - (modified) llvm/test/Transforms/InstCombine/stpcpy_chk-1.ll (+1-1) - (modified) llvm/test/Transforms/InstCombine/strlen-1.ll (+3-3) - (modified) llvm/test/Transforms/InstCombine/strlen-4.ll (+8-8) - (modified) llvm/test/Transforms/InstCombine/strncat-2.ll (+1-1) - (modified) llvm/test/Transforms/InstCombine/strnlen-3.ll (+9-9) - (modified) llvm/test/Transforms/InstCombine/strnlen-4.ll (+2-2) - (modified) llvm/test/Transforms/InstCombine/strnlen-5.ll (+2-2) - (modified) llvm/test/Transforms/InstCombine/sub-gep.ll (+4-4) - (modified) llvm/test/Transforms/InstCombine/wcslen-1.ll (+3-3) - (modified) llvm/test/Transforms/InstCombine/wcslen-3.ll (+1-1) - (modified) llvm/test/Transforms/InstCombine/wcslen-5.ll (+8-8) - (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll (+4-4) - (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve2-histcnt.ll (+3-3) - (modified) llvm/test/Transforms/LoopVectorize/X86/small-size.ll (+19-19) - (modified) llvm/test/Transforms/LoopVectorize/X86/x86_fp80-vector-store.ll (+2-2) - (modified) llvm/test/Transforms/LoopVectorize/interleaved-accesses.ll (+3-3) - (modified) llvm/test/Transforms/LoopVectorize/multiple-address-spaces.ll (+2-2) - (modified) llvm/test/Transforms/LoopVectorize/non-const-n.ll (+3-3) - (modified) llvm/test/Transforms/PhaseOrdering/X86/excessive-unrolling.ll (+87-87) - (modified) llvm/test/Transforms/SLPVectorizer/X86/operandorder.ll (+8-8) ``diff diff --git a/clang/test/CodeGen/attr-counted-by.c b/clang/test/CodeGen/attr-counted-by.c index 6b3cad5708835b..be4c7f07e92150 100644 --- a/clang/test/CodeGen/attr-counted-by.c +++ b/clang/test/CodeGen/attr-counted-by.c @@ -1043,7 +1043,7 @@ int test12_a, test12_b; // NO-SANITIZE-WITH-ATTR-NEXT:call void @llvm.lifetime.start.p0(i64 24, ptr nonnull [[BAZ]]) #[[ATTR11:[0-9]+]] // NO-SANITIZE-WITH-ATTR-NEXT:call void @llvm.memcpy.p0.p0.i64(ptr noundef nonnull align 4 dereferenceable(24) [[BAZ]], ptr noundef nonnull align 4 dereferenceable(24) @test12_bar, i64 24, i1 false), !tbaa.struct [[TBAA_STRUCT7:![0-9]+]] // NO-SANITIZE-WITH-ATTR-NEXT:[[IDXPROM:%.*]] = sext i32 [[INDEX]] to i64 -// NO-SANITIZE-WITH-ATTR-NEXT:[[ARRAYIDX:%.*]] = getelementptr inbounds [6 x i32], ptr [[BAZ]], i64 0, i64 [[IDXPROM]] +// NO-SANITIZE-WITH-ATTR-NEXT:[[ARRAYIDX:%.*]] = getelementptr inbounds nuw [6 x i32], ptr [[BAZ]], i64 0, i64 [[IDXPROM]] // NO-SANITIZE-WITH-ATTR-NEXT:[[TMP0:%.*]] = load i32, ptr [[ARRAYIDX]], align 4, !tbaa [[TBAA2]] // NO-SANITIZE-WITH-ATTR-NEXT:store i32 [[TMP0]], ptr @test12_b, align 4, !tbaa [[TBAA2]] // NO-SANITIZE-WITH-ATTR-NEXT:[[TMP1:%.*]] = load i32, ptr getelementptr inbounds nuw (i8, ptr @test12_foo, i64 4), align 4, !tbaa [[TBAA2]] @@ -1085,7 +1085,7 @@ int test12_a, test12_b; // NO-SANITIZE-WITHOUT-ATTR-NEXT:call void @llvm.lifetime.start.p0(i64 24, ptr nonnull [[BAZ]]) #[[ATTR9:[0-9]+]] // NO-SANITIZE-WITHOUT-ATTR-NEXT:call void @llvm.memcpy.p0.p0.i64(ptr noundef nonnull align 4 dereferenceable(24) [[BAZ]], ptr noundef nonnull align 4 dereferenceable(24) @test12_bar, i64 24, i1 false), !tbaa.struct [[TBAA_STRUCT7:![0-9]+]] // NO-SANITIZE-WITHOUT-ATTR-NEXT:[[IDXPROM:%.*]] = sext i32 [[INDEX]] to i64 -// NO-SANITIZE-WITHOUT-ATTR-NEXT:[[ARRAYIDX:%.*]] = getelementptr inbounds [6 x i32], ptr [[BAZ]], i64 0, i64 [[IDXPROM]] +// NO-SANITIZE-WITHOUT-ATTR-NEXT:[[ARRAYIDX:%.*]] = getelementptr inbounds nuw [6 x i32], ptr [[BAZ]], i64 0, i64 [[IDXPROM]] // NO-SANITIZE-WITHOUT-ATTR-NEXT:[[TMP0:%.*]] = load i32,
[clang] [llvm] [InstCombine] Infer nuw for gep inbounds from base of object (PR #119225)
https://github.com/nikic created https://github.com/llvm/llvm-project/pull/119225 When we have a gep inbounds from the base of an object (e.g. alloca or global), we know that the index cannot be negative, as this would go out of bounds. As such, we can infer nuw as well. The implementation is a bit stricter than necessary, we could also accept one unknown index followed by known-non-negative indices. Proof: https://alive2.llvm.org/ce/z/Hp7-6w (Note that alive2 currently incorrectly doesn't require the inbounds for the alloca case, see https://github.com/AliveToolkit/alive2/issues/1138). >From 0c337ef857e1b96253e0d18959d66c4640780274 Mon Sep 17 00:00:00 2001 From: Nikita Popov Date: Mon, 9 Dec 2024 15:59:40 +0100 Subject: [PATCH] [InstCombine] Infer nuw for gep inbounds from base of object When we have a gep inbounds from the base of an object (e.g. alloca or global), we know that the index cannot be negative, as this would go out of bounds. As such, we can infer nuw as well. The implementation is a bit stricter than necessary, we could also accept one unknown index followed by known-non-negative indices. Proof: https://alive2.llvm.org/ce/z/Hp7-6w (Note that alive2 currently incorrectly doesn't require the inbounds for the alloca case, see https://github.com/AliveToolkit/alive2/issues/1138). --- clang/test/CodeGen/attr-counted-by.c | 4 +- clang/test/CodeGen/union-tbaa1.c | 6 +- .../InstCombine/InstructionCombining.cpp | 20 ++ .../AMDGPU/memcpy-from-constant.ll| 6 +- llvm/test/Transforms/InstCombine/cast_phi.ll | 4 +- llvm/test/Transforms/InstCombine/load-cmp.ll | 2 +- .../InstCombine/memcpy-addrspace.ll | 16 +- .../InstCombine/memcpy-from-global.ll | 2 +- llvm/test/Transforms/InstCombine/stpcpy-1.ll | 2 +- .../Transforms/InstCombine/stpcpy_chk-1.ll| 2 +- llvm/test/Transforms/InstCombine/strlen-1.ll | 6 +- llvm/test/Transforms/InstCombine/strlen-4.ll | 16 +- llvm/test/Transforms/InstCombine/strncat-2.ll | 2 +- llvm/test/Transforms/InstCombine/strnlen-3.ll | 18 +- llvm/test/Transforms/InstCombine/strnlen-4.ll | 4 +- llvm/test/Transforms/InstCombine/strnlen-5.ll | 4 +- llvm/test/Transforms/InstCombine/sub-gep.ll | 8 +- llvm/test/Transforms/InstCombine/wcslen-1.ll | 6 +- llvm/test/Transforms/InstCombine/wcslen-3.ll | 2 +- llvm/test/Transforms/InstCombine/wcslen-5.ll | 16 +- .../AArch64/sve-interleaved-accesses.ll | 8 +- .../LoopVectorize/AArch64/sve2-histcnt.ll | 6 +- .../LoopVectorize/X86/small-size.ll | 38 ++-- .../X86/x86_fp80-vector-store.ll | 4 +- .../LoopVectorize/interleaved-accesses.ll | 6 +- .../LoopVectorize/multiple-address-spaces.ll | 4 +- .../Transforms/LoopVectorize/non-const-n.ll | 6 +- .../PhaseOrdering/X86/excessive-unrolling.ll | 174 +- .../SLPVectorizer/X86/operandorder.ll | 16 +- 29 files changed, 214 insertions(+), 194 deletions(-) diff --git a/clang/test/CodeGen/attr-counted-by.c b/clang/test/CodeGen/attr-counted-by.c index 6b3cad5708835b..be4c7f07e92150 100644 --- a/clang/test/CodeGen/attr-counted-by.c +++ b/clang/test/CodeGen/attr-counted-by.c @@ -1043,7 +1043,7 @@ int test12_a, test12_b; // NO-SANITIZE-WITH-ATTR-NEXT:call void @llvm.lifetime.start.p0(i64 24, ptr nonnull [[BAZ]]) #[[ATTR11:[0-9]+]] // NO-SANITIZE-WITH-ATTR-NEXT:call void @llvm.memcpy.p0.p0.i64(ptr noundef nonnull align 4 dereferenceable(24) [[BAZ]], ptr noundef nonnull align 4 dereferenceable(24) @test12_bar, i64 24, i1 false), !tbaa.struct [[TBAA_STRUCT7:![0-9]+]] // NO-SANITIZE-WITH-ATTR-NEXT:[[IDXPROM:%.*]] = sext i32 [[INDEX]] to i64 -// NO-SANITIZE-WITH-ATTR-NEXT:[[ARRAYIDX:%.*]] = getelementptr inbounds [6 x i32], ptr [[BAZ]], i64 0, i64 [[IDXPROM]] +// NO-SANITIZE-WITH-ATTR-NEXT:[[ARRAYIDX:%.*]] = getelementptr inbounds nuw [6 x i32], ptr [[BAZ]], i64 0, i64 [[IDXPROM]] // NO-SANITIZE-WITH-ATTR-NEXT:[[TMP0:%.*]] = load i32, ptr [[ARRAYIDX]], align 4, !tbaa [[TBAA2]] // NO-SANITIZE-WITH-ATTR-NEXT:store i32 [[TMP0]], ptr @test12_b, align 4, !tbaa [[TBAA2]] // NO-SANITIZE-WITH-ATTR-NEXT:[[TMP1:%.*]] = load i32, ptr getelementptr inbounds nuw (i8, ptr @test12_foo, i64 4), align 4, !tbaa [[TBAA2]] @@ -1085,7 +1085,7 @@ int test12_a, test12_b; // NO-SANITIZE-WITHOUT-ATTR-NEXT:call void @llvm.lifetime.start.p0(i64 24, ptr nonnull [[BAZ]]) #[[ATTR9:[0-9]+]] // NO-SANITIZE-WITHOUT-ATTR-NEXT:call void @llvm.memcpy.p0.p0.i64(ptr noundef nonnull align 4 dereferenceable(24) [[BAZ]], ptr noundef nonnull align 4 dereferenceable(24) @test12_bar, i64 24, i1 false), !tbaa.struct [[TBAA_STRUCT7:![0-9]+]] // NO-SANITIZE-WITHOUT-ATTR-NEXT:[[IDXPROM:%.*]] = sext i32 [[INDEX]] to i64 -// NO-SANITIZE-WITHOUT-ATTR-NEXT:[[ARRAYIDX:%.*]] = getelementptr inbounds [6 x i32], ptr [[BAZ]], i64 0, i64 [[IDXPROM]] +// NO-SANITIZE-WITHOUT-ATTR-NEXT: