https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81778
--- Comment #3 from Tom de Vries <vries at gcc dot gnu.org> --- The core loop looks like this: ... $L4: add.u32 %r31,%r42,-1; cvt.u64.u32 %r63,%r31; shl.b64 %r64,%r63,2; add.u64 %r65,%r75,%r64; ld.u32 %r71,[%r65]; add.u32 %r70,%r71,-4; st.u32 [%r65],%r70; add.u32 %r42,%r42,-32; setp.lt.u32 %r72,%r30,%r42; @ %r72 bra $L4; ... If we comment out the branch, the testcase passes. I've investigated the ranges of the registers to understand why the branch is taken. [ By storing the reg into the array, and printing it out afterwards. That meant disabling the jump, otherwise I don't get to the print. ]: ... r42: [0,63] -> [-31,32] r30: [0-31] -> 0 [32-63] -> 32 ... So for index 0, we have: ... setp.lt.u32 %r72,0,-31; ... This evaluates to true, because (u32)-31 is bigger than 0. Changing the compare from unsigned to signed makes the test pass.