[Bug rtl-optimization/108826] Inefficient address generation on POWER and RISC-V
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108826 Andrew Pinski changed: What|Removed |Added CC||lis8215 at gmail dot com --- Comment #7 from Andrew Pinski --- *** Bug 111626 has been marked as a duplicate of this bug. ***
[Bug rtl-optimization/108826] Inefficient address generation on POWER and RISC-V
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108826 --- Comment #6 from Andrew Pinski --- (In reply to palmer from comment #5) > We've run into a handful of things that look like this before, I'm not sure > if it's a backend issue or something more general. There's two patterns > here that are frequently bad on RISC-V: "unsigned int" array indices and > unsigned int shifting. I think they might both boil down to some problems > we have tracking the high parts of registers around ABI boundaries. That seems unrelated to the issue here. In this case the shift is in DI (ptrmode) mode already so the shift is fine. See comment # 4 for the RTL (this was the RTL even for RV64).
[Bug rtl-optimization/108826] Inefficient address generation on POWER and RISC-V
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108826 --- Comment #5 from palmer at gcc dot gnu.org --- We've run into a handful of things that look like this before, I'm not sure if it's a backend issue or something more general. There's two patterns here that are frequently bad on RISC-V: "unsigned int" array indices and unsigned int shifting. I think they might both boil down to some problems we have tracking the high parts of registers around ABI boundaries. FWIW, the smallest bad code I can get is unsigned int func(unsigned int ui) { return (ui >> 6 & 5) << 2; } func: srliw a0,a0,6 slliw a0,a0,2 andia0,a0,20 ret which is particularly awkward as enough is going right to try and move that andi, but we still end up with the double shifts.
[Bug rtl-optimization/108826] Inefficient address generation on POWER and RISC-V
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108826 Andrew Pinski changed: What|Removed |Added Keywords||missed-optimization Last reconfirmed||2023-02-16 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 --- Comment #4 from Andrew Pinski --- Trying 13, 14, 15 -> 16: 13: r84:DI=r83:DI+0xc8 REG_DEAD r83:DI 14: r85:DI=r84:DI<<0x2 REG_DEAD r84:DI 15: r86:DI=r72:DI+r85:DI REG_DEAD r85:DI 16: r76:DI=sign_extend([r86:DI]) REG_DEAD r86:DI Failed to match this instruction: (set (reg:DI 76 [ _5 ]) (sign_extend:DI (mem:SI (plus:DI (plus:DI (mult:DI (reg:DI 83) (const_int 4 [0x4])) (reg/f:DI 72 [ _nettle_aes_decrypt_T.0_1 ])) (const_int 800 [0x320])) [2 _nettle_aes_decrypt_T.0_1->table[2][_4]+0 S4 A32]))) Failed to match this instruction: (set (reg/f:DI 86) (plus:DI (ashift:DI (reg:DI 83) (const_int 2 [0x2])) (reg/f:DI 72 [ _nettle_aes_decrypt_T.0_1 ]))) So combine does know how to combine all 4 instructions and produce the plus 800 there. But then it goes and splits it up and fails. I can't remember if there is 4->3 splitting or just 4->2 .
[Bug rtl-optimization/108826] Inefficient address generation on POWER and RISC-V
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108826 --- Comment #3 from Andrew Pinski --- (In reply to Andrew Pinski from comment #2) > Note I need to better understand why the C++ front-end thinks this would be > invalid ... Oh because the struct name is unnamed :).
[Bug rtl-optimization/108826] Inefficient address generation on POWER and RISC-V
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108826 --- Comment #2 from Andrew Pinski --- Actually this is aarch64: ldr x1, [x0, #:lo12:.LANCHOR0] ldr w0, [x3, 8] and w0, w2, w0, lsr 6 add x0, x0, 200 ldr w0, [x1, x0, lsl 2] str w0, [x1, 800] Note I need to better understand why the C++ front-end thinks this would be invalid ...
[Bug rtl-optimization/108826] Inefficient address generation on POWER and RISC-V
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108826 --- Comment #1 from Andrew Pinski --- AARCH64 looks ok too because of the use of more complex adddresses: ldr w0, [x0, #:lo12:.LANCHOR0] and w0, w2, w0, lsr 6 add x0, x0, 200 ldr w0, [x1, x0, lsl 2]