[Bug rtl-optimization/108826] Inefficient address generation on POWER and RISC-V

2023-09-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108826

Andrew Pinski  changed:

   What|Removed |Added

 CC||lis8215 at gmail dot com

--- Comment #7 from Andrew Pinski  ---
*** Bug 111626 has been marked as a duplicate of this bug. ***

[Bug rtl-optimization/108826] Inefficient address generation on POWER and RISC-V

2023-02-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108826

--- Comment #6 from Andrew Pinski  ---
(In reply to palmer from comment #5)
> We've run into a handful of things that look like this before, I'm not sure
> if it's a backend issue or something more general.  There's two patterns
> here that are frequently bad on RISC-V: "unsigned int" array indices and
> unsigned int shifting.  I think they might both boil down to some problems
> we have tracking the high parts of registers around ABI boundaries.

That seems unrelated to the issue here. In this case the shift is in DI
(ptrmode) mode already so the shift is fine. See comment # 4 for the RTL (this
was the RTL even for RV64).

[Bug rtl-optimization/108826] Inefficient address generation on POWER and RISC-V

2023-02-16 Thread palmer at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108826

--- Comment #5 from palmer at gcc dot gnu.org ---
We've run into a handful of things that look like this before, I'm not sure if
it's a backend issue or something more general.  There's two patterns here that
are frequently bad on RISC-V: "unsigned int" array indices and unsigned int
shifting.  I think they might both boil down to some problems we have tracking
the high parts of registers around ABI boundaries.

FWIW, the smallest bad code I can get is

unsigned int func(unsigned int ui) {
return (ui >> 6 & 5) << 2;
}

func:
srliw   a0,a0,6
slliw   a0,a0,2
andia0,a0,20
ret

which is particularly awkward as enough is going right to try and move that
andi, but we still end up with the double shifts.

[Bug rtl-optimization/108826] Inefficient address generation on POWER and RISC-V

2023-02-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108826

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||missed-optimization
   Last reconfirmed||2023-02-16
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #4 from Andrew Pinski  ---
Trying 13, 14, 15 -> 16:
   13: r84:DI=r83:DI+0xc8
  REG_DEAD r83:DI
   14: r85:DI=r84:DI<<0x2
  REG_DEAD r84:DI
   15: r86:DI=r72:DI+r85:DI
  REG_DEAD r85:DI
   16: r76:DI=sign_extend([r86:DI])
  REG_DEAD r86:DI
Failed to match this instruction:
(set (reg:DI 76 [ _5 ])
(sign_extend:DI (mem:SI (plus:DI (plus:DI (mult:DI (reg:DI 83)
(const_int 4 [0x4]))
(reg/f:DI 72 [ _nettle_aes_decrypt_T.0_1 ]))
(const_int 800 [0x320])) [2
_nettle_aes_decrypt_T.0_1->table[2][_4]+0 S4 A32])))
Failed to match this instruction:
(set (reg/f:DI 86)
(plus:DI (ashift:DI (reg:DI 83)
(const_int 2 [0x2]))
(reg/f:DI 72 [ _nettle_aes_decrypt_T.0_1 ])))


So combine does know how to combine all 4 instructions and produce the plus 800
there. But then it goes and splits it up and fails. I can't remember if there
is 4->3 splitting or just 4->2 .

[Bug rtl-optimization/108826] Inefficient address generation on POWER and RISC-V

2023-02-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108826

--- Comment #3 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #2)
> Note I need to better understand why the C++ front-end thinks this would be
> invalid ...

Oh because the struct name is unnamed :).

[Bug rtl-optimization/108826] Inefficient address generation on POWER and RISC-V

2023-02-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108826

--- Comment #2 from Andrew Pinski  ---
Actually this is aarch64:
ldr x1, [x0, #:lo12:.LANCHOR0]
ldr w0, [x3, 8]
and w0, w2, w0, lsr 6
add x0, x0, 200
ldr w0, [x1, x0, lsl 2]
str w0, [x1, 800]

Note I need to better understand why the C++ front-end thinks this would be
invalid ...

[Bug rtl-optimization/108826] Inefficient address generation on POWER and RISC-V

2023-02-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108826

--- Comment #1 from Andrew Pinski  ---
AARCH64 looks ok too because of the use of more complex adddresses:
ldr w0, [x0, #:lo12:.LANCHOR0]
and w0, w2, w0, lsr 6
add x0, x0, 200
ldr w0, [x1, x0, lsl 2]