[Bug target/108840] Aarch64 doesn't optimize away shift counter masking
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108840 ktkachov at gcc dot gnu.org changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED Target Milestone|--- |14.0 --- Comment #5 from ktkachov at gcc dot gnu.org --- Fixed for GCC 14.
[Bug target/108840] Aarch64 doesn't optimize away shift counter masking
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108840 --- Comment #4 from CVS Commits --- The master branch has been updated by Kyrylo Tkachov : https://gcc.gnu.org/g:136330bf637b50a4f10ace017a4316541386b9c0 commit r14-62-g136330bf637b50a4f10ace017a4316541386b9c0 Author: Kyrylo Tkachov Date: Wed Apr 19 09:34:40 2023 +0100 aarch64: PR target/108840 Simplify register shift RTX costs and eliminate shift amount masking In this PR we fail to eliminate explicit &31 operations for variable shifts such as in: void bar (int x[3], int y) { x[0] <<= (y & 31); x[1] <<= (y & 31); x[2] <<= (y & 31); } This is rejected by RTX costs that end up giving too high a cost for: (set (reg:SI 96) (ashift:SI (reg:SI 98) (subreg:QI (and:SI (reg:SI 99) (const_int 31 [0x1f])) 0))) There is code to handle the AND-31 case in rtx costs, but it gets confused by the subreg. It's easy enough to fix by looking inside the subreg when costing the expression. While doing that I noticed that the ASHIFT case and the other shift-like cases are almost identical and we should just merge them. This code will only be used for valid insns anyway, so the code after this patch should do the Right Thing (TM) for all such shift cases. With this patch there are no more "and wn, wn, 31" instructions left in the testcase. Bootstrapped and tested on aarch64-none-linux-gnu. PR target/108840 gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_rtx_costs): Merge ASHIFT and ROTATE, ROTATERT, LSHIFTRT, ASHIFTRT cases. Handle subregs in op1. gcc/testsuite/ChangeLog: * gcc.target/aarch64/pr108840.c: New test.
[Bug target/108840] Aarch64 doesn't optimize away shift counter masking
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108840 --- Comment #3 from ktkachov at gcc dot gnu.org --- Created attachment 54531 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54531&action=edit Candidate patch Candidate patch attached.
[Bug target/108840] Aarch64 doesn't optimize away shift counter masking
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108840 ktkachov at gcc dot gnu.org changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |ktkachov at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #2 from ktkachov at gcc dot gnu.org --- I have a patch to simplify and fix the aarch64 rtx costs for this case. I'll aim it for GCC 14 as it's not a regression.
[Bug target/108840] Aarch64 doesn't optimize away shift counter masking
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108840 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2023-02-17 See Also|https://gcc.gnu.org/bugzill | |a/show_bug.cgi?id=91202 | Status|UNCONFIRMED |NEW Keywords||missed-optimization --- Comment #1 from Andrew Pinski --- Confirmed: Trying 8 -> 10: 8: r93:SI=r108:SI&0x1f REG_DEAD r108:SI 10: r101:SI=r102:SI<