https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108247
Andrew Waterman <andrew at sifive dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |andrew at sifive dot com --- Comment #1 from Andrew Waterman <andrew at sifive dot com> --- The base-ISA-only code sequence has the same instruction count but smaller code size: slli a1, a1, 1 addw a0, a1, a0 ret I guess this example is a proxy for a broader set of missed optimizations, some of which are actually improved by Zba, so maybe my comment is moot. But it would be best not to de-optimize code size in pursuit of opportunities to use the Zba instructions. (It also occurs to me that the sh1add + sext.w sequence is easier to fuse because the instructions have the same destination. But choosing to increase code size to avail a fusion opportunity is a uarch-specific tuning decision.)