https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115551
Bug ID: 115551 Summary: [missed optimization] "c1 << (a + c2)" not optimized into "(c1 << c2) << a" Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org CC: pinskia at gcc dot gnu.org Target Milestone: --- Created attachment 58468 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58468&action=edit patch to show how to get a nice output – but doesn't actually use it. Not to be used.. "c1 << (a + c2)" not optimized into "(c1 << c2) << a" Example: int f(int ch) { unsigned long mask1 = ((((1UL))) << (1 + 4 * ((1) - 1))) << (ch * 4); unsigned long mask2 = ((((1UL))) << (1 + 4 * ((ch + 1) - 1))); return mask1-mask2; } GCC converts this currently to: mask1 = 2 << (ch * 4) mask2 = 1 << (ch * 4 + 1) * * * Related to https://lore.kernel.org/lkml/d7ef7a6158df4ba6687233b0e00d37796b069fb3.1718791090.git.u.kleine-koe...@baylibre.com/ Result: * With the 2nd form the resulting binary gets ~25% smaller * Saving nearly 500 bytes! * * * On ARM, the generated code for mask1 is: lsls r0, r0, #2 movs r3, #2 lsl.w r0, r3, r0 and for mask2: lsls r0, r0, #2 adds r0, #1 // additional 'adds' instruction movs r3, #1 lsl.w r0, r3, r0