https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110481
Bug ID: 110481 Summary: Possible improvements in dense switch statement returning values Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: tkoenig at gcc dot gnu.org Target Milestone: --- Putting this provisionally into tree-optimization, although there may be other aspects. Consider unsigned int foo(unsigned int a) { switch ((a >> 10) & 3) { case 0: return 8; case 1: return 16; case 2: return 32; case 3: return 64; } } unsigned int bar(unsigned int a) { return 1u << (((a >> 10) & 3) + 3); } unsigned int baz (unsigned int a) { switch (a & (3 << 10)) { case 0: return 8; case 1 << 10: return 16; case 2 << 10: return 32; case 3 << 10: return 64; } } which all do the same thing. The code for bar is bar: .LFB1: .cfi_startproc shrl $10, %edi movl $1, %eax movl %edi, %ecx andl $3, %ecx addl $3, %ecx sall %cl, %eax ret which is optimum except for the register move (submitted as PR110479). The compiler does not recognize that foo or baz are equivalent to bar, but that may be too much of a special case to really consider. The code for foo is foo: .LFB0: .cfi_startproc shrl $10, %edi movl $8, %eax andl $3, %edi decl %edi cmpl $2, %edi ja .L1 movzbl CSWTCH.1(%rdi), %eax .L1: ret .cfi_endproc [...] CSWTCH.1: .byte 16 .byte 32 .byte 64 where it seems strange that there is a comparison and conditional jump around the load. A look at *.optimized shows <bb 2> [local count: 1073741824]: _1 = a_4(D) >> 10; _2 = _1 & 3; _8 = _2 + 4294967295; if (_8 <= 2) goto <bb 3>; [50.00%] else goto <bb 4>; [50.00%] <bb 3> [local count: 536870913]: _6 = CSWTCH.1[_8]; _5 = (unsigned int) _6; <bb 4> [local count: 1073741824]: # _3 = PHI <_5(3), 8(2)> return _3; which assigns a probability of 50% to (a>>10)& 3 being zero. Where this comes from is unclear to me. A straightforward load from a table which also includes the 8 seems more logical to me (especially with -Os). Finally, baz generates baz: .LFB2: .cfi_startproc andl $3072, %edi movl $32, %eax cmpl $2048, %edi je .L6 ja .L8 movl $8, %eax testl %edi, %edi je .L6 movl $16, %eax ret .L8: movl $64, %eax .L6: ret when transforming into something equivalent to foo (or even bar) would seem advantageous.