http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56197
Bug #: 56197 Summary: [SH] Use calculated jump address instead of using a jump table Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: target AssignedTo: unassig...@gcc.gnu.org ReportedBy: olege...@gcc.gnu.org Target: sh*-*-* I ran across this one while checking out PR 55146. If there are a lot of cases in a switch and the length of the case blocks is more or less constant, it can be beneficial to calculate the jump address and eliminate the jump table. For example, code such as int test (int arg) { int rc; switch (arg) { case 0: asm ("nop\n\tnop\n\t" "mov r4,%0" : "=r" (rc) : "r" (arg)); break; case 1: asm ("nop\n\tnop\n\t" "mov r5,%0" : "=r" (rc) : "r" (arg)); break; case 2: asm ("nop\n\tnop\n\t" "mov r6,%0" : "=r" (rc) : "r" (arg)); [...] case 9: asm ("nop\n\tnop\n\t" "mov r7,%0" : "=r" (rc) : "r" (arg)); break; } return rc; } Compiled with -O2 results in: _test: mov #9,r1 cmp/hi r1,r4 bt .L2 mova .L4,r0 mov.b @(r0,r4),r4 add r0,r4 jmp @r4 nop .align 2 .L4: .byte .L3-.L4 .byte .L5-.L4 .byte .L6-.L4 .byte .L7-.L4 .byte .L8-.L4 .byte .L9-.L4 .byte .L10-.L4 .byte .L11-.L4 .byte .L12-.L4 .byte .L13-.L4 .align 1 .L13: mov #9,r0 nop nop mov r7,r0 .align 2 .L2: rts nop .align 1 .L12: mov #8,r0 [...] For a lot of cases, the jump table might become large and is likely to cause data cache misses. The following might be better in that case (assuming that the length of each case block is 16 bytes): mov #9,r1 cmp/hi r1,r4 bt .L2 shll2 r4 shll2 r4 add #.Lcase_0 - .Lcase_default,r4 braf @r4 nop .Lcase_default: rts nop .align 4 .Lcase_0: mov #0,r0 nop nop mov r4,r0 rts nop .align 4 .Lcase_1: [...] .align 4 .Lcase_9: mov #0,r0 nop nop mov r7,r0 rts nop However, this requires the jump table to be sorted in ascending order and the length of the case blocks should not vary too much. Maybe this optimization could also be beneficial on other targets than SH. At least PR 43462 looks somewhat related to it.