https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96475
Bug ID: 96475
Summary: direct threaded interpreter with computed gotos
generates suboptimal dispatch loop
Product: gcc
Version: 10.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: npiggin at gmail dot com
CC: segher at gcc dot gnu.org
Target Milestone: ---
Target: powerpc64le-linux-gnu
Created attachment 48999
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48999&action=edit
test case
The attached test case code generation with -O2 for run_program_goto generates
a central indirect branch dispatch to handlers that branch back to the central
dispatcher.
Direct threaded code with indirect branches between handlers is faster on a
POWER9 when there are no branch mispredictions due to fewer branches, and it
should generally do better with branch prediction when there is an indirect
branch from each handler.