So I was curious about TCG and bored at the same time. :) This series achieves 2-6% improvements in all the three benchmarks that I tried ("dc -e10000k1000vp", 1500 digits of pi with bc, SHA-1 of 100 MB of random data), so it must not be that bad.
It improves all three aspects of handling x86 condition codes: the good, the bad, and the ugly. The good is the things that are fast. This series makes more things fast. ZF, SF, CF and PF can be computed using TCG ops most of the time, while remaining lazy. They are cheap, so repeated computation is not a problem. Only computations that require the overflow flag need the compute_eflags helper. Jumps whose result is computed in a previous basic block also do, but even this could be eliminated in a future patch for ZF/SF/PF. The bad is the things that are slow. After this series QEMU does slow things less. The result of the compute_eflags helper is cached, because the CC_OP_EFLAGS state can produce just as fast code as the specialized states---the only cost is setting it up the first time EFLAGS is needed, and that's unavoidable. The ugly is the things that are rare. This series may make rare things slower, but at the same it makes them simpler. For example shifts result in a CC_OP_EFLAGS state instead of CC_OP_DYNAMIC. This may result in an extra call to compute EFLAGS, but at the same time exposes more optimization opportunities downstream. I won't really have time to look at this further, but there is more low-hanging fruit. Implementing (and using) movcond should be easier after this series for example. I'm looking at you, Richard... If anybody can help testing it less cursorily than I did, that would be great. Paolo Paolo Bonzini (14): i386: use OT_* consistently i386: introduce gen_ext_tl i386: factor setting of s->cc_op handling for string functions i386: drop cc_op argument of gen_jcc1 i386: move eflags computation closer to gen_op_set_cc_op i386: factor gen_op_set_cc_op/tcg_gen_discard_tl around computing flags i386: add helper functions to get other flags i386: do not compute eflags multiple times consecutively i386: do not call helper to compute ZF/SF i386: use inverted setcond when computing NS or NZ i386: convert gen_compute_eflags_c to TCG i386: change gen_setcc_slow_T0 to gen_setcc_slow i386: optimize setbe i386: optimize setcc instructions target-i386/cc_helper.c | 118 ------- target-i386/cc_helper_template.h | 76 ----- target-i386/helper.h | 1 - target-i386/translate.c | 691 +++++++++++++++++++++------------------ 4 file modificati, 365 inserzioni(+), 521 rimozioni(-) -- 1.7.12.1