On Mon, Nov 26, 2018 at 15:16:00 +0000, Alex Bennée wrote: > Emilio G. Cota <c...@braap.org> writes: (snip) > > + if (tb_trans_cb && first_pass) { > > + qemu_plugin_tb_trans_cb(cpu, plugin_tb); > > + first_pass = false; > > + goto translate; > > + } > > So the only reason we are doing this two pass tango is to ensure the > plugin can insert TCG ops before the actual translation has occurred?
Not only. The idea is to provide plugins with well-defined TBs, i.e. the instruction sizes and contents can be queried by the plugin before the plugin decides how/where to instrument the TB. Since in the targets we generate TCG code and also generate host code in a single shot (TranslatorOps.translate_insn), the 2-pass approach is a workaround to first get the well-defined TB, and in the second pass inject the instrumentation in the appropriate places. This is a bit of a waste but given that it only happens at translate time, it can have negligible performance impact -- I measured a 0.13% gmean slowdown for SPEC06int. > I think we can do better, especially as the internal structures of > TCGops are implemented as a list so ops and be inserted before and after > other ops. This is currently only done by the optimiser at the moment, > see: > > TCGOp *tcg_op_insert_before(TCGContext *s, TCGOp *op, TCGOpcode opc, int > narg); > TCGOp *tcg_op_insert_after(TCGContext *s, TCGOp *op, TCGOpcode opc, int > narg); > > and all the base tcg ops end up going to tcg_emit_op which just appends > to the tail. But if we can come up with a neater way to track the op > used before the current translated expression we could do away with two > phases translation completely. This list of ops is generated via TranslatorOps.translate_insn. Unfortunately, this function also defines the guest insns that form the TB. Decoupling the two actions ("define the TB" and "translate to TCG ops") would be ideal, but I didn't want to rewrite all the target translators in QEMU, and opted instead for the 2-pass approach as a compromise. Thanks, Emilio