Emilio G. Cota <c...@braap.org> writes:
> On Tue, Nov 27, 2018 at 14:06:57 -0500, Emilio G. Cota wrote: >> On Tue, Nov 27, 2018 at 14:48:11 +0000, Alex Bennée wrote: >> > With a little tweaking to the TCG we could then insert >> > our instrumentation at the end of the pass with all the knowledge we >> > want to export to the plugin. >> >> After .translate_insn has returned for the last instruction, how >> do we insert the instrumentation that the plugin wants--say, a TB >> callback at the beginning of the TB, memory callbacks for the >> 2nd instruction, and an insn callback before the 3rd instruction >> executes? >> >> I don't see how we could achieve that with "a little tweaking" >> instead of a 2nd pass, but I'd love to be wrong =) > > (snip) >> > I don't quite follow. When we've done all our translation into TCG ops >> > haven't we by definition defined the TB? >> >> Yes, that's precisely my point. >> >> The part that's missing is that once the TB is defined, we want to >> insert instrumentation. Unfortunately, the "TCG ops" we get after >> the 1st pass (no instrumentation), are very different from the list >> of "TCG ops" that we get after the 2nd pass (after having injected >> instrumentation). Could we get the same output of the 2nd pass, >> just by using the output of the 1st and the list of injection points? >> It's probably possible, but it seems very hard to do. (Think for >> instance of memory callbacks, and further the complication of when >> they use helpers). >> >> The only reasonable way to do this I think would be to leave behind >> "placeholder" TCG ops, that then we could scan to add further TCG ops. >> But you'll agree with me that the 2nd pass is simpler :P > > It might not be that much simpler after all! > > I am exploring the approach you suggested, that is IIUC to do a > single pass and then walk the list of Ops, adding (and > reordering) Ops based on what the plugins have requested. > > Contrary to what I wrote earlier today (quoted above), this > approach seems quite promising, and certainly less ugly > than doing the 2 passes. > > I just wrote some code to go over the list and add TB callbacks, > which go right before the first insn_start Op. The code is hack-ish > in that we first generate the TCG ops we need, which get added to > the end of the ops list, and then we go over those and move them > to where we want them to be (before insn_start, in this case). > But it works and it's less of a hack than doing the whole 2nd pass. But we should be able to insert the ops directly in the right place. That is the whole point of being a list right ;-) > Insn callbacks will be trivial to implement this way; memory > callbacks should be harder because there are several qemu_ld/st > opcodes, but it should be doable; I was thinking about this last night. I wonder if we need to tag the memory tcg ops so we can find them afterwards during the insertion phase - but maybe the opcode itself provides enough information. > last, memory instrumentation > of helpers might actually be easier than with the 2 passes, because here > we just have to look for a Call TCG op to know whether a guest > instruction uses helpers, and if it does we can wrap the call > with the helpers to generate the setting/resetting of > CPUState.plugin_mem_cbs. So merging the two helper calls into one from the target code? > > I'll try to do what's in the previous paragraph tomorrow -- I'm > appending what I did so far. > > Thanks, > > Emilio > --- > diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c > index ee9e179e14..232f645cd4 100644 > --- a/accel/tcg/translator.c > +++ b/accel/tcg/translator.c > @@ -18,6 +18,7 @@ > #include "exec/gen-icount.h" > #include "exec/log.h" > #include "exec/translator.h" > +#include "exec/plugin-gen.h" > > /* Pairs with tcg_clear_temp_count. > To be called by #TranslatorOps.{translate_insn,tb_stop} if > @@ -142,6 +143,11 @@ void translator_loop(const TranslatorOps *ops, > DisasContextBase *db, > gen_tb_end(db->tb, db->num_insns - bp_insn); > > if (plugin_enabled) { > + /* collect the plugins' instrumentation */ > + qemu_plugin_tb_trans_cb(cpu, &tcg_ctx->plugin_tb); > + /* inject instrumentation */ > + qemu_plugin_gen_inject(); > + /* done */ > tcg_ctx->plugin_insn = NULL; > } > > diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c > index 75f182be37..cb5dbadc3c 100644 > --- a/accel/tcg/plugin-gen.c > +++ b/accel/tcg/plugin-gen.c > @@ -1,4 +1,5 @@ > #include "qemu/osdep.h" > +#include "qemu/queue.h" > #include "cpu.h" > #include "tcg/tcg.h" > #include "tcg/tcg-op.h" > @@ -169,8 +170,61 @@ static void gen_vcpu_udata_cb(struct qemu_plugin_dyn_cb > *cb) > tcg_temp_free_i32(cpu_index); > } > > -void qemu_plugin_gen_vcpu_udata_callbacks(struct qemu_plugin_dyn_cb_arr *arr) > +/* check that @a comes before @b */ > +static inline void ops_check(const TCGOp *a, const TCGOp *b) > { > + const TCGOp *op; > + bool seen_a = false; > + bool seen_b = false; > + > + tcg_debug_assert(a != b); > + QTAILQ_FOREACH(op, &tcg_ctx->ops, link) { > + if (op == a) { > + tcg_debug_assert(!seen_b); > + seen_a = true; > + } else if (op == b) { > + tcg_debug_assert(seen_a); > + seen_b = true; > + break; > + } > + } > +} > + > +/* > + * Move ops from @from to @dest. > + * @from must come after @dest in the list. > + */ > +static void ops_move(TCGOp *dest, TCGOp *from) > +{ > + TCGOp *curr; > + > +#ifdef CONFIG_DEBUG_TCG > + ops_check(dest, from); > +#endif > + > + if (QTAILQ_NEXT(dest, link) == from) { > + /* nothing to do */ > + return; > + } > + curr = from; > + do { > + TCGOp *next = QTAILQ_NEXT(curr, link); > + > + /* > + * This could be done more efficiently since (@from,end) will > + * remain linked together, but most likely we just have a few > + * elements, so this is simple enough. > + */ > + QTAILQ_REMOVE(&tcg_ctx->ops, curr, link); > + QTAILQ_INSERT_AFTER(&tcg_ctx->ops, dest, curr, link); > + dest = curr; > + curr = next; > + } while (curr); > +} > + > +static void inject_vcpu_udata_callbacks(struct qemu_plugin_dyn_cb_arr *arr, > TCGOp *dest) > +{ > + TCGOp *last_op = tcg_last_op(); > size_t i; > > for (i = 0; i < arr->n; i++) { > @@ -187,6 +241,10 @@ void qemu_plugin_gen_vcpu_udata_callbacks(struct > qemu_plugin_dyn_cb_arr *arr) > g_assert_not_reached(); > } > } > + g_assert(tcg_last_op() != last_op); > + last_op = QTAILQ_NEXT(last_op, link); > + g_assert(last_op); > + ops_move(dest, last_op); > } > > /* > @@ -228,3 +286,26 @@ void qemu_plugin_gen_disable_mem_helpers(void) > plugin_mem_cbs)); > tcg_temp_free_ptr(ptr); > } > + > +void qemu_plugin_gen_inject(void) > +{ > + struct qemu_plugin_tb *plugin_tb = &tcg_ctx->plugin_tb; > + > + /* TB exec callbacks */ > + if (plugin_tb->cbs.n) { > + TCGOp *op; > + > + /* Insert the callbacks right before the first insn_start */ > + QTAILQ_FOREACH(op, &tcg_ctx->ops, link) { > + if (op->opc == INDEX_op_insn_start) { > + break; > + } > + } > + /* a TB without insn_start? Something went wrong */ > + g_assert(op); > + op = QTAILQ_PREV(op, TCGOpHead, link); > + /* we should have called gen_tb_start before the 1st insn */ > + g_assert(op); > + inject_vcpu_udata_callbacks(&plugin_tb->cbs, op); > + } > +} -- Alex Bennée