Hello, I have been wrestling with what might be a bug in the plugin memory callbacks. The immediate error is that I hit the `g_assert_not_reached()` in the 'default:' case in qemu_plugin_vcpu_mem_cb, indicating the callback type was invalid. When breaking on this assertion in gdb, the contents of cpu->plugin_mem_cbs are obviously bogus (`len` was absurdly high, for example). After doing some further digging/instrumenting, I eventually found that `free_dyn_cb_arr(void *p, ...)` is being called shortly before the assertion is hit with `p` pointing to the same address as `cpu->plugin_mem_cbs` will later hold at assertion-time. We are freeing the memory still pointed to by `cpu->plugin_mem_cbs`.
I believe the code *should* always reset `cpu->plugin_mem_cbs` to NULL at the end of an instruction/TB's execution, so its not exactly clear to me how this is occurring. However, I suspect it may be relevant that we are calling `free_dyn_cb_arr()` because my plugin called `qemu_plugin_reset()`. I have additionally found that the below addition allows me to run successfully without hitting the assert: diff --git a/plugins/core.c b/plugins/core.c --- a/plugins/core.c +++ b/plugins/core.c @@ -427,9 +427,14 @@ static bool free_dyn_cb_arr(void *p, uint32_t h, void *userp) void qemu_plugin_flush_cb(void) { + CPUState *cpu; qht_iter_remove(&plugin.dyn_cb_arr_ht, free_dyn_cb_arr, NULL); qht_reset(&plugin.dyn_cb_arr_ht); + CPU_FOREACH(cpu) { + cpu->plugin_mem_cbs = NULL; + } + plugin_cb__simple(QEMU_PLUGIN_EV_FLUSH); } Unfortunately, the workload/setup I have encountered this bug with are difficult to reproduce in a way suitable for sharing upstream (admittedly potentially because I do not fully understand the conditions necessary to trigger it). It is also deep into a run, and I haven't found a good way to break in gdb immediately prior to it happening in order to inspect it, without perturbing it enough such that it doesn't happen... I welcome any feedback or insights on how to further nail down the failure case and/or help in working towards an appropriate solution. Thanks! -Aaron