On Fri, Apr 28, 2023 at 10:49:25PM +0300, Vladimir Sementsov-Ogievskiy wrote: > Originally, migration_incoming_co was introduced by > 25d0c16f625feb3b6 > "migration: Switch to COLO process after finishing loadvm" > to be able to enter from COLO code to one specific yield point, added > by 25d0c16f625feb3b6. > > Later in 923709896b1b0 > "migration: poll the cm event for destination qemu" > we reused this variable to wake the migration incoming coroutine from > RDMA code. > > That was doubtful idea. Entering coroutines is a very fragile thing: > you should be absolutely sure which yield point you are going to enter. > > I don't know how much is it safe to enter during qemu_loadvm_state() > which I think what RDMA want to do. But for sure RDMA shouldn't enter > the special COLO-related yield-point. As well, COLO code doesn't want > to enter during qemu_loadvm_state(), it want to enter it's own specific > yield-point. > > As well, when in 8e48ac95865ac97d > "COLO: Add block replication into colo process" we added > bdrv_invalidate_cache_all() call (now it's called activate_all()) > it became possible to enter the migration incoming coroutine during > that call which is wrong too. > > So, let't make these things separate and disjoint: loadvm_co for RDMA, > non-NULL during qemu_loadvm_state(), and colo_incoming_co for COLO, > non-NULL only around specific yield. > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsement...@yandex-team.ru> > --- > migration/colo.c | 4 ++-- > migration/migration.c | 8 ++++++-- > migration/migration.h | 9 ++++++++- > 3 files changed, 16 insertions(+), 5 deletions(-)
The idea looks right to me, but I really know mostly nothing on coroutines and also rdma+colo.. Is the other ref in rdma.c (rdma_cm_poll_handler()) still missing? -- Peter Xu