On Fri, Apr 28, 2023 at 10:49:25PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Originally, migration_incoming_co was introduced by
> 25d0c16f625feb3b6
>    "migration: Switch to COLO process after finishing loadvm"
> to be able to enter from COLO code to one specific yield point, added
> by 25d0c16f625feb3b6.
> 
> Later in 923709896b1b0
>  "migration: poll the cm event for destination qemu"
> we reused this variable to wake the migration incoming coroutine from
> RDMA code.
> 
> That was doubtful idea. Entering coroutines is a very fragile thing:
> you should be absolutely sure which yield point you are going to enter.
> 
> I don't know how much is it safe to enter during qemu_loadvm_state()
> which I think what RDMA want to do. But for sure RDMA shouldn't enter
> the special COLO-related yield-point. As well, COLO code doesn't want
> to enter during qemu_loadvm_state(), it want to enter it's own specific
> yield-point.
> 
> As well, when in 8e48ac95865ac97d
>  "COLO: Add block replication into colo process" we added
> bdrv_invalidate_cache_all() call (now it's called activate_all())
> it became possible to enter the migration incoming coroutine during
> that call which is wrong too.
> 
> So, let't make these things separate and disjoint: loadvm_co for RDMA,
> non-NULL during qemu_loadvm_state(), and colo_incoming_co for COLO,
> non-NULL only around specific yield.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsement...@yandex-team.ru>
> ---
>  migration/colo.c      | 4 ++--
>  migration/migration.c | 8 ++++++--
>  migration/migration.h | 9 ++++++++-
>  3 files changed, 16 insertions(+), 5 deletions(-)

The idea looks right to me, but I really know mostly nothing on coroutines
and also rdma+colo..

Is the other ref in rdma.c (rdma_cm_poll_handler()) still missing?

-- 
Peter Xu


Reply via email to