On Wed, Jan 28, 2026 at 09:30:24AM -0300, Fabiano Rosas wrote:
> >> >> > @@ -294,6 +294,14 @@ int multifd_ram_unfill_packet(MultiFDRecvParams 
> >> >> > *p, Error **errp)
> >> >> >          p->zero[i] = offset;
> >> >> >      }
> >> >> >  
> >> >> > +    if (migrate_colo()) {
> >> >> > +        multifd_colo_prepare_recv(p);
> >> >> > +        assert(p->block->colo_cache);
> >> >> > +        p->host = p->block->colo_cache;  
> >> >> 
> >> >> Can't you just use p->block->colo_cache later? I don't see why p->host
> >> >> needs to be set beforehand even in the non-colo case.
> >> >
> >> > We should not touch the guest ram directly while in colo state, since
> >> > the incoming guest is running and we either want to receive and apply a
> >> > whole checkpoint with all ram into colo cache and all device state,
> >> > or if anything goes wrong during checkpointing, keep the currently
> >> > running guest on the incoming side in pristine state.
> >> >
> >> 
> >> I was asking about setting p->host at this specific point. I don't think
> >> any of this fits the unfill function. However, I see those were
> >> suggested by Peter so let's not go back and forth.
> >
> > Actually I don't know why p->host existed before this work; IIUC we could
> > have always used p->block->host.  Maybe when Juan was developing this Juan
> > kept COLO in mind; or maybe Juan wanted to avoid frequent p->block pointer
> > reference.
> >
> 
> Maybe p->block was being reset at some point and p->host was passed
> being the point where the (whatever) lock was release. I checked and
> today there's no such thing. The p->mutex seems to be there just to
> protect against this in multifd_recv_sync_main:
> 
> WITH_QEMU_LOCK_GUARD(&p->mutex) {
>     if (multifd_recv_state->packet_num < p->packet_num) {
>         multifd_recv_state->packet_num = p->packet_num;
>     }
> }

It should be protected by various checks over migration_is_running().

E.g., QMP device-add & device-del are forbidden so no new pc-dimm hotplug /
removal allowed.  Similarly, virtio_mem_is_busy() would return true during
migration too.

We should definitely make sure ramblock will not be reset during the whole
lifecycle of migration; I believe we're not ready for that..

-- 
Peter Xu


Reply via email to