On 5/26/2026 7:23 PM, Peter Xu wrote:
External email: Use caution opening links or attachments


On Tue, May 26, 2026 at 12:08:34PM +0300, Avihai Horon wrote:
On 5/25/2026 6:01 PM, Peter Xu wrote:
External email: Use caution opening links or attachments


On Sun, May 24, 2026 at 09:34:48AM +0300, Avihai Horon wrote:
Yes I think so.

We just need to indicate modules that it’s the last query during switchover
so they can handle it properly.
Do you think it would be reasonable to add a "bool final" param to
save_query_pending handler?

For RAM it will be used to indicate we are running under the BQL (since
currently save_query_pending runs only outside BQL) and to pass the proper
last_stage param into migration_bitmap_sync_precopy().
For VFIO it will indicate we should not do a query precopy info ioctl (which
is only valid in VFIO precopy states, not while VM is stopped).
Yes, a final boolean sounds reasonable, implying both (1) last sync before
switchover, VM stopped, (2) BQL held.

For VFIO, I double checked the complete() that does not depend on the
precopy_bytes fetched, then it should be fine indeed,

vfio_save_complete_precopy():
      do {
          data_size = vfio_save_block(f, vbasedev->migration);
          if (data_size < 0) {
              return data_size;
          }
      } while (data_size);

It's just werid to see that it doesn't depend on either precopy_bytes or
initial_bytes, even if logically it should.. this will be confusing to
whoever start reading this code.. but I understand not much we can do with
the current kernel API.

Side note: should we still better update these fields to make sure they'll
be zero after migration? That means vfio_update_estimated_pending_data() in
vfio_save_complete_precopy() too, with/without further sanity checks.  That
seems to be missing right now.  I'm not sure if it's intentional.
Yes, it's intentional, since we don't use these values after calling
vfio_save_complete_precopy() -- they are only used for downtime estimation
prior switchover.
Precopy_init/dirty sizes are zeroed in vfio_save_cleanup() though, but not
stopcopy_size (however, that's benign, since upon new migration it will be
reset before used).

So calling vfio_update_estimated_pending_data() here seems redundant to me.
Logically migration can still fail during complete():

qemu_savevm_state_complete_precopy():
     ret = qemu_savevm_state_complete_precopy_iterable(f, false);
     if (ret) {
         return ret;
     }

If vfio_update_estimated_pending_data() has the safe guard for any form of
overflow, then IMHO we should try to maintain those counters if possible.

Yes, we can call vfio_update_estimated_pending_data() in vfio_save_complete_precopy(), but I am not sure I see what's the benefit of it?

If it's to ensure these counters are zero post of migration, then even if migration fails the .save_cleanup handler will be called and zero them (I can add a patch that zeroes stopcopy_size as well).

Thanks.


Reply via email to