On 6/8/2022 12:32 AM, Alex Williamson wrote:
External email: Use caution opening links or attachments


On Tue, 7 Jun 2022 20:44:23 +0300
Avihai Horon <avih...@nvidia.com> wrote:

On 5/30/2022 8:07 PM, Avihai Horon wrote:
Hello,

Following VFIO migration protocol v2 acceptance in kernel, this series
implements VFIO migration according to the new v2 protocol and replaces
the now deprecated v1 implementation.

The main differences between v1 and v2 migration protocols are:
1. VFIO device state is represented as a finite state machine instead of
     a bitmap.

2. The migration interface with kernel is done using VFIO_DEVICE_FEATURE
     ioctl and normal read() and write() instead of the migration region
     used in v1.

3. Migration protocol v2 currently doesn't support the pre-copy phase of
     migration.

Full description of the v2 protocol and the differences from v1 can be
found here [1].

Patches 1-3 are prep patches fixing bugs and adding QEMUFile function
that will be used later.

Patches 4-6 refactor v1 protocol code to make it easier to add v2
protocol.

Patches 7-11 implement v2 protocol and remove v1 protocol.

Thanks.

[1]
https://lore.kernel.org/all/20220224142024.147653-10-yish...@nvidia.com/

Changes from v1: 
https://lore.kernel.org/all/20220512154320.19697-1-avih...@nvidia.com/
- Split the big patch that replaced v1 with v2 into several patches as
    suggested by Joao, to make review easier.
- Change warn_report to warn_report_once when container doesn't support
    dirty tracking.
- Add Reviewed-by tag.

Avihai Horon (11):
    vfio/migration: Fix NULL pointer dereference bug
    vfio/migration: Skip pre-copy if dirty page tracking is not supported
    migration/qemu-file: Add qemu_file_get_to_fd()
    vfio/common: Change vfio_devices_all_running_and_saving() logic to
      equivalent one
    vfio/migration: Move migration v1 logic to vfio_migration_init()
    vfio/migration: Rename functions/structs related to v1 protocol
    vfio/migration: Implement VFIO migration protocol v2
    vfio/migration: Remove VFIO migration protocol v1
    vfio/migration: Reset device if setting recover state fails
    vfio: Alphabetize migration section of VFIO trace-events file
    docs/devel: Align vfio-migration docs to VFIO migration v2

   docs/devel/vfio-migration.rst |  77 ++--
   hw/vfio/common.c              |  21 +-
   hw/vfio/migration.c           | 640 ++++++++--------------------------
   hw/vfio/trace-events          |  25 +-
   include/hw/vfio/vfio-common.h |   8 +-
   migration/migration.c         |   5 +
   migration/migration.h         |   3 +
   migration/qemu-file.c         |  34 ++
   migration/qemu-file.h         |   1 +
   9 files changed, 252 insertions(+), 562 deletions(-)

Ping.
Based on the changelog, this seems like a mostly cosmetic spin and I
don't see that all of the discussion threads from v1 were resolved to
everyone's satisfaction.  I'm certainly still uncomfortable with the
pre-copy behavior and I thought there were still some action items to
figure out whether an SLA is present and vet the solution with
management tools.  Thanks,

Yes.
OK, so let's clear things up and reach an agreement before I prepare the v3 series.

There are three topics that came up in previous discussion:

1. [PATCH v2 01/11] vfio/migration: Fix NULL pointer dereference bug.
   Juan gave his Reviewed-by but he wasn't sure about qemu_file_* usage
   outside migration thread.
   This code existed before and I fixed a NULL pointer dereference that
   I encountered.
   I suggested that later we can refactor VMChangeStateHandler to
   return error.
   I prefer not to do this refactor right now because I am not sure
   it's as straightforward change as it might seem - if some notifier
   fails and we abort do_vm_stop/vm_prepare_start in the middle, can
   this leave the VM in some unstable state?
   We plan to leave it as is and not do the refactor as part of this
   series.
   Are you ok with this?

2. [PATCH v2 02/11] vfio/migration: Skip pre-copy if dirty page
   tracking is not supported.
   As previously discussed, this patch doesn't consider the configured
   downtime limit.
   One way to fix it is to allow such migration only when "no SLA" (no
   downtime limit) is set. AFAIK today there is no way that one can set
   "no SLA".
   If we go with this option, we change normal flow of migration
   (skipping pre-copy) and might need to change management tools.

Instead, what about letting QEMU VFIO code mark all pages dirty (instead of kernel)? This way we don’t skip pre-copy and we get the same behavior we have now of perpetual dirtying all RAM, which respects SLA. If we go with this option, do we need to block migration when IOMMU is sPAPR TCE? Until now migration would be blocked because sPAPR TCE doesn't report dirty_pages_supported cap, but going with this option we will allow migration even when dirty_pages_supported cap is not set (and let QEMU dirty all pages).

3. [PATCH v2 03/11] migration/qemu-file: Add qemu_file_get_to_fd().
   Juan expressed his concern about the amount of data that will go
   through main migration thread.

This is already the case in v1 protocol - VFIO devices send all their data in the main migration thread. Note that like in v1 protocol, here as well the data is sent in small sized chunks, each with a header.
This patch just aims to eliminate an extra copy.

We plan to leave it as is. Is this ok?

Thanks.


Reply via email to