* Stefan Hajnoczi (stefa...@redhat.com) wrote: > On Wed, Nov 11, 2020 at 04:18:34PM +0000, Thanos Makatos wrote: > > > > > VFIO Migration > > > ============== > > > This document describes how to ensure migration compatibility for VFIO > > > devices, > > > including mdev and vfio-user devices. > > > > Is this something all VFIO/user devices will have to support? If it's not > > mandatory, how can a device advertise support? > > The --print-migration-info-json command-line option described below must > be implemented by the vfio-user device emulation program. Similarly, > VFIO/mdev devices must provide the migration/ sysfs group. > > If the device implementation does not expose these standard interfaces > then management tools can still attempt to migrate them, but there is no > migration compatibility check or algorithm for setting up the > destination device. In other words, it will only succeed with some luck > or by hardcoding knowledge of the specific device implementation into > the management tool. > > > > > > Multiple device implementations can support the same device model. Doing > > > so > > > means that the device implementations can offer migration compatiblity > > > because > > > they support the same hardware interface, device state representation, and > > > migration parameters. > > > > Does the above mean that a passthrough function can be migrated to a > > vfio-user > > program and vice versa? If so, then it's worth mentioning. > > Yes, if they are migration compatible (they support the same device > model and migration parameters) then migration is possible. I'll make > this clear in the next revision. > > Note VFIO migration is currently only working for mdev devices. Alex > Williamson mentioned that it could be extended to core VFIO PCI devices > (without mdev) in the future. > > > > More complex device emulation programs may host multiple devices. The > > > interface > > > for configuring these device emulation programs is not standardized. > > > Therefore, > > > migrating these devices is beyond the scope of this document. > > > > Most likely a device emulation program hosting multile devices would allow > > some form of communication for control purposes (e.g. SPDK implements a > > JSON-RPC > > server). So maybe it's possible to define interacting with such programs in > > this document? > > Yes, it's definitely possible. There needs to be agreement on the RPC > mechanism. QEMU implements QMP, SPDK has something similar but > different, gRPC/Protobuf is popular, and D-Bus is another alternative. I > asked about RPC mechanisms on the muser Slack instance to see if there > was consensus but it seems to be a bit early for that. > > Perhaps the most realistic option will be to define bindings to several > RPC mechanisms. That way everyone can use their preferred RPC mechanism, > at the cost of requiring management tools to support more than one > (which some already do, e.g. libvirt uses XDR itself but also implements > QEMU's QMP). > > > > > > > The migration information JSON is printed to standard output by a > > > vfio-user > > > device emulation program as follows: > > > > > > .. code:: bash > > > > > > $ my-device --print-migration-info-json > > > > > > The device is instantiated by launching the destination process with the > > > migration parameter list from the source: > > > > Must 'my-device --print-migration-info-json' always generate the same > > migration > > information JSON? If so, then what if the output generated by > > 'my-device --print-migration-info-json' depends on additional arguments > > passed > > to 'my-device' when it was originally started? > > Yes, it needs to be stable in the sense that you can invoke the program > with --print-migration-info-json and then expect launching the program > to succeed with migration parameters that are valid according to the > JSON. > > Running the same device emulation binary on different hosts can produce > different JSON. This is because the binary may rely on host hardware > resources or features (e.g. does this host have GPUs available?). > > It gets trickier when considering host reboots. I think the JSON can > change between reboots. However, the management tools may cache the JSON > so there needs to be a rule about when to refresh it.
libvirt does something similar for QEMU's current capabilities; it normally works fine; very occasionally you have to flush the cache though if you do something surprising which causes it to change capabilities. Dave > Regarding additional command-line arguments, they can affect the JSON > output. For example, they could include the connection details to an > iSCSI LUN and affect the block size migration parameter. This leads to > the same issue - can they be cached by the management tool? The answer > is the same - stability is needed in the short-term to avoid unexpected > failures when launching the program, but over the longer term we should > allow JSON changes. > > Thanks for raising these points. I'll add details to the next revision. > > Stefan -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK