Re: [libvirt] [Qemu-devel] [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device

2019-04-25 Thread Yan Zhao
On Wed, Apr 24, 2019 at 05:10:43PM +0800, Christophe de Dinechin wrote:
> 
> 
> > On 23 Apr 2019, at 12:39, Daniel P. Berrangé  wrote:
> > 
> > On Fri, Apr 19, 2019 at 04:35:04AM -0400, Yan Zhao wrote:
> >> device version attribute in mdev sysfs is used by user space software
> >> (e.g. libvirt) to query device compatibility for live migration of VFIO
> >> mdev devices. This attribute is mandatory if a mdev device supports live
> >> migration.
> >> 
> >> It consists of two parts: common part and vendor proprietary part.
> >> common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> >> identifies device type. e.g., for pci device, it is
> >> "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> >> vendor proprietary part: this part is varied in length. vendor driver can
> >> specify any string to identify a device.
> >> 
> >> When reading this attribute, it should show device version string of the
> >> device of type . If a device does not support live migration, it
> >> should return errno.
> >> When writing a string to this attribute, it returns errno for
> >> incompatibility or returns written string length in compatibility case.
> >> If a device does not support live migration, it always returns errno.
> >> 
> >> For user space software to use:
> >> 1.
> >> Before starting live migration, user space software first reads source side
> >> mdev device's version. e.g.
> >> "#cat \
> >> /sys/bus/pci/devices/\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> >> 00028086-193b-i915-GVTg_V5_4
> >> 
> >> 2.
> >> Then, user space software writes the source side returned version string
> >> to device version attribute in target side, and checks the return value.
> >> If a negative errno is returned in the target side, then mdev devices in
> >> source and target sides are not compatible;
> >> If a positive number is returned and it equals to the length of written
> >> string, then the two mdev devices in source and target side are compatible.
> >> e.g.
> >> (a) compatibility case
> >> "# echo 00028086-193b-i915-GVTg_V5_4 >
> >> /sys/bus/pci/devices/\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> >> 
> >> (b) incompatibility case
> >> "#echo 00028086-193b-i915-GVTg_V5_1 >
> >> /sys/bus/pci/devices/\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> >> -bash: echo: write error: Invalid argument
> > 
> > What you have written here seems to imply that each mdev type is able to
> > support many different versions at the same time. Writing a version into
> > this sysfs file then chooses which of the many versions to actually use.
> > 
> > This is good as it allows for live migration across driver software 
> > upgrades.
> > 
> > A mgmt application may well want to know what versions are supported for an
> > mdev type *before* starting a migration. A mgmt app can query all the 100's
> > of hosts it knows and thus figure out which are valid to use as the target
> > of a migration.
> > 
> > IOW, we want to avoid the ever hitting the incompatibility case in the
> > first place, by only choosing to migrate to a host that we know is going
> > to be compatible.
> > 
> > This would need some kind of way to report the full list of supported
> > versions against the mdev supported types on the host.
> > 
> > 
> >> 3. if two mdev devices are compatible, user space software can start
> >> live migration, and vice versa.
> >> 
> >> Note: if a mdev device does not support live migration, it either does
> >> not provide a version attribute, or always returns errno when its version
> >> attribute is read/written.
> >> 
> >> Cc: Alex Williamson 
> >> Cc: Erik Skultety 
> >> Cc: "Dr. David Alan Gilbert" 
> >> Cc: Cornelia Huck 
> >> Cc: "Tian, Kevin" 
> >> Cc: Zhenyu Wang 
> >> Cc: "Wang, Zhi A" 
> >> Cc: Neo Jia 
> >> Cc: Kirti Wankhede 
> >> 
> >> Signed-off-by: Yan Zhao 
> >> ---
> >> Documentation/vfio-mediated-device.txt | 36 ++
> >> samples/vfio-mdev/mbochs.c | 17 
> >> samples/vfio-mdev/mdpy.c   | 16 
> >> samples/vfio-mdev/mtty.c   | 16 
> >> 4 files changed, 85 insertions(+)
> >> 
> >> diff --git a/Documentation/vfio-mediated-device.txt 
> >> b/Documentation/vfio-mediated-device.txt
> >> index c3f69bcaf96e..bc28471c0667 100644
> >> --- a/Documentation/vfio-mediated-device.txt
> >> +++ b/Documentation/vfio-mediated-device.txt
> >> @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each 
> >> Physical Device
> >>   | |   |--- available_instances
> >>   | |   |--- device_api
> >>   | |   |--- description
> >> +  | |   |--- version
> >>   | |   |--- [devices]
> >>   | |--- []
> >>   | |   |--- create
> >> @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each 
> >> Physical Device
> >>   | |   |--- available_instances
> >>   | |   |--- device_api
> >>   | |   |--- desc

Re: [libvirt] [Qemu-devel] [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device

2019-04-24 Thread Christophe de Dinechin


> On 23 Apr 2019, at 12:39, Daniel P. Berrangé  wrote:
> 
> On Fri, Apr 19, 2019 at 04:35:04AM -0400, Yan Zhao wrote:
>> device version attribute in mdev sysfs is used by user space software
>> (e.g. libvirt) to query device compatibility for live migration of VFIO
>> mdev devices. This attribute is mandatory if a mdev device supports live
>> migration.
>> 
>> It consists of two parts: common part and vendor proprietary part.
>> common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
>> identifies device type. e.g., for pci device, it is
>> "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
>> vendor proprietary part: this part is varied in length. vendor driver can
>> specify any string to identify a device.
>> 
>> When reading this attribute, it should show device version string of the
>> device of type . If a device does not support live migration, it
>> should return errno.
>> When writing a string to this attribute, it returns errno for
>> incompatibility or returns written string length in compatibility case.
>> If a device does not support live migration, it always returns errno.
>> 
>> For user space software to use:
>> 1.
>> Before starting live migration, user space software first reads source side
>> mdev device's version. e.g.
>> "#cat \
>> /sys/bus/pci/devices/\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
>> 00028086-193b-i915-GVTg_V5_4
>> 
>> 2.
>> Then, user space software writes the source side returned version string
>> to device version attribute in target side, and checks the return value.
>> If a negative errno is returned in the target side, then mdev devices in
>> source and target sides are not compatible;
>> If a positive number is returned and it equals to the length of written
>> string, then the two mdev devices in source and target side are compatible.
>> e.g.
>> (a) compatibility case
>> "# echo 00028086-193b-i915-GVTg_V5_4 >
>> /sys/bus/pci/devices/\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
>> 
>> (b) incompatibility case
>> "#echo 00028086-193b-i915-GVTg_V5_1 >
>> /sys/bus/pci/devices/\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
>> -bash: echo: write error: Invalid argument
> 
> What you have written here seems to imply that each mdev type is able to
> support many different versions at the same time. Writing a version into
> this sysfs file then chooses which of the many versions to actually use.
> 
> This is good as it allows for live migration across driver software upgrades.
> 
> A mgmt application may well want to know what versions are supported for an
> mdev type *before* starting a migration. A mgmt app can query all the 100's
> of hosts it knows and thus figure out which are valid to use as the target
> of a migration.
> 
> IOW, we want to avoid the ever hitting the incompatibility case in the
> first place, by only choosing to migrate to a host that we know is going
> to be compatible.
> 
> This would need some kind of way to report the full list of supported
> versions against the mdev supported types on the host.
> 
> 
>> 3. if two mdev devices are compatible, user space software can start
>> live migration, and vice versa.
>> 
>> Note: if a mdev device does not support live migration, it either does
>> not provide a version attribute, or always returns errno when its version
>> attribute is read/written.
>> 
>> Cc: Alex Williamson 
>> Cc: Erik Skultety 
>> Cc: "Dr. David Alan Gilbert" 
>> Cc: Cornelia Huck 
>> Cc: "Tian, Kevin" 
>> Cc: Zhenyu Wang 
>> Cc: "Wang, Zhi A" 
>> Cc: Neo Jia 
>> Cc: Kirti Wankhede 
>> 
>> Signed-off-by: Yan Zhao 
>> ---
>> Documentation/vfio-mediated-device.txt | 36 ++
>> samples/vfio-mdev/mbochs.c | 17 
>> samples/vfio-mdev/mdpy.c   | 16 
>> samples/vfio-mdev/mtty.c   | 16 
>> 4 files changed, 85 insertions(+)
>> 
>> diff --git a/Documentation/vfio-mediated-device.txt 
>> b/Documentation/vfio-mediated-device.txt
>> index c3f69bcaf96e..bc28471c0667 100644
>> --- a/Documentation/vfio-mediated-device.txt
>> +++ b/Documentation/vfio-mediated-device.txt
>> @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical 
>> Device
>>   | |   |--- available_instances
>>   | |   |--- device_api
>>   | |   |--- description
>> +  | |   |--- version
>>   | |   |--- [devices]
>>   | |--- []
>>   | |   |--- create
>> @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical 
>> Device
>>   | |   |--- available_instances
>>   | |   |--- device_api
>>   | |   |--- description
>> +  | |   |--- version
>>   | |   |--- [devices]
>>   | |--- []
>>   |  |--- create
>> @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical 
>> Device
>>   |  |--- available_instances
>>   |  |--- device_api
>>   |  |--- des

Re: [libvirt] [Qemu-devel] [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device

2019-04-23 Thread Neo Jia
On Tue, Apr 23, 2019 at 11:39:39AM +0100, Daniel P. Berrangé wrote:
> On Fri, Apr 19, 2019 at 04:35:04AM -0400, Yan Zhao wrote:
> > device version attribute in mdev sysfs is used by user space software
> > (e.g. libvirt) to query device compatibility for live migration of VFIO
> > mdev devices. This attribute is mandatory if a mdev device supports live
> > migration.
> > 
> > It consists of two parts: common part and vendor proprietary part.
> > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> >  identifies device type. e.g., for pci device, it is
> >  "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > vendor proprietary part: this part is varied in length. vendor driver can
> >  specify any string to identify a device.
> > 
> > When reading this attribute, it should show device version string of the
> > device of type . If a device does not support live migration, it
> > should return errno.
> > When writing a string to this attribute, it returns errno for
> > incompatibility or returns written string length in compatibility case.
> > If a device does not support live migration, it always returns errno.
> > 
> > For user space software to use:
> > 1.
> > Before starting live migration, user space software first reads source side
> > mdev device's version. e.g.
> > "#cat \
> > /sys/bus/pci/devices/\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> > 00028086-193b-i915-GVTg_V5_4
> > 
> > 2.
> > Then, user space software writes the source side returned version string
> > to device version attribute in target side, and checks the return value.
> > If a negative errno is returned in the target side, then mdev devices in
> > source and target sides are not compatible;
> > If a positive number is returned and it equals to the length of written
> > string, then the two mdev devices in source and target side are compatible.
> > e.g.
> > (a) compatibility case
> > "# echo 00028086-193b-i915-GVTg_V5_4 >
> > /sys/bus/pci/devices/\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > 
> > (b) incompatibility case
> > "#echo 00028086-193b-i915-GVTg_V5_1 >
> > /sys/bus/pci/devices/\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > -bash: echo: write error: Invalid argument
> 
> What you have written here seems to imply that each mdev type is able to
> support many different versions at the same time. Writing a version into
> this sysfs file then chooses which of the many versions to actually use.
> 
> This is good as it allows for live migration across driver software upgrades.
> 
> A mgmt application may well want to know what versions are supported for an
> mdev type *before* starting a migration. A mgmt app can query all the 100's
> of hosts it knows and thus figure out which are valid to use as the target
> of a migration.
> 
> IOW, we want to avoid the ever hitting the incompatibility case in the
> first place, by only choosing to migrate to a host that we know is going
> to be compatible.
> 
> This would need some kind of way to report the full list of supported
> versions against the mdev supported types on the host.

What would be the typical scenario / use case for mgmt layer to query the 
version
information? Do they expect this get done completely offline as long as the
vendor driver installed on each host?

Thanks,
Neo

> 
> 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device

2019-04-23 Thread Yan Zhao
On Tue, Apr 23, 2019 at 06:24:19PM +0800, Daniel P. Berrangé wrote:
> On Tue, Apr 23, 2019 at 01:41:57AM -0400, Yan Zhao wrote:
> > On Tue, Apr 23, 2019 at 09:21:00AM +0800, Alex Williamson wrote:
> > > On Mon, 22 Apr 2019 21:01:52 -0400
> > > Yan Zhao  wrote:
> > >
> > > > On Mon, Apr 22, 2019 at 10:39:50PM +0800, Alex Williamson wrote:
> > > > > On Fri, 19 Apr 2019 04:35:04 -0400
> > > > > Yan Zhao  wrote:
> > > > >
> > > > > > device version attribute in mdev sysfs is used by user space 
> > > > > > software
> > > > > > (e.g. libvirt) to query device compatibility for live migration of 
> > > > > > VFIO
> > > > > > mdev devices. This attribute is mandatory if a mdev device supports 
> > > > > > live
> > > > > > migration.
> > > > >
> > > > > The Subject: doesn't quite match what's being proposed here.
> > > > >
> > > > > > It consists of two parts: common part and vendor proprietary part.
> > > > > > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> > > > > >  identifies device type. e.g., for pci device, it is
> > > > > >  "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > > >
> > > > > What purpose does this serve?  If it's intended as some sort of
> > > > > namespace feature, shouldn't we first assume that we can only support
> > > > > migration to devices of the same type?  Therefore each type would
> > > > > already have its own namespace.  Also that would make the trailing bit
> > > > > of the version string listed below in the example redundant.  A vendor
> > > > > is still welcome to include this in their version string if they wish,
> > > > > but I think the string should be entirely vendor defined.
> > > > >
> > > > hi Alex,
> > > > This common part is a kind of namespace.
> > > > Because if version string is entirely defined by vendors, I'm worried 
> > > > about
> > > > if there is a case that one vendor's version string happens to deceive 
> > > > and
> > > > interfere with another vendor's version checking?
> > > > e.g.
> > > > vendor A has a version string like: vendor id + device id + mdev type
> > > > vendor B has a version string like: device id + vendor id + mdev type
> > > > but vendor A's vendor id is 0x8086, device id is 0x1217
> > > > vendor B's vendor id is 0x1217, device id is 0x8086.
> > > >
> > > > In this corner case, the two vendors may regard the two device is
> > > > migratable but actually they are not.
> > > >
> > > > That's the reason for this common part that serve as a kind of namespace
> > > > that all vendors will comply with to avoid overlap.
> > >
> > > If we assume that migration can only occur between matching mdev types,
> > > this is redundant, each type already has their own namespace.
> > >
> > hi Alex,
> > do you mean user space software like libvirt needs to first check whether
> > mdev type is matching and then check whether version is matching?
> 
> I would expect that libvirt (or other mgmt apps) will always first
> check that the vendor id, device id, mdev type all match. So for the
> version string it should suffice to be a "normal" numeric value.
> 
> Essentially version string just needs to be there to distinguish
> revisions of the same mdev type over time.
>
hi Daniel,
The way that user space software checks that the vendor id, device id, mdev
type all match and version string is just revisions is somewhat restrictive?
user space software could not have so much detailed information regarding to
which devices are compatible, especially when vendor id, device id,
revision id, and mdev types are not enough or no need to be exactly the same.
By moving decision making for compatibility from user space to vendor
driver, user space software can be more change-resistant.
Agree?

Thanks
Yan

> 
> Regards,
> Daniel
> --
> |: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o-https://fstop138.berrange.com :|
> |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device

2019-04-23 Thread Daniel P . Berrangé
On Tue, Apr 23, 2019 at 08:48:52AM -0600, Alex Williamson wrote:
> On Tue, 23 Apr 2019 14:44:00 +0100
> Daniel P. Berrangé  wrote:
> 
> > On Tue, Apr 23, 2019 at 06:35:40AM -0600, Alex Williamson wrote:
> > > On Tue, 23 Apr 2019 11:39:39 +0100
> > > Daniel P. Berrangé  wrote:
> > >   
> > > > On Fri, Apr 19, 2019 at 04:35:04AM -0400, Yan Zhao wrote:  
> > > > > +* version
> > > > > +
> > > > > +  This attribute is rw. It is used to check whether two devices are 
> > > > > compatible
> > > > > +  for live migration. If this attribute is missing, then the 
> > > > > corresponding mdev
> > > > > +  device is regarded as not supporting live migration.
> > > > > +
> > > > > +  It consists of two parts: common part and vendor proprietary part.
> > > > > +  common part: 32 bit. lower 16 bits is vendor id and higher 16 bits 
> > > > > identifies
> > > > > +   device type. e.g., for pci device, it is
> > > > > +   "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > > > +  vendor proprietary part: this part is varied in length. vendor 
> > > > > driver can
> > > > > +   specify any string to identify a device.
> > > > > +
> > > > > +  When reading this attribute, it should show device version string 
> > > > > of the device
> > > > > +  of type . If a device does not support live migration, it 
> > > > > should
> > > > > +  return errno.
> > > > > +  When writing a string to this attribute, it returns errno for 
> > > > > incompatibility
> > > > > +  or returns written string length in compatibility case. If a 
> > > > > device does not
> > > > > +  support live migration, it always returns errno.
> > > > > +
> > > > > +  for example.
> > > > > +  # cat \
> > > > > + 
> > > > > /sys/bus/pci/devices/\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
> > > > > +  00028086-193b-i915-GVTg_V5_2
> > > > > +
> > > > > +  #echo 00028086-193b-i915-GVTg_V5_2 > \
> > > > > + 
> > > > > /sys/bus/pci/devices/\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> > > > > + -bash: echo: write error: Invalid argument
> > > > > +
> > > > 
> > > > IIUC this path is against the physical device. IOW, the mgmt app would 
> > > > have
> > > > to first write to the "version" file to choose a version, and then 
> > > > write to
> > > > the "create" file to actually create an virtual device. This has the 
> > > > obvious
> > > > concurrency problem if multiple devices are being created at the same 
> > > > time
> > > > and distinct versions for each device are required. There would need to 
> > > > be
> > > > a locking scheme defined to ensure safety.  
> > > 
> > > "Create a device of a given version" is not an intended feature of this
> > > interface aiui.  Writing the version attribute only indicates
> > > migration compatibility with a binary result.
> > >
> > > > Wouldn't it be better if we can pass the desired version when we write 
> > > > to
> > > > the "create" file, so that we avoid any concurrent usage problems. 
> > > > "version"
> > > > could be just a read-only file with a *list* of supported versions.
> > > > 
> > > > eg
> > > > 
> > > >   $ cat 
> > > > /sys/bus/pci/devices/\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> > > >   5.0
> > > >   5.1
> > > >   5.2
> > > > 
> > > >   $ echo "83b8f4f2-509f-382f-3c1e-e6bfe0fa1001;version=5.2" >
> > > >   /sys/devices/virtual/mtty/mtty/mdev_supported_types/mtty-2/create 
> > > >  
> > > 
> > > This is reminiscent of the proposed aggregation support, but again,
> > > this sort of feature is not intended here.  It's no expected that any
> > > vendor driver would support creating device types of different
> > > versions, but they may support migration from different versions.  
> > 
> > Hmm, that's a subtle distinction I wasn't seeing the patch series.
> > IIUC from what you're saying, a device can be created with version
> > "C", but for an incoming migration it can (potentially) accept
> > serialized state matching any of versions "A", "B" or "C".
> > 
> > That is sufficient if migration is being used as a host upgrade
> > tool, to move from OS release N to N + 1.
> > 
> > It wouldn't cover the case where you need to support backwards
> > migration too. eg if you have a mixture of hosts with release
> > N and N + 1 and want to make sure that VMs can always move
> > betweeen any host.  That would require that you can create
> > mdevs with the lowest common denominator version, not solely
> > the most recent.
> > 
> > In QEMU this is done by mgmt applications picking a versioned
> > machine type for QEMU that is older than most recent.
> 
> I suppose we'd need to determine how important that is, this is just a
> device after all, there are always fallback mechanisms via hotplug.
> There are a lot of pieces that need to line up for backwards migration
> including support for it at the individual vendor driver.  Nothing we
> design into the API is going to require vendor drivers to support
> backwards migration

Re: [libvirt] [Qemu-devel] [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device

2019-04-23 Thread Alex Williamson
On Tue, 23 Apr 2019 14:44:00 +0100
Daniel P. Berrangé  wrote:

> On Tue, Apr 23, 2019 at 06:35:40AM -0600, Alex Williamson wrote:
> > On Tue, 23 Apr 2019 11:39:39 +0100
> > Daniel P. Berrangé  wrote:
> >   
> > > On Fri, Apr 19, 2019 at 04:35:04AM -0400, Yan Zhao wrote:  
> > > > +* version
> > > > +
> > > > +  This attribute is rw. It is used to check whether two devices are 
> > > > compatible
> > > > +  for live migration. If this attribute is missing, then the 
> > > > corresponding mdev
> > > > +  device is regarded as not supporting live migration.
> > > > +
> > > > +  It consists of two parts: common part and vendor proprietary part.
> > > > +  common part: 32 bit. lower 16 bits is vendor id and higher 16 bits 
> > > > identifies
> > > > +   device type. e.g., for pci device, it is
> > > > +   "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > > +  vendor proprietary part: this part is varied in length. vendor 
> > > > driver can
> > > > +   specify any string to identify a device.
> > > > +
> > > > +  When reading this attribute, it should show device version string of 
> > > > the device
> > > > +  of type . If a device does not support live migration, it 
> > > > should
> > > > +  return errno.
> > > > +  When writing a string to this attribute, it returns errno for 
> > > > incompatibility
> > > > +  or returns written string length in compatibility case. If a device 
> > > > does not
> > > > +  support live migration, it always returns errno.
> > > > +
> > > > +  for example.
> > > > +  # cat \
> > > > + 
> > > > /sys/bus/pci/devices/\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
> > > > +  00028086-193b-i915-GVTg_V5_2
> > > > +
> > > > +  #echo 00028086-193b-i915-GVTg_V5_2 > \
> > > > + 
> > > > /sys/bus/pci/devices/\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> > > > + -bash: echo: write error: Invalid argument
> > > > +
> > > 
> > > IIUC this path is against the physical device. IOW, the mgmt app would 
> > > have
> > > to first write to the "version" file to choose a version, and then write 
> > > to
> > > the "create" file to actually create an virtual device. This has the 
> > > obvious
> > > concurrency problem if multiple devices are being created at the same time
> > > and distinct versions for each device are required. There would need to be
> > > a locking scheme defined to ensure safety.  
> > 
> > "Create a device of a given version" is not an intended feature of this
> > interface aiui.  Writing the version attribute only indicates
> > migration compatibility with a binary result.
> >
> > > Wouldn't it be better if we can pass the desired version when we write to
> > > the "create" file, so that we avoid any concurrent usage problems. 
> > > "version"
> > > could be just a read-only file with a *list* of supported versions.
> > > 
> > > eg
> > > 
> > >   $ cat 
> > > /sys/bus/pci/devices/\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> > >   5.0
> > >   5.1
> > >   5.2
> > > 
> > >   $ echo "83b8f4f2-509f-382f-3c1e-e6bfe0fa1001;version=5.2" >
> > >   /sys/devices/virtual/mtty/mtty/mdev_supported_types/mtty-2/create  
> > 
> > This is reminiscent of the proposed aggregation support, but again,
> > this sort of feature is not intended here.  It's no expected that any
> > vendor driver would support creating device types of different
> > versions, but they may support migration from different versions.  
> 
> Hmm, that's a subtle distinction I wasn't seeing the patch series.
> IIUC from what you're saying, a device can be created with version
> "C", but for an incoming migration it can (potentially) accept
> serialized state matching any of versions "A", "B" or "C".
> 
> That is sufficient if migration is being used as a host upgrade
> tool, to move from OS release N to N + 1.
> 
> It wouldn't cover the case where you need to support backwards
> migration too. eg if you have a mixture of hosts with release
> N and N + 1 and want to make sure that VMs can always move
> betweeen any host.  That would require that you can create
> mdevs with the lowest common denominator version, not solely
> the most recent.
> 
> In QEMU this is done by mgmt applications picking a versioned
> machine type for QEMU that is older than most recent.

I suppose we'd need to determine how important that is, this is just a
device after all, there are always fallback mechanisms via hotplug.
There are a lot of pieces that need to line up for backwards migration
including support for it at the individual vendor driver.  Nothing we
design into the API is going to require vendor drivers to support
backwards migration and we already have some vendors requiring host and
guest driver alignment.  Specifying a version with a create syntax as
you've provided is reasonable, but this also balloons into whole
tangential interface providing information regarding what versions a
vendor driver is able to create, because presumably manage

Re: [libvirt] [Qemu-devel] [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device

2019-04-23 Thread Daniel P . Berrangé
On Tue, Apr 23, 2019 at 06:35:40AM -0600, Alex Williamson wrote:
> On Tue, 23 Apr 2019 11:39:39 +0100
> Daniel P. Berrangé  wrote:
> 
> > On Fri, Apr 19, 2019 at 04:35:04AM -0400, Yan Zhao wrote:
> > > +* version
> > > +
> > > +  This attribute is rw. It is used to check whether two devices are 
> > > compatible
> > > +  for live migration. If this attribute is missing, then the 
> > > corresponding mdev
> > > +  device is regarded as not supporting live migration.
> > > +
> > > +  It consists of two parts: common part and vendor proprietary part.
> > > +  common part: 32 bit. lower 16 bits is vendor id and higher 16 bits 
> > > identifies
> > > +   device type. e.g., for pci device, it is
> > > +   "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > +  vendor proprietary part: this part is varied in length. vendor driver 
> > > can
> > > +   specify any string to identify a device.
> > > +
> > > +  When reading this attribute, it should show device version string of 
> > > the device
> > > +  of type . If a device does not support live migration, it 
> > > should
> > > +  return errno.
> > > +  When writing a string to this attribute, it returns errno for 
> > > incompatibility
> > > +  or returns written string length in compatibility case. If a device 
> > > does not
> > > +  support live migration, it always returns errno.
> > > +
> > > +  for example.
> > > +  # cat \
> > > + 
> > > /sys/bus/pci/devices/\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
> > > +  00028086-193b-i915-GVTg_V5_2
> > > +
> > > +  #echo 00028086-193b-i915-GVTg_V5_2 > \
> > > + 
> > > /sys/bus/pci/devices/\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> > > + -bash: echo: write error: Invalid argument
> > > +  
> > 
> > IIUC this path is against the physical device. IOW, the mgmt app would have
> > to first write to the "version" file to choose a version, and then write to
> > the "create" file to actually create an virtual device. This has the obvious
> > concurrency problem if multiple devices are being created at the same time
> > and distinct versions for each device are required. There would need to be
> > a locking scheme defined to ensure safety.
> 
> "Create a device of a given version" is not an intended feature of this
> interface aiui.  Writing the version attribute only indicates
> migration compatibility with a binary result.
>  
> > Wouldn't it be better if we can pass the desired version when we write to
> > the "create" file, so that we avoid any concurrent usage problems. "version"
> > could be just a read-only file with a *list* of supported versions.
> > 
> > eg
> > 
> >   $ cat 
> > /sys/bus/pci/devices/\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> >   5.0
> >   5.1
> >   5.2
> > 
> >   $ echo "83b8f4f2-509f-382f-3c1e-e6bfe0fa1001;version=5.2" >
> >   /sys/devices/virtual/mtty/mtty/mdev_supported_types/mtty-2/create
> 
> This is reminiscent of the proposed aggregation support, but again,
> this sort of feature is not intended here.  It's no expected that any
> vendor driver would support creating device types of different
> versions, but they may support migration from different versions.

Hmm, that's a subtle distinction I wasn't seeing the patch series.
IIUC from what you're saying, a device can be created with version
"C", but for an incoming migration it can (potentially) accept
serialized state matching any of versions "A", "B" or "C".

That is sufficient if migration is being used as a host upgrade
tool, to move from OS release N to N + 1.

It wouldn't cover the case where you need to support backwards
migration too. eg if you have a mixture of hosts with release
N and N + 1 and want to make sure that VMs can always move
betweeen any host.  That would require that you can create
mdevs with the lowest common denominator version, not solely
the most recent.

In QEMU this is done by mgmt applications picking a versioned
machine type for QEMU that is older than most recent.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [Qemu-devel] [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device

2019-04-23 Thread Alex Williamson
On Tue, 23 Apr 2019 11:39:39 +0100
Daniel P. Berrangé  wrote:

> On Fri, Apr 19, 2019 at 04:35:04AM -0400, Yan Zhao wrote:
> > device version attribute in mdev sysfs is used by user space software
> > (e.g. libvirt) to query device compatibility for live migration of VFIO
> > mdev devices. This attribute is mandatory if a mdev device supports live
> > migration.
> > 
> > It consists of two parts: common part and vendor proprietary part.
> > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> >  identifies device type. e.g., for pci device, it is
> >  "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > vendor proprietary part: this part is varied in length. vendor driver can
> >  specify any string to identify a device.
> > 
> > When reading this attribute, it should show device version string of the
> > device of type . If a device does not support live migration, it
> > should return errno.
> > When writing a string to this attribute, it returns errno for
> > incompatibility or returns written string length in compatibility case.
> > If a device does not support live migration, it always returns errno.
> > 
> > For user space software to use:
> > 1.
> > Before starting live migration, user space software first reads source side
> > mdev device's version. e.g.
> > "#cat \
> > /sys/bus/pci/devices/\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> > 00028086-193b-i915-GVTg_V5_4
> > 
> > 2.
> > Then, user space software writes the source side returned version string
> > to device version attribute in target side, and checks the return value.
> > If a negative errno is returned in the target side, then mdev devices in
> > source and target sides are not compatible;
> > If a positive number is returned and it equals to the length of written
> > string, then the two mdev devices in source and target side are compatible.
> > e.g.
> > (a) compatibility case
> > "# echo 00028086-193b-i915-GVTg_V5_4 >
> > /sys/bus/pci/devices/\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > 
> > (b) incompatibility case
> > "#echo 00028086-193b-i915-GVTg_V5_1 >
> > /sys/bus/pci/devices/\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > -bash: echo: write error: Invalid argument  
> 
> What you have written here seems to imply that each mdev type is able to
> support many different versions at the same time. Writing a version into
> this sysfs file then chooses which of the many versions to actually use.

That's not actually what's being proposed here, reading the version
attribute provides an opaque string.  Writing the version attribute
from a different system reports whether that version string is
compatible for migration.  IOW, we're not creating a device of a given
version, we're only reporting if that version is compatible with this
mdev.  The version is intentionally opaque to allow vendor specific
nuances, therefore it's rather impractical create the sort of version
list requested below.

> This is good as it allows for live migration across driver software upgrades.
> 
> A mgmt application may well want to know what versions are supported for an
> mdev type *before* starting a migration. A mgmt app can query all the 100's
> of hosts it knows and thus figure out which are valid to use as the target
> of a migration.
> 
> IOW, we want to avoid the ever hitting the incompatibility case in the
> first place, by only choosing to migrate to a host that we know is going
> to be compatible.

This is provided, the migration source reports a version string which
can be queried against the version attribute on other hosts to insure
compatibility prior to migration.

> This would need some kind of way to report the full list of supported
> versions against the mdev supported types on the host.

This is not provided, matching versions are vendor specific.

> > 3. if two mdev devices are compatible, user space software can start
> > live migration, and vice versa.
> > 
> > Note: if a mdev device does not support live migration, it either does
> > not provide a version attribute, or always returns errno when its version
> > attribute is read/written.
> > 
> > Cc: Alex Williamson 
> > Cc: Erik Skultety 
> > Cc: "Dr. David Alan Gilbert" 
> > Cc: Cornelia Huck 
> > Cc: "Tian, Kevin" 
> > Cc: Zhenyu Wang 
> > Cc: "Wang, Zhi A" 
> > Cc: Neo Jia 
> > Cc: Kirti Wankhede 
> > 
> > Signed-off-by: Yan Zhao 
> > ---
> >  Documentation/vfio-mediated-device.txt | 36 ++
> >  samples/vfio-mdev/mbochs.c | 17 
> >  samples/vfio-mdev/mdpy.c   | 16 
> >  samples/vfio-mdev/mtty.c   | 16 
> >  4 files changed, 85 insertions(+)
> > 
> > diff --git a/Documentation/vfio-mediated-device.txt 
> > b/Documentation/vfio-mediated-device.txt
> > index c3f69bcaf96e..bc28471c0667 100644
> > --- a/Documentation/vfio-mediated-device.txt
> > +++ b/Document

Re: [libvirt] [Qemu-devel] [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device

2019-04-23 Thread Daniel P . Berrangé
On Fri, Apr 19, 2019 at 04:35:04AM -0400, Yan Zhao wrote:
> device version attribute in mdev sysfs is used by user space software
> (e.g. libvirt) to query device compatibility for live migration of VFIO
> mdev devices. This attribute is mandatory if a mdev device supports live
> migration.
> 
> It consists of two parts: common part and vendor proprietary part.
> common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
>  identifies device type. e.g., for pci device, it is
>  "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> vendor proprietary part: this part is varied in length. vendor driver can
>  specify any string to identify a device.
> 
> When reading this attribute, it should show device version string of the
> device of type . If a device does not support live migration, it
> should return errno.
> When writing a string to this attribute, it returns errno for
> incompatibility or returns written string length in compatibility case.
> If a device does not support live migration, it always returns errno.
> 
> For user space software to use:
> 1.
> Before starting live migration, user space software first reads source side
> mdev device's version. e.g.
> "#cat \
> /sys/bus/pci/devices/\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> 00028086-193b-i915-GVTg_V5_4
> 
> 2.
> Then, user space software writes the source side returned version string
> to device version attribute in target side, and checks the return value.
> If a negative errno is returned in the target side, then mdev devices in
> source and target sides are not compatible;
> If a positive number is returned and it equals to the length of written
> string, then the two mdev devices in source and target side are compatible.
> e.g.
> (a) compatibility case
> "# echo 00028086-193b-i915-GVTg_V5_4 >
> /sys/bus/pci/devices/\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> 
> (b) incompatibility case
> "#echo 00028086-193b-i915-GVTg_V5_1 >
> /sys/bus/pci/devices/\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> -bash: echo: write error: Invalid argument

What you have written here seems to imply that each mdev type is able to
support many different versions at the same time. Writing a version into
this sysfs file then chooses which of the many versions to actually use.

This is good as it allows for live migration across driver software upgrades.

A mgmt application may well want to know what versions are supported for an
mdev type *before* starting a migration. A mgmt app can query all the 100's
of hosts it knows and thus figure out which are valid to use as the target
of a migration.

IOW, we want to avoid the ever hitting the incompatibility case in the
first place, by only choosing to migrate to a host that we know is going
to be compatible.

This would need some kind of way to report the full list of supported
versions against the mdev supported types on the host.


> 3. if two mdev devices are compatible, user space software can start
> live migration, and vice versa.
> 
> Note: if a mdev device does not support live migration, it either does
> not provide a version attribute, or always returns errno when its version
> attribute is read/written.
> 
> Cc: Alex Williamson 
> Cc: Erik Skultety 
> Cc: "Dr. David Alan Gilbert" 
> Cc: Cornelia Huck 
> Cc: "Tian, Kevin" 
> Cc: Zhenyu Wang 
> Cc: "Wang, Zhi A" 
> Cc: Neo Jia 
> Cc: Kirti Wankhede 
> 
> Signed-off-by: Yan Zhao 
> ---
>  Documentation/vfio-mediated-device.txt | 36 ++
>  samples/vfio-mdev/mbochs.c | 17 
>  samples/vfio-mdev/mdpy.c   | 16 
>  samples/vfio-mdev/mtty.c   | 16 
>  4 files changed, 85 insertions(+)
> 
> diff --git a/Documentation/vfio-mediated-device.txt 
> b/Documentation/vfio-mediated-device.txt
> index c3f69bcaf96e..bc28471c0667 100644
> --- a/Documentation/vfio-mediated-device.txt
> +++ b/Documentation/vfio-mediated-device.txt
> @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical 
> Device
>| |   |--- available_instances
>| |   |--- device_api
>| |   |--- description
> +  | |   |--- version
>| |   |--- [devices]
>| |--- []
>| |   |--- create
> @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical 
> Device
>| |   |--- available_instances
>| |   |--- device_api
>| |   |--- description
> +  | |   |--- version
>| |   |--- [devices]
>| |--- []
>|  |--- create
> @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical 
> Device
>|  |--- available_instances
>|  |--- device_api
>|  |--- description
> +  |  |--- version
>|  |--- [devices]
>  
>  * [mdev_supported_types]
> @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physi

Re: [libvirt] [Qemu-devel] [PATCH 1/2] vfio/mdev: add version field as mandatory attribute for mdev device

2019-04-23 Thread Daniel P . Berrangé
On Tue, Apr 23, 2019 at 01:41:57AM -0400, Yan Zhao wrote:
> On Tue, Apr 23, 2019 at 09:21:00AM +0800, Alex Williamson wrote:
> > On Mon, 22 Apr 2019 21:01:52 -0400
> > Yan Zhao  wrote:
> > 
> > > On Mon, Apr 22, 2019 at 10:39:50PM +0800, Alex Williamson wrote:
> > > > On Fri, 19 Apr 2019 04:35:04 -0400
> > > > Yan Zhao  wrote:
> > > >   
> > > > > device version attribute in mdev sysfs is used by user space software
> > > > > (e.g. libvirt) to query device compatibility for live migration of 
> > > > > VFIO
> > > > > mdev devices. This attribute is mandatory if a mdev device supports 
> > > > > live
> > > > > migration.  
> > > > 
> > > > The Subject: doesn't quite match what's being proposed here.
> > > >
> > > > > It consists of two parts: common part and vendor proprietary part.
> > > > > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> > > > >  identifies device type. e.g., for pci device, it is
> > > > >  "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).  
> > > > 
> > > > What purpose does this serve?  If it's intended as some sort of
> > > > namespace feature, shouldn't we first assume that we can only support
> > > > migration to devices of the same type?  Therefore each type would
> > > > already have its own namespace.  Also that would make the trailing bit
> > > > of the version string listed below in the example redundant.  A vendor
> > > > is still welcome to include this in their version string if they wish,
> > > > but I think the string should be entirely vendor defined.
> > > >  
> > > hi Alex,
> > > This common part is a kind of namespace.
> > > Because if version string is entirely defined by vendors, I'm worried 
> > > about
> > > if there is a case that one vendor's version string happens to deceive and
> > > interfere with another vendor's version checking?
> > > e.g.
> > > vendor A has a version string like: vendor id + device id + mdev type
> > > vendor B has a version string like: device id + vendor id + mdev type
> > > but vendor A's vendor id is 0x8086, device id is 0x1217
> > > vendor B's vendor id is 0x1217, device id is 0x8086.
> > > 
> > > In this corner case, the two vendors may regard the two device is
> > > migratable but actually they are not.
> > > 
> > > That's the reason for this common part that serve as a kind of namespace
> > > that all vendors will comply with to avoid overlap.
> > 
> > If we assume that migration can only occur between matching mdev types,
> > this is redundant, each type already has their own namespace.
> >
> hi Alex,
> do you mean user space software like libvirt needs to first check whether
> mdev type is matching and then check whether version is matching?

I would expect that libvirt (or other mgmt apps) will always first
check that the vendor id, device id, mdev type all match. So for the
version string it should suffice to be a "normal" numeric value.

Essentially version string just needs to be there to distinguish
revisions of the same mdev type over time.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list