Re: [libvirt PATCH v3 00/21] Add support for persistent mediated devices
On Tue, 2 Feb 2021 08:28:52 +0100 Erik Skultety wrote: > On Mon, Feb 01, 2021 at 04:38:56PM -0700, Alex Williamson wrote: > > On Mon, 1 Feb 2021 16:57:44 -0600 > > Jonathon Jongsma wrote: > > > > > On Mon, 1 Feb 2021 11:33:08 +0100 > > > Erik Skultety wrote: > > > > > > > On Mon, Feb 01, 2021 at 09:48:11AM +, Daniel P. Berrangé wrote: > > > > > On Mon, Feb 01, 2021 at 10:45:43AM +0100, Erik Skultety wrote: > > > > > > On Mon, Feb 01, 2021 at 09:42:32AM +, Daniel P. Berrangé > > > > > > wrote: > > > > > > > On Fri, Jan 29, 2021 at 05:34:36PM -0600, Jonathon Jongsma > > > > > > > wrote: > > > > > > > > On Thu, 7 Jan 2021 17:43:54 +0100 > > > > > > > > Erik Skultety wrote: > > > > > > > > > > > > > > > > > > Tested with v6.10.0-283-g1948d4e61e. > > > > > > > > > > > > > > > > > > > > 1.Can define/start/destroy mdev device successfully; > > > > > > > > > > > > > > > > > > > > 2.'virsh nodedev-list' has no '--active' option, which is > > > > > > > > > > inconsistent with the description in the patch: > > > > > > > > > > # virsh nodedev-list --active > > > > > > > > > > error: command 'nodedev-list' doesn't support option > > > > > > > > > > --active > > > > > > > > > > > > > > > > > > > > 3.virsh client hang when trying to destroy a mdev device > > > > > > > > > > which is using by a vm, and after that all 'virsh nodev*' > > > > > > > > > > cmds will hang. If restarting llibvirtd after that, > > > > > > > > > > libvirtd will hang. > > > > > > > > > > > > > > > > > > It hangs because underneath a write to the 'remove' sysfs > > > > > > > > > attribute is now blocking for some reason and since we're > > > > > > > > > relying on mdevctl to do it for us, hence "it hangs". I'm > > > > > > > > > not trying to make an excuse, it's plain wrong. I'd love to > > > > > > > > > rely on such a basic functionality, but it looks like we'll > > > > > > > > > have to go with a extremely ugly workaround and try to get > > > > > > > > > the list of active domains from the nodedev driver and see > > > > > > > > > whether any of them has the device assigned before we try > > > > > > > > > to destroy the mdev via the nodedev driver. > > > > > > > > > > > > > > > > So, I've been trying to figure out a way to do this, but as > > > > > > > > far as I know, there's no way to get a list of active domains > > > > > > > > from within the nodedev driver, and I can't think of any > > > > > > > > better ways to handle it. Any ideas? > > > > > > > > > > > > > > Correct, the nodedev driver isn't permitted to talk to any of > > > > > > > the virt drivers. > > > > > > > > > > > > Oh, not even via secondary connection? What makes nodedev so > > > > > > special, since we can open a secondary connection from e.g. the > > > > > > storage driver? > > > > > > > > > > It is technically possible, but it should never be done, because it > > > > > introduces a bi-directional dependancy between the daemons which > > > > > introduces the danger of deadlocking them. None of the secondary > > > > > drivers should connect to the hypervisor drivers. > > > > > > > > > > > > Is there anything in sysfs which reports whether the device is > > > > > > > in use ? > > > > > > > > > > > > Nothing that I know of, the way it used to work was that you > > > > > > tried to write to sysfs and kernel returned a write error with > > > > > > "device in use" or something like that, but that has changed > > > > > > since :(. > > > > > > > > Without having tried this and since mdevctl is just a Bash script, > > > > can we bypass mdevctl on destroys a little bit by constructing the > > > > path to the sysfs attribute ourselves and perform a non-blocking > > > > write of zero bytes to see if we get an error? If so, we'll skip > > > > mdevctl and report an error. If we don't, we'll invoke mdevctl to > > > > remove the device in order to remain consistent. Would that be an > > > > acceptable workaround (provided it would work)? > > > > > > As far as I can tell, this doesn't work. According to my > > > tests, attempting to write zero bytes to $(mdev_sysfs_path)/remove > > > doesn't result in an error if the mdev is in use by a vm. It just > > > "successfully" writes zero bytes. Adding Alex to cc in case he has an > > > idea for a workaround here. > > > > [Cc +Connie] > > > > I'm not really sure why mdevs are unique here. When we write to > > remove, the first step is to release the device from the driver, so > > it's really the same as an unbind for a vfio-pci device. PCI drivers > > cannot return an error, an unbind is handled not as a request, but a > > directive, so when the device is in use we block until the unbind can > > complete. With vfio-pci (and added upstream to the mdev core - > > depending on vendor support), the driver remove callback triggers a > > virtual interrupt to the user asking to cooperatively return the device > > (triggering a hot-unplug in
Re: [libvirt PATCH v3 00/21] Add support for persistent mediated devices
On Tue, 2 Feb 2021 08:28:52 +0100 Erik Skultety wrote: > On Mon, Feb 01, 2021 at 04:38:56PM -0700, Alex Williamson wrote: > > On Mon, 1 Feb 2021 16:57:44 -0600 > > Jonathon Jongsma wrote: > > > > > On Mon, 1 Feb 2021 11:33:08 +0100 > > > Erik Skultety wrote: > > > > > > > On Mon, Feb 01, 2021 at 09:48:11AM +, Daniel P. Berrangé > > > > wrote: > > > > > On Mon, Feb 01, 2021 at 10:45:43AM +0100, Erik Skultety > > > > > wrote: > > > > > > On Mon, Feb 01, 2021 at 09:42:32AM +, Daniel P. Berrangé > > > > > > wrote: > > > > > > > On Fri, Jan 29, 2021 at 05:34:36PM -0600, Jonathon Jongsma > > > > > > > wrote: > > > > > > > > On Thu, 7 Jan 2021 17:43:54 +0100 > > > > > > > > Erik Skultety wrote: > > > > > > > > > > > > > > > > > > Tested with v6.10.0-283-g1948d4e61e. > > > > > > > > > > > > > > > > > > > > 1.Can define/start/destroy mdev device successfully; > > > > > > > > > > > > > > > > > > > > 2.'virsh nodedev-list' has no '--active' option, > > > > > > > > > > which is inconsistent with the description in the > > > > > > > > > > patch: # virsh nodedev-list --active > > > > > > > > > > error: command 'nodedev-list' doesn't support option > > > > > > > > > > --active > > > > > > > > > > > > > > > > > > > > 3.virsh client hang when trying to destroy a mdev > > > > > > > > > > device which is using by a vm, and after that all > > > > > > > > > > 'virsh nodev*' cmds will hang. If restarting > > > > > > > > > > llibvirtd after that, libvirtd will hang. > > > > > > > > > > > > > > > > > > It hangs because underneath a write to the 'remove' > > > > > > > > > sysfs attribute is now blocking for some reason and > > > > > > > > > since we're relying on mdevctl to do it for us, hence > > > > > > > > > "it hangs". I'm not trying to make an excuse, it's > > > > > > > > > plain wrong. I'd love to rely on such a basic > > > > > > > > > functionality, but it looks like we'll have to go > > > > > > > > > with a extremely ugly workaround and try to get the > > > > > > > > > list of active domains from the nodedev driver and > > > > > > > > > see whether any of them has the device assigned > > > > > > > > > before we try to destroy the mdev via the nodedev > > > > > > > > > driver. > > > > > > > > > > > > > > > > So, I've been trying to figure out a way to do this, > > > > > > > > but as far as I know, there's no way to get a list of > > > > > > > > active domains from within the nodedev driver, and I > > > > > > > > can't think of any better ways to handle it. Any ideas? > > > > > > > > > > > > > > > > > > > > > > Correct, the nodedev driver isn't permitted to talk to > > > > > > > any of the virt drivers. > > > > > > > > > > > > Oh, not even via secondary connection? What makes nodedev so > > > > > > special, since we can open a secondary connection from e.g. > > > > > > the storage driver? > > > > > > > > > > It is technically possible, but it should never be done, > > > > > because it introduces a bi-directional dependancy between the > > > > > daemons which introduces the danger of deadlocking them. None > > > > > of the secondary drivers should connect to the hypervisor > > > > > drivers. > > > > > > > Is there anything in sysfs which reports whether the > > > > > > > device is in use ? > > > > > > > > > > > > Nothing that I know of, the way it used to work was that you > > > > > > tried to write to sysfs and kernel returned a write error > > > > > > with "device in use" or something like that, but that has > > > > > > changed since :(. > > > > > > > > Without having tried this and since mdevctl is just a Bash > > > > script, can we bypass mdevctl on destroys a little bit by > > > > constructing the path to the sysfs attribute ourselves and > > > > perform a non-blocking write of zero bytes to see if we get an > > > > error? If so, we'll skip mdevctl and report an error. If we > > > > don't, we'll invoke mdevctl to remove the device in order to > > > > remain consistent. Would that be an acceptable workaround > > > > (provided it would work)? > > > > > > As far as I can tell, this doesn't work. According to my > > > tests, attempting to write zero bytes to $(mdev_sysfs_path)/remove > > > doesn't result in an error if the mdev is in use by a vm. It just > > > "successfully" writes zero bytes. Adding Alex to cc in case he > > > has an idea for a workaround here. > > > > [Cc +Connie] > > > > I'm not really sure why mdevs are unique here. When we write to > > remove, the first step is to release the device from the driver, so > > it's really the same as an unbind for a vfio-pci device. PCI > > drivers cannot return an error, an unbind is handled not as a > > request, but a directive, so when the device is in use we block > > until the unbind can complete. With vfio-pci (and added upstream > > to the mdev core - depending on vendor support), the driver remove > > callback triggers a virtual interrupt to the user as
Re: [libvirt PATCH v3 00/21] Add support for persistent mediated devices
On Mon, Feb 01, 2021 at 04:38:56PM -0700, Alex Williamson wrote: > On Mon, 1 Feb 2021 16:57:44 -0600 > Jonathon Jongsma wrote: > > > On Mon, 1 Feb 2021 11:33:08 +0100 > > Erik Skultety wrote: > > > > > On Mon, Feb 01, 2021 at 09:48:11AM +, Daniel P. Berrangé wrote: > > > > On Mon, Feb 01, 2021 at 10:45:43AM +0100, Erik Skultety wrote: > > > > > On Mon, Feb 01, 2021 at 09:42:32AM +, Daniel P. Berrangé > > > > > wrote: > > > > > > On Fri, Jan 29, 2021 at 05:34:36PM -0600, Jonathon Jongsma > > > > > > wrote: > > > > > > > On Thu, 7 Jan 2021 17:43:54 +0100 > > > > > > > Erik Skultety wrote: > > > > > > > > > > > > > > > > Tested with v6.10.0-283-g1948d4e61e. > > > > > > > > > > > > > > > > > > 1.Can define/start/destroy mdev device successfully; > > > > > > > > > > > > > > > > > > 2.'virsh nodedev-list' has no '--active' option, which is > > > > > > > > > inconsistent with the description in the patch: > > > > > > > > > # virsh nodedev-list --active > > > > > > > > > error: command 'nodedev-list' doesn't support option > > > > > > > > > --active > > > > > > > > > > > > > > > > > > 3.virsh client hang when trying to destroy a mdev device > > > > > > > > > which is using by a vm, and after that all 'virsh nodev*' > > > > > > > > > cmds will hang. If restarting llibvirtd after that, > > > > > > > > > libvirtd will hang. > > > > > > > > > > > > > > > > It hangs because underneath a write to the 'remove' sysfs > > > > > > > > attribute is now blocking for some reason and since we're > > > > > > > > relying on mdevctl to do it for us, hence "it hangs". I'm > > > > > > > > not trying to make an excuse, it's plain wrong. I'd love to > > > > > > > > rely on such a basic functionality, but it looks like we'll > > > > > > > > have to go with a extremely ugly workaround and try to get > > > > > > > > the list of active domains from the nodedev driver and see > > > > > > > > whether any of them has the device assigned before we try > > > > > > > > to destroy the mdev via the nodedev driver. > > > > > > > > > > > > > > So, I've been trying to figure out a way to do this, but as > > > > > > > far as I know, there's no way to get a list of active domains > > > > > > > from within the nodedev driver, and I can't think of any > > > > > > > better ways to handle it. Any ideas? > > > > > > > > > > > > Correct, the nodedev driver isn't permitted to talk to any of > > > > > > the virt drivers. > > > > > > > > > > Oh, not even via secondary connection? What makes nodedev so > > > > > special, since we can open a secondary connection from e.g. the > > > > > storage driver? > > > > > > > > It is technically possible, but it should never be done, because it > > > > introduces a bi-directional dependancy between the daemons which > > > > introduces the danger of deadlocking them. None of the secondary > > > > drivers should connect to the hypervisor drivers. > > > > > > > > > > Is there anything in sysfs which reports whether the device is > > > > > > in use ? > > > > > > > > > > Nothing that I know of, the way it used to work was that you > > > > > tried to write to sysfs and kernel returned a write error with > > > > > "device in use" or something like that, but that has changed > > > > > since :(. > > > > > > Without having tried this and since mdevctl is just a Bash script, > > > can we bypass mdevctl on destroys a little bit by constructing the > > > path to the sysfs attribute ourselves and perform a non-blocking > > > write of zero bytes to see if we get an error? If so, we'll skip > > > mdevctl and report an error. If we don't, we'll invoke mdevctl to > > > remove the device in order to remain consistent. Would that be an > > > acceptable workaround (provided it would work)? > > > > As far as I can tell, this doesn't work. According to my > > tests, attempting to write zero bytes to $(mdev_sysfs_path)/remove > > doesn't result in an error if the mdev is in use by a vm. It just > > "successfully" writes zero bytes. Adding Alex to cc in case he has an > > idea for a workaround here. > > [Cc +Connie] > > I'm not really sure why mdevs are unique here. When we write to > remove, the first step is to release the device from the driver, so > it's really the same as an unbind for a vfio-pci device. PCI drivers > cannot return an error, an unbind is handled not as a request, but a > directive, so when the device is in use we block until the unbind can > complete. With vfio-pci (and added upstream to the mdev core - > depending on vendor support), the driver remove callback triggers a > virtual interrupt to the user asking to cooperatively return the device > (triggering a hot-unplug in QEMU). Has this really worked so well in > vfio-pci that we've forgotten that an unbind can block there too or are > we better about tracking something with PCI devices vs mdevs? Does any of the current vendor guest drivers for mdev support unplug? While I'm not trying t
Re: [libvirt PATCH v3 00/21] Add support for persistent mediated devices
On Mon, 1 Feb 2021 16:57:44 -0600 Jonathon Jongsma wrote: > On Mon, 1 Feb 2021 11:33:08 +0100 > Erik Skultety wrote: > > > On Mon, Feb 01, 2021 at 09:48:11AM +, Daniel P. Berrangé wrote: > > > On Mon, Feb 01, 2021 at 10:45:43AM +0100, Erik Skultety wrote: > > > > On Mon, Feb 01, 2021 at 09:42:32AM +, Daniel P. Berrangé > > > > wrote: > > > > > On Fri, Jan 29, 2021 at 05:34:36PM -0600, Jonathon Jongsma > > > > > wrote: > > > > > > On Thu, 7 Jan 2021 17:43:54 +0100 > > > > > > Erik Skultety wrote: > > > > > > > > > > > > > > Tested with v6.10.0-283-g1948d4e61e. > > > > > > > > > > > > > > > > 1.Can define/start/destroy mdev device successfully; > > > > > > > > > > > > > > > > 2.'virsh nodedev-list' has no '--active' option, which is > > > > > > > > inconsistent with the description in the patch: > > > > > > > > # virsh nodedev-list --active > > > > > > > > error: command 'nodedev-list' doesn't support option > > > > > > > > --active > > > > > > > > > > > > > > > > 3.virsh client hang when trying to destroy a mdev device > > > > > > > > which is using by a vm, and after that all 'virsh nodev*' > > > > > > > > cmds will hang. If restarting llibvirtd after that, > > > > > > > > libvirtd will hang. > > > > > > > > > > > > > > It hangs because underneath a write to the 'remove' sysfs > > > > > > > attribute is now blocking for some reason and since we're > > > > > > > relying on mdevctl to do it for us, hence "it hangs". I'm > > > > > > > not trying to make an excuse, it's plain wrong. I'd love to > > > > > > > rely on such a basic functionality, but it looks like we'll > > > > > > > have to go with a extremely ugly workaround and try to get > > > > > > > the list of active domains from the nodedev driver and see > > > > > > > whether any of them has the device assigned before we try > > > > > > > to destroy the mdev via the nodedev driver. > > > > > > > > > > > > So, I've been trying to figure out a way to do this, but as > > > > > > far as I know, there's no way to get a list of active domains > > > > > > from within the nodedev driver, and I can't think of any > > > > > > better ways to handle it. Any ideas? > > > > > > > > > > Correct, the nodedev driver isn't permitted to talk to any of > > > > > the virt drivers. > > > > > > > > Oh, not even via secondary connection? What makes nodedev so > > > > special, since we can open a secondary connection from e.g. the > > > > storage driver? > > > > > > It is technically possible, but it should never be done, because it > > > introduces a bi-directional dependancy between the daemons which > > > introduces the danger of deadlocking them. None of the secondary > > > drivers should connect to the hypervisor drivers. > > > > > > > > Is there anything in sysfs which reports whether the device is > > > > > in use ? > > > > > > > > Nothing that I know of, the way it used to work was that you > > > > tried to write to sysfs and kernel returned a write error with > > > > "device in use" or something like that, but that has changed > > > > since :(. > > > > Without having tried this and since mdevctl is just a Bash script, > > can we bypass mdevctl on destroys a little bit by constructing the > > path to the sysfs attribute ourselves and perform a non-blocking > > write of zero bytes to see if we get an error? If so, we'll skip > > mdevctl and report an error. If we don't, we'll invoke mdevctl to > > remove the device in order to remain consistent. Would that be an > > acceptable workaround (provided it would work)? > > As far as I can tell, this doesn't work. According to my > tests, attempting to write zero bytes to $(mdev_sysfs_path)/remove > doesn't result in an error if the mdev is in use by a vm. It just > "successfully" writes zero bytes. Adding Alex to cc in case he has an > idea for a workaround here. [Cc +Connie] I'm not really sure why mdevs are unique here. When we write to remove, the first step is to release the device from the driver, so it's really the same as an unbind for a vfio-pci device. PCI drivers cannot return an error, an unbind is handled not as a request, but a directive, so when the device is in use we block until the unbind can complete. With vfio-pci (and added upstream to the mdev core - depending on vendor support), the driver remove callback triggers a virtual interrupt to the user asking to cooperatively return the device (triggering a hot-unplug in QEMU). Has this really worked so well in vfio-pci that we've forgotten that an unbind can block there too or are we better about tracking something with PCI devices vs mdevs? On idea for a solution would be that vfio only allows a single open of a group at a time, so if libvirt were to open the group it could know that it's unused. If you can manage to close the group once you've already triggered the remove/unbind, then I'd think the completion of the write would be deterministic. If the group is in use else
Re: [libvirt PATCH v3 00/21] Add support for persistent mediated devices
On Mon, 1 Feb 2021 11:33:08 +0100 Erik Skultety wrote: > On Mon, Feb 01, 2021 at 09:48:11AM +, Daniel P. Berrangé wrote: > > On Mon, Feb 01, 2021 at 10:45:43AM +0100, Erik Skultety wrote: > > > On Mon, Feb 01, 2021 at 09:42:32AM +, Daniel P. Berrangé > > > wrote: > > > > On Fri, Jan 29, 2021 at 05:34:36PM -0600, Jonathon Jongsma > > > > wrote: > > > > > On Thu, 7 Jan 2021 17:43:54 +0100 > > > > > Erik Skultety wrote: > > > > > > > > > > > > Tested with v6.10.0-283-g1948d4e61e. > > > > > > > > > > > > > > 1.Can define/start/destroy mdev device successfully; > > > > > > > > > > > > > > 2.'virsh nodedev-list' has no '--active' option, which is > > > > > > > inconsistent with the description in the patch: > > > > > > > # virsh nodedev-list --active > > > > > > > error: command 'nodedev-list' doesn't support option > > > > > > > --active > > > > > > > > > > > > > > 3.virsh client hang when trying to destroy a mdev device > > > > > > > which is using by a vm, and after that all 'virsh nodev*' > > > > > > > cmds will hang. If restarting llibvirtd after that, > > > > > > > libvirtd will hang. > > > > > > > > > > > > It hangs because underneath a write to the 'remove' sysfs > > > > > > attribute is now blocking for some reason and since we're > > > > > > relying on mdevctl to do it for us, hence "it hangs". I'm > > > > > > not trying to make an excuse, it's plain wrong. I'd love to > > > > > > rely on such a basic functionality, but it looks like we'll > > > > > > have to go with a extremely ugly workaround and try to get > > > > > > the list of active domains from the nodedev driver and see > > > > > > whether any of them has the device assigned before we try > > > > > > to destroy the mdev via the nodedev driver. > > > > > > > > > > So, I've been trying to figure out a way to do this, but as > > > > > far as I know, there's no way to get a list of active domains > > > > > from within the nodedev driver, and I can't think of any > > > > > better ways to handle it. Any ideas? > > > > > > > > Correct, the nodedev driver isn't permitted to talk to any of > > > > the virt drivers. > > > > > > Oh, not even via secondary connection? What makes nodedev so > > > special, since we can open a secondary connection from e.g. the > > > storage driver? > > > > It is technically possible, but it should never be done, because it > > introduces a bi-directional dependancy between the daemons which > > introduces the danger of deadlocking them. None of the secondary > > drivers should connect to the hypervisor drivers. > > > > > > Is there anything in sysfs which reports whether the device is > > > > in use ? > > > > > > Nothing that I know of, the way it used to work was that you > > > tried to write to sysfs and kernel returned a write error with > > > "device in use" or something like that, but that has changed > > > since :(. > > Without having tried this and since mdevctl is just a Bash script, > can we bypass mdevctl on destroys a little bit by constructing the > path to the sysfs attribute ourselves and perform a non-blocking > write of zero bytes to see if we get an error? If so, we'll skip > mdevctl and report an error. If we don't, we'll invoke mdevctl to > remove the device in order to remain consistent. Would that be an > acceptable workaround (provided it would work)? As far as I can tell, this doesn't work. According to my tests, attempting to write zero bytes to $(mdev_sysfs_path)/remove doesn't result in an error if the mdev is in use by a vm. It just "successfully" writes zero bytes. Adding Alex to cc in case he has an idea for a workaround here. Jonathon
Re: [libvirt PATCH v3 00/21] Add support for persistent mediated devices
On Mon, Feb 01, 2021 at 09:48:11AM +, Daniel P. Berrangé wrote: > On Mon, Feb 01, 2021 at 10:45:43AM +0100, Erik Skultety wrote: > > On Mon, Feb 01, 2021 at 09:42:32AM +, Daniel P. Berrangé wrote: > > > On Fri, Jan 29, 2021 at 05:34:36PM -0600, Jonathon Jongsma wrote: > > > > On Thu, 7 Jan 2021 17:43:54 +0100 > > > > Erik Skultety wrote: > > > > > > > > > > Tested with v6.10.0-283-g1948d4e61e. > > > > > > > > > > > > 1.Can define/start/destroy mdev device successfully; > > > > > > > > > > > > 2.'virsh nodedev-list' has no '--active' option, which is > > > > > > inconsistent with the description in the patch: > > > > > > # virsh nodedev-list --active > > > > > > error: command 'nodedev-list' doesn't support option --active > > > > > > > > > > > > 3.virsh client hang when trying to destroy a mdev device which is > > > > > > using by a vm, and after that all 'virsh nodev*' cmds will hang. If > > > > > > restarting llibvirtd after that, libvirtd will hang. > > > > > > > > > > It hangs because underneath a write to the 'remove' sysfs attribute > > > > > is now blocking for some reason and since we're relying on mdevctl to > > > > > do it for us, hence "it hangs". I'm not trying to make an excuse, > > > > > it's plain wrong. I'd love to rely on such a basic functionality, but > > > > > it looks like we'll have to go with a extremely ugly workaround and > > > > > try to get the list of active domains from the nodedev driver and see > > > > > whether any of them has the device assigned before we try to destroy > > > > > the mdev via the nodedev driver. > > > > > > > > So, I've been trying to figure out a way to do this, but as far as > > > > I know, there's no way to get a list of active domains from within the > > > > nodedev driver, and I can't think of any better ways to handle it. Any > > > > ideas? > > > > > > Correct, the nodedev driver isn't permitted to talk to any of the virt > > > drivers. > > > > Oh, not even via secondary connection? What makes nodedev so special, since > > we > > can open a secondary connection from e.g. the storage driver? > > It is technically possible, but it should never be done, because it > introduces a bi-directional dependancy between the daemons which > introduces the danger of deadlocking them. None of the secondary > drivers should connect to the hypervisor drivers. > > > > Is there anything in sysfs which reports whether the device is in use ? > > > > Nothing that I know of, the way it used to work was that you tried to write > > to > > sysfs and kernel returned a write error with "device in use" or something > > like > > that, but that has changed since :(. Without having tried this and since mdevctl is just a Bash script, can we bypass mdevctl on destroys a little bit by constructing the path to the sysfs attribute ourselves and perform a non-blocking write of zero bytes to see if we get an error? If so, we'll skip mdevctl and report an error. If we don't, we'll invoke mdevctl to remove the device in order to remain consistent. Would that be an acceptable workaround (provided it would work)? Erik
Re: [libvirt PATCH v3 00/21] Add support for persistent mediated devices
On Mon, Feb 01, 2021 at 10:45:43AM +0100, Erik Skultety wrote: > On Mon, Feb 01, 2021 at 09:42:32AM +, Daniel P. Berrangé wrote: > > On Fri, Jan 29, 2021 at 05:34:36PM -0600, Jonathon Jongsma wrote: > > > On Thu, 7 Jan 2021 17:43:54 +0100 > > > Erik Skultety wrote: > > > > > > > > Tested with v6.10.0-283-g1948d4e61e. > > > > > > > > > > 1.Can define/start/destroy mdev device successfully; > > > > > > > > > > 2.'virsh nodedev-list' has no '--active' option, which is > > > > > inconsistent with the description in the patch: > > > > > # virsh nodedev-list --active > > > > > error: command 'nodedev-list' doesn't support option --active > > > > > > > > > > 3.virsh client hang when trying to destroy a mdev device which is > > > > > using by a vm, and after that all 'virsh nodev*' cmds will hang. If > > > > > restarting llibvirtd after that, libvirtd will hang. > > > > > > > > It hangs because underneath a write to the 'remove' sysfs attribute > > > > is now blocking for some reason and since we're relying on mdevctl to > > > > do it for us, hence "it hangs". I'm not trying to make an excuse, > > > > it's plain wrong. I'd love to rely on such a basic functionality, but > > > > it looks like we'll have to go with a extremely ugly workaround and > > > > try to get the list of active domains from the nodedev driver and see > > > > whether any of them has the device assigned before we try to destroy > > > > the mdev via the nodedev driver. > > > > > > So, I've been trying to figure out a way to do this, but as far as > > > I know, there's no way to get a list of active domains from within the > > > nodedev driver, and I can't think of any better ways to handle it. Any > > > ideas? > > > > Correct, the nodedev driver isn't permitted to talk to any of the virt > > drivers. > > Oh, not even via secondary connection? What makes nodedev so special, since we > can open a secondary connection from e.g. the storage driver? It is technically possible, but it should never be done, because it introduces a bi-directional dependancy between the daemons which introduces the danger of deadlocking them. None of the secondary drivers should connect to the hypervisor drivers. > > Is there anything in sysfs which reports whether the device is in use ? > > Nothing that I know of, the way it used to work was that you tried to write to > sysfs and kernel returned a write error with "device in use" or something like > that, but that has changed since :(. Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [libvirt PATCH v3 00/21] Add support for persistent mediated devices
On Mon, Feb 01, 2021 at 09:42:32AM +, Daniel P. Berrangé wrote: > On Fri, Jan 29, 2021 at 05:34:36PM -0600, Jonathon Jongsma wrote: > > On Thu, 7 Jan 2021 17:43:54 +0100 > > Erik Skultety wrote: > > > > > > Tested with v6.10.0-283-g1948d4e61e. > > > > > > > > 1.Can define/start/destroy mdev device successfully; > > > > > > > > 2.'virsh nodedev-list' has no '--active' option, which is > > > > inconsistent with the description in the patch: > > > > # virsh nodedev-list --active > > > > error: command 'nodedev-list' doesn't support option --active > > > > > > > > 3.virsh client hang when trying to destroy a mdev device which is > > > > using by a vm, and after that all 'virsh nodev*' cmds will hang. If > > > > restarting llibvirtd after that, libvirtd will hang. > > > > > > It hangs because underneath a write to the 'remove' sysfs attribute > > > is now blocking for some reason and since we're relying on mdevctl to > > > do it for us, hence "it hangs". I'm not trying to make an excuse, > > > it's plain wrong. I'd love to rely on such a basic functionality, but > > > it looks like we'll have to go with a extremely ugly workaround and > > > try to get the list of active domains from the nodedev driver and see > > > whether any of them has the device assigned before we try to destroy > > > the mdev via the nodedev driver. > > > > So, I've been trying to figure out a way to do this, but as far as > > I know, there's no way to get a list of active domains from within the > > nodedev driver, and I can't think of any better ways to handle it. Any > > ideas? > > Correct, the nodedev driver isn't permitted to talk to any of the virt > drivers. Oh, not even via secondary connection? What makes nodedev so special, since we can open a secondary connection from e.g. the storage driver? > > Is there anything in sysfs which reports whether the device is in use ? Nothing that I know of, the way it used to work was that you tried to write to sysfs and kernel returned a write error with "device in use" or something like that, but that has changed since :(. Erik
Re: [libvirt PATCH v3 00/21] Add support for persistent mediated devices
On Fri, Jan 29, 2021 at 05:34:36PM -0600, Jonathon Jongsma wrote: > On Thu, 7 Jan 2021 17:43:54 +0100 > Erik Skultety wrote: > > > > Tested with v6.10.0-283-g1948d4e61e. > > > > > > 1.Can define/start/destroy mdev device successfully; > > > > > > 2.'virsh nodedev-list' has no '--active' option, which is > > > inconsistent with the description in the patch: > > > # virsh nodedev-list --active > > > error: command 'nodedev-list' doesn't support option --active > > > > > > 3.virsh client hang when trying to destroy a mdev device which is > > > using by a vm, and after that all 'virsh nodev*' cmds will hang. If > > > restarting llibvirtd after that, libvirtd will hang. > > > > It hangs because underneath a write to the 'remove' sysfs attribute > > is now blocking for some reason and since we're relying on mdevctl to > > do it for us, hence "it hangs". I'm not trying to make an excuse, > > it's plain wrong. I'd love to rely on such a basic functionality, but > > it looks like we'll have to go with a extremely ugly workaround and > > try to get the list of active domains from the nodedev driver and see > > whether any of them has the device assigned before we try to destroy > > the mdev via the nodedev driver. > > So, I've been trying to figure out a way to do this, but as far as > I know, there's no way to get a list of active domains from within the > nodedev driver, and I can't think of any better ways to handle it. Any > ideas? Correct, the nodedev driver isn't permitted to talk to any of the virt drivers. Is there anything in sysfs which reports whether the device is in use ? Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [libvirt PATCH v3 00/21] Add support for persistent mediated devices
On Thu, 7 Jan 2021 17:43:54 +0100 Erik Skultety wrote: > > Tested with v6.10.0-283-g1948d4e61e. > > > > 1.Can define/start/destroy mdev device successfully; > > > > 2.'virsh nodedev-list' has no '--active' option, which is > > inconsistent with the description in the patch: > > # virsh nodedev-list --active > > error: command 'nodedev-list' doesn't support option --active > > > > 3.virsh client hang when trying to destroy a mdev device which is > > using by a vm, and after that all 'virsh nodev*' cmds will hang. If > > restarting llibvirtd after that, libvirtd will hang. > > It hangs because underneath a write to the 'remove' sysfs attribute > is now blocking for some reason and since we're relying on mdevctl to > do it for us, hence "it hangs". I'm not trying to make an excuse, > it's plain wrong. I'd love to rely on such a basic functionality, but > it looks like we'll have to go with a extremely ugly workaround and > try to get the list of active domains from the nodedev driver and see > whether any of them has the device assigned before we try to destroy > the mdev via the nodedev driver. So, I've been trying to figure out a way to do this, but as far as I know, there's no way to get a list of active domains from within the nodedev driver, and I can't think of any better ways to handle it. Any ideas? Jonathon
Re: [libvirt PATCH v3 00/21] Add support for persistent mediated devices
On Thu, 7 Jan 2021 17:43:54 +0100 Erik Skultety wrote: > > > > 4.Define a mdev device with the uuid specified, but the mdev device > > defined seems using another uuid. Maybe it make a little confusion > > about the use of uuid in the xml: > > #cat mdev.xml > > > > mdev_85531b6d_e5e4_41c1_aa2a_8844717f066a > > Yeah, the easy way out here is to document that the element is > read only, but that would be wrong, because we allow specifying it > for domains, networks, interfaces, etc. So, we should give the end > user the option to specify whatever name they want and generate one > if none is provided. > Unfortunately, this appears to be difficult to achieve for persistent devices. For mediated devices, we are using mdevctl as our backend persistent device definition storage. But mdevctl doesn't provide a way to attach additional metadata along with a mdev definition. So we would need to either modify mdevctl to allow us to store additional data with a device definition, or we would need to create (and keep synchronized) a separate persistent storage for libvirt-specific metadata about mediated devices. It's also worth noting that 'nodedev-create' previously allowed you to create vHBA devices in libvirt, but the 'name' element is explicitly ignored (see https://wiki.libvirt.org/page/NPIV_in_libvirt): "NOTE: If you specify "name" for the vHBA, then it will be ignored. The kernel will automatically pick the next SCSI host name in sequence not already used." That said, I do think that it will be useful or even necessary to be able to create mdevs with a specific UUID, but I think that's a separate issue than being able to specify the 'name' of the nodedev. It seems better to do that by specifying a 'uuid' element within the mdev caps, rather than trying to parse a UUID from an arbitrary name string. For example: 901891a6-2077-4476-9465-53d8995b81b4 pci__00_02_0 Thoughts? Jonathon
Re: [libvirt PATCH v3 00/21] Add support for persistent mediated devices
On Thu, Jan 07, 2021 at 05:50:10PM +0100, Erik Skultety wrote: > On Thu, Dec 24, 2020 at 08:14:24AM -0600, Jonathon Jongsma wrote: > > This patch series follows the previously-merged series which added support > > for > > transient mediated devices. This series expands mdev support to include > > persistent device definitions. Again, it relies on mdevctl as the backend. > > > > It follows the common libvirt pattern of APIs by adding the following new > > APIs > > for node devices: > > - virNodeDeviceDefineXML() - defines a persistent device > > - virNodeDeviceUndefine() - undefines a persistent device > > - virNodeDeviceCreate() - starts a previously-defined device > > > > It also adds virsh commands mapping to these new APIs: nodedev-define, > > nodedev-undefine, and nodedev-start. > > > > Since we rely on mdevctl for the definition of ediated devices, we need a > > way > > to stay up-to-date with devices that are defined by mdevctl (outside of > > libvirt). The method for staying up-to-date is currently a little bit crude > > due to the fact that mdevctl does not emit any events when new devices are > > added or removed. As a workaround, we create a file monitor for the mdevctl > > config directory and re-query mdevctl when we detect changes within that > > directory. In the future, mdevctl may introduce a more elegant solution. > > > > changes in v3: > > - streamlined tests -- removed some unnecessary duplication > > - split out patch to factor out node device name generation function > > - split nodeDeviceParseMdevctlChildDevice() into a separate function > > - added follow-up patch to remove space-padded alignment in header > > - refactored the mdevctl update handling significantly: > >- no longer a separate persistent thread that gets signaled by a timer > >- now piggybacks onto the existing udev thread and signals the thread in > > t= > > he > > same way that the udev event does. > >- Daniel suggested spawning a throw-away thread to handle mdevctl > > updates, > > but that introduces the complexity of possibly serializing multiple > > throw-away threads (e.g. if we get an 'created' event followed > > immediate= > > ly > > by a 'deleted' event, two threads may be spawned and we'd need to > > ensure > > they are properly ordered) > > - added virNodeDeviceObjListForEach() and > > virNodeDeviceObjListRemoveLocked() > >to simplify removing devices that are removed from mdevctl. > > - coding style fixes > > - NOTE: per Erik's request, I experimented with changing the way that > > mdevctl > >commands were generated and tested (e.g. introducing something like > >virMdevctlGetCommand(def, MDEVCTL_COMMAND_, ...)), but it was > >too invasive and awkward and didn't seem worthwhile > > Thanks for giving ^this a try :) > > I think we're converging nicely with the only remaining bit to agree on being > the mdevctl monitoring thread. I think we can target 7.1.0 with this series. The lifecycle events handling in this series is totally broken and needs addressing before we can consider merging. Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [libvirt PATCH v3 00/21] Add support for persistent mediated devices
On Thu, Dec 24, 2020 at 08:14:24AM -0600, Jonathon Jongsma wrote: > This patch series follows the previously-merged series which added support for > transient mediated devices. This series expands mdev support to include > persistent device definitions. Again, it relies on mdevctl as the backend. > > It follows the common libvirt pattern of APIs by adding the following new APIs > for node devices: > - virNodeDeviceDefineXML() - defines a persistent device > - virNodeDeviceUndefine() - undefines a persistent device > - virNodeDeviceCreate() - starts a previously-defined device > > It also adds virsh commands mapping to these new APIs: nodedev-define, > nodedev-undefine, and nodedev-start. > > Since we rely on mdevctl for the definition of ediated devices, we need a way > to stay up-to-date with devices that are defined by mdevctl (outside of > libvirt). The method for staying up-to-date is currently a little bit crude > due to the fact that mdevctl does not emit any events when new devices are > added or removed. As a workaround, we create a file monitor for the mdevctl > config directory and re-query mdevctl when we detect changes within that > directory. In the future, mdevctl may introduce a more elegant solution. > > changes in v3: > - streamlined tests -- removed some unnecessary duplication > - split out patch to factor out node device name generation function > - split nodeDeviceParseMdevctlChildDevice() into a separate function > - added follow-up patch to remove space-padded alignment in header > - refactored the mdevctl update handling significantly: >- no longer a separate persistent thread that gets signaled by a timer >- now piggybacks onto the existing udev thread and signals the thread in t= > he > same way that the udev event does. >- Daniel suggested spawning a throw-away thread to handle mdevctl updates, > but that introduces the complexity of possibly serializing multiple > throw-away threads (e.g. if we get an 'created' event followed immediate= > ly > by a 'deleted' event, two threads may be spawned and we'd need to ensure > they are properly ordered) > - added virNodeDeviceObjListForEach() and virNodeDeviceObjListRemoveLocked() >to simplify removing devices that are removed from mdevctl. > - coding style fixes > - NOTE: per Erik's request, I experimented with changing the way that mdevctl >commands were generated and tested (e.g. introducing something like >virMdevctlGetCommand(def, MDEVCTL_COMMAND_, ...)), but it was >too invasive and awkward and didn't seem worthwhile Thanks for giving ^this a try :) I think we're converging nicely with the only remaining bit to agree on being the mdevctl monitoring thread. I think we can target 7.1.0 with this series. Regards, Erik
Re: [libvirt PATCH v3 00/21] Add support for persistent mediated devices
On Tue, Jan 05, 2021 at 11:50:11AM +0800, Yan Fu wrote: > Tested with v6.10.0-283-g1948d4e61e. > > 1.Can define/start/destroy mdev device successfully; > > 2.'virsh nodedev-list' has no '--active' option, which is inconsistent with > the description in the patch: > # virsh nodedev-list --active > error: command 'nodedev-list' doesn't support option --active > > 3.virsh client hang when trying to destroy a mdev device which is using by > a vm, and after that all 'virsh nodev*' cmds will hang. If restarting > llibvirtd after that, libvirtd will hang. It hangs because underneath a write to the 'remove' sysfs attribute is now blocking for some reason and since we're relying on mdevctl to do it for us, hence "it hangs". I'm not trying to make an excuse, it's plain wrong. I'd love to rely on such a basic functionality, but it looks like we'll have to go with a extremely ugly workaround and try to get the list of active domains from the nodedev driver and see whether any of them has the device assigned before we try to destroy the mdev via the nodedev driver. However, in my testing libvirtd never hung after a restart, can you elaborate on that? > > 4.Define a mdev device with the uuid specified, but the mdev device defined > seems using another uuid. Maybe it make a little confusion about the use of > uuid in the xml: > #cat mdev.xml > > mdev_85531b6d_e5e4_41c1_aa2a_8844717f066a Yeah, the easy way out here is to document that the element is read only, but that would be wrong, because we allow specifying it for domains, networks, interfaces, etc. So, we should give the end user the option to specify whatever name they want and generate one if none is provided. Regards, Erik
Re: [libvirt PATCH v3 00/21] Add support for persistent mediated devices
On Tue, Jan 5, 2021 at 11:50 AM Yan Fu wrote: > Tested with v6.10.0-283-g1948d4e61e. > > 1.Can define/start/destroy mdev device successfully; > > 2.'virsh nodedev-list' has no '--active' option, which is inconsistent > with the description in the patch: > # virsh nodedev-list --active > error: command 'nodedev-list' doesn't support option --active > > 3.virsh client hang when trying to destroy a mdev device which is using by > a vm, and after that all 'virsh nodev*' cmds will hang. If restarting > llibvirtd after that, libvirtd will hang. > Please provide the domain xml and the libvirtd log by the filter "1:qemu 1:node_device" > > 4.Define a mdev device with the uuid specified, but the mdev device > defined seems using another uuid. Maybe it make a little confusion about > the use of uuid in the xml: > #cat mdev.xml > > mdev_85531b6d_e5e4_41c1_aa2a_8844717f066a > > /sys/devices/pci:00/:00:02.0/85531b6d-e5e4-41c1-aa2a-8844717f066a > *** > pci__00_02_0 > > vfio_mdev > > > > > > > > # virsh nodedev-define /root/mdev.xml > Node device ***mdev_6cdbaad0_7215_4741_82b9_c51e1a4decda defined from > /root/mdev.xml > > > > -- > Tested by Yan Fu(y...@redhat.com) > It should be: Tested-by: Yan Fu That means it works for me. If you think there are problems here, the tested-by should not be given. > > > On Fri, Dec 25, 2020 at 10:19 AM Han Han wrote: > >> >> >> On Thu, Dec 24, 2020 at 10:15 PM Jonathon Jongsma >> wrote: >> >>> This patch series follows the previously-merged series which added >>> support for >>> transient mediated devices. This series expands mdev support to include >>> persistent device definitions. Again, it relies on mdevctl as the >>> backend. >>> >>> It follows the common libvirt pattern of APIs by adding the following >>> new APIs >>> for node devices: >>> - virNodeDeviceDefineXML() - defines a persistent device >>> - virNodeDeviceUndefine() - undefines a persistent device >>> - virNodeDeviceCreate() - starts a previously-defined device >>> >>> It also adds virsh commands mapping to these new APIs: nodedev-define, >>> nodedev-undefine, and nodedev-start. >>> >> Trivial suggestions: >> 1. Mention the bug to be resolved by this series in commit msg: >> https://bugzilla.redhat.com/show_bug.cgi?id=1699274 >> 2. Add doc of man page for the new virsh commands and options >> >>> >>> Since we rely on mdevctl for the definition of ediated devices, we need >>> a way >>> to stay up-to-date with devices that are defined by mdevctl (outside of >>> libvirt). The method for staying up-to-date is currently a little bit >>> crude >>> due to the fact that mdevctl does not emit any events when new devices >>> are >>> added or removed. As a workaround, we create a file monitor for the >>> mdevctl >>> config directory and re-query mdevctl when we detect changes within that >>> directory. In the future, mdevctl may introduce a more elegant solution. >>> >>> changes in v3: >>> - streamlined tests -- removed some unnecessary duplication >>> - split out patch to factor out node device name generation function >>> - split nodeDeviceParseMdevctlChildDevice() into a separate function >>> - added follow-up patch to remove space-padded alignment in header >>> - refactored the mdevctl update handling significantly: >>>- no longer a separate persistent thread that gets signaled by a timer >>>- now piggybacks onto the existing udev thread and signals the thread >>> in t= >>> he >>> same way that the udev event does. >>>- Daniel suggested spawning a throw-away thread to handle mdevctl >>> updates, >>> but that introduces the complexity of possibly serializing multiple >>> throw-away threads (e.g. if we get an 'created' event followed >>> immediate= >>> ly >>> by a 'deleted' event, two threads may be spawned and we'd need to >>> ensure >>> they are properly ordered) >>> - added virNodeDeviceObjListForEach() and >>> virNodeDeviceObjListRemoveLocked() >>>to simplify removing devices that are removed from mdevctl. >>> - coding style fixes >>> - NOTE: per Erik's request, I experimented with changing the way that >>> mdevctl >>>commands were generated and tested (e.g. introducing something like >>>virMdevctlGetCommand(def, MDEVCTL_COMMAND_, ...)), but it >>> was >>>too invasive and awkward and didn't seem worthwhile >>> >>> Changes in v2: >>> - rebase to latest git master >>> >>> Jonathon Jongsma (21): >>> tests: remove extra trailing semicolon >>> nodedev: introduce concept of 'active' node devices >>> nodedev: Add ability to filter by active state >>> nodedev: expose internal helper for naming devices >>> nodedev: add ability to list and parse defined mdevs >>> nodedev: add STOPPED/STARTED lifecycle events >>> nodedev: add mdevctl devices to node device list >>> nodedev: add helper functions to remove node devices >>> nodedev: handle mdevs that disappear from mdevctl >>> nodedev:
Re: [libvirt PATCH v3 00/21] Add support for persistent mediated devices
Tested with v6.10.0-283-g1948d4e61e. 1.Can define/start/destroy mdev device successfully; 2.'virsh nodedev-list' has no '--active' option, which is inconsistent with the description in the patch: # virsh nodedev-list --active error: command 'nodedev-list' doesn't support option --active 3.virsh client hang when trying to destroy a mdev device which is using by a vm, and after that all 'virsh nodev*' cmds will hang. If restarting llibvirtd after that, libvirtd will hang. 4.Define a mdev device with the uuid specified, but the mdev device defined seems using another uuid. Maybe it make a little confusion about the use of uuid in the xml: #cat mdev.xml mdev_85531b6d_e5e4_41c1_aa2a_8844717f066a /sys/devices/pci:00/:00:02.0/85531b6d-e5e4-41c1-aa2a-8844717f066a *** pci__00_02_0 vfio_mdev # virsh nodedev-define /root/mdev.xml Node device ***mdev_6cdbaad0_7215_4741_82b9_c51e1a4decda defined from /root/mdev.xml -- Tested by Yan Fu(y...@redhat.com) On Fri, Dec 25, 2020 at 10:19 AM Han Han wrote: > > > On Thu, Dec 24, 2020 at 10:15 PM Jonathon Jongsma > wrote: > >> This patch series follows the previously-merged series which added >> support for >> transient mediated devices. This series expands mdev support to include >> persistent device definitions. Again, it relies on mdevctl as the backend. >> >> It follows the common libvirt pattern of APIs by adding the following new >> APIs >> for node devices: >> - virNodeDeviceDefineXML() - defines a persistent device >> - virNodeDeviceUndefine() - undefines a persistent device >> - virNodeDeviceCreate() - starts a previously-defined device >> >> It also adds virsh commands mapping to these new APIs: nodedev-define, >> nodedev-undefine, and nodedev-start. >> > Trivial suggestions: > 1. Mention the bug to be resolved by this series in commit msg: > https://bugzilla.redhat.com/show_bug.cgi?id=1699274 > 2. Add doc of man page for the new virsh commands and options > >> >> Since we rely on mdevctl for the definition of ediated devices, we need a >> way >> to stay up-to-date with devices that are defined by mdevctl (outside of >> libvirt). The method for staying up-to-date is currently a little bit >> crude >> due to the fact that mdevctl does not emit any events when new devices are >> added or removed. As a workaround, we create a file monitor for the >> mdevctl >> config directory and re-query mdevctl when we detect changes within that >> directory. In the future, mdevctl may introduce a more elegant solution. >> >> changes in v3: >> - streamlined tests -- removed some unnecessary duplication >> - split out patch to factor out node device name generation function >> - split nodeDeviceParseMdevctlChildDevice() into a separate function >> - added follow-up patch to remove space-padded alignment in header >> - refactored the mdevctl update handling significantly: >>- no longer a separate persistent thread that gets signaled by a timer >>- now piggybacks onto the existing udev thread and signals the thread >> in t= >> he >> same way that the udev event does. >>- Daniel suggested spawning a throw-away thread to handle mdevctl >> updates, >> but that introduces the complexity of possibly serializing multiple >> throw-away threads (e.g. if we get an 'created' event followed >> immediate= >> ly >> by a 'deleted' event, two threads may be spawned and we'd need to >> ensure >> they are properly ordered) >> - added virNodeDeviceObjListForEach() and >> virNodeDeviceObjListRemoveLocked() >>to simplify removing devices that are removed from mdevctl. >> - coding style fixes >> - NOTE: per Erik's request, I experimented with changing the way that >> mdevctl >>commands were generated and tested (e.g. introducing something like >>virMdevctlGetCommand(def, MDEVCTL_COMMAND_, ...)), but it >> was >>too invasive and awkward and didn't seem worthwhile >> >> Changes in v2: >> - rebase to latest git master >> >> Jonathon Jongsma (21): >> tests: remove extra trailing semicolon >> nodedev: introduce concept of 'active' node devices >> nodedev: Add ability to filter by active state >> nodedev: expose internal helper for naming devices >> nodedev: add ability to list and parse defined mdevs >> nodedev: add STOPPED/STARTED lifecycle events >> nodedev: add mdevctl devices to node device list >> nodedev: add helper functions to remove node devices >> nodedev: handle mdevs that disappear from mdevctl >> nodedev: rename dataReady to udevReady >> nodedev: Refresh mdev devices when changes are detected >> api: add virNodeDeviceDefineXML() >> virsh: Add --active, --inactive, --all to nodedev-list >> virsh: add nodedev-define command >> nodedev: refactor tests to support mdev undefine >> api: add virNodeDeviceUndefine() >> virsh: Factor out function to find node device >> virsh: add nodedev-undefine command >> api: add virNodeDeviceCr
Re: [libvirt PATCH v3 00/21] Add support for persistent mediated devices
On Thu, Dec 24, 2020 at 10:15 PM Jonathon Jongsma wrote: > This patch series follows the previously-merged series which added support > for > transient mediated devices. This series expands mdev support to include > persistent device definitions. Again, it relies on mdevctl as the backend. > > It follows the common libvirt pattern of APIs by adding the following new > APIs > for node devices: > - virNodeDeviceDefineXML() - defines a persistent device > - virNodeDeviceUndefine() - undefines a persistent device > - virNodeDeviceCreate() - starts a previously-defined device > > It also adds virsh commands mapping to these new APIs: nodedev-define, > nodedev-undefine, and nodedev-start. > Trivial suggestions: 1. Mention the bug to be resolved by this series in commit msg: https://bugzilla.redhat.com/show_bug.cgi?id=1699274 2. Add doc of man page for the new virsh commands and options > > Since we rely on mdevctl for the definition of ediated devices, we need a > way > to stay up-to-date with devices that are defined by mdevctl (outside of > libvirt). The method for staying up-to-date is currently a little bit > crude > due to the fact that mdevctl does not emit any events when new devices are > added or removed. As a workaround, we create a file monitor for the mdevctl > config directory and re-query mdevctl when we detect changes within that > directory. In the future, mdevctl may introduce a more elegant solution. > > changes in v3: > - streamlined tests -- removed some unnecessary duplication > - split out patch to factor out node device name generation function > - split nodeDeviceParseMdevctlChildDevice() into a separate function > - added follow-up patch to remove space-padded alignment in header > - refactored the mdevctl update handling significantly: >- no longer a separate persistent thread that gets signaled by a timer >- now piggybacks onto the existing udev thread and signals the thread > in t= > he > same way that the udev event does. >- Daniel suggested spawning a throw-away thread to handle mdevctl > updates, > but that introduces the complexity of possibly serializing multiple > throw-away threads (e.g. if we get an 'created' event followed > immediate= > ly > by a 'deleted' event, two threads may be spawned and we'd need to > ensure > they are properly ordered) > - added virNodeDeviceObjListForEach() and > virNodeDeviceObjListRemoveLocked() >to simplify removing devices that are removed from mdevctl. > - coding style fixes > - NOTE: per Erik's request, I experimented with changing the way that > mdevctl >commands were generated and tested (e.g. introducing something like >virMdevctlGetCommand(def, MDEVCTL_COMMAND_, ...)), but it > was >too invasive and awkward and didn't seem worthwhile > > Changes in v2: > - rebase to latest git master > > Jonathon Jongsma (21): > tests: remove extra trailing semicolon > nodedev: introduce concept of 'active' node devices > nodedev: Add ability to filter by active state > nodedev: expose internal helper for naming devices > nodedev: add ability to list and parse defined mdevs > nodedev: add STOPPED/STARTED lifecycle events > nodedev: add mdevctl devices to node device list > nodedev: add helper functions to remove node devices > nodedev: handle mdevs that disappear from mdevctl > nodedev: rename dataReady to udevReady > nodedev: Refresh mdev devices when changes are detected > api: add virNodeDeviceDefineXML() > virsh: Add --active, --inactive, --all to nodedev-list > virsh: add nodedev-define command > nodedev: refactor tests to support mdev undefine > api: add virNodeDeviceUndefine() > virsh: Factor out function to find node device > virsh: add nodedev-undefine command > api: add virNodeDeviceCreate() > virsh: add "nodedev-start" command > libvirt-nodedev.h: remove space-padded alignment > > examples/c/misc/event-test.c | 4 + > include/libvirt/libvirt-nodedev.h | 98 ++-- > src/access/viraccessperm.c| 2 +- > src/access/viraccessperm.h| 6 + > src/conf/node_device_conf.h | 9 + > src/conf/virnodedeviceobj.c | 132 - > src/conf/virnodedeviceobj.h | 19 + > src/driver-nodedev.h | 14 + > src/libvirt-nodedev.c | 115 > src/libvirt_private.syms | 4 + > src/libvirt_public.syms | 3 + > src/node_device/node_device_driver.c | 538 +- > src/node_device/node_device_driver.h | 38 ++ > src/node_device/node_device_udev.c| 308 -- > src/remote/remote_driver.c| 3 + > src/remote/remote_protocol.x | 40 +- > src/remote_protocol-structs | 16 + > src/rpc/gendispatch.pl| 1 + > .
[libvirt PATCH v3 00/21] Add support for persistent mediated devices
This patch series follows the previously-merged series which added support for transient mediated devices. This series expands mdev support to include persistent device definitions. Again, it relies on mdevctl as the backend. It follows the common libvirt pattern of APIs by adding the following new APIs for node devices: - virNodeDeviceDefineXML() - defines a persistent device - virNodeDeviceUndefine() - undefines a persistent device - virNodeDeviceCreate() - starts a previously-defined device It also adds virsh commands mapping to these new APIs: nodedev-define, nodedev-undefine, and nodedev-start. Since we rely on mdevctl for the definition of ediated devices, we need a way to stay up-to-date with devices that are defined by mdevctl (outside of libvirt). The method for staying up-to-date is currently a little bit crude due to the fact that mdevctl does not emit any events when new devices are added or removed. As a workaround, we create a file monitor for the mdevctl config directory and re-query mdevctl when we detect changes within that directory. In the future, mdevctl may introduce a more elegant solution. changes in v3: - streamlined tests -- removed some unnecessary duplication - split out patch to factor out node device name generation function - split nodeDeviceParseMdevctlChildDevice() into a separate function - added follow-up patch to remove space-padded alignment in header - refactored the mdevctl update handling significantly: - no longer a separate persistent thread that gets signaled by a timer - now piggybacks onto the existing udev thread and signals the thread in t= he same way that the udev event does. - Daniel suggested spawning a throw-away thread to handle mdevctl updates, but that introduces the complexity of possibly serializing multiple throw-away threads (e.g. if we get an 'created' event followed immediate= ly by a 'deleted' event, two threads may be spawned and we'd need to ensure they are properly ordered) - added virNodeDeviceObjListForEach() and virNodeDeviceObjListRemoveLocked() to simplify removing devices that are removed from mdevctl. - coding style fixes - NOTE: per Erik's request, I experimented with changing the way that mdevctl commands were generated and tested (e.g. introducing something like virMdevctlGetCommand(def, MDEVCTL_COMMAND_, ...)), but it was too invasive and awkward and didn't seem worthwhile Changes in v2: - rebase to latest git master Jonathon Jongsma (21): tests: remove extra trailing semicolon nodedev: introduce concept of 'active' node devices nodedev: Add ability to filter by active state nodedev: expose internal helper for naming devices nodedev: add ability to list and parse defined mdevs nodedev: add STOPPED/STARTED lifecycle events nodedev: add mdevctl devices to node device list nodedev: add helper functions to remove node devices nodedev: handle mdevs that disappear from mdevctl nodedev: rename dataReady to udevReady nodedev: Refresh mdev devices when changes are detected api: add virNodeDeviceDefineXML() virsh: Add --active, --inactive, --all to nodedev-list virsh: add nodedev-define command nodedev: refactor tests to support mdev undefine api: add virNodeDeviceUndefine() virsh: Factor out function to find node device virsh: add nodedev-undefine command api: add virNodeDeviceCreate() virsh: add "nodedev-start" command libvirt-nodedev.h: remove space-padded alignment examples/c/misc/event-test.c | 4 + include/libvirt/libvirt-nodedev.h | 98 ++-- src/access/viraccessperm.c| 2 +- src/access/viraccessperm.h| 6 + src/conf/node_device_conf.h | 9 + src/conf/virnodedeviceobj.c | 132 - src/conf/virnodedeviceobj.h | 19 + src/driver-nodedev.h | 14 + src/libvirt-nodedev.c | 115 src/libvirt_private.syms | 4 + src/libvirt_public.syms | 3 + src/node_device/node_device_driver.c | 538 +- src/node_device/node_device_driver.h | 38 ++ src/node_device/node_device_udev.c| 308 -- src/remote/remote_driver.c| 3 + src/remote/remote_protocol.x | 40 +- src/remote_protocol-structs | 16 + src/rpc/gendispatch.pl| 1 + ...19_36ea_4111_8f0a_8c9a70e21366-define.argv | 1 + ...19_36ea_4111_8f0a_8c9a70e21366-define.json | 1 + ...39_495e_4243_ad9f_beb3f14c23d9-define.argv | 1 + ...39_495e_4243_ad9f_beb3f14c23d9-define.json | 1 + ...16_1ca8_49ac_b176_871d16c13076-define.argv | 1 + ...16_1ca8_49ac_b176_871d16c13076-define.json | 1 + tests/nodedevmdevctldata/mdevctl-create.argv | 1 + .../mdevctl-list-defined.argv | 1 + .../mdevctl-list-