date:20200714

Re: [PATCH 00/10] resolve hangs/crashes on libvirtd shutdown

2020-07-14 Thread Nikolay Shirokovskiy




On 14.07.2020 17:53, Daniel Henrique Barboza wrote:
> As far as code goes:
> 
> 
> Reviewed-by: Daniel Henrique Barboza 
> 
> 
> About the design I have a question about the timeout. Patch 5/10 is setting a
> 15 second timeout. How did you reach this value? Reading the bug, specially
> this comment from Daniel:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1828207#c6
> 
> He mentions "give it 5 seconds of running before shutting it down".

I guess 5 seconds is time for libvirtd to finish startup. This time has
different nature than time for libvirtd to finish it's work on shutdown
so it can be different.

> 
> 5 seconds before shutdown is something that most users can be slightly annoyed
> but in the end don't mind that much, but 15 seconds is something that will
> cause bugs to be opened because "Libvirt is taking too long to shutdown".
> Besides, it's a fair assumption that a transaction that takes more than
> 5 or so seconds to finish is already compromised* - might as well shutdown
> the daemon and deal with the errors.

15 seconds was mentioned by Daniel in [1] when he first proposed the approach
so I used this value without any extra thought. However I missed that in
the last John's series [2] the default for waiting time is 0s. May be this
is the current decision on waiting time. Let's wait for others to join
the review.

[1] https://www.redhat.com/archives/libvir-list/2018-January/msg00942.html
[2] https://www.redhat.com/archives/libvir-list/2018-July/msg00382.html

Nikolay

> 
> 
> 
> * assuming user discretion to avoid shutting down the daemon in the middle
> of a long duration transaction, of course.
> 
> 
> Thanks,
> 
> 
> DHB
> 
> 
> On 7/14/20 6:32 AM, Nikolay Shirokovskiy wrote:
>> This series follows [1] but address the issue slightly differently.
>> Instead of polling for RPC thread pool termination it waits for
>> thread pool drain in distinct thread and then signal the main loop
>> to exit.
>>
>> The series introduces new driver's methods stateShutdown/stateShutdownWait
>> to finish all driver's background threads. The methods however are only
>> implemented for qemu driver and only partially. There are other drivers
>> that have background threads and I don't check every one of them in
>> terms of how they manage their background threads.
>>
>> For example node driver creates 2 threads. One of them is supposed to live
>> a for a short amount of time and thus not tracked. This thread can cause 
>> issues
>> on shutdown. The second thread is tracked and finished synchronously on 
>> driver
>> cleanup. So this thread can not cause crashes but can cause hangs 
>> theoretically
>> speaking so we may want to move the thread finishing to stateShutdownWait
>> method so that possible hang will be handled by shutdown timeout.
>>
>> The qemu driver also has untracked threads and they can cause crashes on
>> shutdown. For example reconnect threads or reboot thread. These need to be
>> tracked.
>>
>> I'm going to address these issues in qemu and other drivers once the overall
>> approach will be approved.
>>
>> I added 2 new driver's methods so that thread finishing will be done in
>> parallel. If we have only one method then the shutdown is done one by one
>> effectively.
>>
>> I've added clean shutdown timeout in event loop as suggested by Daniel in 
>> [2].
>> Now I think why we can't just go with systemd unit management? Systemd will
>> eventually send SIGKILL and we can tune the timeout using TimeoutStopUSec
>> parameter. This way we even don't need to introduce new driver's methods.
>> Driver's background threads can be finished in stateCleanup method. AFAIU as
>> drivers are cleaned up in reverse order it is safe in the sense that already
>> cleaned up driver can not be used by background threads of not yet cleaned up
>> driver. Of course this way the cleanup is not done in parallel. Well to
>> turn it into parallel we can introduce just stateShutdown which we don't need
>> to call in netdaemon code and thus don't introduce undesired dependency of
>> netdaemon on drivers concept.
>>
>> [1] Resolve libvirtd hang on termination with connected long running client
>>  https://www.redhat.com/archives/libvir-list/2018-July/msg00382.html
>> [2] Races / crashes in shutdown of libvirtd daemon
>>  https://www.redhat.com/archives/libvir-list/2020-April/msg01328.html
>>
>> Nikolay Shirokovskiy (10):
>>    libvirt: add stateShutdown/stateShutdownWait to drivers
>>    util: always initialize priority condition
>>    util: add stop/drain functions to thread pool
>>    rpc: don't unref service ref on socket behalf twice
>>    rpc: finish all threads before exiting main loop
>>    vireventthread: add virEventThreadClose
>>    qemu: exit thread synchronously in qemuDomainObjStopWorker
>>    qemu: implement driver's shutdown/shutdown wait methods
>>    rpc: cleanup virNetDaemonClose method
>>    util: remove unused virThreadPoolNew macro
>>
>>   scripts/check-drivername.py   |  2 +
>>

Re: The issue about adding multipath device's targets into qemu-pr-helper's namespace

2020-07-14 Thread Lin Ma


On 2020-07-14 14:38, Michal Privoznik wrote:

On 7/14/20 3:41 PM, Lin Ma wrote:

Hi all,

I have a namespace question about passthrough disk(multipath device).
In case of enabling namespace and cgroups in qemu.conf, The target(s) 
of the
multipath device won't be added into qemu-pr-helper's namespace under 
certain

situation, It causes the PERSISTENT RESERVE command failure in guest.



Yeah, this is expected. I mean, the failure is expected if not all
devices are added to the namespace.


While user starts a vm,
To build namespace, The qemuDomainSetupDisk() will be invoked via 
threadA(this

thread id will be the qemu's pid),
To build cgroup, The qemuSetupImagePathCgroup() will be invoked via 
threadB.


Both of the functions invoke the virDevMapperGetTargets() trying to 
parse a

multipath device to target paths string, Then fill the targetPaths[].

The issue I experienced is:
After libvirtd started, Everything works well for the first booted vm 
which has

the passthrough multipath device.
But If I shut it down & start it again, OR keep it running & start 
another vm
which has other passthrough multipath device, Then the target(s) of 
the fresh
started vm won't be added into the related qemu-pr-helper's namespace 
and it

causes PERSISTENT RESERVE command failure in the corresponding guest.
I digged into code, In this situation, The targetPaths[] in 
qemuDomainSetupDisk()

won't be filled, it keeps NULL after virDevMapperGetTargets() returns.
The virDevMapperGetTargets doesn't fill targetPaths[] because the 
dm_task_run()

of libdevmapper returns 0 with errno 9(Bad file descriptor).
So far, I don't understand why the dm_task_run() return 0 in this 
situation.
BTW, The virDevMapperGetTargets() can always successfully fill the 
targetPaths[]

in qemuSetupImagePathCgroup().


What is your libvirt version? I've merged a fix for something similar
not a long ago: v6.5.0-rc1~190

Can you please check v6.5.0?


The libvirt version I used in the tests is git master.
The sure thing is that the multipath devices I used are valid and 
functional.


In the test A or the test B, Everytime we restart libvirtd, The issue 
won't happen for

the first booted vm.

In other words, The target of multipath device can be parsed & added to 
qemu-pr-helper's

namespace for the first booted vm.

As long as libvirtd once invoked 
qemuSetupImagePathCgroup()->virDevMapperGetTargetsImpl()

to successfully parse any valid multipath device,
From then on, The 
qemuDomainSetupDisk()->virDevMapperGetTargetsImpl()->dm_task_run()

returns 0 with errno=9 when parsing any valid multipath device.

BTW, I ran the same tests on kernel 4.12 and kernel 5.6, Got the same 
test results.


Thanks,
Lin

Re: device compatibility interface for live migration with assigned devices

2020-07-14 Thread Sean Mooney

resending with full cc list since i had this typed up
i would blame my email provier but my email client does not seam to like long 
cc lists.
we probably want to continue on  alex's thread to not split the disscusion.
but i have responed inline with some example of  how openstack schdules and 
what i ment by different mdev_types


On Tue, 2020-07-14 at 20:29 +0100, Sean Mooney wrote:
> On Tue, 2020-07-14 at 11:01 -0600, Alex Williamson wrote:
> > On Tue, 14 Jul 2020 13:33:24 +0100
> > Sean Mooney  wrote:
> > 
> > > On Tue, 2020-07-14 at 11:21 +0100, Daniel P. Berrangé wrote:
> > > > On Tue, Jul 14, 2020 at 07:29:57AM +0800, Yan Zhao wrote:  
> > > > > hi folks,
> > > > > we are defining a device migration compatibility interface that helps 
> > > > > upper
> > > > > layer stack like openstack/ovirt/libvirt to check if two devices are
> > > > > live migration compatible.
> > > > > The "devices" here could be MDEVs, physical devices, or hybrid of the 
> > > > > two.
> > > > > e.g. we could use it to check whether
> > > > > - a src MDEV can migrate to a target MDEV,  
> > > 
> > > mdev live migration is completely possible to do but i agree with Dan 
> > > barrange's comments
> > > from the point of view of openstack integration i dont see calling out to 
> > > a vender sepecific
> > > tool to be an accpetable
> > 
> > As I replied to Dan, I'm hoping Yan was referring more to vendor
> > specific knowledge rather than actual tools.
> > 
> > > solutions for device compatiablity checking. the sys filesystem
> > > that describs the mdevs that can be created shoudl also
> > > contain the relevent infomation such
> > > taht nova could integrate it via libvirt xml representation or directly 
> > > retrive the
> > > info from
> > > sysfs.
> > > > > - a src VF in SRIOV can migrate to a target VF in SRIOV,  
> > > 
> > > so vf to vf migration is not possible in the general case as there is no 
> > > standarised
> > > way to transfer teh device state as part of the siorv specs produced by 
> > > the pci-sig
> > > as such there is not vender neutral way to support sriov live migration. 
> > 
> > We're not talking about a general case, we're talking about physical
> > devices which have vfio wrappers or hooks with device specific
> > knowledge in order to support the vfio migration interface.  The point
> > is that a discussion around vfio device migration cannot be limited to
> > mdev devices.
> 
> ok upstream in  openstack at least we do not plan to support generic 
> livemigration
> for passthough devivces. we cheat with network interfaces since in generaly 
> operating
> systems handel hotplug of a nic somewhat safely so wehre no abstraction layer 
> like
> an mdev is present or a macvtap device we hot unplug the nic before the 
> migration
> and attach a new one after.  for gpus or crypto cards this likely would not 
> be viable
> since you can bond generic hardware devices to hide the removal and readdtion 
> of a generic
> pci device. we were hoping that there would be a convergenca around MDEVs as 
> a way to provide
> that abstraction going forward for generic device or some other new 
> mechanisum in the future.
> > 
> > > > > - a src MDEV can migration to a target VF in SRIOV.  
> > > 
> > > that also makes this unviable
> > > > >   (e.g. SIOV/SRIOV backward compatibility case)
> > > > > 
> > > > > The upper layer stack could use this interface as the last step to 
> > > > > check
> > > > > if one device is able to migrate to another device before triggering 
> > > > > a real
> > > > > live migration procedure.  
> > > 
> > > well actully that is already too late really. ideally we would want to do 
> > > this compaiablity
> > > check much sooneer to avoid the migration failing. in an openstack 
> > > envionment  at least
> > > by the time we invoke libvirt (assuming your using the libvirt driver) to 
> > > do the migration we have alreaedy
> > > finished schduling the instance to the new host. if if we do the 
> > > compatiablity check at this point
> > > and it fails then the live migration is aborted and will not be retired. 
> > > These types of late check lead to a
> > > poor user experince as unless you check the migration detial it basically 
> > > looks like the migration was ignored
> > > as it start to migrate and then continuge running on the orgininal host.
> > > 
> > > when using generic pci passhotuhg with openstack, the pci alias is 
> > > intended to reference a single vendor
> > > id/product
> > > id so you will have 1+ alias for each type of device. that allows 
> > > openstack to schedule based on the availability
> > > of
> > > a
> > > compatibale device because we track inventories of pci devices and can 
> > > query that when selecting a host.
> > > 
> > > if we were to support mdev live migration in the future we would want to 
> > > take the same declarative approch.
> > > 1 interospec the capability of the deivce we manage
> > > 2 create inventories of the allocatable devices and

Re: device compatibility interface for live migration with assigned devices

2020-07-14 Thread Alex Williamson

On Tue, 14 Jul 2020 18:19:46 +0100
"Dr. David Alan Gilbert"  wrote:

> * Alex Williamson (alex.william...@redhat.com) wrote:
> > On Tue, 14 Jul 2020 11:21:29 +0100
> > Daniel P. Berrangé  wrote:
> >   
> > > On Tue, Jul 14, 2020 at 07:29:57AM +0800, Yan Zhao wrote:  
> > > > hi folks,
> > > > we are defining a device migration compatibility interface that helps 
> > > > upper
> > > > layer stack like openstack/ovirt/libvirt to check if two devices are
> > > > live migration compatible.
> > > > The "devices" here could be MDEVs, physical devices, or hybrid of the 
> > > > two.
> > > > e.g. we could use it to check whether
> > > > - a src MDEV can migrate to a target MDEV,
> > > > - a src VF in SRIOV can migrate to a target VF in SRIOV,
> > > > - a src MDEV can migration to a target VF in SRIOV.
> > > >   (e.g. SIOV/SRIOV backward compatibility case)
> > > > 
> > > > The upper layer stack could use this interface as the last step to check
> > > > if one device is able to migrate to another device before triggering a 
> > > > real
> > > > live migration procedure.
> > > > we are not sure if this interface is of value or help to you. please 
> > > > don't
> > > > hesitate to drop your valuable comments.
> > > > 
> > > > 
> > > > (1) interface definition
> > > > The interface is defined in below way:
> > > > 
> > > >  __userspace
> > > >   /\  \
> > > >  / \write
> > > > / read  \
> > > >/__   ___\|/_
> > > >   | migration_version | | migration_version |-->check migration
> > > >   - -   compatibility
> > > >  device Adevice B
> > > > 
> > > > 
> > > > a device attribute named migration_version is defined under each 
> > > > device's
> > > > sysfs node. e.g. 
> > > > (/sys/bus/pci/devices/\:00\:02.0/$mdev_UUID/migration_version).
> > > > userspace tools read the migration_version as a string from the source 
> > > > device,
> > > > and write it to the migration_version sysfs attribute in the target 
> > > > device.
> > > > 
> > > > The userspace should treat ANY of below conditions as two devices not 
> > > > compatible:
> > > > - any one of the two devices does not have a migration_version attribute
> > > > - error when reading from migration_version attribute of one device
> > > > - error when writing migration_version string of one device to
> > > >   migration_version attribute of the other device
> > > > 
> > > > The string read from migration_version attribute is defined by device 
> > > > vendor
> > > > driver and is completely opaque to the userspace.
> > > > for a Intel vGPU, string format can be defined like
> > > > "parent device PCI ID" + "version of gvt driver" + "mdev type" + 
> > > > "aggregator count".
> > > > 
> > > > for an NVMe VF connecting to a remote storage. it could be
> > > > "PCI ID" + "driver version" + "configured remote storage URL"
> > > > 
> > > > for a QAT VF, it may be
> > > > "PCI ID" + "driver version" + "supported encryption set".
> > > > 
> > > > (to avoid namespace confliction from each vendor, we may prefix a 
> > > > driver name to
> > > > each migration_version string. e.g. i915-v1-8086-591d-i915-GVTg_V5_8-1) 
> > > >  
> > 
> > It's very strange to define it as opaque and then proceed to describe
> > the contents of that opaque string.  The point is that its contents
> > are defined by the vendor driver to describe the device, driver version,
> > and possibly metadata about the configuration of the device.  One
> > instance of a device might generate a different string from another.
> > The string that a device produces is not necessarily the only string
> > the vendor driver will accept, for example the driver might support
> > backwards compatible migrations.  
> 
> (As I've said in the previous discussion, off one of the patch series)
> 
> My view is it makes sense to have a half-way house on the opaqueness of
> this string; I'd expect to have an ID and version that are human
> readable, maybe a device ID/name that's human interpretable and then a
> bunch of other cruft that maybe device/vendor/version specific.
> 
> I'm thinking that we want to be able to report problems and include the
> string and the user to be able to easily identify the device that was
> complaining and notice a difference in versions, and perhaps also use
> it in compatibility patterns to find compatible hosts; but that does
> get tricky when it's a 'ask the device if it's compatible'.

In the reply I just sent to Dan, I gave this example of what a
"compatibility string" might look like represented as json:

{
  "device_api": "vfio-pci",
  "vendor": "vendor-driver-name",
  "version": {
"major": 0,
"minor": 1
  },
  "vfio-pci": { // Based on above device_api
"vendor": 0x1234, // Values for the exposed device
"device": 0x5678,
  // Possibly further parameters for a more

Re: device compatibility interface for live migration with assigned devices

2020-07-14 Thread Alex Williamson

On Tue, 14 Jul 2020 17:47:22 +0100
Daniel P. Berrangé  wrote:

> On Tue, Jul 14, 2020 at 10:16:16AM -0600, Alex Williamson wrote:
> > On Tue, 14 Jul 2020 11:21:29 +0100
> > Daniel P. Berrangé  wrote:
> >   
> > > On Tue, Jul 14, 2020 at 07:29:57AM +0800, Yan Zhao wrote:  
> > > > 
> > > > The string read from migration_version attribute is defined by device 
> > > > vendor
> > > > driver and is completely opaque to the userspace.
> > > > for a Intel vGPU, string format can be defined like
> > > > "parent device PCI ID" + "version of gvt driver" + "mdev type" + 
> > > > "aggregator count".
> > > > 
> > > > for an NVMe VF connecting to a remote storage. it could be
> > > > "PCI ID" + "driver version" + "configured remote storage URL"
> > > > 
> > > > for a QAT VF, it may be
> > > > "PCI ID" + "driver version" + "supported encryption set".
> > > > 
> > > > (to avoid namespace confliction from each vendor, we may prefix a 
> > > > driver name to
> > > > each migration_version string. e.g. i915-v1-8086-591d-i915-GVTg_V5_8-1) 
> > > >  
> > 
> > It's very strange to define it as opaque and then proceed to describe
> > the contents of that opaque string.  The point is that its contents
> > are defined by the vendor driver to describe the device, driver version,
> > and possibly metadata about the configuration of the device.  One
> > instance of a device might generate a different string from another.
> > The string that a device produces is not necessarily the only string
> > the vendor driver will accept, for example the driver might support
> > backwards compatible migrations.  
> 
> 
> > > IMHO there needs to be a mechanism for the kernel to report via sysfs
> > > what versions are supported on a given device. This puts the job of
> > > reporting compatible versions directly under the responsibility of the
> > > vendor who writes the kernel driver for it. They are the ones with the
> > > best knowledge of the hardware they've built and the rules around its
> > > compatibility.  
> > 
> > The version string discussed previously is the version string that
> > represents a given device, possibly including driver information,
> > configuration, etc.  I think what you're asking for here is an
> > enumeration of every possible version string that a given device could
> > accept as an incoming migration stream.  If we consider the string as
> > opaque, that means the vendor driver needs to generate a separate
> > string for every possible version it could accept, for every possible
> > configuration option.  That potentially becomes an excessive amount of
> > data to either generate or manage.
> > 
> > Am I overestimating how vendors intend to use the version string?  
> 
> If I'm interpreting your reply & the quoted text orrectly, the version
> string isn't really a version string in any normal sense of the word
> "version".
> 
> Instead it sounds like string encoding a set of features in some arbitrary
> vendor specific format, which they parse and do compatibility checks on
> individual pieces ? One or more parts may contain a version number, but
> its much more than just a version.
> 
> If that's correct, then I'd prefer we didn't call it a version string,
> instead call it a "capability string" to make it clear it is expressing
> a much more general concept, but...

I'd agree with that.  The intent of the previous proposal was to
provide and interface for reading a string and writing a string back in
where the result of that write indicated migration compatibility with
the device.  So yes, "version" is not the right term.
 
> > We'd also need to consider devices that we could create, for instance
> > providing the same interface enumeration prior to creating an mdev
> > device to have a confidence level that the new device would be a valid
> > target.
> > 
> > We defined the string as opaque to allow vendor flexibility and because
> > defining a common format is hard.  Do we need to revisit this part of
> > the discussion to define the version string as non-opaque with parsing
> > rules, probably with separate incoming vs outgoing interfaces?  Thanks,  
> 
> ..even if the huge amount of flexibility is technically relevant from the
> POV of the hardware/drivers, we should consider whether management apps
> actually want, or can use, that level of flexibility.
> 
> The task of picking which host to place a VM on has alot of factors to
> consider, and when there are a large number of hosts, the total amount
> of information to check gets correspondingly large.  The placement
> process is also fairly performance critical.
> 
> Running complex algorithmic logic to check compatibility of devices
> based on a arbitrary set of rules is likely to be a performance
> challenge. A flat list of supported strings is a much simpler
> thing to check as it reduces down to a simple set membership test.
> 
> IOW, even if there's some complex set of device type / vendor specific
> rules to check for compatibility, I fear apps

[libvirt PATCH 1/2] tests: Don't assume IPv4 connectivity is available

2020-07-14 Thread Andrea Bolognani

If the host doesn't have a single IPv4 address assigned to any of
its interfaces, not even the loopback one, then virnetsockettest
will fail with

  Cannot identify IPv4/6 availability

because, while the IPv6 bind attempt is conditional, the IPv4 one
is not, and in this case it will always fail.

This commit is better viewed with 'git show -w'.

Signed-off-by: Andrea Bolognani 
---
 tests/virnetsockettest.c | 20 
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/tests/virnetsockettest.c b/tests/virnetsockettest.c
index 78fb9cbffd..f56e623cb3 100644
--- a/tests/virnetsockettest.c
+++ b/tests/virnetsockettest.c
@@ -56,8 +56,10 @@ checkProtocols(bool *hasIPv4, bool *hasIPv6,
 
 for (i = 0; i < 50; i++) {
 int only = 1;
-if ((s4 = socket(AF_INET, SOCK_STREAM, 0)) < 0)
-goto cleanup;
+if (*hasIPv4) {
+if ((s4 = socket(AF_INET, SOCK_STREAM, 0)) < 0)
+goto cleanup;
+}
 
 if (*hasIPv6) {
 if ((s6 = socket(AF_INET6, SOCK_STREAM, 0)) < 0)
@@ -77,13 +79,15 @@ checkProtocols(bool *hasIPv4, bool *hasIPv6,
 in6.sin6_port = htons(BASE_PORT + i);
 in6.sin6_addr = in6addr_loopback;
 
-if (bind(s4, (struct sockaddr *), sizeof(in4)) < 0) {
-if (errno == EADDRINUSE) {
-VIR_FORCE_CLOSE(s4);
-VIR_FORCE_CLOSE(s6);
-continue;
+if (*hasIPv4) {
+if (bind(s4, (struct sockaddr *), sizeof(in4)) < 0) {
+if (errno == EADDRINUSE) {
+VIR_FORCE_CLOSE(s4);
+VIR_FORCE_CLOSE(s6);
+continue;
+}
+goto cleanup;
 }
-goto cleanup;
 }
 
 if (*hasIPv6) {
-- 
2.25.4

[libvirt PATCH 2/2] tests: Minimize variable scope

2020-07-14 Thread Andrea Bolognani

Signed-off-by: Andrea Bolognani 
---
 tests/virnetsockettest.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tests/virnetsockettest.c b/tests/virnetsockettest.c
index f56e623cb3..96c582216c 100644
--- a/tests/virnetsockettest.c
+++ b/tests/virnetsockettest.c
@@ -55,13 +55,14 @@ checkProtocols(bool *hasIPv4, bool *hasIPv6,
 return -1;
 
 for (i = 0; i < 50; i++) {
-int only = 1;
 if (*hasIPv4) {
 if ((s4 = socket(AF_INET, SOCK_STREAM, 0)) < 0)
 goto cleanup;
 }
 
 if (*hasIPv6) {
+int only = 1;
+
 if ((s6 = socket(AF_INET6, SOCK_STREAM, 0)) < 0)
 goto cleanup;
 
-- 
2.25.4

[libvirt PATCH 0/2] tests: Don't assume IPv4 connectivity is available

2020-07-14 Thread Andrea Bolognani

I started looking into this after seeing

  FAIL: virnetsockettest
  ==

  TEST: virnetsockettest
   1) Socket TCP/IPv4 Accept   ... libvirt: XML-RPC error : 
Unable to resolve address '127.0.0.1' service '5672': Address family for 
hostname not supported
  FAILED
   2) Socket TCP/IPv6 Accept   ... OK
   3) Socket TCP/IPv4+IPv6 Accept  ... OK
   4) Socket TCP/IPv4+IPv6 Accept  ... OK
   5) Socket UNIX Accept   ... OK
   6) Socket UNIX Addrs... OK
   7) Socket External Command /dev/zero... OK
   8) Socket External Command /dev/does-not-exist  ... libvirt: XML-RPC error : 
End of file while reading data: /bin/cat: /dev/does-not-exist: No such file or 
directory: Input/output error
  OK
   9) SSH test 1   ... OK
  10) SSH test 2   ... OK
  11) SSH test 3   ... OK
  12) SSH test 4   ... libvirt: XML-RPC error : 
End of file while reading data: Cannot connect to host nosuchhost: Input/output 
error
  OK
  13) SSH test 5   ... libvirt: XML-RPC error : 
End of file while reading data: : Input/output error
  OK
  14) SSH test 6   ... OK
  15) SSH test 7   ... OK

during a Debian package build.

Full log:

  
https://buildd.debian.org/status/fetch.php?pkg=libvirt=armel=6.4.0-2=1594738948=0

Just a few days ago, this issue was discussed in

  https://lists.debian.org/debian-devel/2020/07/msg00070.html

and libvirt was mentioned explicitly as a package that could be
affected by it.

Reproducing the issue is quite simple: just remove all IPv4
addresses *except* the one assigned to 'lo', and you'll see it
immediately. The problem seems to be that, in this configuration,
virNetSocketCheckProtocols() reports that both IPv4 and IPv6 are
available, but XML-RPC is apparently not happy with that.

I'm pretty bad at networking, so I'm reporting this in the hope that
someone else will figure it out O:-)

Anyway, there's another scenario that I found while digging: if you
remove *all* IPv4 addresses, *including* the one assigned to 'lo',
you get a different failure. My patch addresses this last one.

Andrea Bolognani (2):
  tests: Don't assume IPv4 connectivity is available
  tests: Minimize variable scope

 tests/virnetsockettest.c | 23 ++-
 1 file changed, 14 insertions(+), 9 deletions(-)

-- 
2.25.4

Re: [PATCH v5 0/3] tpm: Fix default choices for CRB and SPAPR dev models

2020-07-14 Thread Marc-André Lureau

Hi

On Fri, Jul 10, 2020 at 12:49 AM Stefan Berger 
wrote:

> From: Stefan Berger 
>
> This series of patches adds an additional check for the SPAPR device model
> that prevents the choice of a TPM 1.2 backend and chooses a TPM 2 as
> default.
> Also CRB now chooses a TPM 2 as default since TPM 1.2 wouldn't work with
> it,
> either.
>
>Stefan
>
> v4->v5:
>   - Added R-b's
>
> Stefan Berger (3):
>   qemu: Move setting of TPM default to post parse function
>   qemu: Set SPAPR TPM default to 2.0 and prevent 1.2 choice
>   qemu: Choose TPM 2 for backend as default for CRB interface
>

Series:
Reviewed-by: Marc-André Lureau 


>  src/qemu/qemu_domain.c   | 12 +---
>  src/qemu/qemu_validate.c | 10 ++
>  2 files changed, 15 insertions(+), 7 deletions(-)
>
> --
> 2.17.1
>
>

-- 
Marc-André Lureau

[PATCH 06/11] virDomainHostdevDefFormatSubsys: Split out formatting of vHBA subsystem

2020-07-14 Thread Peter Krempa

Similarly to previous commit split out formatting of the vHBA subsystem
related stuff.

Signed-off-by: Peter Krempa 
---
 src/conf/domain_conf.c | 30 +++---
 1 file changed, 19 insertions(+), 11 deletions(-)

diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index 7b181e0587..facfddeea7 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -26148,6 +26148,21 @@ virDomainHostdevDefFormatSubsysSCSI(virBufferPtr buf,
 }


+static void
+virDomainHostdevDefFormatSubsysSCSIHost(virBufferPtr buf,
+virDomainHostdevDefPtr def)
+{
+g_auto(virBuffer) sourceAttrBuf = VIR_BUFFER_INITIALIZER;
+virDomainHostdevSubsysSCSIVHostPtr hostsrc = 
>source.subsys.u.scsi_host;
+
+virBufferAsprintf(, " protocol='%s' wwpn='%s'",
+  
virDomainHostdevSubsysSCSIHostProtocolTypeToString(hostsrc->protocol),
+  hostsrc->wwpn);
+
+virXMLFormatElement(buf, "source", , NULL);
+}
+
+
 static int
 virDomainHostdevDefFormatSubsys(virBufferPtr buf,
 virDomainHostdevDefPtr def,
@@ -26155,9 +26170,7 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf,
 bool includeTypeInAddr,
 virDomainXMLOptionPtr xmlopt)
 {
-g_auto(virBuffer) sourceAttrBuf = VIR_BUFFER_INITIALIZER;
 g_auto(virBuffer) sourceChildBuf = VIR_BUFFER_INIT_CHILD(buf);
-virDomainHostdevSubsysSCSIVHostPtr hostsrc = 
>source.subsys.u.scsi_host;
 virDomainHostdevSubsysMediatedDevPtr mdevsrc = >source.subsys.u.mdev;

 switch ((virDomainHostdevSubsysType) def->source.subsys.type) {
@@ -26172,6 +26185,9 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf,
 return virDomainHostdevDefFormatSubsysSCSI(buf, def, flags, 
includeTypeInAddr, xmlopt);

 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST:
+virDomainHostdevDefFormatSubsysSCSIHost(buf, def);
+return 0;
+
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV:
 break;

@@ -26182,14 +26198,6 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf,
 }


-if (def->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST) {
-const char *protocol =
-
virDomainHostdevSubsysSCSIHostProtocolTypeToString(hostsrc->protocol);
-
-virBufferAsprintf(, " protocol='%s' wwpn='%s'",
-  protocol, hostsrc->wwpn);
-}
-
 switch (def->source.subsys.type) {
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB:
 break;
@@ -26210,7 +26218,7 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf,
 return -1;
 }

-virXMLFormatElement(buf, "source", , );
+virXMLFormatElement(buf, "source", NULL, );

 return 0;
 }
-- 
2.26.2

[PATCH 07/11] virDomainHostdevDefFormatSubsys: Split out formatting of mdev subsystem

2020-07-14 Thread Peter Krempa

Similarly to previous commit split out formatting of the mdev subsystem
related stuff.

Signed-off-by: Peter Krempa 
---
 src/conf/domain_conf.c | 42 +++---
 1 file changed, 15 insertions(+), 27 deletions(-)

diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index facfddeea7..e9fee10e31 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -26163,6 +26163,19 @@ virDomainHostdevDefFormatSubsysSCSIHost(virBufferPtr 
buf,
 }


+static void
+virDomainHostdevDefFormatSubsysMDEV(virBufferPtr buf,
+virDomainHostdevDefPtr def)
+{
+g_auto(virBuffer) sourceChildBuf = VIR_BUFFER_INIT_CHILD(buf);
+virDomainHostdevSubsysMediatedDevPtr mdevsrc = >source.subsys.u.mdev;
+
+virBufferAsprintf(, "\n", 
mdevsrc->uuidstr);
+
+virXMLFormatElement(buf, "source", NULL, );
+}
+
+
 static int
 virDomainHostdevDefFormatSubsys(virBufferPtr buf,
 virDomainHostdevDefPtr def,
@@ -26170,9 +26183,6 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf,
 bool includeTypeInAddr,
 virDomainXMLOptionPtr xmlopt)
 {
-g_auto(virBuffer) sourceChildBuf = VIR_BUFFER_INIT_CHILD(buf);
-virDomainHostdevSubsysMediatedDevPtr mdevsrc = >source.subsys.u.mdev;
-
 switch ((virDomainHostdevSubsysType) def->source.subsys.type) {
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB:
 virDomainHostdevDefFormatSubsysUSB(buf, def, flags, includeTypeInAddr);
@@ -26189,7 +26199,8 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf,
 return 0;

 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV:
-break;
+virDomainHostdevDefFormatSubsysMDEV(buf, def);
+return 0;

 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST:
 default:
@@ -26197,29 +26208,6 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf,
 return -1;
 }

-
-switch (def->source.subsys.type) {
-case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB:
-break;
-case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI:
-break;
-case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI:
-break;
-case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST:
-break;
-case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV:
-virBufferAsprintf(, "\n",
-  mdevsrc->uuidstr);
-break;
-default:
-virReportError(VIR_ERR_INTERNAL_ERROR,
-   _("unexpected hostdev type %d"),
-   def->source.subsys.type);
-return -1;
-}
-
-virXMLFormatElement(buf, "source", NULL, );
-
 return 0;
 }

-- 
2.26.2

[PATCH 10/11] docs: formatdomain-devices: Split out '' into separate file

2020-07-14 Thread Peter Krempa

Signed-off-by: Peter Krempa 
---
 docs/formatdomain-devices-hostdev.rst.in | 337 ++
 docs/formatdomain-devices.rst.in | 338 +--
 2 files changed, 338 insertions(+), 337 deletions(-)
 create mode 100644 docs/formatdomain-devices-hostdev.rst.in

diff --git a/docs/formatdomain-devices-hostdev.rst.in 
b/docs/formatdomain-devices-hostdev.rst.in
new file mode 100644
index 00..859c4b4ec5
--- /dev/null
+++ b/docs/formatdomain-devices-hostdev.rst.in
@@ -0,0 +1,337 @@
+:anchor:``
+
+Host device assignment
+~~
+
+:anchor:``
+
+USB / PCI / SCSI devices
+
+
+USB, PCI and SCSI devices attached to the host can be passed through to the
+guest using the ``hostdev`` element. :since:`since after 0.4.4 for USB, 0.6.0
+for PCI (KVM only) and 1.0.6 for SCSI (KVM only)` :
+
+::
+
+   ...
+   
+ 
+   
+ 
+ 
+   
+   
+ 
+   
+   ...
+
+or:
+
+::
+
+   ...
+   
+ 
+   
+ 
+   
+   
+   
+ 
+   
+   ...
+
+or:
+
+::
+
+   ...
+   
+ 
+   
+ 
+ 
+   
+   
+   
+ 
+   
+   ...
+
+or:
+
+::
+
+   ...
+   
+ 
+   
+ 
+ 
+   
+ 
+   
+   
+ 
+   
+   ...
+
+or:
+
+::
+
+ ...
+ 
+   
+ 
+   
+ 
+ ...
+
+or:
+
+::
+
+ ...
+ 
+   
+   
+ 
+   
+   
+   
+   
+ 
+   
+   
+   
+ 
+ ...
+
+``hostdev``
+   The ``hostdev`` element is the main container for describing host devices.
+   For each device, the ``mode`` is always "subsystem" and the ``type`` is one
+   of the following values with additional attributes noted.
+
+   ``usb``
+  USB devices are detached from the host on guest startup and reattached
+  after the guest exits or the device is hot-unplugged.
+   ``pci``
+  For PCI devices, when ``managed`` is "yes" it is detached from the host
+  before being passed on to the guest and reattached to the host after the
+  guest exits. If ``managed`` is omitted or "no", the user is responsible 
to
+  call ``virNodeDeviceDetachFlags`` (or ``virsh nodedev-detach`` before
+  starting the guest or hot-plugging the device and
+  ``virNodeDeviceReAttach`` (or ``virsh nodedev-reattach``) after 
hot-unplug
+  or stopping the guest.
+   ``scsi``
+  For SCSI devices, user is responsible to make sure the device is not used
+  by host. If supported by the hypervisor and OS, the optional ``sgio`` (
+  :since:`since 1.0.6` ) attribute indicates whether unprivileged SG_IO
+  commands are filtered for the disk. Valid settings are "filtered" or
+  "unfiltered", where the default is "filtered". The optional ``rawio`` (
+  :since:`since 1.2.9` ) attribute indicates whether the lun needs the 
rawio
+  capability. Valid settings are "yes" or "no". See the rawio description
+  within the `disk <#elementsDisks>`__ section. If a disk lun in the domain
+  already has the rawio capability, then this setting not required.
+   ``scsi_host``
+  :since:`since 2.5.0` For SCSI devices, user is responsible to make sure
+  the device is not used by host. This ``type`` passes all LUNs presented 
by
+  a single HBA to the guest. :since:`Since 5.2.0,` the ``model`` attribute
+  can be specified further with "virtio-transitional",
+  "virtio-non-transitional", or "virtio". See `Virtio transitional
+  devices <#elementsVirtioTransitional>`__ for more details.
+   ``mdev``
+  For mediated devices ( :since:`Since 3.2.0` ) the ``model`` attribute
+  specifies the device API which determines how the host's vfio driver will
+  expose the device to the guest. Currently, ``model='vfio-pci'``,
+  ``model='vfio-ccw'`` ( :since:`Since 4.4.0` ) and ``model='vfio-ap'`` (
+  :since:`Since 4.9.0` ) is supported. `MDEV `__
+  section provides more information about mediated devices as well as how 
to
+  create mediated devices on the host. :since:`Since 4.6.0 (QEMU 2.12)` an
+  optional ``display`` attribute may be used to enable or disable support
+  for an accelerated remote desktop backed by a mediated device (such as
+  NVIDIA vGPU or Intel GVT-g) as an alternative to emulated `video
+  devices <#elementsVideo>`__. This attribute is limited to
+  ``model='vfio-pci'`` only. Supported values are either ``on`` or ``off``
+  (default is 'off'). It is required to use a `graphical
+  framebuffer <#elementsGraphics>`__ in order to use this attribute,
+  currently only supported with VNC, Spice and egl-headless graphics
+  devices. :since:`Since version 5.10.0` , there is an optional ``ramfb``
+  attribute for devices with ``model='vfio-pci'``. Supported values are
+  either ``on`` or ``off`` (default is 'off'). When enabled, this attribute
+  provides a memory framebuffer

[PATCH 11/11] conf: Add support for initiator IQN setting for iSCSI hostdevs

2020-07-14 Thread Peter Krempa

We already allow controlling the initiator IQN for iSCSI based disks.
Add the same for host devices.

Signed-off-by: Peter Krempa 
---
 docs/formatdomain-devices-hostdev.rst.in   | 7 +++
 docs/schemas/domaincommon.rng  | 3 +++
 src/conf/domain_conf.c | 5 +
 .../hostdev-scsi-virtio-scsi.x86_64-4.1.0.args | 3 ++-
 .../hostdev-scsi-virtio-scsi.x86_64-latest.args| 1 +
 tests/qemuxml2argvdata/hostdev-scsi-virtio-scsi.xml| 3 +++
 tests/qemuxml2xmloutdata/hostdev-scsi-virtio-scsi.xml  | 3 +++
 7 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/docs/formatdomain-devices-hostdev.rst.in 
b/docs/formatdomain-devices-hostdev.rst.in
index 859c4b4ec5..1221627818 100644
--- a/docs/formatdomain-devices-hostdev.rst.in
+++ b/docs/formatdomain-devices-hostdev.rst.in
@@ -68,6 +68,9 @@ or:
  

  
+ 
+   
+ 
  

  
@@ -210,6 +213,10 @@ or:
   same ``name`` attribute and optionally using the ``auth`` element to
   provide the authentication credentials to the iSCSI server.

+  :since:`Since 6.6.0`, the optional ``initiator`` sub-element controls the
+  IQN of the initiator ran by the hypervisor via it's ``
   
 
+
+  
+
   
 
   
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index 5988a13986..5c23a52031 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -8349,6 +8349,8 @@ virDomainHostdevSubsysSCSIiSCSIDefParseXML(xmlNodePtr 
sourcenode,
 iscsisrc->src->auth = g_steal_pointer();
 }

+virStorageSourceInitiatorParseXML(ctxt, >src->initiator);
+
 if ((flags & VIR_DOMAIN_DEF_PARSE_STATUS) &&
 xmlopt && xmlopt->privateData.storageParse) {
 if ((ctxt->node = virXPathNode("./privateData", ctxt)) &&
@@ -26128,6 +26130,9 @@ virDomainHostdevDefFormatSubsysSCSI(virBufferPtr buf,

 if (iscsisrc->src->auth)
 virStorageAuthDefFormat(, iscsisrc->src->auth);
+
+virStorageSourceInitiatorFormatXML(>src->initiator,
+   );
 } else {
 virBufferAsprintf(, "\n",
   scsihostsrc->adapter);
diff --git a/tests/qemuxml2argvdata/hostdev-scsi-virtio-scsi.x86_64-4.1.0.args 
b/tests/qemuxml2argvdata/hostdev-scsi-virtio-scsi.x86_64-4.1.0.args
index de4047000e..f2591d6956 100644
--- a/tests/qemuxml2argvdata/hostdev-scsi-virtio-scsi.x86_64-4.1.0.args
+++ b/tests/qemuxml2argvdata/hostdev-scsi-virtio-scsi.x86_64-4.1.0.args
@@ -60,7 +60,8 @@ 
data=9eao5F8qtkGt+seB1HYivWIxbtwUu6MQtg1zpj/oDtUsPr1q8wBYM91uEHCn6j/1,\
 keyid=masterKey0,iv=AAECAwQFBgcICQoLDA0ODw==,format=base64 \
 -drive file.driver=iscsi,file.portal=example.org:3260,\
 file.target=iqn.1992-01.com.example:storage,file.lun=2,file.transport=tcp,\
-file.user=myname,file.password-secret=hostdev5-secret0,if=none,format=raw,\
+file.user=myname,file.password-secret=hostdev5-secret0,\
+file.initiator-name=iqn.2020-07.com.example:test,if=none,format=raw,\
 id=drive-hostdev5 \
 -device scsi-generic,bus=scsi0.0,channel=0,scsi-id=3,lun=5,\
 drive=drive-hostdev5,id=hostdev5 \
diff --git a/tests/qemuxml2argvdata/hostdev-scsi-virtio-scsi.x86_64-latest.args 
b/tests/qemuxml2argvdata/hostdev-scsi-virtio-scsi.x86_64-latest.args
index 72980d58b8..f86cbd7314 100644
--- a/tests/qemuxml2argvdata/hostdev-scsi-virtio-scsi.x86_64-latest.args
+++ b/tests/qemuxml2argvdata/hostdev-scsi-virtio-scsi.x86_64-latest.args
@@ -67,6 +67,7 @@ keyid=masterKey0,iv=AAECAwQFBgcICQoLDA0ODw==,format=base64 \
 -blockdev '{"driver":"iscsi","portal":"example.org:3260",\
 "target":"iqn.1992-01.com.example:storage","lun":2,"transport":"tcp",\
 "user":"myname","password-secret":"hostdev5-secret0",\
+"initiator-name":"iqn.2020-07.com.example:test",\
 "node-name":"libvirt-hostdev5-backend","read-only":false}' \
 -device scsi-generic,bus=scsi0.0,channel=0,scsi-id=3,lun=5,\
 drive=libvirt-hostdev5-backend,id=hostdev5 \
diff --git a/tests/qemuxml2argvdata/hostdev-scsi-virtio-scsi.xml 
b/tests/qemuxml2argvdata/hostdev-scsi-virtio-scsi.xml
index 775b678b36..f1caf80644 100644
--- a/tests/qemuxml2argvdata/hostdev-scsi-virtio-scsi.xml
+++ b/tests/qemuxml2argvdata/hostdev-scsi-virtio-scsi.xml
@@ -67,6 +67,9 @@
 
   
 
+
+  
+
   
   
 
diff --git a/tests/qemuxml2xmloutdata/hostdev-scsi-virtio-scsi.xml 
b/tests/qemuxml2xmloutdata/hostdev-scsi-virtio-scsi.xml
index 9c823809ab..6c7e22d0c3 100644
--- a/tests/qemuxml2xmloutdata/hostdev-scsi-virtio-scsi.xml
+++ b/tests/qemuxml2xmloutdata/hostdev-scsi-virtio-scsi.xml
@@ -74,6 +74,9 @@
 
   
 
+
+  
+
   
   
 
-- 
2.26.2

[PATCH 00/11] Implement initiator IQN config for iSCSI hostdevs (and cleanups)

2020-07-14 Thread Peter Krempa

Last patch implements the initiator IQN, the rest is cleanups and
preparation.

Note that this series depends on:
https://www.redhat.com/archives/libvir-list/2020-July/msg00567.html
https://www.redhat.com/archives/libvir-list/2020-July/msg00717.html

Peter Krempa (11):
  virDomainHostdevDefFormatSubsys: Use virXMLFormatElement
  virDomainHostdevDefFormatSubsys: Split out formatting of USB subsystem
  virDomainHostdevDefFormatSubsys: Split out formatting of PCI subsystem
  virDomainHostdevDefFormatSubsys: Split out formatting of SCSI
subsystem
  virDomainHostdevDefFormatSubsysSCSI: Avoid ternary operator when
formatting address
  virDomainHostdevDefFormatSubsys: Split out formatting of vHBA
subsystem
  virDomainHostdevDefFormatSubsys: Split out formatting of mdev
subsystem
  virDomainHostdevSubsysSCSIDefParseXML: Use typecasted switch
  virDomainHostdevSubsysSCSIiSCSIDefParseXML: Use XPath to fetch
elements
  docs: formatdomain-devices: Split out '' into separate file
  conf: Add support for initiator IQN setting for iSCSI hostdevs

 docs/formatdomain-devices-hostdev.rst.in  | 344 ++
 docs/formatdomain-devices.rst.in  | 338 +
 docs/schemas/domaincommon.rng |   3 +
 src/conf/domain_conf.c| 329 ++---
 ...hostdev-scsi-virtio-scsi.x86_64-4.1.0.args |   3 +-
 ...ostdev-scsi-virtio-scsi.x86_64-latest.args |   1 +
 .../hostdev-scsi-virtio-scsi.xml  |   3 +
 .../hostdev-scsi-virtio-scsi.xml  |   3 +
 8 files changed, 545 insertions(+), 479 deletions(-)
 create mode 100644 docs/formatdomain-devices-hostdev.rst.in

-- 
2.26.2

[PATCH 01/11] virDomainHostdevDefFormatSubsys: Use virXMLFormatElement

2020-07-14 Thread Peter Krempa

Refactor the formatter to the new multiple buffer based approach so that
we can easily separate it into formatters per subsys type.

Signed-off-by: Peter Krempa 
---
 src/conf/domain_conf.c | 65 ++
 1 file changed, 27 insertions(+), 38 deletions(-)

diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index 8f5a8ef4a4..4b1f27fcea 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -26030,7 +26030,9 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf,
 bool includeTypeInAddr,
 virDomainXMLOptionPtr xmlopt)
 {
-bool closedSource = false;
+g_auto(virBuffer) sourceAttrBuf = VIR_BUFFER_INITIALIZER;
+g_auto(virBuffer) sourceChildBuf = VIR_BUFFER_INIT_CHILD(buf);
+g_auto(virBuffer) origstatesChildBuf = 
VIR_BUFFER_INIT_CHILD();
 virDomainHostdevSubsysUSBPtr usbsrc = >source.subsys.u.usb;
 virDomainHostdevSubsysPCIPtr pcisrc = >source.subsys.u.pci;
 virDomainHostdevSubsysSCSIPtr scsisrc = >source.subsys.u.scsi;
@@ -26053,18 +26055,17 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf,
 virBufferAsprintf(buf, "\n", backend);
 }

-virBufferAddLit(buf, "source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB) {
 if (def->startupPolicy) {
 const char *policy;
 policy = virDomainStartupPolicyTypeToString(def->startupPolicy);
-virBufferAsprintf(buf, " startupPolicy='%s'", policy);
+virBufferAsprintf(, " startupPolicy='%s'", policy);
 }
 if (usbsrc->autoAddress && (flags & VIR_DOMAIN_DEF_FORMAT_MIGRATABLE))
-virBufferAddLit(buf, " autoAddress='yes'");
+virBufferAddLit(, " autoAddress='yes'");

 if (def->missing && !(flags & VIR_DOMAIN_DEF_FORMAT_INACTIVE))
-virBufferAddLit(buf, " missing='yes'");
+virBufferAddLit(, " missing='yes'");
 }

 if (def->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI &&
@@ -26072,69 +26073,59 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf,
 const char *protocol =
 virDomainHostdevSubsysSCSIProtocolTypeToString(scsisrc->protocol);

-virBufferAsprintf(buf, " protocol='%s' name='%s'",
+virBufferAsprintf(, " protocol='%s' name='%s'",
   protocol, iscsisrc->src->path);
 }

 if (def->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST) {
 const char *protocol =
 
virDomainHostdevSubsysSCSIHostProtocolTypeToString(hostsrc->protocol);
-closedSource = true;

-virBufferAsprintf(buf, " protocol='%s' wwpn='%s'/",
+virBufferAsprintf(, " protocol='%s' wwpn='%s'",
   protocol, hostsrc->wwpn);
 }

-virBufferAddLit(buf, ">\n");
-
-virBufferAdjustIndent(buf, 2);
 switch (def->source.subsys.type) {
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB:
 if (usbsrc->vendor) {
-virBufferAsprintf(buf, "\n", usbsrc->vendor);
-virBufferAsprintf(buf, "\n", 
usbsrc->product);
+virBufferAsprintf(, "\n", 
usbsrc->vendor);
+virBufferAsprintf(, "\n", 
usbsrc->product);
 }
 if (usbsrc->bus || usbsrc->device) {
-virBufferAsprintf(buf, "\n",
+virBufferAsprintf(, "\n",
   includeTypeInAddr ? "type='usb' " : "",
   usbsrc->bus, usbsrc->device);
 }
 break;
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI:
-virPCIDeviceAddressFormat(buf, pcisrc->addr,
+virPCIDeviceAddressFormat(, pcisrc->addr,
   includeTypeInAddr);

-if ((flags & VIR_DOMAIN_DEF_FORMAT_PCI_ORIG_STATES) &&
-(def->origstates.states.pci.unbind_from_stub ||
- def->origstates.states.pci.remove_slot ||
- def->origstates.states.pci.reprobe)) {
-virBufferAddLit(buf, "\n");
-virBufferAdjustIndent(buf, 2);
+if ((flags & VIR_DOMAIN_DEF_FORMAT_PCI_ORIG_STATES)) {
 if (def->origstates.states.pci.unbind_from_stub)
-virBufferAddLit(buf, "\n");
+virBufferAddLit(, "\n");
 if (def->origstates.states.pci.remove_slot)
-virBufferAddLit(buf, "\n");
+virBufferAddLit(, "\n");
 if (def->origstates.states.pci.reprobe)
-virBufferAddLit(buf, "\n");
-virBufferAdjustIndent(buf, -2);
-virBufferAddLit(buf, "\n");
+virBufferAddLit(, "\n");
+virXMLFormatElement(, "origstates", NULL, 
);
 }
 break;
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI:
 if (scsisrc->protocol == VIR_DOMAIN_HOSTDEV_SCSI_PROTOCOL_TYPE_ISCSI) {
-virBufferAddLit(buf, "src->hosts[0].name);
+virBufferAddLit(, "src->hosts[0].name);
 if

[PATCH 09/11] virDomainHostdevSubsysSCSIiSCSIDefParseXML: Use XPath to fetch elements

2020-07-14 Thread Peter Krempa

Conver the code to the new approach which uses XPath to fetch known
elements rather than looping through all XML children.

Signed-off-by: Peter Krempa 
---
 src/conf/domain_conf.c | 51 --
 1 file changed, 19 insertions(+), 32 deletions(-)

diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index c8ac4380c4..5988a13986 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -8295,9 +8295,12 @@ virDomainHostdevSubsysSCSIiSCSIDefParseXML(xmlNodePtr 
sourcenode,
virDomainXMLOptionPtr xmlopt)
 {
 int auth_secret_usage = -1;
-xmlNodePtr cur;
 virDomainHostdevSubsysSCSIiSCSIPtr iscsisrc = >u.iscsi;
 g_autoptr(virStorageAuthDef) authdef = NULL;
+xmlNodePtr node;
+VIR_XPATH_NODE_AUTORESTORE(ctxt);
+
+ctxt->node = sourcenode;

 /* For the purposes of command line creation, this needs to look
  * like a disk storage source */
@@ -8328,42 +8331,26 @@ virDomainHostdevSubsysSCSIiSCSIDefParseXML(xmlNodePtr 
sourcenode,
 return -1;
 }

-cur = sourcenode->children;
-while (cur != NULL) {
-if (cur->type == XML_ELEMENT_NODE &&
-virXMLNodeNameEqual(cur, "auth")) {
-if (iscsisrc->src->auth) {
-virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
-   _("an  definition already found for "
- "the  iSCSI definition"));
-return -1;
-}
-if (!(authdef = virStorageAuthDefParse(cur, ctxt)))
-return -1;
-if ((auth_secret_usage =
- virSecretUsageTypeFromString(authdef->secrettype)) < 0) {
-virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
-   _("invalid secret type %s"),
-   authdef->secrettype);
-return -1;
-}
-if (auth_secret_usage != VIR_SECRET_USAGE_TYPE_ISCSI) {
-virReportError(VIR_ERR_INTERNAL_ERROR,
-   _("hostdev invalid secret type '%s'"),
-   authdef->secrettype);
-return -1;
-}
-iscsisrc->src->auth = g_steal_pointer();
+if ((node = virXPathNode("./auth", ctxt))) {
+if (!(authdef = virStorageAuthDefParse(node, ctxt)))
+return -1;
+if ((auth_secret_usage = 
virSecretUsageTypeFromString(authdef->secrettype)) < 0) {
+virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+   _("invalid secret type %s"),
+   authdef->secrettype);
+return -1;
 }
-cur = cur->next;
+if (auth_secret_usage != VIR_SECRET_USAGE_TYPE_ISCSI) {
+virReportError(VIR_ERR_INTERNAL_ERROR,
+   _("hostdev invalid secret type '%s'"),
+   authdef->secrettype);
+return -1;
+}
+iscsisrc->src->auth = g_steal_pointer();
 }

 if ((flags & VIR_DOMAIN_DEF_PARSE_STATUS) &&
 xmlopt && xmlopt->privateData.storageParse) {
-VIR_XPATH_NODE_AUTORESTORE(ctxt);
-
-ctxt->node = sourcenode;
-
 if ((ctxt->node = virXPathNode("./privateData", ctxt)) &&
 xmlopt->privateData.storageParse(ctxt, iscsisrc->src) < 0)
 return -1;
-- 
2.26.2

[PATCH 08/11] virDomainHostdevSubsysSCSIDefParseXML: Use typecasted switch

2020-07-14 Thread Peter Krempa

Use a switch statement which will not be omitted when adding potential
new types.

Signed-off-by: Peter Krempa 
---
 src/conf/domain_conf.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index e9fee10e31..c8ac4380c4 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -8392,11 +8392,21 @@ virDomainHostdevSubsysSCSIDefParseXML(xmlNodePtr 
sourcenode,
 }
 }

-if (scsisrc->protocol == VIR_DOMAIN_HOSTDEV_SCSI_PROTOCOL_TYPE_ISCSI)
+switch ((virDomainHostdevSCSIProtocolType) scsisrc->protocol) {
+case VIR_DOMAIN_HOSTDEV_SCSI_PROTOCOL_TYPE_NONE:
+return virDomainHostdevSubsysSCSIHostDefParseXML(sourcenode, scsisrc);
+
+case VIR_DOMAIN_HOSTDEV_SCSI_PROTOCOL_TYPE_ISCSI:
 return virDomainHostdevSubsysSCSIiSCSIDefParseXML(sourcenode, scsisrc, 
ctxt,
   flags, xmlopt);

-return virDomainHostdevSubsysSCSIHostDefParseXML(sourcenode, scsisrc);
+case VIR_DOMAIN_HOSTDEV_SCSI_PROTOCOL_TYPE_LAST:
+default:
+virReportEnumRangeError(virDomainHostdevSCSIProtocolType, 
scsisrc->protocol);
+return -1;
+}
+
+return 0;
 }

 static int
-- 
2.26.2

[PATCH 02/11] virDomainHostdevDefFormatSubsys: Split out formatting of USB subsystem

2020-07-14 Thread Peter Krempa

Separate out bits related to USB so that the logic isn't entangled in
multiple conditional statements.

Signed-off-by: Peter Krempa 
---
 src/conf/domain_conf.c | 73 +-
 1 file changed, 51 insertions(+), 22 deletions(-)

diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index 4b1f27fcea..7718a59c66 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -26023,6 +26023,40 @@ virDomainNetIPInfoFormat(virBufferPtr buf,
 }


+static void
+virDomainHostdevDefFormatSubsysUSB(virBufferPtr buf,
+   virDomainHostdevDefPtr def,
+   unsigned int flags,
+   bool includeTypeInAddr)
+{
+g_auto(virBuffer) sourceAttrBuf = VIR_BUFFER_INITIALIZER;
+g_auto(virBuffer) sourceChildBuf = VIR_BUFFER_INIT_CHILD(buf);
+virDomainHostdevSubsysUSBPtr usbsrc = >source.subsys.u.usb;
+
+if (def->startupPolicy)
+virBufferAsprintf(, " startupPolicy='%s'",
+  
virDomainStartupPolicyTypeToString(def->startupPolicy));
+
+if (usbsrc->autoAddress && (flags & VIR_DOMAIN_DEF_FORMAT_MIGRATABLE))
+virBufferAddLit(, " autoAddress='yes'");
+
+if (def->missing && !(flags & VIR_DOMAIN_DEF_FORMAT_INACTIVE))
+virBufferAddLit(, " missing='yes'");
+
+if (usbsrc->vendor) {
+virBufferAsprintf(, "\n", 
usbsrc->vendor);
+virBufferAsprintf(, "\n", 
usbsrc->product);
+}
+
+if (usbsrc->bus || usbsrc->device)
+virBufferAsprintf(, "\n",
+  includeTypeInAddr ? "type='usb' " : "",
+  usbsrc->bus, usbsrc->device);
+
+virXMLFormatElement(buf, "source", , );
+}
+
+
 static int
 virDomainHostdevDefFormatSubsys(virBufferPtr buf,
 virDomainHostdevDefPtr def,
@@ -26033,7 +26067,6 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf,
 g_auto(virBuffer) sourceAttrBuf = VIR_BUFFER_INITIALIZER;
 g_auto(virBuffer) sourceChildBuf = VIR_BUFFER_INIT_CHILD(buf);
 g_auto(virBuffer) origstatesChildBuf = 
VIR_BUFFER_INIT_CHILD();
-virDomainHostdevSubsysUSBPtr usbsrc = >source.subsys.u.usb;
 virDomainHostdevSubsysPCIPtr pcisrc = >source.subsys.u.pci;
 virDomainHostdevSubsysSCSIPtr scsisrc = >source.subsys.u.scsi;
 virDomainHostdevSubsysSCSIVHostPtr hostsrc = 
>source.subsys.u.scsi_host;
@@ -26041,6 +26074,23 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf,
 virDomainHostdevSubsysSCSIHostPtr scsihostsrc = >u.host;
 virDomainHostdevSubsysSCSIiSCSIPtr iscsisrc = >u.iscsi;

+switch ((virDomainHostdevSubsysType) def->source.subsys.type) {
+case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB:
+virDomainHostdevDefFormatSubsysUSB(buf, def, flags, includeTypeInAddr);
+return 0;
+
+case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI:
+case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI:
+case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST:
+case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV:
+break;
+
+case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST:
+default:
+virReportEnumRangeError(virDomainHostdevSubsysType, 
def->source.subsys.type);
+return -1;
+}
+
 if (def->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI &&
 pcisrc->backend != VIR_DOMAIN_HOSTDEV_PCI_BACKEND_DEFAULT) {
 const char *backend =
@@ -26055,18 +26105,6 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf,
 virBufferAsprintf(buf, "\n", backend);
 }

-if (def->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB) {
-if (def->startupPolicy) {
-const char *policy;
-policy = virDomainStartupPolicyTypeToString(def->startupPolicy);
-virBufferAsprintf(, " startupPolicy='%s'", policy);
-}
-if (usbsrc->autoAddress && (flags & VIR_DOMAIN_DEF_FORMAT_MIGRATABLE))
-virBufferAddLit(, " autoAddress='yes'");
-
-if (def->missing && !(flags & VIR_DOMAIN_DEF_FORMAT_INACTIVE))
-virBufferAddLit(, " missing='yes'");
-}

 if (def->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI &&
 scsisrc->protocol == VIR_DOMAIN_HOSTDEV_SCSI_PROTOCOL_TYPE_ISCSI) {
@@ -26087,15 +26125,6 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf,

 switch (def->source.subsys.type) {
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB:
-if (usbsrc->vendor) {
-virBufferAsprintf(, "\n", 
usbsrc->vendor);
-virBufferAsprintf(, "\n", 
usbsrc->product);
-}
-if (usbsrc->bus || usbsrc->device) {
-virBufferAsprintf(, "\n",
-  includeTypeInAddr ? "type='usb' " : "",
-  usbsrc->bus, usbsrc->device);
-}
 break;
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI:
 virPCIDeviceAddressFormat(, pcisrc->addr,
-- 
2.26.2

[PATCH 04/11] virDomainHostdevDefFormatSubsys: Split out formatting of SCSI subsystem

2020-07-14 Thread Peter Krempa

Similarly to previous commit split out formatting of the SCSI subsystem
related stuff.

Signed-off-by: Peter Krempa 
---
 src/conf/domain_conf.c | 83 --
 1 file changed, 47 insertions(+), 36 deletions(-)

diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index d4a2aa7623..b337bba534 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -26101,6 +26101,51 @@ virDomainHostdevDefFormatSubsysPCI(virBufferPtr buf,
 }


+static int
+virDomainHostdevDefFormatSubsysSCSI(virBufferPtr buf,
+virDomainHostdevDefPtr def,
+unsigned int flags,
+bool includeTypeInAddr,
+virDomainXMLOptionPtr xmlopt)
+{
+g_auto(virBuffer) sourceAttrBuf = VIR_BUFFER_INITIALIZER;
+g_auto(virBuffer) sourceChildBuf = VIR_BUFFER_INIT_CHILD(buf);
+virDomainHostdevSubsysSCSIPtr scsisrc = >source.subsys.u.scsi;
+virDomainHostdevSubsysSCSIHostPtr scsihostsrc = >u.host;
+virDomainHostdevSubsysSCSIiSCSIPtr iscsisrc = >u.iscsi;
+
+if (scsisrc->protocol == VIR_DOMAIN_HOSTDEV_SCSI_PROTOCOL_TYPE_ISCSI) {
+virBufferAsprintf(, " protocol='%s' name='%s'",
+  
virDomainHostdevSubsysSCSIProtocolTypeToString(scsisrc->protocol),
+  iscsisrc->src->path);
+
+virBufferAddLit(, "src->hosts[0].name);
+if (iscsisrc->src->hosts[0].port)
+virBufferAsprintf(, " port='%u'", 
iscsisrc->src->hosts[0].port);
+virBufferAddLit(, "/>\n");
+
+if (virDomainDiskSourceFormatPrivateData(, 
iscsisrc->src,
+ flags, xmlopt) < 0)
+return -1;
+
+if (iscsisrc->src->auth)
+virStorageAuthDefFormat(, iscsisrc->src->auth);
+} else {
+virBufferAsprintf(, "\n",
+  scsihostsrc->adapter);
+virBufferAsprintf(,
+  "\n",
+  includeTypeInAddr ? "type='scsi' " : "",
+  scsihostsrc->bus, scsihostsrc->target,
+  scsihostsrc->unit);
+}
+
+virXMLFormatElement(buf, "source", , );
+return 0;
+}
+
+
 static int
 virDomainHostdevDefFormatSubsys(virBufferPtr buf,
 virDomainHostdevDefPtr def,
@@ -26110,11 +26155,8 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf,
 {
 g_auto(virBuffer) sourceAttrBuf = VIR_BUFFER_INITIALIZER;
 g_auto(virBuffer) sourceChildBuf = VIR_BUFFER_INIT_CHILD(buf);
-virDomainHostdevSubsysSCSIPtr scsisrc = >source.subsys.u.scsi;
 virDomainHostdevSubsysSCSIVHostPtr hostsrc = 
>source.subsys.u.scsi_host;
 virDomainHostdevSubsysMediatedDevPtr mdevsrc = >source.subsys.u.mdev;
-virDomainHostdevSubsysSCSIHostPtr scsihostsrc = >u.host;
-virDomainHostdevSubsysSCSIiSCSIPtr iscsisrc = >u.iscsi;

 switch ((virDomainHostdevSubsysType) def->source.subsys.type) {
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB:
@@ -26125,6 +26167,8 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf,
 return virDomainHostdevDefFormatSubsysPCI(buf, def, flags, 
includeTypeInAddr);

 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI:
+return virDomainHostdevDefFormatSubsysSCSI(buf, def, flags, 
includeTypeInAddr, xmlopt);
+
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST:
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV:
 break;
@@ -26136,15 +26180,6 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf,
 }


-if (def->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI &&
-scsisrc->protocol == VIR_DOMAIN_HOSTDEV_SCSI_PROTOCOL_TYPE_ISCSI) {
-const char *protocol =
-virDomainHostdevSubsysSCSIProtocolTypeToString(scsisrc->protocol);
-
-virBufferAsprintf(, " protocol='%s' name='%s'",
-  protocol, iscsisrc->src->path);
-}
-
 if (def->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST) {
 const char *protocol =
 
virDomainHostdevSubsysSCSIHostProtocolTypeToString(hostsrc->protocol);
@@ -26159,25 +26194,6 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf,
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI:
 break;
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI:
-if (scsisrc->protocol == VIR_DOMAIN_HOSTDEV_SCSI_PROTOCOL_TYPE_ISCSI) {
-virBufferAddLit(, "src->hosts[0].name);
-if (iscsisrc->src->hosts[0].port)
-virBufferAsprintf(, " port='%u'", 
iscsisrc->src->hosts[0].port);
-virBufferAddLit(, "/>\n");
-
-if (virDomainDiskSourceFormatPrivateData(, 
iscsisrc->src,
- flags, xmlopt) < 0)
-return -1;
-} else {
-virBufferAsprintf(, "\n",
-  scsihostsrc->adapter);
-virBufferAsprintf(,
-

[PATCH 05/11] virDomainHostdevDefFormatSubsysSCSI: Avoid ternary operator when formatting address

2020-07-14 Thread Peter Krempa

Split up formatting of the '' element rather that trying to
optimize it with formatting string hacks.

Signed-off-by: Peter Krempa 
---
 src/conf/domain_conf.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index b337bba534..7b181e0587 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -26134,11 +26134,13 @@ virDomainHostdevDefFormatSubsysSCSI(virBufferPtr buf,
 } else {
 virBufferAsprintf(, "\n",
   scsihostsrc->adapter);
-virBufferAsprintf(,
-  "\n",
-  includeTypeInAddr ? "type='scsi' " : "",
-  scsihostsrc->bus, scsihostsrc->target,
-  scsihostsrc->unit);
+
+virBufferAddLit(, "bus, scsihostsrc->target, 
scsihostsrc->unit);
+virBufferAddLit(, "/>\n");
 }

 virXMLFormatElement(buf, "source", , );
-- 
2.26.2

[PATCH 03/11] virDomainHostdevDefFormatSubsys: Split out formatting of PCI subsystem

2020-07-14 Thread Peter Krempa

Similarly to previous commit split out formatting of the PCI subsystem
related stuff.

Signed-off-by: Peter Krempa 
---
 src/conf/domain_conf.c | 74 ++
 1 file changed, 46 insertions(+), 28 deletions(-)

diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index 7718a59c66..d4a2aa7623 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -26057,6 +26057,50 @@ virDomainHostdevDefFormatSubsysUSB(virBufferPtr buf,
 }


+static int
+virDomainHostdevDefFormatSubsysPCI(virBufferPtr buf,
+   virDomainHostdevDefPtr def,
+   unsigned int flags,
+   bool includeTypeInAddr)
+{
+g_auto(virBuffer) sourceAttrBuf = VIR_BUFFER_INITIALIZER;
+g_auto(virBuffer) sourceChildBuf = VIR_BUFFER_INIT_CHILD(buf);
+g_auto(virBuffer) origstatesChildBuf = 
VIR_BUFFER_INIT_CHILD();
+virDomainHostdevSubsysPCIPtr pcisrc = >source.subsys.u.pci;
+
+if (pcisrc->backend != VIR_DOMAIN_HOSTDEV_PCI_BACKEND_DEFAULT) {
+const char *backend = 
virDomainHostdevSubsysPCIBackendTypeToString(pcisrc->backend);
+
+if (!backend) {
+virReportError(VIR_ERR_INTERNAL_ERROR,
+   _("unexpected pci hostdev driver name type %d"),
+   pcisrc->backend);
+return -1;
+}
+
+virBufferAsprintf(buf, "\n", backend);
+}
+
+virPCIDeviceAddressFormat(, pcisrc->addr, 
includeTypeInAddr);
+
+if ((flags & VIR_DOMAIN_DEF_FORMAT_PCI_ORIG_STATES)) {
+if (def->origstates.states.pci.unbind_from_stub)
+virBufferAddLit(, "\n");
+
+if (def->origstates.states.pci.remove_slot)
+virBufferAddLit(, "\n");
+
+if (def->origstates.states.pci.reprobe)
+virBufferAddLit(, "\n");
+
+virXMLFormatElement(, "origstates", NULL, 
);
+}
+
+virXMLFormatElement(buf, "source", , );
+return 0;
+}
+
+
 static int
 virDomainHostdevDefFormatSubsys(virBufferPtr buf,
 virDomainHostdevDefPtr def,
@@ -26066,8 +26110,6 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf,
 {
 g_auto(virBuffer) sourceAttrBuf = VIR_BUFFER_INITIALIZER;
 g_auto(virBuffer) sourceChildBuf = VIR_BUFFER_INIT_CHILD(buf);
-g_auto(virBuffer) origstatesChildBuf = 
VIR_BUFFER_INIT_CHILD();
-virDomainHostdevSubsysPCIPtr pcisrc = >source.subsys.u.pci;
 virDomainHostdevSubsysSCSIPtr scsisrc = >source.subsys.u.scsi;
 virDomainHostdevSubsysSCSIVHostPtr hostsrc = 
>source.subsys.u.scsi_host;
 virDomainHostdevSubsysMediatedDevPtr mdevsrc = >source.subsys.u.mdev;
@@ -26080,6 +26122,8 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf,
 return 0;

 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI:
+return virDomainHostdevDefFormatSubsysPCI(buf, def, flags, 
includeTypeInAddr);
+
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI:
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST:
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV:
@@ -26091,20 +26135,6 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf,
 return -1;
 }

-if (def->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI &&
-pcisrc->backend != VIR_DOMAIN_HOSTDEV_PCI_BACKEND_DEFAULT) {
-const char *backend =
-virDomainHostdevSubsysPCIBackendTypeToString(pcisrc->backend);
-
-if (!backend) {
-virReportError(VIR_ERR_INTERNAL_ERROR,
-   _("unexpected pci hostdev driver name type %d"),
-   pcisrc->backend);
-return -1;
-}
-virBufferAsprintf(buf, "\n", backend);
-}
-

 if (def->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI &&
 scsisrc->protocol == VIR_DOMAIN_HOSTDEV_SCSI_PROTOCOL_TYPE_ISCSI) {
@@ -26127,18 +26157,6 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf,
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB:
 break;
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI:
-virPCIDeviceAddressFormat(, pcisrc->addr,
-  includeTypeInAddr);
-
-if ((flags & VIR_DOMAIN_DEF_FORMAT_PCI_ORIG_STATES)) {
-if (def->origstates.states.pci.unbind_from_stub)
-virBufferAddLit(, "\n");
-if (def->origstates.states.pci.remove_slot)
-virBufferAddLit(, "\n");
-if (def->origstates.states.pci.reprobe)
-virBufferAddLit(, "\n");
-virXMLFormatElement(, "origstates", NULL, 
);
-}
 break;
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI:
 if (scsisrc->protocol == VIR_DOMAIN_HOSTDEV_SCSI_PROTOCOL_TYPE_ISCSI) {
-- 
2.26.2

[libvirt PATCH 1/2] ci: Drop Debian 9 jobs

2020-07-14 Thread Andrea Bolognani

The esisting cross-compilation jobs are carefully redistributed
among Debian 10 and Debian sid to ensure we don't use the latter
for aarch64, mipsel or mips64el, since those architectures are
currently broken.

Signed-off-by: Andrea Bolognani 
---
 .gitlab-ci.yml | 74 --
 1 file changed, 12 insertions(+), 62 deletions(-)

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index 5565750b7e..702198ec8e 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -125,11 +125,6 @@ x64-centos-stream-container:
   variables:
 NAME: centos-stream
 
-x64-debian-9-container:
-  <<: *container_job_definition
-  variables:
-NAME: debian-9
-
 x64-debian-10-container:
   <<: *container_job_definition
   variables:
@@ -173,53 +168,13 @@ x64-ubuntu-2004-container:
 
 # Cross-build containers build jobs
 
-aarch64-debian-9-container:
-  <<: *container_optional_job_definition
-  variables:
-NAME: debian-9-cross-aarch64
-
-armv6l-debian-9-container:
-  <<: *container_job_definition
-  variables:
-NAME: debian-9-cross-armv6l
-
-armv7l-debian-9-container:
-  <<: *container_optional_job_definition
-  variables:
-NAME: debian-9-cross-armv7l
-
-mips-debian-9-container:
-  <<: *container_job_definition
-  variables:
-NAME: debian-9-cross-mips
-
-mips64el-debian-9-container:
-  <<: *container_job_definition
-  variables:
-NAME: debian-9-cross-mips64el
-
-mipsel-debian-9-container:
-  <<: *container_optional_job_definition
-  variables:
-NAME: debian-9-cross-mipsel
-
-ppc64le-debian-9-container:
-  <<: *container_optional_job_definition
-  variables:
-NAME: debian-9-cross-ppc64le
-
-s390x-debian-9-container:
-  <<: *container_optional_job_definition
-  variables:
-NAME: debian-9-cross-s390x
-
 aarch64-debian-10-container:
   <<: *container_job_definition
   variables:
 NAME: debian-10-cross-aarch64
 
 armv6l-debian-10-container:
-  <<: *container_optional_job_definition
+  <<: *container_job_definition
   variables:
 NAME: debian-10-cross-armv6l
 
@@ -234,12 +189,12 @@ i686-debian-10-container:
 NAME: debian-10-cross-i686
 
 mips-debian-10-container:
-  <<: *container_optional_job_definition
+  <<: *container_job_definition
   variables:
 NAME: debian-10-cross-mips
 
 mips64el-debian-10-container:
-  <<: *container_optional_job_definition
+  <<: *container_job_definition
   variables:
 NAME: debian-10-cross-mips64el
 
@@ -254,7 +209,7 @@ ppc64le-debian-10-container:
 NAME: debian-10-cross-ppc64le
 
 s390x-debian-10-container:
-  <<: *container_job_definition
+  <<: *container_optional_job_definition
   variables:
 NAME: debian-10-cross-s390x
 
@@ -311,11 +266,6 @@ mingw64-fedora-rawhide-container:
 
 # Native architecture build + test jobs
 
-x64-debian-9:
-  <<: *native_build_job_definition
-  variables:
-NAME: debian-9
-
 x64-debian-10:
   <<: *native_build_job_definition
   variables:
@@ -394,22 +344,22 @@ x64-macos-1015-build:
 
 # Cross compiled build jobs
 
-armv6l-debian-9:
+armv6l-debian-10:
   <<: *cross_build_job_definition
   variables:
-NAME: debian-9
+NAME: debian-10
 CROSS: armv6l
 
-mips64el-debian-9:
+mips64el-debian-10:
   <<: *cross_build_job_definition
   variables:
-NAME: debian-9
+NAME: debian-10
 CROSS: mips64el
 
-mips-debian-9:
+mips-debian-10:
   <<: *cross_build_job_definition
   variables:
-NAME: debian-9
+NAME: debian-10
 CROSS: mips
 
 aarch64-debian-10:
@@ -424,10 +374,10 @@ mipsel-debian-10:
 NAME: debian-10
 CROSS: mipsel
 
-s390x-debian-10:
+s390x-debian-sid:
   <<: *cross_build_job_definition
   variables:
-NAME: debian-10
+NAME: debian-sid
 CROSS: s390x
 
 armv7l-debian-sid:
-- 
2.25.4

[libvirt PATCH 0/2] ci: Drop Debian 9

2020-07-14 Thread Andrea Bolognani

The errors caused by it going EOL last week are currently blocking
all CI jobs.

Andrea Bolognani (2):
  ci: Drop Debian 9 jobs
  ci: Drop Debian 9 containers

 .gitlab-ci.yml|  74 ++
 .../libvirt-debian-9-cross-aarch64.Dockerfile | 128 --
 .../libvirt-debian-9-cross-armv6l.Dockerfile  | 126 -
 .../libvirt-debian-9-cross-armv7l.Dockerfile  | 127 -
 .../libvirt-debian-9-cross-mips.Dockerfile| 127 -
 ...libvirt-debian-9-cross-mips64el.Dockerfile | 127 -
 .../libvirt-debian-9-cross-mipsel.Dockerfile  | 127 -
 .../libvirt-debian-9-cross-ppc64le.Dockerfile | 127 -
 .../libvirt-debian-9-cross-s390x.Dockerfile   | 127 -
 ci/containers/libvirt-debian-9.Dockerfile | 118 
 10 files changed, 12 insertions(+), 1196 deletions(-)
 delete mode 100644 ci/containers/libvirt-debian-9-cross-aarch64.Dockerfile
 delete mode 100644 ci/containers/libvirt-debian-9-cross-armv6l.Dockerfile
 delete mode 100644 ci/containers/libvirt-debian-9-cross-armv7l.Dockerfile
 delete mode 100644 ci/containers/libvirt-debian-9-cross-mips.Dockerfile
 delete mode 100644 ci/containers/libvirt-debian-9-cross-mips64el.Dockerfile
 delete mode 100644 ci/containers/libvirt-debian-9-cross-mipsel.Dockerfile
 delete mode 100644 ci/containers/libvirt-debian-9-cross-ppc64le.Dockerfile
 delete mode 100644 ci/containers/libvirt-debian-9-cross-s390x.Dockerfile
 delete mode 100644 ci/containers/libvirt-debian-9.Dockerfile

-- 
2.25.4

[libvirt PATCH 2/2] ci: Drop Debian 9 containers

2020-07-14 Thread Andrea Bolognani

The corresponding libvirt-ci commit is 5abf5e7e2326.

Signed-off-by: Andrea Bolognani 
---
 .../libvirt-debian-9-cross-aarch64.Dockerfile | 128 --
 .../libvirt-debian-9-cross-armv6l.Dockerfile  | 126 -
 .../libvirt-debian-9-cross-armv7l.Dockerfile  | 127 -
 .../libvirt-debian-9-cross-mips.Dockerfile| 127 -
 ...libvirt-debian-9-cross-mips64el.Dockerfile | 127 -
 .../libvirt-debian-9-cross-mipsel.Dockerfile  | 127 -
 .../libvirt-debian-9-cross-ppc64le.Dockerfile | 127 -
 .../libvirt-debian-9-cross-s390x.Dockerfile   | 127 -
 ci/containers/libvirt-debian-9.Dockerfile | 118 
 9 files changed, 1134 deletions(-)
 delete mode 100644 ci/containers/libvirt-debian-9-cross-aarch64.Dockerfile
 delete mode 100644 ci/containers/libvirt-debian-9-cross-armv6l.Dockerfile
 delete mode 100644 ci/containers/libvirt-debian-9-cross-armv7l.Dockerfile
 delete mode 100644 ci/containers/libvirt-debian-9-cross-mips.Dockerfile
 delete mode 100644 ci/containers/libvirt-debian-9-cross-mips64el.Dockerfile
 delete mode 100644 ci/containers/libvirt-debian-9-cross-mipsel.Dockerfile
 delete mode 100644 ci/containers/libvirt-debian-9-cross-ppc64le.Dockerfile
 delete mode 100644 ci/containers/libvirt-debian-9-cross-s390x.Dockerfile
 delete mode 100644 ci/containers/libvirt-debian-9.Dockerfile

diff --git a/ci/containers/libvirt-debian-9-cross-aarch64.Dockerfile 
b/ci/containers/libvirt-debian-9-cross-aarch64.Dockerfile
deleted file mode 100644
index 8f7f794c9c..00
--- a/ci/containers/libvirt-debian-9-cross-aarch64.Dockerfile
+++ /dev/null
@@ -1,128 +0,0 @@
-FROM debian:9
-
-RUN export DEBIAN_FRONTEND=noninteractive && \
-apt-get update && \
-apt-get dist-upgrade -y && \
-apt-get install --no-install-recommends -y \
-augeas-lenses \
-augeas-tools \
-autoconf \
-automake \
-autopoint \
-bash \
-bash-completion \
-ca-certificates \
-ccache \
-chrony \
-cpanminus \
-dnsmasq-base \
-dwarves \
-ebtables \
-flake8 \
-gcc \
-gdb \
-gettext \
-git \
-iproute2 \
-kmod \
-libc-dev-bin \
-libtool \
-libtool-bin \
-libxml2-utils \
-locales \
-lsof \
-lvm2 \
-make \
-net-tools \
-nfs-common \
-ninja-build \
-numad \
-open-iscsi \
-parted \
-patch \
-perl \
-pkgconf \
-policykit-1 \
-python3 \
-python3-docutils \
-python3-pip \
-python3-setuptools \
-python3-wheel \
-qemu-utils \
-radvd \
-screen \
-scrub \
-sheepdog \
-strace \
-sudo \
-vim \
-xsltproc \
-xz-utils \
-zfs-fuse && \
-apt-get autoremove -y && \
-apt-get autoclean -y && \
-sed -Ei 's,^# (en_US\.UTF-8 .*)$,\1,' /etc/locale.gen && \
-dpkg-reconfigure locales && \
-mkdir -p /usr/libexec/ccache-wrappers && \
-ln -s /usr/bin/ccache /usr/libexec/ccache-wrappers/aarch64-linux-gnu-cc && 
\
-ln -s /usr/bin/ccache 
/usr/libexec/ccache-wrappers/aarch64-linux-gnu-$(basename /usr/bin/gcc)
-
-RUN export DEBIAN_FRONTEND=noninteractive && \
-dpkg --add-architecture arm64 && \
-apt-get update && \
-apt-get dist-upgrade -y && \
-apt-get install --no-install-recommends -y dpkg-dev && \
-apt-get install --no-install-recommends -y \
-gcc-aarch64-linux-gnu \
-glusterfs-common:arm64 \
-libacl1-dev:arm64 \
-libapparmor-dev:arm64 \
-libattr1-dev:arm64 \
-libaudit-dev:arm64 \
-libavahi-client-dev:arm64 \
-libblkid-dev:arm64 \
-libc6-dev:arm64 \
-libcap-ng-dev:arm64 \
-libcurl4-gnutls-dev:arm64 \
-libdbus-1-dev:arm64 \
-libdevmapper-dev:arm64 \
-libfuse-dev:arm64 \
-libglib2.0-dev:arm64 \
-libgnutls28-dev:arm64 \
-libiscsi-dev:arm64 \
-libncurses5-dev:arm64 \
-libnl-3-dev:arm64 \
-libnl-route-3-dev:arm64 \
-libnuma-dev:arm64 \
-libparted-dev:arm64 \
-libpcap0.8-dev:arm64 \
-libpciaccess-dev:arm64 \
-librbd-dev:arm64 \
-libreadline-dev:arm64 \
-libsanlock-dev:arm64 \
-libsasl2-dev:arm64 \
-libselinux1-dev:arm64 \
-libssh-gcrypt-dev:arm64 \
-libssh2-1-dev:arm64 \
-

Re: [libvirt PATCH v2 00/15] convert network and nwfilter directories to glib memory allocation.

2020-07-14 Thread Laine Stump


ping

On 7/7/20 5:08 PM, Laine Stump wrote:

V1 was here:

https://www.redhat.com/archives/libvir-list/2020-June/msg01156.html

Some patches were ACKed and pushed. I re-ordered/re-organized most of
the rest, and removed some others to deal with separately (the
xmlNodeContent stuff)

What's left here is a few preliminary patches, then the standard set,
once for network and again for nwfilter:

1) convert from VIR_(RE)ALLOC(_N) to g_new0()/g_renew()
2) use g_auto*() where appropriate, removing unneeded free's
3) get rid of now-extraneous labels
4) (controversial) replace any remaining VIR_FREE() with g_free() (and
possibly g_clear_pointer() when needed

NB: these patches require my virBuffer "convert to g_auto" series
as a prerequisite:

   https://www.redhat.com/archives/libvir-list/2020-July/msg00185.html



^^ This has been pushed, so there are no longer any extra prerequisites.




Changes from V1:

   * move conversion of virFirewall and virBuffer automatics to another
 series (see above)
   
   * re-order to replace VIR_ALLOC first (without adding any g_auto*)

 instead of doing it after g_auto conversion of automatics, then do
 all g_auto additions at o

   * separate label elimination into separate patches per jtomko's
 suggestion.


Laine Stump (15):
   replace g_new() with g_new0() for consistency
   util: define g_autoptr cleanups for a couple dnsmasq objects
   define g_autoptr cleanup function for virNetworkDHCPLease
   network: replace VIR_ALLOC/REALLOC with g_new0/g_renew
   network: use g_auto wherever appropriate
   network: eliminate unnecessary labels
   network: use g_free() in place of remaining VIR_FREE()
   nwfilter: remove unnecessary code from ebtablesGetSubChainInsts()
   nwfilter: clear nrules when resetting virNWFilterInst
   nwfilter: define a typedef for struct ebtablesSubChainInst
   nwfilter: transform logic in virNWFilterRuleInstSort to eliminate
 label
   nwfilter: use standard label names when reasonable
   nwfilter: replace VIR_ALLOC with g_new0
   nwfilter: convert local pointers to use g_auto*
   nwfilter: convert remaining VIR_FREE() to g_free()

  src/datatypes.h   |   2 +
  src/network/bridge_driver.c   | 536 --
  src/network/bridge_driver_linux.c |  22 +-
  src/network/leaseshelper.c|  16 +-
  src/nwfilter/nwfilter_dhcpsnoop.c | 150 +++---
  src/nwfilter/nwfilter_driver.c|  13 +-
  src/nwfilter/nwfilter_ebiptables_driver.c | 119 ++---
  src/nwfilter/nwfilter_gentech_driver.c|  57 ++-
  src/nwfilter/nwfilter_learnipaddr.c   |  43 +-
  src/qemu/qemu_backup.c|   2 +-
  src/util/virdnsmasq.h |   4 +
  src/util/virutil.c|   2 +-
  tests/qemuhotplugmock.c   |   2 +-
  13 files changed, 379 insertions(+), 589 deletions(-)

Re: device compatibility interface for live migration with assigned devices

2020-07-14 Thread Dr. David Alan Gilbert

* Alex Williamson (alex.william...@redhat.com) wrote:
> On Tue, 14 Jul 2020 11:21:29 +0100
> Daniel P. BerrangÃ©  wrote:
> 
> > On Tue, Jul 14, 2020 at 07:29:57AM +0800, Yan Zhao wrote:
> > > hi folks,
> > > we are defining a device migration compatibility interface that helps 
> > > upper
> > > layer stack like openstack/ovirt/libvirt to check if two devices are
> > > live migration compatible.
> > > The "devices" here could be MDEVs, physical devices, or hybrid of the two.
> > > e.g. we could use it to check whether
> > > - a src MDEV can migrate to a target MDEV,
> > > - a src VF in SRIOV can migrate to a target VF in SRIOV,
> > > - a src MDEV can migration to a target VF in SRIOV.
> > >   (e.g. SIOV/SRIOV backward compatibility case)
> > > 
> > > The upper layer stack could use this interface as the last step to check
> > > if one device is able to migrate to another device before triggering a 
> > > real
> > > live migration procedure.
> > > we are not sure if this interface is of value or help to you. please don't
> > > hesitate to drop your valuable comments.
> > > 
> > > 
> > > (1) interface definition
> > > The interface is defined in below way:
> > > 
> > >  __userspace
> > >   /\  \
> > >  / \write
> > > / read  \
> > >/__   ___\|/_
> > >   | migration_version | | migration_version |-->check migration
> > >   - -   compatibility
> > >  device Adevice B
> > > 
> > > 
> > > a device attribute named migration_version is defined under each device's
> > > sysfs node. e.g. 
> > > (/sys/bus/pci/devices/\:00\:02.0/$mdev_UUID/migration_version).
> > > userspace tools read the migration_version as a string from the source 
> > > device,
> > > and write it to the migration_version sysfs attribute in the target 
> > > device.
> > > 
> > > The userspace should treat ANY of below conditions as two devices not 
> > > compatible:
> > > - any one of the two devices does not have a migration_version attribute
> > > - error when reading from migration_version attribute of one device
> > > - error when writing migration_version string of one device to
> > >   migration_version attribute of the other device
> > > 
> > > The string read from migration_version attribute is defined by device 
> > > vendor
> > > driver and is completely opaque to the userspace.
> > > for a Intel vGPU, string format can be defined like
> > > "parent device PCI ID" + "version of gvt driver" + "mdev type" + 
> > > "aggregator count".
> > > 
> > > for an NVMe VF connecting to a remote storage. it could be
> > > "PCI ID" + "driver version" + "configured remote storage URL"
> > > 
> > > for a QAT VF, it may be
> > > "PCI ID" + "driver version" + "supported encryption set".
> > > 
> > > (to avoid namespace confliction from each vendor, we may prefix a driver 
> > > name to
> > > each migration_version string. e.g. i915-v1-8086-591d-i915-GVTg_V5_8-1)
> 
> It's very strange to define it as opaque and then proceed to describe
> the contents of that opaque string.  The point is that its contents
> are defined by the vendor driver to describe the device, driver version,
> and possibly metadata about the configuration of the device.  One
> instance of a device might generate a different string from another.
> The string that a device produces is not necessarily the only string
> the vendor driver will accept, for example the driver might support
> backwards compatible migrations.

(As I've said in the previous discussion, off one of the patch series)

My view is it makes sense to have a half-way house on the opaqueness of
this string; I'd expect to have an ID and version that are human
readable, maybe a device ID/name that's human interpretable and then a
bunch of other cruft that maybe device/vendor/version specific.

I'm thinking that we want to be able to report problems and include the
string and the user to be able to easily identify the device that was
complaining and notice a difference in versions, and perhaps also use
it in compatibility patterns to find compatible hosts; but that does
get tricky when it's a 'ask the device if it's compatible'.

Dave

> > > (2) backgrounds
> > > 
> > > The reason we hope the migration_version string is opaque to the userspace
> > > is that it is hard to generalize standard comparing fields and comparing
> > > methods for different devices from different vendors.
> > > Though userspace now could still do a simple string compare to check if
> > > two devices are compatible, and result should also be right, it's still
> > > too limited as it excludes the possible candidate whose migration_version
> > > string fails to be equal.
> > > e.g. an MDEV with mdev_type_1, aggregator count 3 is probably compatible
> > > with another MDEV with mdev_type_3, aggregator count 1, even their
> > > migration_version strings

Re: device compatibility interface for live migration with assigned devices

2020-07-14 Thread Alex Williamson

On Tue, 14 Jul 2020 13:33:24 +0100
Sean Mooney  wrote:

> On Tue, 2020-07-14 at 11:21 +0100, Daniel P. Berrangé wrote:
> > On Tue, Jul 14, 2020 at 07:29:57AM +0800, Yan Zhao wrote:  
> > > hi folks,
> > > we are defining a device migration compatibility interface that helps 
> > > upper
> > > layer stack like openstack/ovirt/libvirt to check if two devices are
> > > live migration compatible.
> > > The "devices" here could be MDEVs, physical devices, or hybrid of the two.
> > > e.g. we could use it to check whether
> > > - a src MDEV can migrate to a target MDEV,  
> mdev live migration is completely possible to do but i agree with Dan 
> barrange's comments
> from the point of view of openstack integration i dont see calling out to a 
> vender sepecific
> tool to be an accpetable

As I replied to Dan, I'm hoping Yan was referring more to vendor
specific knowledge rather than actual tools.

> solutions for device compatiablity checking. the sys filesystem
> that describs the mdevs that can be created shoudl also
> contain the relevent infomation such
> taht nova could integrate it via libvirt xml representation or directly 
> retrive the
> info from
> sysfs.
> > > - a src VF in SRIOV can migrate to a target VF in SRIOV,  
> so vf to vf migration is not possible in the general case as there is no 
> standarised
> way to transfer teh device state as part of the siorv specs produced by the 
> pci-sig
> as such there is not vender neutral way to support sriov live migration. 

We're not talking about a general case, we're talking about physical
devices which have vfio wrappers or hooks with device specific
knowledge in order to support the vfio migration interface.  The point
is that a discussion around vfio device migration cannot be limited to
mdev devices.

> > > - a src MDEV can migration to a target VF in SRIOV.  
> that also makes this unviable
> > >   (e.g. SIOV/SRIOV backward compatibility case)
> > > 
> > > The upper layer stack could use this interface as the last step to check
> > > if one device is able to migrate to another device before triggering a 
> > > real
> > > live migration procedure.  
> well actully that is already too late really. ideally we would want to do 
> this compaiablity
> check much sooneer to avoid the migration failing. in an openstack envionment 
>  at least
> by the time we invoke libvirt (assuming your using the libvirt driver) to do 
> the migration we have alreaedy
> finished schduling the instance to the new host. if if we do the 
> compatiablity check at this point
> and it fails then the live migration is aborted and will not be retired. 
> These types of late check lead to a
> poor user experince as unless you check the migration detial it basically 
> looks like the migration was ignored
> as it start to migrate and then continuge running on the orgininal host.
> 
> when using generic pci passhotuhg with openstack, the pci alias is intended 
> to reference a single vendor id/product
> id so you will have 1+ alias for each type of device. that allows openstack 
> to schedule based on the availability of a
> compatibale device because we track inventories of pci devices and can query 
> that when selecting a host.
> 
> if we were to support mdev live migration in the future we would want to take 
> the same declarative approch.
> 1 interospec the capability of the deivce we manage
> 2 create inventories of the allocatable devices and there capabilities
> 3 schdule the instance to a host based on the device-type/capabilities and 
> claim it atomicly to prevent raceces
> 4 have the lower level hyperviors do addtional validation if need prelive 
> migration.
> 
> this proposal seams to be targeting extending step 4 where as ideally we 
> should focuse on providing the info that would
> be relevant in set 1 preferably in a vendor neutral way vai a kernel 
> interface like /sys.

I think this is reading a whole lot into the phrase "last step".  We
want to make the information available for a management engine to
consume as needed to make informed decisions regarding likely
compatible target devices.
 
> > > we are not sure if this interface is of value or help to you. please don't
> > > hesitate to drop your valuable comments.
> > > 
> > > 
> > > (1) interface definition
> > > The interface is defined in below way:
> > > 
> > >  __userspace
> > >   /\  \
> > >  / \write
> > > / read  \
> > >/__   ___\|/_
> > >   | migration_version | | migration_version |-->check migration
> > >   - -   compatibility
> > >  device Adevice B
> > > 
> > > 
> > > a device attribute named migration_version is defined under each device's
> > > sysfs node. e.g. 
> > > (/sys/bus/pci/devices/\:00\:02.0/$mdev_UUID/migration_version).  
> this might be useful as we could tag the inventory with the

Re: [GSoC][PATCH v4 4/4] qemu_domainjob: introduce `privateData` for `qemuDomainJobInfo`

2020-07-14 Thread Michal Privoznik


On 7/14/20 5:14 PM, Prathamesh Chavan wrote:

Currently, domainJobInfo also uses "stats" as one of the job specific
parameters. To remove this dependency, a privateData structure is
introduced.

The plan is to even have this structure renamed as
`virDomainJobInfoInternal` as there already exists a
`virDomainJobInfo'.


I see. Well, I'm not that sure about virDomainJobInfoInternal (mostly 
because this qemuDomainJobInfo structure looks too qemu specific).
How about moving it under qemuDomainJobObj privateData? I mean, 
qemuDomainJobPrivate structure you propose in 3/4?


Michal

Re: device compatibility interface for live migration with assigned devices

2020-07-14 Thread Daniel P . Berrangé

On Tue, Jul 14, 2020 at 10:16:16AM -0600, Alex Williamson wrote:
> On Tue, 14 Jul 2020 11:21:29 +0100
> Daniel P. Berrangé  wrote:
> 
> > On Tue, Jul 14, 2020 at 07:29:57AM +0800, Yan Zhao wrote:
> > > 
> > > The string read from migration_version attribute is defined by device 
> > > vendor
> > > driver and is completely opaque to the userspace.
> > > for a Intel vGPU, string format can be defined like
> > > "parent device PCI ID" + "version of gvt driver" + "mdev type" + 
> > > "aggregator count".
> > > 
> > > for an NVMe VF connecting to a remote storage. it could be
> > > "PCI ID" + "driver version" + "configured remote storage URL"
> > > 
> > > for a QAT VF, it may be
> > > "PCI ID" + "driver version" + "supported encryption set".
> > > 
> > > (to avoid namespace confliction from each vendor, we may prefix a driver 
> > > name to
> > > each migration_version string. e.g. i915-v1-8086-591d-i915-GVTg_V5_8-1)
> 
> It's very strange to define it as opaque and then proceed to describe
> the contents of that opaque string.  The point is that its contents
> are defined by the vendor driver to describe the device, driver version,
> and possibly metadata about the configuration of the device.  One
> instance of a device might generate a different string from another.
> The string that a device produces is not necessarily the only string
> the vendor driver will accept, for example the driver might support
> backwards compatible migrations.

> > IMHO there needs to be a mechanism for the kernel to report via sysfs
> > what versions are supported on a given device. This puts the job of
> > reporting compatible versions directly under the responsibility of the
> > vendor who writes the kernel driver for it. They are the ones with the
> > best knowledge of the hardware they've built and the rules around its
> > compatibility.
> 
> The version string discussed previously is the version string that
> represents a given device, possibly including driver information,
> configuration, etc.  I think what you're asking for here is an
> enumeration of every possible version string that a given device could
> accept as an incoming migration stream.  If we consider the string as
> opaque, that means the vendor driver needs to generate a separate
> string for every possible version it could accept, for every possible
> configuration option.  That potentially becomes an excessive amount of
> data to either generate or manage.
> 
> Am I overestimating how vendors intend to use the version string?

If I'm interpreting your reply & the quoted text orrectly, the version
string isn't really a version string in any normal sense of the word
"version".

Instead it sounds like string encoding a set of features in some arbitrary
vendor specific format, which they parse and do compatibility checks on
individual pieces ? One or more parts may contain a version number, but
its much more than just a version.

If that's correct, then I'd prefer we didn't call it a version string,
instead call it a "capability string" to make it clear it is expressing
a much more general concept, but...

> We'd also need to consider devices that we could create, for instance
> providing the same interface enumeration prior to creating an mdev
> device to have a confidence level that the new device would be a valid
> target.
> 
> We defined the string as opaque to allow vendor flexibility and because
> defining a common format is hard.  Do we need to revisit this part of
> the discussion to define the version string as non-opaque with parsing
> rules, probably with separate incoming vs outgoing interfaces?  Thanks,

..even if the huge amount of flexibility is technically relevant from the
POV of the hardware/drivers, we should consider whether management apps
actually want, or can use, that level of flexibility.

The task of picking which host to place a VM on has alot of factors to
consider, and when there are a large number of hosts, the total amount
of information to check gets correspondingly large.  The placement
process is also fairly performance critical.

Running complex algorithmic logic to check compatibility of devices
based on a arbitrary set of rules is likely to be a performance
challenge. A flat list of supported strings is a much simpler
thing to check as it reduces down to a simple set membership test.

IOW, even if there's some complex set of device type / vendor specific
rules to check for compatibility, I fear apps will ignore them and
just define a very simplified list of compatible string, and ignore
all the extra flexibility.

I'm sure OpenStack maintainers can speak to this more, as they've put
alot of work into their scheduling engine to optimize the way it places
VMs largely driven from simple structured data reported from hosts.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|:

Re: device compatibility interface for live migration with assigned devices

2020-07-14 Thread Alex Williamson

On Tue, 14 Jul 2020 11:21:29 +0100
Daniel P. Berrangé  wrote:

> On Tue, Jul 14, 2020 at 07:29:57AM +0800, Yan Zhao wrote:
> > hi folks,
> > we are defining a device migration compatibility interface that helps upper
> > layer stack like openstack/ovirt/libvirt to check if two devices are
> > live migration compatible.
> > The "devices" here could be MDEVs, physical devices, or hybrid of the two.
> > e.g. we could use it to check whether
> > - a src MDEV can migrate to a target MDEV,
> > - a src VF in SRIOV can migrate to a target VF in SRIOV,
> > - a src MDEV can migration to a target VF in SRIOV.
> >   (e.g. SIOV/SRIOV backward compatibility case)
> > 
> > The upper layer stack could use this interface as the last step to check
> > if one device is able to migrate to another device before triggering a real
> > live migration procedure.
> > we are not sure if this interface is of value or help to you. please don't
> > hesitate to drop your valuable comments.
> > 
> > 
> > (1) interface definition
> > The interface is defined in below way:
> > 
> >  __userspace
> >   /\  \
> >  / \write
> > / read  \
> >/__   ___\|/_
> >   | migration_version | | migration_version |-->check migration
> >   - -   compatibility
> >  device Adevice B
> > 
> > 
> > a device attribute named migration_version is defined under each device's
> > sysfs node. e.g. 
> > (/sys/bus/pci/devices/\:00\:02.0/$mdev_UUID/migration_version).
> > userspace tools read the migration_version as a string from the source 
> > device,
> > and write it to the migration_version sysfs attribute in the target device.
> > 
> > The userspace should treat ANY of below conditions as two devices not 
> > compatible:
> > - any one of the two devices does not have a migration_version attribute
> > - error when reading from migration_version attribute of one device
> > - error when writing migration_version string of one device to
> >   migration_version attribute of the other device
> > 
> > The string read from migration_version attribute is defined by device vendor
> > driver and is completely opaque to the userspace.
> > for a Intel vGPU, string format can be defined like
> > "parent device PCI ID" + "version of gvt driver" + "mdev type" + 
> > "aggregator count".
> > 
> > for an NVMe VF connecting to a remote storage. it could be
> > "PCI ID" + "driver version" + "configured remote storage URL"
> > 
> > for a QAT VF, it may be
> > "PCI ID" + "driver version" + "supported encryption set".
> > 
> > (to avoid namespace confliction from each vendor, we may prefix a driver 
> > name to
> > each migration_version string. e.g. i915-v1-8086-591d-i915-GVTg_V5_8-1)

It's very strange to define it as opaque and then proceed to describe
the contents of that opaque string.  The point is that its contents
are defined by the vendor driver to describe the device, driver version,
and possibly metadata about the configuration of the device.  One
instance of a device might generate a different string from another.
The string that a device produces is not necessarily the only string
the vendor driver will accept, for example the driver might support
backwards compatible migrations.

> > (2) backgrounds
> > 
> > The reason we hope the migration_version string is opaque to the userspace
> > is that it is hard to generalize standard comparing fields and comparing
> > methods for different devices from different vendors.
> > Though userspace now could still do a simple string compare to check if
> > two devices are compatible, and result should also be right, it's still
> > too limited as it excludes the possible candidate whose migration_version
> > string fails to be equal.
> > e.g. an MDEV with mdev_type_1, aggregator count 3 is probably compatible
> > with another MDEV with mdev_type_3, aggregator count 1, even their
> > migration_version strings are not equal.
> > (assumed mdev_type_3 is of 3 times equal resources of mdev_type_1).
> > 
> > besides that, driver version + configured resources are all elements 
> > demanding
> > to take into account.
> > 
> > So, we hope leaving the freedom to vendor driver and let it make the final 
> > decision
> > in a simple reading from source side and writing for test in the target 
> > side way.
> > 
> > 
> > we then think the device compatibility issues for live migration with 
> > assigned
> > devices can be divided into two steps:
> > a. management tools filter out possible migration target devices.
> >Tags could be created according to info from product specification.
> >we think openstack/ovirt may have vendor proprietary components to create
> >those customized tags for each product from each vendor.  
> 
> >for Intel vGPU, with a vGPU(a MDEV device) in source side, the tags to
> >search target vGPU are like:
>

Re: [PATCH] docs: bhyve: document ignoring unknown MSRs

2020-07-14 Thread Roman Bogorodskiy

  Daniel P. Berrangé wrote:

> On Tue, Jul 14, 2020 at 07:46:38PM +0400, Roman Bogorodskiy wrote:
> > Ignoring unknown MSRs using  element
> >  was supported for quite some already,
> > so add documentation for it for completeness of flags coverage,
> > as some guests can be extra picky about flags passed to bhyve,
> > and it's useful to know how to control those.
> > 
> > Signed-off-by: Roman Bogorodskiy 
> > ---
> >  docs/drvbhyve.html.in | 18 ++
> >  1 file changed, 18 insertions(+)
> 
> Reviewed-by: Daniel P. Berrangé 
> 
> 
> Regards,
> Daniel

Pushed, thanks!

> -- 
> |: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o-https://fstop138.berrange.com :|
> |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
> 

Roman Bogorodskiy


signature.asc
Description: PGP signature

Re: [PATCH] docs: bhyve: document ignoring unknown MSRs

2020-07-14 Thread Daniel P . Berrangé

On Tue, Jul 14, 2020 at 07:46:38PM +0400, Roman Bogorodskiy wrote:
> Ignoring unknown MSRs using  element
>  was supported for quite some already,
> so add documentation for it for completeness of flags coverage,
> as some guests can be extra picky about flags passed to bhyve,
> and it's useful to know how to control those.
> 
> Signed-off-by: Roman Bogorodskiy 
> ---
>  docs/drvbhyve.html.in | 18 ++
>  1 file changed, 18 insertions(+)

Reviewed-by: Daniel P. Berrangé 


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

[PATCH] docs: bhyve: document ignoring unknown MSRs

2020-07-14 Thread Roman Bogorodskiy

Ignoring unknown MSRs using  element
 was supported for quite some already,
so add documentation for it for completeness of flags coverage,
as some guests can be extra picky about flags passed to bhyve,
and it's useful to know how to control those.

Signed-off-by: Roman Bogorodskiy 
---
 docs/drvbhyve.html.in | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/docs/drvbhyve.html.in b/docs/drvbhyve.html.in
index 2e9cf5551b..66a13be1f6 100644
--- a/docs/drvbhyve.html.in
+++ b/docs/drvbhyve.html.in
@@ -462,6 +462,24 @@ Example:
 /domain
 
 
+Ignoring unknown MSRs reads and writes
+
+Since 5.1.0, it's possible to make bhyve
+ignore accesses to unimplemented Model Specific Registers (MSRs).
+Example:
+
+
+domain type="bhyve"
+...
+features
+  ...
+  msrs unknown='ignore'/
+  ...
+/features
+...
+/domain
+
+
 Pass-through of arbitrary bhyve commands
 
 Since 5.1.0, it's possible to pass additional 
command-line
-- 
2.27.0

Re: [PATCH 17/18] qemu: caps: Enable QEMU_CAPS_BLOCKDEV_HOSTDEV_SCSI

2020-07-14 Thread Kevin Wolf

Am 10.07.2020 um 16:44 hat Peter Krempa geschrieben:
> On Fri, Jul 10, 2020 at 16:33:38 +0200, Peter Krempa wrote:
> > Enable it when regular QEMU_CAPS_BLOCKDEV is present.
> > 
> > Signed-off-by: Peter Krempa 
> > ---
> >  src/qemu/qemu_capabilities.c  |  3 ++
> >  .../caps_4.2.0.aarch64.xml|  1 +
> >  .../qemucapabilitiesdata/caps_4.2.0.s390x.xml |  1 +
> >  .../caps_4.2.0.x86_64.xml |  1 +
> >  .../caps_5.0.0.aarch64.xml|  1 +
> >  .../qemucapabilitiesdata/caps_5.0.0.ppc64.xml |  1 +
> >  .../caps_5.0.0.riscv64.xml|  1 +
> >  .../caps_5.0.0.x86_64.xml |  1 +
> >  .../caps_5.1.0.x86_64.xml |  1 +
> >  .../hostdev-scsi-lsi.x86_64-latest.args   | 52 +++
> >  ...ostdev-scsi-virtio-scsi.x86_64-latest.args | 46 
> >  11 files changed, 65 insertions(+), 44 deletions(-)
> > 
> 
> Kevin, Markus, could you please check, that he change from -drive
> if=none to -blockdev is equivalent/makes sense in these cases.
> 
> Usage of SCSI hostdevs is finnicky as it tends to be used with weird
> devices such as tape libraries. I've tested it only with scsi_debug.

Apart from the obvious differences shared with other cases (like
differences in QMP for using node names instead of -drive IDs), it looks
equivalent to me.

Kevin

> > diff --git 
> > a/tests/qemuxml2argvdata/hostdev-scsi-virtio-scsi.x86_64-latest.args 
> > b/tests/qemuxml2argvdata/hostdev-scsi-virtio-scsi.x86_64-latest.args
> > index 99b0117447..72980d58b8 100644
> > --- a/tests/qemuxml2argvdata/hostdev-scsi-virtio-scsi.x86_64-latest.args
> > +++ b/tests/qemuxml2argvdata/hostdev-scsi-virtio-scsi.x86_64-latest.args
> > @@ -34,40 +34,42 @@ file=/tmp/lib/domain--1-QEMUGuest2/master-key.aes \
> >  -blockdev 
> > '{"node-name":"libvirt-1-format","read-only":false,"driver":"raw",\
> >  "file":"libvirt-1-storage"}' \
> >  -device 
> > ide-hd,bus=ide.0,unit=0,drive=libvirt-1-format,id=ide0-0-0,bootindex=1 \
> > --drive file=/dev/sg0,if=none,format=raw,id=drive-hostdev0 \
> > +-blockdev '{"driver":"host_device","filename":"/dev/sg0",\
> > +"node-name":"libvirt-hostdev0-backend","read-only":false}' \
> >  -device scsi-generic,bus=scsi0.0,channel=0,scsi-id=4,lun=8,\
> > -drive=drive-hostdev0,id=hostdev0 \
> > --drive file=/dev/sg0,if=none,format=raw,id=drive-hostdev1,readonly=on \
> > +drive=libvirt-hostdev0-backend,id=hostdev0 \
> > +-blockdev '{"driver":"host_device","filename":"/dev/sg0",\
> > +"node-name":"libvirt-hostdev1-backend","read-only":true}' \
> >  -device scsi-generic,bus=scsi0.0,channel=0,scsi-id=4,lun=7,\
> > -drive=drive-hostdev1,id=hostdev1 \
> > --drive file.driver=iscsi,file.portal=example.org:3260,\
> > -file.target=iqn.1992-01.com.example,file.lun=0,file.transport=tcp,if=none,\
> > -format=raw,id=drive-hostdev2 \
> > +drive=libvirt-hostdev1-backend,id=hostdev1 \
> > +-blockdev '{"driver":"iscsi","portal":"example.org:3260",\
> > +"target":"iqn.1992-01.com.example","lun":0,"transport":"tcp",\
> > +"node-name":"libvirt-hostdev2-backend","read-only":false}' \
> >  -device scsi-generic,bus=scsi0.0,channel=0,scsi-id=2,lun=4,\
> > -drive=drive-hostdev2,id=hostdev2 \
> > --drive file.driver=iscsi,file.portal=example.org:3260,\
> > -file.target=iqn.1992-01.com.example,file.lun=1,file.transport=tcp,if=none,\
> > -format=raw,id=drive-hostdev3 \
> > +drive=libvirt-hostdev2-backend,id=hostdev2 \
> > +-blockdev '{"driver":"iscsi","portal":"example.org:3260",\
> > +"target":"iqn.1992-01.com.example","lun":1,"transport":"tcp",\
> > +"node-name":"libvirt-hostdev3-backend","read-only":false}' \
> >  -device scsi-generic,bus=scsi0.0,channel=0,scsi-id=2,lun=5,\
> > -drive=drive-hostdev3,id=hostdev3 \
> > +drive=libvirt-hostdev3-backend,id=hostdev3 \
> >  -object secret,id=hostdev4-secret0,\
> >  data=9eao5F8qtkGt+seB1HYivWIxbtwUu6MQtg1zpj/oDtUsPr1q8wBYM91uEHCn6j/1,\
> >  keyid=masterKey0,iv=AAECAwQFBgcICQoLDA0ODw==,format=base64 \
> > --drive file.driver=iscsi,file.portal=example.org:3260,\
> > -file.target=iqn.1992-01.com.example:storage,file.lun=1,file.transport=tcp,\
> > -file.user=myname,file.password-secret=hostdev4-secret0,if=none,format=raw,\
> > -id=drive-hostdev4 \
> > +-blockdev '{"driver":"iscsi","portal":"example.org:3260",\
> > +"target":"iqn.1992-01.com.example:storage","lun":1,"transport":"tcp",\
> > +"user":"myname","password-secret":"hostdev4-secret0",\
> > +"node-name":"libvirt-hostdev4-backend","read-only":false}' \
> >  -device scsi-generic,bus=scsi0.0,channel=0,scsi-id=3,lun=4,\
> > -drive=drive-hostdev4,id=hostdev4 \
> > +drive=libvirt-hostdev4-backend,id=hostdev4 \
> >  -object secret,id=hostdev5-secret0,\
> >  data=9eao5F8qtkGt+seB1HYivWIxbtwUu6MQtg1zpj/oDtUsPr1q8wBYM91uEHCn6j/1,\
> >  keyid=masterKey0,iv=AAECAwQFBgcICQoLDA0ODw==,format=base64 \
> > --drive file.driver=iscsi,file.portal=example.org:3260,\
> >

Re: [GSoC][PATCH v4 3/4] qemu_domainjob: introduce `privateData` for `qemuDomainJob`

2020-07-14 Thread Prathamesh Chavan

On Tue, Jul 14, 2020 at 7:59 PM Martin Kletzander  wrote:
>
> On Mon, Jul 13, 2020 at 11:33:40PM +0530, Prathamesh Chavan wrote:
> >To remove dependecy of `qemuDomainJob` on job specific
> >paramters, a `privateData` pointer is introduced.
> >To handle it, structure of callback functions is
> >also introduced.
> >
> >Signed-off-by: Prathamesh Chavan 
> >---
> >The static callback functions structure was declared in `qemu_domain.c`
> >Due to this, all callback functions (present in `qemu_domainobj.c`)
> >were exposed using the .h file to allow their access by the
> >`qemu_domain.c` file.
> >
> > src/qemu/qemu_domain.c   |  9 ++-
> > src/qemu/qemu_domain.h   | 10 
> > src/qemu/qemu_domainjob.c| 94 
> > src/qemu/qemu_domainjob.h| 44 ---
> > src/qemu/qemu_driver.c   |  3 +-
> > src/qemu/qemu_migration.c| 28 +++---
> > src/qemu/qemu_migration_params.c |  9 ++-
> > src/qemu/qemu_process.c  | 15 +++--
> > 8 files changed, 165 insertions(+), 47 deletions(-)
> >
>
> So the biggest issue I have with this is that it is impossible to see which of
> these functions are going to end up in `src/hypervisor/virjob.c` and which of
> them will stay qemu-specific in `src/qemu/qemu_domainjob.c`.
>
> I would suggest either:
>
> 1) marking them as such in some understandable way or
>
> 2) leaving qemu-specific code in `qemu_driver.c`
>
> 3) creating yet another file for what is going to become `virjob.c`
>
> Otherwise I'm getting confused with some of the things, why are they there 
> etc.
>
But doing this rightaway would mean we are again moving back
somestuff, which we just brought in this file. (Such as the
PrivateXML[Parse,Format]Job functions).

What I had in mind, was to deal with this segregation, while the
creation of the new file happens. But for more clarity, we can have
the code reside in the same file, and spilt into two sections, where
all the code residing in the beginning of the file (all the structs
and functions) will be the ones, which we plan on moving, and all the
code residing in the lower half, will be the one which will persist.

The other way of dealing this could be to move these things back to
`qemu_domain.c`

> >diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
> >index 10d2033db1..a02865dc59 100644
> >--- a/src/qemu/qemu_domain.c
> >+++ b/src/qemu/qemu_domain.c
> >@@ -88,6 +88,13 @@ VIR_ENUM_IMPL(qemuDomainNamespace,
> >   "mount",
> > );
> >
> >+static qemuDomainObjPrivateJobCallbacks qemuPrivateJobCallbacks = {
> >+.allocJobPrivate = ,
> >+.freeJobPrivate = ,
> >+.formatJob = ,
> >+.parseJob = ,
> >+};
> >+
> > /**
> >  * qemuDomainObjFromDomain:
> >  * @domain: Domain pointer that has to be looked up
> >@@ -1585,7 +1592,7 @@ qemuDomainObjPrivateAlloc(void *opaque)
> > if (VIR_ALLOC(priv) < 0)
> > return NULL;
> >
> >-if (qemuDomainObjInitJob(>job) < 0) {
> >+if (qemuDomainObjInitJob(>job, ) < 0) {
> > virReportSystemError(errno, "%s",
> >  _("Unable to init qemu driver mutexes"));
> > goto error;
> >diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h
> >index e524fd0002..bb9b414a46 100644
> >--- a/src/qemu/qemu_domain.h
> >+++ b/src/qemu/qemu_domain.h
> >@@ -492,6 +492,16 @@ struct _qemuDomainXmlNsDef {
> > char **capsdel;
> > };
> >
> >+typedef struct _qemuDomainJobPrivate qemuDomainJobPrivate;
> >+typedef qemuDomainJobPrivate *qemuDomainJobPrivatePtr;
> >+struct _qemuDomainJobPrivate {
> >+bool spiceMigration;/* we asked for spice migration and 
> >we
> >+ * should wait for it to finish */
> >+bool spiceMigrated; /* spice migration completed */
> >+bool dumpCompleted; /* dump completed */
> >+qemuMigrationParamsPtr migParams;
> >+};
> >+
> > int qemuDomainObjStartWorker(virDomainObjPtr dom);
> > void qemuDomainObjStopWorker(virDomainObjPtr dom);
> >
> >diff --git a/src/qemu/qemu_domainjob.c b/src/qemu/qemu_domainjob.c
> >index d96d5334a3..fe160afe79 100644
> >--- a/src/qemu/qemu_domainjob.c
> >+++ b/src/qemu/qemu_domainjob.c
> >@@ -159,23 +159,32 @@ qemuDomainEventEmitJobCompleted(virQEMUDriverPtr 
> >driver,
> > virObjectEventStateQueue(driver->domainEventState, event);
> > }
> >
> >-
> >-int
> >-qemuDomainObjInitJob(qemuDomainJobObjPtr job)
> >+void *
> >+qemuJobAllocPrivate(void)
> > {
> >-memset(job, 0, sizeof(*job));
> >-
> >-if (virCondInit(>cond) < 0)
> >-return -1;
> >+qemuDomainJobPrivatePtr priv;
> >+if (VIR_ALLOC(priv) < 0)
> >+return NULL;
> >+return (void *)priv;
> >+}
> >
> >-if (virCondInit(>asyncCond) < 0) {
> >-virCondDestroy(>cond);
> >-return -1;
> >-}
> >
> >-return 0;
> >+static void
> >+qemuJobFreePrivateData(qemuDomainJobPrivatePtr priv)
> >+{
> >+priv->spiceMigration = false;
>

Re: [PATCH v3] conf: add 'isa' controller type

2020-07-14 Thread Daniel P . Berrangé

On Sun, Jul 12, 2020 at 07:56:03AM +0400, Roman Bogorodskiy wrote:
> Decided to revive this patch to address
> https://gitlab.com/libvirt/libvirt/-/issues/45.
> 
> Changes from v2:
> 
>  - Rebased to master,
>  - Added bhyveDomainDeviceDefValidate() to disallow ISA
>controllers idx other than 0 because we don't support
>more than one ISA controller.
> 
> 
> 
> Introduce 'isa' controller type. The only supported model
> now is 'isa-bridge'. In domain XML it looks this way:
> 
> ...
> 
>   function='0x0'/>
> 
> ...
> 
> Currently, this is needed for the bhyve driver to allow choosing a
> specific PCI address for that. In bhyve, this controller is used to
> attach serial ports and a boot ROM.
> 
> bhyve: support 'isa' controller for LPC
> 
> Support modeling of the 'isa' controller for bhyve. When controller is
> not present in the domain XML, but domain requires it to be there (e.g.
> because bootrom is used), implicitly add addition of this controller to
> the command line arguments, and bind it to PCI slot 1.
> 
> PCI slot 1 is always reserved still. User can manually define any PCI
> slot for the 'isa' controller, including PCI slot 1, but other devices
> are not allowed to use this address.
> 
> bhyve: automatically add 'isa' controller
> 
> When domain configuration requires the 'isa' controller to be present,
> automatically add it on domain post-parse stage.
> 
> Now, as this controller is always available when needed, it's not
> necessary to implicitly add it to the bhyve command line, so remove
> bhyveBuildLPCArgStr().
> 
> Also, make bhyveDomainDefNeedsISAController() static as it's no longer
> used outside of bhyve_domain.c.
> 
> bhyve: validate that ISA controller always has index 0
> 
> More than one ISA controller is not supported by bhyve,
> and multiple controllers with the same index are forbidden,
> so forbid ISA controllers with non-zero index for bhyve.


It seems like you've taken multiple separate patches and merged them
all together. Could you separate them out again, as the split you
had in v2 made alot more sense.

> Signed-off-by: Roman Bogorodskiy 
> ---
>  docs/schemas/domaincommon.rng |  6 
>  src/bhyve/bhyve_command.c | 27 +++---
>  src/bhyve/bhyve_device.c  | 23 +---
>  src/bhyve/bhyve_domain.c  | 25 +++--
>  src/bhyve/bhyve_domain.h  |  2 --
>  src/conf/domain_conf.c|  9 +
>  src/conf/domain_conf.h|  8 +
>  src/qemu/qemu_command.c   |  1 +
>  src/qemu/qemu_domain.c|  1 +
>  src/qemu/qemu_domain_address.c|  1 +
>  src/qemu/qemu_validate.c  |  1 +
>  src/vbox/vbox_common.c|  1 +
>  ...ml2argv-addr-isa-controller-on-slot-1.args | 10 ++
>  ...2argv-addr-isa-controller-on-slot-1.ldargs |  3 ++
>  ...xml2argv-addr-isa-controller-on-slot-1.xml | 26 ++
>  ...l2argv-addr-isa-controller-on-slot-31.args | 10 ++
>  ...argv-addr-isa-controller-on-slot-31.ldargs |  3 ++
>  ...ml2argv-addr-isa-controller-on-slot-31.xml | 26 ++
>  ...argv-addr-non-isa-controller-on-slot-1.xml | 23 
>  .../bhyvexml2argv-console.args|  2 +-
>  .../bhyvexml2argv-isa-controller.args | 10 ++
>  .../bhyvexml2argv-isa-controller.ldargs   |  3 ++
>  .../bhyvexml2argv-isa-controller.xml  | 24 +
>  ...bhyvexml2argv-isa-multiple-controllers.xml | 25 +
>  .../bhyvexml2argv-serial-grub-nocons.args |  2 +-
>  .../bhyvexml2argv-serial-grub.args|  2 +-
>  .../bhyvexml2argv-serial.args |  2 +-
>  .../bhyvexml2argvdata/bhyvexml2argv-uefi.args |  4 +--
>  .../bhyvexml2argv-vnc-autoport.args   |  4 +--
>  .../bhyvexml2argv-vnc-vgaconf-io.args |  4 +--
>  .../bhyvexml2argv-vnc-vgaconf-off.args|  4 +--
>  .../bhyvexml2argv-vnc-vgaconf-on.args |  4 +--
>  .../bhyvexml2argvdata/bhyvexml2argv-vnc.args  |  4 +--
>  tests/bhyvexml2argvtest.c |  5 +++
>  ...l2xmlout-addr-isa-controller-on-slot-1.xml | 36 +++
>  ...2xmlout-addr-isa-controller-on-slot-31.xml | 36 +++
>  .../bhyvexml2xmlout-console.xml   |  3 ++
>  .../bhyvexml2xmlout-isa-controller.xml| 36 +++
>  .../bhyvexml2xmlout-serial-grub-nocons.xml|  3 ++
>  .../bhyvexml2xmlout-serial-grub.xml   |  3 ++
>  .../bhyvexml2xmlout-serial.xml|  3 ++
>  .../bhyvexml2xmlout-vnc-autoport.xml  |  3 ++
>  .../bhyvexml2xmlout-vnc-vgaconf-io.xml|  3 ++
>  .../bhyvexml2xmlout-vnc-vgaconf-off.xml   |  3 ++
>  .../bhyvexml2xmlout-vnc-vgaconf-on.xml|  3 ++
>  .../bhyvexml2xmlout-vnc.xml   |  3 ++
>  tests/bhyvexml2xmltest.c  |

Re: [GSoC][PATCH v4 4/4] qemu_domainjob: introduce `privateData` for `qemuDomainJobInfo`

2020-07-14 Thread Prathamesh Chavan

Currently, domainJobInfo also uses "stats" as one of the job specific
parameters. To remove this dependency, a privateData structure is
introduced.

The plan is to even have this structure renamed as
`virDomainJobInfoInternal` as there already exists a
`virDomainJobInfo'.

On Tue, Jul 14, 2020 at 8:16 PM Michal Privoznik  wrote:
>
> On 7/13/20 8:03 PM, Prathamesh Chavan wrote:
> > To remove dependecy of `qemuDomainJobInfo` on job specific
> > paramters, a `privateData` pointer is introduced.
> > To handle it, structure of callback functions is
> > also introduced.
> >
> > Signed-off-by: Prathamesh Chavan 
> > ---
> >   src/qemu/qemu_backup.c   | 15 +++--
> >   src/qemu/qemu_domain.h   | 18 ++
> >   src/qemu/qemu_domainjob.c| 98 +---
> >   src/qemu/qemu_domainjob.h| 31 +-
> >   src/qemu/qemu_driver.c   | 18 +++---
> >   src/qemu/qemu_migration.c| 14 +++--
> >   src/qemu/qemu_migration_cookie.c |  7 ++-
> >   src/qemu/qemu_process.c  | 11 +++-
> >   8 files changed, 154 insertions(+), 58 deletions(-)
>
> I'm not exactly sure why this is needed. Can you shed more light into it
> please?
>
> Michal
>

The issue about adding multipath device's targets into qemu-pr-helper's namespace

2020-07-14 Thread Lin Ma


Hi all,

I have a namespace question about passthrough disk(multipath device).
In case of enabling namespace and cgroups in qemu.conf, The target(s) of 
the
multipath device won't be added into qemu-pr-helper's namespace under 
certain

situation, It causes the PERSISTENT RESERVE command failure in guest.

While user starts a vm,
To build namespace, The qemuDomainSetupDisk() will be invoked via 
threadA(this

thread id will be the qemu's pid),
To build cgroup, The qemuSetupImagePathCgroup() will be invoked via 
threadB.


Both of the functions invoke the virDevMapperGetTargets() trying to 
parse a

multipath device to target paths string, Then fill the targetPaths[].

The issue I experienced is:
After libvirtd started, Everything works well for the first booted vm 
which has

the passthrough multipath device.
But If I shut it down & start it again, OR keep it running & start 
another vm
which has other passthrough multipath device, Then the target(s) of the 
fresh
started vm won't be added into the related qemu-pr-helper's namespace 
and it

causes PERSISTENT RESERVE command failure in the corresponding guest.
I digged into code, In this situation, The targetPaths[] in 
qemuDomainSetupDisk()

won't be filled, it keeps NULL after virDevMapperGetTargets() returns.
The virDevMapperGetTargets doesn't fill targetPaths[] because the 
dm_task_run()

of libdevmapper returns 0 with errno 9(Bad file descriptor).
So far, I don't understand why the dm_task_run() return 0 in this 
situation.
BTW, The virDevMapperGetTargets() can always successfully fill the 
targetPaths[]

in qemuSetupImagePathCgroup().

Please refer to the following 2 tests:
The multipath configuration on host:
host:~ # multipath -l
vm1-data (3600140582d9024bc13f4b8db5ff12de0) dm-11 FreeNAS,lv68
size=6.0G features='0' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=0 status=active
  `- 2:0:0:2 sdd 8:48 active undef running
vm2-data (36001405fc5f29ace3ec4fb8acd32aae5) dm-8 FreeNAS,lv46
size=4.0G features='0' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=0 status=active
  `- 2:0:0:1 sde 8:64 active undef running

===
Test A:
host:~ # systemctl restart libvirtd
host:~ # virsh list
 Id   Name   State


host:~ #
host:~ # virsh domblklist vm1
 Target   Source
--
 vda  /opt/vms/vm1/disk0.qcow2
 sda  /dev/mapper/vm1-data

host:~ #
host:~ # virsh start vm1
Domain vm1 started

host:~ # virsh list
 Id   NameState
---
 1vm1running

host:~ # nsenter -t $(pidof qemu-pr-helper) -a bash
host:~ # ls -l /dev/sd*
brw-rw 1 root disk 8, 48 Jul 14 16:30 /dev/sdd
host:~ # exit
exit
host:~ #

vm1:~ # lsscsi
[0:0:0:0]diskFreeNAS  lv68 0123   /dev/sda
vm1:~ #
vm1:~ # sg_persist --in -k /dev/sda
  FreeNAS   lv68  0123
  Peripheral device type: disk
  PR generation=0x0, there are NO registered reservation keys
vm1:~ #

host:~ # virsh shutdown vm1
Domain vm1 is being shutdown

host:~ # virsh list
 Id   Name   State


host:~ #
host:~ # virsh start vm1
Domain vm1 started

host:~ # virsh list
 Id   NameState
---
 2vm1running

host:~ # nsenter -t $(pidof qemu-pr-helper) -a bash
host:~ # ls -l /dev/sd*
ls: cannot access '/dev/sd*': No such file or directory
host:~ # exit
exit
host:~ #

vm1:~ # sg_persist --in -k /dev/sda
  FreeNAS   lv68  0123
  Peripheral device type: disk
PR in (Read keys): Aborted command
Aborted command
vm1:~ #
===
Test B:
host:~ # systemctl restart libvirtd
host:~ # virsh list
 Id   Name   State


host:~ #
host:~ # virsh domblklist vm1
 Target   Source
--
 vda  /opt/vms/vm1/disk0.qcow2
 sda  /dev/mapper/vm1-data

host:~ #
host:~ # virsh start vm1
Domain vm1 started

host:~ # virsh list
 Id   NameState
---
 1vm1running

host:~ # nsenter -t $(pidof qemu-pr-helper) -a bash
host:~ # ls -l /dev/sd*
brw-rw 1 root disk 8, 48 Jul 14 17:28 /dev/sdd
host:~ # exit
exit
host:~ #

vm1:~ # lsscsi
[2:0:0:0]diskFreeNAS  lv68 0123   /dev/sda
vm1:~ #
vm1:~ # sg_persist --in -k /dev/sda
  FreeNAS   lv68  0123
  Peripheral device type: disk
  PR generation=0x0, there are NO registered reservation keys
vm1:~ #

host:~ # virsh list
 Id   NameState
---
 1vm1running

host:~ #
host:~ # virsh domblklist vm2
 Target   Source
--
 vda  /opt/vms/vm2/disk0.qcow2
 sda  /dev/mapper/vm2-data

host:~ #
host:~ # virsh start vm2
Domain vm2 started

host:~ # virsh list
 Id   NameState
---
 1vm1running
 2vm2running

host:~ # nsenter -t

[RFC PATCH 5/5] docs: formatdomain-devices: Split out split

2020-07-14 Thread Peter Krempa

Start splitting out part of  into smaller sections.

Signed-off-by: Peter Krempa 
---
 docs/formatdomain-devices-disk.rst.in | 821 +
 docs/formatdomain-devices.rst.in  | 822 +-
 2 files changed, 822 insertions(+), 821 deletions(-)
 create mode 100644 docs/formatdomain-devices-disk.rst.in

diff --git a/docs/formatdomain-devices-disk.rst.in 
b/docs/formatdomain-devices-disk.rst.in
new file mode 100644
index 00..557db4edec
--- /dev/null
+++ b/docs/formatdomain-devices-disk.rst.in
@@ -0,0 +1,821 @@
+:anchor:``
+
+Hard drives, floppy disks, CDROMs
+~
+
+Any device that looks like a disk, be it a floppy, harddisk, cdrom, or
+paravirtualized driver is specified via the ``disk`` element.
+
+::
+
+   ...
+   
+ 
+   
+   
+ 
+   
+   
+   
+ 1000
+ 40
+ 10
+   
+   
+   
+ ...
+   
+   
+   
+ ...
+   
+ 
+   ...
+ 
+   
+   
+ 
+   
+   
+   
+   
+   
+ 
+ 
+   
+   
+ 
+ 
+ 
+ 
+   
+ 
+   
+   
+ 
+ 
+   
+   
+   
+ 
+ 
+   
+   
+ 
+   somevalue
+ 
+ 
+ 
+   
+   
+   
+ 
+ 
+   
+   
+ 
+ 
+   
+   
+   
+ 
+ 
+   
+   
+ 
+   
+   
+   
+ 
+ 
+   
+   
+ 
+   
+   
+   
+ 
+ 
+   
+   
+ 
+   
+   
+   
+ 
+ 
+   
+   
+ 
+   
+ 
+ 
+   
+ 
+   
+   
+   
+ 
+ 
+   
+   
+   
+   
+   
+ 
+ 
+   
+   
+   
+ 
+ 
+   
+   
+ 
+ 
+   
+ 
+   
+   
+ 
+ 
+   
+   
+ 
+ 
+   
+ 
+   
+   
+ 
+ 
+   
+   
+ 
+ 
+   
+ 
+   
+   
+ 
+ 
+   
+   
+   
+ 
+ 
+   
+   
+   
+ 
+ 
+   
+   
+   
+ 
+ 
+ 
+   
+   
+   
+ 
+   
+   
+ 
+ 
+   
+   
+ 
+   
+   
+ 
+   
+   ...
+
+``disk``
+   The ``disk`` element is the main container for describing disks and supports
+   the following attributes:
+
+   ``type``
+  Valid values are "file", "block", "dir" ( :since:`since 0.7.5` ),
+  "network" ( :since:`since 0.8.7` ), or "volume" ( :since:`since 1.0.5` ),
+  or "nvme" ( :since:`since 6.0.0` ) and refer to the underlying source for
+  the disk. :since:`Since 0.0.3`
+   ``device``
+  Indicates how the disk is to be exposed to the guest OS. Possible values
+  for this attribute are "floppy", "disk", "cdrom", and "lun", defaulting 
to
+  "disk".
+
+  Using "lun" ( :since:`since 0.9.10` ) is only valid when the ``type`` is
+  "block" or "network" for ``protocol='iscsi'`` or when the ``type`` is
+  "volume" when using an iSCSI source ``pool`` for ``mode`` "host" or as an
+  `NPIV `__ virtual Host Bus
+  Adapter (vHBA) using a Fibre Channel storage pool. Configured in this
+  manner, the LUN behaves identically to "disk", except that generic SCSI
+  commands from the guest are accepted and passed through to the physical
+  device. Also note that device='lun' will only be recognized for actual 
raw
+  devices, but never for individual partitions or LVM partitions (in those
+  cases, the kernel will reject the generic SCSI commands, making it
+  identical to device='disk'). :since:`Since 0.1.4`
+
+   ``model``
+  Indicates the emulated device model of the disk. Typically this is
+  indicated solely by the ``bus`` property but for ``bus`` "virtio" the
+  model can be specified further with "virtio-transitional",
+  "virtio-non-transitional", or "virtio". See `Virtio transitional
+  devices <#elementsVirtioTransitional>`__ for more details. :since:`Since
+  5.2.0`
+   ``rawio``
+  Indicates whether the disk needs rawio capability. Valid settings are
+  "yes" or "no" (default is "no"). If any one disk in a domain has
+  rawio='yes', rawio capability will be enabled for all disks in the domain
+  (because, in the case of QEMU, this capability can only be set on a
+  per-process basis). This attribute is only valid when device is "lun". 
NB,
+  ``rawio`` intends to confine the capability per-device, however, current
+  QEMU implementation gives the domain process broader capability than that
+  (per-process basis, affects all the domain disks). To confine the
+

[RFC PATCH 2/5] docs: Makefile: Distribute rst files meant for inclusion

2020-07-14 Thread Peter Krempa

Allow splitting bigger rst files into sections by using the .. include::
directive. We don't want those to be processed by docutils directly but
rather just included.

Signed-off-by: Peter Krempa 
---
 docs/Makefile.am | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/docs/Makefile.am b/docs/Makefile.am
index a480123e33..f066d3e529 100644
--- a/docs/Makefile.am
+++ b/docs/Makefile.am
@@ -322,6 +322,8 @@ dot_html_generated_in = \
   $(NULL)
 dot_html_in = \
   $(notdir $(wildcard $(srcdir)/*.html.in))
+dot_rst_in = \
+  $(notdir $(wildcard $(srcdir)/*.rst.in))
 dot_rst = \
   $(notdir $(wildcard $(srcdir)/*.rst))
 dot_rst_html_in = \
@@ -364,6 +366,7 @@ EXTRA_DIST= \
   $(internals_html_in) $(internals_rst) $(fonts) \
   $(kbase_html_in) $(kbase_rst) \
   $(manpages_rst) \
+  $(dot_rst_in) \
   aclperms.htmlinc \
   $(schema_DATA)

-- 
2.26.2

[RFC PATCH 1/5] docs: css: Make definition list 'code' entries bold when converted from rst

2020-07-14 Thread Peter Krempa

Docutils doesn't generate  for inline literals (``blah``) in rst
but rather puts them in the '.literal' class. Add a selector for making
them bold as well.

Signed-off-by: Peter Krempa 
---
 docs/generic.css | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/generic.css b/docs/generic.css
index c4092abc2b..d15d85e67a 100644
--- a/docs/generic.css
+++ b/docs/generic.css
@@ -25,7 +25,7 @@ dt {
   margin-right: 2em;
 }

-dt code {
+dt code, dt .literal {
   font-weight: bold;
 }

-- 
2.26.2

[RFC PATCH 0/5] docs: formatdomain: Convert to rst

2020-07-14 Thread Peter Krempa

The conversion was done using following steps:

https://gitlab.com/pipo.sk/libvirt/-/commit/3d7e23948564502ff01160a73abd3973360764d2

Output version is browsable at:

https://pipo.sk.gitlab.io/-/libvirt/-/jobs/638030149/artifacts/website/formatdomain.html

Note that this keeps existing anchors in place to prevent breaking
links.

Peter Krempa (5):
  docs: css: Make definition list  'code' entries bold when converted
from rst
  docs: Makefile: Distribute rst files meant for inclusion
  docs: formatdomain: Convert to rst
  docs: formatdomain: Split out  section
  docs: formatdomain-devices: Split out split 

 docs/Makefile.am  |3 +
 docs/formatdomain-devices-disk.rst.in |  821 +++
 docs/formatdomain-devices.rst.in  | 4231 +++
 docs/formatdomain.html.in | 9846 -
 docs/formatdomain.rst | 2390 ++
 docs/generic.css  |2 +-
 6 files changed, 7446 insertions(+), 9847 deletions(-)
 create mode 100644 docs/formatdomain-devices-disk.rst.in
 create mode 100644 docs/formatdomain-devices.rst.in
 delete mode 100644 docs/formatdomain.html.in
 create mode 100644 docs/formatdomain.rst

-- 
2.26.2

Re: [libvirt PATCH 0/5] docs: platforms: Convert and simplify

2020-07-14 Thread Pavel Hrdina

On Tue, Jul 14, 2020 at 03:07:38PM +0200, Andrea Bolognani wrote:
> This makes our platform support policy easier to grasp, and also
> explicitly allows us to drop Debian 9, which is EOL as of last week.
> 
> Andrea Bolognani (5):
>   docs: platforms: Add brief outline
>   docs: platforms: Convert to reStructuredText
>   docs: platforms: Simplify support policy
>   docs: platforms: Clarify stance on third-party LTS efforts
>   docs: platforms: Mention Windows API target

Reviewed-by: Pavel Hrdina 


signature.asc
Description: PGP signature

Re: [PATCH 00/10] resolve hangs/crashes on libvirtd shutdown

2020-07-14 Thread Daniel Henrique Barboza


As far as code goes:


Reviewed-by: Daniel Henrique Barboza 


About the design I have a question about the timeout. Patch 5/10 is setting a
15 second timeout. How did you reach this value? Reading the bug, specially
this comment from Daniel:

https://bugzilla.redhat.com/show_bug.cgi?id=1828207#c6

He mentions "give it 5 seconds of running before shutting it down".

5 seconds before shutdown is something that most users can be slightly annoyed
but in the end don't mind that much, but 15 seconds is something that will
cause bugs to be opened because "Libvirt is taking too long to shutdown".
Besides, it's a fair assumption that a transaction that takes more than
5 or so seconds to finish is already compromised* - might as well shutdown
the daemon and deal with the errors.



* assuming user discretion to avoid shutting down the daemon in the middle
of a long duration transaction, of course.


Thanks,


DHB


On 7/14/20 6:32 AM, Nikolay Shirokovskiy wrote:

This series follows [1] but address the issue slightly differently.
Instead of polling for RPC thread pool termination it waits for
thread pool drain in distinct thread and then signal the main loop
to exit.

The series introduces new driver's methods stateShutdown/stateShutdownWait
to finish all driver's background threads. The methods however are only
implemented for qemu driver and only partially. There are other drivers
that have background threads and I don't check every one of them in
terms of how they manage their background threads.

For example node driver creates 2 threads. One of them is supposed to live
a for a short amount of time and thus not tracked. This thread can cause issues
on shutdown. The second thread is tracked and finished synchronously on driver
cleanup. So this thread can not cause crashes but can cause hangs theoretically
speaking so we may want to move the thread finishing to stateShutdownWait
method so that possible hang will be handled by shutdown timeout.

The qemu driver also has untracked threads and they can cause crashes on
shutdown. For example reconnect threads or reboot thread. These need to be
tracked.

I'm going to address these issues in qemu and other drivers once the overall
approach will be approved.

I added 2 new driver's methods so that thread finishing will be done in
parallel. If we have only one method then the shutdown is done one by one
effectively.

I've added clean shutdown timeout in event loop as suggested by Daniel in [2].
Now I think why we can't just go with systemd unit management? Systemd will
eventually send SIGKILL and we can tune the timeout using TimeoutStopUSec
parameter. This way we even don't need to introduce new driver's methods.
Driver's background threads can be finished in stateCleanup method. AFAIU as
drivers are cleaned up in reverse order it is safe in the sense that already
cleaned up driver can not be used by background threads of not yet cleaned up
driver. Of course this way the cleanup is not done in parallel. Well to
turn it into parallel we can introduce just stateShutdown which we don't need
to call in netdaemon code and thus don't introduce undesired dependency of
netdaemon on drivers concept.

[1] Resolve libvirtd hang on termination with connected long running client
 https://www.redhat.com/archives/libvir-list/2018-July/msg00382.html
[2] Races / crashes in shutdown of libvirtd daemon
 https://www.redhat.com/archives/libvir-list/2020-April/msg01328.html

Nikolay Shirokovskiy (10):
   libvirt: add stateShutdown/stateShutdownWait to drivers
   util: always initialize priority condition
   util: add stop/drain functions to thread pool
   rpc: don't unref service ref on socket behalf twice
   rpc: finish all threads before exiting main loop
   vireventthread: add virEventThreadClose
   qemu: exit thread synchronously in qemuDomainObjStopWorker
   qemu: implement driver's shutdown/shutdown wait methods
   rpc: cleanup virNetDaemonClose method
   util: remove unused virThreadPoolNew macro

  scripts/check-drivername.py   |  2 +
  src/driver-state.h|  8 
  src/libvirt.c | 42 +++
  src/libvirt_internal.h|  2 +
  src/libvirt_private.syms  |  3 ++
  src/libvirt_remote.syms   |  1 -
  src/qemu/qemu_domain.c|  1 +
  src/qemu/qemu_driver.c| 32 +++
  src/remote/remote_daemon.c|  3 --
  src/rpc/virnetdaemon.c| 95 ---
  src/rpc/virnetdaemon.h|  2 -
  src/rpc/virnetserver.c|  8 
  src/rpc/virnetserver.h|  1 +
  src/rpc/virnetserverservice.c |  1 -
  src/util/vireventthread.c |  9 
  src/util/vireventthread.h |  1 +
  src/util/virthreadpool.c  | 65 -
  src/util/virthreadpool.h  |  6 +--
  18 files changed, 238 insertions(+), 44 deletions(-)

Re: [GSoC][PATCH v4 4/4] qemu_domainjob: introduce `privateData` for `qemuDomainJobInfo`

2020-07-14 Thread Michal Privoznik


On 7/13/20 8:03 PM, Prathamesh Chavan wrote:

To remove dependecy of `qemuDomainJobInfo` on job specific
paramters, a `privateData` pointer is introduced.
To handle it, structure of callback functions is
also introduced.

Signed-off-by: Prathamesh Chavan 
---
  src/qemu/qemu_backup.c   | 15 +++--
  src/qemu/qemu_domain.h   | 18 ++
  src/qemu/qemu_domainjob.c| 98 +---
  src/qemu/qemu_domainjob.h| 31 +-
  src/qemu/qemu_driver.c   | 18 +++---
  src/qemu/qemu_migration.c| 14 +++--
  src/qemu/qemu_migration_cookie.c |  7 ++-
  src/qemu/qemu_process.c  | 11 +++-
  8 files changed, 154 insertions(+), 58 deletions(-)


I'm not exactly sure why this is needed. Can you shed more light into it 
please?


Michal

Re: The issue about adding multipath device's targets into qemu-pr-helper's namespace

2020-07-14 Thread Michal Privoznik


On 7/14/20 3:41 PM, Lin Ma wrote:

Hi all,

I have a namespace question about passthrough disk(multipath device).
In case of enabling namespace and cgroups in qemu.conf, The target(s) of 
the
multipath device won't be added into qemu-pr-helper's namespace under 
certain

situation, It causes the PERSISTENT RESERVE command failure in guest.



Yeah, this is expected. I mean, the failure is expected if not all 
devices are added to the namespace.



While user starts a vm,
To build namespace, The qemuDomainSetupDisk() will be invoked via 
threadA(this

thread id will be the qemu's pid),
To build cgroup, The qemuSetupImagePathCgroup() will be invoked via 
threadB.


Both of the functions invoke the virDevMapperGetTargets() trying to parse a
multipath device to target paths string, Then fill the targetPaths[].

The issue I experienced is:
After libvirtd started, Everything works well for the first booted vm 
which has

the passthrough multipath device.
But If I shut it down & start it again, OR keep it running & start 
another vm
which has other passthrough multipath device, Then the target(s) of the 
fresh
started vm won't be added into the related qemu-pr-helper's namespace 
and it

causes PERSISTENT RESERVE command failure in the corresponding guest.
I digged into code, In this situation, The targetPaths[] in 
qemuDomainSetupDisk()

won't be filled, it keeps NULL after virDevMapperGetTargets() returns.
The virDevMapperGetTargets doesn't fill targetPaths[] because the 
dm_task_run()

of libdevmapper returns 0 with errno 9(Bad file descriptor).
So far, I don't understand why the dm_task_run() return 0 in this 
situation.
BTW, The virDevMapperGetTargets() can always successfully fill the 
targetPaths[]

in qemuSetupImagePathCgroup().


What is your libvirt version? I've merged a fix for something similar 
not a long ago: v6.5.0-rc1~190


Can you please check v6.5.0?

Michal

Re: [GSoC][PATCH v4 1/4] qemu_domain: remove passing `qemuDomainObjPrivatePtr` as param

2020-07-14 Thread Michal Privoznik


On 7/13/20 8:03 PM, Prathamesh Chavan wrote:

`qemuDomainObjPrivatePtr` parameter was avoided being passed
as a paramter in functions `qemuDomainObjPrivateXMLParseJob`
and `qemuDomainObjPrivateXMLFormatJob`, as we already pass
`virDomainObjPtr`, which can be used to get `privateData`
pointer.

Signed-off-by: Prathamesh Chavan 
---
  src/qemu/qemu_domain.c | 14 +++---
  1 file changed, 7 insertions(+), 7 deletions(-)


Reviewed-by: Michal Privoznik 

Michal

Re: [GSoC][PATCH v4 3/4] qemu_domainjob: introduce `privateData` for `qemuDomainJob`

2020-07-14 Thread Michal Privoznik


On 7/13/20 8:03 PM, Prathamesh Chavan wrote:

To remove dependecy of `qemuDomainJob` on job specific
paramters, a `privateData` pointer is introduced.
To handle it, structure of callback functions is
also introduced.

Signed-off-by: Prathamesh Chavan 
---
The static callback functions structure was declared in `qemu_domain.c`
Due to this, all callback functions (present in `qemu_domainobj.c`)
were exposed using the .h file to allow their access by the
`qemu_domain.c` file.



Yeah, well callbacks should be static thus should live in the 
qemu_domain.c too. Also, the private data struct is defined in 
qemu_domain.h and thus callbacks which allocate/free/format/parse should 
live in the corresponding .c file.



  src/qemu/qemu_domain.c   |  9 ++-
  src/qemu/qemu_domain.h   | 10 
  src/qemu/qemu_domainjob.c| 94 
  src/qemu/qemu_domainjob.h| 44 ---
  src/qemu/qemu_driver.c   |  3 +-
  src/qemu/qemu_migration.c| 28 +++---
  src/qemu/qemu_migration_params.c |  9 ++-
  src/qemu/qemu_process.c  | 15 +++--
  8 files changed, 165 insertions(+), 47 deletions(-)


This looks almost perfect.



diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
index 10d2033db1..a02865dc59 100644
--- a/src/qemu/qemu_domain.c
+++ b/src/qemu/qemu_domain.c
@@ -88,6 +88,13 @@ VIR_ENUM_IMPL(qemuDomainNamespace,
"mount",
  );
  
+static qemuDomainObjPrivateJobCallbacks qemuPrivateJobCallbacks = {

+.allocJobPrivate = ,
+.freeJobPrivate = ,
+.formatJob = ,
+.parseJob = ,
+};


I think we will need one callback more - resetJobPrivate. More below.
Also, there is no need to put ampersand before functions. Compilers are 
wise enough to tell that we need the address of the functions.



+
  /**
   * qemuDomainObjFromDomain:
   * @domain: Domain pointer that has to be looked up
@@ -1585,7 +1592,7 @@ qemuDomainObjPrivateAlloc(void *opaque)
  if (VIR_ALLOC(priv) < 0)
  return NULL;
  
-if (qemuDomainObjInitJob(>job) < 0) {

+if (qemuDomainObjInitJob(>job, ) < 0) {
  virReportSystemError(errno, "%s",
   _("Unable to init qemu driver mutexes"));
  goto error;
diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h
index e524fd0002..bb9b414a46 100644
--- a/src/qemu/qemu_domain.h
+++ b/src/qemu/qemu_domain.h
@@ -492,6 +492,16 @@ struct _qemuDomainXmlNsDef {
  char **capsdel;
  };
  
+typedef struct _qemuDomainJobPrivate qemuDomainJobPrivate;

+typedef qemuDomainJobPrivate *qemuDomainJobPrivatePtr;
+struct _qemuDomainJobPrivate {
+bool spiceMigration;/* we asked for spice migration and we
+ * should wait for it to finish */
+bool spiceMigrated; /* spice migration completed */
+bool dumpCompleted; /* dump completed */
+qemuMigrationParamsPtr migParams;
+};
+
  int qemuDomainObjStartWorker(virDomainObjPtr dom);
  void qemuDomainObjStopWorker(virDomainObjPtr dom);
  
diff --git a/src/qemu/qemu_domainjob.c b/src/qemu/qemu_domainjob.c

index d96d5334a3..fe160afe79 100644
--- a/src/qemu/qemu_domainjob.c
+++ b/src/qemu/qemu_domainjob.c
@@ -159,23 +159,32 @@ qemuDomainEventEmitJobCompleted(virQEMUDriverPtr driver,
  virObjectEventStateQueue(driver->domainEventState, event);
  }
  
-

-int
-qemuDomainObjInitJob(qemuDomainJobObjPtr job)
+void *
+qemuJobAllocPrivate(void)
  {
-memset(job, 0, sizeof(*job));
-
-if (virCondInit(>cond) < 0)
-return -1;
+qemuDomainJobPrivatePtr priv;
+if (VIR_ALLOC(priv) < 0)
+return NULL;
+return (void *)priv;


Or plain: g_new0(qemuDomainJobPrivate, 1)


+}
  
-if (virCondInit(>asyncCond) < 0) {

-virCondDestroy(>cond);
-return -1;
-}
  
-return 0;

+static void
+qemuJobFreePrivateData(qemuDomainJobPrivatePtr priv)
+{
+priv->spiceMigration = false;
+priv->spiceMigrated = false;
+priv->dumpCompleted = false;
+qemuMigrationParamsFree(priv->migParams);
+priv->migParams = NULL;


See, this is what previously qemuDomainObjResetAsyncJob() was doing. And 
while qemuDomainObjResetAsyncJob() is called from qemuDomainObjFreeJob() 
it is also called from qemuDomainObjBeginJobInternal() and other places. 
Therefore, naming this Free() is probably not the best idea. It should 
be named qemuJobResetPrivate() or something, esp. beacuse it doesn't 
really free the private data (it doesn't call free(priv)), it just 
clears/resets them.



  }
  
+void

+qemuJobFreePrivate(void *opaque)
+{
+qemuDomainJobObjPtr job = (qemuDomainJobObjPtr) opaque;
+qemuJobFreePrivateData(job->privateData);


And this will actually free the data.


+}
  
  static void

  qemuDomainObjResetJob(qemuDomainJobObjPtr job)
@@ -207,14 +216,11 @@ qemuDomainObjResetAsyncJob(qemuDomainJobObjPtr job)
  job->phase = 0;
  job->mask = QEMU_JOB_DEFAULT_MASK;

Re: [GSoC][PATCH v4 2/4] qemu_domainjob: moved PrivateXML parse-job and format-job

2020-07-14 Thread Michal Privoznik


On 7/13/20 8:03 PM, Prathamesh Chavan wrote:

Functions `qemuDomainObjPrivateXMLParseJob` and
`qemuDomainObjPrivateXMLFormatJob` moved from
`qemu_domain` to `qemu_domainjob`.

Signed-off-by: Prathamesh Chavan 
---
  src/qemu/qemu_domain.c| 241 --
  src/qemu/qemu_domainjob.c | 241 ++
  src/qemu/qemu_domainjob.h |   8 ++
  3 files changed, 249 insertions(+), 241 deletions(-)


Reviewed-by: Michal Privoznik 

Michal

Re: [GSoC][PATCH v4 3/4] qemu_domainjob: introduce `privateData` for `qemuDomainJob`

2020-07-14 Thread Martin Kletzander


On Mon, Jul 13, 2020 at 11:33:40PM +0530, Prathamesh Chavan wrote:

To remove dependecy of `qemuDomainJob` on job specific
paramters, a `privateData` pointer is introduced.
To handle it, structure of callback functions is
also introduced.

Signed-off-by: Prathamesh Chavan 
---
The static callback functions structure was declared in `qemu_domain.c`
Due to this, all callback functions (present in `qemu_domainobj.c`)
were exposed using the .h file to allow their access by the
`qemu_domain.c` file.

src/qemu/qemu_domain.c   |  9 ++-
src/qemu/qemu_domain.h   | 10 
src/qemu/qemu_domainjob.c| 94 
src/qemu/qemu_domainjob.h| 44 ---
src/qemu/qemu_driver.c   |  3 +-
src/qemu/qemu_migration.c| 28 +++---
src/qemu/qemu_migration_params.c |  9 ++-
src/qemu/qemu_process.c  | 15 +++--
8 files changed, 165 insertions(+), 47 deletions(-)



So the biggest issue I have with this is that it is impossible to see which of
these functions are going to end up in `src/hypervisor/virjob.c` and which of
them will stay qemu-specific in `src/qemu/qemu_domainjob.c`.

I would suggest either:

1) marking them as such in some understandable way or

2) leaving qemu-specific code in `qemu_driver.c`

3) creating yet another file for what is going to become `virjob.c`

Otherwise I'm getting confused with some of the things, why are they there etc.


diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
index 10d2033db1..a02865dc59 100644
--- a/src/qemu/qemu_domain.c
+++ b/src/qemu/qemu_domain.c
@@ -88,6 +88,13 @@ VIR_ENUM_IMPL(qemuDomainNamespace,
  "mount",
);

+static qemuDomainObjPrivateJobCallbacks qemuPrivateJobCallbacks = {
+.allocJobPrivate = ,
+.freeJobPrivate = ,
+.formatJob = ,
+.parseJob = ,
+};
+
/**
 * qemuDomainObjFromDomain:
 * @domain: Domain pointer that has to be looked up
@@ -1585,7 +1592,7 @@ qemuDomainObjPrivateAlloc(void *opaque)
if (VIR_ALLOC(priv) < 0)
return NULL;

-if (qemuDomainObjInitJob(>job) < 0) {
+if (qemuDomainObjInitJob(>job, ) < 0) {
virReportSystemError(errno, "%s",
 _("Unable to init qemu driver mutexes"));
goto error;
diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h
index e524fd0002..bb9b414a46 100644
--- a/src/qemu/qemu_domain.h
+++ b/src/qemu/qemu_domain.h
@@ -492,6 +492,16 @@ struct _qemuDomainXmlNsDef {
char **capsdel;
};

+typedef struct _qemuDomainJobPrivate qemuDomainJobPrivate;
+typedef qemuDomainJobPrivate *qemuDomainJobPrivatePtr;
+struct _qemuDomainJobPrivate {
+bool spiceMigration;/* we asked for spice migration and we
+ * should wait for it to finish */
+bool spiceMigrated; /* spice migration completed */
+bool dumpCompleted; /* dump completed */
+qemuMigrationParamsPtr migParams;
+};
+
int qemuDomainObjStartWorker(virDomainObjPtr dom);
void qemuDomainObjStopWorker(virDomainObjPtr dom);

diff --git a/src/qemu/qemu_domainjob.c b/src/qemu/qemu_domainjob.c
index d96d5334a3..fe160afe79 100644
--- a/src/qemu/qemu_domainjob.c
+++ b/src/qemu/qemu_domainjob.c
@@ -159,23 +159,32 @@ qemuDomainEventEmitJobCompleted(virQEMUDriverPtr driver,
virObjectEventStateQueue(driver->domainEventState, event);
}

-
-int
-qemuDomainObjInitJob(qemuDomainJobObjPtr job)
+void *
+qemuJobAllocPrivate(void)
{
-memset(job, 0, sizeof(*job));
-
-if (virCondInit(>cond) < 0)
-return -1;
+qemuDomainJobPrivatePtr priv;
+if (VIR_ALLOC(priv) < 0)
+return NULL;
+return (void *)priv;
+}

-if (virCondInit(>asyncCond) < 0) {
-virCondDestroy(>cond);
-return -1;
-}

-return 0;
+static void
+qemuJobFreePrivateData(qemuDomainJobPrivatePtr priv)
+{
+priv->spiceMigration = false;
+priv->spiceMigrated = false;
+priv->dumpCompleted = false;
+qemuMigrationParamsFree(priv->migParams);
+priv->migParams = NULL;
}

+void
+qemuJobFreePrivate(void *opaque)
+{
+qemuDomainJobObjPtr job = (qemuDomainJobObjPtr) opaque;
+qemuJobFreePrivateData(job->privateData);
+}

static void
qemuDomainObjResetJob(qemuDomainJobObjPtr job)
@@ -207,14 +216,11 @@ qemuDomainObjResetAsyncJob(qemuDomainJobObjPtr job)
job->phase = 0;
job->mask = QEMU_JOB_DEFAULT_MASK;
job->abortJob = false;
-job->spiceMigration = false;
-job->spiceMigrated = false;
-job->dumpCompleted = false;
VIR_FREE(job->error);
g_clear_pointer(>current, qemuDomainJobInfoFree);
-qemuMigrationParamsFree(job->migParams);
-job->migParams = NULL;
+job->cb->freeJobPrivate(job);


If this function is supposed to be in `virjob.c`, then the callback function
should be just called with `job->privateData` which will help you drop one level
of indirection by skipping qemuJobFreePrivate() and calling
qemuJobFreePrivateData() directly.  The function would take

[libvirt PATCH 2/5] docs: platforms: Convert to reStructuredText

2020-07-14 Thread Andrea Bolognani

Signed-off-by: Andrea Bolognani 
---
 docs/platforms.html.in | 119 -
 docs/platforms.rst | 100 ++
 2 files changed, 100 insertions(+), 119 deletions(-)
 delete mode 100644 docs/platforms.html.in
 create mode 100644 docs/platforms.rst

diff --git a/docs/platforms.html.in b/docs/platforms.html.in
deleted file mode 100644
index c50279bffd..00
--- a/docs/platforms.html.in
+++ /dev/null
@@ -1,119 +0,0 @@
-
-
-http://www.w3.org/1999/xhtml;>
-  
-Supported host platforms
-
-
-
-
-  Libvirt aims to support building and executing on multiple host OS
-  platforms, as well as working with multiple hypervisors. This document
-  outlines which platforms are targeted for each of these areas.
-
-
-Build targets
-
-
-  These platforms are used as the basis for deciding
-  upon the minimum required versions of 3rd party software libvirt depends
-  on. If a platform is not listed here, it does not imply that libvirt
-  won't work. If an unlisted platform has comparable software versions
-  to a listed platform, there is every expectation that it will work.
-  Bug reports are welcome for problems encountered on unlisted platforms
-  unless they are clearly older vintage than what is described here.
-
-
-
-  Note that when considering software versions shipped in distros as
-  support targets, libvirt considers only the version number, and assumes
-  the features in that distro match the upstream release with the same
-  version. In other words, if a distro backports extra features to the
-  software in their distro, libvirt upstream code will not add explicit
-  support for those backports, unless the feature is auto-detectable in
-  a manner that works for the upstream releases too.
-
-
-
-  The Repology site is a useful resource to identify currently shipped
-  versions of software in various operating systems, though it does not
-  cover all distros listed below.
-
-
-
-  https://repology.org/metapackage/libvirt/versions;>libvirt
-  https://repology.org/metapackage/qemu/versions;>qemu
-  https://repology.org/metapackage/qemu-kvm/versions;>qemu-kvm
-
-
-
-Linux OS
-
-
-  For distributions with frequent, short-lifetime releases, the project
-  will aim to support all versions that are not end of life by their
-  respective vendors. For the purposes of identifying supported software
-  versions, the project will look at Fedora, Ubuntu, and openSUSE distros.
-  Other short-lifetime distros will be assumed to ship similar software
-  versions.
-
-
-
-  For distributions with long-lifetime releases, the project will aim to
-  support the most recent major version at all times. Support for the
-  previous major version will be dropped 2 years after the new major
-  version is released. For the purposes of identifying supported software
-  versions, the project will look at RHEL, Debian, Ubuntu LTS, and SLES
-  distros. Other long-lifetime distros will be assumed to ship similar
-  software versions.
-
-
-Windows
-
-
-  The project supports building with current versions of the MinGW
-  toolchain, hosted on Linux.
-
-
-macOS
-
-
-  The project aims to support the most recent major version
-  at all times. Support for the previous major version will
-  be dropped 2 years after the new major version is released.
-
-
-
-  Note that to compile libvirt will require extra packages
-  to be made available on the macOS host. It is recommended
-  to use https://brew.sh/;>HomeBrew since this
-  is what libvirt CI tests with, however, https://www.macports.org/;>MacPorts
-  is an alternative option that is likely to work.
-
-
-FreeBSD
-
-
-  The project aims to support the most recent major version
-  at all times. Support for the previous major version will
-  be dropped 2 years after the new major version is released.
-
-
-Virtualization platforms
-
-
-  For hypervisor drivers which execute
-  locally (QEMU, LXC, VZ, libxl, etc), the set of supported operating
-  system platforms listed above will inform choices as to the minimum
-  required versions of 3rd party libraries and hypervisor management
-  APIs.
-
-
-  If a hypervisor is not commonly shipped directly by any distro
-  listed above, (VMware ESX, HyperV, VZ), the project aims to
-  support versions up to 5 years, or until the vendor discontinues
-  support, whichever comes first.
-
-
-  
-
diff --git a/docs/platforms.rst b/docs/platforms.rst
new file mode 100644
index 00..2845ac40ea
--- /dev/null
+++ b/docs/platforms.rst
@@ -0,0 +1,100 @@
+
+Supported host platforms
+
+
+..

[libvirt PATCH 5/5] docs: platforms: Mention Windows API target

2020-07-14 Thread Andrea Bolognani

Signed-off-by: Andrea Bolognani 
---
 docs/platforms.rst | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/docs/platforms.rst b/docs/platforms.rst
index 4fcb458635..b0ce0c3737 100644
--- a/docs/platforms.rst
+++ b/docs/platforms.rst
@@ -59,6 +59,9 @@ Windows
 The project supports building with current versions of the MinGW toolchain,
 hosted on Linux.
 
+The version of the Windows API that's currently targeted is Vista / Server
+2008.
+
 
 Virtualization platforms
 
-- 
2.25.4

[libvirt PATCH 3/5] docs: platforms: Simplify support policy

2020-07-14 Thread Andrea Bolognani

We discuss Linux, FreeBSD and macOS separately, and we even go as
far as splitting Linux distros into short-lifetime and long-lifetime,
when ultimately the same two priciples apply everywhere: we don't
want to support a platform longer than its vendor does, and in cases
where the vendor support is extremely long we need to have a
time-based escape hatch.

Signed-off-by: Andrea Bolognani 
---
 docs/platforms.rst | 47 ++
 1 file changed, 14 insertions(+), 33 deletions(-)

diff --git a/docs/platforms.rst b/docs/platforms.rst
index 2845ac40ea..702d8d56d3 100644
--- a/docs/platforms.rst
+++ b/docs/platforms.rst
@@ -36,46 +36,27 @@ distros listed below.
 * `qemu on Repology`_
 * `qemu-kvm on Repology`_
 
-Linux OS
-
-
-For distributions with frequent, short-lifetime releases, the project will aim
-to support all versions that are not end of life by their respective vendors.
-For the purposes of identifying supported software versions, the project will
-look at Fedora, Ubuntu, and openSUSE distros.  Other short-lifetime distros
-will be assumed to ship similar software versions.
-
-For distributions with long-lifetime releases, the project will aim to support
-the most recent major version at all times. Support for the previous major
-version will be dropped 2 years after the new major version is released. For
-the purposes of identifying supported software versions, the project will look
-at RHEL, Debian, Ubuntu LTS, and SLES distros. Other long-lifetime distros will
-be assumed to ship similar software versions.
-
-Windows

-
-The project supports building with current versions of the MinGW toolchain,
-hosted on Linux.
-
-macOS
--
+Linux, FreeBSD and macOS
+
 
 The project aims to support the most recent major version at all times. Support
 for the previous major version will be dropped 2 years after the new major
-version is released.
+version is released or when the vendor itself drops support, whichever comes
+first.
 
-Note that to compile libvirt will require extra packages to be made available
-on the macOS host. It is recommended to use `HomeBrew`_ since this is what
-libvirt CI tests with, however, `MacPorts`_ is an alternative option that is
-likely to work.
+For the purposes of identifying supported software versions available on Linux,
+the project will look at CentOS, Debian, Fedora, openSUSE, RHEL, SLES and
+Ubuntu LTS. Other distros will be assumed to ship similar software versions.
 
-FreeBSD
+For FreeBSD, decisions will be made based on the contents of the ports tree;
+for macOS, `HomeBrew`_ will be used, although `MacPorts`_ is expected to carry
+similar versions.
+
+Windows
 ---
 
-The project aims to support the most recent major version at all times. Support
-for the previous major version will be dropped 2 years after the new major
-version is released.
+The project supports building with current versions of the MinGW toolchain,
+hosted on Linux.
 
 
 Virtualization platforms
-- 
2.25.4

[libvirt PATCH 0/5] docs: platforms: Convert and simplify

2020-07-14 Thread Andrea Bolognani

This makes our platform support policy easier to grasp, and also
explicitly allows us to drop Debian 9, which is EOL as of last week.

Andrea Bolognani (5):
  docs: platforms: Add brief outline
  docs: platforms: Convert to reStructuredText
  docs: platforms: Simplify support policy
  docs: platforms: Clarify stance on third-party LTS efforts
  docs: platforms: Mention Windows API target

 docs/platforms.html.in | 115 -
 docs/platforms.rst |  85 ++
 2 files changed, 85 insertions(+), 115 deletions(-)
 delete mode 100644 docs/platforms.html.in
 create mode 100644 docs/platforms.rst

-- 
2.25.4

[libvirt PATCH 4/5] docs: platforms: Clarify stance on third-party LTS efforts

2020-07-14 Thread Andrea Bolognani

Signed-off-by: Andrea Bolognani 
---
 docs/platforms.rst | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/docs/platforms.rst b/docs/platforms.rst
index 702d8d56d3..4fcb458635 100644
--- a/docs/platforms.rst
+++ b/docs/platforms.rst
@@ -42,7 +42,8 @@ Linux, FreeBSD and macOS
 The project aims to support the most recent major version at all times. Support
 for the previous major version will be dropped 2 years after the new major
 version is released or when the vendor itself drops support, whichever comes
-first.
+first. In this context, third-party efforts to extend the lifetime of a distro
+are not considered, even when they are endorsed by the vendor (eg. Debian LTS).
 
 For the purposes of identifying supported software versions available on Linux,
 the project will look at CentOS, Debian, Fedora, openSUSE, RHEL, SLES and
-- 
2.25.4

[libvirt PATCH 1/5] docs: platforms: Add brief outline

2020-07-14 Thread Andrea Bolognani

This will make the document look nicer, especially after we have
converted it to reStructuredText.

Signed-off-by: Andrea Bolognani 
---
 docs/platforms.html.in | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/docs/platforms.html.in b/docs/platforms.html.in
index 5c90f1dd7a..c50279bffd 100644
--- a/docs/platforms.html.in
+++ b/docs/platforms.html.in
@@ -6,12 +6,16 @@
 
 
 
+
+  Libvirt aims to support building and executing on multiple host OS
+  platforms, as well as working with multiple hypervisors. This document
+  outlines which platforms are targeted for each of these areas.
+
+
 Build targets
 
 
-  Libvirt drivers aim to support building and executing on multiple
-  host OS platforms. This document outlines which platforms are the
-  major build targets. These platforms are used as the basis for deciding
+  These platforms are used as the basis for deciding
   upon the minimum required versions of 3rd party software libvirt depends
   on. If a platform is not listed here, it does not imply that libvirt
   won't work. If an unlisted platform has comparable software versions
-- 
2.25.4

Re: device compatibility interface for live migration with assigned devices

2020-07-14 Thread Sean Mooney

On Tue, 2020-07-14 at 11:21 +0100, Daniel P. Berrangé wrote:
> On Tue, Jul 14, 2020 at 07:29:57AM +0800, Yan Zhao wrote:
> > hi folks,
> > we are defining a device migration compatibility interface that helps upper
> > layer stack like openstack/ovirt/libvirt to check if two devices are
> > live migration compatible.
> > The "devices" here could be MDEVs, physical devices, or hybrid of the two.
> > e.g. we could use it to check whether
> > - a src MDEV can migrate to a target MDEV,
mdev live migration is completely possible to do but i agree with Dan 
barrange's comments
from the point of view of openstack integration i dont see calling out to a 
vender sepecific
tool to be an accpetable
solutions for device compatiablity checking. the sys filesystem
that describs the mdevs that can be created shoudl also
contain the relevent infomation such
taht nova could integrate it via libvirt xml representation or directly retrive 
the
info from
sysfs.
> > - a src VF in SRIOV can migrate to a target VF in SRIOV,
so vf to vf migration is not possible in the general case as there is no 
standarised
way to transfer teh device state as part of the siorv specs produced by the 
pci-sig
as such there is not vender neutral way to support sriov live migration. 
> > - a src MDEV can migration to a target VF in SRIOV.
that also makes this unviable
> >   (e.g. SIOV/SRIOV backward compatibility case)
> > 
> > The upper layer stack could use this interface as the last step to check
> > if one device is able to migrate to another device before triggering a real
> > live migration procedure.
well actully that is already too late really. ideally we would want to do this 
compaiablity
check much sooneer to avoid the migration failing. in an openstack envionment  
at least
by the time we invoke libvirt (assuming your using the libvirt driver) to do 
the migration we have alreaedy
finished schduling the instance to the new host. if if we do the compatiablity 
check at this point
and it fails then the live migration is aborted and will not be retired. These 
types of late check lead to a
poor user experince as unless you check the migration detial it basically looks 
like the migration was ignored
as it start to migrate and then continuge running on the orgininal host.

when using generic pci passhotuhg with openstack, the pci alias is intended to 
reference a single vendor id/product
id so you will have 1+ alias for each type of device. that allows openstack to 
schedule based on the availability of a
compatibale device because we track inventories of pci devices and can query 
that when selecting a host.

if we were to support mdev live migration in the future we would want to take 
the same declarative approch.
1 interospec the capability of the deivce we manage
2 create inventories of the allocatable devices and there capabilities
3 schdule the instance to a host based on the device-type/capabilities and 
claim it atomicly to prevent raceces
4 have the lower level hyperviors do addtional validation if need prelive 
migration.

this proposal seams to be targeting extending step 4 where as ideally we should 
focuse on providing the info that would
be relevant in set 1 preferably in a vendor neutral way vai a kernel interface 
like /sys.
 
> > we are not sure if this interface is of value or help to you. please don't
> > hesitate to drop your valuable comments.
> > 
> > 
> > (1) interface definition
> > The interface is defined in below way:
> > 
> >  __userspace
> >   /\  \
> >  / \write
> > / read  \
> >/__   ___\|/_
> >   | migration_version | | migration_version |-->check migration
> >   - -   compatibility
> >  device Adevice B
> > 
> > 
> > a device attribute named migration_version is defined under each device's
> > sysfs node. e.g. 
> > (/sys/bus/pci/devices/\:00\:02.0/$mdev_UUID/migration_version).
this might be useful as we could tag the inventory with the migration version 
and only might to
devices with  the same version
> > userspace tools read the migration_version as a string from the source 
> > device,
> > and write it to the migration_version sysfs attribute in the target device.
this would not be useful as the schduler cannot directlly connect to the 
compute host
and even if it could it would be extreamly slow to do this for 1000s of hosts 
and potentally
multiple devices per host.
> > 
> > The userspace should treat ANY of below conditions as two devices not 
> > compatible:
> > - any one of the two devices does not have a migration_version attribute
> > - error when reading from migration_version attribute of one device
> > - error when writing migration_version string of one device to
> >   migration_version attribute of the other device
> > 
> > The string read from migration_version attribute is defined by

Re: device compatibility interface for live migration with assigned devices

2020-07-14 Thread Daniel P . Berrangé

On Tue, Jul 14, 2020 at 07:29:57AM +0800, Yan Zhao wrote:
> hi folks,
> we are defining a device migration compatibility interface that helps upper
> layer stack like openstack/ovirt/libvirt to check if two devices are
> live migration compatible.
> The "devices" here could be MDEVs, physical devices, or hybrid of the two.
> e.g. we could use it to check whether
> - a src MDEV can migrate to a target MDEV,
> - a src VF in SRIOV can migrate to a target VF in SRIOV,
> - a src MDEV can migration to a target VF in SRIOV.
>   (e.g. SIOV/SRIOV backward compatibility case)
> 
> The upper layer stack could use this interface as the last step to check
> if one device is able to migrate to another device before triggering a real
> live migration procedure.
> we are not sure if this interface is of value or help to you. please don't
> hesitate to drop your valuable comments.
> 
> 
> (1) interface definition
> The interface is defined in below way:
> 
>  __userspace
>   /\  \
>  / \write
> / read  \
>/__   ___\|/_
>   | migration_version | | migration_version |-->check migration
>   - -   compatibility
>  device Adevice B
> 
> 
> a device attribute named migration_version is defined under each device's
> sysfs node. e.g. 
> (/sys/bus/pci/devices/\:00\:02.0/$mdev_UUID/migration_version).
> userspace tools read the migration_version as a string from the source device,
> and write it to the migration_version sysfs attribute in the target device.
> 
> The userspace should treat ANY of below conditions as two devices not 
> compatible:
> - any one of the two devices does not have a migration_version attribute
> - error when reading from migration_version attribute of one device
> - error when writing migration_version string of one device to
>   migration_version attribute of the other device
> 
> The string read from migration_version attribute is defined by device vendor
> driver and is completely opaque to the userspace.
> for a Intel vGPU, string format can be defined like
> "parent device PCI ID" + "version of gvt driver" + "mdev type" + "aggregator 
> count".
> 
> for an NVMe VF connecting to a remote storage. it could be
> "PCI ID" + "driver version" + "configured remote storage URL"
> 
> for a QAT VF, it may be
> "PCI ID" + "driver version" + "supported encryption set".
> 
> (to avoid namespace confliction from each vendor, we may prefix a driver name 
> to
> each migration_version string. e.g. i915-v1-8086-591d-i915-GVTg_V5_8-1)
> 
> 
> (2) backgrounds
> 
> The reason we hope the migration_version string is opaque to the userspace
> is that it is hard to generalize standard comparing fields and comparing
> methods for different devices from different vendors.
> Though userspace now could still do a simple string compare to check if
> two devices are compatible, and result should also be right, it's still
> too limited as it excludes the possible candidate whose migration_version
> string fails to be equal.
> e.g. an MDEV with mdev_type_1, aggregator count 3 is probably compatible
> with another MDEV with mdev_type_3, aggregator count 1, even their
> migration_version strings are not equal.
> (assumed mdev_type_3 is of 3 times equal resources of mdev_type_1).
> 
> besides that, driver version + configured resources are all elements demanding
> to take into account.
> 
> So, we hope leaving the freedom to vendor driver and let it make the final 
> decision
> in a simple reading from source side and writing for test in the target side 
> way.
> 
> 
> we then think the device compatibility issues for live migration with assigned
> devices can be divided into two steps:
> a. management tools filter out possible migration target devices.
>Tags could be created according to info from product specification.
>we think openstack/ovirt may have vendor proprietary components to create
>those customized tags for each product from each vendor.

>for Intel vGPU, with a vGPU(a MDEV device) in source side, the tags to
>search target vGPU are like:
>a tag for compatible parent PCI IDs,
>a tag for a range of gvt driver versions,
>a tag for a range of mdev type + aggregator count
> 
>for NVMe VF, the tags to search target VF may be like:
>a tag for compatible PCI IDs,
>a tag for a range of driver versions,
>a tag for URL of configured remote storage.

Requiring management application developers to figure out this possible
compatibility based on prod specs is really unrealistic. Product specs
are typically as clear as mud, and with the suggestion we consider
different rules for different types of devices, add up to a huge amount
of complexity. This isn't something app developers should have to spend
their time figuring out.

The suggestion that we make use of vendor proprietary helper

[libvirt PATCH 0/2] ci: Refresh Dockerfiles

2020-07-14 Thread Andrea Bolognani

Pushed under the "Dockerfile refresh" rule.

Andrea Bolognani (2):
  ci: Exclude macOS when generating Dockerfiles
  ci: Refresh Dockerfiles

 ci/containers/libvirt-centos-7.Dockerfile | 2 ++
 ci/containers/libvirt-centos-8.Dockerfile | 2 ++
 ci/containers/libvirt-centos-stream.Dockerfile| 2 ++
 ci/containers/libvirt-debian-10-cross-aarch64.Dockerfile  | 2 ++
 ci/containers/libvirt-debian-10-cross-armv6l.Dockerfile   | 2 ++
 ci/containers/libvirt-debian-10-cross-armv7l.Dockerfile   | 2 ++
 ci/containers/libvirt-debian-10-cross-i686.Dockerfile | 2 ++
 ci/containers/libvirt-debian-10-cross-mips.Dockerfile | 2 ++
 ci/containers/libvirt-debian-10-cross-mips64el.Dockerfile | 2 ++
 ci/containers/libvirt-debian-10-cross-mipsel.Dockerfile   | 2 ++
 ci/containers/libvirt-debian-10-cross-ppc64le.Dockerfile  | 2 ++
 ci/containers/libvirt-debian-10-cross-s390x.Dockerfile| 2 ++
 ci/containers/libvirt-debian-10.Dockerfile| 2 ++
 ci/containers/libvirt-debian-9-cross-aarch64.Dockerfile   | 2 ++
 ci/containers/libvirt-debian-9-cross-armv6l.Dockerfile| 2 ++
 ci/containers/libvirt-debian-9-cross-armv7l.Dockerfile| 2 ++
 ci/containers/libvirt-debian-9-cross-mips.Dockerfile  | 2 ++
 ci/containers/libvirt-debian-9-cross-mips64el.Dockerfile  | 2 ++
 ci/containers/libvirt-debian-9-cross-mipsel.Dockerfile| 2 ++
 ci/containers/libvirt-debian-9-cross-ppc64le.Dockerfile   | 2 ++
 ci/containers/libvirt-debian-9-cross-s390x.Dockerfile | 2 ++
 ci/containers/libvirt-debian-9.Dockerfile | 2 ++
 ci/containers/libvirt-debian-sid-cross-aarch64.Dockerfile | 2 ++
 ci/containers/libvirt-debian-sid-cross-armv6l.Dockerfile  | 2 ++
 ci/containers/libvirt-debian-sid-cross-armv7l.Dockerfile  | 2 ++
 ci/containers/libvirt-debian-sid-cross-i686.Dockerfile| 2 ++
 ci/containers/libvirt-debian-sid-cross-mips64el.Dockerfile| 2 ++
 ci/containers/libvirt-debian-sid-cross-mipsel.Dockerfile  | 2 ++
 ci/containers/libvirt-debian-sid-cross-ppc64le.Dockerfile | 2 ++
 ci/containers/libvirt-debian-sid-cross-s390x.Dockerfile   | 2 ++
 ci/containers/libvirt-debian-sid.Dockerfile   | 2 ++
 ci/containers/libvirt-fedora-31.Dockerfile| 2 ++
 ci/containers/libvirt-fedora-32.Dockerfile| 2 ++
 ci/containers/libvirt-fedora-rawhide-cross-mingw32.Dockerfile | 2 ++
 ci/containers/libvirt-fedora-rawhide-cross-mingw64.Dockerfile | 2 ++
 ci/containers/libvirt-fedora-rawhide.Dockerfile   | 2 ++
 ci/containers/libvirt-opensuse-151.Dockerfile | 4 +++-
 ci/containers/libvirt-ubuntu-1804.Dockerfile  | 2 ++
 ci/containers/libvirt-ubuntu-2004.Dockerfile  | 2 ++
 ci/containers/refresh | 2 +-
 40 files changed, 80 insertions(+), 2 deletions(-)

-- 
2.25.4

[libvirt PATCH 1/2] ci: Exclude macOS when generating Dockerfiles

2020-07-14 Thread Andrea Bolognani

Just like FreeBSD, macOS is not supported by the Dockerfile
generator.

Signed-off-by: Andrea Bolognani 
---
 ci/containers/refresh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ci/containers/refresh b/ci/containers/refresh
index 3a3594caa4..4cae30b869 100755
--- a/ci/containers/refresh
+++ b/ci/containers/refresh
@@ -14,7 +14,7 @@ then
 exit 1
 fi
 
-HOSTS=$($LCITOOL hosts | grep -v freebsd)
+HOSTS=$($LCITOOL hosts | grep -Ev 'freebsd|macos')
 
 for host in $HOSTS
 do
-- 
2.25.4

[libvirt PATCH 2/2] ci: Refresh Dockerfiles

2020-07-14 Thread Andrea Bolognani

All targets get cpanm, which is now part of the base system, and
xz is now installed explicitly instead of relying on it being either
present by default or dragged in indirectly.

The corresponding libvirt-ci commit is 8920e8f408ba.

Signed-off-by: Andrea Bolognani 
---
 ci/containers/libvirt-centos-7.Dockerfile | 2 ++
 ci/containers/libvirt-centos-8.Dockerfile | 2 ++
 ci/containers/libvirt-centos-stream.Dockerfile| 2 ++
 ci/containers/libvirt-debian-10-cross-aarch64.Dockerfile  | 2 ++
 ci/containers/libvirt-debian-10-cross-armv6l.Dockerfile   | 2 ++
 ci/containers/libvirt-debian-10-cross-armv7l.Dockerfile   | 2 ++
 ci/containers/libvirt-debian-10-cross-i686.Dockerfile | 2 ++
 ci/containers/libvirt-debian-10-cross-mips.Dockerfile | 2 ++
 ci/containers/libvirt-debian-10-cross-mips64el.Dockerfile | 2 ++
 ci/containers/libvirt-debian-10-cross-mipsel.Dockerfile   | 2 ++
 ci/containers/libvirt-debian-10-cross-ppc64le.Dockerfile  | 2 ++
 ci/containers/libvirt-debian-10-cross-s390x.Dockerfile| 2 ++
 ci/containers/libvirt-debian-10.Dockerfile| 2 ++
 ci/containers/libvirt-debian-9-cross-aarch64.Dockerfile   | 2 ++
 ci/containers/libvirt-debian-9-cross-armv6l.Dockerfile| 2 ++
 ci/containers/libvirt-debian-9-cross-armv7l.Dockerfile| 2 ++
 ci/containers/libvirt-debian-9-cross-mips.Dockerfile  | 2 ++
 ci/containers/libvirt-debian-9-cross-mips64el.Dockerfile  | 2 ++
 ci/containers/libvirt-debian-9-cross-mipsel.Dockerfile| 2 ++
 ci/containers/libvirt-debian-9-cross-ppc64le.Dockerfile   | 2 ++
 ci/containers/libvirt-debian-9-cross-s390x.Dockerfile | 2 ++
 ci/containers/libvirt-debian-9.Dockerfile | 2 ++
 ci/containers/libvirt-debian-sid-cross-aarch64.Dockerfile | 2 ++
 ci/containers/libvirt-debian-sid-cross-armv6l.Dockerfile  | 2 ++
 ci/containers/libvirt-debian-sid-cross-armv7l.Dockerfile  | 2 ++
 ci/containers/libvirt-debian-sid-cross-i686.Dockerfile| 2 ++
 ci/containers/libvirt-debian-sid-cross-mips64el.Dockerfile| 2 ++
 ci/containers/libvirt-debian-sid-cross-mipsel.Dockerfile  | 2 ++
 ci/containers/libvirt-debian-sid-cross-ppc64le.Dockerfile | 2 ++
 ci/containers/libvirt-debian-sid-cross-s390x.Dockerfile   | 2 ++
 ci/containers/libvirt-debian-sid.Dockerfile   | 2 ++
 ci/containers/libvirt-fedora-31.Dockerfile| 2 ++
 ci/containers/libvirt-fedora-32.Dockerfile| 2 ++
 ci/containers/libvirt-fedora-rawhide-cross-mingw32.Dockerfile | 2 ++
 ci/containers/libvirt-fedora-rawhide-cross-mingw64.Dockerfile | 2 ++
 ci/containers/libvirt-fedora-rawhide.Dockerfile   | 2 ++
 ci/containers/libvirt-opensuse-151.Dockerfile | 4 +++-
 ci/containers/libvirt-ubuntu-1804.Dockerfile  | 2 ++
 ci/containers/libvirt-ubuntu-2004.Dockerfile  | 2 ++
 39 files changed, 79 insertions(+), 1 deletion(-)

diff --git a/ci/containers/libvirt-centos-7.Dockerfile 
b/ci/containers/libvirt-centos-7.Dockerfile
index abbbdcc47a..08d9386644 100644
--- a/ci/containers/libvirt-centos-7.Dockerfile
+++ b/ci/containers/libvirt-centos-7.Dockerfile
@@ -98,6 +98,7 @@ WEiJKtQrZDJloqtyi/mmRa1VsV7RYR0VPJjhK/R8EQ7Ysshy\n\
 parted-devel \
 patch \
 perl \
+perl-App-cpanminus \
 pkgconfig \
 polkit \
 python3 \
@@ -118,6 +119,7 @@ WEiJKtQrZDJloqtyi/mmRa1VsV7RYR0VPJjhK/R8EQ7Ysshy\n\
 vim \
 wireshark-devel \
 xfsprogs-devel \
+xz \
 yajl-devel && \
 yum autoremove -y && \
 yum clean all -y && \
diff --git a/ci/containers/libvirt-centos-8.Dockerfile 
b/ci/containers/libvirt-centos-8.Dockerfile
index f376126d93..2ac825fc80 100644
--- a/ci/containers/libvirt-centos-8.Dockerfile
+++ b/ci/containers/libvirt-centos-8.Dockerfile
@@ -71,6 +71,7 @@ RUN dnf install 'dnf-command(config-manager)' -y && \
 parted-devel \
 patch \
 perl \
+perl-App-cpanminus \
 pkgconfig \
 polkit \
 python3 \
@@ -93,6 +94,7 @@ RUN dnf install 'dnf-command(config-manager)' -y && \
 vim \
 wireshark-devel \
 xfsprogs-devel \
+xz \
 yajl-devel && \
 dnf autoremove -y && \
 dnf clean all -y && \
diff --git a/ci/containers/libvirt-centos-stream.Dockerfile 
b/ci/containers/libvirt-centos-stream.Dockerfile
index bc75c95193..e0025e2acb 100644
--- a/ci/containers/libvirt-centos-stream.Dockerfile
+++ b/ci/containers/libvirt-centos-stream.Dockerfile
@@ -72,6 +72,7 @@ RUN dnf install -y centos-release-stream && \
 parted-devel \
 patch \
 perl \
+perl-App-cpanminus \
 pkgconfig \
 polkit \
 python3 \
@@ -94,6 +95,7 @@ RUN dnf install -y centos-release-stream && \
 vim \
 wireshark-devel \

Re: [PATCH] storage: find vstorage-mount binary in runtime

2020-07-14 Thread Ján Tomko


On a Tuesday in 2020, Nikolay Shirokovskiy wrote:

This allows us to use CI for vstorage driver without installing Virtuozzo
Storage packages. This way we can leave aside license considerations.

By the way we need to change configure defaults from 'check' to 'no' otherwise
vstorage driver will be build on any system with umount binary which is not
expected I guess.



If it does not have any compile-time dependencies, compiling it by
default would help developers like me - I would not even have to wait
for the CI to finish to see I've broken the compilation of this driver.


Signed-off-by: Nikolay Shirokovskiy 
---
m4/virt-storage-vstorage.m4| 17 +
src/storage/storage_backend_vstorage.c |  9 -
2 files changed, 9 insertions(+), 17 deletions(-)

diff --git a/m4/virt-storage-vstorage.m4 b/m4/virt-storage-vstorage.m4
index e3b3bb4..cf0a543 100644
--- a/m4/virt-storage-vstorage.m4
+++ b/m4/virt-storage-vstorage.m4
@@ -21,30 +21,19 @@ dnl
AC_DEFUN([LIBVIRT_STORAGE_ARG_VSTORAGE], [
  LIBVIRT_ARG_WITH_FEATURE([STORAGE_VSTORAGE],
   [Virtuozzo Storage backend for the storage driver],
-   [check])
+   [no])
])

AC_DEFUN([LIBVIRT_STORAGE_CHECK_VSTORAGE], [
  if test "$with_storage_vstorage" = "yes" ||
 test "$with_storage_vstorage" = "check"; then
-AC_PATH_PROG([VSTORAGE], [vstorage], [], [$LIBVIRT_SBIN_PATH])
-AC_PATH_PROG([VSTORAGE_MOUNT], [vstorage-mount], [], [$LIBVIRT_SBIN_PATH])
AC_PATH_PROG([UMOUNT], [umount], [], [$LIBVIRT_SBIN_PATH])



I'm not sure why we (in general):
a) look for the paths of the helper binaries upfront
b) make decisions based on that

It seems reasonable that someone would build libvirt first and then
install e.g. lvm2 and expected the logical storage driver to work.

I doubt the runtime speedup would be significant, is this just to verify
the sanity of the build environment?

Would it make sense to enforce their presence with = "yes", but build
the driver regardless of whether the tools are present with = "check"?


if test "$with_storage_vstorage" = "yes"; then
-  if test -z "$VSTORAGE" || test -z "$VSTORAGE_MOUNT"; then
-AC_MSG_ERROR([We need vstorage and vstorage-mount tool for Vstorage 
storage driver]);
-  fi
  if test -z "$UMOUNT" ; then
AC_MSG_ERROR([We need umount for Vstorage storage driver]);
  fi
else
-  if test -z "$VSTORAGE" ; then
-with_storage_vstorage=no
-  fi
-  if test -z "$VSTORAGE_MOUNT" ; then
-with_storage_vstorage=no
-  fi
  if test -z "$UMOUNT" ; then
with_storage_vstorage=no
  fi
@@ -57,10 +46,6 @@ AC_DEFUN([LIBVIRT_STORAGE_CHECK_VSTORAGE], [
if test "$with_storage_vstorage" = "yes" ; then
  AC_DEFINE_UNQUOTED([WITH_STORAGE_VSTORAGE], 1,
 [whether Vstorage backend for storage driver is 
enabled])
-  AC_DEFINE_UNQUOTED([VSTORAGE], ["$VSTORAGE"],
- [Location or name of the vstorage client tool])
-  AC_DEFINE_UNQUOTED([VSTORAGE_MOUNT], ["$VSTORAGE_MOUNT"],
- [Location or name of the vstorage mount tool])
  AC_DEFINE_UNQUOTED([UMOUNT], ["$UMOUNT"],
 [Location or name of the umount programm])
fi
diff --git a/src/storage/storage_backend_vstorage.c 
b/src/storage/storage_backend_vstorage.c
index 6cff9f1..ea3bfaa 100644
--- a/src/storage/storage_backend_vstorage.c
+++ b/src/storage/storage_backend_vstorage.c
@@ -44,9 +44,16 @@ virStorageBackendVzPoolStart(virStoragePoolObjPtr pool)
g_autofree char *grp_name = NULL;
g_autofree char *usr_name = NULL;
g_autofree char *mode = NULL;
+g_autofree char *mount_bin = NULL;
g_autoptr(virCommand) cmd = NULL;
int ret;

+if (!(mount_bin = virFindFileInPath("vstorage-mount"))) {


virCommand already does this for you if the binary path does not start
with a slash, so it would be safe to AC_DEFINE VSTORAGE_MOUNT to
"vstorage-mount" if it wasn't found.

Jano


+virReportError(VIR_ERR_INTERNAL_ERROR,
+   "%s", _("unable to find vstorage-mount"));
+return -1;
+}
+
/* Check the permissions */
if (def->target.perms.mode == (mode_t)-1)
def->target.perms.mode = VIR_STORAGE_DEFAULT_POOL_PERM_MODE;
@@ -65,7 +72,7 @@ virStorageBackendVzPoolStart(virStoragePoolObjPtr pool)

mode = g_strdup_printf("%o", def->target.perms.mode);

-cmd = virCommandNewArgList(VSTORAGE_MOUNT,
+cmd = virCommandNewArgList(mount_bin,
   "-c", def->source.name,
   def->target.path,
   "-m", mode,
--
1.8.3.1



signature.asc
Description: PGP signature

[PATCH 04/10] rpc: don't unref service ref on socket behalf twice

2020-07-14 Thread Nikolay Shirokovskiy

Second unref was added in [1]. We don't need it actually as
we pass free callback to virNetSocketAddIOCallback thus
when we call virNetSocketRemoveIOCallback the extra ref for
callback will be dropped without extra efforts.

[1] 355d8f470f9: virNetServerServiceClose: Don't leak sockets
---
 src/rpc/virnetserverservice.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/rpc/virnetserverservice.c b/src/rpc/virnetserverservice.c
index 9d5df45..e4165ea 100644
--- a/src/rpc/virnetserverservice.c
+++ b/src/rpc/virnetserverservice.c
@@ -449,6 +449,5 @@ void virNetServerServiceClose(virNetServerServicePtr svc)
 for (i = 0; i < svc->nsocks; i++) {
 virNetSocketRemoveIOCallback(svc->socks[i]);
 virNetSocketClose(svc->socks[i]);
-virObjectUnref(svc);
 }
 }
-- 
1.8.3.1

[PATCH 02/10] util: always initialize priority condition

2020-07-14 Thread Nikolay Shirokovskiy

Even if we have no priority threads on pool creation we can add them thru
virThreadPoolSetParameters later.

Signed-off-by: Nikolay Shirokovskiy 
---
 src/util/virthreadpool.c | 22 +++---
 1 file changed, 7 insertions(+), 15 deletions(-)

diff --git a/src/util/virthreadpool.c b/src/util/virthreadpool.c
index 379d236..10a44de 100644
--- a/src/util/virthreadpool.c
+++ b/src/util/virthreadpool.c
@@ -245,6 +245,8 @@ virThreadPoolNewFull(size_t minWorkers,
 goto error;
 if (virCondInit(>cond) < 0)
 goto error;
+if (virCondInit(>prioCond) < 0)
+goto error;
 if (virCondInit(>quit_cond) < 0)
 goto error;
 
@@ -255,13 +257,8 @@ virThreadPoolNewFull(size_t minWorkers,
 if (virThreadPoolExpand(pool, minWorkers, false) < 0)
 goto error;
 
-if (prioWorkers) {
-if (virCondInit(>prioCond) < 0)
-goto error;
-
-if (virThreadPoolExpand(pool, prioWorkers, true) < 0)
-goto error;
-}
+if (virThreadPoolExpand(pool, prioWorkers, true) < 0)
+goto error;
 
 return pool;
 
@@ -274,7 +271,6 @@ virThreadPoolNewFull(size_t minWorkers,
 void virThreadPoolFree(virThreadPoolPtr pool)
 {
 virThreadPoolJobPtr job;
-bool priority = false;
 
 if (!pool)
 return;
@@ -283,10 +279,8 @@ void virThreadPoolFree(virThreadPoolPtr pool)
 pool->quit = true;
 if (pool->nWorkers > 0)
 virCondBroadcast(>cond);
-if (pool->nPrioWorkers > 0) {
-priority = true;
+if (pool->nPrioWorkers > 0)
 virCondBroadcast(>prioCond);
-}
 
 while (pool->nWorkers > 0 || pool->nPrioWorkers > 0)
 ignore_value(virCondWait(>quit_cond, >mutex));
@@ -301,10 +295,8 @@ void virThreadPoolFree(virThreadPoolPtr pool)
 virMutexDestroy(>mutex);
 virCondDestroy(>quit_cond);
 virCondDestroy(>cond);
-if (priority) {
-VIR_FREE(pool->prioWorkers);
-virCondDestroy(>prioCond);
-}
+VIR_FREE(pool->prioWorkers);
+virCondDestroy(>prioCond);
 VIR_FREE(pool);
 }
 
-- 
1.8.3.1

[PATCH 08/10] qemu: implement driver's shutdown/shutdown wait methods

2020-07-14 Thread Nikolay Shirokovskiy

On shutdown we just stop accepting new jobs for worker thread so that on
shutdown wait we can exit worker thread faster. Yes we basically stop
processing of events for VMs but we are going to do so anyway in case of daemon
shutdown.

At the same time synchronous event processing that some API calls may require
are still possible as per VM event loop is still running and we don't need
worker thread for synchronous event processing.

Signed-off-by: Nikolay Shirokovskiy 
---
 src/qemu/qemu_driver.c | 32 
 1 file changed, 32 insertions(+)

diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index d185666..f7ff0fb 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -1098,6 +1098,36 @@ qemuStateStop(void)
 return ret;
 }
 
+
+static int
+qemuStateShutdown(void)
+{
+virThreadPoolStop(qemu_driver->workerPool);
+return 0;
+}
+
+
+static int
+qemuDomainObjStopWorkerIter(virDomainObjPtr vm,
+void *opaque G_GNUC_UNUSED)
+{
+virObjectLock(vm);
+qemuDomainObjStopWorker(vm);
+virObjectUnlock(vm);
+return 0;
+}
+
+
+static int
+qemuStateShutdownWait(void)
+{
+virDomainObjListForEach(qemu_driver->domains, false,
+qemuDomainObjStopWorkerIter, NULL);
+virThreadPoolDrain(qemu_driver->workerPool);
+return 0;
+}
+
+
 /**
  * qemuStateCleanup:
  *
@@ -23413,6 +23443,8 @@ static virStateDriver qemuStateDriver = {
 .stateCleanup = qemuStateCleanup,
 .stateReload = qemuStateReload,
 .stateStop = qemuStateStop,
+.stateShutdown = qemuStateShutdown,
+.stateShutdownWait = qemuStateShutdownWait,
 };
 
 int qemuRegister(void)
-- 
1.8.3.1

[PATCH 00/10] resolve hangs/crashes on libvirtd shutdown

2020-07-14 Thread Nikolay Shirokovskiy

This series follows [1] but address the issue slightly differently.
Instead of polling for RPC thread pool termination it waits for
thread pool drain in distinct thread and then signal the main loop
to exit.

The series introduces new driver's methods stateShutdown/stateShutdownWait
to finish all driver's background threads. The methods however are only
implemented for qemu driver and only partially. There are other drivers
that have background threads and I don't check every one of them in 
terms of how they manage their background threads. 

For example node driver creates 2 threads. One of them is supposed to live
a for a short amount of time and thus not tracked. This thread can cause issues
on shutdown. The second thread is tracked and finished synchronously on driver
cleanup. So this thread can not cause crashes but can cause hangs theoretically
speaking so we may want to move the thread finishing to stateShutdownWait
method so that possible hang will be handled by shutdown timeout.

The qemu driver also has untracked threads and they can cause crashes on
shutdown. For example reconnect threads or reboot thread. These need to be
tracked.

I'm going to address these issues in qemu and other drivers once the overall
approach will be approved.

I added 2 new driver's methods so that thread finishing will be done in
parallel. If we have only one method then the shutdown is done one by one
effectively.

I've added clean shutdown timeout in event loop as suggested by Daniel in [2].
Now I think why we can't just go with systemd unit management? Systemd will
eventually send SIGKILL and we can tune the timeout using TimeoutStopUSec
parameter. This way we even don't need to introduce new driver's methods.
Driver's background threads can be finished in stateCleanup method. AFAIU as
drivers are cleaned up in reverse order it is safe in the sense that already
cleaned up driver can not be used by background threads of not yet cleaned up
driver. Of course this way the cleanup is not done in parallel. Well to
turn it into parallel we can introduce just stateShutdown which we don't need
to call in netdaemon code and thus don't introduce undesired dependency of
netdaemon on drivers concept.

[1] Resolve libvirtd hang on termination with connected long running client
https://www.redhat.com/archives/libvir-list/2018-July/msg00382.html
[2] Races / crashes in shutdown of libvirtd daemon
https://www.redhat.com/archives/libvir-list/2020-April/msg01328.html

Nikolay Shirokovskiy (10):
  libvirt: add stateShutdown/stateShutdownWait to drivers
  util: always initialize priority condition
  util: add stop/drain functions to thread pool
  rpc: don't unref service ref on socket behalf twice
  rpc: finish all threads before exiting main loop
  vireventthread: add virEventThreadClose
  qemu: exit thread synchronously in qemuDomainObjStopWorker
  qemu: implement driver's shutdown/shutdown wait methods
  rpc: cleanup virNetDaemonClose method
  util: remove unused virThreadPoolNew macro

 scripts/check-drivername.py   |  2 +
 src/driver-state.h|  8 
 src/libvirt.c | 42 +++
 src/libvirt_internal.h|  2 +
 src/libvirt_private.syms  |  3 ++
 src/libvirt_remote.syms   |  1 -
 src/qemu/qemu_domain.c|  1 +
 src/qemu/qemu_driver.c| 32 +++
 src/remote/remote_daemon.c|  3 --
 src/rpc/virnetdaemon.c| 95 ---
 src/rpc/virnetdaemon.h|  2 -
 src/rpc/virnetserver.c|  8 
 src/rpc/virnetserver.h|  1 +
 src/rpc/virnetserverservice.c |  1 -
 src/util/vireventthread.c |  9 
 src/util/vireventthread.h |  1 +
 src/util/virthreadpool.c  | 65 -
 src/util/virthreadpool.h  |  6 +--
 18 files changed, 238 insertions(+), 44 deletions(-)

-- 
1.8.3.1

[PATCH 09/10] rpc: cleanup virNetDaemonClose method

2020-07-14 Thread Nikolay Shirokovskiy

Signed-off-by: Nikolay Shirokovskiy 
---
 src/libvirt_remote.syms |  1 -
 src/rpc/virnetdaemon.c  | 13 -
 src/rpc/virnetdaemon.h  |  2 --
 3 files changed, 16 deletions(-)

diff --git a/src/libvirt_remote.syms b/src/libvirt_remote.syms
index 0018a0c..2cd1b76 100644
--- a/src/libvirt_remote.syms
+++ b/src/libvirt_remote.syms
@@ -75,7 +75,6 @@ virNetDaemonAddServer;
 virNetDaemonAddShutdownInhibition;
 virNetDaemonAddSignalHandler;
 virNetDaemonAutoShutdown;
-virNetDaemonClose;
 virNetDaemonGetServer;
 virNetDaemonGetServers;
 virNetDaemonHasClients;
diff --git a/src/rpc/virnetdaemon.c b/src/rpc/virnetdaemon.c
index c4b31c6..87d669c 100644
--- a/src/rpc/virnetdaemon.c
+++ b/src/rpc/virnetdaemon.c
@@ -966,19 +966,6 @@ daemonServerClose(void *payload,
 return 0;
 }
 
-void
-virNetDaemonClose(virNetDaemonPtr dmn)
-{
-if (!dmn)
-return;
-
-virObjectLock(dmn);
-
-virHashForEach(dmn->servers, daemonServerClose, NULL);
-
-virObjectUnlock(dmn);
-}
-
 static int
 daemonServerHasClients(void *payload,
const void *key G_GNUC_UNUSED,
diff --git a/src/rpc/virnetdaemon.h b/src/rpc/virnetdaemon.h
index c2c7767..e708967 100644
--- a/src/rpc/virnetdaemon.h
+++ b/src/rpc/virnetdaemon.h
@@ -73,8 +73,6 @@ void virNetDaemonRun(virNetDaemonPtr dmn);
 
 void virNetDaemonQuit(virNetDaemonPtr dmn);
 
-void virNetDaemonClose(virNetDaemonPtr dmn);
-
 bool virNetDaemonHasClients(virNetDaemonPtr dmn);
 
 virNetServerPtr virNetDaemonGetServer(virNetDaemonPtr dmn,
-- 
1.8.3.1

[PATCH 10/10] util: remove unused virThreadPoolNew macro

2020-07-14 Thread Nikolay Shirokovskiy

Signed-off-by: Nikolay Shirokovskiy 
---
 src/util/virthreadpool.h | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/src/util/virthreadpool.h b/src/util/virthreadpool.h
index dd1aaf3..cb800dc 100644
--- a/src/util/virthreadpool.h
+++ b/src/util/virthreadpool.h
@@ -28,9 +28,6 @@ typedef virThreadPool *virThreadPoolPtr;
 
 typedef void (*virThreadPoolJobFunc)(void *jobdata, void *opaque);
 
-#define virThreadPoolNew(min, max, prio, func, opaque) \
-virThreadPoolNewFull(min, max, prio, func, #func, opaque)
-
 virThreadPoolPtr virThreadPoolNewFull(size_t minWorkers,
   size_t maxWorkers,
   size_t prioWorkers,
-- 
1.8.3.1

[PATCH 07/10] qemu: exit thread synchronously in qemuDomainObjStopWorker

2020-07-14 Thread Nikolay Shirokovskiy

The change won't hurt much current callers performance I guess and now we can
use the function when we need to be sure of synchronous thread exit as well.

Signed-off-by: Nikolay Shirokovskiy 
---
 src/qemu/qemu_domain.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
index 2d9d822..18651d0 100644
--- a/src/qemu/qemu_domain.c
+++ b/src/qemu/qemu_domain.c
@@ -1571,6 +1571,7 @@ qemuDomainObjStopWorker(virDomainObjPtr dom)
 qemuDomainObjPrivatePtr priv = dom->privateData;
 
 if (priv->eventThread) {
+virEventThreadClose(priv->eventThread);
 g_object_unref(priv->eventThread);
 priv->eventThread = NULL;
 }
-- 
1.8.3.1

[PATCH 05/10] rpc: finish all threads before exiting main loop

2020-07-14 Thread Nikolay Shirokovskiy

Currently we have issues like [1] on libvirtd shutdown as we cleanup while RPC
and other threads are still running. Let's finish all threads other then main
before cleanup.

The approach to finish threads is suggested in [2]. In order to finish RPC
threads serving API calls we let the event loop run but stop accepting new API
calls and block processing any pending API calls. We also inform all drivers of
shutdown so they can prepare for shutdown too. Then we wait for all RPC threads
and driver's background thread to finish. If finishing takes more then 15s we
just exit as we can't safely cleanup in time.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1828207
[2] https://www.redhat.com/archives/libvir-list/2020-April/msg01328.html

Signed-off-by: Nikolay Shirokovskiy 
---
 src/remote/remote_daemon.c |  3 --
 src/rpc/virnetdaemon.c | 82 +-
 src/rpc/virnetserver.c |  8 +
 src/rpc/virnetserver.h |  1 +
 4 files changed, 90 insertions(+), 4 deletions(-)

diff --git a/src/remote/remote_daemon.c b/src/remote/remote_daemon.c
index 1aa9bfc..222bb5f 100644
--- a/src/remote/remote_daemon.c
+++ b/src/remote/remote_daemon.c
@@ -1201,9 +1201,6 @@ int main(int argc, char **argv) {
 0, "shutdown", NULL, NULL);
 
  cleanup:
-/* Keep cleanup order in inverse order of startup */
-virNetDaemonClose(dmn);
-
 virNetlinkEventServiceStopAll();
 
 if (driversInitialized) {
diff --git a/src/rpc/virnetdaemon.c b/src/rpc/virnetdaemon.c
index bb81a43..c4b31c6 100644
--- a/src/rpc/virnetdaemon.c
+++ b/src/rpc/virnetdaemon.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 
+#include "libvirt_internal.h"
 #include "virnetdaemon.h"
 #include "virlog.h"
 #include "viralloc.h"
@@ -69,7 +70,10 @@ struct _virNetDaemon {
 virHashTablePtr servers;
 virJSONValuePtr srvObject;
 
+int finishTimer;
 bool quit;
+bool finished;
+bool graceful;
 
 unsigned int autoShutdownTimeout;
 size_t autoShutdownInhibitions;
@@ -80,6 +84,11 @@ struct _virNetDaemon {
 
 static virClassPtr virNetDaemonClass;
 
+static int
+daemonServerClose(void *payload,
+  const void *key G_GNUC_UNUSED,
+  void *opaque G_GNUC_UNUSED);
+
 static void
 virNetDaemonDispose(void *obj)
 {
@@ -796,11 +805,53 @@ daemonServerProcessClients(void *payload,
 return 0;
 }
 
+static int
+daemonServerShutdownWait(void *payload,
+ const void *key G_GNUC_UNUSED,
+ void *opaque G_GNUC_UNUSED)
+{
+virNetServerPtr srv = payload;
+
+virNetServerShutdownWait(srv);
+return 0;
+}
+
+static void
+daemonShutdownWait(void *opaque)
+{
+virNetDaemonPtr dmn = opaque;
+bool graceful = false;
+
+virHashForEach(dmn->servers, daemonServerShutdownWait, NULL);
+if (virStateShutdownWait() < 0)
+goto finish;
+
+graceful = true;
+
+ finish:
+virObjectLock(dmn);
+dmn->graceful = graceful;
+virEventUpdateTimeout(dmn->finishTimer, 0);
+virObjectUnlock(dmn);
+}
+
+static void
+virNetDaemonFinishTimer(int timerid G_GNUC_UNUSED,
+void *opaque)
+{
+virNetDaemonPtr dmn = opaque;
+
+virObjectLock(dmn);
+dmn->finished = true;
+virObjectUnlock(dmn);
+}
+
 void
 virNetDaemonRun(virNetDaemonPtr dmn)
 {
 int timerid = -1;
 bool timerActive = false;
+virThread shutdownThread;
 
 virObjectLock(dmn);
 
@@ -811,6 +862,9 @@ virNetDaemonRun(virNetDaemonPtr dmn)
 }
 
 dmn->quit = false;
+dmn->finishTimer = -1;
+dmn->finished = false;
+dmn->graceful = false;
 
 if (dmn->autoShutdownTimeout &&
 (timerid = virEventAddTimeout(-1,
@@ -826,7 +880,7 @@ virNetDaemonRun(virNetDaemonPtr dmn)
 virSystemdNotifyStartup();
 
 VIR_DEBUG("dmn=%p quit=%d", dmn, dmn->quit);
-while (!dmn->quit) {
+while (!dmn->finished) {
 /* A shutdown timeout is specified, so check
  * if any drivers have active state, if not
  * shutdown after timeout seconds
@@ -857,6 +911,32 @@ virNetDaemonRun(virNetDaemonPtr dmn)
 virObjectLock(dmn);
 
 virHashForEach(dmn->servers, daemonServerProcessClients, NULL);
+
+if (dmn->quit && dmn->finishTimer == -1) {
+virHashForEach(dmn->servers, daemonServerClose, NULL);
+if (virStateShutdown() < 0)
+break;
+
+if ((dmn->finishTimer = virEventAddTimeout(15 * 1000,
+   virNetDaemonFinishTimer,
+   dmn, NULL)) < 0) {
+VIR_WARN("Failed to register finish timer.");
+break;
+}
+
+if (virThreadCreateFull(, true, daemonShutdownWait,
+   "daemon-shutdown", false, dmn) < 0) {
+VIR_WARN("Failed to register join thread.");
+break;
+}
+}
+}
+
+if

[PATCH 01/10] libvirt: add stateShutdown/stateShutdownWait to drivers

2020-07-14 Thread Nikolay Shirokovskiy

stateShutdown is supposed to inform driver that it will be closed soon so that
the driver can prepare and finish all background threads quickly on
stateShutdownWait call.

Signed-off-by: Nikolay Shirokovskiy 
---
 scripts/check-drivername.py |  2 ++
 src/driver-state.h  |  8 
 src/libvirt.c   | 42 ++
 src/libvirt_internal.h  |  2 ++
 4 files changed, 54 insertions(+)

diff --git a/scripts/check-drivername.py b/scripts/check-drivername.py
index 39eff83..19d1cd1 100644
--- a/scripts/check-drivername.py
+++ b/scripts/check-drivername.py
@@ -50,6 +50,8 @@ for drvfile in drvfiles:
 "virDrvStateCleanup",
 "virDrvStateReload",
 "virDrvStateStop",
+"virDrvStateShutdown",
+"virDrvStateShutdownWait",
 "virDrvConnectSupportsFeature",
 "virDrvConnectURIProbe",
 "virDrvDomainMigratePrepare",
diff --git a/src/driver-state.h b/src/driver-state.h
index 6b3f501..1f664f3 100644
--- a/src/driver-state.h
+++ b/src/driver-state.h
@@ -45,6 +45,12 @@ typedef int
 typedef int
 (*virDrvStateStop)(void);
 
+typedef int
+(*virDrvStateShutdown)(void);
+
+typedef int
+(*virDrvStateShutdownWait)(void);
+
 typedef struct _virStateDriver virStateDriver;
 typedef virStateDriver *virStateDriverPtr;
 
@@ -55,4 +61,6 @@ struct _virStateDriver {
 virDrvStateCleanup stateCleanup;
 virDrvStateReload stateReload;
 virDrvStateStop stateStop;
+virDrvStateShutdown stateShutdown;
+virDrvStateShutdownWait stateShutdownWait;
 };
diff --git a/src/libvirt.c b/src/libvirt.c
index b2d0ba3..28f9332 100644
--- a/src/libvirt.c
+++ b/src/libvirt.c
@@ -673,6 +673,48 @@ virStateInitialize(bool privileged,
 
 
 /**
+ * virStateShutdown:
+ *
+ * Run each virtualization driver's shutdown method.
+ *
+ * Returns 0 if all succeed, -1 upon any failure.
+ */
+int
+virStateShutdown(void)
+{
+size_t i;
+
+for (i = 0; i < virStateDriverTabCount; i++) {
+if (virStateDriverTab[i]->stateShutdown &&
+virStateDriverTab[i]->stateShutdown() < 0)
+return -1;
+}
+return 0;
+}
+
+
+/**
+ * virStateShutdownWait:
+ *
+ * Run each virtualization driver's shutdown wait method.
+ *
+ * Returns 0 if all succeed, -1 upon any failure.
+ */
+int
+virStateShutdownWait(void)
+{
+size_t i;
+
+for (i = 0; i < virStateDriverTabCount; i++) {
+if (virStateDriverTab[i]->stateShutdownWait &&
+virStateDriverTab[i]->stateShutdownWait() < 0)
+return -1;
+}
+return 0;
+}
+
+
+/**
  * virStateCleanup:
  *
  * Run each virtualization driver's cleanup method.
diff --git a/src/libvirt_internal.h b/src/libvirt_internal.h
index 72c6127..5b6035f 100644
--- a/src/libvirt_internal.h
+++ b/src/libvirt_internal.h
@@ -34,6 +34,8 @@ int virStateInitialize(bool privileged,
const char *root,
virStateInhibitCallback inhibit,
void *opaque);
+int virStateShutdown(void);
+int virStateShutdownWait(void);
 int virStateCleanup(void);
 int virStateReload(void);
 int virStateStop(void);
-- 
1.8.3.1

[PATCH 06/10] vireventthread: add virEventThreadClose

2020-07-14 Thread Nikolay Shirokovskiy

virEventThreadClose is used when event loop thread should
be exited synchronously (which is not the case when event loop
is just unreferenced).

Signed-off-by: Nikolay Shirokovskiy 
---
 src/libvirt_private.syms  | 1 +
 src/util/vireventthread.c | 9 +
 src/util/vireventthread.h | 1 +
 3 files changed, 11 insertions(+)

diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
index f64b1de..c85ec43 100644
--- a/src/libvirt_private.syms
+++ b/src/libvirt_private.syms
@@ -2023,6 +2023,7 @@ virEventGLibRunOnce;
 
 
 # util/vireventthread.h
+virEventThreadClose;
 virEventThreadGetContext;
 virEventThreadNew;
 
diff --git a/src/util/vireventthread.c b/src/util/vireventthread.c
index cf86592..0a7c436 100644
--- a/src/util/vireventthread.c
+++ b/src/util/vireventthread.c
@@ -183,6 +183,15 @@ virEventThreadNew(const char *name)
 }
 
 
+void
+virEventThreadClose(virEventThread *evt)
+{
+g_main_loop_quit(evt->loop);
+g_thread_join(evt->thread);
+evt->thread = NULL;
+}
+
+
 GMainContext *
 virEventThreadGetContext(virEventThread *evt)
 {
diff --git a/src/util/vireventthread.h b/src/util/vireventthread.h
index 5826c25..6f01629 100644
--- a/src/util/vireventthread.h
+++ b/src/util/vireventthread.h
@@ -27,5 +27,6 @@
 G_DECLARE_FINAL_TYPE(virEventThread, vir_event_thread, VIR, EVENT_THREAD, 
GObject);
 
 virEventThread *virEventThreadNew(const char *name);
+void virEventThreadClose(virEventThread *evt);
 
 GMainContext *virEventThreadGetContext(virEventThread *evt);
-- 
1.8.3.1

[PATCH 03/10] util: add stop/drain functions to thread pool

2020-07-14 Thread Nikolay Shirokovskiy

Stop just send signal for threads to exit when they finish with
current task. Drain waits when all threads will finish.

Signed-off-by: Nikolay Shirokovskiy 
---
 src/libvirt_private.syms |  2 ++
 src/util/virthreadpool.c | 43 ++-
 src/util/virthreadpool.h |  3 +++
 3 files changed, 43 insertions(+), 5 deletions(-)

diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
index 73b72c9..f64b1de 100644
--- a/src/libvirt_private.syms
+++ b/src/libvirt_private.syms
@@ -3326,6 +3326,7 @@ virThreadJobSetWorker;
 
 
 # util/virthreadpool.h
+virThreadPoolDrain;
 virThreadPoolFree;
 virThreadPoolGetCurrentWorkers;
 virThreadPoolGetFreeWorkers;
@@ -3336,6 +3337,7 @@ virThreadPoolGetPriorityWorkers;
 virThreadPoolNewFull;
 virThreadPoolSendJob;
 virThreadPoolSetParameters;
+virThreadPoolStop;
 
 
 # util/virtime.h
diff --git a/src/util/virthreadpool.c b/src/util/virthreadpool.c
index 10a44de..ca44f55 100644
--- a/src/util/virthreadpool.c
+++ b/src/util/virthreadpool.c
@@ -268,19 +268,27 @@ virThreadPoolNewFull(size_t minWorkers,
 
 }
 
-void virThreadPoolFree(virThreadPoolPtr pool)
-{
-virThreadPoolJobPtr job;
 
-if (!pool)
+static void
+virThreadPoolStopLocked(virThreadPoolPtr pool)
+{
+if (pool->quit)
 return;
 
-virMutexLock(>mutex);
 pool->quit = true;
 if (pool->nWorkers > 0)
 virCondBroadcast(>cond);
 if (pool->nPrioWorkers > 0)
 virCondBroadcast(>prioCond);
+}
+
+
+static void
+virThreadPoolDrainLocked(virThreadPoolPtr pool)
+{
+virThreadPoolJobPtr job;
+
+virThreadPoolStopLocked(pool);
 
 while (pool->nWorkers > 0 || pool->nPrioWorkers > 0)
 ignore_value(virCondWait(>quit_cond, >mutex));
@@ -289,6 +297,15 @@ void virThreadPoolFree(virThreadPoolPtr pool)
 pool->jobList.head = pool->jobList.head->next;
 VIR_FREE(job);
 }
+}
+
+void virThreadPoolFree(virThreadPoolPtr pool)
+{
+if (!pool)
+return;
+
+virMutexLock(>mutex);
+virThreadPoolDrainLocked(pool);
 
 VIR_FREE(pool->workers);
 virMutexUnlock(>mutex);
@@ -475,3 +492,19 @@ virThreadPoolSetParameters(virThreadPoolPtr pool,
 virMutexUnlock(>mutex);
 return -1;
 }
+
+void
+virThreadPoolStop(virThreadPoolPtr pool)
+{
+virMutexLock(>mutex);
+virThreadPoolStopLocked(pool);
+virMutexUnlock(>mutex);
+}
+
+void
+virThreadPoolDrain(virThreadPoolPtr pool)
+{
+virMutexLock(>mutex);
+virThreadPoolDrainLocked(pool);
+virMutexUnlock(>mutex);
+}
diff --git a/src/util/virthreadpool.h b/src/util/virthreadpool.h
index c97d9b3..dd1aaf3 100644
--- a/src/util/virthreadpool.h
+++ b/src/util/virthreadpool.h
@@ -56,3 +56,6 @@ int virThreadPoolSetParameters(virThreadPoolPtr pool,
long long int minWorkers,
long long int maxWorkers,
long long int prioWorkers);
+
+void virThreadPoolStop(virThreadPoolPtr pool);
+void virThreadPoolDrain(virThreadPoolPtr pool);
-- 
1.8.3.1

Re: [PATCH] storage: find vstorage-mount binary in runtime

2020-07-14 Thread Nikolay Shirokovskiy




On 14.07.2020 12:02, Daniel P. Berrangé wrote:
> On Tue, Jul 14, 2020 at 11:55:47AM +0300, Nikolay Shirokovskiy wrote:
>>
>>
>> On 14.07.2020 11:50, Daniel P. Berrangé wrote:
>>> On Tue, Jul 14, 2020 at 10:26:00AM +0300, Nikolay Shirokovskiy wrote:
 This allows us to use CI for vstorage driver without installing Virtuozzo
 Storage packages. This way we can leave aside license considerations.

 By the way we need to change configure defaults from 'check' to 'no' 
 otherwise
 vstorage driver will be build on any system with umount binary which is not
 expected I guess.

 Signed-off-by: Nikolay Shirokovskiy 
 ---
  m4/virt-storage-vstorage.m4| 17 +
  src/storage/storage_backend_vstorage.c |  9 -
  2 files changed, 9 insertions(+), 17 deletions(-)

 diff --git a/m4/virt-storage-vstorage.m4 b/m4/virt-storage-vstorage.m4
 index e3b3bb4..cf0a543 100644
 --- a/m4/virt-storage-vstorage.m4
 +++ b/m4/virt-storage-vstorage.m4
 @@ -21,30 +21,19 @@ dnl
  AC_DEFUN([LIBVIRT_STORAGE_ARG_VSTORAGE], [
LIBVIRT_ARG_WITH_FEATURE([STORAGE_VSTORAGE],
 [Virtuozzo Storage backend for the storage 
 driver],
 -   [check])
 +   [no])
  ])
>>>
>>> Why was this changed to "no".  It means we'll never enable the storage
>>> driver without an explicit --with-storage-vstorage arg passed
>>
>> But with "check" we will have this driver built on any system with umount
>> as I mentioned in commit message. I guess this is not desired given the
>> driver actually needs some more binaries that are not usually installed.
> 
> The core rule for configure checks is that users should never have to
> pass any --with args to enable use of features that their host OS has.
> So we need some mechanism to correctly enable the driver.
> 
> I'd suggest that we in fact get rid of this entire check, and just make
> the storage driver be directly configured based on $with_vz, beucase if
> you have the hypervisor driver enabled, you want the storage driver to
> match.

These drivers are not so strongly related. One can be used without another
and vz driver soon going to be outdated.

May be we can leave "check" really check for vstorage-mount presence and
leave default to "check" also while make "yes" be forceful and don't
fail if there is no binary? This new "yes" behavior is supposed to be
used in CI.

Nikolay

Re: [PATCH] storage: find vstorage-mount binary in runtime

2020-07-14 Thread Daniel P . Berrangé

On Tue, Jul 14, 2020 at 11:55:47AM +0300, Nikolay Shirokovskiy wrote:
> 
> 
> On 14.07.2020 11:50, Daniel P. Berrangé wrote:
> > On Tue, Jul 14, 2020 at 10:26:00AM +0300, Nikolay Shirokovskiy wrote:
> >> This allows us to use CI for vstorage driver without installing Virtuozzo
> >> Storage packages. This way we can leave aside license considerations.
> >>
> >> By the way we need to change configure defaults from 'check' to 'no' 
> >> otherwise
> >> vstorage driver will be build on any system with umount binary which is not
> >> expected I guess.
> >>
> >> Signed-off-by: Nikolay Shirokovskiy 
> >> ---
> >>  m4/virt-storage-vstorage.m4| 17 +
> >>  src/storage/storage_backend_vstorage.c |  9 -
> >>  2 files changed, 9 insertions(+), 17 deletions(-)
> >>
> >> diff --git a/m4/virt-storage-vstorage.m4 b/m4/virt-storage-vstorage.m4
> >> index e3b3bb4..cf0a543 100644
> >> --- a/m4/virt-storage-vstorage.m4
> >> +++ b/m4/virt-storage-vstorage.m4
> >> @@ -21,30 +21,19 @@ dnl
> >>  AC_DEFUN([LIBVIRT_STORAGE_ARG_VSTORAGE], [
> >>LIBVIRT_ARG_WITH_FEATURE([STORAGE_VSTORAGE],
> >> [Virtuozzo Storage backend for the storage 
> >> driver],
> >> -   [check])
> >> +   [no])
> >>  ])
> > 
> > Why was this changed to "no".  It means we'll never enable the storage
> > driver without an explicit --with-storage-vstorage arg passed
> 
> But with "check" we will have this driver built on any system with umount
> as I mentioned in commit message. I guess this is not desired given the
> driver actually needs some more binaries that are not usually installed.

The core rule for configure checks is that users should never have to
pass any --with args to enable use of features that their host OS has.
So we need some mechanism to correctly enable the driver.

I'd suggest that we in fact get rid of this entire check, and just make
the storage driver be directly configured based on $with_vz, beucase if
you have the hypervisor driver enabled, you want the storage driver to
match.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH] storage: find vstorage-mount binary in runtime

2020-07-14 Thread Nikolay Shirokovskiy




On 14.07.2020 11:50, Daniel P. Berrangé wrote:
> On Tue, Jul 14, 2020 at 10:26:00AM +0300, Nikolay Shirokovskiy wrote:
>> This allows us to use CI for vstorage driver without installing Virtuozzo
>> Storage packages. This way we can leave aside license considerations.
>>
>> By the way we need to change configure defaults from 'check' to 'no' 
>> otherwise
>> vstorage driver will be build on any system with umount binary which is not
>> expected I guess.
>>
>> Signed-off-by: Nikolay Shirokovskiy 
>> ---
>>  m4/virt-storage-vstorage.m4| 17 +
>>  src/storage/storage_backend_vstorage.c |  9 -
>>  2 files changed, 9 insertions(+), 17 deletions(-)
>>
>> diff --git a/m4/virt-storage-vstorage.m4 b/m4/virt-storage-vstorage.m4
>> index e3b3bb4..cf0a543 100644
>> --- a/m4/virt-storage-vstorage.m4
>> +++ b/m4/virt-storage-vstorage.m4
>> @@ -21,30 +21,19 @@ dnl
>>  AC_DEFUN([LIBVIRT_STORAGE_ARG_VSTORAGE], [
>>LIBVIRT_ARG_WITH_FEATURE([STORAGE_VSTORAGE],
>> [Virtuozzo Storage backend for the storage 
>> driver],
>> -   [check])
>> +   [no])
>>  ])
> 
> Why was this changed to "no".  It means we'll never enable the storage
> driver without an explicit --with-storage-vstorage arg passed

But with "check" we will have this driver built on any system with umount
as I mentioned in commit message. I guess this is not desired given the
driver actually needs some more binaries that are not usually installed.

Nikolay

> 
>>  
>>  AC_DEFUN([LIBVIRT_STORAGE_CHECK_VSTORAGE], [
>>if test "$with_storage_vstorage" = "yes" ||
>>   test "$with_storage_vstorage" = "check"; then
>> -AC_PATH_PROG([VSTORAGE], [vstorage], [], [$LIBVIRT_SBIN_PATH])
>> -AC_PATH_PROG([VSTORAGE_MOUNT], [vstorage-mount], [], 
>> [$LIBVIRT_SBIN_PATH])
>>  AC_PATH_PROG([UMOUNT], [umount], [], [$LIBVIRT_SBIN_PATH])
>>  
>>  if test "$with_storage_vstorage" = "yes"; then
>> -  if test -z "$VSTORAGE" || test -z "$VSTORAGE_MOUNT"; then
>> -AC_MSG_ERROR([We need vstorage and vstorage-mount tool for Vstorage 
>> storage driver]);
>> -  fi
>>if test -z "$UMOUNT" ; then
>>  AC_MSG_ERROR([We need umount for Vstorage storage driver]);
>>fi
>>  else
>> -  if test -z "$VSTORAGE" ; then
>> -with_storage_vstorage=no
>> -  fi
>> -  if test -z "$VSTORAGE_MOUNT" ; then
>> -with_storage_vstorage=no
>> -  fi
>>if test -z "$UMOUNT" ; then
>>  with_storage_vstorage=no
>>fi
>> @@ -57,10 +46,6 @@ AC_DEFUN([LIBVIRT_STORAGE_CHECK_VSTORAGE], [
>>  if test "$with_storage_vstorage" = "yes" ; then
>>AC_DEFINE_UNQUOTED([WITH_STORAGE_VSTORAGE], 1,
>>   [whether Vstorage backend for storage driver is 
>> enabled])
>> -  AC_DEFINE_UNQUOTED([VSTORAGE], ["$VSTORAGE"],
>> - [Location or name of the vstorage client tool])
>> -  AC_DEFINE_UNQUOTED([VSTORAGE_MOUNT], ["$VSTORAGE_MOUNT"],
>> - [Location or name of the vstorage mount tool])
>>AC_DEFINE_UNQUOTED([UMOUNT], ["$UMOUNT"],
>>   [Location or name of the umount programm])
>>  fi
>> diff --git a/src/storage/storage_backend_vstorage.c 
>> b/src/storage/storage_backend_vstorage.c
>> index 6cff9f1..ea3bfaa 100644
>> --- a/src/storage/storage_backend_vstorage.c
>> +++ b/src/storage/storage_backend_vstorage.c
>> @@ -44,9 +44,16 @@ virStorageBackendVzPoolStart(virStoragePoolObjPtr pool)
>>  g_autofree char *grp_name = NULL;
>>  g_autofree char *usr_name = NULL;
>>  g_autofree char *mode = NULL;
>> +g_autofree char *mount_bin = NULL;
>>  g_autoptr(virCommand) cmd = NULL;
>>  int ret;
>>  
>> +if (!(mount_bin = virFindFileInPath("vstorage-mount"))) {
>> +virReportError(VIR_ERR_INTERNAL_ERROR,
>> +   "%s", _("unable to find vstorage-mount"));
>> +return -1;
>> +}
>> +
>>  /* Check the permissions */
>>  if (def->target.perms.mode == (mode_t)-1)
>>  def->target.perms.mode = VIR_STORAGE_DEFAULT_POOL_PERM_MODE;
>> @@ -65,7 +72,7 @@ virStorageBackendVzPoolStart(virStoragePoolObjPtr pool)
>>  
>>  mode = g_strdup_printf("%o", def->target.perms.mode);
>>  
>> -cmd = virCommandNewArgList(VSTORAGE_MOUNT,
>> +cmd = virCommandNewArgList(mount_bin,
>> "-c", def->source.name,
>> def->target.path,
>> "-m", mode,
>> -- 
>> 1.8.3.1
>>
> 
> Regards,
> Daniel
>

Re: [PATCH] storage: find vstorage-mount binary in runtime

2020-07-14 Thread Daniel P . Berrangé

On Tue, Jul 14, 2020 at 10:26:00AM +0300, Nikolay Shirokovskiy wrote:
> This allows us to use CI for vstorage driver without installing Virtuozzo
> Storage packages. This way we can leave aside license considerations.
> 
> By the way we need to change configure defaults from 'check' to 'no' otherwise
> vstorage driver will be build on any system with umount binary which is not
> expected I guess.
> 
> Signed-off-by: Nikolay Shirokovskiy 
> ---
>  m4/virt-storage-vstorage.m4| 17 +
>  src/storage/storage_backend_vstorage.c |  9 -
>  2 files changed, 9 insertions(+), 17 deletions(-)
> 
> diff --git a/m4/virt-storage-vstorage.m4 b/m4/virt-storage-vstorage.m4
> index e3b3bb4..cf0a543 100644
> --- a/m4/virt-storage-vstorage.m4
> +++ b/m4/virt-storage-vstorage.m4
> @@ -21,30 +21,19 @@ dnl
>  AC_DEFUN([LIBVIRT_STORAGE_ARG_VSTORAGE], [
>LIBVIRT_ARG_WITH_FEATURE([STORAGE_VSTORAGE],
> [Virtuozzo Storage backend for the storage 
> driver],
> -   [check])
> +   [no])
>  ])

Why was this changed to "no".  It means we'll never enable the storage
driver without an explicit --with-storage-vstorage arg passed

>  
>  AC_DEFUN([LIBVIRT_STORAGE_CHECK_VSTORAGE], [
>if test "$with_storage_vstorage" = "yes" ||
>   test "$with_storage_vstorage" = "check"; then
> -AC_PATH_PROG([VSTORAGE], [vstorage], [], [$LIBVIRT_SBIN_PATH])
> -AC_PATH_PROG([VSTORAGE_MOUNT], [vstorage-mount], [], 
> [$LIBVIRT_SBIN_PATH])
>  AC_PATH_PROG([UMOUNT], [umount], [], [$LIBVIRT_SBIN_PATH])
>  
>  if test "$with_storage_vstorage" = "yes"; then
> -  if test -z "$VSTORAGE" || test -z "$VSTORAGE_MOUNT"; then
> -AC_MSG_ERROR([We need vstorage and vstorage-mount tool for Vstorage 
> storage driver]);
> -  fi
>if test -z "$UMOUNT" ; then
>  AC_MSG_ERROR([We need umount for Vstorage storage driver]);
>fi
>  else
> -  if test -z "$VSTORAGE" ; then
> -with_storage_vstorage=no
> -  fi
> -  if test -z "$VSTORAGE_MOUNT" ; then
> -with_storage_vstorage=no
> -  fi
>if test -z "$UMOUNT" ; then
>  with_storage_vstorage=no
>fi
> @@ -57,10 +46,6 @@ AC_DEFUN([LIBVIRT_STORAGE_CHECK_VSTORAGE], [
>  if test "$with_storage_vstorage" = "yes" ; then
>AC_DEFINE_UNQUOTED([WITH_STORAGE_VSTORAGE], 1,
>   [whether Vstorage backend for storage driver is 
> enabled])
> -  AC_DEFINE_UNQUOTED([VSTORAGE], ["$VSTORAGE"],
> - [Location or name of the vstorage client tool])
> -  AC_DEFINE_UNQUOTED([VSTORAGE_MOUNT], ["$VSTORAGE_MOUNT"],
> - [Location or name of the vstorage mount tool])
>AC_DEFINE_UNQUOTED([UMOUNT], ["$UMOUNT"],
>   [Location or name of the umount programm])
>  fi
> diff --git a/src/storage/storage_backend_vstorage.c 
> b/src/storage/storage_backend_vstorage.c
> index 6cff9f1..ea3bfaa 100644
> --- a/src/storage/storage_backend_vstorage.c
> +++ b/src/storage/storage_backend_vstorage.c
> @@ -44,9 +44,16 @@ virStorageBackendVzPoolStart(virStoragePoolObjPtr pool)
>  g_autofree char *grp_name = NULL;
>  g_autofree char *usr_name = NULL;
>  g_autofree char *mode = NULL;
> +g_autofree char *mount_bin = NULL;
>  g_autoptr(virCommand) cmd = NULL;
>  int ret;
>  
> +if (!(mount_bin = virFindFileInPath("vstorage-mount"))) {
> +virReportError(VIR_ERR_INTERNAL_ERROR,
> +   "%s", _("unable to find vstorage-mount"));
> +return -1;
> +}
> +
>  /* Check the permissions */
>  if (def->target.perms.mode == (mode_t)-1)
>  def->target.perms.mode = VIR_STORAGE_DEFAULT_POOL_PERM_MODE;
> @@ -65,7 +72,7 @@ virStorageBackendVzPoolStart(virStoragePoolObjPtr pool)
>  
>  mode = g_strdup_printf("%o", def->target.perms.mode);
>  
> -cmd = virCommandNewArgList(VSTORAGE_MOUNT,
> +cmd = virCommandNewArgList(mount_bin,
> "-c", def->source.name,
> def->target.path,
> "-m", mode,
> -- 
> 1.8.3.1
> 

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: libvirt-6.5.0 breaks host passthrough migration

2020-07-14 Thread Jiri Denemark

On Mon, Jul 13, 2020 at 14:04:25 +0200, Jiri Denemark wrote:
> On Sat, Jul 11, 2020 at 13:44:19 -0400, Mark Mielke wrote:
> > On Sat, Jul 11, 2020 at 6:04 AM Mark Mielke  wrote:
> > 
> > > On Fri, Jul 10, 2020 at 7:48 AM Mark Mielke  wrote:
> > >
> > >> On Fri, Jul 10, 2020 at 7:14 AM Jiri Denemark 
> > >> wrote:
> > >>
> > >>> The implementation seems to be doing exactly what the commit message
> > >>>
> > >> says. The migratable=off default should be used only when QEMU does not
> > >>> support -cpu host,migratable=on|off, that is only when QEMU is very old.
> > >>> Every non-ancient version of libvirt should have the
> > >>> QEMU_CAPS_CPU_MIGRATABLE set and thus this code should choose
> > >>> migrateble=on default.
> > >>>
> > >> QEMU_CAPS_CPU_MIGRATABLE only from the  element? If so, doesn't this
> > >> mean that it is not explicitly listed for host-passthrough, and this 
> > >> means
> > >> the check is not detecting whether it is enabled or not properly?
> > >>
> > > Trying to understand what is going on more - I see "migratable" seems to
> > > be ok when launching a new machine, but the failure scenario was live
> > > migration from 6.4.0 to 6.5.0.
> > >
> > > Is this because the QEMU_CAPS_CPU_MIGRATABLE was not filled in for 6.4.0,
> > > and live migration grabs the capabilities from the source, where the
> > > absence of this capability makes it presume an older Qemu in the above 
> > > code?
> > >
> > 
> > Sorry all - I am having trouble reproducing now. The expected use cases are
> > now working.
> > 
> > Is it possible that the "migratable" flag might have been missing on some
> > of the instances, although migration worked fine, and despite having used
> > Qemu 4.2 and Qemu 5.0?
> 
> When an updated libvirtd which knows about this new capability starts,
> it would reprobe all QEMU capabilities (lazily, i.e., once they are
> needed). However, if there is a running domain, libvirt will use cached
> capabilities probed when the domain was started. I suspect migrating
> such domain could be a problem. I'll try to reproduce locally.

OK, I did not reproduce the failure, because migratable=off doesn't
enable anything more than migratable=on (likely because L1 VM in my
nested environment does not have any non-migratable features enabled).
But I was able to reproduce the issue itself and the migration could
clearly fail if migratable=off enabled some non-migratable features. The
reproducer is actually easy and one doesn't even need to migrate to see
libvirt did something wrong:

1. run libvirtd older then 6.5.0
2. start a domain with host-passthrough CPU (QEMU would default to
   migratable=on)
3. upgrade libvirt to 6.5.0 and restart libvirtd
4. virsh dumpxml $DOMAIN_STARTED_IN_STEP_2

Now you would see

which differs from the default used by QEMU. Migrating such domain would
succeed anyway, because it was actually started with migratable='on'.
But when such domain is migrated to libvirt 6.5.0, we would honor the
migratable attribute and start QEMU with -cpu host,migratable=off which
could cause failures when trying to migrate this domain again.

The problem is exactly where I was afraid it could be. When libvirtd
starts, it reads the QEMU capabilities probed by older libvirt
(QEMU_CAPS_CPU_MIGRATABLE would be off) and wrongly updates the XML of
the running domain. I'll prepare a patch to fix this.

Jirka

[PATCH] qemu: pre-create the dbus directory in qemuStateInitialize

2020-07-14 Thread Bihong Yu

>From 187323ce736dcd9b1a177d552630b0c6859a4798 Mon Sep 17 00:00:00 2001
From: Bihong Yu 
Date: Tue, 14 Jul 2020 15:44:05 +0800
Subject: [PATCH] qemu: pre-create the dbus directory in qemuStateInitialize

There are races condiction to make '/run/libvirt/qemu/dbus' directory in
virDirCreateNoFork() while concurrent start VMs, and get "failed to create
directory '/run/libvirt/qemu/dbus': File exists" error message. pre-create the
dbus directory in qemuStateInitialize.

Signed-off-by:Bihong Yu 
---
 src/qemu/qemu_dbus.c| 4 +---
 src/qemu/qemu_dbus.h| 2 +-
 src/qemu/qemu_driver.c  | 4 
 src/qemu/qemu_process.c | 3 ---
 4 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/src/qemu/qemu_dbus.c b/src/qemu/qemu_dbus.c
index 51f6c94..0e0306a 100644
--- a/src/qemu/qemu_dbus.c
+++ b/src/qemu/qemu_dbus.c
@@ -34,10 +34,8 @@ VIR_LOG_INIT("qemu.dbus");


 int
-qemuDBusPrepareHost(virQEMUDriverPtr driver)
+qemuDBusPreparePath(virQEMUDriverConfigPtr cfg)
 {
-g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver);
-
 return virDirCreate(cfg->dbusStateDir, 0770, cfg->user, cfg->group,
 VIR_DIR_CREATE_ALLOW_EXIST);
 }
diff --git a/src/qemu/qemu_dbus.h b/src/qemu/qemu_dbus.h
index 474eb10..6ce9f7b 100644
--- a/src/qemu/qemu_dbus.h
+++ b/src/qemu/qemu_dbus.h
@@ -21,7 +21,7 @@
 #include "qemu_conf.h"
 #include "qemu_domain.h"

-int qemuDBusPrepareHost(virQEMUDriverPtr driver);
+int qemuDBusPreparePath(virQEMUDriverConfigPtr cfg);

 char *qemuDBusGetAddress(virQEMUDriverPtr driver,
  virDomainObjPtr vm);
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index d185666..52b68c9 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -50,6 +50,7 @@
 #include "qemu_security.h"
 #include "qemu_checkpoint.h"
 #include "qemu_backup.h"
+#include "qemu_dbus.h"

 #include "virerror.h"
 #include "virlog.h"
@@ -790,6 +791,9 @@ qemuStateInitialize(bool privileged,
   cfg->migrationPortMax)) == NULL)
 goto error;

+if (qemuDBusPreparePath(cfg) < 0)
+goto error;
+
 if (qemuSecurityInit(qemu_driver) < 0)
 goto error;

diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index eba14ed..46620ca 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -6449,9 +6449,6 @@ qemuProcessPrepareHost(virQEMUDriverPtr driver,
 qemuDomainObjPrivatePtr priv = vm->privateData;
 g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver);

-if (qemuDBusPrepareHost(driver) < 0)
-return -1;
-
 if (qemuPrepareNVRAM(cfg, vm) < 0)
 return -1;

-- 
1.8.3.1

Re: [PATCH] qemu: use a fixed directory for saving qmp capabilities process

2020-07-14 Thread Jiri Denemark

On Tue, Jul 14, 2020 at 11:20:00 +0800, Bihong Yu wrote:
> >From c7ee36417b88df7dcfe5e18d1eb72b6d7c175268 Mon Sep 17 00:00:00 2001
> From: Bihong Yu 
> Date: Tue, 14 Jul 2020 11:17:46 +0800
> Subject: [PATCH] qemu: use a fixed directory for saving qmp capabilities
>  process
> 
> In some case, the qmp capabilities process possible residues. So we need to
> cleanup the residual process during start libvirt. For this reason, we
> should use a fixed directory for saving qmp capabilities process.
> 
> Change-Id: I6a74a046e192eee169a0e2677ce3aed9ded5e0ed
> Signed-off-by:Bihong Yu 
> ---
>  src/qemu/qemu_process.c | 17 -
>  1 file changed, 12 insertions(+), 5 deletions(-)

We can't do this because this code is used in more places than just QMP
capabilities probing. Specifically, APIs for CPU comparison and baseline
on s390x are passed to QEMU using this code. Which means we can have
multiple QEMU processes running at the same time.

We should rather make sure such temporary directories are not left
behind and possible cleanup them when QEMU driver starts to remove those
which were left behind when libvirtd gets restarted.

NACK

[PATCH] storage: find vstorage-mount binary in runtime

2020-07-14 Thread Nikolay Shirokovskiy

This allows us to use CI for vstorage driver without installing Virtuozzo
Storage packages. This way we can leave aside license considerations.

By the way we need to change configure defaults from 'check' to 'no' otherwise
vstorage driver will be build on any system with umount binary which is not
expected I guess.

Signed-off-by: Nikolay Shirokovskiy 
---
 m4/virt-storage-vstorage.m4| 17 +
 src/storage/storage_backend_vstorage.c |  9 -
 2 files changed, 9 insertions(+), 17 deletions(-)

diff --git a/m4/virt-storage-vstorage.m4 b/m4/virt-storage-vstorage.m4
index e3b3bb4..cf0a543 100644
--- a/m4/virt-storage-vstorage.m4
+++ b/m4/virt-storage-vstorage.m4
@@ -21,30 +21,19 @@ dnl
 AC_DEFUN([LIBVIRT_STORAGE_ARG_VSTORAGE], [
   LIBVIRT_ARG_WITH_FEATURE([STORAGE_VSTORAGE],
[Virtuozzo Storage backend for the storage driver],
-   [check])
+   [no])
 ])
 
 AC_DEFUN([LIBVIRT_STORAGE_CHECK_VSTORAGE], [
   if test "$with_storage_vstorage" = "yes" ||
  test "$with_storage_vstorage" = "check"; then
-AC_PATH_PROG([VSTORAGE], [vstorage], [], [$LIBVIRT_SBIN_PATH])
-AC_PATH_PROG([VSTORAGE_MOUNT], [vstorage-mount], [], [$LIBVIRT_SBIN_PATH])
 AC_PATH_PROG([UMOUNT], [umount], [], [$LIBVIRT_SBIN_PATH])
 
 if test "$with_storage_vstorage" = "yes"; then
-  if test -z "$VSTORAGE" || test -z "$VSTORAGE_MOUNT"; then
-AC_MSG_ERROR([We need vstorage and vstorage-mount tool for Vstorage 
storage driver]);
-  fi
   if test -z "$UMOUNT" ; then
 AC_MSG_ERROR([We need umount for Vstorage storage driver]);
   fi
 else
-  if test -z "$VSTORAGE" ; then
-with_storage_vstorage=no
-  fi
-  if test -z "$VSTORAGE_MOUNT" ; then
-with_storage_vstorage=no
-  fi
   if test -z "$UMOUNT" ; then
 with_storage_vstorage=no
   fi
@@ -57,10 +46,6 @@ AC_DEFUN([LIBVIRT_STORAGE_CHECK_VSTORAGE], [
 if test "$with_storage_vstorage" = "yes" ; then
   AC_DEFINE_UNQUOTED([WITH_STORAGE_VSTORAGE], 1,
  [whether Vstorage backend for storage driver is 
enabled])
-  AC_DEFINE_UNQUOTED([VSTORAGE], ["$VSTORAGE"],
- [Location or name of the vstorage client tool])
-  AC_DEFINE_UNQUOTED([VSTORAGE_MOUNT], ["$VSTORAGE_MOUNT"],
- [Location or name of the vstorage mount tool])
   AC_DEFINE_UNQUOTED([UMOUNT], ["$UMOUNT"],
  [Location or name of the umount programm])
 fi
diff --git a/src/storage/storage_backend_vstorage.c 
b/src/storage/storage_backend_vstorage.c
index 6cff9f1..ea3bfaa 100644
--- a/src/storage/storage_backend_vstorage.c
+++ b/src/storage/storage_backend_vstorage.c
@@ -44,9 +44,16 @@ virStorageBackendVzPoolStart(virStoragePoolObjPtr pool)
 g_autofree char *grp_name = NULL;
 g_autofree char *usr_name = NULL;
 g_autofree char *mode = NULL;
+g_autofree char *mount_bin = NULL;
 g_autoptr(virCommand) cmd = NULL;
 int ret;
 
+if (!(mount_bin = virFindFileInPath("vstorage-mount"))) {
+virReportError(VIR_ERR_INTERNAL_ERROR,
+   "%s", _("unable to find vstorage-mount"));
+return -1;
+}
+
 /* Check the permissions */
 if (def->target.perms.mode == (mode_t)-1)
 def->target.perms.mode = VIR_STORAGE_DEFAULT_POOL_PERM_MODE;
@@ -65,7 +72,7 @@ virStorageBackendVzPoolStart(virStoragePoolObjPtr pool)
 
 mode = g_strdup_printf("%o", def->target.perms.mode);
 
-cmd = virCommandNewArgList(VSTORAGE_MOUNT,
+cmd = virCommandNewArgList(mount_bin,
"-c", def->source.name,
def->target.path,
"-m", mode,
-- 
1.8.3.1

Re: [PATCH] storage: fix vstorage backend build

2020-07-14 Thread Nikolay Shirokovskiy




On 13.07.2020 12:16, Andrea Bolognani wrote:
> On Mon, 2020-07-13 at 11:52 +0300, Nikolay Shirokovskiy wrote:
>> On 13.07.2020 09:57, Nikolay Shirokovskiy wrote:
>>> On 10.07.2020 21:17, Andrea Bolognani wrote:
 If you look at the list of packages through the provided "repoview"
 files, the releases from the second repository contain many more:
 additional categories include things like "Readykernel", Virtuozzo
 High Availability" and, most interesting to us, "Virtuozzo Storage".

 As mentioned in my previous message, some of these packages appear to
 be released not under the (L)GPL but under a "Virtuozzo" license that
 I haven't been able to find anywhere, and I'm not entirely sure is
 actually open source.
>>>
>>> We have Virtuozzo product and part of it is available as open source
>>> from openvz.org repo. Virtuozzo Storage is not open source but it
>>> can be used with very limited storage size without license keys.
>>> I guess you would want to read license anyway before using Virtuozzo
>>> Storage packages in CI but I'm going to remove these packages
>>> requirements anyway as I wrote in reply to Andrea's message.
>>
>> After giving it more thought I think it is useful to have vstorage
>> m4 script as it is. It is useful to detect vstorage-mount binary
>> path at build time. Although Virtuozzo Storage is not open source
>> and can not be build for distributions with different binary paths
>> by community it is a part of different Virtuozzo products and
>> theoretically speaking the path can be changed from product to product.
>> So it is useful to have path not hardcoded and not detected at runtime.
> 
> If runtime detection is good enough for qemu-img, I don't see why it
> wouldn't be good enough for vstorage too.
> 
>> Before we fix our repos please take a look at Virtuozzo license at [1]
>> (note there is different tabs and EULA tab).
>>
>> [1] https://www.virtuozzo.com/legal.html
> 
> Yeah, sorry, but I'm not going to put my name on a patch that results
> in our build environments suddenly including proprietary software.
> I'm already not a fan that happening based on principles alone, and I
> most certainly don't want to deal with any potential fallout from our
> container images violating the EULA or whatnot.
> 

Ok, given all the license stuff and the fact we can just use runtime
binary detection let's go with this variant. I'll send the patch.

Nikolay

82 matches

Mail list logo