Re: [libvirt] [PATCH v2 11/12] storage: Use virStoragePoolObj{Get|Incr}Decr}Asyncjobs
On 08/24/2017 03:09 PM, John Ferlan wrote: > Use the new accessor APIs for storage_driver. > > Signed-off-by: John Ferlan > --- > src/storage/storage_driver.c | 20 ++-- > 1 file changed, 10 insertions(+), 10 deletions(-) s/Incr}Decr/Incr|Decr/ in $SUBJ. Michal -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 07/12] storage: Use virStoragePoolObjGetAutostartLink
On 08/24/2017 03:09 PM, John Ferlan wrote: > Use the new accessor API for storage_driver. > > Signed-off-by: John Ferlan > --- > src/storage/storage_driver.c | 21 - > 1 file changed, 12 insertions(+), 9 deletions(-) > > diff --git a/src/storage/storage_driver.c b/src/storage/storage_driver.c > index 7b1396f..a7a77ba 100644 > --- a/src/storage/storage_driver.c > +++ b/src/storage/storage_driver.c > @@ -814,6 +814,7 @@ static int > storagePoolUndefine(virStoragePoolPtr pool) > { > virStoragePoolObjPtr obj; > +const char *autostartLink; > virObjectEventPtr event = NULL; > int ret = -1; > > @@ -838,18 +839,17 @@ storagePoolUndefine(virStoragePoolPtr pool) > goto cleanup; > } > > +autostartLink = virStoragePoolObjGetAutostartLink(obj); virStoragePoolObjGetAutostartLink() should look a bit different. Now it's returning char * which suggests that caller is supposed to free the retval. But in fact they shouldn't. const char * would be better. But lets save that for a follow up patch. Michal -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 10/12] storage: Internally represent @autostart to bool
On 08/24/2017 03:09 PM, John Ferlan wrote: > Since it's been used that way anyway, let's just convert it to a bool > and only make the external representation be an int. > > Signed-off-by: John Ferlan > --- > src/conf/virstorageobj.c | 4 ++-- > src/conf/virstorageobj.h | 4 ++-- > src/storage/storage_driver.c | 2 +- > src/test/test_driver.c | 4 ++-- > 4 files changed, 7 insertions(+), 7 deletions(-) I'm no native speaker but probably s/to bool/as bool/ in $SUBJ? Michal -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] qemu: Introduce a wrapper over virFileWrapperFdClose
On 09/18/2017 10:49 PM, John Ferlan wrote: > > > On 09/18/2017 05:08 AM, Michal Privoznik wrote: >> https://bugzilla.redhat.com/show_bug.cgi?id=1448268 >> >> When migrating to a file (e.g. when doing 'virsh save file'), >> couple of things are happening in the thread that is executing >> the API: >> >> 1) the domain obj is locked >> 2) iohelper is spawned as a separate process to handle all I/O >> 3) the thread waits for iohelper to finish >> 4) the domain obj is unlocked >> >> Now, the problem is that while the thread waits in step 3 for >> iohelper to finish this may take ages because iohelper calls >> fdatasync(). And unfortunately, we are waiting the whole time >> with the domain locked. So if another thread wants to jump in and >> say copy the domain name ('virsh list' for instance), they are >> stuck. >> >> The solution is to unlock the domain whenever waiting for I/O and >> lock it back again when it finished. >> >> Signed-off-by: Michal Privoznik >> --- >> src/qemu/qemu_driver.c | 27 +++ >> 1 file changed, 23 insertions(+), 4 deletions(-) >> > > I would think there'd be more concern about dropping the lock in the > middle and something changing that could more negatively affect things. > There's such a delicate balance between Save/Restore processing now - if > you adjust the locks that could mean simultaneous usage, right? Yes & no. Yes in sense that if another thread (in this case the one that is listing the domains) wants to access the domain it can. But no other API can be started really (like migrate, or hotplug) because the thread executing save/dump sets a job. A job is like a flag so that we can unlock the domain object for others to use (in a limited manner) and yet still exclude another state changing API. We lock and unlock domains all the time btw - qemuDomaninObjEnterMonitor() and qemuDomainObjExitMonitor() do just that. And exactly for the same reason as here. It may take ages for qemu to reply to our command and for that the domain obj lock should not be held. IOW, once a job is acquired we can lock & unlock domain as we please because we can be sure nobody will change the libvirt internal state of the domain. All that other threads can do is read. > > Perhaps thinking differently... Does the virdomainobjlist code for > Collect, Filter, Export really need to obtain the domain obj write lock? Unfortunately yes. vm->def is not immutable, nor vm->def->name. > > If Export is but a "snapshot in time", then using virObject{Lock|Unlock} > perhaps isn't necessary in any of the @vms list processing. Since > virDomainObjListCollectIterator sets a refcnt on the object it cannot be > freed. Beyond that any state checking done in virDomainObjListFilter is > negated because the lock isn't held until the virGetDomain call is made > during virDomainObjListExport to fill in the returned @doms list. It's > quite possible after *Filter to have the 'removing' boolean get set > before the virGetDomain call is made. So what's the need or purpose of > the write locks then? > > If the object locks are removed during Export, then virsh list won't > freeze while other operations that take time occur. Since the RWLock is > taken on the list while @vms is created, we know nothing else can > Add/Remove to/from the list - thus *that* is our snapshot in time. > Anything done after than is just reducing what's in that snapshot. > > It's "interesting" that the domain export list uses a very different > model than other drivers... That loop to take a refcnt on each object w/ > the Read lock taken is something I could see duplicated for other > drivers conceptually speaking of course if someone were to write that > kind of common processing model ;-) Right. It's not domain obj lock that is RW, it's the domain obj *list* lock that is. The domain obj lock is still a mutex. BTW: it's not only 'virsh list' that would hang. Other APIs that just read without grabbing a job would hang too: qemuDomainIsActive qemuDomainIsPersistent .. And also qemuDomainLookupByName. We really must unlock the domain instead of trying to work around the problem. Michal -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v6 2/2] qemu: add capability checking for qcow2 cache configuration
Add qemu capabilities QEMU_CAPS_L2_CACHE_SIZE, QEMU_CAPS_REFCOUNT_CACHE_SIZE, QEMU_CAPS_CACHE_CLEAN_INTERVAL. Add testing for the above qemu capabilities. Signed-off-by: Liu Qing --- src/qemu/qemu_capabilities.c | 11 src/qemu/qemu_capabilities.h | 5 src/qemu/qemu_command.c| 33 ++ tests/qemucapabilitiesdata/caps_2.5.0.x86_64.xml | 3 ++ .../caps_2.6.0-gicv2.aarch64.xml | 3 ++ .../caps_2.6.0-gicv3.aarch64.xml | 3 ++ tests/qemucapabilitiesdata/caps_2.6.0.ppc64le.xml | 3 ++ tests/qemucapabilitiesdata/caps_2.6.0.x86_64.xml | 3 ++ tests/qemucapabilitiesdata/caps_2.7.0.s390x.xml| 3 ++ tests/qemucapabilitiesdata/caps_2.7.0.x86_64.xml | 3 ++ tests/qemucapabilitiesdata/caps_2.8.0.s390x.xml| 3 ++ tests/qemucapabilitiesdata/caps_2.8.0.x86_64.xml | 3 ++ tests/qemucapabilitiesdata/caps_2.9.0.ppc64le.xml | 3 ++ tests/qemucapabilitiesdata/caps_2.9.0.s390x.xml| 3 ++ tests/qemucapabilitiesdata/caps_2.9.0.x86_64.xml | 3 ++ .../qemuxml2argv-disk-drive-qcow2-cache.args | 28 ++ tests/qemuxml2argvtest.c | 4 +++ 17 files changed, 117 insertions(+) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-disk-drive-qcow2-cache.args diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index e7ea6f4..2db6af1 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -439,6 +439,11 @@ VIR_ENUM_IMPL(virQEMUCaps, QEMU_CAPS_LAST, "virtio-net.tx_queue_size", "chardev-reconnect", "virtio-gpu.max_outputs", + + /* 270 */ + "drive.qcow2.l2-cache-size", + "drive.qcow2.refcount-cache-size", + "drive.qcow2.cache-clean-interval", ); @@ -1811,6 +1816,12 @@ static struct virQEMUCapsStringFlags virQEMUCapsObjectPropsIntelIOMMU[] = { static struct virQEMUCapsStringFlags virQEMUCapsQMPSchemaQueries[] = { { "blockdev-add/arg-type/options/+gluster/debug-level", QEMU_CAPS_GLUSTER_DEBUG_LEVEL}, { "blockdev-add/arg-type/+gluster/debug", QEMU_CAPS_GLUSTER_DEBUG_LEVEL}, +{ "blockdev-add/arg-type/options/+qcow2/l2-cache-size", QEMU_CAPS_DRIVE_QCOW2_L2_CACHE_SIZE}, +{ "blockdev-add/arg-type/options/+qcow2/refcount-cache-size", QEMU_CAPS_DRIVE_QCOW2_REFCOUNT_CACHE_SIZE}, +{ "blockdev-add/arg-type/options/+qcow2/cache-clean-interval", QEMU_CAPS_DRIVE_QCOW2_CACHE_CLEAN_INTERVAL}, +{ "blockdev-add/arg-type/+qcow2/l2-cache-size", QEMU_CAPS_DRIVE_QCOW2_L2_CACHE_SIZE}, +{ "blockdev-add/arg-type/+qcow2/refcount-cache-size", QEMU_CAPS_DRIVE_QCOW2_REFCOUNT_CACHE_SIZE}, +{ "blockdev-add/arg-type/+qcow2/cache-clean-interval", QEMU_CAPS_DRIVE_QCOW2_CACHE_CLEAN_INTERVAL}, }; struct virQEMUCapsObjectTypeProps { diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index f32687d..021297e 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -426,6 +426,11 @@ typedef enum { QEMU_CAPS_CHARDEV_RECONNECT, /* -chardev reconnect */ QEMU_CAPS_VIRTIO_GPU_MAX_OUTPUTS, /* -device virtio-(vga|gpu-*),max-outputs= */ +/* 270 */ +QEMU_CAPS_DRIVE_QCOW2_L2_CACHE_SIZE, /* -drive support qcow2 l2 cache size */ +QEMU_CAPS_DRIVE_QCOW2_REFCOUNT_CACHE_SIZE, /* -drive support qcow2 refcount cache size */ +QEMU_CAPS_DRIVE_QCOW2_CACHE_CLEAN_INTERVAL, /* -drive qcow2 cache clean interval */ + QEMU_CAPS_LAST /* this must always be the last item */ } virQEMUCapsFlags; diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index d553df5..da91059 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -1434,6 +1434,39 @@ qemuBuildDriveSourceStr(virDomainDiskDefPtr disk, virBufferAsprintf(buf, "format=%s,", qemuformat); } +if (disk->src->l2_cache_size > 0) { +if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_DRIVE_QCOW2_L2_CACHE_SIZE)) { +virBufferAsprintf(buf, "l2-cache-size=%llu,", + disk->src->l2_cache_size); +} else { +virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("l2-cache-size is not supported by this QEMU")); +goto cleanup; +} +} + +if (disk->src->refcount_cache_size > 0) { +if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_DRIVE_QCOW2_REFCOUNT_CACHE_SIZE)) { +virBufferAsprintf(buf, "refcount-cache-size=%llu,", + disk->src->refcount_cache_size); +} else { +virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("refcount-cache-size is not supported by this QEMU")); +goto cleanup; +} +} + +if (disk->src->cache_clean_interval > 0) { +if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_DRIVE_QCOW2_CACHE_CLEAN_INTERVAL)) {
[libvirt] [PATCH v6 0/2] Add support for qcow2 cache
Qcow2 small IO random write performance will drop dramatically if the l2 cache table could not cover the whole disk. This will be a lot of l2 cache table RW operations if cache miss happens frequently. This patch exports the qcow2 driver parameter l2-cache-size/refcount-cache-size, first added in Qemu 2.2, and cache-clean-interval, first added in Qemu 2.5, in libvirt. change since v4: removed unnecessary cache error check Liu Qing (2): conf, docs: Add qcow2 cache configuration support qemu: add capability checking for qcow2 cache configuration docs/formatdomain.html.in | 41 + docs/schemas/domaincommon.rng | 35 src/conf/domain_conf.c | 97 -- src/qemu/qemu_capabilities.c | 11 +++ src/qemu/qemu_capabilities.h | 5 ++ src/qemu/qemu_command.c| 33 src/qemu/qemu_driver.c | 5 ++ src/util/virstoragefile.c | 3 + src/util/virstoragefile.h | 6 ++ tests/qemucapabilitiesdata/caps_2.5.0.x86_64.xml | 3 + .../caps_2.6.0-gicv2.aarch64.xml | 3 + .../caps_2.6.0-gicv3.aarch64.xml | 3 + tests/qemucapabilitiesdata/caps_2.6.0.ppc64le.xml | 3 + tests/qemucapabilitiesdata/caps_2.6.0.x86_64.xml | 3 + tests/qemucapabilitiesdata/caps_2.7.0.s390x.xml| 3 + tests/qemucapabilitiesdata/caps_2.7.0.x86_64.xml | 3 + tests/qemucapabilitiesdata/caps_2.8.0.s390x.xml| 3 + tests/qemucapabilitiesdata/caps_2.8.0.x86_64.xml | 3 + tests/qemucapabilitiesdata/caps_2.9.0.ppc64le.xml | 3 + tests/qemucapabilitiesdata/caps_2.9.0.s390x.xml| 3 + tests/qemucapabilitiesdata/caps_2.9.0.x86_64.xml | 3 + .../qemuxml2argv-disk-drive-qcow2-cache.args | 28 +++ .../qemuxml2argv-disk-drive-qcow2-cache.xml| 43 ++ tests/qemuxml2argvtest.c | 4 + .../qemuxml2xmlout-disk-drive-qcow2-cache.xml | 43 ++ tests/qemuxml2xmltest.c| 1 + 26 files changed, 384 insertions(+), 7 deletions(-) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-disk-drive-qcow2-cache.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-disk-drive-qcow2-cache.xml create mode 100644 tests/qemuxml2xmloutdata/qemuxml2xmlout-disk-drive-qcow2-cache.xml -- 1.8.3.1 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v6 1/2] conf, docs: Add qcow2 cache configuration support
Random write IOPS will drop dramatically if qcow2 l2 cache could not cover the whole disk. This patch gives libvirt user a chance to adjust the qcow2 cache configuration. Three new qcow2 driver parameters (l2-cache-size, refcount-cache-size and cache-clean-interval) are added as attributes to a new subelement for a of a element. The QEMU source docs/qcow2-cache.txt provides the guidelines for defining/configuring values for each. Signed-off-by: Liu Qing --- change since v4: removed unnecessary cache error check docs/formatdomain.html.in | 41 + docs/schemas/domaincommon.rng | 35 src/conf/domain_conf.c | 97 -- src/qemu/qemu_driver.c | 5 ++ src/util/virstoragefile.c | 3 + src/util/virstoragefile.h | 6 ++ .../qemuxml2argv-disk-drive-qcow2-cache.xml| 43 ++ .../qemuxml2xmlout-disk-drive-qcow2-cache.xml | 43 ++ tests/qemuxml2xmltest.c| 1 + 9 files changed, 267 insertions(+), 7 deletions(-) create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-disk-drive-qcow2-cache.xml create mode 100644 tests/qemuxml2xmloutdata/qemuxml2xmlout-disk-drive-qcow2-cache.xml diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 8ca7637..4574e3a 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -3036,6 +3036,47 @@ set. (Since 3.5.0) +The driver element may contain a qcow2 sub +element, which to specifying further details related to a qcow2 disk. +For recommended setting guidelines refer to the QEMU source file +docs/qcow2-cache.txt. +Since 3.8.0 + + +The optional l2_cache_size attribute controls how much +memory will be consumed by qcow2 l2 table cache in bytes. This +option is only valid when the driver type is qcow2. The default +size is 2097152. +Since 3.8.0 + +In general you should leave this option alone, unless you +are very certain you know what you are doing. + + +The optional refcount_cache_size attribute controls +how much memory will be consumed by qcow2 reference count table +cache in bytes. This option is only valid when the driver type is +qcow2. The default size is 262144. +Since 3.8.0 + +In general you should leave this option alone, unless you +are very certain you know what you are doing. + + +The optional cache_clean_interval attribute defines +an interval (in seconds). All cache entries that haven't been +accessed during that interval are removed from memory. This option +is only valid when the driver type is qcow2. The default +value is 0, which disables this feature. If the interval is too +short, it will cause frequent cache write back and load, which +impact performance. If the interval is too long, unused cache +will still consume memory until it's been written back to disk. +Since 3.8.0 + +In general you should leave this option alone, unless you +are very certain you know what you are doing. + + backenddomain The optional backenddomain element allows specifying a diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index c9a4f7a..0e25f44 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -1756,6 +1756,23 @@ + + + + + + + + + + + + + + + @@ -1794,6 +1811,9 @@ + + + @@ -1889,6 +1909,21 @@ + + + + + + + + + + + + + + + diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index a43b25c..f225a7f 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -5734,6 +5734,30 @@ virDomainDeviceLoadparmIsValid(const char *loadparm) static void +virDomainQcow2CacheOptionsFormat(virBufferPtr buf, + virDomainDiskDefPtr def) +{ +if (def->src->l2_cache_size == 0 && +def->src->refcount_cache_size == 0 && +def->src->cache_clean_interval == 0) +return; + +virBufferAddLit(buf, "src->l2_cache_size > 0) +virBufferAsprintf(buf, " l2_cache_size='%llu'", + def->src->l2_cache_size); +if (def->src->refcount_cache_size > 0) +virBufferAsprintf(buf, " refcount_cache_size='%llu'", + def->src->refcount_cache_size); +if
Re: [libvirt] [PATCH v5 1/2] conf, docs: Add qcow2 cache configuration support
On Mon, Sep 18, 2017 at 09:49:36AM -0400, John Ferlan wrote: > > > On 09/14/2017 01:08 AM, Liu Qing wrote: > > Random write IOPS will drop dramatically if qcow2 l2 cache could not > > cover the whole disk. This patch gives libvirt user a chance to adjust > > the qcow2 cache configuration. > > > > Three new qcow2 driver parameters (l2-cache-size, refcount-cache-size > > and cache-clean-interval) are added as attributes to a new > > subelement for a of a > > element. The QEMU source docs/qcow2-cache.txt provides the guidelines > > for defining/configuring values for each. > > > > Signed-off-by: Liu Qing > > --- > > docs/formatdomain.html.in | 41 + > > docs/schemas/domaincommon.rng | 35 > > src/conf/domain_conf.c | 96 > > -- > > src/qemu/qemu_driver.c | 5 ++ > > src/util/virstoragefile.c | 3 + > > src/util/virstoragefile.h | 6 ++ > > .../qemuxml2argv-disk-drive-qcow2-cache.xml| 43 ++ > > .../qemuxml2xmlout-disk-drive-qcow2-cache.xml | 43 ++ > > tests/qemuxml2xmltest.c| 1 + > > 9 files changed, 268 insertions(+), 5 deletions(-) > > create mode 100644 > > tests/qemuxml2argvdata/qemuxml2argv-disk-drive-qcow2-cache.xml > > create mode 100644 > > tests/qemuxml2xmloutdata/qemuxml2xmlout-disk-drive-qcow2-cache.xml > > > > OK - so I had my head in other code during the v4 review comments from > Peter. So I'll just "restart" that a bit here. If I can summarize the > comments... There's a concern that we're creating a very specific set of > parameters for which there isn't great guidance available other than > going to a source file in the qemu tree and attempting to decipher it. > Something I did note at one time as a possible issue and why I also > asked for the "qcow2" subelement. It makes it painfully obvious that > there was something very specific. I figure seeing a "qcow2" element > name would fairly easily point out to anyone that specificity. > > So, based on what I read in the responses it seems that the purpose of > the values is to provide a mechanism to "allow" qemu to use a couple of > different algorithms to perform operations, but qemu wants the user to > decide whether they are willing to sacrifice one thing over another. If > you want high IOPS, then you have to sacrifice something else; whereas, > if you don't care about performance, then you sacrifice something else. > > In any case, I think what could be useful is to determine if there is a > way to "automate" something such that the "dials" would be automatically > set based on the algorithms. That is - rather than supplying the > specific values, supply a named attribute that would then perform the > calculations and set the values, such as "high_performance="yes" or > "large_capacity=yes". Where you cannot set both, but can set one or the > other. Or can you set both? Is there value in partially turning a knob? > If so maybe the attribute is "performance={default, ..., best}" and > "capacity={default, ..., large}". Then based on the value of the > attribute, code you add to libvirt would perform the calculations. > > Or do you really think that's something some supplier (such as what > you're trying to do) would want to have - the ability to control the > algorithm for the various knobs. I googled the qcow2 cache and there are already some discussion about how to give a user friendly interface, even in the qemu community. https://bugzilla.redhat.com/show_bug.cgi?id=1377735 I don't what to involve this, as this is hard to lead to an all win result. A high level policy configuration would be enough for us. If a policy configuration on libvirt level is necessary, I think a new patch could be made. For example adding new attribute in the element which controls how the cache is automatically configured. But this may lead to another discussion about what's a good policy. A policy way and the knob way could both be available on libvirt level. -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] Questions about function virPCIDeviceIsBehindSwitchLackingACS in virpci.c
Hi, In function virPCIDeviceIsBehindSwitchLackingACS, I noticed that(line 8): 1if (virPCIDeviceGetParent(dev, &parent) < 0) 2return -1; 3if (!parent) { 4/* if we have no parent, and this is the root bus, ACS doesn't come 5 * into play since devices on the root bus can't P2P without going 6 * through the root IOMMU. 7 */ 8if (dev->address.bus == 0) { 9return 0; 10} else { 11virReportError(VIR_ERR_INTERNAL_ERROR, 12 _("Failed to find parent device for %s"), 13 dev->name); 14return -1; 15} 16} Why we just return 0 only if device's bus is 0? In my server, I can see a root bus which bus number is greater than 0, see the results(just a part) after I run lspci -t: +-[:80]-+-02.0-[81-83]--+-00.0 | | \-00.1 | +-05.0 | +-05.1 | +-05.2 | \-05.4 +-[:7f]-+-08.0 | +-08.2 | +-08.3 | + . . . | \-1f.2 \-[:00]-+-00.0 +-01.0-[01]00.0 +-02.0-[02]--+-00.0 |+-00.1 |+-00.2 |\-00.3 +-02.2-[03]-- +-03.0-[04-0b]00.0-[05-0b]--+-08.0-[06-08]00.0 | \-10.0-[09-0b]00.0 +-05.0 +-05.1 +-05.2 +-05.4 +-11.0 +-11.4 +-16.0 +-16.1 +-1a.0 If I assign the device :81:00.0 to a VM, I get "Failed to find parent device". I think I should get no error with return value 0 just like bus number is 0, because bus 80 is the root bus as well in my case. In the <> Datasheet, I found that(Chapter 9.1): For some server platforms, it may be desirable to have multiple PCHs in the system Which means some PCH's may reside on a bus greater than 0. So, is this a bug? Thanks, Zongyong Wu -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 1/2] logical: implement volume resize
On 09/07/2017 08:56 AM, apolya...@beget.ru wrote: > From: Alexander Polyakov > > Add new virStorageBackendLogicalResizeVol function to > implement resize of logical volumes > > Sparse volumes are not supported > > Signed-off-by: Alexander Polyakov > --- > m4/virt-storage-lvm.m4| 4 > src/storage/storage_backend_logical.c | 37 > +++ > 2 files changed, 41 insertions(+) > Curious - is there a way to create/modify an LV to use LUKS encryption via some lvm* commands? I have another bz and the only way that comes to me right now is to create the LV using qemu-img with and then somehow have it recognized/added so that LVM recognizes it. I haven't put much thought into it yet though... Anyway... > diff --git a/m4/virt-storage-lvm.m4 b/m4/virt-storage-lvm.m4 > index a0ccca7a0..0932995b4 100644 > --- a/m4/virt-storage-lvm.m4 > +++ b/m4/virt-storage-lvm.m4 > @@ -30,6 +30,7 @@ AC_DEFUN([LIBVIRT_STORAGE_CHECK_LVM], [ > AC_PATH_PROG([VGREMOVE], [vgremove], [], [$LIBVIRT_SBIN_PATH]) > AC_PATH_PROG([LVREMOVE], [lvremove], [], [$LIBVIRT_SBIN_PATH]) > AC_PATH_PROG([LVCHANGE], [lvchange], [], [$LIBVIRT_SBIN_PATH]) > +AC_PATH_PROG([LVRESIZE], [lvresize], [], [$LIBVIRT_SBIN_PATH]) > AC_PATH_PROG([VGCHANGE], [vgchange], [], [$LIBVIRT_SBIN_PATH]) > AC_PATH_PROG([VGSCAN], [vgscan], [], [$LIBVIRT_SBIN_PATH]) > AC_PATH_PROG([PVS], [pvs], [], [$LIBVIRT_SBIN_PATH]) > @@ -44,6 +45,7 @@ AC_DEFUN([LIBVIRT_STORAGE_CHECK_LVM], [ >if test -z "$VGREMOVE" ; then AC_MSG_ERROR([We need vgremove for LVM > storage driver]) ; fi >if test -z "$LVREMOVE" ; then AC_MSG_ERROR([We need lvremove for LVM > storage driver]) ; fi >if test -z "$LVCHANGE" ; then AC_MSG_ERROR([We need lvchange for LVM > storage driver]) ; fi > + if test -z "$LVRESIZE" ; then AC_MSG_ERROR([We need lvresize for LVM > storage driver]) ; fi >if test -z "$VGCHANGE" ; then AC_MSG_ERROR([We need vgchange for LVM > storage driver]) ; fi >if test -z "$VGSCAN" ; then AC_MSG_ERROR([We need vgscan for LVM > storage driver]) ; fi >if test -z "$PVS" ; then AC_MSG_ERROR([We need pvs for LVM storage > driver]) ; fi > @@ -57,6 +59,7 @@ AC_DEFUN([LIBVIRT_STORAGE_CHECK_LVM], [ >if test -z "$VGREMOVE" ; then with_storage_lvm=no ; fi >if test -z "$LVREMOVE" ; then with_storage_lvm=no ; fi >if test -z "$LVCHANGE" ; then with_storage_lvm=no ; fi > + if test -z "$LVRESIZE" ; then with_storage_lvm=no ; fi >if test -z "$VGCHANGE" ; then with_storage_lvm=no ; fi >if test -z "$VGSCAN" ; then with_storage_lvm=no ; fi >if test -z "$PVS" ; then with_storage_lvm=no ; fi > @@ -75,6 +78,7 @@ AC_DEFUN([LIBVIRT_STORAGE_CHECK_LVM], [ >AC_DEFINE_UNQUOTED([VGREMOVE],["$VGREMOVE"],[Location of vgremove > program]) >AC_DEFINE_UNQUOTED([LVREMOVE],["$LVREMOVE"],[Location of lvremove > program]) >AC_DEFINE_UNQUOTED([LVCHANGE],["$LVCHANGE"],[Location of lvchange > program]) > + AC_DEFINE_UNQUOTED([LVRESIZE],["$LVRESIZE"],[Location of lvresize > program]) >AC_DEFINE_UNQUOTED([VGCHANGE],["$VGCHANGE"],[Location of vgchange > program]) >AC_DEFINE_UNQUOTED([VGSCAN],["$VGSCAN"],[Location of vgscan program]) >AC_DEFINE_UNQUOTED([PVS],["$PVS"],[Location of pvs program]) > diff --git a/src/storage/storage_backend_logical.c > b/src/storage/storage_backend_logical.c > index 67f70e551..e5cb09922 100644 > --- a/src/storage/storage_backend_logical.c > +++ b/src/storage/storage_backend_logical.c > @@ -1072,6 +1072,42 @@ virStorageBackendLogicalVolWipe(virConnectPtr conn, > return -1; > } > I see the lvmresize command allows you to supply which physical device the extents would come from. Would that be something we might want to try here as well? Thankfully I don't think snapshot or mirror'd LV's come into play (yet) nor thin pool or stripes. In any case, I don't know whether someone would "want" to choose a specific PV to draw from or just let LVM make the decision and let it be for now. If we did have another parameter we'd have to check if the PV existed and was part of the VG... lot's more work, but didn't want to miss noting it either. Just typing some thoughts... > +static int > +virStorageBackendLogicalResizeVol(virConnectPtr conn ATTRIBUTE_UNUSED, > + virStoragePoolObjPtr pool > ATTRIBUTE_UNUSED, > + virStorageVolDefPtr vol, > + unsigned long long capacity, > + unsigned int flags) ^ The spacing for each of the argument rows is off by a single space. > +{ > +virCheckFlags(VIR_STORAGE_VOL_RESIZE_SHRINK | > + VIR_STORAGE_VOL_RESIZE_DELTA, -1); > + > +bool delta = flags & VIR_STORAGE_VOL_RESIZE_DELTA; > +bool shrink = flags & VIR_STORAGE_VOL_RES
Re: [libvirt] [PATCH 2/2] news: Document lvm volume resize
On 09/07/2017 08:56 AM, apolya...@beget.ru wrote: > From: Alexander Polyakov > > Signed-off-by: Alexander Polyakov > --- > docs/news.xml | 8 > 1 file changed, 8 insertions(+) > > diff --git a/docs/news.xml b/docs/news.xml > index 483f9d6d1..44e99205f 100644 > --- a/docs/news.xml > +++ b/docs/news.xml > @@ -37,6 +37,14 @@ > > > > + > + > + logical: Add volume resize > + > + > + It's now possible to resize LVM volumes. I would probably add "non sparse" LVM volumes... Just to be clear. Reviewed-by: John Ferlan John > + > + > > > > -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] qemu: Introduce a wrapper over virFileWrapperFdClose
On 09/18/2017 05:08 AM, Michal Privoznik wrote: > https://bugzilla.redhat.com/show_bug.cgi?id=1448268 > > When migrating to a file (e.g. when doing 'virsh save file'), > couple of things are happening in the thread that is executing > the API: > > 1) the domain obj is locked > 2) iohelper is spawned as a separate process to handle all I/O > 3) the thread waits for iohelper to finish > 4) the domain obj is unlocked > > Now, the problem is that while the thread waits in step 3 for > iohelper to finish this may take ages because iohelper calls > fdatasync(). And unfortunately, we are waiting the whole time > with the domain locked. So if another thread wants to jump in and > say copy the domain name ('virsh list' for instance), they are > stuck. > > The solution is to unlock the domain whenever waiting for I/O and > lock it back again when it finished. > > Signed-off-by: Michal Privoznik > --- > src/qemu/qemu_driver.c | 27 +++ > 1 file changed, 23 insertions(+), 4 deletions(-) > I would think there'd be more concern about dropping the lock in the middle and something changing that could more negatively affect things. There's such a delicate balance between Save/Restore processing now - if you adjust the locks that could mean simultaneous usage, right? Perhaps thinking differently... Does the virdomainobjlist code for Collect, Filter, Export really need to obtain the domain obj write lock? If Export is but a "snapshot in time", then using virObject{Lock|Unlock} perhaps isn't necessary in any of the @vms list processing. Since virDomainObjListCollectIterator sets a refcnt on the object it cannot be freed. Beyond that any state checking done in virDomainObjListFilter is negated because the lock isn't held until the virGetDomain call is made during virDomainObjListExport to fill in the returned @doms list. It's quite possible after *Filter to have the 'removing' boolean get set before the virGetDomain call is made. So what's the need or purpose of the write locks then? If the object locks are removed during Export, then virsh list won't freeze while other operations that take time occur. Since the RWLock is taken on the list while @vms is created, we know nothing else can Add/Remove to/from the list - thus *that* is our snapshot in time. Anything done after than is just reducing what's in that snapshot. It's "interesting" that the domain export list uses a very different model than other drivers... That loop to take a refcnt on each object w/ the Read lock taken is something I could see duplicated for other drivers conceptually speaking of course if someone were to write that kind of common processing model ;-) John FWIW: The "older" processing model to get the Num of domains followed by getting a list of domains could continue to do the locking... although that code could use a cleanup too, but that's a different issue. > diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c > index b334cf20b..f81d3def4 100644 > --- a/src/qemu/qemu_driver.c > +++ b/src/qemu/qemu_driver.c > @@ -3216,6 +3216,25 @@ qemuOpenFileAs(uid_t fallback_uid, gid_t fallback_gid, > goto cleanup; > } > > + > +static int > +qemuFileWrapperFDClose(virDomainObjPtr vm, > + virFileWrapperFdPtr fd) > +{ > +int ret; > + > +/* virFileWrapperFd uses iohelper to write data onto disk. > + * However, iohelper calls fdatasync() which may take ages to > + * finish. Therefore, we shouldn't be waiting with the domain > + * object locked. */ > + > +virObjectUnlock(vm); > +ret = virFileWrapperFdClose(fd); > +virObjectLock(vm); > +return ret; > +} > + > + > /* Helper function to execute a migration to file with a correct save header > * the caller needs to make sure that the processors are stopped and do all > other > * actions besides saving memory */ > @@ -3276,7 +3295,7 @@ qemuDomainSaveMemory(virQEMUDriverPtr driver, > goto cleanup; > } > > -if (virFileWrapperFdClose(wrapperFd) < 0) > +if (qemuFileWrapperFDClose(vm, wrapperFd) < 0) > goto cleanup; > > if ((fd = qemuOpenFile(driver, vm, path, O_WRONLY, NULL, NULL)) < 0 || > @@ -3827,7 +3846,7 @@ doCoreDump(virQEMUDriverPtr driver, > path); > goto cleanup; > } > -if (virFileWrapperFdClose(wrapperFd) < 0) > +if (qemuFileWrapperFDClose(vm, wrapperFd) < 0) > goto cleanup; > > ret = 0; > @@ -6687,7 +6706,7 @@ qemuDomainRestoreFlags(virConnectPtr conn, > > ret = qemuDomainSaveImageStartVM(conn, driver, vm, &fd, data, path, > false, QEMU_ASYNC_JOB_START); > -if (virFileWrapperFdClose(wrapperFd) < 0) > +if (qemuFileWrapperFDClose(vm, wrapperFd) < 0) > VIR_WARN("Failed to close %s", path); > > qemuProcessEndJob(driver, vm); > @@ -6962,7 +6981,7 @@ qemuDomainObjRestore(virConnectPtr conn, > > ret = qemuDomainSaveImageS
Re: [libvirt] [PATCH v7 11/11] qemu: Add TLS support for Veritas HyperScale (VxHS)
Hi, I've done some TLS testing with this patch and results look good. The following test statically adds a VxHS disk to a guest in the TLS mode. Boots up the guest and makes sure that we can do read/writes to the VxHS disk from within the guest with TLS enabled. (1) Create a backing store file /tmp/test_vxhs_disk_1 and start the VxHS test server "qnio_server" with TLS enabled. (2) Client side TLS env was setup as follows - [root@audi ~] 2017-09-18 13:56:13# grep -i vxhs /etc/libvirt/qemu.conf | grep -v "^#" vxhs_tls = 1 vxhs_tls_x509_cert_dir = "/etc/pki/libvirt-vxhs" [root@audi ~] 2017-09-18 13:56:22# ll /etc/pki/libvirt-vxhs total 20 -r--r--r--. 1 root root 4062 Sep 17 23:15 ca-cert.pem -rw-r--r--. 1 root root 1866 Sep 17 22:52 client-cert.pem -r. 1 root root 1679 Sep 17 22:52 client-key.pem [root@audi ~] 2017-09-18 13:56:35# (3) virsh edit and add a new VxHS device with tls='yes' The XML added to existing domain - eb90327c-8302-4725-9e1b-4e85ed4dc251 (4) Start the domain and check if qemu command is correct [root@audi ~] 2017-09-18 13:29:01# virsh start myfc24 Domain myfc24 started [root@audi ~] 2017-09-18 13:29:20# ps -ef | grep qemu root 9578 1 99 13:29 ?00:00:20 /usr/bin/qemu-system-x86_64 -machine accel=kvm -name guest=myfc24,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1-myfc24/master-key.aes -machine pc-i440fx-2.6,accel=kvm,usb=off,vmport=off,dump-guest-core=off -cpu Opteron_G3 -m 1024 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid 70454565-8185-4506-b50f-d2cf55d83796 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-1-myfc24/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x6.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x6 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x6.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x6.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=/var/lib/libvirt/images/myfc24_rootdisk.qcow2,format=qcow2,if=none,id=drive-ide0-0-0 -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -drive if=none,id=drive-ide0-0-1,readonly=on -device ide-cd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1 -object tls-creds-x509,id=objvirtio-disk2_tls0,dir=/etc/pki/libvirt-vxhs,endpoint=client,verify-peer=yes -drive file.driver=vxhs,file.tls-creds=objvirtio-disk2_tls0,file.vdisk-id=/tmp/test_vxhs_disk_1,file.server.type=tcp,file.server.host=127.0.0.1,file.server.port=,format=raw,if=none,id=drive-virtio-disk2,serial=eb90327c-8302-4725-9e1b-4e85ed4dc251,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0xa,drive=drive-virtio-disk2,id=virtio-disk2 -netdev tap,fd=27,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:e4:9e:30,bus=pci.0,addr=0x3 -netdev tap,fd=29,id=hostnet1,vhost=on,vhostfd=30 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:00:b1:43:c4,bus=pci.0,addr=0x8 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -spice port=5900,addr=127.0.0.1,disable-ticketing,image-compression=off,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=1 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 -msg timestamp=on (5) Log in to the guest domain and make sure we see this VxHS disk [root@camshaft ~] 2017-09-18 13:32:22# fdisk -l ... Disk /dev/vda: 1 MiB, 1048576 bytes, 2048 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk /dev/mapper/fedora-root: 45.4 GiB, 48704258048 bytes, 95125504 sectors ... (6) Create a disk label and partition. Do mkfs and mount the FS. Copy some files to the disk and check general read/write operations. [root@camshaft ~] 2017-09-18 13:32:35# fdisk /dev/vda Created a new partition 1 of type 'Linux' and of size 1023.5 KiB. Command (m for help): p Disk /dev/vda: 1 MiB, 1048576 bytes, 2048 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical
[libvirt] [PATCH] apparmor: support finer-grained ptrace checks
Kernel 4.13 introduced finer-grained ptrace checks https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/?h=v4.13.2&id=290f458a4f16f9cf6cb6562b249e69fe1c3c3a07 When Apparmor is enabled and libvirtd is confined, attempting to start a domain fails virsh start test error: Failed to start domain test error: internal error: child reported: Kernel does not provide mount namespace: Permission denied The audit log contains type=AVC msg=audit(1505466699.828:534): apparmor="DENIED" operation="ptrace" profile="/usr/sbin/libvirtd" pid=6621 comm="libvirtd" requested_mask="trace" denied_mask="trace" peer="/usr/sbin/libvirtd" It was also noticed that simply connecting to libvirtd (e.g. virsh list) resulted in the following entries in the audit log type=AVC msg=audit(1505755799.975:65): apparmor="DENIED" operation="ptrace" profile="/usr/sbin/libvirtd" pid=1418 comm="libvirtd" requested_mask="trace" denied_mask="trace" peer="unconfined" type=AVC msg=audit(1505755799.976:66): apparmor="DENIED" operation="ptrace" profile="/usr/sbin/libvirtd" pid=1418 comm="libvirtd" requested_mask="trace" denied_mask="trace" peer="unconfined" Both Apparmor denials can be fixed by adding ptrace rules to the libvirtd profile. The new rules only grant trace permission. Resolves: https://bugzilla.suse.com/show_bug.cgi?id=1058847 Signed-off-by: Jim Fehlig --- Even with debug enabled in libvirtd, I've had a hard time correlating a libvirtd action that results in the denied ptrace check seen in the audit log. I suspect it is related to accessing files in /proc as mentioned in the apparmor wiki http://wiki.apparmor.net/index.php/TechnicalDo_Proc_and_ptrace cc'ing some of the usual apparmor suspects for any words of wisdom. examples/apparmor/usr.sbin.libvirtd | 4 1 file changed, 4 insertions(+) diff --git a/examples/apparmor/usr.sbin.libvirtd b/examples/apparmor/usr.sbin.libvirtd index acb59e071..ff84aa149 100644 --- a/examples/apparmor/usr.sbin.libvirtd +++ b/examples/apparmor/usr.sbin.libvirtd @@ -37,6 +37,10 @@ network packet dgram, network packet raw, + # Support finer-grained ptrace checks, which were enabled in kernel 4.13 + ptrace trace peer=/usr/sbin/libvirtd, + ptrace trace peer=unconfined, + # Very lenient profile for libvirtd since we want to first focus on confining # the guests. Guests will have a very restricted profile. / r, -- 2.14.1 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH] apparmor: delete profile on VM shutdown
instead of only unloading it. This makes sure old profiles don't pile up in /etc/apparmor.d/libvirt and we get updates to modified templates on VM restart. --- src/security/security_apparmor.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/security/security_apparmor.c b/src/security/security_apparmor.c index 5afe0c5c85..1db94c632f 100644 --- a/src/security/security_apparmor.c +++ b/src/security/security_apparmor.c @@ -220,7 +220,7 @@ remove_profile(const char *profile) { int rc = -1; const char * const argv[] = { -VIRT_AA_HELPER, "-R", "-u", profile, NULL +VIRT_AA_HELPER, "-D", "-u", profile, NULL }; if (virRun(argv, NULL) == 0) -- 2.14.1 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 1/2] qemu: blockPeek: Fix filling of the return buffer
On 09/18/2017 09:11 AM, Peter Krempa wrote: > Commit 3956af495e broke the blockPeek API since virStorageFileRead > allocates a return buffer and fills it with the data, while the API > fills a user-provided buffer. This did not get caught by the compiler > since the API prototype uses a 'void *'. > > Fix it by transferring the data from the allocated buffer to the user > provided buffer. > > Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1491217 > --- > src/qemu/qemu_driver.c | 4 > 1 file changed, 4 insertions(+) > > diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c > index e1a0dd553..93a1c6061 100644 > --- a/src/qemu/qemu_driver.c > +++ b/src/qemu/qemu_driver.c > @@ -11415,6 +11415,7 @@ qemuDomainBlockPeek(virDomainPtr dom, > virQEMUDriverPtr driver = dom->conn->privateData; > virDomainDiskDefPtr disk = NULL; > virDomainObjPtr vm; > +char *tmpbuf = NULL; > int ret = -1; > > virCheckFlags(0, -1); > @@ -11444,12 +11445,15 @@ qemuDomainBlockPeek(virDomainPtr dom, > if (virStorageFileRead(disk->src, offset, size, buffer) < 0) > goto cleanup; > > +memcpy(buffer, tmpbuf, size); Umm, where is tmpbuf actually set to a non-null pointer? Shouldn't the virStorageFileRead() call also be updated? -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] apparmor: cater for new AAVMF image location
Hi, On Mon, Sep 18, 2017 at 02:05:41PM +0200, Michal Privoznik wrote: > On 09/15/2017 06:10 PM, Guido Günther wrote: > > Things moved again, sigh. > > --- > > src/security/virt-aa-helper.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/src/security/virt-aa-helper.c b/src/security/virt-aa-helper.c > > index 55a686a59c..0b43c8e391 100644 > > --- a/src/security/virt-aa-helper.c > > +++ b/src/security/virt-aa-helper.c > > @@ -516,7 +516,8 @@ valid_path(const char *path, const bool readonly) > > "/usr/share/OVMF/", /* for OVMF images */ > > "/usr/share/ovmf/", /* for OVMF images */ > > "/usr/share/AAVMF/", /* for AAVMF images */ > > -"/usr/share/qemu-efi/" /* for AAVMF images */ > > +"/usr/share/qemu-efi/", /* for AAVMF images */ > > +"/usr/share/qemu-efi-aarch64/" /* for AAVMF images */ > > }; > > /* override the above with these */ > > const char * const override[] = { > > > > ACK Pushed. Thanks -- Guido > > Michal > -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] apparmor: add attach_disconnected
Hi, On Mon, Sep 18, 2017 at 02:05:30PM +0200, Michal Privoznik wrote: > On 09/15/2017 05:17 PM, Guido Günther wrote: > > Otherwise we fail to reconnect to /dev/net/tun opened by libvirtd > > like > > > > [ 8144.507756] audit: type=1400 audit(1505488162.386:38069121): > > apparmor="DENIED" operation="file_perm" info="Failed name lookup - > > disconnected path" error=-13 > > profile="libvirt-5dfcc8a7-b79a-4fa9-a41f-f6271651934c" name="dev/net/tun" > > pid=9607 comm="qemu-system-x86" requested_mask="r" denied_mask="r" > > fsuid=117 ouid=0 > > > > --- > > I do wonder why we didn't see this earlier though. > > > > examples/apparmor/TEMPLATE.lxc | 2 +- > > examples/apparmor/TEMPLATE.qemu | 2 +- > > 2 files changed, 2 insertions(+), 2 deletions(-) > > +1/ACK/or whatever. Pushed. Thanks. -- Guido > > Michal > -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH] util: Fix configure.ac check for DEVLINK_CMD_ESWITCH_GET
Rather than checking for whether the devlink.h on the system has multiple symbols, let's only check for whether the command we want is defined. Turns out the mechanism of providing multiple definitions to check via AC_CHECK_DECLS in order to determine whether HAVE_DECL_DEVLINK should be set resulted in a false positive since as long as one of the defs was true, then the HAVE_DECL_DEVLINK got defined. This is further complicated by a change between kernel 4.8 and 4.11 where the originally defined name DEVLINK_CMD_ESWITCH_MODE_GET was changed to DEVLINK_CMD_ESWITCH_GET with the comment/caveat that the old format is obsolete should never be used. Therefore, even though some kernels will have a couple of the same symbols that were added at the same time (DEVLINK_ATTR_ESWITCH_MODE and DEVLINK_ESWITCH_MODE_SWITCHDEV), they won't have the newer name and thus cause a build failure. So even though multiple DEVLINK_CMD_SWITCH* symbols are used to determine whether the set HAVE_DECL_DEVLINK, this should cover those kernels before 4.11 with the old definition. Signed-off-by: John Ferlan --- I'd push this as a build breaker, but I don't want to need to go through the trouble again if i got this wrong, so hopefully someone who's seeing this can try out the patch... It's also present on a couple of the CI infrastructure environments (f23, centos-7): https://ci.centos.org/view/libvirt/job/libvirt-master-build/ configure.ac | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/configure.ac b/configure.ac index c9509c7f9..73ae7fdd5 100644 --- a/configure.ac +++ b/configure.ac @@ -630,12 +630,16 @@ fi dnl dnl check for kernel headers required by devlink dnl +dnl kernel 4.8 introduced DEVLINK_CMD_ESWITCH_MODE_GET, but that was +dnl later changed in kernel 4.11 to be just DEVLINK_CMD_ESWITCH_GET, so +dnl for "completeness" let's allow HAVE_DECL_DEVLINK to be define if +dnl either is defined. if test "$with_linux" = "yes"; then AC_CHECK_HEADERS([linux/devlink.h]) -AC_CHECK_DECLS([DEVLINK_GENL_VERSION, DEVLINK_GENL_NAME, DEVLINK_ATTR_MAX, DEVLINK_CMD_ESWITCH_GET, DEVLINK_ATTR_BUS_NAME, DEVLINK_ATTR_DEV_NAME, DEVLINK_ATTR_ESWITCH_MODE, DEVLINK_ESWITCH_MODE_SWITCHDEV], +AC_CHECK_DECLS([DEVLINK_CMD_ESWITCH_GET, DEVLINK_CMD_ESWITCH_MODE_GET], [AC_DEFINE([HAVE_DECL_DEVLINK], [1], - [whether devlink declarations are available])], + [whether devlink CMD_ESWITCH_GET is available])], [], [[#include ]]) fi -- 2.13.5 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH] cpu_ppc64: Error out when model tag missing in virsh cpu-compare xml
libvirtd throws unhandled signal 11 on ppc while running virsh cpu-compare with missing model tag in the xml. This patch errors out in such situation. Signed-off-by: Nitesh Konkar --- src/cpu/cpu_ppc64.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/cpu/cpu_ppc64.c b/src/cpu/cpu_ppc64.c index b58e80a..c11ac9f 100644 --- a/src/cpu/cpu_ppc64.c +++ b/src/cpu/cpu_ppc64.c @@ -247,6 +247,12 @@ ppc64ModelFromCPU(const virCPUDef *cpu, { struct ppc64_model *model; +if (!cpu->model) { +virReportError(VIR_ERR_INVALID_ARG, "%s", + _("no guest CPU model specified")); +return NULL; +} + if (!(model = ppc64ModelFind(map, cpu->model))) { virReportError(VIR_ERR_INTERNAL_ERROR, _("Unknown CPU model %s"), cpu->model); -- 2.9.5 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v4 2/7] nodedev: udev: Lock the driver in udevEventCheckMonitorFd
The udev monitor is not an immutable resource, so better protect every access to it. Signed-off-by: Erik Skultety --- src/node_device/node_device_udev.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index fe21ad7df..6c7f887ed 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -1615,6 +1615,7 @@ udevHandleOneDevice(struct udev_device *device) } +/* the caller must be holding the driver lock prior to calling this function */ static bool udevEventCheckMonitorFD(struct udev_monitor *udev_monitor, int fd) @@ -1653,6 +1654,7 @@ udevEventHandleCallback(int watch ATTRIBUTE_UNUSED, struct udev_device *device = NULL; struct udev_monitor *udev_monitor = NULL; +nodeDeviceLock(); udev_monitor = DRV_STATE_UDEV_MONITOR(driver); if (!udevEventCheckMonitorFD(udev_monitor, fd)) { @@ -1661,6 +1663,7 @@ udevEventHandleCallback(int watch ATTRIBUTE_UNUSED, } device = udev_monitor_receive_device(udev_monitor); +nodeDeviceUnlock(); if (!device) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", -- 2.13.5 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v4 7/7] nodedev: Work around the uevent race by hooking up virFileWaitForAccess
If we find ourselves in the situation that the 'add' uevent has been fired earlier than the sysfs tree for a device was created, we should use the best-effort approach and give kernel some predetermined amount of time, thus waiting for the attributes to be ready rather than discarding the device from our device list forever. If those don't appear in the given time frame, we need to move on, since libvirt can't wait indefinitely. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1463285 Signed-off-by: Erik Skultety --- src/node_device/node_device_udev.c | 16 +++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index 70e15ffb8..2f63256a3 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -1166,9 +1166,23 @@ udevProcessMediatedDevice(struct udev_device *dev, char *canonicalpath = NULL; virNodeDevCapMdevPtr data = &def->caps->data.mdev; -if (virAsprintf(&linkpath, "%s/mdev_type", udev_device_get_syspath(dev)) < 0) +/* Because of a kernel uevent race, we might get the 'add' event prior to + * the sysfs tree being ready, so any attempt to access any sysfs attribute + * would result in ENOENT and us dropping the device, so let's work around + * it by waiting for the attributes to become available. + */ + +if (virAsprintf(&linkpath, "%s/mdev_type", +udev_device_get_syspath(dev)) < 0) goto cleanup; +if (virFileWaitForExists(linkpath, 1, 100) < 0) { +virReportSystemError(errno, + _("failed to wait for file '%s' to appear"), + linkpath); +goto cleanup; +} + if (virFileResolveLink(linkpath, &canonicalpath) < 0) { virReportSystemError(errno, _("failed to resolve '%s'"), linkpath); goto cleanup; -- 2.13.5 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v4 5/7] nodedev: Disable/re-enable polling on the udev fd
The event loop may get scheduled earlier than the udev event handler thread which means that it would keep invoking the handler callback with "new" events, while in fact it's most likely still the same event which the handler thread hasn't managed to remove from the socket queue yet. This is due to unwanted increments of the number of events queuing on the socket and causes the handler thread to spam the logs with an error about libudev returning NULL (errno == EAGAIN). Signed-off-by: Erik Skultety --- src/node_device/node_device_udev.c | 17 + 1 file changed, 17 insertions(+) diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index 96760d1e4..70e15ffb8 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -1696,6 +1696,7 @@ udevEventCheckMonitorFD(struct udev_monitor *udev_monitor, static void udevEventHandleThread(void *opaque) { +udevPrivate *priv = NULL; udevEventThreadDataPtr privateData = opaque; struct udev_device *device = NULL; struct udev_monitor *udev_monitor = NULL; @@ -1715,6 +1716,7 @@ udevEventHandleThread(void *opaque) virMutexUnlock(&privateData->lock); nodeDeviceLock(); +priv = driver->privateData; udev_monitor = DRV_STATE_UDEV_MONITOR(driver); if (!udevEventCheckMonitorFD(udev_monitor, privateData->monitor_fd)) { @@ -1725,6 +1727,9 @@ udevEventHandleThread(void *opaque) device = udev_monitor_receive_device(udev_monitor); nodeDeviceUnlock(); +/* Re-enable polling for new events on the @udev_monitor */ +virEventUpdateHandle(priv->watch, VIR_EVENT_HANDLE_READABLE); + if (!device) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("udev_monitor_receive_device returned NULL")); @@ -1750,10 +1755,12 @@ udevEventHandleCallback(int watch ATTRIBUTE_UNUSED, int events ATTRIBUTE_UNUSED, void *opaque) { +udevPrivate *priv = NULL; struct udev_monitor *udev_monitor = NULL; udevEventThreadDataPtr threadData = opaque; nodeDeviceLock(); +priv = driver->privateData; udev_monitor = DRV_STATE_UDEV_MONITOR(driver); if (!udevEventCheckMonitorFD(udev_monitor, fd)) { @@ -1771,6 +1778,16 @@ udevEventHandleCallback(int watch ATTRIBUTE_UNUSED, threadData->nevents++; virCondSignal(&threadData->threadCond); virMutexUnlock(&threadData->lock); + +/* Due to scheduling, the eventloop might poll the udev monitor before the + * handler thread actually removes the data from the socket, thus causing it + * to spam errors about data not being ready yet, because + * udevEventHandleCallback would be invoked multiple times incrementing the + * counter of events queuing for a single event. + * Therefore we need to disable polling on the fd until the thread actually + * removes the data from the socket. + */ +virEventUpdateHandle(priv->watch, 0); } -- 2.13.5 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v4 3/7] udev: Split udevEventHandleCallback in two functions
This patch splits udevEventHandleCallback in two (introduces udevEventHandleThread) in order to be later able to refactor the latter to actually become a detached thread which will wait some time for the kernel to create the whole sysfs tree for a device as we cannot do that in the event loop directly. Signed-off-by: Erik Skultety --- src/node_device/node_device_udev.c | 27 +++ 1 file changed, 23 insertions(+), 4 deletions(-) diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index 6c7f887ed..e144472f1 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -1646,13 +1646,11 @@ udevEventCheckMonitorFD(struct udev_monitor *udev_monitor, static void -udevEventHandleCallback(int watch ATTRIBUTE_UNUSED, -int fd, -int events ATTRIBUTE_UNUSED, -void *data ATTRIBUTE_UNUSED) +udevEventHandleThread(void *opaque) { struct udev_device *device = NULL; struct udev_monitor *udev_monitor = NULL; +int fd = (intptr_t) opaque; nodeDeviceLock(); udev_monitor = DRV_STATE_UDEV_MONITOR(driver); @@ -1676,6 +1674,27 @@ udevEventHandleCallback(int watch ATTRIBUTE_UNUSED, } +static void +udevEventHandleCallback(int watch ATTRIBUTE_UNUSED, +int fd, +int events ATTRIBUTE_UNUSED, +void *data ATTRIBUTE_UNUSED) +{ +struct udev_monitor *udev_monitor = NULL; + +nodeDeviceLock(); +udev_monitor = DRV_STATE_UDEV_MONITOR(driver); + +if (!udevEventCheckMonitorFD(udev_monitor, fd)) { +nodeDeviceUnlock(); +return; +} +nodeDeviceUnlock(); + +udevEventHandleThread((void *)(intptr_t) fd); +} + + /* DMI is intel-compatible specific */ #if defined(__x86_64__) || defined(__i386__) || defined(__amd64__) static void -- 2.13.5 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v4 1/7] nodedev: Introduce udevEventCheckMonitorFD helper function
We need to perform some sanity checks on the udev monitor before every use so that we know nothing changed in the meantime. The reason for moving the code to a separate function is to be able to perform the same check from a worker thread that will replace the udevEventHandleCallback in terms of poking udev for device events. Signed-off-by: Erik Skultety --- src/node_device/node_device_udev.c | 40 +- 1 file changed, 27 insertions(+), 13 deletions(-) diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index f4177455c..fe21ad7df 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -1615,14 +1615,10 @@ udevHandleOneDevice(struct udev_device *device) } -static void -udevEventHandleCallback(int watch ATTRIBUTE_UNUSED, -int fd, -int events ATTRIBUTE_UNUSED, -void *data ATTRIBUTE_UNUSED) +static bool +udevEventCheckMonitorFD(struct udev_monitor *udev_monitor, +int fd) { -struct udev_device *device = NULL; -struct udev_monitor *udev_monitor = DRV_STATE_UDEV_MONITOR(driver); int udev_fd = -1; udev_fd = udev_monitor_get_fd(udev_monitor); @@ -1641,21 +1637,39 @@ udevEventHandleCallback(int watch ATTRIBUTE_UNUSED, virEventRemoveHandle(priv->watch); priv->watch = -1; -goto cleanup; +return false; +} + +return true; +} + + +static void +udevEventHandleCallback(int watch ATTRIBUTE_UNUSED, +int fd, +int events ATTRIBUTE_UNUSED, +void *data ATTRIBUTE_UNUSED) +{ +struct udev_device *device = NULL; +struct udev_monitor *udev_monitor = NULL; + +udev_monitor = DRV_STATE_UDEV_MONITOR(driver); + +if (!udevEventCheckMonitorFD(udev_monitor, fd)) { +nodeDeviceUnlock(); +return; } device = udev_monitor_receive_device(udev_monitor); -if (device == NULL) { + +if (!device) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("udev_monitor_receive_device returned NULL")); -goto cleanup; +return; } udevHandleOneDevice(device); - - cleanup: udev_device_unref(device); -return; } -- 2.13.5 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v4 6/7] util: Introduce virFileWaitForExists
Since we have a number of places where we workaround timing issues with devices, attributes (files in general) not being available at the time of processing them by calling usleep in a loop for a fixed number of tries, we could as well have a utility function that would do that. Therefore we won't have to duplicate this ugly workaround even more. Signed-off-by: Erik Skultety --- src/libvirt_private.syms | 1 + src/util/virfile.c | 31 +++ src/util/virfile.h | 2 ++ 3 files changed, 34 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index d12db9b3f..57a5ea300 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1730,6 +1730,7 @@ virFileStripSuffix; virFileTouch; virFileUnlock; virFileUpdatePerm; +virFileWaitForExists; virFileWrapperFdClose; virFileWrapperFdFree; virFileWrapperFdNew; diff --git a/src/util/virfile.c b/src/util/virfile.c index 2f28e83f4..01cc61e98 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -4166,3 +4166,34 @@ virFileReadValueString(char **value, const char *format, ...) VIR_FREE(str); return ret; } + + +/** + * virFileWaitForExists: + * @path: absolute path to a sysfs attribute (can be a symlink) + * @ms: how long to wait (in milliseconds) + * @tries: how many times should we try to wait for @path to become accessible + * + * Checks the existence of @path. In case the file defined by @path + * doesn't exist, we wait for it to appear in @ms milliseconds (for up to + * @tries attempts). + * + * Returns 0 on success, -1 on error, setting errno appropriately. + */ +int +virFileWaitForExists(const char *path, + size_t ms, + size_t tries) +{ +errno = 0; + +/* wait for @path to be accessible in @ms milliseconds, up to @tries */ +while (tries-- > 0 && !virFileExists(path)) { +if (tries == 0 || errno != ENOENT) +return -1; + +usleep(ms * 1000); +} + +return 0; +} diff --git a/src/util/virfile.h b/src/util/virfile.h index 57ceb8072..310a8e7f6 100644 --- a/src/util/virfile.h +++ b/src/util/virfile.h @@ -347,6 +347,8 @@ int virFileReadValueScaledInt(unsigned long long *value, const char *format, ... int virFileReadValueString(char **value, const char *format, ...) ATTRIBUTE_FMT_PRINTF(2, 3); +int virFileWaitForExists(const char *path, size_t ms, size_t tries); + int virFileInData(int fd, int *inData, -- 2.13.5 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v4 4/7] udev: Convert udevEventHandleThread to an actual thread routine
Adjust udevEventHandleThread to be a proper thread routine running in an infinite loop handling devices. Also introduce udevEventThreadData private structure. Every time there's and incoming event from udev, udevEventHandleCallback only increments the number of events queuing on the monitor and signals the worker thread to handle them. Signed-off-by: Erik Skultety --- src/node_device/node_device_udev.c | 125 +++-- 1 file changed, 107 insertions(+), 18 deletions(-) diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index e144472f1..96760d1e4 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -59,6 +59,54 @@ struct _udevPrivate { bool privileged; }; +typedef struct _udevEventThreadData udevEventThreadData; +typedef udevEventThreadData *udevEventThreadDataPtr; + +struct _udevEventThreadData { +virMutex lock; +virCond threadCond; + +bool threadQuit; +int monitor_fd; +unsigned long long nevents; /* number of udev events queuing on monitor */ +}; + + +static udevEventThreadDataPtr +udevEventThreadDataNew(int fd) +{ +udevEventThreadDataPtr ret = NULL; + +if (VIR_ALLOC_QUIET(ret) < 0) +return NULL; + +if (virMutexInit(&ret->lock) < 0) +goto cleanup; + +if (virCondInit(&ret->threadCond) < 0) +goto cleanup_mutex; + +ret->monitor_fd = fd; + +return ret; + + cleanup_mutex: +virMutexDestroy(&ret->lock); + + cleanup: +VIR_FREE(ret); +return NULL; +} + + +static void +udevEventThreadDataFree(udevEventThreadDataPtr data) +{ +virMutexDestroy(&data->lock); +virCondDestroy(&data->threadCond); +VIR_FREE(data); +} + static bool udevHasDeviceProperty(struct udev_device *dev, @@ -1648,29 +1696,51 @@ udevEventCheckMonitorFD(struct udev_monitor *udev_monitor, static void udevEventHandleThread(void *opaque) { +udevEventThreadDataPtr privateData = opaque; struct udev_device *device = NULL; struct udev_monitor *udev_monitor = NULL; -int fd = (intptr_t) opaque; -nodeDeviceLock(); -udev_monitor = DRV_STATE_UDEV_MONITOR(driver); +/* continue rather than break from the loop on non-fatal errors */ +while (1) { +virMutexLock(&privateData->lock); +while (privateData->nevents == 0 && !privateData->threadQuit) { +if (virCondWait(&privateData->threadCond, &privateData->lock)) { +virReportSystemError(errno, "%s", + _("handler failed to wait on condition")); +goto cleanup; +} +} -if (!udevEventCheckMonitorFD(udev_monitor, fd)) { +privateData->nevents--; +virMutexUnlock(&privateData->lock); + +nodeDeviceLock(); +udev_monitor = DRV_STATE_UDEV_MONITOR(driver); + +if (!udevEventCheckMonitorFD(udev_monitor, privateData->monitor_fd)) { +nodeDeviceUnlock(); +goto cleanup; +} + +device = udev_monitor_receive_device(udev_monitor); nodeDeviceUnlock(); -return; -} -device = udev_monitor_receive_device(udev_monitor); -nodeDeviceUnlock(); +if (!device) { +virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("udev_monitor_receive_device returned NULL")); +goto next; +} + +udevHandleOneDevice(device); -if (!device) { -virReportError(VIR_ERR_INTERNAL_ERROR, "%s", - _("udev_monitor_receive_device returned NULL")); -return; +next: +udev_device_unref(device); } -udevHandleOneDevice(device); + cleanup: udev_device_unref(device); +udevEventThreadDataFree(privateData); +return; } @@ -1678,20 +1748,29 @@ static void udevEventHandleCallback(int watch ATTRIBUTE_UNUSED, int fd, int events ATTRIBUTE_UNUSED, -void *data ATTRIBUTE_UNUSED) +void *opaque) { struct udev_monitor *udev_monitor = NULL; +udevEventThreadDataPtr threadData = opaque; nodeDeviceLock(); udev_monitor = DRV_STATE_UDEV_MONITOR(driver); if (!udevEventCheckMonitorFD(udev_monitor, fd)) { +virMutexLock(&threadData->lock); +threadData->threadQuit = true; +virCondSignal(&threadData->threadCond); +virMutexUnlock(&threadData->lock); + nodeDeviceUnlock(); return; } nodeDeviceUnlock(); -udevEventHandleThread((void *)(intptr_t) fd); +virMutexLock(&threadData->lock); +threadData->nevents++; +virCondSignal(&threadData->threadCond); +virMutexUnlock(&threadData->lock); } @@ -1818,6 +1897,9 @@ nodeStateInitialize(bool privileged, { udevPrivate *priv = NULL; struct udev *udev = NULL; +int monitor_fd = -1; +virThread th; +udevEventThreadDataPtr threadData =
[libvirt] [PATCH v4 0/7] Work around the kernel mdev uevent race in nodedev
v3 here: https://www.redhat.com/archives/libvir-list/2017-August/msg00703.html since v3: - some minor cosmetic changes related to comments from the original patches 1-2 - everything else (provided it didn't cause merge conflicts) is unchanged Erik Skultety (7): nodedev: Introduce udevEventCheckMonitorFD helper function nodedev: udev: Lock the driver in udevEventCheckMonitorFd udev: Split udevEventHandleCallback in two functions udev: Convert udevEventHandleThread to an actual thread routine nodedev: Disable/re-enable polling on the udev fd util: Introduce virFileWaitForExists nodedev: Work around the uevent race by hooking up virFileWaitForAccess src/libvirt_private.syms | 1 + src/node_device/node_device_udev.c | 194 + src/util/virfile.c | 31 ++ src/util/virfile.h | 2 + 4 files changed, 209 insertions(+), 19 deletions(-) -- 2.13.5 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCHv2] nodedev: add switchdev to NIC capabilities
On 09/18/2017 10:57 AM, Ján Tomko wrote: > On Thu, Sep 14, 2017 at 11:23:02AM -0400, Laine Stump wrote: >> (Almost all of my comments result in "ok, no action needed". There are >> just three items I would like to see changed (2 trivial, 1 also small >> but Edan or John may think it prudent to re-test with the change before >> pushing) - I marked those comments with [**]. >> >> On 08/21/2017 05:19 AM, Edan David wrote: >>> Adding functionality to libvirt that will allow querying the interface >>> for the availability of switchdev Offloading NIC capabilities. >>> The switchdev mode was introduced in kernel 4.8, the iproute2-devlink >>> command to retrieve the swtichdev NIC feature, >>> Command example: devlink dev eswitch show pci/:03:00.0 >>> This feature is needed for Openstack so we can do a scheduling decision >>> if the NIC is in Hardware Offload (switchdev) or regular SR-IOV >>> (legacy) mode. >>> And select the appropriate hypervisors with the requested capability >>> see [1]. >>> >>> [1] - >>> https://specs.openstack.org/openstack/nova-specs/specs/pike/approved/enable-sriov-nic-features.html >>> >>> --- >>> configure.ac | 13 ++ >>> docs/formatnode.html.in | 1 + >>> src/util/virnetdev.c | 187 >>> +- >>> src/util/virnetdev.h | 1 + >>> tests/nodedevschemadata/net_00_13_02_b9_f9_d3.xml | 1 + >>> tests/nodedevschemadata/net_00_15_58_2f_e9_55.xml | 1 + >>> 6 files changed, 203 insertions(+), 1 deletion(-) >>> > > This patch fails to compile on a Gentoo machine with > sys-kernel/linux-headers-4.8: > > util/virnetdev.c:3248:18: error: use of undeclared identifier > 'DEVLINK_CMD_ESWITCH_GET'; did you mean 'DEVLINK_CMD_ESWITCH_MODE_GET'? > gmsgh->cmd = DEVLINK_CMD_ESWITCH_GET; > ^~~ > DEVLINK_CMD_ESWITCH_MODE_GET > /usr/include/linux/devlink.h:60:2: note: 'DEVLINK_CMD_ESWITCH_MODE_GET' > declared here > DEVLINK_CMD_ESWITCH_MODE_GET, > ^ > 1 error generated. > >>> diff --git a/configure.ac b/configure.ac >>> index b12b7fa..c089798 100644 >>> --- a/configure.ac >>> +++ b/configure.ac >>> @@ -627,6 +627,19 @@ if test "$with_linux" = "yes"; then >>> AC_CHECK_HEADERS([linux/btrfs.h]) >>> fi >>> >>> +dnl >>> +dnl check for kernel headers required by devlink >>> +dnl >>> +if test "$with_linux" = "yes"; then >>> + AC_CHECK_HEADERS([linux/devlink.h]) >>> + AC_CHECK_DECLS([DEVLINK_GENL_VERSION, DEVLINK_GENL_NAME, >>> DEVLINK_ATTR_MAX, DEVLINK_CMD_ESWITCH_GET, DEVLINK_ATTR_BUS_NAME, >>> DEVLINK_ATTR_DEV_NAME, DEVLINK_ATTR_ESWITCH_MODE, >>> DEVLINK_ESWITCH_MODE_SWITCHDEV], >> >> That's very . thorough, and potentially misleading if someone ever >> wanted to use devlink to check for something other than switchdev (e.g. >> [mythical feature X]) on some system that didn't have the proper stuff >> defined for switchdev, but did have it for [other mythical feature X]. >> But that's all just hypothetical, so this is fine with me. >> >> >>> + [AC_DEFINE([HAVE_DECL_DEVLINK], >>> + [1], >>> + [whether devlink declarations is >>> available])], >> >> [**] >> s/is/are/ >> > > It seems the [action-if-found] is executed for every found symbol: > > #define HAVE_DECL_DEVLINK_GENL_VERSION 1 > #define HAVE_DECL_DEVLINK 1 > #define HAVE_DECL_DEVLINK_GENL_NAME 1 > #define HAVE_DECL_DEVLINK 1 > #define HAVE_DECL_DEVLINK_ATTR_MAX 1 > #define HAVE_DECL_DEVLINK 1 > #define HAVE_DECL_DEVLINK_CMD_ESWITCH_GET 0 > #define HAVE_DECL_DEVLINK_ATTR_BUS_NAME 1 > #define HAVE_DECL_DEVLINK 1 > #define HAVE_DECL_DEVLINK_ATTR_DEV_NAME 1 > #define HAVE_DECL_DEVLINK 1 > #define HAVE_DECL_DEVLINK_ATTR_ESWITCH_MODE 1 > #define HAVE_DECL_DEVLINK 1 > #define HAVE_DECL_DEVLINK_ESWITCH_MODE_SWITCHDEV 1 > #define HAVE_DECL_DEVLINK 1 > > so this usage of AC_CHECK_DECLS is bogus. > > Jan > So then the question becomes is it "safe enough" to only look for HAVE_DECL_DEVLINK_CMD_ESWITCH_GET and assume the others are present or does there have to be some sort of AC_CHECK_DECLS for every one of the symbols that are going to be used and then some way to only allow that code to be compiled if all are true? I'd say the former - just check for CMD_ESWITCH_GET and assume the others, but you know what can be said about assumptions... And this is why I didn't want to push this on Saturday! bad enough that I've been in and out of the office today ;-) John >>> + [], >>> + [[#include ]]) >>> +fi >>> + >>> dnl Allow perl/python overrides >>> AC_PATH_PROGS([PYTHON], [python2 python]) >>> if test -z "$PYTHON"; then > > > -- > libvir-list mailing list > libvir-list@redhat.com > https://www.redhat.com/mailman/listinfo/libvir-list > -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/l
Re: [libvirt] [PATCH] check-spacing: Don't hardcode Perl interpreter path
On Mon, Sep 18, 2017 at 02:39:39PM +0200, Andrea Bolognani wrote: This is particularly useful on operating systems that don't ship Perl as part of the base system (eg. FreeBSD) while still working just as well as it did before on Linux. Signed-off-by: Andrea Bolognani --- build-aux/check-spacing.pl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) For the usual use case of 'make syntax-check' this should be working already, since we run it as $(PERL) with the binary path detected at configure time. But this change is an improvement. ACK if you extend it to include all perl scripts. Jan diff --git a/build-aux/check-spacing.pl b/build-aux/check-spacing.pl index 448acf234..ca8b43491 100755 --- a/build-aux/check-spacing.pl +++ b/build-aux/check-spacing.pl @@ -1,4 +1,4 @@ -#!/usr/bin/perl +#!/usr/bin/env perl # # check-spacing.pl: Report any usage of 'function (..args..)' # Also check for other syntax issues, such as correct use of ';' -- 2.13.5 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list signature.asc Description: Digital signature -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v5 1/2] conf, docs: Add qcow2 cache configuration support
On Mon, Sep 18, 2017 at 09:49:36 -0400, John Ferlan wrote: > > > On 09/14/2017 01:08 AM, Liu Qing wrote: > > Random write IOPS will drop dramatically if qcow2 l2 cache could not > > cover the whole disk. This patch gives libvirt user a chance to adjust > > the qcow2 cache configuration. > > > > Three new qcow2 driver parameters (l2-cache-size, refcount-cache-size > > and cache-clean-interval) are added as attributes to a new > > subelement for a of a > > element. The QEMU source docs/qcow2-cache.txt provides the guidelines > > for defining/configuring values for each. > > > > Signed-off-by: Liu Qing > > --- > > docs/formatdomain.html.in | 41 + > > docs/schemas/domaincommon.rng | 35 > > src/conf/domain_conf.c | 96 > > -- > > src/qemu/qemu_driver.c | 5 ++ > > src/util/virstoragefile.c | 3 + > > src/util/virstoragefile.h | 6 ++ > > .../qemuxml2argv-disk-drive-qcow2-cache.xml| 43 ++ > > .../qemuxml2xmlout-disk-drive-qcow2-cache.xml | 43 ++ > > tests/qemuxml2xmltest.c| 1 + > > 9 files changed, 268 insertions(+), 5 deletions(-) > > create mode 100644 > > tests/qemuxml2argvdata/qemuxml2argv-disk-drive-qcow2-cache.xml > > create mode 100644 > > tests/qemuxml2xmloutdata/qemuxml2xmlout-disk-drive-qcow2-cache.xml > > > > OK - so I had my head in other code during the v4 review comments from > Peter. So I'll just "restart" that a bit here. If I can summarize the > comments... There's a concern that we're creating a very specific set of > parameters for which there isn't great guidance available other than > going to a source file in the qemu tree and attempting to decipher it. > Something I did note at one time as a possible issue and why I also > asked for the "qcow2" subelement. It makes it painfully obvious that > there was something very specific. I figure seeing a "qcow2" element > name would fairly easily point out to anyone that specificity. > > So, based on what I read in the responses it seems that the purpose of > the values is to provide a mechanism to "allow" qemu to use a couple of > different algorithms to perform operations, but qemu wants the user to > decide whether they are willing to sacrifice one thing over another. If > you want high IOPS, then you have to sacrifice something else; whereas, > if you don't care about performance, then you sacrifice something else. > > In any case, I think what could be useful is to determine if there is a > way to "automate" something such that the "dials" would be automatically > set based on the algorithms. That is - rather than supplying the > specific values, supply a named attribute that would then perform the > calculations and set the values, such as "high_performance="yes" or > "large_capacity=yes". Where you cannot set both, but can set one or the > other. Or can you set both? Is there value in partially turning a knob? > If so maybe the attribute is "performance={default, ..., best}" and > "capacity={default, ..., large}". Then based on the value of the > attribute, code you add to libvirt would perform the calculations. While I think that having an user friendly way (not having to inspect the image to set similar parameters across different VMs) would be great, boolean parameters will basically translate into some kind of policy decision we are trying to avoid. I thought that we could set the cache e.g. in percent of the image it can cover, but that lacks transparency in regards to consumed memory, but that's similar to the boolean flags. While I really don't like meaningless knobs that can change in the future I currently can't think of anything that would be user friendly and not step into the "policy" region. signature.asc Description: PGP signature -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCHv2] nodedev: add switchdev to NIC capabilities
On Thu, Sep 14, 2017 at 11:23:02AM -0400, Laine Stump wrote: (Almost all of my comments result in "ok, no action needed". There are just three items I would like to see changed (2 trivial, 1 also small but Edan or John may think it prudent to re-test with the change before pushing) - I marked those comments with [**]. On 08/21/2017 05:19 AM, Edan David wrote: Adding functionality to libvirt that will allow querying the interface for the availability of switchdev Offloading NIC capabilities. The switchdev mode was introduced in kernel 4.8, the iproute2-devlink command to retrieve the swtichdev NIC feature, Command example: devlink dev eswitch show pci/:03:00.0 This feature is needed for Openstack so we can do a scheduling decision if the NIC is in Hardware Offload (switchdev) or regular SR-IOV (legacy) mode. And select the appropriate hypervisors with the requested capability see [1]. [1] - https://specs.openstack.org/openstack/nova-specs/specs/pike/approved/enable-sriov-nic-features.html --- configure.ac | 13 ++ docs/formatnode.html.in | 1 + src/util/virnetdev.c | 187 +- src/util/virnetdev.h | 1 + tests/nodedevschemadata/net_00_13_02_b9_f9_d3.xml | 1 + tests/nodedevschemadata/net_00_15_58_2f_e9_55.xml | 1 + 6 files changed, 203 insertions(+), 1 deletion(-) This patch fails to compile on a Gentoo machine with sys-kernel/linux-headers-4.8: util/virnetdev.c:3248:18: error: use of undeclared identifier 'DEVLINK_CMD_ESWITCH_GET'; did you mean 'DEVLINK_CMD_ESWITCH_MODE_GET'? gmsgh->cmd = DEVLINK_CMD_ESWITCH_GET; ^~~ DEVLINK_CMD_ESWITCH_MODE_GET /usr/include/linux/devlink.h:60:2: note: 'DEVLINK_CMD_ESWITCH_MODE_GET' declared here DEVLINK_CMD_ESWITCH_MODE_GET, ^ 1 error generated. diff --git a/configure.ac b/configure.ac index b12b7fa..c089798 100644 --- a/configure.ac +++ b/configure.ac @@ -627,6 +627,19 @@ if test "$with_linux" = "yes"; then AC_CHECK_HEADERS([linux/btrfs.h]) fi +dnl +dnl check for kernel headers required by devlink +dnl +if test "$with_linux" = "yes"; then +AC_CHECK_HEADERS([linux/devlink.h]) +AC_CHECK_DECLS([DEVLINK_GENL_VERSION, DEVLINK_GENL_NAME, DEVLINK_ATTR_MAX, DEVLINK_CMD_ESWITCH_GET, DEVLINK_ATTR_BUS_NAME, DEVLINK_ATTR_DEV_NAME, DEVLINK_ATTR_ESWITCH_MODE, DEVLINK_ESWITCH_MODE_SWITCHDEV], That's very . thorough, and potentially misleading if someone ever wanted to use devlink to check for something other than switchdev (e.g. [mythical feature X]) on some system that didn't have the proper stuff defined for switchdev, but did have it for [other mythical feature X]. But that's all just hypothetical, so this is fine with me. + [AC_DEFINE([HAVE_DECL_DEVLINK], + [1], + [whether devlink declarations is available])], [**] s/is/are/ It seems the [action-if-found] is executed for every found symbol: #define HAVE_DECL_DEVLINK_GENL_VERSION 1 #define HAVE_DECL_DEVLINK 1 #define HAVE_DECL_DEVLINK_GENL_NAME 1 #define HAVE_DECL_DEVLINK 1 #define HAVE_DECL_DEVLINK_ATTR_MAX 1 #define HAVE_DECL_DEVLINK 1 #define HAVE_DECL_DEVLINK_CMD_ESWITCH_GET 0 #define HAVE_DECL_DEVLINK_ATTR_BUS_NAME 1 #define HAVE_DECL_DEVLINK 1 #define HAVE_DECL_DEVLINK_ATTR_DEV_NAME 1 #define HAVE_DECL_DEVLINK 1 #define HAVE_DECL_DEVLINK_ATTR_ESWITCH_MODE 1 #define HAVE_DECL_DEVLINK 1 #define HAVE_DECL_DEVLINK_ESWITCH_MODE_SWITCHDEV 1 #define HAVE_DECL_DEVLINK 1 so this usage of AC_CHECK_DECLS is bogus. Jan + [], + [[#include ]]) +fi + dnl Allow perl/python overrides AC_PATH_PROGS([PYTHON], [python2 python]) if test -z "$PYTHON"; then signature.asc Description: Digital signature -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] qemu: Mark graphics ports used on reconnect
On 09/18/2017 04:10 PM, Cole Robinson wrote: > On 09/18/2017 09:47 AM, Michal Privoznik wrote: >> This is an issue that's bugging me for a long time. I don't know >> exactly when and who is to blame but on daemon reconnect we lose >> qemu's port allocator internal state. > > It's the kernel's fault, broken in 4.11+, patches recently posted > upstream but not in any release yet: > > https://bugzilla.redhat.com/show_bug.cgi?id=1432684 > Cool thanks. I'll keep an eye on it. Nevertheless, we should merge this patch as from control POV it makes no sense to try to bind to ports we know are taken. Michal -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 0/2] qemu: blockPeek: Fix invalid buffer usage and
See 1/2 for the bugfix. Peter Krempa (2): qemu: blockPeek: Fix filling of the return buffer qemu: blockPeek: Enforce buffer filling src/qemu/qemu_driver.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) -- 2.14.1 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v5 1/2] conf, docs: Add qcow2 cache configuration support
On 09/14/2017 01:08 AM, Liu Qing wrote: > Random write IOPS will drop dramatically if qcow2 l2 cache could not > cover the whole disk. This patch gives libvirt user a chance to adjust > the qcow2 cache configuration. > > Three new qcow2 driver parameters (l2-cache-size, refcount-cache-size > and cache-clean-interval) are added as attributes to a new > subelement for a of a > element. The QEMU source docs/qcow2-cache.txt provides the guidelines > for defining/configuring values for each. > > Signed-off-by: Liu Qing > --- > docs/formatdomain.html.in | 41 + > docs/schemas/domaincommon.rng | 35 > src/conf/domain_conf.c | 96 > -- > src/qemu/qemu_driver.c | 5 ++ > src/util/virstoragefile.c | 3 + > src/util/virstoragefile.h | 6 ++ > .../qemuxml2argv-disk-drive-qcow2-cache.xml| 43 ++ > .../qemuxml2xmlout-disk-drive-qcow2-cache.xml | 43 ++ > tests/qemuxml2xmltest.c| 1 + > 9 files changed, 268 insertions(+), 5 deletions(-) > create mode 100644 > tests/qemuxml2argvdata/qemuxml2argv-disk-drive-qcow2-cache.xml > create mode 100644 > tests/qemuxml2xmloutdata/qemuxml2xmlout-disk-drive-qcow2-cache.xml > OK - so I had my head in other code during the v4 review comments from Peter. So I'll just "restart" that a bit here. If I can summarize the comments... There's a concern that we're creating a very specific set of parameters for which there isn't great guidance available other than going to a source file in the qemu tree and attempting to decipher it. Something I did note at one time as a possible issue and why I also asked for the "qcow2" subelement. It makes it painfully obvious that there was something very specific. I figure seeing a "qcow2" element name would fairly easily point out to anyone that specificity. So, based on what I read in the responses it seems that the purpose of the values is to provide a mechanism to "allow" qemu to use a couple of different algorithms to perform operations, but qemu wants the user to decide whether they are willing to sacrifice one thing over another. If you want high IOPS, then you have to sacrifice something else; whereas, if you don't care about performance, then you sacrifice something else. In any case, I think what could be useful is to determine if there is a way to "automate" something such that the "dials" would be automatically set based on the algorithms. That is - rather than supplying the specific values, supply a named attribute that would then perform the calculations and set the values, such as "high_performance="yes" or "large_capacity=yes". Where you cannot set both, but can set one or the other. Or can you set both? Is there value in partially turning a knob? If so maybe the attribute is "performance={default, ..., best}" and "capacity={default, ..., large}". Then based on the value of the attribute, code you add to libvirt would perform the calculations. Or do you really think that's something some supplier (such as what you're trying to do) would want to have - the ability to control the algorithm for the various knobs. Hopefully this makes sense... One minor nit below which is easily fixable... > diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in > index 8ca7637..4574e3a 100644 > --- a/docs/formatdomain.html.in > +++ b/docs/formatdomain.html.in > @@ -3036,6 +3036,47 @@ >set. (Since 3.5.0) > > > +The driver element may contain a qcow2 sub > +element, which to specifying further details related to a qcow2 disk. > +For recommended setting guidelines refer to the QEMU source file > +docs/qcow2-cache.txt. > +Since 3.8.0 > + > + > +The optional l2_cache_size attribute controls how > much > +memory will be consumed by qcow2 l2 table cache in bytes. This > +option is only valid when the driver type is qcow2. The default > +size is 2097152. > +Since 3.8.0 > + > +In general you should leave this option alone, unless you > +are very certain you know what you are doing. > + > + > +The optional refcount_cache_size attribute controls > +how much memory will be consumed by qcow2 reference count table > +cache in bytes. This option is only valid when the driver type is > +qcow2. The default size is 262144. > +Since 3.8.0 > + > +In general you should leave this option alone, unless you > +are very certain you know what you are doing. > + > + > +The optional cache_clean_interval attribute defines > +an interval (in seconds). All cache entries that haven't been > +
Re: [libvirt] [PATCH v2 0/3] Be more selective when determining cdrom for taint messaging
On 09/18/2017 09:12 AM, Michal Privoznik wrote: > On 09/11/2017 04:32 PM, John Ferlan wrote: >> v1: https://www.redhat.com/archives/libvir-list/2017-September/msg00103.html >> >> Changes since v1: >> >> Split into 3 parts... The first patch would be the bare minimum using >> STRPREFIX instead of STREQ type comparisons for the incoming path to >> be "/dev/cdrom[N]" or "/dev/srN" (or resolved to that). >> >> This would "work" for the most part, but then since it's possible to >> make even more checks let's check against the collected node device >> data. Patch 2 therefore will "tag" the already collected cdrom data >> with a capability. This allows patch3 to find any/all CDROM's on the >> host and compare the resolved path to that list of devices returning >> "true" if something matches a node device declared physical CDROM. >> >> I split things up mainly to make it easier to decide whether patch 1 >> is sufficient or not. If patch2 and patch3 are OK, I would also add >> a release note indicating the improvement to find CDROM by node device >> capability. It's a separate "improvement" on it's own as well. Whether >> it's truly useful or not, is a different question... > > [1] > >> >> John Ferlan (3): >> qemu: Be more selective when determining cdrom for taint messaging > > ACK to this one ^^ > >> nodedev: Add capability bit to detect 'cdrom' devices >> qemu: Add inquiry to nodedev for cdrom taint checking > > However, these two ^^ look like an overkill to me. It's still just a > taint message that nobody cares about. Or? > 1: Yeah, I don't think we really need such a big hammer for tiny nail. > But I might be missing something. > > Michal > I agree with you, but just in case someone wanted to use that sledge hammer in order to catch some really obscure corner condition, I figured I'd show it was possible... Still I can give it a few more days to see if someone indicates they would also like to see usage of the sledge hammer. John -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 1/2] qemu: blockPeek: Fix filling of the return buffer
Commit 3956af495e broke the blockPeek API since virStorageFileRead allocates a return buffer and fills it with the data, while the API fills a user-provided buffer. This did not get caught by the compiler since the API prototype uses a 'void *'. Fix it by transferring the data from the allocated buffer to the user provided buffer. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1491217 --- src/qemu/qemu_driver.c | 4 1 file changed, 4 insertions(+) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index e1a0dd553..93a1c6061 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -11415,6 +11415,7 @@ qemuDomainBlockPeek(virDomainPtr dom, virQEMUDriverPtr driver = dom->conn->privateData; virDomainDiskDefPtr disk = NULL; virDomainObjPtr vm; +char *tmpbuf = NULL; int ret = -1; virCheckFlags(0, -1); @@ -11444,12 +11445,15 @@ qemuDomainBlockPeek(virDomainPtr dom, if (virStorageFileRead(disk->src, offset, size, buffer) < 0) goto cleanup; +memcpy(buffer, tmpbuf, size); + ret = 0; cleanup: if (disk) virStorageFileDeinit(disk->src); virDomainObjEndAPI(&vm); +VIR_FREE(tmpbuf); return ret; } -- 2.14.1 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 2/2] qemu: blockPeek: Enforce buffer filling
Documentation states: "'offset' and 'size' represent an area which must lie entirely within the device or file." Enforce the that the buffer lies within fully. --- src/qemu/qemu_driver.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 93a1c6061..bddba6b71 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -11416,6 +11416,7 @@ qemuDomainBlockPeek(virDomainPtr dom, virDomainDiskDefPtr disk = NULL; virDomainObjPtr vm; char *tmpbuf = NULL; +ssize_t nread; int ret = -1; virCheckFlags(0, -1); @@ -11442,9 +11443,16 @@ qemuDomainBlockPeek(virDomainPtr dom, if (qemuDomainStorageFileInit(driver, vm, disk->src) < 0) goto cleanup; -if (virStorageFileRead(disk->src, offset, size, buffer) < 0) +if ((nread = virStorageFileRead(disk->src, offset, size, &tmpbuf)) < 0) goto cleanup; +if (nread < size) { +virReportError(VIR_ERR_INVALID_ARG, + _("'%s' starting from %llu has only %zd bytes available"), + path, offset, nread); +goto cleanup; +} + memcpy(buffer, tmpbuf, size); ret = 0; -- 2.14.1 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] qemu: Mark graphics ports used on reconnect
On 09/18/2017 09:47 AM, Michal Privoznik wrote: > This is an issue that's bugging me for a long time. I don't know > exactly when and who is to blame but on daemon reconnect we lose > qemu's port allocator internal state. It's the kernel's fault, broken in 4.11+, patches recently posted upstream but not in any release yet: https://bugzilla.redhat.com/show_bug.cgi?id=1432684 - Cole -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [Qemu-devel] libvirt/QEMU/SEV interaction
On Mon, Sep 18, 2017 at 07:41:09AM -0500, Richard Relph wrote: > > > On 9/18/17 4:47 AM, Daniel P. Berrange wrote: > > On Mon, Sep 18, 2017 at 11:43:57AM +0200, Erik Skultety wrote: > > > [...] > > > > > > > > > c) what existing communicate interface can be used between > > > > > libvirt and qemu > > > > > > to get the measurement ? can we add a new qemu monitor command > > > > > > 'get_sev_measurement' to get the measurement ? (step 10) > > > > > > > > > > Yes, QMP commands seeem most likely. > > > > > > > > > > > d) how to pass the secret blob from libvirt to qemu ? should > > > > > we consider > > > > > > adding a new object (sev-guest-secret) -- libvirt can add the > > > > > object through > > > > > > qemu monitor. > > > > > > > > > > Yeah, that looks like a viable option too. > > > > So I could see a flow like the following: > > > > > > > > > > > >1. mgmt tool calls virConnectGetCapabilities. This returns an XML > > > > document that includes the following > > > > > > > > > > > > ...other bits... > > > > > > > > ...hex encoded PDH key... > > > > > > > > > > > > > > > >2. mgmt tool requests to start a guest calling virCreateXML(), > > > > passing VIR_DOMAIN_START_PAUSED. The XML would include > > > > > > > > > > > > ...hex encode DH key... > > > > ..hex encode info... > > > > ...int32 value.. > > > > > > > > > > > > > > > > if is provided and VIR_DOMAIN_START_PAUSED is missing, > > > > libvirt would report an error and refuse to start the guest > For ease of use, I would not add this conditional to libvirt. If is > provided and VIR_DOMAIN_START_PAUSED is missing, I’d just send the "GO" I also feel that the presence of the element might determine the usage of the VIR_DOMAIN_START_PAUSED flag implicitly. > command as it would naturally occur. > Unless that would confuse things inside libvirt or QEMU in relation to the > measurement and secret… > Many of our existing tests focus on other aspects of SEV functionality and > so they skip the MEASURE/SECRET phase of launch and just go immediately from > LAUNCH_UPDATE_DATA (or VMSA) to LAUNCH_FINISH. > I guess the key question will be how will QEMU know when to get the > MEASUREMENT and wait for a LAUNCH_SECRET before doing a LAUNCH_FINISH when > connected to libvirt. > Brijesh, this is your area. It feels to me like QEMU will have to wait to do > the LAUNCH_FINISH until it gets the first “go” from libvirt. If that’s > right, and assuming the same “go” comes from libvirt with or without > VIR_DOMAIN_START_PAUSED, then I’d simply exclude the conditional check. QEMU > would get the measurement when it is done sending the data. > Though in “real world” uses, I think the conditional is perfectly OK. > > > > > > > >3. Libvirt generates the QEMU cli arg to enable SEV using > > > > the XML data and starts QEMU, leaving CPUs paused > > > > > > > >4. QEMU emits a SEV_MEASURE event containing the measurement > > > > blob > > > Speaking of which, I expect QEMU to have a QMP command to retrieve the > > > measurement, in which case I think libvirt has to provide an API for the > > > user > > > to retrieve the measurement in case libvirtd crashes somewhere between > > > setting > > > up QEMU and waiting for the measurement event from QEMU, or simply > > > because the > > > GO missed the event for some unspecified reason. > > Yeah, that's a good point - we also ought to have a pause-reason that > > reflects that it is paused due to waiting for SEV secrets. > > > > > >5. Libvirt catches the QEMU event and emits its own > > > > VIR_CONNECT_DOMAIN_EVENT_SEV_MEASURE event containing > > > > the measurement blob > > > > > > > >6. GO does its validation of the measurement > > > > > > > >7a If validation failed, then virDomainDestroy() to stop QEMU > > > > > > > >7b If validation succeeed > > > > > > > > Optionally call > > > > > > > > virDomainSetSEVSecret() > > > Given the fact that we're likely introducing a new element to the > > > XML > > > config, I'm more inclined to utilizing the existing virSecret interfaces > > > (as > > > was originally suggested) instead of creating a vendor-specific API. You > > > could > > > have an optional secret sub-element within the element and libvirt > > > would > > > simply check if that secret has a value set, once the GO issues > > > virDomainResume(). Any particular reason for having a specific API for > > > this that > > > I'm missing? > > Initially I was intending to suggest extensive use of virSecret, but it > > turns out that despite being called a "secret", none of the SEV data we are > > passing around needs protection. Either it is safe to be public, or it is > > already encrypted. So essentially we just have some data blobs we need to > > pass into QEMU. I didn't feel we ought to be
[libvirt] [PATCH 3/6] cpu_conf: Simplify formatting of guest CPU attributes
Signed-off-by: Jiri Denemark --- src/conf/cpu_conf.c | 20 ++-- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/src/conf/cpu_conf.c b/src/conf/cpu_conf.c index 6058d26fa5..02506c020b 100644 --- a/src/conf/cpu_conf.c +++ b/src/conf/cpu_conf.c @@ -604,20 +604,20 @@ virCPUDefFormatBufFull(virBufferPtr buf, if (!def) return 0; -/* Format attributes */ -if (def->type == VIR_CPU_TYPE_GUEST) { +/* Format attributes for guest CPUs unless they only specify + * topology or cache. */ +if (def->type == VIR_CPU_TYPE_GUEST && +(def->mode != VIR_CPU_MODE_CUSTOM || def->model)) { const char *tmp; -if (def->mode != VIR_CPU_MODE_CUSTOM || def->model) { -if (!(tmp = virCPUModeTypeToString(def->mode))) { -virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unexpected CPU mode %d"), def->mode); -goto cleanup; -} -virBufferAsprintf(&attributeBuf, " mode='%s'", tmp); +if (!(tmp = virCPUModeTypeToString(def->mode))) { +virReportError(VIR_ERR_INTERNAL_ERROR, + _("Unexpected CPU mode %d"), def->mode); +goto cleanup; } +virBufferAsprintf(&attributeBuf, " mode='%s'", tmp); -if (def->model && def->mode == VIR_CPU_MODE_CUSTOM) { +if (def->mode == VIR_CPU_MODE_CUSTOM) { if (!(tmp = virCPUMatchTypeToString(def->match))) { virReportError(VIR_ERR_INTERNAL_ERROR, _("Unexpected CPU match policy %d"), -- 2.14.1 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 4/6] conf: Drop unused VIR_DOMAIN_DEF_FORMAT_UPDATE_CPU
The only real usage of this flag was removed by "cpu_conf: Drop updateCPU from virCPUDefFormat". Signed-off-by: Jiri Denemark --- src/conf/domain_conf.c | 3 --- src/conf/domain_conf.h | 1 - src/conf/snapshot_conf.c | 3 +-- 3 files changed, 1 insertion(+), 6 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 35adce95b5..984db98107 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -87,7 +87,6 @@ struct _virDomainXMLOption { #define VIR_DOMAIN_DEF_FORMAT_COMMON_FLAGS \ (VIR_DOMAIN_DEF_FORMAT_SECURE |\ VIR_DOMAIN_DEF_FORMAT_INACTIVE | \ - VIR_DOMAIN_DEF_FORMAT_UPDATE_CPU |\ VIR_DOMAIN_DEF_FORMAT_MIGRATABLE) VIR_ENUM_IMPL(virDomainTaint, VIR_DOMAIN_TAINT_LAST, @@ -25974,8 +25973,6 @@ unsigned int virDomainDefFormatConvertXMLFlags(unsigned int flags) formatFlags |= VIR_DOMAIN_DEF_FORMAT_SECURE; if (flags & VIR_DOMAIN_XML_INACTIVE) formatFlags |= VIR_DOMAIN_DEF_FORMAT_INACTIVE; -if (flags & VIR_DOMAIN_XML_UPDATE_CPU) -formatFlags |= VIR_DOMAIN_DEF_FORMAT_UPDATE_CPU; if (flags & VIR_DOMAIN_XML_MIGRATABLE) formatFlags |= VIR_DOMAIN_DEF_FORMAT_MIGRATABLE; diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index bb3b6f0c3c..4603e4f97e 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2857,7 +2857,6 @@ typedef enum { typedef enum { VIR_DOMAIN_DEF_FORMAT_SECURE = 1 << 0, VIR_DOMAIN_DEF_FORMAT_INACTIVE= 1 << 1, -VIR_DOMAIN_DEF_FORMAT_UPDATE_CPU = 1 << 2, VIR_DOMAIN_DEF_FORMAT_MIGRATABLE = 1 << 3, /* format internal domain status information */ VIR_DOMAIN_DEF_FORMAT_STATUS = 1 << 4, diff --git a/src/conf/snapshot_conf.c b/src/conf/snapshot_conf.c index 5a07d3734a..07706e0b26 100644 --- a/src/conf/snapshot_conf.c +++ b/src/conf/snapshot_conf.c @@ -703,8 +703,7 @@ virDomainSnapshotDefFormat(const char *domain_uuid, virBuffer buf = VIR_BUFFER_INITIALIZER; size_t i; -virCheckFlags(VIR_DOMAIN_DEF_FORMAT_SECURE | - VIR_DOMAIN_DEF_FORMAT_UPDATE_CPU, NULL); +virCheckFlags(VIR_DOMAIN_DEF_FORMAT_SECURE, NULL); flags |= VIR_DOMAIN_DEF_FORMAT_INACTIVE; -- 2.14.1 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 0/6] Fix guest CPU updates for inactive domains
This series originally started as a cleanup work, but it turned out two real bugs got fixed with this cleanup :-) Since commit v2.2.0-199-g7ce711a30e libvirt stores an updated guest CPU in domain's live definition and there's no need to update it every time we want to format the definition. Thus libvirt should never internally request a guest CPU update when formatting domain XML. But if it is asked by a user to do so, it should use the same host CPU model which is used when starting a domain to provide meaningful results. https://bugzilla.redhat.com/show_bug.cgi?id=1481309 https://bugzilla.redhat.com/show_bug.cgi?id=1485022 Jiri Denemark (6): qemuxml2xmltest: Add tests for Power CPUs cpu_conf: Drop updateCPU from virCPUDefFormat cpu_conf: Simplify formatting of guest CPU attributes conf: Drop unused VIR_DOMAIN_DEF_FORMAT_UPDATE_CPU qemu: Use correct host model for updating guest cpu qemu: Don't update CPU when formatting live def src/bhyve/bhyve_driver.c | 2 +- src/conf/capabilities.c| 2 +- src/conf/cpu_conf.c| 38 + src/conf/cpu_conf.h| 9 ++--- src/conf/domain_capabilities.c | 2 +- src/conf/domain_conf.c | 6 +--- src/conf/domain_conf.h | 1 - src/conf/snapshot_conf.c | 3 +- src/libxl/libxl_driver.c | 2 +- src/qemu/qemu_domain.c | 17 +++--- src/qemu/qemu_domain.h | 3 +- src/qemu/qemu_driver.c | 9 - src/qemu/qemu_migration_cookie.c | 2 +- src/test/test_driver.c | 2 +- src/vz/vz_driver.c | 2 +- tests/cputest.c| 15 - .../ppc64-host+guest-compat-incompatible.xml | 2 +- .../ppc64-host+guest-compat-invalid.xml| 2 +- .../cputestdata/ppc64-host+guest-compat-valid.xml | 2 +- .../qemuxml2xmlout-pseries-cpu-compat-power9.xml | 38 + .../qemuxml2xmlout-pseries-cpu-compat.xml | 38 + .../qemuxml2xmlout-pseries-cpu-exact.xml | 39 ++ tests/qemuxml2xmltest.c| 4 +++ 23 files changed, 178 insertions(+), 62 deletions(-) create mode 100644 tests/qemuxml2xmloutdata/qemuxml2xmlout-pseries-cpu-compat-power9.xml create mode 100644 tests/qemuxml2xmloutdata/qemuxml2xmlout-pseries-cpu-compat.xml create mode 100644 tests/qemuxml2xmloutdata/qemuxml2xmlout-pseries-cpu-exact.xml -- 2.14.1 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 2/6] cpu_conf: Drop updateCPU from virCPUDefFormat
In the past we updated host-model CPUs with host CPU data by adding a model and features, but keeping the host-model mode. And since the CPU model is not normally formatted for host-model CPU defs, we had to pass the updateCPU flag to the formatting code to be able to properly output updated host-model CPUs. Libvirt doesn't do this anymore, host-model CPUs are turned into custom mode CPUs once updated with host CPU data and thus there's no reason for keeping the hacks inside CPU XML formatters. Signed-off-by: Jiri Denemark --- src/bhyve/bhyve_driver.c | 2 +- src/conf/capabilities.c | 2 +- src/conf/cpu_conf.c | 20 +++- src/conf/cpu_conf.h | 9 +++-- src/conf/domain_capabilities.c | 2 +- src/conf/domain_conf.c | 3 +-- src/libxl/libxl_driver.c | 2 +- src/qemu/qemu_domain.c | 4 ++-- src/qemu/qemu_driver.c | 2 +- src/qemu/qemu_migration_cookie.c | 2 +- src/test/test_driver.c | 2 +- src/vz/vz_driver.c | 2 +- tests/cputest.c | 15 +++ .../ppc64-host+guest-compat-incompatible.xml | 2 +- .../cputestdata/ppc64-host+guest-compat-invalid.xml | 2 +- tests/cputestdata/ppc64-host+guest-compat-valid.xml | 2 +- 16 files changed, 31 insertions(+), 42 deletions(-) diff --git a/src/bhyve/bhyve_driver.c b/src/bhyve/bhyve_driver.c index e8241f39ff..c096b5562f 100644 --- a/src/bhyve/bhyve_driver.c +++ b/src/bhyve/bhyve_driver.c @@ -1441,7 +1441,7 @@ bhyveConnectBaselineCPU(virConnectPtr conn, virCPUExpandFeatures(cpus[0]->arch, cpu) < 0) goto cleanup; -cpustr = virCPUDefFormat(cpu, NULL, false); +cpustr = virCPUDefFormat(cpu, NULL); cleanup: virCPUDefListFree(cpus); diff --git a/src/conf/capabilities.c b/src/conf/capabilities.c index ba554535ae..9920a675ac 100644 --- a/src/conf/capabilities.c +++ b/src/conf/capabilities.c @@ -984,7 +984,7 @@ virCapabilitiesFormatXML(virCapsPtr caps) virBufferAdjustIndent(&buf, -2); virBufferAddLit(&buf, "\n"); } -virCPUDefFormatBuf(&buf, caps->host.cpu, false); +virCPUDefFormatBuf(&buf, caps->host.cpu); for (i = 0; i < caps->host.nPagesSize; i++) { virBufferAsprintf(&buf, "\n", diff --git a/src/conf/cpu_conf.c b/src/conf/cpu_conf.c index 0bd56c9d28..6058d26fa5 100644 --- a/src/conf/cpu_conf.c +++ b/src/conf/cpu_conf.c @@ -574,12 +574,11 @@ virCPUDefParseXML(xmlXPathContextPtr ctxt, char * virCPUDefFormat(virCPUDefPtr def, -virDomainNumaPtr numa, -bool updateCPU) +virDomainNumaPtr numa) { virBuffer buf = VIR_BUFFER_INITIALIZER; -if (virCPUDefFormatBufFull(&buf, def, numa, updateCPU) < 0) +if (virCPUDefFormatBufFull(&buf, def, numa) < 0) goto cleanup; if (virBufferCheckError(&buf) < 0) @@ -596,8 +595,7 @@ virCPUDefFormat(virCPUDefPtr def, int virCPUDefFormatBufFull(virBufferPtr buf, virCPUDefPtr def, - virDomainNumaPtr numa, - bool updateCPU) + virDomainNumaPtr numa) { int ret = -1; virBuffer attributeBuf = VIR_BUFFER_INITIALIZER; @@ -619,9 +617,7 @@ virCPUDefFormatBufFull(virBufferPtr buf, virBufferAsprintf(&attributeBuf, " mode='%s'", tmp); } -if (def->model && -(def->mode == VIR_CPU_MODE_CUSTOM || - updateCPU)) { +if (def->model && def->mode == VIR_CPU_MODE_CUSTOM) { if (!(tmp = virCPUMatchTypeToString(def->match))) { virReportError(VIR_ERR_INTERNAL_ERROR, _("Unexpected CPU match policy %d"), @@ -642,7 +638,7 @@ virCPUDefFormatBufFull(virBufferPtr buf, if (def->type == VIR_CPU_TYPE_HOST && def->arch) virBufferAsprintf(&childrenBuf, "%s\n", virArchToString(def->arch)); -if (virCPUDefFormatBuf(&childrenBuf, def, updateCPU) < 0) +if (virCPUDefFormatBuf(&childrenBuf, def) < 0) goto cleanup; if (virDomainNumaDefCPUFormat(&childrenBuf, numa) < 0) @@ -677,8 +673,7 @@ virCPUDefFormatBufFull(virBufferPtr buf, int virCPUDefFormatBuf(virBufferPtr buf, - virCPUDefPtr def, - bool updateCPU) + virCPUDefPtr def) { size_t i; bool formatModel; @@ -688,8 +683,7 @@ virCPUDefFormatBuf(virBufferPtr buf, return 0; formatModel = (def->mode == VIR_CPU_MODE_CUSTOM || - def->mode == VIR_CPU_MODE_HOST_MODEL || - updateCPU); + def->mode == VIR_CPU_MODE_HOST_MODEL); formatFallbac
Re: [libvirt] [PATCH v3 0/6] Work around the kernel mdev uevent race in nodedev
On Mon, Sep 18, 2017 at 08:47:52AM -0400, John Ferlan wrote: > > > On 09/18/2017 01:53 AM, Erik Skultety wrote: > > On Thu, Aug 24, 2017 at 01:23:26PM +0200, Erik Skultety wrote: > >> v2 here: > >> https://www.redhat.com/archives/libvir-list/2017-July/msg01268.html > >> > >> Since v2: > >> - added patch 4/6 that fixes the issue with the handler thread spamming > >> logs > >> with "udev_monitor_receive_device returned NULL" > >> -> the event loop callback now disables polling on the udev monitor's fd > >> every time there's a new event, leaving the responsibility for > >> re-enabling > >> it back to the handler thread once it had removed the corresponding > >> data > >> from the socket. > >> > >> Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1463285 > > > > ping > > > > I think I had assumed there was going to be a v4... Was I incorrect and > you would prefer to keep this series as is presented here? Oh, I managed to somehow forget about the comments in patches 1-2 as I fixed them locally and then solely focused on our technical discussion in patches 3-6. No problem, I'll resend a v4 adjusting the first two patches, since there weren't any semantic-related points, rather than points regarding the whole idea in general. Erik > > John > > > Erik > > > > -- > > libvir-list mailing list > > libvir-list@redhat.com > > https://www.redhat.com/mailman/listinfo/libvir-list > > -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 5/6] qemu: Use correct host model for updating guest cpu
when a user requested a domain xml description with vir_domain_xml_update_cpu flag, libvirt would use the host cpu definition from host capabilities rather than the one which will actually be used once the domain is started. https://bugzilla.redhat.com/show_bug.cgi?id=1481309 Signed-off-by: Jiri Denemark --- src/qemu/qemu_domain.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index aba4111ba2..74284f40c1 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -4553,6 +4553,7 @@ qemuDomainDefFormatBufInternal(virQEMUDriverPtr driver, int ret = -1; virDomainDefPtr copy = NULL; virCapsPtr caps = NULL; +virQEMUCapsPtr qemuCaps = NULL; if (!(caps = virQEMUDriverGetCapabilities(driver, false))) goto cleanup; @@ -4571,7 +4572,14 @@ qemuDomainDefFormatBufInternal(virQEMUDriverPtr driver, def->cpu && (def->cpu->mode != VIR_CPU_MODE_CUSTOM || def->cpu->model)) { -if (virCPUUpdate(def->os.arch, def->cpu, caps->host.cpu) < 0) +if (!(qemuCaps = virQEMUCapsCacheLookupCopy(driver->qemuCapsCache, +def->emulator, +def->os.machine))) +goto cleanup; + +if (virCPUUpdate(def->os.arch, def->cpu, + virQEMUCapsGetHostModel(qemuCaps, def->virtType, + VIR_QEMU_CAPS_HOST_CPU_MIGRATABLE)) < 0) goto cleanup; } @@ -4689,6 +4697,7 @@ qemuDomainDefFormatBufInternal(virQEMUDriverPtr driver, cleanup: virDomainDefFree(copy); virObjectUnref(caps); +virObjectUnref(qemuCaps); return ret; } -- 2.14.1 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH] qemu: Mark graphics ports used on reconnect
This is an issue that's bugging me for a long time. I don't know exactly when and who is to blame but on daemon reconnect we lose qemu's port allocator internal state. That's okay as we should be able to rebuild it later. However, now I'm seeing port allocator biding successfully to ports that are already taken by qemu (either VNC or Spice). Thus any attempt to start another domain after daemon is restarted fails because libvirt instructs qemu to take port 5900 which is already taken. Now, I don't want to mask the real problem, but one can advocate that we should be marking graphics ports as already in use on qemuProcessReconnect anyway, because we already know that they are taken. Signed-off-by: Michal Privoznik --- src/qemu/qemu_process.c | 17 + 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index d3155e4e7..053aba1a6 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -4035,7 +4035,8 @@ qemuProcessStartHook(virQEMUDriverPtr driver, static int qemuProcessGraphicsReservePorts(virQEMUDriverPtr driver, -virDomainGraphicsDefPtr graphics) +virDomainGraphicsDefPtr graphics, +bool reconnect) { virDomainGraphicsListenDefPtr glisten; @@ -4050,7 +4051,8 @@ qemuProcessGraphicsReservePorts(virQEMUDriverPtr driver, switch (graphics->type) { case VIR_DOMAIN_GRAPHICS_TYPE_VNC: -if (!graphics->data.vnc.autoport) { +if (!graphics->data.vnc.autoport || +reconnect) { if (virPortAllocatorSetUsed(driver->remotePorts, graphics->data.vnc.port, true) < 0) @@ -4065,7 +4067,7 @@ qemuProcessGraphicsReservePorts(virQEMUDriverPtr driver, break; case VIR_DOMAIN_GRAPHICS_TYPE_SPICE: -if (graphics->data.spice.autoport) +if (graphics->data.spice.autoport && !reconnect) return 0; if (graphics->data.spice.port > 0) { @@ -4269,7 +4271,7 @@ qemuProcessSetupGraphics(virQEMUDriverPtr driver, for (i = 0; i < vm->def->ngraphics; i++) { graphics = vm->def->graphics[i]; -if (qemuProcessGraphicsReservePorts(driver, graphics) < 0) +if (qemuProcessGraphicsReservePorts(driver, graphics, false) < 0) goto cleanup; } } @@ -6881,6 +6883,13 @@ qemuProcessReconnect(void *opaque) goto error; } +for (i = 0; i < obj->def->ngraphics; i++) { +if (qemuProcessGraphicsReservePorts(driver, +obj->def->graphics[i], +true) < 0) +goto error; +} + if (qemuProcessUpdateState(driver, obj) < 0) goto error; -- 2.13.5 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 6/6] qemu: Don't update CPU when formatting live def
Since commit v2.2.0-199-g7ce711a30e libvirt stores an updated guest CPU in domain's live definition and there's no need to update it every time we want to format the definition. The commit itself tried to address this in qemuDomainFormatXML, but forgot to fix qemuDomainDefFormatLive. Not to mention that masking a previously set flag is only acceptable if the flag was set by a public API user. Internally, libvirt should have never set the flag in the first place. https://bugzilla.redhat.com/show_bug.cgi?id=1485022 Signed-off-by: Jiri Denemark --- src/qemu/qemu_domain.c | 2 -- src/qemu/qemu_domain.h | 3 +-- src/qemu/qemu_driver.c | 7 +++ 3 files changed, 8 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 74284f40c1..5ef98911dc 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -4757,8 +4757,6 @@ char *qemuDomainFormatXML(virQEMUDriverPtr driver, } else { def = vm->def; origCPU = priv->origCPU; -if (virDomainObjIsActive(vm)) -flags &= ~VIR_DOMAIN_XML_UPDATE_CPU; } return qemuDomainDefFormatXMLInternal(driver, def, origCPU, flags); diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index b291dc3082..09201b1a40 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -40,8 +40,7 @@ # include "logging/log_manager.h" # define QEMU_DOMAIN_FORMAT_LIVE_FLAGS \ -(VIR_DOMAIN_XML_SECURE |\ - VIR_DOMAIN_XML_UPDATE_CPU) +(VIR_DOMAIN_XML_SECURE) # if ULONG_MAX == 4294967295 /* QEMU has a 64-bit limit, but we are limited by our historical choice of diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 77308d547e..a29bbea55d 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -6997,6 +6997,13 @@ static char if ((flags & VIR_DOMAIN_XML_MIGRATABLE)) flags |= QEMU_DOMAIN_FORMAT_LIVE_FLAGS; +/* The CPU is already updated in the domain's live definition, we need to + * ignore the VIR_DOMAIN_XML_UPDATE_CPU flag. + */ +if (virDomainObjIsActive(vm) && +!(flags & VIR_DOMAIN_DEF_FORMAT_INACTIVE)) +flags &= ~VIR_DOMAIN_XML_UPDATE_CPU; + ret = qemuDomainFormatXML(driver, vm, flags); cleanup: -- 2.14.1 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 1/6] qemuxml2xmltest: Add tests for Power CPUs
Signed-off-by: Jiri Denemark --- .../qemuxml2xmlout-pseries-cpu-compat-power9.xml | 38 + .../qemuxml2xmlout-pseries-cpu-compat.xml | 38 + .../qemuxml2xmlout-pseries-cpu-exact.xml | 39 ++ tests/qemuxml2xmltest.c| 4 +++ 4 files changed, 119 insertions(+) create mode 100644 tests/qemuxml2xmloutdata/qemuxml2xmlout-pseries-cpu-compat-power9.xml create mode 100644 tests/qemuxml2xmloutdata/qemuxml2xmlout-pseries-cpu-compat.xml create mode 100644 tests/qemuxml2xmloutdata/qemuxml2xmlout-pseries-cpu-exact.xml diff --git a/tests/qemuxml2xmloutdata/qemuxml2xmlout-pseries-cpu-compat-power9.xml b/tests/qemuxml2xmloutdata/qemuxml2xmlout-pseries-cpu-compat-power9.xml new file mode 100644 index 00..f020056219 --- /dev/null +++ b/tests/qemuxml2xmloutdata/qemuxml2xmlout-pseries-cpu-compat-power9.xml @@ -0,0 +1,38 @@ + + QEMUGuest1 + c7a5fdbd-edaf-9455-926a-d65c16db1809 + 219100 + 219100 + 4 + +hvm + + + +power9 + + + destroy + restart + destroy + +/usr/bin/qemu-system-ppc64 + + + + + + + + + + + + + + + + + + + diff --git a/tests/qemuxml2xmloutdata/qemuxml2xmlout-pseries-cpu-compat.xml b/tests/qemuxml2xmloutdata/qemuxml2xmlout-pseries-cpu-compat.xml new file mode 100644 index 00..3cbce9fe6a --- /dev/null +++ b/tests/qemuxml2xmloutdata/qemuxml2xmlout-pseries-cpu-compat.xml @@ -0,0 +1,38 @@ + + QEMUGuest1 + c7a5fdbd-edaf-9455-926a-d65c16db1809 + 219100 + 219100 + 4 + +hvm + + + +power7 + + + destroy + restart + destroy + +/usr/bin/qemu-system-ppc64 + + + + + + + + + + + + + + + + + + + diff --git a/tests/qemuxml2xmloutdata/qemuxml2xmlout-pseries-cpu-exact.xml b/tests/qemuxml2xmloutdata/qemuxml2xmlout-pseries-cpu-exact.xml new file mode 100644 index 00..d69b387686 --- /dev/null +++ b/tests/qemuxml2xmloutdata/qemuxml2xmlout-pseries-cpu-exact.xml @@ -0,0 +1,39 @@ + + QEMUGuest1 + 1ccfd97d-5eb4-478a-bbe6-88d254c16db7 + 524288 + 524288 + 1 + +hvm + + + +POWER7 +IBM + + + destroy + restart + destroy + +/usr/bin/qemu-system-ppc64 + + + + + + + + + + + + + + + + + + + diff --git a/tests/qemuxml2xmltest.c b/tests/qemuxml2xmltest.c index efac9e8286..d15ea93b19 100644 --- a/tests/qemuxml2xmltest.c +++ b/tests/qemuxml2xmltest.c @@ -1216,6 +1216,10 @@ mymain(void) DO_TEST("smartcard-passthrough-spicevmc", NONE); DO_TEST("smartcard-controller", NONE); +DO_TEST("pseries-cpu-compat-power9", NONE); +DO_TEST("pseries-cpu-compat", NONE); +DO_TEST("pseries-cpu-exact", NONE); + if (getenv("LIBVIRT_SKIP_CLEANUP") == NULL) virFileDeleteTree(fakerootdir); -- 2.14.1 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 0/5] Reject parallel ports on s390, and pseries
On 09/07/2017 02:19 PM, Pino Toscano wrote: > Hi, > > this is a simple series to reject parallel ports on s390 architectures, > and pseries machines -- both simply do not support them. This fixes > https://bugzilla.redhat.com/show_bug.cgi?id=1487499 > > Pino Toscano (5): > tests: qemuxml2argv: fix expected type for usb-bus-missing > tests: qemuxml2argv: fail also on unexpected pass > qemu: pass the virDomainDef to qemuDomainChrDefValidate > qemu: reject parallel ports for s390 archs > qemu: reject parallel ports for pseries machines > > src/qemu/qemu_domain.c | 16 > .../qemuxml2argv-pseries-no-parallel.xml | 18 ++ > .../qemuxml2argv-s390-no-parallel.xml | 22 > ++ > tests/qemuxml2argvtest.c | 20 +--- > 4 files changed, 69 insertions(+), 7 deletions(-) > create mode 100644 > tests/qemuxml2argvdata/qemuxml2argv-pseries-no-parallel.xml > create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-s390-no-parallel.xml > ACK series. Michal -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCHv2] nodedev: add switchdev to NIC capabilities
On 09/17/2017 05:32 AM, Edan David wrote: > I removed the memset from virNetDevPutExtraHeader and tested on a card > supporting switchdev. > It looks fine to me :) > Great review guys, thanks! > Thanks for checking... This is now pushed. Congrats on your first libvirt patch, John > > -Original Message- > From: John Ferlan [mailto:jfer...@redhat.com] > Sent: Saturday, September 16, 2017 2:04 PM > To: Laine Stump ; libvir-list@redhat.com > Cc: Edan David > Subject: Re: [libvirt] [PATCHv2] nodedev: add switchdev to NIC capabilities > > > > On 09/14/2017 11:23 AM, Laine Stump wrote: >> (Almost all of my comments result in "ok, no action needed". There are >> just three items I would like to see changed (2 trivial, 1 also small >> but Edan or John may think it prudent to re-test with the change >> before >> pushing) - I marked those comments with [**]. >> >> On 08/21/2017 05:19 AM, Edan David wrote: >>> Adding functionality to libvirt that will allow querying the >>> interface for the availability of switchdev Offloading NIC capabilities. >>> The switchdev mode was introduced in kernel 4.8, the iproute2-devlink >>> command to retrieve the swtichdev NIC feature, Command example: >>> devlink dev eswitch show pci/:03:00.0 This feature is needed for >>> Openstack so we can do a scheduling decision if the NIC is in >>> Hardware Offload (switchdev) or regular SR-IOV (legacy) mode. >>> And select the appropriate hypervisors with the requested capability see >>> [1]. >>> >>> [1] - >>> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsp >>> ecs.openstack.org%2Fopenstack%2Fnova-specs%2Fspecs%2Fpike%2Fapproved% >>> 2Fenable-sriov-nic-features.html&data=02%7C01%7Cedand%40mellanox.com% >>> 7C1e30890a5c7a4f9f045f08d4fcf2b479%7Ca652971c7d2e4d9ba6a4d149256f461b >>> %7C0%7C0%7C636411566714987559&sdata=Ri340H6MLap3HpOMDrb%2FFk6D9agQjEh >>> C9cvR7%2FbV1Ls%3D&reserved=0 >>> --- >>> configure.ac | 13 ++ >>> docs/formatnode.html.in | 1 + >>> src/util/virnetdev.c | 187 >>> +- >>> src/util/virnetdev.h | 1 + >>> tests/nodedevschemadata/net_00_13_02_b9_f9_d3.xml | 1 + >>> tests/nodedevschemadata/net_00_15_58_2f_e9_55.xml | 1 + >>> 6 files changed, 203 insertions(+), 1 deletion(-) >>> >>> diff --git a/configure.ac b/configure.ac index b12b7fa..c089798 >>> 100644 >>> --- a/configure.ac >>> +++ b/configure.ac >>> @@ -627,6 +627,19 @@ if test "$with_linux" = "yes"; then >>> AC_CHECK_HEADERS([linux/btrfs.h]) fi >>> >>> +dnl >>> +dnl check for kernel headers required by devlink dnl if test >>> +"$with_linux" = "yes"; then >>> +AC_CHECK_HEADERS([linux/devlink.h]) >>> +AC_CHECK_DECLS([DEVLINK_GENL_VERSION, DEVLINK_GENL_NAME, >>> +DEVLINK_ATTR_MAX, DEVLINK_CMD_ESWITCH_GET, DEVLINK_ATTR_BUS_NAME, >>> +DEVLINK_ATTR_DEV_NAME, DEVLINK_ATTR_ESWITCH_MODE, >>> +DEVLINK_ESWITCH_MODE_SWITCHDEV], >> >> That's very . thorough, and potentially misleading if someone ever >> wanted to use devlink to check for something other than switchdev (e.g. >> [mythical feature X]) on some system that didn't have the proper stuff >> defined for switchdev, but did have it for [other mythical feature X]. >> But that's all just hypothetical, so this is fine with me. >> >> >>> + [AC_DEFINE([HAVE_DECL_DEVLINK], >>> + [1], >>> + [whether devlink declarations is >>> + available])], >> >> [**] >> s/is/are/ >> > > Yep - altered... > >>> + [], >>> + [[#include ]]) fi >>> + >>> dnl Allow perl/python overrides >>> AC_PATH_PROGS([PYTHON], [python2 python]) if test -z "$PYTHON"; >>> then diff --git a/docs/formatnode.html.in b/docs/formatnode.html.in >>> index 4d935b5..29244a8 100644 >>> --- a/docs/formatnode.html.in >>> +++ b/docs/formatnode.html.in >>> @@ -227,6 +227,7 @@ >>> rxhashreceive-hashing >>> >>> rdmaremote-direct-memory-access >>> >>> txudptnltx-udp-tunnel-segmentation >>> + >>> + switchdevkernel-forward-plane-offload>> + > >> >> pretty odd abbreviation. But it is what it is :-) >>> >>>capability diff --git >>> a/src/util/virnetdev.c b/src/util/virnetdev.c index 51a6e42..fc7c961 >>> 100644 >>> --- a/src/util/virnetdev.c >>> +++ b/src/util/virnetdev.c >>> @@ -59,6 +59,10 @@ >>> # include >>> #endif >>> >>> +#if HAVE_DECL_DEVLINK >>> +# include >>> +#endif >>> + >>> #ifndef IFNAMSIZ >>> # define IFNAMSIZ 16 >>> #endif >>> @@ -2481,7 +2485,8 @@ VIR_ENUM_IMPL(virNetDevFeature, >>>"ntuple", >>>"rxhash", >>>"rdma", >>> - "txudptnl") >>> + "txudptnl", >>> + "switchdev") >>> >>> #ifdef __linu
Re: [libvirt] [Qemu-devel] libvirt/QEMU/SEV interaction
On 9/18/17 4:47 AM, Daniel P. Berrange wrote: On Mon, Sep 18, 2017 at 11:43:57AM +0200, Erik Skultety wrote: [...] > c) what existing communicate interface can be used between libvirt and qemu > to get the measurement ? can we add a new qemu monitor command > 'get_sev_measurement' to get the measurement ? (step 10) Yes, QMP commands seeem most likely. > d) how to pass the secret blob from libvirt to qemu ? should we consider > adding a new object (sev-guest-secret) -- libvirt can add the object through > qemu monitor. Yeah, that looks like a viable option too. So I could see a flow like the following: 1. mgmt tool calls virConnectGetCapabilities. This returns an XML document that includes the following ...other bits... ...hex encoded PDH key... 2. mgmt tool requests to start a guest calling virCreateXML(), passing VIR_DOMAIN_START_PAUSED. The XML would include ...hex encode DH key... ..hex encode info... ...int32 value.. if is provided and VIR_DOMAIN_START_PAUSED is missing, libvirt would report an error and refuse to start the guest For ease of use, I would not add this conditional to libvirt. If is provided and VIR_DOMAIN_START_PAUSED is missing, I’d just send the "GO" command as it would naturally occur. Unless that would confuse things inside libvirt or QEMU in relation to the measurement and secret… Many of our existing tests focus on other aspects of SEV functionality and so they skip the MEASURE/SECRET phase of launch and just go immediately from LAUNCH_UPDATE_DATA (or VMSA) to LAUNCH_FINISH. I guess the key question will be how will QEMU know when to get the MEASUREMENT and wait for a LAUNCH_SECRET before doing a LAUNCH_FINISH when connected to libvirt. Brijesh, this is your area. It feels to me like QEMU will have to wait to do the LAUNCH_FINISH until it gets the first “go” from libvirt. If that’s right, and assuming the same “go” comes from libvirt with or without VIR_DOMAIN_START_PAUSED, then I’d simply exclude the conditional check. QEMU would get the measurement when it is done sending the data. Though in “real world” uses, I think the conditional is perfectly OK. 3. Libvirt generates the QEMU cli arg to enable SEV using the XML data and starts QEMU, leaving CPUs paused 4. QEMU emits a SEV_MEASURE event containing the measurement blob Speaking of which, I expect QEMU to have a QMP command to retrieve the measurement, in which case I think libvirt has to provide an API for the user to retrieve the measurement in case libvirtd crashes somewhere between setting up QEMU and waiting for the measurement event from QEMU, or simply because the GO missed the event for some unspecified reason. Yeah, that's a good point - we also ought to have a pause-reason that reflects that it is paused due to waiting for SEV secrets. 5. Libvirt catches the QEMU event and emits its own VIR_CONNECT_DOMAIN_EVENT_SEV_MEASURE event containing the measurement blob 6. GO does its validation of the measurement 7a If validation failed, then virDomainDestroy() to stop QEMU 7b If validation succeeed Optionally call virDomainSetSEVSecret() Given the fact that we're likely introducing a new element to the XML config, I'm more inclined to utilizing the existing virSecret interfaces (as was originally suggested) instead of creating a vendor-specific API. You could have an optional secret sub-element within the element and libvirt would simply check if that secret has a value set, once the GO issues virDomainResume(). Any particular reason for having a specific API for this that I'm missing? Initially I was intending to suggest extensive use of virSecret, but it turns out that despite being called a "secret", none of the SEV data we are passing around needs protection. Either it is safe to be public, or it is already encrypted. So essentially we just have some data blobs we need to pass into QEMU. I didn't feel we ought to be abusing virSecret as a general purpose mechanism for passing in opaque data blobs which do not need any kind of protection. All of the above looks really good to me. While I agree with Daniel’s analysis of the need for “secret”, I do like using virSecret to convey the notion of secrecy. But it isn’t necessary. The end points are the SEV FW and the guest owner and all secrets they share are already encrypted. Embedding it in the “GO” command feels equally OK to me. Note that sending a secret with a “GO” other than the first one is an error… I don’t think libvirt needs to catch that, though. The SEV FW will. Regards, Daniel Thanks, Richard -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH] check-spacing: Don't hardcode Perl interpreter path
This is particularly useful on operating systems that don't ship Perl as part of the base system (eg. FreeBSD) while still working just as well as it did before on Linux. Signed-off-by: Andrea Bolognani --- build-aux/check-spacing.pl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/build-aux/check-spacing.pl b/build-aux/check-spacing.pl index 448acf234..ca8b43491 100755 --- a/build-aux/check-spacing.pl +++ b/build-aux/check-spacing.pl @@ -1,4 +1,4 @@ -#!/usr/bin/perl +#!/usr/bin/env perl # # check-spacing.pl: Report any usage of 'function (..args..)' # Also check for other syntax issues, such as correct use of ';' -- 2.13.5 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 0/3] Be more selective when determining cdrom for taint messaging
On 09/11/2017 04:32 PM, John Ferlan wrote: > v1: https://www.redhat.com/archives/libvir-list/2017-September/msg00103.html > > Changes since v1: > > Split into 3 parts... The first patch would be the bare minimum using > STRPREFIX instead of STREQ type comparisons for the incoming path to > be "/dev/cdrom[N]" or "/dev/srN" (or resolved to that). > > This would "work" for the most part, but then since it's possible to > make even more checks let's check against the collected node device > data. Patch 2 therefore will "tag" the already collected cdrom data > with a capability. This allows patch3 to find any/all CDROM's on the > host and compare the resolved path to that list of devices returning > "true" if something matches a node device declared physical CDROM. > > I split things up mainly to make it easier to decide whether patch 1 > is sufficient or not. If patch2 and patch3 are OK, I would also add > a release note indicating the improvement to find CDROM by node device > capability. It's a separate "improvement" on it's own as well. Whether > it's truly useful or not, is a different question... [1] > > John Ferlan (3): > qemu: Be more selective when determining cdrom for taint messaging ACK to this one ^^ > nodedev: Add capability bit to detect 'cdrom' devices > qemu: Add inquiry to nodedev for cdrom taint checking However, these two ^^ look like an overkill to me. It's still just a taint message that nobody cares about. Or? 1: Yeah, I don't think we really need such a big hammer for tiny nail. But I might be missing something. Michal -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v3 0/6] Work around the kernel mdev uevent race in nodedev
On 09/18/2017 01:53 AM, Erik Skultety wrote: > On Thu, Aug 24, 2017 at 01:23:26PM +0200, Erik Skultety wrote: >> v2 here: https://www.redhat.com/archives/libvir-list/2017-July/msg01268.html >> >> Since v2: >> - added patch 4/6 that fixes the issue with the handler thread spamming logs >> with "udev_monitor_receive_device returned NULL" >> -> the event loop callback now disables polling on the udev monitor's fd >> every time there's a new event, leaving the responsibility for >> re-enabling >> it back to the handler thread once it had removed the corresponding data >> from the socket. >> >> Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1463285 > > ping > I think I had assumed there was going to be a v4... Was I incorrect and you would prefer to keep this series as is presented here? John > Erik > > -- > libvir-list mailing list > libvir-list@redhat.com > https://www.redhat.com/mailman/listinfo/libvir-list > -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] virsh: man: Describe the 'create' command a bit more
On 09/08/2017 02:52 PM, Erik Skultety wrote: > So we refer to the terms 'persistent' and 'transient' across the whole > man page, without describing it further, but more importantly, how the > create command affects it, i.e. explicitly stating that domain created > via the 'create' command are going to be transient or persistent, > depending on whether there is an existing persistent domain with a > matching and , in which case it will remain persistent, but > will run using a one-time configuration, otherwise it's going to be > transient and will vanish once destroyed. > > Signed-off-by: Erik Skultety > --- > tools/virsh.pod | 32 ++-- > 1 file changed, 26 insertions(+), 6 deletions(-) ACK Michal -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] Questions about function virPCIDeviceIsBehindSwitchLackingACS in virpci.c
In function virPCIDeviceIsBehindSwitchLackingACS, I noticed that(line 8): 1if (virPCIDeviceGetParent(dev, &parent) < 0) 2return -1; 3if (!parent) { 4/* if we have no parent, and this is the root bus, ACS doesn't come 5 * into play since devices on the root bus can't P2P without going 6 * through the root IOMMU. 7 */ 8if (dev->address.bus == 0) { 9return 0; 10} else { 11virReportError(VIR_ERR_INTERNAL_ERROR, 12 _("Failed to find parent device for %s"), 13 dev->name); 14return -1; 15} 16} Why we just return 0 only if device's bus is 0? In my server, I can see a root bus which bus number is greater than 0, see the results(just a part) after I run lspci -t: +-[:80]-+-02.0-[81-83]--+-00.0 | | \-00.1 | +-05.0 | +-05.1 | +-05.2 | \-05.4 +-[:7f]-+-08.0 | +-08.2 | +-08.3 | + . . . | \-1f.2 \-[:00]-+-00.0 +-01.0-[01]00.0 +-02.0-[02]--+-00.0 |+-00.1 |+-00.2 |\-00.3 +-02.2-[03]-- +-03.0-[04-0b]00.0-[05-0b]--+-08.0-[06-08]00.0 | \-10.0-[09-0b]00.0 +-05.0 +-05.1 +-05.2 +-05.4 +-11.0 +-11.4 +-16.0 +-16.1 +-1a.0 If I assign the device :81:00.0 to a VM, I get "Failed to find parent device". I think I should get no error with return value 0 just like bus number is 0, because bus 80 is the root bus as well in my case. In the <> Datasheet, I found that(Chapter 9.1): For some server platforms, it may be desirable to have multiple PCHs in the system Which means some PCH's may reside on a bus greater than 0. So, is this a bug? Thanks, Zongyong Wu -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH go-xml] Add support for domain hostdev and test code
On Fri, Sep 15, 2017 at 02:12:44PM +0800, zhenwei.pi wrote: > Signed-off-by: zhenwei.pi > --- > domain.go | 36 > domain_test.go | 44 > 2 files changed, 80 insertions(+) > > diff --git a/domain.go b/domain.go > index bead49a..1bcc9cc 100644 > --- a/domain.go > +++ b/domain.go > @@ -407,6 +407,29 @@ type DomainRNG struct { > Backend *DomainRNGBackend `xml:"backend"` > } > > +type DomainHostdevAdapter struct { > + Name string `xml:"name,attr,omitempty"` > +} > + > +type DomainHostdevSource struct { > + Protocol string`xml:"protocol,attr,omitempty"` > + Name string`xml:"name,attr,omitempty"` > + Wwpn string`xml:"wwpn,attr,omitempty"` This should be WWPN > + Adapter *DomainHostdevAdapter `xml:"adapter"` > + Address *DomainAddress`xml:"address"` > +} > + > +type DomainHostdev struct { > + XMLName xml.Name `xml:"hostdev"` > + Modestring `xml:"mode,attr"` > + Typestring `xml:"type,attr"` > + Sgiostring `xml:"sgio,attr,omitempty"` SGIO > + Rawio string `xml:"rawio,attr,omitempty"` And RawIO > + Managed string `xml:"managed,attr,omitempty"` > + Source *DomainHostdevSource `xml:"source"` > + Address *DomainAddress `xml:"address"` > +} > diff --git a/domain_test.go b/domain_test.go > index d1b107d..73dd47b 100644 > --- a/domain_test.go > +++ b/domain_test.go > @@ -37,6 +37,13 @@ type Address struct { > Function HexUint > } > > +type ScsiAddress struct { Should be SCSI ACK, and I'll fix the capitalization when pushing Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] apparmor: add attach_disconnected
On 09/15/2017 05:17 PM, Guido Günther wrote: > Otherwise we fail to reconnect to /dev/net/tun opened by libvirtd > like > > [ 8144.507756] audit: type=1400 audit(1505488162.386:38069121): > apparmor="DENIED" operation="file_perm" info="Failed name lookup - > disconnected path" error=-13 > profile="libvirt-5dfcc8a7-b79a-4fa9-a41f-f6271651934c" name="dev/net/tun" > pid=9607 comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=117 > ouid=0 > > --- > I do wonder why we didn't see this earlier though. > > examples/apparmor/TEMPLATE.lxc | 2 +- > examples/apparmor/TEMPLATE.qemu | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) +1/ACK/or whatever. Michal -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] apparmor: cater for new AAVMF image location
On 09/15/2017 06:10 PM, Guido Günther wrote: > Things moved again, sigh. > --- > src/security/virt-aa-helper.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/src/security/virt-aa-helper.c b/src/security/virt-aa-helper.c > index 55a686a59c..0b43c8e391 100644 > --- a/src/security/virt-aa-helper.c > +++ b/src/security/virt-aa-helper.c > @@ -516,7 +516,8 @@ valid_path(const char *path, const bool readonly) > "/usr/share/OVMF/", /* for OVMF images */ > "/usr/share/ovmf/", /* for OVMF images */ > "/usr/share/AAVMF/", /* for AAVMF images */ > -"/usr/share/qemu-efi/" /* for AAVMF images */ > +"/usr/share/qemu-efi/", /* for AAVMF images */ > +"/usr/share/qemu-efi-aarch64/" /* for AAVMF images */ > }; > /* override the above with these */ > const char * const override[] = { > ACK Michal -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH python] Skip sparseRecvAll / sparseSendAll in sanity test
On 09/18/2017 11:49 AM, Daniel P. Berrange wrote: > On Mon, Sep 18, 2017 at 11:47:24AM +0200, Michal Privoznik wrote: >> On 09/14/2017 02:50 PM, Daniel P. Berrange wrote: >>> The sanity test check aims to ensure that every function listed in >>> the Python code maps to a corresponding C function. The Sparse >>> send/recv methods are special though - we're never calling the >>> corresponding C APIs, instead we have a pure python impl. >>> >>> Signed-off-by: Daniel P. Berrange >>> --- >>> sanitytest.py | 3 ++- >>> 1 file changed, 2 insertions(+), 1 deletion(-) >>> >>> diff --git a/sanitytest.py b/sanitytest.py >>> index deec200..a5cb01b 100644 >>> --- a/sanitytest.py >>> +++ b/sanitytest.py >>> @@ -351,7 +351,8 @@ for klass in gotfunctions: >>> for func in sorted(gotfunctions[klass]): >>> # These are pure python methods with no C APi >>> if func in ["connect", "getConnect", "domain", "getDomain", >>> -"virEventInvokeFreeCallback"]: >>> +"virEventInvokeFreeCallback", >>> +"sparseRecvAll", "sparseSendAll"]: >>> continue >>> >>> key = "%s.%s" % (klass, func) >>> >> >> Well, what if somebody builds -python against older libvirt that lacks >> virStreamSparseRecvAll()? Are they facing an exception because cmod is >> unable to find that func? > > Yes, the same is true of any of the functions where we manually written > the Python API. To be fully correct we ought to annotate the manually > written python code and then process the .py file to strip out the APIs > we can't support. Patches welcome for doing that Okay. Since we're doing that for others you have my ACK. This makes things better and no worse. Michal -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 0/3] cpu: Add new Skylake-Server CPU model
On Wed, Sep 13, 2017 at 02:00:43PM +0200, Jiri Denemark wrote: > Jiri Denemark (3): > tests: Add CPUID data for Intel(R) Xeon(R) Gold 6148 CPU > cpu: Add clwb/pcommit CPU features > cpu: Add new Skylake-Server CPU model Reviewed-by: Pavel Hrdina signature.asc Description: PGP signature -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH python] Skip sparseRecvAll / sparseSendAll in sanity test
On Mon, Sep 18, 2017 at 11:47:24AM +0200, Michal Privoznik wrote: > On 09/14/2017 02:50 PM, Daniel P. Berrange wrote: > > The sanity test check aims to ensure that every function listed in > > the Python code maps to a corresponding C function. The Sparse > > send/recv methods are special though - we're never calling the > > corresponding C APIs, instead we have a pure python impl. > > > > Signed-off-by: Daniel P. Berrange > > --- > > sanitytest.py | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/sanitytest.py b/sanitytest.py > > index deec200..a5cb01b 100644 > > --- a/sanitytest.py > > +++ b/sanitytest.py > > @@ -351,7 +351,8 @@ for klass in gotfunctions: > > for func in sorted(gotfunctions[klass]): > > # These are pure python methods with no C APi > > if func in ["connect", "getConnect", "domain", "getDomain", > > -"virEventInvokeFreeCallback"]: > > +"virEventInvokeFreeCallback", > > +"sparseRecvAll", "sparseSendAll"]: > > continue > > > > key = "%s.%s" % (klass, func) > > > > Well, what if somebody builds -python against older libvirt that lacks > virStreamSparseRecvAll()? Are they facing an exception because cmod is > unable to find that func? Yes, the same is true of any of the functions where we manually written the Python API. To be fully correct we ought to annotate the manually written python code and then process the .py file to strip out the APIs we can't support. Patches welcome for doing that Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] RFC: libvirt support for QEMU live patching
On Mon, Sep 18, 2017 at 10:03:34AM +0100, Daniel P. Berrange wrote: On Mon, Sep 18, 2017 at 09:37:14AM +0200, Martin Kletzander wrote: On Fri, Sep 15, 2017 at 09:18:18AM +0100, Daniel P. Berrange wrote: > On Fri, Sep 15, 2017 at 01:27:31PM +0530, Madhu Pavan wrote: > > Hi, > > QEMU live patching should be just a matter of updating the QEMU RPM package > > and then live migrating the VMs to another QEMU instance on the same host > > (which would point to the just installed new QEMU executable). > > I think it will be useful to support it from libvirt side. After some > > searching I found a > > RFC patch posted in Nov 2013. Here is the link to it > > https://www.redhat.com/archives/libvir-list/2013-November/msg00372.html > > Approach followed in above mentioned link is as follows: > > 1. newDef = deep copy oldVm definition > > 2. newVm = create VM using newDef, start QEMU process with all vCPUs paused > > 3. oldVm migrate to newVm using unix socket > > 4. shutdown oldVm > > 5. newPid = newVm->pid > > 6. finalDef = live deep copy of newVm definition > > 7. Drop the newVm from qemu domain table without shutting down QEMU process > > 8. Assign finalDef to oldVm > > 9. oldVm attaches to QEMU process newPid using finalDef > > 10.resume all vCPUs in oldVm > > > > I can see it didn't get communities approval for having problems in handling > > UUID > > of the vm's. To fix the problem we need to teach libvirt to manage two qemu > > processes > > at once both tied to same UUID. I would like to know if there is any > > interested approach > > to get this done. I would like to send patches on this. > > > > Is there any specific reason why it is not been pursued for the last 4 year? > > It isn't possible to make it work correctly in the general case, because > both QEMU processes want to own the same files on disk. eg both might want > to listen on a UNIX socket /foo/bar, but only one can do this. If you let > the new QEMU delete the original QEMU's sockets, then you either break or > delay incoming connections during the migration time, and you make it > impossible to roll back on failure, or both. This kind of thing is not > something that's acceptable for the usage scenerio described, which would > need to be bulletproof to be usable in production. > Can't we utilize namespaces for this? Lot of the things could be separated, so we could fire up a new VM that's "containerized" like this, migrate to it and then fire up a new one and migrate back. If the first migration fails then we can still fallback. If it does not, then the second one "should" not either. As well as increasing complexity and thus chances of failure, this also doubles the performance impact on the guest. More generally I think the high level idea of in-place live-upgrades for guests on a host is flawed. Even if we had the ability to in-place upgrade QEMU for running guest, there are always going to be cases where a security upgrade requires you to reboot the host system - kernel/glibc flaw, or fixes to other servicves that can't be restarted eg dbus. Any person hosting VMs will always always need to be able to handle migrating VMs to a *separate* host to deal with such security updates. For added complexity, determinining which security upgrades could be done by restarting QEMU vs which need to have a full host restart is not trivial, unless intimately familiar with the software stack. So from an administrative / operational POV, defining two separate procedures for dealing with upgrades is unwise. You have the chance of picking the wrong one and leaving a security fix accidentally unpatched, or if you take one path 95% of the time and the other path 5% of the time, chances are you're going to screw up the less used path due to lack of practice. It makes more sense to simplify the operational protocols by standardizing of cross-host migration, followed by host reboot, for applying patches. It simplifies the decision matrix removing complexity, and ensures you are well praticed following the same approach every time. The only cost is having 1 spare server against which to perform the migrations, but if your VMs are at all important, you will have spare hardware already to deal with unforseen hardware problems. If you absolutely can't afford spare hardware, you could just run your entire "host" OS in a container, and then spin up a second "host" OS on the same machine and migrate into that. From libvirt's POV this is still cross-host migration, so needs no special case. Oh sure, I did not consider the in-place upgrade to be dealing with any security issues. And even if people would want just update qemu for some other reason they would be misusing the update for security issues and that's a problem on its own. We probably don't want that at all. And there are other ways around it. Case closed then, thanks for the insights. Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https:/
Re: [libvirt] [PATCH python] Skip sparseRecvAll / sparseSendAll in sanity test
On 09/14/2017 02:50 PM, Daniel P. Berrange wrote: > The sanity test check aims to ensure that every function listed in > the Python code maps to a corresponding C function. The Sparse > send/recv methods are special though - we're never calling the > corresponding C APIs, instead we have a pure python impl. > > Signed-off-by: Daniel P. Berrange > --- > sanitytest.py | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/sanitytest.py b/sanitytest.py > index deec200..a5cb01b 100644 > --- a/sanitytest.py > +++ b/sanitytest.py > @@ -351,7 +351,8 @@ for klass in gotfunctions: > for func in sorted(gotfunctions[klass]): > # These are pure python methods with no C APi > if func in ["connect", "getConnect", "domain", "getDomain", > -"virEventInvokeFreeCallback"]: > +"virEventInvokeFreeCallback", > +"sparseRecvAll", "sparseSendAll"]: > continue > > key = "%s.%s" % (klass, func) > Well, what if somebody builds -python against older libvirt that lacks virStreamSparseRecvAll()? Are they facing an exception because cmod is unable to find that func? Michal -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [Qemu-devel] libvirt/QEMU/SEV interaction
On Mon, Sep 18, 2017 at 11:43:57AM +0200, Erik Skultety wrote: > [...] > > > > > > > > c) what existing communicate interface can be used between libvirt > > > and qemu > > > > to get the measurement ? can we add a new qemu monitor command > > > > 'get_sev_measurement' to get the measurement ? (step 10) > > > > > > Yes, QMP commands seeem most likely. > > > > > > > d) how to pass the secret blob from libvirt to qemu ? should we > > > consider > > > > adding a new object (sev-guest-secret) -- libvirt can add the > > > object through > > > > qemu monitor. > > > > > > Yeah, that looks like a viable option too. > > > > So I could see a flow like the following: > > > > > > 1. mgmt tool calls virConnectGetCapabilities. This returns an XML > > document that includes the following > > > > > > ...other bits... > > > > ...hex encoded PDH key... > > > > > > > > 2. mgmt tool requests to start a guest calling virCreateXML(), > > passing VIR_DOMAIN_START_PAUSED. The XML would include > > > > > > ...hex encode DH key... > > ..hex encode info... > > ...int32 value.. > > > > > > > > if is provided and VIR_DOMAIN_START_PAUSED is missing, > > libvirt would report an error and refuse to start the guest > > > > 3. Libvirt generates the QEMU cli arg to enable SEV using > > the XML data and starts QEMU, leaving CPUs paused > > > > 4. QEMU emits a SEV_MEASURE event containing the measurement > > blob > > Speaking of which, I expect QEMU to have a QMP command to retrieve the > measurement, in which case I think libvirt has to provide an API for the user > to retrieve the measurement in case libvirtd crashes somewhere between setting > up QEMU and waiting for the measurement event from QEMU, or simply because the > GO missed the event for some unspecified reason. Yeah, that's a good point - we also ought to have a pause-reason that reflects that it is paused due to waiting for SEV secrets. > > > > > 5. Libvirt catches the QEMU event and emits its own > > VIR_CONNECT_DOMAIN_EVENT_SEV_MEASURE event containing > > the measurement blob > > > > 6. GO does its validation of the measurement > > > > 7a If validation failed, then virDomainDestroy() to stop QEMU > > > > 7b If validation succeeed > > > > Optionally call > > > > virDomainSetSEVSecret() > > Given the fact that we're likely introducing a new element to the XML > config, I'm more inclined to utilizing the existing virSecret interfaces (as > was originally suggested) instead of creating a vendor-specific API. You could > have an optional secret sub-element within the element and libvirt would > simply check if that secret has a value set, once the GO issues > virDomainResume(). Any particular reason for having a specific API for this > that > I'm missing? Initially I was intending to suggest extensive use of virSecret, but it turns out that despite being called a "secret", none of the SEV data we are passing around needs protection. Either it is safe to be public, or it is already encrypted. So essentially we just have some data blobs we need to pass into QEMU. I didn't feel we ought to be abusing virSecret as a general purpose mechanism for passing in opaque data blobs which do not need any kind of protection. Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [Qemu-devel] libvirt/QEMU/SEV interaction
[...] > > > > > c) what existing communicate interface can be used between libvirt > > and qemu > > > to get the measurement ? can we add a new qemu monitor command > > > 'get_sev_measurement' to get the measurement ? (step 10) > > > > Yes, QMP commands seeem most likely. > > > > > d) how to pass the secret blob from libvirt to qemu ? should we > > consider > > > adding a new object (sev-guest-secret) -- libvirt can add the object > > through > > > qemu monitor. > > > > Yeah, that looks like a viable option too. > > So I could see a flow like the following: > > > 1. mgmt tool calls virConnectGetCapabilities. This returns an XML > document that includes the following > > > ...other bits... > > ...hex encoded PDH key... > > > > 2. mgmt tool requests to start a guest calling virCreateXML(), > passing VIR_DOMAIN_START_PAUSED. The XML would include > > > ...hex encode DH key... > ..hex encode info... > ...int32 value.. > > > > if is provided and VIR_DOMAIN_START_PAUSED is missing, > libvirt would report an error and refuse to start the guest > > 3. Libvirt generates the QEMU cli arg to enable SEV using > the XML data and starts QEMU, leaving CPUs paused > > 4. QEMU emits a SEV_MEASURE event containing the measurement > blob Speaking of which, I expect QEMU to have a QMP command to retrieve the measurement, in which case I think libvirt has to provide an API for the user to retrieve the measurement in case libvirtd crashes somewhere between setting up QEMU and waiting for the measurement event from QEMU, or simply because the GO missed the event for some unspecified reason. > > 5. Libvirt catches the QEMU event and emits its own > VIR_CONNECT_DOMAIN_EVENT_SEV_MEASURE event containing > the measurement blob > > 6. GO does its validation of the measurement > > 7a If validation failed, then virDomainDestroy() to stop QEMU > > 7b If validation succeeed > > Optionally call > > virDomainSetSEVSecret() Given the fact that we're likely introducing a new element to the XML config, I'm more inclined to utilizing the existing virSecret interfaces (as was originally suggested) instead of creating a vendor-specific API. You could have an optional secret sub-element within the element and libvirt would simply check if that secret has a value set, once the GO issues virDomainResume(). Any particular reason for having a specific API for this that I'm missing? Other than that I like the initial proposal for the design. Erik -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] apparmor: add attach_disconnected
Hi, Jamie Strandboge: > On Fri, 2017-09-15 at 17:17 +0200, Guido Günther wrote: >> Otherwise we fail to reconnect to /dev/net/tun opened by libvirtd >> like I confirm I see the bug on current Debian sid and Guido's patch fixes it. Please commit :) Cheers, -- intrigeri -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] apparmor: cater for new AAVMF image location
Jamie Strandboge: > On Fri, 2017-09-15 at 18:10 +0200, Guido Günther wrote: >> Things moved again, sigh. >> --- >> src/security/virt-aa-helper.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/src/security/virt-aa-helper.c b/src/security/virt-aa-helper.c >> index 55a686a59c..0b43c8e391 100644 >> --- a/src/security/virt-aa-helper.c >> +++ b/src/security/virt-aa-helper.c >> @@ -516,7 +516,8 @@ valid_path(const char *path, const bool readonly) >> "/usr/share/OVMF/", /* for OVMF images */ >> "/usr/share/ovmf/", /* for OVMF images */ >> "/usr/share/AAVMF/", /* for AAVMF images */ >> -"/usr/share/qemu-efi/" /* for AAVMF images */ >> +"/usr/share/qemu-efi/", /* for AAVMF images */ >> +"/usr/share/qemu-efi-aarch64/" /* for AAVMF images */ >> }; >> /* override the above with these */ >> const char * const override[] = { > +1. LGTM +1 too after verifying that qemu-efi-aarch64 on current Debian sid indeed ships QEMU_EFI.fd in there.. Cheers, -- intrigeri -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] RFC: libvirt support for QEMU live patching
On Mon, Sep 18, 2017 at 09:37:14AM +0200, Martin Kletzander wrote: > On Fri, Sep 15, 2017 at 09:18:18AM +0100, Daniel P. Berrange wrote: > > On Fri, Sep 15, 2017 at 01:27:31PM +0530, Madhu Pavan wrote: > > > Hi, > > > QEMU live patching should be just a matter of updating the QEMU RPM > > > package > > > and then live migrating the VMs to another QEMU instance on the same host > > > (which would point to the just installed new QEMU executable). > > > I think it will be useful to support it from libvirt side. After some > > > searching I found a > > > RFC patch posted in Nov 2013. Here is the link to it > > > https://www.redhat.com/archives/libvir-list/2013-November/msg00372.html > > > Approach followed in above mentioned link is as follows: > > > 1. newDef = deep copy oldVm definition > > > 2. newVm = create VM using newDef, start QEMU process with all vCPUs > > > paused > > > 3. oldVm migrate to newVm using unix socket > > > 4. shutdown oldVm > > > 5. newPid = newVm->pid > > > 6. finalDef = live deep copy of newVm definition > > > 7. Drop the newVm from qemu domain table without shutting down QEMU > > > process > > > 8. Assign finalDef to oldVm > > > 9. oldVm attaches to QEMU process newPid using finalDef > > > 10.resume all vCPUs in oldVm > > > > > > I can see it didn't get communities approval for having problems in > > > handling > > > UUID > > > of the vm's. To fix the problem we need to teach libvirt to manage two > > > qemu > > > processes > > > at once both tied to same UUID. I would like to know if there is any > > > interested approach > > > to get this done. I would like to send patches on this. > > > > > > Is there any specific reason why it is not been pursued for the last 4 > > > year? > > > > It isn't possible to make it work correctly in the general case, because > > both QEMU processes want to own the same files on disk. eg both might want > > to listen on a UNIX socket /foo/bar, but only one can do this. If you let > > the new QEMU delete the original QEMU's sockets, then you either break or > > delay incoming connections during the migration time, and you make it > > impossible to roll back on failure, or both. This kind of thing is not > > something that's acceptable for the usage scenerio described, which would > > need to be bulletproof to be usable in production. > > > > Can't we utilize namespaces for this? Lot of the things could be > separated, so we could fire up a new VM that's "containerized" like > this, migrate to it and then fire up a new one and migrate back. If the > first migration fails then we can still fallback. If it does not, then > the second one "should" not either. As well as increasing complexity and thus chances of failure, this also doubles the performance impact on the guest. More generally I think the high level idea of in-place live-upgrades for guests on a host is flawed. Even if we had the ability to in-place upgrade QEMU for running guest, there are always going to be cases where a security upgrade requires you to reboot the host system - kernel/glibc flaw, or fixes to other servicves that can't be restarted eg dbus. Any person hosting VMs will always always need to be able to handle migrating VMs to a *separate* host to deal with such security updates. For added complexity, determinining which security upgrades could be done by restarting QEMU vs which need to have a full host restart is not trivial, unless intimately familiar with the software stack. So from an administrative / operational POV, defining two separate procedures for dealing with upgrades is unwise. You have the chance of picking the wrong one and leaving a security fix accidentally unpatched, or if you take one path 95% of the time and the other path 5% of the time, chances are you're going to screw up the less used path due to lack of practice. It makes more sense to simplify the operational protocols by standardizing of cross-host migration, followed by host reboot, for applying patches. It simplifies the decision matrix removing complexity, and ensures you are well praticed following the same approach every time. The only cost is having 1 spare server against which to perform the migrations, but if your VMs are at all important, you will have spare hardware already to deal with unforseen hardware problems. If you absolutely can't afford spare hardware, you could just run your entire "host" OS in a container, and then spin up a second "host" OS on the same machine and migrate into that. From libvirt's POV this is still cross-host migration, so needs no special case. Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH] qemu: Introduce a wrapper over virFileWrapperFdClose
https://bugzilla.redhat.com/show_bug.cgi?id=1448268 When migrating to a file (e.g. when doing 'virsh save file'), couple of things are happening in the thread that is executing the API: 1) the domain obj is locked 2) iohelper is spawned as a separate process to handle all I/O 3) the thread waits for iohelper to finish 4) the domain obj is unlocked Now, the problem is that while the thread waits in step 3 for iohelper to finish this may take ages because iohelper calls fdatasync(). And unfortunately, we are waiting the whole time with the domain locked. So if another thread wants to jump in and say copy the domain name ('virsh list' for instance), they are stuck. The solution is to unlock the domain whenever waiting for I/O and lock it back again when it finished. Signed-off-by: Michal Privoznik --- src/qemu/qemu_driver.c | 27 +++ 1 file changed, 23 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index b334cf20b..f81d3def4 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -3216,6 +3216,25 @@ qemuOpenFileAs(uid_t fallback_uid, gid_t fallback_gid, goto cleanup; } + +static int +qemuFileWrapperFDClose(virDomainObjPtr vm, + virFileWrapperFdPtr fd) +{ +int ret; + +/* virFileWrapperFd uses iohelper to write data onto disk. + * However, iohelper calls fdatasync() which may take ages to + * finish. Therefore, we shouldn't be waiting with the domain + * object locked. */ + +virObjectUnlock(vm); +ret = virFileWrapperFdClose(fd); +virObjectLock(vm); +return ret; +} + + /* Helper function to execute a migration to file with a correct save header * the caller needs to make sure that the processors are stopped and do all other * actions besides saving memory */ @@ -3276,7 +3295,7 @@ qemuDomainSaveMemory(virQEMUDriverPtr driver, goto cleanup; } -if (virFileWrapperFdClose(wrapperFd) < 0) +if (qemuFileWrapperFDClose(vm, wrapperFd) < 0) goto cleanup; if ((fd = qemuOpenFile(driver, vm, path, O_WRONLY, NULL, NULL)) < 0 || @@ -3827,7 +3846,7 @@ doCoreDump(virQEMUDriverPtr driver, path); goto cleanup; } -if (virFileWrapperFdClose(wrapperFd) < 0) +if (qemuFileWrapperFDClose(vm, wrapperFd) < 0) goto cleanup; ret = 0; @@ -6687,7 +6706,7 @@ qemuDomainRestoreFlags(virConnectPtr conn, ret = qemuDomainSaveImageStartVM(conn, driver, vm, &fd, data, path, false, QEMU_ASYNC_JOB_START); -if (virFileWrapperFdClose(wrapperFd) < 0) +if (qemuFileWrapperFDClose(vm, wrapperFd) < 0) VIR_WARN("Failed to close %s", path); qemuProcessEndJob(driver, vm); @@ -6962,7 +6981,7 @@ qemuDomainObjRestore(virConnectPtr conn, ret = qemuDomainSaveImageStartVM(conn, driver, vm, &fd, data, path, start_paused, asyncJob); -if (virFileWrapperFdClose(wrapperFd) < 0) +if (qemuFileWrapperFDClose(vm, wrapperFd) < 0) VIR_WARN("Failed to close %s", path); cleanup: -- 2.13.5 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] RFC: libvirt support for QEMU live patching
On Mon, Sep 18, 2017 at 10:26:40AM +0200, Peter Krempa wrote: On Mon, Sep 18, 2017 at 09:37:14 +0200, Martin Kletzander wrote: On Fri, Sep 15, 2017 at 09:18:18AM +0100, Daniel P. Berrange wrote: > On Fri, Sep 15, 2017 at 01:27:31PM +0530, Madhu Pavan wrote: [...] > It isn't possible to make it work correctly in the general case, because > both QEMU processes want to own the same files on disk. eg both might want > to listen on a UNIX socket /foo/bar, but only one can do this. If you let > the new QEMU delete the original QEMU's sockets, then you either break or > delay incoming connections during the migration time, and you make it > impossible to roll back on failure, or both. This kind of thing is not > something that's acceptable for the usage scenerio described, which would > need to be bulletproof to be usable in production. > Can't we utilize namespaces for this? Lot of the things could be separated, so we could fire up a new VM that's "containerized" like this, migrate to it and then fire up a new one and migrate back. If the first migration fails then we can still fallback. If it does not, then the second one "should" not either. If you are willing to accept this kind of slow-down/overhead you can as well as save the VM to file and restore it. This works as the upgrade path even now. That way you have no fallback option, though. The above solution could still support live upgrade. signature.asc Description: Digital signature -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 1/7] cpu_conf: Introduce virCPUDefList{Parse, Free}
On Fri, Sep 15, 2017 at 14:04:42 +0200, Ján Tomko wrote: > On Thu, Sep 14, 2017 at 12:57:14PM +0200, Jiri Denemark wrote: > >For parsing a list of CPU XMLs into a NULL-terminated list of CPU defs. > > > >Signed-off-by: Jiri Denemark > >--- > > src/conf/cpu_conf.c | 78 > > > > src/conf/cpu_conf.h | 7 + > > src/libvirt_private.syms | 2 ++ > > 3 files changed, 87 insertions(+) > > > > [...] > > >+/* > >+ * Parses a list of CPU XMLs into a NULL-terminated list of CPU defs. > >+ */ > >+virCPUDefPtr * > >+virCPUDefListParse(const char **xmlCPUs, > >+ unsigned int ncpus, > >+ virCPUType cpuType) > >+{ > >+ > > [...] > > >+if (ncpus < 1) { > > Interesting way of spelling '== 0' Yeah, caused by copy&paste engineering... see the next patch where the same condition gets removed from cpuBaselineXML. Fixed. Jirka -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] RFC: libvirt support for QEMU live patching
On Mon, Sep 18, 2017 at 09:37:14 +0200, Martin Kletzander wrote: > On Fri, Sep 15, 2017 at 09:18:18AM +0100, Daniel P. Berrange wrote: > > On Fri, Sep 15, 2017 at 01:27:31PM +0530, Madhu Pavan wrote: [...] > > It isn't possible to make it work correctly in the general case, because > > both QEMU processes want to own the same files on disk. eg both might want > > to listen on a UNIX socket /foo/bar, but only one can do this. If you let > > the new QEMU delete the original QEMU's sockets, then you either break or > > delay incoming connections during the migration time, and you make it > > impossible to roll back on failure, or both. This kind of thing is not > > something that's acceptable for the usage scenerio described, which would > > need to be bulletproof to be usable in production. > > > > Can't we utilize namespaces for this? Lot of the things could be > separated, so we could fire up a new VM that's "containerized" like > this, migrate to it and then fire up a new one and migrate back. If the > first migration fails then we can still fallback. If it does not, then > the second one "should" not either. If you are willing to accept this kind of slow-down/overhead you can as well as save the VM to file and restore it. This works as the upgrade path even now. signature.asc Description: PGP signature -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] RFC: libvirt support for QEMU live patching
On Fri, Sep 15, 2017 at 09:18:18AM +0100, Daniel P. Berrange wrote: On Fri, Sep 15, 2017 at 01:27:31PM +0530, Madhu Pavan wrote: Hi, QEMU live patching should be just a matter of updating the QEMU RPM package and then live migrating the VMs to another QEMU instance on the same host (which would point to the just installed new QEMU executable). I think it will be useful to support it from libvirt side. After some searching I found a RFC patch posted in Nov 2013. Here is the link to it https://www.redhat.com/archives/libvir-list/2013-November/msg00372.html Approach followed in above mentioned link is as follows: 1. newDef = deep copy oldVm definition 2. newVm = create VM using newDef, start QEMU process with all vCPUs paused 3. oldVm migrate to newVm using unix socket 4. shutdown oldVm 5. newPid = newVm->pid 6. finalDef = live deep copy of newVm definition 7. Drop the newVm from qemu domain table without shutting down QEMU process 8. Assign finalDef to oldVm 9. oldVm attaches to QEMU process newPid using finalDef 10.resume all vCPUs in oldVm I can see it didn't get communities approval for having problems in handling UUID of the vm's. To fix the problem we need to teach libvirt to manage two qemu processes at once both tied to same UUID. I would like to know if there is any interested approach to get this done. I would like to send patches on this. Is there any specific reason why it is not been pursued for the last 4 year? It isn't possible to make it work correctly in the general case, because both QEMU processes want to own the same files on disk. eg both might want to listen on a UNIX socket /foo/bar, but only one can do this. If you let the new QEMU delete the original QEMU's sockets, then you either break or delay incoming connections during the migration time, and you make it impossible to roll back on failure, or both. This kind of thing is not something that's acceptable for the usage scenerio described, which would need to be bulletproof to be usable in production. Can't we utilize namespaces for this? Lot of the things could be separated, so we could fire up a new VM that's "containerized" like this, migrate to it and then fire up a new one and migrate back. If the first migration fails then we can still fallback. If it does not, then the second one "should" not either. Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list signature.asc Description: Digital signature -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] virsh: Enhance documentation of --rdma-pin-all option
On Fri, Sep 08, 2017 at 09:33:13PM +0200, Jiri Denemark wrote: > https://bugzilla.redhat.com/show_bug.cgi?id=1373783 > > Signed-off-by: Jiri Denemark > --- > tools/virsh.pod | 8 +++- > 1 file changed, 7 insertions(+), 1 deletion(-) Reviewed-by: Pavel Hrdina signature.asc Description: PGP signature -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list