[libvirt] interactions between virDomainSetVcpusFlags and NUMA/pinning?
Hi, Just wondering about interactions between virDomainSetVcpusFlags() and virDomainPinVcpuFlags() and the domain XML. 1) If I add a vCPU to a domain, do I need to pin it after or does it respect the vCPU-to-pCPU mapping specified in the domain XML? 2) Are vCPUs added/removed in strict numerical order such that at any given time the active vCPUs are numbered 0-(N-1) where N is the number of active vCPUs? 3) Will the newly-added vCPUs respect the NUMA topology specified in the domain XML? Thanks, Chris -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] anyone ever seen virDomainCreateWithFlags() essentially hang?
On 04/05/2018 12:17 PM, Jiri Denemark wrote: On Thu, Apr 05, 2018 at 12:00:44 -0600, Chris Friesen wrote: I'm investigating something weird with libvirt 1.2.17 and qemu 2.3.0. I'm using the python bindings, and I seem to have a case where libvirtmod.virDomainCreateWithFlags() hung rather than returned. Then, about 15min later a subsequent call to libvirtmod.virDomainDestroy() from a different eventlet within the same process seems to have "unblocked" the original creation call, which raised an exception and an error code of libvirt.VIR_ERR_INTERNAL_ERROR. The virDomainDestroy() call came back with an error of "Requested operation is not valid: domain is not running". The corresponding qemu logs show the guest starting up and then a bit over 15min later there is a "shutting down" log. At shutdown time the libvirtd log shows "qemuMonitorIORead:609 : Unable to read from monitor: Connection reset by peer". Looks like qemu is hung and is not responding to commands libvirt sends to the QEMU's monitor socket. And since this happens while libvirt is in the process of starting up the domain (it sends several commands to QEMU before it starts the virtual CPU and considers the domain running), you see a hanging virDomainCreateWithFlags API. Seems plausible. The libvirt qemuDomainDestroyFlags() code seems to kill the qemu process first before emitting the "domain is not running" error, so that would fit with the logs. Of course now I have an unexplained qemu hang, which isn't much better. :) Chris -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] anyone ever seen virDomainCreateWithFlags() essentially hang?
I'm investigating something weird with libvirt 1.2.17 and qemu 2.3.0. I'm using the python bindings, and I seem to have a case where libvirtmod.virDomainCreateWithFlags() hung rather than returned. Then, about 15min later a subsequent call to libvirtmod.virDomainDestroy() from a different eventlet within the same process seems to have "unblocked" the original creation call, which raised an exception and an error code of libvirt.VIR_ERR_INTERNAL_ERROR. The virDomainDestroy() call came back with an error of "Requested operation is not valid: domain is not running". The corresponding qemu logs show the guest starting up and then a bit over 15min later there is a "shutting down" log. At shutdown time the libvirtd log shows "qemuMonitorIORead:609 : Unable to read from monitor: Connection reset by peer". The parent function did two additional retries, and both the retries failed in similar fashion. In all three caess there seems to be a pattern of the qemu instance starting up but virDomainCreateWithFlags() not returning, then a subsequent virDomainDestroy() call for the same domain causing the virDomainCreateWithFlags() call to get "unblocked" and return -1 leading to an exception in the python code. Any ideas what might cause this behaviour? I haven't reproduced the "hanging" behaviour myself, I'm working entirely off of logs from the affected system. Thanks, Chris -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] qemu: fix migration with local and VIR_STORAGE_TYPE_NETWORK disks
On 02/09/2018 04:15 AM, Daniel P. Berrangé wrote: On Thu, Feb 08, 2018 at 01:24:58PM -0600, Chris Friesen wrote: Given your comment above about "I don't want to see the semantics of that change", it sounds like you're suggesting: 1) If there are any non-shared non-readonly network drives then the user can't rely on the default behaviour of VIR_MIGRATE_NON_SHARED_INC to do the right thing and therefore must explicitly specify the list of drives to migrate I would not make that conditional. Just always specific the list of disk to migrate, if you're using new enough libvirt. 2) If there are no drives to migrate, then it is not valid to specify VIR_MIGRATE_NON_SHARED_INC with an empty "migrate_disks", but instead the caller should ensure that VIR_MIGRATE_NON_SHARED_INC is not set. Yes, don't ask for shared storage migration if there's no storage to migrate. Thanks for the clarifications. Chris -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] qemu: fix migration with local and VIR_STORAGE_TYPE_NETWORK disks
On 02/08/2018 03:07 AM, Daniel P. Berrangé wrote: On Wed, Feb 07, 2018 at 01:11:33PM -0600, Chris Friesen wrote: Are you okay with the other change? That part of the code was intended to be funtionally identical to what QEMU's previous built-in storage migration code would do. I don't want to see the semantics of that change, because it makes libvirt behaviour vary depending on which QEMU version you are using. If that logic is not right for a particular usage scenario, applications are expected to provide the "migrate_disks" parameter. My coworker has pointed out another related issue. In tools/virsh-domain.c, doMigrate(), if we specify "migrate-disks" with an empty list, the behaviour is the same as if it is not specified at all. That is, the fact that it was specified but empty is lost. Our original problem scenario was where the root disk is rbd and there is a read-only ISO config-drive, and "nmigrate_disks" is zero. What we see in this case is that qemuMigrateDisk() returns "true" for the rbd disk, which then causes qemuMigrationPrecreateStorage() to fail with "pre-creation of storage targets for incremental storage migration is not supported". So you want zero disks migrated. Simply don't ask for storage migration in the first place if you don't have any disks to migrate. In this case yes, but now we're talking about duplicating the libvirt logic around which disks to migrate in the code that calls libvirt. There is a comment in the OpenStack nova code that looks like this: # Due to a quirk in the libvirt python bindings, # VIR_MIGRATE_NON_SHARED_INC with an empty migrate_disks is # interpreted as "block migrate all writable disks" rather than # "don't block migrate any disks". This includes attached # volumes, which will potentially corrupt data on those # volumes. Consequently we need to explicitly unset # VIR_MIGRATE_NON_SHARED_INC if there are no disks to be block # migrated. It sounds like it's not just a quirk, but rather design intent? Given your comment above about "I don't want to see the semantics of that change", it sounds like you're suggesting: 1) If there are any non-shared non-readonly network drives then the user can't rely on the default behaviour of VIR_MIGRATE_NON_SHARED_INC to do the right thing and therefore must explicitly specify the list of drives to migrate 2) If there are no drives to migrate, then it is not valid to specify VIR_MIGRATE_NON_SHARED_INC with an empty "migrate_disks", but instead the caller should ensure that VIR_MIGRATE_NON_SHARED_INC is not set. Is that a fair summation? If so, I'd suggest that this is non-intuitive from a user's perspective and at a minimum should be more explicitly documented. Chris -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] qemu: fix migration with local and VIR_STORAGE_TYPE_NETWORK disks
On 02/07/2018 12:05 PM, Daniel P. Berrangé wrote: On Wed, Feb 07, 2018 at 11:57:19AM -0600, Chris Friesen wrote: In the current implementation of qemuMigrateDisk() the value of the "nmigrate_disks" parameter wrongly impacts the decision whether or not to migrate a disk that is not a member of "migrate_disks": 1) If "nmigrate_disks" is zero, "disk" is migrated if it's non-shared non-readonly with source. 2) If "nmigrate_disks" is non-zero and "disk" is not a member of "migrate_disks" then "disk" is not migrated. This should instead proceed with checking conditions as per 1) and allow migration of non-shared non-readonly disks with source. Huh, this doesn't make sense. If an app has passed a list of disks in migrate_disks, we must *never* touch any disk that is not present in this list. If the app wanted the other disk(s) migrated, it would have included it in the list of disks it passed in. Okay, that makes sense. I can restore the "return false" here. Are you okay with the other change? Our original problem scenario was where the root disk is rbd and there is a read-only ISO config-drive, and "nmigrate_disks" is zero. What we see in this case is that qemuMigrateDisk() returns "true" for the rbd disk, which then causes qemuMigrationPrecreateStorage() to fail with "pre-creation of storage targets for incremental storage migration is not supported". Chris -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH] qemu: fix migration with local and VIR_STORAGE_TYPE_NETWORK disks
In the current implementation of qemuMigrateDisk() the value of the "nmigrate_disks" parameter wrongly impacts the decision whether or not to migrate a disk that is not a member of "migrate_disks": 1) If "nmigrate_disks" is zero, "disk" is migrated if it's non-shared non-readonly with source. 2) If "nmigrate_disks" is non-zero and "disk" is not a member of "migrate_disks" then "disk" is not migrated. This should instead proceed with checking conditions as per 1) and allow migration of non-shared non-readonly disks with source. Fixing 2) breaks migration of VMs with a mix of rbd and local disks because now libvirt tries to migrate the rbd root disk and it fails. This new problem is solved by updating 1) to factor in disk source type and migrate only 'local' non-shared non-readonly disks with source. The end result is that disks not in "migrate_disks" are treated uniformly regardless of the value of "nmigrate_disks". Signed-off-by: Chris Friesen <chris.frie...@windriver.com> --- src/qemu/qemu_migration.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 5ee9e5c..77fafc6 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -409,12 +409,12 @@ qemuMigrateDisk(virDomainDiskDef const *disk, if (STREQ(disk->dst, migrate_disks[i])) return true; } -return false; } -/* Default is to migrate only non-shared non-readonly disks +/* Default is to migrate only non-shared non-readonly local disks * with source */ return !disk->src->shared && !disk->src->readonly && + (disk->src->type != VIR_STORAGE_TYPE_NETWORK) && !virStorageSourceIsEmpty(disk->src); } -- 1.8.3.1 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] Redesigning Libvirt: Adopting use of a safe language
On 11/20/2017 09:25 AM, Daniel P. Berrange wrote: When I worked in OpenStack it was a constant battle to get people to consider enhancements to libvirt instead of reinventing it in Python. It was a hard sell because most python dev just didn't want to use C at all because it has a high curve to contributors, even if libvirt as a community is welcoming. As a result OpenStack pretty much reinvented its own hypervisor agnostic API for esx, hyperv, xenapi and KVM instead of enhancing libvirt's support for esx, hyperv or xenapi. To be fair, there's also the issue that getting a change into any external project and packaged into all the distros is more unpredictable (and may take longer) than implementing the same thing in their own project. RHEL (just as an example) has been updating libvirt roughly once a year for the past couple years. Chris -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] Redesigning Libvirt: Adopting use of a safe language
On 11/17/2017 06:37 AM, Daniel P. Berrange wrote: On Fri, Nov 17, 2017 at 01:34:54PM +0100, Markus Armbruster wrote: "Daniel P. Berrange"writes: [...] Goroutines are basically a union of the thread + coroutine concepts. The Go runtime will create N OS level threads, where the default N currently matches the number of logical CPU cores you host has (but is tunable to other values). The application code just always creates Goroutines which are userspace threads just like coroutines. The Go runtime will dynamically switch goroutines at key points, and automatically pick suitable OS level threads to run them on to maximize concurrency. Most cleverly goroutines have a 2 KB default stack size, and runtime will dynamically grow the stack if that limit is reached. Does this work even when the stack limit is exceeded in a C function? When you make a C call in go, it runs in a separate stack. The goroutines own stack is managed by the garbage collector, so can't be exposed to C code. I'm unclear exactly what size the C stack would be, but it'll be the traditional fixed size, not the grow-on-demand behaviour of the Go stack. Based on https://github.com/golang/go/blob/master/src/runtime/cgo/gcc_linux_amd64.c it looks like they don't explicitly specify a stack size, at least on linux. Are there limits as to what you're allowed to do in C code called from Go? Can you fork processes, spawn threads, call setjmp/longjmp, handle signals, sleep, etc.? Chris -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] Redesigning Libvirt: Adopting use of a safe language
On 11/16/2017 03:55 PM, John Ferlan wrote: On 11/14/2017 12:27 PM, Daniel P. Berrange wrote: Part of the problem is that, despite Linux having very low overhead thread spawning, threads still consume non-trivial resources, so we try to constrain how many we use, which forces an M:N relationship between jobs we need to process and threads we have available. So GO's process/thread model is then lightweight? What did they learn that the rest of us ought to know! Or is this just a continuation of the libvirtd discussion? Goroutines are not strictly 1:1 mapped to an OS thread...it's an N:M mapping where a blocking call in a goroutine will not block any other goroutines. Modern Go defaults to a number of OS threads equal to the number of cores. Chris -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v4 0/4] Implement migrate-getmaxdowntime command
On 08/17/2017 04:17 PM, Scott Garfinkle wrote: Currently, the maximum tolerable downtime for a domain being migrated is write-only. This patch implements a way to query that value nondestructively. I'd like register my support for the concept in general. Seems odd to have something you can write but not read. For what it's worth I took a look at the patches and didn't see anything horribly wrong, but I only dabble in libvirt so that's not worth much. :) Chris -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] question about locking in qemuDomainObjBeginJobInternal()
Hi, I'm hitting a scenario (on libvirt 1.2.12, so yeah it's a bit old) where I'm attempting to create two domains at the same time, and they both end up erroring out with "cannot acquire state change lock": 2017-08-14T12:57:00.000 79674: warning : qemuDomainObjBeginJobInternal:1380 : Cannot start job (modify, none) for domain instance-0001; current job is (modify, none) owned by (79673, 0) 2017-08-14T12:57:00.000 79674: error : qemuDomainObjBeginJobInternal:1385 : Timed out during operation: cannot acquire state change lock 2017-08-14T12:57:01.000 79675: warning : qemuDomainObjBeginJobInternal:1380 : Cannot start job (modify, none) for domain instance-0002; current job is (modify, none) owned by (79677, 0) 2017-08-14T12:57:01.000 79675: error : qemuDomainObjBeginJobInternal:1385 : Timed out during operation: cannot acquire state change lock Given that the lock appears to be per-domain, I assume this means that something is trying to issue multiple operations in parallel to each domain? Chris -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] status of support for cache allocation technology?
On 07/27/2017 05:08 AM, Martin Kletzander wrote: Is the "[PATH V10 00/12] Support cache tune in libvirt" patch series the most recent set of patches? No, then there were several RFCs and then patch series again, IIRC, but you can expect a new one written from scratch to be posted soon. I can add you to Cc if you want me to an I don't forget. I'd appreciate that, thanks. Chris -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] status of support for cache allocation technology?
Hi, I'm just wondering what the current status is about exposing/controlling cache banks. Looking at the code, it appears that we report the banks as part of "virsh capabilities". Is it possible to associate a particular bank with a particular domain, or has that not yet merged? Is the "[PATH V10 00/12] Support cache tune in libvirt" patch series the most recent set of patches? Thanks, Chris -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] libvirtd not responding to virsh, results in virsh hanging
On 03/31/2017 11:30 AM, Chris Friesen wrote: On 03/31/2017 11:21 AM, Chris Friesen wrote: I ran tcpdump looking for TCP traffic between the two libvirtd processes, and was unable to see any after several minutes. So it doesn't look like there is any regular keepalive messaging going on (/etc/libvirt/libvirtd.conf doesn't specify any keepalive settings so we'd be using the defaults I think). And yet the TCP connection is stuck open. Turns out I ran tcpdump in the wrong windowoops. There's what appears to be a keepalive sequence every 5 seconds. I still don't understand why the connection wasn't taken down when qemu exited on the destination host. One final update for nowI attached gdb to libvirtd on the source host and then killed libvirtd on the destination host. I saw the TCP connection get closed down, and gdb showed this: [Thread 0x7f8948ab3700 (LWP 4514) exited] At this point "virsh" commands on the source host work as expected, it's no longer hung. So it appears we have a number of factors contributing to the hang: 1) failure of migration in qemu 2) connection between hosts not getting torn down when migration fails 3) the libvirtd thread managing the migration on the source side appears to be sleeping indefinitely while holding a resource of some sort which causes the apparent hang when we try to do other operations Chris -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] libvirtd not responding to virsh, results in virsh hanging -- correction
On 03/31/2017 11:21 AM, Chris Friesen wrote: I ran tcpdump looking for TCP traffic between the two libvirtd processes, and was unable to see any after several minutes. So it doesn't look like there is any regular keepalive messaging going on (/etc/libvirt/libvirtd.conf doesn't specify any keepalive settings so we'd be using the defaults I think). And yet the TCP connection is stuck open. Turns out I ran tcpdump in the wrong windowoops. There's what appears to be a keepalive sequence every 5 seconds. I still don't understand why the connection wasn't taken down when qemu exited on the destination host. Chris -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] libvirtd not responding to virsh, results in virsh hanging
Hi, I finally got a chance to take another look at this issue. We've reproduced it in another test lab. New information below. On 03/18/2017 12:41 AM, Michal Privoznik wrote: On 17.03.2017 23:21, Chris Friesen wrote: Hi, We've recently run into an issue with libvirt 1.2.17 in the context of an OpenStack deployment. Let me just say that 1.2.17 is rather old libvirt. Can you try with one of the latests one to see whether the bug still reproduces? Difficult, the version seems likely to be part of the problem. We haven't seen this issue with migrations between hosts with libvirtd 1.2.17 or between hosts with libvirtd 2.0.0, just when the versions are mismatched. The issue occurs when we are trying to do an in-service upgrade, so the source host is running libvirt 1.2.17 and we're trying to live-migrate to a dest host that has been upgraded to libvirt 2.0.0. Interestingly, the issue doesn't always happen, it's intermittent. We recently reproduced it on the fourth guest we live-migrated from the "old" host to the "new" host--the first three migrated without difficulty. (And the first three were configured very closely to the fourth...boot from iscsi, same number/type of NICs, same amount of vCPUs/RAM, same topology, etc.) To answer a previous question, yes we're doing tunneled migration in this case. Interestingly, when I hit "c" to continue in the debugger, I got this: (gdb) c Continuing. Program received signal SIGPIPE, Broken pipe. [Switching to Thread 0x7f0573fff700 (LWP 186865)] 0x7f05b5cbb1cd in write () from /lib64/libpthread.so.0 (gdb) c Continuing. [Thread 0x7f0573fff700 (LWP 186865) exited] (gdb) quit A debugging session is active. Inferior 1 [process 37471] will be detached. Quit anyway? (y or n) y Detaching from program: /usr/sbin/libvirtd, process 37471 This is becasue there might be some keep alive going on. Introduced in 0.9.8, libvirt has keepalive mechanism in place (repeatedly sending ping/pong between client & server). Now, should 5 subsequent pings get lost (this is configurable of course) libvirt thinks the connection is broken and closes it. If you attach a debugger to libvirt, the whole daemon is paused, among with the event loop so server cannot reply to client's pings which in turn makes client think the connection is broken. Thus it closes the connection which is observed as broken pipe in the daemon. I've reproduced the issue in another test lab, in this case compute-2 is the "old" host while compute-0 and compute-1 are the "new" hosts. Three guests have live-migrated from compute-2 to compute-0, and a fourth appears to be stuck in-progress, but libvirtd is hung so any "virsh" commands also hang. Running "netstat -apn |grep libvirtd" shows an open connection between compute-2 (192.168.205.134) and compute-0 (192.168.205.24). Presumably this corresponds to the migration that appears to be "stuck" in progress. compute-2:/home/wrsroot# netstat -atpn|grep libvirtd tcp0 0 0.0.0.0:16509 0.0.0.0:* LISTEN 35787/libvirtd tcp0 0 192.168.205.134:51760 192.168.205.24:16509ESTABLISHED 35787/libvirtd tcp6 0 0 :::16509:::*LISTEN 35787/libvirtd Running "virsh list" on compute-0 shows 9 guests, which agrees with the number of running "qemu-kvm" processes. Interestingly, the guest from the migration with an open connection in libvirtd is *not* running and doesn't show up in the "virsh list" output. The /var/log/libvirt/qemu/instance-000e.log file on compute-0 corresponds to the instance that libvirtd is "stuck" migrating, and it ends with these lines: 2017-03-29T06:38:37.886940Z qemu-kvm: VQ 2 size 0x80 < last_avail_idx 0x47b - used_idx 0x47c 2017-03-29T06:38:37.886974Z qemu-kvm: error while loading state for instance 0x0 of device ':00:07.0/virtio-balloon' 2017-03-29T06:38:37.888684Z qemu-kvm: load of migration failed: Operation not permitted 2017-03-29 06:38:37.896+: shutting down I think this implies a qemu incompatibility of some sort between the different qemu versions on the "old" and "new" hosts, but it doesn't explain why libvirtd didn't close down the migration connection between the two hosts. The corresponding libvirtd logs on compute-0 are: 2017-03-29T06:38:35.000 401: warning : qemuDomainObjTaint:3580 : Domain id=10 name='instance-000e' uuid=57ae849f-aa66-422a-90a2-62db6c59db29 is tainted: high-privileges 2017-03-29T06:38:37.000 49075: error : qemuMonitorIO:695 : internal error: End of file from monitor 2017-03-29T06:38:37.000 49075: error : qemuProcessReportLogError:1810 : internal error: qemu unexpectedly closed the monitor: EAL:eal_memory.c:1591: WARNING: Address Space Layout Randomization (ASLR) is enabled in the kernel.
[libvirt] libvirtd not responding to virsh, results in virsh hanging
Hi, We've recently run into an issue with libvirt 1.2.17 in the context of an OpenStack deployment. Occasionally after doing live migrations from a compute node with libvirt 1.2.17 to a compute node with libvirt 2.0.0 we see libvirtd on the 1.2.17 side stop responding. When this happens, if you run a command like "sudo virsh list" then it just hangs waiting for a response from libvirtd. Running "ps -elfT|grep libvirtd" shows many threads waiting on a futex, but two threads in poll_schedule_timeout() as part of the poll() syscall. On a non-hung libvirtd I only see one thread in poll_schedule_timeout(). If I kill and restart libvirtd (this took two tries, it didn't actually die the first time) then the problem seems to go away. I just tried attaching gdb to the "hung" libvirtd process and running "thread apply all backtrace". This printed backtraces for the threads, including the one that was apparently stuck in poll(): Thread 17 (Thread 0x7f0573fff700 (LWP 186865)): #0 0x7f05b59d769d in poll () from /lib64/libc.so.6 #1 0x7f05b7f01b9a in virNetClientIOEventLoop () from /lib64/libvirt.so.0 #2 0x7f05b7f0234b in virNetClientSendInternal () from /lib64/libvirt.so.0 #3 0x7f05b7f036f3 in virNetClientSendWithReply () from /lib64/libvirt.so.0 #4 0x7f05b7f04eb3 in virNetClientStreamSendPacket () from /lib64/libvirt.so.0 #5 0x7f05b7ed8db5 in remoteStreamFinish () from /lib64/libvirt.so.0 #6 0x7f05b7ec7eaa in virStreamFinish () from /lib64/libvirt.so.0 #7 0x7f059bd9323d in qemuMigrationIOFunc () from /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so #8 0x7f05b7e09aa2 in virThreadHelper () from /lib64/libvirt.so.0 #9 0x7f05b5cb4dc5 in start_thread () from /lib64/libpthread.so.0 #10 0x7f05b59e1ced in clone () from /lib64/libc.so.6 Interestingly, when I hit "c" to continue in the debugger, I got this: (gdb) c Continuing. Program received signal SIGPIPE, Broken pipe. [Switching to Thread 0x7f0573fff700 (LWP 186865)] 0x7f05b5cbb1cd in write () from /lib64/libpthread.so.0 (gdb) c Continuing. [Thread 0x7f0573fff700 (LWP 186865) exited] (gdb) quit A debugging session is active. Inferior 1 [process 37471] will be detached. Quit anyway? (y or n) y Detaching from program: /usr/sbin/libvirtd, process 37471 Now thread 186865 seems to be gone, and libvirtd is no longer hung. Has anyone seen anything like this before? Anyone have an idea where to start looking? Thanks, Chris -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] inconsistent handling of "qemu64" CPU model
On 05/26/2016 04:41 AM, Jiri Denemark wrote: The qemu64 CPU model contains svm and thus libvirt will always consider it incompatible with any Intel CPUs (which have vmx instead of svm). On the other hand, QEMU by default ignores features that are missing in the host CPU and has no problem using qemu64 CPU, the guest just won't see some of the features defined in qemu64 model. In your case, you should be able to use qemu64 to get the same CPU model you'd get by default (if not, you may need to also add ). Alternatively qemu64 should work too (and it would be better in case you use it on an AMD host). It's actually OpenStack that is setting up the XML, not me, so I'd have to special-case the "qemu64" model and it'd get ugly. :) The question remains, why is "qemu64" okay when used implicitly but not explicitly? I would have expected them to behave the same. But why you even want to use qemu64 CPU in a domain XML explicitly? If you're fine with that CPU, just let QEMU use a default one. If not, use a CPU model that fits your host/needs better. Working around another issue would be simpler/cleaner if I could just explicitly set the model to qemu64. Chris -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] inconsistent handling of "qemu64" CPU model
Hi, I'm not sure where the problem lies, hence the CC to both lists. Please copy me on the reply. I'm playing with OpenStack's devstack environment on an Ubuntu 14.04 host with a Celeron 2961Y CPU. (libvirt detects it as a Nehalem with a bunch of extra features.) Qemu gives version 2.2.0 (Debian 1:2.2+dfsg-5expubuntu9.7~cloud2). If I don't specify a virtual CPU model, it appears to give me a "qemu64" CPU, and /proc/cpuinfo in the guest instance looks something like this: processor 0 vendor_id GenuineIntel cpu family 6 model 6 model name: QEMU Virtual CPU version 2.2.0 stepping: 3 microcode: 0x1 flags: fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pse36 clflush mmx fxsr sse sse2 syscall nx lm rep_good nopl pni vmx cx16 x2apic popcnt hypervisor lahf_lm abm vnmi ept However, if I explicitly specify a custom CPU model of "qemu64" the instance refuses to boot and I get a log saying: libvirtError: unsupported configuration: guest and host CPU are not compatible: Host CPU does not provide required features: svmlibvirtError: unsupported configuration: guest and host CPU are not compatible: Host CPU does not provide required features: svm When this happens, some of the XML for the domain looks like this: hvm qemu64 Of course "svm" is an AMD flag and I'm running an Intel CPU. But why does it work when I just rely on the default virtual CPU? Is kvm_default_unset_features handled differently when it's implicit vs explicit? If I explicitly specify a custom CPU model of "kvm64" then it boots, but of course I get a different virtual CPU from what I get if I don't specify anything. Following some old suggestions I tried turning off nested kvm, deleting /var/cache/libvirt/qemu/capabilities/*, and restarting libvirtd. Didn't help. So...anyone got any ideas what's going on? Is there no way to explicitly specify the model that you get by default? Thanks, Chris -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] high outage times for qemu virtio network links during live migration, trying to debug
Hi, I'm using libvirt (1.2.12) with qemu (2.2.0) in the context of OpenStack. If I live-migrate a guest with virtio network interfaces, I see a ~1200msec delay in processing the network packets, and several hundred of them get dropped. I get the dropped packets, but I'm not sure why the delay is there. I instrumented qemu and libvirt, and the strange thing is that this delay seems to happen before qemu actually starts doing any migration-related work. (i.e. before qmp_migrate() is called) Looking at my timestamps, the start of the glitch seems to coincide with libvirtd calling qemuDomainMigratePrepareTunnel3Params(), and the end of the glitch occurs when the migration is complete and we're up and running on the destination. My question is, why doesn't qemu continue processing virtio packets while the dirty page scanning and memory transfer over the network is proceeding? Thanks, Chris (Please CC me on responses, I'm not subscribed to the lists.) -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] high outage times for qemu virtio network links during live migration, trying to debug
On 01/26/2016 10:50 AM, Paolo Bonzini wrote: On 26/01/2016 17:41, Chris Friesen wrote: I'm using libvirt (1.2.12) with qemu (2.2.0) in the context of OpenStack. If I live-migrate a guest with virtio network interfaces, I see a ~1200msec delay in processing the network packets, and several hundred of them get dropped. I get the dropped packets, but I'm not sure why the delay is there. I instrumented qemu and libvirt, and the strange thing is that this delay seems to happen before qemu actually starts doing any migration-related work. (i.e. before qmp_migrate() is called) Looking at my timestamps, the start of the glitch seems to coincide with libvirtd calling qemuDomainMigratePrepareTunnel3Params(), and the end of the glitch occurs when the migration is complete and we're up and running on the destination. My question is, why doesn't qemu continue processing virtio packets while the dirty page scanning and memory transfer over the network is proceeding? QEMU (or vhost) _are_ processing virtio traffic, because otherwise you'd have no delay---only dropped packets. Or am I missing something? I have separate timestamps embedded in the packet for when it was sent and when it was echoed back by the target (which is the one being migrated). What I'm seeing is that packets to the guest are being sent every msec, but they get delayed somewhere for over a second on the way to the destination VM while the migration is in progress. Once the migration is over, a bunch of packets get delivered to the app in the guest and are then processed all at once and echoed back to the sender in a big burst (and a bunch of packets are dropped, presumably due to a buffer overflowing somewhere). For comparison, we have a DPDK-based fastpath NIC type that we added (sort of like vhost-net), and it continues to process packets while the dirty page scanning is going on. Only the actual cutover affects it. Chris -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] high outage times for qemu virtio network links during live migration, trying to debug
On 01/26/2016 11:31 AM, Paolo Bonzini wrote: On 26/01/2016 18:21, Chris Friesen wrote: My question is, why doesn't qemu continue processing virtio packets while the dirty page scanning and memory transfer over the network is proceeding? QEMU (or vhost) _are_ processing virtio traffic, because otherwise you'd have no delay---only dropped packets. Or am I missing something? I have separate timestamps embedded in the packet for when it was sent and when it was echoed back by the target (which is the one being migrated). What I'm seeing is that packets to the guest are being sent every msec, but they get delayed somewhere for over a second on the way to the destination VM while the migration is in progress. Once the migration is over, a bunch of packets get delivered to the app in the guest and are then processed all at once and echoed back to the sender in a big burst (and a bunch of packets are dropped, presumably due to a buffer overflowing somewhere). That doesn't exclude a bug somewhere in net/ code. It doesn't pinpoint it to QEMU or vhost-net. In any case, what I would do is to use tracing at all levels (guest kernel, QEMU, host kernel) for packet rx and tx, and find out at which layer the hiccup appears. Is there a straightforward way to trace packet processing in qemu (preferably with millisecond-accurate timestamps)? Chris -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [Qemu-devel] high outage times for qemu virtio network links during live migration, trying to debug
On 01/26/2016 10:45 AM, Daniel P. Berrange wrote: On Tue, Jan 26, 2016 at 10:41:12AM -0600, Chris Friesen wrote: My question is, why doesn't qemu continue processing virtio packets while the dirty page scanning and memory transfer over the network is proceeding? The qemuDomainMigratePrepareTunnel3Params() method is responsible for starting the QEMU process on the target host. This should not normally have any impact on host networking connectivity, since the CPUs on that target QEMU wouldn't be running at that point. Perhaps the mere act of starting QEMU and plugging the TAP dev into the network on the target host causes some issue though ? eg are you using a bridge that is doing STP or something like that. Well, looks like your suspicions were correct. Our fast-path backend was mistakenly sending out a GARP when the backend was initialized as part of creating the qemu process on the target host. Oops. Thanks for your help. Chris -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] is there a notification when watchdog triggers?
On 11/07/2014 03:14 AM, Eric Blake wrote: On 11/06/2014 11:08 PM, Chris Friesen wrote: The libvirt.org docs say A virtual hardware watchdog device can be added to the guest via the watchdog elementCurrently libvirt does not support notification when the watchdog fires. This feature is planned for a future version of libvirt. Is that still accurate? Or does libvirt now support notifications? It looks outdated, as we have VIR_DOMAIN_EVENT_ID_WATCHDOG as the way to receive notification of a watchdog event in the guest: http://libvirt.org/html/libvirt-libvirt-domain.html#virDomainEventWatchdogAction http://libvirt.org/html/libvirt-libvirt-domain.html#virConnectDomainEventWatchdogCallback Which URL had the outdated information, so we can fix it? http://libvirt.org/formatdomain.html#elementsWatchdog It's also in the draft Fedora docs: http://docs.fedoraproject.org/en-US/Fedora_Draft_Documentation/0.1/html/Virtualization_Deployment_and_Administration_Guide/index.html Chris -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] is there a notification when watchdog triggers?
The libvirt.org docs say A virtual hardware watchdog device can be added to the guest via the watchdog elementCurrently libvirt does not support notification when the watchdog fires. This feature is planned for a future version of libvirt. Is that still accurate? Or does libvirt now support notifications? Chris -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [bug] python-libvirt vcpus mismatch
I've got a libvirt-created instance where I've been messing with affinity, and now something is strange. I did the following in python: import libvirt conn=libvirt.open(qemu:///system) dom = conn.lookupByName('instance-0027') dom.vcpus() ([(0, 1, 52815000L, 2), (1, 1, 54807000L, 3)], [(False, False, True, False), (False, False, True, False)]) I'm totally confused by that 3. It's supposed to represent the physical cpu that virtual cpu 1 is running on. But cpu 3 isn't even in the allowable affinity map for vcpu 1. If I query the data other ways, I get both cpus running on physical cpu 2: root@compute-0:~# virsh vcpupin instance-0027 VCPU: CPU Affinity -- 0: 2 1: 2 root@compute-0:~# virsh emulatorpin instance-0027 emulator: CPU Affinity -- *: 2 root@compute-0:~# taskset -pac 15072 pid 15072's current affinity list: 2 pid 15073's current affinity list: 1-3 pid 15075's current affinity list: 2 pid 15076's current affinity list: 0 So I'm left with the conclusion that there is something strange going on with libvirt-python. Anyone got any ideas? Chris -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [bug] problem with python interface, dom.vcpus() cpu info doesn't match cpu map
Hi, I was playing around with vcpupin and emulatorpin and managed to get into a strange state. From within python I get the following: (Pdb) dom = self._lookup_by_name(instance.name) (Pdb) dom.vcpus() ([(0, 1, 597000L, 2), (1, 1, 458000L, 3)], [(False, False, True, False), (False, False, True, False)]) The problem is that the cpuinfo for the second vcpu has it running on physical cpu 3, even though the affinity mask (within python and from taskset) says it can only run on physical cpu 2. The VM in question was originally started up running on pysical cpus 2 and 3, then I used the vcpupin/emulatorpin commands to only use physical cpu 2. Anyone got any ideas? I'm using libvirt-1.1.2 and libvirt-python-1.1.2 on a 3.4.82 kernel. Some more data: I have one VM running, taskset shows affinity as follows: root@compute-0:~# taskset -pac 7680 pid 7680's current affinity list: 2 pid 7681's current affinity list: 1-3 pid 7683's current affinity list: 2 pid 7684's current affinity list: 0 virsh has the following view: root@compute-0:~# virsh list IdName State 3 instance-001d running root@compute-0:~# virsh emulatorpin 3 emulator: CPU Affinity -- *: 2 root@compute-0:~# virsh vcpupin 3 VCPU: CPU Affinity -- 0: 2 1: 2 Thanks, Chris -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [Qemu-devel] qemu leaving unix sockets behind after VM is shut down
On 05/06/2014 07:39 AM, Stefan Hajnoczi wrote: On Tue, Apr 01, 2014 at 02:34:58PM -0600, Chris Friesen wrote: When running qemu with something like this -device virtio-serial \ -chardev socket,path=/tmp/foo,server,nowait,id=foo \ -device virtserialport,chardev=foo,name=host.port.0 the VM starts up as expected and creates a socket at /tmp/foo as expected. However, when I shut down the VM the socket at /tmp/foo is left behind in the filesystem. Basically qemu has leaked a file. With something like OpenStack where we could be creating/destroying many VMs this could end up creating a significant number of files in the specified directory. Has any thought been given to either automatically cleaning up the unix socket in the filesystem when qemu exits, or else supporting the abstract namespace for unix sockets to allow for automatic cleanup? Libvirt has a special case for the monitor socket in its qemuProcessStop() function. Are you using the OpenStack libvirt driver? Perhaps QEMU should support cleanup but first I think we should check the situation with libvirt. Yes, I am in fact using OpenStack/libvirt, and did eventually track down libvirt as the code that was cleaning up the monitor socket. Even so, I think this sort of change would be valid in qemu itself. qemu created the files, so really it should be up to qemu to delete them when it's done with them. They're not usable for anything with qemu not running, so there's no good reason to leave them laying around. Chris -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] why doesn't libvirt let qemu autostart live-migrated VMs?
On 04/15/2014 02:28 AM, Daniel P. Berrange wrote: On Mon, Apr 14, 2014 at 05:50:07PM -0600, Chris Friesen wrote: Hi, I've been digging through the libvirt code and something that struck me was that it appears that when using qemu libvirt will migrate the instance with autostart disabled, then sit on the source host periodically polling for migration completion, then once the host detects that migration is completed it will tell the destination to start up the VM. Why don't we let the destination autostart the VM once migration is complete? Libvirt has to have synchronization point on the target machine, so that we can acquire any disk leases associated with the VM before the CPUs are started. Where does that happen? I'm looking at libvirt/qemu, specifically this code path on the target: remoteDispatchDomainMigrateFinish3ParamsHelper remoteDispatchDomainMigrateFinish3Params virDomainMigrateFinish3Params qemuDomainMigrateFinish3Params qemuMigrationFinish qemuProcessStartCPUs qemuMonitorStartCPUs Thanks, Chris -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] why doesn't libvirt let qemu autostart live-migrated VMs?
Hi, I've been digging through the libvirt code and something that struck me was that it appears that when using qemu libvirt will migrate the instance with autostart disabled, then sit on the source host periodically polling for migration completion, then once the host detects that migration is completed it will tell the destination to start up the VM. Why don't we let the destination autostart the VM once migration is complete? Chris -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [bug?] unix sockets opened via chardev devices not being closed on shutdown
I have a case where I'm creating a virtio channel between the host and guest using something like this: channel type=unix source mode=bind path=/path/in/host/instance_name/ target type=virtio name=name_in_guest/ /channel When qemu is started up this gets created as expected, but when qemu is shut down the unix socket is left in the filesystem. It seems to me that libvirt should be deleting this unix socket the same way that it deletes the monitor socket in qemuProcessStop(). Anyone else trying to delete it is going to be subject to race conditions since they can't know whether or not a virtual machine has been (re)created that wants to use the same socket path. Chris -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] virsh domstate output when kvm killed vs guest OS panic
Hi, If I kill a libvirt-managed kvm process with kill -9, running virsh domstate --reason name gives shut off (crashed) Looking at the code, that corresponds to VIR_DOMAIN_SHUTOFF/VIR_DOMAIN_SHUTOFF_CRASHED. The comment says that VIR_DOMAIN_SHUTOFF_CRASHED corresponds to domain crashed. Is this supposed to be a crash of the hypervisor, or of the guest OS? If I trigger a panic in the guest, it sits there in the panicked state doing nothing and virsh domstate gives running (booted) So what's the point of VIR_DOMAIN_CRASHED/VIR_DOMAIN_CRASHED_PANICKED? Chris -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list