Re: [libvirt] Call to virDomainIsActive hangs forever
On Thu, Jun 14, 2018 at 04:31:57PM +0300, Mathieu Tarral wrote: > Hi Daniel, > > Thanks for your help, i was able to run the libvirt daemon from master, > and after add some debug messages, i understood that the call > to virDomainIsActive was actually hanging from the client. > > It never reached the daemon. > > Looking back at the call again: > https://gist.github.com/Wenzel/624cb8c0746c05bd824d2b3c04b6b9f9 > > The call is waiting for pthread_sigmask to return. > > Looking into the libvirt source code (3.0.0) at the location of the call: > https://github.com/libvirt/libvirt/blob/v3.0.0/src/rpc/virnetclient.c?utf8=%E2%9C%93#L1659 > > Is this supposed to be an instant call, or does it wait for specific > conditions/event to return ? Yeah this is very bad - pthread_sigmask should be instantaneous, and I can't imagine what would make it hang besides severe memory corruption somewhere :-( Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] Call to virDomainIsActive hangs forever
Hi Daniel, Thanks for your help, i was able to run the libvirt daemon from master, and after add some debug messages, i understood that the call to virDomainIsActive was actually hanging from the client. It never reached the daemon. Looking back at the call again: https://gist.github.com/Wenzel/624cb8c0746c05bd824d2b3c04b6b9f9 The call is waiting for pthread_sigmask to return. Looking into the libvirt source code (3.0.0) at the location of the call: https://github.com/libvirt/libvirt/blob/v3.0.0/src/rpc/virnetclient.c?utf8=%E2%9C%93#L1659 Is this supposed to be an instant call, or does it wait for specific conditions/event to return ? Any idea what might go wrong ? Thank you. -- Mathieu Tarral -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] Call to virDomainIsActive hangs forever
Hi, thanks for your answer Daniel. I think the best way to get more information about this bug is to reproduce it with libvirt master branch. However, i'm facing an issue when i try to run my own daemon: there is a chid process which is failing in a loop. I used sudo strace -f ./libvirtd to watch all child processes, and this is the output: https://gist.github.com/Wenzel/620327a9b3b7e13454108c2e472eaa77 One of them is continously failing, and i don't clearly understand why. Does this log file can help you identify the cause maybe ? I'm here to provide more information if needed. Thank you ! 2018-03-27 16:12 GMT+03:00 Daniel P. Berrangé: > On Tue, Mar 27, 2018 at 04:04:33PM +0300, Mathieu Tarral wrote: >> > Are you sure this isa different thread ? It looks identical to the first >> > stack trace you give above. >> >> Yes, the first one is calling libvirtmod.virDomainGetState >> and the second one libvirtmod.virDomainIsActive. >> >> > Interesting. This is an identical stack trace - so we have 2 python >> > threads both calling virDomainIsActive(). Nothing wrong with that >> > per-se - we support multithreaded usage like this. >> >> virDomainGetState() >> and >> virDomainIsActive() > > Opps, yes i see. > >> > Can you confirm there are no other threads running libvirt code >> > in your python app ? Did you have any thread running the libvirt >> > event loop perhaps ? >> >> Actually i found 2 others threads in Python app calling libvirt. >> >> So, as a recap: > > >> >> (gdb) bt >> #0 pthread_sigmask (how=how@entry=0, newmask=, >> newmask@entry=0x7f4ffd7f8d10, oldmask=oldmask@entry=0x7f4ffd7f8c90) at >> ../sysdeps/unix/sysv/linux/pthread_sigmask.c:50 > > This is slightly unusual - pthread_sigmask() should complete in a tiny > fraction of a second, so seeing it in the stack trace is odd unless you > have very fortuitous timing when taking the stack trace. > >> #1 0x7f508e0f52fa in virNetClientIOEventLoop >> (client=client@entry=0x55a1fde4d2b0, >> thiscall=thiscall@entry=0x7f4fe005a350) at >> ../../../src/rpc/virnetclient.c:1659 >> #2 0x7f508e0f5a16 in virNetClientIO (thiscall=0x7f4fe005a350, >> client=0x55a1fde4d2b0) at ../../../src/rpc/virnetclient.c:1944 >> #3 virNetClientSendInternal (client=client@entry=0x55a1fde4d2b0, >> msg=msg@entry=0x7f4fe0031f50, expectReply=expectReply@entry=true, >> nonBlock=nonBlock@entry=false) at ../../../src/rpc/virnetclient.c:2116 >> #4 0x7f508e0f7443 in virNetClientSendWithReply >> (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x7f4fe0031f50) at >> ../../../src/rpc/virnetclient.c:2144 >> #5 0x7f508e0f7bf2 in virNetClientProgramCall >> (prog=prog@entry=0x55a1fdff0f90, client=client@entry=0x55a1fde4d2b0, >> serial=serial@entry=105, proc=proc@entry=14, noutfds=noutfds@entry=0, >> outfds=outfds@entry=0x0, ninfds=0x0, >> infds=0x0, args_filter=0x7f508e0ecba0 >> , args=0x7f4ffd7f8fe0, >> ret_filter=0x7f508e0ecbd0 , >> ret=0x7f4ffd7f8fd8) >> at ../../../src/rpc/virnetclientprogram.c:329 >> #6 0x7f508e0cdeb4 in callFull (priv=priv@entry=0x55a1fe5ac460, >> flags=flags@entry=0, fdin=fdin@entry=0x0, fdinlen=fdinlen@entry=0, >> fdout=fdout@entry=0x0, fdoutlen=fdoutlen@entry=0x0, proc_nr=14, >> args_filter=0x7f508e0ecba0 , >> args=0x7f4ffd7f8fe0 "`k.\376\241U", ret_filter=0x7f508e0ecbd0 >> , ret=0x7f4ffd7f8fd8 "", >> conn=) >> at ../../../src/remote/remote_driver.c:6636 >> #7 0x7f508e0d7b58 in call (conn=, >> ret=0x7f4ffd7f8fd8 "", ret_filter=, args=0x7f4ffd7f8fe0 >> "`k.\376\241U", args_filter=, proc_nr=14, flags=0, >> priv=0x55a1fe5ac460) >> at ../../../src/remote/remote_driver.c:6658 >> #8 remoteDomainGetXMLDesc (dom=, flags=0) at >> ../../../src/remote/remote_client_bodies.h:2698 >> #9 0x7f508e08f5c1 in virDomainGetXMLDesc >> (domain=domain@entry=0x55a1fe212da0, flags=0) at >> ../../../src/libvirt-domain.c:2592 >> #10 0x7f508e46c8c0 in libvirt_virDomainGetXMLDesc (self=> out>, args=) at build/libvirt.c:1212 >> #11 0x55a1fb4cb6df in PyCFunction_Call () at >> ../Objects/methodobject.c:109 > > > Aside from the thing mentioned above I don't see any reason why you would > have bad problems here. > > I don't have much more useful to suggest, other than to try using the very > latest libvirt to see if you get the same behaviour. If not, then it would > point to a bug in old libvirt, but I don't recall anything that would cause > this behaviour you see offhand. > > Regards, > Daniel > -- > |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o-https://fstop138.berrange.com :| > |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :| -- Mathieu Tarral -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] Call to virDomainIsActive hangs forever
On Tue, Mar 27, 2018 at 04:04:33PM +0300, Mathieu Tarral wrote: > > Are you sure this isa different thread ? It looks identical to the first > > stack trace you give above. > > Yes, the first one is calling libvirtmod.virDomainGetState > and the second one libvirtmod.virDomainIsActive. > > > Interesting. This is an identical stack trace - so we have 2 python > > threads both calling virDomainIsActive(). Nothing wrong with that > > per-se - we support multithreaded usage like this. > > virDomainGetState() > and > virDomainIsActive() Opps, yes i see. > > Can you confirm there are no other threads running libvirt code > > in your python app ? Did you have any thread running the libvirt > > event loop perhaps ? > > Actually i found 2 others threads in Python app calling libvirt. > > So, as a recap: > > (gdb) bt > #0 pthread_sigmask (how=how@entry=0, newmask=, > newmask@entry=0x7f4ffd7f8d10, oldmask=oldmask@entry=0x7f4ffd7f8c90) at > ../sysdeps/unix/sysv/linux/pthread_sigmask.c:50 This is slightly unusual - pthread_sigmask() should complete in a tiny fraction of a second, so seeing it in the stack trace is odd unless you have very fortuitous timing when taking the stack trace. > #1 0x7f508e0f52fa in virNetClientIOEventLoop > (client=client@entry=0x55a1fde4d2b0, > thiscall=thiscall@entry=0x7f4fe005a350) at > ../../../src/rpc/virnetclient.c:1659 > #2 0x7f508e0f5a16 in virNetClientIO (thiscall=0x7f4fe005a350, > client=0x55a1fde4d2b0) at ../../../src/rpc/virnetclient.c:1944 > #3 virNetClientSendInternal (client=client@entry=0x55a1fde4d2b0, > msg=msg@entry=0x7f4fe0031f50, expectReply=expectReply@entry=true, > nonBlock=nonBlock@entry=false) at ../../../src/rpc/virnetclient.c:2116 > #4 0x7f508e0f7443 in virNetClientSendWithReply > (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x7f4fe0031f50) at > ../../../src/rpc/virnetclient.c:2144 > #5 0x7f508e0f7bf2 in virNetClientProgramCall > (prog=prog@entry=0x55a1fdff0f90, client=client@entry=0x55a1fde4d2b0, > serial=serial@entry=105, proc=proc@entry=14, noutfds=noutfds@entry=0, > outfds=outfds@entry=0x0, ninfds=0x0, > infds=0x0, args_filter=0x7f508e0ecba0 > , args=0x7f4ffd7f8fe0, > ret_filter=0x7f508e0ecbd0 , > ret=0x7f4ffd7f8fd8) > at ../../../src/rpc/virnetclientprogram.c:329 > #6 0x7f508e0cdeb4 in callFull (priv=priv@entry=0x55a1fe5ac460, > flags=flags@entry=0, fdin=fdin@entry=0x0, fdinlen=fdinlen@entry=0, > fdout=fdout@entry=0x0, fdoutlen=fdoutlen@entry=0x0, proc_nr=14, > args_filter=0x7f508e0ecba0 , > args=0x7f4ffd7f8fe0 "`k.\376\241U", ret_filter=0x7f508e0ecbd0 > , ret=0x7f4ffd7f8fd8 "", > conn=) > at ../../../src/remote/remote_driver.c:6636 > #7 0x7f508e0d7b58 in call (conn=, > ret=0x7f4ffd7f8fd8 "", ret_filter=, args=0x7f4ffd7f8fe0 > "`k.\376\241U", args_filter=, proc_nr=14, flags=0, > priv=0x55a1fe5ac460) > at ../../../src/remote/remote_driver.c:6658 > #8 remoteDomainGetXMLDesc (dom=, flags=0) at > ../../../src/remote/remote_client_bodies.h:2698 > #9 0x7f508e08f5c1 in virDomainGetXMLDesc > (domain=domain@entry=0x55a1fe212da0, flags=0) at > ../../../src/libvirt-domain.c:2592 > #10 0x7f508e46c8c0 in libvirt_virDomainGetXMLDesc (self= out>, args=) at build/libvirt.c:1212 > #11 0x55a1fb4cb6df in PyCFunction_Call () at ../Objects/methodobject.c:109 Aside from the thing mentioned above I don't see any reason why you would have bad problems here. I don't have much more useful to suggest, other than to try using the very latest libvirt to see if you get the same behaviour. If not, then it would point to a bug in old libvirt, but I don't recall anything that would cause this behaviour you see offhand. Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] Call to virDomainIsActive hangs forever
> Are you sure this isa different thread ? It looks identical to the first > stack trace you give above. Yes, the first one is calling libvirtmod.virDomainGetState and the second one libvirtmod.virDomainIsActive. > Interesting. This is an identical stack trace - so we have 2 python > threads both calling virDomainIsActive(). Nothing wrong with that > per-se - we support multithreaded usage like this. virDomainGetState() and virDomainIsActive() > Can you confirm there are no other threads running libvirt code > in your python app ? Did you have any thread running the libvirt > event loop perhaps ? Actually i found 2 others threads in Python app calling libvirt. So, as a recap: Thread 1 - calling virDomainGetState (gdb) py-bt Traceback (most recent call first): File "/usr/lib/python3/dist-packages/libvirt.py", line 2551, in state ret = libvirtmod.virDomainGetState(self._o, flags) (gdb) bt pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x7f508dfe2b86 in virCondWait (c=c@entry=0x55a1fe420728, m=m@entry=0x55a1fde4d2c0) at ../../../src/util/virthread.c:154 #2 0x7f508e0f5bbb in virNetClientIO (thiscall=0x55a1fe420710, client=0x55a1fde4d2b0) at ../../../src/rpc/virnetclient.c:1894 #3 virNetClientSendInternal (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x55a1fdd798f0, expectReply=expectReply@entry=true, nonBlock=nonBlock@entry=false) at ../../../src/rpc/virnetclient.c:2116 #4 0x7f508e0f7443 in virNetClientSendWithReply (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x55a1fdd798f0) at ../../../src/rpc/virnetclient.c:2144 #5 0x7f508e0f7bf2 in virNetClientProgramCall (prog=prog@entry=0x55a1fdff0f90, client=client@entry=0x55a1fde4d2b0, serial=serial@entry=108, proc=proc@entry=212, noutfds=noutfds@entry=0, outfds=outfds@entry=0x0, ninfds=0x0, infds=0x0, args_filter=0x7f508e0f13e0 , args=0x7ffd1518cfd0, ret_filter=0x7f508e0f1410 , ret=0x7ffd1518cfc8) at ../../../src/rpc/virnetclientprogram.c:329 #6 0x7f508e0cdeb4 in callFull (priv=priv@entry=0x55a1fe5ac460, flags=flags@entry=0, fdin=fdin@entry=0x0, fdinlen=fdinlen@entry=0, fdout=fdout@entry=0x0, fdoutlen=fdoutlen@entry=0x0, proc_nr=212, args_filter=0x7f508e0f13e0 , args=0x7ffd1518cfd0 "`k.\376\241U", ret_filter=0x7f508e0f1410 , ret=0x7ffd1518cfc8 "", conn=) at ../../../src/remote/remote_driver.c:6636 #7 0x7f508e0d7f90 in call (conn=, ret=0x7ffd1518cfc8 "", ret_filter=, args=0x7ffd1518cfd0 "`k.\376\241U", args_filter=, proc_nr=212, flags=0, priv=0x55a1fe5ac460) at ../../../src/remote/remote_driver.c:6658 #8 remoteDomainGetState (domain=0x55a1fe212da0, state=0x7ffd1518d0a4, reason=0x7ffd1518d0a8, flags=0) at ../../../src/remote/remote_driver.c:2458 #9 0x7f508e08e248 in virDomainGetState (domain=domain@entry=0x55a1fe212da0, state=state@entry=0x7ffd1518d0a4, reason=reason@entry=0x7ffd1518d0a8, flags=0) at ../../../src/libvirt-domain.c:2495 #10 0x7f508e466f28 in libvirt_virDomainGetState (self=, args=) at libvirt-override.c:2539 #11 0x55a1fb4cb6df in PyCFunction_Call () at ../Objects/methodobject.c:109 Thread 32 - calling virDomainIsActive (gdb) py-bt Traceback (most recent call first): File "/usr/lib/python3/dist-packages/libvirt.py", line 1340, in isActive ret = libvirtmod.virDomainIsActive(self._o) (gdb) bt #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x7f508dfe2b86 in virCondWait (c=c@entry=0x7f4ff051fdc8, m=m@entry=0x55a1fde4d2c0) at ../../../src/util/virthread.c:154 #2 0x7f508e0f5bbb in virNetClientIO (thiscall=0x7f4ff051fdb0, client=0x55a1fde4d2b0) at ../../../src/rpc/virnetclient.c:1894 #3 virNetClientSendInternal (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x7f4ff051fa10, expectReply=expectReply@entry=true, nonBlock=nonBlock@entry=false) at ../../../src/rpc/virnetclient.c:2116 #4 0x7f508e0f7443 in virNetClientSendWithReply (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x7f4ff051fa10) at ../../../src/rpc/virnetclient.c:2144 #5 0x7f508e0f7bf2 in virNetClientProgramCall (prog=prog@entry=0x55a1fdff0f90, client=client@entry=0x55a1fde4d2b0, serial=serial@entry=107, proc=proc@entry=150, noutfds=noutfds@entry=0, outfds=outfds@entry=0x0, ninfds=0x0, infds=0x0, args_filter=0x7f508e0efe70 , args=0x7f4fff7fd040, ret_filter=0x7f508e0efe90 , ret=0x7f4fff7fd03c) at ../../../src/rpc/virnetclientprogram.c:329 #6 0x7f508e0cdeb4 in callFull (priv=priv@entry=0x55a1fe5ac460, flags=flags@entry=0, fdin=fdin@entry=0x0, fdinlen=fdinlen@entry=0, fdout=fdout@entry=0x0, fdoutlen=fdoutlen@entry=0x0, proc_nr=150, args_filter=0x7f508e0efe70 , args=0x7f4fff7fd040 "`k.\376\241U", ret_filter=0x7f508e0efe90 , ret=0x7f4fff7fd03c "", conn=) at ../../../src/remote/remote_driver.c:6636 #7 0x7f508e0d71cb in call (conn=, ret=0x7f4fff7fd03c "", ret_filter=, args=0x7f4fff7fd040 "`k.\376\241U", args_filter=, proc_nr=150, flags=0,
Re: [libvirt] Call to virDomainIsActive hangs forever
On Tue, Mar 27, 2018 at 03:30:14PM +0300, Mathieu Tarral wrote: > I have installed the following debug symbols: > > - libvirt-daemon-dbgsym_3.0.0-4+deb9u3_amd64.deb > - libvirt-daemon-system-dbgsym_3.0.0-4+deb9u3_amd64.deb > - libvirt-clients-dbgsym_3.0.0-4+deb9u3_amd64.deb > - libvirt0-dbgsym_3.0.0-4+deb9u3_amd64.deb > - python3-libvirt-dbgsym_3.0.0-2_amd64.deb > > And i was able to reproduce my bug. > > In my Python application, i had 2 threads hanging on a libvirt call. > (sorry for the verbosity) > Thread 1: > > (gdb) py-bt > Traceback (most recent call first): >0x7f508e6b1278> > File "/usr/lib/python3/dist-packages/libvirt.py", line 2551, in state > ret = libvirtmod.virDomainGetState(self._o, flags) > > (gdb) bt > pthread_cond_wait@@GLIBC_2.3.2 () at > ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 > #1 0x7f508dfe2b86 in virCondWait (c=c@entry=0x55a1fe420728, > m=m@entry=0x55a1fde4d2c0) at ../../../src/util/virthread.c:154 > #2 0x7f508e0f5bbb in virNetClientIO (thiscall=0x55a1fe420710, > client=0x55a1fde4d2b0) at ../../../src/rpc/virnetclient.c:1894 > #3 virNetClientSendInternal (client=client@entry=0x55a1fde4d2b0, > msg=msg@entry=0x55a1fdd798f0, expectReply=expectReply@entry=true, > nonBlock=nonBlock@entry=false) at ../../../src/rpc/virnetclient.c:2116 > #4 0x7f508e0f7443 in virNetClientSendWithReply > (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x55a1fdd798f0) at > ../../../src/rpc/virnetclient.c:2144 > #5 0x7f508e0f7bf2 in virNetClientProgramCall > (prog=prog@entry=0x55a1fdff0f90, client=client@entry=0x55a1fde4d2b0, > serial=serial@entry=108, proc=proc@entry=212, noutfds=noutfds@entry=0, > outfds=outfds@entry=0x0, ninfds=0x0, > infds=0x0, args_filter=0x7f508e0f13e0 > , args=0x7ffd1518cfd0, > ret_filter=0x7f508e0f1410 , > ret=0x7ffd1518cfc8) at ../../../src/rpc/virnetclientprogram.c:329 > #6 0x7f508e0cdeb4 in callFull (priv=priv@entry=0x55a1fe5ac460, > flags=flags@entry=0, fdin=fdin@entry=0x0, fdinlen=fdinlen@entry=0, > fdout=fdout@entry=0x0, fdoutlen=fdoutlen@entry=0x0, proc_nr=212, > args_filter=0x7f508e0f13e0 , > args=0x7ffd1518cfd0 "`k.\376\241U", ret_filter=0x7f508e0f1410 > , ret=0x7ffd1518cfc8 "", > conn=) > at ../../../src/remote/remote_driver.c:6636 > #7 0x7f508e0d7f90 in call (conn=, > ret=0x7ffd1518cfc8 "", ret_filter=, args=0x7ffd1518cfd0 > "`k.\376\241U", args_filter=, proc_nr=212, flags=0, > priv=0x55a1fe5ac460) > at ../../../src/remote/remote_driver.c:6658 > #8 remoteDomainGetState (domain=0x55a1fe212da0, state=0x7ffd1518d0a4, > reason=0x7ffd1518d0a8, flags=0) at > ../../../src/remote/remote_driver.c:2458 > #9 0x7f508e08e248 in virDomainGetState > (domain=domain@entry=0x55a1fe212da0, state=state@entry=0x7ffd1518d0a4, > reason=reason@entry=0x7ffd1518d0a8, flags=0) at > ../../../src/libvirt-domain.c:2495 > #10 0x7f508e466f28 in libvirt_virDomainGetState (self= out>, args=) at libvirt-override.c:2539 > #11 0x55a1fb4cb6df in PyCFunction_Call () at ../Objects/methodobject.c:109 That's pretty much as expected - just remote client waiting for a reply from the daemon. > And Thread 2: > > (gdb) py-bt >0x7f508e6b1278> > File "/usr/lib/python3/dist-packages/libvirt.py", line 1340, in isActive > ret = libvirtmod.virDomainIsActive(self._o) > > (gdb) bt Are you sure this isa different thread ? It looks identical to the first stack trace you give above. > #0 pthread_cond_wait@@GLIBC_2.3.2 () at > ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 > #1 0x7f508dfe2b86 in virCondWait (c=c@entry=0x7f4ff051fdc8, > m=m@entry=0x55a1fde4d2c0) at ../../../src/util/virthread.c:154 > #2 0x7f508e0f5bbb in virNetClientIO (thiscall=0x7f4ff051fdb0, > client=0x55a1fde4d2b0) at ../../../src/rpc/virnetclient.c:1894 > #3 virNetClientSendInternal (client=client@entry=0x55a1fde4d2b0, > msg=msg@entry=0x7f4ff051fa10, expectReply=expectReply@entry=true, > nonBlock=nonBlock@entry=false) at ../../../src/rpc/virnetclient.c:2116 > #4 0x7f508e0f7443 in virNetClientSendWithReply > (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x7f4ff051fa10) at > ../../../src/rpc/virnetclient.c:2144 > #5 0x7f508e0f7bf2 in virNetClientProgramCall > (prog=prog@entry=0x55a1fdff0f90, client=client@entry=0x55a1fde4d2b0, > serial=serial@entry=107, proc=proc@entry=150, noutfds=noutfds@entry=0, > outfds=outfds@entry=0x0, ninfds=0x0, > infds=0x0, args_filter=0x7f508e0efe70 > , args=0x7f4fff7fd040, > ret_filter=0x7f508e0efe90 , > ret=0x7f4fff7fd03c) at ../../../src/rpc/virnetclientprogram.c:329 > #6 0x7f508e0cdeb4 in callFull (priv=priv@entry=0x55a1fe5ac460, > flags=flags@entry=0, fdin=fdin@entry=0x0, fdinlen=fdinlen@entry=0, > fdout=fdout@entry=0x0, fdoutlen=fdoutlen@entry=0x0, proc_nr=150, > args_filter=0x7f508e0efe70 , > args=0x7f4fff7fd040 "`k.\376\241U", ret_filter=0x7f508e0efe90 > , ret=0x7f4fff7fd03c "", > conn=) > at ../../../src/remote/remote_driver.c:6636 > #7 0x7f508e0d71cb in
Re: [libvirt] Call to virDomainIsActive hangs forever
I have installed the following debug symbols: - libvirt-daemon-dbgsym_3.0.0-4+deb9u3_amd64.deb - libvirt-daemon-system-dbgsym_3.0.0-4+deb9u3_amd64.deb - libvirt-clients-dbgsym_3.0.0-4+deb9u3_amd64.deb - libvirt0-dbgsym_3.0.0-4+deb9u3_amd64.deb - python3-libvirt-dbgsym_3.0.0-2_amd64.deb And i was able to reproduce my bug. In my Python application, i had 2 threads hanging on a libvirt call. (sorry for the verbosity) Thread 1: (gdb) py-bt Traceback (most recent call first): File "/usr/lib/python3/dist-packages/libvirt.py", line 2551, in state ret = libvirtmod.virDomainGetState(self._o, flags) (gdb) bt pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x7f508dfe2b86 in virCondWait (c=c@entry=0x55a1fe420728, m=m@entry=0x55a1fde4d2c0) at ../../../src/util/virthread.c:154 #2 0x7f508e0f5bbb in virNetClientIO (thiscall=0x55a1fe420710, client=0x55a1fde4d2b0) at ../../../src/rpc/virnetclient.c:1894 #3 virNetClientSendInternal (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x55a1fdd798f0, expectReply=expectReply@entry=true, nonBlock=nonBlock@entry=false) at ../../../src/rpc/virnetclient.c:2116 #4 0x7f508e0f7443 in virNetClientSendWithReply (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x55a1fdd798f0) at ../../../src/rpc/virnetclient.c:2144 #5 0x7f508e0f7bf2 in virNetClientProgramCall (prog=prog@entry=0x55a1fdff0f90, client=client@entry=0x55a1fde4d2b0, serial=serial@entry=108, proc=proc@entry=212, noutfds=noutfds@entry=0, outfds=outfds@entry=0x0, ninfds=0x0, infds=0x0, args_filter=0x7f508e0f13e0 , args=0x7ffd1518cfd0, ret_filter=0x7f508e0f1410 , ret=0x7ffd1518cfc8) at ../../../src/rpc/virnetclientprogram.c:329 #6 0x7f508e0cdeb4 in callFull (priv=priv@entry=0x55a1fe5ac460, flags=flags@entry=0, fdin=fdin@entry=0x0, fdinlen=fdinlen@entry=0, fdout=fdout@entry=0x0, fdoutlen=fdoutlen@entry=0x0, proc_nr=212, args_filter=0x7f508e0f13e0 , args=0x7ffd1518cfd0 "`k.\376\241U", ret_filter=0x7f508e0f1410 , ret=0x7ffd1518cfc8 "", conn=) at ../../../src/remote/remote_driver.c:6636 #7 0x7f508e0d7f90 in call (conn=, ret=0x7ffd1518cfc8 "", ret_filter=, args=0x7ffd1518cfd0 "`k.\376\241U", args_filter=, proc_nr=212, flags=0, priv=0x55a1fe5ac460) at ../../../src/remote/remote_driver.c:6658 #8 remoteDomainGetState (domain=0x55a1fe212da0, state=0x7ffd1518d0a4, reason=0x7ffd1518d0a8, flags=0) at ../../../src/remote/remote_driver.c:2458 #9 0x7f508e08e248 in virDomainGetState (domain=domain@entry=0x55a1fe212da0, state=state@entry=0x7ffd1518d0a4, reason=reason@entry=0x7ffd1518d0a8, flags=0) at ../../../src/libvirt-domain.c:2495 #10 0x7f508e466f28 in libvirt_virDomainGetState (self=, args=) at libvirt-override.c:2539 #11 0x55a1fb4cb6df in PyCFunction_Call () at ../Objects/methodobject.c:109 And Thread 2: (gdb) py-bt File "/usr/lib/python3/dist-packages/libvirt.py", line 1340, in isActive ret = libvirtmod.virDomainIsActive(self._o) (gdb) bt #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x7f508dfe2b86 in virCondWait (c=c@entry=0x7f4ff051fdc8, m=m@entry=0x55a1fde4d2c0) at ../../../src/util/virthread.c:154 #2 0x7f508e0f5bbb in virNetClientIO (thiscall=0x7f4ff051fdb0, client=0x55a1fde4d2b0) at ../../../src/rpc/virnetclient.c:1894 #3 virNetClientSendInternal (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x7f4ff051fa10, expectReply=expectReply@entry=true, nonBlock=nonBlock@entry=false) at ../../../src/rpc/virnetclient.c:2116 #4 0x7f508e0f7443 in virNetClientSendWithReply (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x7f4ff051fa10) at ../../../src/rpc/virnetclient.c:2144 #5 0x7f508e0f7bf2 in virNetClientProgramCall (prog=prog@entry=0x55a1fdff0f90, client=client@entry=0x55a1fde4d2b0, serial=serial@entry=107, proc=proc@entry=150, noutfds=noutfds@entry=0, outfds=outfds@entry=0x0, ninfds=0x0, infds=0x0, args_filter=0x7f508e0efe70 , args=0x7f4fff7fd040, ret_filter=0x7f508e0efe90 , ret=0x7f4fff7fd03c) at ../../../src/rpc/virnetclientprogram.c:329 #6 0x7f508e0cdeb4 in callFull (priv=priv@entry=0x55a1fe5ac460, flags=flags@entry=0, fdin=fdin@entry=0x0, fdinlen=fdinlen@entry=0, fdout=fdout@entry=0x0, fdoutlen=fdoutlen@entry=0x0, proc_nr=150, args_filter=0x7f508e0efe70 , args=0x7f4fff7fd040 "`k.\376\241U", ret_filter=0x7f508e0efe90 , ret=0x7f4fff7fd03c "", conn=) at ../../../src/remote/remote_driver.c:6636 #7 0x7f508e0d71cb in call (conn=, ret=0x7f4fff7fd03c "", ret_filter=, args=0x7f4fff7fd040 "`k.\376\241U", args_filter=, proc_nr=150, flags=0, priv=0x55a1fe5ac460) at ../../../src/remote/remote_driver.c:6658 #8 remoteDomainIsActive (dom=0x55a1fe212da0) at ../../../src/remote/remote_client_bodies.h:2842 #9 0x7f508e09df03 in virDomainIsActive (dom=dom@entry=0x55a1fe212da0) at ../../../src/libvirt-domain.c:8467 #10 0x7f508e46cbd0 in libvirt_virDomainIsActive (self=, args=) at build/libvirt.c:1288 #11
Re: [libvirt] Call to virDomainIsActive hangs forever
On Fri, Mar 23, 2018 at 01:24:46PM +0100, Erik Skultety wrote: > On Thu, Mar 22, 2018 at 06:10:49PM +0200, Mathieu Tarral wrote: > > Hi ! > > > > I'm submitting my messages on this mailing list to request a bit of > > help on a case that I have > > where a Python application makes a call to virDomainIsActive, and the > > call never returns. > > > > I have tried to investigate, but as there are no debug symbols for > > libvirt on Debian Stretch, > > i can only have the following GDB backtrace: > > > > (gdb) bt > > #0 pthread_cond_wait@@GLIBC_2.3.2 () at > > ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 > > #1 0x7f49026f5b76 in virCondWait () from /usr/lib/libvirt.so.0 > > #2 0x7f4902808bab in ?? () from /usr/lib/libvirt.so.0 > > #3 0x7f490280a433 in virNetClientSendWithReply () from > > /usr/lib/libvirt.so.0 > > #4 0x7f490280abe2 in virNetClientProgramCall () from > > /usr/lib/libvirt.so.0 > > #5 0x7f49027e0ea4 in ?? () from /usr/lib/libvirt.so.0 > > #6 0x7f49027ea1bb in ?? () from /usr/lib/libvirt.so.0 > > #7 0x7f49027b0ef3 in virDomainIsActive () from /usr/lib/libvirt.so.0 > > #8 0x7f4902b7fbd0 in libvirt_virDomainIsActive () from > > /usr/lib/python3/dist-packages/libvirtmod.cpython-35m-x86_64-linux-gnu.so > > #9 0x558eeec696df in PyCFunction_Call () at > > ../Objects/methodobject.c:109 > > > > The libvirt driver used is QEMU, and i have a specific monitoring in > > place using virtual machine introspection: > > https://github.com/KVM-VMI/kvm-vmi > > > > Now this specific monitoring somehow triggers this bug, and at this > > point, i don't know if > > it's a corner case in the libvirt QEMU driver or not. > > That's why i would like to have your lights on this. > > > > libvirt version: 3.0.0-4 > > > > -> Could you tell me where i should look in the code ? > > You're probably looking at virLogManagerDomain* methods located in > src/logging/log_manager.c and the wait call is issued from virNetClientIO. > > > -> Do you have more information about this virCondWait ? Which > > condition is it waiting for ? > > -> How can i get the symbols without having the recompile libvirt and > > install it system wide, erasing the binaries installed by the package > > ? > > To be honest, I think it's always worth debugging a custom built binary from > sources, since the debug symbols shipped via distro package are most likely > generated with optimizations which makes any kind of interactive debugging > painful. The problem is that you're going to built v3.0.0 tag on a new distro, > since new GCCs will complain about a lot of stuff, I looked at the code, tried > a few things, but honestly I didn't see a path where you could get to > virNetClientProgrammCall from virDomainIsActive (since I don't know what the > original call was), so unless you post a full backtrace, we can't help you > much > here. That's easy - virDomainIsActive() calls into the driver APIs. This is client side trace, so will get into the remote driver client, which will then call virNetClientProgramCall. Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] Call to virDomainIsActive hangs forever
On 03/22/2018 05:10 PM, Mathieu Tarral wrote: > Hi ! > > I'm submitting my messages on this mailing list to request a bit of > help on a case that I have > where a Python application makes a call to virDomainIsActive, and the > call never returns. > > I have tried to investigate, but as there are no debug symbols for > libvirt on Debian Stretch, > i can only have the following GDB backtrace: > > (gdb) bt > #0 pthread_cond_wait@@GLIBC_2.3.2 () at > ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 > #1 0x7f49026f5b76 in virCondWait () from /usr/lib/libvirt.so.0 > #2 0x7f4902808bab in ?? () from /usr/lib/libvirt.so.0 > #3 0x7f490280a433 in virNetClientSendWithReply () from > /usr/lib/libvirt.so.0 > #4 0x7f490280abe2 in virNetClientProgramCall () from > /usr/lib/libvirt.so.0 > #5 0x7f49027e0ea4 in ?? () from /usr/lib/libvirt.so.0 > #6 0x7f49027ea1bb in ?? () from /usr/lib/libvirt.so.0 > #7 0x7f49027b0ef3 in virDomainIsActive () from /usr/lib/libvirt.so.0 > #8 0x7f4902b7fbd0 in libvirt_virDomainIsActive () from > /usr/lib/python3/dist-packages/libvirtmod.cpython-35m-x86_64-linux-gnu.so > #9 0x558eeec696df in PyCFunction_Call () at ../Objects/methodobject.c:109 This is just a client waiting for server's reply. You want to attach gdb to corresponding daemon and run 't a a bt' (thread apply all backtrace). Libvirtd is multithreaded and one of the threads will be executing your API. Michal -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] Call to virDomainIsActive hangs forever
On Thu, Mar 22, 2018 at 06:10:49PM +0200, Mathieu Tarral wrote: > Hi ! > > I'm submitting my messages on this mailing list to request a bit of > help on a case that I have > where a Python application makes a call to virDomainIsActive, and the > call never returns. > > I have tried to investigate, but as there are no debug symbols for > libvirt on Debian Stretch, > i can only have the following GDB backtrace: > > (gdb) bt > #0 pthread_cond_wait@@GLIBC_2.3.2 () at > ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 > #1 0x7f49026f5b76 in virCondWait () from /usr/lib/libvirt.so.0 > #2 0x7f4902808bab in ?? () from /usr/lib/libvirt.so.0 > #3 0x7f490280a433 in virNetClientSendWithReply () from > /usr/lib/libvirt.so.0 > #4 0x7f490280abe2 in virNetClientProgramCall () from > /usr/lib/libvirt.so.0 > #5 0x7f49027e0ea4 in ?? () from /usr/lib/libvirt.so.0 > #6 0x7f49027ea1bb in ?? () from /usr/lib/libvirt.so.0 > #7 0x7f49027b0ef3 in virDomainIsActive () from /usr/lib/libvirt.so.0 > #8 0x7f4902b7fbd0 in libvirt_virDomainIsActive () from > /usr/lib/python3/dist-packages/libvirtmod.cpython-35m-x86_64-linux-gnu.so > #9 0x558eeec696df in PyCFunction_Call () at ../Objects/methodobject.c:109 > > The libvirt driver used is QEMU, and i have a specific monitoring in > place using virtual machine introspection: > https://github.com/KVM-VMI/kvm-vmi > > Now this specific monitoring somehow triggers this bug, and at this > point, i don't know if > it's a corner case in the libvirt QEMU driver or not. > That's why i would like to have your lights on this. > > libvirt version: 3.0.0-4 > > -> Could you tell me where i should look in the code ? You're probably looking at virLogManagerDomain* methods located in src/logging/log_manager.c and the wait call is issued from virNetClientIO. > -> Do you have more information about this virCondWait ? Which > condition is it waiting for ? > -> How can i get the symbols without having the recompile libvirt and > install it system wide, erasing the binaries installed by the package > ? To be honest, I think it's always worth debugging a custom built binary from sources, since the debug symbols shipped via distro package are most likely generated with optimizations which makes any kind of interactive debugging painful. The problem is that you're going to built v3.0.0 tag on a new distro, since new GCCs will complain about a lot of stuff, I looked at the code, tried a few things, but honestly I didn't see a path where you could get to virNetClientProgrammCall from virDomainIsActive (since I don't know what the original call was), so unless you post a full backtrace, we can't help you much here. One more thing, yes, you'd have to compile libvirt from git, but you don't have to install it, in fact, I'm always running upstream daemon, but haven't installed it system-wide, on the contrary, I've got a distro-provided version installed in the system which is disabled most of the time. Regards, Erik -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] Call to virDomainIsActive hangs forever
Hi, On Thursday, 22 March 2018 17:10:49 CET Mathieu Tarral wrote: > I have tried to investigate, but as there are no debug symbols for > libvirt on Debian Stretch, Yes there are: Stretch started the migration to automatically generated packages with debug symbols -- see: https://wiki.debian.org/AutomaticDebugPackages After adding the right repository, for a $pkg binary is a $pkg-dbgsym package, if the automatic generation was enabled for the source of $pkg (which is the case for libvirt). -- Pino Toscano signature.asc Description: This is a digitally signed message part. -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list