Re: [libvirt] Call to virDomainIsActive hangs forever

2018-06-14 Thread Daniel P . Berrangé
On Thu, Jun 14, 2018 at 04:31:57PM +0300, Mathieu Tarral wrote:
> Hi Daniel,
> 
> Thanks for your help, i was able to run the libvirt daemon from master,
> and after add some debug messages, i understood that the call
> to virDomainIsActive was actually hanging from the client.
> 
> It never reached the daemon.
> 
> Looking back at the call again:
> https://gist.github.com/Wenzel/624cb8c0746c05bd824d2b3c04b6b9f9
> 
> The call is waiting for pthread_sigmask to return.
>
> Looking into the libvirt source code (3.0.0) at the location of the call:
> https://github.com/libvirt/libvirt/blob/v3.0.0/src/rpc/virnetclient.c?utf8=%E2%9C%93#L1659
> 
> Is this supposed to be an instant call, or does it wait for specific
> conditions/event to return ?

Yeah this is very bad - pthread_sigmask should be instantaneous, and I
can't imagine what would make it hang besides severe memory corruption
somewhere :-(


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Call to virDomainIsActive hangs forever

2018-06-14 Thread Mathieu Tarral
Hi Daniel,

Thanks for your help, i was able to run the libvirt daemon from master,
and after add some debug messages, i understood that the call
to virDomainIsActive was actually hanging from the client.

It never reached the daemon.

Looking back at the call again:
https://gist.github.com/Wenzel/624cb8c0746c05bd824d2b3c04b6b9f9

The call is waiting for pthread_sigmask to return.
Looking into the libvirt source code (3.0.0) at the location of the call:
https://github.com/libvirt/libvirt/blob/v3.0.0/src/rpc/virnetclient.c?utf8=%E2%9C%93#L1659

Is this supposed to be an instant call, or does it wait for specific
conditions/event to return ?

Any idea what might go wrong ?

Thank you.

-- 
Mathieu Tarral

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Call to virDomainIsActive hangs forever

2018-04-26 Thread Mathieu Tarral
Hi,

thanks for your answer Daniel.

I think the best way to get more information about this bug is to reproduce it
with libvirt master branch.

However, i'm facing an issue when i try to run my own daemon: there is
a chid process
which is failing in a loop.

I used sudo strace -f ./libvirtd to watch all child processes, and
this is the output:
https://gist.github.com/Wenzel/620327a9b3b7e13454108c2e472eaa77

One of them is continously failing, and i don't clearly understand why.

Does this log file can help you identify the cause maybe ?
I'm here to provide more information if needed.


Thank you !

2018-03-27 16:12 GMT+03:00 Daniel P. Berrangé :
> On Tue, Mar 27, 2018 at 04:04:33PM +0300, Mathieu Tarral wrote:
>> > Are you sure this isa  different thread ? It looks identical to the first
>> > stack trace you give above.
>>
>> Yes, the first one is calling libvirtmod.virDomainGetState
>> and the second one libvirtmod.virDomainIsActive.
>>
>> > Interesting. This is an identical stack trace - so we have 2 python
>> > threads both calling virDomainIsActive(). Nothing wrong with that
>> > per-se - we support multithreaded usage like this.
>>
>> virDomainGetState()
>> and
>> virDomainIsActive()
>
> Opps, yes i see.
>
>> > Can you confirm there are no other threads running libvirt code
>> > in your python app ?  Did you have any thread running the libvirt
>> > event loop perhaps ?
>>
>> Actually i found 2 others threads in Python app calling libvirt.
>>
>> So, as a recap:
>
>
>>
>> (gdb) bt
>> #0  pthread_sigmask (how=how@entry=0, newmask=,
>> newmask@entry=0x7f4ffd7f8d10, oldmask=oldmask@entry=0x7f4ffd7f8c90) at
>> ../sysdeps/unix/sysv/linux/pthread_sigmask.c:50
>
> This is slightly unusual - pthread_sigmask() should complete in a tiny
> fraction of a second, so seeing it in the stack trace is odd unless you
> have very fortuitous timing when taking the stack trace.
>
>> #1  0x7f508e0f52fa in virNetClientIOEventLoop
>> (client=client@entry=0x55a1fde4d2b0,
>> thiscall=thiscall@entry=0x7f4fe005a350) at
>> ../../../src/rpc/virnetclient.c:1659
>> #2  0x7f508e0f5a16 in virNetClientIO (thiscall=0x7f4fe005a350,
>> client=0x55a1fde4d2b0) at ../../../src/rpc/virnetclient.c:1944
>> #3  virNetClientSendInternal (client=client@entry=0x55a1fde4d2b0,
>> msg=msg@entry=0x7f4fe0031f50, expectReply=expectReply@entry=true,
>> nonBlock=nonBlock@entry=false) at ../../../src/rpc/virnetclient.c:2116
>> #4  0x7f508e0f7443 in virNetClientSendWithReply
>> (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x7f4fe0031f50) at
>> ../../../src/rpc/virnetclient.c:2144
>> #5  0x7f508e0f7bf2 in virNetClientProgramCall
>> (prog=prog@entry=0x55a1fdff0f90, client=client@entry=0x55a1fde4d2b0,
>> serial=serial@entry=105, proc=proc@entry=14, noutfds=noutfds@entry=0,
>> outfds=outfds@entry=0x0, ninfds=0x0,
>> infds=0x0, args_filter=0x7f508e0ecba0
>> , args=0x7f4ffd7f8fe0,
>> ret_filter=0x7f508e0ecbd0 ,
>> ret=0x7f4ffd7f8fd8)
>> at ../../../src/rpc/virnetclientprogram.c:329
>> #6  0x7f508e0cdeb4 in callFull (priv=priv@entry=0x55a1fe5ac460,
>> flags=flags@entry=0, fdin=fdin@entry=0x0, fdinlen=fdinlen@entry=0,
>> fdout=fdout@entry=0x0, fdoutlen=fdoutlen@entry=0x0, proc_nr=14,
>> args_filter=0x7f508e0ecba0 ,
>> args=0x7f4ffd7f8fe0 "`k.\376\241U", ret_filter=0x7f508e0ecbd0
>> , ret=0x7f4ffd7f8fd8 "",
>> conn=)
>> at ../../../src/remote/remote_driver.c:6636
>> #7  0x7f508e0d7b58 in call (conn=,
>> ret=0x7f4ffd7f8fd8 "", ret_filter=, args=0x7f4ffd7f8fe0
>> "`k.\376\241U", args_filter=, proc_nr=14, flags=0,
>> priv=0x55a1fe5ac460)
>> at ../../../src/remote/remote_driver.c:6658
>> #8  remoteDomainGetXMLDesc (dom=, flags=0) at
>> ../../../src/remote/remote_client_bodies.h:2698
>> #9  0x7f508e08f5c1 in virDomainGetXMLDesc
>> (domain=domain@entry=0x55a1fe212da0, flags=0) at
>> ../../../src/libvirt-domain.c:2592
>> #10 0x7f508e46c8c0 in libvirt_virDomainGetXMLDesc (self=> out>, args=) at build/libvirt.c:1212
>> #11 0x55a1fb4cb6df in PyCFunction_Call () at 
>> ../Objects/methodobject.c:109
>
>
> Aside from the thing mentioned above I don't see any reason why you would
> have bad problems here.
>
> I don't have much more useful to suggest, other than to try using the very
> latest libvirt to see if you get the same behaviour. If not, then it would
> point to a bug in old libvirt, but I don't recall anything that would cause
> this behaviour you see offhand.
>
> Regards,
> Daniel
> --
> |: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o-https://fstop138.berrange.com :|
> |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



-- 
Mathieu Tarral

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] Call to virDomainIsActive hangs forever

2018-03-27 Thread Daniel P . Berrangé
On Tue, Mar 27, 2018 at 04:04:33PM +0300, Mathieu Tarral wrote:
> > Are you sure this isa  different thread ? It looks identical to the first
> > stack trace you give above.
> 
> Yes, the first one is calling libvirtmod.virDomainGetState
> and the second one libvirtmod.virDomainIsActive.
> 
> > Interesting. This is an identical stack trace - so we have 2 python
> > threads both calling virDomainIsActive(). Nothing wrong with that
> > per-se - we support multithreaded usage like this.
> 
> virDomainGetState()
> and
> virDomainIsActive()

Opps, yes i see.

> > Can you confirm there are no other threads running libvirt code
> > in your python app ?  Did you have any thread running the libvirt
> > event loop perhaps ?
> 
> Actually i found 2 others threads in Python app calling libvirt.
> 
> So, as a recap:


> 
> (gdb) bt
> #0  pthread_sigmask (how=how@entry=0, newmask=,
> newmask@entry=0x7f4ffd7f8d10, oldmask=oldmask@entry=0x7f4ffd7f8c90) at
> ../sysdeps/unix/sysv/linux/pthread_sigmask.c:50

This is slightly unusual - pthread_sigmask() should complete in a tiny
fraction of a second, so seeing it in the stack trace is odd unless you
have very fortuitous timing when taking the stack trace.

> #1  0x7f508e0f52fa in virNetClientIOEventLoop
> (client=client@entry=0x55a1fde4d2b0,
> thiscall=thiscall@entry=0x7f4fe005a350) at
> ../../../src/rpc/virnetclient.c:1659
> #2  0x7f508e0f5a16 in virNetClientIO (thiscall=0x7f4fe005a350,
> client=0x55a1fde4d2b0) at ../../../src/rpc/virnetclient.c:1944
> #3  virNetClientSendInternal (client=client@entry=0x55a1fde4d2b0,
> msg=msg@entry=0x7f4fe0031f50, expectReply=expectReply@entry=true,
> nonBlock=nonBlock@entry=false) at ../../../src/rpc/virnetclient.c:2116
> #4  0x7f508e0f7443 in virNetClientSendWithReply
> (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x7f4fe0031f50) at
> ../../../src/rpc/virnetclient.c:2144
> #5  0x7f508e0f7bf2 in virNetClientProgramCall
> (prog=prog@entry=0x55a1fdff0f90, client=client@entry=0x55a1fde4d2b0,
> serial=serial@entry=105, proc=proc@entry=14, noutfds=noutfds@entry=0,
> outfds=outfds@entry=0x0, ninfds=0x0,
> infds=0x0, args_filter=0x7f508e0ecba0
> , args=0x7f4ffd7f8fe0,
> ret_filter=0x7f508e0ecbd0 ,
> ret=0x7f4ffd7f8fd8)
> at ../../../src/rpc/virnetclientprogram.c:329
> #6  0x7f508e0cdeb4 in callFull (priv=priv@entry=0x55a1fe5ac460,
> flags=flags@entry=0, fdin=fdin@entry=0x0, fdinlen=fdinlen@entry=0,
> fdout=fdout@entry=0x0, fdoutlen=fdoutlen@entry=0x0, proc_nr=14,
> args_filter=0x7f508e0ecba0 ,
> args=0x7f4ffd7f8fe0 "`k.\376\241U", ret_filter=0x7f508e0ecbd0
> , ret=0x7f4ffd7f8fd8 "",
> conn=)
> at ../../../src/remote/remote_driver.c:6636
> #7  0x7f508e0d7b58 in call (conn=,
> ret=0x7f4ffd7f8fd8 "", ret_filter=, args=0x7f4ffd7f8fe0
> "`k.\376\241U", args_filter=, proc_nr=14, flags=0,
> priv=0x55a1fe5ac460)
> at ../../../src/remote/remote_driver.c:6658
> #8  remoteDomainGetXMLDesc (dom=, flags=0) at
> ../../../src/remote/remote_client_bodies.h:2698
> #9  0x7f508e08f5c1 in virDomainGetXMLDesc
> (domain=domain@entry=0x55a1fe212da0, flags=0) at
> ../../../src/libvirt-domain.c:2592
> #10 0x7f508e46c8c0 in libvirt_virDomainGetXMLDesc (self= out>, args=) at build/libvirt.c:1212
> #11 0x55a1fb4cb6df in PyCFunction_Call () at ../Objects/methodobject.c:109


Aside from the thing mentioned above I don't see any reason why you would
have bad problems here.

I don't have much more useful to suggest, other than to try using the very
latest libvirt to see if you get the same behaviour. If not, then it would
point to a bug in old libvirt, but I don't recall anything that would cause
this behaviour you see offhand.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Call to virDomainIsActive hangs forever

2018-03-27 Thread Mathieu Tarral
> Are you sure this isa  different thread ? It looks identical to the first
> stack trace you give above.

Yes, the first one is calling libvirtmod.virDomainGetState
and the second one libvirtmod.virDomainIsActive.

> Interesting. This is an identical stack trace - so we have 2 python
> threads both calling virDomainIsActive(). Nothing wrong with that
> per-se - we support multithreaded usage like this.

virDomainGetState()
and
virDomainIsActive()


> Can you confirm there are no other threads running libvirt code
> in your python app ?  Did you have any thread running the libvirt
> event loop perhaps ?

Actually i found 2 others threads in Python app calling libvirt.

So, as a recap:

Thread 1 - calling virDomainGetState
(gdb) py-bt
Traceback (most recent call first):
  
  File "/usr/lib/python3/dist-packages/libvirt.py", line 2551, in state
ret = libvirtmod.virDomainGetState(self._o, flags)

(gdb) bt
pthread_cond_wait@@GLIBC_2.3.2 () at
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x7f508dfe2b86 in virCondWait (c=c@entry=0x55a1fe420728,
m=m@entry=0x55a1fde4d2c0) at ../../../src/util/virthread.c:154
#2  0x7f508e0f5bbb in virNetClientIO (thiscall=0x55a1fe420710,
client=0x55a1fde4d2b0) at ../../../src/rpc/virnetclient.c:1894
#3  virNetClientSendInternal (client=client@entry=0x55a1fde4d2b0,
msg=msg@entry=0x55a1fdd798f0, expectReply=expectReply@entry=true,
nonBlock=nonBlock@entry=false) at ../../../src/rpc/virnetclient.c:2116
#4  0x7f508e0f7443 in virNetClientSendWithReply
(client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x55a1fdd798f0) at
../../../src/rpc/virnetclient.c:2144
#5  0x7f508e0f7bf2 in virNetClientProgramCall
(prog=prog@entry=0x55a1fdff0f90, client=client@entry=0x55a1fde4d2b0,
serial=serial@entry=108, proc=proc@entry=212, noutfds=noutfds@entry=0,
outfds=outfds@entry=0x0, ninfds=0x0,
infds=0x0, args_filter=0x7f508e0f13e0
, args=0x7ffd1518cfd0,
ret_filter=0x7f508e0f1410 ,
ret=0x7ffd1518cfc8) at ../../../src/rpc/virnetclientprogram.c:329
#6  0x7f508e0cdeb4 in callFull (priv=priv@entry=0x55a1fe5ac460,
flags=flags@entry=0, fdin=fdin@entry=0x0, fdinlen=fdinlen@entry=0,
fdout=fdout@entry=0x0, fdoutlen=fdoutlen@entry=0x0, proc_nr=212,
args_filter=0x7f508e0f13e0 ,
args=0x7ffd1518cfd0 "`k.\376\241U", ret_filter=0x7f508e0f1410
, ret=0x7ffd1518cfc8 "",
conn=)
at ../../../src/remote/remote_driver.c:6636
#7  0x7f508e0d7f90 in call (conn=,
ret=0x7ffd1518cfc8 "", ret_filter=, args=0x7ffd1518cfd0
"`k.\376\241U", args_filter=, proc_nr=212, flags=0,
priv=0x55a1fe5ac460)
at ../../../src/remote/remote_driver.c:6658
#8  remoteDomainGetState (domain=0x55a1fe212da0, state=0x7ffd1518d0a4,
reason=0x7ffd1518d0a8, flags=0) at
../../../src/remote/remote_driver.c:2458
#9  0x7f508e08e248 in virDomainGetState
(domain=domain@entry=0x55a1fe212da0, state=state@entry=0x7ffd1518d0a4,
reason=reason@entry=0x7ffd1518d0a8, flags=0) at
../../../src/libvirt-domain.c:2495
#10 0x7f508e466f28 in libvirt_virDomainGetState (self=, args=) at libvirt-override.c:2539
#11 0x55a1fb4cb6df in PyCFunction_Call () at ../Objects/methodobject.c:109

Thread 32 - calling virDomainIsActive
(gdb) py-bt
Traceback (most recent call first):
  
  File "/usr/lib/python3/dist-packages/libvirt.py", line 1340, in isActive
ret = libvirtmod.virDomainIsActive(self._o)

(gdb) bt
#0  pthread_cond_wait@@GLIBC_2.3.2 () at
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x7f508dfe2b86 in virCondWait (c=c@entry=0x7f4ff051fdc8,
m=m@entry=0x55a1fde4d2c0) at ../../../src/util/virthread.c:154
#2  0x7f508e0f5bbb in virNetClientIO (thiscall=0x7f4ff051fdb0,
client=0x55a1fde4d2b0) at ../../../src/rpc/virnetclient.c:1894
#3  virNetClientSendInternal (client=client@entry=0x55a1fde4d2b0,
msg=msg@entry=0x7f4ff051fa10, expectReply=expectReply@entry=true,
nonBlock=nonBlock@entry=false) at ../../../src/rpc/virnetclient.c:2116
#4  0x7f508e0f7443 in virNetClientSendWithReply
(client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x7f4ff051fa10) at
../../../src/rpc/virnetclient.c:2144
#5  0x7f508e0f7bf2 in virNetClientProgramCall
(prog=prog@entry=0x55a1fdff0f90, client=client@entry=0x55a1fde4d2b0,
serial=serial@entry=107, proc=proc@entry=150, noutfds=noutfds@entry=0,
outfds=outfds@entry=0x0, ninfds=0x0,
infds=0x0, args_filter=0x7f508e0efe70
, args=0x7f4fff7fd040,
ret_filter=0x7f508e0efe90 ,
ret=0x7f4fff7fd03c) at ../../../src/rpc/virnetclientprogram.c:329
#6  0x7f508e0cdeb4 in callFull (priv=priv@entry=0x55a1fe5ac460,
flags=flags@entry=0, fdin=fdin@entry=0x0, fdinlen=fdinlen@entry=0,
fdout=fdout@entry=0x0, fdoutlen=fdoutlen@entry=0x0, proc_nr=150,
args_filter=0x7f508e0efe70 ,
args=0x7f4fff7fd040 "`k.\376\241U", ret_filter=0x7f508e0efe90
, ret=0x7f4fff7fd03c "",
conn=)
at ../../../src/remote/remote_driver.c:6636
#7  0x7f508e0d71cb in call (conn=,
ret=0x7f4fff7fd03c "", ret_filter=, args=0x7f4fff7fd040
"`k.\376\241U", args_filter=, proc_nr=150, flags=0,

Re: [libvirt] Call to virDomainIsActive hangs forever

2018-03-27 Thread Daniel P . Berrangé
On Tue, Mar 27, 2018 at 03:30:14PM +0300, Mathieu Tarral wrote:
> I have installed the following debug symbols:
> 
> - libvirt-daemon-dbgsym_3.0.0-4+deb9u3_amd64.deb
> - libvirt-daemon-system-dbgsym_3.0.0-4+deb9u3_amd64.deb
> - libvirt-clients-dbgsym_3.0.0-4+deb9u3_amd64.deb
> - libvirt0-dbgsym_3.0.0-4+deb9u3_amd64.deb
> - python3-libvirt-dbgsym_3.0.0-2_amd64.deb
> 
> And i was able to reproduce my bug.
> 
> In my Python application, i had 2 threads hanging on a libvirt call.
> (sorry for the verbosity)
> Thread 1:
> 
> (gdb) py-bt
> Traceback (most recent call first):
>0x7f508e6b1278>
>   File "/usr/lib/python3/dist-packages/libvirt.py", line 2551, in state
> ret = libvirtmod.virDomainGetState(self._o, flags)
> 
> (gdb) bt
> pthread_cond_wait@@GLIBC_2.3.2 () at
> ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
> #1  0x7f508dfe2b86 in virCondWait (c=c@entry=0x55a1fe420728,
> m=m@entry=0x55a1fde4d2c0) at ../../../src/util/virthread.c:154
> #2  0x7f508e0f5bbb in virNetClientIO (thiscall=0x55a1fe420710,
> client=0x55a1fde4d2b0) at ../../../src/rpc/virnetclient.c:1894
> #3  virNetClientSendInternal (client=client@entry=0x55a1fde4d2b0,
> msg=msg@entry=0x55a1fdd798f0, expectReply=expectReply@entry=true,
> nonBlock=nonBlock@entry=false) at ../../../src/rpc/virnetclient.c:2116
> #4  0x7f508e0f7443 in virNetClientSendWithReply
> (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x55a1fdd798f0) at
> ../../../src/rpc/virnetclient.c:2144
> #5  0x7f508e0f7bf2 in virNetClientProgramCall
> (prog=prog@entry=0x55a1fdff0f90, client=client@entry=0x55a1fde4d2b0,
> serial=serial@entry=108, proc=proc@entry=212, noutfds=noutfds@entry=0,
> outfds=outfds@entry=0x0, ninfds=0x0,
> infds=0x0, args_filter=0x7f508e0f13e0
> , args=0x7ffd1518cfd0,
> ret_filter=0x7f508e0f1410 ,
> ret=0x7ffd1518cfc8) at ../../../src/rpc/virnetclientprogram.c:329
> #6  0x7f508e0cdeb4 in callFull (priv=priv@entry=0x55a1fe5ac460,
> flags=flags@entry=0, fdin=fdin@entry=0x0, fdinlen=fdinlen@entry=0,
> fdout=fdout@entry=0x0, fdoutlen=fdoutlen@entry=0x0, proc_nr=212,
> args_filter=0x7f508e0f13e0 ,
> args=0x7ffd1518cfd0 "`k.\376\241U", ret_filter=0x7f508e0f1410
> , ret=0x7ffd1518cfc8 "",
> conn=)
> at ../../../src/remote/remote_driver.c:6636
> #7  0x7f508e0d7f90 in call (conn=,
> ret=0x7ffd1518cfc8 "", ret_filter=, args=0x7ffd1518cfd0
> "`k.\376\241U", args_filter=, proc_nr=212, flags=0,
> priv=0x55a1fe5ac460)
> at ../../../src/remote/remote_driver.c:6658
> #8  remoteDomainGetState (domain=0x55a1fe212da0, state=0x7ffd1518d0a4,
> reason=0x7ffd1518d0a8, flags=0) at
> ../../../src/remote/remote_driver.c:2458
> #9  0x7f508e08e248 in virDomainGetState
> (domain=domain@entry=0x55a1fe212da0, state=state@entry=0x7ffd1518d0a4,
> reason=reason@entry=0x7ffd1518d0a8, flags=0) at
> ../../../src/libvirt-domain.c:2495
> #10 0x7f508e466f28 in libvirt_virDomainGetState (self= out>, args=) at libvirt-override.c:2539
> #11 0x55a1fb4cb6df in PyCFunction_Call () at ../Objects/methodobject.c:109


That's pretty much as expected - just remote client waiting for a reply
from the daemon.

> And Thread 2:
> 
> (gdb) py-bt
>0x7f508e6b1278>
>   File "/usr/lib/python3/dist-packages/libvirt.py", line 1340, in isActive
> ret = libvirtmod.virDomainIsActive(self._o)
> 
> (gdb) bt

Are you sure this isa  different thread ? It looks identical to the first
stack trace you give above.

> #0  pthread_cond_wait@@GLIBC_2.3.2 () at
> ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
> #1  0x7f508dfe2b86 in virCondWait (c=c@entry=0x7f4ff051fdc8,
> m=m@entry=0x55a1fde4d2c0) at ../../../src/util/virthread.c:154
> #2  0x7f508e0f5bbb in virNetClientIO (thiscall=0x7f4ff051fdb0,
> client=0x55a1fde4d2b0) at ../../../src/rpc/virnetclient.c:1894
> #3  virNetClientSendInternal (client=client@entry=0x55a1fde4d2b0,
> msg=msg@entry=0x7f4ff051fa10, expectReply=expectReply@entry=true,
> nonBlock=nonBlock@entry=false) at ../../../src/rpc/virnetclient.c:2116
> #4  0x7f508e0f7443 in virNetClientSendWithReply
> (client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x7f4ff051fa10) at
> ../../../src/rpc/virnetclient.c:2144
> #5  0x7f508e0f7bf2 in virNetClientProgramCall
> (prog=prog@entry=0x55a1fdff0f90, client=client@entry=0x55a1fde4d2b0,
> serial=serial@entry=107, proc=proc@entry=150, noutfds=noutfds@entry=0,
> outfds=outfds@entry=0x0, ninfds=0x0,
> infds=0x0, args_filter=0x7f508e0efe70
> , args=0x7f4fff7fd040,
> ret_filter=0x7f508e0efe90 ,
> ret=0x7f4fff7fd03c) at ../../../src/rpc/virnetclientprogram.c:329
> #6  0x7f508e0cdeb4 in callFull (priv=priv@entry=0x55a1fe5ac460,
> flags=flags@entry=0, fdin=fdin@entry=0x0, fdinlen=fdinlen@entry=0,
> fdout=fdout@entry=0x0, fdoutlen=fdoutlen@entry=0x0, proc_nr=150,
> args_filter=0x7f508e0efe70 ,
> args=0x7f4fff7fd040 "`k.\376\241U", ret_filter=0x7f508e0efe90
> , ret=0x7f4fff7fd03c "",
> conn=)
> at ../../../src/remote/remote_driver.c:6636
> #7  0x7f508e0d71cb in 

Re: [libvirt] Call to virDomainIsActive hangs forever

2018-03-27 Thread Mathieu Tarral
I have installed the following debug symbols:

- libvirt-daemon-dbgsym_3.0.0-4+deb9u3_amd64.deb
- libvirt-daemon-system-dbgsym_3.0.0-4+deb9u3_amd64.deb
- libvirt-clients-dbgsym_3.0.0-4+deb9u3_amd64.deb
- libvirt0-dbgsym_3.0.0-4+deb9u3_amd64.deb
- python3-libvirt-dbgsym_3.0.0-2_amd64.deb

And i was able to reproduce my bug.

In my Python application, i had 2 threads hanging on a libvirt call.
(sorry for the verbosity)
Thread 1:

(gdb) py-bt
Traceback (most recent call first):
  
  File "/usr/lib/python3/dist-packages/libvirt.py", line 2551, in state
ret = libvirtmod.virDomainGetState(self._o, flags)

(gdb) bt
pthread_cond_wait@@GLIBC_2.3.2 () at
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x7f508dfe2b86 in virCondWait (c=c@entry=0x55a1fe420728,
m=m@entry=0x55a1fde4d2c0) at ../../../src/util/virthread.c:154
#2  0x7f508e0f5bbb in virNetClientIO (thiscall=0x55a1fe420710,
client=0x55a1fde4d2b0) at ../../../src/rpc/virnetclient.c:1894
#3  virNetClientSendInternal (client=client@entry=0x55a1fde4d2b0,
msg=msg@entry=0x55a1fdd798f0, expectReply=expectReply@entry=true,
nonBlock=nonBlock@entry=false) at ../../../src/rpc/virnetclient.c:2116
#4  0x7f508e0f7443 in virNetClientSendWithReply
(client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x55a1fdd798f0) at
../../../src/rpc/virnetclient.c:2144
#5  0x7f508e0f7bf2 in virNetClientProgramCall
(prog=prog@entry=0x55a1fdff0f90, client=client@entry=0x55a1fde4d2b0,
serial=serial@entry=108, proc=proc@entry=212, noutfds=noutfds@entry=0,
outfds=outfds@entry=0x0, ninfds=0x0,
infds=0x0, args_filter=0x7f508e0f13e0
, args=0x7ffd1518cfd0,
ret_filter=0x7f508e0f1410 ,
ret=0x7ffd1518cfc8) at ../../../src/rpc/virnetclientprogram.c:329
#6  0x7f508e0cdeb4 in callFull (priv=priv@entry=0x55a1fe5ac460,
flags=flags@entry=0, fdin=fdin@entry=0x0, fdinlen=fdinlen@entry=0,
fdout=fdout@entry=0x0, fdoutlen=fdoutlen@entry=0x0, proc_nr=212,
args_filter=0x7f508e0f13e0 ,
args=0x7ffd1518cfd0 "`k.\376\241U", ret_filter=0x7f508e0f1410
, ret=0x7ffd1518cfc8 "",
conn=)
at ../../../src/remote/remote_driver.c:6636
#7  0x7f508e0d7f90 in call (conn=,
ret=0x7ffd1518cfc8 "", ret_filter=, args=0x7ffd1518cfd0
"`k.\376\241U", args_filter=, proc_nr=212, flags=0,
priv=0x55a1fe5ac460)
at ../../../src/remote/remote_driver.c:6658
#8  remoteDomainGetState (domain=0x55a1fe212da0, state=0x7ffd1518d0a4,
reason=0x7ffd1518d0a8, flags=0) at
../../../src/remote/remote_driver.c:2458
#9  0x7f508e08e248 in virDomainGetState
(domain=domain@entry=0x55a1fe212da0, state=state@entry=0x7ffd1518d0a4,
reason=reason@entry=0x7ffd1518d0a8, flags=0) at
../../../src/libvirt-domain.c:2495
#10 0x7f508e466f28 in libvirt_virDomainGetState (self=, args=) at libvirt-override.c:2539
#11 0x55a1fb4cb6df in PyCFunction_Call () at ../Objects/methodobject.c:109



And Thread 2:

(gdb) py-bt
  
  File "/usr/lib/python3/dist-packages/libvirt.py", line 1340, in isActive
ret = libvirtmod.virDomainIsActive(self._o)

(gdb) bt
#0  pthread_cond_wait@@GLIBC_2.3.2 () at
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x7f508dfe2b86 in virCondWait (c=c@entry=0x7f4ff051fdc8,
m=m@entry=0x55a1fde4d2c0) at ../../../src/util/virthread.c:154
#2  0x7f508e0f5bbb in virNetClientIO (thiscall=0x7f4ff051fdb0,
client=0x55a1fde4d2b0) at ../../../src/rpc/virnetclient.c:1894
#3  virNetClientSendInternal (client=client@entry=0x55a1fde4d2b0,
msg=msg@entry=0x7f4ff051fa10, expectReply=expectReply@entry=true,
nonBlock=nonBlock@entry=false) at ../../../src/rpc/virnetclient.c:2116
#4  0x7f508e0f7443 in virNetClientSendWithReply
(client=client@entry=0x55a1fde4d2b0, msg=msg@entry=0x7f4ff051fa10) at
../../../src/rpc/virnetclient.c:2144
#5  0x7f508e0f7bf2 in virNetClientProgramCall
(prog=prog@entry=0x55a1fdff0f90, client=client@entry=0x55a1fde4d2b0,
serial=serial@entry=107, proc=proc@entry=150, noutfds=noutfds@entry=0,
outfds=outfds@entry=0x0, ninfds=0x0,
infds=0x0, args_filter=0x7f508e0efe70
, args=0x7f4fff7fd040,
ret_filter=0x7f508e0efe90 ,
ret=0x7f4fff7fd03c) at ../../../src/rpc/virnetclientprogram.c:329
#6  0x7f508e0cdeb4 in callFull (priv=priv@entry=0x55a1fe5ac460,
flags=flags@entry=0, fdin=fdin@entry=0x0, fdinlen=fdinlen@entry=0,
fdout=fdout@entry=0x0, fdoutlen=fdoutlen@entry=0x0, proc_nr=150,
args_filter=0x7f508e0efe70 ,
args=0x7f4fff7fd040 "`k.\376\241U", ret_filter=0x7f508e0efe90
, ret=0x7f4fff7fd03c "",
conn=)
at ../../../src/remote/remote_driver.c:6636
#7  0x7f508e0d71cb in call (conn=,
ret=0x7f4fff7fd03c "", ret_filter=, args=0x7f4fff7fd040
"`k.\376\241U", args_filter=, proc_nr=150, flags=0,
priv=0x55a1fe5ac460)
at ../../../src/remote/remote_driver.c:6658
#8  remoteDomainIsActive (dom=0x55a1fe212da0) at
../../../src/remote/remote_client_bodies.h:2842
#9  0x7f508e09df03 in virDomainIsActive
(dom=dom@entry=0x55a1fe212da0) at ../../../src/libvirt-domain.c:8467
#10 0x7f508e46cbd0 in libvirt_virDomainIsActive (self=, args=) at build/libvirt.c:1288
#11 

Re: [libvirt] Call to virDomainIsActive hangs forever

2018-03-26 Thread Daniel P . Berrangé
On Fri, Mar 23, 2018 at 01:24:46PM +0100, Erik Skultety wrote:
> On Thu, Mar 22, 2018 at 06:10:49PM +0200, Mathieu Tarral wrote:
> > Hi !
> >
> > I'm submitting my messages on this mailing list to request a bit of
> > help on a case that I have
> > where a Python application makes a call to virDomainIsActive, and the
> > call never returns.
> >
> > I have tried to investigate, but as there are no debug symbols for
> > libvirt on Debian Stretch,
> > i can only have the following GDB backtrace:
> >
> > (gdb) bt
> > #0  pthread_cond_wait@@GLIBC_2.3.2 () at
> > ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
> > #1  0x7f49026f5b76 in virCondWait () from /usr/lib/libvirt.so.0
> > #2  0x7f4902808bab in ?? () from /usr/lib/libvirt.so.0
> > #3  0x7f490280a433 in virNetClientSendWithReply () from
> > /usr/lib/libvirt.so.0
> > #4  0x7f490280abe2 in virNetClientProgramCall () from 
> > /usr/lib/libvirt.so.0
> > #5  0x7f49027e0ea4 in ?? () from /usr/lib/libvirt.so.0
> > #6  0x7f49027ea1bb in ?? () from /usr/lib/libvirt.so.0
> > #7  0x7f49027b0ef3 in virDomainIsActive () from /usr/lib/libvirt.so.0
> > #8  0x7f4902b7fbd0 in libvirt_virDomainIsActive () from
> > /usr/lib/python3/dist-packages/libvirtmod.cpython-35m-x86_64-linux-gnu.so
> > #9  0x558eeec696df in PyCFunction_Call () at 
> > ../Objects/methodobject.c:109
> >
> > The libvirt driver used is QEMU, and i have a specific monitoring in
> > place using virtual machine introspection:
> > https://github.com/KVM-VMI/kvm-vmi
> >
> > Now this specific monitoring somehow triggers this bug, and at this
> > point, i don't know if
> > it's a corner case in the libvirt QEMU driver or not.
> > That's why i would like to have your lights on this.
> >
> > libvirt version: 3.0.0-4
> >
> > -> Could you tell me where i should look in the code ?
> 
> You're probably looking at virLogManagerDomain* methods located in
> src/logging/log_manager.c and the wait call is issued from virNetClientIO.
> 
> > -> Do you have more information about this virCondWait ? Which
> > condition is it waiting for ?
> > -> How can i get the symbols without having the recompile libvirt and
> > install it system wide, erasing the binaries installed by the package
> > ?
> 
> To be honest, I think it's always worth debugging a custom built binary from
> sources, since the debug symbols shipped via distro package are most likely
> generated with optimizations which makes any kind of interactive debugging
> painful. The problem is that you're going to built v3.0.0 tag on a new distro,
> since new GCCs will complain about a lot of stuff, I looked at the code, tried
> a few things, but honestly I didn't see a path where you could get to
> virNetClientProgrammCall from virDomainIsActive (since I don't know what the
> original call was), so unless you post a full backtrace, we can't help you 
> much
> here.

That's easy - virDomainIsActive() calls into the driver APIs. This is client
side trace, so will get into the remote driver client, which will then call
virNetClientProgramCall.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Call to virDomainIsActive hangs forever

2018-03-23 Thread Michal Privoznik
On 03/22/2018 05:10 PM, Mathieu Tarral wrote:
> Hi !
> 
> I'm submitting my messages on this mailing list to request a bit of
> help on a case that I have
> where a Python application makes a call to virDomainIsActive, and the
> call never returns.
> 
> I have tried to investigate, but as there are no debug symbols for
> libvirt on Debian Stretch,
> i can only have the following GDB backtrace:
> 
> (gdb) bt
> #0  pthread_cond_wait@@GLIBC_2.3.2 () at
> ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
> #1  0x7f49026f5b76 in virCondWait () from /usr/lib/libvirt.so.0
> #2  0x7f4902808bab in ?? () from /usr/lib/libvirt.so.0
> #3  0x7f490280a433 in virNetClientSendWithReply () from
> /usr/lib/libvirt.so.0
> #4  0x7f490280abe2 in virNetClientProgramCall () from 
> /usr/lib/libvirt.so.0
> #5  0x7f49027e0ea4 in ?? () from /usr/lib/libvirt.so.0
> #6  0x7f49027ea1bb in ?? () from /usr/lib/libvirt.so.0
> #7  0x7f49027b0ef3 in virDomainIsActive () from /usr/lib/libvirt.so.0
> #8  0x7f4902b7fbd0 in libvirt_virDomainIsActive () from
> /usr/lib/python3/dist-packages/libvirtmod.cpython-35m-x86_64-linux-gnu.so
> #9  0x558eeec696df in PyCFunction_Call () at ../Objects/methodobject.c:109

This is just a client waiting for server's reply. You want to attach gdb
to corresponding daemon and run 't a a bt' (thread apply all backtrace).
Libvirtd is multithreaded and one of the threads will be executing your API.

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Call to virDomainIsActive hangs forever

2018-03-23 Thread Erik Skultety
On Thu, Mar 22, 2018 at 06:10:49PM +0200, Mathieu Tarral wrote:
> Hi !
>
> I'm submitting my messages on this mailing list to request a bit of
> help on a case that I have
> where a Python application makes a call to virDomainIsActive, and the
> call never returns.
>
> I have tried to investigate, but as there are no debug symbols for
> libvirt on Debian Stretch,
> i can only have the following GDB backtrace:
>
> (gdb) bt
> #0  pthread_cond_wait@@GLIBC_2.3.2 () at
> ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
> #1  0x7f49026f5b76 in virCondWait () from /usr/lib/libvirt.so.0
> #2  0x7f4902808bab in ?? () from /usr/lib/libvirt.so.0
> #3  0x7f490280a433 in virNetClientSendWithReply () from
> /usr/lib/libvirt.so.0
> #4  0x7f490280abe2 in virNetClientProgramCall () from 
> /usr/lib/libvirt.so.0
> #5  0x7f49027e0ea4 in ?? () from /usr/lib/libvirt.so.0
> #6  0x7f49027ea1bb in ?? () from /usr/lib/libvirt.so.0
> #7  0x7f49027b0ef3 in virDomainIsActive () from /usr/lib/libvirt.so.0
> #8  0x7f4902b7fbd0 in libvirt_virDomainIsActive () from
> /usr/lib/python3/dist-packages/libvirtmod.cpython-35m-x86_64-linux-gnu.so
> #9  0x558eeec696df in PyCFunction_Call () at ../Objects/methodobject.c:109
>
> The libvirt driver used is QEMU, and i have a specific monitoring in
> place using virtual machine introspection:
> https://github.com/KVM-VMI/kvm-vmi
>
> Now this specific monitoring somehow triggers this bug, and at this
> point, i don't know if
> it's a corner case in the libvirt QEMU driver or not.
> That's why i would like to have your lights on this.
>
> libvirt version: 3.0.0-4
>
> -> Could you tell me where i should look in the code ?

You're probably looking at virLogManagerDomain* methods located in
src/logging/log_manager.c and the wait call is issued from virNetClientIO.

> -> Do you have more information about this virCondWait ? Which
> condition is it waiting for ?
> -> How can i get the symbols without having the recompile libvirt and
> install it system wide, erasing the binaries installed by the package
> ?

To be honest, I think it's always worth debugging a custom built binary from
sources, since the debug symbols shipped via distro package are most likely
generated with optimizations which makes any kind of interactive debugging
painful. The problem is that you're going to built v3.0.0 tag on a new distro,
since new GCCs will complain about a lot of stuff, I looked at the code, tried
a few things, but honestly I didn't see a path where you could get to
virNetClientProgrammCall from virDomainIsActive (since I don't know what the
original call was), so unless you post a full backtrace, we can't help you much
here.
One more thing, yes, you'd have to compile libvirt from git, but you don't have
to install it, in fact, I'm always running upstream daemon, but haven't
installed it system-wide, on the contrary, I've got a distro-provided version
installed in the system which is disabled most of the time.

Regards,
Erik

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Call to virDomainIsActive hangs forever

2018-03-22 Thread Pino Toscano
Hi,

On Thursday, 22 March 2018 17:10:49 CET Mathieu Tarral wrote:
> I have tried to investigate, but as there are no debug symbols for
> libvirt on Debian Stretch,

Yes there are: Stretch started the migration to automatically generated
packages with debug symbols -- see:
https://wiki.debian.org/AutomaticDebugPackages

After adding the right repository, for a $pkg binary is a $pkg-dbgsym
package, if the automatic generation was enabled for the source of
$pkg (which is the case for libvirt).

-- 
Pino Toscano

signature.asc
Description: This is a digitally signed message part.
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list