Hi,

I'm running qemu 1.4.2 (soon planning on moving to 1.7).  I'm running two 
instances of qemu with a virtio-serial channel each, exposed on the host via 
unix stream sockets.

I've got an app that tries to connect() to both of them in turn.  The connect() 
to the first socket fails with EAGAIN, the second one succeeds, and all 
subsequent retries on the first fail.  Here's an strace() of the sequence:

socket(PF_FILE, SOCK_STREAM, 0)         = 6
fcntl(6, F_GETFL)                       = 0x2 (flags O_RDWR)
fcntl(6, F_SETFL, O_RDWR|O_NONBLOCK)    = 0
connect(6, {sa_family=AF_FILE, 
sun_path="/var/lib/libvirt/qemu/cgcs.messaging.instance-00000007.sock"}, 61) = 
-1 EAGAIN (Resource temporarily unavailable)
clock_gettime(CLOCK_MONOTONIC, {158877, 262941763}) = 0
socket(PF_FILE, SOCK_STREAM, 0)         = 7
fcntl(7, F_GETFL)                       = 0x2 (flags O_RDWR)
fcntl(7, F_SETFL, O_RDWR|O_NONBLOCK)    = 0
connect(7, {sa_family=AF_FILE, 
sun_path="/var/lib/libvirt/qemu/cgcs.messaging.instance-00000008.sock"}, 61) = 0
getdents(5, /* 0 entries */, 32768)     = 0
close(5)                                = 0
clock_gettime(CLOCK_MONOTONIC, {158877, 265359109}) = 0
poll([{fd=3, events=POLLIN}, {fd=4, events=POLLIN}, {fd=7, events=POLLIN}], 3, 
997) = 0 (Timeout)
clock_gettime(CLOCK_MONOTONIC, {158878, 265914614}) = 0
connect(6, {sa_family=AF_FILE, 
sun_path="/var/lib/libvirt/qemu/cgcs.messaging.instance-00000007.sock"}, 61) = 
-1 EAGAIN (Resource temporarily unavailable)


With the app not running, netstat seems to show that something is trying to 
connect to the socket in question:

root@compute-0:~# netstat -ap unix |grep messaging
unix  2      [ ACC ]     STREAM     LISTENING     1109818  17379/qemu-system-x 
/var/lib/libvirt/qemu/cgcs.messaging.instance-00000007.sock
unix  2      [ ACC ]     STREAM     LISTENING     1110051  17425/qemu-system-x 
/var/lib/libvirt/qemu/cgcs.messaging.instance-00000008.sock
unix  2      [ ]         STREAM     CONNECTING    0        -                   
/var/lib/libvirt/qemu/cgcs.messaging.instance-00000007.sock
unix  2      [ ]         STREAM     CONNECTING    0        -                   
/var/lib/libvirt/qemu/cgcs.messaging.instance-00000007.sock
unix  2      [ ]         STREAM     CONNECTED     1109848  17379/qemu-system-x 
/var/lib/libvirt/qemu/cgcs.messaging.instance-00000007.sock


Here's /proc/net/unix for completeness:

root@compute-0:~/host-guest-comm# grep -a messaging /proc/net/unix
ffff880045c35540: 00000002 00000000 00010000 0001 01 1109818 
/var/lib/libvirt/qemu/cgcs.messaging.instance-00000007.sock
ffff8800576b8a80: 00000002 00000000 00010000 0001 01 1110051 
/var/lib/libvirt/qemu/cgcs.messaging.instance-00000008.sock
ffff880045e2f040: 00000002 00000000 00000000 0001 02     0 
/var/lib/libvirt/qemu/cgcs.messaging.instance-00000007.sock
ffff88004bc5ea80: 00000002 00000000 00000000 0001 02     0 
/var/lib/libvirt/qemu/cgcs.messaging.instance-00000007.sock
ffff880045e2f540: 00000002 00000000 00000000 0001 03 1109848 
/var/lib/libvirt/qemu/cgcs.messaging.instance-00000007.sock



The crazy thing is that I can't figure out what could be causing the 
CONNECTED/CONNECTING sockets.  There are no background processes of the 
connecting app running, no zombie processes, no forked children, etc.

Just to make things more interesting, I successfully ran this application 
several times (connecting to both sockets) before this behaviour started 
happening.  I was running it under strace and just killed it with ctrl-C.

I contacted the linux kernel netdev list and they suggested it might be due to 
the listen() backlog of 1, combined with somehow missing a connection attempt 
on the socket and thus never calling accept().

Anyone got any ideas?   Please CC me since I'm not subscribed to the list.

Thanks,
Chris

Reply via email to