Re: [Qemu-devel] [bug] busy-loop in send_all()

Amit Shah Mon, 26 May 2014 21:43:21 -0700

Hi,

Also CCing Gerd.


On (Fri) 23 May 2014 [13:55:40], Stefan Hajnoczi wrote:
> On Thu, May 15, 2014 at 11:23:54AM -0600, Chris Friesen wrote:
> > I've run into a situation that seems like a bug.  I'm using qemu 1.4.2 (with
> > additional patches) from within openstack.
> > 
> > I'm using virtio-serial-pci to provide a channel between the guest and host.
> > 
> > On occasion when doing suspend/resume I run into a case where the main qemu
> > thread ends up chewing 100% of a cpu.
> > 
> > I attached strace to the thread and it showed qemu just spitting messages:
> > 
> > write(35, "HRBT\0\1\0\3d<\230k\0\0\0\0\0\0\1\330\0\0\0\0enqueue\0"..., 472)
> > = -1 EAGAIN (Resource temporarily unavailable)
> > write(35, "HRBT\0\1\0\3d<\230k\0\0\0\0\0\0\1\330\0\0\0\0enqueue\0"..., 472)
> > = -1 EAGAIN (Resource temporarily unavailable)
> > write(35, "HRBT\0\1\0\3d<\230k\0\0\0\0\0\0\1\330\0\0\0\0enqueue\0"..., 472)
> > = -1 EAGAIN (Resource temporarily unavailable)
> > write(35, "HRBT\0\1\0\3d<\230k\0\0\0\0\0\0\1\330\0\0\0\0enqueue\0"..., 472)
> > = -1 EAGAIN (Resource temporarily unavailable)
> > 
> > File descriptor 35 is the unix socket corresponding to the virtio-serial
> > port.
> > 
> > I broke in with gdb and got a backtrace showing it was in send_all().
> > Looking at the implementation of send_all(), the core loop looks like:
> > 
> >      while (len > 0) {
> >          ret = write(fd, buf, len);
> >          if (ret < 0) {
> >              if (errno != EINTR && errno != EAGAIN)
> >                  return -1;
> >          } else if (ret == 0) {
> >              break;
> >          } else {
> >              buf += ret;
> >              len -= ret;
> >          }
> >      }
> > 
> > 
> > So if we get EAGAIN, we'll just immediately retry.
> > 
> > I'm not sure where the unix socket would get opened, but I'm assuming it's
> > set as non-blocking?  And by default /proc/sys/net/unix/max_dgram_qlen is
> > set to 10.
> > 
> > So if the other end of that unix socket is connected but isn't actually
> > paying attention to the messages then the first 10 messages will get
> > buffered but after that we'll end up with qemu spinning forever in a
> > busy-loop trying to send a message into a full buffer.
> > 
> > This seems less than ideal.  Either we should block, or else we should
> > discard the data.  And I don't think discarding the data makes sense.

Chardev flow control was added to 1.5.0.  Can you re-try with that
release and let us know if it still behaves similarly?

http://wiki.qemu.org/Features/ChardevFlowControl

Thanks,

                Amit

Re: [Qemu-devel] [bug] busy-loop in send_all()

Reply via email to