On Fri, Aug 09, 2013 at 03:05:22PM +0100, Andrei Mikhailovsky wrote:
> I can confirm that I am having similar issues with ubuntu vm guests using fio 
> with bs=4k direct=1 numjobs=4 iodepth=16. Occasionally i see hang tasks, 
> occasionally guest vm stops responding without leaving anything in the logs 
> and sometimes i see kernel panic on the console. I typically leave the 
> runtime of the fio test for 60 minutes and it tends to stop responding after 
> about 10-30 mins. 
> 
> I am on ubuntu 12.04 with 3.5 kernel backport and using ceph 0.61.7 with qemu 
> 1.5.0 and libvirt 1.0.2 

Josh,
In addition to the Ceph logs you can also use QEMU tracing with the
following events enabled:
virtio_blk_handle_write
virtio_blk_handle_read
virtio_blk_rw_complete

See docs/tracing.txt for details on usage.

Inspecting the trace output will let you observe the I/O request
submission/completion from the virtio-blk device perspective.  You'll be
able to see whether requests are never being completed in some cases.

This bug seems like a corner case or race condition since most requests
seem to complete just fine.  The problem is that eventually the
virtio-blk device becomes unusable when it runs out of descriptors (it
has 128).  And before that limit is reached the guest may become
unusable due to the hung I/O requests.

Stefan

Reply via email to