Re: [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Qemu-devel] [Bug 1207686]
On Mon, 5 Aug 2013, Mike Dawson wrote: Josh, Logs are uploaded to cephdrop with the file name mikedawson-rbd-qemu-deadlock. - At about 2013-08-05 19:46 or 47, we hit the issue, traffic went to 0 - At about 2013-08-05 19:53:51, ran a 'virsh screenshot' Environment is: - Ceph 0.61.7 (client is co-mingled with three OSDs) - rbd cache = true and cache=writeback - qemu 1.4.0 1.4.0+dfsg-1expubuntu4 - Ubuntu Raring with 3.8.0-25-generic This issue is reproducible in my environment, and I'm willing to run any wip branch you need. What else can I provide to help? This looks like a different issue than Oliver's. I see one anomaly in the log, where a rbd io completion is triggered a second time for no apparent reason. I opened a separate bug http://tracker.ceph.com/issues/5955 and pushed wip-5955 that will hopefully shine some light on the weird behavior I saw. Can you reproduce with this branch and debug objectcacher = 20 debug ms = 1 debug rbd = 20 debug finisher = 20 Thanks! sage Thanks, Mike Dawson On 8/5/2013 3:48 AM, Stefan Hajnoczi wrote: On Sun, Aug 04, 2013 at 03:36:52PM +0200, Oliver Francke wrote: Am 02.08.2013 um 23:47 schrieb Mike Dawson mike.daw...@cloudapt.com: We can un-wedge the guest by opening a NoVNC session or running a 'virsh screenshot' command. After that, the guest resumes and runs as expected. At that point we can examine the guest. Each time we'll see: If virsh screenshot works then this confirms that QEMU itself is still responding. Its main loop cannot be blocked since it was able to process the screendump command. This supports Josh's theory that a callback is not being invoked. The virtio-blk I/O request would be left in a pending state. Now here is where the behavior varies between configurations: On a Windows guest with 1 vCPU, you may see the symptom that the guest no longer responds to ping. On a Linux guest with multiple vCPUs, you may see the hung task message from the guest kernel because other vCPUs are still making progress. Just the vCPU that issued the I/O request and whose task is in UNINTERRUPTIBLE state would really be stuck. Basically, the symptoms depend not just on how QEMU is behaving but also on the guest kernel and how many vCPUs you have configured. I think this can explain how both problems you are observing, Oliver and Mike, are a result of the same bug. At least I hope they are :). Stefan ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Qemu-devel] [Bug 1207686]
Hi Oliver, (Posted this on the bug too, but:) Your last log revealed a bug in the librados aio flush. A fix is pushed to wip-librados-aio-flush (bobtail) and wip-5919 (master); can you retest please (with caching off again)? Thanks! sage On Fri, 9 Aug 2013, Oliver Francke wrote: Hi Josh, just opened http://tracker.ceph.com/issues/5919 with all collected information incl. debug-log. Hope it helps, Oliver. On 08/08/2013 07:01 PM, Josh Durgin wrote: On 08/08/2013 05:40 AM, Oliver Francke wrote: Hi Josh, I have a session logged with: debug_ms=1:debug_rbd=20:debug_objectcacher=30 as you requested from Mike, even if I think, we do have another story here, anyway. Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is 3.2.0-51-amd... Do you want me to open a ticket for that stuff? I have about 5MB compressed logfile waiting for you ;) Yes, that'd be great. If you could include the time when you saw the guest hang that'd be ideal. I'm not sure if this is one or two bugs, but it seems likely it's a bug in rbd and not qemu. Thanks! Josh Thnx in advance, Oliver. On 08/05/2013 09:48 AM, Stefan Hajnoczi wrote: On Sun, Aug 04, 2013 at 03:36:52PM +0200, Oliver Francke wrote: Am 02.08.2013 um 23:47 schrieb Mike Dawson mike.daw...@cloudapt.com: We can un-wedge the guest by opening a NoVNC session or running a 'virsh screenshot' command. After that, the guest resumes and runs as expected. At that point we can examine the guest. Each time we'll see: If virsh screenshot works then this confirms that QEMU itself is still responding. Its main loop cannot be blocked since it was able to process the screendump command. This supports Josh's theory that a callback is not being invoked. The virtio-blk I/O request would be left in a pending state. Now here is where the behavior varies between configurations: On a Windows guest with 1 vCPU, you may see the symptom that the guest no longer responds to ping. On a Linux guest with multiple vCPUs, you may see the hung task message from the guest kernel because other vCPUs are still making progress. Just the vCPU that issued the I/O request and whose task is in UNINTERRUPTIBLE state would really be stuck. Basically, the symptoms depend not just on how QEMU is behaving but also on the guest kernel and how many vCPUs you have configured. I think this can explain how both problems you are observing, Oliver and Mike, are a result of the same bug. At least I hope they are :). Stefan -- Oliver Francke filoo GmbH Moltkestra?e 25a 0 G?tersloh HRB4355 AG G?tersloh Gesch?ftsf?hrer: J.Rehp?hler | C.Kunz Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Qemu-devel] [Bug 1207686]
On 08/09/2013 08:03 AM, Stefan Hajnoczi wrote: On Fri, Aug 09, 2013 at 03:05:22PM +0100, Andrei Mikhailovsky wrote: I can confirm that I am having similar issues with ubuntu vm guests using fio with bs=4k direct=1 numjobs=4 iodepth=16. Occasionally i see hang tasks, occasionally guest vm stops responding without leaving anything in the logs and sometimes i see kernel panic on the console. I typically leave the runtime of the fio test for 60 minutes and it tends to stop responding after about 10-30 mins. I am on ubuntu 12.04 with 3.5 kernel backport and using ceph 0.61.7 with qemu 1.5.0 and libvirt 1.0.2 Oliver's logs show one aio_flush() never getting completed, which means it's an issue with aio_flush in librados when rbd caching isn't used. Mike's log is from a qemu without aio_flush(), and with caching turned on, and shows all flushes completing quickly, so it's a separate bug. Josh, In addition to the Ceph logs you can also use QEMU tracing with the following events enabled: virtio_blk_handle_write virtio_blk_handle_read virtio_blk_rw_complete See docs/tracing.txt for details on usage. Inspecting the trace output will let you observe the I/O request submission/completion from the virtio-blk device perspective. You'll be able to see whether requests are never being completed in some cases. Thanks for the info. That may be the best way to check what's happening when caching is enabled. Mike, could you recompile qemu with tracing enabled and get a trace of the hang you were seeing, in addition to the ceph logs? This bug seems like a corner case or race condition since most requests seem to complete just fine. The problem is that eventually the virtio-blk device becomes unusable when it runs out of descriptors (it has 128). And before that limit is reached the guest may become unusable due to the hung I/O requests. It seems only one request hung from an important kernel thread in Oliver's case, but it's good to be aware of the descriptor limit. Josh ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Qemu-devel] [Bug 1207686]
Hi Josh, just opened http://tracker.ceph.com/issues/5919 with all collected information incl. debug-log. Hope it helps, Oliver. On 08/08/2013 07:01 PM, Josh Durgin wrote: On 08/08/2013 05:40 AM, Oliver Francke wrote: Hi Josh, I have a session logged with: debug_ms=1:debug_rbd=20:debug_objectcacher=30 as you requested from Mike, even if I think, we do have another story here, anyway. Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is 3.2.0-51-amd... Do you want me to open a ticket for that stuff? I have about 5MB compressed logfile waiting for you ;) Yes, that'd be great. If you could include the time when you saw the guest hang that'd be ideal. I'm not sure if this is one or two bugs, but it seems likely it's a bug in rbd and not qemu. Thanks! Josh Thnx in advance, Oliver. On 08/05/2013 09:48 AM, Stefan Hajnoczi wrote: On Sun, Aug 04, 2013 at 03:36:52PM +0200, Oliver Francke wrote: Am 02.08.2013 um 23:47 schrieb Mike Dawson mike.daw...@cloudapt.com: We can un-wedge the guest by opening a NoVNC session or running a 'virsh screenshot' command. After that, the guest resumes and runs as expected. At that point we can examine the guest. Each time we'll see: If virsh screenshot works then this confirms that QEMU itself is still responding. Its main loop cannot be blocked since it was able to process the screendump command. This supports Josh's theory that a callback is not being invoked. The virtio-blk I/O request would be left in a pending state. Now here is where the behavior varies between configurations: On a Windows guest with 1 vCPU, you may see the symptom that the guest no longer responds to ping. On a Linux guest with multiple vCPUs, you may see the hung task message from the guest kernel because other vCPUs are still making progress. Just the vCPU that issued the I/O request and whose task is in UNINTERRUPTIBLE state would really be stuck. Basically, the symptoms depend not just on how QEMU is behaving but also on the guest kernel and how many vCPUs you have configured. I think this can explain how both problems you are observing, Oliver and Mike, are a result of the same bug. At least I hope they are :). Stefan -- Oliver Francke filoo GmbH Moltkestraße 25a 0 Gütersloh HRB4355 AG Gütersloh Geschäftsführer: J.Rehpöhler | C.Kunz Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Qemu-devel] [Bug 1207686]
I can confirm that I am having similar issues with ubuntu vm guests using fio with bs=4k direct=1 numjobs=4 iodepth=16. Occasionally i see hang tasks, occasionally guest vm stops responding without leaving anything in the logs and sometimes i see kernel panic on the console. I typically leave the runtime of the fio test for 60 minutes and it tends to stop responding after about 10-30 mins. I am on ubuntu 12.04 with 3.5 kernel backport and using ceph 0.61.7 with qemu 1.5.0 and libvirt 1.0.2 Andrei - Original Message - From: Oliver Francke oliver.fran...@filoo.de To: Josh Durgin josh.dur...@inktank.com Cc: ceph-users@lists.ceph.com, Mike Dawson mike.daw...@cloudapt.com, Stefan Hajnoczi stefa...@redhat.com, qemu-de...@nongnu.org Sent: Friday, 9 August, 2013 10:22:00 AM Subject: Re: [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Qemu-devel] [Bug 1207686] Hi Josh, just opened http://tracker.ceph.com/issues/5919 with all collected information incl. debug-log. Hope it helps, Oliver. On 08/08/2013 07:01 PM, Josh Durgin wrote: On 08/08/2013 05:40 AM, Oliver Francke wrote: Hi Josh, I have a session logged with: debug_ms=1:debug_rbd=20:debug_objectcacher=30 as you requested from Mike, even if I think, we do have another story here, anyway. Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is 3.2.0-51-amd... Do you want me to open a ticket for that stuff? I have about 5MB compressed logfile waiting for you ;) Yes, that'd be great. If you could include the time when you saw the guest hang that'd be ideal. I'm not sure if this is one or two bugs, but it seems likely it's a bug in rbd and not qemu. Thanks! Josh Thnx in advance, Oliver. On 08/05/2013 09:48 AM, Stefan Hajnoczi wrote: On Sun, Aug 04, 2013 at 03:36:52PM +0200, Oliver Francke wrote: Am 02.08.2013 um 23:47 schrieb Mike Dawson mike.daw...@cloudapt.com: We can un-wedge the guest by opening a NoVNC session or running a 'virsh screenshot' command. After that, the guest resumes and runs as expected. At that point we can examine the guest. Each time we'll see: If virsh screenshot works then this confirms that QEMU itself is still responding. Its main loop cannot be blocked since it was able to process the screendump command. This supports Josh's theory that a callback is not being invoked. The virtio-blk I/O request would be left in a pending state. Now here is where the behavior varies between configurations: On a Windows guest with 1 vCPU, you may see the symptom that the guest no longer responds to ping. On a Linux guest with multiple vCPUs, you may see the hung task message from the guest kernel because other vCPUs are still making progress. Just the vCPU that issued the I/O request and whose task is in UNINTERRUPTIBLE state would really be stuck. Basically, the symptoms depend not just on how QEMU is behaving but also on the guest kernel and how many vCPUs you have configured. I think this can explain how both problems you are observing, Oliver and Mike, are a result of the same bug. At least I hope they are :). Stefan -- Oliver Francke filoo GmbH Moltkestraße 25a 0 Gütersloh HRB4355 AG Gütersloh Geschäftsführer: J.Rehpöhler | C.Kunz Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Qemu-devel] [Bug 1207686]
On Fri, Aug 09, 2013 at 03:05:22PM +0100, Andrei Mikhailovsky wrote: I can confirm that I am having similar issues with ubuntu vm guests using fio with bs=4k direct=1 numjobs=4 iodepth=16. Occasionally i see hang tasks, occasionally guest vm stops responding without leaving anything in the logs and sometimes i see kernel panic on the console. I typically leave the runtime of the fio test for 60 minutes and it tends to stop responding after about 10-30 mins. I am on ubuntu 12.04 with 3.5 kernel backport and using ceph 0.61.7 with qemu 1.5.0 and libvirt 1.0.2 Josh, In addition to the Ceph logs you can also use QEMU tracing with the following events enabled: virtio_blk_handle_write virtio_blk_handle_read virtio_blk_rw_complete See docs/tracing.txt for details on usage. Inspecting the trace output will let you observe the I/O request submission/completion from the virtio-blk device perspective. You'll be able to see whether requests are never being completed in some cases. This bug seems like a corner case or race condition since most requests seem to complete just fine. The problem is that eventually the virtio-blk device becomes unusable when it runs out of descriptors (it has 128). And before that limit is reached the guest may become unusable due to the hung I/O requests. Stefan ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Qemu-devel] [Bug 1207686]
Hi Josh, I have a session logged with: debug_ms=1:debug_rbd=20:debug_objectcacher=30 as you requested from Mike, even if I think, we do have another story here, anyway. Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is 3.2.0-51-amd... Do you want me to open a ticket for that stuff? I have about 5MB compressed logfile waiting for you ;) Thnx in advance, Oliver. On 08/05/2013 09:48 AM, Stefan Hajnoczi wrote: On Sun, Aug 04, 2013 at 03:36:52PM +0200, Oliver Francke wrote: Am 02.08.2013 um 23:47 schrieb Mike Dawson mike.daw...@cloudapt.com: We can un-wedge the guest by opening a NoVNC session or running a 'virsh screenshot' command. After that, the guest resumes and runs as expected. At that point we can examine the guest. Each time we'll see: If virsh screenshot works then this confirms that QEMU itself is still responding. Its main loop cannot be blocked since it was able to process the screendump command. This supports Josh's theory that a callback is not being invoked. The virtio-blk I/O request would be left in a pending state. Now here is where the behavior varies between configurations: On a Windows guest with 1 vCPU, you may see the symptom that the guest no longer responds to ping. On a Linux guest with multiple vCPUs, you may see the hung task message from the guest kernel because other vCPUs are still making progress. Just the vCPU that issued the I/O request and whose task is in UNINTERRUPTIBLE state would really be stuck. Basically, the symptoms depend not just on how QEMU is behaving but also on the guest kernel and how many vCPUs you have configured. I think this can explain how both problems you are observing, Oliver and Mike, are a result of the same bug. At least I hope they are :). Stefan -- Oliver Francke filoo GmbH Moltkestraße 25a 0 Gütersloh HRB4355 AG Gütersloh Geschäftsführer: J.Rehpöhler | C.Kunz Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Qemu-devel] [Bug 1207686]
On 08/08/2013 05:40 AM, Oliver Francke wrote: Hi Josh, I have a session logged with: debug_ms=1:debug_rbd=20:debug_objectcacher=30 as you requested from Mike, even if I think, we do have another story here, anyway. Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is 3.2.0-51-amd... Do you want me to open a ticket for that stuff? I have about 5MB compressed logfile waiting for you ;) Yes, that'd be great. If you could include the time when you saw the guest hang that'd be ideal. I'm not sure if this is one or two bugs, but it seems likely it's a bug in rbd and not qemu. Thanks! Josh Thnx in advance, Oliver. On 08/05/2013 09:48 AM, Stefan Hajnoczi wrote: On Sun, Aug 04, 2013 at 03:36:52PM +0200, Oliver Francke wrote: Am 02.08.2013 um 23:47 schrieb Mike Dawson mike.daw...@cloudapt.com: We can un-wedge the guest by opening a NoVNC session or running a 'virsh screenshot' command. After that, the guest resumes and runs as expected. At that point we can examine the guest. Each time we'll see: If virsh screenshot works then this confirms that QEMU itself is still responding. Its main loop cannot be blocked since it was able to process the screendump command. This supports Josh's theory that a callback is not being invoked. The virtio-blk I/O request would be left in a pending state. Now here is where the behavior varies between configurations: On a Windows guest with 1 vCPU, you may see the symptom that the guest no longer responds to ping. On a Linux guest with multiple vCPUs, you may see the hung task message from the guest kernel because other vCPUs are still making progress. Just the vCPU that issued the I/O request and whose task is in UNINTERRUPTIBLE state would really be stuck. Basically, the symptoms depend not just on how QEMU is behaving but also on the guest kernel and how many vCPUs you have configured. I think this can explain how both problems you are observing, Oliver and Mike, are a result of the same bug. At least I hope they are :). Stefan ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Qemu-devel] [Bug 1207686]
On Sun, Aug 04, 2013 at 03:36:52PM +0200, Oliver Francke wrote: Am 02.08.2013 um 23:47 schrieb Mike Dawson mike.daw...@cloudapt.com: We can un-wedge the guest by opening a NoVNC session or running a 'virsh screenshot' command. After that, the guest resumes and runs as expected. At that point we can examine the guest. Each time we'll see: If virsh screenshot works then this confirms that QEMU itself is still responding. Its main loop cannot be blocked since it was able to process the screendump command. This supports Josh's theory that a callback is not being invoked. The virtio-blk I/O request would be left in a pending state. Now here is where the behavior varies between configurations: On a Windows guest with 1 vCPU, you may see the symptom that the guest no longer responds to ping. On a Linux guest with multiple vCPUs, you may see the hung task message from the guest kernel because other vCPUs are still making progress. Just the vCPU that issued the I/O request and whose task is in UNINTERRUPTIBLE state would really be stuck. Basically, the symptoms depend not just on how QEMU is behaving but also on the guest kernel and how many vCPUs you have configured. I think this can explain how both problems you are observing, Oliver and Mike, are a result of the same bug. At least I hope they are :). Stefan ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Qemu-devel] [Bug 1207686]
Josh, Logs are uploaded to cephdrop with the file name mikedawson-rbd-qemu-deadlock. - At about 2013-08-05 19:46 or 47, we hit the issue, traffic went to 0 - At about 2013-08-05 19:53:51, ran a 'virsh screenshot' Environment is: - Ceph 0.61.7 (client is co-mingled with three OSDs) - rbd cache = true and cache=writeback - qemu 1.4.0 1.4.0+dfsg-1expubuntu4 - Ubuntu Raring with 3.8.0-25-generic This issue is reproducible in my environment, and I'm willing to run any wip branch you need. What else can I provide to help? Thanks, Mike Dawson On 8/5/2013 3:48 AM, Stefan Hajnoczi wrote: On Sun, Aug 04, 2013 at 03:36:52PM +0200, Oliver Francke wrote: Am 02.08.2013 um 23:47 schrieb Mike Dawson mike.daw...@cloudapt.com: We can un-wedge the guest by opening a NoVNC session or running a 'virsh screenshot' command. After that, the guest resumes and runs as expected. At that point we can examine the guest. Each time we'll see: If virsh screenshot works then this confirms that QEMU itself is still responding. Its main loop cannot be blocked since it was able to process the screendump command. This supports Josh's theory that a callback is not being invoked. The virtio-blk I/O request would be left in a pending state. Now here is where the behavior varies between configurations: On a Windows guest with 1 vCPU, you may see the symptom that the guest no longer responds to ping. On a Linux guest with multiple vCPUs, you may see the hung task message from the guest kernel because other vCPUs are still making progress. Just the vCPU that issued the I/O request and whose task is in UNINTERRUPTIBLE state would really be stuck. Basically, the symptoms depend not just on how QEMU is behaving but also on the guest kernel and how many vCPUs you have configured. I think this can explain how both problems you are observing, Oliver and Mike, are a result of the same bug. At least I hope they are :). Stefan ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Qemu-devel] [Bug 1207686]
Hi Mike, you might be the guy StefanHa was referring to on the qemu-devel mailing-list. I just made some more tests, so… Am 02.08.2013 um 23:47 schrieb Mike Dawson mike.daw...@cloudapt.com: Oliver, We've had a similar situation occur. For about three months, we've run several Windows 2008 R2 guests with virtio drivers that record video surveillance. We have long suffered an issue where the guest appears to hang indefinitely (or until we intervene). For the sake of this conversation, we call this state wedged, because it appears something (rbd, qemu, virtio, etc) gets stuck on a deadlock. When a guest gets wedged, we see the following: - the guest will not respond to pings If showing up the hung_task - message, I can ping and establish new ssh-sessions, just the session with a while loop does not accept any keyboard-action. - the qemu-system-x86_64 process drops to 0% cpu - graphite graphs show the interface traffic dropping to 0bps - the guest will stay wedged forever (or until we intervene) - strace of qemu-system-x86_64 shows QEMU is making progress [1][2] nothing special here: 5, events=POLLIN}, {fd=7, events=POLLIN}, {fd=6, events=POLLIN}, {fd=19, events=POLLIN}, {fd=15, events=POLLIN}, {fd=4, events=POLLIN}], 11, -1) = 1 ([{fd=12, revents=POLLIN}]) [pid 11793] read(5, 0x7fff16b61f00, 16) = -1 EAGAIN (Resource temporarily unavailable) [pid 11793] read(12, \2\0\0\0\0\0\0\0\0\0\0\0\0\361p\0\252\340\374\373\373!gH\10\0E\0\0Yq\374..., 69632) = 115 [pid 11793] read(12, 0x7f0c1737fcec, 69632) = -1 EAGAIN (Resource temporarily unavailable) [pid 11793] poll([{fd=27, events=POLLIN|POLLERR|POLLHUP}, {fd=26, events=POLLIN|POLLERR|POLLHUP}, {fd=24, events=POLLIN|POLLERR|POLLHUP}, {fd=12, events=POLLIN|POLLERR|POLLHUP}, {fd=3, events=POLLIN|POLLERR|POLLHUP}, {fd= and that for many, many threads. Inside the VM I see 75% wait, but I can restart the spew-test in a second session. All that tested with rbd_cache=false,cache=none. I also test every qemu-version with a 2 CPU 2GiB mem Windows 7 VM with some high load, encountering no problem ATM. Running smooth and fast. We can un-wedge the guest by opening a NoVNC session or running a 'virsh screenshot' command. After that, the guest resumes and runs as expected. At that point we can examine the guest. Each time we'll see: - No Windows error logs whatsoever while the guest is wedged - A time sync typically occurs right after the guest gets un-wedged - Scheduled tasks do not run while wedged - Windows error logs do not show any evidence of suspend, sleep, etc We had so many issue with guests becoming wedged, we wrote a script to 'virsh screenshot' them via cron. Then we installed some updates and had a month or so of higher stability (wedging happened maybe 1/10th as often). Until today we couldn't figure out why. Yesterday, I realized qemu was starting the instances without specifying cache=writeback. We corrected that, and let them run overnight. With RBD writeback re-enabled, wedging came back as often as we had seen in the past. I've counted ~40 occurrences in the past 12-hour period. So I feel like writeback caching in RBD certainly makes the deadlock more likely to occur. Joshd asked us to gather RBD client logs: joshd it could very well be the writeback cache not doing a callback at some point - if you could gather logs of a vm getting stuck with debug rbd = 20, debug ms = 1, and debug objectcacher = 30 that would be great We'll do that over the weekend. If you could as well, we'd love the help! [1] http://www.gammacode.com/kvm/wedged-with-timestamps.txt [2] http://www.gammacode.com/kvm/not-wedged.txt As I wrote above, no cache so far, so omitting the verbose debugging at the moment. But will do if requested. Thnx for your report, Oliver. Thanks, Mike Dawson Co-Founder Director of Cloud Architecture Cloudapt LLC 6330 East 75th Street, Suite 170 Indianapolis, IN 46250 On 8/2/2013 6:22 AM, Oliver Francke wrote: Well, I believe, I'm the winner of buzzwords-bingo for today. But seriously speaking... as I don't have this particular problem with qcow2 with kernel 3.2 nor qemu-1.2.2 nor newer kernels, I hope I'm not alone here? We have a raising number of tickets from people reinstalling from ISO's with 3.2-kernel. Fast fallback is to start all VM's with qemu-1.2.2, but we then lose some features ala latency-free-RBD-cache ;) I just opened a bug for qemu per: https://bugs.launchpad.net/qemu/+bug/1207686 with all dirty details. Installing a backport-kernel 3.9.x or upgrade Ubuntu-kernel to 3.8.x fixes it. So we have a bad combination for all distros with 3.2-kernel and rbd as storage-backend, I assume. Any similar findings? Any idea of tracing/debugging ( Josh? ;) ) very welcome, Oliver. ___ ceph-users mailing list ceph-users@lists.ceph.com