Re: repeatable hang with loop mount and heavy IO in guest (now in host - not KVM then..)

2010-05-22 Thread Antoine Martin

On 05/23/2010 01:10 AM, Jim Paris wrote:

Antoine Martin wrote:
   

On 02/27/2010 12:38 AM, Antoine Martin wrote:
 

  1   0   0  98   0   1|   0 0 |  66B  354B|   0 0 |  3011
  1   1   0  98   0   0|   0 0 |  66B  354B|   0 0 |  2911
   

> From that point onwards, nothing will happen.
 

The host has disk IO to spare... So what is it waiting for??
   

Moved to an AMD64 host. No effect.
Disabled swap before running the test. No effect.
Moved the guest to a fully up-to-date FC12 server
(2.6.31.6-145.fc12.x86_64), no effect.
 

I have narrowed it down to the guest's filesystem used for backing
the disk image which is loop mounted: although it was not
completely full (and had enough inodes), freeing some space on it
prevents the system from misbehaving.

FYI: the disk image was clean and was fscked before each test. kvm
had been updated to 0.12.3
The weird thing is that the same filesystem works fine (no system
hang) if used directly from the host, it is only misbehaving via
kvm...

So I am not dismissing the possibility that kvm may be at least
partly to blame, or that it is exposing a filesystem bug (race?)
not normally encountered.
(I have backed up the full 32GB virtual disk in case someone
suggests further investigation)
   

Well, well. I've just hit the exact same bug on another *host* (not
a guest), running stock Fedora 12.
So this isn't a kvm bug after all. Definitely a loop+ext(4?) bug.
Looks like you need a pretty big loop mounted partition to trigger
it. (bigger than available ram?)

This is what triggered it on a quad amd system with 8Gb of ram,
software raid-1 partition:
mount -o loop 2GB.dd source
dd if=/dev/zero of=8GB.dd bs=1048576 count=8192
mkfs.ext4 -f 8GB.dd
mount -o loop 8GB.dd dest
rsync -rplogtD source/* dest/
umount source
umount dest
^ this is where it hangs, I then tried to issue a 'sync' from
another terminal, which also hung.
It took more than 10 minutes to settle itself, during that time one
CPU was stuck in wait state.
 

This sounds like:
   https://bugzilla.kernel.org/show_bug.cgi?id=15906
   https://bugzilla.redhat.com/show_bug.cgi?id=588930
   

Indeed it does.
Let's hope this makes it to -stable fast.

Antoine
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: repeatable hang with loop mount and heavy IO in guest (now in host - not KVM then..)

2010-05-22 Thread Jim Paris
Antoine Martin wrote:
> On 02/27/2010 12:38 AM, Antoine Martin wrote:
> >>>  1   0   0  98   0   1|   0 0 |  66B  354B|   0 0 |  3011
> >>>  1   1   0  98   0   0|   0 0 |  66B  354B|   0 0 |  2911
> >>>From that point onwards, nothing will happen.
> >>>The host has disk IO to spare... So what is it waiting for??
> >>Moved to an AMD64 host. No effect.
> >>Disabled swap before running the test. No effect.
> >>Moved the guest to a fully up-to-date FC12 server
> >>(2.6.31.6-145.fc12.x86_64), no effect.
> >I have narrowed it down to the guest's filesystem used for backing
> >the disk image which is loop mounted: although it was not
> >completely full (and had enough inodes), freeing some space on it
> >prevents the system from misbehaving.
> >
> >FYI: the disk image was clean and was fscked before each test. kvm
> >had been updated to 0.12.3
> >The weird thing is that the same filesystem works fine (no system
> >hang) if used directly from the host, it is only misbehaving via
> >kvm...
> >
> >So I am not dismissing the possibility that kvm may be at least
> >partly to blame, or that it is exposing a filesystem bug (race?)
> >not normally encountered.
> >(I have backed up the full 32GB virtual disk in case someone
> >suggests further investigation)
> Well, well. I've just hit the exact same bug on another *host* (not
> a guest), running stock Fedora 12.
> So this isn't a kvm bug after all. Definitely a loop+ext(4?) bug.
> Looks like you need a pretty big loop mounted partition to trigger
> it. (bigger than available ram?)
> 
> This is what triggered it on a quad amd system with 8Gb of ram,
> software raid-1 partition:
> mount -o loop 2GB.dd source
> dd if=/dev/zero of=8GB.dd bs=1048576 count=8192
> mkfs.ext4 -f 8GB.dd
> mount -o loop 8GB.dd dest
> rsync -rplogtD source/* dest/
> umount source
> umount dest
> ^ this is where it hangs, I then tried to issue a 'sync' from
> another terminal, which also hung.
> It took more than 10 minutes to settle itself, during that time one
> CPU was stuck in wait state.

This sounds like:
  https://bugzilla.kernel.org/show_bug.cgi?id=15906
  https://bugzilla.redhat.com/show_bug.cgi?id=588930

-jim
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: repeatable hang with loop mount and heavy IO in guest (now in host - not KVM then..)

2010-05-21 Thread Antoine Martin

On 02/27/2010 12:38 AM, Antoine Martin wrote:

  1   0   0  98   0   1|   0 0 |  66B  354B|   0 0 |  3011
  1   1   0  98   0   0|   0 0 |  66B  354B|   0 0 |  2911
From that point onwards, nothing will happen.
The host has disk IO to spare... So what is it waiting for??

Moved to an AMD64 host. No effect.
Disabled swap before running the test. No effect.
Moved the guest to a fully up-to-date FC12 server 
(2.6.31.6-145.fc12.x86_64), no effect.
I have narrowed it down to the guest's filesystem used for backing the 
disk image which is loop mounted: although it was not completely full 
(and had enough inodes), freeing some space on it prevents the system 
from misbehaving.


FYI: the disk image was clean and was fscked before each test. kvm had 
been updated to 0.12.3
The weird thing is that the same filesystem works fine (no system 
hang) if used directly from the host, it is only misbehaving via kvm...


So I am not dismissing the possibility that kvm may be at least partly 
to blame, or that it is exposing a filesystem bug (race?) not normally 
encountered.
(I have backed up the full 32GB virtual disk in case someone suggests 
further investigation)
Well, well. I've just hit the exact same bug on another *host* (not a 
guest), running stock Fedora 12.

So this isn't a kvm bug after all. Definitely a loop+ext(4?) bug.
Looks like you need a pretty big loop mounted partition to trigger it. 
(bigger than available ram?)


This is what triggered it on a quad amd system with 8Gb of ram, software 
raid-1 partition:

mount -o loop 2GB.dd source
dd if=/dev/zero of=8GB.dd bs=1048576 count=8192
mkfs.ext4 -f 8GB.dd
mount -o loop 8GB.dd dest
rsync -rplogtD source/* dest/
umount source
umount dest
^ this is where it hangs, I then tried to issue a 'sync' from another 
terminal, which also hung.
It took more than 10 minutes to settle itself, during that time one CPU 
was stuck in wait state.

dstat reported almost no IO at the time (<1MB/s)
I assume dstat reports page write back like any other disk IO?
That raid partition does ~60MB/s, so writing back 8GB shouldn't take 10 
minutes. (that's even assuming it would have to write back the whole 8GB 
at umount time - which should not be the case)


Cheers
Antoine

Here's the hung trace:
INFO: task umount:526 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
umountD 0002 0   526  32488 0x
 880140f9fc88 0086 880008e3c228 810d5fd9
 880140f9fc28 880140f9fcd8 880140f9ffd8 880140f9ffd8
 88021b5e03d8 f980 00015740 88021b5e03d8
Call Trace:
 [] ? sync_page+0x0/0x4a
 [] ? __enqueue_entity+0x7b/0x7d
 [] ? bdi_sched_wait+0x0/0x12
 [] bdi_sched_wait+0xe/0x12
 [] __wait_on_bit+0x48/0x7b
 [] ? native_smp_send_reschedule+0x5c/0x5e
 [] out_of_line_wait_on_bit+0x6e/0x79
 [] ? bdi_sched_wait+0x0/0x12
 [] ? wake_bit_function+0x0/0x33
 [] wait_on_bit.clone.1+0x1e/0x20
 [] bdi_sync_writeback+0x64/0x6b
 [] sync_inodes_sb+0x22/0xec
 [] __sync_filesystem+0x4e/0x77
 [] sync_filesystem+0x4b/0x4f
 [] generic_shutdown_super+0x27/0xc9
 [] kill_block_super+0x27/0x3f
 [] deactivate_super+0x56/0x6b
 [] mntput_no_expire+0xb4/0xec
 [] sys_umount+0x2d5/0x304
 [] ? do_page_fault+0x270/0x2a0
 [] system_call_fastpath+0x16/0x1b
INFO: task umount:526 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
umountD 0002 0   526  32488 0x
 880140f9fc88 0086 880008e3c228 810d5fd9
 880140f9fc28 880140f9fcd8 880140f9ffd8 880140f9ffd8
 88021b5e03d8 f980 00015740 88021b5e03d8
Call Trace:
 [] ? sync_page+0x0/0x4a
 [] ? __enqueue_entity+0x7b/0x7d
 [] ? bdi_sched_wait+0x0/0x12
 [] bdi_sched_wait+0xe/0x12
 [] __wait_on_bit+0x48/0x7b
 [] ? native_smp_send_reschedule+0x5c/0x5e
 [] out_of_line_wait_on_bit+0x6e/0x79
 [] ? bdi_sched_wait+0x0/0x12
 [] ? wake_bit_function+0x0/0x33
 [] wait_on_bit.clone.1+0x1e/0x20
 [] bdi_sync_writeback+0x64/0x6b
 [] sync_inodes_sb+0x22/0xec
 [] __sync_filesystem+0x4e/0x77
 [] sync_filesystem+0x4b/0x4f
 [] generic_shutdown_super+0x27/0xc9
 [] kill_block_super+0x27/0x3f
 [] deactivate_super+0x56/0x6b
 [] mntput_no_expire+0xb4/0xec
 [] sys_umount+0x2d5/0x304
 [] ? do_page_fault+0x270/0x2a0
 [] system_call_fastpath+0x16/0x1b
INFO: task umount:526 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
umountD 0002 0   526  32488 0x
 880140f9fc88 0086 880008e3c228 810d5fd9
 880140f9fc28 880140f9fcd8 880140f9ffd8 880140f9ffd8
 88021b5e03d8 f980 00015740 88021b5e03d8
Call Trace:
 [] ? sync_page+0x0/0x4a
 [] ? __enqueue_entity+0x7b/0x7d
 [] ? bdi_sched_wait+0x0/0x12
 [] bdi_sched_wait+0xe/0x12
 [] __wait_on_bit+0x4

Re: repeatable hang with loop mount and heavy IO in guest

2010-02-26 Thread Antoine Martin



  1   0   0  98   0   1|   0 0 |  66B  354B|   0 0 |  3011
  1   1   0  98   0   0|   0 0 |  66B  354B|   0 0 |  2911
From that point onwards, nothing will happen.
The host has disk IO to spare... So what is it waiting for??

Moved to an AMD64 host. No effect.
Disabled swap before running the test. No effect.
Moved the guest to a fully up-to-date FC12 server 
(2.6.31.6-145.fc12.x86_64), no effect.
I have narrowed it down to the guest's filesystem used for backing the 
disk image which is loop mounted: although it was not completely full 
(and had enough inodes), freeing some space on it prevents the system 
from misbehaving.


FYI: the disk image was clean and was fscked before each test. kvm had 
been updated to 0.12.3
The weird thing is that the same filesystem works fine (no system hang) 
if used directly from the host, it is only misbehaving via kvm...


So I am not dismissing the possibility that kvm may be at least partly 
to blame, or that it is exposing a filesystem bug (race?) not normally 
encountered.
(I have backed up the full 32GB virtual disk in case someone suggests 
further investigation)


Cheers
Antoine
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: repeatable hang with loop mount and heavy IO in guest [NOT SOLVED]

2010-02-03 Thread Antoine Martin

On 01/23/2010 02:15 AM, Antoine Martin wrote:

On 01/23/2010 01:28 AM, Antoine Martin wrote:

On 01/22/2010 02:57 PM, Michael Tokarev wrote:

Antoine Martin wrote:

I've tried various guests, including most recent Fedora12 kernels,
custom 2.6.32.x
All of them hang around the same point (~1GB written) when I do 
heavy IO

write inside the guest.

[]

Host is running: 2.6.31.4
QEMU PC emulator version 0.10.50 (qemu-kvm-devel-88)

Please update to last version and repeat.  kvm-88 is ancient and
_lots_ of stuff fixed and changed since that time, I doubt anyone
here will try to dig into kvm-88 problems.

Current kvm is qemu-kvm-0.12.2, released yesterday.

Sorry about that, I didn't realize 88 was so far behind.
Upgrading to qemu-kvm-0.12.2 did solve my IO problems.
Only for a while. Same problem just re-occurred, only this time it 
went a little further.

It is now just sitting there, with a load average of exactly 3.0 (+- 5%)

Here is a good trace of the symptom during writeback, you can see it 
write the data at around 50MB/s, it goes from being idle to sys, but 
after a while it just stops writing and goes into mostly wait state:

total-cpu-usage -dsk/total- -net/total- ---paging-- ---system--
  1   0  99   0   0   0|   0 0 | 198B  614B|   0 0 |  3617
  1   0  99   0   0   0|   0 0 | 198B  710B|   0 0 |  3117
  1   1  98   0   0   0|   0   128k| 240B  720B|   0 0 |  3926
  1   1  98   0   0   0|   0 0 | 132B  564B|   0 0 |  3114
  1   0  99   0   0   0|   0 0 | 132B  468B|   0 0 |  3114
  1   1  98   0   0   0|   0 0 |  66B  354B|   0 0 |  3013
  0   4  11  85   0   0| 852k0 | 444B 1194B|   0 0 | 215   477
  2   2   0  96   0   0| 500k0 | 132B  756B|   0 0 | 169   458
  3  57   0  39   1   0| 228k   10M| 132B  692B|   0 0 | 476  5387
  6  94   0   0   0   0|  28k   23M| 132B  884B|   0 0 | 373  2142
  6  89   0   2   2   0|  40k   38M|  66B  692B|   0  8192B| 502  5651
  4  47   0  48   0   0| 140k   34M| 132B  836B|   0 0 | 605  1664
  3  64   0  30   2   0|  60k   50M| 132B  370B|   060k| 750   631
  4  59   0  35   2   0|  48k   45M| 132B  836B|   028k| 708  1293
  7  81   0  10   2   0|  68k   67M| 132B  788B|   0   124k| 928  1634
  5  74   0  20   1   0|  48k   48M| 132B  756B|   0   316k| 830  5715
  5  70   0  24   1   0| 168k   48M| 132B  676B|   0   100k| 734  5325
  4  70   0  24   1   0|  72k   49M| 132B  948B|   088k| 776  3784
  5  57   0  37   1   0|  36k   37M| 132B  996B|   0   480k| 602   369
  2  21   0  77   0   0|  36k   23M| 132B  724B|   072k| 318  1033
  4  51   0  43   2   0| 112k   43M| 132B  756B|   0   112k| 681   909
  5  55   0  40   0   0|  88k   48M| 140B  926B|  16k   12k| 698   557
total-cpu-usage -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw
  3  45   0  51   1   0|2248k   29M| 198B 1028B|  28k   44k| 681  5468
  1  21   0  78   0   0|  92k   17M|1275B 2049B|  92k   52k| 328  1883
  3  30   0  66   1   0| 288k   28M| 498B 2116B|   040k| 455   679
  1   1   0  98   0   0|4096B0 | 394B 1340B|4096B0 |  4119
  1   1   0  98   0   0| 148k   52k| 881B 1592B|4096B   44k|  7561
  1   2   0  97   0   0|1408k0 | 351B 1727B|   0 0 | 110   109
  2   1   0  97   0   0|8192B0 |1422B 1940B|   0 0 |  5334
  1   0   0  99   0   0|4096B   12k| 328B 1018B|   0 0 |  4124
  1   4   0  95   0   0| 340k0 |3075B 2152B|4096B0 | 153   191
  4   7   0  89   0   0|1004k   44k|1526B 1906B|   0 0 | 254   244
  0   1   0  99   0   0|  76k0 | 708B 1708B|   0 0 |  6757
  1   1   0  98   0   0|   0 0 | 174B  702B|   0 0 |  3214
  1   1   0  98   0   0|   0 0 | 132B  354B|   0 0 |  3211
  1   0   0  99   0   0|   0 0 | 132B  468B|   0 0 |  3216
  1   0   0  99   0   0|   0 0 | 132B  468B|   0 0 |  3214
  1   1   0  98   0   0|   052k| 132B  678B|   0 0 |  4127
  1   0   0  99   0   0|   0 0 | 198B  678B|   0 0 |  3517
  1   1   0  98   0   0|   0 0 | 198B  468B|   0 0 |  3414
  1   0   0  99   0   0|   0 0 |  66B  354B|   0 0 |  2811
  1   0   0  99   0   0|   0 0 |  66B  354B|   0 0 |  28 9
  1   1   0  98   0   0|   0 0 | 132B  468B|   0 0 |  3416
  1   0   0  98   0   1|   0 0 |  66B  354B|   0 0 |  3011
  1   1   0  98   0   0|   0 0 |  66B  354B|   0 0 |  2911
From that point onwards, nothing will happen.
The host has disk IO to spare... So what is it waiting for??

Moved to an AMD64 host. No effect.
Disabled swap before running the test. No effect.
Moved the guest to a fully up-to-date FC12 server 
(2.6.31.6-145.fc12.x86_64), no effect.


I am still seeing traces like these in dmesg (various length, but always 
ending in sync_page):


[ 2401.350143] INFO: task perl:29512 blocked for more tha

Re: repeatable hang with loop mount and heavy IO in guest [NOT SOLVED]

2010-01-24 Thread Antoine Martin

On 01/23/2010 02:15 AM, Antoine Martin wrote:

On 01/23/2010 01:28 AM, Antoine Martin wrote:

On 01/22/2010 02:57 PM, Michael Tokarev wrote:

Antoine Martin wrote:

I've tried various guests, including most recent Fedora12 kernels,
custom 2.6.32.x
All of them hang around the same point (~1GB written) when I do 
heavy IO

write inside the guest.

[]

Host is running: 2.6.31.4
QEMU PC emulator version 0.10.50 (qemu-kvm-devel-88)

Please update to last version and repeat.  kvm-88 is ancient and
_lots_ of stuff fixed and changed since that time, I doubt anyone
here will try to dig into kvm-88 problems.

Current kvm is qemu-kvm-0.12.2, released yesterday.

Sorry about that, I didn't realize 88 was so far behind.
Upgrading to qemu-kvm-0.12.2 did solve my IO problems.
Only for a while. Same problem just re-occurred, only this time it 
went a little further.

It is now just sitting there, with a load average of exactly 3.0 (+- 5%)

Here is a good trace of the symptom during writeback, you can see it 
write the data at around 50MB/s, it goes from being idle to sys, but 
after a while it just stops writing and goes into mostly wait state:

[snip]

From that point onwards, nothing will happen.
The host has disk IO to spare... So what is it waiting for??
Note: if I fill the disk in the guest with zeroes but without going via 
a loop mounted filesystem, then everything works just fine. Something in 
using the loopback makes it fall over.


Here is the simplest way to make this happen:
time dd if=/dev/zero of=./test bs=1048576 count=2048
2147483648 bytes (2.1 GB) copied, 65.1344 s, 33.0 MB/s

mkfs.ext3 ./test; mkdir tmp
mount -o loop ./test ./tmp
time dd if=/dev/zero of=./tmp/test-loop bs=1048576 count=2048
^this one will never return and you can't just kill "dd", it's stuck.
The whole guest has to be killed at this point.


QEMU PC emulator version 0.12.2 (qemu-kvm-0.12.2), Copyright (c) 
2003-2008 Fabrice Bellard

Guests: various, all recent kernels.
Host: 2.6.31.4
Before anyone suggests this, I have tried with/without elevator=noop, 
with/without virtio disks.

No effect, still hangs.

Antoine


Please advise.

Thanks
Antoine



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: repeatable hang with loop mount and heavy IO in guest [NOT SOLVED]

2010-01-22 Thread Antoine Martin

On 01/23/2010 01:28 AM, Antoine Martin wrote:

On 01/22/2010 02:57 PM, Michael Tokarev wrote:

Antoine Martin wrote:

I've tried various guests, including most recent Fedora12 kernels,
custom 2.6.32.x
All of them hang around the same point (~1GB written) when I do 
heavy IO

write inside the guest.

[]

Host is running: 2.6.31.4
QEMU PC emulator version 0.10.50 (qemu-kvm-devel-88)

Please update to last version and repeat.  kvm-88 is ancient and
_lots_ of stuff fixed and changed since that time, I doubt anyone
here will try to dig into kvm-88 problems.

Current kvm is qemu-kvm-0.12.2, released yesterday.

Sorry about that, I didn't realize 88 was so far behind.
Upgrading to qemu-kvm-0.12.2 did solve my IO problems.
Only for a while. Same problem just re-occurred, only this time it went 
a little further.

It is now just sitting there, with a load average of exactly 3.0 (+- 5%)

Here is a good trace of the symptom during writeback, you can see it 
write the data at around 50MB/s, it goes from being idle to sys, but 
after a while it just stops writing and goes into mostly wait state:

total-cpu-usage -dsk/total- -net/total- ---paging-- ---system--
  1   0  99   0   0   0|   0 0 | 198B  614B|   0 0 |  3617
  1   0  99   0   0   0|   0 0 | 198B  710B|   0 0 |  3117
  1   1  98   0   0   0|   0   128k| 240B  720B|   0 0 |  3926
  1   1  98   0   0   0|   0 0 | 132B  564B|   0 0 |  3114
  1   0  99   0   0   0|   0 0 | 132B  468B|   0 0 |  3114
  1   1  98   0   0   0|   0 0 |  66B  354B|   0 0 |  3013
  0   4  11  85   0   0| 852k0 | 444B 1194B|   0 0 | 215   477
  2   2   0  96   0   0| 500k0 | 132B  756B|   0 0 | 169   458
  3  57   0  39   1   0| 228k   10M| 132B  692B|   0 0 | 476  5387
  6  94   0   0   0   0|  28k   23M| 132B  884B|   0 0 | 373  2142
  6  89   0   2   2   0|  40k   38M|  66B  692B|   0  8192B| 502  5651
  4  47   0  48   0   0| 140k   34M| 132B  836B|   0 0 | 605  1664
  3  64   0  30   2   0|  60k   50M| 132B  370B|   060k| 750   631
  4  59   0  35   2   0|  48k   45M| 132B  836B|   028k| 708  1293
  7  81   0  10   2   0|  68k   67M| 132B  788B|   0   124k| 928  1634
  5  74   0  20   1   0|  48k   48M| 132B  756B|   0   316k| 830  5715
  5  70   0  24   1   0| 168k   48M| 132B  676B|   0   100k| 734  5325
  4  70   0  24   1   0|  72k   49M| 132B  948B|   088k| 776  3784
  5  57   0  37   1   0|  36k   37M| 132B  996B|   0   480k| 602   369
  2  21   0  77   0   0|  36k   23M| 132B  724B|   072k| 318  1033
  4  51   0  43   2   0| 112k   43M| 132B  756B|   0   112k| 681   909
  5  55   0  40   0   0|  88k   48M| 140B  926B|  16k   12k| 698   557
total-cpu-usage -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw
  3  45   0  51   1   0|2248k   29M| 198B 1028B|  28k   44k| 681  5468
  1  21   0  78   0   0|  92k   17M|1275B 2049B|  92k   52k| 328  1883
  3  30   0  66   1   0| 288k   28M| 498B 2116B|   040k| 455   679
  1   1   0  98   0   0|4096B0 | 394B 1340B|4096B0 |  4119
  1   1   0  98   0   0| 148k   52k| 881B 1592B|4096B   44k|  7561
  1   2   0  97   0   0|1408k0 | 351B 1727B|   0 0 | 110   109
  2   1   0  97   0   0|8192B0 |1422B 1940B|   0 0 |  5334
  1   0   0  99   0   0|4096B   12k| 328B 1018B|   0 0 |  4124
  1   4   0  95   0   0| 340k0 |3075B 2152B|4096B0 | 153   191
  4   7   0  89   0   0|1004k   44k|1526B 1906B|   0 0 | 254   244
  0   1   0  99   0   0|  76k0 | 708B 1708B|   0 0 |  6757
  1   1   0  98   0   0|   0 0 | 174B  702B|   0 0 |  3214
  1   1   0  98   0   0|   0 0 | 132B  354B|   0 0 |  3211
  1   0   0  99   0   0|   0 0 | 132B  468B|   0 0 |  3216
  1   0   0  99   0   0|   0 0 | 132B  468B|   0 0 |  3214
  1   1   0  98   0   0|   052k| 132B  678B|   0 0 |  4127
  1   0   0  99   0   0|   0 0 | 198B  678B|   0 0 |  3517
  1   1   0  98   0   0|   0 0 | 198B  468B|   0 0 |  3414
  1   0   0  99   0   0|   0 0 |  66B  354B|   0 0 |  2811
  1   0   0  99   0   0|   0 0 |  66B  354B|   0 0 |  28 9
  1   1   0  98   0   0|   0 0 | 132B  468B|   0 0 |  3416
  1   0   0  98   0   1|   0 0 |  66B  354B|   0 0 |  3011
  1   1   0  98   0   0|   0 0 |  66B  354B|   0 0 |  2911
From that point onwards, nothing will happen.
The host has disk IO to spare... So what is it waiting for??

QEMU PC emulator version 0.12.2 (qemu-kvm-0.12.2), Copyright (c) 
2003-2008 Fabrice Bellard

Guests: various, all recent kernels.
Host: 2.6.31.4

Please advise.

Thanks
Antoine

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: repeatable hang with loop mount and heavy IO in guest [SOLVED]

2010-01-22 Thread Antoine Martin

On 01/22/2010 02:57 PM, Michael Tokarev wrote:

Antoine Martin wrote:
   

I've tried various guests, including most recent Fedora12 kernels,
custom 2.6.32.x
All of them hang around the same point (~1GB written) when I do heavy IO
write inside the guest.
 

[]
   

Host is running: 2.6.31.4
QEMU PC emulator version 0.10.50 (qemu-kvm-devel-88)
 

Please update to last version and repeat.  kvm-88 is ancient and
_lots_ of stuff fixed and changed since that time, I doubt anyone
here will try to dig into kvm-88 problems.

Current kvm is qemu-kvm-0.12.2, released yesterday.
   

Sorry about that, I didn't realize 88 was so far behind.
Upgrading to qemu-kvm-0.12.2 did solve my IO problems.


Found these build issues if anyone is interested:
"--enable-io-thread" gave me:
  LINK  x86_64-softmmu/qemu-system-x86_64
kvm-all.o: In function `qemu_mutex_lock_iothread':
/usr/src/KVM/qemu-kvm-0.12.2/qemu-kvm.c:2526: multiple definition of 
`qemu_mutex_lock_iothread'

vl.o:/usr/src/KVM/qemu-kvm-0.12.2/vl.c:3772: first defined here
kvm-all.o: In function `qemu_mutex_unlock_iothread':
/usr/src/KVM/qemu-kvm-0.12.2/qemu-kvm.c:2520: multiple definition of 
`qemu_mutex_unlock_iothread'

vl.o:/usr/src/KVM/qemu-kvm-0.12.2/vl.c:3783: first defined here
collect2: ld returned 1 exit status

And "--enable-cap-kvm-pit" is defined if you look at "--help", but does 
not exist if you try to use it!?

# ./configure --enable-cap-kvm-pit | grep cap-kvm-pit
ERROR: unknown option --enable-cap-kvm-pit
  --disable-cap-kvm-pitdisable KVM pit support
  --enable-cap-kvm-pit enable KVM pit support

Antoine
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: repeatable hang with loop mount and heavy IO in guest

2010-01-21 Thread Michael Tokarev
Antoine Martin wrote:
> I've tried various guests, including most recent Fedora12 kernels,
> custom 2.6.32.x
> All of them hang around the same point (~1GB written) when I do heavy IO
> write inside the guest.
[]
> Host is running: 2.6.31.4
> QEMU PC emulator version 0.10.50 (qemu-kvm-devel-88)

Please update to last version and repeat.  kvm-88 is ancient and
_lots_ of stuff fixed and changed since that time, I doubt anyone
here will try to dig into kvm-88 problems.

Current kvm is qemu-kvm-0.12.2, released yesterday.

/mjt
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: repeatable hang with loop mount and heavy IO in guest

2010-01-21 Thread RW
No sorry, I haven't any performance data with noop. I even don't
have had a crash. BUT I've experienced serve I/O degradation
with noop. Once I've written a big chunk of data (e.g. a simple
rsync -av /usr /opt) with noop it works for a while and
after a few seconds I saw heavy writes which made the
VM virtually unusable. As far as I remember it was kjournald
which cases the writes.

I've written a mail to the list some months ago with some benchmarks:
http://article.gmane.org/gmane.comp.emulators.kvm.devel/41112/match=benchmark
There're some I/O benchmarks in there. You can't get the graphs
currently since tauceti.net is offline until monday. I haven't
tested noop in these benchmarks because of the problems
mentioned above. But it compares deadline and cfq a little bit
on a HP DL 380 G6 server.

Robert

On 01/21/10 22:08, Thomas Beinicke wrote:
> On Thursday 21 January 2010 21:08:38 RW wrote:
>> Some months ago I also thought elevator=noop should be a good idea.
>> But it isn't. It works good as long as you only do short IO requests.
>> Try using deadline in host and guest.
>>
>> Robert
> 
> @Robert: I've been using noop on all of my KVMs and didn't have any problems 
> so far, never had any crash too.
> Do you have any performance data or comparisons between noop and deadline io 
> schedulers?
> 
> Cheers,
> 
> Thomas
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: repeatable hang with loop mount and heavy IO in guest

2010-01-21 Thread Thomas Beinicke
On Thursday 21 January 2010 21:08:38 RW wrote:
> Some months ago I also thought elevator=noop should be a good idea.
> But it isn't. It works good as long as you only do short IO requests.
> Try using deadline in host and guest.
> 
> Robert

@Robert: I've been using noop on all of my KVMs and didn't have any problems 
so far, never had any crash too.
Do you have any performance data or comparisons between noop and deadline io 
schedulers?

Cheers,

Thomas
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: repeatable hang with loop mount and heavy IO in guest

2010-01-21 Thread RW
Some months ago I also thought elevator=noop should be a good idea.
But it isn't. It works good as long as you only do short IO requests.
Try using deadline in host and guest.

Robert


On 01/21/10 18:26, Antoine Martin wrote:
> I've tried various guests, including most recent Fedora12 kernels,
> custom 2.6.32.x
> All of them hang around the same point (~1GB written) when I do heavy IO
> write inside the guest.
> I have waited 30 minutes to see if the guest would recover, but it just
> sits there, not writing back any data, not doing anything - but
> certainly not allowing any new IO writes. The host has some load on it,
> but nothing heavy enough to completely hand a guest for that long.
> 
> mount -o loop some_image.fs ./somewhere bs=512
> dd if=/dev/zero of=/somewhere/zero
> then after ~1GB: sync
> 
> Host is running: 2.6.31.4
> QEMU PC emulator version 0.10.50 (qemu-kvm-devel-88)
> 
> Guests are booted with "elevator=noop" as the filesystems are stored as
> files, accessed as virtio disks.
> 
> 
> The "hung" backtraces always look similar to these:
> [  361.460136] INFO: task loop0:2097 blocked for more than 120 seconds.
> [  361.460139] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  361.460142] loop0 D 88000b92c848 0  2097  2
> 0x0080
> [  361.460148]  88000b92c5d0 0046 880008c1f810
> 880009829fd8
> [  361.460153]  880009829fd8 880009829fd8 88000a21ee80
> 88000b92c5d0
> [  361.460157]  880009829610 8181b768 880001af33b0
> 0002
> [  361.460161] Call Trace:
> [  361.460216]  [] ? sync_page+0x0/0x43
> [  361.460253]  [] ? io_schedule+0x2c/0x43
> [  361.460257]  [] ? sync_page+0x3e/0x43
> [  361.460261]  [] ? __wait_on_bit+0x41/0x71
> [  361.460264]  [] ? wait_on_page_bit+0x6a/0x70
> [  361.460283]  [] ? wake_bit_function+0x0/0x23
> [  361.460287]  [] ? shrink_page_list+0x3e5/0x61e
> [  361.460291]  [] ? schedule_timeout+0xa3/0xbe
> [  361.460305]  [] ? autoremove_wake_function+0x0/0x2e
> [  361.460308]  [] ? shrink_zone+0x7e1/0xaf6
> [  361.460310]  [] ? determine_dirtyable_memory+0xd/0x17
> [  361.460314]  [] ? isolate_pages_global+0xa3/0x216
> [  361.460316]  [] ? mark_page_accessed+0x2a/0x39
> [  361.460335]  [] ? __find_get_block+0x13b/0x15c
> [  361.460337]  [] ? try_to_free_pages+0x1ab/0x2c9
> [  361.460340]  [] ? isolate_pages_global+0x0/0x216
> [  361.460343]  [] ? __alloc_pages_nodemask+0x394/0x564
> [  361.460350]  [] ? __slab_alloc+0x137/0x44f
> [  361.460371]  [] ? radix_tree_preload+0x1f/0x6a
> [  361.460374]  [] ? kmem_cache_alloc+0x5d/0x88
> [  361.460376]  [] ? radix_tree_preload+0x1f/0x6a
> [  361.460379]  [] ? add_to_page_cache_locked+0x1d/0xf1
> [  361.460381]  [] ? add_to_page_cache_lru+0x27/0x57
> [  361.460384]  [] ?
> grab_cache_page_write_begin+0x7a/0xa0
> [  361.460399]  [] ? ext3_write_begin+0x7e/0x201
> [  361.460417]  [] ? do_lo_send_aops+0xa1/0x174
> [  361.460420]  [] ? virt_to_head_page+0x9/0x2a
> [  361.460422]  [] ? loop_thread+0x309/0x48a
> [  361.460425]  [] ? do_lo_send_aops+0x0/0x174
> [  361.460427]  [] ? autoremove_wake_function+0x0/0x2e
> [  361.460430]  [] ? loop_thread+0x0/0x48a
> [  361.460432]  [] ? kthread+0x78/0x80
> [  361.460441]  [] ? finish_task_switch+0x2b/0x78
> [  361.460454]  [] ? child_rip+0xa/0x20
> [  361.460460]  [] ? native_pax_close_kernel+0x0/0x32
> [  361.460463]  [] ? kthread+0x0/0x80
> [  361.460469]  [] ? child_rip+0x0/0x20
> [  361.460471] INFO: task kjournald:2098 blocked for more than 120 seconds.
> [  361.460473] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  361.460474] kjournald D 88000b92e558 0  2098  2
> 0x0080
> [  361.460477]  88000b92e2e0 0046 88000aad9840
> 88000983ffd8
> [  361.460480]  88000983ffd8 88000983ffd8 81808e00
> 88000b92e2e0
> [  361.460483]  88000983fcf0 8181b768 880001af3c40
> 0002
> [  361.460486] Call Trace:
> [  361.460488]  [] ? sync_buffer+0x0/0x3c
> [  361.460491]  [] ? io_schedule+0x2c/0x43
> [  361.460494]  [] ? sync_buffer+0x38/0x3c
> [  361.460496]  [] ? __wait_on_bit+0x41/0x71
> [  361.460499]  [] ? sync_buffer+0x0/0x3c
> [  361.460501]  [] ? out_of_line_wait_on_bit+0x6a/0x76
> [  361.460504]  [] ? wake_bit_function+0x0/0x23
> [  361.460514]  [] ?
> journal_commit_transaction+0x769/0xbb8
> [  361.460517]  [] ? finish_task_switch+0x2b/0x78
> [  361.460519]  [] ? thread_return+0x40/0x79
> [  361.460522]  [] ? kjournald+0xc7/0x1cb
> [  361.460525]  [] ? autoremove_wake_function+0x0/0x2e
> [  361.460527]  [] ? kjournald+0x0/0x1cb
> [  361.460530]  [] ? kthread+0x78/0x80
> [  361.460532]  [] ? finish_task_switch+0x2b/0x78
> [  361.460534]  [] ? child_rip+0xa/0x20
> [  361.460537]  [] ? native_pax_close_kernel+0x0/0x32
> [  361.460540]  [] ? kthread+0x0/0x80
> [  361.460542]  [] ? child_rip+0x0/0x20
> [  361.460544] INFO: task dd:2132 blocked for more than 120 se

repeatable hang with loop mount and heavy IO in guest

2010-01-21 Thread Antoine Martin
I've tried various guests, including most recent Fedora12 kernels, 
custom 2.6.32.x
All of them hang around the same point (~1GB written) when I do heavy IO 
write inside the guest.
I have waited 30 minutes to see if the guest would recover, but it just 
sits there, not writing back any data, not doing anything - but 
certainly not allowing any new IO writes. The host has some load on it, 
but nothing heavy enough to completely hand a guest for that long.


mount -o loop some_image.fs ./somewhere bs=512
dd if=/dev/zero of=/somewhere/zero
then after ~1GB: sync

Host is running: 2.6.31.4
QEMU PC emulator version 0.10.50 (qemu-kvm-devel-88)

Guests are booted with "elevator=noop" as the filesystems are stored as 
files, accessed as virtio disks.



The "hung" backtraces always look similar to these:
[  361.460136] INFO: task loop0:2097 blocked for more than 120 seconds.
[  361.460139] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[  361.460142] loop0 D 88000b92c848 0  2097  2 
0x0080
[  361.460148]  88000b92c5d0 0046 880008c1f810 
880009829fd8
[  361.460153]  880009829fd8 880009829fd8 88000a21ee80 
88000b92c5d0
[  361.460157]  880009829610 8181b768 880001af33b0 
0002

[  361.460161] Call Trace:
[  361.460216]  [] ? sync_page+0x0/0x43
[  361.460253]  [] ? io_schedule+0x2c/0x43
[  361.460257]  [] ? sync_page+0x3e/0x43
[  361.460261]  [] ? __wait_on_bit+0x41/0x71
[  361.460264]  [] ? wait_on_page_bit+0x6a/0x70
[  361.460283]  [] ? wake_bit_function+0x0/0x23
[  361.460287]  [] ? shrink_page_list+0x3e5/0x61e
[  361.460291]  [] ? schedule_timeout+0xa3/0xbe
[  361.460305]  [] ? autoremove_wake_function+0x0/0x2e
[  361.460308]  [] ? shrink_zone+0x7e1/0xaf6
[  361.460310]  [] ? determine_dirtyable_memory+0xd/0x17
[  361.460314]  [] ? isolate_pages_global+0xa3/0x216
[  361.460316]  [] ? mark_page_accessed+0x2a/0x39
[  361.460335]  [] ? __find_get_block+0x13b/0x15c
[  361.460337]  [] ? try_to_free_pages+0x1ab/0x2c9
[  361.460340]  [] ? isolate_pages_global+0x0/0x216
[  361.460343]  [] ? __alloc_pages_nodemask+0x394/0x564
[  361.460350]  [] ? __slab_alloc+0x137/0x44f
[  361.460371]  [] ? radix_tree_preload+0x1f/0x6a
[  361.460374]  [] ? kmem_cache_alloc+0x5d/0x88
[  361.460376]  [] ? radix_tree_preload+0x1f/0x6a
[  361.460379]  [] ? add_to_page_cache_locked+0x1d/0xf1
[  361.460381]  [] ? add_to_page_cache_lru+0x27/0x57
[  361.460384]  [] ? grab_cache_page_write_begin+0x7a/0xa0
[  361.460399]  [] ? ext3_write_begin+0x7e/0x201
[  361.460417]  [] ? do_lo_send_aops+0xa1/0x174
[  361.460420]  [] ? virt_to_head_page+0x9/0x2a
[  361.460422]  [] ? loop_thread+0x309/0x48a
[  361.460425]  [] ? do_lo_send_aops+0x0/0x174
[  361.460427]  [] ? autoremove_wake_function+0x0/0x2e
[  361.460430]  [] ? loop_thread+0x0/0x48a
[  361.460432]  [] ? kthread+0x78/0x80
[  361.460441]  [] ? finish_task_switch+0x2b/0x78
[  361.460454]  [] ? child_rip+0xa/0x20
[  361.460460]  [] ? native_pax_close_kernel+0x0/0x32
[  361.460463]  [] ? kthread+0x0/0x80
[  361.460469]  [] ? child_rip+0x0/0x20
[  361.460471] INFO: task kjournald:2098 blocked for more than 120 seconds.
[  361.460473] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[  361.460474] kjournald D 88000b92e558 0  2098  2 
0x0080
[  361.460477]  88000b92e2e0 0046 88000aad9840 
88000983ffd8
[  361.460480]  88000983ffd8 88000983ffd8 81808e00 
88000b92e2e0
[  361.460483]  88000983fcf0 8181b768 880001af3c40 
0002

[  361.460486] Call Trace:
[  361.460488]  [] ? sync_buffer+0x0/0x3c
[  361.460491]  [] ? io_schedule+0x2c/0x43
[  361.460494]  [] ? sync_buffer+0x38/0x3c
[  361.460496]  [] ? __wait_on_bit+0x41/0x71
[  361.460499]  [] ? sync_buffer+0x0/0x3c
[  361.460501]  [] ? out_of_line_wait_on_bit+0x6a/0x76
[  361.460504]  [] ? wake_bit_function+0x0/0x23
[  361.460514]  [] ? 
journal_commit_transaction+0x769/0xbb8

[  361.460517]  [] ? finish_task_switch+0x2b/0x78
[  361.460519]  [] ? thread_return+0x40/0x79
[  361.460522]  [] ? kjournald+0xc7/0x1cb
[  361.460525]  [] ? autoremove_wake_function+0x0/0x2e
[  361.460527]  [] ? kjournald+0x0/0x1cb
[  361.460530]  [] ? kthread+0x78/0x80
[  361.460532]  [] ? finish_task_switch+0x2b/0x78
[  361.460534]  [] ? child_rip+0xa/0x20
[  361.460537]  [] ? native_pax_close_kernel+0x0/0x32
[  361.460540]  [] ? kthread+0x0/0x80
[  361.460542]  [] ? child_rip+0x0/0x20
[  361.460544] INFO: task dd:2132 blocked for more than 120 seconds.
[  361.460546] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[  361.460547] ddD 88000a21f0f8 0  2132   2090 
0x0080
[  361.460550]  88000a21ee80 0082 88000a21ee80 
88000b3affd8
[  361.460553]  88000b3affd8 88000b3affd8 81808e00 
880001af3510
[  361.460556]  88000b78eaf0 88000b3daa00 880008de6c40