Re: [BUG] Guest kernel divide error in kvm_unlock_kick

2014-09-22 Thread Chris Webb
Paolo Bonzini pbonz...@redhat.com wrote:

 Il 11/09/2014 19:03, Chris Webb ha scritto:
 Paolo Bonzini pbonz...@redhat.com wrote:
 
 This is a hypercall that should have kicked VCPU 3 (see rcx).
 
 Can you please apply this patch and gather a trace of the host
 (using trace-cmd -e kvm qemu-kvm arguments)?
 
 Sure, no problem. I've built the trace-cmd tool against udis86 (I hope) and
 have put the resulting trace.dat at
 
  http://cdw.me.uk/tmp/trace.dat
 
 This is actually for a -smp 2 qemu (failing to kick VCPU 1?) as I was having
 trouble persuading the -smp 4 qemu to crash as reliably under tracing.
 (Something timing related?) Otherwise the qemu-system-x86 command line is
 exactly as before.
 
 Do you by chance have CONFIG_DEBUG_RODATA set?  In that case, the fix is
 simply not to set it.

Absolutely right: my host and guest kernels do have CONFIG_DEBUG_RODATA set!

Your patch to use alternatives for VMCALL vs VMMCALL definitely fixed the
divide-by-zero crashes I saw.

Given that I can easily use either (or both) of these solutions, is it be
more efficient to turn off CONFIG_DEBUG_RODATA in the guest kernel so kvm
can fix up the instructions in-place, or is using alternatives for
VMCALL/VMMCALL as implemented by your patch just as good?

Best wishes,

Chris.--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] Guest kernel divide error in kvm_unlock_kick

2014-09-11 Thread Chris Webb
Paolo Bonzini pbonz...@redhat.com wrote:

 This is a hypercall that should have kicked VCPU 3 (see rcx).
 
 Can you please apply this patch and gather a trace of the host
 (using trace-cmd -e kvm qemu-kvm arguments)?

Sure, no problem. I've built the trace-cmd tool against udis86 (I hope) and
have put the resulting trace.dat at

  http://cdw.me.uk/tmp/trace.dat

This is actually for a -smp 2 qemu (failing to kick VCPU 1?) as I was having
trouble persuading the -smp 4 qemu to crash as reliably under tracing.
(Something timing related?) Otherwise the qemu-system-x86 command line is
exactly as before.

The guest kernel crash message which corresponds to this trace was:

divide error:  [#1] PREEMPT SMP 
Modules linked in:
CPU: 0 PID: 618 Comm: mkdir Not tainted 3.16.2-guest #2
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
task: 88007c997080 ti: 88007c614000 task.ti: 88007c614000
RIP: 0010:[81037fe2]  [81037fe2] kvm_unlock_kick+0x72/0x80
RSP: 0018:88007c617d40  EFLAGS: 00010046
RAX: 0005 RBX:  RCX: 0001
RDX: 0001 RSI: 88007fd11c40 RDI: 
RBP: 88007fd11c40 R08: 81b98940 R09: 0001
R10:  R11: 0007 R12: 00f6
R13: 0001 R14: 0001 R15: 00011c40
FS:  7f43eb1ed700() GS:88007fc0() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 7f43eace0a30 CR3: 01a12000 CR4: 000406f0
Stack:
 88007c994380 88007c9949aa 0046 81689715
 810f3174 0001 ea0001f16320 ea0001f17860
  88007c99e1e8 88007c997080 0001
Call Trace:
 [81689715] ? _raw_spin_unlock+0x45/0x70
 [810f3174] ? try_to_wake_up+0x2a4/0x330
 [81101e2c] ? __wake_up_common+0x4c/0x80
 [81102418] ? __wake_up_sync_key+0x38/0x60
 [810d873a] ? do_notify_parent+0x19a/0x280
 [810f4d56] ? sched_move_task+0xb6/0x190
 [810cb4fc] ? do_exit+0xa1c/0xab0
 [810cc344] ? do_group_exit+0x34/0xb0
 [810cc3cb] ? SyS_exit_group+0xb/0x10
 [8168a16d] ? system_call_fastpath+0x1a/0x1f
Code: c0 ca a7 81 48 8d 04 0b 48 8b 30 48 39 ee 75 c9 0f b6 40 08 44 38 e0 75 
c0 48 c7 c0 22 b0 00 00 31 db 0f b7 0c 08 b8 05 00 00 00 0f 01 c1 0f 1f 00 5b 
5d 41 5c c3 0f 1f 00 48 c7 c0 10 cf 00 00 
RIP  [81037fe2] kvm_unlock_kick+0x72/0x80
 RSP 88007c617d40
---[ end trace bf5a4445f9decdbb ]---
Fixing recursive fault but reboot is needed!
BUG: scheduling while atomic: mkdir/618/0x0006
Modules linked in:
CPU: 0 PID: 618 Comm: mkdir Tainted: G  D   3.16.2-guest #2
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
  c022d302 81684029 
 810ee956 81686266 00011c40 88007c617fd8
 00011c40 88007c997080 0006 0046
Call Trace:
 [81684029] ? dump_stack+0x49/0x6a
 [810ee956] ? __schedule_bug+0x46/0x60
 [81686266] ? __schedule+0x5a6/0x7c0
 [816828cd] ? printk+0x59/0x75
 [810cb33b] ? do_exit+0x85b/0xab0
 [816828cd] ? printk+0x59/0x75
 [8100614a] ? oops_end+0x7a/0x100
 [810033e5] ? do_error_trap+0x85/0x110
 [81037fe2] ? kvm_unlock_kick+0x72/0x80
 [8114a358] ? __alloc_pages_nodemask+0x108/0xa60
 [8168b57e] ? divide_error+0x1e/0x30
 [81037fe2] ? kvm_unlock_kick+0x72/0x80
 [81689715] ? _raw_spin_unlock+0x45/0x70
 [810f3174] ? try_to_wake_up+0x2a4/0x330
 [81101e2c] ? __wake_up_common+0x4c/0x80
 [81102418] ? __wake_up_sync_key+0x38/0x60
 [810d873a] ? do_notify_parent+0x19a/0x280
 [810f4d56] ? sched_move_task+0xb6/0x190
 [810cb4fc] ? do_exit+0xa1c/0xab0
 [810cc344] ? do_group_exit+0x34/0xb0
 [810cc3cb] ? SyS_exit_group+0xb/0x10
 [8168a16d] ? system_call_fastpath+0x1a/0x1f

Best wishes,

Chris.--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[BUG] Guest kernel divide error in kvm_unlock_kick

2014-09-08 Thread Chris Webb
I've reported this bug before, which reliably crashes a guest kernel shortly
after boot, but have just reconfirmed that it is still present with Linux
3.16.2 guest and host kernels and Qemu 2.1.

Running a 3.16.2 x86-64 SMP guest kernel on qemu-2.1, with kvm enabled and
-cpu host on a 3.16.2 AMD Opteron host, I'm seeing a reliable kernel panic
from the guest shortly after boot. I think is happening in kvm_unlock_kick()
in the paravirt_ops code:

divide error:  [#1] PREEMPT SMP 
Modules linked in:
CPU: 0 PID: 743 Comm: syslogd Not tainted 3.16.2-guest #2
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
task: 88007c972580 ti: 88007cb7c000 task.ti: 88007cb7c000
RIP: 0010:[81037fe2]  [81037fe2] kvm_unlock_kick+0x72/0x80
RSP: :88007fc03ec8  EFLAGS: 00010046
RAX: 0005 RBX:  RCX: 0003
RDX: 0003 RSI: 81a466a0 RDI: 
RBP: 81a466a0 R08: 81b98940 R09: 0246
R10: 0400 R11:  R12: 00ea
R13: 0009 R14: 0002 R15: 88007fc0d300
FS:  7f2a6473e700() GS:88007fc0() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 004a8240 CR3: 7ac75000 CR4: 000406f0
Stack:
 81a46400 0246 0001 8168979d
 0282 81110d97 0007 88007cb7ffd8
 88007c972580 4b0782e8 0002 81a0b0c8
Call Trace:
 IRQ 
 [8168979d] ? _raw_spin_unlock_irqrestore+0x5d/0x80
 [81110d97] ? rcu_process_callbacks+0x337/0x4f0
 [810cde2d] ? __do_softirq+0xfd/0x210
 [810ce06e] ? irq_exit+0x7e/0xa0
 [8103063b] ? smp_apic_timer_interrupt+0x3b/0x50
 [8168b04d] ? apic_timer_interrupt+0x6d/0x80
 EOI 
 [8114180b] ? filemap_map_pages+0x17b/0x240
 [811418c0] ? filemap_map_pages+0x230/0x240
 [811679e2] ? do_read_fault.isra.70+0x2a2/0x320
 [811696cc] ? handle_mm_fault+0x37c/0xd00
 [8103bb45] ? __do_page_fault+0x185/0x4c0
 [8168b958] ? async_page_fault+0x28/0x30
 [813b9610] ? __put_user_4+0x20/0x30
 [8168b958] ? async_page_fault+0x28/0x30
Code: c0 ca a7 81 48 8d 04 0b 48 8b 30 48 39 ee 75 c9 0f b6 40 08 44 38 e0 75 
c0 48 c7 c0 22 b0 00 00 31 db 0f b7 0c 08 b8 05 00 00 00 0f 01 c1 0f 1f 00 5b 
5d 41 5c c3 0f 1f 00 48 c7 c0 10 cf 00 00 
RIP  [81037fe2] kvm_unlock_kick+0x72/0x80
 RSP 88007fc03ec8
---[ end trace be08885ac2c94c6a ]---
Kernel panic - not syncing: Fatal exception in interrupt

My host kernel config is http://cdw.me.uk/tmp/host-config.txt and the guest
config is http://cdw.me.uk/tmp/guest-config.txt with qemu command line:

 qemu-system-x86 -enable-kvm -cpu host -machine q35 -m 2048 -name $1 \
   -smp sockets=1,cores=4 -pidfile /run/$1.pid -runas nobody \
   -serial stdio -vga none -vnc none -kernel /boot/vmlinuz-guest \
   -append console=ttyS0 root=/dev/vda \
   -drive file=/dev/guest/$1,cache=none,format=raw,if=virtio \
   -device virtio-rng-pci \
   -device virtio-net-pci,netdev=nic,mac=$( /sys/class/net/$1/address) \
   -netdev tap,id=nic,fd=3 3/dev/tap$( /sys/class/net/$1/ifindex)

I can stop this crash by disabling CONFIG_PARAVIRT_SPINLOCKS in my guest
kernel, running with -cpu qemu64 instead of -cpu host, or running with -smp 1
instead of -smp 4. (Removing/changing the -machine q35 makes no difference.)

/proc/cpuinfo on the host has 8 of these:

processor   : 0
vendor_id   : AuthenticAMD
cpu family  : 21
model   : 2
model name  : AMD Opteron(tm) Processor 6328
stepping: 0
microcode   : 0x600081c
cpu MHz : 3200.000
cache size  : 2048 KB
physical id : 0
siblings: 8
core id : 0
cpu cores   : 4
apicid  : 32
initial apicid  : 0
fpu : yes
fpu_exception   : yes
cpuid level : 13
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb 
rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid amd_dcm aperfmperf 
pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c 
lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch 
osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core 
perfctr_nb arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean 
flushbyasid decodeassists pausefilter pfthreshold bmi1
bogomips: 6399.70
TLB size: 1536 4K pages
clflush size: 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro

and on the guest, has 4 of these:

processor   : 0
vendor_id   : AuthenticAMD
cpu family  : 

Re: Divide error in kvm_unlock_kick()

2014-06-17 Thread Chris Webb
I see kernel 3.15 is now out, so I retested with 3.15 guest and host. I'm
still getting exactly the same guest kernel panic: a divide error in
kvm_unlock_kick with -cpu host, but not with -cpu qemu64:

divide error:  [#1] PREEMPT SMP 
Modules linked in:
CPU: 1 PID: 781 Comm: mkdir Not tainted 3.15.0-guest #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Bochs 01/01/2011
task: 88007cbf6180 ti: 88088000 task.ti: 88088000
RIP: 0010:[8102d1e0]  [8102d1e0] kvm_unlock_kick+0x63/0x6b
RSP: :88007fc83d38  EFLAGS: 00010046
RAX: 0005 RBX:  RCX: 0002
RDX: 0002 RSI: 88007fd11d80 RDI: 81994840
RBP: 88007fd11d80 R08:  R09: 81994840
R10: 88007c480c88 R11: 0005 R12: cec0
R13: 88007d38332a R14: 0002 R15: 88007d382d00
FS:  7fdabf7fd700() GS:88007fc8() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7fd0643f6509 CR3: 7c028000 CR4: 000406e0
Stack:
 00011d80 0002 88007fd11d80 8156f83f
 810dba53 0046 88007fd0 88007d3bbe70
 81845da8 0003  
Call Trace:
 IRQ 
 [8156f83f] ? _raw_spin_unlock+0x32/0x55
 [810dba53] ? try_to_wake_up+0x1ed/0x20f
 [810e78b8] ? autoremove_wake_function+0x9/0x2a
 [810e739a] ? __wake_up_common+0x47/0x73
 [810e7547] ? __wake_up+0x33/0x44
 [8110f10b] ? irq_work_run+0x72/0x8f
 [81006079] ? smp_irq_work_interrupt+0x26/0x2b
 [8157185d] ? irq_work_interrupt+0x6d/0x80
 [810dba64] ? try_to_wake_up+0x1fe/0x20f
 [8102ad01] ? native_apic_msr_read+0x6/0x4e
 [8156f89f] ? _raw_spin_unlock_irqrestore+0x3d/0x65
 [810f2de3] ? rcu_process_callbacks+0x15e/0x47d
 [810cccf3] ? execute_in_process_context+0x55/0x55
 [810bdb98] ? __do_softirq+0xe0/0x1e6
 [810bde23] ? irq_exit+0x3c/0x81
 [810270e4] ? smp_apic_timer_interrupt+0x3b/0x46
 [8157135d] ? apic_timer_interrupt+0x6d/0x80
 EOI 
Code: 0c c5 c0 b8 87 81 49 8d 04 0c 48 8b 30 48 39 ee 75 ca 8a 40 08 38 d8 75 
c3 48 c7 c0 22 b0 00 00 31 db 0f b7 0c 08 b8 05 00 00 00 0f 01 c1 5b 5d 41 5c 
c3 4c 8d 54 24 08 48 83 e4 f0 b9 0a 00 00 
RIP  [8102d1e0] kvm_unlock_kick+0x63/0x6b
 RSP 88007fc83d38
---[ end trace 949b1bf47cc57d09 ]---
Kernel panic - not syncing: Fatal exception in interrupt
Shutting down cpus with NMI
Kernel Offset: 0x0 from 0x8100 (relocation range: 
0x8000-0x9fff)
---[ end Kernel panic - not syncing: Fatal exception in interrupt

I'm at a complete loss as to what to do next to debug this. Any help would be
extremely gratefully received!

I've put 3.15 host and guest configs here:

  http://cdw.me.uk/tmp/3.15-guest-config.txt
  http://cdw.me.uk/tmp/3.15-host-config.txt

dmesg just after boot here:

  http://cdw.me.uk/tmp/3.15-guest-dmesg.txt
  http://cdw.me.uk/tmp/3.15-host-dmesg.txt

and /proc/cpuinfo from both host and guest here:

  http://cdw.me.uk/tmp/3.15-guest-cpuinfo.txt
  http://cdw.me.uk/tmp/3.15-host-cpuinfo.txt

The qemu command line was

  qemu-system-x86 -enable-kvm -cpu host -machine q35 -m 2048 -name omega \
-smp sockets=1,cores=4 -pidfile /run/omega.pid -runas nobody \
-serial stdio -vga none -vnc none -kernel /boot/vmlinuz-guest \
-append console=ttyS0 root=/dev/vda \
-drive file=/dev/guest/omega,cache=none,format=raw,if=virtio \
-device virtio-rng-pci \
-device virtio-net-pci,netdev=nic,mac=02:14:72:3c:69:54 \
-netdev tap,id=nic,fd=3,vhost=on 3/dev/tapNNN

but removing the -machine q35 and -device virtio-rng-pci doesn't affect the
crash.

Dropping to -smp 1, running with -cpu qemu64, or compiling the guest kernel
without paravirtualised spinlock support does remove the panic, albeit at the
cost of performance.

Best wishes,

Chris.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Divide error in kvm_unlock_kick()

2014-06-01 Thread Chris Webb
I realised my original bug report was for a guest kernel compiled without
frame pointers which might be unhelpful, so I enabled CONFIG_DEBUG_INFO and
CONFIG_FRAME_POINTER, but I don't think this has made the backtrace any more
detailed.

Is there anything more I can do to pinpoint what might be going on here?

Cheers,

Chris.


divide error:  [#1] PREEMPT SMP 
Modules linked in:
CPU: 1 PID: 1013 Comm: mkdir Not tainted 3.14.4-guest #21
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Bochs 01/01/2011
task: 88007c8cf400 ti: 88007c7c6000 task.ti: 88007c7c6000
RIP: 0010:[8102ea86]  [8102ea86] kvm_unlock_kick+0x69/0x73
RSP: :88007fc83ca8  EFLAGS: 00010046
RAX: 0005 RBX:  RCX: 0002
RDX: 0002 RSI: 88007fd11d40 RDI: 8198f840
RBP: 88007fc83cc0 R08:  R09: 8198f840
R10: b5e0 R11: 0005 R12: 88007fd11d40
R13: cec0 R14: 88007d382b80 R15: 0002
FS:  7f4c6e265700() GS:88007fc8() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7f4c6dc9a080 CR3: 7c62e000 CR4: 000406e0
Stack:
 00011d40 88007fd11d40 0002 88007fc83cd0
 815852d0 88007fc83d20 810dd694 88007fd0
 0046 88007d383172 88007d3abe68 0003
Call Trace:
 IRQ 
 [815852d0] _raw_spin_unlock+0x36/0x5b
 [810dd694] try_to_wake_up+0x1f4/0x217
 [810dd6f6] default_wake_function+0xd/0xf
 [810e99f0] autoremove_wake_function+0xd/0x2f
 [810e944f] __wake_up_common+0x50/0x7c
 [810e962f] __wake_up+0x34/0x46
 [810f3b45] rsp_wakeup+0x1c/0x1e
 [81112e31] irq_work_run+0x77/0x9b
 [810063e2] smp_irq_work_interrupt+0x2a/0x31
 [8158739d] irq_work_interrupt+0x6d/0x80
 [81585336] ? _raw_spin_unlock_irqrestore+0x41/0x6a
 [810f5402] rcu_process_callbacks+0x162/0x486
 [810c4140] ? run_timer_softirq+0x19f/0x1c0
 [810be612] __do_softirq+0xe1/0x1e9
 [810be8b7] irq_exit+0x40/0x87
 [810283f1] smp_apic_timer_interrupt+0x3f/0x4b
 [81586e9d] apic_timer_interrupt+0x6d/0x80
 EOI 
Code: c5 40 50 87 81 49 8d 44 0d 00 48 8b 30 4c 39 e6 75 c9 8a 40 08 38 d8 75 
c2 48 c7 c0 22 b0 00 00 31 db 0f b7 0c 08 b8 05 00 00 00 0f 01 c1 5b 41 5c 41 
5d 5d c3 4c 8d 54 24 08 48 83 e4 f0 b9 0a 
RIP  [8102ea86] kvm_unlock_kick+0x69/0x73
 RSP 88007fc83ca8
---[ end trace ed563ea2dedc59b5 ]---
Kernel panic - not syncing: Fatal exception in interrupt
Shutting down cpus with NMI
Kernel Offset: 0x0 from 0x8100 (relocation range: 
0x8000-0x9fff)

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Divide error in kvm_unlock_kick()

2014-05-29 Thread Chris Webb
Chris Webb ch...@arachsys.com wrote:

 My CPU flags inside the crashing guest look like this:
 
 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 
 clflush
 mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb lm rep_good nopl
 extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic popcnt aes xsave
 avx f16c hypervisor lahf_lm cmp_legacy svm cr8_legacy abm sse4a misalignsse
 3dnowprefetch osvw xop fma4 tbm arat npt nrip_save tsc_adjust bmi1
 
 whereas in a (working) -cpu qemu64 guest, they look like this:
 
 fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush 
 mmx
 fxsr sse sse2 ht syscall nx lm nopl pni cx16 x2apic popcnt hypervisor lahf_lm
 cmp_legacy svm abm sse4a

I thought I'd try to bisect on processor flags to see which was/were
implicated. The extra flags from -cpu host compared to -cpu qemu64 are:

3dnowprefetch aes arat avx bmi1 cr8_legacy extd_apicid f16c fma fma4
fxsr_opt misalignsse mmxext npt nrip_save osvw pclmulqdq pdpe1gb rep_good
sse4_1 sse4_2 ssse3 tbm tsc_adjust vme xop xsave

I can add all of these to -cpu qemu64 with the +FLAG,... syntax and obtain a
working guest, but qemu doesn't recognise a handful of them:

CPU feature tsc_adjust not found
CPU feature arat not found
CPU feature cr8_legacy not found
CPU feature extd_apicid not found
CPU feature rep_good not found
CPU feature tsc_adjust not found
Failed to access perfctr msr (MSR c0010001 is )
[...]

Doing this results in a working, non-crashing guest, which suggests the
behaviour is triggered by one of tsc_adjust, arat, cr8_legacy, extd_apicid
or rep_good. However, because qemu doesn't recognise the flags, I can't run
with -cpu host,-tsc_adjust,-arat,... to investigate further. :(

Cheers,

Chris.--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Divide error in kvm_unlock_kick()

2014-05-29 Thread Chris Webb
Paolo Bonzini pbonz...@redhat.com wrote:

 Il 29/05/2014 19:45, Chris Webb ha scritto:
 Chris Webb ch...@arachsys.com wrote:
 
 My CPU flags inside the crashing guest look like this:
 
 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 
 clflush
 mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb lm rep_good nopl
 extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic popcnt aes 
 xsave
 avx f16c hypervisor lahf_lm cmp_legacy svm cr8_legacy abm sse4a misalignsse
 3dnowprefetch osvw xop fma4 tbm arat npt nrip_save tsc_adjust bmi1
 
 whereas in a (working) -cpu qemu64 guest, they look like this:
 
 fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush 
 mmx
 fxsr sse sse2 ht syscall nx lm nopl pni cx16 x2apic popcnt hypervisor 
 lahf_lm
 cmp_legacy svm abm sse4a
 
 I thought I'd try to bisect on processor flags to see which was/were
 implicated.
 
 Can you dump the full /proc/cpuinfo?

On the host, it looks like this:

processor   : 0
vendor_id   : AuthenticAMD
cpu family  : 21
model   : 2
model name  : AMD Opteron(tm) Processor 6328
stepping: 0
microcode   : 0x600081c
cpu MHz : 3200.000
cache size  : 2048 KB
physical id : 0
siblings: 8
core id : 0
cpu cores   : 4
apicid  : 32
initial apicid  : 0
fpu : yes
fpu_exception   : yes
cpuid level : 13
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb 
rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid amd_dcm aperfmperf 
pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c 
lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch 
osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core 
perfctr_nb arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean 
flushbyasid decodeassists pausefilter pfthreshold bmi1
bogomips: 6399.89
TLB size: 1536 4K pages
clflush size: 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro

[ x8 for processor 0 - 7; full dump at http://cdw.me.uk/tmp/host-cpuinfo.txt ]

and on the guest it looks like:

processor   : 0
vendor_id   : AuthenticAMD
cpu family  : 21
model   : 2
model name  : AMD Opteron(tm) Processor 6328
stepping: 0
microcode   : 0x165
cpu MHz : 3199.946
cache size  : 2048 KB
physical id : 0
siblings: 4
core id : 0
cpu cores   : 4
apicid  : 0
initial apicid  : 0
fpu : yes
fpu_exception   : yes
cpuid level : 13
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb lm 
rep_good nopl extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic 
popcnt aes xsave avx f16c hypervisor lahf_lm cmp_legacy svm cr8_legacy abm 
sse4a misalignsse 3dnowprefetch osvw xop fma4 tbm arat npt nrip_save tsc_adjust 
bmi1
bogomips: 6399.89
TLB size: 1536 4K pages
clflush size: 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management:

[ x4 for processor 0 - 3; full dump at http://cdw.me.uk/tmp/guest-cpuinfo.txt ]

Many thanks in advance for any pointers.

Best wishes,

Chris.--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Divide error in kvm_unlock_kick()

2014-05-28 Thread Chris Webb
Running a 3.14.4 x86-64 SMP guest kernel on qemu-2.0, with kvm enabled and
-cpu host on a 3.14.4 AMD Opteron host, I'm seeing a reliable kernel panic from
the guest shortly after boot. I think is happening in kvm_unlock_kick() in the
paravirt_ops code:

divide error:  [#1] PREEMPT SMP 
Modules linked in:
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.14.4-guest #16
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Bochs 01/01/2011
task: 88007d384880 ti: 88007d3b2000 task.ti: 88007d3b2000
RIP: 0010:[8102f0cc]  [8102f0cc] kvm_unlock_kick+0x63/0x6b
RSP: 0018:88007fc83db0  EFLAGS: 00010046
RAX: 0005 RBX:  RCX: 0003
RDX: 0003 RSI: 88007fd91d40 RDI: 0008
RBP: 88007fd91d40 R08:  R09: 8198e840
R10: 88007cbc7400 R11: 88007cbc9d00 R12: cec0
R13: 0001 R14: 88007fd91d40 R15: 0001
FS:  7ff42a4d3700() GS:88007fc8() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 7ff42a290006 CR3: 7c76d000 CR4: 000406e0
Stack:
 88007fd11d40 88007d361cc0 88007fc8d240 81563990
 810e42a6 00038102fa73 0282 
 88007fd12668 88007fc83ecc 00ff 006b
Call Trace:
 IRQ 
 [81563990] ? _raw_spin_unlock+0x57/0x61
 [810e42a6] ? load_balance+0x4ff/0x783
 [810e4681] ? rebalance_domains+0x157/0x20c
 [810e4841] ? run_rebalance_domains+0x10b/0x148
 [810be7c1] ? __do_softirq+0xec/0x1fe
 [810beacc] ? irq_exit+0x48/0x8d
 [815658dd] ? reschedule_interrupt+0x6d/0x80
 EOI 
 [8100a842] ? hard_enable_TSC+0x2e/0x2e
 [8102fbe1] ? native_safe_halt+0x2/0x3
 [8100a853] ? default_idle+0x11/0x14
 [810ed4e7] ? cpu_startup_entry+0x153/0x1d2
 [810277ad] ? start_secondary+0x220/0x23c
Code: 0c c5 40 50 87 81 49 8d 04 0c 48 8b 30 48 39 ee 75 ca 8a 40 08 38 d8 75 
c3 48 c7 c0 22 b0 00 00 31 db 0f b7 0c 08 b8 05 00 00 00 0f 01 c1 5b 5d 41 5c 
c3 4c 8d 54 24 08 48 83 e4 f0 b9 0a 00 00 
RIP  [8102f0cc] kvm_unlock_kick+0x63/0x6b
 RSP 88007fc83db0
---[ end trace 2278d9742b4dff74 ]---
Kernel panic - not syncing: Fatal exception in interrupt
Shutting down cpus with NMI
Kernel Offset: 0x0 from 0x8100 (relocation range: 
0x8000-0x9fff)

My host kernel config is http://cdw.me.uk/tmp/host-config.txt and the guest
config is http://cdw.me.uk/tmp/guest-config.txt with qemu command line:

  qemu-system-x86 -enable-kvm -cpu qemu64 -machine q35 -m 2048 -name $1 \
-smp sockets=1,cores=4 -pidfile /run/$1.pid -runas nobody \
-serial stdio -vga none -vnc none -kernel /boot/vmlinuz-guest \
-append console=ttyS0 root=/dev/vda \
-drive file=/dev/guest/$1,cache=none,format=raw,if=virtio \
-device virtio-net-pci,netdev=nic,mac=$( /sys/class/net/$1/address) \
-netdev tap,id=nic,fd=3 3/dev/tap$( /sys/class/net/$1/ifindex)

I can stop this crash by disabling CONFIG_PARAVIRT_SPINLOCKS in my guest
kernel, running with -cpu qemu64 instead of -cpu host, or running with -smp 1
instead of -smp 4. (Removing/changing the -machine q35 makes no difference.)

My CPU flags inside the crashing guest look like this:

fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush
mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb lm rep_good nopl
extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic popcnt aes xsave
avx f16c hypervisor lahf_lm cmp_legacy svm cr8_legacy abm sse4a misalignsse
3dnowprefetch osvw xop fma4 tbm arat npt nrip_save tsc_adjust bmi1

whereas in a (working) -cpu qemu64 guest, they look like this:

fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx
fxsr sse sse2 ht syscall nx lm nopl pni cx16 x2apic popcnt hypervisor lahf_lm
cmp_legacy svm abm sse4a

I tried enabling CONFIG_PARAVIRT_DEBUG, but no extra information was reported.

Very happy to do any testing at my end which might help track down what's going
on here.

Best wishes,

Chris.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


qemu-kvm guest which won't 'cont' (emulation failure?)

2011-10-24 Thread Chris Webb
I have a qemu-kvm guest (apparently a Ubuntu 11.04 x86-64 install) which has
stopped and refuses to continue:

  (qemu) info status
  VM status: paused
  (qemu) cont
  (qemu) info status
  VM status: paused

The host is running linux 2.6.39.2 with qemu-kvm 0.14.1 on 24-core Opteron
6176 box, and has nine other 2GB production guests on it running absolutely
fine.

It's been a while since I've seen one of these. When I last saw a cluster of
them, they were emulation failures (big real mode instructions, maybe?). I
also remember a message about abnormal exit in the dmesg previously, but I
don't have that here. This time, there is no host kernel output at all, just
the paused guest.

I have qemu monitor access and can even strace the relevant qemu process if
necessary: is it possible to use this to diagnose what's caused this guest
to stop, e.g. the unsupported instruction if it's an emulation failure?

Cheers,

Chris.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] qemu-kvm guest which won't 'cont' (emulation failure?)

2011-10-24 Thread Chris Webb
Kevin Wolf kw...@redhat.com writes:

 Am 24.10.2011 12:00, schrieb Chris Webb:
  I have qemu monitor access and can even strace the relevant qemu process if
  necessary: is it possible to use this to diagnose what's caused this guest
  to stop, e.g. the unsupported instruction if it's an emulation failure?
 
 Another common cause for stopped VMs are I/O errors, for example writes
 to a sparse image when the disk is full.

This guest are backed by LVM LVs so I don't think they can return EFULL, but I
could imagine read errors, so I've just done a trivial test to make sure I can
read them end-to-end:

  0015# dd 
if=/dev/mapper/guest\:e549f8e1-4c0e-4dea-826a-e4b877282c07\:ide\:0\:0 
of=/dev/null bs=1M
  3136+0 records in
  3136+0 records out
  3288334336 bytes (3.3 GB) copied, 20.898 s, 157 MB/s

  0015# dd 
if=/dev/mapper/guest\:e549f8e1-4c0e-4dea-826a-e4b877282c07\:ide\:0\:1 
of=/dev/null bs=1M
  276+0 records in
  276+0 records out
  289406976 bytes (289 MB) copied, 1.85218 s, 156 MB/s

Is there any way to ask qemu why a guest has stopped, so I can distinguish IO
problems from emulation problems from anything else?

Cheers,

Chris.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] qemu-kvm guest which won't 'cont' (emulation failure?)

2011-10-24 Thread Chris Webb
Kevin Wolf kw...@redhat.com writes:

 In qemu 1.0 we'll have an extended 'info status' that includes the stop
 reason, but 0.14 doesn't have this yet (was committed to git master only
 recently).

Right, okay. I might take a look at cherry-picking and back-porting that to
our version of qemu-kvm if it's not too entangled with other changes. It
would be very useful in these situations.

 If you attach a QMP monitor (see QMP/README, don't forget to send the
 capabilities command, it's part of creating the connection) you will
 receive messages for I/O errors, though.

Thanks. I don't think I can do this with an already-running qemu-kvm that's
in a stopped state can I, only with a new qemu-kvm invocation and wait to
try to catch the problem again?

Cheers,

Chris.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] qemu-kvm guest which won't 'cont' (emulation failure?)

2011-10-24 Thread Chris Webb
Kevin Wolf kw...@redhat.com writes:

 Good point... The only other thing that I can think of would be
 attaching gdb and setting a breakpoint in vm_stop() or something.

Perfect, that seems to identified what's going on very nicely:

(gdb) break vm_stop
Breakpoint 1 at 0x407d10: file /home/root/packages/qemu-kvm/src-UMBurO/cpus.c, 
line 318.
(gdb) fg
Continuing.

Breakpoint 1, vm_stop (reason=0)
at /home/root/packages/qemu-kvm/src-UMBurO/cpus.c:318
318 /home/root/packages/qemu-kvm/src-UMBurO/cpus.c: No such file or 
directory.
in /home/root/packages/qemu-kvm/src-UMBurO/cpus.c
(gdb) bt
#0  vm_stop (reason=0) at /home/root/packages/qemu-kvm/src-UMBurO/cpus.c:318
#1  0x0058585f in ide_handle_rw_error (s=0x20330d8, error=28, op=8)
at /home/root/packages/qemu-kvm/src-UMBurO/hw/ide/core.c:468
#2  0x00588376 in ide_dma_cb (opaque=0x20330d8, 
ret=value optimized out)
at /home/root/packages/qemu-kvm/src-UMBurO/hw/ide/core.c:494
#3  0x00590092 in dma_bdrv_cb (opaque=0x2043a10, ret=-28)
at /home/root/packages/qemu-kvm/src-UMBurO/dma-helpers.c:94
#4  0x0044d64a in qcow2_aio_write_cb (opaque=0x2034900, ret=-28)
at block/qcow2.c:714
#5  0x0043df6d in posix_aio_process_queue (
opaque=value optimized out) at posix-aio-compat.c:462
#6  0x0043e07d in posix_aio_read (opaque=0x17c8110)
at posix-aio-compat.c:503
#7  0x00415fca in main_loop_wait (nonblocking=value optimized out)
at /home/root/packages/qemu-kvm/src-UMBurO/vl.c:1383
#8  0x0042ca37 in kvm_main_loop ()
at /home/root/packages/qemu-kvm/src-UMBurO/qemu-kvm.c:1589
#9  0x004170a3 in main (argc=32, argv=value optimized out, 
envp=value optimized out)
at /home/root/packages/qemu-kvm/src-UMBurO/vl.c:1429

I see what's happened here: we're not explicitly setting format=raw when we
start that guest and someone's uploaded a qcow2 image directly to a block
device. Ouch. Sorry for the noise!

Best wishes,

Chris.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Host where KSM appears to save a negative amount of memory

2011-08-22 Thread Chris Webb
Hugh Dickins hu...@google.com writes:

 KSM chooses to show the numbers pages_shared and pages_sharing as
 exclusive counts: pages_sharing indicates the saving being made.  So it
 would be perfectly reasonable to add those two numbers together to get
 the total number of pages sharing, the number you expected it to show;
 but it doesn't make sense to subtract shared from sharing.

Hi. Many thanks for your helpful and detailed explanation. I've fixed our
monitoring to correctly use just pages_sharing to measure the savings. I
think I just assumed the meanings of pages_shared and pages_sharing from
their names. This means that ksm has been saving even more memory than we
thought on our hosts in the past!

Best wishes,

Chris.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Host where KSM appears to save a negative amount of memory

2011-08-21 Thread Chris Webb
We're running KSM on kernel 2.6.39.2 with hosts running a number qemu-kvm
virtual machines, and it has consistently been saving us a useful amount of
RAM.

To monitor the effective amount of memory saved, I've been looking at the
difference between /sys/kernel/mm/ksm/pages_sharing and pages_shared. On a
typical 32GB host, this has been coming out as at least a hundred thousand
or so, which is presumably half to one gigabyte worth of 4k pages.

However, this morning we've spotted something odd - a host where
pages_sharing is smaller than pages_shared, giving a negative saving by the
above calculation:

  # cat /sys/kernel/mm/ksm/pages_sharing
  104
  # cat /sys/kernel/mm/ksm/pages_shared
  1761313

I think this means my interpretation of these values must be wrong, as I
presumably can't have more pages being shared than instances of their use!
Can anyone shed any light on what might be going on here for me? Am I
misinterpreting these values, or does this look like it might be an
accounting bug? (If the latter, what useful debug info can I extract from
the system to help identify it?)

Best wishes,

Chris.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


High CPU use of -usbdevice tablet (was Re: KVM usability)

2010-04-04 Thread Chris Webb
Avi Kivity a...@redhat.com writes:

 On 03/02/2010 11:34 AM, Jernej Simončič wrote:
 On Tuesday, March 2, 2010, 9:21:18, Chris Webb wrote:
 
 I remember about a year ago, someone asserting on the list that -usbdevice
 tablet was very CPU intensive even when not in use, and should be avoided if
 mouse support wasn't needed, e.g. on non-graphical VMs. Was that actually a
 significant hit, and is it still true today?
 It would appear that this is still the case, at least on slower hosts
 - on Atom Z530 (1,6GHz), the XP VM uses ~30% CPU when idle with
 -usbdevice tablet, but only ~4% without it. However, on a faster host
 (Core2 Quad 2,66GHz), there's practically no difference (Vista x64 VM
 uses ~1% CPU when idle regardless of -usbdevice tablet).
 
 Looks like the tablet is set to 100 Hz polling rate.  We may be able
 to get away with 30 Hz or even less (ep_bInterval, in ms, in
 hw/usb-wacom.c).

Hi Avi. Sorry for the very late follow-up, but I decided to experiment with
this. The cpu impact of the usb tablet device shows up fairly clearly on a
crude test on my (relatively low-spec) desktop. Running an idle Fedora 11
livecd on qemu-kvm 0.12.3, top shows around 0.1% of my cpu in use, but this
increases to roughly 5% when specifying -usbdevice tablet, and more detailed
examination with perf record/report suggests about a factor of thirty too.

It's actually a more general symptom with USB or at least HID devices by the
look of things: although -usb doesn't increase CPU use on its own, the same
increase in load can also be triggered by -usbdevice keyboard or mouse.
However, running with all three of -usbdevice mouse, keyboard and tablet
doesn't increase load any more than just one of these.

Changing the USB tablet polling interval from 10ms to 100ms in both
hw/usb-wacom.c and hw/usb-hid.c made no difference except the an increase in
bInterval shown in lsusb -v in the guest and the hint of jerky mouse
movement I expected from setting this value so high. A similar change to the
polling interval for the keyboard and mouse also made no difference to their
performance impact.

Taking the FRAME_TIMER_FREQ down to 100 in hw/usb-uhci.c does seem to reduce
the CPU load quite a bit, but at the expense of making the USB tablet (and
presumably all other USB devices) very laggy.

Could there be some bug here that causes the usb hid devices to wake qemu at
the maximum rate possible (FRAME_TIMER_FREQ?) rather than the configured
polling interval?

Best wishes,

Chris.

PS Vmmouse works fine as an absolute pointing device in the place of
-usbdevice tablet without the performance impact, but this isn't supported
out of the box with typical linux live CDs (e.g. Fedora 11 and 12 or
Knoppix) so unfortunately it's probably less suitable as a default
configuration to expose to end-users.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RF C/T/D] Unmapped page cache control - via boot parameter

2010-03-22 Thread Chris Webb
Chris Webb ch...@arachsys.com writes:

 Okay. What I was driving at in describing these systems as 'already broken'
 is that they will already lose data (in this sense) if they're run on bare
 metal with normal commodity SATA disks with their 32MB write caches on. That
 configuration surely describes the vast majority of PC-class desktops and
 servers!
 
 If I understand correctly, your point here is that the small cache on a real
 SATA drive gives a relatively small time window for data loss, whereas the
 worry with cache=writeback is that the host page cache can be gigabytes, so
 the time window for unsynced data to be lost is potentially enormous.
 
 Isn't the fix for that just forcing periodic sync on the host to bound-above
 the time window for unsynced data loss in the guest?

For the benefit of the archives, it turns out the simplest fix for this is
already implemented as a vm sysctl in linux. Set vm.dirty_bytes to 3220,
and the size of dirty page cache is bounded above by 32MB, so we are
simulating exactly the case of a SATA drive with a 32MB writeback-cache.

Unless I'm missing something, the risk to guest OSes in this configuration
should therefore be exactly the same as the risk from running on normal
commodity hardware with such drives and no expensive battery-backed RAM.

Cheers,

Chris.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RF C/T/D] Unmapped page cache control - via boot parameter

2010-03-22 Thread Chris Webb
Avi Kivity a...@redhat.com writes:

 On 03/22/2010 11:04 PM, Chris Webb wrote:

 Unless I'm missing something, the risk to guest OSes in this configuration
 should therefore be exactly the same as the risk from running on normal
 commodity hardware with such drives and no expensive battery-backed RAM.
 
 A host crash will destroy your data.  If  your machine is connected
 to a UPS, only a firmware crash can destroy your data.

Yes, that's a good point: in this configuration a host crash is equivalent
to a power failure rather than a OS crash in terms of data loss.

Cheers,

Chris.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RF C/T/D] Unmapped page cache control - via boot parameter

2010-03-17 Thread Chris Webb
Anthony Liguori anth...@codemonkey.ws writes:

 This really gets down to your definition of safe behaviour.  As it
 stands, if you suffer a power outage, it may lead to guest
 corruption.
 
 While we are correct in advertising a write-cache, write-caches are
 volatile and should a drive lose power, it could lead to data
 corruption.  Enterprise disks tend to have battery backed write
 caches to prevent this.
 
 In the set up you're emulating, the host is acting as a giant write
 cache.  Should your host fail, you can get data corruption.

Hi Anthony. I suspected my post might spark an interesting discussion!

Before considering anything like this, we did quite a bit of testing with
OSes in qemu-kvm guests running filesystem-intensive work, using an ipmitool
power off to kill the host. I didn't manage to corrupt any ext3, ext4 or
NTFS filesystems despite these efforts.

Is your claim here that:-

  (a) qemu doesn't emulate a disk write cache correctly; or

  (b) operating systems are inherently unsafe running on top of a disk with
  a write-cache; or

  (c) installations that are already broken and lose data with a physical
  drive with a write-cache can lose much more in this case because the
  write cache is much bigger?

Following Christoph Hellwig's patch series from last September, I'm pretty
convinced that (a) isn't true apart from the inability to disable the
write-cache at run-time, which is something that neither recent linux nor
windows seem to want to do out-of-the box.

Given that modern SATA drives come with fairly substantial write-caches
nowadays which operating systems leave on without widespread disaster, I
don't really believe in (b) either, at least for the ide and scsi case.
Filesystems know they have to flush the disk cache to avoid corruption.
(Virtio makes the write cache invisible to the OS except in linux 2.6.32+ so
I know virtio-blk has to be avoided for current windows and obsolete linux
when writeback caching is on.)

I can certainly imagine (c) might be the case, although when I use strace to
watch the IO to the block device, I see pretty regular fdatasyncs being
issued by the guests, interleaved with the writes, so I'm not sure how
likely the problem would be in practice. Perhaps my test guests were
unrepresentatively well-behaved.

However, the potentially unlimited time-window for loss of incorrectly
unsynced data is also something one could imagine fixing at the qemu level.
Perhaps I should be implementing something like
cache=writeback,flushtimeout=N which, upon a write being issued to the block
device, starts an N second timer if it isn't already running. The timer is
destroyed on flush, and if it expires before it's destroyed, a gratuitous
flush is sent. Do you think this is worth doing? Just a simple 'while sleep
10; do sync; done' on the host even!

We've used cache=none and cache=writethrough, and whilst performance is fine
with a single guest accessing a disk, when we chop the disks up with LVM and
run a even a small handful of guests, the constant seeking to serve tiny
synchronous IOs leads to truly abysmal throughput---we've seen less than
700kB/s streaming write rates within guests when the backing store is
capable of 100MB/s.

With cache=writeback, there's still IO contention between guests, but the
write granularity is a bit coarser, so the host's elevator seems to get a
bit more of a chance to help us out and we can at least squeeze out 5-10MB/s
from two or three concurrently running guests, getting a total of 20-30% of
the performance of the underlying block device rather than a total of around
5%.

Cheers,

Chris.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RF C/T/D] Unmapped page cache control - via boot parameter

2010-03-17 Thread Chris Webb
Avi Kivity a...@redhat.com writes:

 On 03/15/2010 10:23 PM, Chris Webb wrote:

 Wasteful duplication of page cache between guest and host notwithstanding,
 turning on cache=writeback is a spectacular performance win for our guests.
 
 Is this with qcow2, raw file, or direct volume access?

This is with direct access to logical volumes. No file systems or qcow2 in
the stack. Our typical host has a couple of SATA disks, combined in md
RAID1, chopped up into volumes with LVM2 (really just dm linear targets).
The performance measured outside qemu is excellent, inside qemu-kvm is fine
too until multiple guests are trying to access their drives at once, but
then everything starts to grind badly.

 I can understand it for qcow2, but for direct volume access this
 shouldn't happen.  The guest schedules as many writes as it can,
 followed by a sync.  The host (and disk) can then reschedule them
 whether they are in the writeback cache or in the block layer, and
 must sync in the same way once completed.

I don't really understand what's going on here, but I wonder if the
underlying problem might be that all the O_DIRECT/O_SYNC writes from the
guests go down into the same block device at the bottom of the device mapper
stack, and thus can't be reordered with respect to one another. For our
purposes,

  Guest AA   Guest BB   Guest AA   Guest BB   Guest AA   Guest BB
  write A1  write A1 write B1
 write B1   write A2  write A1
  write A2 write B1   write A2

are all equivalent, but the system isn't allowed to reorder in this way
because there isn't a separate request queue for each logical volume, just
the one at the bottom. (I don't know whether nested request queues would
behave remotely reasonably either, though!)

Also, if my guest kernel issues (say) three small writes, one at the start
of the disk, one in the middle, one at the end, and then does a flush, can
virtio really express this as one non-contiguous O_DIRECT write (the three
components of which can be reordered by the elevator with respect to one
another) rather than three distinct O_DIRECT writes which can't be permuted?
Can qemu issue a write like that? cache=writeback + flush allows this to be
optimised by the block layer in the normal way.

Cheers,

Chris.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RF C/T/D] Unmapped page cache control - via boot parameter

2010-03-17 Thread Chris Webb
Anthony Liguori anth...@codemonkey.ws writes:

 On 03/17/2010 10:14 AM, Chris Webb wrote:
(c) installations that are already broken and lose data with a physical
drive with a write-cache can lose much more in this case because the
write cache is much bigger?
 
 This is the closest to the most accurate.
 
 It basically boils down to this: most enterprises use a disks with
 battery backed write caches.  Having the host act as a giant write
 cache means that you can lose data.
 
 I agree that a well behaved file system will not become corrupt, but
 my contention is that for many types of applications, data lose ==
 corruption and not all file systems are well behaved.  And it's
 certainly valid to argue about whether common filesystems are
 broken but from a purely pragmatic perspective, this is going to
 be the case.

Okay. What I was driving at in describing these systems as 'already broken'
is that they will already lose data (in this sense) if they're run on bare
metal with normal commodity SATA disks with their 32MB write caches on. That
configuration surely describes the vast majority of PC-class desktops and
servers!

If I understand correctly, your point here is that the small cache on a real
SATA drive gives a relatively small time window for data loss, whereas the
worry with cache=writeback is that the host page cache can be gigabytes, so
the time window for unsynced data to be lost is potentially enormous.

Isn't the fix for that just forcing periodic sync on the host to bound-above
the time window for unsynced data loss in the guest?

Cheers,

Chris.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RF C/T/D] Unmapped page cache control - via boot parameter

2010-03-17 Thread Chris Webb
Avi Kivity a...@redhat.com writes:

 Chris, can you carry out an experiment?  Write a program that
 pwrite()s a byte to a file at the same location repeatedly, with the
 file opened using O_SYNC.  Measure the write rate, and run blktrace
 on the host to see what the disk (/dev/sda, not the volume) sees.
 Should be a (write, flush, write, flush) per pwrite pattern or
 similar (for writing the data and a journal block, perhaps even
 three writes will be needed).
 
 Then scale this across multiple guests, measure and trace again.  If
 we're lucky, the flushes will be coalesced, if not, we need to work
 on it.

Sure, sounds like an excellent plan. I don't have a test machine at the
moment as the last host I was using for this has gone into production, but
I'm due to get another one to install later today or first thing tomorrow
which would be ideal for doing this. I'll follow up with the results once I
have them.

Cheers,

Chris.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RF C/T/D] Unmapped page cache control - via boot parameter

2010-03-17 Thread Chris Webb
Vivek Goyal vgo...@redhat.com writes:

 Are you using CFQ in the host? What is the host kernel version? I am not sure
 what is the problem here but you might want to play with IO controller and put
 these guests in individual cgroups and see if you get better throughput even
 with cache=writethrough.

Hi. We're using the deadline IO scheduler on 2.6.32.7. We got better
performance from deadline than from cfq when we last tested, which was
admittedly around the 2.6.30 timescale so is now a rather outdated
measurement.

 If the problem is that if sync writes from different guests get intermixed
 resulting in more seeks, IO controller might help as these writes will now
 go on different group service trees and in CFQ, we try to service requests
 from one service tree at a time for a period before we switch the service
 tree.

Thanks for the suggestion: I'll have a play with this. I currently use
/sys/kernel/uids/N/cpu_share with one UID per guest to divide up the CPU
between guests, but this could just as easily be done with a cgroup per
guest if a side-effect is to provide a hint about IO independence to CFQ.

Best wishes,

Chris.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RF C/T/D] Unmapped page cache control - via boot parameter

2010-03-15 Thread Chris Webb
Avi Kivity a...@redhat.com writes:

 On 03/15/2010 10:07 AM, Balbir Singh wrote:

 Yes, it is a virtio call away, but is the cost of paying twice in
 terms of memory acceptable?
 
 Usually, it isn't, which is why I recommend cache=off.

Hi Avi. One observation about your recommendation for cache=none:

We run hosts of VMs accessing drives backed by logical volumes carved out
from md RAID1. Each host has 32GB RAM and eight cores, divided between (say)
twenty virtual machines, which pretty much fill the available memory on the
host. Our qemu-kvm is new enough that IDE and SCSI drives with writeback
caching turned on get advertised to the guest as having a write-cache, and
FLUSH gets translated to fsync() by qemu. (Consequently cache=writeback
isn't acting as cache=neverflush like it would have done a year ago. I know
that comparing performance for cache=none against that unsafe behaviour
would be somewhat unfair!)

Wasteful duplication of page cache between guest and host notwithstanding,
turning on cache=writeback is a spectacular performance win for our guests.
For example, even IDE with cache=writeback easily beats virtio with
cache=none in most of the guest filesystem performance tests I've tried. The
anecdotal feedback from clients is also very strongly in favour of
cache=writeback.

With a host full of cache=none guests, IO contention between guests is
hugely problematic with non-stop seek from the disks to service tiny
O_DIRECT writes (especially without virtio), many of which needn't have been
synchronous if only there had been some way for the guest OS to tell qemu
that. Running with cache=writeback seems to reduce the frequency of disk
flush per guest to a much more manageable level, and to allow the host's
elevator to optimise writing out across the guests in between these flushes.

Cheers,

Chris.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Fix SIGFPE for vnc display of width/height = 1

2010-03-08 Thread Chris Webb
Chris Webb ch...@arachsys.com writes:

 During boot, the screen gets resized to height 1 and a mouse click at this
 point will cause a division by zero when calculating the absolute pointer
 position from the pixel (x, y). Return a click in the middle of the screen
 instead in this case.

I think this probably ought to be a candidate for 0.12-stable too. We're
seeing these crashes for real from time-to-time so it's not just a
theoretical problem.

Cheers,

Chris.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Re: Another VNC crash, qemu-kvm-0.12.3

2010-03-06 Thread Chris Webb
Alexander Graf ag...@suse.de writes:

 On 05.03.2010, at 17:52, Chris Webb wrote:

  Of course, if the screen width or height is 1, it doesn't really matter what
  the value of the mouse position for the click is, so something as simple as
  
  diff --git a/vnc.c b/vnc.c
  --- a/vnc.c
  +++ b/vnc.c
  @@ -1421,8 +1421,10 @@
  dz = 1;
  
  if (vs-absolute) {
  -kbd_mouse_event(x * 0x7FFF / (ds_get_width(vs-ds) - 1),
  -y * 0x7FFF / (ds_get_height(vs-ds) - 1),
  +kbd_mouse_event(ds_get_width(vs-ds)  1 ?
  +  x * 0x7FFF / (ds_get_width(vs-ds) - 1) : 0x4000,
  +ds_get_height(vs-ds)  1 ?
  +  y * 0x7FFF / (ds_get_height(vs-ds) - 1) : 
  0x4000,
  dz, buttons);
  } else if (vnc_has_feature(vs, VNC_FEATURE_POINTER_TYPE_CHANGE)) {
  x -= 0x7FFF;
  
  will fix the symptom: the division by zero. The underlying cause of a 9x1
  display surface is a bit mysterious though.
 
 Is it? When booting the screen gets resized to something like 9x1 for a
 few ms. Try putting debug code in the resize callback - you'll see it.

Ah, okay. In that case, this patch could well be the correct fix rather than
just a work-around. I'll have a look for any other places in vnc.c that
might do a similar division-by-zero for small screen sizes at the same
point.

Best wishes,

Chris.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Another VNC crash, qemu-kvm-0.12.3

2010-03-05 Thread Chris Webb
Anthony Liguori anth...@codemonkey.ws writes:

 On 03/01/2010 12:14 PM, Chris Webb wrote:
 We've just seen another VNC related qemu-kvm crash, this time an arithmetic
 exception at vnc.c:1424 in the newly release qemu-kvm 0.12.3.
 
[...]
1423 if (vs-absolute) {
1424 kbd_mouse_event(x * 0x7FFF / (ds_get_width(vs-ds) - 1),
1425 y * 0x7FFF / (ds_get_height(vs-ds) - 1),
1426 dz, buttons);
1427 } else if (vnc_has_feature(vs, VNC_FEATURE_POINTER_TYPE_CHANGE)) 
  {
1428 x -= 0x7FFF;
[...]
 
 and sure enough:
 
(gdb) p vs-ds-surface-width
$1 = 9
(gdb) p vs-ds-surface-height
$2 = 1
 
 What a 9x1 display surface is doing on this guest is a mystery to me, but you
 definitely can't divide by one less than its height!
 
 Can you reproduce this reliably?  If so, what's the procedure?

No, I'm afraid not, although I have had a thorough play myself with a variety
of VNC clients in an attempt to reproduce.

The background here is that we're running a public hosting service where
customers can install and run their own OSes on their own qemu-kvm virtual
machines. I don't even know what VNC client (if any) was connected at the
time. I only see the core dump if the qemu-kvm crashes.

Of course, if the screen width or height is 1, it doesn't really matter what
the value of the mouse position for the click is, so something as simple as

diff --git a/vnc.c b/vnc.c
--- a/vnc.c
+++ b/vnc.c
@@ -1421,8 +1421,10 @@
 dz = 1;
 
 if (vs-absolute) {
-kbd_mouse_event(x * 0x7FFF / (ds_get_width(vs-ds) - 1),
-y * 0x7FFF / (ds_get_height(vs-ds) - 1),
+kbd_mouse_event(ds_get_width(vs-ds)  1 ?
+  x * 0x7FFF / (ds_get_width(vs-ds) - 1) : 0x4000,
+ds_get_height(vs-ds)  1 ?
+  y * 0x7FFF / (ds_get_height(vs-ds) - 1) : 0x4000,
 dz, buttons);
 } else if (vnc_has_feature(vs, VNC_FEATURE_POINTER_TYPE_CHANGE)) {
 x -= 0x7FFF;

will fix the symptom: the division by zero. The underlying cause of a 9x1
display surface is a bit mysterious though.

Cheers,

Chris.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM usability

2010-03-02 Thread Chris Webb
Dustin Kirkland kirkl...@canonical.com writes:

 On Mon, 2010-03-01 at 15:59 -0600, Anthony Liguori wrote:
  
  Defaulting usb to on and defaulting to a usb tablet is a reasonable 
  thing to do IMHO.
 
 \o/  Definitely a better user experience.

I remember about a year ago, someone asserting on the list that -usbdevice
tablet was very CPU intensive even when not in use, and should be avoided if
mouse support wasn't needed, e.g. on non-graphical VMs. Was that actually a
significant hit, and is it still true today?

Cheers,

Chris.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM usability

2010-03-02 Thread Chris Webb
Ingo Molnar mi...@elte.hu writes:

 Yes, you are quite correct: udev has been argued to be a prime candidate for 
 tools/. (and some other kernel utilities as well)

A small, static set of userspace like klibc (only 5M unpacked!) with enough
tools for rolling up in a standard initramfs would be especially nice, and
vastly less difficult to import than qemu.

It's a pain in the neck to have to build two versions of lots of bits of
userspace: one stripped down and statically linked for initramfs and one
full-featured for the main system. However, trying to avoid initramfs
altogether is an increasingly losing battle these days, and for quite
understandable reasons.  klibc + md* + mini lvm2 (enough to activate
volumes) perhaps?

Cheers,

Chris.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Another VNC crash, qemu-kvm-0.12.3

2010-03-01 Thread Chris Webb
We've just seen another VNC related qemu-kvm crash, this time an arithmetic
exception at vnc.c:1424 in the newly release qemu-kvm 0.12.3.

  [...]
  1423 if (vs-absolute) { 
  1424 kbd_mouse_event(x * 0x7FFF / (ds_get_width(vs-ds) - 1),
  1425 y * 0x7FFF / (ds_get_height(vs-ds) - 1),
  1426 dz, buttons);
  1427 } else if (vnc_has_feature(vs, VNC_FEATURE_POINTER_TYPE_CHANGE)) { 
  1428 x -= 0x7FFF;
  [...]

and sure enough:

  (gdb) p vs-ds-surface-width
  $1 = 9
  (gdb) p vs-ds-surface-height
  $2 = 1

What a 9x1 display surface is doing on this guest is a mystery to me, but you
definitely can't divide by one less than its height!

  (gdb) p *vs
  $3 = {csock = 19, ds = 0x1c60fa0, dirty = {{4294967295, 4294967295, 
4294967295, 4294967295, 
4294967295} repeats 2048 times}, vd = 0x26a0110, need_update = 1, 
force_update = 0, features = 67, 
absolute = 1, last_x = -1, last_y = -1, vnc_encoding = 5, tight_quality = 9 
'\t', tight_compression = 9 '\t', 
major = 3, minor = 8, challenge = ¹{\177\226\200kÕjéPñÄA¤o), output = 
{capacity = 925115, offset = 0, 
  buffer = 0x28ba4b0 }, input = {capacity = 5120, offset = 6, buffer = 
0x28b90a0 \005}, 
write_pixels = 0x4bb9e0 vnc_write_pixels_generic, send_hextile_tile = 
0x4bcdf0 send_hextile_tile_generic_32, 
clientds = {flags = 0 '\0', width = 800, height = 600, linesize = 3200, 
data = 0x7fcd00ab6010 , pf = {
bits_per_pixel = 32 ' ', bytes_per_pixel = 4 '\004', depth = 24 '\030', 
rmask = 0, gmask = 0, bmask = 0, 
amask = 0, rshift = 16 '\020', gshift = 8 '\b', bshift = 0 '\0', ashift 
= 24 '\030', rmax = 255 'ÿ', 
gmax = 255 'ÿ', bmax = 255 'ÿ', amax = 255 'ÿ', rbits = 8 '\b', gbits = 
8 '\b', bbits = 8 '\b', 
abits = 8 '\b'}}, audio_cap = 0x0, as = {freq = 44100, nchannels = 2, 
fmt = AUD_FMT_S16, endianness = 0}, 
read_handler = 0x4beac0 protocol_client_msg, read_handler_expect = 6, 
modifiers_state = '\0' repeats 255 times, 
zlib = {capacity = 0, offset = 0, buffer = 0x0}, zlib_tmp = {capacity = 0, 
offset = 0, buffer = 0x0}, 
zlib_stream = {{next_in = 0x0, avail_in = 0, total_in = 0, next_out = 0x0, 
avail_out = 0, total_out = 0, msg = 0x0, 
state = 0x0, zalloc = 0, zfree = 0, opaque = 0x0, data_type = 0, adler 
= 0, reserved = 0}, {next_in = 0x0, 
avail_in = 0, total_in = 0, next_out = 0x0, avail_out = 0, total_out = 
0, msg = 0x0, state = 0x0, zalloc = 0, 
zfree = 0, opaque = 0x0, data_type = 0, adler = 0, reserved = 0}, 
{next_in = 0x0, avail_in = 0, total_in = 0, 
next_out = 0x0, avail_out = 0, total_out = 0, msg = 0x0, state = 0x0, 
zalloc = 0, zfree = 0, opaque = 0x0, 
data_type = 0, adler = 0, reserved = 0}, {next_in = 0x0, avail_in = 0, 
total_in = 0, next_out = 0x0, 
avail_out = 0, total_out = 0, msg = 0x0, state = 0x0, zalloc = 0, zfree 
= 0, opaque = 0x0, data_type = 0, 
adler = 0, reserved = 0}}, next = 0x0}

  (gdb) p *vs-ds
  $4 = {surface = 0x1c81f40, opaque = 0x26a0110, gui_timer = 0x0, allocator = 
0x8199d0, listeners = 0x1c95fa0, 
mouse_set = 0, cursor_define = 0, next = 0x0}

  (gdb) p *vs-ds-surface
  $5 = {flags = 2 '\002', width = 9, height = 1, linesize = 36, data = 
0x7fcd00ab6010 , pf = {
  bits_per_pixel = 32 ' ', bytes_per_pixel = 4 '\004', depth = 24 '\030', 
rmask = 16711680, gmask = 65280, 
  bmask = 255, amask = 0, rshift = 16 '\020', gshift = 8 '\b', bshift = 0 
'\0', ashift = 24 '\030', rmax = 255 'ÿ', 
  gmax = 255 'ÿ', bmax = 255 'ÿ', amax = 255 'ÿ', rbits = 8 '\b', gbits = 8 
'\b', bbits = 8 '\b', abits = 8 '\b'}}

Cheers,

Chris.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Re: qemu-kvm 0.12.2 VNC segfault

2010-02-22 Thread Chris Webb
Avi Kivity a...@redhat.com writes:

 On 02/21/2010 07:23 PM, Chris Webb wrote:
 Some sort of race where a client disconnects and is removed from the client
 list while the vnc_refresh() loop is iterating over it, maybe?
 
 Looks like c727a05459, and high time for 0.12.3.  Anthony?

Ah yes, looks like this was exactly the case that commit was trying to
prevent. Thanks!

Best wishes,

Chris.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


qemu-kvm 0.12.2 VNC segfault

2010-02-21 Thread Chris Webb
I've just had a segfault from one of the qemu-kvm virtual machines we run.
This is qemu-kvm 0.12.2 running with the in-kernel kvm modules on linux
2.6.32.7 on a dual quad-core Xeon E5420 machine, with ksm enabled.

The backtrace looks like

  #0  vnc_update_client (vs=0x83f0, has_dirty=18) at vnc.c:908
  #1  0x004c015b in vnc_refresh (opaque=value optimized out) at 
vnc.c:2305
  #2  0x00405f50 in qemu_run_timers (ptimer_head=0x836cc0, 
current_time=1606536889) at /packages/qemu-kvm-0.12/src-gktOMQ/vl.c:1127
  #3  0x00408edf in main_loop_wait (timeout=1000) at 
/packages/qemu-kvm-0.12/src-gktOMQ/vl.c:4036
  #4  0x00421d7a in kvm_main_loop () at 
/packages/qemu-kvm-0.12/src-gktOMQ/qemu-kvm.c:2121
  #5  0x0040b755 in main (argc=value optimized out, 
argv=0x7fffcc2fa1b8, envp=value optimized out) at 
/packages/qemu-kvm-0.12/src-gktOMQ/vl.c:4209

and the segfault itself is rather puzzling.

  #0  vnc_update_client (vs=0x83f0, has_dirty=18) at vnc.c:908
  908 if (vs-need_update  vs-csock != -1) {
  (gdb) p vs
  $1 = (VncState *) 0x83f0
  (gdb) p *vs
  Cannot access memory at address 0x83f0

The call site in vnc_refresh() looks like:

  vs = vd-clients;
  while (vs != NULL) {
  rects += vnc_update_client(vs, has_dirty);
  vs = vs-next;
  }

but when I go up a stack frame and look at the vd over which this loop would be
iterating:

  (gdb) up
  #1  0x004c015b in vnc_refresh (opaque=value optimized out) at 
vnc.c:2305
  2305rects += vnc_update_client(vs, has_dirty);
  (gdb) p *vd-clients 
  $2 = {csock = 17, ds = 0x19b2760, dirty = {{0, 0, 0, 0} repeats 293 times, 
{50331648, 0, 0, 0}, {50331648, 0, 0, 0}, {50331648, 0, 0, 0}, {50331648, 0, 0, 
0}, {16777216, 0, 0, 0}, {16777216, 0, 0, 0}, {16777216, 0, 0, 0}, {16777216, 
0, 0, 0}, {16777216, 0, 0, 0}, {16777216, 0, 0, 0}, {16777216, 0, 0, 0}, 
{16777216, 0, 0, 0}, {50331648, 0, 0, 0}, {0, 0, 0, 0} repeats 1742 times}, 
vd = 0x1ef60b0, need_update = 0, force_update = 0, features = 0, absolute = 0, 
last_x = -1, last_y = -1, vnc_encoding = 0, tight_quality = 0 '\0', 
tight_compression = 0 '\0', major = 0, minor = 0, challenge = '\0' repeats 15 
times, output = {capacity = 1036, offset = 0, buffer = 0x1ec7420 RFB 
003.008\n¦\177}, input = {capacity = 0, offset = 0, buffer = 0x0}, 
write_pixels = 0, send_hextile_tile = 0, clientds = {flags = 0 '\0', width = 0, 
height = 0, linesize = 0, data = 0x0, pf = {bits_per_pixel = 0 '\0', 
bytes_per_pixel = 0 '\0', depth = 0 '\0', rmask = 0, gmask = 0, bmask = 0, 
amask = 0, rshift = 0 '\0', gshift = 0 '\0', bshift = 0 '\0', ashift = 0 '\0', 
rmax = 0 '\0', gmax = 0 '\0', bmax = 0 '\0', amax = 0 '\0', rbits = 0 '\0', 
gbits = 0 '\0', bbits = 0 '\0', abits = 0 '\0'}}, audio_cap = 0x0, as = {freq = 
44100, nchannels = 2, fmt = AUD_FMT_S16, endianness = 0}, read_handler = 
0x4bdb30 protocol_version, read_handler_expect = 12, modifiers_state = '\0' 
repeats 255 times, zlib = {capacity = 0, offset = 0, buffer = 0x0}, zlib_tmp 
= {capacity = 0, offset = 0, buffer = 0x0}, zlib_stream = {{next_in = 0x0, 
avail_in = 0, total_in = 0, next_out = 0x0, avail_out = 0, total_out = 0, msg = 
0x0, state = 0x0, zalloc = 0, zfree = 0, opaque = 0x0, data_type = 0, adler = 
0, reserved = 0}, {next_in = 0x0, avail_in = 0, total_in = 0, next_out = 0x0, 
avail_out = 0, total_out = 0, msg = 0x0, state = 0x0, zalloc = 0, zfree = 0, 
opaque = 0x0, data_type = 0, adler = 0, reserved = 0}, {next_in = 0x0, avail_in 
= 0, total_in = 0, next_out = 0x0, avail_out = 0, total_out = 0, msg = 0x0, 
state = 0x0, zalloc = 0, zfree = 0, opaque = 0x0, data_type = 0, adler = 0, 
reserved = 0}, {next_in = 0x0, avail_in = 0, total_in = 0, next_out = 0x0, 
avail_out = 0, total_out = 0, msg = 0x0, state = 0x0, zalloc = 0, zfree = 0, 
opaque = 0x0, data_type = 0, adler = 0, reserved = 0}}, next = 0x0}
  (gdb) p vd-clients.next 
  $3 = (VncState *) 0x0

So the first client in vd is fine, and the next pointer is set to zero, not
0x83f0.

Some sort of race where a client disconnects and is removed from the client
list while the vnc_refresh() loop is iterating over it, maybe?

Cheers,

Chris.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ANNOUNCE] Sheepdog: Distributed Storage System for KVM

2009-10-23 Thread Chris Webb
MORITA Kazutaka morita.kazut...@lab.ntt.co.jp writes:

 We use JGroups (Java library) for reliable multicast communication in
 our cluster manager daemon. We don't worry about the performance much
 since the cluster manager daemon is not involved in the I/O path. We
 might think about moving to corosync if it is more stable than
 JGroups.

I'd love to see this running on top of corosync too. Corosync is a well
tested, stable cluster manager, and doesn't have the JVM dependency of
jgroups so feels more suitable for building 'thin virtualisation fabrics'.

Cheers,

Chris.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ANNOUNCE] Sheepdog: Distributed Storage System for KVM

2009-10-23 Thread Chris Webb
Chris Webb ch...@arachsys.com writes:

 MORITA Kazutaka morita.kazut...@lab.ntt.co.jp writes:
 
  We use JGroups (Java library) for reliable multicast communication in
  our cluster manager daemon. We don't worry about the performance much
  since the cluster manager daemon is not involved in the I/O path. We
  might think about moving to corosync if it is more stable than
  JGroups.
 
 I'd love to see this running on top of corosync too. Corosync is a well
 tested, stable cluster manager, and doesn't have the JVM dependency of
 jgroups so feels more suitable for building 'thin virtualisation fabrics'.

Very exciting project, by the way!

Best wishes,

Chris.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Re: [ANNOUNCE] Sheepdog: Distributed Storage System for KVM

2009-10-23 Thread Chris Webb
Javier Guerra jav...@guerrag.com writes:

 i'd just want to add my '+1 votes' on both getting rid of JVM
 dependency and using block devices (usually LVM) instead of ext3/btrfs

If the chunks into which the virtual drives are split are quite small (say
the 64MB used by Hadoop), LVM may be a less appropriate choice. It doesn't
support very large numbers of very small logical volumes very well.

Best wishes,

Chris.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: qemu-kvm segfaults in qemu_del_timer (0.10.5 and 0.10.6)

2009-08-24 Thread Chris Webb
Chris Webb ch...@arachsys.com writes:

 With the following applied, VNC connections and disconnections still work
 correctly, so it doesn't horribly break anything, but I can't immediately
 confirm whether it will cure the rare segfaults as I haven't yet found a
 rapid way of reproducing the crashes other than by waiting for one.

Just to follow up on this: the backported patch has cured the vast majority of
VNC crashes we've been seeing on 0.10.6, although I've still seen this earlier
today:

Core was generated by `qemu-kvm -m 512 -smp 1 -uuid 
d6f2cb13-7421-4baa-a978-eda9bec9d075 -pidfile /var'.
Program terminated with signal 11, Segmentation fault.
[New process 16847]
[New process 16855]
(gdb) bt
#0  0x7fe42e9c6cb1 in memcpy () from /lib/libc.so.6
#1  0x004917e4 in vnc_write (vs=0x31a7f50, data=0x7fffe3a19230, len=2) 
at vnc.c:323
#2  0x004919bf in vnc_write_u16 (vs=0x7fe2f8cae023, value=value 
optimized out) at vnc.c:1035
#3  0x00491bf3 in vnc_framebuffer_update (vs=0x7fe2f8cae023, 
x=-475950544, y=2, w=16385, h=1, encoding=6)
at vnc.c:286
#4  0x00496660 in send_framebuffer_update (vs=0x7fe2f8cae023, 
x=-475950544, y=196, w=208, h=1) at vnc.c:598
#5  0x00496f65 in vnc_update_client (opaque=value optimized out) at 
vnc.c:754
#6  0x0040822a in main_loop_wait (timeout=value optimized out)
at /packages/qemu-kvm+vncfix/src-nUlCId/vl.c:1240
#7  0x0051753a in kvm_main_loop () at 
/packages/qemu-kvm+vncfix/src-nUlCId/qemu-kvm.c:596
#8  0x0040c8a5 in main (argc=value optimized out, argv=value 
optimized out, envp=value optimized out)
at /packages/qemu-kvm+vncfix/src-nUlCId/vl.c:3850
(gdb) f 1
#1  0x004917e4 in vnc_write (vs=0x31a7f50, data=0x7fffe3a19230, len=2) 
at vnc.c:323
323 memcpy(buffer-buffer + buffer-offset, data, len);
(gdb) f 1
#1  0x004917e4 in vnc_write (vs=0x31a7f50, data=0x7fffe3a19230, len=2) 
at vnc.c:323
323 memcpy(buffer-buffer + buffer-offset, data, len);
(gdb) p *vs
$1 = {timer = 0x2b90b20, csock = 18, ds = 0x28a1a20, vd = 0x28b0fc0, 
need_update = 1, dirty_row = {{0, 0, 0, 
  0} repeats 197 times, {65535, 262128, 0, 0}, {4294967295, 1, 0, 0}, 
{4294967288, 262143, 0, 0}, {4294443008, 
  262143, 0, 0}, {131071, 262128, 0, 0}, {4294967295, 1, 0, 0}, 
{4294967292, 262143, 0, 0}, {4294443008, 262143, 
  0, 0}, {131071, 262136, 0, 0}, {4294967295, 1, 0, 0}, {4294967292, 
262143, 0, 0}, {4294443008, 262143, 0, 0}, {
  131071, 262136, 0, 0}, {4294967295, 1, 0, 0}, {4294967292, 262143, 0, 0}, 
{4294705152, 262143, 0, 0}, {131071, 
  262136, 0, 0}, {4294967295, 1, 0, 0}, {4294967294, 262143, 0, 0}, 
{4294705152, 262143, 0, 0}, {131071, 262140, 
  0, 0}, {4294967295, 1, 0, 0}, {4294967294, 262143, 0, 0}, {4294836224, 
262143, 0, 0}, {131071, 262140, 0, 0}, {
  4294967295, 1, 0, 0}, {4294967294, 262143, 0, 0}, {4294836224, 262143, 0, 
0}, {131071, 262140, 0, 0}, {
  4294967295, 1, 0, 0}, {4294967295, 262143, 0, 0}, {4294836224, 262143, 0, 
0}, {131071, 262142, 0, 0}, {
  4294967295, 1, 0, 0}, {4294967295, 262143, 0, 0}, {4294901760, 262143, 0, 
0}, {131071, 262142, 0, 0}, {
  4294967295, 1, 0, 0}, {4294967295, 262143, 0, 0}, {4294901760, 262143, 0, 
0}, {131071, 262142, 0, 0}, {
  4294967295, 131073, 0, 0}, {4294967295, 262143, 0, 0}, {4294901760, 
262143, 0, 0}, {131071, 262143, 0, 0}, {
  4294967295, 131073, 0, 0}, {4294967295, 262143, 0, 0}, {4294934528, 
262143, 0, 0}, {131071, 262143, 0, 0}, {
  4294967295, 131075, 0, 0}, {4294967295, 262143, 0, 0}, {4294934528, 
262143, 0, 0}, {131071, 262143, 0, 0}, {
  4294967295, 196611, 0, 0}, {4294967295, 262143, 0, 0}, {4294934528, 
262143, 0, 0}, {2147614719, 262143, 0, 0}, {
  4294967295, 196611, 0, 0}, {4294967295, 262143, 0, 0}, {4294950912, 
262143, 0, 0}, {2147614719, 262143, 0, 0}, {
  4294967295, 196611, 0, 0}, {4294967295, 262143, 0, 0}, {4294950912, 
262143, 0, 0}, {2147614719, 262143, 0, 0}, {
  4294967295, 229379, 0, 0}, {4294967295, 262143, 0, 0}, {4294950912, 
262143, 0, 0}, {3221356543, 262143, 0, 0}, {
  4294967295, 229379, 0, 0}, {4294967295, 262143, 0, 0}, {4294950912, 
262143, 0, 0}, {3221356543, 262143, 0, 0}, {
  4294967295, 229377, 0, 0}, {4294967295, 262143, 0, 0}, {4294959104, 
262143, 0, 0}, {3221356543, 262143, 0, 0}, {
  4294967295, 245761, 0, 0}, {4294967295, 262143, 0, 0}, {4294959104, 
262143, 0, 0}, {3758227455, 262143, 0, 0}, {
  4294967295, 245761, 0, 0}, {4294967295, 262143, 0, 0}, {4294959104, 
262143, 0, 0}, {3758227455, 262143, 0, 0}, {
  4294967295, 245761, 0, 0}, {4294967295, 262143, 0, 0}, {4294963200, 
262143, 0, 0}, {3758227455, 262143, 0, 0}, {
  4294967295, 253953, 0, 0}, {4294967295, 262143, 0, 0}, {4294963200, 
262143, 0, 0}, {4026662911, 262143, 0, 0}, {
  4294967295, 253953, 0, 0}, {4294967295, 262143, 0, 0}, {4294963200, 
262143, 0, 0}, {4026662911, 262143, 0, 0}, {
  4294967295, 253953, 0, 0

Re: qemu-kvm segfaults in qemu_del_timer (0.10.5 and 0.10.6)

2009-08-19 Thread Chris Webb
Avi Kivity a...@redhat.com writes:

 master branch has a patch that fixes a use-after-free when  
 disconnecting.  Unfortunately it doesn't port cleanly to stable-0.10.

I've collected quite a few more core dumps from segfaults of client virtual
machines now, all of which are VNC related and could quite plausibly be use
of a VncState after it has been freed. I looked at Gerd's patch [198a00:
vnc: rework VncState release workflow] and have taken a stab at the
equivalent patch for stable qemu  qemu-kvm 0.10.

With the following applied, VNC connections and disconnections still work
correctly, so it doesn't horribly break anything, but I can't immediately
confirm whether it will cure the rare segfaults as I haven't yet found a
rapid way of reproducing the crashes other than by waiting for one.


diff --git a/vnc.c b/vnc.c
--- a/vnc.c
+++ b/vnc.c
@@ -200,6 +200,8 @@
 static void vnc_write_u8(VncState *vs, uint8_t value);
 static void vnc_flush(VncState *vs);
 static void vnc_update_client(void *opaque);
+static void vnc_disconnect_start(VncState *vs);
+static void vnc_disconnect_finish(VncState *vs);
 static void vnc_client_read(void *opaque);
 
 static void vnc_colordepth(VncState *vs);
@@ -633,8 +635,6 @@
 
 static void vnc_copy(VncState *vs, int src_x, int src_y, int dst_x, int dst_y, 
int w, int h)
 {
-vnc_update_client(vs);
-
 vnc_write_u8(vs, 0);  /* msg id */
 vnc_write_u8(vs, 0);
 vnc_write_u16(vs, 1); /* number of rects */
@@ -647,13 +647,21 @@
 static void vnc_dpy_copy(DisplayState *ds, int src_x, int src_y, int dst_x, 
int dst_y, int w, int h)
 {
 VncDisplay *vd = ds-opaque;
-VncState *vs = vd-clients;
-while (vs != NULL) {
+VncState *vs, *vn;
+
+for (vs = vd-clients; vs != NULL; vs = vn) {
+vn = vs-next;
+if (vnc_has_feature(vs, VNC_FEATURE_COPYRECT)) {
+vnc_update_client(vs);
+/* vs might be free()ed here */
+}
+}
+
+for (vs = vd-clients; vs != NULL; vs = vs-next) {
 if (vnc_has_feature(vs, VNC_FEATURE_COPYRECT))
 vnc_copy(vs, src_x, src_y, dst_x, dst_y, w, h);
 else /* TODO */
 vnc_update(vs, dst_x, dst_y, w, h);
-vs = vs-next;
 }
 }
 
@@ -763,6 +771,8 @@
 
 if (vs-csock != -1) {
 qemu_mod_timer(vs-timer, qemu_get_clock(rt_clock) + 
VNC_REFRESH_INTERVAL);
+} else {
+vnc_disconnect_finish(vs);
 }
 
 }
@@ -832,6 +842,47 @@
 }
 }
 
+static void vnc_disconnect_start(VncState *vs)
+{
+if (vs-csock == -1)
+return;
+qemu_set_fd_handler2(vs-csock, NULL, NULL, NULL, NULL);
+closesocket(vs-csock);
+vs-csock = -1;
+}
+
+static void vnc_disconnect_finish(VncState *vs)
+{
+qemu_del_timer(vs-timer);
+qemu_free_timer(vs-timer);
+if (vs-input.buffer) qemu_free(vs-input.buffer);
+if (vs-output.buffer) qemu_free(vs-output.buffer);
+#ifdef CONFIG_VNC_TLS
+if (vs-tls_session) {
+gnutls_deinit(vs-tls_session);
+vs-tls_session = NULL;
+}
+#endif /* CONFIG_VNC_TLS */
+audio_del(vs);
+
+VncState *p, *parent = NULL;
+for (p = vs-vd-clients; p != NULL; p = p-next) {
+if (p == vs) {
+if (parent)
+parent-next = p-next;
+else
+vs-vd-clients = p-next;
+break;
+}
+parent = p;
+}
+if (!vs-vd-clients)
+dcl-idle = 1;
+
+qemu_free(vs-old_data);
+qemu_free(vs);
+}
+
 static int vnc_client_io_error(VncState *vs, int ret, int last_errno)
 {
 if (ret == 0 || ret == -1) {
@@ -849,36 +900,7 @@
 }
 
VNC_DEBUG(Closing down client sock %d %d\n, ret, ret  0 ? last_errno 
: 0);
-   qemu_set_fd_handler2(vs-csock, NULL, NULL, NULL, NULL);
-   closesocket(vs-csock);
-qemu_del_timer(vs-timer);
-qemu_free_timer(vs-timer);
-if (vs-input.buffer) qemu_free(vs-input.buffer);
-if (vs-output.buffer) qemu_free(vs-output.buffer);
-#ifdef CONFIG_VNC_TLS
-   if (vs-tls_session) {
-   gnutls_deinit(vs-tls_session);
-   vs-tls_session = NULL;
-   }
-#endif /* CONFIG_VNC_TLS */
-audio_del(vs);
-
-VncState *p, *parent = NULL;
-for (p = vs-vd-clients; p != NULL; p = p-next) {
-if (p == vs) {
-if (parent)
-parent-next = p-next;
-else
-vs-vd-clients = p-next;
-break;
-}
-parent = p;
-}
-if (!vs-vd-clients)
-dcl-idle = 1;
-
-qemu_free(vs-old_data);
-qemu_free(vs);
+vnc_disconnect_start(vs);
   
return 0;
 }
@@ -887,7 +909,8 @@
 
 static void vnc_client_error(VncState *vs)
 {
-vnc_client_io_error(vs, -1, EINVAL);
+VNC_DEBUG(Closing down client sock: protocol error\n);
+vnc_disconnect_start(vs);
 }
 
 static void vnc_client_write(void *opaque)
@@ -947,8 +970,11 @@
 #endif /* CONFIG_VNC_TLS */

Re: qemu-kvm segfaults in qemu_del_timer (0.10.5 and 0.10.6)

2009-08-13 Thread Chris Webb
Chris Webb ch...@arachsys.com writes:

 Avi Kivity a...@redhat.com writes:
 
  I understand it's hard, but it's nearly impossible to work out the  
  problem from so little data, so please do make the effort to obtain 
  dumps.
 
 We're trying for this at the moment, but since we can't change the rlimit
 for the running qemu-kvm processes (?), we'll have to wait until one of the
 new ones dies, which may take some time. I'll follow up when I do have
 something.

We've been lucky and relatively quickly got a core dump from one of the new
qemu-kvms with the non-zero core file rlimit. A backtrace looks like this:

  (gdb) bt
  #0  0x004068f7 in qemu_mod_timer (ts=0x30d1f30, expire_time=430489)
  at /packages/qemu-kvm/src-f39tF1/vl.c:1161
  #1  0x00495dd5 in vnc_update_client (opaque=value optimized out) at 
vnc.c:765
  #2  0x004081da in main_loop_wait (timeout=value optimized out) at 
/packages/qemu-kvm/src-f39tF1/vl.c:1240
  #3  0x0051613a in kvm_main_loop () at 
/packages/qemu-kvm/src-f39tF1/qemu-kvm.c:596
  #4  0x0040c7b7 in main (argc=value optimized out, argv=value 
optimized out, envp=value optimized out)
  at /packages/qemu-kvm/src-f39tF1/vl.c:3850

The segfault appears to be a null pointer dereference. ts-clock is NULL
and line 1161 uses ts-clock-type:

  (gdb) p ts   
  $4 = (QEMUTimer *) 0x30d1f30
  (gdb) p ts-clock
  $5 = (QEMUClock *) 0x0

The VncState in vnc_update_client is as follows:

  (gdb) f 1
  #1  0x00495dd5 in vnc_update_client (opaque=value optimized out) at 
vnc.c:765
  765 qemu_mod_timer(vs-timer, qemu_get_clock(rt_clock) + 
VNC_REFRESH_INTERVAL);
  (gdb) p *vs
  $12 = {timer = 0x30d1f30, csock = -986235208, ds = 0x0, vd = 0x0, need_update 
= 1, dirty_row = {{0, 0, 4294967295, 
4294967295} repeats 768 times, {4294967295, 4294967295, 4294967295, 
4294967295} repeats 1280 times}, 
old_data = 0x7f9b8276f010 Address 0x7f9b8276f010 out of bounds, features 
= 98, absolute = 1, last_x = -1, 
last_y = -1, vnc_encoding = 5, tight_quality = 6 '\006', tight_compression 
= 1 '\001', major = 3, minor = 3, 
challenge = \032\314i\257\302t1(\320\312\263\024pH\226, output = 
{capacity = 1545078, offset = 684, 
  buffer = 0x3107860 }, input = {capacity = 5120, offset = 0, buffer = 
0x3106450 \020\220(\003}, 
write_pixels = 0x490b50 vnc_write_pixels_generic, send_hextile_tile = 
0x492030 send_hextile_tile_generic_32, 
clientds = {flags = 0 '\0', width = 800, height = 600, linesize = 3200, 
  data = 0x7f9b82944010 Address 0x7f9b82944010 out of bounds, pf = 
{bits_per_pixel = 32 ' ', 
bytes_per_pixel = 4 '\004', depth = 24 '\030', rmask = 0, gmask = 0, 
bmask = 0, amask = 0, rshift = 16 '\020', 
gshift = 8 '\b', bshift = 0 '\0', ashift = 24 '\030', rmax = 255 
'\377', gmax = 255 '\377', bmax = 255 '\377', 
amax = 255 '\377', rbits = 8 '\b', gbits = 8 '\b', bbits = 8 '\b', 
abits = 8 '\b'}}, serverds = {
  flags = 2 '\002', width = 1024, height = 768, linesize = 4096, data = 
0x7f9b8246e010 , pf = {
bits_per_pixel = 32 ' ', bytes_per_pixel = 4 '\004', depth = 24 '\030', 
rmask = 16711680, gmask = 65280, 
bmask = 255, amask = 0, rshift = 16 '\020', gshift = 8 '\b', bshift = 0 
'\0', ashift = 24 '\030', 
rmax = 255 '\377', gmax = 255 '\377', bmax = 255 '\377', amax = 255 
'\377', rbits = 8 '\b', gbits = 8 '\b', 
bbits = 8 '\b', abits = 8 '\b'}}, audio_cap = 0x0, as = {freq = 44100, 
nchannels = 2, fmt = AUD_FMT_S16, 
  endianness = 0}, read_handler = 0x494b40 protocol_client_msg, 
read_handler_expect = 1, 
modifiers_state = '\0' repeats 255 times, zlib = {capacity = 0, offset = 
0, buffer = 0x0}, zlib_tmp = {
  capacity = 0, offset = 0, buffer = 0x0}, zlib_stream = {{next_in = 0x0, 
avail_in = 0, total_in = 0, 
next_out = 0x0, avail_out = 0, total_out = 0, msg = 0x0, state = 0x0, 
zalloc = 0, zfree = 0, opaque = 0x0, 
data_type = 0, adler = 0, reserved = 0}, {next_in = 0x0, avail_in = 0, 
total_in = 0, next_out = 0x0, 
avail_out = 0, total_out = 0, msg = 0x0, state = 0x0, zalloc = 0, zfree 
= 0, opaque = 0x0, data_type = 0, 
adler = 0, reserved = 0}, {next_in = 0x0, avail_in = 0, total_in = 0, 
next_out = 0x0, avail_out = 0, 
total_out = 0, msg = 0x0, state = 0x0, zalloc = 0, zfree = 0, opaque = 
0x0, data_type = 0, adler = 0, 
reserved = 0}, {next_in = 0x0, avail_in = 0, total_in = 0, next_out = 
0x0, avail_out = 0, total_out = 0, 
msg = 0x0, state = 0x0, zalloc = 0, zfree = 0, opaque = 0x0, data_type 
= 0, adler = 0, reserved = 0}}, 
next = 0x0}

I'm afraid I only have one of these, so I can't say whether the other
segfaults were exactly the same or different (other than knowing the source
line matched), but I'll keep my eye out for more core dumps.

qemu-kvm command line for this guest would have been

  qemu-kvm -m 1024 -smp 1

Re: qemu-kvm segfaults in qemu_del_timer (0.10.5 and 0.10.6)

2009-08-13 Thread Chris Webb
Chris Webb ch...@arachsys.com writes:

 The segfault appears to be a null pointer dereference. ts-clock is NULL
 and line 1161 uses ts-clock-type:
 
   (gdb) p ts   
   $4 = (QEMUTimer *) 0x30d1f30
   (gdb) p ts-clock
   $5 = (QEMUClock *) 0x0

Sorry, meant to paste this too:

  (gdb) p *ts
  $1 = {clock = 0x0, expire_time = 49, cb = 0x2b63630, opaque = 0x30fe000, next 
= 0x495b40}

Cheers,

Chris.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: qemu-kvm segfaults in qemu_del_timer (0.10.5 and 0.10.6)

2009-08-13 Thread Chris Webb
Avi Kivity a...@redhat.com writes:

 csock looks corrupted, should be -1 or an fd.  Was a vnc client connected?
 Was the guest playing with the display resolution?

Yes, I think in this case there was a vncviewer connected, and the guest had
started booting up into windows, which changes the resolution a couple of
times.

Best wishes,

Chris.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: qemu-kvm segfaults in qemu_del_timer (0.10.5 and 0.10.6)

2009-08-13 Thread Chris Webb
Chris Webb ch...@arachsys.com writes:

 Avi Kivity a...@redhat.com writes:
 
  csock looks corrupted, should be -1 or an fd.  Was a vnc client connected?
  Was the guest playing with the display resolution?
 
 Yes, I think in this case there was a vncviewer connected, and the guest had
 started booting up into windows, which changes the resolution a couple of
 times.

Also, I think the vncviewer might actually have been disconnecting at about
the time the segfault happened.

Cheers,

Chris.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


qemu-kvm segfaults in qemu_del_timer (0.10.5 and 0.10.6)

2009-08-12 Thread Chris Webb
I have a couple of clusters hosting qemu-kvm virtual machines. One of these
clusters consists of dual quad-core Xeon E5420s (vmx), the other consists of
dual quad-core Barcelona Opterons (svm), and both are running x86-64 Linux
2.6.30.4 with the kvm modules included with the upstream kernel compiled in.

Running qemu-kvm 0.10.5, I was seeing occasional segfaults from the virtual
machines, perhaps two or three a day across each cluster. The guest OS didn't
appear to be a factor, as both Linux and Windows VMs have crashed. I then
switched to the recently released qemu-kvm 0.10.6, and am still seeing these
segfaults.

It's very hard for me to arrange for core dumps on these live clusters, and the
segfaults are hard to reproduce on test machines because they are rare.
However, I have unstripped copies of the respective binaries and have used gdb
to translate the segfault ip into a source file and line number, which I hope
might be useful. On both clusters and for each version of qemu-kvm, segfaults
are happening at lines #1161 and #1163 of vl.c:

[...]
/* stop a timer, but do not dealloc it */
void qemu_del_timer(QEMUTimer *ts)
{
QEMUTimer **pt, *t;

/* NOTE: this code must be signal safe because
   qemu_timer_expired() can be called from a signal. */
HERE ==pt = active_timers[ts-clock-type];
for(;;) {
HERE ==t = *pt;
if (!t)
break;
if (t == ts) {
*pt = t-next;
break;
}
pt = t-next;
}
}
[...]

For qemu-kvm 0.10.5, I have large numbers of segfaults in both locations. For
qemu-kvm 0.10.6, my sample is much smaller, but the segfaults I have are all at
line #1161, not #1163.

Final data-point: prior to the 0.10.5 upgrade, we had been successfully running 
a
(fairly old) kvm-83 userspace without experiencing this segfault problem.

Any help fixing this would be gratefully received!

Cheers,

Chris.

PS One other place I have seen a segfault in 0.10.6 since we rolled it out is
at line #141 of hw/scsi-disk.c, but this has only happened once---very rare
compared to the problem I describe above.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Trouble understanding net config options

2009-07-15 Thread Chris Webb
Michael Jinks michael.ji...@gmail.com writes:

 How do I make a guest use a specific tap?  Quoting
 from my initial post, my -net options are:
 
  -net nic -net tap,name=tap11 -net nic -net tap,name=tap12

You want

  -net nic,vlan=0 -net tap,vlan=0,ifname=tap11 -net nic,vlan=1 -net 
tap,vlan=1,ifname=tap12

to get the effect that (I think) you're looking for: one nic connected to
tap11 using vlan0 and one nic connected to tap12 using vlan1.

Without the vlan parameters, everything's on vlan0 so you get two nics and
two tap interfaces all connected together inside qemu on a single virtual
switch.

Best wishes,

Chris.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Two VNC patches

2008-12-08 Thread Chris Webb
I sent this pair of VNC-related patches to the qemu-devel list a couple of
weeks back and I'm not sure whether they've got lost in the cracks or were
in some way not acceptable and need fixing up.

The first one is a straightforward bug-fix, and the second is a trivial
convenience feature in the monitor which I imagine ought to be fairly
uncontroversial?

Cheers,

Chris.
---BeginMessage---
Fix off-by-one bug limiting VNC passwords to 7 characters instead of 8

monitor_readline expects buf_size to include the terminating \0, but
do_change_vnc in monitor.c calls it as though it doesn't. The other site
where monitor_readline reads a password (in vl.c) passes the buffer length
correctly.

Signed-off-by: Chris Webb [EMAIL PROTECTED]
---
 monitor.c |3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/monitor.c b/monitor.c
index 22360fc..a252838 100644
--- a/monitor.c
+++ b/monitor.c
@@ -433,8 +433,7 @@ static void do_change_vnc(const char *target)
 if (strcmp(target, passwd) == 0 ||
strcmp(target, password) == 0) {
char password[9];
-   monitor_readline(Password: , 1, password, sizeof(password)-1);
-   password[sizeof(password)-1] = '\0';
+   monitor_readline(Password: , 1, password, sizeof(password));
if (vnc_display_password(NULL, password)  0)
term_printf(could not set VNC server password\n);
 } else {

---End Message---
---BeginMessage---
Accept password as an argument to 'change vnc password' monitor command

This allows easier use of the change vnc password monitor command from
management scripts, without having to implement expect(1)-like behaviour.

Signed-off-by: Chris Webb [EMAIL PROTECTED]
---
 monitor.c |   14 +-
 qemu-doc.texi |8 
 2 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/monitor.c b/monitor.c
index a252838..f6a2783 100644
--- a/monitor.c
+++ b/monitor.c
@@ -428,12 +428,16 @@ static void do_change_block(const char *device, const 
char *filename, const char
 qemu_key_check(bs, filename);
 }
 
-static void do_change_vnc(const char *target)
+static void do_change_vnc(const char *target, const char *arg)
 {
 if (strcmp(target, passwd) == 0 ||
strcmp(target, password) == 0) {
char password[9];
-   monitor_readline(Password: , 1, password, sizeof(password));
+   if (arg) {
+   strncpy(password, arg, sizeof(password));
+   password[sizeof(password) - 1] = '\0';
+   } else
+   monitor_readline(Password: , 1, password, sizeof(password));
if (vnc_display_password(NULL, password)  0)
term_printf(could not set VNC server password\n);
 } else {
@@ -442,12 +446,12 @@ static void do_change_vnc(const char *target)
 }
 }
 
-static void do_change(const char *device, const char *target, const char *fmt)
+static void do_change(const char *device, const char *target, const char *arg)
 {
 if (strcmp(device, vnc) == 0) {
-   do_change_vnc(target);
+   do_change_vnc(target, arg);
 } else {
-   do_change_block(device, target, fmt);
+   do_change_block(device, target, arg);
 }
 }
 
diff --git a/qemu-doc.texi b/qemu-doc.texi
index 1735d92..ca3b181 100644
--- a/qemu-doc.texi
+++ b/qemu-doc.texi
@@ -1233,11 +1233,11 @@ and @var{options} are described at 
@ref{sec_invocation}. eg
 (qemu) change vnc localhost:1
 @end example
 
[EMAIL PROTECTED] change vnc password
[EMAIL PROTECTED] change vnc password [EMAIL PROTECTED]
 
-Change the password associated with the VNC server. The monitor will prompt for
-the new password to be entered. VNC passwords are only significant upto 8 
letters.
-eg.
+Change the password associated with the VNC server. If the new password is not
+supplied, the monitor will prompt for it to be entered. VNC passwords are only
+significant up to 8 letters. eg
 
 @example
 (qemu) change vnc password

---End Message---


[RESEND] [PATCH v2] Accept password as an argument to 'change vnc password'

2008-12-08 Thread Chris Webb
Accept password as an argument to 'change vnc password' monitor command

This allows easier use of the change vnc password monitor command from
management scripts, without having to implement expect(1)-like behaviour.

Signed-off-by: Chris Webb [EMAIL PROTECTED]
---
 monitor.c |   14 +-
 qemu-doc.texi |8 
 2 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/monitor.c b/monitor.c
index a252838..f6a2783 100644
--- a/monitor.c
+++ b/monitor.c
@@ -428,12 +428,16 @@ static void do_change_block(const char *device, const 
char *filename, const char
 qemu_key_check(bs, filename);
 }
 
-static void do_change_vnc(const char *target)
+static void do_change_vnc(const char *target, const char *arg)
 {
 if (strcmp(target, passwd) == 0 ||
strcmp(target, password) == 0) {
char password[9];
-   monitor_readline(Password: , 1, password, sizeof(password));
+   if (arg) {
+   strncpy(password, arg, sizeof(password));
+   password[sizeof(password) - 1] = '\0';
+   } else
+   monitor_readline(Password: , 1, password, sizeof(password));
if (vnc_display_password(NULL, password)  0)
term_printf(could not set VNC server password\n);
 } else {
@@ -442,12 +446,12 @@ static void do_change_vnc(const char *target)
 }
 }
 
-static void do_change(const char *device, const char *target, const char *fmt)
+static void do_change(const char *device, const char *target, const char *arg)
 {
 if (strcmp(device, vnc) == 0) {
-   do_change_vnc(target);
+   do_change_vnc(target, arg);
 } else {
-   do_change_block(device, target, fmt);
+   do_change_block(device, target, arg);
 }
 }
 
diff --git a/qemu-doc.texi b/qemu-doc.texi
index 1735d92..ca3b181 100644
--- a/qemu-doc.texi
+++ b/qemu-doc.texi
@@ -1233,11 +1233,11 @@ and @var{options} are described at 
@ref{sec_invocation}. eg
 (qemu) change vnc localhost:1
 @end example
 
[EMAIL PROTECTED] change vnc password
[EMAIL PROTECTED] change vnc password [EMAIL PROTECTED]
 
-Change the password associated with the VNC server. The monitor will prompt for
-the new password to be entered. VNC passwords are only significant upto 8 
letters.
-eg.
+Change the password associated with the VNC server. If the new password is not
+supplied, the monitor will prompt for it to be entered. VNC passwords are only
+significant up to 8 letters. eg
 
 @example
 (qemu) change vnc password

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Fix off-by-one bug limiting VNC passwords to 7 chars

2008-11-25 Thread Chris Webb
Thiemo Seufer [EMAIL PROTECTED] writes:

 Chris Webb wrote:
[...]
  -   monitor_readline(Password: , 1, password, sizeof(password)-1);
  +   monitor_readline(Password: , 1, password, sizeof(password));
  password[sizeof(password)-1] = '\0';
 
 The next line can go as well, the string is already NULL terminated.

You're quite right. I'll update the two patches to reflect this change.

Cheers,

Chris.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] Accept password as an argument to 'change vnc password'

2008-11-25 Thread Chris Webb
Accept password as an argument to 'change vnc password' monitor command

This allows easier use of the change vnc password monitor command from
management scripts, without having to implement expect(1)-like behaviour.

Signed-off-by: Chris Webb [EMAIL PROTECTED]
---
 monitor.c |   14 +-
 qemu-doc.texi |8 
 2 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/monitor.c b/monitor.c
index a252838..f6a2783 100644
--- a/monitor.c
+++ b/monitor.c
@@ -428,12 +428,16 @@ static void do_change_block(const char *device, const 
char *filename, const char
 qemu_key_check(bs, filename);
 }
 
-static void do_change_vnc(const char *target)
+static void do_change_vnc(const char *target, const char *arg)
 {
 if (strcmp(target, passwd) == 0 ||
strcmp(target, password) == 0) {
char password[9];
-   monitor_readline(Password: , 1, password, sizeof(password));
+   if (arg) {
+   strncpy(password, arg, sizeof(password));
+   password[sizeof(password) - 1] = '\0';
+   } else
+   monitor_readline(Password: , 1, password, sizeof(password));
if (vnc_display_password(NULL, password)  0)
term_printf(could not set VNC server password\n);
 } else {
@@ -442,12 +446,12 @@ static void do_change_vnc(const char *target)
 }
 }
 
-static void do_change(const char *device, const char *target, const char *fmt)
+static void do_change(const char *device, const char *target, const char *arg)
 {
 if (strcmp(device, vnc) == 0) {
-   do_change_vnc(target);
+   do_change_vnc(target, arg);
 } else {
-   do_change_block(device, target, fmt);
+   do_change_block(device, target, arg);
 }
 }
 
diff --git a/qemu-doc.texi b/qemu-doc.texi
index 1735d92..ca3b181 100644
--- a/qemu-doc.texi
+++ b/qemu-doc.texi
@@ -1233,11 +1233,11 @@ and @var{options} are described at 
@ref{sec_invocation}. eg
 (qemu) change vnc localhost:1
 @end example
 
[EMAIL PROTECTED] change vnc password
[EMAIL PROTECTED] change vnc password [EMAIL PROTECTED]
 
-Change the password associated with the VNC server. The monitor will prompt for
-the new password to be entered. VNC passwords are only significant upto 8 
letters.
-eg.
+Change the password associated with the VNC server. If the new password is not
+supplied, the monitor will prompt for it to be entered. VNC passwords are only
+significant up to 8 letters. eg
 
 @example
 (qemu) change vnc password
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Fix off-by-one bug limiting VNC passwords to 7 chars

2008-11-23 Thread Chris Webb
Fix off-by-one bug limiting VNC passwords to 7 characters instead of 8

monitor_readline expects buf_size to include the terminating \0, but
do_change_vnc in monitor.c calls it as though it doesn't. The other site
where monitor_readline reads a password (in vl.c) passes the buffer length
correctly.

Signed-off-by: Chris Webb [EMAIL PROTECTED]
---
 monitor.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/monitor.c b/monitor.c
index 22360fc..6ae5729 100644
--- a/monitor.c
+++ b/monitor.c
@@ -433,7 +433,7 @@ static void do_change_vnc(const char *target)
 if (strcmp(target, passwd) == 0 ||
strcmp(target, password) == 0) {
char password[9];
-   monitor_readline(Password: , 1, password, sizeof(password)-1);
+   monitor_readline(Password: , 1, password, sizeof(password));
password[sizeof(password)-1] = '\0';
if (vnc_display_password(NULL, password)  0)
term_printf(could not set VNC server password\n);
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Accept password as an argument to 'change vnc password'

2008-11-23 Thread Chris Webb
Accept password as an argument to 'change vnc password' monitor command

This allows easier use of the change vnc password monitor command from
management scripts, without having to implement expect(1)-like behaviour.

Signed-off-by: Chris Webb [EMAIL PROTECTED]
---
 monitor.c |   13 -
 qemu-doc.texi |8 
 2 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/monitor.c b/monitor.c
index 22360fc..8ac73c1 100644
--- a/monitor.c
+++ b/monitor.c
@@ -428,12 +428,15 @@ static void do_change_block(const char *device, const 
char *filename, const char
 qemu_key_check(bs, filename);
 }
 
-static void do_change_vnc(const char *target)
+static void do_change_vnc(const char *target, const char *arg)
 {
 if (strcmp(target, passwd) == 0 ||
strcmp(target, password) == 0) {
char password[9];
-   monitor_readline(Password: , 1, password, sizeof(password));
+   if (arg)
+   strncpy(password, arg, sizeof(password));
+   else
+   monitor_readline(Password: , 1, password, sizeof(password));
password[sizeof(password)-1] = '\0';
if (vnc_display_password(NULL, password)  0)
term_printf(could not set VNC server password\n);
@@ -443,12 +446,12 @@ static void do_change_vnc(const char *target)
 }
 }
 
-static void do_change(const char *device, const char *target, const char *fmt)
+static void do_change(const char *device, const char *target, const char *arg)
 {
 if (strcmp(device, vnc) == 0) {
-   do_change_vnc(target);
+   do_change_vnc(target, arg);
 } else {
-   do_change_block(device, target, fmt);
+   do_change_block(device, target, arg);
 }
 }
 
diff --git a/qemu-doc.texi b/qemu-doc.texi
index 1735d92..ca3b181 100644
--- a/qemu-doc.texi
+++ b/qemu-doc.texi
@@ -1233,11 +1233,11 @@ and @var{options} are described at 
@ref{sec_invocation}. eg
 (qemu) change vnc localhost:1
 @end example
 
[EMAIL PROTECTED] change vnc password
[EMAIL PROTECTED] change vnc password [EMAIL PROTECTED]
 
-Change the password associated with the VNC server. The monitor will prompt for
-the new password to be entered. VNC passwords are only significant upto 8 
letters.
-eg.
+Change the password associated with the VNC server. If the new password is not
+supplied, the monitor will prompt for it to be entered. VNC passwords are only
+significant up to 8 letters. eg
 
 @example
 (qemu) change vnc password
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Unsupported delivery mode 7

2008-11-19 Thread Chris Webb
We're running kvm-78 in production on Linux 2.6.27 x86_64 on dual quad-core
Opteron 'Barcelona' machines. Our kvm modules are built from the kvm-78
sources rather than the older version bundled with the kernel, and we're
using the NPT features of the processors.

For the most part, everything is performing very well and running reliably.
However, occasionally a guest will hang as it starts (or is reset) with a
large number of messages of the form

  Unsupported delivery mode 7

in the dmesg. Following this, killing and relaunching the qemu process is
usually sufficient to get a working guest.

I'm aware that our versions of the kvm kernel modules and userspace are not
the latest release, but because we're running long-lived guests on behalf of
clients, it's quite a major operation to upgrade. Does this look like a
known bug which has already been fixed or should I try to reproduce it
properly on a test machine with an ability to debug, use magic sysrq, etc?
(It seems impossible to reproduce on my lower spec desktop machine, for what
it's worth. Normally I'd reproduce kernel problems in a KVM virtual
machine---but that's obviously not an option here!)

Am I right in suspecting it might be connected to interrupt delivery
following page migration when a guest moves from one processor to another,
and that a workaround might be to taskset guests to one or other physical
CPU until we're able to upgrade to a more recent version of KVM?

Many thanks in advance for any advice anyone can offer.

Cheers,

Chris.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: qemu-send.c (was Re: Since we're sharing, here's my kvmctl script)

2008-06-12 Thread Chris Webb
Javier Guerra Giraldez [EMAIL PROTECTED] writes:

 On Wednesday 11 June 2008, Chris Webb wrote:
  Hi. I have a small 'qemu-send' utility for talking to a running qemu/kvm
  process whose monitor console listens on a filesystem socket, which I think
  might be a useful building block when extending these kinds of script to do
  things like migratation, pausing, and so on. The source is attached.
 
 there's a utility called socat that let's you send text to/from TCP sockets 
 and unix-domain sockets.  it can even (temporarily) attach the terminal, or 
 use GNU's readline to regain interactive control of KVM/Qemu

Hi. Yes, I'm aware of socat, netcat, tcpclient et al. and even have a
similar pair of little unix/tcp/udp/syslogging utilities myself called
sk/skd which I initially used for scripting our local kvm management system.

However, it's a little bit clumsy to use these tools correctly from a shell
script if you want to get back the command output intact. You need to open
your connection to the unix server socket, wait for the prompt (skipping the
welcome banner), send the command, copy the response out until you get a
line '(qemu) ', then disconnect. For the same reason you can't do

  echo -e GET / HTTP/1.1\n\n /dev/tcp/www.google.com/80
  cat /dev/tcp/www.google.com/80

having to write

  exec 3/dev/tcp/www.google.com/80
  echo -e GET / HTTP/1.1\n\n 3
  cat 3

instead, you need to avoid disconnecting from the socket in the middle of
the command/response exchange.

(In fact, with qemu, it nearly works anyway: the new connection gets all the
output and the next prompt from the old one before the new banner, so you
just have a couple of extra prompts, a command echo and a banner at the top
and bottom to filter away. However, I'd be very reluctant to rely on this
behaviour, and in particular on it not losing output between connections.
The method I implemented in qemu-send.c should be robust again changes in
the way qemu handles its monitor sockets.)

To get the convenient syntax and behaviour I wanted, it felt easier
and cleaner to write the few lines of C needed for a standalone utility
rather than introduce a parsing shell script/function plus a dependency on
one of sk/socat/netcat/tcpclient. I suspect also that I'm just more
comfortable in C than sh; YMMV!

Cheers,

Chris.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html