Re: [Qemu-devel] [BUG] Guest kernel divide error in kvm_unlock_kick

2014-09-22 Thread Paolo Bonzini
Il 11/09/2014 19:03, Chris Webb ha scritto:
 Paolo Bonzini pbonz...@redhat.com wrote:
 
 This is a hypercall that should have kicked VCPU 3 (see rcx).

 Can you please apply this patch and gather a trace of the host
 (using trace-cmd -e kvm qemu-kvm arguments)?
 
 Sure, no problem. I've built the trace-cmd tool against udis86 (I hope) and
 have put the resulting trace.dat at
 
   http://cdw.me.uk/tmp/trace.dat
 
 This is actually for a -smp 2 qemu (failing to kick VCPU 1?) as I was having
 trouble persuading the -smp 4 qemu to crash as reliably under tracing.
 (Something timing related?) Otherwise the qemu-system-x86 command line is
 exactly as before.

Do you by chance have CONFIG_DEBUG_RODATA set?  In that case, the fix is
simply not to set it.

Paolo

 The guest kernel crash message which corresponds to this trace was:
 
 divide error:  [#1] PREEMPT SMP 
 Modules linked in:
 CPU: 0 PID: 618 Comm: mkdir Not tainted 3.16.2-guest #2
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
 rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
 task: 88007c997080 ti: 88007c614000 task.ti: 88007c614000
 RIP: 0010:[81037fe2]  [81037fe2] kvm_unlock_kick+0x72/0x80
 RSP: 0018:88007c617d40  EFLAGS: 00010046
 RAX: 0005 RBX:  RCX: 0001
 RDX: 0001 RSI: 88007fd11c40 RDI: 
 RBP: 88007fd11c40 R08: 81b98940 R09: 0001
 R10:  R11: 0007 R12: 00f6
 R13: 0001 R14: 0001 R15: 00011c40
 FS:  7f43eb1ed700() GS:88007fc0() knlGS:
 CS:  0010 DS:  ES:  CR0: 8005003b
 CR2: 7f43eace0a30 CR3: 01a12000 CR4: 000406f0
 Stack:
  88007c994380 88007c9949aa 0046 81689715
  810f3174 0001 ea0001f16320 ea0001f17860
   88007c99e1e8 88007c997080 0001
 Call Trace:
  [81689715] ? _raw_spin_unlock+0x45/0x70
  [810f3174] ? try_to_wake_up+0x2a4/0x330
  [81101e2c] ? __wake_up_common+0x4c/0x80
  [81102418] ? __wake_up_sync_key+0x38/0x60
  [810d873a] ? do_notify_parent+0x19a/0x280
  [810f4d56] ? sched_move_task+0xb6/0x190
  [810cb4fc] ? do_exit+0xa1c/0xab0
  [810cc344] ? do_group_exit+0x34/0xb0
  [810cc3cb] ? SyS_exit_group+0xb/0x10
  [8168a16d] ? system_call_fastpath+0x1a/0x1f
 Code: c0 ca a7 81 48 8d 04 0b 48 8b 30 48 39 ee 75 c9 0f b6 40 08 44 38 e0 75 
 c0 48 c7 c0 22 b0 00 00 31 db 0f b7 0c 08 b8 05 00 00 00 0f 01 c1 0f 1f 00 
 5b 5d 41 5c c3 0f 1f 00 48 c7 c0 10 cf 00 00 
 RIP  [81037fe2] kvm_unlock_kick+0x72/0x80
  RSP 88007c617d40
 ---[ end trace bf5a4445f9decdbb ]---
 Fixing recursive fault but reboot is needed!
 BUG: scheduling while atomic: mkdir/618/0x0006
 Modules linked in:
 CPU: 0 PID: 618 Comm: mkdir Tainted: G  D   3.16.2-guest #2
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
 rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
   c022d302 81684029 
  810ee956 81686266 00011c40 88007c617fd8
  00011c40 88007c997080 0006 0046
 Call Trace:
  [81684029] ? dump_stack+0x49/0x6a
  [810ee956] ? __schedule_bug+0x46/0x60
  [81686266] ? __schedule+0x5a6/0x7c0
  [816828cd] ? printk+0x59/0x75
  [810cb33b] ? do_exit+0x85b/0xab0
  [816828cd] ? printk+0x59/0x75
  [8100614a] ? oops_end+0x7a/0x100
  [810033e5] ? do_error_trap+0x85/0x110
  [81037fe2] ? kvm_unlock_kick+0x72/0x80
  [8114a358] ? __alloc_pages_nodemask+0x108/0xa60
  [8168b57e] ? divide_error+0x1e/0x30
  [81037fe2] ? kvm_unlock_kick+0x72/0x80
  [81689715] ? _raw_spin_unlock+0x45/0x70
  [810f3174] ? try_to_wake_up+0x2a4/0x330
  [81101e2c] ? __wake_up_common+0x4c/0x80
  [81102418] ? __wake_up_sync_key+0x38/0x60
  [810d873a] ? do_notify_parent+0x19a/0x280
  [810f4d56] ? sched_move_task+0xb6/0x190
  [810cb4fc] ? do_exit+0xa1c/0xab0
  [810cc344] ? do_group_exit+0x34/0xb0
  [810cc3cb] ? SyS_exit_group+0xb/0x10
  [8168a16d] ? system_call_fastpath+0x1a/0x1f
 
 Best wishes,
 
 Chris.
 




Re: [Qemu-devel] [BUG] Guest kernel divide error in kvm_unlock_kick

2014-09-22 Thread Chris Webb
Paolo Bonzini pbonz...@redhat.com wrote:

 Il 11/09/2014 19:03, Chris Webb ha scritto:
 Paolo Bonzini pbonz...@redhat.com wrote:
 
 This is a hypercall that should have kicked VCPU 3 (see rcx).
 
 Can you please apply this patch and gather a trace of the host
 (using trace-cmd -e kvm qemu-kvm arguments)?
 
 Sure, no problem. I've built the trace-cmd tool against udis86 (I hope) and
 have put the resulting trace.dat at
 
  http://cdw.me.uk/tmp/trace.dat
 
 This is actually for a -smp 2 qemu (failing to kick VCPU 1?) as I was having
 trouble persuading the -smp 4 qemu to crash as reliably under tracing.
 (Something timing related?) Otherwise the qemu-system-x86 command line is
 exactly as before.
 
 Do you by chance have CONFIG_DEBUG_RODATA set?  In that case, the fix is
 simply not to set it.

Absolutely right: my host and guest kernels do have CONFIG_DEBUG_RODATA set!

Your patch to use alternatives for VMCALL vs VMMCALL definitely fixed the
divide-by-zero crashes I saw.

Given that I can easily use either (or both) of these solutions, is it be
more efficient to turn off CONFIG_DEBUG_RODATA in the guest kernel so kvm
can fix up the instructions in-place, or is using alternatives for
VMCALL/VMMCALL as implemented by your patch just as good?

Best wishes,

Chris.


Re: [Qemu-devel] [BUG] Guest kernel divide error in kvm_unlock_kick

2014-09-22 Thread Paolo Bonzini
Il 22/09/2014 21:08, Chris Webb ha scritto:
  Do you by chance have CONFIG_DEBUG_RODATA set?  In that case, the fix is
  simply not to set it.
 
 Absolutely right: my host and guest kernels do have CONFIG_DEBUG_RODATA set!
 
 Your patch to use alternatives for VMCALL vs VMMCALL definitely fixed the
 divide-by-zero crashes I saw.
 
 Given that I can easily use either (or both) of these solutions, is it be
 more efficient to turn off CONFIG_DEBUG_RODATA in the guest kernel so kvm
 can fix up the instructions in-place, or is using alternatives for
 VMCALL/VMMCALL as implemented by your patch just as good?

I posted a patch to use alternatives if CONFIG_DEBUG_RODATA is enabled,
but the bug is in KVM that explicitly documents you can use any of
VMCALL or VMMCALL.

I'll also see to fix KVM, but the patch is still useful because a) KVM
would not patch a read-only text segment, so there would be a small
performance benefit; b) you cannot control already deployed hypervisors.

However, since there is a workaround, I won't push it into 3.17 so late
in the cycle.  Also, there's a chance that it is NACKed since it touches
non-KVM files.

Paolo



Re: [Qemu-devel] [BUG] Guest kernel divide error in kvm_unlock_kick

2014-09-11 Thread Paolo Bonzini
Il 08/09/2014 15:28, Chris Webb ha scritto:
 divide error:  [#1] PREEMPT SMP 
 Modules linked in:
 CPU: 0 PID: 743 Comm: syslogd Not tainted 3.16.2-guest #2
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
 rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
 task: 88007c972580 ti: 88007cb7c000 task.ti: 88007cb7c000
 RIP: 0010:[81037fe2]  [81037fe2] kvm_unlock_kick+0x72/0x80
 RSP: :88007fc03ec8  EFLAGS: 00010046
 RAX: 0005 RBX:  RCX: 0003
 RDX: 0003 RSI: 81a466a0 RDI: 
 RBP: 81a466a0 R08: 81b98940 R09: 0246
 R10: 0400 R11:  R12: 00ea
 R13: 0009 R14: 0002 R15: 88007fc0d300
 FS:  7f2a6473e700() GS:88007fc0() knlGS:
 CS:  0010 DS:  ES:  CR0: 8005003b
 CR2: 004a8240 CR3: 7ac75000 CR4: 000406f0
 Stack:
  81a46400 0246 0001 8168979d
  0282 81110d97 0007 88007cb7ffd8
  88007c972580 4b0782e8 0002 81a0b0c8
 Call Trace:
  IRQ 
  [8168979d] ? _raw_spin_unlock_irqrestore+0x5d/0x80
  [81110d97] ? rcu_process_callbacks+0x337/0x4f0
  [810cde2d] ? __do_softirq+0xfd/0x210
  [810ce06e] ? irq_exit+0x7e/0xa0
  [8103063b] ? smp_apic_timer_interrupt+0x3b/0x50
  [8168b04d] ? apic_timer_interrupt+0x6d/0x80
  EOI 
  [8114180b] ? filemap_map_pages+0x17b/0x240
  [811418c0] ? filemap_map_pages+0x230/0x240
  [811679e2] ? do_read_fault.isra.70+0x2a2/0x320
  [811696cc] ? handle_mm_fault+0x37c/0xd00
  [8103bb45] ? __do_page_fault+0x185/0x4c0
  [8168b958] ? async_page_fault+0x28/0x30
  [813b9610] ? __put_user_4+0x20/0x30
  [8168b958] ? async_page_fault+0x28/0x30
 Code: c0 ca a7 81 48 8d 04 0b 48 8b 30 48 39 ee 75 c9 0f b6 40 08 44 38 e0 75 
 c0 48 c7 c0 22 b0 00 00 31 db 0f b7 0c 08 b8 05 00 00 00 0f 01 c1 0f 1f 00 
 5b 5d 41 5c c3 0f 1f 00 48 c7 c0 10 cf 00 00 

Hi Chris,

sorry for not following up on your previous patch.

This is a hypercall that should have kicked VCPU 3 (see rcx).

Can you please apply this patch and gather a trace of the host
(using trace-cmd -e kvm qemu-kvm arguments)?

Thanks,

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index fb919c574e23..25ed29f68419 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -709,6 +709,8 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int 
delivery_mode,
int result = 0;
struct kvm_vcpu *vcpu = apic-vcpu;
 
+   trace_kvm_apic_accept_irq(vcpu-vcpu_id, delivery_mode,
+ trig_mode, vector, false);
switch (delivery_mode) {
case APIC_DM_LOWEST:
vcpu-arch.apic_arb_prio++;
@@ -730,8 +732,6 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int 
delivery_mode,
kvm_make_request(KVM_REQ_EVENT, vcpu);
kvm_vcpu_kick(vcpu);
}
-   trace_kvm_apic_accept_irq(vcpu-vcpu_id, delivery_mode,
- trig_mode, vector, false);
break;
 
case APIC_DM_REMRD:



Paolo



Re: [Qemu-devel] [BUG] Guest kernel divide error in kvm_unlock_kick

2014-09-11 Thread Chris Webb
Paolo Bonzini pbonz...@redhat.com wrote:

 This is a hypercall that should have kicked VCPU 3 (see rcx).
 
 Can you please apply this patch and gather a trace of the host
 (using trace-cmd -e kvm qemu-kvm arguments)?

Sure, no problem. I've built the trace-cmd tool against udis86 (I hope) and
have put the resulting trace.dat at

  http://cdw.me.uk/tmp/trace.dat

This is actually for a -smp 2 qemu (failing to kick VCPU 1?) as I was having
trouble persuading the -smp 4 qemu to crash as reliably under tracing.
(Something timing related?) Otherwise the qemu-system-x86 command line is
exactly as before.

The guest kernel crash message which corresponds to this trace was:

divide error:  [#1] PREEMPT SMP 
Modules linked in:
CPU: 0 PID: 618 Comm: mkdir Not tainted 3.16.2-guest #2
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
task: 88007c997080 ti: 88007c614000 task.ti: 88007c614000
RIP: 0010:[81037fe2]  [81037fe2] kvm_unlock_kick+0x72/0x80
RSP: 0018:88007c617d40  EFLAGS: 00010046
RAX: 0005 RBX:  RCX: 0001
RDX: 0001 RSI: 88007fd11c40 RDI: 
RBP: 88007fd11c40 R08: 81b98940 R09: 0001
R10:  R11: 0007 R12: 00f6
R13: 0001 R14: 0001 R15: 00011c40
FS:  7f43eb1ed700() GS:88007fc0() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 7f43eace0a30 CR3: 01a12000 CR4: 000406f0
Stack:
 88007c994380 88007c9949aa 0046 81689715
 810f3174 0001 ea0001f16320 ea0001f17860
  88007c99e1e8 88007c997080 0001
Call Trace:
 [81689715] ? _raw_spin_unlock+0x45/0x70
 [810f3174] ? try_to_wake_up+0x2a4/0x330
 [81101e2c] ? __wake_up_common+0x4c/0x80
 [81102418] ? __wake_up_sync_key+0x38/0x60
 [810d873a] ? do_notify_parent+0x19a/0x280
 [810f4d56] ? sched_move_task+0xb6/0x190
 [810cb4fc] ? do_exit+0xa1c/0xab0
 [810cc344] ? do_group_exit+0x34/0xb0
 [810cc3cb] ? SyS_exit_group+0xb/0x10
 [8168a16d] ? system_call_fastpath+0x1a/0x1f
Code: c0 ca a7 81 48 8d 04 0b 48 8b 30 48 39 ee 75 c9 0f b6 40 08 44 38 e0 75 
c0 48 c7 c0 22 b0 00 00 31 db 0f b7 0c 08 b8 05 00 00 00 0f 01 c1 0f 1f 00 5b 
5d 41 5c c3 0f 1f 00 48 c7 c0 10 cf 00 00 
RIP  [81037fe2] kvm_unlock_kick+0x72/0x80
 RSP 88007c617d40
---[ end trace bf5a4445f9decdbb ]---
Fixing recursive fault but reboot is needed!
BUG: scheduling while atomic: mkdir/618/0x0006
Modules linked in:
CPU: 0 PID: 618 Comm: mkdir Tainted: G  D   3.16.2-guest #2
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
  c022d302 81684029 
 810ee956 81686266 00011c40 88007c617fd8
 00011c40 88007c997080 0006 0046
Call Trace:
 [81684029] ? dump_stack+0x49/0x6a
 [810ee956] ? __schedule_bug+0x46/0x60
 [81686266] ? __schedule+0x5a6/0x7c0
 [816828cd] ? printk+0x59/0x75
 [810cb33b] ? do_exit+0x85b/0xab0
 [816828cd] ? printk+0x59/0x75
 [8100614a] ? oops_end+0x7a/0x100
 [810033e5] ? do_error_trap+0x85/0x110
 [81037fe2] ? kvm_unlock_kick+0x72/0x80
 [8114a358] ? __alloc_pages_nodemask+0x108/0xa60
 [8168b57e] ? divide_error+0x1e/0x30
 [81037fe2] ? kvm_unlock_kick+0x72/0x80
 [81689715] ? _raw_spin_unlock+0x45/0x70
 [810f3174] ? try_to_wake_up+0x2a4/0x330
 [81101e2c] ? __wake_up_common+0x4c/0x80
 [81102418] ? __wake_up_sync_key+0x38/0x60
 [810d873a] ? do_notify_parent+0x19a/0x280
 [810f4d56] ? sched_move_task+0xb6/0x190
 [810cb4fc] ? do_exit+0xa1c/0xab0
 [810cc344] ? do_group_exit+0x34/0xb0
 [810cc3cb] ? SyS_exit_group+0xb/0x10
 [8168a16d] ? system_call_fastpath+0x1a/0x1f

Best wishes,

Chris.


[Qemu-devel] [BUG] Guest kernel divide error in kvm_unlock_kick

2014-09-08 Thread Chris Webb
I've reported this bug before, which reliably crashes a guest kernel shortly
after boot, but have just reconfirmed that it is still present with Linux
3.16.2 guest and host kernels and Qemu 2.1.

Running a 3.16.2 x86-64 SMP guest kernel on qemu-2.1, with kvm enabled and
-cpu host on a 3.16.2 AMD Opteron host, I'm seeing a reliable kernel panic
from the guest shortly after boot. I think is happening in kvm_unlock_kick()
in the paravirt_ops code:

divide error:  [#1] PREEMPT SMP 
Modules linked in:
CPU: 0 PID: 743 Comm: syslogd Not tainted 3.16.2-guest #2
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
task: 88007c972580 ti: 88007cb7c000 task.ti: 88007cb7c000
RIP: 0010:[81037fe2]  [81037fe2] kvm_unlock_kick+0x72/0x80
RSP: :88007fc03ec8  EFLAGS: 00010046
RAX: 0005 RBX:  RCX: 0003
RDX: 0003 RSI: 81a466a0 RDI: 
RBP: 81a466a0 R08: 81b98940 R09: 0246
R10: 0400 R11:  R12: 00ea
R13: 0009 R14: 0002 R15: 88007fc0d300
FS:  7f2a6473e700() GS:88007fc0() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 004a8240 CR3: 7ac75000 CR4: 000406f0
Stack:
 81a46400 0246 0001 8168979d
 0282 81110d97 0007 88007cb7ffd8
 88007c972580 4b0782e8 0002 81a0b0c8
Call Trace:
 IRQ 
 [8168979d] ? _raw_spin_unlock_irqrestore+0x5d/0x80
 [81110d97] ? rcu_process_callbacks+0x337/0x4f0
 [810cde2d] ? __do_softirq+0xfd/0x210
 [810ce06e] ? irq_exit+0x7e/0xa0
 [8103063b] ? smp_apic_timer_interrupt+0x3b/0x50
 [8168b04d] ? apic_timer_interrupt+0x6d/0x80
 EOI 
 [8114180b] ? filemap_map_pages+0x17b/0x240
 [811418c0] ? filemap_map_pages+0x230/0x240
 [811679e2] ? do_read_fault.isra.70+0x2a2/0x320
 [811696cc] ? handle_mm_fault+0x37c/0xd00
 [8103bb45] ? __do_page_fault+0x185/0x4c0
 [8168b958] ? async_page_fault+0x28/0x30
 [813b9610] ? __put_user_4+0x20/0x30
 [8168b958] ? async_page_fault+0x28/0x30
Code: c0 ca a7 81 48 8d 04 0b 48 8b 30 48 39 ee 75 c9 0f b6 40 08 44 38 e0 75 
c0 48 c7 c0 22 b0 00 00 31 db 0f b7 0c 08 b8 05 00 00 00 0f 01 c1 0f 1f 00 5b 
5d 41 5c c3 0f 1f 00 48 c7 c0 10 cf 00 00 
RIP  [81037fe2] kvm_unlock_kick+0x72/0x80
 RSP 88007fc03ec8
---[ end trace be08885ac2c94c6a ]---
Kernel panic - not syncing: Fatal exception in interrupt

My host kernel config is http://cdw.me.uk/tmp/host-config.txt and the guest
config is http://cdw.me.uk/tmp/guest-config.txt with qemu command line:

 qemu-system-x86 -enable-kvm -cpu host -machine q35 -m 2048 -name $1 \
   -smp sockets=1,cores=4 -pidfile /run/$1.pid -runas nobody \
   -serial stdio -vga none -vnc none -kernel /boot/vmlinuz-guest \
   -append console=ttyS0 root=/dev/vda \
   -drive file=/dev/guest/$1,cache=none,format=raw,if=virtio \
   -device virtio-rng-pci \
   -device virtio-net-pci,netdev=nic,mac=$( /sys/class/net/$1/address) \
   -netdev tap,id=nic,fd=3 3/dev/tap$( /sys/class/net/$1/ifindex)

I can stop this crash by disabling CONFIG_PARAVIRT_SPINLOCKS in my guest
kernel, running with -cpu qemu64 instead of -cpu host, or running with -smp 1
instead of -smp 4. (Removing/changing the -machine q35 makes no difference.)

/proc/cpuinfo on the host has 8 of these:

processor   : 0
vendor_id   : AuthenticAMD
cpu family  : 21
model   : 2
model name  : AMD Opteron(tm) Processor 6328
stepping: 0
microcode   : 0x600081c
cpu MHz : 3200.000
cache size  : 2048 KB
physical id : 0
siblings: 8
core id : 0
cpu cores   : 4
apicid  : 32
initial apicid  : 0
fpu : yes
fpu_exception   : yes
cpuid level : 13
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb 
rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid amd_dcm aperfmperf 
pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c 
lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch 
osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core 
perfctr_nb arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean 
flushbyasid decodeassists pausefilter pfthreshold bmi1
bogomips: 6399.70
TLB size: 1536 4K pages
clflush size: 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro

and on the guest, has 4 of these:

processor   : 0
vendor_id   : AuthenticAMD
cpu family  :