Cc'ing Paolo and BenoƮt.
Best regards, -Gonglei > -----Original Message----- > From: Gonglei (Arei) > Sent: Thursday, November 27, 2014 8:58 PM > To: qemu-devel@nongnu.org > Subject: [BUG] Redhat-6.4_64bit-guest kernel panic with cpu-passthrough and > guest numa > > Hi, > > Running a redhat-6.4-64bit (kernel 2.6.32-358.el6.x86_64) or elder guest on > qemu-2.1, with kvm enabled and -cpu host, non default cpu-topology and guest > numa > I'm seeing a reliable kernel panic from the guest shortly after boot. It is > happening in > find_busiest_group(). > > We also found it happend since commit > 787aaf5703a702094f395db6795e74230282cd62 by git bisect. > > The reproducer: > > (1) full qemu cmd line: > qemu-system-x86_64 -machine pc-i440fx-2.1,accel=kvm,usb=off \ > -cpu host -m 16384 \ > -smp 16,sockets=2,cores=4,threads=2 \ > -object memory-backend-ram,size=8192M,id=ram-node0 \ > -numa node,nodeid=0,cpus=0-7,memdev=ram-node0 \ > -object memory-backend-ram,size=8192M,id=ram-node1 \ > -numa node,nodeid=1,cpus=8-15,memdev=ram-node1 \ > -boot c -drive file=/data/wxin/vm/redhat_6.4_64 \ > -vnc 0.0.0.0:0 -device > cirrus-vga,id=video0,vgamem_mb=8,bus=pci.0,addr=0x1.0x4 \ > -msg timestamp=on > > (2)the guest kernel messages: > > divide error: 0000 [#1] SMP > last sysfs file: > CPU 0 > Modules linked in: > > Pid: 1, comm: swapper Not tainted 2.6.32-358.el6.x86_64 #1 QEMU Standard > PC (i440FX + PIIX, 1996) > RIP: 0010:[<ffffffff81059a9c>] [<ffffffff81059a9c>] > find_busiest_group+0x55c/0x9f0 > RSP: 0018:ffff88023c85f9e0 EFLAGS: 00010046 > RAX: 0000000000100000 RBX: ffff88023c85fbdc RCX: 0000000000000000 > RDX: 0000000000000000 RSI: 0000000000000010 RDI: 0000000000000010 > RBP: ffff88023c85fb50 R08: ffff88023ca16c10 R09: 0000000000000000 > R10: 0000000000000001 R11: 0000000000000000 R12: 00000000ffffff01 > R13: 0000000000016700 R14: ffffffffffffffff R15: 0000000000000000 > FS: 0000000000000000(0000) GS:ffff880028200000(0000) > knlGS:0000000000000000 > CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b > CR2: 0000000000000000 CR3: 0000000001a85000 CR4: 00000000000407f0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process swapper (pid: 1, threadinfo ffff88023c85e000, task ffff88043d27c040) > Stack: > ffff88023c85faf0 ffff88023c85fa60 ffff88023c85fbc8 0000000200000000 > <d> 0000000100000000 ffff880028210b60 0000000100000001 > 0000000000000008 > <d> 0000000000016700 0000000000016700 ffff88023ca16c00 > 0000000000016700 > Call Trace: > [<ffffffff8150da2a>] thread_return+0x398/0x76e > [<ffffffff8150e555>] schedule_timeout+0x215/0x2e0 > [<ffffffff81065905>] ? enqueue_entity+0x125/0x410 > [<ffffffff8150e1d3>] wait_for_common+0x123/0x180 > [<ffffffff81063310>] ? default_wake_function+0x0/0x20 > [<ffffffff8150e2ed>] wait_for_completion+0x1d/0x20 > [<ffffffff81096a89>] kthread_create+0x99/0x120 > [<ffffffff81090950>] ? worker_thread+0x0/0x2a0 > [<ffffffff81167769>] ? alternate_node_alloc+0xc9/0xe0 > [<ffffffff810908d9>] create_workqueue_thread+0x59/0xd0 > [<ffffffff8150ebce>] ? mutex_lock+0x1e/0x50 > [<ffffffff810911bd>] __create_workqueue_key+0x14d/0x200 > [<ffffffff81c47233>] init_workqueues+0x9f/0xb1 > [<ffffffff81c2788c>] kernel_init+0x25e/0x2fe > [<ffffffff8100c0ca>] child_rip+0xa/0x20 > [<ffffffff81c2762e>] ? kernel_init+0x0/0x2fe > [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 > Code: 8b b5 b0 fe ff ff 48 8b bd b8 fe ff ff e8 9d 85 ff ff 0f 1f 44 00 00 48 > 8b 95 e0 > fe ff ff 48 8b 45 a8 8b 4a 08 48 c1 e0 0a 31 d2 <48> f7 f1 48 8b 4d b0 48 89 > 45 a0 > 31 c0 48 85 c9 74 0c 48 8b 45 > RIP [<ffffffff81059a9c>] find_busiest_group+0x55c/0x9f0 > RSP <ffff88023c85f9e0> > divide error: 0000 [#2] > ---[ end trace d7d20afc6dd05e71 ]--- > Kernel panic - not syncing: Fatal exception > Pid: 1, comm: swapper Tainted: G D --------------- > 2.6.32-358.el6.x86_64 #1 > Call Trace: > [<ffffffff8150cfc8>] ? panic+0xa7/0x16f > [<ffffffff815111f4>] ? oops_end+0xe4/0x100 > [<ffffffff8100f19b>] ? die+0x5b/0x90 > [<ffffffff81510a34>] ? do_trap+0xc4/0x160 > [<ffffffff8100cf7f>] ? do_divide_error+0x8f/0xb0 > [<ffffffff81059a9c>] ? find_busiest_group+0x55c/0x9f0 > [<ffffffff8113b3a9>] ? zone_statistics+0x99/0xc0 > [<ffffffff8100bdfb>] ? divide_error+0x1b/0x20 > [<ffffffff81059a9c>] ? find_busiest_group+0x55c/0x9f0 > [<ffffffff8150da2a>] ? thread_return+0x398/0x76e > [<ffffffff8150e555>] ? schedule_timeout+0x215/0x2e0 > [<ffffffff81065905>] ? enqueue_entity+0x125/0x410 > [<ffffffff8150e1d3>] ? wait_for_common+0x123/0x180 > [<ffffffff81063310>] ? default_wake_function+0x0/0x20 > [<ffffffff8150e2ed>] ? wait_for_completion+0x1d/0x20 > [<ffffffff81096a89>] ? kthread_create+0x99/0x120 > [<ffffffff81090950>] ? worker_thread+0x0/0x2a0 > [<ffffffff81167769>] ? alternate_node_alloc+0xc9/0xe0 > [<ffffffff810908d9>] ? create_workqueue_thread+0x59/0xd0 > [<ffffffff8150ebce>] ? mutex_lock+0x1e/0x50 > [<ffffffff810911bd>] ? __create_workqueue_key+0x14d/0x200 > [<ffffffff81c47233>] ? init_workqueues+0x9f/0xb1 > [<ffffffff81c2788c>] ? kernel_init+0x25e/0x2fe > [<ffffffff8100c0ca>] ? child_rip+0xa/0x20 > [<ffffffff81c2762e>] ? kernel_init+0x0/0x2fe > [<ffffffff8100c0c0>] ? child_rip+0x0/0x20 > > --- > (3)host info > > /proc/cpuinfo on the host has 16 of these: > > processor : 15 > vendor_id : GenuineIntel > cpu family : 6 > model : 45 > model name : Intel(R) Xeon(R) CPU E5-2643 0 @ 3.30GHz > stepping : 7 > microcode : 1803 > cpu MHz : 3301.000 > cache size : 10240 KB > physical id : 1 > siblings : 8 > core id : 3 > cpu cores : 4 > apicid : 39 > initial apicid : 39 > fpu : yes > fpu_exception : yes > cpuid level : 13 > wp : yes > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca > cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx > pdpe1gb > rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology > nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx > est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt > tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts > dtherm tpr_shadow vnmi flexpriority ept vpid > bogomips : 6599.83 > clflush size : 64 > cache_alignment : 64 > address sizes : 46 bits physical, 48 bits virtual > power management: > > > host numa topo: > > node 0 cpus: 0 1 2 3 8 9 10 11 > node 0 size: 40936 MB > node 0 free: 39625 MB > node 1 cpus: 4 5 6 7 12 13 14 15 > node 1 size: 40960 MB > node 1 free: 39876 MB > node distances: > node 0 1 > 0: 10 21 > 1: 21 10 > > (4) With "sched_debug loglevel=8" kernel parameter command line, > you can see follow error log(those "ERROR"s): > > CPU0 attaching sched-domain: > domain 0: span 0-15 level MC > groups: 0 (cpu_power = 1023) 1 2 3 4 5 6 7 8 9 10 (cpu_power = 1023) 11 12 > 13 14 15 > ERROR: parent span is not a superset of domain->span > domain 1: span 0-7 level CPU > ERROR: domain->groups does not contain CPU0 > groups: 8-15 (cpu_power = 16382) > ERROR: groups don't span domain->span > domain 2: span 0-15 level NODE > groups: > ERROR: domain->cpu_power not set > > Any comments and help will be appreciated! > > Best regards, > -Gonglei