Public bug reported: Hi, this seems to me to be a kernel crash of some sorts. Somewhat in the spirit of older bugs: - bug 1630940 - bug 1630578
Xnox asked me to look into a hang on openvswitch dep8 tests. What I found initially was in the log just "ERROR: Removing temporary files on testbed timed out" That message brought me to the two bugs above. But in there I read that this was the infra running dep8 crashing. So for a better bug report I tried to reproduce locally and that actually seems to work very reliable. To reproduce do: $ autopkgtest-buildvm-ubuntu-cloud -a i386 -r artful -s 10G $ pull-lp-source openvswitch $ autopkgtest --apt-upgrade --shell --no-built-binaries openvswitch_2.8.0~git20170809.7aa47a19d-0ubuntu1.dsc -- qemu ~/work/autopkgtest-artful-i386.img # This guest currently will crash after a while of testing But with that running you can attach to the console and monitor of that guest. For example: $ sudo nc -U /tmp/autopkgtest-virt-qemu.*/ttyS0 That gave me a crash on the hang which kind of matches the older bugs, here the console: [ 54.256253] BUG: unable to handle kernel NULL pointer dereference at (null) [ 54.257156] IP: add_grec+0x28/0x440 [ 54.257553] *pdpt = 000000001a869001 *pde = 0000000000000000 [ 54.257555] [ 54.258338] Oops: 0000 [#1] SMP [ 54.258638] Modules linked in: veth openvswitch nf_conntrack_ipv6 nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_defrag_ipv6 nf_nat nf_conntrack libcrc32c 9p fscache ppdev kvm_intel kvm irqbypass joydev input_leds serio_raw 9pnet_virtio parport_pc 9pnet parport qemu_fw_cfg i2c_piix4 mac_hid ip_tables x_tables autofs4 btrfs xor raid6_pq psmouse virtio_blk virtio_net floppy pata_acpi [ 54.261891] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 4.12.0-11-generic #12-Ubuntu [ 54.262715] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1ubuntu1~cloud0 04/01/2014 [ 54.263610] task: c7b622c0 task.stack: c7b5a000 [ 54.264039] EIP: add_grec+0x28/0x440 [ 54.264378] EFLAGS: 00010202 CPU: 0 [ 54.264711] EAX: 00000000 EBX: dd062540 ECX: 00000006 EDX: dd062540 [ 54.265308] ESI: dd1e6e00 EDI: dd1e6e00 EBP: dbcc5f30 ESP: dbcc5ef0 [ 54.265793] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [ 54.266297] CR0: 80050033 CR2: 00000000 CR3: 1e930d80 CR4: 000006f0 [ 54.266885] Call Trace: [ 54.267120] <SOFTIRQ> [ 54.267349] mld_ifc_timer_expire+0xfe/0x250 [ 54.267754] ? mld_dad_timer_expire+0x50/0x50 [ 54.268173] call_timer_fn+0x30/0x120 [ 54.268524] ? mld_dad_timer_expire+0x50/0x50 [ 54.268942] ? mld_dad_timer_expire+0x50/0x50 [ 54.269364] run_timer_softirq+0x3c5/0x420 [ 54.269760] ? __softirqentry_text_start+0x8/0x8 [ 54.270198] __do_softirq+0xa9/0x245 [ 54.270539] ? __softirqentry_text_start+0x8/0x8 [ 54.270976] do_softirq_own_stack+0x24/0x30 [ 54.271373] </SOFTIRQ> [ 54.271611] irq_exit+0xad/0xb0 [ 54.271913] smp_apic_timer_interrupt+0x38/0x50 [ 54.272344] apic_timer_interrupt+0x39/0x40 [ 54.272745] EIP: native_safe_halt+0x5/0x10 [ 54.273139] EFLAGS: 00000246 CPU: 0 [ 54.273624] EAX: 00000000 EBX: 00000000 ECX: 00000001 EDX: 00000000 [ 54.274000] ESI: 00000000 EDI: c7b622c0 EBP: c7b5bf10 ESP: c7b5bf10 [ 54.274361] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [ 54.274687] default_idle+0x1c/0xf0 [ 54.274896] arch_cpu_idle+0xe/0x10 [ 54.275286] default_idle_call+0x1e/0x30 [ 54.275787] do_idle+0x145/0x1c0 [ 54.276092] cpu_startup_entry+0x65/0x70 [ 54.276441] rest_init+0x62/0x70 [ 54.276718] start_kernel+0x3be/0x3d7 [ 54.276974] i386_start_kernel+0x9d/0xa1 [ 54.277260] startup_32_smp+0x16b/0x16d [ 54.277509] Code: 00 00 00 3e 8d 74 26 00 55 89 e5 57 56 53 89 c6 83 ec 34 89 4d e8 65 a1 14 00 00 00 89 45 f0 31 c0 8b 42 10 f6 42 48 08 89 45 cc <8b> 00 c7 45 ec 00 00 00 00 89 45 c8 89 f0 0f 85 b4 02 00 00 8b [ 54.279314] EIP: add_grec+0x28/0x440 SS:ESP: 0068:dbcc5ef0 [ 54.279829] CR2: 0000000000000000 [ 54.280143] ---[ end trace 3164b1c0dd7745bc ]--- [ 54.280550] Kernel panic - not syncing: Fatal exception in interrupt [ 54.281078] Kernel Offset: 0x6000000 from 0xc1000000 (relocation range: 0xc0000000-0xdfbfdfff) [ 54.281797] ---[ end Kernel panic - not syncing: Fatal exception in interrupt But since this is lovely qemu the machine isn't as dead as real HW now, so via the monitor $ sudo nc -U /tmp/autopkgtest-virt-qemu.*/monitor I took a dump. Fetching the i386 debug kernel shows me I can load that in crash. But I'd leave the authoritative look what happened to the kernel Team. So I shared via fileshare. Please load with debug-kernel of linux-image-4.12.0-11-generic-dbgsym. I have issues properly loading that in crash as this is a i386 artful on a 64bit KVM, so the format of the kdump generated by qemu is x86_64 and something seems to disagree. But since I have simple repro steps above I hope you can crash and analyze it the way you want/need. In case you still want my dump fetch it from https://private-fileshare.canonical.com/~paelzer/ovs-dep8-i386.vmcore.kdump.zlib I also have raw (non kdump style formats if needed). ** Affects: linux (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1712831 Title: 4.12.0-11-generic - crashing in infrastructure on i386 openvswitch tests Status in linux package in Ubuntu: New Bug description: Hi, this seems to me to be a kernel crash of some sorts. Somewhat in the spirit of older bugs: - bug 1630940 - bug 1630578 Xnox asked me to look into a hang on openvswitch dep8 tests. What I found initially was in the log just "ERROR: Removing temporary files on testbed timed out" That message brought me to the two bugs above. But in there I read that this was the infra running dep8 crashing. So for a better bug report I tried to reproduce locally and that actually seems to work very reliable. To reproduce do: $ autopkgtest-buildvm-ubuntu-cloud -a i386 -r artful -s 10G $ pull-lp-source openvswitch $ autopkgtest --apt-upgrade --shell --no-built-binaries openvswitch_2.8.0~git20170809.7aa47a19d-0ubuntu1.dsc -- qemu ~/work/autopkgtest-artful-i386.img # This guest currently will crash after a while of testing But with that running you can attach to the console and monitor of that guest. For example: $ sudo nc -U /tmp/autopkgtest-virt-qemu.*/ttyS0 That gave me a crash on the hang which kind of matches the older bugs, here the console: [ 54.256253] BUG: unable to handle kernel NULL pointer dereference at (null) [ 54.257156] IP: add_grec+0x28/0x440 [ 54.257553] *pdpt = 000000001a869001 *pde = 0000000000000000 [ 54.257555] [ 54.258338] Oops: 0000 [#1] SMP [ 54.258638] Modules linked in: veth openvswitch nf_conntrack_ipv6 nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_defrag_ipv6 nf_nat nf_conntrack libcrc32c 9p fscache ppdev kvm_intel kvm irqbypass joydev input_leds serio_raw 9pnet_virtio parport_pc 9pnet parport qemu_fw_cfg i2c_piix4 mac_hid ip_tables x_tables autofs4 btrfs xor raid6_pq psmouse virtio_blk virtio_net floppy pata_acpi [ 54.261891] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 4.12.0-11-generic #12-Ubuntu [ 54.262715] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1ubuntu1~cloud0 04/01/2014 [ 54.263610] task: c7b622c0 task.stack: c7b5a000 [ 54.264039] EIP: add_grec+0x28/0x440 [ 54.264378] EFLAGS: 00010202 CPU: 0 [ 54.264711] EAX: 00000000 EBX: dd062540 ECX: 00000006 EDX: dd062540 [ 54.265308] ESI: dd1e6e00 EDI: dd1e6e00 EBP: dbcc5f30 ESP: dbcc5ef0 [ 54.265793] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [ 54.266297] CR0: 80050033 CR2: 00000000 CR3: 1e930d80 CR4: 000006f0 [ 54.266885] Call Trace: [ 54.267120] <SOFTIRQ> [ 54.267349] mld_ifc_timer_expire+0xfe/0x250 [ 54.267754] ? mld_dad_timer_expire+0x50/0x50 [ 54.268173] call_timer_fn+0x30/0x120 [ 54.268524] ? mld_dad_timer_expire+0x50/0x50 [ 54.268942] ? mld_dad_timer_expire+0x50/0x50 [ 54.269364] run_timer_softirq+0x3c5/0x420 [ 54.269760] ? __softirqentry_text_start+0x8/0x8 [ 54.270198] __do_softirq+0xa9/0x245 [ 54.270539] ? __softirqentry_text_start+0x8/0x8 [ 54.270976] do_softirq_own_stack+0x24/0x30 [ 54.271373] </SOFTIRQ> [ 54.271611] irq_exit+0xad/0xb0 [ 54.271913] smp_apic_timer_interrupt+0x38/0x50 [ 54.272344] apic_timer_interrupt+0x39/0x40 [ 54.272745] EIP: native_safe_halt+0x5/0x10 [ 54.273139] EFLAGS: 00000246 CPU: 0 [ 54.273624] EAX: 00000000 EBX: 00000000 ECX: 00000001 EDX: 00000000 [ 54.274000] ESI: 00000000 EDI: c7b622c0 EBP: c7b5bf10 ESP: c7b5bf10 [ 54.274361] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [ 54.274687] default_idle+0x1c/0xf0 [ 54.274896] arch_cpu_idle+0xe/0x10 [ 54.275286] default_idle_call+0x1e/0x30 [ 54.275787] do_idle+0x145/0x1c0 [ 54.276092] cpu_startup_entry+0x65/0x70 [ 54.276441] rest_init+0x62/0x70 [ 54.276718] start_kernel+0x3be/0x3d7 [ 54.276974] i386_start_kernel+0x9d/0xa1 [ 54.277260] startup_32_smp+0x16b/0x16d [ 54.277509] Code: 00 00 00 3e 8d 74 26 00 55 89 e5 57 56 53 89 c6 83 ec 34 89 4d e8 65 a1 14 00 00 00 89 45 f0 31 c0 8b 42 10 f6 42 48 08 89 45 cc <8b> 00 c7 45 ec 00 00 00 00 89 45 c8 89 f0 0f 85 b4 02 00 00 8b [ 54.279314] EIP: add_grec+0x28/0x440 SS:ESP: 0068:dbcc5ef0 [ 54.279829] CR2: 0000000000000000 [ 54.280143] ---[ end trace 3164b1c0dd7745bc ]--- [ 54.280550] Kernel panic - not syncing: Fatal exception in interrupt [ 54.281078] Kernel Offset: 0x6000000 from 0xc1000000 (relocation range: 0xc0000000-0xdfbfdfff) [ 54.281797] ---[ end Kernel panic - not syncing: Fatal exception in interrupt But since this is lovely qemu the machine isn't as dead as real HW now, so via the monitor $ sudo nc -U /tmp/autopkgtest-virt-qemu.*/monitor I took a dump. Fetching the i386 debug kernel shows me I can load that in crash. But I'd leave the authoritative look what happened to the kernel Team. So I shared via fileshare. Please load with debug-kernel of linux-image-4.12.0-11-generic-dbgsym. I have issues properly loading that in crash as this is a i386 artful on a 64bit KVM, so the format of the kdump generated by qemu is x86_64 and something seems to disagree. But since I have simple repro steps above I hope you can crash and analyze it the way you want/need. In case you still want my dump fetch it from https://private-fileshare.canonical.com/~paelzer/ovs-dep8-i386.vmcore.kdump.zlib I also have raw (non kdump style formats if needed). To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1712831/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp