Public bug reported:

Hi,
this seems to me to be a kernel crash of some sorts.
Somewhat in the spirit of older bugs:
- bug 1630940
- bug 1630578

Xnox asked me to look into a hang on openvswitch dep8 tests.
What I found initially was in the log just
  "ERROR: Removing temporary files on testbed timed out"
That message brought me to the two bugs above.

But in there I read that this was the infra running dep8 crashing.
So for a better bug report I tried to reproduce locally and that actually seems 
to work very reliable.

To reproduce do:
$ autopkgtest-buildvm-ubuntu-cloud -a i386 -r artful -s 10G
$ pull-lp-source openvswitch
$ autopkgtest --apt-upgrade --shell --no-built-binaries 
openvswitch_2.8.0~git20170809.7aa47a19d-0ubuntu1.dsc -- qemu 
~/work/autopkgtest-artful-i386.img
# This guest currently will crash after a while of testing


But with that running you can attach to the console and monitor of that guest.
For example:
$ sudo nc -U /tmp/autopkgtest-virt-qemu.*/ttyS0

That gave me a crash on the hang which kind of matches the older bugs, here the 
console:
[   54.256253] BUG: unable to handle kernel NULL pointer dereference at   (null)
[   54.257156] IP: add_grec+0x28/0x440
[   54.257553] *pdpt = 000000001a869001 *pde = 0000000000000000 
[   54.257555] 
[   54.258338] Oops: 0000 [#1] SMP
[   54.258638] Modules linked in: veth openvswitch nf_conntrack_ipv6 
nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_defrag_ipv6 nf_nat 
nf_conntrack libcrc32c 9p fscache ppdev kvm_intel kvm irqbypass joydev 
input_leds serio_raw 9pnet_virtio parport_pc 9pnet parport qemu_fw_cfg 
i2c_piix4 mac_hid ip_tables x_tables autofs4 btrfs xor raid6_pq psmouse 
virtio_blk virtio_net floppy pata_acpi
[   54.261891] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W       
4.12.0-11-generic #12-Ubuntu
[   54.262715] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.10.1-1ubuntu1~cloud0 04/01/2014
[   54.263610] task: c7b622c0 task.stack: c7b5a000
[   54.264039] EIP: add_grec+0x28/0x440
[   54.264378] EFLAGS: 00010202 CPU: 0
[   54.264711] EAX: 00000000 EBX: dd062540 ECX: 00000006 EDX: dd062540
[   54.265308] ESI: dd1e6e00 EDI: dd1e6e00 EBP: dbcc5f30 ESP: dbcc5ef0
[   54.265793]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[   54.266297] CR0: 80050033 CR2: 00000000 CR3: 1e930d80 CR4: 000006f0
[   54.266885] Call Trace:
[   54.267120]  <SOFTIRQ>
[   54.267349]  mld_ifc_timer_expire+0xfe/0x250
[   54.267754]  ? mld_dad_timer_expire+0x50/0x50
[   54.268173]  call_timer_fn+0x30/0x120
[   54.268524]  ? mld_dad_timer_expire+0x50/0x50
[   54.268942]  ? mld_dad_timer_expire+0x50/0x50
[   54.269364]  run_timer_softirq+0x3c5/0x420
[   54.269760]  ? __softirqentry_text_start+0x8/0x8
[   54.270198]  __do_softirq+0xa9/0x245
[   54.270539]  ? __softirqentry_text_start+0x8/0x8
[   54.270976]  do_softirq_own_stack+0x24/0x30
[   54.271373]  </SOFTIRQ>
[   54.271611]  irq_exit+0xad/0xb0
[   54.271913]  smp_apic_timer_interrupt+0x38/0x50
[   54.272344]  apic_timer_interrupt+0x39/0x40
[   54.272745] EIP: native_safe_halt+0x5/0x10
[   54.273139] EFLAGS: 00000246 CPU: 0
[   54.273624] EAX: 00000000 EBX: 00000000 ECX: 00000001 EDX: 00000000
[   54.274000] ESI: 00000000 EDI: c7b622c0 EBP: c7b5bf10 ESP: c7b5bf10
[   54.274361]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[   54.274687]  default_idle+0x1c/0xf0
[   54.274896]  arch_cpu_idle+0xe/0x10
[   54.275286]  default_idle_call+0x1e/0x30
[   54.275787]  do_idle+0x145/0x1c0
[   54.276092]  cpu_startup_entry+0x65/0x70
[   54.276441]  rest_init+0x62/0x70
[   54.276718]  start_kernel+0x3be/0x3d7
[   54.276974]  i386_start_kernel+0x9d/0xa1
[   54.277260]  startup_32_smp+0x16b/0x16d
[   54.277509] Code: 00 00 00 3e 8d 74 26 00 55 89 e5 57 56 53 89 c6 83 ec 34 
89 4d e8 65 a1 14 00 00 00 89 45 f0 31 c0 8b 42 10 f6 42 48 08 89 45 cc <8b> 00 
c7 45 ec 00 00 00 00 89 45 c8 89 f0 0f 85 b4 02 00 00 8b
[   54.279314] EIP: add_grec+0x28/0x440 SS:ESP: 0068:dbcc5ef0
[   54.279829] CR2: 0000000000000000
[   54.280143] ---[ end trace 3164b1c0dd7745bc ]---
[   54.280550] Kernel panic - not syncing: Fatal exception in interrupt
[   54.281078] Kernel Offset: 0x6000000 from 0xc1000000 (relocation range: 
0xc0000000-0xdfbfdfff)
[   54.281797] ---[ end Kernel panic - not syncing: Fatal exception in interrupt

But since this is lovely qemu the machine isn't as dead as real HW now, so via 
the monitor
$ sudo nc -U /tmp/autopkgtest-virt-qemu.*/monitor
I took a dump.

Fetching the i386 debug kernel shows me I can load that in crash.


But I'd leave the authoritative look what happened to the kernel Team. So I 
shared via fileshare.
Please load with debug-kernel of linux-image-4.12.0-11-generic-dbgsym.
I have issues properly loading that in crash as this is a i386 artful on a 
64bit KVM, so the format of the kdump generated by qemu is x86_64 and something 
seems to disagree.
But since I have simple repro steps above I hope you can crash and analyze it 
the way you want/need.

In case you still want my dump fetch it from 
https://private-fileshare.canonical.com/~paelzer/ovs-dep8-i386.vmcore.kdump.zlib
I also have raw (non kdump style formats if needed).

** Affects: linux (Ubuntu)
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1712831

Title:
  4.12.0-11-generic - crashing in infrastructure on i386 openvswitch
  tests

Status in linux package in Ubuntu:
  New

Bug description:
  Hi,
  this seems to me to be a kernel crash of some sorts.
  Somewhat in the spirit of older bugs:
  - bug 1630940
  - bug 1630578

  Xnox asked me to look into a hang on openvswitch dep8 tests.
  What I found initially was in the log just
    "ERROR: Removing temporary files on testbed timed out"
  That message brought me to the two bugs above.

  But in there I read that this was the infra running dep8 crashing.
  So for a better bug report I tried to reproduce locally and that actually 
seems to work very reliable.

  To reproduce do:
  $ autopkgtest-buildvm-ubuntu-cloud -a i386 -r artful -s 10G
  $ pull-lp-source openvswitch
  $ autopkgtest --apt-upgrade --shell --no-built-binaries 
openvswitch_2.8.0~git20170809.7aa47a19d-0ubuntu1.dsc -- qemu 
~/work/autopkgtest-artful-i386.img
  # This guest currently will crash after a while of testing

  
  But with that running you can attach to the console and monitor of that guest.
  For example:
  $ sudo nc -U /tmp/autopkgtest-virt-qemu.*/ttyS0

  That gave me a crash on the hang which kind of matches the older bugs, here 
the console:
  [   54.256253] BUG: unable to handle kernel NULL pointer dereference at   
(null)
  [   54.257156] IP: add_grec+0x28/0x440
  [   54.257553] *pdpt = 000000001a869001 *pde = 0000000000000000 
  [   54.257555] 
  [   54.258338] Oops: 0000 [#1] SMP
  [   54.258638] Modules linked in: veth openvswitch nf_conntrack_ipv6 
nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_defrag_ipv6 nf_nat 
nf_conntrack libcrc32c 9p fscache ppdev kvm_intel kvm irqbypass joydev 
input_leds serio_raw 9pnet_virtio parport_pc 9pnet parport qemu_fw_cfg 
i2c_piix4 mac_hid ip_tables x_tables autofs4 btrfs xor raid6_pq psmouse 
virtio_blk virtio_net floppy pata_acpi
  [   54.261891] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W       
4.12.0-11-generic #12-Ubuntu
  [   54.262715] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.10.1-1ubuntu1~cloud0 04/01/2014
  [   54.263610] task: c7b622c0 task.stack: c7b5a000
  [   54.264039] EIP: add_grec+0x28/0x440
  [   54.264378] EFLAGS: 00010202 CPU: 0
  [   54.264711] EAX: 00000000 EBX: dd062540 ECX: 00000006 EDX: dd062540
  [   54.265308] ESI: dd1e6e00 EDI: dd1e6e00 EBP: dbcc5f30 ESP: dbcc5ef0
  [   54.265793]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
  [   54.266297] CR0: 80050033 CR2: 00000000 CR3: 1e930d80 CR4: 000006f0
  [   54.266885] Call Trace:
  [   54.267120]  <SOFTIRQ>
  [   54.267349]  mld_ifc_timer_expire+0xfe/0x250
  [   54.267754]  ? mld_dad_timer_expire+0x50/0x50
  [   54.268173]  call_timer_fn+0x30/0x120
  [   54.268524]  ? mld_dad_timer_expire+0x50/0x50
  [   54.268942]  ? mld_dad_timer_expire+0x50/0x50
  [   54.269364]  run_timer_softirq+0x3c5/0x420
  [   54.269760]  ? __softirqentry_text_start+0x8/0x8
  [   54.270198]  __do_softirq+0xa9/0x245
  [   54.270539]  ? __softirqentry_text_start+0x8/0x8
  [   54.270976]  do_softirq_own_stack+0x24/0x30
  [   54.271373]  </SOFTIRQ>
  [   54.271611]  irq_exit+0xad/0xb0
  [   54.271913]  smp_apic_timer_interrupt+0x38/0x50
  [   54.272344]  apic_timer_interrupt+0x39/0x40
  [   54.272745] EIP: native_safe_halt+0x5/0x10
  [   54.273139] EFLAGS: 00000246 CPU: 0
  [   54.273624] EAX: 00000000 EBX: 00000000 ECX: 00000001 EDX: 00000000
  [   54.274000] ESI: 00000000 EDI: c7b622c0 EBP: c7b5bf10 ESP: c7b5bf10
  [   54.274361]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
  [   54.274687]  default_idle+0x1c/0xf0
  [   54.274896]  arch_cpu_idle+0xe/0x10
  [   54.275286]  default_idle_call+0x1e/0x30
  [   54.275787]  do_idle+0x145/0x1c0
  [   54.276092]  cpu_startup_entry+0x65/0x70
  [   54.276441]  rest_init+0x62/0x70
  [   54.276718]  start_kernel+0x3be/0x3d7
  [   54.276974]  i386_start_kernel+0x9d/0xa1
  [   54.277260]  startup_32_smp+0x16b/0x16d
  [   54.277509] Code: 00 00 00 3e 8d 74 26 00 55 89 e5 57 56 53 89 c6 83 ec 34 
89 4d e8 65 a1 14 00 00 00 89 45 f0 31 c0 8b 42 10 f6 42 48 08 89 45 cc <8b> 00 
c7 45 ec 00 00 00 00 89 45 c8 89 f0 0f 85 b4 02 00 00 8b
  [   54.279314] EIP: add_grec+0x28/0x440 SS:ESP: 0068:dbcc5ef0
  [   54.279829] CR2: 0000000000000000
  [   54.280143] ---[ end trace 3164b1c0dd7745bc ]---
  [   54.280550] Kernel panic - not syncing: Fatal exception in interrupt
  [   54.281078] Kernel Offset: 0x6000000 from 0xc1000000 (relocation range: 
0xc0000000-0xdfbfdfff)
  [   54.281797] ---[ end Kernel panic - not syncing: Fatal exception in 
interrupt

  But since this is lovely qemu the machine isn't as dead as real HW now, so 
via the monitor
  $ sudo nc -U /tmp/autopkgtest-virt-qemu.*/monitor
  I took a dump.

  Fetching the i386 debug kernel shows me I can load that in crash.

  
  But I'd leave the authoritative look what happened to the kernel Team. So I 
shared via fileshare.
  Please load with debug-kernel of linux-image-4.12.0-11-generic-dbgsym.
  I have issues properly loading that in crash as this is a i386 artful on a 
64bit KVM, so the format of the kdump generated by qemu is x86_64 and something 
seems to disagree.
  But since I have simple repro steps above I hope you can crash and analyze it 
the way you want/need.

  In case you still want my dump fetch it from 
https://private-fileshare.canonical.com/~paelzer/ovs-dep8-i386.vmcore.kdump.zlib
  I also have raw (non kdump style formats if needed).

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1712831/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to