Bug#801463: BUG: unable to handle kernel NULL pointer dereference at 00000001 in smp_apic_timer_interrupt

2015-10-10 Thread Richard Kettlewell
Package: src:linux
Version: 3.16.7-ckt11-1+deb8u4
Severity: important

Dear Maintainer,

   * What led up to the situation?

My kernel has started crashing every few days, since 2015-09-09.

The system was upgraded to this kernel (linux-image-3.16.0-4-586
3.16.7-ckt11-1+deb8u4) on 2015-09-20, so it's not a regression in that
particular version.

I retrieved kernel output from the most recent crash, which replaces
the kernel log section below.


-- Package-specific info:
** Version:
Linux version 3.16.0-4-586 (debian-kernel@lists.debian.org) (gcc version 4.8.4 
(Debian 4.8.4-1) ) #1 Debian 3.16.7-ckt11-1+deb8u4 (2015-09-19)

** Command line:
BOOT_IMAGE=/vmlinuz-3.16.0-4-586 root=/dev/sda3 ro console=ttyS0,115200n8

** Not tainted

** Kernel log:
[587090.909477] inbound: IN=eth2 OUT= 
MAC=00:04:a7:08:af:b0:00:1f:27:c0:08:01:08:00 SRC=221.3.105.106 DST=86.9.121.8 
LEN=60 TOS=0x00 PREC=0x00 TTL=51 ID=62653 DF PROTO=TCP SPT=39416 DPT=23 
WINDOW=5808 RES=0x00 SYN URGP=0 
[587134.548195] BUG: unable to handle kernel NULL pointer dereference at 
0001
[587134.548195] IP: [] smp_apic_timer_interrupt+0x24/0x50
[587134.548195] *pde =  
[587134.548195] Oops: 0002 [#1] 
[587134.548195] Modules linked in: tcp_diag inet_diag xt_nat xt_addrtype 
ipt_MASQUERADE ip6t_REJECT xt_multiport ipt_REJECT xt_LOG xt_limit xt_tcpudp 
nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack ip6table_mangle iptable_mangle 
ip6table_raw iptable_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 
nf_nat_ipv4 nf_nat nf_conntrack ip6table_filter ip6_tables iptable_filter 
cpufreq_stats cpufreq_conservative ip_tables x_tables cpufreq_userspace 
cpufreq_powersave nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache 
sunrpc sit tunnel4 ip_tunnel evdev iTCO_wdt iTCO_vendor_support video 
drm_kms_helper processor pcspkr serio_raw drm thermal_sys i2c_i801 i2c_algo_bit 
lpc_ich rng_core i2c_core shpchp w83627hf hwmon_vid bridge stp llc loop slip 
slhc tun fuse autofs4 ext4 crc16 mbcache jbd2 sg sd_mod crc_t10dif 
crct10dif_generic crct10dif_common ata_generic 8139too ata_piix ehci_pci 
uhci_hcd ahci libahci ehci_hcd libata scsi_mod 8139cp r8169 mii usbcore 
usb_common
[587134.548195] CPU: 0 PID: 0 Comm: swapper Not tainted 3.16.0-4-586 #1 Debian 
3.16.7-ckt11-1+deb8u4
[587134.548195] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To 
be filled by O.E.M., BIOS 080012  12/22/2008
[587134.548195] task: c1592500 ti: c1584000 task.ti: c1584000
[587134.548195] EIP: 0060:[] EFLAGS: 00210896 CPU: 0
[587134.548195] EIP is at smp_apic_timer_interrupt+0x24/0x50
[587134.548195] EAX: 0001 EBX: c158beec ECX: c1585f4c EDX: c1585f4c
[587134.548195] ESI: c1584001 EDI: c1585fed EBP: c1585f94 ESP: c1585f48
[587134.548195]  DS: 007b ES: 007b FS:  GS: 00e0 SS: 0068
[587134.548195] CR0: 8005003b CR2: 0001 CR3: 34b8a000 CR4: 0790
[587134.548195] Stack:
[587134.548195]  c1425474 c1585fec  c1584000 c1584000 c1585fec c1585f94 

[587134.548195]  0002007b 08be007b 08be 00e0 ff10 c102f1a2 0060 
00200246
[587134.548195]  c1009d74 c1585fec c1584000 c1585f9c c100a54e c1585fd8 c1066b40 
c1585fec
[587134.548195] Call Trace:
[587134.548195]  [] ? apic_timer_interrupt+0x34/0x40
[587134.548195]  [] ? native_safe_halt+0x2/0x10
[587134.548195]  [] ? default_idle+0x14/0x90
[587134.548195]  [] ? arch_cpu_idle+0xe/0x10
[587134.548195]  [] ? cpu_startup_entry+0x230/0x370
[587134.548195]  [] ? start_kernel+0x3f2/0x3f7
[587134.548195]  [] ? set_init_arg+0x3f/0x45
[587134.548195] Code: ff ff eb 80 66 90 90 55 89 e5 53 3e 8d 74 26 00 8b 0d 00 
ee 59 c1 31 d2 8b 1d 48 39 59 c1 a3 48 39 59 c1 b8 b0 00 00 00 ff 91 a4 <00> 00 
00 e8 d4 a4 c1 ff e8 df f6 bf ff e8 2a a5 c1 ff 89 1d 48
[587134.548195] EIP: [] smp_apic_timer_interrupt+0x24/0x50 SS:ESP 
0068:c1585f48
[587134.548195] CR2: 0001
[587134.548195] ---[ end trace c2ab876b17f6fd20 ]---
[587134.548195] Kernel panic - not syncing: Attempted to kill the idle task!
[587134.548195] Kernel Offset: 0x0 from 0xc100 (relocation range: 
0xc000-0xf7ffdfff)
[587134.548195] Rebooting in 300 seconds..

** Model information
not available

** Loaded modules:
tcp_diag
inet_diag
xt_nat
xt_addrtype
ipt_MASQUERADE
ip6t_REJECT
xt_multiport
ipt_REJECT
xt_LOG
xt_limit
xt_tcpudp
nf_conntrack_ipv6
nf_defrag_ipv6
xt_conntrack
ip6table_mangle
iptable_mangle
ip6table_raw
iptable_raw
iptable_nat
nf_conntrack_ipv4
nf_defrag_ipv4
nf_nat_ipv4
nf_nat
nf_conntrack
ip6table_filter
ip6_tables
cpufreq_stats
iptable_filter
cpufreq_conservative
ip_tables
x_tables
cpufreq_userspace
cpufreq_powersave
nfsd
auth_rpcgss
oid_registry
nfs_acl
nfs
lockd
fscache
sunrpc
sit
tunnel4
ip_tunnel
evdev
iTCO_wdt
iTCO_vendor_support
video
drm_kms_helper
drm
processor
thermal_sys
i2c_algo_bit
pcspkr
serio_raw
i2c_i801
lpc_ich
shpchp
i2c_core
rng_core
w83627hf
hwmon_vid
bridge
stp
llc
loop
slip
slhc
tun
fuse
autofs4
ext4
crc16
mbcache
jbd2
sg
sd_mod
crc_t10dif
crct10dif_generic
crct10dif_common
ata_generic
8139too

Bug#801463: BUG: unable to handle kernel NULL pointer dereference at 00000001 in smp_apic_timer_interrupt

2015-10-10 Thread Ben Hutchings
On Sat, 2015-10-10 at 18:11 +0100, Richard Kettlewell wrote:
> Package: src:linux
> Version: 3.16.7-ckt11-1+deb8u4
> Severity: important
> 
> Dear Maintainer,
> 
>* What led up to the situation?
> 
> My kernel has started crashing every few days, since 2015-09-09.
> 
> The system was upgraded to this kernel (linux-image-3.16.0-4-586
> 3.16.7-ckt11-1+deb8u4) on 2015-09-20, so it's not a regression in that
> particular version.
> 
> I retrieved kernel output from the most recent crash, which replaces
> the kernel log section below.
[...]

This looks rather like a hardware failure, as the instruction pointer
is pointing to the middle of an instruction.  Here's the disassembly of
 smp_apic_timer_interrupt:

c1425ba0:   55  push   %ebp
c1425ba1:   89 e5   mov%esp,%ebp
c1425ba3:   53  push   %ebx
c1425ba4:   e8 c7 fb ff ff  call   0xc1425770; 
initial
3e 8d 74 26 00  lea%ds:0x0(%esi,%eiz,1),%esi ; 
patched
c1425ba9:   8b 0d 00 ee 59 c1   mov0xc159ee00,%ecx
c1425baf:   31 d2   xor%edx,%edx
c1425bb1:   8b 1d 48 39 59 c1   mov0xc1593948,%ebx
c1425bb7:   a3 48 39 59 c1  mov%eax,0xc1593948
c1425bbc:   b8 b0 00 00 00  mov$0xb0,%eax
c1425bc1:   ff 91 a4 00 00 00   call   *0xa4(%ecx)
 ^ EIP
c1425bc7:   e8 d4 a4 c1 ff  call   0xc10400a0
c1425bcc:   e8 df f6 bf ff  call   0xc10252b0
c1425bd1:   e8 2a a5 c1 ff  call   0xc1040100
c1425bd6:   89 1d 48 39 59 c1   mov%ebx,0xc1593948
c1425bdc:   5b  pop%ebx
c1425bdd:   5d  pop%ebp
c1425bde:   66 90   xchg   %ax,%ax
c1425be0:   c3  ret

Ben.

-- 
Ben Hutchings
Unix is many things to many people,
but it's never been everything to anybody.

signature.asc
Description: This is a digitally signed message part


Bug#801463: BUG: unable to handle kernel NULL pointer dereference at 00000001 in smp_apic_timer_interrupt

2015-10-10 Thread Richard Kettlewell
On 2015-10-10 18:49, Ben Hutchings wrote:
> This looks rather like a hardware failure, as the instruction pointer
> is pointing to the middle of an instruction.  Here's the disassembly of
>  smp_apic_timer_interrupt:

Thanks for the diagnosis.  Time to spend some money l-/

For future reference, is there a convenient way to get a disassembly
corresponding to the kernel I have installed?

ttfn/rjk



Bug#801463: BUG: unable to handle kernel NULL pointer dereference at 00000001 in smp_apic_timer_interrupt

2015-10-10 Thread Ben Hutchings
On Sat, 2015-10-10 at 19:34 +0100, Richard Kettlewell wrote:
> On 2015-10-10 18:49, Ben Hutchings wrote:
> > This looks rather like a hardware failure, as the instruction pointer
> > is pointing to the middle of an instruction.  Here's the disassembly of
> >  smp_apic_timer_interrupt:
> 
> Thanks for the diagnosis.  Time to spend some money l-/
> 
> For future reference, is there a convenient way to get a disassembly
> corresponding to the kernel I have installed?

Use scripts/extract-vmlinux from the Linux source to decompress the
image in /boot, then 'objdump -d'.

Ben.

-- 
Ben Hutchings
Unix is many things to many people,
but it's never been everything to anybody.


signature.asc
Description: This is a digitally signed message part