Hello,
one of our servers crashed repeatedly this week. After setting up serial
console logging we were able to capture the following stack traces:

> [3689736.061539] WARNING: CPU: 0 PID: 29284 at 
> linux-4.1.6/lib/list_debug.c:33 __list_add+0xc0/0xd0()
> [3689736.061541] list_add corruption. prev->next should be next 
> (ffffffff81ab3ca8), but was ffffffff81ab3cc8. (prev=ffff8804d9910d58).

Compare this ...

> [3689736.061602] CPU: 0 PID: 29284 Comm: slapd Tainted: G        W       
> 4.1.0-ucs190-amd64 #1 Debian 4.1.6-1.190.201604142226
> [3689736.061603] Hardware name: VMware, Inc. VMware Virtual Platform/440BX 
> Desktop Reference Platform, BIOS 6.00 09/21/2015

Maybe VMware has a bug?

> [3689736.061604]  0000000000000000 ffffffff817531c0 ffffffff81597807 
> ffff88083fc038a8
> [3689736.061606]  ffffffff81076c45 ffff88004b553e00 ffffffff81ab3ca8 
> ffff8804d9910d58
> [3689736.061608]  0000000000000001 0000000137090762 ffffffff81076d4a 
> ffffffff81753310
> [3689736.061609] Call Trace:
...
> [3689736.061624]  [<ffffffff8130be50>] ? __list_add+0xc0/0xd0
> [3689736.061627]  [<ffffffff810da5a6>] ? internal_add_timer+0x36/0xa0
> [3689736.061629]  [<ffffffff810dc6fa>] ? mod_timer_pending+0xfa/0x140
> [3689736.061635]  [<ffffffffa048c441>] ? __nf_ct_refresh_acct+0xb1/0xc0 
> [nf_conntrack]
> [3689736.061640]  [<ffffffffa04945bc>] ? tcp_packet+0x66c/0x1500 
> [nf_conntrack]
> [3689736.061643]  [<ffffffff810b5fff>] ? autoremove_wake_function+0x2f/0x50
> [3689736.061647]  [<ffffffffa0493ef2>] ? tcp_error+0x1b2/0x210 [nf_conntrack]
> [3689736.061650]  [<ffffffffa048e725>] ? nf_conntrack_in+0x3a5/0xb30 
> [nf_conntrack]
> [3689736.061654]  [<ffffffff81481cb4>] ? sk_reset_timer+0x14/0x20
> [3689736.061657]  [<ffffffff814cdeef>] ? nf_iterate+0x4f/0x80
> [3689736.061659]  [<ffffffff814cdfb8>] ? nf_hook_slow+0x98/0xf0
> [3689736.061662]  [<ffffffff814d52f4>] ? ip_rcv+0x314/0x400
> [3689736.061664]  [<ffffffff814d48a0>] ? inet_add_protocol+0x50/0x50
> [3689736.061668]  [<ffffffff81498ae3>] ? __netif_receive_skb_core+0x703/0x920
> [3689736.061670]  [<ffffffff8101f405>] ? read_tsc+0x5/0x10
> [3689736.061672]  [<ffffffff81498ecf>] ? netif_receive_skb_internal+0x1f/0x90
> [3689736.061673]  [<ffffffff81499af0>] ? napi_gro_receive+0xb0/0xe0
> [3689736.061678]  [<ffffffffa0097fe4>] ? e1000_clean_rx_irq+0x2b4/0x500 
> [e1000]
> [3689736.061681]  [<ffffffffa0099ccc>] ? e1000_clean+0x26c/0x900 [e1000]
> [3689736.061683]  [<ffffffff81499629>] ? net_rx_action+0x159/0x330
> [3689736.061685]  [<ffffffff8107aace>] ? __do_softirq+0xde/0x260
> [3689736.061687]  [<ffffffff8107ae95>] ? irq_exit+0x95/0xa0
> [3689736.061689]  [<ffffffff815a0b74>] ? do_IRQ+0x64/0x110
> [3689736.061691]  [<ffffffff8159e9ee>] ? common_interrupt+0x6e/0x6e
...
> [3689738.157677] WARNING: CPU: 0 PID: 29284 at 
> linux-4.1.6/lib/list_debug.c:33 __list_add+0xc0/0xd0()
> [3689738.157678] list_add corruption. prev->next should be next 
> (ffffffff81ab3cc8), but was ffffffff81ab3ca8. (prev=ffff8804d9910d58).

with that one: the arguments are swapped.

...
> [3689738.157740]  [<ffffffff8130be50>] ? __list_add+0xc0/0xd0
> [3689738.157742]  [<ffffffff810da5a6>] ? internal_add_timer+0x36/0xa0
> [3689738.157744]  [<ffffffff810dc6fa>] ? mod_timer_pending+0xfa/0x140
> [3689738.157748]  [<ffffffffa048c441>] ? __nf_ct_refresh_acct+0xb1/0xc0 
> [nf_conntrack]
> [3689738.157751]  [<ffffffffa04945bc>] ? tcp_packet+0x66c/0x1500 
> [nf_conntrack]
> [3689738.157753]  [<ffffffff8101f9d5>] ? sched_clock+0x5/0x10
> [3689738.157755]  [<ffffffff8109ea48>] ? resched_curr+0x38/0xc0
> [3689738.157758]  [<ffffffff810b5fff>] ? autoremove_wake_function+0x2f/0x50
> [3689738.157760]  [<ffffffffa0493ef2>] ? tcp_error+0x1b2/0x210 [nf_conntrack]
> [3689738.157763]  [<ffffffffa048e725>] ? nf_conntrack_in+0x3a5/0xb30 
> [nf_conntrack]
> [3689738.157765]  [<ffffffff81481cb4>] ? sk_reset_timer+0x14/0x20
> [3689738.157768]  [<ffffffff814cdeef>] ? nf_iterate+0x4f/0x80
> [3689738.157769]  [<ffffffff814cdfb8>] ? nf_hook_slow+0x98/0xf0
> [3689738.157771]  [<ffffffff814d52f4>] ? ip_rcv+0x314/0x400
> [3689738.157773]  [<ffffffff814d48a0>] ? inet_add_protocol+0x50/0x50
> [3689738.157775]  [<ffffffff81498ae3>] ? __netif_receive_skb_core+0x703/0x920
> [3689738.157777]  [<ffffffff8101f405>] ? read_tsc+0x5/0x10
> [3689738.157778]  [<ffffffff81498ecf>] ? netif_receive_skb_internal+0x1f/0x90
> [3689738.157780]  [<ffffffff81499af0>] ? napi_gro_receive+0xb0/0xe0
> [3689738.157784]  [<ffffffffa0097fe4>] ? e1000_clean_rx_irq+0x2b4/0x500 
> [e1000]
> [3689738.157787]  [<ffffffffa0099ccc>] ? e1000_clean+0x26c/0x900 [e1000]
> [3689738.157789]  [<ffffffff81499629>] ? net_rx_action+0x159/0x330
> [3689738.157791]  [<ffffffff8107aace>] ? __do_softirq+0xde/0x260
> [3689738.157792]  [<ffffffff8107ae95>] ? irq_exit+0x95/0xa0
> [3689738.157794]  [<ffffffff815a0b74>] ? do_IRQ+0x64/0x110
> [3689738.157797]  [<ffffffff8159e9ee>] ? common_interrupt+0x6e/0x6e

Has anyone seen a similar issue and knows if it is fixed post 4.1.16?

If you need more data, just ask and I will see what else I can gather.

Thank you in advance.

Philipp
-- 
Philipp Hahn
Open Source Software Engineer

Univention GmbH
be open.
Mary-Somerville-Str. 1
D-28359 Bremen
Tel.: +49 421 22232-0
Fax : +49 421 22232-99
h...@univention.de

http://www.univention.de/
Geschäftsführer: Peter H. Ganten
HRB 20755 Amtsgericht Bremen
Steuer-Nr.: 71-597-02876
[3689736.061530] ------------[ cut here ]------------
[3689736.061539] WARNING: CPU: 0 PID: 29284 at linux-4.1.6/lib/list_debug.c:33 
__list_add+0xc0/0xd0()
[3689736.061541] list_add corruption. prev->next should be next 
(ffffffff81ab3ca8), but was ffffffff81ab3cc8. (prev=ffff8804d9910d58).
[3689736.061542] Modules linked in: nfnetlink_log nfnetlink xt_addrtype 
xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 bridge stp llc overlay 
vmw_vsock_vmci_transport vsock ip6t_REJECT nf_reject_ipv6 ipt_REJECT 
nf_reject_ipv4 xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_mangle 
ip6table_filter ip6_tables xt_state iptable_mangle iptable_nat 
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter 
ip_tables x_tables rpcsec_gss_krb5 nfsd auth_rpcgss nfs_acl nfs lockd grace 
fscache sunrpc quota_v2 quota_tree vmw_balloon psmouse coretemp pcspkr 
serio_raw parport_pc 8250_fintek parport shpchp i2c_piix4 vmw_vmci ac 
acpi_cpufreq processor thermal_sys battery evdev ext4 crc16 mbcache jbd2 
dm_snapshot dm_bufio dm_mirror dm_region_hash dm_log dm_mod sg sr_mod cdrom 
sd_mod ata_generic crc32c_intel e1000 floppy vmwgfx ttm ata_piix mptspi 
scsi_transport_spi mptscsih mptbase libata drm_kms_helper drm scsi_mod button
[3689736.061602] CPU: 0 PID: 29284 Comm: slapd Tainted: G        W       
4.1.0-ucs190-amd64 #1 Debian 4.1.6-1.190.201604142226
[3689736.061603] Hardware name: VMware, Inc. VMware Virtual Platform/440BX 
Desktop Reference Platform, BIOS 6.00 09/21/2015
[3689736.061604]  0000000000000000 ffffffff817531c0 ffffffff81597807 
ffff88083fc038a8
[3689736.061606]  ffffffff81076c45 ffff88004b553e00 ffffffff81ab3ca8 
ffff8804d9910d58
[3689736.061608]  0000000000000001 0000000137090762 ffffffff81076d4a 
ffffffff81753310
[3689736.061609] Call Trace:
[3689736.061610]  <IRQ>  [<ffffffff81597807>] ? dump_stack+0x40/0x50
[3689736.061619]  [<ffffffff81076c45>] ? warn_slowpath_common+0x95/0xe0
[3689736.061621]  [<ffffffff81076d4a>] ? warn_slowpath_fmt+0x4a/0x50
[3689736.061624]  [<ffffffff8130be50>] ? __list_add+0xc0/0xd0
[3689736.061627]  [<ffffffff810da5a6>] ? internal_add_timer+0x36/0xa0
[3689736.061629]  [<ffffffff810dc6fa>] ? mod_timer_pending+0xfa/0x140
[3689736.061635]  [<ffffffffa048c441>] ? __nf_ct_refresh_acct+0xb1/0xc0 
[nf_conntrack]
[3689736.061640]  [<ffffffffa04945bc>] ? tcp_packet+0x66c/0x1500 [nf_conntrack]
[3689736.061643]  [<ffffffff810b5fff>] ? autoremove_wake_function+0x2f/0x50
[3689736.061647]  [<ffffffffa0493ef2>] ? tcp_error+0x1b2/0x210 [nf_conntrack]
[3689736.061650]  [<ffffffffa048e725>] ? nf_conntrack_in+0x3a5/0xb30 
[nf_conntrack]
[3689736.061654]  [<ffffffff81481cb4>] ? sk_reset_timer+0x14/0x20
[3689736.061657]  [<ffffffff814cdeef>] ? nf_iterate+0x4f/0x80
[3689736.061659]  [<ffffffff814cdfb8>] ? nf_hook_slow+0x98/0xf0
[3689736.061662]  [<ffffffff814d52f4>] ? ip_rcv+0x314/0x400
[3689736.061664]  [<ffffffff814d48a0>] ? inet_add_protocol+0x50/0x50
[3689736.061668]  [<ffffffff81498ae3>] ? __netif_receive_skb_core+0x703/0x920
[3689736.061670]  [<ffffffff8101f405>] ? read_tsc+0x5/0x10
[3689736.061672]  [<ffffffff81498ecf>] ? netif_receive_skb_internal+0x1f/0x90
[3689736.061673]  [<ffffffff81499af0>] ? napi_gro_receive+0xb0/0xe0
[3689736.061678]  [<ffffffffa0097fe4>] ? e1000_clean_rx_irq+0x2b4/0x500 [e1000]
[3689736.061681]  [<ffffffffa0099ccc>] ? e1000_clean+0x26c/0x900 [e1000]
[3689736.061683]  [<ffffffff81499629>] ? net_rx_action+0x159/0x330
[3689736.061685]  [<ffffffff8107aace>] ? __do_softirq+0xde/0x260
[3689736.061687]  [<ffffffff8107ae95>] ? irq_exit+0x95/0xa0
[3689736.061689]  [<ffffffff815a0b74>] ? do_IRQ+0x64/0x110
[3689736.061691]  [<ffffffff8159e9ee>] ? common_interrupt+0x6e/0x6e
[3689736.061692]  <EOI> 
[3689736.061693] ---[ end trace 8364fe1151c67412 ]---
[3689738.157669] ------------[ cut here ]------------
[3689738.157677] WARNING: CPU: 0 PID: 29284 at linux-4.1.6/lib/list_debug.c:33 
__list_add+0xc0/0xd0()
[3689738.157678] list_add corruption. prev->next should be next 
(ffffffff81ab3cc8), but was ffffffff81ab3ca8. (prev=ffff8804d9910d58).
[3689738.157679] Modules linked in: nfnetlink_log nfnetlink xt_addrtype 
xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 bridge stp llc overlay 
vmw_vsock_vmci_transport vsock ip6t_REJECT nf_reject_ipv6 ipt_REJECT 
nf_reject_ipv4 xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_mangle 
ip6table_filter ip6_tables xt_state iptable_mangle iptable_nat 
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter 
ip_tables x_tables rpcsec_gss_krb5 nfsd auth_rpcgss nfs_acl nfs lockd grace 
fscache sunrpc quota_v2 quota_tree vmw_balloon psmouse coretemp pcspkr 
serio_raw parport_pc 8250_fintek parport shpchp i2c_piix4 vmw_vmci ac 
acpi_cpufreq processor thermal_sys battery evdev ext4 crc16 mbcache jbd2 
dm_snapshot dm_bufio dm_mirror dm_region_hash dm_log dm_mod sg sr_mod cdrom 
sd_mod ata_generic crc32c_intel e1000 floppy vmwgfx ttm ata_piix mptspi 
scsi_transport_spi mptscsih mptbase libata drm_kms_helper drm scsi_mod button
[3689738.157718] CPU: 0 PID: 29284 Comm: slapd Tainted: G        W       
4.1.0-ucs190-amd64 #1 Debian 4.1.6-1.190.201604142226
[3689738.157719] Hardware name: VMware, Inc. VMware Virtual Platform/440BX 
Desktop Reference Platform, BIOS 6.00 09/21/2015
[3689738.157721]  0000000000000000 ffffffff817531c0 ffffffff81597807 
ffff88083fc038a8
[3689738.157722]  ffffffff81076c45 ffff8807b9f83578 ffffffff81ab3cc8 
ffff8804d9910d58
[3689738.157724]  0000000000000001 000000013709096f ffffffff81076d4a 
ffffffff81753310
[3689738.157725] Call Trace:
[3689738.157726]  <IRQ>  [<ffffffff81597807>] ? dump_stack+0x40/0x50
[3689738.157734]  [<ffffffff81076c45>] ? warn_slowpath_common+0x95/0xe0
[3689738.157735]  [<ffffffff81076d4a>] ? warn_slowpath_fmt+0x4a/0x50
[3689738.157738]  [<ffffffff810a8962>] ? select_task_rq_fair+0x412/0x610
[3689738.157740]  [<ffffffff8130be50>] ? __list_add+0xc0/0xd0
[3689738.157742]  [<ffffffff810da5a6>] ? internal_add_timer+0x36/0xa0
[3689738.157744]  [<ffffffff810dc6fa>] ? mod_timer_pending+0xfa/0x140
[3689738.157748]  [<ffffffffa048c441>] ? __nf_ct_refresh_acct+0xb1/0xc0 
[nf_conntrack]
[3689738.157751]  [<ffffffffa04945bc>] ? tcp_packet+0x66c/0x1500 [nf_conntrack]
[3689738.157753]  [<ffffffff8101f9d5>] ? sched_clock+0x5/0x10
[3689738.157755]  [<ffffffff8109ea48>] ? resched_curr+0x38/0xc0
[3689738.157758]  [<ffffffff810b5fff>] ? autoremove_wake_function+0x2f/0x50
[3689738.157760]  [<ffffffffa0493ef2>] ? tcp_error+0x1b2/0x210 [nf_conntrack]
[3689738.157763]  [<ffffffffa048e725>] ? nf_conntrack_in+0x3a5/0xb30 
[nf_conntrack]
[3689738.157765]  [<ffffffff81481cb4>] ? sk_reset_timer+0x14/0x20
[3689738.157768]  [<ffffffff814cdeef>] ? nf_iterate+0x4f/0x80
[3689738.157769]  [<ffffffff814cdfb8>] ? nf_hook_slow+0x98/0xf0
[3689738.157771]  [<ffffffff814d52f4>] ? ip_rcv+0x314/0x400
[3689738.157773]  [<ffffffff814d48a0>] ? inet_add_protocol+0x50/0x50
[3689738.157775]  [<ffffffff81498ae3>] ? __netif_receive_skb_core+0x703/0x920
[3689738.157777]  [<ffffffff8101f405>] ? read_tsc+0x5/0x10
[3689738.157778]  [<ffffffff81498ecf>] ? netif_receive_skb_internal+0x1f/0x90
[3689738.157780]  [<ffffffff81499af0>] ? napi_gro_receive+0xb0/0xe0
[3689738.157784]  [<ffffffffa0097fe4>] ? e1000_clean_rx_irq+0x2b4/0x500 [e1000]
[3689738.157787]  [<ffffffffa0099ccc>] ? e1000_clean+0x26c/0x900 [e1000]
[3689738.157789]  [<ffffffff81499629>] ? net_rx_action+0x159/0x330
[3689738.157791]  [<ffffffff8107aace>] ? __do_softirq+0xde/0x260
[3689738.157792]  [<ffffffff8107ae95>] ? irq_exit+0x95/0xa0
[3689738.157794]  [<ffffffff815a0b74>] ? do_IRQ+0x64/0x110
[3689738.157797]  [<ffffffff8159e9ee>] ? common_interrupt+0x6e/0x6e
[3689738.157797]  <EOI> 
[3689738.157798] ---[ end trace 8364fe1151c67413 ]---

Reply via email to