I've put a (less critical) production guest back on this system to see if that reproduces the problem. If it's stable, I'll move the mission critical guest (the one that typically reproduced the problem in the past) over. Maybe that'll help narrow it down.
-- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1404409 Title: [regression] Intel 10Gb NIC Crashes Status in linux package in Ubuntu: In Progress Status in linux source package in Trusty: New Status in linux source package in Utopic: New Status in linux source package in Vivid: New Bug description: I posted this to net...@vger.kernel.org as well: http://www.spinics.net/lists/netdev/msg309110.html I think the next step is to try to bisect this down to a specific commit. I'm starting to look at the instructions here: https://wiki.ubuntu.com/Kernel/KernelBisection ----- Previous history of this thread: http://thread.gmane.org/gmane.linux.network/326672 On 2014-11-04 22:57:19, Tom Herbert wrote: > Using vlan and bonding? vlan_dev_hard_start_xmit called. A possible > cause is that bonding interface is out of sync with slave interface > w.r.t. GSO features. Do we know if this worked in 3.14, 3.15? I'm seeing the same sort of crash/warning (skb_war_bad_offload). It's happening on Intel 10 Gig NICs using the ixgbe driver. I'm using bridges (for virtual machines) on top of VLANs on top of 802.3ad bonding. I'm using an MTU of 9000 on the bond0 interface, but 1500 everywhere else. I'm always bonding two ports: one one system, I'm bonding two ports on identical one-port NICs; on another system, I'm bonding two ports on a single two-port NIC. Both systems exhibit the same behavior. Everything has worked fine for a couple years on Ubuntu 12.04 Precise (Linux 3.2.0). It immediately broke when I upgraded to Ubuntu 14.04 Trusty (Linux 3.13.0). I can also reproduce this using the packaged version of Linux 3.16.0 on Trusty. In contrast to other reports of this bug, disabling scatter gather on the physical interfaces (e.g. eth0) does *not* stop the crashes (assuming I disabled it correctly). I currently have two systems (one with Precise, one with Trusty) available to do any testing that you'd find helpful. Here's a first pass at getting some debugging data. The broken system (Ubuntu 14.04 Trusty): rlaager@BROKEN:~$ uname -a Linux BROKEN 3.13.0-43-generic #72-Ubuntu SMP Mon Dec 8 19:35:06 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux rlaager@BROKEN:~$ ethtool -k p6p1 Features for p6p1: rx-checksumming: on tx-checksumming: on tx-checksum-ipv4: on tx-checksum-ip-generic: off [fixed] tx-checksum-ipv6: on tx-checksum-fcoe-crc: on [fixed] tx-checksum-sctp: on scatter-gather: on tx-scatter-gather: on tx-scatter-gather-fraglist: off [fixed] tcp-segmentation-offload: on tx-tcp-segmentation: on tx-tcp-ecn-segmentation: off [fixed] tx-tcp6-segmentation: on udp-fragmentation-offload: off [fixed] generic-segmentation-offload: on generic-receive-offload: on large-receive-offload: off rx-vlan-offload: on tx-vlan-offload: on ntuple-filters: off receive-hashing: on highdma: on [fixed] rx-vlan-filter: on vlan-challenged: off [fixed] tx-lockless: off [fixed] netns-local: off [fixed] tx-gso-robust: off [fixed] tx-fcoe-segmentation: on [fixed] tx-gre-segmentation: off [fixed] tx-ipip-segmentation: off [fixed] tx-sit-segmentation: off [fixed] tx-udp_tnl-segmentation: off [fixed] tx-mpls-segmentation: off [fixed] fcoe-mtu: off [fixed] tx-nocache-copy: on loopback: off [fixed] rx-fcs: off [fixed] rx-all: off tx-vlan-stag-hw-insert: off [fixed] rx-vlan-stag-hw-parse: off [fixed] rx-vlan-stag-filter: off [fixed] l2-fwd-offload: off rlaager@BROKEN:~$ ethtool -k bond0 Features for bond0: rx-checksumming: off [fixed] tx-checksumming: on tx-checksum-ipv4: off [fixed] tx-checksum-ip-generic: on tx-checksum-ipv6: off [fixed] tx-checksum-fcoe-crc: off [fixed] tx-checksum-sctp: off [fixed] scatter-gather: on tx-scatter-gather: on tx-scatter-gather-fraglist: off [requested on] tcp-segmentation-offload: on tx-tcp-segmentation: on tx-tcp-ecn-segmentation: on tx-tcp6-segmentation: on udp-fragmentation-offload: off [fixed] generic-segmentation-offload: on generic-receive-offload: on large-receive-offload: off rx-vlan-offload: on tx-vlan-offload: on ntuple-filters: off [fixed] receive-hashing: off [fixed] highdma: on rx-vlan-filter: on vlan-challenged: off [fixed] tx-lockless: on [fixed] netns-local: off [fixed] tx-gso-robust: off [fixed] tx-fcoe-segmentation: off [fixed] tx-gre-segmentation: off [fixed] tx-ipip-segmentation: off [fixed] tx-sit-segmentation: off [fixed] tx-udp_tnl-segmentation: on tx-mpls-segmentation: off [fixed] fcoe-mtu: off [fixed] tx-nocache-copy: off [requested on] loopback: off [fixed] rx-fcs: off [fixed] rx-all: off [fixed] tx-vlan-stag-hw-insert: off [fixed] rx-vlan-stag-hw-parse: off [fixed] rx-vlan-stag-filter: off [fixed] l2-fwd-offload: off [fixed] rlaager@BROKEN:~$ ethtool -k br7 Features for br7: rx-checksumming: off [fixed] tx-checksumming: on tx-checksum-ipv4: off [fixed] tx-checksum-ip-generic: on tx-checksum-ipv6: off [fixed] tx-checksum-fcoe-crc: off [fixed] tx-checksum-sctp: off [fixed] scatter-gather: on tx-scatter-gather: on tx-scatter-gather-fraglist: off [requested on] tcp-segmentation-offload: on tx-tcp-segmentation: on tx-tcp-ecn-segmentation: on tx-tcp6-segmentation: on udp-fragmentation-offload: off [requested on] generic-segmentation-offload: on generic-receive-offload: on large-receive-offload: off [fixed] rx-vlan-offload: off [fixed] tx-vlan-offload: on ntuple-filters: off [fixed] receive-hashing: off [fixed] highdma: on rx-vlan-filter: off [fixed] vlan-challenged: off [fixed] tx-lockless: on [fixed] netns-local: on [fixed] tx-gso-robust: off [requested on] tx-fcoe-segmentation: off [requested on] tx-gre-segmentation: on tx-ipip-segmentation: on tx-sit-segmentation: on tx-udp_tnl-segmentation: on tx-mpls-segmentation: on fcoe-mtu: off [fixed] tx-nocache-copy: off [requested on] loopback: off [fixed] rx-fcs: off [fixed] rx-all: off [fixed] tx-vlan-stag-hw-insert: off [fixed] rx-vlan-stag-hw-parse: off [fixed] rx-vlan-stag-filter: off [fixed] l2-fwd-offload: off [fixed] The working system (Ubuntu 12.04 Precise): rlaager@WORKING:~$ uname -a Linux WORKING 3.2.0-74-generic #109-Ubuntu SMP Tue Dec 9 16:45:49 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux rlaager@WORKING:~$ ethtool -k eth0 Offload parameters for eth0: rx-checksumming: on tx-checksumming: on scatter-gather: on tcp-segmentation-offload: on udp-fragmentation-offload: off generic-segmentation-offload: on generic-receive-offload: on large-receive-offload: off rx-vlan-offload: on tx-vlan-offload: on ntuple-filters: off receive-hashing: on rlaager@WORKING:~$ ethtool -k bond0 Offload parameters for bond0: rx-checksumming: on tx-checksumming: on scatter-gather: on tcp-segmentation-offload: on udp-fragmentation-offload: off generic-segmentation-offload: on generic-receive-offload: on large-receive-offload: off rx-vlan-offload: on tx-vlan-offload: on ntuple-filters: off receive-hashing: off rlaager@WORKING:~$ ethtool -k br7 Offload parameters for br7: rx-checksumming: on tx-checksumming: on scatter-gather: off tcp-segmentation-offload: off udp-fragmentation-offload: off generic-segmentation-offload: off generic-receive-offload: on large-receive-offload: off rx-vlan-offload: off tx-vlan-offload: on ntuple-filters: off A stack trace from 3.13.0 (the default kernel in Ubuntu Trusty): [ 1161.275007] WARNING: CPU: 7 PID: 0 at /build/buildd/linux-3.13.0/net/core/dev.c:2224 skb_warn_bad_offload+0xcd/0xda() [ 1161.275011] : caps=(0x00000022000048c1, 0x0000000000000000) len=1514 data_len=1460 gso_size=1460 gso_type=1 ip_summed=1 [ 1161.275012] Modules linked in: nfsv3 ipmi_devintf ipmi_si vhost_net vhost macvtap macvlan bridge ip6t_REJECT xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT xt_comment xt_mul mrp xt_addrtype llc bonding nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_ ch intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd serio_raw joydev i7core_eda id nfs_acl lp parport nfs lockd sunrpc fscache ses enclosure raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor ixgbe raid6_pq dca hid_generic raid1 ptp mpt2sas smouse hid libahci scsi_transport_sas mdio linear [ 1161.275077] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G W 3.13.0-43-generic #72-Ubuntu [ 1161.275079] Hardware name: Supermicro X8DT6/X8DT6, BIOS 2.0a 09/14/2010 [ 1161.275080] 0000000000000009 ffff880c3fc239d8 ffffffff81720bf6 ffff880c3fc23a20 [ 1161.275085] ffff880c3fc23a10 ffffffff810677cd ffff880c1d3b9600 ffff880618e08000 [ 1161.275089] 0000000000000001 0000000000000001 ffff880c1d3b9600 ffff880c3fc23a70 [ 1161.275092] Call Trace: [ 1161.275094] <IRQ> [<ffffffff81720bf6>] dump_stack+0x45/0x56 [ 1161.275101] [<ffffffff810677cd>] warn_slowpath_common+0x7d/0xa0 [ 1161.275105] [<ffffffff8106783c>] warn_slowpath_fmt+0x4c/0x50 [ 1161.275109] [<ffffffff8136a0a3>] ? ___ratelimit+0x93/0x100 [ 1161.275113] [<ffffffff81723afe>] skb_warn_bad_offload+0xcd/0xda [ 1161.275118] [<ffffffff81626489>] __skb_gso_segment+0x79/0xb0 [ 1161.275122] [<ffffffff8162677a>] dev_hard_start_xmit+0x18a/0x560 [ 1161.275126] [<ffffffff81098209>] ? ttwu_do_wakeup+0x19/0xc0 [ 1161.275129] [<ffffffff8164594e>] sch_direct_xmit+0xee/0x1c0 [ 1161.275133] [<ffffffff81626d80>] __dev_queue_xmit+0x230/0x500 [ 1161.275137] [<ffffffff81627060>] dev_queue_xmit+0x10/0x20 [ 1161.275143] [<ffffffffa04ab31b>] br_dev_queue_push_xmit+0x7b/0xc0 [bridge] [ 1161.275149] [<ffffffffa04ab532>] br_forward_finish+0x22/0x60 [bridge] [ 1161.275155] [<ffffffffa04ab710>] __br_forward+0x80/0xf0 [bridge] [ 1161.275161] [<ffffffffa04ab9bb>] br_forward+0x8b/0xa0 [bridge] [ 1161.275167] [<ffffffffa04ac6d9>] br_handle_frame_finish+0x149/0x3d0 [bridge] [ 1161.275173] [<ffffffffa04acad5>] br_handle_frame+0x175/0x250 [bridge] [ 1161.275177] [<ffffffff81624ac2>] __netif_receive_skb_core+0x262/0x840 [ 1161.275181] [<ffffffff8101b700>] ? check_tsc_unstable+0x10/0x10 [ 1161.275184] [<ffffffff816250b8>] __netif_receive_skb+0x18/0x60 [ 1161.275188] [<ffffffff81625123>] netif_receive_skb+0x23/0x90 [ 1161.275192] [<ffffffff81625b70>] napi_gro_receive+0x80/0xb0 [ 1161.275202] [<ffffffffa014009c>] ixgbe_clean_rx_irq+0x7ac/0xb10 [ixgbe] [ 1161.275211] [<ffffffffa0141140>] ixgbe_poll+0x460/0x800 [ixgbe] [ 1161.275216] [<ffffffff816254a2>] net_rx_action+0x152/0x250 [ 1161.275220] [<ffffffff8106cc1c>] __do_softirq+0xec/0x2c0 [ 1161.275223] [<ffffffff8106d165>] irq_exit+0x105/0x110 [ 1161.275227] [<ffffffff817339e6>] do_IRQ+0x56/0xc0 [ 1161.275231] [<ffffffff817290ed>] common_interrupt+0x6d/0x6d [ 1161.275232] <EOI> [<ffffffff815d361f>] ? cpuidle_enter_state+0x4f/0xc0 [ 1161.275240] [<ffffffff815d3749>] cpuidle_idle_call+0xb9/0x1f0 [ 1161.275244] [<ffffffff8101d35e>] arch_cpu_idle+0xe/0x30 [ 1161.275247] [<ffffffff810bef35>] cpu_startup_entry+0xc5/0x290 [ 1161.275251] [<ffffffff810413ed>] start_secondary+0x21d/0x2d0 A stack trace from 3.16.0 (still on Ubuntu Trusty): [ 120.376026] WARNING: CPU: 6 PID: 0 at /build/buildd/linux-lts-utopic-3.16.0/net/core/dev.c:2246 skb_warn_bad_offload+0xcd/0xda() [ 120.376029] : caps=(0x00000080000048c1, 0x0000000000000000) len=1514 data_len=1460 gso_size=1460 gso_type=1 ip_summed=1 [ 120.376030] Modules linked in: nfsv3 ipmi_devintf ipmi_si ipmi_msghandler vhost_net vhost macvtap macvlan bridge 8021q garp stp mrp llc bonding ip6t_REJECT xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT xt_comment xt_multiport xt_recent xt_limit xt_tcpudp xt_addrtype nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack iptable_filter ip_tables x_tables intel_powerclamp coretemp kvm_intel gpio_ich kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd serio_raw lpc_ich joydev i7core_edac ioatdma edac_core nfsd auth_rpcgss mac_hid nfs_acl lp parport nfs lockd sunrpc fscache ses enclosure raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor hid_generic raid6_pq ixgbe usbhid raid1 mpt2sas dca ahci raid0 ptp raid_class pps_core scsi_transport_sas multipath hid mdio libahci linear [ 120.376085] CPU: 6 PID: 0 Comm: swapper/6 Not tainted 3.16.0-28-generic #37-Ubuntu [ 120.376086] Hardware name: Supermicro X8DT6/X8DT6, BIOS 2.0a 09/14/2010 [ 120.376088] 0000000000000009 ffff880c3fc039b8 ffffffff81762220 ffff880c3fc03a00 [ 120.376090] ffff880c3fc039f0 ffffffff8106dd2d ffff880c1ac99a00 ffff88061c2fc000 [ 120.376092] 0000000000000001 0000000000000001 ffff880c1ac99a00 ffff880c3fc03a50 [ 120.376094] Call Trace: [ 120.376096] <IRQ> [<ffffffff81762220>] dump_stack+0x45/0x56 [ 120.376105] [<ffffffff8106dd2d>] warn_slowpath_common+0x7d/0xa0 [ 120.376107] [<ffffffff8106dd9c>] warn_slowpath_fmt+0x4c/0x50 [ 120.376111] [<ffffffff8138b153>] ? ___ratelimit+0x93/0x100 [ 120.376114] [<ffffffff817654da>] skb_warn_bad_offload+0xcd/0xda [ 120.376119] [<ffffffff81661d29>] __skb_gso_segment+0x79/0xb0 [ 120.376122] [<ffffffff81662052>] dev_hard_start_xmit+0x182/0x5c0 [ 120.376125] [<ffffffff8168337e>] sch_direct_xmit+0xee/0x1c0 [ 120.376127] [<ffffffff81662690>] __dev_queue_xmit+0x200/0x4d0 [ 120.376129] [<ffffffff81662970>] dev_queue_xmit+0x10/0x20 [ 120.376135] [<ffffffffc0796ac8>] br_dev_queue_push_xmit+0x68/0xa0 [bridge] [ 120.376138] [<ffffffffc0796cd2>] br_forward_finish+0x22/0x60 [bridge] [ 120.376142] [<ffffffffc0796e90>] __br_forward+0x80/0xf0 [bridge] [ 120.376145] [<ffffffffc079713b>] br_forward+0x8b/0xa0 [bridge] [ 120.376149] [<ffffffffc0797fb9>] br_handle_frame_finish+0x139/0x3c0 [bridge] [ 120.376153] [<ffffffffc079838e>] br_handle_frame+0x14e/0x240 [bridge] [ 120.376155] [<ffffffff81660102>] __netif_receive_skb_core+0x1b2/0x790 [ 120.376158] [<ffffffff8101bcd9>] ? read_tsc+0x9/0x20 [ 120.376161] [<ffffffff816606f8>] __netif_receive_skb+0x18/0x60 [ 120.376163] [<ffffffff81660763>] netif_receive_skb_internal+0x23/0x90 [ 120.376165] [<ffffffff816612c0>] napi_gro_receive+0xc0/0xf0 [ 120.376174] [<ffffffffc03007ac>] ixgbe_clean_rx_irq+0x7bc/0xb40 [ixgbe] [ 120.376180] [<ffffffffc03018a2>] ixgbe_poll+0x482/0x850 [ixgbe] [ 120.376183] [<ffffffff8109e9e9>] ? ttwu_do_wakeup+0x19/0xc0 [ 120.376186] [<ffffffff81660b52>] net_rx_action+0x152/0x250 [ 120.376189] [<ffffffff81073055>] __do_softirq+0xf5/0x2e0 [ 120.376191] [<ffffffff81073515>] irq_exit+0x105/0x110 [ 120.376194] [<ffffffff8176d748>] do_IRQ+0x58/0xf0 [ 120.376198] [<ffffffff8176b5ed>] common_interrupt+0x6d/0x6d [ 120.376199] <EOI> [<ffffffff815fb83f>] ? cpuidle_enter_state+0x4f/0xc0 [ 120.376204] [<ffffffff815fb838>] ? cpuidle_enter_state+0x48/0xc0 [ 120.376206] [<ffffffff815fb967>] cpuidle_enter+0x17/0x20 [ 120.376209] [<ffffffff810b527d>] cpu_startup_entry+0x31d/0x450 [ 120.376213] [<ffffffff810e028d>] ? tick_check_new_device+0xdd/0xf0 [ 120.376216] [<ffffffff8104520d>] start_secondary+0x21d/0x2e0 [ 120.376217] ---[ end trace 90d53a2c9c47f360 ]--- ProblemType: Bug DistroRelease: Ubuntu 14.04 Package: linux-image-3.13.0-43-generic 3.13.0-43.72 ProcVersionSignature: Ubuntu 3.13.0-43.72-generic 3.13.11.11 Uname: Linux 3.13.0-43-generic x86_64 AlsaDevices: total 0 crw-rw---- 1 root audio 116, 1 Dec 15 01:23 seq crw-rw---- 1 root audio 116, 33 Dec 15 01:23 timer AplayDevices: Error: [Errno 2] No such file or directory: 'aplay' ApportVersion: 2.14.1-0ubuntu3.6 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord' AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CRDA: Error: [Errno 2] No such file or directory: 'iw' Date: Fri Dec 19 17:07:18 2014 HibernationDevice: RESUME=/dev/mapper/data-swap IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig' MachineType: Supermicro X8DT6 PciMultimedia: ProcEnviron: TERM=xterm PATH=(custom, no user) XDG_RUNTIME_DIR=<set> LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 VESA VGA ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-43-generic root=/dev/mapper/data-os ro elevator=noop console=ttyS1,115200n8 console=tty1 transparent_hugepage=always nomdmonddf nomdmonisw RelatedPackageVersions: linux-restricted-modules-3.13.0-43-generic N/A linux-backports-modules-3.13.0-43-generic N/A linux-firmware 1.127.10 RfKill: Error: [Errno 2] No such file or directory: 'rfkill' SourcePackage: linux UpgradeStatus: No upgrade log present (probably fresh install) WifiSyslog: dmi.bios.date: 09/14/2010 dmi.bios.vendor: American Megatrends Inc. dmi.bios.version: 2.0a dmi.board.asset.tag: 1234567890 dmi.board.name: X8DT6 dmi.board.vendor: Supermicro dmi.board.version: 1234567890 dmi.chassis.asset.tag: To Be Filled By O.E.M. dmi.chassis.type: 17 dmi.chassis.vendor: Supermicro dmi.chassis.version: 1234567890 dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr2.0a:bd09/14/2010:svnSupermicro:pnX8DT6:pvr1234567890:rvnSupermicro:rnX8DT6:rvr1234567890:cvnSupermicro:ct17:cvr1234567890: dmi.product.name: X8DT6 dmi.product.version: 1234567890 dmi.sys.vendor: Supermicro To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1404409/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp