Re: [ovs-dev] kernel crash bug caused by ixgbevf kernel module of centos-3.10.0-229.20.1.el7
On 1/29/2018 6:23 PM, Sam wrote: detail as below, bug is happened on bond of enp1s16 and enp1s16f1 [huanghuai-test@yf-mos-test-net14 ~]$ sudo /usr/local/share/openvswitch/scripts/dpdk_nic_bind --status Network devices using DPDK-compatible driver :01:00.0 'Ethernet Controller 10-Gigabit X540-AT2' drv=igb_uio unused=ixgbe :01:00.1 'Ethernet Controller 10-Gigabit X540-AT2' drv=igb_uio unused=ixgbe Network devices using kernel driver === :01:10.0 'X540 Ethernet Controller Virtual Function' if=enp1s16 drv=ixgbevf unused=bak,igb_uio :01:10.1 'X540 Ethernet Controller Virtual Function' if=enp1s16f1 drv=ixgbevf unused=bak,igb_uio :08:00.0 'I350 Gigabit Network Connection' if=eth2 drv=igb unused=igb_uio :08:00.1 'I350 Gigabit Network Connection' if=eth3 drv=igb unused=igb_uio Other network devices = I'd try reporting it to Intel. https://sourceforge.net/p/e1000/bugs/ - Greg 2018-01-30 10:19 GMT+08:00 Sam : I found a bug about ixgbevf kernel module in centos-3.10.0-229.20.1.el7. And this bug is also in 3.10.0-514.10.2.el7. How to produce this bug: use SRIOV first, then add lots of network traffic on vf port, and then ifdow/ifup vf port, after many times, this bug happens. BUG: [308026.586026] ixgbevf :01:10.0: NIC Link is Down [308026.586037] ixgbevf :01:10.1: NIC Link is Down [308026.683724] bonding: bond1: link status definitely down for interface enp1s16, disabling it [308026.683728] bonding: bond1: now running without any active interface ! [308026.683729] bonding: bond1: link status definitely down for interface enp1s16f1, disabling it [308028.266060] bonding: bond1: Removing slave enp1s16. [308028.266135] bonding: bond1: Warning: the permanent HWaddr of enp1s16 - 4e:cd:a6:59:26:2c - is still in use by bond1. Set the HWaddr of enp1s16 to a different address to avoid conflicts. [308028.266139] bonding: bond1: releasing active interface enp1s16 [308028.359872] BUG: unable to handle kernel NULL pointer dereference at 0008 [308028.361319] IP: [] ixgbevf_alloc_rx_buffers+0x60/0x160 [ixgbevf] [308028.362049] PGD 0 [308028.362777] Oops: [#1] SMP [308028.363481] Modules linked in: ixgbevf(OF) igb_uio(OF) iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter nbd(OF) vhost_net macvtap macvlan udp_diag unix_diag af_packet_diag netlink_diag tun tcp_diag inet_diag uio bonding ext4 mbcache jbd2 intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel mgag200 aesni_intel iTCO_wdt lrw dcdbas gf128mul syscopyarea sysfillrect iTCO_vendor_support glue_helper sysimgblt ablk_helper ttm cryptd ipmi_devintf igb ixgbe drm_kms_helper drm i2c_algo_bit ptp i2c_core ipmi_si pps_core sg mdio ipmi_msghandler dca sb_edac mei_me mei shpchp lpc_ich pcspkr mfd_core edac_core wmi acpi_power_meter acpi_pad ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_common ahci libahci [308028.368487] libata megaraid_sas [last unloaded: ixgbevf] [308028.369345] CPU: 0 PID: 21971 Comm: kworker/0:1 Tainted: GF W O-- 3.10.0-229.el7.x86_64 #1 [308028.370226] Hardware name: Dell Inc. PowerEdge R720/068CDY, BIOS 2.5.2 01/28/2015 [308028.371132] Workqueue: events ixgbevf_service_task [ixgbevf] [308028.372038] task: 88022b0dad80 ti: 88010905c000 task.ti: 88010905c000 [308028.372965] RIP: 0010:[] [] ixgbevf_alloc_rx_buffers+0x60/0x160 [ixgbevf] [308028.373949] RSP: 0018:88010905fd10 EFLAGS: 00010287 [308028.374900] RAX: 0200 RBX: RCX: [308028.375895] RDX: RSI: 01ff RDI: 8800b82061c0 [308028.376841] RBP: 88010905fd48 R08: 0282 R09: 0001 [308028.377780] R10: 0004 R11: 0005 R12: [308028.378702] R13: fe00 R14: 01ff R15: 8800b82061c0 [308028.379628] FS: () GS:882f7fa0() knlGS: [308028.380540] CS: 0010 DS: ES: CR0: 80050033 [308028.381471] CR2: 0008 CR3: 0190a000 CR4: 001427f0 [308028.382376] DR0: DR1: DR2: [308028.383291] DR3: DR6: 0ff0 DR7: 0400 [308028.384180] Stack: [308028.385051] 8832d1b58bc0 88010905fd28 8832d1b588c0 0009 [308028.385933] 8832d1b58bc0 8800b82061c0 1028 88010905fdb8 [308028.386804] a0496ba3 8832d1b58e58 00022b1e2000 819e2108 [308028.387693] Call Trace: [308028.388520] [] ixgbevf_configure+0x5d3/0x7d0 [ixgbevf] [308028.389363] [] ixgbevf_reinit_locked+0x65/0x90 [ixgbevf] [308028.390213] [] ixgbevf_service_task+0x324/0x420 [ixgbevf] [308028.391043] [] process_one_work+0x17b/0x470 [3080
Re: [ovs-dev] kernel crash bug caused by ixgbevf kernel module of centos-3.10.0-229.20.1.el7
detail as below, bug is happened on bond of enp1s16 and enp1s16f1 [huanghuai-test@yf-mos-test-net14 ~]$ sudo /usr/local/share/openvswitch/scripts/dpdk_nic_bind --status Network devices using DPDK-compatible driver :01:00.0 'Ethernet Controller 10-Gigabit X540-AT2' drv=igb_uio unused=ixgbe :01:00.1 'Ethernet Controller 10-Gigabit X540-AT2' drv=igb_uio unused=ixgbe Network devices using kernel driver === :01:10.0 'X540 Ethernet Controller Virtual Function' if=enp1s16 drv=ixgbevf unused=bak,igb_uio :01:10.1 'X540 Ethernet Controller Virtual Function' if=enp1s16f1 drv=ixgbevf unused=bak,igb_uio :08:00.0 'I350 Gigabit Network Connection' if=eth2 drv=igb unused=igb_uio :08:00.1 'I350 Gigabit Network Connection' if=eth3 drv=igb unused=igb_uio Other network devices = 2018-01-30 10:19 GMT+08:00 Sam : > I found a bug about ixgbevf kernel module in centos-3.10.0-229.20.1.el7. > And this bug is also in 3.10.0-514.10.2.el7. > > How to produce this bug: use SRIOV first, then add lots of network traffic > on vf port, and then ifdow/ifup vf port, after many times, this bug happens. > > BUG: > > [308026.586026] ixgbevf :01:10.0: NIC Link is Down > [308026.586037] ixgbevf :01:10.1: NIC Link is Down > [308026.683724] bonding: bond1: link status definitely down for interface > enp1s16, disabling it > [308026.683728] bonding: bond1: now running without any active interface ! > [308026.683729] bonding: bond1: link status definitely down for interface > enp1s16f1, disabling it > [308028.266060] bonding: bond1: Removing slave enp1s16. > [308028.266135] bonding: bond1: Warning: the permanent HWaddr of enp1s16 - > 4e:cd:a6:59:26:2c - is still in use by bond1. Set the HWaddr of enp1s16 to a > different address to avoid conflicts. > [308028.266139] bonding: bond1: releasing active interface enp1s16 > [308028.359872] BUG: unable to handle kernel NULL pointer dereference at > 0008 > [308028.361319] IP: [] ixgbevf_alloc_rx_buffers+0x60/0x160 > [ixgbevf] > [308028.362049] PGD 0 > [308028.362777] Oops: [#1] SMP > [308028.363481] Modules linked in: ixgbevf(OF) igb_uio(OF) iptable_mangle > iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack > iptable_filter nbd(OF) vhost_net macvtap macvlan udp_diag unix_diag > af_packet_diag netlink_diag tun tcp_diag inet_diag uio bonding ext4 mbcache > jbd2 intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul > crc32c_intel ghash_clmulni_intel mgag200 aesni_intel iTCO_wdt lrw dcdbas > gf128mul syscopyarea sysfillrect iTCO_vendor_support glue_helper sysimgblt > ablk_helper ttm cryptd ipmi_devintf igb ixgbe drm_kms_helper drm i2c_algo_bit > ptp i2c_core ipmi_si pps_core sg mdio ipmi_msghandler dca sb_edac mei_me mei > shpchp lpc_ich pcspkr mfd_core edac_core wmi acpi_power_meter acpi_pad > ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_common ahci libahci > [308028.368487] libata megaraid_sas [last unloaded: ixgbevf] > [308028.369345] CPU: 0 PID: 21971 Comm: kworker/0:1 Tainted: GF W > O-- 3.10.0-229.el7.x86_64 #1 > [308028.370226] Hardware name: Dell Inc. PowerEdge R720/068CDY, BIOS 2.5.2 > 01/28/2015 > [308028.371132] Workqueue: events ixgbevf_service_task [ixgbevf] > [308028.372038] task: 88022b0dad80 ti: 88010905c000 task.ti: > 88010905c000 > [308028.372965] RIP: 0010:[] [] > ixgbevf_alloc_rx_buffers+0x60/0x160 [ixgbevf] > [308028.373949] RSP: 0018:88010905fd10 EFLAGS: 00010287 > [308028.374900] RAX: 0200 RBX: RCX: > > [308028.375895] RDX: RSI: 01ff RDI: > 8800b82061c0 > [308028.376841] RBP: 88010905fd48 R08: 0282 R09: > 0001 > [308028.377780] R10: 0004 R11: 0005 R12: > > [308028.378702] R13: fe00 R14: 01ff R15: > 8800b82061c0 > [308028.379628] FS: () GS:882f7fa0() > knlGS: > [308028.380540] CS: 0010 DS: ES: CR0: 80050033 > [308028.381471] CR2: 0008 CR3: 0190a000 CR4: > 001427f0 > [308028.382376] DR0: DR1: DR2: > > [308028.383291] DR3: DR6: 0ff0 DR7: > 0400 > [308028.384180] Stack: > [308028.385051] 8832d1b58bc0 88010905fd28 8832d1b588c0 > 0009 > [308028.385933] 8832d1b58bc0 8800b82061c0 1028 > 88010905fdb8 > [308028.386804] a0496ba3 8832d1b58e58 00022b1e2000 > 819e2108 > [308028.387693] Call Trace: > [308028.388520] [] ixgbevf_configure+0x5d3/0x7d0 [ixgbevf] > [308028.389363] [] ixgbevf_reinit_locked+0x65/0x90 > [ixgbevf] > [308028.390213] [] ixgbevf_service_task+0x324/0x420 > [ixgbevf] > [308028.391
[ovs-dev] kernel crash bug caused by ixgbevf kernel module of centos-3.10.0-229.20.1.el7
I found a bug about ixgbevf kernel module in centos-3.10.0-229.20.1.el7. And this bug is also in 3.10.0-514.10.2.el7. How to produce this bug: use SRIOV first, then add lots of network traffic on vf port, and then ifdow/ifup vf port, after many times, this bug happens. BUG: [308026.586026] ixgbevf :01:10.0: NIC Link is Down [308026.586037] ixgbevf :01:10.1: NIC Link is Down [308026.683724] bonding: bond1: link status definitely down for interface enp1s16, disabling it [308026.683728] bonding: bond1: now running without any active interface ! [308026.683729] bonding: bond1: link status definitely down for interface enp1s16f1, disabling it [308028.266060] bonding: bond1: Removing slave enp1s16. [308028.266135] bonding: bond1: Warning: the permanent HWaddr of enp1s16 - 4e:cd:a6:59:26:2c - is still in use by bond1. Set the HWaddr of enp1s16 to a different address to avoid conflicts. [308028.266139] bonding: bond1: releasing active interface enp1s16 [308028.359872] BUG: unable to handle kernel NULL pointer dereference at 0008 [308028.361319] IP: [] ixgbevf_alloc_rx_buffers+0x60/0x160 [ixgbevf] [308028.362049] PGD 0 [308028.362777] Oops: [#1] SMP [308028.363481] Modules linked in: ixgbevf(OF) igb_uio(OF) iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter nbd(OF) vhost_net macvtap macvlan udp_diag unix_diag af_packet_diag netlink_diag tun tcp_diag inet_diag uio bonding ext4 mbcache jbd2 intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel mgag200 aesni_intel iTCO_wdt lrw dcdbas gf128mul syscopyarea sysfillrect iTCO_vendor_support glue_helper sysimgblt ablk_helper ttm cryptd ipmi_devintf igb ixgbe drm_kms_helper drm i2c_algo_bit ptp i2c_core ipmi_si pps_core sg mdio ipmi_msghandler dca sb_edac mei_me mei shpchp lpc_ich pcspkr mfd_core edac_core wmi acpi_power_meter acpi_pad ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_common ahci libahci [308028.368487] libata megaraid_sas [last unloaded: ixgbevf] [308028.369345] CPU: 0 PID: 21971 Comm: kworker/0:1 Tainted: GF W O-- 3.10.0-229.el7.x86_64 #1 [308028.370226] Hardware name: Dell Inc. PowerEdge R720/068CDY, BIOS 2.5.2 01/28/2015 [308028.371132] Workqueue: events ixgbevf_service_task [ixgbevf] [308028.372038] task: 88022b0dad80 ti: 88010905c000 task.ti: 88010905c000 [308028.372965] RIP: 0010:[] [] ixgbevf_alloc_rx_buffers+0x60/0x160 [ixgbevf] [308028.373949] RSP: 0018:88010905fd10 EFLAGS: 00010287 [308028.374900] RAX: 0200 RBX: RCX: [308028.375895] RDX: RSI: 01ff RDI: 8800b82061c0 [308028.376841] RBP: 88010905fd48 R08: 0282 R09: 0001 [308028.377780] R10: 0004 R11: 0005 R12: [308028.378702] R13: fe00 R14: 01ff R15: 8800b82061c0 [308028.379628] FS: () GS:882f7fa0() knlGS: [308028.380540] CS: 0010 DS: ES: CR0: 80050033 [308028.381471] CR2: 0008 CR3: 0190a000 CR4: 001427f0 [308028.382376] DR0: DR1: DR2: [308028.383291] DR3: DR6: 0ff0 DR7: 0400 [308028.384180] Stack: [308028.385051] 8832d1b58bc0 88010905fd28 8832d1b588c0 0009 [308028.385933] 8832d1b58bc0 8800b82061c0 1028 88010905fdb8 [308028.386804] a0496ba3 8832d1b58e58 00022b1e2000 819e2108 [308028.387693] Call Trace: [308028.388520] [] ixgbevf_configure+0x5d3/0x7d0 [ixgbevf] [308028.389363] [] ixgbevf_reinit_locked+0x65/0x90 [ixgbevf] [308028.390213] [] ixgbevf_service_task+0x324/0x420 [ixgbevf] [308028.391043] [] process_one_work+0x17b/0x470 [308028.391888] [] worker_thread+0x11b/0x400 [308028.392728] [] ? rescuer_thread+0x400/0x400 [308028.393576] [] kthread+0xcf/0xe0 [308028.394434] [] ? kthread_create_on_node+0x140/0x140 [308028.395339] [] ret_from_fork+0x7c/0xb0 [308028.396205] [] ? kthread_create_on_node+0x140/0x140 [308028.397068] Code: c5 41 89 f6 49 89 c4 48 8d 14 40 48 8b 47 28 49 c1 e4 04 4c 03 67 20 48 8d 1c d0 0f b7 47 4c 41 29 c5 66 0f 1f 84 00 00 00 00 00 <48> 83 7b 08 00 74 73 8b 53 10 48 8b 03 48 01 d0 49 83 c4 10 48 [308028.398959] RIP [] ixgbevf_alloc_rx_buffers+0x60/0x160 [ixgbevf] [308028.399910] RSP [308028.400846] CR2: 0008 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev