Thanks a lot. I comfirm your fix works.
Br, Zhike Wang JDCloud, Product Development, IaaS ------------------------------------------------------------------------------------------------ Mobile/+86 13466719566 E- mail/wangzh...@jd.com Address/5F Building A,North-Star Century Center,8 Beichen West Street,Chaoyang District Beijing Https://JDCloud.com ------------------------------------------------------------------------------------------------ -----Original Message----- From: Lorenzo Bianconi [mailto:lorenzo.bianc...@redhat.com] Sent: Friday, December 28, 2018 4:33 AM To: Ben Pfaff Cc: 王志克; Gregory Rose; ovs-discuss@openvswitch.org; ovs-...@openvswitch.org Subject: Re: [ovs-discuss] crash when restart openvswitch with huge vxlan traffic running > Greg, this is a kernel issue. If you have the time, will you take a > look at it sometime? > Hi all, I worked on a pretty similar issue a couple of weeks ago. Could you please take a look to the commit below (it is already in Linus's tree): commit 8e1da73acded4751a93d4166458a7e640f37d26c Author: Lorenzo Bianconi <lorenzo.bianc...@redhat.com> Date: Wed Dec 19 23:23:00 2018 +0100 gro_cell: add napi_disable in gro_cells_destroy Add napi_disable routine in gro_cells_destroy since starting from commit c42858eaf492 ("gro_cells: remove spinlock protecting receive queues") gro_cell_poll and gro_cells_destroy can run concurrently on napi_skbs list producing a kernel Oops if the tunnel interface is removed while gro_cell_poll is running. The following Oops has been triggered removing a vxlan device while the interface is receiving traffic Regards, Lorenzo > On Thu, Dec 20, 2018 at 12:42:43PM +0000, 王志克 wrote: > > Hi All, > > > > I did below test, and found system crash, does anyone knows whether there > > are already some fix for it? > > > > Setup: > > CentOS7.4 3.10.0-693.el7.x86_64, > > OVS: 2.10.1 > > > > Step: > > 1. Build OVS only for userspace, and reuse kernel-builtin openvswitch > > module. > > 2. On Host1, create 1 vxlan interface and add 1 VF_rep to OVS. > > 3. Attach the VF to one VM, and the VM will do 5 tuples swap using DPDK > > app. > > 4. using traffic generator to send huge traffic (7Mpps with serveral k > > connetions)to Host1 PF. > > 5. The OVS rue are configured as below. > > > > VM1_PORTNAME=$1 > > VXLAN_PORTNAME=$2 > > VM1_PORT=$(ovs-vsctl list interface | grep $VM1_PORTNAME -A1 | grep ofport > > | sed 's/ofport *: \([0-9]*\)/\1/g') > > VXLAN_PORT=$(ovs-vsctl list interface | grep $VXLAN_PORTNAME -A1 | grep > > ofport | sed 's/ofport *: \([0-9]*\)/\1/g') > > ZONE=8 > > ovs-ofctl del-flows ovs-sriov > > ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=1000 table=0,arp, > > actions=NORMAL" > > ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=100 > > table=0,ip,in_port=$VM1_PORT,action=set_field:$VM1_PORT->reg6,goto_table:5" > > ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=100 > > table=0,ip,in_port=$VXLAN_PORT, tun_id=0x242, > > action=set_field:$VXLAN_PORT->reg6,set_field:$VM1_PORT->reg7,goto_table:5" > > > > ovs-ofctl add-flow ovs-sriov -O openflow13 "table=5, priority=100, > > ip,actions=ct(table=10,zone=$ZONE)" > > > > ovs-ofctl add-flow ovs-sriov -O openflow13 "table=10, > > priority=100,ip,ct_state=-new+est-rel-inv+trk actions= goto_table:15" > > ovs-ofctl add-flow ovs-sriov -O openflow13 "table=10, > > priority=100,ip,ct_state=-new-est-rel+inv+trk actions=drop" > > ovs-ofctl add-flow ovs-sriov -O openflow13 "table=10, > > priority=100,ip,ct_state=-new-est-rel-inv-trk actions=drop" > > > > ovs-ofctl add-flow ovs-sriov -O openflow13 "table=10, > > priority=100,ip,ct_state=+new-rel-inv+trk actions= > > ct(commit,table=15,zone=$ZONE)" > > > > ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=100 table=15,ip, > > in_port=$VM1_PORT, > > action=set_field:0x242->tun_id,set_field:$VXLAN_PORT->reg7,goto_table:20" > > ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=100 table=15,ip, > > in_port=$VXLAN_PORT, actions=goto_table:20" > > > > ovs-ofctl add-flow ovs-sriov -O openflow13 "table=20, priority=100, > > ip,action=output:NXM_NX_REG7[0..15]" > > ovs-ofctl add-flow ovs-sriov -O openflow13 "table=200, > > priority=100,action=drop" > > 6. execute serveral times “systemctl restart openvswitch”, then crash. > > > > Crash stack (2 kinds): > > One > > [ 575.459905] device vxlan_sys_4789 left promiscuous mode > > [ 575.460103] BUG: unable to handle kernel NULL pointer dereference at > > 0000000000000008 > > [ 575.460133] IP: [<ffffffffc09b330b>] gro_cell_poll+0x4b/0x80 [vxlan] > > [ 575.460210] PGD 0 > > [ 575.460226] Oops: 0002 [#1] SMP > > [ 575.460254] Modules linked in: vhost_net vhost macvtap macvlan vxlan > > ip6_udp_tunnel udp_tunnel openvswitch nf_conntrack_ipv6 nf_nat_ipv6 > > nf_defrag_ipv6 vfio_pci vfio_iommu_type1 vfio xt_CHECKSUM iptable_mangle > > ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat > > nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT > > nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter > > ip6_tables iptable_filter rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) > > ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) mlx5_ib(OE) > > ib_uverbs(OE) mlx5_core(OE) mlxfw(OE) mlx4_en(OE) mlx4_ib(OE) ib_core(OE) > > mlx4_core(OE) mlx_compat(OE) devlink iTCO_wdt iTCO_vendor_support dcdbas > > sb_edac edac_core intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel > > kvm irqbypass crc32_pclmul > > [ 575.460619] ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper > > ablk_helper cryptd ipmi_ssif joydev pcspkr sg mei_me mei lpc_ich ipmi_si > > shpchp ipmi_devintf ipmi_msghandler wmi acpi_power_meter knem(OE) nfsd > > auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod > > crc_t10dif crct10dif_generic mgag200 drm_kms_helper syscopyarea ixgbe > > sysfillrect igb sysimgblt fb_sys_fops ttm crct10dif_pclmul crct10dif_common > > crc32c_intel drm ahci libahci megaraid_sas libata i2c_algo_bit i2c_core > > mdio ptp dca pps_core dm_mirror dm_region_hash dm_log dm_mod [last > > unloaded: devlink] > > [ 575.460885] CPU: 2 PID: 20 Comm: ksoftirqd/2 Tainted: G OE > > ------------ 3.10.0-693.el7.x86_64 #1 > > [ 575.460912] Hardware name: Dell Inc. PowerEdge R630/0CNCJW, BIOS 1.3.6 > > 06/03/2015 > > [ 575.460933] task: ffff880152ef1fa0 ti: ffff880152efc000 task.ti: > > ffff880152efc000 > > [ 575.460954] RIP: 0010:[<ffffffffc09b330b>] [<ffffffffc09b330b>] > > gro_cell_poll+0x4b/0x80 [vxlan] > > [ 575.460990] RSP: 0018:ffff880152effd68 EFLAGS: 00010202 > > [ 575.461004] RAX: 0000000000000000 RBX: ffffe8dfff448818 RCX: > > 0000000000000000 > > [ 575.461024] RDX: 0000000000000001 RSI: ffff881fa42ebf00 RDI: > > ffffe8dfff448818 > > [ 575.461042] RBP: ffff880152effd88 R08: 0000000000019c40 R09: > > ffffffff815710d7 > > [ 575.461061] R10: ffff881ffec59c40 R11: ffffea007e90ba00 R12: > > 0000000000000002 > > [ 575.461079] R13: 0000000000000040 R14: ffffe8dfff448800 R15: > > 0000000000000001 > > [ 575.461098] FS: 0000000000000000(0000) GS:ffff881ffec40000(0000) > > knlGS:0000000000000000 > > [ 575.461119] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 575.461134] CR2: 0000000000000008 CR3: 00000000019f2000 CR4: > > 00000000001427e0 > > [ 575.461153] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > > 0000000000000000 > > [ 575.461172] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: > > 0000000000000400 > > [ 575.461190] Stack: > > [ 575.461198] ffffe8dfff448818 0000000000000000 0000000000000040 > > 0000000000000000 > > [ 575.461221] ffff880152effe08 ffffffff8158799d ffff881ffec57950 > > ffff881ffec57940 > > [ 575.461254] 00000001000432b7 0000012c52f09428 ffff881ffd57eb40 > > ffff881ffd57eb40 > > [ 575.461277] Call Trace: > > [ 575.461290] [<ffffffff8158799d>] net_rx_action+0x16d/0x380 > > [ 575.461308] [<ffffffff81090b3f>] __do_softirq+0xef/0x280 > > [ 575.461324] [<ffffffff81090d08>] run_ksoftirqd+0x38/0x50 > > [ 575.462074] [<ffffffff810b909f>] smpboot_thread_fn+0x12f/0x180 > > [ 575.462780] [<ffffffff810b8f70>] ? lg_double_unlock+0x40/0x40 > > [ 575.463464] [<ffffffff810b098f>] kthread+0xcf/0xe0 > > [ 575.464169] [<ffffffff810b08c0>] ? insert_kthread_work+0x40/0x40 > > [ 575.464862] [<ffffffff816b4f18>] ret_from_fork+0x58/0x90 > > [ 575.465497] [<ffffffff810b08c0>] ? insert_kthread_work+0x40/0x40 > > [ 575.466192] Code: 49 39 f6 74 40 48 85 f6 74 3b 83 6b f8 01 48 89 df 41 > > 83 c4 01 48 8b 0e 48 8b 46 08 48 c7 06 00 00 00 00 48 c7 46 08 00 00 00 00 > > <48> 89 41 08 48 89 08 e8 29 4f bd c0 45 39 ec 74 14 48 8b 73 e8 > > [ 575.467663] RIP [<ffffffffc09b330b>] gro_cell_poll+0x4b/0x80 [vxlan] > > [ 575.468412] RSP <ffff880152effd68> > > [ 575.469197] CR2: 0000000000000008 > > > > TWO: > > [ 390.626080] device vxlan_sys_4789 left promiscuous mode > > [ 390.626345] BUG: unable to handle kernel NULL pointer dereference at > > 0000000000000008 > > [ 390.626411] IP: [<ffffffffc09c8b4a>] vxlan_dellink+0x9a/0xf0 [vxlan] > > [ 390.626462] PGD 0 > > [ 390.626499] Oops: 0002 [#1] SMP > > [ 390.626529] Modules linked in: vhost_net vhost macvtap macvlan vxlan > > ip6_udp_tunnel udp_tunnel openvswitch nf_conntrack_ipv6 nf_nat_ipv6 > > nf_defrag_ipv6 vfio_pci vfio_iommu_type1 vfio xt_CHECKSUM iptable_mangle > > ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat > > nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT > > nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter > > ip6_tables iptable_filter rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) > > ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) mlx5_ib(OE) > > ib_uverbs(OE) mlx5_core(OE) mlxfw(OE) mlx4_en(OE) mlx4_ib(OE) ib_core(OE) > > mlx4_core(OE) mlx_compat(OE) devlink iTCO_wdt iTCO_vendor_support dcdbas > > sb_edac edac_core intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel > > kvm irqbypass crc32_pclmul > > [ 390.627152] ghash_clmulni_intel ipmi_ssif aesni_intel lrw gf128mul > > glue_helper ablk_helper cryptd ipmi_si pcspkr joydev ipmi_devintf > > ipmi_msghandler mei_me mei sg lpc_ich shpchp acpi_power_meter wmi nfsd > > auth_rpcgss nfs_acl lockd knem(OE) grace sunrpc ip_tables xfs libcrc32c > > sd_mod crc_t10dif crct10dif_generic mgag200 drm_kms_helper syscopyarea > > sysfillrect sysimgblt fb_sys_fops ttm drm crct10dif_pclmul crct10dif_common > > ixgbe crc32c_intel ahci igb libahci libata megaraid_sas mdio i2c_algo_bit > > ptp i2c_core pps_core dca dm_mirror dm_region_hash dm_log dm_mod [last > > unloaded: devlink] > > [ 390.627626] CPU: 11 PID: 6303 Comm: ovs-vswitchd Tainted: G OE > > ------------ 3.10.0-693.el7.x86_64 #1 > > [ 390.627690] Hardware name: Dell Inc. PowerEdge R630/0CNCJW, BIOS 1.3.6 > > 06/03/2015 > > [ 390.627738] task: ffff881fe0e89fa0 ti: ffff881fa3590000 task.ti: > > ffff881fa3590000 > > [ 390.627786] RIP: 0010:[<ffffffffc09c8b4a>] [<ffffffffc09c8b4a>] > > vxlan_dellink+0x9a/0xf0 [vxlan] > > [ 390.627848] RSP: 0018:ffff881fa3593888 EFLAGS: 00010206 > > [ 390.627883] RAX: 0000000000000000 RBX: 0000000000000010 RCX: > > 0000000000000000 > > [ 390.627929] RDX: 0000000000000000 RSI: ffffea007fd7f600 RDI: > > ffff881ff5fd8c00 > > [ 390.627975] RBP: ffff881fa35938b0 R08: ffff881ff5fd8b00 R09: > > 000000018040000d > > [ 390.628020] R10: 00000000f5fd8a01 R11: ffffea007fd7f600 R12: > > ffff88015270e000 > > [ 390.628066] R13: ffffffff81b1caa0 R14: ffff881fa35938c0 R15: > > ffffe8dfff60a1d8 > > [ 390.628112] FS: 00007f4ea1168ac0(0000) GS:ffff883ffe540000(0000) > > knlGS:0000000000000000 > > [ 390.628163] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 390.628201] CR2: 0000000000000008 CR3: 0000001ff9055000 CR4: > > 00000000001427e0 > > [ 390.628246] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > > 0000000000000000 > > [ 390.628292] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: > > 0000000000000400 > > [ 390.628337] Stack: > > [ 390.628354] ffff881fa35938c0 ffffffff81ad9d40 0000000000000001 > > 0000000000000000 > > [ 390.628411] ffffffff81ad9d40 ffff881fa35938e0 ffffffff81599023 > > ffff881fa35938c0 > > [ 390.628468] ffff881fa35938c0 000000001d8239fc ffff883ffd864a00 > > ffff881fa3593a70 > > [ 390.628535] Call Trace: > > [ 390.628561] [<ffffffff81599023>] rtnl_delete_link+0x43/0x80 > > [ 390.628610] [<ffffffff8159b761>] rtnl_dellink+0x91/0xf0 > > [ 390.628649] [<ffffffff81599bd4>] rtnetlink_rcv_msg+0xa4/0x270 > > [ 390.630373] [<ffffffff815bacd0>] ? __netlink_lookup+0xc0/0x110 > > [ 390.632066] [<ffffffff81599b30>] ? rtnetlink_rcv+0x30/0x30 > > [ 390.633751] [<ffffffff815bd929>] netlink_rcv_skb+0xa9/0xc0 > > [ 390.635426] [<ffffffff81599b28>] rtnetlink_rcv+0x28/0x30 > > [ 390.637081] [<ffffffff815bd012>] netlink_unicast+0xf2/0x1b0 > > [ 390.638721] [<ffffffff815bd3ef>] netlink_sendmsg+0x31f/0x6a0 > > [ 390.640371] [<ffffffff812b4d65>] ? sock_has_perm+0x75/0x90 > > [ 390.642037] [<ffffffff8156a580>] sock_sendmsg+0xb0/0xf0 > > [ 390.643722] [<ffffffff8156a88f>] ? sock_recvmsg+0xbf/0x100 > > [ 390.645411] [<ffffffff8132c312>] ? put_dec+0x72/0x90 > > [ 390.647075] [<ffffffff8132d303>] ? number.isra.2+0x323/0x360 > > [ 390.648724] [<ffffffff8156ae29>] ___sys_sendmsg+0x3a9/0x3c0 > > [ 390.650362] [<ffffffff811de9d2>] ? kmem_cache_free+0x1e2/0x200 > > [ 390.652010] [<ffffffff81217af5>] ? __d_free+0x35/0x40 > > [ 390.653623] [<ffffffff812181b0>] ? d_free+0x60/0x70 > > [ 390.655181] [<ffffffff812186b4>] ? dentry_kill+0x154/0x1b0 > > [ 390.656702] [<ffffffff81222744>] ? mntput+0x24/0x40 > > [ 390.658173] [<ffffffff81203053>] ? __fput+0x183/0x260 > > [ 390.659606] [<ffffffff8156b5f1>] __sys_sendmsg+0x51/0x90 > > [ 390.660988] [<ffffffff8156b642>] SyS_sendmsg+0x12/0x20 > > [ 390.662325] [<ffffffff816b4fc9>] system_call_fastpath+0x16/0x1b > > [ 390.663624] Code: a0 bb c0 49 8b 3f 49 39 ff 74 be 48 85 ff 74 b9 41 83 > > 6f 10 01 48 8b 0f 48 8b 57 08 48 c7 07 00 00 00 00 48 c7 47 08 00 00 00 00 > > <48> 89 51 08 48 89 0a e8 8a a2 ba c0 49 8b 3f 49 39 ff 75 cc eb > > [ 390.666406] RIP [<ffffffffc09c8b4a>] vxlan_dellink+0x9a/0xf0 [vxlan] > > [ 390.667674] RSP <ffff881fa3593888> > > [ 390.668892] CR2: 0000000000000008 > > > > Br, > > Zhike Wang > > JDCloud, Product Development, IaaS > > ------------------------------------------------------------------------------------------------ > > Mobile/+86 13466719566 > > E- mail/wangzh...@jd.com<mailto:wangzh...@jd.com> > > Address/5F Building A,North-Star Century Center,8 Beichen West > > Street,Chaoyang District Beijing > > Https://JDCloud.com<https://jdcloud.com/> > > ------------------------------------------------------------------------------------------------ > > [cid:image002.jpg@01D404D3.6724C2E0] > > > > > > > _______________________________________________ > > discuss mailing list > > disc...@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss > > _______________________________________________ > discuss mailing list > disc...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss _______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss