On 12/27/2018 11:40 AM, Ben Pfaff wrote:
Greg, this is a kernel issue.  If you have the time, will you take a
look at it sometime?

Yep, will do.

- Greg


On Thu, Dec 20, 2018 at 12:42:43PM +0000, 王志克 wrote:
Hi All,

I did below test, and found system crash, does anyone knows whether there are 
already some fix for it?

Setup:
CentOS7.4 3.10.0-693.el7.x86_64,
OVS: 2.10.1

Step:
1.  Build OVS only for userspace, and reuse kernel-builtin openvswitch module.
2.  On Host1, create 1 vxlan interface and add 1 VF_rep to OVS.
3.  Attach the VF to one VM, and the VM will do 5 tuples swap using DPDK app.
4.  using traffic generator to send huge traffic (7Mpps with serveral k 
connetions)to Host1 PF.
5.  The OVS rue are configured as below.

VM1_PORTNAME=$1
VXLAN_PORTNAME=$2
VM1_PORT=$(ovs-vsctl list interface | grep $VM1_PORTNAME -A1 | grep ofport | 
sed 's/ofport *: \([0-9]*\)/\1/g')
VXLAN_PORT=$(ovs-vsctl list interface | grep $VXLAN_PORTNAME -A1 | grep ofport 
| sed 's/ofport *: \([0-9]*\)/\1/g')
ZONE=8
ovs-ofctl del-flows ovs-sriov
ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=1000 table=0,arp, 
actions=NORMAL"
ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=100 
table=0,ip,in_port=$VM1_PORT,action=set_field:$VM1_PORT->reg6,goto_table:5"
ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=100 table=0,ip,in_port=$VXLAN_PORT, 
tun_id=0x242, 
action=set_field:$VXLAN_PORT->reg6,set_field:$VM1_PORT->reg7,goto_table:5"

ovs-ofctl add-flow ovs-sriov -O openflow13 "table=5, priority=100, 
ip,actions=ct(table=10,zone=$ZONE)"

ovs-ofctl add-flow ovs-sriov -O openflow13 "table=10, 
priority=100,ip,ct_state=-new+est-rel-inv+trk actions= goto_table:15"
ovs-ofctl add-flow ovs-sriov -O openflow13 "table=10, 
priority=100,ip,ct_state=-new-est-rel+inv+trk actions=drop"
ovs-ofctl add-flow ovs-sriov -O openflow13 "table=10, 
priority=100,ip,ct_state=-new-est-rel-inv-trk actions=drop"

ovs-ofctl add-flow ovs-sriov -O openflow13 "table=10, 
priority=100,ip,ct_state=+new-rel-inv+trk actions= ct(commit,table=15,zone=$ZONE)"

ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=100 table=15,ip, in_port=$VM1_PORT, 
action=set_field:0x242->tun_id,set_field:$VXLAN_PORT->reg7,goto_table:20"
ovs-ofctl add-flow ovs-sriov -O openflow13 "priority=100 table=15,ip, 
in_port=$VXLAN_PORT, actions=goto_table:20"

ovs-ofctl add-flow ovs-sriov -O openflow13 "table=20, priority=100, 
ip,action=output:NXM_NX_REG7[0..15]"
ovs-ofctl add-flow ovs-sriov -O openflow13 "table=200, priority=100,action=drop"
6. execute serveral times “systemctl restart openvswitch”, then crash.

Crash stack (2 kinds):
One
[  575.459905] device vxlan_sys_4789 left promiscuous mode
[  575.460103] BUG: unable to handle kernel NULL pointer dereference at 
0000000000000008
[  575.460133] IP: [<ffffffffc09b330b>] gro_cell_poll+0x4b/0x80 [vxlan]
[  575.460210] PGD 0
[  575.460226] Oops: 0002 [#1] SMP
[  575.460254] Modules linked in: vhost_net vhost macvtap macvlan vxlan 
ip6_udp_tunnel udp_tunnel openvswitch nf_conntrack_ipv6 nf_nat_ipv6 
nf_defrag_ipv6 vfio_pci vfio_iommu_type1 vfio xt_CHECKSUM iptable_mangle 
ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat 
nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT 
nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter 
ip6_tables iptable_filter rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) 
ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) mlx5_ib(OE) 
ib_uverbs(OE) mlx5_core(OE) mlxfw(OE) mlx4_en(OE) mlx4_ib(OE) ib_core(OE) 
mlx4_core(OE) mlx_compat(OE) devlink iTCO_wdt iTCO_vendor_support dcdbas 
sb_edac edac_core intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm 
irqbypass crc32_pclmul
[  575.460619]  ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper 
ablk_helper cryptd ipmi_ssif joydev pcspkr sg mei_me mei lpc_ich ipmi_si shpchp 
ipmi_devintf ipmi_msghandler wmi acpi_power_meter knem(OE) nfsd auth_rpcgss 
nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif 
crct10dif_generic mgag200 drm_kms_helper syscopyarea ixgbe sysfillrect igb 
sysimgblt fb_sys_fops ttm crct10dif_pclmul crct10dif_common crc32c_intel drm 
ahci libahci megaraid_sas libata i2c_algo_bit i2c_core mdio ptp dca pps_core 
dm_mirror dm_region_hash dm_log dm_mod [last unloaded: devlink]
[  575.460885] CPU: 2 PID: 20 Comm: ksoftirqd/2 Tainted: G           OE  
------------   3.10.0-693.el7.x86_64 #1
[  575.460912] Hardware name: Dell Inc. PowerEdge R630/0CNCJW, BIOS 1.3.6 
06/03/2015
[  575.460933] task: ffff880152ef1fa0 ti: ffff880152efc000 task.ti: 
ffff880152efc000
[  575.460954] RIP: 0010:[<ffffffffc09b330b>]  [<ffffffffc09b330b>] 
gro_cell_poll+0x4b/0x80 [vxlan]
[  575.460990] RSP: 0018:ffff880152effd68  EFLAGS: 00010202
[  575.461004] RAX: 0000000000000000 RBX: ffffe8dfff448818 RCX: 0000000000000000
[  575.461024] RDX: 0000000000000001 RSI: ffff881fa42ebf00 RDI: ffffe8dfff448818
[  575.461042] RBP: ffff880152effd88 R08: 0000000000019c40 R09: ffffffff815710d7
[  575.461061] R10: ffff881ffec59c40 R11: ffffea007e90ba00 R12: 0000000000000002
[  575.461079] R13: 0000000000000040 R14: ffffe8dfff448800 R15: 0000000000000001
[  575.461098] FS:  0000000000000000(0000) GS:ffff881ffec40000(0000) 
knlGS:0000000000000000
[  575.461119] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  575.461134] CR2: 0000000000000008 CR3: 00000000019f2000 CR4: 00000000001427e0
[  575.461153] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  575.461172] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  575.461190] Stack:
[  575.461198]  ffffe8dfff448818 0000000000000000 0000000000000040 
0000000000000000
[  575.461221]  ffff880152effe08 ffffffff8158799d ffff881ffec57950 
ffff881ffec57940
[  575.461254]  00000001000432b7 0000012c52f09428 ffff881ffd57eb40 
ffff881ffd57eb40
[  575.461277] Call Trace:
[  575.461290]  [<ffffffff8158799d>] net_rx_action+0x16d/0x380
[  575.461308]  [<ffffffff81090b3f>] __do_softirq+0xef/0x280
[  575.461324]  [<ffffffff81090d08>] run_ksoftirqd+0x38/0x50
[  575.462074]  [<ffffffff810b909f>] smpboot_thread_fn+0x12f/0x180
[  575.462780]  [<ffffffff810b8f70>] ? lg_double_unlock+0x40/0x40
[  575.463464]  [<ffffffff810b098f>] kthread+0xcf/0xe0
[  575.464169]  [<ffffffff810b08c0>] ? insert_kthread_work+0x40/0x40
[  575.464862]  [<ffffffff816b4f18>] ret_from_fork+0x58/0x90
[  575.465497]  [<ffffffff810b08c0>] ? insert_kthread_work+0x40/0x40
[  575.466192] Code: 49 39 f6 74 40 48 85 f6 74 3b 83 6b f8 01 48 89 df 41 83 c4 01 
48 8b 0e 48 8b 46 08 48 c7 06 00 00 00 00 48 c7 46 08 00 00 00 00 <48> 89 41 08 
48 89 08 e8 29 4f bd c0 45 39 ec 74 14 48 8b 73 e8
[  575.467663] RIP  [<ffffffffc09b330b>] gro_cell_poll+0x4b/0x80 [vxlan]
[  575.468412]  RSP <ffff880152effd68>
[  575.469197] CR2: 0000000000000008

TWO:
[  390.626080] device vxlan_sys_4789 left promiscuous mode
[  390.626345] BUG: unable to handle kernel NULL pointer dereference at 
0000000000000008
[  390.626411] IP: [<ffffffffc09c8b4a>] vxlan_dellink+0x9a/0xf0 [vxlan]
[  390.626462] PGD 0
[  390.626499] Oops: 0002 [#1] SMP
[  390.626529] Modules linked in: vhost_net vhost macvtap macvlan vxlan 
ip6_udp_tunnel udp_tunnel openvswitch nf_conntrack_ipv6 nf_nat_ipv6 
nf_defrag_ipv6 vfio_pci vfio_iommu_type1 vfio xt_CHECKSUM iptable_mangle 
ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat 
nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT 
nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter 
ip6_tables iptable_filter rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) 
ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_fpga_tools(OE) mlx5_ib(OE) 
ib_uverbs(OE) mlx5_core(OE) mlxfw(OE) mlx4_en(OE) mlx4_ib(OE) ib_core(OE) 
mlx4_core(OE) mlx_compat(OE) devlink iTCO_wdt iTCO_vendor_support dcdbas 
sb_edac edac_core intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm 
irqbypass crc32_pclmul
[  390.627152]  ghash_clmulni_intel ipmi_ssif aesni_intel lrw gf128mul 
glue_helper ablk_helper cryptd ipmi_si pcspkr joydev ipmi_devintf 
ipmi_msghandler mei_me mei sg lpc_ich shpchp acpi_power_meter wmi nfsd 
auth_rpcgss nfs_acl lockd knem(OE) grace sunrpc ip_tables xfs libcrc32c sd_mod 
crc_t10dif crct10dif_generic mgag200 drm_kms_helper syscopyarea sysfillrect 
sysimgblt fb_sys_fops ttm drm crct10dif_pclmul crct10dif_common ixgbe 
crc32c_intel ahci igb libahci libata megaraid_sas mdio i2c_algo_bit ptp 
i2c_core pps_core dca dm_mirror dm_region_hash dm_log dm_mod [last unloaded: 
devlink]
[  390.627626] CPU: 11 PID: 6303 Comm: ovs-vswitchd Tainted: G           OE  
------------   3.10.0-693.el7.x86_64 #1
[  390.627690] Hardware name: Dell Inc. PowerEdge R630/0CNCJW, BIOS 1.3.6 
06/03/2015
[  390.627738] task: ffff881fe0e89fa0 ti: ffff881fa3590000 task.ti: 
ffff881fa3590000
[  390.627786] RIP: 0010:[<ffffffffc09c8b4a>]  [<ffffffffc09c8b4a>] 
vxlan_dellink+0x9a/0xf0 [vxlan]
[  390.627848] RSP: 0018:ffff881fa3593888  EFLAGS: 00010206
[  390.627883] RAX: 0000000000000000 RBX: 0000000000000010 RCX: 0000000000000000
[  390.627929] RDX: 0000000000000000 RSI: ffffea007fd7f600 RDI: ffff881ff5fd8c00
[  390.627975] RBP: ffff881fa35938b0 R08: ffff881ff5fd8b00 R09: 000000018040000d
[  390.628020] R10: 00000000f5fd8a01 R11: ffffea007fd7f600 R12: ffff88015270e000
[  390.628066] R13: ffffffff81b1caa0 R14: ffff881fa35938c0 R15: ffffe8dfff60a1d8
[  390.628112] FS:  00007f4ea1168ac0(0000) GS:ffff883ffe540000(0000) 
knlGS:0000000000000000
[  390.628163] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  390.628201] CR2: 0000000000000008 CR3: 0000001ff9055000 CR4: 00000000001427e0
[  390.628246] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  390.628292] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  390.628337] Stack:
[  390.628354]  ffff881fa35938c0 ffffffff81ad9d40 0000000000000001 
0000000000000000
[  390.628411]  ffffffff81ad9d40 ffff881fa35938e0 ffffffff81599023 
ffff881fa35938c0
[  390.628468]  ffff881fa35938c0 000000001d8239fc ffff883ffd864a00 
ffff881fa3593a70
[  390.628535] Call Trace:
[  390.628561]  [<ffffffff81599023>] rtnl_delete_link+0x43/0x80
[  390.628610]  [<ffffffff8159b761>] rtnl_dellink+0x91/0xf0
[  390.628649]  [<ffffffff81599bd4>] rtnetlink_rcv_msg+0xa4/0x270
[  390.630373]  [<ffffffff815bacd0>] ? __netlink_lookup+0xc0/0x110
[  390.632066]  [<ffffffff81599b30>] ? rtnetlink_rcv+0x30/0x30
[  390.633751]  [<ffffffff815bd929>] netlink_rcv_skb+0xa9/0xc0
[  390.635426]  [<ffffffff81599b28>] rtnetlink_rcv+0x28/0x30
[  390.637081]  [<ffffffff815bd012>] netlink_unicast+0xf2/0x1b0
[  390.638721]  [<ffffffff815bd3ef>] netlink_sendmsg+0x31f/0x6a0
[  390.640371]  [<ffffffff812b4d65>] ? sock_has_perm+0x75/0x90
[  390.642037]  [<ffffffff8156a580>] sock_sendmsg+0xb0/0xf0
[  390.643722]  [<ffffffff8156a88f>] ? sock_recvmsg+0xbf/0x100
[  390.645411]  [<ffffffff8132c312>] ? put_dec+0x72/0x90
[  390.647075]  [<ffffffff8132d303>] ? number.isra.2+0x323/0x360
[  390.648724]  [<ffffffff8156ae29>] ___sys_sendmsg+0x3a9/0x3c0
[  390.650362]  [<ffffffff811de9d2>] ? kmem_cache_free+0x1e2/0x200
[  390.652010]  [<ffffffff81217af5>] ? __d_free+0x35/0x40
[  390.653623]  [<ffffffff812181b0>] ? d_free+0x60/0x70
[  390.655181]  [<ffffffff812186b4>] ? dentry_kill+0x154/0x1b0
[  390.656702]  [<ffffffff81222744>] ? mntput+0x24/0x40
[  390.658173]  [<ffffffff81203053>] ? __fput+0x183/0x260
[  390.659606]  [<ffffffff8156b5f1>] __sys_sendmsg+0x51/0x90
[  390.660988]  [<ffffffff8156b642>] SyS_sendmsg+0x12/0x20
[  390.662325]  [<ffffffff816b4fc9>] system_call_fastpath+0x16/0x1b
[  390.663624] Code: a0 bb c0 49 8b 3f 49 39 ff 74 be 48 85 ff 74 b9 41 83 6f 10 01 
48 8b 0f 48 8b 57 08 48 c7 07 00 00 00 00 48 c7 47 08 00 00 00 00 <48> 89 51 08 
48 89 0a e8 8a a2 ba c0 49 8b 3f 49 39 ff 75 cc eb
[  390.666406] RIP  [<ffffffffc09c8b4a>] vxlan_dellink+0x9a/0xf0 [vxlan]
[  390.667674]  RSP <ffff881fa3593888>
[  390.668892] CR2: 0000000000000008

Br,
Zhike Wang
JDCloud, Product Development, IaaS
------------------------------------------------------------------------------------------------
Mobile/+86 13466719566
E- mail/wangzh...@jd.com<mailto:wangzh...@jd.com>
Address/5F Building A,North-Star Century Center,8 Beichen West Street,Chaoyang 
District Beijing
Https://JDCloud.com<https://jdcloud.com/>
------------------------------------------------------------------------------------------------
[cid:image002.jpg@01D404D3.6724C2E0]



_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to