Hello Eli, On Mon, Sep 5, 2022 at 4:46 PM Eli Britstein via dev <ovs-dev@openvswitch.org> wrote: > > Vport's offloads are done on the tracked orig-in-port, but the flow itself > is associated in the vport's map. > > Removing the physdev will flush all the ports that are on its map, but
all the flows* > not the ones on other netdevs' maps. Since flows take reference count on > both their vport and their physdev, the physdev fails to be removed. I tested with a simple ping over a vxlan tunnel. In my testing, I do manage to remove the physdev port. The revalidator later flushes the expired flow (related to the vport), and the offload thread ends up crashing. netdev_dpdk_get_port_id (netdev=netdev@entry=0x17d333600) at ../lib/netdev-dpdk.c:5438 5438 if (!is_dpdk_class(netdev->netdev_class)) { (gdb) bt #0 netdev_dpdk_get_port_id (netdev=netdev@entry=0x17d333600) at ../lib/netdev-dpdk.c:5438 #1 0x0000000000a34930 in netdev_offload_dpdk_flow_destroy (rte_flow_data=0x7fa51c0104a0) at ../lib/netdev-offload-dpdk.c:2349 #2 0x0000000000926f7c in mark_to_flow_disassociate (dp=0x5c93c80, flow=0x7fa4f400d8a0) at ../lib/dpif-netdev.c:2621 #3 0x00000000009276f7 in dp_netdev_flow_offload_del (item=0x7fa4fc003660) at ../lib/dpif-netdev.c:2743 #4 dp_offload_flow (item=0x7fa4fc003660) at ../lib/dpif-netdev.c:2855 #5 dp_netdev_flow_offload_main (arg=0x59aced0) at ../lib/dpif-netdev.c:2918 #6 0x00000000009c0635 in ovsthread_wrapper (aux_=<optimized out>) at ../lib/ovs-thread.c:423 (gdb) p rte_flow_data->physdev $5 = (struct netdev *) 0x17d333600 (gdb) p rte_flow_data->netdev $6 = (struct netdev *) 0x60bb520 (gdb) p *rte_flow_data->physdev $7 = {name = 0x0, netdev_class = 0x0, auto_classified = false, ol_flags = 0, mtu_user_config = false, ref_cnt = 0, change_seq = 0, reconfigure_seq = 0x0, last_reconfigure_seq = 0, n_txq = 0, n_rxq = 0, node = 0x0, saved_flags_list = { prev = 0x0, next = 0x0}, flow_api = {p = 0x0}, dpif_type = 0x0, hw_info = {oor = false, miss_api_supported = false, offload_count = 0, pending_count = 0, offload_data = {p = 0x0}}} (gdb) p *rte_flow_data->netdev $8 = {name = 0x60b8eb0 "vxlan0", netdev_class = 0xcc5518 <vport_classes+1080>, auto_classified = false, ol_flags = 0, mtu_user_config = false, ref_cnt = 8, change_seq = 4, reconfigure_seq = 0x60b9120, last_reconfigure_seq = 1802, n_txq = 0, n_rxq = 0, node = 0x60b8da0, saved_flags_list = {prev = 0x60bb570, next = 0x60bb570}, flow_api = {p = 0xb7ede0 <netdev_offload_dpdk>}, dpif_type = 0xb7e42b "netdev", hw_info = {oor = false, miss_api_supported = true, offload_count = 0, pending_count = 0, offload_data = {p = 0x60bc090}}} There is probably something wrong with the physdev refcnt... and it seems I am hitting an issue close but different to yours. Can you retest on the current master branch and confirm? > > Fix it by flushing the physdev's offload flows in all related netdevs, > e.g. the netdev itself, or for physical devices, all vports. > > Fixes: adbd4301a249 ("netdev-offload-dpdk: Use per-netdev offload metadata.") > Reported-by: 15895987278 <wuxi_...@163.com> > Signed-off-by: Eli Britstein <el...@nvidia.com> -- David Marchand _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev