Hello Eli,

On Mon, Sep 5, 2022 at 4:46 PM Eli Britstein via dev
<ovs-dev@openvswitch.org> wrote:
>
> Vport's offloads are done on the tracked orig-in-port, but the flow itself
> is associated in the vport's map.
>
> Removing the physdev will flush all the ports that are on its map, but

all the flows*

> not the ones on other netdevs' maps. Since flows take reference count on
> both their vport and their physdev, the physdev fails to be removed.

I tested with a simple ping over a vxlan tunnel.
In my testing, I do manage to remove the physdev port.
The revalidator later flushes the expired flow (related to the vport),
and the offload thread ends up crashing.

netdev_dpdk_get_port_id (netdev=netdev@entry=0x17d333600) at
../lib/netdev-dpdk.c:5438
5438        if (!is_dpdk_class(netdev->netdev_class)) {
(gdb) bt
#0  netdev_dpdk_get_port_id (netdev=netdev@entry=0x17d333600) at
../lib/netdev-dpdk.c:5438
#1  0x0000000000a34930 in netdev_offload_dpdk_flow_destroy
(rte_flow_data=0x7fa51c0104a0) at ../lib/netdev-offload-dpdk.c:2349
#2  0x0000000000926f7c in mark_to_flow_disassociate (dp=0x5c93c80,
flow=0x7fa4f400d8a0) at ../lib/dpif-netdev.c:2621
#3  0x00000000009276f7 in dp_netdev_flow_offload_del
(item=0x7fa4fc003660) at ../lib/dpif-netdev.c:2743
#4  dp_offload_flow (item=0x7fa4fc003660) at ../lib/dpif-netdev.c:2855
#5  dp_netdev_flow_offload_main (arg=0x59aced0) at ../lib/dpif-netdev.c:2918
#6  0x00000000009c0635 in ovsthread_wrapper (aux_=<optimized out>) at
../lib/ovs-thread.c:423

(gdb) p rte_flow_data->physdev
$5 = (struct netdev *) 0x17d333600
(gdb) p rte_flow_data->netdev
$6 = (struct netdev *) 0x60bb520
(gdb) p *rte_flow_data->physdev
$7 = {name = 0x0, netdev_class = 0x0, auto_classified = false,
ol_flags = 0, mtu_user_config = false, ref_cnt = 0, change_seq = 0,
reconfigure_seq = 0x0, last_reconfigure_seq = 0, n_txq = 0, n_rxq = 0,
node = 0x0, saved_flags_list = {
    prev = 0x0, next = 0x0}, flow_api = {p = 0x0}, dpif_type = 0x0,
hw_info = {oor = false, miss_api_supported = false, offload_count = 0,
pending_count = 0, offload_data = {p = 0x0}}}
(gdb) p *rte_flow_data->netdev
$8 = {name = 0x60b8eb0 "vxlan0", netdev_class = 0xcc5518
<vport_classes+1080>, auto_classified = false, ol_flags = 0,
mtu_user_config = false, ref_cnt = 8, change_seq = 4, reconfigure_seq
= 0x60b9120, last_reconfigure_seq = 1802,
  n_txq = 0, n_rxq = 0, node = 0x60b8da0, saved_flags_list = {prev =
0x60bb570, next = 0x60bb570}, flow_api = {p = 0xb7ede0
<netdev_offload_dpdk>}, dpif_type = 0xb7e42b "netdev", hw_info = {oor
= false, miss_api_supported = true,
    offload_count = 0, pending_count = 0, offload_data = {p = 0x60bc090}}}

There is probably something wrong with the physdev refcnt... and it
seems I am hitting an issue close but different to yours.
Can you retest on the current master branch and confirm?


>
> Fix it by flushing the physdev's offload flows in all related netdevs,
> e.g. the netdev itself, or for physical devices, all vports.
>
> Fixes: adbd4301a249 ("netdev-offload-dpdk: Use per-netdev offload metadata.")
> Reported-by: 15895987278 <wuxi_...@163.com>
> Signed-off-by: Eli Britstein <el...@nvidia.com>


-- 
David Marchand

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to