Hi Ilya, Testing this RFC series on master branch (kernel and dpdk) passed OK. I sent 2 small comments. Other than that - Acked-by Ophir Munk <ophi...@mellanox.com> Acked-by Eli Britstein <el...@mellanox.com>
Regards, Ophir > -----Original Message----- > From: Ilya Maximets <i.maxim...@ovn.org> > Sent: Friday, December 6, 2019 4:35 PM > To: Ophir Munk <ophi...@mellanox.com>; Ilya Maximets > <i.maxim...@ovn.org>; ovs-dev@openvswitch.org > Cc: Roni Bar Yanai <ron...@mellanox.com>; Simon Horman > <simon.hor...@netronome.com>; Ameer Mahagneh > <ame...@mellanox.com>; Eli Britstein <el...@mellanox.com> > Subject: Re: [RFC v3 0/4] netdev-offload: Prerequisites of vport offloading > via > DPDK. > > On 06.12.2019 14:09, Ophir Munk wrote: > > Hi Ilya, > > I applied this series on master ("dpdk: Update to use DPDK 19.11.") and > PINGed between two "dpdk" ports (hw-offload=true). > > It worked fine. > > Then I watched flow statistics (dpif based) by executing the command in > (1): > > (1) watch ovs-appctl dpif/dump-flows -m <bridge name> It worked fine. > > Then I watched flow statistics (dpctl based) by executing the command in > (2): > > (2) watch ovs-appctl dpctl/dump-flows > > In this case OVS crashed after a few seconds. > > > > When inspecting the calls trace I notice that pmd->dp->dpif->dpif_class- > >type is a corrupted memory address. > > > > (gdb) bt > > #0 0x0000000000c26fda in dpif_normalize_type (type=0x7372 <Address > > 0x7372 out of bounds>) at lib/dpif.c:517 > > #1 0x0000000000c19864 in dp_netdev_flow_offload_put > > (offload=offload@entry=0x7f11fc00b790) at lib/dpif-netdev.c:2378 > > #2 0x0000000000c19ca8 in dp_netdev_flow_offload_main > (data=<optimized > > out>) at lib/dpif-netdev.c:2467 > > #3 0x0000000000ca3c9d in ovsthread_wrapper (aux_=<optimized out>) at > > lib/ovs-thread.c:383 > > #4 0x00007f12a2a08e25 in start_thread () from /lib64/libpthread.so.0 > > #5 0x00007f12a222c34d in clone () from /lib64/libc.so.6 > > > > This scenario repeats whenever running the command in (2) and when the > code calls dpif_normalize_type(). > > It seems to me that the memory corruption was there for a while but only > now it is exposed due to your rfc series. > > > > In order to observe this memory corruption I added the following > printouts. > > I tested it with hw-offload=false: > > ovs-vsctl set Open_vSwitch . other_config:hw-offload=false I tested it > > on latest master without your rfc series. > > > > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index > > 1e54936..18a804a 100644 > > --- a/lib/dpif-netdev.c > > +++ b/lib/dpif-netdev.c > > @@ -3625,6 +3625,10 @@ dpif_netdev_flow_dump_next(struct > dpif_flow_dump_thread *thread_, > > } > > } > > > > + VLOG_ERR("pmd->dp=%p pmd->dp->dpif=%p pmd->dp->dpif- > >dpif_class=%p \ > > + pmd->dp->dpif->full_name=%s", > > + pmd->dp, pmd->dp->dpif, pmd->dp->dpif->dpif_class, > > + pmd->dp->dpif->full_name); > > do { > > for (n_flows = 0; n_flows < flow_limit; n_flows++) { > > struct cmap_node *node; > > > > > > Looking at the printouts the memory corruption is observed. > > > > Initially all printouts are identical and correct as expected. > > > > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 > > pmd->dp->dpif=0x2525e70 pmd->dp->dpif->dpif_class=0xef3e80 > > pmd->dp->dpif->full_name=netdev@ovs-netdev > > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 > > pmd->dp->dpif=0x2525e70 pmd->dp->dpif->dpif_class=0xef3e80 > > pmd->dp->dpif->full_name=netdev@ovs-netdev > > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 > > pmd->dp->dpif=0x2525e70 pmd->dp->dpif->dpif_class=0xef3e80 > > pmd->dp->dpif->full_name=netdev@ovs-netdev > > > > Around this point I executed the dpctl command in (2) and you can notice > that the pointers pmd->dp->dp_dpif and beyond became modified. > > > > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 > > pmd->dp->dpif=0x25c3280 pmd->dp->dpif->dpif_class=0x251c790 > > pmd->dp->dpif->full_name=(null) > > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 > > pmd->dp->dpif=0x25c3280 pmd->dp->dpif->dpif_class=0x33c03608 > > pmd->dp->dpif->full_name= > > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 > > pmd->dp->dpif=0x25c3280 pmd->dp->dpif->dpif_class=0x2605700 > > pmd->dp->dpif->full_name= > > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 > > pmd->dp->dpif=0x25c3280 pmd->dp->dpif->dpif_class=0xe784250b > > pmd->dp->dpif->full_name=memseg-2048k-0-3 > > dpif_netdev|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x2602430 > > pmd->dp->dpif->dpif_class=0xef3e80 > > pmd->dp->dpif->full_name=netdev@ovs-netdev > > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 > > pmd->dp->dpif=0x2602430 pmd->dp->dpif->dpif_class=0x2606200 > > pmd->dp->dpif->full_name=(null) > > > > The same happens if I test it with hw-offload=true. > > > > I hope this scenario can be recreated on other setups as well. > > This is interesting. I'll try to reproduce this issue on my setup. > > > I will look more into it. > > Thanks! > > Best regards, Ilya Maximets. _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev