Hi Ilya,
Testing this RFC series on master branch (kernel and dpdk) passed OK.
I sent 2 small comments.
Other than that - 
Acked-by Ophir Munk <ophi...@mellanox.com>
Acked-by Eli Britstein <el...@mellanox.com>

Regards,
Ophir


> -----Original Message-----
> From: Ilya Maximets <i.maxim...@ovn.org>
> Sent: Friday, December 6, 2019 4:35 PM
> To: Ophir Munk <ophi...@mellanox.com>; Ilya Maximets
> <i.maxim...@ovn.org>; ovs-dev@openvswitch.org
> Cc: Roni Bar Yanai <ron...@mellanox.com>; Simon Horman
> <simon.hor...@netronome.com>; Ameer Mahagneh
> <ame...@mellanox.com>; Eli Britstein <el...@mellanox.com>
> Subject: Re: [RFC v3 0/4] netdev-offload: Prerequisites of vport offloading 
> via
> DPDK.
> 
> On 06.12.2019 14:09, Ophir Munk wrote:
> > Hi Ilya,
> > I applied this series on master ("dpdk: Update to use DPDK 19.11.") and
> PINGed between two "dpdk" ports (hw-offload=true).
> > It worked fine.
> > Then I watched flow statistics (dpif based) by executing the command in
> (1):
> > (1) watch ovs-appctl dpif/dump-flows -m <bridge name> It worked fine.
> > Then I watched flow statistics (dpctl based) by executing the command in
> (2):
> > (2) watch ovs-appctl dpctl/dump-flows
> > In this case OVS crashed after a few seconds.
> >
> > When inspecting the calls trace I notice that pmd->dp->dpif->dpif_class-
> >type is a corrupted memory address.
> >
> > (gdb) bt
> > #0  0x0000000000c26fda in dpif_normalize_type (type=0x7372 <Address
> > 0x7372 out of bounds>) at lib/dpif.c:517
> > #1  0x0000000000c19864 in dp_netdev_flow_offload_put
> > (offload=offload@entry=0x7f11fc00b790) at lib/dpif-netdev.c:2378
> > #2  0x0000000000c19ca8 in dp_netdev_flow_offload_main
> (data=<optimized
> > out>) at lib/dpif-netdev.c:2467
> > #3  0x0000000000ca3c9d in ovsthread_wrapper (aux_=<optimized out>) at
> > lib/ovs-thread.c:383
> > #4  0x00007f12a2a08e25 in start_thread () from /lib64/libpthread.so.0
> > #5  0x00007f12a222c34d in clone () from /lib64/libc.so.6
> >
> > This scenario repeats whenever running the command in (2) and when the
> code calls dpif_normalize_type().
> > It seems to me that the memory corruption was there for a while but only
> now it is exposed due to your rfc series.
> >
> > In order to observe this memory corruption I added the following
> printouts.
> > I tested it with hw-offload=false:
> > ovs-vsctl set Open_vSwitch . other_config:hw-offload=false I tested it
> > on latest master without your rfc series.
> >
> > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index
> > 1e54936..18a804a 100644
> > --- a/lib/dpif-netdev.c
> > +++ b/lib/dpif-netdev.c
> > @@ -3625,6 +3625,10 @@ dpif_netdev_flow_dump_next(struct
> dpif_flow_dump_thread *thread_,
> >              }
> >          }
> >
> > +        VLOG_ERR("pmd->dp=%p pmd->dp->dpif=%p pmd->dp->dpif-
> >dpif_class=%p \
> > +                 pmd->dp->dpif->full_name=%s",
> > +                 pmd->dp, pmd->dp->dpif, pmd->dp->dpif->dpif_class,
> > +                 pmd->dp->dpif->full_name);
> >          do {
> >              for (n_flows = 0; n_flows < flow_limit; n_flows++) {
> >                  struct cmap_node *node;
> >
> >
> > Looking at the printouts the memory corruption is observed.
> >
> > Initially all printouts are identical and correct as expected.
> >
> > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010
> > pmd->dp->dpif=0x2525e70 pmd->dp->dpif->dpif_class=0xef3e80
> > pmd->dp->dpif->full_name=netdev@ovs-netdev
> > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010
> > pmd->dp->dpif=0x2525e70 pmd->dp->dpif->dpif_class=0xef3e80
> > pmd->dp->dpif->full_name=netdev@ovs-netdev
> > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010
> > pmd->dp->dpif=0x2525e70 pmd->dp->dpif->dpif_class=0xef3e80
> > pmd->dp->dpif->full_name=netdev@ovs-netdev
> >
> > Around this point I executed the dpctl command in (2) and you can notice
> that the pointers pmd->dp->dp_dpif and beyond became modified.
> >
> > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010
> > pmd->dp->dpif=0x25c3280 pmd->dp->dpif->dpif_class=0x251c790
> > pmd->dp->dpif->full_name=(null)
> > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010
> > pmd->dp->dpif=0x25c3280 pmd->dp->dpif->dpif_class=0x33c03608
> > pmd->dp->dpif->full_name=
> > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010
> > pmd->dp->dpif=0x25c3280 pmd->dp->dpif->dpif_class=0x2605700
> > pmd->dp->dpif->full_name=
> > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010
> > pmd->dp->dpif=0x25c3280 pmd->dp->dpif->dpif_class=0xe784250b
> > pmd->dp->dpif->full_name=memseg-2048k-0-3
> > dpif_netdev|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x2602430
> > pmd->dp->dpif->dpif_class=0xef3e80
> > pmd->dp->dpif->full_name=netdev@ovs-netdev
> > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010
> > pmd->dp->dpif=0x2602430 pmd->dp->dpif->dpif_class=0x2606200
> > pmd->dp->dpif->full_name=(null)
> >
> > The same happens if I test it with hw-offload=true.
> >
> > I hope this scenario can be recreated on other setups as well.
> 
> This is interesting.  I'll try to reproduce this issue on my setup.
> 
> > I will look more into it.
> 
> Thanks!
> 
> Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to