Re: [ovs-dev] [RFC v3 0/4] netdev-offload: Prerequisites of vport offloading via DPDK.
Hi Ilya, Testing this RFC series on master branch (kernel and dpdk) passed OK. I sent 2 small comments. Other than that - Acked-by Ophir Munk Acked-by Eli Britstein Regards, Ophir > -Original Message- > From: Ilya Maximets > Sent: Friday, December 6, 2019 4:35 PM > To: Ophir Munk ; Ilya Maximets > ; ovs-dev@openvswitch.org > Cc: Roni Bar Yanai ; Simon Horman > ; Ameer Mahagneh > ; Eli Britstein > Subject: Re: [RFC v3 0/4] netdev-offload: Prerequisites of vport offloading > via > DPDK. > > On 06.12.2019 14:09, Ophir Munk wrote: > > Hi Ilya, > > I applied this series on master ("dpdk: Update to use DPDK 19.11.") and > PINGed between two "dpdk" ports (hw-offload=true). > > It worked fine. > > Then I watched flow statistics (dpif based) by executing the command in > (1): > > (1) watch ovs-appctl dpif/dump-flows -m It worked fine. > > Then I watched flow statistics (dpctl based) by executing the command in > (2): > > (2) watch ovs-appctl dpctl/dump-flows > > In this case OVS crashed after a few seconds. > > > > When inspecting the calls trace I notice that pmd->dp->dpif->dpif_class- > >type is a corrupted memory address. > > > > (gdb) bt > > #0 0x00c26fda in dpif_normalize_type (type=0x7372 > 0x7372 out of bounds>) at lib/dpif.c:517 > > #1 0x00c19864 in dp_netdev_flow_offload_put > > (offload=offload@entry=0x7f11fc00b790) at lib/dpif-netdev.c:2378 > > #2 0x00c19ca8 in dp_netdev_flow_offload_main > (data= > out>) at lib/dpif-netdev.c:2467 > > #3 0x00ca3c9d in ovsthread_wrapper (aux_=) at > > lib/ovs-thread.c:383 > > #4 0x7f12a2a08e25 in start_thread () from /lib64/libpthread.so.0 > > #5 0x7f12a222c34d in clone () from /lib64/libc.so.6 > > > > This scenario repeats whenever running the command in (2) and when the > code calls dpif_normalize_type(). > > It seems to me that the memory corruption was there for a while but only > now it is exposed due to your rfc series. > > > > In order to observe this memory corruption I added the following > printouts. > > I tested it with hw-offload=false: > > ovs-vsctl set Open_vSwitch . other_config:hw-offload=false I tested it > > on latest master without your rfc series. > > > > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index > > 1e54936..18a804a 100644 > > --- a/lib/dpif-netdev.c > > +++ b/lib/dpif-netdev.c > > @@ -3625,6 +3625,10 @@ dpif_netdev_flow_dump_next(struct > dpif_flow_dump_thread *thread_, > > } > > } > > > > +VLOG_ERR("pmd->dp=%p pmd->dp->dpif=%p pmd->dp->dpif- > >dpif_class=%p \ > > + pmd->dp->dpif->full_name=%s", > > + pmd->dp, pmd->dp->dpif, pmd->dp->dpif->dpif_class, > > + pmd->dp->dpif->full_name); > > do { > > for (n_flows = 0; n_flows < flow_limit; n_flows++) { > > struct cmap_node *node; > > > > > > Looking at the printouts the memory corruption is observed. > > > > Initially all printouts are identical and correct as expected. > > > > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 > > pmd->dp->dpif=0x2525e70 pmd->dp->dpif->dpif_class=0xef3e80 > > pmd->dp->dpif->full_name=netdev@ovs-netdev > > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 > > pmd->dp->dpif=0x2525e70 pmd->dp->dpif->dpif_class=0xef3e80 > > pmd->dp->dpif->full_name=netdev@ovs-netdev > > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 > > pmd->dp->dpif=0x2525e70 pmd->dp->dpif->dpif_class=0xef3e80 > > pmd->dp->dpif->full_name=netdev@ovs-netdev > > > > Around this point I executed the dpctl command in (2) and you can notice > that the pointers pmd->dp->dp_dpif and beyond became modified. > > > > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 > > pmd->dp->dpif=0x25c3280 pmd->dp->dpif->dpif_class=0x251c790 > > pmd->dp->dpif->full_name=(null) > > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 > > pmd->dp->dpif=0x25c3280 pmd->dp->dpif->dpif_class=0x33c03608 > > pmd->dp->dpif->full_name= > > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 > > pmd->dp->dpif=0x25c3280 pmd->dp->dpif->dpif_class=0x2605700 > > pmd->dp->dpif->full_name= > > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 > > pmd->dp->dpif=0x25c3280 pmd->dp->dpif->dpif_class=0xe784250b > > pmd->dp->dpif->full_name=memseg-2048k-0-3 > > dpif_netdev|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x2602430 > > pmd->dp->dpif->dpif_class=0xef3e80 > > pmd->dp->dpif->full_name=netdev@ovs-netdev > > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 > > pmd->dp->dpif=0x2602430 pmd->dp->dpif->dpif_class=0x2606200 > > pmd->dp->dpif->full_name=(null) > > > > The same happens if I test it with hw-offload=true. > > > > I hope this scenario can be recreated on other setups as well. > > This is interesting. I'll try to reproduce this issue on my setup. > > > I will look more into it. > > Thanks! > > Best regards, Ilya Maximets. ___ dev mailing list d...@openvswitch.org
Re: [ovs-dev] [RFC v3 0/4] netdev-offload: Prerequisites of vport offloading via DPDK.
Hi Ilya, I think the root cause for the issues mentioned below is the usage of pmd->dp->dpif->dpif_class rather than pmd->dp->class. Please see a fixing patch https://patchwork.ozlabs.org/patch/1205684/ ("dpif-netdev: Retrieve dpif_class from struct dp_netdev "). When running dpctl statistics there is a call to dpif_open(). As part of this call dp->dpif is assigned a new dpif pointer value. However pmd->dp->dpif should assume that dp->dpif is fixed throughout its life time. Hence it is better to access the dpif_class through fixed pointers that do not change (pmd->dp->class). With the fixing patch Eli issues are fixes. Also when I updated your RFCs series to use pmd->dp->class - there is no more crash and everything works fine. Ilya - can you please update the RFC series and send them as V1 patches? I will test them again and hopefully will be able to ACK them. Regards, Ophir > -Original Message- > From: Eli Britstein > Sent: Sunday, December 8, 2019 2:29 PM > To: Ilya Maximets ; Ophir Munk > ; ovs-dev@openvswitch.org > Cc: Roni Bar Yanai ; Simon Horman > ; Ameer Mahagneh > > Subject: Re: [RFC v3 0/4] netdev-offload: Prerequisites of vport offloading > via > DPDK. > > > On 12/6/2019 4:34 PM, Ilya Maximets wrote: > > On 06.12.2019 14:09, Ophir Munk wrote: > >> Hi Ilya, > >> I applied this series on master ("dpdk: Update to use DPDK 19.11.") and > PINGed between two "dpdk" ports (hw-offload=true). > >> It worked fine. > >> Then I watched flow statistics (dpif based) by executing the command in > (1): > >> (1) watch ovs-appctl dpif/dump-flows -m It worked fine. > >> Then I watched flow statistics (dpctl based) by executing the command in > (2): > >> (2) watch ovs-appctl dpctl/dump-flows In this case OVS crashed after > >> a few seconds. > >> > >> When inspecting the calls trace I notice that pmd->dp->dpif->dpif_class- > >type is a corrupted memory address. > >> > >> (gdb) bt > >> #0 0x00c26fda in dpif_normalize_type (type=0x7372 >> 0x7372 out of bounds>) at lib/dpif.c:517 > >> #1 0x00c19864 in dp_netdev_flow_offload_put > >> (offload=offload@entry=0x7f11fc00b790) at lib/dpif-netdev.c:2378 > >> #2 0x00c19ca8 in dp_netdev_flow_offload_main > >> (data=) at lib/dpif-netdev.c:2467 > >> #3 0x00ca3c9d in ovsthread_wrapper (aux_=) > at > >> lib/ovs-thread.c:383 > >> #4 0x7f12a2a08e25 in start_thread () from /lib64/libpthread.so.0 > >> #5 0x7f12a222c34d in clone () from /lib64/libc.so.6 > >> > >> This scenario repeats whenever running the command in (2) and when > the code calls dpif_normalize_type(). > >> It seems to me that the memory corruption was there for a while but only > now it is exposed due to your rfc series. > >> > >> In order to observe this memory corruption I added the following > printouts. > >> I tested it with hw-offload=false: > >> ovs-vsctl set Open_vSwitch . other_config:hw-offload=false I tested > >> it on latest master without your rfc series. > >> > >> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index > >> 1e54936..18a804a 100644 > >> --- a/lib/dpif-netdev.c > >> +++ b/lib/dpif-netdev.c > >> @@ -3625,6 +3625,10 @@ dpif_netdev_flow_dump_next(struct > dpif_flow_dump_thread *thread_, > >> } > >> } > >> > >> +VLOG_ERR("pmd->dp=%p pmd->dp->dpif=%p pmd->dp->dpif- > >dpif_class=%p \ > >> + pmd->dp->dpif->full_name=%s", > >> + pmd->dp, pmd->dp->dpif, pmd->dp->dpif->dpif_class, > >> + pmd->dp->dpif->full_name); > >> do { > >> for (n_flows = 0; n_flows < flow_limit; n_flows++) { > >> struct cmap_node *node; > >> > >> > >> Looking at the printouts the memory corruption is observed. > >> > >> Initially all printouts are identical and correct as expected. > >> > >> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 > >> pmd->dp->dpif=0x2525e70 pmd->dp->dpif->dpif_class=0xef3e80 > >> pmd->dp->dpif->full_name=netdev@ovs-netdev > >> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 > >> pmd->dp->dpif=0x2525e70 pmd->dp->dpif->dpif_class=0xef3e80 > >> pmd->dp->dpif->full_name=netdev@ovs-netdev > >> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 > >> pmd->dp->dpif=0x2525e70 pmd->dp->dpif->dpif_class=0xef3e80 > >> pmd->dp->dpif->full_name=netdev@ovs-netdev > >> > >> Around this point I executed the dpctl command in (2) and you can notice > that the pointers pmd->dp->dp_dpif and beyond became modified. > >> > >> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 > >> pmd->dp->dpif=0x25c3280 pmd->dp->dpif->dpif_class=0x251c790 > >> pmd->dp->dpif->full_name=(null) > >> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 > >> pmd->dp->dpif=0x25c3280 pmd->dp->dpif->dpif_class=0x33c03608 > >> pmd->dp->dpif->full_name= > >> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 > >> pmd->dp->dpif=0x25c3280 pmd->dp->dpif->dpif_class=0x2605700 > >> pmd->dp->dpif->full_name= > >>
Re: [ovs-dev] [RFC v3 0/4] netdev-offload: Prerequisites of vport offloading via DPDK.
On 12/6/2019 4:34 PM, Ilya Maximets wrote: > On 06.12.2019 14:09, Ophir Munk wrote: >> Hi Ilya, >> I applied this series on master ("dpdk: Update to use DPDK 19.11.") and >> PINGed between two "dpdk" ports (hw-offload=true). >> It worked fine. >> Then I watched flow statistics (dpif based) by executing the command in (1): >> (1) watch ovs-appctl dpif/dump-flows -m >> It worked fine. >> Then I watched flow statistics (dpctl based) by executing the command in (2): >> (2) watch ovs-appctl dpctl/dump-flows >> In this case OVS crashed after a few seconds. >> >> When inspecting the calls trace I notice that >> pmd->dp->dpif->dpif_class->type is a corrupted memory address. >> >> (gdb) bt >> #0 0x00c26fda in dpif_normalize_type (type=0x7372 > out of bounds>) at lib/dpif.c:517 >> #1 0x00c19864 in dp_netdev_flow_offload_put >> (offload=offload@entry=0x7f11fc00b790) at lib/dpif-netdev.c:2378 >> #2 0x00c19ca8 in dp_netdev_flow_offload_main (data=) >> at lib/dpif-netdev.c:2467 >> #3 0x00ca3c9d in ovsthread_wrapper (aux_=) at >> lib/ovs-thread.c:383 >> #4 0x7f12a2a08e25 in start_thread () from /lib64/libpthread.so.0 >> #5 0x7f12a222c34d in clone () from /lib64/libc.so.6 >> >> This scenario repeats whenever running the command in (2) and when the code >> calls dpif_normalize_type(). >> It seems to me that the memory corruption was there for a while but only now >> it is exposed due to your rfc series. >> >> In order to observe this memory corruption I added the following printouts. >> I tested it with hw-offload=false: >> ovs-vsctl set Open_vSwitch . other_config:hw-offload=false >> I tested it on latest master without your rfc series. >> >> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c >> index 1e54936..18a804a 100644 >> --- a/lib/dpif-netdev.c >> +++ b/lib/dpif-netdev.c >> @@ -3625,6 +3625,10 @@ dpif_netdev_flow_dump_next(struct >> dpif_flow_dump_thread *thread_, >> } >> } >> >> +VLOG_ERR("pmd->dp=%p pmd->dp->dpif=%p pmd->dp->dpif->dpif_class=%p \ >> + pmd->dp->dpif->full_name=%s", >> + pmd->dp, pmd->dp->dpif, pmd->dp->dpif->dpif_class, >> + pmd->dp->dpif->full_name); >> do { >> for (n_flows = 0; n_flows < flow_limit; n_flows++) { >> struct cmap_node *node; >> >> >> Looking at the printouts the memory corruption is observed. >> >> Initially all printouts are identical and correct as expected. >> >> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 >> pmd->dp->dpif=0x2525e70 pmd->dp->dpif->dpif_class=0xef3e80 >> pmd->dp->dpif->full_name=netdev@ovs-netdev >> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 >> pmd->dp->dpif=0x2525e70 pmd->dp->dpif->dpif_class=0xef3e80 >> pmd->dp->dpif->full_name=netdev@ovs-netdev >> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 >> pmd->dp->dpif=0x2525e70 pmd->dp->dpif->dpif_class=0xef3e80 >> pmd->dp->dpif->full_name=netdev@ovs-netdev >> >> Around this point I executed the dpctl command in (2) and you can notice >> that the pointers pmd->dp->dp_dpif and beyond became modified. >> >> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 >> pmd->dp->dpif=0x25c3280 pmd->dp->dpif->dpif_class=0x251c790 >> pmd->dp->dpif->full_name=(null) >> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 >> pmd->dp->dpif=0x25c3280 pmd->dp->dpif->dpif_class=0x33c03608 >> pmd->dp->dpif->full_name= >> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 >> pmd->dp->dpif=0x25c3280 pmd->dp->dpif->dpif_class=0x2605700 >> pmd->dp->dpif->full_name= >> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 >> pmd->dp->dpif=0x25c3280 pmd->dp->dpif->dpif_class=0xe784250b >> pmd->dp->dpif->full_name=memseg-2048k-0-3 >> dpif_netdev|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x2602430 >> pmd->dp->dpif->dpif_class=0xef3e80 pmd->dp->dpif->full_name=netdev@ovs-netdev >> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 >> pmd->dp->dpif=0x2602430 pmd->dp->dpif->dpif_class=0x2606200 >> pmd->dp->dpif->full_name=(null) >> >> The same happens if I test it with hw-offload=true. >> >> I hope this scenario can be recreated on other setups as well. > This is interesting. I'll try to reproduce this issue on my setup. Hi Ilya, I see that this bug is not only with this new patch-set, but now also came into effect in master, with acceptance of 30115809da2e ("dpif-netdev: Use netdev-offload API for port lookup while offloading.") - port = dp_netdev_lookup_port(pmd->dp, in_port); + port = netdev_ports_get(in_port, pmd->dp->dpif->dpif_class); As of the corruption, the "port" is not retrieved by netdev_ports_get, and the offloading is not done. I have reverted this commits in my tests. Though this commit is correct (but should not have any effect before vport offloading), maybe we should revert it until this corruption bug is fixed. Thanks, Eli > >> I will look more into it. > Thanks! > >
Re: [ovs-dev] [RFC v3 0/4] netdev-offload: Prerequisites of vport offloading via DPDK.
On 06.12.2019 14:09, Ophir Munk wrote: > Hi Ilya, > I applied this series on master ("dpdk: Update to use DPDK 19.11.") and > PINGed between two "dpdk" ports (hw-offload=true). > It worked fine. > Then I watched flow statistics (dpif based) by executing the command in (1): > (1) watch ovs-appctl dpif/dump-flows -m > It worked fine. > Then I watched flow statistics (dpctl based) by executing the command in (2): > (2) watch ovs-appctl dpctl/dump-flows > In this case OVS crashed after a few seconds. > > When inspecting the calls trace I notice that pmd->dp->dpif->dpif_class->type > is a corrupted memory address. > > (gdb) bt > #0 0x00c26fda in dpif_normalize_type (type=0x7372 out of bounds>) at lib/dpif.c:517 > #1 0x00c19864 in dp_netdev_flow_offload_put > (offload=offload@entry=0x7f11fc00b790) at lib/dpif-netdev.c:2378 > #2 0x00c19ca8 in dp_netdev_flow_offload_main (data=) > at lib/dpif-netdev.c:2467 > #3 0x00ca3c9d in ovsthread_wrapper (aux_=) at > lib/ovs-thread.c:383 > #4 0x7f12a2a08e25 in start_thread () from /lib64/libpthread.so.0 > #5 0x7f12a222c34d in clone () from /lib64/libc.so.6 > > This scenario repeats whenever running the command in (2) and when the code > calls dpif_normalize_type(). > It seems to me that the memory corruption was there for a while but only now > it is exposed due to your rfc series. > > In order to observe this memory corruption I added the following printouts. > I tested it with hw-offload=false: > ovs-vsctl set Open_vSwitch . other_config:hw-offload=false > I tested it on latest master without your rfc series. > > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c > index 1e54936..18a804a 100644 > --- a/lib/dpif-netdev.c > +++ b/lib/dpif-netdev.c > @@ -3625,6 +3625,10 @@ dpif_netdev_flow_dump_next(struct > dpif_flow_dump_thread *thread_, > } > } > > +VLOG_ERR("pmd->dp=%p pmd->dp->dpif=%p pmd->dp->dpif->dpif_class=%p \ > + pmd->dp->dpif->full_name=%s", > + pmd->dp, pmd->dp->dpif, pmd->dp->dpif->dpif_class, > + pmd->dp->dpif->full_name); > do { > for (n_flows = 0; n_flows < flow_limit; n_flows++) { > struct cmap_node *node; > > > Looking at the printouts the memory corruption is observed. > > Initially all printouts are identical and correct as expected. > > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x2525e70 > pmd->dp->dpif->dpif_class=0xef3e80 pmd->dp->dpif->full_name=netdev@ovs-netdev > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x2525e70 > pmd->dp->dpif->dpif_class=0xef3e80 pmd->dp->dpif->full_name=netdev@ovs-netdev > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x2525e70 > pmd->dp->dpif->dpif_class=0xef3e80 pmd->dp->dpif->full_name=netdev@ovs-netdev > > Around this point I executed the dpctl command in (2) and you can notice that > the pointers pmd->dp->dp_dpif and beyond became modified. > > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x25c3280 > pmd->dp->dpif->dpif_class=0x251c790 pmd->dp->dpif->full_name=(null) > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x25c3280 > pmd->dp->dpif->dpif_class=0x33c03608 pmd->dp->dpif->full_name= > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x25c3280 > pmd->dp->dpif->dpif_class=0x2605700 pmd->dp->dpif->full_name= > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x25c3280 > pmd->dp->dpif->dpif_class=0xe784250b pmd->dp->dpif->full_name=memseg-2048k-0-3 > dpif_netdev|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x2602430 > pmd->dp->dpif->dpif_class=0xef3e80 pmd->dp->dpif->full_name=netdev@ovs-netdev > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x2602430 > pmd->dp->dpif->dpif_class=0x2606200 pmd->dp->dpif->full_name=(null) > > The same happens if I test it with hw-offload=true. > > I hope this scenario can be recreated on other setups as well. This is interesting. I'll try to reproduce this issue on my setup. > I will look more into it. Thanks! Best regards, Ilya Maximets. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [RFC v3 0/4] netdev-offload: Prerequisites of vport offloading via DPDK.
Hi Ilya, I applied this series on master ("dpdk: Update to use DPDK 19.11.") and PINGed between two "dpdk" ports (hw-offload=true). It worked fine. Then I watched flow statistics (dpif based) by executing the command in (1): (1) watch ovs-appctl dpif/dump-flows -m It worked fine. Then I watched flow statistics (dpctl based) by executing the command in (2): (2) watch ovs-appctl dpctl/dump-flows In this case OVS crashed after a few seconds. When inspecting the calls trace I notice that pmd->dp->dpif->dpif_class->type is a corrupted memory address. (gdb) bt #0 0x00c26fda in dpif_normalize_type (type=0x7372 ) at lib/dpif.c:517 #1 0x00c19864 in dp_netdev_flow_offload_put (offload=offload@entry=0x7f11fc00b790) at lib/dpif-netdev.c:2378 #2 0x00c19ca8 in dp_netdev_flow_offload_main (data=) at lib/dpif-netdev.c:2467 #3 0x00ca3c9d in ovsthread_wrapper (aux_=) at lib/ovs-thread.c:383 #4 0x7f12a2a08e25 in start_thread () from /lib64/libpthread.so.0 #5 0x7f12a222c34d in clone () from /lib64/libc.so.6 This scenario repeats whenever running the command in (2) and when the code calls dpif_normalize_type(). It seems to me that the memory corruption was there for a while but only now it is exposed due to your rfc series. In order to observe this memory corruption I added the following printouts. I tested it with hw-offload=false: ovs-vsctl set Open_vSwitch . other_config:hw-offload=false I tested it on latest master without your rfc series. diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 1e54936..18a804a 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -3625,6 +3625,10 @@ dpif_netdev_flow_dump_next(struct dpif_flow_dump_thread *thread_, } } +VLOG_ERR("pmd->dp=%p pmd->dp->dpif=%p pmd->dp->dpif->dpif_class=%p \ + pmd->dp->dpif->full_name=%s", + pmd->dp, pmd->dp->dpif, pmd->dp->dpif->dpif_class, + pmd->dp->dpif->full_name); do { for (n_flows = 0; n_flows < flow_limit; n_flows++) { struct cmap_node *node; Looking at the printouts the memory corruption is observed. Initially all printouts are identical and correct as expected. dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x2525e70 pmd->dp->dpif->dpif_class=0xef3e80 pmd->dp->dpif->full_name=netdev@ovs-netdev dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x2525e70 pmd->dp->dpif->dpif_class=0xef3e80 pmd->dp->dpif->full_name=netdev@ovs-netdev dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x2525e70 pmd->dp->dpif->dpif_class=0xef3e80 pmd->dp->dpif->full_name=netdev@ovs-netdev Around this point I executed the dpctl command in (2) and you can notice that the pointers pmd->dp->dp_dpif and beyond became modified. dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x25c3280 pmd->dp->dpif->dpif_class=0x251c790 pmd->dp->dpif->full_name=(null) dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x25c3280 pmd->dp->dpif->dpif_class=0x33c03608 pmd->dp->dpif->full_name= dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x25c3280 pmd->dp->dpif->dpif_class=0x2605700 pmd->dp->dpif->full_name= dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x25c3280 pmd->dp->dpif->dpif_class=0xe784250b pmd->dp->dpif->full_name=memseg-2048k-0-3 dpif_netdev|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x2602430 pmd->dp->dpif->dpif_class=0xef3e80 pmd->dp->dpif->full_name=netdev@ovs-netdev dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x2602430 pmd->dp->dpif->dpif_class=0x2606200 pmd->dp->dpif->full_name=(null) The same happens if I test it with hw-offload=true. I hope this scenario can be recreated on other setups as well. I will look more into it. Any insights on this issue would be appreciated. Regards, Ophir > -Original Message- > From: Ilya Maximets > Sent: Tuesday, December 3, 2019 1:13 PM > To: ovs-dev@openvswitch.org > Cc: Ophir Munk ; Roni Bar Yanai > ; Simon Horman > ; Ilya Maximets > Subject: [RFC v3 0/4] netdev-offload: Prerequisites of vport offloading via > DPDK. > > These patches are necessary to enable vxlan offloading for userspace > datapath via DPDK flow offloading API (not tested). > They allows to properly distinguish userspace tunnels from the system > tunnels on netdev level in order to choose appropriate flow offloading API > provider. Before these patches it was not possible because tunneling ports > are implemented by the same netdev-vport classes and uses same dpif port > names, so only ofproto really knows which netdevs assigned to each dpif. > > RFC v3: > * Rebase on current master. > > RFC v2: > * Added 2 patches for using dpif type. > * Last patch updated to use netdev dpif_type instead of traversing > dpif ports. > > > Ilya Maximets (4): > netdev: Allow storing dpif type into netdev structure.
Re: [ovs-dev] [RFC v3 0/4] netdev-offload: Prerequisites of vport offloading via DPDK.
On Tue, Dec 03, 2019 at 12:13:11PM +0100, Ilya Maximets wrote: > These patches are necessary to enable vxlan offloading for > userspace datapath via DPDK flow offloading API (not tested). > They allows to properly distinguish userspace tunnels from the > system tunnels on netdev level in order to choose appropriate > flow offloading API provider. Before these patches it was not > possible because tunneling ports are implemented by the same > netdev-vport classes and uses same dpif port names, so only > ofproto really knows which netdevs assigned to each dpif. This series looks good to me. Reviewed-by: Simon Horman > > RFC v3: > * Rebase on current master. > > RFC v2: > * Added 2 patches for using dpif type. > * Last patch updated to use netdev dpif_type instead of traversing > dpif ports. > > > Ilya Maximets (4): > netdev: Allow storing dpif type into netdev structure. > netdev-offload: Use dpif type instead of class. > netdev-offload: Allow offloading to netdev without ifindex. > netdev-offload: Disallow offloading to unrelated tunneling vports. > > lib/dpif-netdev.c | 13 +++ > lib/dpif-netlink.c| 23 ++-- > lib/dpif.c| 21 ++- > lib/netdev-offload-dpdk.c | 8 + > lib/netdev-offload-tc.c | 12 +-- > lib/netdev-offload.c | 68 ++- > lib/netdev-offload.h | 16 - > lib/netdev-provider.h | 3 +- > lib/netdev.c | 16 + > lib/netdev.h | 2 ++ > ofproto/ofproto-dpif-upcall.c | 5 ++- > 11 files changed, 114 insertions(+), 73 deletions(-) > > -- > 2.17.1 > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [RFC v3 0/4] netdev-offload: Prerequisites of vport offloading via DPDK.
These patches are necessary to enable vxlan offloading for userspace datapath via DPDK flow offloading API (not tested). They allows to properly distinguish userspace tunnels from the system tunnels on netdev level in order to choose appropriate flow offloading API provider. Before these patches it was not possible because tunneling ports are implemented by the same netdev-vport classes and uses same dpif port names, so only ofproto really knows which netdevs assigned to each dpif. RFC v3: * Rebase on current master. RFC v2: * Added 2 patches for using dpif type. * Last patch updated to use netdev dpif_type instead of traversing dpif ports. Ilya Maximets (4): netdev: Allow storing dpif type into netdev structure. netdev-offload: Use dpif type instead of class. netdev-offload: Allow offloading to netdev without ifindex. netdev-offload: Disallow offloading to unrelated tunneling vports. lib/dpif-netdev.c | 13 +++ lib/dpif-netlink.c| 23 ++-- lib/dpif.c| 21 ++- lib/netdev-offload-dpdk.c | 8 + lib/netdev-offload-tc.c | 12 +-- lib/netdev-offload.c | 68 ++- lib/netdev-offload.h | 16 - lib/netdev-provider.h | 3 +- lib/netdev.c | 16 + lib/netdev.h | 2 ++ ofproto/ofproto-dpif-upcall.c | 5 ++- 11 files changed, 114 insertions(+), 73 deletions(-) -- 2.17.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev