Re: [ovs-dev] [RFC v3 0/4] netdev-offload: Prerequisites of vport offloading via DPDK.

2019-12-11 Thread Ophir Munk
Hi Ilya,
Testing this RFC series on master branch (kernel and dpdk) passed OK.
I sent 2 small comments.
Other than that - 
Acked-by Ophir Munk 
Acked-by Eli Britstein 

Regards,
Ophir


> -Original Message-
> From: Ilya Maximets 
> Sent: Friday, December 6, 2019 4:35 PM
> To: Ophir Munk ; Ilya Maximets
> ; ovs-dev@openvswitch.org
> Cc: Roni Bar Yanai ; Simon Horman
> ; Ameer Mahagneh
> ; Eli Britstein 
> Subject: Re: [RFC v3 0/4] netdev-offload: Prerequisites of vport offloading 
> via
> DPDK.
> 
> On 06.12.2019 14:09, Ophir Munk wrote:
> > Hi Ilya,
> > I applied this series on master ("dpdk: Update to use DPDK 19.11.") and
> PINGed between two "dpdk" ports (hw-offload=true).
> > It worked fine.
> > Then I watched flow statistics (dpif based) by executing the command in
> (1):
> > (1) watch ovs-appctl dpif/dump-flows -m  It worked fine.
> > Then I watched flow statistics (dpctl based) by executing the command in
> (2):
> > (2) watch ovs-appctl dpctl/dump-flows
> > In this case OVS crashed after a few seconds.
> >
> > When inspecting the calls trace I notice that pmd->dp->dpif->dpif_class-
> >type is a corrupted memory address.
> >
> > (gdb) bt
> > #0  0x00c26fda in dpif_normalize_type (type=0x7372  > 0x7372 out of bounds>) at lib/dpif.c:517
> > #1  0x00c19864 in dp_netdev_flow_offload_put
> > (offload=offload@entry=0x7f11fc00b790) at lib/dpif-netdev.c:2378
> > #2  0x00c19ca8 in dp_netdev_flow_offload_main
> (data= > out>) at lib/dpif-netdev.c:2467
> > #3  0x00ca3c9d in ovsthread_wrapper (aux_=) at
> > lib/ovs-thread.c:383
> > #4  0x7f12a2a08e25 in start_thread () from /lib64/libpthread.so.0
> > #5  0x7f12a222c34d in clone () from /lib64/libc.so.6
> >
> > This scenario repeats whenever running the command in (2) and when the
> code calls dpif_normalize_type().
> > It seems to me that the memory corruption was there for a while but only
> now it is exposed due to your rfc series.
> >
> > In order to observe this memory corruption I added the following
> printouts.
> > I tested it with hw-offload=false:
> > ovs-vsctl set Open_vSwitch . other_config:hw-offload=false I tested it
> > on latest master without your rfc series.
> >
> > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index
> > 1e54936..18a804a 100644
> > --- a/lib/dpif-netdev.c
> > +++ b/lib/dpif-netdev.c
> > @@ -3625,6 +3625,10 @@ dpif_netdev_flow_dump_next(struct
> dpif_flow_dump_thread *thread_,
> >  }
> >  }
> >
> > +VLOG_ERR("pmd->dp=%p pmd->dp->dpif=%p pmd->dp->dpif-
> >dpif_class=%p \
> > + pmd->dp->dpif->full_name=%s",
> > + pmd->dp, pmd->dp->dpif, pmd->dp->dpif->dpif_class,
> > + pmd->dp->dpif->full_name);
> >  do {
> >  for (n_flows = 0; n_flows < flow_limit; n_flows++) {
> >  struct cmap_node *node;
> >
> >
> > Looking at the printouts the memory corruption is observed.
> >
> > Initially all printouts are identical and correct as expected.
> >
> > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010
> > pmd->dp->dpif=0x2525e70 pmd->dp->dpif->dpif_class=0xef3e80
> > pmd->dp->dpif->full_name=netdev@ovs-netdev
> > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010
> > pmd->dp->dpif=0x2525e70 pmd->dp->dpif->dpif_class=0xef3e80
> > pmd->dp->dpif->full_name=netdev@ovs-netdev
> > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010
> > pmd->dp->dpif=0x2525e70 pmd->dp->dpif->dpif_class=0xef3e80
> > pmd->dp->dpif->full_name=netdev@ovs-netdev
> >
> > Around this point I executed the dpctl command in (2) and you can notice
> that the pointers pmd->dp->dp_dpif and beyond became modified.
> >
> > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010
> > pmd->dp->dpif=0x25c3280 pmd->dp->dpif->dpif_class=0x251c790
> > pmd->dp->dpif->full_name=(null)
> > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010
> > pmd->dp->dpif=0x25c3280 pmd->dp->dpif->dpif_class=0x33c03608
> > pmd->dp->dpif->full_name=
> > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010
> > pmd->dp->dpif=0x25c3280 pmd->dp->dpif->dpif_class=0x2605700
> > pmd->dp->dpif->full_name=
> > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010
> > pmd->dp->dpif=0x25c3280 pmd->dp->dpif->dpif_class=0xe784250b
> > pmd->dp->dpif->full_name=memseg-2048k-0-3
> > dpif_netdev|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x2602430
> > pmd->dp->dpif->dpif_class=0xef3e80
> > pmd->dp->dpif->full_name=netdev@ovs-netdev
> > dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010
> > pmd->dp->dpif=0x2602430 pmd->dp->dpif->dpif_class=0x2606200
> > pmd->dp->dpif->full_name=(null)
> >
> > The same happens if I test it with hw-offload=true.
> >
> > I hope this scenario can be recreated on other setups as well.
> 
> This is interesting.  I'll try to reproduce this issue on my setup.
> 
> > I will look more into it.
> 
> Thanks!
> 
> Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org

Re: [ovs-dev] [RFC v3 0/4] netdev-offload: Prerequisites of vport offloading via DPDK.

2019-12-08 Thread Ophir Munk
Hi Ilya,
I think the root cause for the issues mentioned below is the usage of 
pmd->dp->dpif->dpif_class
rather than pmd->dp->class. Please see a fixing patch 
https://patchwork.ozlabs.org/patch/1205684/
("dpif-netdev: Retrieve dpif_class from struct dp_netdev ").

When running dpctl statistics there is a call to dpif_open(). As part of this 
call dp->dpif is assigned a new dpif pointer value.
However pmd->dp->dpif should assume that dp->dpif is fixed throughout its life 
time. 
Hence it is better to access the dpif_class through fixed pointers that do not 
change (pmd->dp->class).

With the fixing patch Eli issues are fixes.
Also when I updated your RFCs series to use pmd->dp->class - there is no more 
crash and everything works fine.
Ilya - can you please update the RFC series and send them as V1 patches? 
I will test them again and hopefully will be able to ACK them.

Regards,
Ophir

> -Original Message-
> From: Eli Britstein 
> Sent: Sunday, December 8, 2019 2:29 PM
> To: Ilya Maximets ; Ophir Munk
> ; ovs-dev@openvswitch.org
> Cc: Roni Bar Yanai ; Simon Horman
> ; Ameer Mahagneh
> 
> Subject: Re: [RFC v3 0/4] netdev-offload: Prerequisites of vport offloading 
> via
> DPDK.
> 
> 
> On 12/6/2019 4:34 PM, Ilya Maximets wrote:
> > On 06.12.2019 14:09, Ophir Munk wrote:
> >> Hi Ilya,
> >> I applied this series on master ("dpdk: Update to use DPDK 19.11.") and
> PINGed between two "dpdk" ports (hw-offload=true).
> >> It worked fine.
> >> Then I watched flow statistics (dpif based) by executing the command in
> (1):
> >> (1) watch ovs-appctl dpif/dump-flows -m  It worked fine.
> >> Then I watched flow statistics (dpctl based) by executing the command in
> (2):
> >> (2) watch ovs-appctl dpctl/dump-flows In this case OVS crashed after
> >> a few seconds.
> >>
> >> When inspecting the calls trace I notice that pmd->dp->dpif->dpif_class-
> >type is a corrupted memory address.
> >>
> >> (gdb) bt
> >> #0  0x00c26fda in dpif_normalize_type (type=0x7372  >> 0x7372 out of bounds>) at lib/dpif.c:517
> >> #1  0x00c19864 in dp_netdev_flow_offload_put
> >> (offload=offload@entry=0x7f11fc00b790) at lib/dpif-netdev.c:2378
> >> #2  0x00c19ca8 in dp_netdev_flow_offload_main
> >> (data=) at lib/dpif-netdev.c:2467
> >> #3  0x00ca3c9d in ovsthread_wrapper (aux_=)
> at
> >> lib/ovs-thread.c:383
> >> #4  0x7f12a2a08e25 in start_thread () from /lib64/libpthread.so.0
> >> #5  0x7f12a222c34d in clone () from /lib64/libc.so.6
> >>
> >> This scenario repeats whenever running the command in (2) and when
> the code calls dpif_normalize_type().
> >> It seems to me that the memory corruption was there for a while but only
> now it is exposed due to your rfc series.
> >>
> >> In order to observe this memory corruption I added the following
> printouts.
> >> I tested it with hw-offload=false:
> >> ovs-vsctl set Open_vSwitch . other_config:hw-offload=false I tested
> >> it on latest master without your rfc series.
> >>
> >> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index
> >> 1e54936..18a804a 100644
> >> --- a/lib/dpif-netdev.c
> >> +++ b/lib/dpif-netdev.c
> >> @@ -3625,6 +3625,10 @@ dpif_netdev_flow_dump_next(struct
> dpif_flow_dump_thread *thread_,
> >>   }
> >>   }
> >>
> >> +VLOG_ERR("pmd->dp=%p pmd->dp->dpif=%p pmd->dp->dpif-
> >dpif_class=%p \
> >> + pmd->dp->dpif->full_name=%s",
> >> + pmd->dp, pmd->dp->dpif, pmd->dp->dpif->dpif_class,
> >> + pmd->dp->dpif->full_name);
> >>   do {
> >>   for (n_flows = 0; n_flows < flow_limit; n_flows++) {
> >>   struct cmap_node *node;
> >>
> >>
> >> Looking at the printouts the memory corruption is observed.
> >>
> >> Initially all printouts are identical and correct as expected.
> >>
> >> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010
> >> pmd->dp->dpif=0x2525e70 pmd->dp->dpif->dpif_class=0xef3e80
> >> pmd->dp->dpif->full_name=netdev@ovs-netdev
> >> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010
> >> pmd->dp->dpif=0x2525e70 pmd->dp->dpif->dpif_class=0xef3e80
> >> pmd->dp->dpif->full_name=netdev@ovs-netdev
> >> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010
> >> pmd->dp->dpif=0x2525e70 pmd->dp->dpif->dpif_class=0xef3e80
> >> pmd->dp->dpif->full_name=netdev@ovs-netdev
> >>
> >> Around this point I executed the dpctl command in (2) and you can notice
> that the pointers pmd->dp->dp_dpif and beyond became modified.
> >>
> >> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010
> >> pmd->dp->dpif=0x25c3280 pmd->dp->dpif->dpif_class=0x251c790
> >> pmd->dp->dpif->full_name=(null)
> >> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010
> >> pmd->dp->dpif=0x25c3280 pmd->dp->dpif->dpif_class=0x33c03608
> >> pmd->dp->dpif->full_name=
> >> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010
> >> pmd->dp->dpif=0x25c3280 pmd->dp->dpif->dpif_class=0x2605700
> >> pmd->dp->dpif->full_name=
> >> 

Re: [ovs-dev] [RFC v3 0/4] netdev-offload: Prerequisites of vport offloading via DPDK.

2019-12-08 Thread Eli Britstein

On 12/6/2019 4:34 PM, Ilya Maximets wrote:
> On 06.12.2019 14:09, Ophir Munk wrote:
>> Hi Ilya,
>> I applied this series on master ("dpdk: Update to use DPDK 19.11.") and 
>> PINGed between two "dpdk" ports (hw-offload=true).
>> It worked fine.
>> Then I watched flow statistics (dpif based) by executing the command in (1):
>> (1) watch ovs-appctl dpif/dump-flows -m 
>> It worked fine.
>> Then I watched flow statistics (dpctl based) by executing the command in (2):
>> (2) watch ovs-appctl dpctl/dump-flows
>> In this case OVS crashed after a few seconds.
>>
>> When inspecting the calls trace I notice that 
>> pmd->dp->dpif->dpif_class->type is a corrupted memory address.
>>
>> (gdb) bt
>> #0  0x00c26fda in dpif_normalize_type (type=0x7372 > out of bounds>) at lib/dpif.c:517
>> #1  0x00c19864 in dp_netdev_flow_offload_put 
>> (offload=offload@entry=0x7f11fc00b790) at lib/dpif-netdev.c:2378
>> #2  0x00c19ca8 in dp_netdev_flow_offload_main (data=) 
>> at lib/dpif-netdev.c:2467
>> #3  0x00ca3c9d in ovsthread_wrapper (aux_=) at 
>> lib/ovs-thread.c:383
>> #4  0x7f12a2a08e25 in start_thread () from /lib64/libpthread.so.0
>> #5  0x7f12a222c34d in clone () from /lib64/libc.so.6
>>
>> This scenario repeats whenever running the command in (2) and when the code 
>> calls dpif_normalize_type().
>> It seems to me that the memory corruption was there for a while but only now 
>> it is exposed due to your rfc series.
>>
>> In order to observe this memory corruption I added the following printouts.
>> I tested it with hw-offload=false:
>> ovs-vsctl set Open_vSwitch . other_config:hw-offload=false
>> I tested it on latest master without your rfc series.
>>
>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
>> index 1e54936..18a804a 100644
>> --- a/lib/dpif-netdev.c
>> +++ b/lib/dpif-netdev.c
>> @@ -3625,6 +3625,10 @@ dpif_netdev_flow_dump_next(struct 
>> dpif_flow_dump_thread *thread_,
>>   }
>>   }
>>   
>> +VLOG_ERR("pmd->dp=%p pmd->dp->dpif=%p pmd->dp->dpif->dpif_class=%p \
>> + pmd->dp->dpif->full_name=%s",
>> + pmd->dp, pmd->dp->dpif, pmd->dp->dpif->dpif_class,
>> + pmd->dp->dpif->full_name);
>>   do {
>>   for (n_flows = 0; n_flows < flow_limit; n_flows++) {
>>   struct cmap_node *node;
>>
>>
>> Looking at the printouts the memory corruption is observed.
>>
>> Initially all printouts are identical and correct as expected.
>>
>> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 
>> pmd->dp->dpif=0x2525e70 pmd->dp->dpif->dpif_class=0xef3e80 
>> pmd->dp->dpif->full_name=netdev@ovs-netdev
>> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 
>> pmd->dp->dpif=0x2525e70 pmd->dp->dpif->dpif_class=0xef3e80 
>> pmd->dp->dpif->full_name=netdev@ovs-netdev
>> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 
>> pmd->dp->dpif=0x2525e70 pmd->dp->dpif->dpif_class=0xef3e80 
>> pmd->dp->dpif->full_name=netdev@ovs-netdev
>>
>> Around this point I executed the dpctl command in (2) and you can notice 
>> that the pointers pmd->dp->dp_dpif and beyond became modified.
>>
>> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 
>> pmd->dp->dpif=0x25c3280 pmd->dp->dpif->dpif_class=0x251c790 
>> pmd->dp->dpif->full_name=(null)
>> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 
>> pmd->dp->dpif=0x25c3280 pmd->dp->dpif->dpif_class=0x33c03608 
>> pmd->dp->dpif->full_name=
>> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 
>> pmd->dp->dpif=0x25c3280 pmd->dp->dpif->dpif_class=0x2605700 
>> pmd->dp->dpif->full_name=
>> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 
>> pmd->dp->dpif=0x25c3280 pmd->dp->dpif->dpif_class=0xe784250b 
>> pmd->dp->dpif->full_name=memseg-2048k-0-3
>> dpif_netdev|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x2602430 
>> pmd->dp->dpif->dpif_class=0xef3e80 pmd->dp->dpif->full_name=netdev@ovs-netdev
>> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 
>> pmd->dp->dpif=0x2602430 pmd->dp->dpif->dpif_class=0x2606200 
>> pmd->dp->dpif->full_name=(null)
>>
>> The same happens if I test it with hw-offload=true.
>>
>> I hope this scenario can be recreated on other setups as well.
> This is interesting.  I'll try to reproduce this issue on my setup.

Hi Ilya,

I see that this bug is not only with this new patch-set, but now also 
came into effect in master, with acceptance of

30115809da2e ("dpif-netdev: Use netdev-offload API for port lookup while 
offloading.")

-    port = dp_netdev_lookup_port(pmd->dp, in_port);
+    port = netdev_ports_get(in_port, pmd->dp->dpif->dpif_class);
As of the corruption, the "port" is not retrieved by netdev_ports_get, 
and the offloading is not done. I have reverted this commits in my tests.

Though this commit is correct (but should not have any effect before 
vport offloading), maybe we should revert it until this corruption bug 
is fixed.

Thanks,

Eli

>
>> I will look more into it.
> Thanks!
>
> 

Re: [ovs-dev] [RFC v3 0/4] netdev-offload: Prerequisites of vport offloading via DPDK.

2019-12-06 Thread Ilya Maximets
On 06.12.2019 14:09, Ophir Munk wrote:
> Hi Ilya,
> I applied this series on master ("dpdk: Update to use DPDK 19.11.") and 
> PINGed between two "dpdk" ports (hw-offload=true).
> It worked fine.
> Then I watched flow statistics (dpif based) by executing the command in (1):
> (1) watch ovs-appctl dpif/dump-flows -m 
> It worked fine.
> Then I watched flow statistics (dpctl based) by executing the command in (2):
> (2) watch ovs-appctl dpctl/dump-flows
> In this case OVS crashed after a few seconds.
> 
> When inspecting the calls trace I notice that pmd->dp->dpif->dpif_class->type 
> is a corrupted memory address.
> 
> (gdb) bt
> #0  0x00c26fda in dpif_normalize_type (type=0x7372  out of bounds>) at lib/dpif.c:517
> #1  0x00c19864 in dp_netdev_flow_offload_put 
> (offload=offload@entry=0x7f11fc00b790) at lib/dpif-netdev.c:2378
> #2  0x00c19ca8 in dp_netdev_flow_offload_main (data=) 
> at lib/dpif-netdev.c:2467
> #3  0x00ca3c9d in ovsthread_wrapper (aux_=) at 
> lib/ovs-thread.c:383
> #4  0x7f12a2a08e25 in start_thread () from /lib64/libpthread.so.0
> #5  0x7f12a222c34d in clone () from /lib64/libc.so.6
> 
> This scenario repeats whenever running the command in (2) and when the code 
> calls dpif_normalize_type().
> It seems to me that the memory corruption was there for a while but only now 
> it is exposed due to your rfc series.
> 
> In order to observe this memory corruption I added the following printouts.
> I tested it with hw-offload=false: 
> ovs-vsctl set Open_vSwitch . other_config:hw-offload=false
> I tested it on latest master without your rfc series.
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index 1e54936..18a804a 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -3625,6 +3625,10 @@ dpif_netdev_flow_dump_next(struct 
> dpif_flow_dump_thread *thread_,
>  }
>  }
>  
> +VLOG_ERR("pmd->dp=%p pmd->dp->dpif=%p pmd->dp->dpif->dpif_class=%p \
> + pmd->dp->dpif->full_name=%s",
> + pmd->dp, pmd->dp->dpif, pmd->dp->dpif->dpif_class,
> + pmd->dp->dpif->full_name);
>  do {
>  for (n_flows = 0; n_flows < flow_limit; n_flows++) {
>  struct cmap_node *node;
> 
> 
> Looking at the printouts the memory corruption is observed.
> 
> Initially all printouts are identical and correct as expected.
> 
> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x2525e70 
> pmd->dp->dpif->dpif_class=0xef3e80 pmd->dp->dpif->full_name=netdev@ovs-netdev
> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x2525e70 
> pmd->dp->dpif->dpif_class=0xef3e80 pmd->dp->dpif->full_name=netdev@ovs-netdev
> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x2525e70 
> pmd->dp->dpif->dpif_class=0xef3e80 pmd->dp->dpif->full_name=netdev@ovs-netdev
> 
> Around this point I executed the dpctl command in (2) and you can notice that 
> the pointers pmd->dp->dp_dpif and beyond became modified.
> 
> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x25c3280 
> pmd->dp->dpif->dpif_class=0x251c790 pmd->dp->dpif->full_name=(null)
> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x25c3280 
> pmd->dp->dpif->dpif_class=0x33c03608 pmd->dp->dpif->full_name=
> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x25c3280 
> pmd->dp->dpif->dpif_class=0x2605700 pmd->dp->dpif->full_name=
> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x25c3280 
> pmd->dp->dpif->dpif_class=0xe784250b pmd->dp->dpif->full_name=memseg-2048k-0-3
> dpif_netdev|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x2602430 
> pmd->dp->dpif->dpif_class=0xef3e80 pmd->dp->dpif->full_name=netdev@ovs-netdev
> dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x2602430 
> pmd->dp->dpif->dpif_class=0x2606200 pmd->dp->dpif->full_name=(null)
> 
> The same happens if I test it with hw-offload=true.
> 
> I hope this scenario can be recreated on other setups as well.

This is interesting.  I'll try to reproduce this issue on my setup.

> I will look more into it.

Thanks!

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC v3 0/4] netdev-offload: Prerequisites of vport offloading via DPDK.

2019-12-06 Thread Ophir Munk
Hi Ilya,
I applied this series on master ("dpdk: Update to use DPDK 19.11.") and PINGed 
between two "dpdk" ports (hw-offload=true).
It worked fine.
Then I watched flow statistics (dpif based) by executing the command in (1):
(1) watch ovs-appctl dpif/dump-flows -m 
It worked fine.
Then I watched flow statistics (dpctl based) by executing the command in (2):
(2) watch ovs-appctl dpctl/dump-flows
In this case OVS crashed after a few seconds.

When inspecting the calls trace I notice that pmd->dp->dpif->dpif_class->type 
is a corrupted memory address.

(gdb) bt
#0  0x00c26fda in dpif_normalize_type (type=0x7372 ) at lib/dpif.c:517
#1  0x00c19864 in dp_netdev_flow_offload_put 
(offload=offload@entry=0x7f11fc00b790) at lib/dpif-netdev.c:2378
#2  0x00c19ca8 in dp_netdev_flow_offload_main (data=) at 
lib/dpif-netdev.c:2467
#3  0x00ca3c9d in ovsthread_wrapper (aux_=) at 
lib/ovs-thread.c:383
#4  0x7f12a2a08e25 in start_thread () from /lib64/libpthread.so.0
#5  0x7f12a222c34d in clone () from /lib64/libc.so.6

This scenario repeats whenever running the command in (2) and when the code 
calls dpif_normalize_type().
It seems to me that the memory corruption was there for a while but only now it 
is exposed due to your rfc series.

In order to observe this memory corruption I added the following printouts.
I tested it with hw-offload=false: 
ovs-vsctl set Open_vSwitch . other_config:hw-offload=false
I tested it on latest master without your rfc series.

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 1e54936..18a804a 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -3625,6 +3625,10 @@ dpif_netdev_flow_dump_next(struct dpif_flow_dump_thread 
*thread_,
 }
 }
 
+VLOG_ERR("pmd->dp=%p pmd->dp->dpif=%p pmd->dp->dpif->dpif_class=%p \
+ pmd->dp->dpif->full_name=%s",
+ pmd->dp, pmd->dp->dpif, pmd->dp->dpif->dpif_class,
+ pmd->dp->dpif->full_name);
 do {
 for (n_flows = 0; n_flows < flow_limit; n_flows++) {
 struct cmap_node *node;


Looking at the printouts the memory corruption is observed.

Initially all printouts are identical and correct as expected.

dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x2525e70 
pmd->dp->dpif->dpif_class=0xef3e80 pmd->dp->dpif->full_name=netdev@ovs-netdev
dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x2525e70 
pmd->dp->dpif->dpif_class=0xef3e80 pmd->dp->dpif->full_name=netdev@ovs-netdev
dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x2525e70 
pmd->dp->dpif->dpif_class=0xef3e80 pmd->dp->dpif->full_name=netdev@ovs-netdev

Around this point I executed the dpctl command in (2) and you can notice that 
the pointers pmd->dp->dp_dpif and beyond became modified.

dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x25c3280 
pmd->dp->dpif->dpif_class=0x251c790 pmd->dp->dpif->full_name=(null)
dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x25c3280 
pmd->dp->dpif->dpif_class=0x33c03608 pmd->dp->dpif->full_name=
dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x25c3280 
pmd->dp->dpif->dpif_class=0x2605700 pmd->dp->dpif->full_name=
dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x25c3280 
pmd->dp->dpif->dpif_class=0xe784250b pmd->dp->dpif->full_name=memseg-2048k-0-3
dpif_netdev|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x2602430 
pmd->dp->dpif->dpif_class=0xef3e80 pmd->dp->dpif->full_name=netdev@ovs-netdev
dpif_netdev(revalidator82)|ERR|pmd->dp=0x7fdfea3ea010 pmd->dp->dpif=0x2602430 
pmd->dp->dpif->dpif_class=0x2606200 pmd->dp->dpif->full_name=(null)

The same happens if I test it with hw-offload=true.

I hope this scenario can be recreated on other setups as well.
I will look more into it.
Any insights on this issue would be appreciated.

Regards,
Ophir

> -Original Message-
> From: Ilya Maximets 
> Sent: Tuesday, December 3, 2019 1:13 PM
> To: ovs-dev@openvswitch.org
> Cc: Ophir Munk ; Roni Bar Yanai
> ; Simon Horman
> ; Ilya Maximets 
> Subject: [RFC v3 0/4] netdev-offload: Prerequisites of vport offloading via
> DPDK.
> 
> These patches are necessary to enable vxlan offloading for userspace
> datapath via DPDK flow offloading API (not tested).
> They allows to properly distinguish userspace tunnels from the system
> tunnels on netdev level in order to choose appropriate flow offloading API
> provider.  Before these patches it was not possible because tunneling ports
> are implemented by the same netdev-vport classes and uses same dpif port
> names, so only ofproto really knows which netdevs assigned to each dpif.
> 
> RFC v3:
>   * Rebase on current master.
> 
> RFC v2:
>   * Added 2 patches for using dpif type.
>   * Last patch updated to use netdev dpif_type instead of traversing
> dpif ports.
> 
> 
> Ilya Maximets (4):
>   netdev: Allow storing dpif type into netdev structure.

Re: [ovs-dev] [RFC v3 0/4] netdev-offload: Prerequisites of vport offloading via DPDK.

2019-12-04 Thread Simon Horman
On Tue, Dec 03, 2019 at 12:13:11PM +0100, Ilya Maximets wrote:
> These patches are necessary to enable vxlan offloading for
> userspace datapath via DPDK flow offloading API (not tested).
> They allows to properly distinguish userspace tunnels from the
> system tunnels on netdev level in order to choose appropriate
> flow offloading API provider.  Before these patches it was not
> possible because tunneling ports are implemented by the same
> netdev-vport classes and uses same dpif port names, so only
> ofproto really knows which netdevs assigned to each dpif.

This series looks good to me.

Reviewed-by: Simon Horman 

> 
> RFC v3:
>   * Rebase on current master.
> 
> RFC v2:
>   * Added 2 patches for using dpif type.
>   * Last patch updated to use netdev dpif_type instead of traversing
> dpif ports.
> 
> 
> Ilya Maximets (4):
>   netdev: Allow storing dpif type into netdev structure.
>   netdev-offload: Use dpif type instead of class.
>   netdev-offload: Allow offloading to netdev without ifindex.
>   netdev-offload: Disallow offloading to unrelated tunneling vports.
> 
>  lib/dpif-netdev.c | 13 +++
>  lib/dpif-netlink.c| 23 ++--
>  lib/dpif.c| 21 ++-
>  lib/netdev-offload-dpdk.c |  8 +
>  lib/netdev-offload-tc.c   | 12 +--
>  lib/netdev-offload.c  | 68 ++-
>  lib/netdev-offload.h  | 16 -
>  lib/netdev-provider.h |  3 +-
>  lib/netdev.c  | 16 +
>  lib/netdev.h  |  2 ++
>  ofproto/ofproto-dpif-upcall.c |  5 ++-
>  11 files changed, 114 insertions(+), 73 deletions(-)
> 
> -- 
> 2.17.1
> 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [RFC v3 0/4] netdev-offload: Prerequisites of vport offloading via DPDK.

2019-12-03 Thread Ilya Maximets
These patches are necessary to enable vxlan offloading for
userspace datapath via DPDK flow offloading API (not tested).
They allows to properly distinguish userspace tunnels from the
system tunnels on netdev level in order to choose appropriate
flow offloading API provider.  Before these patches it was not
possible because tunneling ports are implemented by the same
netdev-vport classes and uses same dpif port names, so only
ofproto really knows which netdevs assigned to each dpif.

RFC v3:
  * Rebase on current master.

RFC v2:
  * Added 2 patches for using dpif type.
  * Last patch updated to use netdev dpif_type instead of traversing
dpif ports.


Ilya Maximets (4):
  netdev: Allow storing dpif type into netdev structure.
  netdev-offload: Use dpif type instead of class.
  netdev-offload: Allow offloading to netdev without ifindex.
  netdev-offload: Disallow offloading to unrelated tunneling vports.

 lib/dpif-netdev.c | 13 +++
 lib/dpif-netlink.c| 23 ++--
 lib/dpif.c| 21 ++-
 lib/netdev-offload-dpdk.c |  8 +
 lib/netdev-offload-tc.c   | 12 +--
 lib/netdev-offload.c  | 68 ++-
 lib/netdev-offload.h  | 16 -
 lib/netdev-provider.h |  3 +-
 lib/netdev.c  | 16 +
 lib/netdev.h  |  2 ++
 ofproto/ofproto-dpif-upcall.c |  5 ++-
 11 files changed, 114 insertions(+), 73 deletions(-)

-- 
2.17.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev