Re: [ovs-dev] [PATCH v5 0/7] netdev datapath: Partial action offload

2020-07-12 Thread Sriharsha Basavapatna via dev
Hi Ilya,

Thanks for your comments, please see my response inline.

On Fri, Jul 10, 2020 at 5:41 PM Ilya Maximets  wrote:
>
> On 7/9/20 8:47 AM, Sriharsha Basavapatna via dev wrote:
> > Hi,
> >
> > This patchset extends the "Partial HW acceleration" mode to offload a
> > part of the action processing to HW, instead of offloading just lookup
> > (MARK/RSS), for "vhost-user" ports. This is referred to as "Partial Action
> > Offload". This mode does not require SRIOV/switchdev configuration. In this
> > mode, forwarding (output) action is still performed by OVS-DPDK SW datapath.
>
> Hi.  I like the idea of egress offloading.  It's interesting.

Yes,  you get the performance benefits of flow offload to virtual NICs
in a VM without SR-IOV (VFs).

> IIUC, HW will perform matching on egress packets and perform actions before
> actually sending them. Is that right?

Right, HW matches on the rule and performs the subset of actions
specified before transmitting. For example, egress packet from the VM
(from a virtio nic) is processed by OVS and OVS forwards it to the
physical port (PF) and the packet is encapsulated (VXLAN) on its way
out to the wire.
>
> For the implementation I have a few concerns:
>
> 1. Why only vhost-user ports?  I mean, egress offloading could be done
>for any ingress port in case ingress port doesn't support full offloading.

You are right. But just to reduce the scope of initial implementation
and validation, we focused on the use-case involving vhost-user ports.
Once this design is in place, it should be pretty easy to remove this
restriction to include more non-offload capable ingress ports.
>
>Moreover, you could have classification offloading on ingress and actions
>offloading on egress at the same time.  This might be useful, for example,
>if we have two diferent NICs that both supports offloading, but we have to
>send packets between them.  But, yes, that might be a way for further
>improvement.

Yes, good idea. This use case can be supported in the future, with
some slight extensions to the current design. Forwarding between two
physical ports, but since they are from different NICs (i.e, they
don't belong to the same eSwitch), break up the offload flow rule into
2 separate rules, with classification on the ingress-port and
partial-action offloading on the egress-port.
>
>Regarding vhost-user, you're exposing too much of netdev internals to
>datapath layer by checking for a specific netdev type.  This is not
>a good thing to do.  Especially because egress offloading doesn't depend
>on a type of ingress interface.

Sure, I'll remove this netdev type specific check from dpif-netdev.
But the current use case for egress offload that we are considering,
is only when the ingress device is not offload capable. So, I'd like
to retain this condition, but I'll move this into the offload layer
(see below).
>
> 2. I'm worried about other offload providers like tinux-tc that doesn't know
>anything that happens in dpif-netdev and will not work correctly if
>dpif-netdev will try to use egress offloading on it.

Maybe I'm missing something here, but this cannot happen since
dpif-netdev handles only user (dpdk) datapath flows and thus should
only be talking to netdev-offload-dpdk offload provider and in turn
only to devices of class netdev-dpdk  ? But I'll add additional checks
to the new offload APIs to make sure only flows of a given datapath
class are processed.

>I see that you're using netdev_dpdk_flow_api_supported() inside the
>dpif-netdev, but that is the violation of netdev-offload abstraction layer.

Sure, this will be moved into the offload layer API that will be
introduced (see below).
>
>I think, some more generic extension of netdev-offload interface required,
>so all offload providers will be able to use this API.  I mean, egress
>offloading should be possible with TC.  You don't need to add support for
>this, but we should have a generic interface to utilize this support in
>the future.
>
>At least there should be something more generic like info.offload_direction
>of some enumeration type like
> enum netdev_offload_direction {
> NETDEV_OFFLOAD_INGRESS,
> NETDEV_OFFLOAD_EGRESS,
> };
>And each offload provider should decide by itself if it supports some type
>of offloading or not.

I agree; will implement the following new offload APIs:

netdev-offload.h:  netdev_partial_offload_egress()
netdev-offload.h:  netdev_partial_offload_ingress()
netdev-offload-provider.h:  netdev_flow_api::flow_offload_egress_partial()
netdev-offload-provider.h:  netdev_flow_api::flow_offload_ingress_partial()
netdev-offload-dpdk.c:  netdev_offload_dpdk_egress_partial()
netdev-offload-dpdk.c:  netdev_offload_dpdk_ingress_partial()

>netdev specific or specific to particular offload provider functions should
>not be used in general outside of their own modules.
>
> 3. Some new identifiers 

Re: [ovs-dev] [PATCH v5 0/7] netdev datapath: Partial action offload

2020-07-10 Thread Ilya Maximets
On 7/9/20 8:47 AM, Sriharsha Basavapatna via dev wrote:
> Hi,
> 
> This patchset extends the "Partial HW acceleration" mode to offload a
> part of the action processing to HW, instead of offloading just lookup
> (MARK/RSS), for "vhost-user" ports. This is referred to as "Partial Action
> Offload". This mode does not require SRIOV/switchdev configuration. In this
> mode, forwarding (output) action is still performed by OVS-DPDK SW datapath.

Hi.  I like the idea of egress offloading.  It's interesting.
IIUC, HW will perform matching on egress packets and perform actions before
actually sending them. Is that right?

For the implementation I have a few concerns:

1. Why only vhost-user ports?  I mean, egress offloading could be done
   for any ingress port in case ingress port doesn't support full offloading.

   Moreover, you could have classification offloading on ingress and actions
   offloading on egress at the same time.  This might be useful, for example,
   if we have two diferent NICs that both supports offloading, but we have to
   send packets between them.  But, yes, that might be a way for further
   improvement.

   Regarding vhost-user, you're exposing too much of netdev internals to
   datapath layer by checking for a specific netdev type.  This is not
   a good thing to do.  Especially because egress offloading doesn't depend
   on a type of ingress interface.

2. I'm worried about other offload providers like tinux-tc that doesn't know
   anything that happens in dpif-netdev and will not work correctly if
   dpif-netdev will try to use egress offloading on it.
   I see that you're using netdev_dpdk_flow_api_supported() inside the
   dpif-netdev, but that is the violation of netdev-offload abstraction layer. 

   I think, some more generic extension of netdev-offload interface required,
   so all offload providers will be able to use this API.  I mean, egress
   offloading should be possible with TC.  You don't need to add support for
   this, but we should have a generic interface to utilize this support in
   the future.

   At least there should be something more generic like info.offload_direction
   of some enumeration type like
enum netdev_offload_direction {
NETDEV_OFFLOAD_INGRESS,
NETDEV_OFFLOAD_EGRESS,
};
   And each offload provider should decide by itself if it supports some type
   of offloading or not.
   netdev specific or specific to particular offload provider functions should
   not be used in general outside of their own modules.

3. Some new identifiers for flow dumps to distinguish classification and actions
   partial offloading needed.

4. Ingress partial actions offloading sounds interesting, but current
   implementation forces datapath assistance for such actions.  This will
   significantly impact kernel datapath as all the vlan operations will be
   sent back and forth between kernel and userspace.  And I'm not sure if this
   will even work correctly with current patches.


I didn't make a full review.  These are just quick comments from the 
architectural
point of view.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v5 0/7] netdev datapath: Partial action offload

2020-07-09 Thread Sriharsha Basavapatna via dev
Hi Ilya,

I removed the RFC tag on this partial-offload patchset and it is
rebased to the latest master branch. We are targeting this for 2.14.
So just a gentle reminder on this offload patchset.

Thanks,
-Harsha

On Thu, Jul 9, 2020 at 12:17 PM Sriharsha Basavapatna
 wrote:
>
> Hi,
>
> This patchset extends the "Partial HW acceleration" mode to offload a
> part of the action processing to HW, instead of offloading just lookup
> (MARK/RSS), for "vhost-user" ports. This is referred to as "Partial Action
> Offload". This mode does not require SRIOV/switchdev configuration. In this
> mode, forwarding (output) action is still performed by OVS-DPDK SW datapath.
>
> Thanks,
> -Harsha
>
> **
>
> v4-->v5:
>
> - Rebased to the current ovs-master (includes vxlan-encap full offload)
> - Added 2 patches to support partial offload of vlan push/pop actions
>
> v3-->v4:
>
> - Removed mega-ufid mapping code from dpif-netdev (patch #5) and updated the
>   existing mega-ufid mapping code in netdev-offload-dpdk to support partial
>   action offload.
>
> v2-->v3:
>
> - Added more commit log comments in the refactoring patch (#1).
> - Removed wrapper function for should_partial_offload_egress().
> - Removed partial offload check for output action in parse_flow_actions().
> - Changed patch sequence (skip-encap and get-stats before offload patch).
> - Removed locking code (details in email), added inline comments.
> - Moved mega-ufid mapping code from skip-encap (#3) to the offload patch (#5).
>
> v1-->v2:
>
> Fixed review comments from Eli:
> - Revamped action list parsing to reject multiple clone/output actions
> - Updated comments to reflect support for single clone/output action
> - Removed place-holder function for ingress partial action offload
> - Created a separate patch (#2) to query dpdk-vhost netdevs
> - Set transfer attribute to 0 for partial action offload
> - Updated data type of 'dp_flow' in 'dp_netdev_execute_aux'
> - Added a mutex to synchronize between offload and datapath threads
> - Avoid fall back to mark/rss when egress partial offload fails
> - Drop count action for partial action offload
>
> Other changes:
> - Avoid duplicate offload requests for the same mega-ufid (from PMD threads)
> - Added a coverage counter to track pkts with tnl-push partial offloaded
> - Fixed dp_netdev_pmd_remove_flow() to delete partial offloaded flow
>
> **
>
> Sriharsha Basavapatna (7):
>   dpif-netdev: Refactor dp_netdev_flow_offload_put()
>   netdev-dpdk: provide a function to identify dpdk-vhost netdevs
>   dpif-netdev: Skip encap action during datapath execution
>   dpif-netdev: Support flow_get() with partial-action-offload
>   dpif-netdev: Support partial-action-offload of VXLAN encap flow
>   dpif-netdev: Support partial offload of PUSH_VLAN action
>   dpif-netdev: Support partial offload of POP_VLAN action
>
>  lib/dpif-netdev.c | 463 ++
>  lib/netdev-dpdk.c |   5 +
>  lib/netdev-dpdk.h |   1 +
>  lib/netdev-offload-dpdk.c |  96 ++--
>  lib/netdev-offload.h  |   3 +
>  lib/odp-execute.c |  19 +-
>  6 files changed, 502 insertions(+), 85 deletions(-)
>
> --
> 2.25.0.rc2
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v5 0/7] netdev datapath: Partial action offload

2020-07-09 Thread Sriharsha Basavapatna via dev
Hi,

This patchset extends the "Partial HW acceleration" mode to offload a
part of the action processing to HW, instead of offloading just lookup
(MARK/RSS), for "vhost-user" ports. This is referred to as "Partial Action
Offload". This mode does not require SRIOV/switchdev configuration. In this
mode, forwarding (output) action is still performed by OVS-DPDK SW datapath.

Thanks,
-Harsha

**

v4-->v5:

- Rebased to the current ovs-master (includes vxlan-encap full offload)
- Added 2 patches to support partial offload of vlan push/pop actions

v3-->v4:

- Removed mega-ufid mapping code from dpif-netdev (patch #5) and updated the
  existing mega-ufid mapping code in netdev-offload-dpdk to support partial
  action offload.

v2-->v3:

- Added more commit log comments in the refactoring patch (#1).
- Removed wrapper function for should_partial_offload_egress().
- Removed partial offload check for output action in parse_flow_actions().
- Changed patch sequence (skip-encap and get-stats before offload patch).
- Removed locking code (details in email), added inline comments.
- Moved mega-ufid mapping code from skip-encap (#3) to the offload patch (#5).

v1-->v2:

Fixed review comments from Eli:
- Revamped action list parsing to reject multiple clone/output actions
- Updated comments to reflect support for single clone/output action
- Removed place-holder function for ingress partial action offload
- Created a separate patch (#2) to query dpdk-vhost netdevs
- Set transfer attribute to 0 for partial action offload
- Updated data type of 'dp_flow' in 'dp_netdev_execute_aux'
- Added a mutex to synchronize between offload and datapath threads
- Avoid fall back to mark/rss when egress partial offload fails
- Drop count action for partial action offload

Other changes:
- Avoid duplicate offload requests for the same mega-ufid (from PMD threads)
- Added a coverage counter to track pkts with tnl-push partial offloaded
- Fixed dp_netdev_pmd_remove_flow() to delete partial offloaded flow

**

Sriharsha Basavapatna (7):
  dpif-netdev: Refactor dp_netdev_flow_offload_put()
  netdev-dpdk: provide a function to identify dpdk-vhost netdevs
  dpif-netdev: Skip encap action during datapath execution
  dpif-netdev: Support flow_get() with partial-action-offload
  dpif-netdev: Support partial-action-offload of VXLAN encap flow
  dpif-netdev: Support partial offload of PUSH_VLAN action
  dpif-netdev: Support partial offload of POP_VLAN action

 lib/dpif-netdev.c | 463 ++
 lib/netdev-dpdk.c |   5 +
 lib/netdev-dpdk.h |   1 +
 lib/netdev-offload-dpdk.c |  96 ++--
 lib/netdev-offload.h  |   3 +
 lib/odp-execute.c |  19 +-
 6 files changed, 502 insertions(+), 85 deletions(-)

-- 
2.25.0.rc2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev