Re: [ovs-dev] [PATCH v5 0/7] netdev datapath: Partial action offload
Hi Ilya, Thanks for your comments, please see my response inline. On Fri, Jul 10, 2020 at 5:41 PM Ilya Maximets wrote: > > On 7/9/20 8:47 AM, Sriharsha Basavapatna via dev wrote: > > Hi, > > > > This patchset extends the "Partial HW acceleration" mode to offload a > > part of the action processing to HW, instead of offloading just lookup > > (MARK/RSS), for "vhost-user" ports. This is referred to as "Partial Action > > Offload". This mode does not require SRIOV/switchdev configuration. In this > > mode, forwarding (output) action is still performed by OVS-DPDK SW datapath. > > Hi. I like the idea of egress offloading. It's interesting. Yes, you get the performance benefits of flow offload to virtual NICs in a VM without SR-IOV (VFs). > IIUC, HW will perform matching on egress packets and perform actions before > actually sending them. Is that right? Right, HW matches on the rule and performs the subset of actions specified before transmitting. For example, egress packet from the VM (from a virtio nic) is processed by OVS and OVS forwards it to the physical port (PF) and the packet is encapsulated (VXLAN) on its way out to the wire. > > For the implementation I have a few concerns: > > 1. Why only vhost-user ports? I mean, egress offloading could be done >for any ingress port in case ingress port doesn't support full offloading. You are right. But just to reduce the scope of initial implementation and validation, we focused on the use-case involving vhost-user ports. Once this design is in place, it should be pretty easy to remove this restriction to include more non-offload capable ingress ports. > >Moreover, you could have classification offloading on ingress and actions >offloading on egress at the same time. This might be useful, for example, >if we have two diferent NICs that both supports offloading, but we have to >send packets between them. But, yes, that might be a way for further >improvement. Yes, good idea. This use case can be supported in the future, with some slight extensions to the current design. Forwarding between two physical ports, but since they are from different NICs (i.e, they don't belong to the same eSwitch), break up the offload flow rule into 2 separate rules, with classification on the ingress-port and partial-action offloading on the egress-port. > >Regarding vhost-user, you're exposing too much of netdev internals to >datapath layer by checking for a specific netdev type. This is not >a good thing to do. Especially because egress offloading doesn't depend >on a type of ingress interface. Sure, I'll remove this netdev type specific check from dpif-netdev. But the current use case for egress offload that we are considering, is only when the ingress device is not offload capable. So, I'd like to retain this condition, but I'll move this into the offload layer (see below). > > 2. I'm worried about other offload providers like tinux-tc that doesn't know >anything that happens in dpif-netdev and will not work correctly if >dpif-netdev will try to use egress offloading on it. Maybe I'm missing something here, but this cannot happen since dpif-netdev handles only user (dpdk) datapath flows and thus should only be talking to netdev-offload-dpdk offload provider and in turn only to devices of class netdev-dpdk ? But I'll add additional checks to the new offload APIs to make sure only flows of a given datapath class are processed. >I see that you're using netdev_dpdk_flow_api_supported() inside the >dpif-netdev, but that is the violation of netdev-offload abstraction layer. Sure, this will be moved into the offload layer API that will be introduced (see below). > >I think, some more generic extension of netdev-offload interface required, >so all offload providers will be able to use this API. I mean, egress >offloading should be possible with TC. You don't need to add support for >this, but we should have a generic interface to utilize this support in >the future. > >At least there should be something more generic like info.offload_direction >of some enumeration type like > enum netdev_offload_direction { > NETDEV_OFFLOAD_INGRESS, > NETDEV_OFFLOAD_EGRESS, > }; >And each offload provider should decide by itself if it supports some type >of offloading or not. I agree; will implement the following new offload APIs: netdev-offload.h: netdev_partial_offload_egress() netdev-offload.h: netdev_partial_offload_ingress() netdev-offload-provider.h: netdev_flow_api::flow_offload_egress_partial() netdev-offload-provider.h: netdev_flow_api::flow_offload_ingress_partial() netdev-offload-dpdk.c: netdev_offload_dpdk_egress_partial() netdev-offload-dpdk.c: netdev_offload_dpdk_ingress_partial() >netdev specific or specific to particular offload provider functions should >not be used in general outside of their own modules. > > 3. Some new identifiers
Re: [ovs-dev] [PATCH v5 0/7] netdev datapath: Partial action offload
On 7/9/20 8:47 AM, Sriharsha Basavapatna via dev wrote: > Hi, > > This patchset extends the "Partial HW acceleration" mode to offload a > part of the action processing to HW, instead of offloading just lookup > (MARK/RSS), for "vhost-user" ports. This is referred to as "Partial Action > Offload". This mode does not require SRIOV/switchdev configuration. In this > mode, forwarding (output) action is still performed by OVS-DPDK SW datapath. Hi. I like the idea of egress offloading. It's interesting. IIUC, HW will perform matching on egress packets and perform actions before actually sending them. Is that right? For the implementation I have a few concerns: 1. Why only vhost-user ports? I mean, egress offloading could be done for any ingress port in case ingress port doesn't support full offloading. Moreover, you could have classification offloading on ingress and actions offloading on egress at the same time. This might be useful, for example, if we have two diferent NICs that both supports offloading, but we have to send packets between them. But, yes, that might be a way for further improvement. Regarding vhost-user, you're exposing too much of netdev internals to datapath layer by checking for a specific netdev type. This is not a good thing to do. Especially because egress offloading doesn't depend on a type of ingress interface. 2. I'm worried about other offload providers like tinux-tc that doesn't know anything that happens in dpif-netdev and will not work correctly if dpif-netdev will try to use egress offloading on it. I see that you're using netdev_dpdk_flow_api_supported() inside the dpif-netdev, but that is the violation of netdev-offload abstraction layer. I think, some more generic extension of netdev-offload interface required, so all offload providers will be able to use this API. I mean, egress offloading should be possible with TC. You don't need to add support for this, but we should have a generic interface to utilize this support in the future. At least there should be something more generic like info.offload_direction of some enumeration type like enum netdev_offload_direction { NETDEV_OFFLOAD_INGRESS, NETDEV_OFFLOAD_EGRESS, }; And each offload provider should decide by itself if it supports some type of offloading or not. netdev specific or specific to particular offload provider functions should not be used in general outside of their own modules. 3. Some new identifiers for flow dumps to distinguish classification and actions partial offloading needed. 4. Ingress partial actions offloading sounds interesting, but current implementation forces datapath assistance for such actions. This will significantly impact kernel datapath as all the vlan operations will be sent back and forth between kernel and userspace. And I'm not sure if this will even work correctly with current patches. I didn't make a full review. These are just quick comments from the architectural point of view. Best regards, Ilya Maximets. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v5 0/7] netdev datapath: Partial action offload
Hi Ilya, I removed the RFC tag on this partial-offload patchset and it is rebased to the latest master branch. We are targeting this for 2.14. So just a gentle reminder on this offload patchset. Thanks, -Harsha On Thu, Jul 9, 2020 at 12:17 PM Sriharsha Basavapatna wrote: > > Hi, > > This patchset extends the "Partial HW acceleration" mode to offload a > part of the action processing to HW, instead of offloading just lookup > (MARK/RSS), for "vhost-user" ports. This is referred to as "Partial Action > Offload". This mode does not require SRIOV/switchdev configuration. In this > mode, forwarding (output) action is still performed by OVS-DPDK SW datapath. > > Thanks, > -Harsha > > ** > > v4-->v5: > > - Rebased to the current ovs-master (includes vxlan-encap full offload) > - Added 2 patches to support partial offload of vlan push/pop actions > > v3-->v4: > > - Removed mega-ufid mapping code from dpif-netdev (patch #5) and updated the > existing mega-ufid mapping code in netdev-offload-dpdk to support partial > action offload. > > v2-->v3: > > - Added more commit log comments in the refactoring patch (#1). > - Removed wrapper function for should_partial_offload_egress(). > - Removed partial offload check for output action in parse_flow_actions(). > - Changed patch sequence (skip-encap and get-stats before offload patch). > - Removed locking code (details in email), added inline comments. > - Moved mega-ufid mapping code from skip-encap (#3) to the offload patch (#5). > > v1-->v2: > > Fixed review comments from Eli: > - Revamped action list parsing to reject multiple clone/output actions > - Updated comments to reflect support for single clone/output action > - Removed place-holder function for ingress partial action offload > - Created a separate patch (#2) to query dpdk-vhost netdevs > - Set transfer attribute to 0 for partial action offload > - Updated data type of 'dp_flow' in 'dp_netdev_execute_aux' > - Added a mutex to synchronize between offload and datapath threads > - Avoid fall back to mark/rss when egress partial offload fails > - Drop count action for partial action offload > > Other changes: > - Avoid duplicate offload requests for the same mega-ufid (from PMD threads) > - Added a coverage counter to track pkts with tnl-push partial offloaded > - Fixed dp_netdev_pmd_remove_flow() to delete partial offloaded flow > > ** > > Sriharsha Basavapatna (7): > dpif-netdev: Refactor dp_netdev_flow_offload_put() > netdev-dpdk: provide a function to identify dpdk-vhost netdevs > dpif-netdev: Skip encap action during datapath execution > dpif-netdev: Support flow_get() with partial-action-offload > dpif-netdev: Support partial-action-offload of VXLAN encap flow > dpif-netdev: Support partial offload of PUSH_VLAN action > dpif-netdev: Support partial offload of POP_VLAN action > > lib/dpif-netdev.c | 463 ++ > lib/netdev-dpdk.c | 5 + > lib/netdev-dpdk.h | 1 + > lib/netdev-offload-dpdk.c | 96 ++-- > lib/netdev-offload.h | 3 + > lib/odp-execute.c | 19 +- > 6 files changed, 502 insertions(+), 85 deletions(-) > > -- > 2.25.0.rc2 > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev