Hi Roi, Thank you for your replay, I was on vacation and couldn't get back to you on this before. Please find my reply inline
Regards _Sugesh > -----Original Message----- > From: Roi Dayan [mailto:[email protected]] > Sent: Monday, January 16, 2017 12:57 PM > To: Paul Blakey <[email protected]>; Chandran, Sugesh > <[email protected]>; Joe Stringer <[email protected]> > Cc: [email protected]; ovs dev <[email protected]>; Shahar Klein > <[email protected]>; Mark Bloch <[email protected]>; Simon > Horman <[email protected]>; Hadar Hen Zion > <[email protected]>; Rony Efraim <[email protected]>; Jiri Pirko > <[email protected]>; Eric Garver <[email protected]>; Marcelo Ricardo Leitner > <[email protected]>; Or Gerlitz <[email protected]>; Andy > Gospodarek <[email protected]> > Subject: Re: [ovs-dev] [PATCH ovs V2 00/21] Introducing HW offload support > for openvswitch > > > > On 16/01/2017 11:04, Paul Blakey wrote: > > > > > > On 16/01/2017 11:03, Paul Blakey wrote: > >> > >> On 13/01/2017 02:42, Chandran, Sugesh wrote: > >>> Hi Paul, > >>> > >>> Thank you for releasing the patch set. I have few high level > >>> comments on the patch set as below. > >>> > >>> My first concern is not providing the device (for eg: A FPGA) > >>> specific initialization support. This has to be done based on the > >>> user requirement/use case. Consider the P4 based OVS switch which > >>> wanted to define a FPGA behavior/profile at the time of OVS > >>> initialization. This profile defines set of features/number of flow > >>> fields to be enabled on the hardware for the specific application. I > >>> feel there must be some option for control-path to init the entire > hardware module. > >>> Some of the FPGA can offer dynamic feature initialization based on > >>> application need. This is something like loading different firmware > >>> on the fly based on what application profile is. Each firmware > >>> varies in performance and scalability figures hence its worth to > >>> define the profile on the need basis. Do you think its possible to > >>> offer this support with current implementation model? > > Hi Sugesh, > > What about netdev_initialize? > We didn't need it currently but can be added later when needed. [Sugesh] I feel that would be fine. > > >>> > >>> According to my understanding, OVS try to install the flows in > >>> hardware first if it is enabled. The flows are fallback to software > >>> only when hw insert is failed. I would like to have flexibility of > >>> defining the flow installation behavior from the control path than > >>> just based on netdev capacity. Consider a multi-tenant deployment > >>> with limited hardware offload support (Say hardware supports only > >>> limited number of flows), OVS should install only priority flows in > >>> hardware to offer better performance. The current implementation > >>> operates in first come first serve model, which may not be optimal. > > This can be added later as a feature. here, as you understood, we added the > basic flow which is if the user enabled hw-offload then try to offload and > fallback to software. [Sugesh] Sure. > > >>> > >>> From an orchestration point of view, how system would know the port > >>> can do hardware offload/not. More specifically how system could know > >>> the capabilities of hardware offload. In many cases this information > >>> is sufficient enough to identify a flow is hardware ready/not, which > >>> avoid the cost of hardware interaction at the time of flow insert. > >>> May be port can be added as software port, which will populate the > >>> OVSDB with hardware capability information. The orchestration can > >>> use the exposed hardware capability matrix to determine whether or > >>> not to enable the hardware offload on that port. > > We definitely thought about it but didn't do it yet. currently we just let the > user enable or disable the hw-offload which by default is disabled. [Sugesh] ok > > >>> > >>> Just to confirm, the provided APIs should be implemented for virtual > >>> ports(virtio-vhost ports) as well. Is that the case? > > you mean netdev providers? then yes. can be NULLs though. [Sugesh] Ok > > >>> > >>> I assume the flow between hardware ports to software only ports > >>> always programmed in the software. The packets from hardware port > >>> are arrived as normal packet in OVS-DP. Is that correct? > > can you give an example? > do you mean like veth? then if the TC policy is skip_sw then yes. > we can't add HW offload rule for veth so we'll fallback to OVS datapath. [Sugesh]Ok. And this is true for the virtio ports, where the vhost backend is in software.? For eg: the way how the north-south traffic is treated when the VM port is a software only port. > > >>> > >>> The hardware flow API are not defined for the dpdk netdev. It may > >>> fail for OVS-DPDK compilation. > > Right. we'll add the missing api (NULLs currently) to make sure we pass > compilation. we'll make sure to test ovs-dpdk compilation for next > submission. [Sugesh] Thank you for that. > > Thanks, > Roi > > >>> > >>> > >>> Regards > >>> _Sugesh > >>> > >>> > >>>> -----Original Message----- > >>>> From: [email protected] [mailto:ovs-dev- > >>>> [email protected]] On Behalf Of Joe Stringer > >>>> Sent: Tuesday, January 10, 2017 6:09 PM > >>>> To: Paul Blakey <[email protected]> > >>>> Cc: ovs dev <[email protected]>; Shahar Klein > >>>> <[email protected]>; Mark Bloch <[email protected]>; > Simon > >>>> Horman <[email protected]>; Hadar Hen Zion > >>>> <[email protected]>; Rony Efraim <[email protected]>; Jiri > Pirko > >>>> <[email protected]>; Eric Garver <[email protected]>; Marcelo Ricardo > >>>> Leitner <[email protected]>; Or Gerlitz <[email protected]>; > >>>> Andy Gospodarek <[email protected]> > >>>> Subject: Re: [ovs-dev] [PATCH ovs V2 00/21] Introducing HW offload > >>>> support for openvswitch > >>>> > >>>> On 10 January 2017 at 10:07, Joe Stringer <[email protected]> wrote: > >>>>> On 8 January 2017 at 07:18, Paul Blakey <[email protected]> > wrote: > >>>>>> > >>>>>> On 05/01/2017 02:11, Joe Stringer wrote: > >>>>>> > >>>>>>> On 25 December 2016 at 03:39, Paul Blakey <[email protected]> > >>>> wrote: > >>>>>>>> > >>>>>>>> This patch series introduces rule offload functionality to > >>>>>>>> dpif-netlink via netdev ports new flow offloading API. The user > >>>>>>>> can specify whether to enable rule offloading or not via OVS > >>>>>>>> configuration. Netdev providers are able to implement netdev > >>>>>>>> flow > >>>> offload API in order to offload rules. > >>>>>>>> > >>>>>>>> This patch series also implements one offload scheme for > >>>>>>>> netdev-linux, using TC flower classifier, which was chosen > >>>>>>>> because its sort of natrual to state OVS DP rules for this > >>>>>>>> classifier. > >>>>>>>> However, the code can be extended to support other classifiers > >>>>>>>> such as U32, eBPF, etc which support offload as well. > >>>>>>>> > >>>>>>>> The use-case we are currently addressing is the newly sriov > >>>>>>>> switchdev mode in the linux kernel which was introduced in > >>>>>>>> version > >>>>>>>> 4.8 [1][2]. this series was tested against sriov vfs vports > >>>>>>>> representors of the Mellanox 100G ConnectX-4 series exposed by > >>>>>>>> the > >>>>>>>> mlx5 kernel driver. > >>>>>>>> > >>>>>>>> changes from V1 > >>>>>>>> - Added generic netdev flow offloads API. > >>>>>>>> - Implemented relevant flow API in netdev-linux (and > >>>>>>>> netdev-vport). > >>>>>>>> - Added a other_config hw-offload option to enable > >>>>>>>> offloading (defaults to false). > >>>>>>>> - Fixed coding style to conform with OVS. > >>>>>>>> - Policy removed for now. (Will be discussed how best > >>>>>>>> implemented later). > >>>>>>> > >>>>>>> Hi Paul, > >>>>>>> > >>>>>>> Happy holidays to you too ;) > >>>>>> > >>>>>> Hi :) > >>>>>> Thanks for your thorough review, We are working on resubmitting > >>>>>> the next patch set and I'll try and reply as much as I can right > >>>>>> now, but this is a big review, so I'll split this to several replies. > >>>>>>> > >>>>>>> > >>>>>>> A few high level comments: > >>>>>>> * Overall the patchset is looking in better state than previously. > >>>>>>> * That said, on platforms other than your specific test > >>>>>>> environment OVS fails to build. > >>>>>>> * netdev-vport seems to have acquired platform-specific code > >>>>>> > >>>>>> Thanks, We've seen your links for travis-ci and app veyor, we'll > >>>>>> create an account and fix it accordingly. > >>>>>>> > >>>>>>> * I don't think that ovs-dpctl should read the database. One of > >>>>>>> the nice things about ovs-dpctl is that it is currently > >>>>>>> standalone, ie doesn't require anything else to be running to be > >>>>>>> able to query the state of the kernel datapath. What are you > >>>>>>> trying to achieve with this? Perhaps you can extend the > >>>>>>> ovs-appctl dpctl/... commands instead to provide the functionality > you need? > >>>>>> > >>>>>> We have two config options that isn't saved with the kernel > >>>>>> datapath (and dumped back on dpif open): > >>>>>> if offload is enabled to know if we even need to dump from > >>>>>> netdevs or not for dump-flows cmd (I guess we could just dump the > >>>>>> netdev anyway) and for add-flow, and tc flower flag > >>>>>> skip_hw/skip_sw that is also needed for add-flow cmd, so we can > >>>>>> know what flag we be used while inserting to flower if offloading is > enabled and possible. > >>>>> > >>>>> I discussed this with a few people offline and it seems that the > >>>>> most reasonable way to handle it for dump is just to dump from > >>>>> netdevs anyway. You can fetch the list of relevant ports from OVS > >>>>> datapath in kernel, then only dump those ports. > >>>>> > >>>>> For adding and deleting flows, using ovs-dpctl is not really > >>>>> supported, it's primarily for debug. ovs-vswitchd is going to > >>>>> override that behaviour anyway - if you add a flow, it may delete > >>>>> it; if you delete a flow, it may re-add it. I suggest that for > >>>>> debugging purposes the default behaviour is to install into OVS, > >>>>> but if you want to put it into TC sw/hw, provide an additional flag on > the commandline. > >>>>> > >>>>> If you want ovs-vswitchd to decide how to install the flow, you > >>>>> can always use "ovs-appctl dpctl/add-flow ..." and the command > >>>>> will go via ovs-vswitchd, which will make the offloading decision > >>>>> based on the same logic as it would when handling regular upcalls. > >>>>> > >>>>>>> * It would be good to see at least some basic system tests ala > >>>>>>> tests/system-traffic.at or tests/system-ovn.at. Maybe there > >>>>>>> could be tests/system-tc.at that uses SKIP_HW to show that flows > >>>>>>> can be installed and that they work. > >>>>>> > >>>>>> Sure. > >>>>>>> > >>>>>>> * I think it would be great to get some ability to see what's > >>>>>>> happening with flow offloads; eg "ovs-appctl dpctl/show-hw- > offloads" > >>>>>>> that could give you some idea of how many flows are being > >>>>>>> offloaded > >>>>>> > >>>>>> You mean a number/stats or actually dump only offloaded flows? > >>>>> > >>>>> At the time, I was thinking just stats. > >>>>> > >>>>> However I think it would also be useful to have some flag for > >>>>> ovs-dpctl that chooses which flows to fetch. > >>>>> > >>>>> Something like: > >>>>> ovs-dpctl dump-flows: Dump all flows from both OVS and TC (flower > >>>>> only). > >>>>> ovs-dpctl dump-flows --only-flower ovs-dpctl dump-flows --only-ovs > >>>>> > >>>>> Each flow when printed should also indicate in some way where it > >>>>> came > >>>> from. > >>>> > >>>> Could be something like ovs-dpctl dump-flows --type=flower instead, > >>>> to be a bit more generic. > >>>> _______________________________________________ > >>>> dev mailing list > >>>> [email protected] > >>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >> > >> +cc Roi that was removed from cc somehow, he can answer some of this > >> questions for you. _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
