On Thu, Jul 21, 2016 at 3:55 AM, Chandran, Sugesh
<sugesh.chand...@intel.com> wrote:
> Hi Mark & Jesse
>
> Thank you for looking into the the proposal,
> Please find my answers inline below.
>
> Regards
> _Sugesh
>
>> -----Original Message-----
>> From: Gray, Mark D
>> Sent: Wednesday, July 20, 2016 7:17 PM
>> To: Jesse Gross <je...@kernel.org>
>> Cc: Chandran, Sugesh <sugesh.chand...@intel.com>;
>> dev@openvswitch.org; Giller, Robin <robin.gil...@intel.com>
>> Subject: RE: [ovs-dev] Considering the possibility of integrating DPDK 
>> generic
>> classifier APIs in OVS.
>>
>> >
>> > On Wed, Jul 20, 2016 at 6:43 PM, Gray, Mark D <mark.d.g...@intel.com>
>> > wrote:
>> > >  [Gray, Mark D] I think we should focus on one or two use cases
>> > > rather than a general offload like you discuss below. A general
>> > > offload involves a huge amount of code churn and there are a lot of
>> > > difficulties,
>> > some that you have highlighted below.
>> > > A more focused implementation will flush out any issues with the API.
>> > > In particular, the VxLAN use case that you mentioned above and
>> > > perhaps the offload of the hash calculation (but the generic
>> > > filtering api would also need to support generation of hashes) could
>> > > be two targets for
>> > this DPDK api.
>> >
>> > I agree that targeting a specific use case is a good idea (as well as
>> > your other comments). It's probably worthwhile talking to John
>> > Fastabend about this (also from Intel) since he has tried to something
>> > similar for the past several years in Linux. Many of the general
>> > problems listed in the original email turn out to be very difficult.
>> > (Examples include capabilities; describing flows in a hardware
>> > independent manner is something that OpenFlow tried to tackle for a
>> > long time; which flows to offload in the face of table size limits
>> > while maintaining correct forwarding behavior; etc.)
>>  [Gray, Mark D]
>> Yes, John and I have discussed a lot of this in depth and we have done
>> whiteboarding of possible hw offload designs in OVS which is why I am quite
>> familiar with the issues.
> [Sugesh] I feel that the design must be considering all the capabilities
> Of DPDK APIs though it uses only for the VxLAN and hashing for now.
> The earlier implementation installs the flows in hardware when a flow get
> populated in the datapath. Everything happens in the datapath.
> The main focus of that implementation is to just optimize the VxLAN
> traffic, so we haven’t consider other cases where flow director can be useful.
> The generic APIs can do much more than just a flow director. So having
> a generic extendable design in OVS helps in many ways.
> Comments?
>
>>
>> >
>> > I think the VXLAN acceleration was a good use case since the vswitch
>> > is the owner of the tunnel endpoint and therefore is better equipped
>> > to make policy decisions. The main concern that I had with the
>> > previous implementation was that it was making assumptions about the
>> > contents of the inner flow based on the UDP source port, which is not
>> > really safe since that is just a hash.
>> [Gray, Mark D]
>> I read your comments on this I had a look through Sugesh's code to try and
>> see where this was happening. I couldn’t see it but I agree that the source
>> port is basically random and it's only a hash of the inner flow by 
>> convention.
>> Sugesh, is Jesse's concern valid in your implementation? I thought it was
>> actually extracting the inner header and you weren't making an assumption
>> about the source port?
> [Sugesh] Its not possible to do the inner packet extraction and lookup because
> the inner packet miniflow also include metadata from outer header. The
> inner packet matches on hash + tunnel flag to match the flow in the last
> implementation.
> The proposed design may solve this by making  the control plane to insert two 
> set rules in the datapath,
> for VxLAN tunnel traffic
> set :1 (Software fallback path)
> Rule 1: Outer tunnel header rule
> Rule 2 : Inner header rule.
> Set:1 is purely software rules and that expected to be present in the 
> datapath until the
> Hardware flow insertion is complete.
>
> Set:2 (Software + Hardware path)
> Rule 1:- Outer header hardware flow director rule, Programmed by OVS control 
> plane.
> Rule 2:- Inner header software rule(matches on miniflow only from inner 
> header +
> hardware reported tunnel ID). Please note there is no tunnel metadata from 
> the outer
> header uses here.
>
> To start off,
> OVS can verify the hardware capabilities when a user adds a port. For now its 
> only
> the flow director and hashing.
> For every flow insertion request To/From the port, control plane has to check 
> if it is a
> candidate for hardware flow and insert the flows in NICs accordingly.
> We can assume that every hardware flow associate with an ID and queue(Or 
> either of one).
> The flow lookup and matching logic has to be changed to handle these 
> parameters too.
> Similarly this should be represented in the openflow as well. Can we use any 
> openflow registers
> for this?
> Please let me know your thoughts on this.

I think this should really be transparent to the OpenFlow layer - if
the device is capable of doing these matches, then the performance
will seamlessly increase, otherwise things work the same as before
with software. That being said, the actual datapath flow keys (either
hardware or software) are internal to OVS and could have some
knowledge of this. So I don't think that it makes sense to use
OpenFlow registers but there could be some space used in struct flow.
This could also be used to do a lookup and population of tunnel
metadata in rule #2.

We also need to make sure that there aren't any behavior changes
regardless of what fields in the flow are used. For example, if there
is a flow with an ACL on a field that the NIC doesn't support, it's
important to ensure that the original behavior is retained. However,
if it is the outer header that is being offloaded to hardware, complex
matches are relatively rare there so it would probably be reasonable
to detect this and drop back to software. A fairly conservative
algorithm would probably keep this pretty simple and still allow
offloading in almost all real world cases.

Finally, this is starting to sound like a specific case of a technique
that I mentioned previously on the original patch set. A generalized
version might actually be easier to implement and have uses in more
cases. There is a better description of the technique in section 6 of
this paper:
https://www.usenix.org/system/files/conference/atc16/atc16_paper_jackson.pdf
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to