On Fri, Sep 05, 2014 at 12:07:17PM -0700, Jesse Gross wrote:
> On Thu, Sep 4, 2014 at 12:28 AM, Simon Horman
> <simon.hor...@netronome.com> wrote:
> > On Tue, Sep 02, 2014 at 07:20:30PM -0700, Pravin Shelar wrote:
> >> On Tue, Sep 2, 2014 at 6:55 PM, Jesse Gross <je...@nicira.com> wrote:
> >> > On Mon, Sep 1, 2014 at 1:10 AM, Simon Horman 
> >> > <simon.hor...@netronome.com> wrote:
> >> >> On Thu, Aug 28, 2014 at 10:12:49AM +0900, Simon Horman wrote:
> >> >>> On Wed, Aug 27, 2014 at 03:03:53PM -0500, Jesse Gross wrote:
> >> >>> > On Wed, Aug 27, 2014 at 11:51 AM, Ben Pfaff <b...@nicira.com> wrote:
> >> >>> > > On Wed, Aug 27, 2014 at 10:26:14AM +0900, Simon Horman wrote:
> >> >>> > >> On Fri, Aug 22, 2014 at 08:30:08AM -0700, Ben Pfaff wrote:
> >> >>> > >> > On Fri, Aug 22, 2014 at 09:19:41PM +0900, Simon Horman wrote:
> >> >>> > >> What we would like to do is to provide something generally useful
> >> >>> > >> which may be used as appropriate to:
> >> >>> > >
> >> >>> > > I'm going to skip past these ideas, which do sound interesting, 
> >> >>> > > because
> >> >>> > > I think that they're more for Pravin and Jesse than for me.  I 
> >> >>> > > hope that
> >> >>> > > they will provide some reactions to them.
> >> >>> >
> >> >>> > For the hardware offloading piece in particular, I would take a look
> >> >>> > at the discussion that has been going on in the netdev mailing list. 
> >> >>> > I
> >> >>> > think the general consensus (to the extent that there is one) is that
> >> >>> > the hardware offload interface should be a block outside of OVS and
> >> >>> > then OVS (mostly likely from userspace) configures it.
> >> >>>
> >> >>> Thanks, I am now digesting that conversation.
> >> >>
> >> >> A lively conversation indeed.
> >> >>
> >> >> We are left with two questions for you:
> >> >>
> >> >> 1. Would you look at a proposal (I have some rough code that even works)
> >> >>    for a select group action in the datapath prior to the finalisation
> >> >>    of the question of offloads infrastructure in the kernel?
> >> >>
> >> >>    From our point of view we would ultimately like to use such an 
> >> >> action to
> >> >>    offload to hardware. But it seems that there might be use-cases (not 
> >> >> the
> >> >>    one that I have rough code for) where such an action may be useful. 
> >> >> For
> >> >>    example to allow parts of IPVS to be used to provide stateful load
> >> >>    balancing.
> >> >>
> >> >>    Put another: It doesn't seem that a select group action is dependent 
> >> >> on
> >> >>    an offloads tough there are cases where they could be used together.
> >> >
> >> > I agree that this is orthogonal to offloading and seems fine to do
> >> > now. It seems particularly nice if we can use IPVS in a clean way,
> >> > similar to what is currently being worked on for connection tracking.
> >> >
> >> > I guess I'm not entirely sure how you plan to offload this to hardware
> >> > so it's hard to say how it would intersect in the future. However, the
> >> > current plan is to have offloading be directed for a higher point
> >> > (i.e. userspace) and have the OVS kernel module remain a software path
> >> > so probably it doesn't really matter.
> >> >
> >> > However, I'll Pravin comment since he'll be the one reviewing the code.
> >
> > Ok, my reading of the recent offload thread, which is somewhat clouded by
> > preconceptions, is that offloads could be handled by hooks in the datapath.
> > But I understand other ideas are also under discussion. Indeed it is
> > more clear to me that other ideas are under discussion now that you have
> > pointed it out. Thanks.
> 
> I'm curious about about what exactly you are trying to offload though.
> Is it the actual group selection operation? Is it the whole datapath
> and these use cases happen to contain groups?

We would like to offload the entire flow.
And that would include the group selection operation.
The reason for this is to avoid the extra flow setup-cost
involved when selection occurs in user-space.

Although it is conceivable that an entire datapath could be offloaded
by a Netronome flow processor I think it is more practical to allow
offloads with a smaller granularity: for instance some actions may
not yet be implemented in code that runs on the flow processor.

> >> I agree it is good to integrate datapath with IPVS. I would like to
> >> see the design proposal.
> >
> > So far I have got as far as a prototype select group action for the
> > datapath. In its current incarnation it just implements a hash,
> > using the RSS hash.
> >
> > The attributes of a select group action are one or more nested
> > bucket attributes. And bucket attributes contain a weight attribute
> > and nested action attributes. I have it in mind to add a selection method
> > attribute to the selection group attribute, as per my proposal for Open
> > Flow[1].
> >
> > As such the current hash usage of a hash to select buckets is not
> > particularly important as I would like to support provision of
> > implementations of multiple selection methods.
> >
> > I have not yet fleshed out an IPVS proposal. But my general idea
> > is that when the datapath executes a select group action for
> > an IPVS group that it would call the IPVS scheduler (the IPVS term
> > for its connection tracker) to determine where to forward a packet.
> >
> > On this IPVS side this would probably require adding support for zones,
> > so that the entries relating to OVS would be separate from anything
> > else it is doing.
> 
> This sounds an awful lot like a cross between how bonding is
> implemented (which I think is pretty much the same as the RSS hash
> backed version that you describe above) and an IPVS version of the
> connection tracking proposal that Justin sent out recently. Both of
> these use recirculation as the "select group" action.

Yes, now you mention it, there are large similarities.

>From my point of view the conntrack proposal I am familiar with
lacks the ability for the connection tracker to return details
of the selected end-point to user-space. But I think that could be resolved.

> I know you said that this might lead to a large number of flows post
> selection but I'm not sure why this is inherently true (or that it can't
> be mitigated).

The scenario I am thinking of is something like this:

1. Pre-recirculation flow
   * match:   ip/0xffff,proto=ip_dst=a.b.c.d/255.255.255.255,tp_dst=80/0xfff
   * actions: ...,recirc_id=X,recirculate

2. Post-recirculation flow

   Supposing that stateful L4 load balancing is used as the selection method.
   It seems to me that the resulting flow would need to to an exact
   match on all fields of the 5-tuple.

   e.g.:
   * match: 
recirc_id=X,proto=ip/0xffff,ip_dst=a.b.c.d/255.255.255.255,ip_src=e.f.g.h/255.255.255.255,tp_dst=80/0xfff,tp_dst=p/0xffff
   * actions: output:3

   So I see that there would basically need to be a post-recirculation flow
   for each connection. Each of which would need to be established via an
   upcall. This is what I meant by a large number of flows post selection.

   It is not obvious to me how this can be mitigated other than by
   having a selection algorithm that lends itself to masking
   by for example basing its end-point selection a masked ip_dst.
   But I believe such an approach would lead to uneven balancing.

   It would be possible to not match on proto, ip_dst and ip_src in
   the post-recirculation flow, as this is redundant due to the match on
   recirc_id. But I don't think that alters the number of flows that would
   be created.
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to