Re: [PATCH net-next v6 3/3] openvswitch: enable NSH support

2017-09-06 Thread Yang, Yi
On Wed, Sep 06, 2017 at 04:03:29PM +0800, Hannes Frederic Sowa wrote:
> "Yang, Yi"  writes:
> >> 
> >> > If you check GENEVE implementation, tun_metadata* can be set or matched
> >> > as any other match field.
> >> 
> >> Yes, I wrote that in my previous mail. I wonder why NSH context metadata
> >> is not in tun_metadata as well?
> >
> > tun_metadata is tunnel metadata, GENEVE needs tunnel port, but NSH is
> > not so, NSH can't directly use tun_metadata, for MD type 2, we need to a
> > lot of rework on tun_metadata to make it shared between GENEVE and NSH,
> > I don't think this can happen in near term. So tun_metadata isn't option
> > for this now.
> 
> Sorry, I couldn't follow you. Why can't you store the context headers in
> tun_metadata exactly?

tun_metadata can work only if in_port and out_port are GENEVE tunnel
port, it is designed specially for GENEVE, Eth+NSH can work without
any tunnel port involved, context headers for NSH are not tunnel
metadata, they aren't tunnel-specific, you can't treat them as same
thing. In addtition, tun_metadata also can be matched and set, they have
the same issue as you concern.

> 
> [...]
> 
> Bye,
> Hannes


Re: [PATCH net-next v6 3/3] openvswitch: enable NSH support

2017-09-06 Thread Hannes Frederic Sowa
Jan Scheurich  writes:

>> > There is no way we can re-use the existing TLV tunnel metadata
>> > infrastructure in OVS for matching and setting NSH MD2 TLV headers. We
>> > will need to introduce a new (perhaps similar) scheme for modelling
>> > generic TLV match registers in OVS that are assigned to protocol TLVs
>> > by the controller. This is FFS.
>> 
>> This is what I don't understand.
>> 
>> Why can't you just reuse the space in the struct sw_flow_key where
>> geneve would put in their metadata. There are 255 empty bytes at the
>> beginning if you don't have other tunnel metadata anyway.
>> 
>> If you receive packets over vxlan(gpe), tun_opts gets populated with an
>> ip_tunnel_key. Couldn't you use the options space in there after the
>> ip_tunnel_key to store the NSH context just for the sake of storing them
>> somewhere instead of adding 16 bytes to sw_flow_key?
>
> There is a significant conceptual difference between tunnel metadata
> (copied from a popped tunnel header) and packed match fields extracted
> during parsing of the packets. If we'd store them in the same space in
> the sw_flow_key struct, we are calling for trouble.
>
> NSH is transport agnostic, it should work over Ethernet, VXLAN(GPE)
> and other transport tunnels. Think about an NSH packet arriving on an
> Geneve tunnel port. Any Geneve tunnel options have already been stored
> in the tun_opts metadata bytes. Now the datapath parses the NSH header
> and overwrites the tun_opts metadata with the NSH metadata. This would
> break the OVS semantics.

Obviously you would use key->tun_opts_len and start appending there and
not simply overwrite. Otherwise that would be rather silly.

> I absolutely understand your concern about efficient space utilization
> in the flow struct for TLV match fields and it will be part of the
> design challenge for MD2 TLV support to find a good balance between
> memory and run-time efficiency. But that is FFS. For the four fixed
> size MD1 headers the decision has been to include them as additional
> attributes in the flow key.

Okay, then.

Bye,
Hannes


RE: [PATCH net-next v6 3/3] openvswitch: enable NSH support

2017-09-06 Thread Jan Scheurich
> > There is no way we can re-use the existing TLV tunnel metadata
> > infrastructure in OVS for matching and setting NSH MD2 TLV headers. We
> > will need to introduce a new (perhaps similar) scheme for modelling
> > generic TLV match registers in OVS that are assigned to protocol TLVs
> > by the controller. This is FFS.
> 
> This is what I don't understand.
> 
> Why can't you just reuse the space in the struct sw_flow_key where
> geneve would put in their metadata. There are 255 empty bytes at the
> beginning if you don't have other tunnel metadata anyway.
> 
> If you receive packets over vxlan(gpe), tun_opts gets populated with an
> ip_tunnel_key. Couldn't you use the options space in there after the
> ip_tunnel_key to store the NSH context just for the sake of storing them
> somewhere instead of adding 16 bytes to sw_flow_key?

There is a significant conceptual difference between tunnel metadata (copied 
from a popped tunnel header) and packed match fields extracted during parsing 
of the packets. If we'd store them in the same space in the sw_flow_key struct, 
we are calling for trouble.

NSH is transport agnostic, it should work over Ethernet, VXLAN(GPE) and other 
transport tunnels. Think about an NSH packet arriving on an Geneve tunnel port. 
Any Geneve tunnel options have already been stored in the tun_opts metadata 
bytes. Now the datapath parses the NSH header and overwrites the tun_opts 
metadata with the NSH metadata. This would break the OVS semantics.

I absolutely understand your concern about efficient space utilization in the 
flow struct for TLV match fields and it will be part of the design challenge 
for MD2 TLV support to find a good balance between memory and run-time 
efficiency. But that is FFS. For the four fixed size MD1 headers the decision 
has been to include them as additional attributes in the flow key.

BR, Jan






Re: [PATCH net-next v6 3/3] openvswitch: enable NSH support

2017-09-06 Thread Hannes Frederic Sowa
Jan Scheurich  writes:

>> >> Yes, I wrote that in my previous mail. I wonder why NSH context metadata
>> >> is not in tun_metadata as well?
>> >
>> > tun_metadata is tunnel metadata, GENEVE needs tunnel port, but NSH is
>> > not so, NSH can't directly use tun_metadata, for MD type 2, we need to a
>> > lot of rework on tun_metadata to make it shared between GENEVE and NSH,
>> > I don't think this can happen in near term. So tun_metadata isn't option
>> > for this now.
>> 
>> Sorry, I couldn't follow you. Why can't you store the context headers in
>> tun_metadata exactly?
>> 
>
> I think we mixing things. Let me try to clarify:
>
> 1. NSH context metadata has end-to-end significance for the SFP. They
> must be part of the NSH header and cannot be transported as tunnel
> metadata, because transport tunnels (e.g. Geneve) only connect pairs
> of SFFs in the path.

No questions asked. I am not talking about a design choice of the
protocol but an implementation detail of the patch.

> So we need OVS to be able to match on and set NSH context header
> fields, also for MD2 TLVs in the future.

So be it.

> 2. OVS today has support for matching on TLV tunnel metadata after
> termination of a Geneve tunnel. This infrastructure is only usable for
> OVS tunnel ports (like Geneve) but not for matching on TLV headers of
> the NSH protocol, which is not modelled as an OVS tunnel port but
> handled in the OpenFlow pipeline (with generic encp/decap actions to
> enter/terminate an NSH SFP). This was a strategic decision by the OVS
> community two years ago.

I am talking about the tun_opts field in the sw_flow_keys structure for
the kernel dp only.

> There is no way we can re-use the existing TLV tunnel metadata
> infrastructure in OVS for matching and setting NSH MD2 TLV headers. We
> will need to introduce a new (perhaps similar) scheme for modelling
> generic TLV match registers in OVS that are assigned to protocol TLVs
> by the controller. This is FFS.

This is what I don't understand.

Why can't you just reuse the space in the struct sw_flow_key where
geneve would put in their metadata. There are 255 empty bytes at the
beginning if you don't have other tunnel metadata anyway.

If you receive packets over vxlan(gpe), tun_opts gets populated with an
ip_tunnel_key. Couldn't you use the options space in there after the
ip_tunnel_key to store the NSH context just for the sake of storing them
somewhere instead of adding 16 bytes to sw_flow_key?

Thanks,
Hannes


RE: [PATCH net-next v6 3/3] openvswitch: enable NSH support

2017-09-06 Thread Jan Scheurich
> >> Yes, I wrote that in my previous mail. I wonder why NSH context metadata
> >> is not in tun_metadata as well?
> >
> > tun_metadata is tunnel metadata, GENEVE needs tunnel port, but NSH is
> > not so, NSH can't directly use tun_metadata, for MD type 2, we need to a
> > lot of rework on tun_metadata to make it shared between GENEVE and NSH,
> > I don't think this can happen in near term. So tun_metadata isn't option
> > for this now.
> 
> Sorry, I couldn't follow you. Why can't you store the context headers in
> tun_metadata exactly?
> 

I think we mixing things. Let me try to clarify:

1. NSH context metadata has end-to-end significance for the SFP. They must be 
part of the NSH header and cannot be transported as tunnel metadata, because 
transport tunnels (e.g. Geneve) only connect pairs of SFFs in the path. So we 
need OVS to be able to match on and set NSH context header fields, also for MD2 
TLVs in the future. 

2. OVS today has support for matching on TLV tunnel metadata after termination 
of a Geneve tunnel. This infrastructure is only usable for OVS tunnel ports 
(like Geneve) but not for matching on TLV headers of the NSH protocol, which is 
not modelled as an OVS tunnel port but handled in the OpenFlow pipeline (with 
generic encp/decap actions to enter/terminate an NSH SFP). This was a strategic 
decision by the OVS community two years ago.

There is no way we can re-use the existing TLV tunnel metadata infrastructure 
in OVS for matching and setting NSH MD2 TLV headers. We will need to introduce 
a new (perhaps similar) scheme for modelling generic TLV match registers in OVS 
that are assigned to protocol TLVs by the controller. This is FFS.

BR, Jan



Re: [PATCH net-next v6 3/3] openvswitch: enable NSH support

2017-09-06 Thread Hannes Frederic Sowa
"Yang, Yi"  writes:

> On Tue, Sep 05, 2017 at 09:12:09PM +0800, Hannes Frederic Sowa wrote:
>> "Yang, Yi"  writes:
>> 
>> > We can change this later if we really find a better way to handle this
>> > because it isn't defined in include/uapi/linux/openvswitch.h, so I still
>> > have backdoor to do this if needed :-)
>> 
>> Sorry, I can't follow you.
>> 
>> It doesn't matter if something is defined in uapi headers, the
>> observable behavior matters. If you allow users to configure flows with
>> specific fields, it should not stop working at a future point in time.
>
> Anyway this can be changed if it is really needed, so far current way is
> the best one we can come up with, we would like to use it if you can
> have better proposal. We have explained repeatedly context headers must
> be matched and set, this is bottom line.

I understand that in the way you use those context fields you will have
to match on those. I understand that there is no way around that.

>> > For our sfc use case in Opendaylight, we use context[0] for tunnel ID,
>> > context[1] for destination IP for reverse RSP, they are used to match
>> > and set in OpenFlow table, you can't limit users to use them in such
>> > ways.
>> 
>> So in your specific case you expect the masks to be completely stable
>> because you defined a protocol on top of NSH, understood. And that is
>> stable accross all possible paths. Understood as well.
>> 
>> > If you check GENEVE implementation, tun_metadata* can be set or matched
>> > as any other match field.
>> 
>> Yes, I wrote that in my previous mail. I wonder why NSH context metadata
>> is not in tun_metadata as well?
>
> tun_metadata is tunnel metadata, GENEVE needs tunnel port, but NSH is
> not so, NSH can't directly use tun_metadata, for MD type 2, we need to a
> lot of rework on tun_metadata to make it shared between GENEVE and NSH,
> I don't think this can happen in near term. So tun_metadata isn't option
> for this now.

Sorry, I couldn't follow you. Why can't you store the context headers in
tun_metadata exactly?

[...]

Bye,
Hannes


Re: [PATCH net-next v6 3/3] openvswitch: enable NSH support

2017-09-05 Thread Yang, Yi
On Tue, Sep 05, 2017 at 09:12:09PM +0800, Hannes Frederic Sowa wrote:
> "Yang, Yi"  writes:
> 
> > We can change this later if we really find a better way to handle this
> > because it isn't defined in include/uapi/linux/openvswitch.h, so I still
> > have backdoor to do this if needed :-)
> 
> Sorry, I can't follow you.
> 
> It doesn't matter if something is defined in uapi headers, the
> observable behavior matters. If you allow users to configure flows with
> specific fields, it should not stop working at a future point in time.

Anyway this can be changed if it is really needed, so far current way is
the best one we can come up with, we would like to use it if you can
have better proposal. We have explained repeatedly context headers must
be matched and set, this is bottom line.

> 
> > For our sfc use case in Opendaylight, we use context[0] for tunnel ID,
> > context[1] for destination IP for reverse RSP, they are used to match
> > and set in OpenFlow table, you can't limit users to use them in such
> > ways.
> 
> So in your specific case you expect the masks to be completely stable
> because you defined a protocol on top of NSH, understood. And that is
> stable accross all possible paths. Understood as well.
> 
> > If you check GENEVE implementation, tun_metadata* can be set or matched
> > as any other match field.
> 
> Yes, I wrote that in my previous mail. I wonder why NSH context metadata
> is not in tun_metadata as well?

tun_metadata is tunnel metadata, GENEVE needs tunnel port, but NSH is
not so, NSH can't directly use tun_metadata, for MD type 2, we need to a
lot of rework on tun_metadata to make it shared between GENEVE and NSH,
I don't think this can happen in near term. So tun_metadata isn't option
for this now.

> 
> > Actually the most important information in NSH are just these context
> > headers, you can't limit imagination of users by removing them from flow
> > keys.
> >
> > My point is to bring miniflow into kernel data path to fix your concern,
> > this will benefit your employer directly :-)
> 
> Okay, interesting. It will probably not help if you still have a hash of
> a packet inside the flow table and use that for load balancing.
> 
> [...]
> 
> BTW I don't want to stop this patch, I am merely interested in how the
> bigger picture will look like in the end.
> 
> Bye,
> Hannes


Re: [PATCH net-next v6 3/3] openvswitch: enable NSH support

2017-09-05 Thread Hannes Frederic Sowa
Hello Jan,

Jan Scheurich  writes:

> Please have a look at the Google doc that sketches the overall
> solution to support NSH in OVS.
> https://drive.google.com/open?id=1oWMYUH8sjZJzWa72o2q9kU0N6pNE-rwZcLH3-kbbDR8
>
> In details it is slightly outdated but the NSH MD1 support (and all
> prerequisites like PTAP and Generic Encap/Decap) have been implemented
> in OVS 2.8 (ofproto layer and the userspace datapath).
>
> The NSH use cases are discussed in the chapter "Support for NSH". OVS
> is primarily targeting the Classifier and SFF use cases. Obviously the
> classifier must be able to set the MD1 context headers. The final SFF
> must be able to inspect the context headers, typically to determine
> the L2 or L3 forwarding context (e.g. VRF) in which to forward the
> decapsulated packet on to its final destination.

>From Yi Yang's mail I understood that you put a tunnel ID into
context[0]. In this case, I can understand that you want to match on
that. I was under the impression that the combination of tenant id from
the vxlan-gpe header and the path id plus ttl would give enough state
for the SDN to keep state on where the packet is in the network. I don't
understand what a tunnel id is.

I understood that the context fields are usable by the service function
chains, but they are actually in use by the outer SDN and basically
standardized.

> This information has end-to-end significance for the SFP and is
> encoded by the classifier in the NSH context headers. It cannot be put
> into transport tunnel options of e.g. a Geneve tunnel connecting two
> SFFs along the SFP.

Because the TLVs are not end to end in the geneve case? Yes, this makes
sense.

> We are now trying to establish feature parity in the kernel
> datapath. If the kernel datapath chooses not to support the MD1
> fields, OVS with kernel datapath will not be able to address the
> classifier and final SFF use cases.

As I understand this now, you are designing some kind of protocol or
particular implementation on top of NSH. Thus you don't expect masks to
grow limitless and you don't allow any SFC elements to take part in the
decision what to communicate over the context fields.

I still wonder about the hash as part of the NSH context and how that
fits into the picture, which looked like the only standardized
application of context headers from your google doc.

Btw., I don't want to block this patch.

Bye,
Hannes


Re: [PATCH net-next v6 3/3] openvswitch: enable NSH support

2017-09-05 Thread Hannes Frederic Sowa
"Yang, Yi"  writes:

> On Tue, Sep 05, 2017 at 12:30:09PM +0200, Hannes Frederic Sowa wrote:
>> "Yang, Yi"  writes:
>> 
>> > I'm not sure what new action you expect to bring here, I think group
>> > action is just for this, as you said it isn't only bound to NSH, you can
>> > start a new thread to discuss this. I don't think it is in scope of NSH.
>> 
>> It is in scope of this discussion as you will provide a user space API
>> that makes the NSH context fields accessible from user space in a
>> certain way. If you commit to this, there is no way going back.
>
> We can change this later if we really find a better way to handle this
> because it isn't defined in include/uapi/linux/openvswitch.h, so I still
> have backdoor to do this if needed :-)

Sorry, I can't follow you.

It doesn't matter if something is defined in uapi headers, the
observable behavior matters. If you allow users to configure flows with
specific fields, it should not stop working at a future point in time.

>> I haven't yet grasped the idea on how those fields will be used in OVS
>> besides load balancing. Even for load balancing the tunnel itself
>> (vxlan-gpe + UDP source port or ipv6 flowlabel) already provides enough
>> entropy to do per-flow load balancing. What else is needed?  Why a
>> context header for that? You just need multiple action chains and pick
>> one randomly.
>
> For our sfc use case in Opendaylight, we use context[0] for tunnel ID,
> context[1] for destination IP for reverse RSP, they are used to match
> and set in OpenFlow table, you can't limit users to use them in such
> ways.

So in your specific case you expect the masks to be completely stable
because you defined a protocol on top of NSH, understood. And that is
stable accross all possible paths. Understood as well.

> If you check GENEVE implementation, tun_metadata* can be set or matched
> as any other match field.

Yes, I wrote that in my previous mail. I wonder why NSH context metadata
is not in tun_metadata as well?

> Actually the most important information in NSH are just these context
> headers, you can't limit imagination of users by removing them from flow
> keys.
>
> My point is to bring miniflow into kernel data path to fix your concern,
> this will benefit your employer directly :-)

Okay, interesting. It will probably not help if you still have a hash of
a packet inside the flow table and use that for load balancing.

[...]

BTW I don't want to stop this patch, I am merely interested in how the
bigger picture will look like in the end.

Bye,
Hannes


RE: [PATCH net-next v6 3/3] openvswitch: enable NSH support

2017-09-05 Thread Jan Scheurich
Hi Hannes,

Please have a look at the Google doc that sketches the overall solution to 
support NSH in OVS. 
https://drive.google.com/open?id=1oWMYUH8sjZJzWa72o2q9kU0N6pNE-rwZcLH3-kbbDR8

In details it is slightly outdated but the NSH MD1 support (and all 
prerequisites like PTAP and Generic Encap/Decap) have been implemented in OVS 
2.8 (ofproto layer and the userspace datapath). 

The NSH use cases are discussed in the chapter "Support for NSH". OVS is 
primarily targeting the Classifier and SFF use cases. Obviously the classifier 
must be able to set the MD1 context headers. The final SFF must be able to 
inspect the context headers, typically to determine the L2 or L3 forwarding 
context (e.g. VRF) in which to forward the decapsulated packet on to its final 
destination.

This information has end-to-end significance for the SFP and is encoded by the 
classifier in the NSH context headers. It cannot be put into transport tunnel 
options of e.g. a Geneve tunnel connecting two SFFs along the SFP.

We are now trying to establish feature parity in the kernel datapath. If the 
kernel datapath chooses not to support the MD1 fields, OVS with kernel datapath 
will not be able to address the classifier and final SFF use cases.

BR, Jan

> -Original Message-
> From: Hannes Frederic Sowa [mailto:han...@stressinduktion.org]
> Sent: Tuesday, 05 September, 2017 12:30
> To: Yang, Yi <yi.y.y...@intel.com>
> Cc: netdev@vger.kernel.org; d...@openvswitch.org; jb...@redhat.com; 
> e...@erig.me; b...@ovn.org; Jan Scheurich
> <jan.scheur...@ericsson.com>
> Subject: Re: [PATCH net-next v6 3/3] openvswitch: enable NSH support
> 
> "Yang, Yi" <yi.y.y...@intel.com> writes:
> 
> > I'm not sure what new action you expect to bring here, I think group
> > action is just for this, as you said it isn't only bound to NSH, you can
> > start a new thread to discuss this. I don't think it is in scope of NSH.
> 
> It is in scope of this discussion as you will provide a user space API
> that makes the NSH context fields accessible from user space in a
> certain way. If you commit to this, there is no way going back.
> 
> I haven't yet grasped the idea on how those fields will be used in OVS
> besides load balancing. Even for load balancing the tunnel itself
> (vxlan-gpe + UDP source port or ipv6 flowlabel) already provides enough
> entropy to do per-flow load balancing. What else is needed?  Why a
> context header for that? You just need multiple action chains and pick
> one randomly.
> 
> The only protocol that I can compare that to is geneve with TLVs, but
> the TLVs are global and uniquie and a property of the networking
> forwarding backplane and not a property of the path inside a tenant. So
> I expect this actually to be the first case where I think that matters.
> 
> Why are context labels that special that they are not part of tun_ops?
> 
> Thanks,
> Hannes


Re: [PATCH net-next v6 3/3] openvswitch: enable NSH support

2017-09-05 Thread Yang, Yi
On Tue, Sep 05, 2017 at 12:30:09PM +0200, Hannes Frederic Sowa wrote:
> "Yang, Yi"  writes:
> 
> > I'm not sure what new action you expect to bring here, I think group
> > action is just for this, as you said it isn't only bound to NSH, you can
> > start a new thread to discuss this. I don't think it is in scope of NSH.
> 
> It is in scope of this discussion as you will provide a user space API
> that makes the NSH context fields accessible from user space in a
> certain way. If you commit to this, there is no way going back.

We can change this later if we really find a better way to handle this
because it isn't defined in include/uapi/linux/openvswitch.h, so I still
have backdoor to do this if needed :-)

> 
> I haven't yet grasped the idea on how those fields will be used in OVS
> besides load balancing. Even for load balancing the tunnel itself
> (vxlan-gpe + UDP source port or ipv6 flowlabel) already provides enough
> entropy to do per-flow load balancing. What else is needed?  Why a
> context header for that? You just need multiple action chains and pick
> one randomly.

For our sfc use case in Opendaylight, we use context[0] for tunnel ID,
context[1] for destination IP for reverse RSP, they are used to match
and set in OpenFlow table, you can't limit users to use them in such
ways.

If you check GENEVE implementation, tun_metadata* can be set or matched
as any other match field.

Actually the most important information in NSH are just these context
headers, you can't limit imagination of users by removing them from flow
keys.

My point is to bring miniflow into kernel data path to fix your concern,
this will benefit your employer directly :-)

I'm just curious your company has hardware to offload NSH? Which
company are you from?

> 
> The only protocol that I can compare that to is geneve with TLVs, but
> the TLVs are global and uniquie and a property of the networking
> forwarding backplane and not a property of the path inside a tenant. So
> I expect this actually to be the first case where I think that matters.
> 
> Why are context labels that special that they are not part of tun_ops?
> 
> Thanks,
> Hannes


Re: [PATCH net-next v6 3/3] openvswitch: enable NSH support

2017-09-05 Thread Hannes Frederic Sowa
"Yang, Yi"  writes:

> I'm not sure what new action you expect to bring here, I think group
> action is just for this, as you said it isn't only bound to NSH, you can
> start a new thread to discuss this. I don't think it is in scope of NSH.

It is in scope of this discussion as you will provide a user space API
that makes the NSH context fields accessible from user space in a
certain way. If you commit to this, there is no way going back.

I haven't yet grasped the idea on how those fields will be used in OVS
besides load balancing. Even for load balancing the tunnel itself
(vxlan-gpe + UDP source port or ipv6 flowlabel) already provides enough
entropy to do per-flow load balancing. What else is needed?  Why a
context header for that? You just need multiple action chains and pick
one randomly.

The only protocol that I can compare that to is geneve with TLVs, but
the TLVs are global and uniquie and a property of the networking
forwarding backplane and not a property of the path inside a tenant. So
I expect this actually to be the first case where I think that matters.

Why are context labels that special that they are not part of tun_ops?

Thanks,
Hannes


Re: [PATCH net-next v6 3/3] openvswitch: enable NSH support

2017-09-04 Thread Yang, Yi
On Mon, Sep 04, 2017 at 07:22:26PM +0800, Hannes Frederic Sowa wrote:
> Hello,
> 
> "Yang, Yi"  writes:
> 
> > On Wed, Aug 30, 2017 at 05:53:27PM +0800, Hannes Frederic Sowa wrote:
> >> Hello,
> >> 
> >> Yi Yang  writes:
> >> 
> >> [...]
> >> 
> >> > +struct ovs_key_nsh {
> >> > +u8 flags;
> >> > +u8 ttl;
> >> > +u8 mdtype;
> >> > +u8 np;
> >> > +__be32 path_hdr;
> >> > +__be32 context[NSH_MD1_CONTEXT_SIZE];
> >> > +};
> >> > +
> >> >  struct sw_flow_key {
> >> >  u8 tun_opts[IP_TUNNEL_OPTS_MAX];
> >> >  u8 tun_opts_len;
> >> > @@ -144,6 +154,7 @@ struct sw_flow_key {
> >> >  };
> >> >  } ipv6;
> >> >  };
> >> > +struct ovs_key_nsh nsh; /* network service header */
> >> >  struct {
> >> >  /* Connection tracking fields not packed above. */
> >> >  struct {
> >> 
> >> Does it makes sense to keep the context headers as part of the flow?
> >> What is the reasoning behind it? With mdtype 2 headers this might either
> >> not work very well or will increase sw_flow_key size causing slowdowns
> >> for all protocols.
> >
> > For userspace, miniflow can handle such issue, but kernel data path
> > didn't provide such mechanism, so I don't think we can think of a better
> > way to fix this.
> 
> I do think that both, kernel and dpdk dp, are fine if you don't match on
> context fields, which I think is the common case.
> 
> As soon as a few paths start to match on contexts at different offsets I
> am not sure if miniflow will actually help. I don't know enough about
> that. Set cover is a NP hard problem, so constructing plain simple flows
> will be an approximation somewhere anyway at some point.
> 
> If the context values are in some way discrete it might make sense, but
> for load balancing purposes your ingress router might fill out one
> context field with a flow hash of the inner flow to allow ECMP or
> loadbalancing. You might want to treat that separately. Otherwise the
> different paths might always need to context[3] & 0x3 (bits as number of
> possible next hops) and put that into the flow state. Because every hop
> and every path might have different numbers of nexthops etc., I don't
> think this is easy unifiable anymore.

Anyway, this is a common issue but not NSH-implemention-specific issue.
In my perspective, kernel data path won't have better perfromance than
userspace DPDK data path, maybe you're considering hardware offload, but
so far there isn't an infrastructure in OVS to offload NSH.

> 
> > We have to have context headers in flow for match and set, every hop in
> > service function chaining can put its metedata into context headers on
> > demand, MD type 2 is much more complicated than you can imagine, it is
> > impossible to use an uniform way to handle MD type 1 and MD type 2, our
> > way is to keep MD type 1 keys in struct ovs_key_nsh and to reuse struct
> > flow_tnl for MD type 2 metadata, in case of MD type 2, we can set
> > context headers to 0 in struct ovs_key_nsh.
> 
> Understood and agreed.
> 
> > I beleive every newly-added key will result in similiar issue you
> > concern, maybe it will be better to introduce miniflow in kernel data
> > path, but it isn't in scope of this patch series.
> 
> You will see the same problem with offloading flows e.g. to network
> cards very soon. Flow explosion here will cause your 100Gbit/s NIC
> suddenly to go to software slow path.

Do you mean the whole OVS data path offload? As I said, this is a common
issue, it isn't NSH-specific.

> 
> > It will be great if you can have a better proposal to fix your concern.
> 
> I do think that something alike miniflow might make sense, but I don't
> know enough about miniflow.
> 
> New ACTIONS, that are not only bound to NSH, seem appealing to me at the
> moment. E.g. have a subtable match in the kernel dp or allowing for some
> OXM % modulo -> select action or neighbor alike thing would probably
> cover a lot of cases in the beginning. The submatch would probably very
> much reassemble the miniflow in dpdk dp.

I'm not sure what new action you expect to bring here, I think group
action is just for this, as you said it isn't only bound to NSH, you can
start a new thread to discuss this. I don't think it is in scope of NSH.

> 
> Thanks,
> Hannes


RE: [PATCH net-next v6 3/3] openvswitch: enable NSH support

2017-09-04 Thread Jan Scheurich
> >> Does it makes sense to keep the context headers as part of the flow?
> >> What is the reasoning behind it? With mdtype 2 headers this might either
> >> not work very well or will increase sw_flow_key size causing slowdowns
> >> for all protocols.
> >
> > For userspace, miniflow can handle such issue, but kernel data path
> > didn't provide such mechanism, so I don't think we can think of a better
> > way to fix this.
> 
> I do think that both, kernel and dpdk dp, are fine if you don't match on
> context fields, which I think is the common case.

Agree

> As soon as a few paths start to match on contexts at different offsets I
> am not sure if miniflow will actually help. I don't know enough about
> that. Set cover is a NP hard problem, so constructing plain simple flows
> will be an approximation somewhere anyway at some point.
> 
> If the context values are in some way discrete it might make sense, but
> for load balancing purposes your ingress router might fill out one
> context field with a flow hash of the inner flow to allow ECMP or
> loadbalancing. You might want to treat that separately. Otherwise the
> different paths might always need to context[3] & 0x3 (bits as number of
> possible next hops) and put that into the flow state. Because every hop
> and every path might have different numbers of nexthops etc., I don't
> think this is easy unifiable anymore.

Yes, that would become a scaling challenge for the datapaths and also for 
schemes to off-load that datapath to HW. I believe it requires discipline
on the controller side to not let the number of needed masks explode for
OVS.

> > I believe every newly-added key will result in similiar issue you
> > concern, maybe it will be better to introduce miniflow in kernel data
> > path, but it isn't in scope of this patch series.
> 
> You will see the same problem with offloading flows e.g. to network
> cards very soon. Flow explosion here will cause your 100Gbit/s NIC
> suddenly to go to software slow path.
> 
> > It will be great if you can have a better proposal to fix your concern.
> 
> I do think that something alike miniflow might make sense, but I don't
> know enough about miniflow.

Struct miniflow in the netdev datapath is merely a compact representation
of struct flow. struct flow is partitioned in to 64-bit "stripes" and struct
miniflow only stores those 64-bit bit stripes that have a non-zero mask stripe.
It reduces the memory footprint for storing packet flow and megaflow entries
but does not provide any megaflow lookup performance as such.

> New ACTIONS, that are not only bound to NSH, seem appealing to me at the
> moment. E.g. have a subtable match in the kernel dp or allowing for some
> OXM % modulo -> select action or neighbor alike thing would probably
> cover a lot of cases in the beginning. The submatch would probably very
> much reassemble the miniflow in dpdk dp.

I would like to discuss your ideas to optimize scalable load-sharing in OVS.
But I think we should do that in a separate mail thread to let the immediate
NSH kernel datapath work proceed. 

Can you  provide some more details of what you suggest? Perhaps take that 
on the ovs-dev mailing list.

BR, Jan


Re: [PATCH net-next v6 3/3] openvswitch: enable NSH support

2017-09-04 Thread Hannes Frederic Sowa
Hello,

"Yang, Yi"  writes:

> On Wed, Aug 30, 2017 at 05:53:27PM +0800, Hannes Frederic Sowa wrote:
>> Hello,
>> 
>> Yi Yang  writes:
>> 
>> [...]
>> 
>> > +struct ovs_key_nsh {
>> > +  u8 flags;
>> > +  u8 ttl;
>> > +  u8 mdtype;
>> > +  u8 np;
>> > +  __be32 path_hdr;
>> > +  __be32 context[NSH_MD1_CONTEXT_SIZE];
>> > +};
>> > +
>> >  struct sw_flow_key {
>> >u8 tun_opts[IP_TUNNEL_OPTS_MAX];
>> >u8 tun_opts_len;
>> > @@ -144,6 +154,7 @@ struct sw_flow_key {
>> >};
>> >} ipv6;
>> >};
>> > +  struct ovs_key_nsh nsh; /* network service header */
>> >struct {
>> >/* Connection tracking fields not packed above. */
>> >struct {
>> 
>> Does it makes sense to keep the context headers as part of the flow?
>> What is the reasoning behind it? With mdtype 2 headers this might either
>> not work very well or will increase sw_flow_key size causing slowdowns
>> for all protocols.
>
> For userspace, miniflow can handle such issue, but kernel data path
> didn't provide such mechanism, so I don't think we can think of a better
> way to fix this.

I do think that both, kernel and dpdk dp, are fine if you don't match on
context fields, which I think is the common case.

As soon as a few paths start to match on contexts at different offsets I
am not sure if miniflow will actually help. I don't know enough about
that. Set cover is a NP hard problem, so constructing plain simple flows
will be an approximation somewhere anyway at some point.

If the context values are in some way discrete it might make sense, but
for load balancing purposes your ingress router might fill out one
context field with a flow hash of the inner flow to allow ECMP or
loadbalancing. You might want to treat that separately. Otherwise the
different paths might always need to context[3] & 0x3 (bits as number of
possible next hops) and put that into the flow state. Because every hop
and every path might have different numbers of nexthops etc., I don't
think this is easy unifiable anymore.

> We have to have context headers in flow for match and set, every hop in
> service function chaining can put its metedata into context headers on
> demand, MD type 2 is much more complicated than you can imagine, it is
> impossible to use an uniform way to handle MD type 1 and MD type 2, our
> way is to keep MD type 1 keys in struct ovs_key_nsh and to reuse struct
> flow_tnl for MD type 2 metadata, in case of MD type 2, we can set
> context headers to 0 in struct ovs_key_nsh.

Understood and agreed.

> I beleive every newly-added key will result in similiar issue you
> concern, maybe it will be better to introduce miniflow in kernel data
> path, but it isn't in scope of this patch series.

You will see the same problem with offloading flows e.g. to network
cards very soon. Flow explosion here will cause your 100Gbit/s NIC
suddenly to go to software slow path.

> It will be great if you can have a better proposal to fix your concern.

I do think that something alike miniflow might make sense, but I don't
know enough about miniflow.

New ACTIONS, that are not only bound to NSH, seem appealing to me at the
moment. E.g. have a subtable match in the kernel dp or allowing for some
OXM % modulo -> select action or neighbor alike thing would probably
cover a lot of cases in the beginning. The submatch would probably very
much reassemble the miniflow in dpdk dp.

Thanks,
Hannes


Re: [PATCH net-next v6 3/3] openvswitch: enable NSH support

2017-09-03 Thread Yang, Yi
On Wed, Aug 30, 2017 at 05:53:27PM +0800, Hannes Frederic Sowa wrote:
> Hello,
> 
> Yi Yang  writes:
> 
> [...]
> 
> > +struct ovs_key_nsh {
> > +   u8 flags;
> > +   u8 ttl;
> > +   u8 mdtype;
> > +   u8 np;
> > +   __be32 path_hdr;
> > +   __be32 context[NSH_MD1_CONTEXT_SIZE];
> > +};
> > +
> >  struct sw_flow_key {
> > u8 tun_opts[IP_TUNNEL_OPTS_MAX];
> > u8 tun_opts_len;
> > @@ -144,6 +154,7 @@ struct sw_flow_key {
> > };
> > } ipv6;
> > };
> > +   struct ovs_key_nsh nsh; /* network service header */
> > struct {
> > /* Connection tracking fields not packed above. */
> > struct {
> 
> Does it makes sense to keep the context headers as part of the flow?
> What is the reasoning behind it? With mdtype 2 headers this might either
> not work very well or will increase sw_flow_key size causing slowdowns
> for all protocols.

For userspace, miniflow can handle such issue, but kernel data path
didn't provide such mechanism, so I don't think we can think of a better
way to fix this.

We have to have context headers in flow for match and set, every hop in
service function chaining can put its metedata into context headers on
demand, MD type 2 is much more complicated than you can imagine, it is
impossible to use an uniform way to handle MD type 1 and MD type 2, our
way is to keep MD type 1 keys in struct ovs_key_nsh and to reuse struct
flow_tnl for MD type 2 metadata, in case of MD type 2, we can set
context headers to 0 in struct ovs_key_nsh.

I beleive every newly-added key will result in similiar issue you
concern, maybe it will be better to introduce miniflow in kernel data
path, but it isn't in scope of this patch series.

It will be great if you can have a better proposal to fix your concern.

> 
> [...]


Re: [PATCH net-next v6 3/3] openvswitch: enable NSH support

2017-08-30 Thread Hannes Frederic Sowa
Hello,

Yi Yang  writes:

[...]

> +struct ovs_key_nsh {
> + u8 flags;
> + u8 ttl;
> + u8 mdtype;
> + u8 np;
> + __be32 path_hdr;
> + __be32 context[NSH_MD1_CONTEXT_SIZE];
> +};
> +
>  struct sw_flow_key {
>   u8 tun_opts[IP_TUNNEL_OPTS_MAX];
>   u8 tun_opts_len;
> @@ -144,6 +154,7 @@ struct sw_flow_key {
>   };
>   } ipv6;
>   };
> + struct ovs_key_nsh nsh; /* network service header */
>   struct {
>   /* Connection tracking fields not packed above. */
>   struct {

Does it makes sense to keep the context headers as part of the flow?
What is the reasoning behind it? With mdtype 2 headers this might either
not work very well or will increase sw_flow_key size causing slowdowns
for all protocols.

[...]