RE: [PATCH net-next v4] openvswitch: enable NSH support

2017-08-23 Thread David Laight
From: Ben Pfaff
> Sent: 22 August 2017 18:35
...
> We solved the alignment problem in OVS userspace a different way, by
> defining our versions of the network protocol headers so that they only
> need 16-bit alignment.  In turn, we did that by defining a
> ovs_16aligned_be32 type as a pair of be16s and ovs_16aligned_be64 as
> four be16s, and using helper functions for reads and writes.
...

If you add __attribute__((aligned(2))) to the 32bit and 64bit structure
members then gcc will do the loads and shifts for you.

David



Re: [PATCH net-next v4] openvswitch: enable NSH support

2017-08-23 Thread Jiri Benc
On Tue, 22 Aug 2017 17:38:26 +0800, Yang, Yi wrote:
> I checked GSO support code, we have two cases to handle, one case is to
> output NSH packet to VxLAN-gpe port, the other case is to output NSH packet
> to Ethernet port (tap, veth, vhost, physical eth, etc.), I don't think
> it is a good way to do GSO support in OVS module, because we can't know
> the final output port at this point, if the final output port is
> VxLAN-gpe port, it will still face GSO issue, but I didn't see vxlan
> module handles this, maybe udp tunnel did this. I think a NSH packet can
> be split into two packets, one has NSH header, another one doesn't,
> so I'm not sure how vxlan module can avoid this situation.

As I said before, by either implementing GSO support for NSH or by
segmenting in software before pushing the NSH header. In the first
case, the packet will be software segmented before being pushed to the
hardware, in the second case, it will be software segmented before
pushing the NSH header.

> For the second case, we can add a nsh GSO module in net/nsh/nsh_gso.c
> (as MPLS did in net/mpls/mpls_gso.c), that is the best way to handle
> this, what do you think about it?

Yes, that's one of the two options that we've been talking about. It's
the better and the more efficient one.

 Jiri


Re: [PATCH net-next v4] openvswitch: enable NSH support

2017-08-22 Thread Ben Pfaff
On Tue, Aug 22, 2017 at 08:32:49AM +, Jan Scheurich wrote:
> > > Or why else does OVS user space code take so great pain to model
> > > possible misalignment and provide/use safe access functions?
> > 
> > I don't know how the ovs user space deals with packet allocation. In
> > the kernel, the network header is aligned in a way that it allows
> > efficient 32bit access.
> 
> It seems that OVS has not had the same approach as Linux. There is no
> config parameter covering the alignment characteristics of the machine
> architecture. For packets buffers received from outside sources
> (e.g. DPDK interfaces) they make no assumptions on alignment and play
> safe. For packets allocated inside OVS, the Ethernet packet is
> typically stored so that the L3 header is 32-bit aligned, so that the
> misalignment precautions would be unnecessary. But I didn't check all
> code paths.

We solved the alignment problem in OVS userspace a different way, by
defining our versions of the network protocol headers so that they only
need 16-bit alignment.  In turn, we did that by defining a
ovs_16aligned_be32 type as a pair of be16s and ovs_16aligned_be64 as
four be16s, and using helper functions for reads and writes.  This made
it harder to screw up alignment in a subtle way and only find out long
after release when someone tested a corner case on a RISC architecture.
It probably has a performance cost on those RISC architectures, for the
cases where the access really is aligned, but it's more obviously
correct and I highly value that for OVS userspace.

As far as I can tell it's not actually possible, in the general case, to
add padding such that all parts of a packet are aligned.  VXLAN is the
case that comes to mind.  With VXLAN, as far as I can tell, the 14-byte
inner Ethernet header mean that you can align either the outer IPv4
header or the inner IPv4 header, but not both.  That means that no
matter how careful OVS is about aligning packets, it would still have to
deal with unaligned accesses in some cases.

I see that the VXLAN issue has come up in Linux before:
https://www.ietf.org/mail-archive/web/nvo3/current/msg05743.html


Re: [PATCH net-next v4] openvswitch: enable NSH support

2017-08-22 Thread Yang, Yi
On Mon, Aug 21, 2017 at 05:47:13PM +0800, Jiri Benc wrote:
> 
> > I don't know how we can support this, is it a must-have thing?
> 
> What would happen if you get a GSO packet? Ports of an ovs bridge claim
> GSO support, thus they may get a GSO packet. You have to handle it one
> way or the other: either software segment the packet before pushing the
> header, or implement proper GSO support for NSH.
> 

I checked GSO support code, we have two cases to handle, one case is to
output NSH packet to VxLAN-gpe port, the other case is to output NSH packet
to Ethernet port (tap, veth, vhost, physical eth, etc.), I don't think
it is a good way to do GSO support in OVS module, because we can't know
the final output port at this point, if the final output port is
VxLAN-gpe port, it will still face GSO issue, but I didn't see vxlan
module handles this, maybe udp tunnel did this. I think a NSH packet can
be split into two packets, one has NSH header, another one doesn't, so I'm not 
sure how vxlan module can avoid this situation.

For the second case, we can add a nsh GSO module in net/nsh/nsh_gso.c (as MPLS 
did in net/mpls/mpls_gso.c), that is the best way to handle this, what do you
think about it?


RE: [PATCH net-next v4] openvswitch: enable NSH support

2017-08-22 Thread Jan Scheurich
> > If I understand correctly, this is a default definition that can be
> > overridden by drivers so that we still cannot rely on the Ethernet
> > payload always being 32-bit-aligned.
> 
> Not by drivers, by architectures. Only architectures that can efficiently
> handle unaligned access (or on which the cost of handling unaligned access
> is lower than the cost of unaligned DMA access) set NET_IP_ALIGN to 0.

Thanks, got it!

> 
> > Furthermore, there seem to be machine architectures that cannot handle
> > misaligned 32-bit access at all (not even with a performance penalty).
> 
> Those leave NET_IP_ALIGN to 2.

Dito

> 
> > Or why else does OVS user space code take so great pain to model
> > possible misalignment and provide/use safe access functions?
> 
> I don't know how the ovs user space deals with packet allocation. In the
> kernel, the network header is aligned in a way that it allows efficient 32bit
> access.

It seems that OVS has not had the same approach as Linux. There is no config 
parameter covering the alignment characteristics of the machine architecture. 
For packets buffers received from outside sources (e.g. DPDK interfaces) they 
make no assumptions on alignment and play safe. For packets allocated inside 
OVS, the Ethernet packet is typically stored so that the L3 header is 32-bit 
aligned, so that the misalignment precautions would be unnecessary. But I 
didn't check all code paths.




Re: [PATCH net-next v4] openvswitch: enable NSH support

2017-08-21 Thread Jiri Benc
On Mon, 21 Aug 2017 10:10:38 +, Jan Scheurich wrote:
> If I understand correctly, this is a default definition that can be
> overridden by drivers so that we still cannot rely on the Ethernet
> payload always being 32-bit-aligned. 

Not by drivers, by architectures. Only architectures that can
efficiently handle unaligned access (or on which the cost of handling
unaligned access is lower than the cost of unaligned DMA access) set
NET_IP_ALIGN to 0.

> Furthermore, there seem to be machine architectures that cannot
> handle misaligned 32-bit access at all (not even with a performance
> penalty).

Those leave NET_IP_ALIGN to 2.

> Or why else does OVS user space code take so great pain to model
> possible misalignment and provide/use safe access functions?

I don't know how the ovs user space deals with packet allocation. In
the kernel, the network header is aligned in a way that it allows
efficient 32bit access.

> Does Linux kernel code generally ignore this risk?

Given the fact that IPv4 addresses are 32bit, are accessed as such and
one can't say that IPv4 implementation on Linux is non-functional, the
answer is obvious :-)

 Jiri


Re: [PATCH net-next v4] openvswitch: enable NSH support

2017-08-21 Thread Yang, Yi
On Mon, Aug 21, 2017 at 05:47:13PM +0800, Jiri Benc wrote:
> On Mon, 21 Aug 2017 17:15:42 +0800, Yang, Yi wrote:
> > The issue is it is used union in
> > 
> > struct nsh_hdr {
> > ovs_be16 ver_flags_ttl_len;
> > uint8_t md_type;
> > uint8_t next_proto;
> > ovs_16aligned_be32 path_hdr;
> > union {
> > struct nsh_md1_ctx md1;
> > struct nsh_md2_tlv md2;
> > };
> > };
> 
> This should work (modulo the non-kernel type names, of course). Did you
> mean to put [] after md2?

Yes, the original version has [] after md2.

> 
> > in Linux kernel build, it complained it, I changed it to
> 
> What was the error message?

./include/net/nsh.h:213:25: error: flexible array member in union
  struct nsh_md2_tlv md2[];
 ^

> 
> > struct nsh_hdr {
> > ovs_be16 ver_flags_ttl_len;
> > uint8_t md_type;
> > uint8_t next_proto;
> > ovs_16aligned_be32 path_hdr;
> > union {
> > struct nsh_md1_ctx md1;
> > struct nsh_md2_tlv md2[0];
> > };
> > };
> 
> I wouldn't use this. First, zero length array is a GCC extension. It
> would indeed be better not to use that in uAPI. Second, I wouldn't even
> use a flexible array member here, see my reply to Jan for the reasons.
> 
> Note that I commented on struct nsh_md2_tlv having __u8[] as the last
> member which IMHO makes good sense. I'm not entirely sure what C99 says
> about flexible array member being part of a struct inside union inside
> a struct, though. GCC seems to cope with that just fine but AFAIK it
> has some extension over the C standard wrt. flexible array members.

Yes, if struct nsh_md2_tlv has  __u8[] as last field, 

struct nsh_md2_tlv {
__be16 md_class;
u8 type;
u8 length;
u8 md_value[];
};

struct nsh_hdr {
__be16 ver_flags_ttl_len;
u8 md_type;
u8 next_proto;
__be32 path_hdr;
union {
struct nsh_md1_ctx md1;
struct nsh_md2_tlv md2;
};
};

it is ok, so let us use this one.

> 
> > I don't know how we can support this, is it a must-have thing?
> 
> What would happen if you get a GSO packet? Ports of an ovs bridge claim
> GSO support, thus they may get a GSO packet. You have to handle it one
> way or the other: either software segment the packet before pushing the
> header, or implement proper GSO support for NSH.

This is an issue, I'll investigate it and find a way to handle this.

> 
> > But struct nsh_hdr had different struct from struct ovs_key_nsh, we
> > have no way to make them completely same, do you mean we should use the
> > same name if they are same fields and represent the same thing?
> 
> Yes.
> 
> Thanks,
> 
>  Jiri


RE: [PATCH net-next v4] openvswitch: enable NSH support

2017-08-21 Thread Jan Scheurich
> > NSH can be carried over Ethernet with a 14 byte header. In that case
> > the total NSH header would typically be 16-bit aligned, so that all
> > 32-bit members would be misaligned.
> 
> See NET_IP_ALIGN in include/linux/skbuff.h.

If I understand correctly, this is a default definition that can be overridden 
by drivers so that we still cannot rely on the Ethernet payload always being 
32-bit-aligned. 

Furthermore, there seem to be machine architectures that cannot handle 
misaligned 32-bit access at all (not even with a performance penalty).

Or why else does OVS user space code take so great pain to model possible 
misalignment and provide/use safe access functions?

Does Linux kernel code generally ignore this risk?

/Jan


Re: [PATCH net-next v4] openvswitch: enable NSH support

2017-08-21 Thread Jiri Benc
On Mon, 21 Aug 2017 09:42:27 +, Jan Scheurich wrote:
> I understand your concern. But not declaring md2 as array is wrong as
> well, as there might not be an MD2 TLV context header. An MD2 NSH
> header is perfectly valid without any TLV. So in any case the user of
> the struct needs to be aware of the NSH semantics.

Good point.

> NSH can be carried over Ethernet with a 14 byte header. In that case
> the total NSH header would typically be 16-bit aligned, so that all
> 32-bit members would be misaligned.

See NET_IP_ALIGN in include/linux/skbuff.h.

 Jiri


Re: [PATCH net-next v4] openvswitch: enable NSH support

2017-08-21 Thread Jiri Benc
On Mon, 21 Aug 2017 17:15:42 +0800, Yang, Yi wrote:
> The issue is it is used union in
> 
> struct nsh_hdr {
> ovs_be16 ver_flags_ttl_len;
> uint8_t md_type;
> uint8_t next_proto;
> ovs_16aligned_be32 path_hdr;
> union {
> struct nsh_md1_ctx md1;
> struct nsh_md2_tlv md2;
> };
> };

This should work (modulo the non-kernel type names, of course). Did you
mean to put [] after md2?

> in Linux kernel build, it complained it, I changed it to

What was the error message?

> struct nsh_hdr {
> ovs_be16 ver_flags_ttl_len;
> uint8_t md_type;
> uint8_t next_proto;
> ovs_16aligned_be32 path_hdr;
> union {
> struct nsh_md1_ctx md1;
> struct nsh_md2_tlv md2[0];
> };
> };

I wouldn't use this. First, zero length array is a GCC extension. It
would indeed be better not to use that in uAPI. Second, I wouldn't even
use a flexible array member here, see my reply to Jan for the reasons.

Note that I commented on struct nsh_md2_tlv having __u8[] as the last
member which IMHO makes good sense. I'm not entirely sure what C99 says
about flexible array member being part of a struct inside union inside
a struct, though. GCC seems to cope with that just fine but AFAIK it
has some extension over the C standard wrt. flexible array members.

> I don't know how we can support this, is it a must-have thing?

What would happen if you get a GSO packet? Ports of an ovs bridge claim
GSO support, thus they may get a GSO packet. You have to handle it one
way or the other: either software segment the packet before pushing the
header, or implement proper GSO support for NSH.

> But struct nsh_hdr had different struct from struct ovs_key_nsh, we
> have no way to make them completely same, do you mean we should use the
> same name if they are same fields and represent the same thing?

Yes.

Thanks,

 Jiri


RE: [PATCH net-next v4] openvswitch: enable NSH support

2017-08-21 Thread Jan Scheurich
> > struct nsh_hdr {
> > ovs_be16 ver_flags_ttl_len;
> > uint8_t mdtype;
> > uint8_t np;
> > ovs_16aligned_be32 path_hdr;
> > union {
> > struct nsh_md1_ctx md1;
> > struct nsh_md2_tlv md2[];
> 
> I'm not that sure about this. With each member of md2 having a different
> size, you can't use md2 as an array. However, if it was declared as an
> array, it might encourage such (wrong) usage.
> 
> In particular, nsh_hdr->md2[1] is always wrong.
> 
> It seems better to not declare md2 as an array.

I understand your concern. But not declaring md2 as array is wrong as well, as 
there might not be an MD2 TLV context header. An MD2 NSH header is perfectly 
valid without any TLV. So in any case the user of the struct needs to be aware 
of the NSH semantics.

> >
> > I wonder about the possible 16-bit alignment of the 32-bit fields,
> > though. How is that handled in the kernel?
> 
> get_unaligned_*
> 
> > Also struct nsh_md1_ctx
> > has 32-bit members, which might not be 32-bit aligned in the packet.
> 
> I don't see that happening, it seems the header before md1 is 8 bytes and
> sizeof(md1) is 32 bytes? And for md2, the standard mandates that the md2
> size is a multiply of 4 bytes, too.

NSH can be carried over Ethernet with a 14 byte header. In that case the total 
NSH header would typically be 16-bit aligned, so that all 32-bit members would 
be misaligned.




Re: [PATCH net-next v4] openvswitch: enable NSH support

2017-08-21 Thread Jiri Benc
On Mon, 21 Aug 2017 09:04:30 +, Jan Scheurich wrote:
> The second member of the union should be a variable length array []
> of struct nsh_md2_tlv
> 
> struct nsh_hdr {
> ovs_be16 ver_flags_ttl_len;
> uint8_t mdtype;
> uint8_t np;
> ovs_16aligned_be32 path_hdr;
> union {
> struct nsh_md1_ctx md1;
> struct nsh_md2_tlv md2[];

I'm not that sure about this. With each member of md2 having
a different size, you can't use md2 as an array. However, if it was
declared as an array, it might encourage such (wrong) usage.

In particular, nsh_hdr->md2[1] is always wrong.

It seems better to not declare md2 as an array.

> };
> };
> 
> That was the original design before Ben removed it due to missing
> support in Microsoft compiler. For the Kernel datapath we should go
> back to that. 
> 
> I wonder about the possible 16-bit alignment of the 32-bit fields,
> though. How is that handled in the kernel?

get_unaligned_*

> Also struct nsh_md1_ctx
> has 32-bit members, which might not be 32-bit aligned in the packet.

I don't see that happening, it seems the header before md1 is 8 bytes
and sizeof(md1) is 32 bytes? And for md2, the standard mandates that
the md2 size is a multiply of 4 bytes, too.

 Jiri


RE: [PATCH net-next v4] openvswitch: enable NSH support

2017-08-21 Thread Jan Scheurich
> 
> The second member of the union should be a variable length array [] of
> struct nsh_md2_tlv
> 
> struct nsh_hdr {
> ovs_be16 ver_flags_ttl_len;
> uint8_t mdtype;
> uint8_t np;
> ovs_16aligned_be32 path_hdr;
> union {
> struct nsh_md1_ctx md1;
> struct nsh_md2_tlv md2[];
> };
> };
> 
> That was the original design before Ben removed it due to missing support
> in Microsoft compiler.
> For the Kernel datapath we should go back to that.
> 
> I wonder about the possible 16-bit alignment of the 32-bit fields, though.
> How is that handled in the kernel? Also struct nsh_md1_ctx has 32-bit
> members, which might not be 32-bit aligned in the packet.
> 

One afterthought: 

I think it would be good to add another member to the union to represent 
unstructured context headers as used, e.g., in the context of push_nsh.

struct nsh_hdr {
ovs_be16 ver_flags_ttl_len;
uint8_t mdtype;
uint8_t np;
ovs_16aligned_be32 path_hdr;
union {
uint8_t ctx_headers[];
struct nsh_md1_ctx md1;
struct nsh_md2_tlv md2[];
};
};




Re: [PATCH net-next v4] openvswitch: enable NSH support

2017-08-21 Thread Yang, Yi
On Mon, Aug 21, 2017 at 11:18:54AM +0200, Jiri Benc wrote:
> On Mon, 21 Aug 2017 16:39:01 +0800, Yang, Yi wrote:
> > Anyway, we need to keep the code in userspace consistent with the one in
> > kernel as possible as, otherwise it will be a burden for developer, I
> > know userspace has different coding standard from kernel, this will make
> > developer painful if we have two sets of code although they have same
> > functionality.
> 
> I'm sorry, I don't get this. What's wrong with having __u8[] as the
> last member of the struct? That's C99. It's 18 years old standard.
> We're using that throughout our uAPI. Why that should be a problem for
> any user space program?

The issue is it is used union in

struct nsh_hdr {
ovs_be16 ver_flags_ttl_len;
uint8_t md_type;
uint8_t next_proto;
ovs_16aligned_be32 path_hdr;
union {
struct nsh_md1_ctx md1;
struct nsh_md2_tlv md2;
};
};

in Linux kernel build, it complained it, I changed it to

struct nsh_hdr {
ovs_be16 ver_flags_ttl_len;
uint8_t md_type;
uint8_t next_proto;
ovs_16aligned_be32 path_hdr;
union {
struct nsh_md1_ctx md1;
struct nsh_md2_tlv md2[0];
};
};

It is ok, but for Microsoft compiler, it isn't allowed there is struct
nsh_md2_tlv md2[0] in a union, that is Ben Pfaff's hack :-)

> 
> > > MPLS supports GSO and needs this for segmentation. I don't see anything
> > > GSO related in this patch.
> > > 
> > > How do you plan to address GSO, anyway?
> > 
> > No plan to do that, I'm not an expert on this, we can remove it if
> > you're very sure it is necessary.
> 
> Without GSO, I don't see any use for inner_protocol.
> 
> However, don't you need to software segment the packet if it's GSO
> before pushing the NSH header?
> 
> And wouldn't it be better to implement GSO for NSH, anyway?

I don't know how we can support this, is it a must-have thing?

> 
> > To make sure we make agreement, please confirm if this one is ok?
> > 
> > struct nsh_hdr {
> > ovs_be16 ver_flags_ttl_len;
> > uint8_t mdtype;
> > uint8_t np;
> > ovs_16aligned_be32 path_hdr;
> > union {
> > struct nsh_md1_ctx md1;
> > struct nsh_md2_tlv md2;
> > };
> > };
> > 
> > Or it will be better if you can provide your preferred version here.
> 
> I don't really care that much about the names if it's clear what they
> mean. I was merely commenting on the inconsistency which looked weird.
> Whether it's md_type or mdtype, I don't have a preference (does not
> mean others won't, though :-)). Just pick one and stick to it, as far
> as I'm concerned.

But struct nsh_hdr had different struct from struct ovs_key_nsh, we
have no way to make them completely same, do you mean we should use the
same name if they are same fields and represent the same thing?

> 
>  Jiri


Re: [PATCH net-next v4] openvswitch: enable NSH support

2017-08-21 Thread Jiri Benc
On Mon, 21 Aug 2017 16:39:01 +0800, Yang, Yi wrote:
> Anyway, we need to keep the code in userspace consistent with the one in
> kernel as possible as, otherwise it will be a burden for developer, I
> know userspace has different coding standard from kernel, this will make
> developer painful if we have two sets of code although they have same
> functionality.

I'm sorry, I don't get this. What's wrong with having __u8[] as the
last member of the struct? That's C99. It's 18 years old standard.
We're using that throughout our uAPI. Why that should be a problem for
any user space program?

> > MPLS supports GSO and needs this for segmentation. I don't see anything
> > GSO related in this patch.
> > 
> > How do you plan to address GSO, anyway?
> 
> No plan to do that, I'm not an expert on this, we can remove it if
> you're very sure it is necessary.

Without GSO, I don't see any use for inner_protocol.

However, don't you need to software segment the packet if it's GSO
before pushing the NSH header?

And wouldn't it be better to implement GSO for NSH, anyway?

> To make sure we make agreement, please confirm if this one is ok?
> 
> struct nsh_hdr {
> ovs_be16 ver_flags_ttl_len;
> uint8_t mdtype;
> uint8_t np;
> ovs_16aligned_be32 path_hdr;
> union {
> struct nsh_md1_ctx md1;
> struct nsh_md2_tlv md2;
> };
> };
> 
> Or it will be better if you can provide your preferred version here.

I don't really care that much about the names if it's clear what they
mean. I was merely commenting on the inconsistency which looked weird.
Whether it's md_type or mdtype, I don't have a preference (does not
mean others won't, though :-)). Just pick one and stick to it, as far
as I'm concerned.

 Jiri


RE: [PATCH net-next v4] openvswitch: enable NSH support

2017-08-21 Thread Jan Scheurich
> struct nsh_hdr {
> ovs_be16 ver_flags_ttl_len;
> uint8_t mdtype;
> uint8_t np;
> ovs_16aligned_be32 path_hdr;
> union {
> struct nsh_md1_ctx md1;
> struct nsh_md2_tlv md2;
> };
> };
> 

The second member of the union should be a variable length array [] of struct 
nsh_md2_tlv

struct nsh_hdr {
ovs_be16 ver_flags_ttl_len;
uint8_t mdtype;
uint8_t np;
ovs_16aligned_be32 path_hdr;
union {
struct nsh_md1_ctx md1;
struct nsh_md2_tlv md2[];
};
};

That was the original design before Ben removed it due to missing support in 
Microsoft compiler. 
For the Kernel datapath we should go back to that. 

I wonder about the possible 16-bit alignment of the 32-bit fields, though. How 
is that handled in the kernel? Also struct nsh_md1_ctx has 32-bit members, 
which might not be 32-bit aligned in the packet.

BR, Jan


Re: [PATCH net-next v4] openvswitch: enable NSH support

2017-08-21 Thread Yang, Yi
On Mon, Aug 21, 2017 at 10:19:24AM +0200, Jiri Benc wrote:
> On Mon, 21 Aug 2017 14:11:10 +0800, Yang, Yi wrote:
> > In OVS code, it has been removed because of Microsoft compiler issue.
> 
> We absolutely, completely and utterly do not care in the kernel. Please
> never make such arguments and never make the code look worse because of
> a compiler for other operating systems.

Anyway, we need to keep the code in userspace consistent with the one in
kernel as possible as, otherwise it will be a burden for developer, I
know userspace has different coding standard from kernel, this will make
developer painful if we have two sets of code although they have same
functionality.

> 
> > > I was wondering about this during the reviews of the previous versions.
> > > Now I've given this more thought but I still don't see it - why is the
> > > inner_protocol set here?
> > 
> > I saw push_mpls has it, so also set it.
> 
> MPLS supports GSO and needs this for segmentation. I don't see anything
> GSO related in this patch.
> 
> How do you plan to address GSO, anyway?

No plan to do that, I'm not an expert on this, we can remove it if
you're very sure it is necessary.

> 
> > > > +   err = check_header(skb, length);
> > > > +   if (unlikely(err))
> > > > +   return err;
> > > > +
> > > > +   key->nsh.flags = nsh_get_flags(nsh);
> > > 
> > > Again, need to reload nsh.
> > 
> > I used skb->len in v5, so we can't avoid such issue.
> 
> And how do you ensure that the skb has enough headroom, then? That is
> wrong. All I said is that you have to reload the pointers to the header
> which is what you have to do.

Ok, got it, will add it.

> 
> > > Wouldn't it be nicer if the fields of struct ovs_nsh_key_base and of
> > > struct nsh_hdr had the same names?
> > 
> > Such change also will impact on OVS code, so I prefer not to change
> > them.
> 
> We do not care.
> 
> The order in which you send patches to different projects is your
> choice. The only standard by which we measure and evaluate kernel
> patches is the technical suitability of the patches. Whether or not
> some other projects have dependencies on some kind of previous versions
> of out of tree kernel patches have zero relevance here.
> 
> If you designed other code to depend on your notion on how a kernel API
> will look like before getting the actual patches accepted to the
> kernel, that's your problem and you'll have to deal with that.
> 
> > For struct nsh_hdr, we need more self-descriptive fields, but for struct
> > ovs_nsh_key_base, because we need to use keys in OVS OpenFlow, so np is
> > obviously better than next_proto, we also try our best to make sure the
> > old NSH implementation has same match fields as the new one does.
> 
> Not relevant.

Ok, I can follow your standard :-)

To make sure we make agreement, please confirm if this one is ok?

struct nsh_hdr {
ovs_be16 ver_flags_ttl_len;
uint8_t mdtype;
uint8_t np;
ovs_16aligned_be32 path_hdr;
union {
struct nsh_md1_ctx md1;
struct nsh_md2_tlv md2;
};
};

Or it will be better if you can provide your preferred version here.

> 
>  Jiri


Re: [PATCH net-next v4] openvswitch: enable NSH support

2017-08-21 Thread Jiri Benc
On Mon, 21 Aug 2017 14:11:10 +0800, Yang, Yi wrote:
> In OVS code, it has been removed because of Microsoft compiler issue.

We absolutely, completely and utterly do not care in the kernel. Please
never make such arguments and never make the code look worse because of
a compiler for other operating systems.

> > I was wondering about this during the reviews of the previous versions.
> > Now I've given this more thought but I still don't see it - why is the
> > inner_protocol set here?
> 
> I saw push_mpls has it, so also set it.

MPLS supports GSO and needs this for segmentation. I don't see anything
GSO related in this patch.

How do you plan to address GSO, anyway?

> > > + err = check_header(skb, length);
> > > + if (unlikely(err))
> > > + return err;
> > > +
> > > + key->nsh.flags = nsh_get_flags(nsh);
> > 
> > Again, need to reload nsh.
> 
> I used skb->len in v5, so we can't avoid such issue.

And how do you ensure that the skb has enough headroom, then? That is
wrong. All I said is that you have to reload the pointers to the header
which is what you have to do.

> > Wouldn't it be nicer if the fields of struct ovs_nsh_key_base and of
> > struct nsh_hdr had the same names?
> 
> Such change also will impact on OVS code, so I prefer not to change
> them.

We do not care.

The order in which you send patches to different projects is your
choice. The only standard by which we measure and evaluate kernel
patches is the technical suitability of the patches. Whether or not
some other projects have dependencies on some kind of previous versions
of out of tree kernel patches have zero relevance here.

If you designed other code to depend on your notion on how a kernel API
will look like before getting the actual patches accepted to the
kernel, that's your problem and you'll have to deal with that.

> For struct nsh_hdr, we need more self-descriptive fields, but for struct
> ovs_nsh_key_base, because we need to use keys in OVS OpenFlow, so np is
> obviously better than next_proto, we also try our best to make sure the
> old NSH implementation has same match fields as the new one does.

Not relevant.

 Jiri


Re: [PATCH net-next v4] openvswitch: enable NSH support

2017-08-21 Thread Yang, Yi
On Fri, Aug 18, 2017 at 09:31:03PM +0800, Jiri Benc wrote:
> On Fri, 18 Aug 2017 15:26:01 +0200, Jiri Benc wrote:
> > Out of time for today, will continue the review next week. Again, feel
> > free to send a new version meanwhile or wait for the rest of the
> > review, whatever works better for you.
> 
> One more thing: before you send a new version, be sure and double check
> that you addressed *all* of the feedback. In this version, you still
> have not addressed a few things I gave feedback on in the previous
> version. This wastes everyone's time.

I did check every comment you added and tried my best to fix them, maybe
you mean that hacky attr & mask code, I think that one is ok in v4, I have
used your proposal in v5 and has small change, from my perspective, they
don't have what difference :-)

> 
> It doesn't mean I'm always right, of course, but you either explain why
> I'm wrong or change the code. It's generally not acceptable to ignore
> feedback.

Will be more careful later in order that we can have a more efficient
review. Thank you so much for your great comments.

> 
> Thank you,
> 
>  Jiri


Re: [PATCH net-next v4] openvswitch: enable NSH support

2017-08-21 Thread Yang, Yi
On Sat, Aug 19, 2017 at 03:09:47AM +0800, Eric Garver wrote:
> On Fri, Aug 18, 2017 at 03:24:31PM +0800, Yi Yang wrote:
> > v3->v4
> >  - Add new NSH match field ttl
> >  - Update NSH header to the latest format
> >which will be final format and won't change
> >per its author's confirmation.
> >  - Fix comments for v3.
> 
> Hi Yi,
> Only a few comments below since Jiri already supplied lots of feedback.
>

Eric, thank you for your comments, I have fixed them in v5, please help
review v5, thanks a lot.
 
> > @@ -1210,6 +1373,20 @@ static int do_execute_actions(struct datapath *dp, 
> > struct sk_buff *skb,
> > case OVS_ACTION_ATTR_POP_ETH:
> > err = pop_eth(skb, key);
> > break;
> > +
> > +   case OVS_ACTION_ATTR_PUSH_NSH: {
> > +   u8 buffer[256];
> 
> Use NSH_M_TYPE2_MAX_LEN

Replaced 256 to NSH_M_TYPE2_MAX_LEN in v5.

> > +   case OVS_NSH_KEY_ATTR_MD1: {
> > +   const struct ovs_nsh_key_md1 *md1 =
> > +   (struct ovs_nsh_key_md1 *)nla_data(a);
> > +
> > +   has_md1 = true;
> > +   memcpy(nsh->context, md1->context, sizeof(*md1));
> > +   break;
> > +   }
> > +   case OVS_NSH_KEY_ATTR_MD2:
> > +   /* Not supported yet */
> 
> return -ENOTPSUPP if it's not supported.

Did that way in v5.

> > +   case OVS_NSH_KEY_ATTR_MD1: {
> > +   const struct ovs_nsh_key_md1 *md1 =
> > +   (struct ovs_nsh_key_md1 *)nla_data(a);
> > +
> > +   has_md1 = true;
> > +   for (i = 0; i < NSH_MD1_CONTEXT_SIZE; i++)
> > +   SW_FLOW_KEY_PUT(match, nsh.context[i],
> > +   md1->context[i], is_mask);
> > +   break;
> > +   }
> > +   case OVS_NSH_KEY_ATTR_MD2:
> > +   /* Not supported yet */
> 
> return -ENOTPSUPP if it's not supported.

Done in v5.

> > @@ -2636,6 +2984,17 @@ static int __ovs_nla_copy_actions(struct net *net, 
> > const struct nlattr *attr,
> > mac_proto = MAC_PROTO_ETHERNET;
> > break;
> >  
> > +   case OVS_ACTION_ATTR_PUSH_NSH:
> 
> You need to some validation here, especially the metadata lengths.
> Relying on action_lens is not enough because it's variable.
> 

Good point, I added validate_nsh for this.



Re: [PATCH net-next v4] openvswitch: enable NSH support

2017-08-21 Thread Yang, Yi
On Fri, Aug 18, 2017 at 09:26:01PM +0800, Jiri Benc wrote:
> On Fri, 18 Aug 2017 15:24:31 +0800, Yi Yang wrote:
> > +struct nsh_md2_tlv {
> > +   __be16 md_class;
> > +   u8 type;
> > +   u8 length;
> > +   /* Followed by variable-length data. */
> > +};
> 
> What was wrong with the u8[] field that was present at the end of the
> struct in the previous version of the patch?

In OVS code, it has been removed because of Microsoft compiler issue.

> 
> > +#define NSH_M_TYPE2_MAX_LEN 256
> 
> This is defined twice, please delete this define and keep the one lower
> in the file.

Removed duplicate one in v5.

> 
> > +#define NSH_DST_PORT4790 /* UDP Port for NSH on VXLAN. */
> 
> This is a VXLAN-GPE port, it has nothing to do with NSH (except that
> VXLAN-GPE can contain a NSH packet). It's also unused. Please remove it.
> 

Removed in v5.

> > +/* NSH Metadata Length. */
> > +#define NSH_M_TYPE1_MDLEN 16
> 
> This is unused and it seems it's not much useful anyway,
> sizeof(struct nsh_md1_ctx) provides the same value. Please remove this
> define.

Removed in v5.

> 
> > +#define NSH_MD1_CTX(nsh_hdr_ptr) (&(nsh_hdr_ptr)->md1)
> > +
> > +#define NSH_MD2_CTX(nsh_hdr_ptr) (&(nsh_hdr_ptr)->md2)
> 
> Please remove these two. They are unused and would just obscure things
> anyway.
> 
> > +static inline struct nsh_md1_ctx *nsh_md1_ctx(struct nsh_hdr *nsh)
> > +{
> > +   return >md1;
> > +}
> > +
> > +static inline struct nsh_md2_tlv *nsh_md2_ctx(struct nsh_hdr *nsh)
> > +{
> > +   return >md2;
> > +}
> 
> And remove these too, for the same reason. Just use nsh->md1 when you
> need the metadata, there's no reason for these helper functions. They
> just obscure things.
> 

Removed them in v5

> > +static inline void nsh_set_flags_and_ttl(struct nsh_hdr *nsh, u8 flags, u8 
> > ttl)
> > +{
> > +   nsh->ver_flags_ttl_len
> > +   = htons((ntohs(nsh->ver_flags_ttl_len)
> > +   & ~(NSH_FLAGS_MASK | NSH_TTL_MASK))
> > +   | ((flags << NSH_FLAGS_SHIFT) & NSH_FLAGS_MASK)
> > +   | ((ttl << NSH_TTL_SHIFT) & NSH_TTL_MASK));
> > +}
> > +
> > +static inline void nsh_set_flags_ttl_len(struct nsh_hdr *nsh, u8 flags,
> > +u8 ttl, u8 len)
> > +{
> > +   nsh->ver_flags_ttl_len
> > +   = htons((ntohs(nsh->ver_flags_ttl_len)
> > +   & ~(NSH_FLAGS_MASK | NSH_TTL_MASK | NSH_LEN_MASK))
> > +   | ((flags << NSH_FLAGS_SHIFT) & NSH_FLAGS_MASK)
> > +   | ((ttl << NSH_TTL_SHIFT) & NSH_TTL_MASK)
> > +   | ((len << NSH_LEN_SHIFT) & NSH_LEN_MASK));
> > +}
> 
> Okay. Could those two perhaps use a common function?
> 
> static inline void __nsh_set_flags(struct nsh_hdr *nsh, u16 value, u16 mask)
> {
>   nsh->ver_flags_ttl_len = nsh->ver_flags_ttl_len & ~htons(mask)
>   | htons(value);
> }
> 
> static inline void nsh_set_flags_and_ttl(struct nsh_hdr *nsh, u8 flags, u8 
> ttl)
> {
>   __nsh_set_flags(nsh, flags << NSH_FLAGS_SHIFT | ttl << NSH_TTL_SHIFT,
>   NSH_FLAGS_MASK | NSH_TTL_MASK);
> }
> 
> etc.

Thanks for this good suggestion, applied in v5 with small change.

> 
> > +static int push_nsh(struct sk_buff *skb, struct sw_flow_key *key,
> > +   const struct nsh_hdr *nsh_src)
> > +{
> [...]
> > +   if (!skb->inner_protocol)
> > +   skb_set_inner_protocol(skb, skb->protocol);
> 
> I was wondering about this during the reviews of the previous versions.
> Now I've given this more thought but I still don't see it - why is the
> inner_protocol set here?

I saw push_mpls has it, so also set it.

> 
> > +   case OVS_KEY_ATTR_NSH: {
> > +   struct ovs_key_nsh nsh;
> > +   struct ovs_key_nsh nsh_mask;
> > +   size_t size = nla_len(a) / 2;
> > +   struct nlattr attr[1 + size / sizeof(struct nlattr) + 1];
> > +   struct nlattr mask[1 + size / sizeof(struct nlattr) + 1];
> > +
> > +   attr->nla_type = nla_type(a);
> > +   mask->nla_type = attr->nla_type;
> > +   attr->nla_len = NLA_HDRLEN + size;
> > +   mask->nla_len = attr->nla_len;
> > +   memcpy(attr + 1, (char *)(a + 1), size);
> > +   memcpy(mask + 1, (char *)(a + 1) + size, size);
> 
> No, please. See my reply to the previous version for how to do this in
> a less hacky way.

I have used your proposal in previous comments and have it in v5.

> 
> > +   case OVS_ACTION_ATTR_PUSH_NSH: {
> > +   u8 buffer[256];
> > +   struct nsh_hdr *nsh_hdr = (struct nsh_hdr *)buffer;
> > +   const struct nsh_hdr *nsh_src = nsh_hdr;
> > +
> > +   nsh_hdr_from_nlattr(nla_data(a), nsh_hdr);
> 
> This is very dangerous security wise. You have to protect against
> buffer overflow, one way or other. The current code may not overflow
> (I have not checked that, though) but a future addition may break the
> 

Re: [PATCH net-next v4] openvswitch: enable NSH support

2017-08-18 Thread Eric Garver
On Fri, Aug 18, 2017 at 03:24:31PM +0800, Yi Yang wrote:
> v3->v4
>  - Add new NSH match field ttl
>  - Update NSH header to the latest format
>which will be final format and won't change
>per its author's confirmation.
>  - Fix comments for v3.

Hi Yi,
Only a few comments below since Jiri already supplied lots of feedback.

> 
> v2->v3
>  - Change OVS_KEY_ATTR_NSH to nested key to handle
>length-fixed attributes and length-variable
>attriubte more flexibly.
>  - Remove struct ovs_action_push_nsh completely
>  - Add code to handle nested attribute for SET_MASKED
>  - Change PUSH_NSH to use the nested OVS_KEY_ATTR_NSH
>to transfer NSH header data.
>  - Fix comments and coding style issues by Jiri and Eric
> 
> v1->v2
>  - Change encap_nsh and decap_nsh to push_nsh and pop_nsh
>  - Dynamically allocate struct ovs_action_push_nsh for
>length-variable metadata.
> 
> OVS master and 2.8 branch has merged NSH userspace
> patch series, this patch is to enable NSH support
> in kernel data path in order that OVS can support
> NSH in 2.8 release in compat mode by porting this.
> 
> Signed-off-by: Yi Yang 
> ---
[..]
> @@ -1210,6 +1373,20 @@ static int do_execute_actions(struct datapath *dp, 
> struct sk_buff *skb,
>   case OVS_ACTION_ATTR_POP_ETH:
>   err = pop_eth(skb, key);
>   break;
> +
> + case OVS_ACTION_ATTR_PUSH_NSH: {
> + u8 buffer[256];

Use NSH_M_TYPE2_MAX_LEN

> + struct nsh_hdr *nsh_hdr = (struct nsh_hdr *)buffer;
> + const struct nsh_hdr *nsh_src = nsh_hdr;
> +
> + nsh_hdr_from_nlattr(nla_data(a), nsh_hdr);
> + err = push_nsh(skb, key, nsh_src);
> + break;
> + }
> +
> + case OVS_ACTION_ATTR_POP_NSH:
> + err = pop_nsh(skb, key);
> + break;
>   }
>  
>   if (unlikely(err)) {
[..]
> +int nsh_key_from_nlattr(const struct nlattr *attr,
> + struct ovs_key_nsh *nsh)
> +{
> + struct nlattr *a;
> + int rem;
> + bool has_md1 = false;
> + bool has_md2 = false;
> +
> + nla_for_each_nested(a, attr, rem) {
> + int type = nla_type(a);
> +
> + if (type > OVS_NSH_KEY_ATTR_MAX) {
> + OVS_NLERR(1, "nsh attr %d is out of range max %d",
> +   type, OVS_NSH_KEY_ATTR_MAX);
> + return -EINVAL;
> + }
> +
> + if (!check_attr_len(nla_len(a),
> + ovs_nsh_key_attr_lens[type].len)) {
> + OVS_NLERR(
> + 1,
> + "nsh attr %d has unexpected len %d expected %d",
> + type,
> + nla_len(a),
> + ovs_nsh_key_attr_lens[type].len
> + );
> + return -EINVAL;
> + }
> +
> + switch (type) {
> + case OVS_NSH_KEY_ATTR_BASE: {
> + const struct ovs_nsh_key_base *base =
> + (struct ovs_nsh_key_base *)nla_data(a);
> +
> + memcpy(nsh, base, sizeof(*base));
> + break;
> + }
> + case OVS_NSH_KEY_ATTR_MD1: {
> + const struct ovs_nsh_key_md1 *md1 =
> + (struct ovs_nsh_key_md1 *)nla_data(a);
> +
> + has_md1 = true;
> + memcpy(nsh->context, md1->context, sizeof(*md1));
> + break;
> + }
> + case OVS_NSH_KEY_ATTR_MD2:
> + /* Not supported yet */

return -ENOTPSUPP if it's not supported.

> + has_md2 = true;
> + break;
> + default:
> + OVS_NLERR(1, "Unknown nsh attribute %d",
> +   type);
> + return -EINVAL;
> + }
> + }
> +
> + if (rem > 0) {
> + OVS_NLERR(1, "nsh attribute has %d unknown bytes.", rem);
> + return -EINVAL;
> + }
> +
> + if ((has_md1 && nsh->mdtype != NSH_M_TYPE1) ||
> + (has_md2 && nsh->mdtype != NSH_M_TYPE2)) {
> + OVS_NLERR(1, "nsh attribute has unmatched MD type %d.",
> +   nsh->mdtype);
> + return -EINVAL;
> + }
> +
> + return 0;
> +}
> +
> +static int nsh_key_put_from_nlattr(const struct nlattr *attr,
> +struct sw_flow_match *match, bool is_mask,
> +bool log)
> +{
> + struct nlattr *a;
> + int rem;
> + bool has_md1 = false;
> + bool has_md2 = false;
> + u8 mdtype = 0;
> +
> + nla_for_each_nested(a, attr, rem) {
> + int type = nla_type(a);
> +  

Re: [PATCH net-next v4] openvswitch: enable NSH support

2017-08-18 Thread Jiri Benc
On Fri, 18 Aug 2017 15:26:01 +0200, Jiri Benc wrote:
> Out of time for today, will continue the review next week. Again, feel
> free to send a new version meanwhile or wait for the rest of the
> review, whatever works better for you.

One more thing: before you send a new version, be sure and double check
that you addressed *all* of the feedback. In this version, you still
have not addressed a few things I gave feedback on in the previous
version. This wastes everyone's time.

It doesn't mean I'm always right, of course, but you either explain why
I'm wrong or change the code. It's generally not acceptable to ignore
feedback.

Thank you,

 Jiri


Re: [PATCH net-next v4] openvswitch: enable NSH support

2017-08-18 Thread Jiri Benc
On Fri, 18 Aug 2017 15:24:31 +0800, Yi Yang wrote:
> +struct nsh_md2_tlv {
> + __be16 md_class;
> + u8 type;
> + u8 length;
> + /* Followed by variable-length data. */
> +};

What was wrong with the u8[] field that was present at the end of the
struct in the previous version of the patch?

> +#define NSH_M_TYPE2_MAX_LEN 256

This is defined twice, please delete this define and keep the one lower
in the file.

> +#define NSH_DST_PORT4790 /* UDP Port for NSH on VXLAN. */

This is a VXLAN-GPE port, it has nothing to do with NSH (except that
VXLAN-GPE can contain a NSH packet). It's also unused. Please remove it.

> +/* NSH Metadata Length. */
> +#define NSH_M_TYPE1_MDLEN 16

This is unused and it seems it's not much useful anyway,
sizeof(struct nsh_md1_ctx) provides the same value. Please remove this
define.

> +#define NSH_MD1_CTX(nsh_hdr_ptr) (&(nsh_hdr_ptr)->md1)
> +
> +#define NSH_MD2_CTX(nsh_hdr_ptr) (&(nsh_hdr_ptr)->md2)

Please remove these two. They are unused and would just obscure things
anyway.

> +static inline struct nsh_md1_ctx *nsh_md1_ctx(struct nsh_hdr *nsh)
> +{
> + return >md1;
> +}
> +
> +static inline struct nsh_md2_tlv *nsh_md2_ctx(struct nsh_hdr *nsh)
> +{
> + return >md2;
> +}

And remove these too, for the same reason. Just use nsh->md1 when you
need the metadata, there's no reason for these helper functions. They
just obscure things.

> +static inline void nsh_set_flags_and_ttl(struct nsh_hdr *nsh, u8 flags, u8 
> ttl)
> +{
> + nsh->ver_flags_ttl_len
> + = htons((ntohs(nsh->ver_flags_ttl_len)
> + & ~(NSH_FLAGS_MASK | NSH_TTL_MASK))
> + | ((flags << NSH_FLAGS_SHIFT) & NSH_FLAGS_MASK)
> + | ((ttl << NSH_TTL_SHIFT) & NSH_TTL_MASK));
> +}
> +
> +static inline void nsh_set_flags_ttl_len(struct nsh_hdr *nsh, u8 flags,
> +  u8 ttl, u8 len)
> +{
> + nsh->ver_flags_ttl_len
> + = htons((ntohs(nsh->ver_flags_ttl_len)
> + & ~(NSH_FLAGS_MASK | NSH_TTL_MASK | NSH_LEN_MASK))
> + | ((flags << NSH_FLAGS_SHIFT) & NSH_FLAGS_MASK)
> + | ((ttl << NSH_TTL_SHIFT) & NSH_TTL_MASK)
> + | ((len << NSH_LEN_SHIFT) & NSH_LEN_MASK));
> +}

Okay. Could those two perhaps use a common function?

static inline void __nsh_set_flags(struct nsh_hdr *nsh, u16 value, u16 mask)
{
nsh->ver_flags_ttl_len = nsh->ver_flags_ttl_len & ~htons(mask)
| htons(value);
}

static inline void nsh_set_flags_and_ttl(struct nsh_hdr *nsh, u8 flags, u8 ttl)
{
__nsh_set_flags(nsh, flags << NSH_FLAGS_SHIFT | ttl << NSH_TTL_SHIFT,
NSH_FLAGS_MASK | NSH_TTL_MASK);
}

etc.

> +static int push_nsh(struct sk_buff *skb, struct sw_flow_key *key,
> + const struct nsh_hdr *nsh_src)
> +{
[...]
> + if (!skb->inner_protocol)
> + skb_set_inner_protocol(skb, skb->protocol);

I was wondering about this during the reviews of the previous versions.
Now I've given this more thought but I still don't see it - why is the
inner_protocol set here?

> + case OVS_KEY_ATTR_NSH: {
> + struct ovs_key_nsh nsh;
> + struct ovs_key_nsh nsh_mask;
> + size_t size = nla_len(a) / 2;
> + struct nlattr attr[1 + size / sizeof(struct nlattr) + 1];
> + struct nlattr mask[1 + size / sizeof(struct nlattr) + 1];
> +
> + attr->nla_type = nla_type(a);
> + mask->nla_type = attr->nla_type;
> + attr->nla_len = NLA_HDRLEN + size;
> + mask->nla_len = attr->nla_len;
> + memcpy(attr + 1, (char *)(a + 1), size);
> + memcpy(mask + 1, (char *)(a + 1) + size, size);

No, please. See my reply to the previous version for how to do this in
a less hacky way.

> + case OVS_ACTION_ATTR_PUSH_NSH: {
> + u8 buffer[256];
> + struct nsh_hdr *nsh_hdr = (struct nsh_hdr *)buffer;
> + const struct nsh_hdr *nsh_src = nsh_hdr;
> +
> + nsh_hdr_from_nlattr(nla_data(a), nsh_hdr);

This is very dangerous security wise. You have to protect against
buffer overflow, one way or other. The current code may not overflow
(I have not checked that, though) but a future addition may break the
assumption without being obvious it's a problem.

Note that the previous version had exactly the same problem but it was
hidden and I didn't notice it. Which means that getting rid of that
push_nsh_para struct was a very good thing, the code is more clean and
more obvious now.

> +static int parse_nsh(struct sk_buff *skb, struct sw_flow_key *key)
> +{
> + struct nsh_hdr *nsh = (struct nsh_hdr *)skb_network_header(skb);
> + u8 version, length;
> + int err;
> +
> + err = check_header(skb, NSH_BASE_HDR_LEN);
> + if (unlikely(err))
> +   

[PATCH net-next v4] openvswitch: enable NSH support

2017-08-18 Thread Yi Yang
v3->v4
 - Add new NSH match field ttl
 - Update NSH header to the latest format
   which will be final format and won't change
   per its author's confirmation.
 - Fix comments for v3.

v2->v3
 - Change OVS_KEY_ATTR_NSH to nested key to handle
   length-fixed attributes and length-variable
   attriubte more flexibly.
 - Remove struct ovs_action_push_nsh completely
 - Add code to handle nested attribute for SET_MASKED
 - Change PUSH_NSH to use the nested OVS_KEY_ATTR_NSH
   to transfer NSH header data.
 - Fix comments and coding style issues by Jiri and Eric

v1->v2
 - Change encap_nsh and decap_nsh to push_nsh and pop_nsh
 - Dynamically allocate struct ovs_action_push_nsh for
   length-variable metadata.

OVS master and 2.8 branch has merged NSH userspace
patch series, this patch is to enable NSH support
in kernel data path in order that OVS can support
NSH in 2.8 release in compat mode by porting this.

Signed-off-by: Yi Yang 
---
 drivers/net/vxlan.c  |   7 +
 include/net/nsh.h| 325 +++
 include/uapi/linux/if_ether.h|   1 +
 include/uapi/linux/openvswitch.h |  30 
 net/openvswitch/actions.c| 177 +++
 net/openvswitch/flow.c   |  52 ++
 net/openvswitch/flow.h   |  11 ++
 net/openvswitch/flow_netlink.c   | 361 ++-
 net/openvswitch/flow_netlink.h   |   3 +
 9 files changed, 966 insertions(+), 1 deletion(-)
 create mode 100644 include/net/nsh.h

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index ae3a1da..a36c41e 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #if IS_ENABLED(CONFIG_IPV6)
 #include 
@@ -1268,6 +1269,9 @@ static bool vxlan_parse_gpe_hdr(struct vxlanhdr *unparsed,
case VXLAN_GPE_NP_IPV6:
*protocol = htons(ETH_P_IPV6);
break;
+   case VXLAN_GPE_NP_NSH:
+   *protocol = htons(ETH_P_NSH);
+   break;
case VXLAN_GPE_NP_ETHERNET:
*protocol = htons(ETH_P_TEB);
break;
@@ -1807,6 +1811,9 @@ static int vxlan_build_gpe_hdr(struct vxlanhdr *vxh, u32 
vxflags,
case htons(ETH_P_IPV6):
gpe->next_protocol = VXLAN_GPE_NP_IPV6;
return 0;
+   case htons(ETH_P_NSH):
+   gpe->next_protocol = VXLAN_GPE_NP_NSH;
+   return 0;
case htons(ETH_P_TEB):
gpe->next_protocol = VXLAN_GPE_NP_ETHERNET;
return 0;
diff --git a/include/net/nsh.h b/include/net/nsh.h
new file mode 100644
index 000..5186fff
--- /dev/null
+++ b/include/net/nsh.h
@@ -0,0 +1,325 @@
+#ifndef __NET_NSH_H
+#define __NET_NSH_H 1
+
+/*
+ * Network Service Header:
+ *  0   1   2   3
+ *  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ * |Ver|O|U|TTL|   Length  |U|U|U|U|MD Type| Next Protocol |
+ * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ * |  Service Path Identifier (SPI)| Service Index |
+ * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ * |   |
+ * ~   Mandatory/Optional Context Headers  ~
+ * |   |
+ * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ *
+ * Version: The version field is used to ensure backward compatibility
+ * going forward with future NSH specification updates.  It MUST be set
+ * to 0x0 by the sender, in this first revision of NSH.  Given the
+ * widespread implementation of existing hardware that uses the first
+ * nibble after an MPLS label stack for ECMP decision processing, this
+ * document reserves version 01b and this value MUST NOT be used in
+ * future versions of the protocol.  Please see [RFC7325] for further
+ * discussion of MPLS-related forwarding requirements.
+ *
+ * O bit: Setting this bit indicates an Operations, Administration, and
+ * Maintenance (OAM) packet.  The actual format and processing of SFC
+ * OAM packets is outside the scope of this specification (see for
+ * example [I-D.ietf-sfc-oam-framework] for one approach).
+ *
+ * The O bit MUST be set for OAM packets and MUST NOT be set for non-OAM
+ * packets.  The O bit MUST NOT be modified along the SFP.
+ *
+ * SF/SFF/SFC Proxy/Classifier implementations that do not support SFC
+ * OAM procedures SHOULD discard packets with O bit set, but MAY support
+ * a configurable parameter to enable forwarding received SFC OAM
+ * packets unmodified to the next element in the chain.  Forwarding OAM
+ * packets unmodified by SFC elements that do not support SFC OAM
+ * procedures may be acceptable for a subset of OAM functions, but can
+ *