Re: [ovs-dev] [PATCH net-next v11 5/6] openvswitch: add layer 3 flow/port support

2016-09-26 Thread pravin shelar
On Mon, Sep 26, 2016 at 9:53 AM, Jiri Benc  wrote:
> Reviving a very old thread, sorry. Simon handed this over to me, I'm
> preparing v12.
>
> On Fri, 15 Jul 2016 14:07:37 -0700, pravin shelar wrote:
>> I am not sure if you can use only mac_len to detect L3 packet. This
>> does not work with MPLS packets, mac_len is used to account MPLS
>> headers pushed on skb. Therefore in case of a MPLS header on L3
>> packet, mac_len would be non zero and we have to look at either
>> mac_header or some other metadata like is_layer3 flag from key to
>> check for L3 packet.
>
> I went through the relevant code paths and I don't see any problem in
> using mac_len for that. MPLS GSO seems to work correctly. The kernel
> MPLS code expects mac_len to be just the L2 header len, excluding MPLS.
> The same is the case for openvswitch (you're not correct that "mac_len
> is used to account MPLS headers pushed on skb", at least not with the
> current code). In no place I see any problem with mac_len being 0, the
> calculations just nicely work.
>
> What was your concern with that, Pravin?
>
> In another mail in this thread you mentioned skb_mpls_header. That one
> works correctly with mac_len == 0 if mac_header points to the beginning
> of the packet.
>
> You also wrote:
>
>> I was thinking in overall networking stack rather than just ovs
>> datapath. I think we should have consistent method of detecting L3
>> packet. As commented in previous mail it could be achieved using
>> skb-protocol and device type.
>
> Again, mac_len == 0 works correctly and consistently, provided that
> both mac_header and network_header point to the same place. In case of
> a MPLS packet it would be the beginning of MPLS headers.
>
>> > --- a/include/net/mpls.h
>> > +++ b/include/net/mpls.h
>> > @@ -34,6 +34,8 @@ static inline bool eth_p_mpls(__be16 eth_type)
>> >   */
>> >  static inline unsigned char *skb_mpls_header(struct sk_buff *skb)
>> >  {
>> > -   return skb_mac_header(skb) + skb->mac_len;
>> > +   return skb_mac_header_was_set(skb) ?
>> > +   skb_mac_header(skb) + skb->mac_len :
>> > +   skb->data;
>> >  }
>>
>> This function is also called from GSO layer.
>
> I don't see it used anywhere outside of openvswitch. Not even when
> grepping git history. I may be missing something, though.
>
>> issue is in GSO layer, it
>> does reset mac header and mac length and then calls mpls-gso-handler.
>> So all subsequent check for L3 packet fails.
>> So far we have explored three different ways to detect L3 packet but
>> each has its own issue.
>> 1. skb mac header : GSO can reset mac header.
>> 2. skb mac length : MPLS uses mac_len to account for MPLS header
>> length along with L2 header
>
> It does not appear to be the case. Or at least not anymore.
>
>> 3. skb protocol: ETH_P_TEB is not set for all L2 frames, networking
>> stack is not ready to handle this type for given skb.
>>
>> So none of them works consistently. I think the only option to detect
>> L3 packet reliably (and without adding field to skb) is to use
>> skb-protocol along with ARPHRD_NONE device type. If ARPHRD_NONE type
>> device generates L2 packet it needs to set protocol to ETH_P_TEB. Some
>> networking stack function also needs to be fixed to handle this
>> protocol type, e.g. vlan_get_protocol(), br_dev_queue_push_xmit(),
>> etc.
>
> All of this said, I'm not opposed to using the skb_eth_header_present
> helper and checking the device type, it works. I just want to understand
> whether I missed some problem with mac_len. Seems to make some things
> simpler if we could use mac_len.
>

After commit 48d2ab609b6bb ("net: mpls: Fixups for GSO") MPLS does not
need to use skb mac-len to track the header, so using mac-len test for
L3 packet detection would result in better and cleaner solution.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH net-next v11 5/6] openvswitch: add layer 3 flow/port support

2016-09-26 Thread Jiri Benc
Reviving a very old thread, sorry. Simon handed this over to me, I'm
preparing v12.

On Fri, 15 Jul 2016 14:07:37 -0700, pravin shelar wrote:
> I am not sure if you can use only mac_len to detect L3 packet. This
> does not work with MPLS packets, mac_len is used to account MPLS
> headers pushed on skb. Therefore in case of a MPLS header on L3
> packet, mac_len would be non zero and we have to look at either
> mac_header or some other metadata like is_layer3 flag from key to
> check for L3 packet.

I went through the relevant code paths and I don't see any problem in
using mac_len for that. MPLS GSO seems to work correctly. The kernel
MPLS code expects mac_len to be just the L2 header len, excluding MPLS.
The same is the case for openvswitch (you're not correct that "mac_len
is used to account MPLS headers pushed on skb", at least not with the
current code). In no place I see any problem with mac_len being 0, the
calculations just nicely work.

What was your concern with that, Pravin?

In another mail in this thread you mentioned skb_mpls_header. That one
works correctly with mac_len == 0 if mac_header points to the beginning
of the packet.

You also wrote:

> I was thinking in overall networking stack rather than just ovs
> datapath. I think we should have consistent method of detecting L3
> packet. As commented in previous mail it could be achieved using
> skb-protocol and device type.

Again, mac_len == 0 works correctly and consistently, provided that
both mac_header and network_header point to the same place. In case of
a MPLS packet it would be the beginning of MPLS headers.

> > --- a/include/net/mpls.h
> > +++ b/include/net/mpls.h
> > @@ -34,6 +34,8 @@ static inline bool eth_p_mpls(__be16 eth_type)
> >   */
> >  static inline unsigned char *skb_mpls_header(struct sk_buff *skb)
> >  {
> > -   return skb_mac_header(skb) + skb->mac_len;
> > +   return skb_mac_header_was_set(skb) ?
> > +   skb_mac_header(skb) + skb->mac_len :
> > +   skb->data;
> >  }
> 
> This function is also called from GSO layer.

I don't see it used anywhere outside of openvswitch. Not even when
grepping git history. I may be missing something, though.

> issue is in GSO layer, it
> does reset mac header and mac length and then calls mpls-gso-handler.
> So all subsequent check for L3 packet fails.
> So far we have explored three different ways to detect L3 packet but
> each has its own issue.
> 1. skb mac header : GSO can reset mac header.
> 2. skb mac length : MPLS uses mac_len to account for MPLS header
> length along with L2 header

It does not appear to be the case. Or at least not anymore.

> 3. skb protocol: ETH_P_TEB is not set for all L2 frames, networking
> stack is not ready to handle this type for given skb.
> 
> So none of them works consistently. I think the only option to detect
> L3 packet reliably (and without adding field to skb) is to use
> skb-protocol along with ARPHRD_NONE device type. If ARPHRD_NONE type
> device generates L2 packet it needs to set protocol to ETH_P_TEB. Some
> networking stack function also needs to be fixed to handle this
> protocol type, e.g. vlan_get_protocol(), br_dev_queue_push_xmit(),
> etc.

All of this said, I'm not opposed to using the skb_eth_header_present
helper and checking the device type, it works. I just want to understand
whether I missed some problem with mac_len. Seems to make some things
simpler if we could use mac_len.

Thanks,

 Jiri
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH net-next v11 5/6] openvswitch: add layer 3 flow/port support

2016-08-30 Thread Joe Stringer
On 26 August 2016 at 02:13, Simon Horman  wrote:
> On Thu, Aug 25, 2016 at 05:33:57PM -0700, Joe Stringer wrote:
>> On 25 August 2016 at 03:08, Simon Horman  wrote:
>> > Please find my working patch below.
>> >
>> > From: Simon Horman 
>> > Subject: [PATCH] system-traffic: Exercise GSO
>> >
>> > Exercise GSO for: unencapsulated; MPLS; GRE; and MPLS in GRE.
>> >
>> > There is scope to extend this testing to other encapsulation formats
>> > if desired.
>> >
>> > This is motivated by a desire to test GRE and MPLS encapsulation in
>> > the context of L3/VPN (MPLS over non-TEB GRE work). That is not
>> > tested here but tests for those cases would idealy be based on those in
>> > this patch.
>> >
>> > Signed-off-by: Simon Horman 
>>
>> I realised that these tests disable TSO, but they don't actually check
>> if GSO is enabled. Maybe it's safe to assume this, but it's more
>> explicit to actually look for it in the tests.
>
> Good point, I'll see about checking that.
>
>> With particular setups (fedora23 in particular, both kernel and
>> userspace testsuites) I see this:
>>
>> ./system-traffic.at:371: ip netns exec at_ns0 sh << NS_EXEC_HEREDOC
>> ip route add 10.1.2.0/24 encap mpls 100 via inet 10.1.1.2 dev ns_gre0
>> NS_EXEC_HEREDOC
>> --- /dev/null 2016-08-19 01:28:02.15100 +
>> +++ 
>> /home/gitlab-runner/builds/83c49bff/0/root/gitlab-ovs/ovs/tests/system-kmod-testsuite.dir/at-groups/10/stderr
>> 2016-08-25 17:16:27.32400 +
>> @@ -0,0 +1 @@
>> +Error: either "to" is duplicate, or "encap" is a garbage.
>>
>> I'm guessing the ip tool is a little out of date. We could detect and
>> skip this with something like:
>>
>> AT_SKIP_IF([ip route help 2>&1 | grep encap])
>>
>> in the CHECK_MPLS.
>
> Thanks, I'll add something like that.
>
>> Hmm, I'm still seeing the bad counts of segments retransmited even
>> with the diff change on a kernel I have built at bf0f500bd019 ("Merge
>> tag 'trace-v4.8-1' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace").
>> If it's passing on latest net-next then maybe I just need to swap out
>> that box's kernel for a newer build. I'll try that.
>
> It is possible that it is detecting a bug.
> Which test is failing?

FWIW I tried with a newer build, commit 9a0a5c4cb1af ("net:
systemport: Fix ordering in intrl2_*_mask_clear macro"). I no longer
see the issue.

Unfortunately I lost my test output. It was one of these two:

  8: datapath - ping over gre tunnel FAILED
(system-traffic.at:294)
  9: datapath - http over gre tunnel FAILED
(system-traffic.at:348)

I also realised that I didn't have MPLS router enabled in my kernel
config so the MPLS tests were getting skipped. I enabled MPLS_ROUTING,
but now I see this failure on the "http over mpls" tests:

./system-traffic.at:111: ip netns exec at_ns0 sh << NS_EXEC_HEREDOC
ip route add 10.1.1.0/24 encap mpls 100 via inet 172.31.1.2 dev p0
NS_EXEC_HEREDOC
--- /dev/null 2016-08-30 15:22:28.813316948 -0700
+++ 
/home/gitlab-runner/builds/f1d4a2be/0/root/gitlab-ovs/ovs/tests/system-kmod-testsuite.dir/at-groups/4/stderr
2016-08-30 15:33:45.133306581 -0700
@@ -0,0 +1 @@
+RTNETLINK answers: Operation not supported


> At this stage I have mostly added TSO/GSO testing to existing checks.
> Perhaps it would be better to break them out into separate checks so
> ping/http can be be checked without considering TSO/GSO which may have some
> value in situations where TSO/GSO is broken which is actually what I am
> interested in testing.

Sounds reasonable.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH net-next v11 5/6] openvswitch: add layer 3 flow/port support

2016-08-26 Thread Simon Horman
On Thu, Aug 25, 2016 at 05:33:57PM -0700, Joe Stringer wrote:
> On 25 August 2016 at 03:08, Simon Horman  wrote:
> > Please find my working patch below.
> >
> > From: Simon Horman 
> > Subject: [PATCH] system-traffic: Exercise GSO
> >
> > Exercise GSO for: unencapsulated; MPLS; GRE; and MPLS in GRE.
> >
> > There is scope to extend this testing to other encapsulation formats
> > if desired.
> >
> > This is motivated by a desire to test GRE and MPLS encapsulation in
> > the context of L3/VPN (MPLS over non-TEB GRE work). That is not
> > tested here but tests for those cases would idealy be based on those in
> > this patch.
> >
> > Signed-off-by: Simon Horman 
> 
> I realised that these tests disable TSO, but they don't actually check
> if GSO is enabled. Maybe it's safe to assume this, but it's more
> explicit to actually look for it in the tests.

Good point, I'll see about checking that.

> With particular setups (fedora23 in particular, both kernel and
> userspace testsuites) I see this:
> 
> ./system-traffic.at:371: ip netns exec at_ns0 sh << NS_EXEC_HEREDOC
> ip route add 10.1.2.0/24 encap mpls 100 via inet 10.1.1.2 dev ns_gre0
> NS_EXEC_HEREDOC
> --- /dev/null 2016-08-19 01:28:02.15100 +
> +++ 
> /home/gitlab-runner/builds/83c49bff/0/root/gitlab-ovs/ovs/tests/system-kmod-testsuite.dir/at-groups/10/stderr
> 2016-08-25 17:16:27.32400 +
> @@ -0,0 +1 @@
> +Error: either "to" is duplicate, or "encap" is a garbage.
> 
> I'm guessing the ip tool is a little out of date. We could detect and
> skip this with something like:
> 
> AT_SKIP_IF([ip route help 2>&1 | grep encap])
> 
> in the CHECK_MPLS.

Thanks, I'll add something like that.

> Hmm, I'm still seeing the bad counts of segments retransmited even
> with the diff change on a kernel I have built at bf0f500bd019 ("Merge
> tag 'trace-v4.8-1' of
> git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace").
> If it's passing on latest net-next then maybe I just need to swap out
> that box's kernel for a newer build. I'll try that.

It is possible that it is detecting a bug.
Which test is failing?

At this stage I have mostly added TSO/GSO testing to existing checks.
Perhaps it would be better to break them out into separate checks so
ping/http can be be checked without considering TSO/GSO which may have some
value in situations where TSO/GSO is broken which is actually what I am
interested in testing.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH net-next v11 5/6] openvswitch: add layer 3 flow/port support

2016-08-25 Thread Joe Stringer
On 25 August 2016 at 03:08, Simon Horman  wrote:
> Please find my working patch below.
>
> From: Simon Horman 
> Subject: [PATCH] system-traffic: Exercise GSO
>
> Exercise GSO for: unencapsulated; MPLS; GRE; and MPLS in GRE.
>
> There is scope to extend this testing to other encapsulation formats
> if desired.
>
> This is motivated by a desire to test GRE and MPLS encapsulation in
> the context of L3/VPN (MPLS over non-TEB GRE work). That is not
> tested here but tests for those cases would idealy be based on those in
> this patch.
>
> Signed-off-by: Simon Horman 

I realised that these tests disable TSO, but they don't actually check
if GSO is enabled. Maybe it's safe to assume this, but it's more
explicit to actually look for it in the tests.

With particular setups (fedora23 in particular, both kernel and
userspace testsuites) I see this:

./system-traffic.at:371: ip netns exec at_ns0 sh << NS_EXEC_HEREDOC
ip route add 10.1.2.0/24 encap mpls 100 via inet 10.1.1.2 dev ns_gre0
NS_EXEC_HEREDOC
--- /dev/null 2016-08-19 01:28:02.15100 +
+++ 
/home/gitlab-runner/builds/83c49bff/0/root/gitlab-ovs/ovs/tests/system-kmod-testsuite.dir/at-groups/10/stderr
2016-08-25 17:16:27.32400 +
@@ -0,0 +1 @@
+Error: either "to" is duplicate, or "encap" is a garbage.

I'm guessing the ip tool is a little out of date. We could detect and
skip this with something like:

AT_SKIP_IF([ip route help 2>&1 | grep encap])

in the CHECK_MPLS.

Hmm, I'm still seeing the bad counts of segments retransmited even
with the diff change on a kernel I have built at bf0f500bd019 ("Merge
tag 'trace-v4.8-1' of
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace").
If it's passing on latest net-next then maybe I just need to swap out
that box's kernel for a newer build. I'll try that.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH net-next v11 5/6] openvswitch: add layer 3 flow/port support

2016-08-25 Thread Simon Horman
On Tue, Aug 23, 2016 at 10:51:47AM +0200, Simon Horman wrote:
> On Mon, Aug 22, 2016 at 02:47:42PM -0700, Joe Stringer wrote:
> > On 22 August 2016 at 04:04, Simon Horman  wrote:
> > > On Wed, Aug 10, 2016 at 10:17:30AM -0700, Joe Stringer wrote:
> > >> On 10 August 2016 at 03:20, Simon Horman  
> > >> wrote:
> > >> > On Tue, Aug 09, 2016 at 08:47:40AM -0700, pravin shelar wrote:
> > >> >> On Mon, Aug 8, 2016 at 8:17 AM, Simon Horman 
> > >> >>  wrote:
> > >> >> > Light testing seems to indicate that it works for GSO skbs
> > >> >> > received over both L3 and L2 GRE tunnels by OvS with both
> > >> >> > IP-in-MPLS and IP (without MPLS) payloads.
> > >> >> >
> > >> >>
> > >> >> Thanks for testing it. Can you also add those tests to OVS kmod test 
> > >> >> suite?
> > >> >> ..
> > >> >
> > >> > Sure, I will look into doing that.
> > >> > Am I correct in thinking Joe Stringer is the best person to contact if
> > >> > I run into trouble there?
> > >>
> > >> Sure. The basics of running the tests is documented here:
> > >> https://github.com/openvswitch/ovs/blob/master/INSTALL.md#datapath-testing
> > >>
> > >> You should be able to get a good feel for how to add tests by perusing
> > >> the commits to tests/system-{traffic,kmod-macros}.at in the OVS source
> > >> tree.
> > >
> > > Thanks Joe,
> > >
> > > it took me a while but I think that I have something working
> > > against the head branch of the OVS tree. I'd value opinions
> > > on the direction I have taken.
> > >
> > > Subject: [PATCH] system-traffic: Exercise GSO
> > >
> > > Exercise GSO for: unencapsulated; MPLS; GRE; and MPLS in GRE.
> > >
> > > There is scope to extend this testing to other encapsulation formats
> > > if desired.
> > >
> > > This is motivated by a desire to test GRE and MPLS encapsulation in
> > > the context of L3/VPN (MPLS over non-TEB GRE work). That is not
> > > tested here but tests for those cases would ideally be based on those in
> > > this patch.
> > 
> > This makes sense to me. There's a few corners that could be improved,
> > primarily for reproducing sane results on a variety of systems, then a
> > couple of style comments. Please do run the tests via both "make
> > check-kernel" and "make check-system-userspace" before submitting,
> > ideally with at least two varieties of kernel: One where you would
> > expect the test to pass, and one where you would expect the tests to
> > be skipped.

Both make check-kernel and make check-system-userspace are now working.
I have tested against net-next and the 3.16 kernel that ships with
Debian stable.

> Thanks. I'm glad I ran this by you before expanding the number of tests.
> 
> > * CHECK_MPLS is defined in system-kmod-macros.at, so a corresponding
> > version should be provided in system-userspace-macros.at. If the
> > criteria for running the test(s) with both userspace and kernel
> > datapaths is the same, then this could instead be moved into
> > system-common-macros.at.
> 
> Understood.
>
> > * "datapath - ping over gre tunnel" adds a command to execute in
> > at_ns1, but that namespace doesn't exist.
> 
> Oops.

I have removed the chunk in question, it seems to be an artifact
of my development of the tests.

> > * "datapath - http over gre tunnel" is missing MPLS_CHECK.
> 
> Thanks, I'll fix that.

On further inspection it seems tome that this check does not use MPLS,
rather it is testing GSO for GRE (without MPLS).

> > * Is there a way to clear the netstat statistics before running the
> > tests which rely on it? I'm getting a failure on one of my systems
> > (ubuntu trusty with a 4.7 kernel), but I'm not sure if the counter was
> > already high before I ran the test.
> 
> I'll look into that. If not they could be recorded to allow a check
> for a non-zero delta.
> 
> Possibly an entirely different mechanism is needed to check for GSO
> functioning. But I'm not sure what it would be at this point.
>
> > * "datapath - http over mpls between two ports"  (maybe others too?)
> > should shift all openflow rules into a single section using AT_DATA,
> > similar to the other tests. This makes it easier to reason about the
> > flow table and understand what's going on before reading through the
> > rest of the test.
> 
> Sure, will do.
> 
> > * If there is a common set of configuration you do for local stack
> > within a namespace to route MPLS traffic, you could consider adding
> > another macro into system-common-macros.at.
> 
> Ok, possibly there is if some of the configuration is parametrised:
> e.g. over the namespace/netdev to send/receive MPLS using native Linux MPLS
> routing.
> 
> > I also see this error on "http over mpls over gre tunnel":
> > +sh: 1: cannot create /proc/sys/net/mpls/conf/ns_gre0/input: Directory
> > nonexistent
> > 
> > Maybe MPLS + GRE needs a separate check?
> 
> Yes, that is probably the case.
> 
> I believe some versions of the kernel support MPLS routing for
> some interfaces but not GRE interfaces.

Please find my working patch below.

From: Simon H

Re: [ovs-dev] [PATCH net-next v11 5/6] openvswitch: add layer 3 flow/port support

2016-08-23 Thread Simon Horman
On Mon, Aug 22, 2016 at 02:47:42PM -0700, Joe Stringer wrote:
> On 22 August 2016 at 04:04, Simon Horman  wrote:
> > On Wed, Aug 10, 2016 at 10:17:30AM -0700, Joe Stringer wrote:
> >> On 10 August 2016 at 03:20, Simon Horman  
> >> wrote:
> >> > On Tue, Aug 09, 2016 at 08:47:40AM -0700, pravin shelar wrote:
> >> >> On Mon, Aug 8, 2016 at 8:17 AM, Simon Horman 
> >> >>  wrote:
> >> >> > Light testing seems to indicate that it works for GSO skbs
> >> >> > received over both L3 and L2 GRE tunnels by OvS with both
> >> >> > IP-in-MPLS and IP (without MPLS) payloads.
> >> >> >
> >> >>
> >> >> Thanks for testing it. Can you also add those tests to OVS kmod test 
> >> >> suite?
> >> >> ..
> >> >
> >> > Sure, I will look into doing that.
> >> > Am I correct in thinking Joe Stringer is the best person to contact if
> >> > I run into trouble there?
> >>
> >> Sure. The basics of running the tests is documented here:
> >> https://github.com/openvswitch/ovs/blob/master/INSTALL.md#datapath-testing
> >>
> >> You should be able to get a good feel for how to add tests by perusing
> >> the commits to tests/system-{traffic,kmod-macros}.at in the OVS source
> >> tree.
> >
> > Thanks Joe,
> >
> > it took me a while but I think that I have something working
> > against the head branch of the OVS tree. I'd value opinions
> > on the direction I have taken.
> >
> > Subject: [PATCH] system-traffic: Exercise GSO
> >
> > Exercise GSO for: unencapsulated; MPLS; GRE; and MPLS in GRE.
> >
> > There is scope to extend this testing to other encapsulation formats
> > if desired.
> >
> > This is motivated by a desire to test GRE and MPLS encapsulation in
> > the context of L3/VPN (MPLS over non-TEB GRE work). That is not
> > tested here but tests for those cases would ideally be based on those in
> > this patch.
> 
> This makes sense to me. There's a few corners that could be improved,
> primarily for reproducing sane results on a variety of systems, then a
> couple of style comments. Please do run the tests via both "make
> check-kernel" and "make check-system-userspace" before submitting,
> ideally with at least two varieties of kernel: One where you would
> expect the test to pass, and one where you would expect the tests to
> be skipped.

Thanks. I'm glad I ran this by you before expanding the number of tests.

> * CHECK_MPLS is defined in system-kmod-macros.at, so a corresponding
> version should be provided in system-userspace-macros.at. If the
> criteria for running the test(s) with both userspace and kernel
> datapaths is the same, then this could instead be moved into
> system-common-macros.at.

Understood.

> * "datapath - ping over gre tunnel" adds a command to execute in
> at_ns1, but that namespace doesn't exist.

Oops.

> * "datapath - http over gre tunnel" is missing MPLS_CHECK.

Thanks, I'll fix that.

> * Is there a way to clear the netstat statistics before running the
> tests which rely on it? I'm getting a failure on one of my systems
> (ubuntu trusty with a 4.7 kernel), but I'm not sure if the counter was
> already high before I ran the test.

I'll look into that. If not they could be recorded to allow a check
for a non-zero delta.

Possibly an entirely different mechanism is needed to check for GSO
functioning. But I'm not sure what it would be at this point.

> * "datapath - http over mpls between two ports"  (maybe others too?)
> should shift all openflow rules into a single section using AT_DATA,
> similar to the other tests. This makes it easier to reason about the
> flow table and understand what's going on before reading through the
> rest of the test.

Sure, will do.

> * If there is a common set of configuration you do for local stack
> within a namespace to route MPLS traffic, you could consider adding
> another macro into system-common-macros.at.

Ok, possibly there is if some of the configuration is parametrised:
e.g. over the namespace/netdev to send/receive MPLS using native Linux MPLS
routing.

> I also see this error on "http over mpls over gre tunnel":
> +sh: 1: cannot create /proc/sys/net/mpls/conf/ns_gre0/input: Directory
> nonexistent
> 
> Maybe MPLS + GRE needs a separate check?

Yes, that is probably the case.

I believe some versions of the kernel support MPLS routing for
some interfaces but not GRE interfaces.

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH net-next v11 5/6] openvswitch: add layer 3 flow/port support

2016-08-22 Thread Joe Stringer
On 22 August 2016 at 04:04, Simon Horman  wrote:
> On Wed, Aug 10, 2016 at 10:17:30AM -0700, Joe Stringer wrote:
>> On 10 August 2016 at 03:20, Simon Horman  wrote:
>> > On Tue, Aug 09, 2016 at 08:47:40AM -0700, pravin shelar wrote:
>> >> On Mon, Aug 8, 2016 at 8:17 AM, Simon Horman  
>> >> wrote:
>> >> > Light testing seems to indicate that it works for GSO skbs
>> >> > received over both L3 and L2 GRE tunnels by OvS with both
>> >> > IP-in-MPLS and IP (without MPLS) payloads.
>> >> >
>> >>
>> >> Thanks for testing it. Can you also add those tests to OVS kmod test 
>> >> suite?
>> >> ..
>> >
>> > Sure, I will look into doing that.
>> > Am I correct in thinking Joe Stringer is the best person to contact if
>> > I run into trouble there?
>>
>> Sure. The basics of running the tests is documented here:
>> https://github.com/openvswitch/ovs/blob/master/INSTALL.md#datapath-testing
>>
>> You should be able to get a good feel for how to add tests by perusing
>> the commits to tests/system-{traffic,kmod-macros}.at in the OVS source
>> tree.
>
> Thanks Joe,
>
> it took me a while but I think that I have something working
> against the head branch of the OVS tree. I'd value opinions
> on the direction I have taken.
>
> Subject: [PATCH] system-traffic: Exercise GSO
>
> Exercise GSO for: unencapsulated; MPLS; GRE; and MPLS in GRE.
>
> There is scope to extend this testing to other encapsulation formats
> if desired.
>
> This is motivated by a desire to test GRE and MPLS encapsulation in
> the context of L3/VPN (MPLS over non-TEB GRE work). That is not
> tested here but tests for those cases would ideally be based on those in
> this patch.

This makes sense to me. There's a few corners that could be improved,
primarily for reproducing sane results on a variety of systems, then a
couple of style comments. Please do run the tests via both "make
check-kernel" and "make check-system-userspace" before submitting,
ideally with at least two varieties of kernel: One where you would
expect the test to pass, and one where you would expect the tests to
be skipped.

* CHECK_MPLS is defined in system-kmod-macros.at, so a corresponding
version should be provided in system-userspace-macros.at. If the
criteria for running the test(s) with both userspace and kernel
datapaths is the same, then this could instead be moved into
system-common-macros.at.
* "datapath - ping over gre tunnel" adds a command to execute in
at_ns1, but that namespace doesn't exist.
* "datapath - http over gre tunnel" is missing MPLS_CHECK.
* Is there a way to clear the netstat statistics before running the
tests which rely on it? I'm getting a failure on one of my systems
(ubuntu trusty with a 4.7 kernel), but I'm not sure if the counter was
already high before I ran the test.
* "datapath - http over mpls between two ports"  (maybe others too?)
should shift all openflow rules into a single section using AT_DATA,
similar to the other tests. This makes it easier to reason about the
flow table and understand what's going on before reading through the
rest of the test.
* If there is a common set of configuration you do for local stack
within a namespace to route MPLS traffic, you could consider adding
another macro into system-common-macros.at.

I also see this error on "http over mpls over gre tunnel":
+sh: 1: cannot create /proc/sys/net/mpls/conf/ns_gre0/input: Directory
nonexistent

Maybe MPLS + GRE needs a separate check?
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH net-next v11 5/6] openvswitch: add layer 3 flow/port support

2016-08-22 Thread Simon Horman
On Wed, Aug 10, 2016 at 10:17:30AM -0700, Joe Stringer wrote:
> On 10 August 2016 at 03:20, Simon Horman  wrote:
> > On Tue, Aug 09, 2016 at 08:47:40AM -0700, pravin shelar wrote:
> >> On Mon, Aug 8, 2016 at 8:17 AM, Simon Horman  
> >> wrote:
> >> > Light testing seems to indicate that it works for GSO skbs
> >> > received over both L3 and L2 GRE tunnels by OvS with both
> >> > IP-in-MPLS and IP (without MPLS) payloads.
> >> >
> >>
> >> Thanks for testing it. Can you also add those tests to OVS kmod test suite?
> >> ..
> >
> > Sure, I will look into doing that.
> > Am I correct in thinking Joe Stringer is the best person to contact if
> > I run into trouble there?
> 
> Sure. The basics of running the tests is documented here:
> https://github.com/openvswitch/ovs/blob/master/INSTALL.md#datapath-testing
> 
> You should be able to get a good feel for how to add tests by perusing
> the commits to tests/system-{traffic,kmod-macros}.at in the OVS source
> tree.

Thanks Joe,

it took me a while but I think that I have something working
against the head branch of the OVS tree. I'd value opinions
on the direction I have taken.

Subject: [PATCH] system-traffic: Exercise GSO

Exercise GSO for: unencapsulated; MPLS; GRE; and MPLS in GRE.

There is scope to extend this testing to other encapsulation formats
if desired.

This is motivated by a desire to test GRE and MPLS encapsulation in
the context of L3/VPN (MPLS over non-TEB GRE work). That is not
tested here but tests for those cases would ideally be based on those in
this patch.

---
 tests/system-common-macros.at |  36 +--
 tests/system-kmod-macros.at   |  22 +
 tests/system-traffic.at   | 225 +-
 3 files changed, 274 insertions(+), 9 deletions(-)

diff --git a/tests/system-common-macros.at b/tests/system-common-macros.at
index 4ffc3822a4d3..a201cf8ce100 100644
--- a/tests/system-common-macros.at
+++ b/tests/system-common-macros.at
@@ -56,7 +56,7 @@ m4_define([ADD_INT],
 ]
 )
 
-# ADD_VETH([port], [namespace], [ovs-br], [ip_addr] [mac_addr [gateway]])
+# ADD_VETH([port], [namespace], [ovs-br], [ip_addr] [mac_addr [gateway 
[ofport]]])
 #
 # Add a pair of veth ports. 'port' will be added to name space 'namespace',
 # and "ovs-'port'" will be added to ovs bridge 'ovs-br'.
@@ -64,8 +64,8 @@ m4_define([ADD_INT],
 # The 'port' in 'namespace' will be brought up with static IP address
 # with 'ip_addr' in CIDR notation.
 #
-# Optionally, one can specify the 'mac_addr' for 'port' and the default
-# 'gateway'.
+# Optionally, one can specify the 'mac_addr' for 'port', the default
+# 'gateway' and the 'ofport' number.
 #
 # The existing 'port' or 'ovs-port' will be removed before new ones are added.
 #
@@ -74,8 +74,14 @@ m4_define([ADD_VETH],
   CONFIGURE_VETH_OFFLOADS([$1])
   AT_CHECK([ip link set $1 netns $2])
   AT_CHECK([ip link set dev ovs-$1 up])
-  AT_CHECK([ovs-vsctl add-port $3 ovs-$1 -- \
-set interface ovs-$1 external-ids:iface-id="$1"])
+  if test -n "$7"; then
+AT_CHECK([ovs-vsctl add-port $3 ovs-$1 -- \
+  set interface ovs-$1 external-ids:iface-id="$1" \
+  ofport_request=$7])
+  else
+AT_CHECK([ovs-vsctl add-port $3 ovs-$1 -- \
+  set interface ovs-$1 external-ids:iface-id="$1"])
+  fi
   NS_CHECK_EXEC([$2], [ip addr add $4 dev $1])
   NS_CHECK_EXEC([$2], [ip link set dev $1 up])
   if test -n "$5"; then
@@ -99,7 +105,7 @@ m4_define([ADD_VLAN],
 ]
 )
 
-# ADD_OVS_TUNNEL([type], [bridge], [port], [remote-addr], [overlay-addr])
+# ADD_OVS_TUNNEL([type], [bridge], [port], [remote-addr], [overlay-addr 
[ofport]])
 #
 # Add an ovs-based tunnel device in the root namespace, with name 'port' and
 # type 'type'. The tunnel device will be configured as point-to-point with the
@@ -107,9 +113,17 @@ m4_define([ADD_VLAN],
 #
 # 'port will be configured with the address 'overlay-addr'.
 #
+# Optionally one can specify the 'ofport' number
+#
 m4_define([ADD_OVS_TUNNEL],
-   [AT_CHECK([ovs-vsctl add-port $2 $3 -- \
-  set int $3 type=$1 options:remote_ip=$4])
+   [if test -n "$6"; then
+  AT_CHECK([ovs-vsctl add-port $2 $3 -- \
+set int $3 type=$1 options:remote_ip=$4 \
+   ofport_request=$6])
+else
+  AT_CHECK([ovs-vsctl add-port $2 $3 -- \
+set int $3 type=$1 options:remote_ip=$4])
+fi
 AT_CHECK([ip addr add dev $2 $5])
 AT_CHECK([ip link set dev $2 up])
 AT_CHECK([ip link set dev $2 mtu 1450])
@@ -143,6 +157,12 @@ m4_define([ADD_NATIVE_TUNNEL],
 #
 m4_define([FORMAT_PING], [grep "transmitted" | sed 's/time.*ms$/time 0ms/'])
 
+# FORMAT_DD([])
+#
+# Strip variant pieces from dd output so the output can be reliably compared.
+#
+m4_define([FORMAT_DD], [sed 's/copied,.*$/copied, .../'])
+
 # FORMAT_CT([ip-addr])
 #
 # Strip content from the piped input which would differ from test to test
diff --git a/tests/system

Re: [ovs-dev] [PATCH net-next v11 5/6] openvswitch: add layer 3 flow/port support

2016-08-10 Thread Joe Stringer
On 10 August 2016 at 03:20, Simon Horman  wrote:
> On Tue, Aug 09, 2016 at 08:47:40AM -0700, pravin shelar wrote:
>> On Mon, Aug 8, 2016 at 8:17 AM, Simon Horman  
>> wrote:
>> > Light testing seems to indicate that it works for GSO skbs
>> > received over both L3 and L2 GRE tunnels by OvS with both
>> > IP-in-MPLS and IP (without MPLS) payloads.
>> >
>>
>> Thanks for testing it. Can you also add those tests to OVS kmod test suite?
>> ..
>
> Sure, I will look into doing that.
> Am I correct in thinking Joe Stringer is the best person to contact if
> I run into trouble there?

Sure. The basics of running the tests is documented here:
https://github.com/openvswitch/ovs/blob/master/INSTALL.md#datapath-testing

You should be able to get a good feel for how to add tests by perusing
the commits to tests/system-{traffic,kmod-macros}.at in the OVS source
tree.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH net-next v11 5/6] openvswitch: add layer 3 flow/port support

2016-08-10 Thread Simon Horman
On Tue, Aug 09, 2016 at 08:47:40AM -0700, pravin shelar wrote:
> On Mon, Aug 8, 2016 at 8:17 AM, Simon Horman  
> wrote:

...

> > Hi Pravin,
> >
> > I have made an attempt to implement your suggestion to the extent that
> > I understand it. The following is an incremental change on top
> > of this patch-set. Does it move things closer to what you have in mind?
> >
> Following approach looks good to me. I have posted couple of comments.

Thanks, I am rather glad to hear that.

> > Light testing seems to indicate that it works for GSO skbs
> > received over both L3 and L2 GRE tunnels by OvS with both
> > IP-in-MPLS and IP (without MPLS) payloads.
> >
> 
> Thanks for testing it. Can you also add those tests to OVS kmod test suite?
> ..

Sure, I will look into doing that.
Am I correct in thinking Joe Stringer is the best person to contact if
I run into trouble there?

> > diff --git a/net/mpls/mpls_gso.c b/net/mpls/mpls_gso.c
> > index 2055e57ed1c3..113cba89653d 100644
> > --- a/net/mpls/mpls_gso.c
> > +++ b/net/mpls/mpls_gso.c
> > @@ -39,16 +39,18 @@ static struct sk_buff *mpls_gso_segment(struct sk_buff 
> > *skb,
> > mpls_features = skb->dev->mpls_features & features;
> > segs = skb_mac_gso_segment(skb, mpls_features);
> >
> > -
> > -   /* Restore outer protocol. */
> > -   skb->protocol = mpls_protocol;
> > -
> > /* Re-pull the mac header that the call to skb_mac_gso_segment()
> >  * above pulled.  It will be re-pushed after returning
> >  * skb_mac_gso_segment(), an indirect caller of this function.
> >  */
> > __skb_pull(skb, skb->data - skb_mac_header(skb));
> >
> > +   /* Restore outer protocol. */
> > +   skb->protocol = mpls_protocol;
> > +   if (!IS_ERR(segs))
> > +   for (skb = segs; skb; skb = skb->next)
> > +   skb->protocol = mpls_protocol;
> > +
> 
> skb_mac_gso_segment() can also return NULL. Therefore segs should be
> checked for NULL case.

Sure, I will fix that.
I think that can be trivially resolved using IS_ERR_OR_NULL()

> > diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> > index 0001f651c934..424164222f1e 100644
> > --- a/net/openvswitch/actions.c
> > +++ b/net/openvswitch/actions.c
> 
> ...
> > @@ -308,8 +319,8 @@ static int pop_eth(struct sk_buff *skb, struct 
> > sw_flow_key *key)
> >  {
> > skb_pull_rcsum(skb, ETH_HLEN);
> > skb_reset_mac_header(skb);
> > -   skb->mac_len -= ETH_HLEN;
> >
> 
> I am not sure why this line is removed.

I will restore it.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH net-next v11 5/6] openvswitch: add layer 3 flow/port support

2016-08-10 Thread Simon Horman
On Mon, Aug 08, 2016 at 05:28:39PM +0200, Jiri Benc wrote:
> On Mon, 8 Aug 2016 17:17:17 +0200, Simon Horman wrote:
> > +bool skb_mac_header_present(struct sk_buff *skb)
> > +{
> > +   return skb->dev->type == ARPHRD_ETHER ||
> > +   (skb->dev->type == ARPHRD_NONE &&
> > +skb->protocol == htons(ETH_P_TEB));
> > +}
> > +EXPORT_SYMBOL(skb_mac_header_present);
> 
> I'd suggest a different name, this looks like it has something to do
> with skb->mac_header, which it doesn't. skb_eth_header_present, perhaps?

I struggled to come up with a reasonable name and I do like your
suggestion better than mine. I'll update the code as you suggest unless a
better name emerges.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH net-next v11 5/6] openvswitch: add layer 3 flow/port support

2016-08-09 Thread pravin shelar
On Mon, Aug 8, 2016 at 8:17 AM, Simon Horman  wrote:
> On Wed, Jul 20, 2016 at 11:06:37AM -0700, pravin shelar wrote:
>> On Tue, Jul 19, 2016 at 5:02 PM, Simon Horman
>>  wrote:
>> > On Mon, Jul 18, 2016 at 03:34:52PM -0700, pravin shelar wrote:
>> >> On Sun, Jul 17, 2016 at 9:50 PM, Simon Horman
>> >>  wrote:
>> >> > [CC Jiri Benc for portion regarding GRE]
>> >> >
>> >> > Hi Pravin,
>> >> >
>> >> > On Fri, Jul 15, 2016 at 02:07:37PM -0700, pravin shelar wrote:
>> >> >> On Wed, Jul 13, 2016 at 12:31 AM, Simon Horman
>> >> >>  wrote:
>> >> >> > Hi Pravin,
>> >> >> >
>> >> >> > On Thu, Jul 07, 2016 at 01:54:15PM -0700, pravin shelar wrote:
>> >> >> >> On Wed, Jul 6, 2016 at 10:59 AM, Simon Horman
>> >> >> >>  wrote:
>> >> >> >
>> >> >> > ...
>> >> >>
>> >> >> >
>> >> >> >> > diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
>> >> >> >> > index 0ea128eeeab2..86f2cfb19de3 100644
>> >> >> >> > --- a/net/openvswitch/flow.c
>> >> >> >> > +++ b/net/openvswitch/flow.c
>> >> >> >> ...
>> >> >> >>
>> >> >> >> > @@ -723,9 +729,17 @@ int ovs_flow_key_extract(const struct 
>> >> >> >> > ip_tunnel_info *tun_info,
>> >> >> >> > key->phy.skb_mark = skb->mark;
>> >> >> >> > ovs_ct_fill_key(skb, key);
>> >> >> >> > key->ovs_flow_hash = 0;
>> >> >> >> > +   key->phy.is_layer3 = skb->mac_len == 0;
>> >> >> >>
>> >> >> >> I do not think mac_len can be used. mac_header needs to be checked.
>> >> >> >> ...
>> >> >> >
>> >> >> > Yes, indeed. The update to use skb_mac_header_was_set() here 
>> >> >> > accidently
>> >> >> > slipped into the following patch, sorry about that.
>> >> >> >
>> >> >> > With that change in place I believe that this patch is internally
>> >> >> > consistent because mac_header and mac_len are set correctly by the
>> >> >> > call to key_extract() which is called by ovs_flow_key_extract() just
>> >> >> > after where the excerpt above ends.
>> >> >> >
>> >> >> > That said, I do think that it is possible to rely on 
>> >> >> > skb_mac_header_was_set
>> >> >> > throughout the datapath, including action processing etc... I have 
>> >> >> > provided
>> >> >> > an incremental patch - which I created on top of this entire series 
>> >> >> > - at
>> >> >> > the end of this email. If you prefer that approach I am happy to 
>> >> >> > take it,
>> >> >> > though I do feel that using mac_len leads to slightly cleaner code. 
>> >> >> > Let me
>> >> >> > know what you think.
>> >> >> >
>> >> >>
>> >> >>
>> >> >> I am not sure if you can use only mac_len to detect L3 packet. This
>> >> >> does not work with MPLS packets, mac_len is used to account MPLS
>> >> >> headers pushed on skb. Therefore in case of a MPLS header on L3
>> >> >> packet, mac_len would be non zero and we have to look at either
>> >> >> mac_header or some other metadata like is_layer3 flag from key to
>> >> >> check for L3 packet.
>> >> >
>> >> > At least within OvS mac_len does not include the length of the MPLS 
>> >> > label
>> >> > stack. Rather, the MPLS label stack length is the difference between the
>> >> > end of (mac_header + mac_len) and network_header.
>> >> >
>> >> > So I think that the scheme does work as mac_len is 0 if there is no L2
>> >> > header regardless of if an MPLS label stack is present or not.
>> >> >
>> >>
>> >> I was thinking in overall networking stack rather than just ovs
>> >> datapath. I think we should have consistent method of detecting L3
>> >> packet. As commented in previous mail it could be achieved using
>> >> skb-protocol and device type.
>> >
>> > This is somewhat of a surprise to me. As far as I recall when MPLS support
>> > was added to OvS it and the accompanying support for MPLS GSO was the only
>> > MPLS support present in the kernel. And at the time the scheme developed by
>> > Jesse Gross, myself and others was as I describe above.
>> >
>> > Internally OvS relies on this scheme and in particular it is used
>> > by skb_mpls_header() to calculate the beginning of the MPLS label stack
>> > accurately in the presence of VLAN tags.
>> >
>> > Is it mpls_gso_segment() that you are concerned about?
>> > If so, perhaps the problem could be addressed there.
>>
>> Yes.
>> Can you read the comment I made in previous main in context of
>> function skb_mpls_header(). I have given rational for requested
>> change.
>
> Hi Pravin,
>
> I have made an attempt to implement your suggestion to the extent that
> I understand it. The following is an incremental change on top
> of this patch-set. Does it move things closer to what you have in mind?
>
Following approach looks good to me. I have posted couple of comments.

> Light testing seems to indicate that it works for GSO skbs
> received over both L3 and L2 GRE tunnels by OvS with both
> IP-in-MPLS and IP (without MPLS) payloads.
>

Thanks for testing it. Can you also add those tests to OVS kmod test suite?
..

> diff --git a/net/mpls/mpls_gso.c b/net/mpls/mpls_gso.c
> index 2055e57ed1c3..113cba89653d 100644
> --- a/net/mpls/mpls_gso.c
> +++

Re: [ovs-dev] [PATCH net-next v11 5/6] openvswitch: add layer 3 flow/port support

2016-08-08 Thread Jiri Benc
On Mon, 8 Aug 2016 17:17:17 +0200, Simon Horman wrote:
> +bool skb_mac_header_present(struct sk_buff *skb)
> +{
> + return skb->dev->type == ARPHRD_ETHER ||
> + (skb->dev->type == ARPHRD_NONE &&
> +  skb->protocol == htons(ETH_P_TEB));
> +}
> +EXPORT_SYMBOL(skb_mac_header_present);

I'd suggest a different name, this looks like it has something to do
with skb->mac_header, which it doesn't. skb_eth_header_present, perhaps?

 Jiri
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH net-next v11 5/6] openvswitch: add layer 3 flow/port support

2016-08-08 Thread Simon Horman
On Wed, Jul 20, 2016 at 11:06:37AM -0700, pravin shelar wrote:
> On Tue, Jul 19, 2016 at 5:02 PM, Simon Horman
>  wrote:
> > On Mon, Jul 18, 2016 at 03:34:52PM -0700, pravin shelar wrote:
> >> On Sun, Jul 17, 2016 at 9:50 PM, Simon Horman
> >>  wrote:
> >> > [CC Jiri Benc for portion regarding GRE]
> >> >
> >> > Hi Pravin,
> >> >
> >> > On Fri, Jul 15, 2016 at 02:07:37PM -0700, pravin shelar wrote:
> >> >> On Wed, Jul 13, 2016 at 12:31 AM, Simon Horman
> >> >>  wrote:
> >> >> > Hi Pravin,
> >> >> >
> >> >> > On Thu, Jul 07, 2016 at 01:54:15PM -0700, pravin shelar wrote:
> >> >> >> On Wed, Jul 6, 2016 at 10:59 AM, Simon Horman
> >> >> >>  wrote:
> >> >> >
> >> >> > ...
> >> >>
> >> >> >
> >> >> >> > diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
> >> >> >> > index 0ea128eeeab2..86f2cfb19de3 100644
> >> >> >> > --- a/net/openvswitch/flow.c
> >> >> >> > +++ b/net/openvswitch/flow.c
> >> >> >> ...
> >> >> >>
> >> >> >> > @@ -723,9 +729,17 @@ int ovs_flow_key_extract(const struct 
> >> >> >> > ip_tunnel_info *tun_info,
> >> >> >> > key->phy.skb_mark = skb->mark;
> >> >> >> > ovs_ct_fill_key(skb, key);
> >> >> >> > key->ovs_flow_hash = 0;
> >> >> >> > +   key->phy.is_layer3 = skb->mac_len == 0;
> >> >> >>
> >> >> >> I do not think mac_len can be used. mac_header needs to be checked.
> >> >> >> ...
> >> >> >
> >> >> > Yes, indeed. The update to use skb_mac_header_was_set() here 
> >> >> > accidently
> >> >> > slipped into the following patch, sorry about that.
> >> >> >
> >> >> > With that change in place I believe that this patch is internally
> >> >> > consistent because mac_header and mac_len are set correctly by the
> >> >> > call to key_extract() which is called by ovs_flow_key_extract() just
> >> >> > after where the excerpt above ends.
> >> >> >
> >> >> > That said, I do think that it is possible to rely on 
> >> >> > skb_mac_header_was_set
> >> >> > throughout the datapath, including action processing etc... I have 
> >> >> > provided
> >> >> > an incremental patch - which I created on top of this entire series - 
> >> >> > at
> >> >> > the end of this email. If you prefer that approach I am happy to take 
> >> >> > it,
> >> >> > though I do feel that using mac_len leads to slightly cleaner code. 
> >> >> > Let me
> >> >> > know what you think.
> >> >> >
> >> >>
> >> >>
> >> >> I am not sure if you can use only mac_len to detect L3 packet. This
> >> >> does not work with MPLS packets, mac_len is used to account MPLS
> >> >> headers pushed on skb. Therefore in case of a MPLS header on L3
> >> >> packet, mac_len would be non zero and we have to look at either
> >> >> mac_header or some other metadata like is_layer3 flag from key to
> >> >> check for L3 packet.
> >> >
> >> > At least within OvS mac_len does not include the length of the MPLS label
> >> > stack. Rather, the MPLS label stack length is the difference between the
> >> > end of (mac_header + mac_len) and network_header.
> >> >
> >> > So I think that the scheme does work as mac_len is 0 if there is no L2
> >> > header regardless of if an MPLS label stack is present or not.
> >> >
> >>
> >> I was thinking in overall networking stack rather than just ovs
> >> datapath. I think we should have consistent method of detecting L3
> >> packet. As commented in previous mail it could be achieved using
> >> skb-protocol and device type.
> >
> > This is somewhat of a surprise to me. As far as I recall when MPLS support
> > was added to OvS it and the accompanying support for MPLS GSO was the only
> > MPLS support present in the kernel. And at the time the scheme developed by
> > Jesse Gross, myself and others was as I describe above.
> >
> > Internally OvS relies on this scheme and in particular it is used
> > by skb_mpls_header() to calculate the beginning of the MPLS label stack
> > accurately in the presence of VLAN tags.
> >
> > Is it mpls_gso_segment() that you are concerned about?
> > If so, perhaps the problem could be addressed there.
> 
> Yes.
> Can you read the comment I made in previous main in context of
> function skb_mpls_header(). I have given rational for requested
> change.

Hi Pravin,

I have made an attempt to implement your suggestion to the extent that
I understand it. The following is an incremental change on top
of this patch-set. Does it move things closer to what you have in mind?

Light testing seems to indicate that it works for GSO skbs
received over both L3 and L2 GRE tunnels by OvS with both
IP-in-MPLS and IP (without MPLS) payloads.

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 72ece516535d..42033537eb4d 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2171,17 +2171,14 @@ static inline void skb_reset_mac_header(struct sk_buff 
*skb)
skb->mac_header = skb->data - skb->head;
 }
 
-static inline void skb_unset_mac_header(struct sk_buff *skb)
-{
-   skb->mac_header = (typeof(skb->mac_header))~0U;
-}
-
 static inline void skb_set_mac_heade

Re: [ovs-dev] [PATCH net-next v11 5/6] openvswitch: add layer 3 flow/port support

2016-07-21 Thread Jiri Benc
On Mon, 18 Jul 2016 13:50:27 +0900, Simon Horman wrote:
> On Fri, Jul 15, 2016 at 02:07:37PM -0700, pravin shelar wrote:
> > I think we should send L2 header with l2 header pushed on skb. This is
> > what OVS expect. The skb-push should be done for all l2 packets rather
> > than for particular type of device.
> 
> The following seems to achieve that.
> Jiri, what do you think?
> 
> diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
> index a20248355da0..edbc10690b60 100644
> --- a/net/ipv4/ip_gre.c
> +++ b/net/ipv4/ip_gre.c
> @@ -281,10 +281,9 @@ static int __ipgre_rcv(struct sk_buff *skb, const struct 
> tnl_ptk_info *tpi,
>  raw_proto, false) < 0)
>   goto drop;
>  
> - if (tunnel->dev->type != ARPHRD_NONE)
> + if (tunnel->dev->type != ARPHRD_NONE ||
> + tpi->proto == htons(ETH_P_TEB))
>   skb_pop_mac_header(skb);

This is wrong. The MAC header for ARPHRD_NONE interfaces is null,
that's the meaning of ARPHRD_NONE. mac_header cannot point to the outer
IP header. That would be ARPHRD_IPGRE.

 Jiri
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH net-next v11 5/6] openvswitch: add layer 3 flow/port support

2016-07-20 Thread pravin shelar
On Tue, Jul 19, 2016 at 5:02 PM, Simon Horman
 wrote:
> On Mon, Jul 18, 2016 at 03:34:52PM -0700, pravin shelar wrote:
>> On Sun, Jul 17, 2016 at 9:50 PM, Simon Horman
>>  wrote:
>> > [CC Jiri Benc for portion regarding GRE]
>> >
>> > Hi Pravin,
>> >
>> > On Fri, Jul 15, 2016 at 02:07:37PM -0700, pravin shelar wrote:
>> >> On Wed, Jul 13, 2016 at 12:31 AM, Simon Horman
>> >>  wrote:
>> >> > Hi Pravin,
>> >> >
>> >> > On Thu, Jul 07, 2016 at 01:54:15PM -0700, pravin shelar wrote:
>> >> >> On Wed, Jul 6, 2016 at 10:59 AM, Simon Horman
>> >> >>  wrote:
>> >> >
>> >> > ...
>> >>
>> >> >
>> >> >> > diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
>> >> >> > index 0ea128eeeab2..86f2cfb19de3 100644
>> >> >> > --- a/net/openvswitch/flow.c
>> >> >> > +++ b/net/openvswitch/flow.c
>> >> >> ...
>> >> >>
>> >> >> > @@ -723,9 +729,17 @@ int ovs_flow_key_extract(const struct 
>> >> >> > ip_tunnel_info *tun_info,
>> >> >> > key->phy.skb_mark = skb->mark;
>> >> >> > ovs_ct_fill_key(skb, key);
>> >> >> > key->ovs_flow_hash = 0;
>> >> >> > +   key->phy.is_layer3 = skb->mac_len == 0;
>> >> >>
>> >> >> I do not think mac_len can be used. mac_header needs to be checked.
>> >> >> ...
>> >> >
>> >> > Yes, indeed. The update to use skb_mac_header_was_set() here accidently
>> >> > slipped into the following patch, sorry about that.
>> >> >
>> >> > With that change in place I believe that this patch is internally
>> >> > consistent because mac_header and mac_len are set correctly by the
>> >> > call to key_extract() which is called by ovs_flow_key_extract() just
>> >> > after where the excerpt above ends.
>> >> >
>> >> > That said, I do think that it is possible to rely on 
>> >> > skb_mac_header_was_set
>> >> > throughout the datapath, including action processing etc... I have 
>> >> > provided
>> >> > an incremental patch - which I created on top of this entire series - at
>> >> > the end of this email. If you prefer that approach I am happy to take 
>> >> > it,
>> >> > though I do feel that using mac_len leads to slightly cleaner code. Let 
>> >> > me
>> >> > know what you think.
>> >> >
>> >>
>> >>
>> >> I am not sure if you can use only mac_len to detect L3 packet. This
>> >> does not work with MPLS packets, mac_len is used to account MPLS
>> >> headers pushed on skb. Therefore in case of a MPLS header on L3
>> >> packet, mac_len would be non zero and we have to look at either
>> >> mac_header or some other metadata like is_layer3 flag from key to
>> >> check for L3 packet.
>> >
>> > At least within OvS mac_len does not include the length of the MPLS label
>> > stack. Rather, the MPLS label stack length is the difference between the
>> > end of (mac_header + mac_len) and network_header.
>> >
>> > So I think that the scheme does work as mac_len is 0 if there is no L2
>> > header regardless of if an MPLS label stack is present or not.
>> >
>>
>> I was thinking in overall networking stack rather than just ovs
>> datapath. I think we should have consistent method of detecting L3
>> packet. As commented in previous mail it could be achieved using
>> skb-protocol and device type.
>
> This is somewhat of a surprise to me. As far as I recall when MPLS support
> was added to OvS it and the accompanying support for MPLS GSO was the only
> MPLS support present in the kernel. And at the time the scheme developed by
> Jesse Gross, myself and others was as I describe above.
>
> Internally OvS relies on this scheme and in particular it is used
> by skb_mpls_header() to calculate the beginning of the MPLS label stack
> accurately in the presence of VLAN tags.
>
> Is it mpls_gso_segment() that you are concerned about?
> If so, perhaps the problem could be addressed there.

Yes.
Can you read the comment I made in previous main in context of
function skb_mpls_header(). I have given rational for requested
change.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH net-next v11 5/6] openvswitch: add layer 3 flow/port support

2016-07-19 Thread Simon Horman
On Mon, Jul 18, 2016 at 03:34:52PM -0700, pravin shelar wrote:
> On Sun, Jul 17, 2016 at 9:50 PM, Simon Horman
>  wrote:
> > [CC Jiri Benc for portion regarding GRE]
> >
> > Hi Pravin,
> >
> > On Fri, Jul 15, 2016 at 02:07:37PM -0700, pravin shelar wrote:
> >> On Wed, Jul 13, 2016 at 12:31 AM, Simon Horman
> >>  wrote:
> >> > Hi Pravin,
> >> >
> >> > On Thu, Jul 07, 2016 at 01:54:15PM -0700, pravin shelar wrote:
> >> >> On Wed, Jul 6, 2016 at 10:59 AM, Simon Horman
> >> >>  wrote:
> >> >
> >> > ...
> >>
> >> >
> >> >> > diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
> >> >> > index 0ea128eeeab2..86f2cfb19de3 100644
> >> >> > --- a/net/openvswitch/flow.c
> >> >> > +++ b/net/openvswitch/flow.c
> >> >> ...
> >> >>
> >> >> > @@ -723,9 +729,17 @@ int ovs_flow_key_extract(const struct 
> >> >> > ip_tunnel_info *tun_info,
> >> >> > key->phy.skb_mark = skb->mark;
> >> >> > ovs_ct_fill_key(skb, key);
> >> >> > key->ovs_flow_hash = 0;
> >> >> > +   key->phy.is_layer3 = skb->mac_len == 0;
> >> >>
> >> >> I do not think mac_len can be used. mac_header needs to be checked.
> >> >> ...
> >> >
> >> > Yes, indeed. The update to use skb_mac_header_was_set() here accidently
> >> > slipped into the following patch, sorry about that.
> >> >
> >> > With that change in place I believe that this patch is internally
> >> > consistent because mac_header and mac_len are set correctly by the
> >> > call to key_extract() which is called by ovs_flow_key_extract() just
> >> > after where the excerpt above ends.
> >> >
> >> > That said, I do think that it is possible to rely on 
> >> > skb_mac_header_was_set
> >> > throughout the datapath, including action processing etc... I have 
> >> > provided
> >> > an incremental patch - which I created on top of this entire series - at
> >> > the end of this email. If you prefer that approach I am happy to take it,
> >> > though I do feel that using mac_len leads to slightly cleaner code. Let 
> >> > me
> >> > know what you think.
> >> >
> >>
> >>
> >> I am not sure if you can use only mac_len to detect L3 packet. This
> >> does not work with MPLS packets, mac_len is used to account MPLS
> >> headers pushed on skb. Therefore in case of a MPLS header on L3
> >> packet, mac_len would be non zero and we have to look at either
> >> mac_header or some other metadata like is_layer3 flag from key to
> >> check for L3 packet.
> >
> > At least within OvS mac_len does not include the length of the MPLS label
> > stack. Rather, the MPLS label stack length is the difference between the
> > end of (mac_header + mac_len) and network_header.
> >
> > So I think that the scheme does work as mac_len is 0 if there is no L2
> > header regardless of if an MPLS label stack is present or not.
> >
> 
> I was thinking in overall networking stack rather than just ovs
> datapath. I think we should have consistent method of detecting L3
> packet. As commented in previous mail it could be achieved using
> skb-protocol and device type.

This is somewhat of a surprise to me. As far as I recall when MPLS support
was added to OvS it and the accompanying support for MPLS GSO was the only
MPLS support present in the kernel. And at the time the scheme developed by
Jesse Gross, myself and others was as I describe above.

Internally OvS relies on this scheme and in particular it is used
by skb_mpls_header() to calculate the beginning of the MPLS label stack
accurately in the presence of VLAN tags.

Is it mpls_gso_segment() that you are concerned about?
If so, perhaps the problem could be addressed there.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH net-next v11 5/6] openvswitch: add layer 3 flow/port support

2016-07-18 Thread pravin shelar
On Sun, Jul 17, 2016 at 9:50 PM, Simon Horman
 wrote:
> [CC Jiri Benc for portion regarding GRE]
>
> Hi Pravin,
>
> On Fri, Jul 15, 2016 at 02:07:37PM -0700, pravin shelar wrote:
>> On Wed, Jul 13, 2016 at 12:31 AM, Simon Horman
>>  wrote:
>> > Hi Pravin,
>> >
>> > On Thu, Jul 07, 2016 at 01:54:15PM -0700, pravin shelar wrote:
>> >> On Wed, Jul 6, 2016 at 10:59 AM, Simon Horman
>> >>  wrote:
>> >
>> > ...
>>
>> >
>> >> > diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
>> >> > index 0ea128eeeab2..86f2cfb19de3 100644
>> >> > --- a/net/openvswitch/flow.c
>> >> > +++ b/net/openvswitch/flow.c
>> >> ...
>> >>
>> >> > @@ -723,9 +729,17 @@ int ovs_flow_key_extract(const struct 
>> >> > ip_tunnel_info *tun_info,
>> >> > key->phy.skb_mark = skb->mark;
>> >> > ovs_ct_fill_key(skb, key);
>> >> > key->ovs_flow_hash = 0;
>> >> > +   key->phy.is_layer3 = skb->mac_len == 0;
>> >>
>> >> I do not think mac_len can be used. mac_header needs to be checked.
>> >> ...
>> >
>> > Yes, indeed. The update to use skb_mac_header_was_set() here accidently
>> > slipped into the following patch, sorry about that.
>> >
>> > With that change in place I believe that this patch is internally
>> > consistent because mac_header and mac_len are set correctly by the
>> > call to key_extract() which is called by ovs_flow_key_extract() just
>> > after where the excerpt above ends.
>> >
>> > That said, I do think that it is possible to rely on skb_mac_header_was_set
>> > throughout the datapath, including action processing etc... I have provided
>> > an incremental patch - which I created on top of this entire series - at
>> > the end of this email. If you prefer that approach I am happy to take it,
>> > though I do feel that using mac_len leads to slightly cleaner code. Let me
>> > know what you think.
>> >
>>
>>
>> I am not sure if you can use only mac_len to detect L3 packet. This
>> does not work with MPLS packets, mac_len is used to account MPLS
>> headers pushed on skb. Therefore in case of a MPLS header on L3
>> packet, mac_len would be non zero and we have to look at either
>> mac_header or some other metadata like is_layer3 flag from key to
>> check for L3 packet.
>
> At least within OvS mac_len does not include the length of the MPLS label
> stack. Rather, the MPLS label stack length is the difference between the
> end of (mac_header + mac_len) and network_header.
>
> So I think that the scheme does work as mac_len is 0 if there is no L2
> header regardless of if an MPLS label stack is present or not.
>

I was thinking in overall networking stack rather than just ovs
datapath. I think we should have consistent method of detecting L3
packet. As commented in previous mail it could be achieved using
skb-protocol and device type.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH net-next v11 5/6] openvswitch: add layer 3 flow/port support

2016-07-17 Thread Simon Horman
[CC Jiri Benc for portion regarding GRE]

Hi Pravin,

On Fri, Jul 15, 2016 at 02:07:37PM -0700, pravin shelar wrote:
> On Wed, Jul 13, 2016 at 12:31 AM, Simon Horman
>  wrote:
> > Hi Pravin,
> >
> > On Thu, Jul 07, 2016 at 01:54:15PM -0700, pravin shelar wrote:
> >> On Wed, Jul 6, 2016 at 10:59 AM, Simon Horman
> >>  wrote:
> >
> > ...
> 
> >
> >> > diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
> >> > index 0ea128eeeab2..86f2cfb19de3 100644
> >> > --- a/net/openvswitch/flow.c
> >> > +++ b/net/openvswitch/flow.c
> >> ...
> >>
> >> > @@ -723,9 +729,17 @@ int ovs_flow_key_extract(const struct 
> >> > ip_tunnel_info *tun_info,
> >> > key->phy.skb_mark = skb->mark;
> >> > ovs_ct_fill_key(skb, key);
> >> > key->ovs_flow_hash = 0;
> >> > +   key->phy.is_layer3 = skb->mac_len == 0;
> >>
> >> I do not think mac_len can be used. mac_header needs to be checked.
> >> ...
> >
> > Yes, indeed. The update to use skb_mac_header_was_set() here accidently
> > slipped into the following patch, sorry about that.
> >
> > With that change in place I believe that this patch is internally
> > consistent because mac_header and mac_len are set correctly by the
> > call to key_extract() which is called by ovs_flow_key_extract() just
> > after where the excerpt above ends.
> >
> > That said, I do think that it is possible to rely on skb_mac_header_was_set
> > throughout the datapath, including action processing etc... I have provided
> > an incremental patch - which I created on top of this entire series - at
> > the end of this email. If you prefer that approach I am happy to take it,
> > though I do feel that using mac_len leads to slightly cleaner code. Let me
> > know what you think.
> >
> 
> 
> I am not sure if you can use only mac_len to detect L3 packet. This
> does not work with MPLS packets, mac_len is used to account MPLS
> headers pushed on skb. Therefore in case of a MPLS header on L3
> packet, mac_len would be non zero and we have to look at either
> mac_header or some other metadata like is_layer3 flag from key to
> check for L3 packet.

At least within OvS mac_len does not include the length of the MPLS label
stack. Rather, the MPLS label stack length is the difference between the
end of (mac_header + mac_len) and network_header.

So I think that the scheme does work as mac_len is 0 if there is no L2
header regardless of if an MPLS label stack is present or not.

> >> > diff --git a/net/openvswitch/vport-netdev.c 
> >> > b/net/openvswitch/vport-netdev.c
> >> > index 4e3972344aa6..733e7914f6bd 100644
> >> > --- a/net/openvswitch/vport-netdev.c
> >> > +++ b/net/openvswitch/vport-netdev.c
> >> > @@ -57,8 +57,10 @@ static void netdev_port_receive(struct sk_buff *skb)
> >> > if (unlikely(!skb))
> >> > return;
> >> >
> >> > -   skb_push(skb, ETH_HLEN);
> >> > -   skb_postpush_rcsum(skb, skb->data, ETH_HLEN);
> >> > +   if (vport->dev->type == ARPHRD_ETHER) {
> >> > +   skb_push(skb, ETH_HLEN);
> >> > +   skb_postpush_rcsum(skb, skb->data, ETH_HLEN);
> >> > +   }
> >> This is still required for tunnel device of ARPHRD_NONE which can
> >> handle l2 packets.
> >
> > That is not necessary given the current implementation (of ipgre) as it
> > supplies an skb with the mac header in place if the inner packet was an
> > Ethernet packet. This scheme could of course be adjusted.
> >
> > ...
> >
> 
> I think we should send L2 header with l2 header pushed on skb. This is
> what OVS expect. The skb-push should be done for all l2 packets rather
> than for particular type of device.

The following seems to achieve that.
Jiri, what do you think?

diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
index a20248355da0..edbc10690b60 100644
--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -281,10 +281,9 @@ static int __ipgre_rcv(struct sk_buff *skb, const struct 
tnl_ptk_info *tpi,
   raw_proto, false) < 0)
goto drop;
 
-   if (tunnel->dev->type != ARPHRD_NONE)
+   if (tunnel->dev->type != ARPHRD_NONE ||
+   tpi->proto == htons(ETH_P_TEB))
skb_pop_mac_header(skb);
-   else if (tpi->proto != htons(ETH_P_TEB))
-   skb_unset_mac_header(skb);
else
skb_reset_mac_header(skb);
if (tunnel->collect_md) {
@@ -326,7 +325,7 @@ static int ipgre_rcv(struct sk_buff *skb, const struct 
tnl_ptk_info *tpi,
 * also ETH_P_TEB traffic.
 */
itn = net_generic(net, ipgre_net_id);
-   res = __ipgre_rcv(skb, tpi, itn, hdr_len, true);
+   res = __ipgre_rcv(skb, tpi, itn, hdr_len, false);
}
return res;
 }
diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c
index 05985209f611..933ac46db53a 100644
--- a/net/openvswitch/vport-netdev.c
+++ b/net/o

Re: [ovs-dev] [PATCH net-next v11 5/6] openvswitch: add layer 3 flow/port support

2016-07-15 Thread pravin shelar
On Wed, Jul 13, 2016 at 12:31 AM, Simon Horman
 wrote:
> Hi Pravin,
>
> On Thu, Jul 07, 2016 at 01:54:15PM -0700, pravin shelar wrote:
>> On Wed, Jul 6, 2016 at 10:59 AM, Simon Horman
>>  wrote:
>
> ...

>
>> > diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
>> > index 0ea128eeeab2..86f2cfb19de3 100644
>> > --- a/net/openvswitch/flow.c
>> > +++ b/net/openvswitch/flow.c
>> ...
>>
>> > @@ -723,9 +729,17 @@ int ovs_flow_key_extract(const struct ip_tunnel_info 
>> > *tun_info,
>> > key->phy.skb_mark = skb->mark;
>> > ovs_ct_fill_key(skb, key);
>> > key->ovs_flow_hash = 0;
>> > +   key->phy.is_layer3 = skb->mac_len == 0;
>>
>> I do not think mac_len can be used. mac_header needs to be checked.
>> ...
>
> Yes, indeed. The update to use skb_mac_header_was_set() here accidently
> slipped into the following patch, sorry about that.
>
> With that change in place I believe that this patch is internally
> consistent because mac_header and mac_len are set correctly by the
> call to key_extract() which is called by ovs_flow_key_extract() just
> after where the excerpt above ends.
>
> That said, I do think that it is possible to rely on skb_mac_header_was_set
> throughout the datapath, including action processing etc... I have provided
> an incremental patch - which I created on top of this entire series - at
> the end of this email. If you prefer that approach I am happy to take it,
> though I do feel that using mac_len leads to slightly cleaner code. Let me
> know what you think.
>


I am not sure if you can use only mac_len to detect L3 packet. This
does not work with MPLS packets, mac_len is used to account MPLS
headers pushed on skb. Therefore in case of a MPLS header on L3
packet, mac_len would be non zero and we have to look at either
mac_header or some other metadata like is_layer3 flag from key to
check for L3 packet.


>> > diff --git a/net/openvswitch/vport-netdev.c 
>> > b/net/openvswitch/vport-netdev.c
>> > index 4e3972344aa6..733e7914f6bd 100644
>> > --- a/net/openvswitch/vport-netdev.c
>> > +++ b/net/openvswitch/vport-netdev.c
>> > @@ -57,8 +57,10 @@ static void netdev_port_receive(struct sk_buff *skb)
>> > if (unlikely(!skb))
>> > return;
>> >
>> > -   skb_push(skb, ETH_HLEN);
>> > -   skb_postpush_rcsum(skb, skb->data, ETH_HLEN);
>> > +   if (vport->dev->type == ARPHRD_ETHER) {
>> > +   skb_push(skb, ETH_HLEN);
>> > +   skb_postpush_rcsum(skb, skb->data, ETH_HLEN);
>> > +   }
>> This is still required for tunnel device of ARPHRD_NONE which can
>> handle l2 packets.
>
> That is not necessary given the current implementation (of ipgre) as it
> supplies an skb with the mac header in place if the inner packet was an
> Ethernet packet. This scheme could of course be adjusted.
>
> ...
>

I think we should send L2 header with l2 header pushed on skb. This is
what OVS expect. The skb-push should be done for all l2 packets rather
than for particular type of device.

>
>
> Update to use skb_mac_header_was_set() more as mentioned above.
> Please let me know what you think about this approach.
>
>  include/net/mpls.h   |4 ++-
>  net/openvswitch/actions.c|   42 
> ---
>  net/openvswitch/flow.c   |   23 +++
>  net/openvswitch/vport-internal_dev.c |2 -
>  net/openvswitch/vport-netdev.c   |4 +--
>  5 files changed, 44 insertions(+), 31 deletions(-)
>
> diff --git a/include/net/mpls.h b/include/net/mpls.h
> index 5b3b5addfb08..296b68661be0 100644
> --- a/include/net/mpls.h
> +++ b/include/net/mpls.h
> @@ -34,6 +34,8 @@ static inline bool eth_p_mpls(__be16 eth_type)
>   */
>  static inline unsigned char *skb_mpls_header(struct sk_buff *skb)
>  {
> -   return skb_mac_header(skb) + skb->mac_len;
> +   return skb_mac_header_was_set(skb) ?
> +   skb_mac_header(skb) + skb->mac_len :
> +   skb->data;
>  }

This function is also called from GSO layer. issue is in GSO layer, it
does reset mac header and mac length and then calls mpls-gso-handler.
So all subsequent check for L3 packet fails.
So far we have explored three different ways to detect L3 packet but
each has its own issue.
1. skb mac header : GSO can reset mac header.
2. skb mac length : MPLS uses mac_len to account for MPLS header
length along with L2 header
3. skb protocol: ETH_P_TEB is not set for all L2 frames, networking
stack is not ready to handle this type for given skb.

So none of them works consistently. I think the only option to detect
L3 packet reliably (and without adding field to skb) is to use
skb-protocol along with ARPHRD_NONE device type. If ARPHRD_NONE type
device generates L2 packet it needs to set protocol to ETH_P_TEB. Some
networking stack function also needs to be fixed to handle this
protocol type, e.g. vlan_get_protocol(), br_dev_queue_push_xmit(),
etc.
___
dev ma

Re: [ovs-dev] [PATCH net-next v11 5/6] openvswitch: add layer 3 flow/port support

2016-07-13 Thread Simon Horman
Hi Pravin,

On Thu, Jul 07, 2016 at 01:54:15PM -0700, pravin shelar wrote:
> On Wed, Jul 6, 2016 at 10:59 AM, Simon Horman
>  wrote:

...

> > diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> > index 12e8a8942a42..0001f651c934 100644
> > --- a/net/openvswitch/actions.c
> > +++ b/net/openvswitch/actions.c
> > @@ -301,6 +301,43 @@ static int set_eth_addr(struct sk_buff *skb, struct 
> > sw_flow_key *flow_key,
> > return 0;
> >  }
> >
> > +/* pop_eth does not support VLAN packets as this action is never called
> > + * for them.
> > + */
> > +static int pop_eth(struct sk_buff *skb, struct sw_flow_key *key)
> > +{
> > +   skb_pull_rcsum(skb, ETH_HLEN);
> > +   skb_reset_mac_header(skb);
> > +   skb->mac_len -= ETH_HLEN;
> > +
> > +   invalidate_flow_key(key);
> > +   return 0;
> > +}
> This is changing l2 packet to l3 packet by reseting mac header. We
> need to unset mac header so that OVS key-extract can identify this
> packet later on, for example after recirc action.
> Other option would be keeping key is_layer3 consistent with packet.
> Push ethernet and pop ethernet action can unset and set the flag in
> flow key. So that OVS can keep track of packet headers by looking at
> packet key. When packet leaves OVS (in netdev-send) we can unset mac
> header according to this flag.

...

> > diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
> > index 0ea128eeeab2..86f2cfb19de3 100644
> > --- a/net/openvswitch/flow.c
> > +++ b/net/openvswitch/flow.c
> ...
> 
> > @@ -723,9 +729,17 @@ int ovs_flow_key_extract(const struct ip_tunnel_info 
> > *tun_info,
> > key->phy.skb_mark = skb->mark;
> > ovs_ct_fill_key(skb, key);
> > key->ovs_flow_hash = 0;
> > +   key->phy.is_layer3 = skb->mac_len == 0;
> 
> I do not think mac_len can be used. mac_header needs to be checked.
> ...

Yes, indeed. The update to use skb_mac_header_was_set() here accidently
slipped into the following patch, sorry about that.

With that change in place I believe that this patch is internally
consistent because mac_header and mac_len are set correctly by the
call to key_extract() which is called by ovs_flow_key_extract() just
after where the excerpt above ends.

That said, I do think that it is possible to rely on skb_mac_header_was_set
throughout the datapath, including action processing etc... I have provided
an incremental patch - which I created on top of this entire series - at
the end of this email. If you prefer that approach I am happy to take it,
though I do feel that using mac_len leads to slightly cleaner code. Let me
know what you think.

> > diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
> > index c78a6a1476fb..fc94fbe1ddc3 100644
> > --- a/net/openvswitch/flow_netlink.c
> > +++ b/net/openvswitch/flow_netlink.c
> ...
> 
> > @@ -898,20 +922,33 @@ static int metadata_from_nlattrs(struct net *net, 
> > struct sw_flow_match *match,
> >sizeof(*cl), is_mask);
> > *attrs &= ~(1ULL << OVS_KEY_ATTR_CT_LABELS);
> > }
> > -   return 0;
> > -}
> >
> > -static int ovs_key_from_nlattrs(struct net *net, struct sw_flow_match 
> > *match,
> > -   u64 attrs, const struct nlattr **a,
> > -   bool is_mask, bool log)
> > -{
> > -   int err;
> > +   /* For layer 3 packets the ethernet type is provided
> > +* and treated as metadata but no MAC addresses are provided.
> > +*/
> > +   if (*attrs & (1ULL << OVS_KEY_ATTR_ETHERTYPE) &&
> > +   !(*attrs & (1 << OVS_KEY_ATTR_ETHERNET))) {
> > +   int err;
> >
> Here attr_ethertype can be processed irrespective of attr_ethernet.
> is_layer3 can be derived independently. This way there is no need to
> again process attr_ethertyp in l2_from_nlattrs().

Thanks, I have reworked things as you suggest.

...

> > diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c
> > index 4e3972344aa6..733e7914f6bd 100644
> > --- a/net/openvswitch/vport-netdev.c
> > +++ b/net/openvswitch/vport-netdev.c
> > @@ -57,8 +57,10 @@ static void netdev_port_receive(struct sk_buff *skb)
> > if (unlikely(!skb))
> > return;
> >
> > -   skb_push(skb, ETH_HLEN);
> > -   skb_postpush_rcsum(skb, skb->data, ETH_HLEN);
> > +   if (vport->dev->type == ARPHRD_ETHER) {
> > +   skb_push(skb, ETH_HLEN);
> > +   skb_postpush_rcsum(skb, skb->data, ETH_HLEN);
> > +   }
> This is still required for tunnel device of ARPHRD_NONE which can
> handle l2 packets.

That is not necessary given the current implementation (of ipgre) as it
supplies an skb with the mac header in place if the inner packet was an
Ethernet packet. This scheme could of course be adjusted.

...



Update to use skb_mac_header_was_set() more as mentioned above.
Please let me know what you think about this approach.

 include/net/mpls.h

Re: [ovs-dev] [PATCH net-next v11 5/6] openvswitch: add layer 3 flow/port support

2016-07-07 Thread pravin shelar
On Wed, Jul 6, 2016 at 10:59 AM, Simon Horman
 wrote:
> From: Lorand Jakab 
>
> Implementation of the pop_eth and push_eth actions in the kernel, and
> layer 3 flow support.
>
> This doesn't actually do anything yet as no layer 2 tunnel ports are
> supported yet. The original patch by Lorand was against the Open vSwitch
> tree which has L2 LISP tunnels but that is not supported in mainline Linux.
> I (Simon) plan to follow up with support for non-TEB GRE ports based on
> work by Thomas Morin.
>
> Cc: Thomas Morin 
> Signed-off-by: Lorand Jakab 
> Signed-off-by: Simon Horman 
>
> ---

> ---
>  include/uapi/linux/openvswitch.h |  11 ++
>  net/openvswitch/actions.c|  45 
>  net/openvswitch/datapath.c   |  13 +--
>  net/openvswitch/flow.c   |  65 +++
>  net/openvswitch/flow.h   |   4 +-
>  net/openvswitch/flow_netlink.c   | 213 
> ---
>  net/openvswitch/vport-geneve.c   |   2 +-
>  net/openvswitch/vport-gre.c  |   2 +-
>  net/openvswitch/vport-internal_dev.c |   6 +
>  net/openvswitch/vport-netdev.c   |  19 +++-
>  net/openvswitch/vport-netdev.h   |   2 +
>  net/openvswitch/vport-vxlan.c|   2 +-
>  12 files changed, 279 insertions(+), 105 deletions(-)
>
> diff --git a/include/uapi/linux/openvswitch.h 
> b/include/uapi/linux/openvswitch.h
> index 5cde501433eb..6f505e486e93 100644
> --- a/include/uapi/linux/openvswitch.h
> +++ b/include/uapi/linux/openvswitch.h
...

> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> index 12e8a8942a42..0001f651c934 100644
> --- a/net/openvswitch/actions.c
> +++ b/net/openvswitch/actions.c
> @@ -301,6 +301,43 @@ static int set_eth_addr(struct sk_buff *skb, struct 
> sw_flow_key *flow_key,
> return 0;
>  }
>
> +/* pop_eth does not support VLAN packets as this action is never called
> + * for them.
> + */
> +static int pop_eth(struct sk_buff *skb, struct sw_flow_key *key)
> +{
> +   skb_pull_rcsum(skb, ETH_HLEN);
> +   skb_reset_mac_header(skb);
> +   skb->mac_len -= ETH_HLEN;
> +
> +   invalidate_flow_key(key);
> +   return 0;
> +}
This is changing l2 packet to l3 packet by reseting mac header. We
need to unset mac header so that OVS key-extract can identify this
packet later on, for example after recirc action.
Other option would be keeping key is_layer3 consistent with packet.
Push ethernet and pop ethernet action can unset and set the flag in
flow key. So that OVS can keep track of packet headers by looking at
packet key. When packet leaves OVS (in netdev-send) we can unset mac
header according to this flag.

> +
> +static int push_eth(struct sk_buff *skb, struct sw_flow_key *key,
> +   const struct ovs_action_push_eth *ethh)
> +{
> +   struct ethhdr *hdr;
> +
> +   /* Add the new Ethernet header */
> +   if (skb_cow_head(skb, ETH_HLEN) < 0)
> +   return -ENOMEM;
> +
> +   skb_push(skb, ETH_HLEN);
> +   skb_reset_mac_header(skb);
> +   skb->mac_len += ETH_HLEN;
> +
> +   hdr = eth_hdr(skb);
> +   ether_addr_copy(hdr->h_source, ethh->addresses.eth_src);
> +   ether_addr_copy(hdr->h_dest, ethh->addresses.eth_dst);
> +   hdr->h_proto = skb->protocol;
> +
> +   skb_postpush_rcsum(skb, hdr, ETH_HLEN);
> +
> +   invalidate_flow_key(key);
> +   return 0;
> +}
> +


> diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
> index 0ea128eeeab2..86f2cfb19de3 100644
> --- a/net/openvswitch/flow.c
> +++ b/net/openvswitch/flow.c
...

> @@ -723,9 +729,17 @@ int ovs_flow_key_extract(const struct ip_tunnel_info 
> *tun_info,
> key->phy.skb_mark = skb->mark;
> ovs_ct_fill_key(skb, key);
> key->ovs_flow_hash = 0;
> +   key->phy.is_layer3 = skb->mac_len == 0;

I do not think mac_len can be used. mac_header needs to be checked.
...

> diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
> index c78a6a1476fb..fc94fbe1ddc3 100644
> --- a/net/openvswitch/flow_netlink.c
> +++ b/net/openvswitch/flow_netlink.c
...

> @@ -898,20 +922,33 @@ static int metadata_from_nlattrs(struct net *net, 
> struct sw_flow_match *match,
>sizeof(*cl), is_mask);
> *attrs &= ~(1ULL << OVS_KEY_ATTR_CT_LABELS);
> }
> -   return 0;
> -}
>
> -static int ovs_key_from_nlattrs(struct net *net, struct sw_flow_match *match,
> -   u64 attrs, const struct nlattr **a,
> -   bool is_mask, bool log)
> -{
> -   int err;
> +   /* For layer 3 packets the ethernet type is provided
> +* and treated as metadata but no MAC addresses are provided.
> +*/
> +   if (*attrs & (1ULL << OVS_KEY_ATTR_ETHERTYPE) &&
> +   !(*attrs & (1 << OVS_KEY_ATTR_ETHERNET))) {
> +   int err;
>
Here attr_ethertype can be processed irrespective of attr_ethernet.
is_layer3 can be derived