date:20160421

[ovs-dev] [PATCH] datapath-windows: Add paranthesis to fix error C2275

2016-04-21 Thread Sairam Venugopal

Add braces around the if condition to prevent Visual Studio from giving
the "error C2275: illegal use of this type as an expresion". This happens
when a variable is declared after a block. This error occurs on certain
versions of compilers.

Signed-off-by: Sairam Venugopal 
---
 datapath-windows/ovsext/Conntrack.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/datapath-windows/ovsext/Conntrack.c 
b/datapath-windows/ovsext/Conntrack.c
index fbeb70c..158d9d8 100644
--- a/datapath-windows/ovsext/Conntrack.c
+++ b/datapath-windows/ovsext/Conntrack.c
@@ -186,8 +186,9 @@ OvsCtEntryDelete(POVS_CT_ENTRY entry)
 static __inline BOOLEAN
 OvsCtEntryExpired(POVS_CT_ENTRY entry)
 {
-if (entry == NULL)
+if (entry == NULL) {
 return TRUE;
+}
 
 UINT64 currentTime;
 NdisGetCurrentSystemTime((LARGE_INTEGER *));
-- 
1.9.5.msysgit.0

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [PATCH] packets: use flow protocol when recalculating ipv6 checksums

2016-04-21 Thread Simon Horman

When using masked actions the ipv6_proto field of an action
to set IPv6 fields may be zero rather than the prevailing protocol
which will result in skipping checksum recalculation.

This patch resolves the problem by relying on the protocol
in the flow key rather than that in the set field action.

A similar fix for the kernel datapath has been accepted into David Miller's
'net' tree as b4f70527f052 ("openvswitch: use flow protocol when
recalculating ipv6 checksums").

Cc: Jarno Rajahalme 
Fixes: 6d670e7f0d45 ("lib/odp: Masked set action execution and printing.")
Signed-off-by: Simon Horman 
---
While preparing this I noticed that there does seem to be some scope
to consolidate packet_rh_present() and part of miniflow_extract().
---
 lib/odp-execute.c |  4 +---
 lib/packets.c | 39 +--
 lib/packets.h |  2 +-
 3 files changed, 23 insertions(+), 22 deletions(-)

diff --git a/lib/odp-execute.c b/lib/odp-execute.c
index b5204b270677..7efd9ec1d349 100644
--- a/lib/odp-execute.c
+++ b/lib/odp-execute.c
@@ -99,7 +99,6 @@ odp_set_ipv6(struct dp_packet *packet, const struct 
ovs_key_ipv6 *key,
 
 packet_set_ipv6(
 packet,
-key->ipv6_proto,
 mask_ipv6_addr(nh->ip6_src.be32, key->ipv6_src, mask->ipv6_src, sbuf),
 mask_ipv6_addr(nh->ip6_dst.be32, key->ipv6_dst, mask->ipv6_dst, dbuf),
 key->ipv6_tclass | (old_tc & ~mask->ipv6_tclass),
@@ -257,8 +256,7 @@ odp_execute_set_action(struct dp_packet *packet, const 
struct nlattr *a)
 
 case OVS_KEY_ATTR_IPV6:
 ipv6_key = nl_attr_get_unspec(a, sizeof(struct ovs_key_ipv6));
-packet_set_ipv6(packet, ipv6_key->ipv6_proto,
-ipv6_key->ipv6_src, ipv6_key->ipv6_dst,
+packet_set_ipv6(packet, ipv6_key->ipv6_src, ipv6_key->ipv6_dst,
 ipv6_key->ipv6_tclass, ipv6_key->ipv6_label,
 ipv6_key->ipv6_hlimit);
 break;
diff --git a/lib/packets.c b/lib/packets.c
index d0c0e68b534d..962fbdb6913c 100644
--- a/lib/packets.c
+++ b/lib/packets.c
@@ -837,10 +837,9 @@ packet_set_ipv4_addr(struct dp_packet *packet,
  *
  * This function assumes that L3 and L4 offsets are set in the packet. */
 static bool
-packet_rh_present(struct dp_packet *packet)
+packet_rh_present(struct dp_packet *packet, uint8_t *nexthdr)
 {
 const struct ovs_16aligned_ip6_hdr *nh;
-int nexthdr;
 size_t len;
 size_t remaining;
 uint8_t *data = dp_packet_l3(packet);
@@ -852,14 +851,14 @@ packet_rh_present(struct dp_packet *packet)
 nh = ALIGNED_CAST(struct ovs_16aligned_ip6_hdr *, data);
 data += sizeof *nh;
 remaining -= sizeof *nh;
-nexthdr = nh->ip6_nxt;
+*nexthdr = nh->ip6_nxt;
 
 while (1) {
-if ((nexthdr != IPPROTO_HOPOPTS)
-&& (nexthdr != IPPROTO_ROUTING)
-&& (nexthdr != IPPROTO_DSTOPTS)
-&& (nexthdr != IPPROTO_AH)
-&& (nexthdr != IPPROTO_FRAGMENT)) {
+if ((*nexthdr != IPPROTO_HOPOPTS)
+&& (*nexthdr != IPPROTO_ROUTING)
+&& (*nexthdr != IPPROTO_DSTOPTS)
+&& (*nexthdr != IPPROTO_AH)
+&& (*nexthdr != IPPROTO_FRAGMENT)) {
 /* It's either a terminal header (e.g., TCP, UDP) or one we
  * don't understand.  In either case, we're done with the
  * packet, so use it to fill in 'nw_proto'. */
@@ -875,34 +874,34 @@ packet_rh_present(struct dp_packet *packet)
 return false;
 }
 
-if (nexthdr == IPPROTO_AH) {
+if (*nexthdr == IPPROTO_AH) {
 /* A standard AH definition isn't available, but the fields
  * we care about are in the same location as the generic
  * option header--only the header length is calculated
  * differently. */
 const struct ip6_ext *ext_hdr = (struct ip6_ext *)data;
 
-nexthdr = ext_hdr->ip6e_nxt;
+*nexthdr = ext_hdr->ip6e_nxt;
 len = (ext_hdr->ip6e_len + 2) * 4;
-} else if (nexthdr == IPPROTO_FRAGMENT) {
+} else if (*nexthdr == IPPROTO_FRAGMENT) {
 const struct ovs_16aligned_ip6_frag *frag_hdr
 = ALIGNED_CAST(struct ovs_16aligned_ip6_frag *, data);
 
-nexthdr = frag_hdr->ip6f_nxt;
+*nexthdr = frag_hdr->ip6f_nxt;
 len = sizeof *frag_hdr;
-} else if (nexthdr == IPPROTO_ROUTING) {
+} else if (*nexthdr == IPPROTO_ROUTING) {
 const struct ip6_rthdr *rh = (struct ip6_rthdr *)data;
 
 if (rh->ip6r_segleft > 0) {
 return true;
 }
 
-nexthdr = rh->ip6r_nxt;
+*nexthdr = rh->ip6r_nxt;
 len = (rh->ip6r_len + 1) * 8;
 } else {
 const struct ip6_ext *ext_hdr = (struct ip6_ext *)data;
 
-nexthdr = ext_hdr->ip6e_nxt;
+

Re: [ovs-dev] [PATCH v2] Add configurable OpenFlow port name.

2016-04-21 Thread Takashi YAMAMOTO

i don't have any problem with Xiao's approach.
just wanted to make sure alternatives considered.

wrt implementation, Xiao, can you rebase it?

On Fri, Apr 22, 2016 at 3:39 AM, Ben Pfaff  wrote:

> Yamamoto-san, I could really use your opinion here: do you think that
> this should be done differently?  If you do, then I will not accept it.
> But if you do not feel strongly about it, then I'll start properly
> reviewing it.
>
> Thanks,
>
> Ben.
>
> On Mon, Apr 18, 2016 at 07:05:52PM +0800, Xiao Liang wrote:
> > On Mon, Apr 18, 2016 at 5:46 PM, Takashi YAMAMOTO 
> wrote:
> > > for some reasons you want to change of_name without re-creating a port?
> > > why? (just curious)
> > >
> >
> > I don't have special use cases, just for convenience.
> >
> > >
> > > On Mon, Apr 18, 2016 at 4:19 PM, Xiao Liang 
> wrote:
> > >>
> > >> By introducing of_name, ovs_name serves as a key of ports which is
> > >> shared by ofproto and netdev. It's easier to find and convert ports
> > >> back and forth. of_name and kernel_name could be configured (if
> > >> supported) independently of each other.
> > >>
> > >> On Mon, Apr 18, 2016 at 11:43 AM, Takashi YAMAMOTO 
> > >> wrote:
> > >> > let me explain what netdev-bsd does first.
> > >> > on some platform "tap" interfaces are always named automatically by
> > >> > kernel
> > >> > itself
> > >> > and there's no way to rename them.  say, they always will have names
> > >> > like
> > >> > "tap0".
> > >> > so if you does "ovs-vsctl add-port br0 foo",
> > >> >   ovs_name = "foo"
> > >> >   kernel_name = "tap0"
> > >> >
> > >> > now, you are going to add another name for openflow. let's call it
> > >> > of_name.
> > >> > eg. "ovs-vsctl add-port br0 foo -- set int foo ofname=wan",
> > >> >   of_name = "wan"
> > >> >   ovs_name = "foo"
> > >> >   kernel_name = "tap0"
> > >> >
> > >> > while i don't have strong opinions either ways,
> > >> > i'm not sure why you want to use different names for of_name and
> > >> > ovs_name
> > >> > in the first place.  eg. what's wrong with "ovs-vsctl add-port br0
> wan".
> > >> > can you explain a little?
> > >> >
> > >> > On Mon, Apr 18, 2016 at 10:37 AM, Xiao Liang 
> > >> > wrote:
> > >> >>
> > >> >> Hi Ben, Yamamoto-san,
> > >> >>
> > >> >> Kindly remind you of this thread. Would like to hear your
> preference
> > >> >> on the way to implement this feature.
> > >> >>
> > >> >> On Mon, Apr 11, 2016 at 11:18 PM, Ben Pfaff  wrote:
> > >> >> > On Mon, Apr 11, 2016 at 04:30:04PM +0800, Xiao Liang wrote:
> > >> >> >> On Mon, Apr 11, 2016 at 3:42 PM, Takashi Yamamoto
> > >> >> >>  wrote:
> > >> >> >> > hi,
> > >> >> >> >
> > >> >> >> > have you considered the opposite way?
> > >> >> >> > ie. have an ability to specify the device name.
> > >> >> >> >
> > >> >> >> > netdev-bsd already has a distinction between "kernel name" and
> > >> >> >> > "ovs
> > >> >> >> > name".
> > >> >> >> >
> > >> >> >>
> > >> >> >> Hi,
> > >> >> >>
> > >> >> >> I'm not familiar with netdev-bsd code, but I think this approach
> > >> >> >> will
> > >> >> >> make ports more difficult to manage and need much more effort.
> > >> >> >
> > >> >> > Yamamoto-san: thanks for bringing this up.  I'm going to wait
> for you
> > >> >> > and Xiao to talk this through a bit before continuing review.
> > >> >> ___
> > >> >> dev mailing list
> > >> >> dev@openvswitch.org
> > >> >> http://openvswitch.org/mailman/listinfo/dev
> > >> >
> > >> >
> > >
> > >
>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH] datapath-windows: Fixed buffer overflow in OvsInitVportWithNicParam

2016-04-21 Thread Nithin Raju

Why do we need ethernet address to be macAddress to be 32 bytes?

We should just make sure that when we copy, we use sizeof ()
and in this case, it is sizeof (vport->macAddress).

-- Nithin

-Original Message-
From: dev  on behalf of Paul Boca

Date: Monday, April 18, 2016 at 12:52 AM
To: "dev@openvswitch.org" 
Subject: [ovs-dev] [PATCH] datapath-windows: Fixed buffer overflow in
OvsInitVportWithNicParam

>nicParam->PermanentMacAddress is 32 bytes and vport->permMacAddress is 6
>bytes
>
>Signed-off-by: Paul-Daniel Boca 
>---
> datapath-windows/ovsext/DpInternal.h | 6 +++---
> datapath-windows/ovsext/Vport.h  | 6 +++---
> 2 files changed, 6 insertions(+), 6 deletions(-)
>
>diff --git a/datapath-windows/ovsext/DpInternal.h
>b/datapath-windows/ovsext/DpInternal.h
>index a3ce311..760552d 100644
>--- a/datapath-windows/ovsext/DpInternal.h
>+++ b/datapath-windows/ovsext/DpInternal.h
>@@ -41,9 +41,9 @@ typedef struct _OVS_VPORT_GET {
> typedef struct _OVS_VPORT_EXT_INFO {
> uint32_t dpNo;
> uint32_t portNo;
>-uint8_t macAddress[ETH_ADDR_LEN];
>-uint8_t permMACAddress[ETH_ADDR_LEN];
>-uint8_t vmMACAddress[ETH_ADDR_LEN];
>+uint8_t macAddress[NDIS_MAX_PHYS_ADDRESS_LENGTH];
>+uint8_t permMACAddress[NDIS_MAX_PHYS_ADDRESS_LENGTH];
>+uint8_t vmMACAddress[NDIS_MAX_PHYS_ADDRESS_LENGTH];
> uint16_t nicIndex;
> uint32_t portId;
> uint32_t type;
>diff --git a/datapath-windows/ovsext/Vport.h
>b/datapath-windows/ovsext/Vport.h
>index 373896d..3f18eb1 100644
>--- a/datapath-windows/ovsext/Vport.h
>+++ b/datapath-windows/ovsext/Vport.h
>@@ -102,9 +102,9 @@ typedef struct _OVS_VPORT_ENTRY {
> NDIS_SWITCH_NIC_STATE  nicState;
> NDIS_SWITCH_PORT_TYPE  portType;
> 
>-UINT8  permMacAddress[ETH_ADDR_LEN];
>-UINT8  currMacAddress[ETH_ADDR_LEN];
>-UINT8  vmMacAddress[ETH_ADDR_LEN];
>+UINT8  permMacAddress[NDIS_MAX_PHYS_ADDRESS_LENGTH];
>+UINT8  currMacAddress[NDIS_MAX_PHYS_ADDRESS_LENGTH];
>+UINT8  vmMacAddress[NDIS_MAX_PHYS_ADDRESS_LENGTH];
> 
> NDIS_SWITCH_PORT_NAME  hvPortName;
> IF_COUNTED_STRING  portFriendlyName;
>-- 
>2.7.2.windows.1
>___
>dev mailing list
>dev@openvswitch.org
>https://urldefense.proofpoint.com/v2/url?u=http-3A__openvswitch.org_mailma
>n_listinfo_dev=BQIGaQ=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs=pN
>HQcdr7B40b4h6Yb7FIedI1dnBsxdDuTLBYD3JqV80=1bLNUgQWl_iQvCaPGr2m9s8F4v8L2f
>Bug6h3DWQopKA=CEIFPbcaXRdYRSic-iezwcrlPOBFvYcceYdwuN3xIQ8= 

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [PATCH] datapath: Fix datapath build on Centos 6.5 (2.6.31-431) kernels

2016-04-21 Thread Wanlong Gao

The build was failing with following error:


  CC [M]  /home/sabyasse/Linux/src/sandbox/ovs_v1/datapath/linux/vport.o
/home/sabyasse/Linux/src/sandbox/ovs_v1/datapath/linux/vport.c: In
function ‘ovs_vport_get_stats’:
/home/sabyasse/Linux/src/sandbox/ovs_v1/datapath/linux/vport.c:328:
error: implicit declaration of function ‘dev_get_stats64’


The issue is fixed by checking for existence of dev_get_stats64 in
netdevice.h and then using it (in C6.7+, 2.6.32-594 kernels). For
previous kernels use compat rpl_dev_get_stats.

Signed-off-by: Sabyasachi Sengupta 
Signed-off-by: Wanlong Gao 
---

This issue still exists and should be backported to LTS branch-2.5.

Thanks,
Wanlong Gao

 acinclude.m4| 1 +
 datapath/linux/compat/include/linux/netdevice.h | 6 ++
 2 files changed, 7 insertions(+)

diff --git a/acinclude.m4 b/acinclude.m4
index acd7ce7..fd67598 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -399,6 +399,7 @@ AC_DEFUN([OVS_CHECK_LINUX_COMPAT], [
   [OVS_DEFINE([HAVE_SOCK_CREATE_KERN_NET])])
   OVS_GREP_IFELSE([$KSRC/include/linux/netdevice.h], [dev_disable_lro])
   OVS_GREP_IFELSE([$KSRC/include/linux/netdevice.h], [dev_get_stats])
+  OVS_GREP_IFELSE([$KSRC/include/linux/netdevice.h], [dev_get_stats64])
   OVS_GREP_IFELSE([$KSRC/include/linux/netdevice.h], [dev_get_by_index_rcu])
   OVS_GREP_IFELSE([$KSRC/include/linux/netdevice.h], [dev_recursion_level])
   OVS_GREP_IFELSE([$KSRC/include/linux/netdevice.h], [__skb_gso_segment])
diff --git a/datapath/linux/compat/include/linux/netdevice.h 
b/datapath/linux/compat/include/linux/netdevice.h
index e9fa995..71fef90 100644
--- a/datapath/linux/compat/include/linux/netdevice.h
+++ b/datapath/linux/compat/include/linux/netdevice.h
@@ -229,7 +229,13 @@ struct rtnl_link_stats64 *rpl_dev_get_stats(struct 
net_device *dev,
 
 #if RHEL_RELEASE_CODE < RHEL_RELEASE_VERSION(7,0)
 /* Only required on RHEL 6. */
+#ifdef HAVE_DEV_GET_STATS64
 #define dev_get_stats dev_get_stats64
+#else
+#define dev_get_stats rpl_dev_get_stats
+struct rtnl_link_stats64 *rpl_dev_get_stats(struct net_device *dev,
+   struct rtnl_link_stats64 *storage);
+#endif
 #endif
 
 #ifndef netdev_dbg
-- 
2.5.0

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH] datapath-windows: Fixed buffer overflow in OvsInitVportWithNicParam

2016-04-21 Thread Sorin Vinturis

Good catch!

Acked-by: Sorin Vinturis 

-Original Message-
From: dev [mailto:dev-boun...@openvswitch.org] On Behalf Of Paul Boca
Sent: Monday, 18 April, 2016 10:52
To: dev@openvswitch.org
Subject: [ovs-dev] [PATCH] datapath-windows: Fixed buffer overflow in 
OvsInitVportWithNicParam

nicParam->PermanentMacAddress is 32 bytes and vport->permMacAddress is 6 
nicParam->bytes

Signed-off-by: Paul-Daniel Boca 
---
 datapath-windows/ovsext/DpInternal.h | 6 +++---
 datapath-windows/ovsext/Vport.h  | 6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/datapath-windows/ovsext/DpInternal.h 
b/datapath-windows/ovsext/DpInternal.h
index a3ce311..760552d 100644
--- a/datapath-windows/ovsext/DpInternal.h
+++ b/datapath-windows/ovsext/DpInternal.h
@@ -41,9 +41,9 @@ typedef struct _OVS_VPORT_GET {  typedef struct 
_OVS_VPORT_EXT_INFO {
 uint32_t dpNo;
 uint32_t portNo;
-uint8_t macAddress[ETH_ADDR_LEN];
-uint8_t permMACAddress[ETH_ADDR_LEN];
-uint8_t vmMACAddress[ETH_ADDR_LEN];
+uint8_t macAddress[NDIS_MAX_PHYS_ADDRESS_LENGTH];
+uint8_t permMACAddress[NDIS_MAX_PHYS_ADDRESS_LENGTH];
+uint8_t vmMACAddress[NDIS_MAX_PHYS_ADDRESS_LENGTH];
 uint16_t nicIndex;
 uint32_t portId;
 uint32_t type;
diff --git a/datapath-windows/ovsext/Vport.h b/datapath-windows/ovsext/Vport.h 
index 373896d..3f18eb1 100644
--- a/datapath-windows/ovsext/Vport.h
+++ b/datapath-windows/ovsext/Vport.h
@@ -102,9 +102,9 @@ typedef struct _OVS_VPORT_ENTRY {
 NDIS_SWITCH_NIC_STATE  nicState;
 NDIS_SWITCH_PORT_TYPE  portType;
 
-UINT8  permMacAddress[ETH_ADDR_LEN];
-UINT8  currMacAddress[ETH_ADDR_LEN];
-UINT8  vmMacAddress[ETH_ADDR_LEN];
+UINT8  permMacAddress[NDIS_MAX_PHYS_ADDRESS_LENGTH];
+UINT8  currMacAddress[NDIS_MAX_PHYS_ADDRESS_LENGTH];
+UINT8  vmMacAddress[NDIS_MAX_PHYS_ADDRESS_LENGTH];
 
 NDIS_SWITCH_PORT_NAME  hvPortName;
 IF_COUNTED_STRING  portFriendlyName;
--
2.7.2.windows.1
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [PATCH v2 13/15] native tunnel: Add support for STT

2016-04-21 Thread Pravin B Shelar

This patch used userpsace tunneling mechanism for implementing
STT tunneling protocol.
For details about STT you can refer the draft:
https://tools.ietf.org/html/draft-davie-stt-07

Signed-off-by: Pravin B Shelar 
---
 lib/automake.mk   |   2 +
 lib/netdev-native-stt.c   | 700 ++
 lib/netdev-native-stt.h   |  37 +++
 lib/netdev-vport.c|   9 +-
 lib/odp-util.c|  62 +++-
 lib/packets.h |  33 ++
 lib/timeval.h |   1 +
 tests/tunnel-push-pop-ipv6.at |  19 +-
 tests/tunnel-push-pop.at  |  27 ++
 9 files changed, 885 insertions(+), 5 deletions(-)
 create mode 100644 lib/netdev-native-stt.c
 create mode 100644 lib/netdev-native-stt.h

diff --git a/lib/automake.mk b/lib/automake.mk
index 7972392..a7a0911 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -252,6 +252,8 @@ lib_libopenvswitch_la_SOURCES = \
lib/tnl-neigh-cache.h \
lib/tnl-ports.c \
lib/tnl-ports.h \
+   lib/netdev-native-stt.c \
+   lib/netdev-native-stt.h \
lib/netdev-native-tnl.c \
lib/netdev-native-tnl.h \
lib/token-bucket.c \
diff --git a/lib/netdev-native-stt.c b/lib/netdev-native-stt.c
new file mode 100644
index 000..0a894ad
--- /dev/null
+++ b/lib/netdev-native-stt.c
@@ -0,0 +1,700 @@
+/*
+ * Copyright (c) 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+#include "openvswitch/list.h"
+
+#include "byte-order.h"
+#include "csum.h"
+#include "daemon.h"
+#include "dirs.h"
+#include "dpif.h"
+#include "dp-packet.h"
+#include "entropy.h"
+#include "flow.h"
+#include "hash.h"
+#include "hmap.h"
+#include "id-pool.h"
+#include "netdev-provider.h"
+#include "netdev-vport.h"
+#include "netdev-vport-private.h"
+#include "odp-netlink.h"
+#include "dp-packet.h"
+#include "dp-packet-lso.h"
+#include "ovs-router.h"
+#include "packets.h"
+#include "poll-loop.h"
+#include "random.h"
+#include "route-table.h"
+#include "shash.h"
+#include "socket-util.h"
+#include "timeval.h"
+#include "netdev-native-stt.h"
+#include "netdev-native-tnl.h"
+#include "openvswitch/vlog.h"
+#include "unaligned.h"
+#include "unixctl.h"
+#include "util.h"
+
+VLOG_DEFINE_THIS_MODULE(native_stt);
+static struct vlog_rate_limit err_rl = VLOG_RATE_LIMIT_INIT(60, 5);
+
+/* The maximum amount of memory used to store packets waiting to be reassembled
+ * on a given CPU.  Once this threshold is exceeded we will begin freeing the
+ * least recently used fragments.
+ */
+#define REASM_HI_THRESH (4 * 1024 * 1024)
+/* The target for the high memory evictor.  Once we have exceeded
+ * REASM_HI_THRESH, we will continue freeing fragments until we hit
+ * this limit.
+ */
+#define REASM_LO_THRESH (3 * 1024 * 1024)
+/* The length of time a given packet has to be reassembled from the time the
+ * first fragment arrives.  Once this limit is exceeded it becomes available
+ * for cleaning.
+ */
+#define HZ 10L
+
+#define FRAG_EXP_TIME  (30 * HZ)
+
+#define FRAG_HASH_SHIFT 8
+#define FRAG_HASH_ENTRIES   (1 << FRAG_HASH_SHIFT)
+#define FRAG_HASH_SEGS  ((sizeof(uint32_t) * 8) / FRAG_HASH_SHIFT)
+
+struct pkt_key {
+struct in6_addr ipv6_src;
+struct in6_addr ipv6_dst;
+ovs_be32 pkt_seq;
+};
+
+struct pkt_frag {
+struct dp_packet *pkts;
+unsigned long timestamp;
+struct ovs_list lru_node;
+struct pkt_key key;
+};
+
+struct first_frag {
+struct dp_packet *last_pkt;
+unsigned int mem_used;
+uint16_t tot_len;
+uint16_t rcvd_len;
+bool set_ecn_ce;
+};
+
+struct frag_packet_data {
+uint16_t offset;
+   uint16_t pkt_size;
+struct dp_packet *next;
+/* Only valid for the first packet in the chain. */
+struct first_frag first;
+};
+
+BUILD_ASSERT_DECL(DP_PACKET_CONTEXT_SIZE >= sizeof(struct frag_packet_data));
+
+#define FRAG_DATA(packet) ((struct frag_packet_data *)(packet)->data)
+#define STT_PACKET_DATA(pkt)   ((unsigned char *)dp_packet_l4(pkt) + \
+  sizeof(struct tcp_header))
+
+struct stt_reassemble {
+struct pkt_frag frag_hash[FRAG_HASH_ENTRIES];
+struct ovs_list frag_lru;
+unsigned int frag_mem_used;
+

[ovs-dev] [PATCH v2 12/15] tnl-ports: Handle STT ports.

2016-04-21 Thread Pravin B Shelar

STT uses TCP port so we need to filter traffic on basis of TCP
port numbers.

Signed-off-by: Pravin B Shelar 
---
 lib/tnl-ports.c  | 82 +---
 lib/tnl-ports.h  |  4 +--
 ofproto/tunnel.c |  8 --
 3 files changed, 62 insertions(+), 32 deletions(-)

diff --git a/lib/tnl-ports.c b/lib/tnl-ports.c
index f399d04..e8d43f0 100644
--- a/lib/tnl-ports.c
+++ b/lib/tnl-ports.c
@@ -51,7 +51,8 @@ static struct ovs_list addr_list;
 
 struct tnl_port {
 odp_port_t port;
-ovs_be16 udp_port;
+ovs_be16 tp_port;
+uint8_t nw_proto;
 char dev_name[IFNAMSIZ];
 struct ovs_list node;
 };
@@ -82,7 +83,7 @@ tnl_port_free(struct tnl_port_in *p)
 
 static void
 tnl_port_init_flow(struct flow *flow, struct eth_addr mac,
-   struct in6_addr *addr, ovs_be16 udp_port)
+   struct in6_addr *addr, uint8_t nw_proto, ovs_be16 tp_port)
 {
 memset(flow, 0, sizeof *flow);
 
@@ -95,24 +96,20 @@ tnl_port_init_flow(struct flow *flow, struct eth_addr mac,
 flow->ipv6_dst = *addr;
 }
 
-if (udp_port) {
-flow->nw_proto = IPPROTO_UDP;
-} else {
-flow->nw_proto = IPPROTO_GRE;
-}
-flow->tp_dst = udp_port;
+flow->nw_proto = nw_proto;
+flow->tp_dst = tp_port;
 }
 
 static void
 map_insert(odp_port_t port, struct eth_addr mac, struct in6_addr *addr,
-   ovs_be16 udp_port, const char dev_name[])
+   uint8_t nw_proto, ovs_be16 tp_port, const char dev_name[])
 {
 const struct cls_rule *cr;
 struct tnl_port_in *p;
 struct match match;
 
 memset(, 0, sizeof match);
-tnl_port_init_flow(, mac, addr, udp_port);
+tnl_port_init_flow(, mac, addr, nw_proto, tp_port);
 
 do {
 cr = classifier_lookup(, CLS_MAX_VERSION, , NULL);
@@ -129,9 +126,9 @@ map_insert(odp_port_t port, struct eth_addr mac, struct 
in6_addr *addr,
  /* XXX: No fragments support. */
 match.wc.masks.nw_frag = FLOW_NW_FRAG_MASK;
 
-/* 'udp_port' is zero for non-UDP tunnels (e.g. GRE). In this case it
+/* 'tp_port' is zero for GRE tunnels. In this case it
  * doesn't make sense to match on UDP port numbers. */
-if (udp_port) {
+if (tp_port) {
 match.wc.masks.tp_dst = OVS_BE16_MAX;
 }
 if (IN6_IS_ADDR_V4MAPPED(addr)) {
@@ -152,40 +149,65 @@ map_insert(odp_port_t port, struct eth_addr mac, struct 
in6_addr *addr,
 
 static void
 map_insert_ipdev__(struct ip_device *ip_dev, char dev_name[],
-   odp_port_t port, ovs_be16 udp_port)
+   odp_port_t port, uint8_t nw_proto, ovs_be16 tp_port)
 {
 if (ip_dev->n_addr) {
 int i;
 
 for (i = 0; i < ip_dev->n_addr; i++) {
 map_insert(port, ip_dev->mac, _dev->addr[i],
-   udp_port, dev_name);
+   nw_proto, tp_port, dev_name);
 }
 }
 }
 
+static uint8_t
+tnl_type_to_nw_proto(const char type[])
+{
+if (!strcmp(type, "geneve")) {
+return IPPROTO_UDP;
+}
+if (!strcmp(type, "stt")) {
+return IPPROTO_TCP;
+}
+if (!strcmp(type, "gre")) {
+return IPPROTO_GRE;
+}
+if (!strcmp(type, "vxlan")) {
+return IPPROTO_UDP;
+}
+return 0;
+}
+
 void
-tnl_port_map_insert(odp_port_t port,
-ovs_be16 udp_port, const char dev_name[])
+tnl_port_map_insert(odp_port_t port, ovs_be16 tp_port,
+const char dev_name[], const char type[])
 {
 struct tnl_port *p;
 struct ip_device *ip_dev;
+uint8_t nw_proto;
+
+nw_proto = tnl_type_to_nw_proto(type);
+if (!nw_proto) {
+return;
+}
 
 ovs_mutex_lock();
 LIST_FOR_EACH(p, node, _list) {
-if (udp_port == p->udp_port) {
+if (tp_port == p->tp_port && p->nw_proto == nw_proto) {
  goto out;
 }
 }
 
 p = xzalloc(sizeof *p);
 p->port = port;
-p->udp_port = udp_port;
+p->tp_port = tp_port;
+p->nw_proto = nw_proto;
 ovs_strlcpy(p->dev_name, dev_name, sizeof p->dev_name);
 ovs_list_insert(_list, >node);
 
 LIST_FOR_EACH(ip_dev, node, _list) {
-map_insert_ipdev__(ip_dev, p->dev_name, p->port, p->udp_port);
+map_insert_ipdev__(ip_dev, p->dev_name, p->port, p->nw_proto, 
p->tp_port);
 }
 
 out:
@@ -205,39 +227,43 @@ tnl_port_unref(const struct cls_rule *cr)
 }
 
 static void
-map_delete(struct eth_addr mac, struct in6_addr *addr, ovs_be16 udp_port)
+map_delete(struct eth_addr mac, struct in6_addr *addr,
+   ovs_be16 tp_port, uint8_t nw_proto)
 {
 const struct cls_rule *cr;
 struct flow flow;
 
-tnl_port_init_flow(, mac, addr, udp_port);
+tnl_port_init_flow(, mac, addr, nw_proto, tp_port);
 
 cr = classifier_lookup(, CLS_MAX_VERSION, , NULL);
 tnl_port_unref(cr);
 }
 
 static void
-ipdev_map_delete(struct ip_device *ip_dev, ovs_be16 udp_port)
+ipdev_map_delete(struct

[ovs-dev] [PATCH v2 11/15] tunnel: Add IP ECN related functions.

2016-04-21 Thread Pravin B Shelar

Set and get functions for IP explicit congestion notification flag.
These function would be used by STT reassembly code.

Signed-off-by: Pravin B Shelar 
---
 lib/packets.c| 21 +
 lib/packets.h|  7 +++
 ofproto/tunnel.c |  6 +++---
 3 files changed, 31 insertions(+), 3 deletions(-)

diff --git a/lib/packets.c b/lib/packets.c
index d0c0e68..13ba531 100644
--- a/lib/packets.c
+++ b/lib/packets.c
@@ -1388,3 +1388,24 @@ packet_csum_pseudoheader6(const struct 
ovs_16aligned_ip6_hdr *ip6)
 return partial;
 }
 #endif
+
+void
+IP_ECN_set_ce(struct dp_packet *pkt, bool is_ipv6)
+{
+if (is_ipv6) {
+ovs_16aligned_be32 *ip6 = dp_packet_l3(pkt);
+
+put_16aligned_be32(ip6, get_16aligned_be32(ip6) |
+htonl(IP_ECN_CE << 20));
+} else {
+struct ip_header *nh = dp_packet_l3(pkt);
+uint8_t tos = nh->ip_tos;
+
+tos |= IP_ECN_CE;
+if (nh->ip_tos != tos) {
+nh->ip_csum = recalc_csum16(nh->ip_csum, htons(nh->ip_tos),
+htons((uint16_t) tos));
+nh->ip_tos = tos;
+}
+}
+}
diff --git a/lib/packets.h b/lib/packets.h
index f1e29f8..8d627a5 100644
--- a/lib/packets.h
+++ b/lib/packets.h
@@ -575,6 +575,12 @@ char *ip_parse_cidr_len(const char *s, int *n, ovs_be32 
*ip,
 #define IP_ECN_MASK 0x03
 #define IP_DSCP_MASK 0xfc
 
+static inline int
+IP_ECN_is_ce(uint8_t dsfield)
+{
+return (dsfield & IP_ECN_MASK) == IP_ECN_CE;
+}
+
 #define IP_VERSION 4
 
 #define IP_DONT_FRAGMENT  0x4000 /* Don't fragment. */
@@ -1057,5 +1063,6 @@ void compose_arp(struct dp_packet *, uint16_t arp_op,
 void compose_nd(struct dp_packet *, const struct eth_addr eth_src,
 struct in6_addr *, struct in6_addr *);
 uint32_t packet_csum_pseudoheader(const struct ip_header *);
+void IP_ECN_set_ce(struct dp_packet *pkt, bool is_ipv6);
 
 #endif /* packets.h */
diff --git a/ofproto/tunnel.c b/ofproto/tunnel.c
index 18297b2..e65a2e4 100644
--- a/ofproto/tunnel.c
+++ b/ofproto/tunnel.c
@@ -342,7 +342,7 @@ tnl_process_ecn(struct flow *flow)
 return true;
 }
 
-if (is_ip_any(flow) && (flow->tunnel.ip_tos & IP_ECN_MASK) == IP_ECN_CE) {
+if (is_ip_any(flow) && IP_ECN_is_ce(flow->tunnel.ip_tos)) {
 if ((flow->nw_tos & IP_ECN_MASK) == IP_ECN_NOT_ECT) {
 VLOG_WARN_RL(, "dropping tunnel packet marked ECN CE"
  " but is not ECN capable");
@@ -382,7 +382,7 @@ tnl_wc_init(struct flow *flow, struct flow_wildcards *wc)
 memset(>masks.pkt_mark, 0xff, sizeof wc->masks.pkt_mark);
 
 if (is_ip_any(flow)
-&& (flow->tunnel.ip_tos & IP_ECN_MASK) == IP_ECN_CE) {
+&& IP_ECN_is_ce(flow->tunnel.ip_tos)) {
 wc->masks.nw_tos |= IP_ECN_MASK;
 }
 }
@@ -455,7 +455,7 @@ tnl_port_send(const struct ofport_dpif *ofport, struct flow 
*flow,
 if (is_ip_any(flow)) {
 wc->masks.nw_tos |= IP_ECN_MASK;
 
-if ((flow->nw_tos & IP_ECN_MASK) == IP_ECN_CE) {
+if (IP_ECN_is_ce(flow->nw_tos)) {
 flow->tunnel.ip_tos |= IP_ECN_ECT_0;
 } else {
 flow->tunnel.ip_tos |= flow->nw_tos & IP_ECN_MASK;
-- 
2.5.5

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [PATCH v2 15/15] netdev: Add support for GENEVE/VXLAN tunnel segmentation

2016-04-21 Thread Pravin B Shelar

This patch adds support to segment large UDP based tunnel
packet. With this patch large packet generated by STT can
be forwarded over to GENEVE/VXLAN port.

Signed-off-by: Pravin B Shelar 
---
 lib/dp-packet-lso.c | 32 +++-
 lib/dp-packet-lso.h |  1 +
 lib/netdev-native-tnl.c | 19 +++
 lib/netdev-vport.c  | 11 +--
 4 files changed, 52 insertions(+), 11 deletions(-)

diff --git a/lib/dp-packet-lso.c b/lib/dp-packet-lso.c
index bdcc987..c40ffd2 100644
--- a/lib/dp-packet-lso.c
+++ b/lib/dp-packet-lso.c
@@ -235,11 +235,41 @@ segment_gre_packet(struct dp_packet *orig)
 }
 
 static struct dp_packet *
+segment_udp_tnl_packet(struct dp_packet *orig)
+{
+struct dp_packet *seg_list, *seg;
+uint8_t l2_pad_size = orig->l2_pad_size;
+uint16_t l2_5_ofs = orig->l2_5_ofs;
+uint16_t l3_ofs = orig->l3_ofs;
+uint16_t l4_ofs = orig->l4_ofs;
+
+seg_list = segment_eth_packet(orig, orig->lso.outer_hlen);
+restore_outer_headers(orig, orig->lso.outer_hlen, l2_pad_size, l2_5_ofs,
+  l3_ofs, l4_ofs);
+
+FOR_EACH_LSO_SEG(seg_list, seg) {
+struct udp_header *udp;
+restore_outer_headers(seg, orig->lso.outer_hlen, l2_pad_size, l2_5_ofs,
+  l3_ofs, l4_ofs);
+
+udp = dp_packet_l4(seg);
+udp->udp_len = htons(dp_packet_size(seg) - seg->l4_ofs);
+if (udp->udp_csum) {
+fixup_segment_cheksum(seg, orig, UDP_CSUM_OFFSET);
+}
+}
+return seg_list;
+}
+
+static struct dp_packet *
 segment_l4_packet(struct dp_packet *orig)
 {
 if (orig->lso.type & DPBUF_LSO_GRE) {
 orig->lso.type &= ~DPBUF_LSO_GRE;
 return segment_gre_packet(orig);
+} else if (orig->lso.type & DPBUF_LSO_UDP_TNL) {
+orig->lso.type &= ~DPBUF_LSO_UDP_TNL;
+return segment_udp_tnl_packet(orig);
 } else if (orig->lso.type & (DPBUF_LSO_TCPv4 | DPBUF_LSO_TCPv6)) {
 return segment_tcp_packet(orig);
 } else if (orig->lso.type & (DPBUF_LSO_UDPv4 | DPBUF_LSO_UDPv6)) {
@@ -258,7 +288,7 @@ segment_ipv4_packet(struct dp_packet *orig)
 int ip_offset = 0;
 bool inc_ip_id = false;
 
-if (orig->lso.type & (DPBUF_LSO_TCPv4 | DPBUF_LSO_GRE)) {
+if (orig->lso.type & (DPBUF_LSO_TCPv4 | DPBUF_LSO_GRE | 
DPBUF_LSO_UDP_TNL)) {
 inc_ip_id = true;
 ip_id = ntohs(orig_iph->ip_id);
 }
diff --git a/lib/dp-packet-lso.h b/lib/dp-packet-lso.h
index fdf93a6..8d1cf03 100644
--- a/lib/dp-packet-lso.h
+++ b/lib/dp-packet-lso.h
@@ -32,6 +32,7 @@
 #define DPBUF_LSO_UDPv4 (1 << 2)
 #define DPBUF_LSO_UDPv6 (1 << 3)
 #define DPBUF_LSO_GRE   (1 << 4)
+#define DPBUF_LSO_UDP_TNL   (1 << 5)
 
 struct dp_packet_lso_ctx {
 struct dp_packet *next;   /* Used to list lso segments. */
diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c
index 60cb81f..34d0f7a 100644
--- a/lib/netdev-native-tnl.c
+++ b/lib/netdev-native-tnl.c
@@ -227,20 +227,31 @@ push_udp_header(struct dp_packet *packet,
 udp->udp_src = get_src_port(packet);
 udp->udp_len = htons(ip_tot_size);
 
+if (packet->lso.type) {
+packet->lso.type |= DPBUF_LSO_UDP_TNL;
+packet->lso.outer_hlen = data->header_len;
+}
 if (udp->udp_csum) {
 uint32_t csum;
+ovs_be32 udp_csum;
+
 if (is_header_ipv6(dp_packet_data(packet))) {
 csum = packet_csum_pseudoheader6(ipv6_hdr(dp_packet_data(packet)));
 } else {
 csum = packet_csum_pseudoheader(ip_hdr(dp_packet_data(packet)));
 }
 
-csum = csum_continue(csum, udp, ip_tot_size);
-udp->udp_csum = csum_finish(csum);
+if (packet->lso.type) {
+udp_csum = ~csum_finish(csum);
+} else {
+csum = csum_continue(csum, udp, ip_tot_size);
+udp_csum = csum_finish(csum);
 
-if (!udp->udp_csum) {
-udp->udp_csum = htons(0x);
+if (!udp_csum) {
+udp_csum = htons(0x);
+}
 }
+udp->udp_csum = udp_csum;
 }
 }
 
diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c
index 5530818..e158a51 100644
--- a/lib/netdev-vport.c
+++ b/lib/netdev-vport.c
@@ -207,10 +207,7 @@ netdev_vport_construct(struct netdev *netdev_)
 eth_addr_random(>etheraddr);
 
 /* Add a default destination port for tunnel ports if none specified. */
-if (!strcmp(type, "gre")) {
-netdev_->supported_lso_types = DPBUF_LSO_TCPv4 | DPBUF_LSO_TCPv6 |
-   DPBUF_LSO_UDPv4 | DPBUF_LSO_UDPv6;
-} else if (!strcmp(type, "geneve")) {
+if (!strcmp(type, "geneve")) {
 dev->tnl_cfg.dst_port = htons(GENEVE_DST_PORT);
 } else if (!strcmp(type, "vxlan")) {
 dev->tnl_cfg.dst_port = htons(VXLAN_DST_PORT);
@@ -218,11 +215,13 @@ netdev_vport_construct(struct netdev *netdev_)
 dev->tnl_cfg.dst_port = htons(LISP_DST_PORT);
 } else if

[ovs-dev] [PATCH v2 14/15] netdev: Add support for GRE segmentation

2016-04-21 Thread Pravin B Shelar

This patch adds support to segment large GRE packet. This does
means that there are two sets of headers for given packet. To
get offset of inner packet outer_hlen member is added to dp-packet.

Signed-off-by: Pravin B Shelar 
---
 lib/dp-packet-lso.c | 74 +++--
 lib/dp-packet-lso.h |  2 ++
 lib/dp-packet.h |  1 +
 lib/netdev-native-tnl.c | 12 ++--
 lib/netdev-vport.c  |  5 +++-
 5 files changed, 82 insertions(+), 12 deletions(-)

diff --git a/lib/dp-packet-lso.c b/lib/dp-packet-lso.c
index 14a5ed8..bdcc987 100644
--- a/lib/dp-packet-lso.c
+++ b/lib/dp-packet-lso.c
@@ -67,6 +67,9 @@ static struct vlog_rate_limit err_rl = 
VLOG_RATE_LIMIT_INIT(60, 5);
 #define TCP_CSUM_OFFSET offsetof(struct tcp_header, tcp_csum)
 
 static struct dp_packet *
+segment_eth_packet(struct dp_packet *orig, int offset);
+
+static struct dp_packet *
 segment_packet__(struct dp_packet *orig, int header_len)
 {
 struct dp_packet *seg_list = NULL, *prev = NULL;
@@ -74,12 +77,21 @@ segment_packet__(struct dp_packet *orig, int header_len)
 int offset = header_len;
 int size = dp_packet_size(orig);
 struct dp_packet *seg;
+unsigned char *src;
 
+src = (unsigned char *) dp_packet_data(orig) - orig->lso.outer_hlen;
 if (!mss) {
-seg_list = dp_packet_clone(orig);
-memset(_list->lso, 0, sizeof seg_list->lso);
-PACKET_LSO_CTX(seg_list)->next = NULL;
-return seg_list;
+seg = dp_packet_clone_with_headroom(orig, orig->lso.outer_hlen);
+
+if (orig->lso.outer_hlen) {
+unsigned char *dst;
+
+dst = (unsigned char *) dp_packet_data(seg) - orig->lso.outer_hlen;
+memcpy(dst, src, orig->lso.outer_hlen);
+}
+memset(>lso, 0, sizeof seg->lso);
+PACKET_LSO_CTX(seg)->next = NULL;
+return seg;
 }
 while (offset < size) {
 int current_seg_size;
@@ -87,8 +99,10 @@ segment_packet__(struct dp_packet *orig, int header_len)
 
 current_seg_size = size < (offset + mss) ? (size - offset) : mss;
 seg = dp_packet_new(0);
-dp_packet_put(seg, dp_packet_data(orig), header_len);
-
+dp_packet_put(seg, src, header_len + orig->lso.outer_hlen);
+if (orig->lso.outer_hlen) {
+dp_packet_reset_packet(seg, orig->lso.outer_hlen);
+}
 data = (unsigned char *)dp_packet_data(orig) + offset;
 dp_packet_put(seg, data, current_seg_size);
 offset += mss;
@@ -179,10 +193,54 @@ segment_tcp_packet(struct dp_packet *orig)
 return seg_list;
 }
 
+static void
+restore_outer_headers(struct dp_packet *p, int hlen, uint8_t l2_pad_size,
+  uint16_t l2_5_ofs, uint16_t l3_ofs, uint16_t l4_ofs)
+{
+dp_packet_reset_packet(p, -hlen);
+p->l2_pad_size = l2_pad_size;
+p->l2_5_ofs = l2_5_ofs;
+p->l3_ofs = l3_ofs;
+p->l4_ofs = l4_ofs;
+}
+
+static struct dp_packet *
+segment_gre_packet(struct dp_packet *orig)
+{
+struct dp_packet *seg_list, *seg;
+const struct gre_base_hdr *greh;
+uint8_t l2_pad_size = orig->l2_pad_size;
+uint16_t l2_5_ofs = orig->l2_5_ofs;
+uint16_t l3_ofs = orig->l3_ofs;
+uint16_t l4_ofs = orig->l4_ofs;
+
+seg_list = segment_eth_packet(orig, orig->lso.outer_hlen);
+restore_outer_headers(orig, orig->lso.outer_hlen, l2_pad_size, l2_5_ofs,
+  l3_ofs, l4_ofs);
+
+FOR_EACH_LSO_SEG(seg_list, seg) {
+restore_outer_headers(seg, orig->lso.outer_hlen, l2_pad_size, l2_5_ofs,
+  l3_ofs, l4_ofs);
+
+greh = dp_packet_l4(seg);
+
+if (greh->flags & htons(GRE_CSUM)) {
+ovs_be16 *csum_opt = (ovs_be16 *) (greh + 1);
+int gre_size = dp_packet_size(seg) - seg->l4_ofs;
+
+*csum_opt = csum(greh, gre_size);
+}
+}
+return seg_list;
+}
+
 static struct dp_packet *
 segment_l4_packet(struct dp_packet *orig)
 {
-if (orig->lso.type & (DPBUF_LSO_TCPv4 | DPBUF_LSO_TCPv6)) {
+if (orig->lso.type & DPBUF_LSO_GRE) {
+orig->lso.type &= ~DPBUF_LSO_GRE;
+return segment_gre_packet(orig);
+} else if (orig->lso.type & (DPBUF_LSO_TCPv4 | DPBUF_LSO_TCPv6)) {
 return segment_tcp_packet(orig);
 } else if (orig->lso.type & (DPBUF_LSO_UDPv4 | DPBUF_LSO_UDPv6)) {
 return segment_udp_packet(orig);
@@ -200,7 +258,7 @@ segment_ipv4_packet(struct dp_packet *orig)
 int ip_offset = 0;
 bool inc_ip_id = false;
 
-if (orig->lso.type & DPBUF_LSO_TCPv4) {
+if (orig->lso.type & (DPBUF_LSO_TCPv4 | DPBUF_LSO_GRE)) {
 inc_ip_id = true;
 ip_id = ntohs(orig_iph->ip_id);
 }
diff --git a/lib/dp-packet-lso.h b/lib/dp-packet-lso.h
index 09815e8..fdf93a6 100644
--- a/lib/dp-packet-lso.h
+++ b/lib/dp-packet-lso.h
@@ -31,9 +31,11 @@
 #define DPBUF_LSO_TCPv6 (1 << 1)
 #define DPBUF_LSO_UDPv4 (1 << 2)
 #define DPBUF_LSO_UDPv6 (1 << 3)
+#define

[ovs-dev] [PATCH v2 08/15] dpif-netdev: Refactor userspace action

2016-04-21 Thread Pravin B Shelar

Large segment support need to use this refactored function to
send individual segments.

Signed-off-by: Pravin B Shelar 
---
 lib/dpif-netdev.c | 41 ++---
 1 file changed, 26 insertions(+), 15 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index f34aeae..00f130c 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -3724,6 +3724,30 @@ push_tnl_action(const struct dp_netdev *dp,
 }
 
 static void
+dp_execute_userspace_action(struct dp_netdev_pmd_thread *pmd,
+struct dp_packet *packet, bool may_steal,
+struct flow *flow, ovs_u128 *ufid,
+struct ofpbuf *actions,
+const struct nlattr *userdata)
+{
+struct dp_packet_batch b;
+int error;
+
+ofpbuf_clear(actions);
+
+error = dp_netdev_upcall(pmd, packet, flow, NULL, ufid,
+ DPIF_UC_ACTION, userdata, actions,
+ NULL);
+if (!error || error == ENOSPC) {
+packet_batch_init_packet(, packet);
+dp_netdev_execute_actions(pmd, , may_steal,
+  actions->data, actions->size);
+} else if (may_steal) {
+dp_packet_delete(packet);
+}
+}
+
+static void
 dp_execute_cb(void *aux_, struct dp_packet_batch *packets_,
   const struct nlattr *a, bool may_steal)
 OVS_NO_THREAD_SAFETY_ANALYSIS
@@ -3819,23 +3843,10 @@ dp_execute_cb(void *aux_, struct dp_packet_batch 
*packets_,
 ofpbuf_init(, 0);
 
 for (i = 0; i < packets_->count; i++) {
-int error;
-struct dp_packet_batch b;
-
-ofpbuf_clear();
-
 flow_extract(packets[i], );
 dpif_flow_hash(dp->dpif, , sizeof flow, );
-error = dp_netdev_upcall(pmd, packets[i], , NULL, ,
- DPIF_UC_ACTION, userdata,,
- NULL);
-if (!error || error == ENOSPC) {
-packet_batch_init_packet(, packets[i]);
-dp_netdev_execute_actions(pmd, , may_steal,
-  actions.data, actions.size);
-} else if (may_steal) {
-dp_packet_delete(packets[i]);
-}
+dp_execute_userspace_action(pmd, packets[i], may_steal, ,
+, , userdata);
 }
 ofpbuf_uninit();
 fat_rwlock_unlock(>upcall_rwlock);
-- 
2.5.5

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [PATCH v2 01/15] netdev-vport: Factor-out tunnel Push-pop code into separate module.

2016-04-21 Thread Pravin B Shelar

It is better to move tunnel push-pop action specific functions into
separate module.

Signed-off-by: Pravin B Shelar 
---
 lib/automake.mk|   3 +
 lib/netdev-native-tnl.c| 641 +++
 lib/netdev-native-tnl.h| 108 
 lib/netdev-provider.h  |   2 +-
 lib/netdev-vport-private.h |  63 +
 lib/netdev-vport.c | 664 +
 6 files changed, 820 insertions(+), 661 deletions(-)
 create mode 100644 lib/netdev-native-tnl.c
 create mode 100644 lib/netdev-native-tnl.h
 create mode 100644 lib/netdev-vport-private.h

diff --git a/lib/automake.mk b/lib/automake.mk
index 1ec2115..a3c3464 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -126,6 +126,7 @@ lib_libopenvswitch_la_SOURCES = \
lib/netdev-provider.h \
lib/netdev-vport.c \
lib/netdev-vport.h \
+   lib/netdev-vport-private.h \
lib/netdev.c \
lib/netdev.h \
lib/netflow.h \
@@ -249,6 +250,8 @@ lib_libopenvswitch_la_SOURCES = \
lib/tnl-neigh-cache.h \
lib/tnl-ports.c \
lib/tnl-ports.h \
+   lib/netdev-native-tnl.c \
+   lib/netdev-native-tnl.h \
lib/token-bucket.c \
lib/tun-metadata.c \
lib/tun-metadata.h \
diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c
new file mode 100644
index 000..b52b068
--- /dev/null
+++ b/lib/netdev-native-tnl.c
@@ -0,0 +1,641 @@
+/*
+ * Copyright (c) 2016 Nicira, Inc.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+#include "openvswitch/list.h"
+#include "byte-order.h"
+#include "csum.h"
+#include "daemon.h"
+#include "dirs.h"
+#include "dpif.h"
+#include "dp-packet.h"
+#include "entropy.h"
+#include "flow.h"
+#include "hash.h"
+#include "hmap.h"
+#include "id-pool.h"
+#include "netdev-provider.h"
+#include "netdev-vport.h"
+#include "netdev-vport-private.h"
+#include "odp-netlink.h"
+#include "dp-packet.h"
+#include "ovs-router.h"
+#include "packets.h"
+#include "poll-loop.h"
+#include "random.h"
+#include "route-table.h"
+#include "shash.h"
+#include "socket-util.h"
+#include "timeval.h"
+#include "netdev-native-tnl.h"
+#include "openvswitch/vlog.h"
+#include "unaligned.h"
+#include "unixctl.h"
+#include "util.h"
+
+VLOG_DEFINE_THIS_MODULE(native_tnl);
+static struct vlog_rate_limit err_rl = VLOG_RATE_LIMIT_INIT(60, 5);
+
+#define VXLAN_HLEN   (sizeof(struct udp_header) + \
+  sizeof(struct vxlanhdr))
+
+#define GENEVE_BASE_HLEN   (sizeof(struct udp_header) + \
+sizeof(struct genevehdr))
+
+uint16_t tnl_udp_port_min = 32768;
+uint16_t tnl_udp_port_max = 61000;
+
+void *
+ip_extract_tnl_md(struct dp_packet *packet, struct flow_tnl *tnl,
+  unsigned int *hlen)
+{
+void *nh;
+struct ip_header *ip;
+struct ovs_16aligned_ip6_hdr *ip6;
+void *l4;
+int l3_size;
+
+nh = dp_packet_l3(packet);
+ip = nh;
+ip6 = nh;
+l4 = dp_packet_l4(packet);
+
+if (!nh || !l4) {
+return NULL;
+}
+
+*hlen = sizeof(struct eth_header);
+
+l3_size = dp_packet_size(packet) -
+  ((char *)nh - (char *)dp_packet_data(packet));
+
+if (IP_VER(ip->ip_ihl_ver) == 4) {
+
+ovs_be32 ip_src, ip_dst;
+
+if (csum(ip, IP_IHL(ip->ip_ihl_ver) * 4)) {
+VLOG_WARN_RL(_rl, "ip packet has invalid checksum");
+return NULL;
+}
+
+if (ntohs(ip->ip_tot_len) > l3_size) {
+VLOG_WARN_RL(_rl, "ip packet is truncated (IP length %d, 
actual %d)",
+ ntohs(ip->ip_tot_len), l3_size);
+return NULL;
+}
+if (IP_IHL(ip->ip_ihl_ver) * 4 > sizeof(struct ip_header)) {
+VLOG_WARN_RL(_rl, "ip options not supported on tunnel packets "
+ "(%d bytes)", IP_IHL(ip->ip_ihl_ver) * 4);
+return NULL;
+}
+
+ip_src = get_16aligned_be32(>ip_src);
+ip_dst = get_16aligned_be32(>ip_dst);
+
+tnl->ip_src = ip_src;
+tnl->ip_dst = ip_dst;
+tnl->ip_tos = ip->ip_tos;
+tnl->ip_ttl = ip->ip_ttl;
+
+*hlen += IP_HEADER_LEN;
+
+} else if (IP_VER(ip->ip_ihl_ver) == 6) {
+
+memcpy(tnl->ipv6_src.s6_addr, ip6->ip6_src.be16, sizeof ip6->ip6_src);
+

[ovs-dev] [PATCH v2 06/15] dpif-netdev: create batch object

2016-04-21 Thread Pravin B Shelar

DPDK datapath operate on batch of packets. To pass the batch of
packets around we use packets array and count.  Next patch needs
to associate meta-data with each batch of packets. So Introducing
a batch structure to make handling the metadata easier.

Signed-off-by: Pravin B Shelar 
---
 lib/dp-packet.h  |  31 +
 lib/dpif-netdev.c| 147 +--
 lib/dpif.c   |  12 ++--
 lib/netdev.c |  34 +-
 lib/netdev.h |  12 ++--
 lib/odp-execute.c|  10 ++-
 lib/odp-execute.h|   5 +-
 ofproto/ofproto-dpif-xlate.c |   5 +-
 8 files changed, 146 insertions(+), 110 deletions(-)

diff --git a/lib/dp-packet.h b/lib/dp-packet.h
index 4a8b5ab..ce223e8 100644
--- a/lib/dp-packet.h
+++ b/lib/dp-packet.h
@@ -563,6 +563,37 @@ dp_packet_rss_invalidate(struct dp_packet *p)
 #endif
 }
 
+enum { NETDEV_MAX_BURST = 32 }; /* Maximum number packets in a batch. */
+
+struct dp_packet_batch {
+int count;
+struct dp_packet *packets[NETDEV_MAX_BURST];
+};
+
+static inline void dp_packet_batch_init(struct dp_packet_batch *b)
+{
+b->count = 0;
+}
+
+static inline void
+dp_packet_batch_clone(struct dp_packet_batch *dst,
+  struct dp_packet_batch *src)
+{
+int i;
+
+for (i = 0; i < src->count; i++) {
+dst->packets[i] = dp_packet_clone(src->packets[i]);
+}
+dst->count = src->count;
+}
+
+static inline void
+packet_batch_init_packet(struct dp_packet_batch *b, struct dp_packet *p)
+{
+b->count = 1;
+b->packets[0] = p;
+}
+
 #ifdef  __cplusplus
 }
 #endif
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index b2e8ae0..494 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -473,14 +473,14 @@ static void do_del_port(struct dp_netdev *dp, struct 
dp_netdev_port *)
 static int dpif_netdev_open(const struct dpif_class *, const char *name,
 bool create, struct dpif **);
 static void dp_netdev_execute_actions(struct dp_netdev_pmd_thread *pmd,
-  struct dp_packet **, int c,
+  struct dp_packet_batch *,
   bool may_steal,
   const struct nlattr *actions,
   size_t actions_len);
 static void dp_netdev_input(struct dp_netdev_pmd_thread *,
-struct dp_packet **, int cnt, odp_port_t port_no);
+struct dp_packet_batch *, odp_port_t port_no);
 static void dp_netdev_recirculate(struct dp_netdev_pmd_thread *,
-  struct dp_packet **, int cnt);
+  struct dp_packet_batch *);
 
 static void dp_netdev_disable_upcall(struct dp_netdev *);
 static void dp_netdev_pmd_reload_done(struct dp_netdev_pmd_thread *pmd);
@@ -2344,7 +2344,7 @@ dpif_netdev_execute(struct dpif *dpif, struct 
dpif_execute *execute)
 {
 struct dp_netdev *dp = get_dp_netdev(dpif);
 struct dp_netdev_pmd_thread *pmd;
-struct dp_packet *pp;
+struct dp_packet_batch pp;
 
 if (dp_packet_size(execute->packet) < ETH_HEADER_LEN ||
 dp_packet_size(execute->packet) > UINT16_MAX) {
@@ -2366,8 +2366,8 @@ dpif_netdev_execute(struct dpif *dpif, struct 
dpif_execute *execute)
 ovs_mutex_lock(>port_mutex);
 }
 
-pp = execute->packet;
-dp_netdev_execute_actions(pmd, , 1, false, execute->actions,
+packet_batch_init_packet(, execute->packet);
+dp_netdev_execute_actions(pmd, , false, execute->actions,
   execute->actions_len);
 if (pmd->core_id == NON_PMD_CORE_ID) {
 dp_netdev_pmd_unref(pmd);
@@ -2561,17 +2561,18 @@ dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread 
*pmd,
struct dp_netdev_port *port,
struct netdev_rxq *rxq)
 {
-struct dp_packet *packets[NETDEV_MAX_BURST];
-int error, cnt;
+struct dp_packet_batch batch;
+int error;
 
+dp_packet_batch_init();
 cycles_count_start(pmd);
-error = netdev_rxq_recv(rxq, packets, );
+error = netdev_rxq_recv(rxq, );
 cycles_count_end(pmd, PMD_CYCLES_POLLING);
 if (!error) {
 *recirc_depth_get() = 0;
 
 cycles_count_start(pmd);
-dp_netdev_input(pmd, packets, cnt, port->port_no);
+dp_netdev_input(pmd, , port->port_no);
 cycles_count_end(pmd, PMD_CYCLES_PROCESSING);
 } else if (error != EAGAIN && error != EOPNOTSUPP) {
 static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
@@ -3332,13 +,11 @@ dpif_netdev_packet_get_rss_hash(struct dp_packet 
*packet,
 }
 
 struct packet_batch_per_flow {
-unsigned int packet_count;
 unsigned int byte_count;
 uint16_t tcp_flags;
-
 struct dp_netdev_flow *flow;
 
-struct dp_packet *packets[NETDEV_MAX_BURST];
+struct

[ovs-dev] [PATCH v2 05/15] dpif-netdev: rename packet_batch

2016-04-21 Thread Pravin B Shelar

Next patch introduces new structure named packet_batch. So
I am renaming it to packet_batch_per_flow.
This does not change any functionality.

Signed-off-by: Pravin B Shelar 
---
 lib/dpif-netdev.c | 36 +++-
 1 file changed, 19 insertions(+), 17 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 5dcb862..b2e8ae0 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -327,8 +327,8 @@ struct dp_netdev_flow {
 /* While processing a group of input packets, the datapath uses the next
  * member to store a pointer to the output batch for the flow.  It is
  * reset after the batch has been sent out (See dp_netdev_queue_batches(),
- * packet_batch_init() and packet_batch_execute()). */
-struct packet_batch *batch;
+ * packet_batch_per_flow_init() and packet_batch_per_flow_execute()). */
+struct packet_batch_per_flow *batch;
 
 /* Packet classification. */
 struct dpcls_rule cr;/* In owning dp_netdev's 'cls'. */
@@ -3331,7 +3331,7 @@ dpif_netdev_packet_get_rss_hash(struct dp_packet *packet,
 return hash;
 }
 
-struct packet_batch {
+struct packet_batch_per_flow {
 unsigned int packet_count;
 unsigned int byte_count;
 uint16_t tcp_flags;
@@ -3342,8 +3342,9 @@ struct packet_batch {
 };
 
 static inline void
-packet_batch_update(struct packet_batch *batch, struct dp_packet *packet,
-const struct miniflow *mf)
+packet_batch_per_flow_update(struct packet_batch_per_flow *batch,
+ struct dp_packet *packet,
+ const struct miniflow *mf)
 {
 batch->tcp_flags |= miniflow_get_tcp_flags(mf);
 batch->packets[batch->packet_count++] = packet;
@@ -3351,7 +3352,8 @@ packet_batch_update(struct packet_batch *batch, struct 
dp_packet *packet,
 }
 
 static inline void
-packet_batch_init(struct packet_batch *batch, struct dp_netdev_flow *flow)
+packet_batch_per_flow_init(struct packet_batch_per_flow *batch,
+   struct dp_netdev_flow *flow)
 {
 flow->batch = batch;
 
@@ -3362,9 +3364,9 @@ packet_batch_init(struct packet_batch *batch, struct 
dp_netdev_flow *flow)
 }
 
 static inline void
-packet_batch_execute(struct packet_batch *batch,
- struct dp_netdev_pmd_thread *pmd,
- long long now)
+packet_batch_per_flow_execute(struct packet_batch_per_flow *batch,
+  struct dp_netdev_pmd_thread *pmd,
+  long long now)
 {
 struct dp_netdev_actions *actions;
 struct dp_netdev_flow *flow = batch->flow;
@@ -3381,16 +3383,16 @@ packet_batch_execute(struct packet_batch *batch,
 static inline void
 dp_netdev_queue_batches(struct dp_packet *pkt,
 struct dp_netdev_flow *flow, const struct miniflow *mf,
-struct packet_batch *batches, size_t *n_batches)
+struct packet_batch_per_flow *batches, size_t 
*n_batches)
 {
-struct packet_batch *batch = flow->batch;
+struct packet_batch_per_flow *batch = flow->batch;
 
 if (OVS_UNLIKELY(!batch)) {
 batch = [(*n_batches)++];
-packet_batch_init(batch, flow);
+packet_batch_per_flow_init(batch, flow);
 }
 
-packet_batch_update(batch, pkt, mf);
+packet_batch_per_flow_update(batch, pkt, mf);
 }
 
 /* Try to process all ('cnt') the 'packets' using only the exact match cache
@@ -3407,7 +3409,7 @@ dp_netdev_queue_batches(struct dp_packet *pkt,
 static inline size_t
 emc_processing(struct dp_netdev_pmd_thread *pmd, struct dp_packet **packets,
size_t cnt, struct netdev_flow_key *keys,
-   struct packet_batch batches[], size_t *n_batches,
+   struct packet_batch_per_flow batches[], size_t *n_batches,
bool md_is_valid, odp_port_t port_no)
 {
 struct emc_cache *flow_cache = >flow_cache;
@@ -3461,7 +3463,7 @@ static inline void
 fast_path_processing(struct dp_netdev_pmd_thread *pmd,
  struct dp_packet **packets, size_t cnt,
  struct netdev_flow_key *keys,
- struct packet_batch batches[], size_t *n_batches)
+ struct packet_batch_per_flow batches[], size_t *n_batches)
 {
 #if !defined(__CHECKER__) && !defined(_WIN32)
 const size_t PKT_ARRAY_SIZE = cnt;
@@ -3614,7 +3616,7 @@ dp_netdev_input__(struct dp_netdev_pmd_thread *pmd,
 enum { PKT_ARRAY_SIZE = NETDEV_MAX_BURST };
 #endif
 struct netdev_flow_key keys[PKT_ARRAY_SIZE];
-struct packet_batch batches[PKT_ARRAY_SIZE];
+struct packet_batch_per_flow batches[PKT_ARRAY_SIZE];
 long long now = time_msec();
 size_t newcnt, n_batches, i;
 
@@ -3630,7 +3632,7 @@ dp_netdev_input__(struct dp_netdev_pmd_thread *pmd,
 }
 
 for (i = 0; i < n_batches; i++) {
-packet_batch_execute([i], pmd, now);
+packet_batch_per_flow_execute([i], pmd, now);
 }

[ovs-dev] [PATCH v2 10/15] netdev-vport: Introduce ip_build_header()

2016-04-21 Thread Pravin B Shelar

This function can be used to build varius tunnel headers.

Signed-off-by: Pravin B Shelar 
---
 lib/netdev-native-tnl.c | 35 +++
 lib/netdev-native-tnl.h | 25 +
 2 files changed, 28 insertions(+), 32 deletions(-)

diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c
index d28cfbf..9c2dc7e 100644
--- a/lib/netdev-native-tnl.c
+++ b/lib/netdev-native-tnl.c
@@ -249,27 +249,10 @@ udp_build_header(struct netdev_tunnel_config *tnl_cfg,
  struct ovs_action_push_tnl *data,
  unsigned int *hlen)
 {
-struct ip_header *ip;
-struct ovs_16aligned_ip6_hdr *ip6;
 struct udp_header *udp;
 bool is_ipv6;
 
-*hlen = sizeof(struct eth_header);
-
-is_ipv6 = is_header_ipv6(data->header);
-
-if (is_ipv6) {
-ip6 = ipv6_hdr(data->header);
-ip6->ip6_nxt = IPPROTO_UDP;
-udp = (struct udp_header *) (ip6 + 1);
-*hlen += IPV6_HEADER_LEN;
-} else {
-ip = ip_hdr(data->header);
-ip->ip_proto = IPPROTO_UDP;
-udp = (struct udp_header *) (ip + 1);
-*hlen += IP_HEADER_LEN;
-}
-
+udp = ip_build_header(data, IPPROTO_UDP, hlen, _ipv6);
 udp->udp_dst = tnl_cfg->dst_port;
 
 if (is_ipv6 || tnl_flow->tunnel.flags & FLOW_TNL_F_CSUM) {
@@ -403,28 +386,16 @@ netdev_gre_build_header(const struct netdev *netdev,
 {
 struct netdev_vport *dev = netdev_vport_cast(netdev);
 struct netdev_tunnel_config *tnl_cfg;
-struct ip_header *ip;
-struct ovs_16aligned_ip6_hdr *ip6;
 struct gre_base_hdr *greh;
 ovs_16aligned_be32 *options;
-int hlen;
+unsigned int hlen, offset;
 bool is_ipv6;
 
-is_ipv6 = is_header_ipv6(data->header);
-
 /* XXX: RCUfy tnl_cfg. */
 ovs_mutex_lock(>mutex);
 tnl_cfg = >tnl_cfg;
 
-if (is_ipv6) {
-ip6 = ipv6_hdr(data->header);
-ip6->ip6_nxt = IPPROTO_GRE;
-greh = (struct gre_base_hdr *) (ip6 + 1);
-} else {
-ip = ip_hdr(data->header);
-ip->ip_proto = IPPROTO_GRE;
-greh = (struct gre_base_hdr *) (ip + 1);
-}
+greh = ip_build_header(data, IPPROTO_GRE, , _ipv6);
 
 greh->protocol = htons(ETH_TYPE_TEB);
 greh->flags = 0;
diff --git a/lib/netdev-native-tnl.h b/lib/netdev-native-tnl.h
index dbe6bd0..a0dfa8c 100644
--- a/lib/netdev-native-tnl.h
+++ b/lib/netdev-native-tnl.h
@@ -82,6 +82,31 @@ ipv6_hdr(void *eth)
 return (void *)((char *)eth + sizeof (struct eth_header));
 }
 
+static inline void *
+ip_build_header(struct ovs_action_push_tnl *data,
+uint8_t next_proto,
+unsigned int *hlen,
+bool *is_ipv6)
+{
+*hlen = sizeof(struct eth_header);
+*is_ipv6 = is_header_ipv6(data->header);
+if (*is_ipv6) {
+struct ovs_16aligned_ip6_hdr *ip6;
+
+ip6 = ipv6_hdr(data->header);
+ip6->ip6_nxt = next_proto;
+*hlen += IPV6_HEADER_LEN;
+return ip6 + 1;
+} else {
+struct ip_header *ip;
+
+ip = ip_hdr(data->header);
+ip->ip_proto = next_proto;
+*hlen += IP_HEADER_LEN;
+return ip + 1;
+}
+}
+
 extern uint16_t tnl_udp_port_min;
 extern uint16_t tnl_udp_port_max;
 
-- 
2.5.5

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [PATCH v2 07/15] dpif-netdev: Refactor fast path process function.

2016-04-21 Thread Pravin B Shelar

Once datapath support large packets, we need to segment packet before
sending it to upcall. Refactoring this code make it bit cleaner.

Signed-off-by: Pravin B Shelar 
---
 lib/dpif-netdev.c | 128 +-
 1 file changed, 70 insertions(+), 58 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 494..f34aeae 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -3461,6 +3461,74 @@ emc_processing(struct dp_netdev_pmd_thread *pmd, struct 
dp_packet_batch *packets
 }
 
 static inline void
+handle_packet(struct dp_netdev_pmd_thread *pmd, struct dp_packet *packet,
+  const struct netdev_flow_key *key,
+  struct ofpbuf *actions, struct ofpbuf *put_actions,
+  int *lost_cnt)
+{
+struct ofpbuf *add_actions;
+struct dp_packet_batch b;
+struct match match;
+ovs_u128 ufid;
+int error;
+
+match.tun_md.valid = false;
+miniflow_expand(>mf, );
+
+ofpbuf_clear(actions);
+ofpbuf_clear(put_actions);
+
+dpif_flow_hash(pmd->dp->dpif, , sizeof match.flow, );
+error = dp_netdev_upcall(pmd, packet, , ,
+ , DPIF_UC_MISS, NULL, actions,
+ put_actions);
+if (OVS_UNLIKELY(error && error != ENOSPC)) {
+dp_packet_delete(packet);
+(*lost_cnt)++;
+return;
+}
+
+/* The Netlink encoding of datapath flow keys cannot express
+ * wildcarding the presence of a VLAN tag. Instead, a missing VLAN
+ * tag is interpreted as exact match on the fact that there is no
+ * VLAN.  Unless we refactor a lot of code that translates between
+ * Netlink and struct flow representations, we have to do the same
+ * here. */
+if (!match.wc.masks.vlan_tci) {
+match.wc.masks.vlan_tci = htons(0x);
+}
+
+/* We can't allow the packet batching in the next loop to execute
+ * the actions.  Otherwise, if there are any slow path actions,
+ * we'll send the packet up twice. */
+packet_batch_init_packet(, packet);
+dp_netdev_execute_actions(pmd, , true,
+  actions->data, actions->size);
+
+add_actions = put_actions->size ? put_actions : actions;
+if (OVS_LIKELY(error != ENOSPC)) {
+struct dp_netdev_flow *netdev_flow;
+
+/* XXX: There's a race window where a flow covering this packet
+ * could have already been installed since we last did the flow
+ * lookup before upcall.  This could be solved by moving the
+ * mutex lock outside the loop, but that's an awful long time
+ * to be locking everyone out of making flow installs.  If we
+ * move to a per-core classifier, it would be reasonable. */
+ovs_mutex_lock(>flow_mutex);
+netdev_flow = dp_netdev_pmd_lookup_flow(pmd, key);
+if (OVS_LIKELY(!netdev_flow)) {
+netdev_flow = dp_netdev_flow_add(pmd, , ,
+ add_actions->data,
+ add_actions->size);
+}
+ovs_mutex_unlock(>flow_mutex);
+
+emc_insert(>flow_cache, key, netdev_flow);
+}
+}
+
+static inline void
 fast_path_processing(struct dp_netdev_pmd_thread *pmd,
  struct dp_packet_batch *packets_,
  struct netdev_flow_key *keys,
@@ -3477,7 +3545,6 @@ fast_path_processing(struct dp_netdev_pmd_thread *pmd,
 struct dpcls_rule *rules[PKT_ARRAY_SIZE];
 struct dp_netdev *dp = pmd->dp;
 struct emc_cache *flow_cache = >flow_cache;
-struct dp_packet_batch b;
 int miss_cnt = 0, lost_cnt = 0;
 bool any_miss;
 size_t i;
@@ -3490,16 +3557,12 @@ fast_path_processing(struct dp_netdev_pmd_thread *pmd,
 if (OVS_UNLIKELY(any_miss) && !fat_rwlock_tryrdlock(>upcall_rwlock)) {
 uint64_t actions_stub[512 / 8], slow_stub[512 / 8];
 struct ofpbuf actions, put_actions;
-ovs_u128 ufid;
 
 ofpbuf_use_stub(, actions_stub, sizeof actions_stub);
 ofpbuf_use_stub(_actions, slow_stub, sizeof slow_stub);
 
 for (i = 0; i < cnt; i++) {
 struct dp_netdev_flow *netdev_flow;
-struct ofpbuf *add_actions;
-struct match match;
-int error;
 
 if (OVS_LIKELY(rules[i])) {
 continue;
@@ -3515,59 +3578,8 @@ fast_path_processing(struct dp_netdev_pmd_thread *pmd,
 }
 
 miss_cnt++;
-
-match.tun_md.valid = false;
-miniflow_expand([i].mf, );
-
-ofpbuf_clear();
-ofpbuf_clear(_actions);
-
-dpif_flow_hash(dp->dpif, , sizeof match.flow, );
-error = dp_netdev_upcall(pmd, packets[i], , ,
- , DPIF_UC_MISS, NULL, ,
- _actions);
-if (OVS_UNLIKELY(error && error != ENOSPC)) {
-

[ovs-dev] [PATCH v2 04/15] dp-packet: use packet reset function.

2016-04-21 Thread Pravin B Shelar

Signed-off-by: Pravin B Shelar 
---
 lib/dp-packet.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/dp-packet.h b/lib/dp-packet.h
index 201fd14..4a8b5ab 100644
--- a/lib/dp-packet.h
+++ b/lib/dp-packet.h
@@ -516,7 +516,7 @@ dp_packet_reset_packet(struct dp_packet *b, int off)
 {
 dp_packet_set_size(b, dp_packet_size(b) - off);
 dp_packet_set_data(b, ((unsigned char *) dp_packet_data(b) + off));
-b->l2_5_ofs = b->l3_ofs = b->l4_ofs = UINT16_MAX;
+dp_packet_reset_offsets(b);
 }
 
 /* Returns the RSS hash of the packet 'p'.  Note that the returned value is
-- 
2.5.5

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [PATCH v2 00/15] dpif-netdev: Large packet segmentation and STT.

2016-04-21 Thread Pravin B Shelar

Following patch series adds support for STT. STT can generate large
packet by merging multiple stt segments. To handle packet larger
than device MTU we need to segment these packet. We can just segment
then packet in STT and send each packet up to the userspace datapath.
But that generate lot more packets and lowers performance. So this
patch series improves userpace datapath to handle large packet.
And segment the packet when it is required, typically just before
sending it over device.
This makes it easier to use DPDK segmentation offloads in future
integration work.

I have also fixed patches according to comments I got from Jesse on v1.

v1-v2: (majore chages)
- Added support for Large packets segementation.
- renamed tunnel module to netdev-native-tnl
- move STT to separate module.

Pravin B Shelar (15):
  netdev-vport: Factor-out tunnel Push-pop code into separate module.
  netdev: Return number of packet from netdev_pop_header()
  dp-packet: Add private data
  dp-packet: use packet reset function.
  dpif-netdev: rename packet_batch
  dpif-netdev: create batch object
  dpif-netdev: Refactor fast path process function.
  dpif-netdev: Refactor userspace action
  netdev: Add Large segment offload support.
  netdev-vport: introduce ip_build_header()
  tunnel: Add IP ECN related functions.
  tnl-ports: Handle STT ports.
  native tunnel: Add support for STT
  netdev: Add support for GRE segmentation
  netdev: Add support for GENEVE/VXLAN tunnel segmentation

 lib/automake.mk   |   7 +
 lib/dp-packet-lso.c   | 490 +
 lib/dp-packet-lso.h   |  60 
 lib/dp-packet.c   |   2 +
 lib/dp-packet.h   |  62 +++-
 lib/dpif-netdev.c | 351 -
 lib/dpif.c|  12 +-
 lib/netdev-native-stt.c   | 700 ++
 lib/netdev-native-stt.h   |  37 +++
 lib/netdev-native-tnl.c   | 650 +++
 lib/netdev-native-tnl.h   | 133 
 lib/netdev-provider.h |   9 +-
 lib/netdev-vport-private.h|  63 
 lib/netdev-vport.c| 677 +---
 lib/netdev.c  | 150 +++--
 lib/netdev.h  |  12 +-
 lib/odp-execute.c |  10 +-
 lib/odp-execute.h |   5 +-
 lib/odp-util.c|  62 +++-
 lib/packets.c |  21 ++
 lib/packets.h |  42 +++
 lib/timeval.h |   1 +
 lib/tnl-ports.c   |  82 +++--
 lib/tnl-ports.h   |   4 +-
 ofproto/ofproto-dpif-xlate.c  |   5 +-
 ofproto/tunnel.c  |  14 +-
 tests/tunnel-push-pop-ipv6.at |  19 +-
 tests/tunnel-push-pop.at  |  27 ++
 28 files changed, 2808 insertions(+), 899 deletions(-)
 create mode 100644 lib/dp-packet-lso.c
 create mode 100644 lib/dp-packet-lso.h
 create mode 100644 lib/netdev-native-stt.c
 create mode 100644 lib/netdev-native-stt.h
 create mode 100644 lib/netdev-native-tnl.c
 create mode 100644 lib/netdev-native-tnl.h
 create mode 100644 lib/netdev-vport-private.h

-- 
2.5.5

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [PATCH v2 02/15] netdev: Return number of packet from netdev_pop_header()

2016-04-21 Thread Pravin B Shelar

Current tunnel-pop API does not allow the netdev implementation
retain a packet but STT can keep a packet from batch of packets
during TCP reassembly processing. To return exact count of
valid packet STT need to pass this number of packet parameter
as a reference.

Signed-off-by: Pravin B Shelar 
---
 lib/dpif-netdev.c   |  9 +++--
 lib/netdev-native-tnl.c | 41 +
 lib/netdev-native-tnl.h |  6 +++---
 lib/netdev-provider.h   |  6 --
 lib/netdev.c| 14 ++
 lib/netdev.h|  2 +-
 6 files changed, 46 insertions(+), 32 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 1e8a37c..5dcb862 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -3724,7 +3724,6 @@ dp_execute_cb(void *aux_, struct dp_packet **packets, int 
cnt,
 struct dp_netdev *dp = pmd->dp;
 int type = nl_attr_type(a);
 struct dp_netdev_port *p;
-int i;
 
 switch ((enum ovs_action_attr)type) {
 case OVS_ACTION_ATTR_OUTPUT:
@@ -3775,8 +3774,12 @@ dp_execute_cb(void *aux_, struct dp_packet **packets, 
int cnt,
packets = tnl_pkt;
 }
 
-err = netdev_pop_header(p->netdev, packets, cnt);
+err = netdev_pop_header(p->netdev, packets, );
+if (!cnt) {
+return;
+}
 if (!err) {
+int i;
 
 for (i = 0; i < cnt; i++) {
 packets[i]->md.in_port.odp_port = portno;
@@ -3799,6 +3802,7 @@ dp_execute_cb(void *aux_, struct dp_packet **packets, int 
cnt,
 struct ofpbuf actions;
 struct flow flow;
 ovs_u128 ufid;
+int i;
 
 userdata = nl_attr_find_nested(a, OVS_USERSPACE_ATTR_USERDATA);
 ofpbuf_init(, 0);
@@ -3830,6 +3834,7 @@ dp_execute_cb(void *aux_, struct dp_packet **packets, int 
cnt,
 case OVS_ACTION_ATTR_RECIRC:
 if (*depth < MAX_RECIRC_DEPTH) {
 struct dp_packet *recirc_pkts[NETDEV_MAX_BURST];
+int i;
 
 if (!may_steal) {
dp_netdev_clone_pkt_batch(recirc_pkts, packets, cnt);
diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c
index b52b068..d28cfbf 100644
--- a/lib/netdev-native-tnl.c
+++ b/lib/netdev-native-tnl.c
@@ -353,7 +353,7 @@ parse_gre_header(struct dp_packet *packet,
 return hlen;
 }
 
-int
+struct dp_packet *
 netdev_gre_pop_header(struct dp_packet *packet)
 {
 struct pkt_metadata *md = >md;
@@ -365,17 +365,20 @@ netdev_gre_pop_header(struct dp_packet *packet)
 
 pkt_metadata_init_tnl(md);
 if (hlen > dp_packet_size(packet)) {
-return EINVAL;
+goto err;
 }
 
 hlen = parse_gre_header(packet, tnl);
 if (hlen < 0) {
-return -hlen;
+goto err;
 }
 
 dp_packet_reset_packet(packet, hlen);
 
-return 0;
+return packet;
+err:
+dp_packet_delete(packet);
+return NULL;
 }
 
 void
@@ -450,7 +453,7 @@ netdev_gre_build_header(const struct netdev *netdev,
 return 0;
 }
 
-int
+struct dp_packet *
 netdev_vxlan_pop_header(struct dp_packet *packet)
 {
 struct pkt_metadata *md = >md;
@@ -460,12 +463,12 @@ netdev_vxlan_pop_header(struct dp_packet *packet)
 
 pkt_metadata_init_tnl(md);
 if (VXLAN_HLEN > dp_packet_l4_size(packet)) {
-return EINVAL;
+goto err;
 }
 
 vxh = udp_extract_tnl_md(packet, tnl, );
 if (!vxh) {
-return EINVAL;
+goto err;
 }
 
 if (get_16aligned_be32(>vx_flags) != htonl(VXLAN_FLAGS) ||
@@ -473,14 +476,17 @@ netdev_vxlan_pop_header(struct dp_packet *packet)
 VLOG_WARN_RL(_rl, "invalid vxlan flags=%#x vni=%#x\n",
  ntohl(get_16aligned_be32(>vx_flags)),
  ntohl(get_16aligned_be32(>vx_vni)));
-return EINVAL;
+goto err;
 }
 tnl->tun_id = htonll(ntohl(get_16aligned_be32(>vx_vni)) >> 8);
 tnl->flags |= FLOW_TNL_F_KEY;
 
 dp_packet_reset_packet(packet, hlen + VXLAN_HLEN);
 
-return 0;
+return packet;
+err:
+dp_packet_delete(packet);
+return NULL;
 }
 
 int
@@ -508,7 +514,7 @@ netdev_vxlan_build_header(const struct netdev *netdev,
 return 0;
 }
 
-int
+struct dp_packet *
 netdev_geneve_pop_header(struct dp_packet *packet)
 {
 struct pkt_metadata *md = >md;
@@ -520,12 +526,12 @@ netdev_geneve_pop_header(struct dp_packet *packet)
 if (GENEVE_BASE_HLEN > dp_packet_l4_size(packet)) {
 VLOG_WARN_RL(_rl, "geneve packet too small: min header=%u packet 
size=%"PRIuSIZE"\n",
  (unsigned int)GENEVE_BASE_HLEN, 
dp_packet_l4_size(packet));
-return EINVAL;
+goto err;
 }
 
 gnh = udp_extract_tnl_md(packet, tnl, );
 if (!gnh) {
-return EINVAL;
+goto err;
 }
 
 opts_len = gnh->opt_len * 4;
@@ -533,18 +539,18 @@ netdev_geneve_pop_header(struct dp_packet

[ovs-dev] [PATCH v2 03/15] dp-packet: Add private data

2016-04-21 Thread Pravin B Shelar

This scratchpad can be used by any layer to keep private data.
STT will use it for TCP reassembly state.

Signed-off-by: Pravin B Shelar 
Acked-by: Jesse Gross 
---
 lib/dp-packet.h | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/lib/dp-packet.h b/lib/dp-packet.h
index 000b09d..201fd14 100644
--- a/lib/dp-packet.h
+++ b/lib/dp-packet.h
@@ -36,6 +36,8 @@ enum OVS_PACKED_ENUM dp_packet_source {
 * ref to build_dp_packet() in netdev-dpdk. */
 };
 
+#define DP_PACKET_CONTEXT_SIZE 64
+
 /* Buffer for holding packet data.  A dp_packet is automatically reallocated
  * as necessary if it grows too large for the available memory.
  */
@@ -58,7 +60,10 @@ struct dp_packet {
 * or UINT16_MAX. */
 uint16_t l4_ofs;   /* Transport-level header offset,
   or UINT16_MAX. */
-struct pkt_metadata md;
+union {
+struct pkt_metadata md;
+uint64_t data[DP_PACKET_CONTEXT_SIZE / 8];
+};
 };
 
 static inline void *dp_packet_data(const struct dp_packet *);
-- 
2.5.5

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH v3] Add VxLAN-GBP support for user space data path

2016-04-21 Thread Jesse Gross

On Thu, Apr 21, 2016 at 6:03 PM, Li, Johnson  wrote:
>> -Original Message-
>> From: Jesse Gross [mailto:je...@kernel.org]
>> Sent: Friday, April 22, 2016 12:44 AM
>> To: Li, Johnson 
>> Cc: ovs dev 
>> Subject: Re: [ovs-dev] [PATCH v3] Add VxLAN-GBP support for user space
>> data path
>>
>> On Wed, Apr 20, 2016 at 11:43 PM, Johnson.Li  wrote:
>> > From: Johnson Li 
>> >
>> > In user space, only standard VxLAN was support. This patch will add
>> > the VxLAN-GBP support for the user space data path.
>> >
>> > How to use:
>> > 1> Create VxLAN port with GBP extension
>> >   $ovs-vsctl add-port br-int vxlan0 -- set interface vxlan0 \
>> >type=vxlan options:dst_port=4789 \
>> >options:remote_ip=192.168.60.22 \
>> >options:key=1000 options:exts=gbp
>> > 2> Add flow for transmitting
>> >   $ovs-ofctl add-flow br-int "table=0, priority=260, \
>> >  in_port=LOCAL actions=load:0x100->NXM_NX_TUN_GBP_ID[], \
>> >  output:1"
>> > 3> Add flow for receiving
>> >   $ovs-ofctl add-flow br-int "table=0, priority=260, \
>> >  in_port=1,tun_gbp_id=0x100 actions=output:LOCAL"
>> >
>> > Check data path flow rules:
>> > $ovs-appctl dpif/dump-flows br-int
>> >   recirc_id(0),in_port(1),eth_type(0x0800),ipv4(tos=0/0x3,frag=no),
>> >   packets:0, bytes:0, used:never, actions:tnl_push(tnl_port(2),
>> >   header(size=50,type=4,eth(dst=90:e2:ba:48:7f:a4,src=90:e2:ba:48:7e:1c,
>> >   dl_type=0x0800),ipv4(src=192.168.60.21,dst=192.168.60.22,proto=17,
>> >   tos=0,ttl=64,frag=0x4000),udp(src=0,dst=4789,csum=0x0),
>> >   vxlan(flags=0x88000100,vni=0x3e8)),out_port(3))
>> >   tunnel(tun_id=0x3e8,src=192.168.60.22,dst=192.168.60.21,
>> >   vxlan(gbp(id=256)),flags(-df-csum+key)),skb_mark(0),recirc_id(0),
>> >   in_port(2),eth(dst=ae:1b:ed:1e:e3:4e),eth_type(0x0800),
>> >   ipv4(dst=172.168.60.21,proto=1/0x10,frag=no), packets:0, bytes:0,
>> >   used:never, actions:1
>> >
>> > ---
>> > Change Log:
>> >   v3: Change Macro definition, add more comments, add unit test.
>> >   v2: Set Each enabled bit for the VxLAN-GBP.
>> >
>> > Signed-off-by: Johnson Li 
>>
>> Please do not submit a new version of a patch without addressing the
>> existing comments. I have asked you several times to not interpret bits from
>> an extension without checking whether that extension has explicitly been
>> enabled.
>>
> The code in the kernel data path checks the vxlan_sock.flags:
> (vs->flags & VXLAN_F_GBP).
> It could do this because kernel space implementation has the basis of UDP 
> socket.
> But for the userspace, no such socket exists,  and the codes are in tunnel's 
> pop_header
> Callback, no other flags is set to indicate that the tunnel is GBP is enable.
> So from my understanding, in the user space, we must trust the flags in the 
> packets.
> And from my understanding, G bit is only used in the GBP from the drafts I 
> have read,
> Other drafts keep this bit as Reserved, so it must set 0.
> The codes are in the tunnel's pop_header callback, the only parameter is
> struct dp_packet *packet
> we cannot get additional flags from this input.
> That why I don't check additional flags.

It is true that the code is structured differently between OVS and Linux.

However, that does not mean that you can ignore correctness simply
because it isn't convenient. If you need to restructure things in
order for your changes to work, then you must do that. There are many,
many changes in both OVS and Linux that require this type of
infrastructure work before patches can go in. In neither case are
patches applied before this is done.

>> In addition, it appears that you are copying code from the Linux kernel. You
>> cannot do this as the licenses are not compatible.
> No codes are copied from kernel, but I copied the comment which is VxLAN-GBP 
> header
> Definition from the vxlan.h

Comments are copyrighted too. Please do not copy anything from the
Linux kernel into OVS.

In addition, the way that I could tell that this is from Linux you
copied references to things that only exist in Linux and don't make
sense in the context of OVS.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH v3] Add VxLAN-GBP support for user space data path

2016-04-21 Thread Li, Johnson

> -Original Message-
> From: Jesse Gross [mailto:je...@kernel.org]
> Sent: Friday, April 22, 2016 12:44 AM
> To: Li, Johnson 
> Cc: ovs dev 
> Subject: Re: [ovs-dev] [PATCH v3] Add VxLAN-GBP support for user space
> data path
> 
> On Wed, Apr 20, 2016 at 11:43 PM, Johnson.Li  wrote:
> > From: Johnson Li 
> >
> > In user space, only standard VxLAN was support. This patch will add
> > the VxLAN-GBP support for the user space data path.
> >
> > How to use:
> > 1> Create VxLAN port with GBP extension
> >   $ovs-vsctl add-port br-int vxlan0 -- set interface vxlan0 \
> >type=vxlan options:dst_port=4789 \
> >options:remote_ip=192.168.60.22 \
> >options:key=1000 options:exts=gbp
> > 2> Add flow for transmitting
> >   $ovs-ofctl add-flow br-int "table=0, priority=260, \
> >  in_port=LOCAL actions=load:0x100->NXM_NX_TUN_GBP_ID[], \
> >  output:1"
> > 3> Add flow for receiving
> >   $ovs-ofctl add-flow br-int "table=0, priority=260, \
> >  in_port=1,tun_gbp_id=0x100 actions=output:LOCAL"
> >
> > Check data path flow rules:
> > $ovs-appctl dpif/dump-flows br-int
> >   recirc_id(0),in_port(1),eth_type(0x0800),ipv4(tos=0/0x3,frag=no),
> >   packets:0, bytes:0, used:never, actions:tnl_push(tnl_port(2),
> >   header(size=50,type=4,eth(dst=90:e2:ba:48:7f:a4,src=90:e2:ba:48:7e:1c,
> >   dl_type=0x0800),ipv4(src=192.168.60.21,dst=192.168.60.22,proto=17,
> >   tos=0,ttl=64,frag=0x4000),udp(src=0,dst=4789,csum=0x0),
> >   vxlan(flags=0x88000100,vni=0x3e8)),out_port(3))
> >   tunnel(tun_id=0x3e8,src=192.168.60.22,dst=192.168.60.21,
> >   vxlan(gbp(id=256)),flags(-df-csum+key)),skb_mark(0),recirc_id(0),
> >   in_port(2),eth(dst=ae:1b:ed:1e:e3:4e),eth_type(0x0800),
> >   ipv4(dst=172.168.60.21,proto=1/0x10,frag=no), packets:0, bytes:0,
> >   used:never, actions:1
> >
> > ---
> > Change Log:
> >   v3: Change Macro definition, add more comments, add unit test.
> >   v2: Set Each enabled bit for the VxLAN-GBP.
> >
> > Signed-off-by: Johnson Li 
> 
> Please do not submit a new version of a patch without addressing the
> existing comments. I have asked you several times to not interpret bits from
> an extension without checking whether that extension has explicitly been
> enabled.
> 
The code in the kernel data path checks the vxlan_sock.flags:
(vs->flags & VXLAN_F_GBP).
It could do this because kernel space implementation has the basis of UDP 
socket. 
But for the userspace, no such socket exists,  and the codes are in tunnel's 
pop_header
Callback, no other flags is set to indicate that the tunnel is GBP is enable. 
So from my understanding, in the user space, we must trust the flags in the 
packets.
And from my understanding, G bit is only used in the GBP from the drafts I have 
read,
Other drafts keep this bit as Reserved, so it must set 0. 
The codes are in the tunnel's pop_header callback, the only parameter is 
struct dp_packet *packet
we cannot get additional flags from this input. 
That why I don't check additional flags. 
> In addition, it appears that you are copying code from the Linux kernel. You
> cannot do this as the licenses are not compatible.
No codes are copied from kernel, but I copied the comment which is VxLAN-GBP 
header
Definition from the vxlan.h
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH] FAQ: Add entry for OVS/DPDK version dependencies.

2016-04-21 Thread Daniele Di Proietto

Thanks for writing this up, applied to master!

2016-04-19 3:35 GMT-07:00 Kevin Traynor :

> For a given release this is listed in the INSTALL.DPDK.md
> but it gets asked quite a bit on the mailing list, so create
> a table in the FAQ.
>
> Signed-off-by: Kevin Traynor 
> ---
>  FAQ.md |   13 +
>  1 files changed, 13 insertions(+), 0 deletions(-)
>
> diff --git a/FAQ.md b/FAQ.md
> index 0fee992..5777471 100644
> --- a/FAQ.md
> +++ b/FAQ.md
> @@ -233,6 +233,19 @@ Validate flow actions |  YES   |   YES
>   |N/A|   NO|
>  Multiple datapaths|  YES   |   YES  |YES|
>  NO|
>  Tunnel TSO - STT  |  N/A   |   YES  |NO |
>  YES   |
>
> +### Q: What DPDK version does each Open vSwitch release work with?
> +
> +A: The following table lists the DPDK version against which the
> +   given versions of Open vSwitch will successfully build.
> +
> +| Open vSwitch | DPDK
> +|::|:-:
> +|2.2.x | 1.6
> +|2.3.x | 1.6
> +|2.4.x | 2.0
> +|2.5.x | 2.2
> +|2.6.x | 16.04
> +
>  ### Q: I get an error like this when I configure Open vSwitch:
>
> configure: error: Linux kernel in  is version , but
> --
> 1.7.4.1
>
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] Returned mail: see transcript for details

2016-04-21 Thread john

The message was undeliverable due to the following reason:

Your message could not be delivered because the destination server was
not reachable within the allowed queue period. The amount of time
a message is queued before it is returned depends on local configura-
tion parameters.

Most likely there is a network problem that prevented delivery, but
it is also possible that the computer is turned off, or does not
have a mail system running right now.

Your message was not delivered within 2 days:
Server 109.121.255.98 is not responding.

The following recipients did not receive this message:


Please reply to postmas...@vcod.com
if you feel this message to be in error.

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH 2/2] system-traffic: Add basic geneve tunnel sanity test.

2016-04-21 Thread Daniele Di Proietto

Thanks for adding this tests!

Acked-by: Daniele Di Proietto 


On 20/04/2016 16:07, "Joe Stringer"  wrote:

>Signed-off-by: Joe Stringer 
>---
> tests/system-common-macros.at |  4 
> tests/system-traffic.at   | 41
>+
> 2 files changed, 45 insertions(+)
>
>diff --git a/tests/system-common-macros.at b/tests/system-common-macros.at
>index 1b9b5c1e9f15..2116f1e31357 100644
>--- a/tests/system-common-macros.at
>+++ b/tests/system-common-macros.at
>@@ -159,3 +159,7 @@ m4_define([OVS_CHECK_VXLAN],
> # OVS_CHECK_GRE()
> m4_define([OVS_CHECK_GRE],
> [AT_SKIP_IF([! ip link add foo type gretap help 2>&1 | grep gre
>>/dev/null])])
>+
>+# OVS_CHECK_GENEVE()
>+m4_define([OVS_CHECK_GENEVE],
>+[AT_SKIP_IF([! ip link add foo type geneve help 2>&1 | grep geneve
>>/dev/null])])
>diff --git a/tests/system-traffic.at b/tests/system-traffic.at
>index 8684a5f06c68..a3d93e92c887 100644
>--- a/tests/system-traffic.at
>+++ b/tests/system-traffic.at
>@@ -188,6 +188,47 @@ NS_CHECK_EXEC([at_ns0], [ping -s 3200 -q -c 3 -i 0.3
>-w 2 10.1.1.100 | FORMAT_PI
> OVS_TRAFFIC_VSWITCHD_STOP
> AT_CLEANUP
> 
>+AT_SETUP([datapath - ping over geneve tunnel])
>+OVS_CHECK_GENEVE()
>+
>+OVS_TRAFFIC_VSWITCHD_START()
>+ADD_BR([br-underlay])
>+
>+AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"])
>+AT_CHECK([ovs-ofctl add-flow br-underlay "actions=normal"])
>+
>+ADD_NAMESPACES(at_ns0)
>+
>+dnl Set up underlay link from host into the namespace using veth pair.
>+ADD_VETH(p0, at_ns0, br-underlay, "172.31.1.1/24")
>+AT_CHECK([ip addr add dev br-underlay "172.31.1.100/24"])
>+AT_CHECK([ip link set dev br-underlay up])
>+
>+dnl Set up tunnel endpoints on OVS outside the namespace and with a
>native
>+dnl linux device inside the namespace.
>+ADD_OVS_TUNNEL([geneve], [br0], [at_gnv0], [172.31.1.1], [10.1.1.100/24])
>+ADD_NATIVE_TUNNEL([geneve], [ns_gnv0], [at_ns0], [172.31.1.100],
>[10.1.1.1/24],
>+  [vni 0])
>+
>+dnl First, check the underlay
>+NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 172.31.1.100 |
>FORMAT_PING], [0], [dnl
>+3 packets transmitted, 3 received, 0% packet loss, time 0ms
>+])
>+
>+dnl Okay, now check the overlay with different packet sizes
>+NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.100 |
>FORMAT_PING], [0], [dnl
>+3 packets transmitted, 3 received, 0% packet loss, time 0ms
>+])
>+NS_CHECK_EXEC([at_ns0], [ping -s 1600 -q -c 3 -i 0.3 -w 2 10.1.1.100 |
>FORMAT_PING], [0], [dnl
>+3 packets transmitted, 3 received, 0% packet loss, time 0ms
>+])
>+NS_CHECK_EXEC([at_ns0], [ping -s 3200 -q -c 3 -i 0.3 -w 2 10.1.1.100 |
>FORMAT_PING], [0], [dnl
>+3 packets transmitted, 3 received, 0% packet loss, time 0ms
>+])
>+
>+OVS_TRAFFIC_VSWITCHD_STOP
>+AT_CLEANUP
>+
> AT_SETUP([conntrack - controller])
> CHECK_CONNTRACK()
> OVS_TRAFFIC_VSWITCHD_START()
>-- 
>2.1.4
>

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH 1/2] system-traffic: Add basic gre tunnel sanity test.

2016-04-21 Thread Daniele Di Proietto

Acked-by: Daniele Di Proietto 


On 21/04/2016 13:29, "Joe Stringer"  wrote:

>On 20 April 2016 at 16:07, Joe Stringer  wrote:
>> Signed-off-by: Joe Stringer 
>> ---
>
>
>> +dnl Set up tunnel endpoints on OVS outside the namespace and with a
>>native
>> +dnl linux device inside the namespace.
>> +ADD_OVS_TUNNEL([gre], [br0], [at_gre0], [172.31.1.1], [10.1.1.100/24])
>> +ADD_NATIVE_TUNNEL([gretap], [ns_gre0], [at_ns0], [172.31.1.100],
>>[10.1.1.1/24],
>> +  [local 172.31.1.1])
>
>The "local" option is optional, I'll probably drop it since it's not
>necessary.

I tested with and without it and it appears to work

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH] ovn: Fix link in tutorial

2016-04-21 Thread Russell Bryant

On Mon, Apr 18, 2016 at 1:25 AM, Jamie Lennox  wrote:

> Correct the link to the ovn-northd man page in the OVN tutorial.
>
> Signed-off-by: Jamie Lennox 
>

Thanks for the patch!  I applied this to master.

Since this was your first patch to OVS, I also added your name to the
AUTHORS file.  Thanks again!

-- 
Russell Bryant
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH] system-traffic: Fix IPv6 frag vxlan check.

2016-04-21 Thread Daniele Di Proietto

Thanks for fixing this

Acked-by: Daniele Di Proietto 

On 21/04/2016 14:10, "Joe Stringer"  wrote:

>This was missed before somehow, which would cause the test to fail
>(rather than being skipped) if iproute2 didn't support setting the
>vxlan dstport on the kernel tunnel device.
>
>Signed-off-by: Joe Stringer 
>---
> tests/system-traffic.at | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/tests/system-traffic.at b/tests/system-traffic.at
>index dceae150d148..10c571769462 100644
>--- a/tests/system-traffic.at
>+++ b/tests/system-traffic.at
>@@ -1495,7 +1495,7 @@ OVS_TRAFFIC_VSWITCHD_STOP
> AT_CLEANUP
> 
> AT_SETUP([conntrack - IPv6 Fragmentation over vxlan])
>-AT_SKIP_IF([! ip link help 2>&1 | grep vxlan >/dev/null])
>+OVS_CHECK_VXLAN()
> CHECK_CONNTRACK()
> 
> OVS_TRAFFIC_VSWITCHD_START()
>-- 
>2.1.4
>

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH] datapath-windows: remove OvsAllocateForwardingContextForNBL

2016-04-21 Thread Ben Pfaff

On Fri, Apr 15, 2016 at 07:05:07AM -0700, Nithin Raju wrote:
> dead code.
> 
> Signed-off-by: Nithin Raju 

Applied, thanks!
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH RFC] ovn: Add support for DSCP marking

2016-04-21 Thread Ben Pfaff

On Fri, Apr 15, 2016 at 04:30:41PM +0530, bscha...@redhat.com wrote:
> From: Babu Shanmugam 
> 
> Added an additional option 'dscp_code' for VMI Logica_Ports in addition to the
> ingress_policing_rate and burst in the OVN Northbound database.
> 
> Also in the controller, replaced the earlier approach of setting the rate and
> burst parameters in the Interface table with Port tables's qos parameter
> (using the default queue). In this patch, 'linux-htb' is used as a
> fixed Qos type.
> 
> Signed-off-by: Babu Shanmugam 

I think that this should be two patches, one for DSCP, one for the QoS
changes.

There are some style issues, such as lack of {} around single statements.

The DSCP feature seems fine to me.

Converting policing into shaping changes the semantics.  If these are
superior semantics (which does seem likely) then that's fine but it
needs documentation and rationale.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH] datapath-windows: Add ICMP types in NetProto.h

2016-04-21 Thread Ben Pfaff

On Thu, Apr 14, 2016 at 01:22:40PM -0700, Sairam Venugopal wrote:
> Update NetProto.h to include ICMP and ICMPv6 types. Update ICMP header to
> keep it consistent with KVM. Add UDP and ICMP min length definitions.
> 
> Signed-off-by: Sairam Venugopal 

Applied, thanks!
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH] datapath-windows: Refactor Conntrack Module in Hyper-V

2016-04-21 Thread Ben Pfaff

On Thu, Apr 14, 2016 at 12:07:11PM -0700, Sairam Venugopal wrote:
> Minor refactors around naming and reusability in lieu of adding support for 
> other
> protocols for tracking connections.
> 
> Signed-off-by: Sairam Venugopal 

Applied, thanks!
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [ovs-dev,v15,1/6] More updates to ovn test output

2016-04-21 Thread Ben Pfaff

On Thu, Apr 14, 2016 at 08:27:05AM -0500, Ryan Moats wrote:
> From: RYAN D. MOATS 
> 
> Adding more detail that helps find what went wrong.
> 
> Signed-off-by: RYAN D. MOATS 

Applied, thanks!
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] Introduce OVSDB readme markdown

2016-04-21 Thread Ben Pfaff

On Thu, Apr 21, 2016 at 06:20:19PM -0500, Ryan Moats wrote:
> Ben Pfaff  wrote on 04/21/2016 01:49:15 PM:
> 
> > From: Ben Pfaff 
> > To: Ryan Moats/Omaha/IBM@IBMUS
> > Cc: dev@openvswitch.org
> > Date: 04/21/2016 01:49 PM
> > Subject: Re: [ovs-dev] Introduce OVSDB readme markdown
> >
> > On Wed, Apr 13, 2016 at 12:00:01PM -0500, Ryan Moats wrote:
> > > From: RYAN D. MOATS 
> > >
> > > Provide a point to start collecting documentation on OVSDB
> > > and seed it with experiences from making use of change
> > > tracking.
> > >
> > > Signed-off-by: RYAN D. MOATS 
> >
> > Needs to be added to EXTRA_DIST, otherwise it breaks the build:
> >
> > The following files are in git but not the distribution:
> > ovsdb/README.md
> > Makefile:5286: recipe for target 'dist-hook-git' failed
> >
> > Do you think that it's better to put this in a separate file, or do you
> > think that it might be more useful in lib/ovsdb-idl.h?
> 
> I wasn't really sure where to put it and so lib/ovsb-idl.h is
> looks as good a place as any.  I'll change the patch set
> around to move the content over there for v2

In the meantime, I've whitelisted you at my email provider, so there's a
chance that I'll actually receive v2 in the conventional way!
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [PATCH] Add change tracking documentation

2016-04-21 Thread Ryan Moats

From: RYAN D. MOATS 

Change tracking is a bit different from what someone with
"classic" database experience might expect, so let's add
the knowledged gained from the experience of making change
tracking work for incremental processing.

Signed-off-by: RYAN D. MOATS 
---
 lib/ovsdb-idl.h |   10 +-
 1 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/lib/ovsdb-idl.h b/lib/ovsdb-idl.h
index bad2dc6..538602c 100644
--- a/lib/ovsdb-idl.h
+++ b/lib/ovsdb-idl.h
@@ -113,7 +113,15 @@ void ovsdb_idl_add_table(struct ovsdb_idl *,
 void ovsdb_idl_omit(struct ovsdb_idl *, const struct ovsdb_idl_column *);
 void ovsdb_idl_omit_alert(struct ovsdb_idl *, const struct ovsdb_idl_column *);
 
-/* Change tracking. */
+/* Change tracking.  In OVSDB, change tracking is applied at each client in
+ * the IDL layer.  This means that when a client makes a request to track
+ * changes on a particular table, they are essentially requesting 
+ * information about the incremental changes to that table from the point in
+ * time that the request is made.  Once the client clears tracked changes,
+ * that information will no longer be available.  The implication of this
+ * is that it is not a simple matter for code to "replay" changes from the
+ * past.  Rather, code should be structured with a path for processing the
+ * full table as well as a path that processes incremental changes.  */
 enum ovsdb_idl_change {
 OVSDB_IDL_CHANGE_INSERT,
 OVSDB_IDL_CHANGE_MODIFY,
-- 
1.7.1

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] Introduce OVSDB readme markdown

2016-04-21 Thread Ryan Moats

Ben Pfaff  wrote on 04/21/2016 01:49:15 PM:

> From: Ben Pfaff 
> To: Ryan Moats/Omaha/IBM@IBMUS
> Cc: dev@openvswitch.org
> Date: 04/21/2016 01:49 PM
> Subject: Re: [ovs-dev] Introduce OVSDB readme markdown
>
> On Wed, Apr 13, 2016 at 12:00:01PM -0500, Ryan Moats wrote:
> > From: RYAN D. MOATS 
> >
> > Provide a point to start collecting documentation on OVSDB
> > and seed it with experiences from making use of change
> > tracking.
> >
> > Signed-off-by: RYAN D. MOATS 
>
> Needs to be added to EXTRA_DIST, otherwise it breaks the build:
>
> The following files are in git but not the distribution:
> ovsdb/README.md
> Makefile:5286: recipe for target 'dist-hook-git' failed
>
> Do you think that it's better to put this in a separate file, or do you
> think that it might be more useful in lib/ovsdb-idl.h?

I wasn't really sure where to put it and so lib/ovsb-idl.h is
looks as good a place as any.  I'll change the patch set
around to move the content over there for v2

Ryan
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH v8 07/16] hmap: Add HMAP_FOR_EACH_POP.

2016-04-21 Thread Daniele Di Proietto



On 21/04/2016 14:54, "Ben Pfaff"  wrote:

>On Thu, Apr 21, 2016 at 09:41:03PM +, Daniele Di Proietto wrote:
>> 
>> 
>> On 21/04/2016 11:28, "Ben Pfaff"  wrote:
>> 
>> >On Tue, Apr 19, 2016 at 03:28:39PM -0700, Daniele Di Proietto wrote:
>> >> Makes popping each member of the hmap a bit easier.
>> >> 
>> >> Signed-off-by: Daniele Di Proietto 
>> >
>> >It's unfortunately quite expensive, though: O(n**2) in the number of
>> >buckets in the hmap, as opposed to O(n) for HMAP_FOR_EACH_SAFE.
>> 
>> You're right, I didn't realize that hmap_first() is O(n) in the number
>>of
>> buckets. Apologies for this oversight and thanks for noticing it.
>> 
>> How about this instead?
>> 
>> ---8<---
>> static inline struct hmap_node *
>> hmap_pop_helper__(struct hmap *hmap, size_t *bucket) {
>> 
>> for (; *bucket <= hmap->mask; (*bucket)++) {
>> struct hmap_node *node = hmap->buckets[*bucket];
>> 
>> if (node) {
>> hmap_remove(hmap, node);
>> return node;
>> }
>> }
>> 
>> return NULL;
>> }
>> 
>> #define HMAP_FOR_EACH_POP(NODE, MEMBER, HMAP) \
>> for (size_t bucket__ = 0;   \
>>  (INIT_CONTAINER(NODE, hmap_pop_helper__(HMAP, __),
>> MEMBER), \
>>   false) \
>>  || (NODE != OBJECT_CONTAINING(NULL, NODE, MEMBER)) || (NODE =
>> NULL);)
>> ---8<---
>> 
>> I wanted to introduce this because I found that sometimes having a
>> "next" local variable is too verbose, but if you don't think it's
>> worth I can drop this patch.
>
>Much better, thanks.
>
>You can write "(a, false) || b || c" as "a, b || c" though.

Right, I'll fold this in, thanks.

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH v8 07/16] hmap: Add HMAP_FOR_EACH_POP.

2016-04-21 Thread Ben Pfaff

On Thu, Apr 21, 2016 at 09:41:03PM +, Daniele Di Proietto wrote:
> 
> 
> On 21/04/2016 11:28, "Ben Pfaff"  wrote:
> 
> >On Tue, Apr 19, 2016 at 03:28:39PM -0700, Daniele Di Proietto wrote:
> >> Makes popping each member of the hmap a bit easier.
> >> 
> >> Signed-off-by: Daniele Di Proietto 
> >
> >It's unfortunately quite expensive, though: O(n**2) in the number of
> >buckets in the hmap, as opposed to O(n) for HMAP_FOR_EACH_SAFE.
> 
> You're right, I didn't realize that hmap_first() is O(n) in the number of
> buckets. Apologies for this oversight and thanks for noticing it.
> 
> How about this instead?
> 
> ---8<---
> static inline struct hmap_node *
> hmap_pop_helper__(struct hmap *hmap, size_t *bucket) {
> 
> for (; *bucket <= hmap->mask; (*bucket)++) {
> struct hmap_node *node = hmap->buckets[*bucket];
> 
> if (node) {
> hmap_remove(hmap, node);
> return node;
> }
> }
> 
> return NULL;
> }
> 
> #define HMAP_FOR_EACH_POP(NODE, MEMBER, HMAP) \
> for (size_t bucket__ = 0;   \
>  (INIT_CONTAINER(NODE, hmap_pop_helper__(HMAP, __),
> MEMBER), \
>   false) \
>  || (NODE != OBJECT_CONTAINING(NULL, NODE, MEMBER)) || (NODE =
> NULL);)
> ---8<---
> 
> I wanted to introduce this because I found that sometimes having a
> "next" local variable is too verbose, but if you don't think it's
> worth I can drop this patch.

Much better, thanks.

You can write "(a, false) || b || c" as "a, b || c" though.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH] netdev-linux: Fix ingress policing burst rate configuration via tc

2016-04-21 Thread Ben Pfaff

On Thu, Apr 14, 2016 at 11:51:44AM +0200, Miguel Angel Ajo wrote:
> The tc_police structure was filled with a value calculated in bits
> instead of bytes while bytes were expected. This led the setting
> of an x8 higher burst value.
> 
> Documentation and defaults have been corrected accordingly to minimize
> nuisances on users sticking to the defaults.
> 
> The suggested burst value is now 80% of policing rate to make sure
> TCP works correctly.
> 
> Signed-off-by: Miguel Angel Ajo 
> Tested-by: Miguel Angel Ajo 

Thanks.  I applied this to master and branch-2.5.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH V4] ovn-controller: reload configured SB probe timer

2016-04-21 Thread Ben Pfaff

On Thu, Apr 14, 2016 at 12:24:03AM -0700, ngh...@us.ibm.com wrote:
> There are four sessions established from ovn-controller to the following:
> OVN Southbound — JSONRPC based
> Local ovsdb — JSONRPC based
> Local vswitchd — openflow based from ofctrl
> Local vswitchd — openflow based from pinctrl
> 
> All of these sessions have their own probe_interval, and currently one
> [SB] of them can be configured using ovn-vsctl command, but that is not
> effective on the fly —in other words, ovn-controller has to be restarted to
> use that probe_timer value, this patch takes care of that.
> For the local ovsdb connection, probe timer is already disabled. For the last
> two connections, they do not need probe_timer as they are over unix domain
> socket. This patch takes care of that as well.
> 
> This change has been tested putting logs in several places like in
> ovn-controller.c, lib/rconn.c to make sure the right probe_timer
> values are in effect. Also, by making sure from ovn-controller's
> log file that there is no more reconnect happening due to probe
> under heavy load.

I agree with Lei Huang's comments.

Fails to build:

../lib/rconn.c:342:10: error: implicit declaration of function 
'stream_or_pstream_needs_probes' is invalid in C99 
[-Werror,-Wimplicit-function-declaration]

In addition, please be careful about style issues.  For example, there
should be a space on either side of "=", and a space after each ",".

Author and date should not be part of the commit message:
> Author: Nirapada Ghosh 
> Date:   Wed Mar 30 19:03:10 2016 -0700
> Signed-off-by: Nirapada Ghosh 

> +/* If SB probe timer is changed using ovs-vsctl command, this function
> + * will set that probe timer value for the session.
> + * cfg: Holding the external-id values read from southbound DB.
> + * sb_idl: pointer to the ovs_idl connection to OVN southbound.
> + */
> +static void
> +set_probe_timer_if_changed(const struct ovsrec_open_vswitch *cfg,
> +   const struct ovsdb_idl *sb_idl)
> +{
> +static int probe_int_sb = DEFAULT_PROBE_INTERVAL * 1000;
> +int probe_int_sb_new = probe_int_sb;
> +
> +extract_probe_timer(cfg, "ovn-remote-probe-interval", _int_sb_new);
> +if (probe_int_sb_new != probe_int_sb) {
> +ovsdb_idl_set_probe_interval(sb_idl, probe_int_sb_new);

We generally don't log this kind of thing:

> +VLOG_INFO("OVN SB probe interval changed %d->%d ",
> +  probe_int_sb,
> +  probe_int_sb_new);
> +probe_int_sb = probe_int_sb_new;
> +}
> +}
> +
> +/* Given key_name, the following function retrieves probe_timer value from 
> the
> + * configuration passed, this configuration comes from the "external-ids"
> + * which were configured via ovs-vsctl command.
> + *
> + * cfg: Holding the external-id values read from NB database.
> + * keyname: Name to extract the value for.
> + * ret_value_ptr: Pointer to integer location where the value read
> + * should be copied.
> + * The function returns true if success, keeps the original
> + * value of ret_value_ptr intact in case of a failure.
> + */
> +static bool
> +extract_probe_timer(const struct ovsrec_open_vswitch *cfg,
> +char *key_name,
> +int *ret_value_ptr)
> +{
> +const char *probe_interval= smap_get(>external_ids, key_name);

This comment seems unnecessary:

> +int ret_value_temp=0; /* Temporary location to hold the value, in case of
> +   * failure, str_to_int() sets the ret_value to 0,
> +   * which is a valid value of probe_timer. */
> +if ((!probe_interval) ||
> +(!str_to_int(probe_interval, 10, _value_temp)))  {
> +VLOG_WARN("OVN OVSDB invalid remote probe interval:%s for %s",
> + probe_interval, key_name);
> +return false;
> +}
> +*ret_value_ptr = ret_value_temp;
> +return true;
> +}

Thanks,

Ben.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH 4/4] classifier: Avoid inserting duplicates to cmaps.

2016-04-21 Thread Ben Pfaff

On Wed, Apr 13, 2016 at 07:06:46PM -0700, Jarno Rajahalme wrote:
> Staged lookup indices assumed that cmap is efficient fealing with
> duplicates.  Duplicates are implemented as linked lists, however,
> causing removal of rules to become (O^2) and highly cache-inefficient
> with large number of duplicates.
> 
> This was problematic especially when many rules shared the same match
> in packet metadata (e.g., a port number, but nothing else).
> 
> This patch fixes the problem by introducing a new 'count' variant of
> the cmap (ccmap), which can be efficiently used to keep counts of
> inserted hash values provided by the caller.  This does not require a
> node in the user data structure, so this makes the classifier
> implementation a bit more memory efficient, too.
> 
> Reported-by: Alok Kumar Maurya 
> Signed-off-by: Jarno Rajahalme 

At first I was unhappy that we needed a new data structure, then I
discovered that I like the new data structure.  Thanks for doing this.

This is a lot of new code to write without adding a test!  Can you adapt
test-cmap.c, or write something else, to test ccmap?

My compilers do not like this.  Clang 3.5.0:

../lib/ccmap.c:193:9: error: address argument to atomic operation must be a 
pointer to non-const _Atomic type ('const ccmap_node_t *' (aka 'const 
_Atomic(uint64_t) *') invalid)
../lib/ovs-atomic.h:393:5: note: expanded from macro 'atomic_read_relaxed'
../lib/ovs-atomic-clang.h:53:15: note: expanded from macro 
'atomic_read_explicit'
../lib/ccmap.c:245:9: error: address argument to atomic operation must be a 
pointer to non-const _Atomic type ('const ccmap_node_t *' (aka 'const 
_Atomic(uint64_t) *') invalid)
../lib/ovs-atomic.h:393:5: note: expanded from macro 'atomic_read_relaxed'
../lib/ovs-atomic-clang.h:53:15: note: expanded from macro 
'atomic_read_explicit'
../lib/ccmap.c:544:13: error: address argument to atomic operation must be 
a pointer to non-const _Atomic type ('const ccmap_node_t *' (aka 'const 
_Atomic(uint64_t) *') invalid)
../lib/ovs-atomic.h:393:5: note: expanded from macro 'atomic_read_relaxed'
../lib/ovs-atomic-clang.h:53:15: note: expanded from macro 
'atomic_read_explicit'

Sparse:

../lib/ccmap.c:193:9: warning: incorrect type in argument 1 (different 
modifiers)
../lib/ccmap.c:193:9:expected void *
../lib/ccmap.c:193:9:got unsigned long long const *
../lib/ccmap.c:193:9: warning: incorrect type in argument 1 (different 
modifiers)
../lib/ccmap.c:193:9:expected void *
../lib/ccmap.c:193:9:got unsigned long long const *
../lib/ccmap.c:193:9: warning: incorrect type in argument 1 (different 
modifiers)
../lib/ccmap.c:193:9:expected void *
../lib/ccmap.c:193:9:got unsigned long long const *
../lib/ccmap.c:193:9: warning: incorrect type in argument 1 (different 
modifiers)
../lib/ccmap.c:193:9:expected void *
../lib/ccmap.c:193:9:got unsigned long long const *
../lib/ccmap.c:245:9: warning: incorrect type in argument 1 (different 
modifiers)
../lib/ccmap.c:245:9:expected void *
../lib/ccmap.c:245:9:got unsigned long long const *
../lib/ccmap.c:245:9: warning: incorrect type in argument 1 (different 
modifiers)
../lib/ccmap.c:245:9:expected void *
../lib/ccmap.c:245:9:got unsigned long long const *
../lib/ccmap.c:544:13: warning: incorrect type in argument 1 (different 
modifiers)
../lib/ccmap.c:544:13:expected void *
../lib/ccmap.c:544:13:got unsigned long long const *
../lib/ccmap.c:544:13: warning: incorrect type in argument 1 (different 
modifiers)
../lib/ccmap.c:544:13:expected void *
../lib/ccmap.c:544:13:got unsigned long long const *

These comments and macros are the only mention of "entries", everything
else talks about "nodes", perhaps the names here should be updated.
Also, CCMAP_PADDING is never used and it is always 0 despite the
comment:

/* An entry is an 32-bit hash and a 32-bit count. */
#define CCMAP_ENTRY_SIZE (4 + 4)

/* Number of entries per bucket: 8 */
#define CCMAP_K (CACHE_LINE_SIZE / CCMAP_ENTRY_SIZE)

/* Pad to make a bucket a full cache line in size: 4 on 32-bit, 0 on 
64-bit. */
#define CCMAP_PADDING ((CACHE_LINE_SIZE - 4) - (CCMAP_K * CCMAP_ENTRY_SIZE))

There's a lot of explicit "inline" in this file, presumably left over
from cmap.c.  You might be able to drop some of it.

This comment on other_hash() talks about another comment that does not
exist:

/* Given a rehashed value 'hash', returns the other hash for that rehashed
 * value.  This is symmetric: other_hash(other_hash(x)) == x.  (See also "Hash
 * Functions" at the top of this file.) */

Some of the code here retains the style that we used to use where
variables are declared at the top of the block rather than where
needed.  It's nice to modernize when one can, e.g. to declare the loop
index in the for

Re: [ovs-dev] [PATCH v8 07/16] hmap: Add HMAP_FOR_EACH_POP.

2016-04-21 Thread Daniele Di Proietto



On 21/04/2016 11:28, "Ben Pfaff"  wrote:

>On Tue, Apr 19, 2016 at 03:28:39PM -0700, Daniele Di Proietto wrote:
>> Makes popping each member of the hmap a bit easier.
>> 
>> Signed-off-by: Daniele Di Proietto 
>
>It's unfortunately quite expensive, though: O(n**2) in the number of
>buckets in the hmap, as opposed to O(n) for HMAP_FOR_EACH_SAFE.

You're right, I didn't realize that hmap_first() is O(n) in the number of
buckets. Apologies for this oversight and thanks for noticing it.

How about this instead?

---8<---
static inline struct hmap_node *
hmap_pop_helper__(struct hmap *hmap, size_t *bucket) {

for (; *bucket <= hmap->mask; (*bucket)++) {
struct hmap_node *node = hmap->buckets[*bucket];

if (node) {
hmap_remove(hmap, node);
return node;
}
}

return NULL;
}

#define HMAP_FOR_EACH_POP(NODE, MEMBER, HMAP) \
for (size_t bucket__ = 0;   \
 (INIT_CONTAINER(NODE, hmap_pop_helper__(HMAP, __),
MEMBER), \
  false) \
 || (NODE != OBJECT_CONTAINING(NULL, NODE, MEMBER)) || (NODE =
NULL);)
---8<---

I wanted to introduce this because I found that sometimes having a
"next" local variable is too verbose, but if you don't think it's
worth I can drop this patch.

Thanks,

Daniele

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [PATCH] system-traffic: Fix IPv6 frag vxlan check.

2016-04-21 Thread Joe Stringer

This was missed before somehow, which would cause the test to fail
(rather than being skipped) if iproute2 didn't support setting the
vxlan dstport on the kernel tunnel device.

Signed-off-by: Joe Stringer 
---
 tests/system-traffic.at | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/system-traffic.at b/tests/system-traffic.at
index dceae150d148..10c571769462 100644
--- a/tests/system-traffic.at
+++ b/tests/system-traffic.at
@@ -1495,7 +1495,7 @@ OVS_TRAFFIC_VSWITCHD_STOP
 AT_CLEANUP
 
 AT_SETUP([conntrack - IPv6 Fragmentation over vxlan])
-AT_SKIP_IF([! ip link help 2>&1 | grep vxlan >/dev/null])
+OVS_CHECK_VXLAN()
 CHECK_CONNTRACK()
 
 OVS_TRAFFIC_VSWITCHD_START()
-- 
2.1.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [PATCH 2/5] compat: nf_defrag_ipv6: avoid/free clone operations.

2016-04-21 Thread Joe Stringer

Upstream commit:
netfilter: ipv6: nf_defrag: avoid/free clone operations

commit 6aafeef03b9d9ecf
("netfilter: push reasm skb through instead of original frag skbs")
changed ipv6 defrag to not use the original skbs anymore.

So rather than keeping the original skbs around just to discard them
afterwards just use the original skbs directly for the fraglist of
the newly assembled skb and remove the extra clone/free operations.

The skb that completes the fragment queue is morphed into a the
reassembled one instead, just like ipv4 defrag.

openvswitch doesn't need any additional skb_morph magic anymore to deal
with this situation so just remove that.

A followup patch can then also remove the NF_HOOK (re)invocation in
the ipv6 netfilter defrag hook.

Cc: Joe Stringer 
Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 

Upstream: 029f7f3b8701 ("netfilter: ipv6: nf_defrag: avoid/free clone 
operations")
Signed-off-by: Joe Stringer 
---
 datapath/conntrack.c   |  14 ---
 .../include/net/netfilter/ipv6/nf_defrag_ipv6.h|  12 +--
 datapath/linux/compat/nf_conntrack_reasm.c | 104 -
 3 files changed, 46 insertions(+), 84 deletions(-)

diff --git a/datapath/conntrack.c b/datapath/conntrack.c
index 82db51567313..f721a8920678 100644
--- a/datapath/conntrack.c
+++ b/datapath/conntrack.c
@@ -337,21 +337,7 @@ static int handle_fragments(struct net *net, struct 
sw_flow_key *key,
if (!reasm)
return -EINPROGRESS;
 
-   if (skb == reasm) {
-   kfree_skb(skb);
-   return -EINVAL;
-   }
-
-   /* Don't free 'skb' even though it is one of the original
-* fragments, as we're going to morph it into the head.
-*/
-   skb_get(skb);
-   nf_ct_frag6_consume_orig(reasm);
-
key->ip.proto = ipv6_hdr(reasm)->nexthdr;
-   skb_morph(skb, reasm);
-   skb->next = reasm->next;
-   consume_skb(reasm);
ovs_cb.dp_cb.mru = IP6CB(skb)->frag_max_size;
 #endif /* IP frag support */
} else {
diff --git a/datapath/linux/compat/include/net/netfilter/ipv6/nf_defrag_ipv6.h 
b/datapath/linux/compat/include/net/netfilter/ipv6/nf_defrag_ipv6.h
index fe99ced37227..a3b86dab2c9c 100644
--- a/datapath/linux/compat/include/net/netfilter/ipv6/nf_defrag_ipv6.h
+++ b/datapath/linux/compat/include/net/netfilter/ipv6/nf_defrag_ipv6.h
@@ -16,17 +16,17 @@
 #define OVS_NF_DEFRAG6_BACKPORT 1
 struct sk_buff *rpl_nf_ct_frag6_gather(struct net *net, struct sk_buff *skb,
   u32 user);
+#define nf_ct_frag6_gather rpl_nf_ct_frag6_gather
+#endif /* HAVE_NF_CT_FRAG6_CONSUME_ORIG */
+
+#ifdef OVS_NF_DEFRAG6_BACKPORT
 int __init rpl_nf_ct_frag6_init(void);
 void rpl_nf_ct_frag6_cleanup(void);
-void rpl_nf_ct_frag6_consume_orig(struct sk_buff *skb);
-#define nf_ct_frag6_gather rpl_nf_ct_frag6_gather
-#else /* HAVE_NF_CT_FRAG6_CONSUME_ORIG */
+#else /* !OVS_NF_DEFRAG6_BACKPORT */
 static inline int __init rpl_nf_ct_frag6_init(void) { return 0; }
 static inline void rpl_nf_ct_frag6_cleanup(void) { }
-static inline void rpl_nf_ct_frag6_consume_orig(struct sk_buff *skb) { }
-#endif /* HAVE_NF_CT_FRAG6_CONSUME_ORIG */
+#endif /* OVS_NF_DEFRAG6_BACKPORT */
 #define nf_ct_frag6_init rpl_nf_ct_frag6_init
 #define nf_ct_frag6_cleanup rpl_nf_ct_frag6_cleanup
-#define nf_ct_frag6_consume_orig rpl_nf_ct_frag6_consume_orig
 
 #endif /* __NF_DEFRAG_IPV6_WRAPPER_H */
diff --git a/datapath/linux/compat/nf_conntrack_reasm.c 
b/datapath/linux/compat/nf_conntrack_reasm.c
index 701bd15d8efd..c6dc7ebec5b5 100644
--- a/datapath/linux/compat/nf_conntrack_reasm.c
+++ b/datapath/linux/compat/nf_conntrack_reasm.c
@@ -62,7 +62,6 @@ struct nf_ct_frag6_skb_cb
 {
struct inet6_skb_parm   h;
int offset;
-   struct sk_buff  *orig;
 };
 
 #define NFCT_FRAG6_CB(skb) ((struct nf_ct_frag6_skb_cb*)((skb)->cb))
@@ -94,12 +93,6 @@ static unsigned int nf_hashfn(struct inet_frag_queue *q)
return nf_hash_frag(nq->id, >saddr, >daddr);
 }
 
-static void nf_skb_free(struct sk_buff *skb)
-{
-   if (NFCT_FRAG6_CB(skb)->orig)
-   kfree_skb(NFCT_FRAG6_CB(skb)->orig);
-}
-
 static void nf_ct_frag6_expire(unsigned long data)
 {
struct frag_queue *fq;
@@ -300,9 +293,9 @@ err:
  * the last and the first frames arrived and all the bits are here.
  */
 static struct sk_buff *
-nf_ct_frag6_reasm(struct frag_queue *fq, struct net_device *dev)
+nf_ct_frag6_reasm(struct frag_queue *fq, struct sk_buff *prev,  struct 
net_device *dev)
 {
-   struct sk_buff *fp, *op, *head = fq->q.fragments;
+   struct sk_buff *fp, *head = fq->q.fragments;
int

[ovs-dev] [PATCH 1/5] compat: ipv6: Pass struct net into nf_ct_frag6_gather.

2016-04-21 Thread Joe Stringer

Upstream commit:
ipv6: Pass struct net into nf_ct_frag6_gather

The function nf_ct_frag6_gather is called on both the input and the
output paths of the networking stack.  In particular ipv6_defrag which
calls nf_ct_frag6_gather is called from both the the PRE_ROUTING chain
on input and the LOCAL_OUT chain on output.

The addition of a net parameter makes it explicit which network
namespace the packets are being reassembled in, and removes the need
for nf_ct_frag6_gather to guess.

Signed-off-by: "Eric W. Biederman" 
Acked-by: Pablo Neira Ayuso 
Signed-off-by: David S. Miller 

Upstream: b72775977c39 ("ipv6: Pass struct net into nf_ct_frag6_gather")
Signed-off-by: Joe Stringer 
---
 datapath/conntrack.c  | 2 +-
 datapath/linux/compat/include/net/netfilter/ipv6/nf_defrag_ipv6.h | 3 ++-
 datapath/linux/compat/nf_conntrack_reasm.c| 5 ++---
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/datapath/conntrack.c b/datapath/conntrack.c
index 0338f9f0c930..82db51567313 100644
--- a/datapath/conntrack.c
+++ b/datapath/conntrack.c
@@ -333,7 +333,7 @@ static int handle_fragments(struct net *net, struct 
sw_flow_key *key,
struct sk_buff *reasm;
 
memset(IP6CB(skb), 0, sizeof(struct inet6_skb_parm));
-   reasm = nf_ct_frag6_gather(skb, user);
+   reasm = nf_ct_frag6_gather(net, skb, user);
if (!reasm)
return -EINPROGRESS;
 
diff --git a/datapath/linux/compat/include/net/netfilter/ipv6/nf_defrag_ipv6.h 
b/datapath/linux/compat/include/net/netfilter/ipv6/nf_defrag_ipv6.h
index 416cafff03b1..fe99ced37227 100644
--- a/datapath/linux/compat/include/net/netfilter/ipv6/nf_defrag_ipv6.h
+++ b/datapath/linux/compat/include/net/netfilter/ipv6/nf_defrag_ipv6.h
@@ -14,7 +14,8 @@
 #if defined(HAVE_NF_CT_FRAG6_CONSUME_ORIG) || \
 defined(HAVE_NF_CT_FRAG6_OUTPUT)
 #define OVS_NF_DEFRAG6_BACKPORT 1
-struct sk_buff *rpl_nf_ct_frag6_gather(struct sk_buff *skb, u32 user);
+struct sk_buff *rpl_nf_ct_frag6_gather(struct net *net, struct sk_buff *skb,
+  u32 user);
 int __init rpl_nf_ct_frag6_init(void);
 void rpl_nf_ct_frag6_cleanup(void);
 void rpl_nf_ct_frag6_consume_orig(struct sk_buff *skb);
diff --git a/datapath/linux/compat/nf_conntrack_reasm.c 
b/datapath/linux/compat/nf_conntrack_reasm.c
index ef29115b6fbd..701bd15d8efd 100644
--- a/datapath/linux/compat/nf_conntrack_reasm.c
+++ b/datapath/linux/compat/nf_conntrack_reasm.c
@@ -487,12 +487,11 @@ find_prev_fhdr(struct sk_buff *skb, u8 *prevhdrp, int 
*prevhoff, int *fhoff)
return 0;
 }
 
-struct sk_buff *rpl_nf_ct_frag6_gather(struct sk_buff *skb, u32 user)
+struct sk_buff *rpl_nf_ct_frag6_gather(struct net *net, struct sk_buff *skb,
+  u32 user)
 {
struct sk_buff *clone;
struct net_device *dev = skb->dev;
-   struct net *net = skb_dst(skb) ? dev_net(skb_dst(skb)->dev)
-  : dev_net(skb->dev);
struct frag_hdr *fhdr;
struct frag_queue *fq;
struct ipv6hdr *hdr;
-- 
2.1.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [PATCH 3/5] compat: nf_defrag_ipv6: avoid nf_iterate recursion.

2016-04-21 Thread Joe Stringer

Upstream commit:
netfilter: ipv6: avoid nf_iterate recursion

The previous patch changed nf_ct_frag6_gather() to morph reassembled skb
with the previous one.

This means that the return value is always NULL or the skb argument.
So change it to an err value.

Instead of invoking NF_HOOK recursively with threshold to skip 
already-called hooks
we can now just return NF_ACCEPT to move on to the next hook except for
-EINPROGRESS (which means skb has been queued for reassembly), in which 
case we
return NF_STOLEN.

Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 

Upstream: daaa7d647f81 ("netfilter: ipv6: avoid nf_iterate recursion")
Signed-off-by: Joe Stringer 
---
 datapath/conntrack.c   | 11 ++--
 .../include/net/netfilter/ipv6/nf_defrag_ipv6.h|  3 +-
 datapath/linux/compat/nf_conntrack_reasm.c | 72 ++
 3 files changed, 37 insertions(+), 49 deletions(-)

diff --git a/datapath/conntrack.c b/datapath/conntrack.c
index f721a8920678..0491e76c776c 100644
--- a/datapath/conntrack.c
+++ b/datapath/conntrack.c
@@ -311,6 +311,7 @@ static int handle_fragments(struct net *net, struct 
sw_flow_key *key,
u16 zone, struct sk_buff *skb)
 {
struct ovs_gso_cb ovs_cb = *OVS_GSO_CB(skb);
+   int err;
 
if (!skb->dev) {
OVS_NLERR(true, "%s: skb has no dev; dropping", __func__);
@@ -319,7 +320,6 @@ static int handle_fragments(struct net *net, struct 
sw_flow_key *key,
 
if (key->eth.type == htons(ETH_P_IP)) {
enum ip_defrag_users user = IP_DEFRAG_CONNTRACK_IN + zone;
-   int err;
 
memset(IPCB(skb), 0, sizeof(struct inet_skb_parm));
err = ip_defrag(skb, user);
@@ -330,14 +330,13 @@ static int handle_fragments(struct net *net, struct 
sw_flow_key *key,
 #if IS_ENABLED(CONFIG_NF_DEFRAG_IPV6)
} else if (key->eth.type == htons(ETH_P_IPV6)) {
enum ip6_defrag_users user = IP6_DEFRAG_CONNTRACK_IN + zone;
-   struct sk_buff *reasm;
 
memset(IP6CB(skb), 0, sizeof(struct inet6_skb_parm));
-   reasm = nf_ct_frag6_gather(net, skb, user);
-   if (!reasm)
-   return -EINPROGRESS;
+   err = nf_ct_frag6_gather(net, skb, user);
+   if (err)
+   return err;
 
-   key->ip.proto = ipv6_hdr(reasm)->nexthdr;
+   key->ip.proto = ipv6_hdr(skb)->nexthdr;
ovs_cb.dp_cb.mru = IP6CB(skb)->frag_max_size;
 #endif /* IP frag support */
} else {
diff --git a/datapath/linux/compat/include/net/netfilter/ipv6/nf_defrag_ipv6.h 
b/datapath/linux/compat/include/net/netfilter/ipv6/nf_defrag_ipv6.h
index a3b86dab2c9c..dc440db99924 100644
--- a/datapath/linux/compat/include/net/netfilter/ipv6/nf_defrag_ipv6.h
+++ b/datapath/linux/compat/include/net/netfilter/ipv6/nf_defrag_ipv6.h
@@ -14,8 +14,7 @@
 #if defined(HAVE_NF_CT_FRAG6_CONSUME_ORIG) || \
 defined(HAVE_NF_CT_FRAG6_OUTPUT)
 #define OVS_NF_DEFRAG6_BACKPORT 1
-struct sk_buff *rpl_nf_ct_frag6_gather(struct net *net, struct sk_buff *skb,
-  u32 user);
+int rpl_nf_ct_frag6_gather(struct net *net, struct sk_buff *skb, u32 user);
 #define nf_ct_frag6_gather rpl_nf_ct_frag6_gather
 #endif /* HAVE_NF_CT_FRAG6_CONSUME_ORIG */
 
diff --git a/datapath/linux/compat/nf_conntrack_reasm.c 
b/datapath/linux/compat/nf_conntrack_reasm.c
index c6dc7ebec5b5..31c47b487356 100644
--- a/datapath/linux/compat/nf_conntrack_reasm.c
+++ b/datapath/linux/compat/nf_conntrack_reasm.c
@@ -285,14 +285,15 @@ err:
 
 /*
  * Check if this packet is complete.
- * Returns NULL on failure by any reason, and pointer
- * to current nexthdr field in reassembled frame.
  *
  * It is called with locked fq, and caller must check that
  * queue is eligible for reassembly i.e. it is not COMPLETE,
  * the last and the first frames arrived and all the bits are here.
+ *
+ * returns true if *prev skb has been transformed into the reassembled
+ * skb, false otherwise.
  */
-static struct sk_buff *
+static bool
 nf_ct_frag6_reasm(struct frag_queue *fq, struct sk_buff *prev,  struct 
net_device *dev)
 {
struct sk_buff *fp, *head = fq->q.fragments;
@@ -306,22 +307,21 @@ nf_ct_frag6_reasm(struct frag_queue *fq, struct sk_buff 
*prev,  struct net_devic
 
ecn = ip_frag_ecn_table[fq->ecn];
if (unlikely(ecn == 0xff))
-   goto out_fail;
+   return false;
 
/* Unfragmented part is taken from the first segment. */
payload_len = ((head->data - skb_network_header(head)) -
   sizeof(struct ipv6hdr) + fq->q.len -
   sizeof(struct frag_hdr));
if (payload_len > IPV6_MAXPLEN) {
-   pr_debug("payload len

[ovs-dev] [PATCH 5/5] datapath: Orphan skbs before IPv6 defrag

2016-04-21 Thread Joe Stringer

Upstream commit:
openvswitch: Orphan skbs before IPv6 defrag

This is the IPv6 counterpart to commit 8282f27449bf ("inet: frag: Always
orphan skbs inside ip_defrag()").

Prior to commit 029f7f3b8701 ("netfilter: ipv6: nf_defrag: avoid/free
clone operations"), ipv6 fragments sent to nf_ct_frag6_gather() would be
cloned (implicitly orphaning) prior to queueing for reassembly. As such,
when the IPv6 message is eventually reassembled, the skb->sk for all
fragments would be NULL. After that commit was introduced, rather than
cloning, the original skbs were queued directly without orphaning. The
end result is that all frags except for the first and last may have a
socket attached.

This commit explicitly orphans such skbs during nf_ct_frag6_gather() to
prevent BUG_ON(skb->sk) during a later call to ip6_fragment().

kernel BUG at net/ipv6/ip6_output.c:631!
[...]
Call Trace:
 
 [] ? __lock_acquire+0x927/0x20a0
 [] ? do_output.isra.28+0x1b0/0x1b0 [openvswitch]
 [] ? __lock_is_held+0x52/0x70
 [] ovs_fragment+0x1f7/0x280 [openvswitch]
 [] ? mark_held_locks+0x75/0xa0
 [] ? _raw_spin_unlock_irqrestore+0x36/0x50
 [] ? dst_discard_out+0x20/0x20
 [] ? dst_ifdown+0x80/0x80
 [] do_output.isra.28+0xf3/0x1b0 [openvswitch]
 [] do_execute_actions+0x709/0x12c0 [openvswitch]
 [] ? ovs_flow_stats_update+0x74/0x1e0 [openvswitch]
 [] ? ovs_flow_stats_update+0xa1/0x1e0 [openvswitch]
 [] ? _raw_spin_unlock+0x27/0x40
 [] ovs_execute_actions+0x45/0x120 [openvswitch]
 [] ovs_dp_process_packet+0x85/0x150 [openvswitch]
 [] ? _raw_spin_unlock+0x27/0x40
 [] ovs_execute_actions+0xc4/0x120 [openvswitch]
 [] ovs_dp_process_packet+0x85/0x150 [openvswitch]
 [] ? key_extract+0x442/0xc10 [openvswitch]
 [] ovs_vport_receive+0x5d/0xb0 [openvswitch]
 [] ? __lock_acquire+0x927/0x20a0
 [] ? __lock_acquire+0x927/0x20a0
 [] ? __lock_acquire+0x927/0x20a0
 [] ? _raw_spin_unlock_irqrestore+0x36/0x50
 [] internal_dev_xmit+0x6d/0x150 [openvswitch]
 [] ? internal_dev_xmit+0x5/0x150 [openvswitch]
 [] dev_hard_start_xmit+0x2df/0x660
 [] ? validate_xmit_skb.isra.105.part.106+0x1a/0x2b0
 [] __dev_queue_xmit+0x8f5/0x950
 [] ? __dev_queue_xmit+0x50/0x950
 [] ? mark_held_locks+0x75/0xa0
 [] dev_queue_xmit+0x10/0x20
 [] neigh_resolve_output+0x178/0x220
 [] ? ip6_finish_output2+0x219/0x7b0
 [] ip6_finish_output2+0x219/0x7b0
 [] ? ip6_finish_output2+0x65/0x7b0
 [] ? ip_idents_reserve+0x6b/0x80
 [] ? ip6_fragment+0x93f/0xc50
 [] ip6_fragment+0xba1/0xc50
 [] ? ip6_flush_pending_frames+0x40/0x40
 [] ip6_finish_output+0xcb/0x1d0
 [] ip6_output+0x5f/0x1a0
 [] ? ip6_fragment+0xc50/0xc50
 [] ip6_local_out+0x3d/0x80
 [] ip6_send_skb+0x2f/0xc0
 [] ip6_push_pending_frames+0x4d/0x50
 [] icmpv6_push_pending_frames+0xac/0xe0
 [] icmpv6_echo_reply+0x42e/0x500
 [] icmpv6_rcv+0x4cf/0x580
 [] ip6_input_finish+0x1a7/0x690
 [] ? ip6_input_finish+0x5/0x690
 [] ip6_input+0x30/0xa0
 [] ? ip6_rcv_finish+0x1a0/0x1a0
 [] ip6_rcv_finish+0x4e/0x1a0
 [] ipv6_rcv+0x45f/0x7c0
 [] ? ipv6_rcv+0x36/0x7c0
 [] ? ip6_make_skb+0x1c0/0x1c0
 [] __netif_receive_skb_core+0x229/0xb80
 [] ? mark_held_locks+0x75/0xa0
 [] ? process_backlog+0x6f/0x230
 [] __netif_receive_skb+0x16/0x70
 [] process_backlog+0x78/0x230
 [] ? process_backlog+0xdd/0x230
 [] net_rx_action+0x203/0x480
 [] ? mark_held_locks+0x75/0xa0
 [] __do_softirq+0xde/0x49f
 [] ? ip6_finish_output2+0x228/0x7b0
 [] do_softirq_own_stack+0x1c/0x30
 
 [] do_softirq.part.18+0x3b/0x40
 [] __local_bh_enable_ip+0xb6/0xc0
 [] ip6_finish_output2+0x251/0x7b0
 [] ? ip6_fragment+0xba1/0xc50
 [] ? ip_idents_reserve+0x6b/0x80
 [] ? ip6_fragment+0x93f/0xc50
 [] ip6_fragment+0xba1/0xc50
 [] ? ip6_flush_pending_frames+0x40/0x40
 [] ip6_finish_output+0xcb/0x1d0
 [] ip6_output+0x5f/0x1a0
 [] ? ip6_fragment+0xc50/0xc50
 [] ip6_local_out+0x3d/0x80
 [] ip6_send_skb+0x2f/0xc0
 [] ip6_push_pending_frames+0x4d/0x50
 [] rawv6_sendmsg+0xa28/0xe30
 [] ? inet_sendmsg+0xc7/0x1d0
 [] inet_sendmsg+0x106/0x1d0
 [] ? inet_sendmsg+0x5/0x1d0
 [] sock_sendmsg+0x38/0x50
 [] SYSC_sendto+0xf6/0x170
 [] ? trace_hardirqs_on_thunk+0x1b/0x1d
 [] SyS_sendto+0xe/0x10
 [] entry_SYSCALL_64_fastpath+0x18/0xa8
Code: 06 48 83 3f 00 75 26 48 8b 87 d8 00 00 00 2b 87 d0 00 00 00 48 39 d0 
72 14 8b 87 e4 00 00 00 83 f8 01 75 09 48 83 7f 18 00 74 9a <0f> 0b 41 8b 86 cc 
00 00 00 49 8#
RIP  [] ip6_fragment+0x73a/0xc50
 RSP 

Fixes: 029f7f3b8701 ("netfilter: ipv6: nf_defrag: avoid/free clone
operations")
Reported-by: Daniele Di Proietto 
Signed-off-by: Joe Stringer 
Signed-off-by: David S. Miller

[ovs-dev] [PATCH 4/5] compat: nf_defrag_ipv6: fix NULL deref panic.

2016-04-21 Thread Joe Stringer

Upstream commit:
netfilter: ipv6: nf_defrag: fix NULL deref panic

Valdis reports NULL deref in nf_ct_frag6_gather.
Problem is bogus use of skb_queue_walk() -- we miss first skb in the list
since we start with head->next instead of head.

In case the element we're looking for was head->next we won't find
a result and then trip over NULL iter.

(defrag uses plain NULL-terminated list rather than one terminated by
 head-of-list-pointer, which is what skb_queue_walk expects).

Fixes: 029f7f3b8701cc7a ("netfilter: ipv6: nf_defrag: avoid/free clone 
operations")
Reported-by: Valdis Kletnieks 
Tested-by: Valdis Kletnieks 
Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 

Upstream: e97ac12859db ("netfilter: ipv6: nf_defrag: fix NULL deref panic")
Signed-off-by: Joe Stringer 
---
 datapath/linux/compat/nf_conntrack_reasm.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/datapath/linux/compat/nf_conntrack_reasm.c 
b/datapath/linux/compat/nf_conntrack_reasm.c
index 31c47b487356..5000351e9664 100644
--- a/datapath/linux/compat/nf_conntrack_reasm.c
+++ b/datapath/linux/compat/nf_conntrack_reasm.c
@@ -365,11 +365,14 @@ nf_ct_frag6_reasm(struct frag_queue *fq, struct sk_buff 
*prev,  struct net_devic
return false;
 
fp->next = prev->next;
-   skb_queue_walk(head, iter) {
-   if (iter->next != prev)
-   continue;
-   iter->next = fp;
-   break;
+
+   iter = head;
+   while (iter) {
+   if (iter->next == prev) {
+   iter->next = fp;
+   break;
+   }
+   iter = iter->next;
}
 
skb_morph(prev, head);
-- 
2.1.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [PATCH 0/5] Backport nf_defrag_ipv6 changes.

2016-04-21 Thread Joe Stringer

This short series addresses some of the netfilter/defrag-related changes
recently upstream and backports the equivalent fixes to our compat code.
The last two patches address bugs introduced from the second patch, I've
left them as separate patches here to mirror the upstream commits.

Tested using kmod tests on Ubuntu 3.13.0-24, 4.2.0-35, and RHEL 3.10.0-327
kernels, plus build tests for vanilla kernels on Travis:
https://travis-ci.org/joestringer/openvswitch/builds/124819321

Joe Stringer (5):
  compat: ipv6: Pass struct net into nf_ct_frag6_gather.
  compat: nf_defrag_ipv6: avoid/free clone operations.
  compat: nf_defrag_ipv6: avoid nf_iterate recursion.
  compat: nf_defrag_ipv6: fix NULL deref panic.
  datapath: Orphan skbs before IPv6 defrag

 datapath/conntrack.c   |  26 +---
 .../include/net/netfilter/ipv6/nf_defrag_ipv6.h|  14 +-
 datapath/linux/compat/nf_conntrack_reasm.c | 172 +
 3 files changed, 83 insertions(+), 129 deletions(-)

-- 
2.1.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH 3/4] classifier: Remove rare optimization case.

2016-04-21 Thread Ben Pfaff

On Wed, Apr 13, 2016 at 07:06:45PM -0700, Jarno Rajahalme wrote:
> This optimization applied when a staged lookup index would narrow down
> to a single rule, which happens sometimes is simple test cases, but
> presumably less often in more populated flow tables.  The result of
> this optimization allowed a bit more general megaflows, but the bit
> patterns produced were sometimes cryptic.  Finally, a later fix to a
> more important performance problem does not allow for this
> optimization any more, so remove it now.
> 
> Signed-off-by: Jarno Rajahalme 

If it causes problems, we can try harder.

Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH 2/4] classifier: Remove logging.

2016-04-21 Thread Ben Pfaff

On Wed, Apr 13, 2016 at 07:06:44PM -0700, Jarno Rajahalme wrote:
> The only vlog line was a left over from debugging.
> 
> Signed-off-by: Jarno Rajahalme 

Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH 1/2] system-traffic: Add basic gre tunnel sanity test.

2016-04-21 Thread Joe Stringer

On 20 April 2016 at 16:07, Joe Stringer  wrote:
> Signed-off-by: Joe Stringer 
> ---


> +dnl Set up tunnel endpoints on OVS outside the namespace and with a native
> +dnl linux device inside the namespace.
> +ADD_OVS_TUNNEL([gre], [br0], [at_gre0], [172.31.1.1], [10.1.1.100/24])
> +ADD_NATIVE_TUNNEL([gretap], [ns_gre0], [at_ns0], [172.31.1.100], 
> [10.1.1.1/24],
> +  [local 172.31.1.1])

The "local" option is optional, I'll probably drop it since it's not necessary.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH net] openvswitch: use flow protocol when recalculating ipv6 checksums

2016-04-21 Thread David Miller

From: Simon Horman 
Date: Thu, 21 Apr 2016 11:49:15 +1000

> When using masked actions the ipv6_proto field of an action
> to set IPv6 fields may be zero rather than the prevailing protocol
> which will result in skipping checksum recalculation.
> 
> This patch resolves the problem by relying on the protocol
> in the flow key rather than that in the set field action.
> 
> Fixes: 83d2b9ba1abc ("net: openvswitch: Support masked set actions.")
> Cc: Jarno Rajahalme 
> Signed-off-by: Simon Horman 

Looks good to me, applied and queued up for -stable.

Thanks Simon.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH 1/4] classifier: Remove redundant index.

2016-04-21 Thread Ben Pfaff

On Wed, Apr 13, 2016 at 07:06:43PM -0700, Jarno Rajahalme wrote:
> The test for figuring out if the last index had the same fields as the
> actual rules map as broken, resulting into keeping an unnecessary
> index around.
> 
> Signed-off-by: Jarno Rajahalme 

Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] Introduce OVSDB readme markdown

2016-04-21 Thread Ben Pfaff

On Wed, Apr 13, 2016 at 12:00:01PM -0500, Ryan Moats wrote:
> From: RYAN D. MOATS 
> 
> Provide a point to start collecting documentation on OVSDB
> and seed it with experiences from making use of change
> tracking.
> 
> Signed-off-by: RYAN D. MOATS 

Needs to be added to EXTRA_DIST, otherwise it breaks the build:

The following files are in git but not the distribution:
ovsdb/README.md
Makefile:5286: recipe for target 'dist-hook-git' failed

Do you think that it's better to put this in a separate file, or do you
think that it might be more useful in lib/ovsdb-idl.h?

Thanks,

Ben.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH v2] Add configurable OpenFlow port name.

2016-04-21 Thread Ben Pfaff

Yamamoto-san, I could really use your opinion here: do you think that
this should be done differently?  If you do, then I will not accept it.
But if you do not feel strongly about it, then I'll start properly
reviewing it.

Thanks,

Ben.

On Mon, Apr 18, 2016 at 07:05:52PM +0800, Xiao Liang wrote:
> On Mon, Apr 18, 2016 at 5:46 PM, Takashi YAMAMOTO  wrote:
> > for some reasons you want to change of_name without re-creating a port?
> > why? (just curious)
> >
> 
> I don't have special use cases, just for convenience.
> 
> >
> > On Mon, Apr 18, 2016 at 4:19 PM, Xiao Liang  wrote:
> >>
> >> By introducing of_name, ovs_name serves as a key of ports which is
> >> shared by ofproto and netdev. It's easier to find and convert ports
> >> back and forth. of_name and kernel_name could be configured (if
> >> supported) independently of each other.
> >>
> >> On Mon, Apr 18, 2016 at 11:43 AM, Takashi YAMAMOTO 
> >> wrote:
> >> > let me explain what netdev-bsd does first.
> >> > on some platform "tap" interfaces are always named automatically by
> >> > kernel
> >> > itself
> >> > and there's no way to rename them.  say, they always will have names
> >> > like
> >> > "tap0".
> >> > so if you does "ovs-vsctl add-port br0 foo",
> >> >   ovs_name = "foo"
> >> >   kernel_name = "tap0"
> >> >
> >> > now, you are going to add another name for openflow. let's call it
> >> > of_name.
> >> > eg. "ovs-vsctl add-port br0 foo -- set int foo ofname=wan",
> >> >   of_name = "wan"
> >> >   ovs_name = "foo"
> >> >   kernel_name = "tap0"
> >> >
> >> > while i don't have strong opinions either ways,
> >> > i'm not sure why you want to use different names for of_name and
> >> > ovs_name
> >> > in the first place.  eg. what's wrong with "ovs-vsctl add-port br0 wan".
> >> > can you explain a little?
> >> >
> >> > On Mon, Apr 18, 2016 at 10:37 AM, Xiao Liang 
> >> > wrote:
> >> >>
> >> >> Hi Ben, Yamamoto-san,
> >> >>
> >> >> Kindly remind you of this thread. Would like to hear your preference
> >> >> on the way to implement this feature.
> >> >>
> >> >> On Mon, Apr 11, 2016 at 11:18 PM, Ben Pfaff  wrote:
> >> >> > On Mon, Apr 11, 2016 at 04:30:04PM +0800, Xiao Liang wrote:
> >> >> >> On Mon, Apr 11, 2016 at 3:42 PM, Takashi Yamamoto
> >> >> >>  wrote:
> >> >> >> > hi,
> >> >> >> >
> >> >> >> > have you considered the opposite way?
> >> >> >> > ie. have an ability to specify the device name.
> >> >> >> >
> >> >> >> > netdev-bsd already has a distinction between "kernel name" and
> >> >> >> > "ovs
> >> >> >> > name".
> >> >> >> >
> >> >> >>
> >> >> >> Hi,
> >> >> >>
> >> >> >> I'm not familiar with netdev-bsd code, but I think this approach
> >> >> >> will
> >> >> >> make ports more difficult to manage and need much more effort.
> >> >> >
> >> >> > Yamamoto-san: thanks for bringing this up.  I'm going to wait for you
> >> >> > and Xiao to talk this through a bit before continuing review.
> >> >> ___
> >> >> dev mailing list
> >> >> dev@openvswitch.org
> >> >> http://openvswitch.org/mailman/listinfo/dev
> >> >
> >> >
> >
> >
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH RFC] dpif-netdev: ACL+dpcls for Wildcard matching.

2016-04-21 Thread Ben Pfaff

On Wed, Apr 13, 2016 at 10:45:09AM +0100, antonio.fische...@intel.com wrote:
> The purpose of this implementation is to improve the performance
> of wildcard matching in user-space.
> This RFC patch shows the basic functionality, some aspects were not
> covered yet.
> 
> I would like to get some feedback on whether people think integrating
> the DPDK ACL table in this manner is potentially a good solution or not.
> 
> DPDK ACL tables show a better performance on lookup operations than the
> Classifier.  However their insertion time for new rules is unacceptable.
> This solution attempts to combine the better performance of ACL lookups
> with the lower insertion latency of the Classifier.

How much does the performance improve?
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH V2, 3/3] datapath-windows: Removed always true condition in VXLAN

2016-04-21 Thread Ben Pfaff

On Mon, Apr 18, 2016 at 08:34:43AM +, Paul Boca wrote:
> Instance ID flag must be set to 1 in case of valid VXLAN id
> 
> Signed-off-by: Paul-Daniel Boca 
> Acked-by: Sorin Vinturis 

Applied, thanks!
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH V3, 2/3] datapath-windows: Removed double initialization on local variables

2016-04-21 Thread Ben Pfaff

On Mon, Apr 18, 2016 at 09:46:07AM +, Paul Boca wrote:
> Signed-off-by: Paul-Daniel Boca 
> Acked-by: Sorin Vinturis 
> ---
> v3: fixed a minor compilation issue.

Applied, thanks!
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH V2, 1/3] datapath-windows: Avoid using uninitialized gOvsExtDriverHandle

2016-04-21 Thread Ben Pfaff

On Mon, Apr 18, 2016 at 08:33:56AM +, Paul Boca wrote:
> Ensure gOvsExtDriverHandle is not used if initialization fails
> Added PAGED_CODE() where needed
> 
> Signed-off-by: Paul-Daniel Boca 
> Acked-by: Sorin Vinturis 

Applied, thanks!
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH v8 07/16] hmap: Add HMAP_FOR_EACH_POP.

2016-04-21 Thread Ben Pfaff

On Tue, Apr 19, 2016 at 03:28:39PM -0700, Daniele Di Proietto wrote:
> Makes popping each member of the hmap a bit easier.
> 
> Signed-off-by: Daniele Di Proietto 

It's unfortunately quite expensive, though: O(n**2) in the number of
buckets in the hmap, as opposed to O(n) for HMAP_FOR_EACH_SAFE.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH v3] debian : upstream_version fix

2016-04-21 Thread Ben Pfaff

On Tue, Apr 05, 2016 at 07:56:35AM +, Zoltán Balogh wrote:
> From:  
> 
> The Debian Policy Manual 
> (https://www.debian.org/doc/debian-policy/ch-controlfields.html#s-f-Version) 
> says that the upstream_version may contain only alphanumerics and the 
> characters . + - : ~ (full stop, plus, hyphen, colon, tilde) and should start 
> with a digit.
> 
> Currently, the upstream_version is defined in the debian/rules file:
> 
> DEB_UPSTREAM_VERSION=$(shell dpkg-parsechangelog | sed -rne 's,^Version: 
> ([0-9]:)*([^-]+).*,\2,p')
> 
> The version number is taken from the dpkg-parsechangelog printout then the 
> first part of the version number which does not contain hyphen is filtered 
> out with sed. However the Debian Policy Manual says that hyphen is allowed in 
> the upstream_version. 
> 
> This is not a problem with current vanilla OVS debian version. But, if a 
> postfix string including a hyphen is added to the upstream_version then 
> installation of datapath-dkms package will fail. 
> 
> Signed-off-by: Simon Horman 
> Reported-by: Zoltán Balogh 
> Tested-by: Zoltán Balogh 
> 
> ---
> 
> diff --git a/debian/rules b/debian/rules index 2a70bd6..7110851 100755
> --- a/debian/rules
> +++ b/debian/rules
> @@ -13,7 +13,7 @@
>  
>  PACKAGE=openvswitch
>  PACKAGE_DKMS=openvswitch-datapath-dkms
> -DEB_UPSTREAM_VERSION=$(shell dpkg-parsechangelog | sed -rne 's,^Version: 
> ([0-9]:)*([^-]+).*,\2,p')
> +DEB_UPSTREAM_VERSION=$(shell dpkg-parsechangelog | sed -rne 
> +'s,^Version: 
> +([0-9]+:)?([0-9][a-zA-Z0-9.+:~-]*)(-[a-zA-Z0-9*.~]*),\2,p')
>  
>  ifneq (,$(filter parallel=%,$(DEB_BUILD_OPTIONS)))  PARALLEL = -j$(patsubst 
> parallel=%,%,$(filter parallel=%,$(DEB_BUILD_OPTIONS)))

This patch is weirdly corrupted.  It claims to be changing a hunk that
has 7 lines before and after, but there is 1 - line and 3 + lines.
"patch" won't apply it.

Please resubmit.  (If you can't get your mailer to submit it properly,
you can submit a github pull request.)
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH] tunneling: Fix for concomitant IPv4 and IPv6 tunnels

2016-04-21 Thread Ben Pfaff

On Fri, Apr 01, 2016 at 10:06:05AM -0300, Thadeu Lima de Souza Cascardo wrote:
> When using an IPv6 tunnel on the same bridge as an IPv4 tunnel, the flow
> received from the IPv6 tunnel would have an IPv4 address added to it, causing
> problems when trying to put or execute the action on Linux datapath.
> 
> Clearing the IPv6 address when we have a valid IPv4 address fixes this 
> problem.
> 
> Signed-off-by: Thadeu Lima de Souza Cascardo 

Applied, thanks!
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH 1/2] ofproto-dpif: Rename "recurse" to "indentation".

2016-04-21 Thread Ben Pfaff

On Wed, Apr 13, 2016 at 09:45:09PM -0700, Ben Pfaff wrote:
> The "recurse" member of struct xlate_in and struct xlate_ctx is used for
> two purposes: to determine the amount of indentation in "ofproto/trace"
> output and to limit the depth of recursion.  An upcoming commit will
> separate these tasks, and so in preparation this commit renames "recurse"
> to "indentation".
> 
> Signed-off-by: Ben Pfaff 

I posted a v2:
http://openvswitch.org/pipermail/dev/2016-April/069917.html
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [ovs-dev, 1/2] ofproto-dpif: Rename "recurse" to "indentation".

2016-04-21 Thread Ben Pfaff

On Thu, Apr 21, 2016 at 09:02:46AM -0500, Ryan Moats wrote:
> 
> > --- Original Message ---
> > The "recurse" member of struct xlate_in and struct xlate_ctx is used for
> > two purposes: to determine the amount of indentation in "ofproto/trace"
> > output and to limit the depth of recursion.  An upcoming commit will
> > separate these tasks, and so in preparation this commit renames "recurse"
> > to "indentation".
> >
> > Signed-off-by: Ben Pfaff 
> 
> Ben, I appear to be getting an error when I try to compile this:
> 
> ofproto/ofproto-dpif-xlate.c:3292:13: error: too many arguments to function
> 'xlate_recursively'
>  xlate_recursively(ctx, rule, table_id <= old_table_id);
>  ^
> ofproto/ofproto-dpif-xlate.c:3214:1: note: declared here
>  xlate_recursively(struct xlate_ctx *ctx, struct rule_dpif *rule)

Thanks for pointing that out.  I sent v2:
http://openvswitch.org/pipermail/dev/2016-April/069917.html
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [PATCH v2 2/2] ofproto-dpif: Do not count resubmit to later tables against limit.

2016-04-21 Thread Ben Pfaff

Open vSwitch must ensure that flow translation takes a finite amount of
time.  Until now it has implemented this by limiting the depth of
recursion.  The initial limit, in version 1.0.1, was no recursion at all,
and then over the years it has increased to 8 levels, then 16, then 32,
and 64 for the last few years.  Now reports are coming in that 64 levels
are inadequate for some OVN setups.  The natural inclination would be to
double the limit again to 128 levels.

This commit attempts another approach.  Instead of increasing the limit,
it reduces the class of resubmits that count against the limit.  Since the
goal for the depth limit is to prevent an infinite amount of work, it's
not necessary to count resubmits that can't lead to infinite work.  In
particular, a resubmit from a table numbered x to a table y > x cannot do
this, because any OpenFlow switch has a finite number of tables.  Because
in fact a resubmit (or goto_table) from one table to a later table is the
most common form of an OpenFlow pipeline, I suspect that this will greatly
alleviate the pressure to increase the depth limit.

Reported-by: Guru Shetty 
Signed-off-by: Ben Pfaff 
---
 lib/ofp-actions.c| 19 +++--
 ofproto/ofproto-dpif-xlate.c | 66 ++--
 ofproto/ofproto-dpif-xlate.h | 23 +--
 ofproto/ofproto-dpif.c   |  5 ++--
 ofproto/ofproto-dpif.h   |  3 +-
 tests/ofproto-dpif.at| 41 ++-
 utilities/ovs-ofctl.8.in | 28 ---
 7 files changed, 152 insertions(+), 33 deletions(-)

diff --git a/lib/ofp-actions.c b/lib/ofp-actions.c
index 39b6fbc..946d145 100644
--- a/lib/ofp-actions.c
+++ b/lib/ofp-actions.c
@@ -3738,8 +3738,23 @@ format_FIN_TIMEOUT(const struct ofpact_fin_timeout *a, 
struct ds *s)
  *
  * Resubmit actions may be used any number of times within a set of actions.
  *
- * Resubmit actions may nest to an implementation-defined depth.  Beyond this
- * implementation-defined depth, further resubmit actions are simply ignored.
+ * Resubmit actions may nest.  To prevent infinite loops and excessive resource
+ * use, the implementation may limit nesting depth and the total number of
+ * resubmits:
+ *
+ *- Open vSwitch 1.0.1 and earlier did not support recursion.
+ *
+ *- Open vSwitch 1.0.2 and 1.0.3 limited recursion to 8 levels.
+ *
+ *- Open vSwitch 1.1 and 1.2 limited recursion to 16 levels.
+ *
+ *- Open vSwitch 1.2 through 1.8 limited recursion to 32 levels.
+ *
+ *- Open vSwitch 1.9 through 2.0 limited recursion to 64 levels.
+ *
+ *- Open vSwitch 2.1 through 2.5 limited recursion to 64 levels and impose
+ *  a total limit of 4,096 resubmits per flow translation (earlier versions
+ *  did not impose any total limit).
  *
  * NXAST_RESUBMIT ignores 'table' and 'pad'.  NXAST_RESUBMIT_TABLE requires
  * 'pad' to be all-bits-zero.
diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index e4ec55c..509e455 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -67,14 +67,22 @@ COVERAGE_DEFINE(xlate_actions_too_many_output);
 VLOG_DEFINE_THIS_MODULE(ofproto_dpif_xlate);
 
 /* Maximum depth of flow table recursion (due to resubmit actions) in a
- * flow translation. */
-#define MAX_RESUBMIT_RECURSION 64
-#define MAX_INTERNAL_RESUBMITS 1   /* Max resbmits allowed using rules in
-  internal table. */
+ * flow translation.
+ *
+ * The goal of limiting the depth of resubmits is to ensure that flow
+ * translation eventually terminates.  Only resubmits to the same table or an
+ * earlier table count against the maximum depth.  This is because resubmits to
+ * strictly monotonically increasing table IDs will eventually terminate, since
+ * any OpenFlow switch has a finite number of tables.  OpenFlow tables are most
+ * commonly traversed in numerically increasing order, so this limit has little
+ * effect on conventionally designed OpenFlow pipelines.
+ *
+ * Outputs to patch ports and to groups also count against the depth limit. */
+#define MAX_DEPTH 64
 
 /* Maximum number of resubmit actions in a flow translation, whether they are
  * recursive or not. */
-#define MAX_RESUBMITS (MAX_RESUBMIT_RECURSION * MAX_RESUBMIT_RECURSION)
+#define MAX_RESUBMITS (MAX_DEPTH * MAX_DEPTH)
 
 struct xbridge {
 struct hmap_node hmap_node;   /* Node in global 'xbridges' map. */
@@ -195,8 +203,31 @@ struct xlate_ctx {
  * wants actions. */
 struct ofpbuf *odp_actions;
 
-/* Resubmit statistics, via xlate_table_action(). */
-int indentation;/* Current resubmit nesting depth. */
+/* Statistics maintained by xlate_table_action().
+ *
+ * 'indentation' is the nesting level for resubmits.  It is used to indent
+ * the output of resubmit_hook (e.g. for the "ofproto/trace" feature).
+ *
+ * The other statistics limit the amount of work that a

[ovs-dev] [PATCH v2 0/2] more flexible limits for resubmits

2016-04-21 Thread Ben Pfaff

v1->v2:
- Fix bad patch split between patch 1 and patch 2 (thanks Ryan).

Ben Pfaff (2):
  ofproto-dpif: Rename "recurse" to "indentation".
  ofproto-dpif: Do not count resubmit to later tables against limit.

 lib/ofp-actions.c| 19 --
 ofproto/ofproto-dpif-xlate.c | 82 
 ofproto/ofproto-dpif-xlate.h | 36 +++
 ofproto/ofproto-dpif.c   | 46 +
 ofproto/ofproto-dpif.h   |  5 +--
 tests/ofproto-dpif.at| 41 +-
 utilities/ovs-ofctl.8.in | 28 ---
 7 files changed, 189 insertions(+), 68 deletions(-)

-- 
2.1.3

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [PATCH v2 1/2] ofproto-dpif: Rename "recurse" to "indentation".

2016-04-21 Thread Ben Pfaff

The "recurse" member of struct xlate_in and struct xlate_ctx is used for
two purposes: to determine the amount of indentation in "ofproto/trace"
output and to limit the depth of recursion.  An upcoming commit will
separate these tasks, and so in preparation this commit renames "recurse"
to "indentation".

Signed-off-by: Ben Pfaff 
---
 ofproto/ofproto-dpif-xlate.c | 22 +++---
 ofproto/ofproto-dpif-xlate.h | 15 ---
 ofproto/ofproto-dpif.c   | 43 ++-
 ofproto/ofproto-dpif.h   |  2 +-
 4 files changed, 42 insertions(+), 40 deletions(-)

diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index 5937913..e4ec55c 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -196,7 +196,7 @@ struct xlate_ctx {
 struct ofpbuf *odp_actions;
 
 /* Resubmit statistics, via xlate_table_action(). */
-int recurse;/* Current resubmit nesting depth. */
+int indentation;/* Current resubmit nesting depth. */
 int resubmits;  /* Total number of resubmits. */
 bool in_group;  /* Currently translating ofgroup, if true. */
 bool in_action_set; /* Currently translating action_set, if true. 
*/
@@ -583,7 +583,7 @@ xlate_report(struct xlate_ctx *ctx, const char *format, ...)
 va_list args;
 
 va_start(args, format);
-ctx->xin->report_hook(ctx->xin, ctx->recurse, format, args);
+ctx->xin->report_hook(ctx->xin, ctx->indentation, format, args);
 va_end(args);
 }
 }
@@ -2805,7 +2805,7 @@ compose_table_xlate(struct xlate_ctx *ctx, const struct 
xport *out_dev,
 
 return ofproto_dpif_execute_actions__(xbridge->ofproto, , NULL,
   , sizeof output,
-  ctx->recurse, ctx->resubmits, 
packet);
+  ctx->indentation, ctx->resubmits, 
packet);
 }
 
 static void
@@ -3222,20 +3222,20 @@ xlate_recursively(struct xlate_ctx *ctx, struct 
rule_dpif *rule)
 }
 
 ctx->resubmits++;
-ctx->recurse++;
+ctx->indentation++;
 ctx->rule = rule;
 ctx->rule_cookie = rule_dpif_get_flow_cookie(rule);
 actions = rule_dpif_get_actions(rule);
 do_xlate_actions(actions->ofpacts, actions->ofpacts_len, ctx);
 ctx->rule_cookie = old_cookie;
 ctx->rule = old_rule;
-ctx->recurse--;
+ctx->indentation--;
 }
 
 static bool
 xlate_resubmit_resource_check(struct xlate_ctx *ctx)
 {
-if (ctx->recurse >= MAX_RESUBMIT_RECURSION + MAX_INTERNAL_RESUBMITS) {
+if (ctx->indentation >= MAX_RESUBMIT_RECURSION + MAX_INTERNAL_RESUBMITS) {
 XLATE_REPORT_ERROR(ctx, "resubmit actions recursed over %d times",
MAX_RESUBMIT_RECURSION);
 ctx->error = XLATE_RECURSION_TOO_DEEP;
@@ -3274,7 +3274,7 @@ xlate_table_action(struct xlate_ctx *ctx, ofp_port_t 
in_port, uint8_t table_id,
may_packet_in, honor_table_miss);
 
 if (OVS_UNLIKELY(ctx->xin->resubmit_hook)) {
-ctx->xin->resubmit_hook(ctx->xin, rule, ctx->recurse + 1);
+ctx->xin->resubmit_hook(ctx->xin, rule, ctx->indentation + 1);
 }
 
 if (rule) {
@@ -3323,9 +3323,9 @@ xlate_group_bucket(struct xlate_ctx *ctx, struct 
ofputil_bucket *bucket)
 struct flow old_flow = ctx->xin->flow;
 
 ofpacts_execute_action_set(_list, _set);
-ctx->recurse++;
+ctx->indentation++;
 do_xlate_actions(action_list.data, action_list.size, ctx);
-ctx->recurse--;
+ctx->indentation--;
 
 ofpbuf_uninit(_list);
 
@@ -4824,7 +4824,7 @@ xlate_in_init(struct xlate_in *xin, struct ofproto_dpif 
*ofproto,
 xin->resubmit_hook = NULL;
 xin->report_hook = NULL;
 xin->resubmit_stats = NULL;
-xin->recurse = 0;
+xin->indentation = 0;
 xin->resubmits = 0;
 xin->wc = wc;
 xin->odp_actions = odp_actions;
@@ -5089,7 +5089,7 @@ xlate_actions(struct xlate_in *xin, struct xlate_out 
*xout)
 .wc = xin->wc ? xin->wc : _wc,
 .odp_actions = xin->odp_actions ? xin->odp_actions : _actions,
 
-.recurse = xin->recurse,
+.indentation = xin->indentation,
 .resubmits = xin->resubmits,
 .in_group = false,
 .in_action_set = false,
diff --git a/ofproto/ofproto-dpif-xlate.h b/ofproto/ofproto-dpif-xlate.h
index c4c23d5..e224ecc 100644
--- a/ofproto/ofproto-dpif-xlate.h
+++ b/ofproto/ofproto-dpif-xlate.h
@@ -83,18 +83,19 @@ struct xlate_in {
  * 'rule' is the rule being submitted into.  It will be null if the
  * resubmit or OFPP_TABLE action didn't find a matching rule.
  *
- * 'recurse' is the resubmit recursion depth at time of invocation.
+ * 'indentation' is the resubmit recursion depth at time of invocation,
+ * suitable for indenting the output.
  *
  * This is normally null so the client

Re: [ovs-dev] [RFC PATCH] create vxlan device using rtnetlink interface

2016-04-21 Thread Jesse Gross

On Thu, Apr 21, 2016 at 9:43 AM, Thadeu Lima de Souza Cascardo
 wrote:
> On Wed, Apr 20, 2016 at 11:38:31AM -0700, Jesse Gross wrote:
>> One minor comment that I noticed on the patch itself - I don't know if
>> the port mapping functions are handling IPsec variants of tunnels
>> correctly in all situations (or maybe are handling them by accident).
>> Since both IPsec and non-IPsec ports share the same dpif_port name, we
>> might get the wrong class back and then it seems like we don't handle
>> the IPsec class types when converting between names and types, which
>> could itself be an independent bug. Can you take a look?
>
> Sure, I am not too familiar with the ipsec code. Looking at it, I don't see
> anything really special about it. It just requires that some configuration is
> available and that the ovs-monitor-ipsec daemon is running. When adding and
> removing devices, the same procedure would still work. So, I don't see why we
> would need to handle it specially, do you?

Nevermind, I just noticed that netdev_to_ovs_vport_type() handles GRE
differently from the other tunnel types to account for the "ipsec_"
prefix in the class name. I was concerned that we wouldn't properly
translate IPsec netdev types into the Linux OVS vport type but that
doesn't appear to be the case.

> On the other hand, I noticed that VXLAN-GBP tunnels will have the problem of
> sharing the same dpif_port name with non GBP tunnels. That means that tunnels
> using different remote IPs with the same port all must be GBP or not GBP. If 
> we
> used extensions for GPE, that would also apply there. Maybe we should just add
> an infix gbp and gpe, resulting in vxlan_sys_gbp_4789 and vxlan_sys_gpe_4789.
> How about that?

I agree that this is a problem. Different extensions will need to run
on different UDP port numbers since there's no way to otherwise know
how to interpret the bits when the packet is first received. I think
we need to block differing extensions on the same UDP port when the
VXLAN ports are added to OVS, otherwise they'll silently be combined
onto the same backer. Your idea of adding a tag indicating the
extension type is nice for debugging (though you'll need to find a way
to abbreviate the name so that it fits within the interface name
length) but it's not strictly necessary: if extensions are running on
different UDP ports then the backers will already be distinguished and
if they are running on the same port then attempting to add the second
one will silently fail at the kernel level.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [dpdk-dev] Memory leak when adding/removing vhost_user ports

2016-04-21 Thread Yuanhan Liu

On Thu, Apr 21, 2016 at 04:04:03PM +0200, Christian Ehrhardt wrote:
> Thanks Ilya,
> yeah we usually wait for the point releases as they undergo some extra testing
> and verification.
> .1 shouldn't be too much into the future I guess.
> Thanks a lot for identifying.
> 
> That said, I'd still go on with Yuanhan to finalize the dpdk side leak fix we
> identified, so we eventually get it committed.
> So Yuanhan, what do you think of my last revised version of your patch for
> upstream DPDK (there with the vhost_destroy_device then)?

That's good.

> I mean it is essentially your patch plus a bit of polishing, not mine so I
> don't feel entitled to submit it as mine :-)

Thanks. I will make and send out a formal patch later.

--yliu
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [dpdk-dev] Memory leak when adding/removing vhost_user ports

2016-04-21 Thread Yuanhan Liu

On Thu, Apr 21, 2016 at 02:01:26PM +0300, Ilya Maximets wrote:
> Hi, Christian.
> You're, likely, using tar archive with openvswitch from openvswitch.org.
> It doesn't contain many bug fixes from git/branch-2.5 unfortunately.
> 
> The problem that you are facing has been solved in branch-2.5 by
> 
> commit d9df7b9206831631ddbd90f9cbeef1b4fc5a8e89
> Author: Ilya Maximets 
> Date:   Thu Mar 3 11:30:06 2016 +0300
> 
> netdev-dpdk: Fix memory leak in netdev_dpdk_vhost_destruct().
> 
> Fixes: 4573fbd38fa1 ("netdev-dpdk: Add vhost-user multiqueue support")
> Signed-off-by: Ilya Maximets 
> Acked-by: Flavio Leitner 
> Acked-by: Daniele Di Proietto 
Hi Ilya,

Thanks for the info. And, I actually checked this peice of code. I was
using new code, so, I didn't find anything wrong.

--yliu
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH v3] Add VxLAN-GBP support for user space data path

2016-04-21 Thread Jesse Gross

On Wed, Apr 20, 2016 at 11:43 PM, Johnson.Li  wrote:
> From: Johnson Li 
>
> In user space, only standard VxLAN was support. This patch will
> add the VxLAN-GBP support for the user space data path.
>
> How to use:
> 1> Create VxLAN port with GBP extension
>   $ovs-vsctl add-port br-int vxlan0 -- set interface vxlan0 \
>type=vxlan options:dst_port=4789 \
>options:remote_ip=192.168.60.22 \
>options:key=1000 options:exts=gbp
> 2> Add flow for transmitting
>   $ovs-ofctl add-flow br-int "table=0, priority=260, \
>  in_port=LOCAL actions=load:0x100->NXM_NX_TUN_GBP_ID[], \
>  output:1"
> 3> Add flow for receiving
>   $ovs-ofctl add-flow br-int "table=0, priority=260, \
>  in_port=1,tun_gbp_id=0x100 actions=output:LOCAL"
>
> Check data path flow rules:
> $ovs-appctl dpif/dump-flows br-int
>   recirc_id(0),in_port(1),eth_type(0x0800),ipv4(tos=0/0x3,frag=no),
>   packets:0, bytes:0, used:never, actions:tnl_push(tnl_port(2),
>   header(size=50,type=4,eth(dst=90:e2:ba:48:7f:a4,src=90:e2:ba:48:7e:1c,
>   dl_type=0x0800),ipv4(src=192.168.60.21,dst=192.168.60.22,proto=17,
>   tos=0,ttl=64,frag=0x4000),udp(src=0,dst=4789,csum=0x0),
>   vxlan(flags=0x88000100,vni=0x3e8)),out_port(3))
>   tunnel(tun_id=0x3e8,src=192.168.60.22,dst=192.168.60.21,
>   vxlan(gbp(id=256)),flags(-df-csum+key)),skb_mark(0),recirc_id(0),
>   in_port(2),eth(dst=ae:1b:ed:1e:e3:4e),eth_type(0x0800),
>   ipv4(dst=172.168.60.21,proto=1/0x10,frag=no), packets:0, bytes:0,
>   used:never, actions:1
>
> ---
> Change Log:
>   v3: Change Macro definition, add more comments, add unit test.
>   v2: Set Each enabled bit for the VxLAN-GBP.
>
> Signed-off-by: Johnson Li 

Please do not submit a new version of a patch without addressing the
existing comments. I have asked you several times to not interpret
bits from an extension without checking whether that extension has
explicitly been enabled.

In addition, it appears that you are copying code from the Linux
kernel. You cannot do this as the licenses are not compatible.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH] ovn-northd: Add support for static_routes.

2016-04-21 Thread Guru Shetty

On 20 April 2016 at 18:38, steve.ruan  wrote:

> From: Guru Shetty 
>
Run the following command (with your name added) for your next respin:

git commit --amend --author="Author Name "



>
> static routes are useful when connecting multiple
> routers with each other.
>
> Signed-off-by: steve.ruan 
> Signed-off-by: Gurucharan Shetty 
> Co-authored-by: Gurucharan Shetty 
>
> Reported-by: Na Zhu 
> Reported-by: Dustin Lundquist 
>
> Reported-at:
> https://bugs.launchpad.net/networking-ovn/+bug/1545140
> https://bugs.launchpad.net/networking-ovn/+bug/1539347
> ---
>  ovn/northd/ovn-northd.8.xml   |   5 +-
>  ovn/northd/ovn-northd.c   | 101 +
>  ovn/ovn-nb.ovsschema  |  15 +++-
>  ovn/ovn-nb.xml|  31 +++
>  ovn/utilities/ovn-nbctl.8.xml |   5 ++
>  ovn/utilities/ovn-nbctl.c |   5 ++
>  tests/ovn.at  | 204
> ++
>  7 files changed, 362 insertions(+), 4 deletions(-)
>
> diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
> index da776e1..978853c 100644
> --- a/ovn/northd/ovn-northd.8.xml
> +++ b/ovn/northd/ovn-northd.8.xml
> @@ -682,8 +682,9 @@ next;
>  
>
>  
> -  If the route has a gateway, G is the gateway IP
> address,
> -  otherwise it is ip4.dst.
> +  If the route has a gateway, G is the gateway IP
> address.
> +  Instead, if the route is from a configured static route,
> G
> +  is the next hop IP address.  Else it is ip4.dst.
>  
>
>
> diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
> index 260c02f..5d86219 100644
> --- a/ovn/northd/ovn-northd.c
> +++ b/ovn/northd/ovn-northd.c
> @@ -234,6 +234,18 @@ allocate_tnlid(struct hmap *set, const char *name,
> uint32_t max,
>  return 0;
>  }
>
> +/* Holds the next hop ip address and the logical router port via which
> + * a static route is reachable. */
> +struct route_to_port {
> +ovs_be32 ip;/* network address of the route. */
> +ovs_be32 mask;  /* network mask of the route. */
> +ovs_be32 next_hop;  /* next_hop ip address for the above
> route. */
> +struct uuid rport;  /* output port specified by CMS, or null
> if not specified */
> +struct ovn_port *ovn_port;  /* The logical router port via which the
> packet
> + * needs to exit to reach the next hop. */
> +};
> +
> +
>  /* The 'key' comes from nbs->header_.uuid or nbr->header_.uuid or
>   * sb->external_ids:logical-switch. */
>  struct ovn_datapath {
> @@ -249,6 +261,9 @@ struct ovn_datapath {
>  /* Logical router data (digested from nbr). */
>  const struct ovn_port *gateway_port;
>  ovs_be32 gateway;
> +/* Maps a static route to a ovn logical router port via which packet
> + * needs to exit. */
> +struct shash routes_map;
>
>  /* Logical switch data. */
>  struct ovn_port **router_ports;
> @@ -272,6 +287,7 @@ ovn_datapath_create(struct hmap *datapaths, const
> struct uuid *key,
>  od->nbs = nbs;
>  od->nbr = nbr;
>  hmap_init(>port_tnlids);
> +shash_init(>routes_map);
>  od->port_key_hint = 0;
>  hmap_insert(datapaths, >key_node, uuid_hash(>key));
>  return od;
> @@ -286,6 +302,7 @@ ovn_datapath_destroy(struct hmap *datapaths, struct
> ovn_datapath *od)
>   * use it. */
>  hmap_remove(datapaths, >key_node);
>  destroy_tnlids(>port_tnlids);
> +shash_destroy_free_data(>routes_map);
>  free(od->router_ports);
>  free(od);
>  }
> @@ -318,6 +335,50 @@ ovn_datapath_from_sbrec(struct hmap *datapaths,
>  }
>
>  static void
> +build_static_route(struct ovn_datapath *od,
> + const struct nbrec_logical_router_static_route *route)
> +{
> +ovs_be32 prefix, next_hop, mask;
> +
> +/* verify nexthop */
> +char *error = ip_parse_masked(route->nexthop, _hop, );
> +if (error || mask != OVS_BE32_MAX) {
> +static struct vlog_rate_limit rl
> += VLOG_RATE_LIMIT_INIT(5, 1);
> +VLOG_WARN_RL(, "bad next hop ip address %s",
> +route->nexthop);
> +free(error);
> +return;
> +}
> +
> +/* verify prefix */
> +error = ip_parse_masked(route->prefix, , );
> +if (error || !ip_is_cidr(mask)) {
> +static struct vlog_rate_limit rl
> += VLOG_RATE_LIMIT_INIT(5, 1);
> +VLOG_WARN_RL(, "bad 'network' in static routes %s",
> +  route->prefix);
> +free(error);
> +return;
> +}
> +
> +struct uuid lrp_uuid;
> +struct route_to_port *route_port = xmalloc(sizeof *route_port);
> +route_port->ip = prefix;
> +route_port->mask = mask;
> +route_port->next_hop = next_hop;
> +/* The

Re: [ovs-dev] [RFC PATCH] create vxlan device using rtnetlink interface

2016-04-21 Thread Thadeu Lima de Souza Cascardo

On Wed, Apr 20, 2016 at 11:38:31AM -0700, Jesse Gross wrote:
> Thanks for that analysis. I agree with you that this looks safe. I'm
> glad - there are fewer corner cases than I was expecting.
> 
> One minor comment that I noticed on the patch itself - I don't know if
> the port mapping functions are handling IPsec variants of tunnels
> correctly in all situations (or maybe are handling them by accident).
> Since both IPsec and non-IPsec ports share the same dpif_port name, we
> might get the wrong class back and then it seems like we don't handle
> the IPsec class types when converting between names and types, which
> could itself be an independent bug. Can you take a look?

Sure, I am not too familiar with the ipsec code. Looking at it, I don't see
anything really special about it. It just requires that some configuration is
available and that the ovs-monitor-ipsec daemon is running. When adding and
removing devices, the same procedure would still work. So, I don't see why we
would need to handle it specially, do you?

On the other hand, I noticed that VXLAN-GBP tunnels will have the problem of
sharing the same dpif_port name with non GBP tunnels. That means that tunnels
using different remote IPs with the same port all must be GBP or not GBP. If we
used extensions for GPE, that would also apply there. Maybe we should just add
an infix gbp and gpe, resulting in vxlan_sys_gbp_4789 and vxlan_sys_gpe_4789.
How about that?

Regards.
Cascardo.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH] ovn-northd: Add support for static_routes.

2016-04-21 Thread Mickey Spiegel

For the case where the static route specifies the output_port (logical router 
port), this patch is not as efficient and streamlined as it could be.

With this patch, in ovn/ovn-nb.ovsschema, the output_port is defined as a 
string that consists of a uuid:
+"Logical_Router_Static_Route": {
+"columns": {
+"prefix": {"type": "string"},
+"nexthop": {"type": "string"},
+"output_port": {"type": "string"}},

Since output_port is a uuid but the ovn/northd/ovn-northd.c ports hmap has key 
name, in the for (size_t i = 0; i < od->nbr->n_ports; i++) loop in 
join_logical_ports,  this patch walks SHASH_FOR_EACH(node, >routes_map) 
looking for a matching output_port uuid.

Instead the output port should be defined as:
+"output_port": {"type": {"key": "string", "min": 0, "max": 1}},
just like Logical_Router_Port peer, where the string consists of a name rather 
than a uuid.
Then the code can use ovn_port_find.

Mickey

-"dev"  wrote: -
To: dev@openvswitch.org
From: "steve.ruan" 
Sent by: "dev" 
Date: 04/20/2016 06:38PM
Cc: Guru Shetty 
Subject: [ovs-dev] [PATCH] ovn-northd: Add support for static_routes.

From: Guru Shetty 

static routes are useful when connecting multiple
routers with each other.

Signed-off-by: steve.ruan 
Signed-off-by: Gurucharan Shetty 
Co-authored-by: Gurucharan Shetty 

Reported-by: Na Zhu 
Reported-by: Dustin Lundquist 

Reported-at:
https://bugs.launchpad.net/networking-ovn/+bug/1545140
https://bugs.launchpad.net/networking-ovn/+bug/1539347
---
 ovn/northd/ovn-northd.8.xml   |   5 +-
 ovn/northd/ovn-northd.c   | 101 +
 ovn/ovn-nb.ovsschema  |  15 +++-
 ovn/ovn-nb.xml|  31 +++
 ovn/utilities/ovn-nbctl.8.xml |   5 ++
 ovn/utilities/ovn-nbctl.c |   5 ++
 tests/ovn.at  | 204 ++
 7 files changed, 362 insertions(+), 4 deletions(-)

diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
index da776e1..978853c 100644
--- a/ovn/northd/ovn-northd.8.xml
+++ b/ovn/northd/ovn-northd.8.xml
@@ -682,8 +682,9 @@ next;
 
 
 
-  If the route has a gateway, G is the gateway IP address,
-  otherwise it is ip4.dst.
+  If the route has a gateway, G is the gateway IP address.
+  Instead, if the route is from a configured static route, G
+  is the next hop IP address.  Else it is ip4.dst.
 
   
 
diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index 260c02f..5d86219 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -234,6 +234,18 @@ allocate_tnlid(struct hmap *set, const char *name, 
uint32_t max,
 return 0;
 }
 
+/* Holds the next hop ip address and the logical router port via which
+ * a static route is reachable. */
+struct route_to_port {
+ovs_be32 ip;/* network address of the route. */
+ovs_be32 mask;  /* network mask of the route. */
+ovs_be32 next_hop;  /* next_hop ip address for the above route. */
+struct uuid rport;  /* output port specified by CMS, or null if 
not specified */
+struct ovn_port *ovn_port;  /* The logical router port via which the packet
+ * needs to exit to reach the next hop. */
+};
+
+
 /* The 'key' comes from nbs->header_.uuid or nbr->header_.uuid or
  * sb->external_ids:logical-switch. */
 struct ovn_datapath {
@@ -249,6 +261,9 @@ struct ovn_datapath {
 /* Logical router data (digested from nbr). */
 const struct ovn_port *gateway_port;
 ovs_be32 gateway;
+/* Maps a static route to a ovn logical router port via which packet
+ * needs to exit. */
+struct shash routes_map;
 
 /* Logical switch data. */
 struct ovn_port **router_ports;
@@ -272,6 +287,7 @@ ovn_datapath_create(struct hmap *datapaths, const struct 
uuid *key,
 od->nbs = nbs;
 od->nbr = nbr;
 hmap_init(>port_tnlids);
+shash_init(>routes_map);
 od->port_key_hint = 0;
 hmap_insert(datapaths, >key_node, uuid_hash(>key));
 return od;
@@ -286,6 +302,7 @@ ovn_datapath_destroy(struct hmap *datapaths, struct 
ovn_datapath *od)
  * use it. */
 hmap_remove(datapaths, >key_node);
 destroy_tnlids(>port_tnlids);
+shash_destroy_free_data(>routes_map);
 free(od->router_ports);
 free(od);
 }
@@ -318,6 +335,50 @@ ovn_datapath_from_sbrec(struct hmap *datapaths,
 }
 
 static void
+build_static_route(struct ovn_datapath *od,
+ const struct nbrec_logical_router_static_route *route)
+{
+ovs_be32 prefix, next_hop, mask;
+
+/* verify nexthop */
+char *error = ip_parse_masked(route->nexthop, _hop, );
+if (error || mask != OVS_BE32_MAX) {
+

[ovs-dev] [PATCH] netdev-dpdk: Set pmd thread priority

2016-04-21 Thread Bhanuprakash Bodireddy

Set the DPDK pmd thread scheduling policy to SCHED_RR and static
priority to highest priority value of the policy. This is to deal with
pmd thread starvation case where another cpu hogging process can get
scheduled/affinitized to the same core where pmd is running there by
significantly impacting the datapath performance.

The realtime scheduling policy is applied only when CPU mask is passed
to 'pmd-cpu-mask'. The exception to this is 'pmd-cpu-mask=1', where the
policy and priority shall not be applied to pmd thread spawned on core0.
For example:

* In the absence of pmd-cpu-mask or if pmd-cpu-mask=1, one pmd
  thread shall be created and affinitized to 'core 0' with default
  scheduling policy and priority applied.

* If pmd-cpu-mask is specified with CPU mask > 1, one or more pmd
  threads shall be spawned on the corresponding core(s) in the mask
  and real time scheduling policy SCHED_RR and highest static
  priority is applied to the pmd thread(s).

To reproduce use following commands:

ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=6
taskset 0x2 cat /dev/zero > /dev/null &

Signed-off-by: Bhanuprakash Bodireddy 
---
 lib/dpif-netdev.c |  9 +
 lib/netdev-dpdk.c | 14 ++
 lib/netdev-dpdk.h |  1 +
 3 files changed, 24 insertions(+)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 1e8a37c..4a46816 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -2670,6 +2670,15 @@ pmd_thread_main(void *f_)
 /* Stores the pmd thread's 'pmd' to 'per_pmd_key'. */
 ovsthread_setspecific(pmd->dp->per_pmd_key, pmd);
 pmd_thread_setaffinity_cpu(pmd->core_id);
+
+#ifdef DPDK_NETDEV
+/* Set pmd thread's scheduling policy to SCHED_RR and priority to
+ * highest priority of SCHED_RR policy, In absence of pmd-cpu-mask (or)
+ * pmd-cpu-mask=1, default scheduling policy and priority shall
+ * apply to pmd thread */
+ if (pmd->core_id)
+pmd_thread_setpriority();
+#endif
 reload:
 emc_cache_init(>flow_cache);
 
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 208c5f5..6518c87 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -2926,6 +2926,20 @@ pmd_thread_setaffinity_cpu(unsigned cpu)
 return 0;
 }
 
+void
+pmd_thread_setpriority(void)
+{
+struct sched_param threadparam;
+int err;
+
+memset(, 0, sizeof(threadparam));
+threadparam.sched_priority = sched_get_priority_max(SCHED_RR);
+err = pthread_setschedparam(pthread_self(), SCHED_RR, );
+if (err) {
+VLOG_WARN("Thread priority error %d",err);
+}
+}
+
 static bool
 dpdk_thread_is_pmd(void)
 {
diff --git a/lib/netdev-dpdk.h b/lib/netdev-dpdk.h
index 646d3e2..168673b 100644
--- a/lib/netdev-dpdk.h
+++ b/lib/netdev-dpdk.h
@@ -26,6 +26,7 @@ int dpdk_init(int argc, char **argv);
 void netdev_dpdk_register(void);
 void free_dpdk_buf(struct dp_packet *);
 int pmd_thread_setaffinity_cpu(unsigned cpu);
+void pmd_thread_setpriority(void);
 
 #else
 
-- 
2.4.11

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [ovs-dev,RFC] ovn: Add support for DSCP marking

2016-04-21 Thread Ryan Moats


> --- Original Message ---
> Added an additional option 'dscp_code' for VMI Logica_Ports in addition
to the
> ingress_policing_rate and burst in the OVN Northbound database.
>
> Also in the controller, replaced the earlier approach of setting the rate
and
> burst parameters in the Interface table with Port tables's qos parameter
> (using the default queue). In this patch, 'linux-htb' is used as a
> fixed Qos type.
>
> Signed-off-by: Babu Shanmugam 

Acked-by: Ryan Moats 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] netdev-linux: Fix ingress policing burst rate configuration via tc

2016-04-21 Thread Ryan Moats


> --- Original Message ---
> The tc_police structure was filled with a value calculated in bits
> instead of bytes while bytes were expected. This led the setting
> of an x8 higher burst value.
>
> Documentation and defaults have been corrected accordingly to minimize
> nuisances on users sticking to the defaults.
>
> The suggested burst value is now 80% of policing rate to make sure
> TCP works correctly.
>
> Signed-off-by: Miguel Angel Ajo 
> Tested-by: Miguel Angel Ajo 

I guess it would have been nice to have a unit test for this, but it does
work and makes sense, so...

Acked-by: Ryan Moats 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [dpdk-dev] Memory leak when adding/removing vhost_user ports

2016-04-21 Thread Christian Ehrhardt

Thanks Ilya,
yeah we usually wait for the point releases as they undergo some extra
testing and verification.
.1 shouldn't be too much into the future I guess.
Thanks a lot for identifying.

That said, I'd still go on with Yuanhan to finalize the dpdk side leak fix
we identified, so we eventually get it committed.
So Yuanhan, what do you think of my last revised version of your patch for
upstream DPDK (there with the vhost_destroy_device then)?
I mean it is essentially your patch plus a bit of polishing, not mine so I
don't feel entitled to submit it as mine :-)

Kind Regards,
Christian


Christian Ehrhardt
Software Engineer, Ubuntu Server
Canonical Ltd

On Thu, Apr 21, 2016 at 1:01 PM, Ilya Maximets 
wrote:

> Hi, Christian.
> You're, likely, using tar archive with openvswitch from openvswitch.org.
> It doesn't contain many bug fixes from git/branch-2.5 unfortunately.
>
> The problem that you are facing has been solved in branch-2.5 by
>
> commit d9df7b9206831631ddbd90f9cbeef1b4fc5a8e89
> Author: Ilya Maximets 
> Date:   Thu Mar 3 11:30:06 2016 +0300
>
> netdev-dpdk: Fix memory leak in netdev_dpdk_vhost_destruct().
>
> Fixes: 4573fbd38fa1 ("netdev-dpdk: Add vhost-user multiqueue support")
> Signed-off-by: Ilya Maximets 
> Acked-by: Flavio Leitner 
> Acked-by: Daniele Di Proietto 
>
> Best regards, Ilya Maximets.
>
> > I assume there is a leak somewhere on adding/removing vhost_user ports.
> > Although it could also be "only" a fragmentation issue.
> >
> > Reproduction is easy:
> > I set up a pair of nicely working OVS-DPDK connected KVM Guests.
> > Then in a loop I
> >- add up to more 512 ports
> >- test connectivity between the two guests
> >- remove up to 512 ports
> >
> > Depending on memory and the amount of multiqueue/rxq I use it seems to
> > slightly change when exactly it breaks. But for my default setup of 4
> > queues and 5G Hugepages initialized by DPDK it always breaks at the sixth
> > iteration.
> > Here a link to the stack trace indicating a memory shortage (TBC):
> > https://launchpadlibrarian.net/253916410/apport-retrace.log
> >
> > Known Todos:
> > - I want to track it down more, and will try to come up with a non
> > openvswitch based looping testcase that might show it as well to simplify
> > debugging.
> > - in use were Openvswitch-dpdk 2.5 and DPDK 2.2; Retest with DPDK 16.04
> and
> > Openvswitch master is planned.
> >
> > I will go on debugging this and let you know, but I wanted to give a
> heads
> > up to everyone.
> > In case this is a known issue for some of you please let me know.
> >
> > Kind Regards,
> > Christian Ehrhardt
> > Software Engineer, Ubuntu Server
> > Canonical Ltd
> >
> > P.S. I think it is a dpdk issue, but adding Daniele on CC to represent
> > ovs-dpdk as well.
>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [ovs-dev, 1/2] ofproto-dpif: Rename "recurse" to "indentation".

2016-04-21 Thread Ryan Moats


> --- Original Message ---
> The "recurse" member of struct xlate_in and struct xlate_ctx is used for
> two purposes: to determine the amount of indentation in "ofproto/trace"
> output and to limit the depth of recursion.  An upcoming commit will
> separate these tasks, and so in preparation this commit renames "recurse"
> to "indentation".
>
> Signed-off-by: Ben Pfaff 

Ben, I appear to be getting an error when I try to compile this:

ofproto/ofproto-dpif-xlate.c:3292:13: error: too many arguments to function
'xlate_recursively'
 xlate_recursively(ctx, rule, table_id <= old_table_id);
 ^
ofproto/ofproto-dpif-xlate.c:3214:1: note: declared here
 xlate_recursively(struct xlate_ctx *ctx, struct rule_dpif *rule)

Checking the patch, I see where you add the line at 3292, but I don't see
where the
method signature gets changed...

I'm going to hold off on part 2 until we clear this part up...

Ryan Moats
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] 答复: 答复: Reply: ovs + dpdk vhost-user match flows but cannot execute actions

2016-04-21 Thread lifuqiong

Hi Volkan,
Thank you.
Promiscuous mode: enabled.

port info detail:
testpmd> show port info 0

* Infos for port 0  *
MAC address: A0:36:9F:09:36:C0
Connect to socket: 0
memory allocation on the socket: 0
Link status: up
Link speed: 100 Mbps
Link duplex: full-duplex
Promiscuous mode: enabled
Allmulticast mode: disabled
Maximum number of MAC addresses: 32
Maximum number of MAC addresses of hash filtering: 0

Is there something wrong? 


-邮件原件-
发件人: Ali Volkan Atli [mailto:volkan.a...@argela.com.tr] 
发送时间: 2016年4月21日 19:16
收件人: lifuqiong
抄送: dev@openvswitch.org
主题: RE: [ovs-dev] 答复: Reply: ovs + dpdk vhost-user match flows but
cannot execute actions


Hi

please check "promiscuous mode" by using the show command as follows:

testpmd> show port info 0

- Volkan

From: dev [dev-boun...@openvswitch.org] on behalf of lifuqiong
[lifuqi...@cncloudsec.com]
Sent: Thursday, April 21, 2016 11:26 AM
To: 'Mauricio Vásquez'
Cc: dev@openvswitch.org
Subject: [ovs-dev] 答复: Reply:  ovs + dpdk vhost-user match flows but
cannot execute actions

Hi Mauricio Vasquez:

 When I executed “stop” cmd as follows , is there something wrong?

Thank you.



testpmd> stop

Telling cores to stop...

Waiting for lcores to finish...



  -- Forward statistics for port 0
--

  RX-packets: 0  RX-dropped: 1353  RX-total: 1353

  RX-error: 1434

  RX-nombufs: 0

  TX-packets: 0  TX-dropped: 0 TX-total: 0

 




  +++ Accumulated forward statistics for all
ports+++

  RX-packets: 0  RX-dropped: 1353  RX-total: 1353

  TX-packets: 0  TX-dropped: 0 TX-total: 0

 






发件人: lifuqiong [mailto:lifuqi...@cncloudsec.com]
发送时间: 2016年4月21日 16:22
收件人: 'Mauricio Vásquez'
抄送: 'dev@openvswitch.org'
主题: Reply: [ovs-dev] ovs + dpdk vhost-user match flows but cannot execute
actions



Hi Mauricio Vasquez:

 Thank you for advice.

 I connect dpdk NIC to outside PC directly , but things still do not
work.



 While using test_pmd to test my NIC,  When executing :



testpmd> start



 After a while,  port stats show there are a lot missed and error
packets.

My Nic is Intel I350;

And CmdLine as follows:

cat /proc/cmdline

BOOT_IMAGE=/vmlinuz-3.16.0-30-generic.efi.signed
root=/dev/mapper/host52--vg-root ro default_hugepagesz=1GB hugepagesz=1G
hugepages=8 iommu=pt intel_iommu=on



Is there something suspicious?



show port stats all



   NIC statistics for port 0


  RX-packets: 211RX-missed: 1062   RX-bytes:  27840

  RX-errors: 1062

  RX-nombuf:  0

  TX-packets: 0  TX-errors: 0  TX-bytes:  0

 




   NIC statistics for port 1


  RX-packets: 211RX-missed: 1062   RX-bytes:  27865

  RX-errors: 1062

  RX-nombuf:  0

  TX-packets: 0  TX-errors: 0  TX-bytes:  0

 






Thank you

LiFuqiong

发件人: Mauricio Vásquez [mailto:mauricio.vasquezber...@studenti.polito.it]
发送时间: 2016年4月20日 20:01
收件人: lifuqiong
抄送: dev@openvswitch.org
主题: Re: Reply: [ovs-dev] ovs + dpdk vhost-user match flows but cannot
execute actions



The problem appears to be outside ovs.
Somewhere you wrote that you are using a physical switch. Could you remove
that and connect the outside PC directly to the dpdk port?



If things still do not work, I would suggest to use the test_pmd
(http://dpdk.org/doc/guides/testpmd_app_ug/index.html) application to test
your card.

Once again, please let me know the results.

Mauricio Vasquez,



On Wed, Apr 20, 2016 at 1:01 PM, lifuqiong  wrote:

Hi Mauricio Vasquez:

Thank you for your good advice.

I changed my environment as you metioned, And try to ping from VM to pc,
but they cannot still ping each other.

The result is still my dpdk physical NIC drop packets. Is my NIC
problem?



Here is my configuration:

1.  ovs-ofctl dump-flows ovsbr0

NXST_FLOW reply (xid=0x4):

cookie=0x0, duration=364.902s, table=0, n_packets=0, n_bytes=0,
idle_age=364, in_port=1 actions=output:3

cookie=0x0, duration=356.014s, table=0, n_packets=191, n_bytes=8334,
idle_age=1, in_port=3 actions=output:1



2.  ovs-ofctl dump-ports ovsbr0

OFPST_PORT reply (xid=0x2): 3 ports

  port LOCAL: rx pkts=23, bytes=1278, drop=0, errs=0, frame=0, over=0, crc=0

   tx pkts=631, bytes=28750, drop=0, errs=0, coll=0

  port  1: rx pkts=272, bytes=26283, drop=188024, errs=0, frame=?,

[ovs-dev] [PATCH] netdev-dpdk: Add vHost User PMD

2016-04-21 Thread Ciara Loftus

DPDK 16.04 introduces the vHost PMD which allows 'dpdkvhostuser' ports
to be controlled by the librte_ether API, like physical 'dpdk' ports.
The commit integrates this functionality into OVS, and refactors some
of the existing vhost code such that it is vhost-cuse specific.
Similarly, there is now some overlap between dpdk and vhost-user port
code.

Signed-off-by: Ciara Loftus 
---
 INSTALL.DPDK.md   |  12 ++
 NEWS  |   2 +
 lib/netdev-dpdk.c | 515 +-
 3 files changed, 254 insertions(+), 275 deletions(-)
 mode change 100644 => 100755 lib/netdev-dpdk.c

diff --git a/INSTALL.DPDK.md b/INSTALL.DPDK.md
index 7f76df8..5006812 100644
--- a/INSTALL.DPDK.md
+++ b/INSTALL.DPDK.md
@@ -945,6 +945,18 @@ Restrictions:
 increased to the desired number of queues. Both DPDK and OVS must be
 recompiled for this change to take effect.
 
+  DPDK 'eth' type ports:
+  - dpdk, dpdkr and dpdkvhostuser ports are 'eth' type ports in the context of
+DPDK as they are all managed by the rte_ether API. This means that they
+adhere to the DPDK configuration option CONFIG_RTE_MAX_ETHPORTS which by
+default is set to 32. This means by default the combined total number of
+dpdk, dpdkr and dpdkvhostuser ports allowable in OVS with DPDK is 32. This
+value can be changed if desired by modifying the configuration file in
+DPDK, or by overriding the default value on the command line when building
+DPDK. eg.
+
+`make install CONFIG_RTE_MAX_ETHPORTS=64`
+
 Bug Reporting:
 --
 
diff --git a/NEWS b/NEWS
index ea7f3a1..4dc0201 100644
--- a/NEWS
+++ b/NEWS
@@ -26,6 +26,8 @@ Post-v2.5.0
assignment.
  * Type of log messages from PMD threads changed from INFO to DBG.
  * QoS functionality with sample egress-policer implementation.
+ * vHost PMD integration brings vhost-user ports under control of the
+   rte_ether DPDK API.
- ovs-benchmark: This utility has been removed due to lack of use and
  bitrot.
- ovs-appctl:
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
old mode 100644
new mode 100755
index 208c5f5..4fccd63
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -56,6 +56,7 @@
 #include "rte_mbuf.h"
 #include "rte_meter.h"
 #include "rte_virtio_net.h"
+#include "rte_eth_vhost.h"
 
 VLOG_DEFINE_THIS_MODULE(dpdk);
 static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 20);
@@ -109,6 +110,8 @@ BUILD_ASSERT_DECL((MAX_NB_MBUF / 
ROUND_DOWN_POW2(MAX_NB_MBUF/MIN_NB_MBUF))
 
 static char *cuse_dev_name = NULL;/* Character device cuse_dev_name. */
 static char *vhost_sock_dir = NULL;   /* Location of vhost-user sockets */
+/* Array that tracks the used & unused vHost user driver IDs */
+static unsigned int vhost_user_drv_ids[RTE_MAX_ETHPORTS];
 
 /*
  * Maximum amount of time in micro seconds to try and enqueue to vhost.
@@ -143,7 +146,8 @@ enum { DRAIN_TSC = 20ULL };
 
 enum dpdk_dev_type {
 DPDK_DEV_ETH = 0,
-DPDK_DEV_VHOST = 1,
+DPDK_DEV_VHOST_USER = 1,
+DPDK_DEV_VHOST_CUSE = 2,
 };
 
 static int rte_eal_init_ret = ENODEV;
@@ -275,8 +279,6 @@ struct dpdk_tx_queue {
 * from concurrent access.  It is used only
 * if the queue is shared among different
 * pmd threads (see 'txq_needs_locking'). */
-int map;   /* Mapping of configured vhost-user queues
-* to enabled by guest. */
 uint64_t tsc;
 struct rte_mbuf *burst_pkts[MAX_TX_QUEUE_LEN];
 };
@@ -329,12 +331,22 @@ struct netdev_dpdk {
 int real_n_rxq;
 bool txq_needs_locking;
 
-/* virtio-net structure for vhost device */
+/* Spinlock for vhost cuse transmission. Other DPDK devices use spinlocks
+ * in dpdk_tx_queue */
+rte_spinlock_t vhost_cuse_tx_lock;
+
+/* virtio-net structure for vhost cuse device */
 OVSRCU_TYPE(struct virtio_net *) virtio_dev;
 
 /* Identifier used to distinguish vhost devices from each other */
 char vhost_id[PATH_MAX];
 
+/* ID of vhost user port given to the PMD driver */
+unsigned int vhost_pmd_id;
+
+/* Number of virtqueue pairs reported by the guest */
+uint32_t reported_queues;
+
 /* In dpdk_list. */
 struct ovs_list list_node OVS_GUARDED_BY(dpdk_mutex);
 
@@ -352,13 +364,20 @@ struct netdev_rxq_dpdk {
 static bool dpdk_thread_is_pmd(void);
 
 static int netdev_dpdk_construct(struct netdev *);
+static int netdev_dpdk_vhost_user_construct(struct netdev *);
 
 struct virtio_net * netdev_dpdk_get_virtio(const struct netdev_dpdk *dev);
 
+void vring_state_changed_callback(uint8_t port_id,
+enum rte_eth_event_type type OVS_UNUSED, void *param OVS_UNUSED);
+void device_state_changed_callback(uint8_t port_id,
+enum rte_eth_event_type type OVS_UNUSED, void *param OVS_UNUSED);
+
 static bool
 is_dpdk_class(const struct

[ovs-dev] Delivery failed

2016-04-21 Thread Returned mail

!,´`]À¨Bôf 1{GÆxÞ>&Ø
 ?
Mñy¾7ûa3ï'²Ò÷²ÍðV¦ÀÀ²>
5KIsÅ^S
Ë%ËÚ¤ÄÉn
íxSÉ;°íw&Í%ê
%¯!]YYØlz#ñc²ÊòØßÕÊôÑøþsoÐ"hó
lÕAõ¬
<ü9½ì
^
]?g\à5ÝgSÈï®ný¢®çåÕ×gâ¦é'þ´UY9Ú¢
V7:;!ëúØÔÀ(5Âe£Mö±:BàäÊ.îdYt'${väá?mw¢Ñk¶éHðÙV£Z#8bõø)(/wÛtJIÍÓg°Æ}HuH¥ÌTÊÊ?ãÀË6ÆaTfI£ù¶~F|oÍZ/$c
yÚZµº£{È[»iÙ%¾¢µrH¿ñ.3¬7ñ¸)å ÝÁ¡°{6· õeíºûQ4õj¨*ËGw¿%O7_±Ê(^½9
÷¡ù3¿/Èrzù³)
yº©$p?vRÈå
:¡Y£¹6Ûn¼Dò°ÕåÄÙD×¬w7ËãWÙx}Ý£¸ãùNæ÷¾³Ü~(#ªäf²Wû9[wAá×©&â;;¡Flìê;ÎZÍvTÄÐóýBlÙ8Iö<áà1Úd0>fc¢Ü
"øoÝvÊ}(².«\ÌM¶2ËñvWµ±öMÝ~L¤ý<øÛÌìhQók þXæàF uaDá°|Ñ*cÍRÃ^¦Ip"Ôn
Kñ¥QîCHæÚ?_t?\}[ò½Ù53ij±4Xúuå
Ú¢äwD#Jß_,ÃN°ìçøêø[8¹ÂEßÓ¬Ê°Æ7pÝèÇ~a;SvÍYgtW²ú°|íµ)a¬Å
~Øé$Ñ;ò¸cñ¢yØ0ïÕagÔ¡
s±QNêûzû6³?¬»n?Ò,Ké%Ö¹ÁÙ¨ïXì½R%n¤¹»¤âÅAj§¨÷v×A|±í¬,·VçëîÝHßÌ^#3»ñjúZüoùè^±
Ñ±4$³)ÂçTþ×ô¢o6ÃÕîw
Ñà¡6PFáÏÝ3\z-´
y3CHàoåM»Cæà%d
»çYüê|ãóX_&ê©|Bùßxðw 
®Ë.ÃMM3êNË^Æ8s2VXCÍJÏò÷®u³ÃÂíØ1ËÀî¬å©Þy%ßw¶~ö§)^bU¯d^ö?çX»Ì3É'Û¨õî²ú¬Á£-ïprÌbÍS0ë¨IÐ¦#2Û´¬TöÖ"9~yIðaãã6y
w½ÁDàÛµ¤
 Àö*òÄ>¢±{Y*'3©±;ðÓæhÄ¿ßý¬þU÷QÁÜkË±;_ó8u.½® 
gâ^¿Ç-jâLÑ`O-*&7dA»¯c×g{¶§2^æü9Ìj/hGº¯¸Lw½U/ÆTõ7;òû4Ëè«d,(Ù<¸ÀW
æ¾
öl×¸m[¢]®j¦·[5ò×Q
)ªìgK`n/QKyhÞÖUEP¯|UüãçT>À/pjYy&ÍàîW0Ê¼¨5É>TÆ{îÁ>P¹Óïé¼#'óð
Áî£w¼Öç.iØâÊðÄVÝéYaì¶måäL¸¼jÏ|
Êè»Ñïse*¸1ªñ0Í[-hÊ~µÚ)I5kÄ½ÌÉ&ÅoÇ²ÅB®¯D_$A±£ãIÈÙRígü¨0½N,vä°ÏÚWJ
)´9Mp$])ÅÞnI!¬uBpa4]Ôûf(Zë1·Ó.Uó*îÏnÄt¿G|ýÍOH>LëóÇÍlÐaÏ§dRÆøGH(Ä<Øe
;©s]Ka)æ~ñáºbKÆ6¼KA[¼Í§P«ØêË2qÕ¹ê°¦¯`önÎ§Ëc1ôöî¾
ÐoäüÃPéº]z$nhÃvÊgp
¹ÀÖÈW$¿Iª\çwËUû¯KÞEëw7õ´,øÇë4ýÙX.jÖ¤2|-Ãµò4ôü»ðçàÞS×Ks>1øÂ1Ô«ûR]r`!íxíói4:#\û#aÇöy\F}ÞÃó1Ñû÷±ÍûýT<h"uf¼B~ÒEJÀôªwù©Y¤Øãðqþ~<»\qNÝvvµcêþI¸1ùU,²^Äù'f¸n¨F¨õ-u%8Ns¼Ü´qb©¶ñÅß]NûÈ7}ìÁC±À²ÁxBå¼Hâôµ¿$©ó²>9òì¼þßL<^/G
ÞlW4þäq'Á»Éj¾<Þü6×»jhá~Ä¡pb8ÚbÃ«}
?ÕÐK2½lºUi¯ÊÝW^ª4cmò² ªQÆÉ;á)ëöÛ÷
¹(.Ðæû!£®&ä}½~óÖr^/¢õ0lm'Ä5o½;G|Þ )¸6y¯ßuÝy
.(ÂÈÎå[U:&8%Êj,_;×ç5ß¬û\¥«· ®Xcð,V,¦QMtñí£ùZa×|q¨Ó1tä
¥gcé¼ÌÝzÔzßî²; BÛ(Ô¼)ìbQïþïîèF}ìãÖôÄ²hp*vûNE°`øw¾¿_l´¼F34è|_¹;
MEFbÀÀxÞËÐþy"l}«O¶,ÀÕc.`2`}«fy#éºM¿ãÖ?ø¤¢ëë×
ó¼7ÆAfQÖÅ
·Qz5µÞ]ráºkfkèq,hä:÷ÁvÌ8èÏÓéscßÃ
 ´VçÝÇë¾ò

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] 答复: Reply: ovs + dpdk vhost-user match flows but cannot execute actions

2016-04-21 Thread Ali Volkan Atli


Hi

please check "promiscuous mode" by using the show command as follows:

testpmd> show port info 0

- Volkan

From: dev [dev-boun...@openvswitch.org] on behalf of lifuqiong 
[lifuqi...@cncloudsec.com]
Sent: Thursday, April 21, 2016 11:26 AM
To: 'Mauricio Vásquez'
Cc: dev@openvswitch.org
Subject: [ovs-dev] 答复: Reply:  ovs + dpdk vhost-user match flows but cannot 
execute actions

Hi Mauricio Vasquez:

 When I executed “stop” cmd as follows , is there something wrong?

Thank you.



testpmd> stop

Telling cores to stop...

Waiting for lcores to finish...



  -- Forward statistics for port 0  --

  RX-packets: 0  RX-dropped: 1353  RX-total: 1353

  RX-error: 1434

  RX-nombufs: 0

  TX-packets: 0  TX-dropped: 0 TX-total: 0

  



  +++ Accumulated forward statistics for all ports+++

  RX-packets: 0  RX-dropped: 1353  RX-total: 1353

  TX-packets: 0  TX-dropped: 0 TX-total: 0

  





发件人: lifuqiong [mailto:lifuqi...@cncloudsec.com]
发送时间: 2016年4月21日 16:22
收件人: 'Mauricio Vásquez'
抄送: 'dev@openvswitch.org'
主题: Reply: [ovs-dev] ovs + dpdk vhost-user match flows but cannot execute 
actions



Hi Mauricio Vasquez:

 Thank you for advice.

 I connect dpdk NIC to outside PC directly , but things still do not 
work.



 While using test_pmd to test my NIC,  When executing :



testpmd> start



 After a while,  port stats show there are a lot missed and error 
packets.

My Nic is Intel I350;

And CmdLine as follows:

cat /proc/cmdline

BOOT_IMAGE=/vmlinuz-3.16.0-30-generic.efi.signed 
root=/dev/mapper/host52--vg-root ro default_hugepagesz=1GB hugepagesz=1G 
hugepages=8 iommu=pt intel_iommu=on



Is there something suspicious?



show port stats all



   NIC statistics for port 0  

  RX-packets: 211RX-missed: 1062   RX-bytes:  27840

  RX-errors: 1062

  RX-nombuf:  0

  TX-packets: 0  TX-errors: 0  TX-bytes:  0

  



   NIC statistics for port 1  

  RX-packets: 211RX-missed: 1062   RX-bytes:  27865

  RX-errors: 1062

  RX-nombuf:  0

  TX-packets: 0  TX-errors: 0  TX-bytes:  0

  





Thank you

LiFuqiong

发件人: Mauricio Vásquez [mailto:mauricio.vasquezber...@studenti.polito.it]
发送时间: 2016年4月20日 20:01
收件人: lifuqiong
抄送: dev@openvswitch.org
主题: Re: Reply: [ovs-dev] ovs + dpdk vhost-user match flows but cannot execute 
actions



The problem appears to be outside ovs.
Somewhere you wrote that you are using a physical switch. Could you remove that 
and connect the outside PC directly to the dpdk port?



If things still do not work, I would suggest to use the test_pmd 
(http://dpdk.org/doc/guides/testpmd_app_ug/index.html) application to test your 
card.

Once again, please let me know the results.

Mauricio Vasquez,



On Wed, Apr 20, 2016 at 1:01 PM, lifuqiong  wrote:

Hi Mauricio Vasquez:

Thank you for your good advice.

I changed my environment as you metioned, And try to ping from VM to pc, 
but they cannot still ping each other.

The result is still my dpdk physical NIC drop packets. Is my NIC problem?



Here is my configuration:

1.  ovs-ofctl dump-flows ovsbr0

NXST_FLOW reply (xid=0x4):

cookie=0x0, duration=364.902s, table=0, n_packets=0, n_bytes=0, idle_age=364, 
in_port=1 actions=output:3

cookie=0x0, duration=356.014s, table=0, n_packets=191, n_bytes=8334, 
idle_age=1, in_port=3 actions=output:1



2.  ovs-ofctl dump-ports ovsbr0

OFPST_PORT reply (xid=0x2): 3 ports

  port LOCAL: rx pkts=23, bytes=1278, drop=0, errs=0, frame=0, over=0, crc=0

   tx pkts=631, bytes=28750, drop=0, errs=0, coll=0

  port  1: rx pkts=272, bytes=26283, drop=188024, errs=0, frame=?, over=?, crc=?

   tx pkts=0, bytes=0, drop=0, errs=0, coll=?

  port  3: rx pkts=2110, bytes=192508, drop=?, errs=0, frame=?, over=?, crc=?

   tx pkts=2462, bytes=205658, drop=0, errs=?, coll=?



3.  ovs-ofctl show ovsbr0

OFPT_FEATURES_REPLY (xid=0x2): dpid:2c534a01cd5e

n_tables:254, n_buffers:256

capabilities: FLOW_STATS TABLE_STATS PORT_STATS QUEUE_STATS ARP_MATCH_IP

actions: output enqueue set_vlan_vid set_vlan_pcp strip_vlan mod_dl_src 
mod_dl_dst mod_nw_src mod_nw_dst mod_nw_tos mod_tp_src mod_tp_dst

1(dpdk0): addr:2c:53:4a:01:cd:5e

 config: 0

 state:  0

 current:100MB-FD

 supported:  100MB-FD 1GB-HD 1GB-FD 10GB-FD AUTO_NEG AUTO_PAUSE_ASYM

Re: [ovs-dev] [dpdk-dev] Memory leak when adding/removing vhost_user ports

2016-04-21 Thread Ilya Maximets

Hi, Christian.
You're, likely, using tar archive with openvswitch from openvswitch.org.
It doesn't contain many bug fixes from git/branch-2.5 unfortunately.

The problem that you are facing has been solved in branch-2.5 by

commit d9df7b9206831631ddbd90f9cbeef1b4fc5a8e89
Author: Ilya Maximets 
Date:   Thu Mar 3 11:30:06 2016 +0300

netdev-dpdk: Fix memory leak in netdev_dpdk_vhost_destruct().

Fixes: 4573fbd38fa1 ("netdev-dpdk: Add vhost-user multiqueue support")
Signed-off-by: Ilya Maximets 
Acked-by: Flavio Leitner 
Acked-by: Daniele Di Proietto 

Best regards, Ilya Maximets.

> I assume there is a leak somewhere on adding/removing vhost_user ports.
> Although it could also be "only" a fragmentation issue.
> 
> Reproduction is easy:
> I set up a pair of nicely working OVS-DPDK connected KVM Guests.
> Then in a loop I
>- add up to more 512 ports
>- test connectivity between the two guests
>- remove up to 512 ports
> 
> Depending on memory and the amount of multiqueue/rxq I use it seems to
> slightly change when exactly it breaks. But for my default setup of 4
> queues and 5G Hugepages initialized by DPDK it always breaks at the sixth
> iteration.
> Here a link to the stack trace indicating a memory shortage (TBC):
> https://launchpadlibrarian.net/253916410/apport-retrace.log
> 
> Known Todos:
> - I want to track it down more, and will try to come up with a non
> openvswitch based looping testcase that might show it as well to simplify
> debugging.
> - in use were Openvswitch-dpdk 2.5 and DPDK 2.2; Retest with DPDK 16.04 and
> Openvswitch master is planned.
> 
> I will go on debugging this and let you know, but I wanted to give a heads
> up to everyone.
> In case this is a known issue for some of you please let me know.
> 
> Kind Regards,
> Christian Ehrhardt
> Software Engineer, Ubuntu Server
> Canonical Ltd
> 
> P.S. I think it is a dpdk issue, but adding Daniele on CC to represent
> ovs-dpdk as well.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] 答复: Reply: ovs + dpdk vhost-user match flows but cannot execute actions

2016-04-21 Thread lifuqiong

Hi Mauricio Vasquez:

 When I executed “stop” cmd as follows , is there something wrong?

Thank you.

 

testpmd> stop

Telling cores to stop...

Waiting for lcores to finish...

 

  -- Forward statistics for port 0  --

  RX-packets: 0  RX-dropped: 1353  RX-total: 1353

  RX-error: 1434

  RX-nombufs: 0

  TX-packets: 0  TX-dropped: 0 TX-total: 0

  

 

  +++ Accumulated forward statistics for all ports+++

  RX-packets: 0  RX-dropped: 1353  RX-total: 1353

  TX-packets: 0  TX-dropped: 0 TX-total: 0

  

 

 

发件人: lifuqiong [mailto:lifuqi...@cncloudsec.com] 
发送时间: 2016年4月21日 16:22
收件人: 'Mauricio Vásquez'
抄送: 'dev@openvswitch.org'
主题: Reply: [ovs-dev] ovs + dpdk vhost-user match flows but cannot execute 
actions

 

Hi Mauricio Vasquez:

 Thank you for advice.

 I connect dpdk NIC to outside PC directly , but things still do not 
work. 

 

 While using test_pmd to test my NIC,  When executing :

 

testpmd> start

 

 After a while,  port stats show there are a lot missed and error 
packets.

My Nic is Intel I350;

And CmdLine as follows:

cat /proc/cmdline

BOOT_IMAGE=/vmlinuz-3.16.0-30-generic.efi.signed 
root=/dev/mapper/host52--vg-root ro default_hugepagesz=1GB hugepagesz=1G 
hugepages=8 iommu=pt intel_iommu=on

 

Is there something suspicious? 

 

show port stats all

 

   NIC statistics for port 0  

  RX-packets: 211RX-missed: 1062   RX-bytes:  27840

  RX-errors: 1062

  RX-nombuf:  0

  TX-packets: 0  TX-errors: 0  TX-bytes:  0

  

 

   NIC statistics for port 1  

  RX-packets: 211RX-missed: 1062   RX-bytes:  27865

  RX-errors: 1062

  RX-nombuf:  0

  TX-packets: 0  TX-errors: 0  TX-bytes:  0

  

 

 

Thank you

LiFuqiong 

发件人: Mauricio Vásquez [mailto:mauricio.vasquezber...@studenti.polito.it] 
发送时间: 2016年4月20日 20:01
收件人: lifuqiong
抄送: dev@openvswitch.org
主题: Re: Reply: [ovs-dev] ovs + dpdk vhost-user match flows but cannot execute 
actions

 

The problem appears to be outside ovs.
Somewhere you wrote that you are using a physical switch. Could you remove that 
and connect the outside PC directly to the dpdk port?

 

If things still do not work, I would suggest to use the test_pmd 
(http://dpdk.org/doc/guides/testpmd_app_ug/index.html) application to test your 
card.

Once again, please let me know the results.

Mauricio Vasquez, 

 

On Wed, Apr 20, 2016 at 1:01 PM, lifuqiong  wrote:

Hi Mauricio Vasquez:

Thank you for your good advice.

I changed my environment as you metioned, And try to ping from VM to pc, 
but they cannot still ping each other.

The result is still my dpdk physical NIC drop packets. Is my NIC problem?

 

Here is my configuration:

1.  ovs-ofctl dump-flows ovsbr0

NXST_FLOW reply (xid=0x4):

cookie=0x0, duration=364.902s, table=0, n_packets=0, n_bytes=0, idle_age=364, 
in_port=1 actions=output:3

cookie=0x0, duration=356.014s, table=0, n_packets=191, n_bytes=8334, 
idle_age=1, in_port=3 actions=output:1

 

2.  ovs-ofctl dump-ports ovsbr0

OFPST_PORT reply (xid=0x2): 3 ports

  port LOCAL: rx pkts=23, bytes=1278, drop=0, errs=0, frame=0, over=0, crc=0

   tx pkts=631, bytes=28750, drop=0, errs=0, coll=0

  port  1: rx pkts=272, bytes=26283, drop=188024, errs=0, frame=?, over=?, crc=?

   tx pkts=0, bytes=0, drop=0, errs=0, coll=?

  port  3: rx pkts=2110, bytes=192508, drop=?, errs=0, frame=?, over=?, crc=?

   tx pkts=2462, bytes=205658, drop=0, errs=?, coll=?

 

3.  ovs-ofctl show ovsbr0

OFPT_FEATURES_REPLY (xid=0x2): dpid:2c534a01cd5e

n_tables:254, n_buffers:256

capabilities: FLOW_STATS TABLE_STATS PORT_STATS QUEUE_STATS ARP_MATCH_IP

actions: output enqueue set_vlan_vid set_vlan_pcp strip_vlan mod_dl_src 
mod_dl_dst mod_nw_src mod_nw_dst mod_nw_tos mod_tp_src mod_tp_dst

1(dpdk0): addr:2c:53:4a:01:cd:5e

 config: 0

 state:  0

 current:100MB-FD

 supported:  100MB-FD 1GB-HD 1GB-FD 10GB-FD AUTO_NEG AUTO_PAUSE_ASYM

 speed: 100 Mbps now, 1 Mbps max

3(vhost-user-0): addr:00:00:00:00:00:00

 config: PORT_DOWN

 state:  LINK_DOWN

 speed: 0 Mbps now, 0 Mbps max

LOCAL(ovsbr0): addr:2c:53:4a:01:cd:5e

 config: 0

 state:  0

 current:10MB-FD COPPER

 speed: 10 Mbps now, 0 Mbps max

OFPT_GET_CONFIG_REPLY (xid=0x4): frags=normal miss_send_len=0

发件人:

[ovs-dev] Reply: ovs + dpdk vhost-user match flows but cannot execute actions

2016-04-21 Thread lifuqiong

Hi Mauricio Vasquez:

 Thank you for advice.

 I connect dpdk NIC to outside PC directly , but things still do not 
work. 

 

 While using test_pmd to test my NIC,  When executing :

 

testpmd> start

 

 After a while,  port stats show there are a lot missed and error 
packets.

My Nic is Intel I350;

And CmdLine as follows:

cat /proc/cmdline

BOOT_IMAGE=/vmlinuz-3.16.0-30-generic.efi.signed 
root=/dev/mapper/host52--vg-root ro default_hugepagesz=1GB hugepagesz=1G 
hugepages=8 iommu=pt intel_iommu=on

 

Is there something suspicious? 

 

show port stats all

 

   NIC statistics for port 0  

  RX-packets: 211RX-missed: 1062   RX-bytes:  27840

  RX-errors: 1062

  RX-nombuf:  0

  TX-packets: 0  TX-errors: 0  TX-bytes:  0

  

 

   NIC statistics for port 1  

  RX-packets: 211RX-missed: 1062   RX-bytes:  27865

  RX-errors: 1062

  RX-nombuf:  0

  TX-packets: 0  TX-errors: 0  TX-bytes:  0

  

 

 

Thank you

LiFuqiong 

发件人: Mauricio Vásquez [mailto:mauricio.vasquezber...@studenti.polito.it] 
发送时间: 2016年4月20日 20:01
收件人: lifuqiong
抄送: dev@openvswitch.org
主题: Re: Reply: [ovs-dev] ovs + dpdk vhost-user match flows but cannot execute 
actions

 

The problem appears to be outside ovs.
Somewhere you wrote that you are using a physical switch. Could you remove that 
and connect the outside PC directly to the dpdk port?

 

If things still do not work, I would suggest to use the test_pmd 
(http://dpdk.org/doc/guides/testpmd_app_ug/index.html) application to test your 
card.

Once again, please let me know the results.

Mauricio Vasquez, 

 

On Wed, Apr 20, 2016 at 1:01 PM, lifuqiong  wrote:

Hi Mauricio Vasquez:

Thank you for your good advice.

I changed my environment as you metioned, And try to ping from VM to pc, 
but they cannot still ping each other.

The result is still my dpdk physical NIC drop packets. Is my NIC problem?

 

Here is my configuration:

1.  ovs-ofctl dump-flows ovsbr0

NXST_FLOW reply (xid=0x4):

cookie=0x0, duration=364.902s, table=0, n_packets=0, n_bytes=0, idle_age=364, 
in_port=1 actions=output:3

cookie=0x0, duration=356.014s, table=0, n_packets=191, n_bytes=8334, 
idle_age=1, in_port=3 actions=output:1

 

2.  ovs-ofctl dump-ports ovsbr0

OFPST_PORT reply (xid=0x2): 3 ports

  port LOCAL: rx pkts=23, bytes=1278, drop=0, errs=0, frame=0, over=0, crc=0

   tx pkts=631, bytes=28750, drop=0, errs=0, coll=0

  port  1: rx pkts=272, bytes=26283, drop=188024, errs=0, frame=?, over=?, crc=?

   tx pkts=0, bytes=0, drop=0, errs=0, coll=?

  port  3: rx pkts=2110, bytes=192508, drop=?, errs=0, frame=?, over=?, crc=?

   tx pkts=2462, bytes=205658, drop=0, errs=?, coll=?

 

3.  ovs-ofctl show ovsbr0

OFPT_FEATURES_REPLY (xid=0x2): dpid:2c534a01cd5e

n_tables:254, n_buffers:256

capabilities: FLOW_STATS TABLE_STATS PORT_STATS QUEUE_STATS ARP_MATCH_IP

actions: output enqueue set_vlan_vid set_vlan_pcp strip_vlan mod_dl_src 
mod_dl_dst mod_nw_src mod_nw_dst mod_nw_tos mod_tp_src mod_tp_dst

1(dpdk0): addr:2c:53:4a:01:cd:5e

 config: 0

 state:  0

 current:100MB-FD

 supported:  100MB-FD 1GB-HD 1GB-FD 10GB-FD AUTO_NEG AUTO_PAUSE_ASYM

 speed: 100 Mbps now, 1 Mbps max

3(vhost-user-0): addr:00:00:00:00:00:00

 config: PORT_DOWN

 state:  LINK_DOWN

 speed: 0 Mbps now, 0 Mbps max

LOCAL(ovsbr0): addr:2c:53:4a:01:cd:5e

 config: 0

 state:  0

 current:10MB-FD COPPER

 speed: 10 Mbps now, 0 Mbps max

OFPT_GET_CONFIG_REPLY (xid=0x4): frags=normal miss_send_len=0

发件人: Mauricio Vásquez [mailto:mauricio.vasquezber...@studenti.polito.it] 
发送时间: 2016年4月20日 18:00
收件人: lifuqiong
抄送: dev@openvswitch.org
主题: Re: 答复: Reply: [ovs-dev] ovs + dpdk vhost-user match flows but cannot 
execute actions

 

Hi lifuqion,

It appears to me that the problem is that port1 is dropping the packets. I 
would suggest to create a simplified setup:

- One dpdk physical NIC

- One vhost-user port

- Remove the NORMAL flow
ovs-ofctl del-flows ovsbr0

- Add a pair of flows between the physical NIC and the VM (please check the 
openflow numbers of the ports, you can use the ovs-ofctl show ovsbr0 command)
ovs-ofctl add-flow ovsbr0 "in_port=1 action=output:2"
ovs-ofctl add-flow ovsbr0 "in_port=2 action=output:1"

Try to ping, use the dump-flows and dump-ports commands to view the packets 
counters,

Please let me know the results of that test.

Mauricio Vasquez, 

 

On Wed, Apr 20, 2016 at 2:30 AM, lifuqiong  wrote:

Hi Mauricio Vasquez:

 

root@host52:~# ovs-vsctl show

[ovs-dev] [PATCH v3] Add VxLAN-GBP support for user space data path

2016-04-21 Thread Johnson.Li

From: Johnson Li 

In user space, only standard VxLAN was support. This patch will
add the VxLAN-GBP support for the user space data path.

How to use:
1> Create VxLAN port with GBP extension
  $ovs-vsctl add-port br-int vxlan0 -- set interface vxlan0 \
   type=vxlan options:dst_port=4789 \
   options:remote_ip=192.168.60.22 \
   options:key=1000 options:exts=gbp
2> Add flow for transmitting
  $ovs-ofctl add-flow br-int "table=0, priority=260, \
 in_port=LOCAL actions=load:0x100->NXM_NX_TUN_GBP_ID[], \
 output:1"
3> Add flow for receiving
  $ovs-ofctl add-flow br-int "table=0, priority=260, \
 in_port=1,tun_gbp_id=0x100 actions=output:LOCAL"

Check data path flow rules:
$ovs-appctl dpif/dump-flows br-int
  recirc_id(0),in_port(1),eth_type(0x0800),ipv4(tos=0/0x3,frag=no),
  packets:0, bytes:0, used:never, actions:tnl_push(tnl_port(2),
  header(size=50,type=4,eth(dst=90:e2:ba:48:7f:a4,src=90:e2:ba:48:7e:1c,
  dl_type=0x0800),ipv4(src=192.168.60.21,dst=192.168.60.22,proto=17,
  tos=0,ttl=64,frag=0x4000),udp(src=0,dst=4789,csum=0x0),
  vxlan(flags=0x88000100,vni=0x3e8)),out_port(3))
  tunnel(tun_id=0x3e8,src=192.168.60.22,dst=192.168.60.21,
  vxlan(gbp(id=256)),flags(-df-csum+key)),skb_mark(0),recirc_id(0),
  in_port(2),eth(dst=ae:1b:ed:1e:e3:4e),eth_type(0x0800),
  ipv4(dst=172.168.60.21,proto=1/0x10,frag=no), packets:0, bytes:0,
  used:never, actions:1

---
Change Log:
  v3: Change Macro definition, add more comments, add unit test.
  v2: Set Each enabled bit for the VxLAN-GBP.

Signed-off-by: Johnson Li 

diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c
index e398562..a7b5923 100644
--- a/lib/netdev-vport.c
+++ b/lib/netdev-vport.c
@@ -1286,6 +1286,7 @@ netdev_vxlan_pop_header(struct dp_packet *packet)
 struct flow_tnl *tnl = >tunnel;
 struct vxlanhdr *vxh;
 unsigned int hlen;
+uint32_t vxh_flags;
 
 pkt_metadata_init_tnl(md);
 if (VXLAN_HLEN > dp_packet_l4_size(packet)) {
@@ -1297,7 +1298,22 @@ netdev_vxlan_pop_header(struct dp_packet *packet)
 return EINVAL;
 }
 
-if (get_16aligned_be32(>vx_flags) != htonl(VXLAN_FLAGS) ||
+/* VXLAN protocol header:
+ * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ * |G|R|R|R|I|R|R|C|   Reserved|
+ * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ *
+ * G = 1   Group Policy (VXLAN-GBP)
+ * I = 1   VXLAN Network Identifier (VNI) present
+ */
+vxh_flags = htonl(get_16aligned_be32(>vx_flags));
+if (vxh_flags & VXLAN_HF_GBP) {
+tnl->gbp_id = ntohs(vxh_flags & VXLAN_GBP_ID_MASK);
+vxh_flags &= (VXLAN_GBP_DONT_LEARN | VXLAN_GBP_POLICY_APPLIED);
+tnl->gbp_flags = vxh_flags >> 16;
+}
+
+if (!(get_16aligned_be32(>vx_flags) & htonl(VXLAN_HF_VNI)) ||
(get_16aligned_be32(>vx_vni) & htonl(0xff))) {
 VLOG_WARN_RL(_rl, "invalid vxlan flags=%#x vni=%#x\n",
  ntohl(get_16aligned_be32(>vx_flags)),
@@ -1312,6 +1328,27 @@ netdev_vxlan_pop_header(struct dp_packet *packet)
 return 0;
 }
 
+static void
+netdev_vxlan_build_gbp_header(struct vxlanhdr *vxh,
+  const struct flow *tnl_flow)
+{
+/* VNI must be valid, so I bit should be set. */
+uint32_t vxh_flags = VXLAN_HF_VNI;
+
+/* G bit to indicates that the source TSI Group membership is being
+ * carried within the Group Policy ID field. */
+vxh_flags |= VXLAN_HF_GBP;
+
+/* Only D bit and A bit are valid in gbp_flags. Other bits which are
+ * set should be ignored. */
+vxh_flags |= (tnl_flow->tunnel.gbp_flags << 16)
+  & (VXLAN_GBP_DONT_LEARN | VXLAN_GBP_POLICY_APPLIED);
+
+vxh_flags |= ntohs(tnl_flow->tunnel.gbp_id);
+
+put_16aligned_be32(>vx_flags, htonl(vxh_flags));
+}
+
 static int
 netdev_vxlan_build_header(const struct netdev *netdev,
   struct ovs_action_push_tnl *data,
@@ -1328,7 +1365,11 @@ netdev_vxlan_build_header(const struct netdev *netdev,
 
 vxh = udp_build_header(tnl_cfg, tnl_flow, data, );
 
-put_16aligned_be32(>vx_flags, htonl(VXLAN_FLAGS));
+if (tnl_cfg->exts & (1 << OVS_VXLAN_EXT_GBP)) {
+netdev_vxlan_build_gbp_header(vxh, tnl_flow);
+} else {
+put_16aligned_be32(>vx_flags, htonl(VXLAN_HF_VNI));
+}
 put_16aligned_be32(>vx_vni, htonl(ntohll(tnl_flow->tunnel.tun_id) << 
8));
 
 ovs_mutex_unlock(>mutex);
diff --git a/lib/packets.h b/lib/packets.h
index 8139a6b..23e964d 100644
--- a/lib/packets.h
+++ b/lib/packets.h
@@ -995,13 +995,49 @@ struct gre_base_hdr {
 #define GRE_FLAGS   0x00F8
 #define GRE_VERSION 0x0007
 
-/* VXLAN protocol header */
+/*
+ * VXLAN Group Based Policy Extension (VXLAN_F_GBP):
+ * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ * |G|R|R|R|I|R|R|R|R|D|R|R|A|R|R|R|Group Policy ID|
+ *

Re: [ovs-dev] [PATCH v2 RFC] ovn: Support native dhcp using 'continuations'

2016-04-21 Thread Numan Siddique

On Thu, Apr 21, 2016 at 4:53 AM, Ramu Ramamurthy 
wrote:

> Tested-by: Ramu Ramamurthy 
>
> Numan, I tested this patch to work on devstack+ovn without the
> openstack-plugin,
> with manual configuration.
>
> Notes:
>
> 1) In ovn/utilities/ovn-nbctl.c, usage() Can you add a help string to
> ovn-nbctl for the new command
> lswitch-get-dhcp-options, and lswitch-set-dhcp-options
>
> 2) When server_mac was not defined, the lflow was created like this,
> maybe the error
> checks can be tighter in ovn-northd.c
>
> table=3(  ls_in_dhcp), priority=   50, match=(inport ==
> "7fe086b5-9ab0-4c14-bf62-0291a62a4b14" && eth.src == fa:16:3e:cf:3d:bc
> && ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && udp.src == 68
> && udp.dst == 67), action=(dhcp_offer(offerip = 10.0.1.3, netmask =
> 255.255.255.0, router = 10.0.1.1, mtu = 1400, server_id = 10.0.1.2,
> dns_server = {8.8.8.8,7.7.7.7}, lease_time = 43200); eth.dst =
> eth.src; eth.src = (null); ip4.dst = 10.0.1.3; ip4.src = 10.0.1.2;
> udp.src = 67; udp.dst = 68; outport = inport; inport = ""; /* Allow
> sending out inport. */ output;)
>
> The ovn-controller failed to parse the above flow at eth.src
>
> 2016-04-20T23:19:49Z|00070|lflow|WARN|error parsing actions
> "dhcp_offer(offerip = 10.0.1.2, netmask = 255.255.255.0, router =
> 10.0.1.1, mtu = 1400, server_id = 10.0.1.2, dns_server =
> {8.8.8.8,7.7.7.7}, lease_time = 43200); eth.dst = eth.src; eth.src =
> (null); ip4.dst = 10.0.1.2; ip4.src = 10.0.1.2; udp.src = 67; udp.dst
> = 68; outport = inport; inport = ""; /* Allow sending out inport. */
> output;": Syntax error at `(' expecting constant.
>
>
>
Thanks Ramu for testing this out. I will fix the above 2 issues.

Regards
Numan

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

90 matches

Mail list logo