Re: [ovs-dev] [PATCH] netdev-dpdk: round up mbuf_size to cache_line_size

2017-06-08 Thread santosh
Hi Darrell,

On Friday 09 June 2017 04:50 AM, Darrell Ball wrote:

>
> On 5/31/17, 6:08 AM, "ovs-dev-boun...@openvswitch.org on behalf of Santosh 
> Shukla"  santosh.shu...@caviumnetworks.com> wrote:
>
> Some pmd driver(e.g: vNIC thunderx PMD) want mbuf_size to be multiple of
> cache_line_size. With out this fix, Netdev-dpdk initialization would
> simply fail for those drivers.
> 
> Signed-off-by: Santosh Shukla 
> ---
> - Tested arm64/ThunderX platform for vNIC pmd,
> - Topology: phy-phy and phy-vm
> - Tested x86 platform for XL710/i40e pmd.
> 
>  lib/netdev-dpdk.c | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index 9941f884f..1952a5cec 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -76,9 +76,12 @@ static struct vlog_rate_limit rl = 
> VLOG_RATE_LIMIT_INIT(5, 20);
>  #define MTU_TO_MAX_FRAME_LEN(mtu)   ((mtu) + ETHER_HDR_MAX_LEN)
>  #define FRAME_LEN_TO_MTU(frame_len) ((frame_len)\
>   - ETHER_HDR_LEN - ETHER_CRC_LEN)
> -#define MBUF_SIZE(mtu)  (MTU_TO_MAX_FRAME_LEN(mtu)  \
> +#define _MBUF_SIZE(mtu) (MTU_TO_MAX_FRAME_LEN(mtu)  \
>   + sizeof(struct dp_packet) \
>   + RTE_PKTMBUF_HEADROOM)
> +#define MBUF_SIZE(mtu)  ROUND_UP((_MBUF_SIZE(mtu)), \
> + RTE_CACHE_LINE_SIZE)
>
> Why not just add ROUND_UP to the existing MBUF_SIZE macro; I think it is
> more straightforward and there are probably enough ‘MTU’ macros
> already.

ok, Will do in v2.

>
> +
>  #define NETDEV_DPDK_MBUF_ALIGN  1024
>  #define NETDEV_DPDK_MAX_PKT_LEN 9728
>  
> @@ -495,7 +498,7 @@ dpdk_mp_create(int socket_id, int mtu)
>MP_CACHE_SZ,
>sizeof (struct dp_packet)
>   - sizeof (struct 
> rte_mbuf),
> -  MBUF_SIZE(mtu)
> +  MBUF_SIZE(dmp->mtu)
>
> This change does not seem necessary and distracts from the point
> of the patch.

Ok, Will stick to 'mtu' instead of dmp->mtu. Although, higher up in function mtu
assigned to dmp->mtu, so it made sense to me to use 'dmp->mtu'. But I agree
with your argument. Perhaps should go as separate cleanup patch.

Will do in v2.

Thanks for review feedback.

>   - sizeof(struct 
> dp_packet),
>socket_id);
>  if (dmp->mp) {
> -- 
> 2.11.0
> 
> ___
> dev mailing list
> d...@openvswitch.org
> 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=1tNyJJHbYGc7-BZ-RPxJoTCjeqEwQs-1EhjxLywTsfo&s=oGUJD3e3CuDW6phStK-eGR-rFu7ldsm3VeSL3Zd7vcg&e=
>  
> 
>

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v8 4/4] windows-datapath: Temporary workaround checksum issue with NAT

2017-06-08 Thread Yin Lin
From: Alin Gabriel Serdean 

There is a known bug with NAT where checksum computation is wrong on
the RX path if offload is enabled. This patch works around the problem
by always computing a software checksum and should be reverted once
we figure out the root cause of checksum error.

Signed-off-by: Alin Gabriel Serdean 
---
 datapath-windows/ovsext/Actions.c | 38 ++
 1 file changed, 38 insertions(+)

diff --git a/datapath-windows/ovsext/Actions.c 
b/datapath-windows/ovsext/Actions.c
index 668825b..3ea066b 100644
--- a/datapath-windows/ovsext/Actions.c
+++ b/datapath-windows/ovsext/Actions.c
@@ -1573,6 +1573,44 @@ OvsUpdateAddressAndPort(OvsForwardingContext *ovsFwdCtx,
 }
 *portField = newPort;
 }
+PNET_BUFFER_LIST curNbl = ovsFwdCtx->curNbl;
+PNET_BUFFER_LIST newNbl = NULL;
+if (layers->isTcp) {
+UINT32 mss = OVSGetTcpMSS(curNbl);
+if (mss) {
+OVS_LOG_TRACE("l4Offset %d", layers->l4Offset);
+newNbl = OvsTcpSegmentNBL(ovsFwdCtx->switchContext, curNbl, layers,
+  mss, 0, FALSE);
+if (newNbl == NULL) {
+OVS_LOG_ERROR("Unable to segment NBL");
+return NDIS_STATUS_FAILURE;
+}
+/* Clear out LSO flags after this point */
+NET_BUFFER_LIST_INFO(newNbl, TcpLargeSendNetBufferListInfo) = 0;
+}
+}
+/* If we didn't split the packet above, make a copy now */
+if (newNbl == NULL) {
+csumInfo.Value = NET_BUFFER_LIST_INFO(curNbl,
+  TcpIpChecksumNetBufferListInfo);
+OvsApplySWChecksumOnNB(layers, curNbl, &csumInfo);
+}
+
+if (newNbl) {
+curNbl = newNbl;
+OvsCompleteNBLForwardingCtx(ovsFwdCtx,
+L"Complete after cloning NBL for 
encapsulation");
+OvsInitForwardingCtx(ovsFwdCtx, ovsFwdCtx->switchContext,
+ newNbl, ovsFwdCtx->srcVportNo, 0,
+ NET_BUFFER_LIST_SWITCH_FORWARDING_DETAIL(newNbl),
+ ovsFwdCtx->completionList,
+ &ovsFwdCtx->layers, FALSE);
+ovsFwdCtx->curNbl = newNbl;
+}
+
+NET_BUFFER_LIST_INFO(curNbl,
+ TcpIpChecksumNetBufferListInfo) = 0;
+
 return NDIS_STATUS_SUCCESS;
 }
 
-- 
2.10.2.windows.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v8 2/4] datapath-windows: Add NAT module in conntrack

2017-06-08 Thread Yin Lin
Signed-off-by: Yin Lin 
---
 datapath-windows/automake.mk|   2 +
 datapath-windows/ovsext/Conntrack-nat.c | 433 
 datapath-windows/ovsext/Conntrack-nat.h |  39 +++
 3 files changed, 474 insertions(+)
 create mode 100644 datapath-windows/ovsext/Conntrack-nat.c
 create mode 100644 datapath-windows/ovsext/Conntrack-nat.h

diff --git a/datapath-windows/automake.mk b/datapath-windows/automake.mk
index 49011f12..36018ea 100644
--- a/datapath-windows/automake.mk
+++ b/datapath-windows/automake.mk
@@ -16,7 +16,9 @@ EXTRA_DIST += \
datapath-windows/ovsext/Conntrack-icmp.c \
datapath-windows/ovsext/Conntrack-other.c \
datapath-windows/ovsext/Conntrack-related.c \
+   datapath-windows/ovsext/Conntrack-nat.c \
datapath-windows/ovsext/Conntrack-tcp.c \
+   datapath-windows/ovsext/Conntrack-nat.h \
datapath-windows/ovsext/Conntrack.c \
datapath-windows/ovsext/Conntrack.h \
datapath-windows/ovsext/Datapath.c \
diff --git a/datapath-windows/ovsext/Conntrack-nat.c 
b/datapath-windows/ovsext/Conntrack-nat.c
new file mode 100644
index 000..ae6b92c
--- /dev/null
+++ b/datapath-windows/ovsext/Conntrack-nat.c
@@ -0,0 +1,433 @@
+#include "Conntrack-nat.h"
+#include "Jhash.h"
+
+PLIST_ENTRY ovsNatTable = NULL;
+PLIST_ENTRY ovsUnNatTable = NULL;
+
+/*
+ *---
+ * OvsHashNatKey
+ * Hash NAT related fields in a Conntrack key.
+ *---
+ */
+static __inline UINT32
+OvsHashNatKey(const OVS_CT_KEY *key)
+{
+UINT32 hash = 0;
+#define HASH_ADD(field) \
+hash = OvsJhashBytes(&key->field, sizeof(key->field), hash)
+
+HASH_ADD(src.addr.ipv4_aligned);
+HASH_ADD(dst.addr.ipv4_aligned);
+HASH_ADD(src.port);
+HASH_ADD(dst.port);
+HASH_ADD(zone);
+#undef HASH_ADD
+return hash;
+}
+
+/*
+ *---
+ * OvsNatKeyAreSame
+ * Compare NAT related fields in a Conntrack key.
+ *---
+ */
+static __inline BOOLEAN
+OvsNatKeyAreSame(const OVS_CT_KEY *key1, const OVS_CT_KEY *key2)
+{
+// XXX: Compare IPv6 key as well
+#define FIELD_COMPARE(field) \
+if (key1->field != key2->field) return FALSE
+
+FIELD_COMPARE(src.addr.ipv4_aligned);
+FIELD_COMPARE(dst.addr.ipv4_aligned);
+FIELD_COMPARE(src.port);
+FIELD_COMPARE(dst.port);
+FIELD_COMPARE(zone);
+return TRUE;
+#undef FIELD_COMPARE
+}
+
+/*
+ *---
+ * OvsNatGetBucket
+ * Returns the row of NAT table that has the same hash as the given NAT
+ * hash key. If isReverse is TRUE, returns the row of reverse NAT table
+ * instead.
+ *---
+ */
+static __inline PLIST_ENTRY
+OvsNatGetBucket(const OVS_CT_KEY *key, BOOLEAN isReverse)
+{
+uint32_t hash = OvsHashNatKey(key);
+if (isReverse) {
+return &ovsUnNatTable[hash & NAT_HASH_TABLE_MASK];
+} else {
+return &ovsNatTable[hash & NAT_HASH_TABLE_MASK];
+}
+}
+
+/*
+ *---
+ * OvsNatInit
+ * Initialize NAT related resources.
+ *---
+ */
+NTSTATUS OvsNatInit()
+{
+ASSERT(ovsNatTable == NULL);
+
+/* Init the Hash Buffer */
+ovsNatTable = OvsAllocateMemoryWithTag(
+sizeof(LIST_ENTRY) * NAT_HASH_TABLE_SIZE,
+OVS_CT_POOL_TAG);
+if (ovsNatTable == NULL) {
+goto failNoMem;
+}
+
+ovsUnNatTable = OvsAllocateMemoryWithTag(
+sizeof(LIST_ENTRY) * NAT_HASH_TABLE_SIZE,
+OVS_CT_POOL_TAG);
+if (ovsUnNatTable == NULL) {
+goto freeNatTable;
+}
+
+for (int i = 0; i < NAT_HASH_TABLE_SIZE; i++) {
+InitializeListHead(&ovsNatTable[i]);
+InitializeListHead(&ovsUnNatTable[i]);
+}
+return STATUS_SUCCESS;
+
+freeNatTable:
+OvsFreeMemoryWithTag(ovsNatTable, OVS_CT_POOL_TAG);
+failNoMem:
+return STATUS_INSUFFICIENT_RESOURCES;
+}
+
+/*
+ *
+ * OvsNatFlush
+ * Flushes out all NAT entries that match the given zone.
+ *
+ */
+VOID OvsNatFlush(UINT16 zone)
+{
+PLIST_ENTRY link, next;
+for (int i = 0; i < NAT_HASH_TABLE_SIZE; i++) {
+LIST_FORALL_SAFE(&ovsNatTable[i], link, next) {
+POVS_NAT_ENTRY entry =
+CONTAINING_RECORD(link, OVS_NAT_ENTRY, link);
+/* zone is a non-zero value */
+if (!zone || zone == entry->key.zone) {
+OvsNatDeleteEntry(entry);
+}
+}
+ 

[ovs-dev] [PATCH v8 3/4] datapath-windows: NAT integration with conntrack

2017-06-08 Thread Yin Lin
This patch integrates NAT module with existing conntrack module. NAT
action is now supported.

Signed-off-by: Yin Lin 
---
 datapath-windows/ovsext/Actions.c  | 119 ++-
 datapath-windows/ovsext/Actions.h  |  20 
 datapath-windows/ovsext/Conntrack.c| 201 +
 datapath-windows/ovsext/Conntrack.h|  25 ++--
 datapath-windows/ovsext/ovsext.vcxproj |   2 +
 5 files changed, 282 insertions(+), 85 deletions(-)

diff --git a/datapath-windows/ovsext/Actions.c 
b/datapath-windows/ovsext/Actions.c
index c3f0362..668825b 100644
--- a/datapath-windows/ovsext/Actions.c
+++ b/datapath-windows/ovsext/Actions.c
@@ -1380,7 +1380,7 @@ PUINT8 OvsGetHeaderBySize(OvsForwardingContext *ovsFwdCtx,
  *  based on the specified key.
  *
  */
-static __inline NDIS_STATUS
+NDIS_STATUS
 OvsUpdateUdpPorts(OvsForwardingContext *ovsFwdCtx,
   const struct ovs_key_udp *udpAttr)
 {
@@ -1427,7 +1427,7 @@ OvsUpdateUdpPorts(OvsForwardingContext *ovsFwdCtx,
  *  based on the specified key.
  *
  */
-static __inline NDIS_STATUS
+NDIS_STATUS
 OvsUpdateTcpPorts(OvsForwardingContext *ovsFwdCtx,
   const struct ovs_key_tcp *tcpAttr)
 {
@@ -1465,12 +1465,125 @@ OvsUpdateTcpPorts(OvsForwardingContext *ovsFwdCtx,
 
 /*
  *
+ * OvsUpdateAddressAndPort --
+ *  Updates the source/destination IP and port fields in
+ *  ovsFwdCtx.curNbl inline based on the specified key.
+ *
+ */
+NDIS_STATUS
+OvsUpdateAddressAndPort(OvsForwardingContext *ovsFwdCtx,
+UINT32 newAddr, UINT16 newPort,
+BOOLEAN isSource, BOOLEAN isTx)
+{
+PUINT8 bufferStart;
+UINT32 hdrSize;
+OVS_PACKET_HDR_INFO *layers = &ovsFwdCtx->layers;
+IPHdr *ipHdr;
+TCPHdr *tcpHdr = NULL;
+UDPHdr *udpHdr = NULL;
+UINT32 *addrField = NULL;
+UINT16 *portField = NULL;
+UINT16 *checkField = NULL;
+BOOLEAN l4Offload = FALSE;
+NDIS_TCP_IP_CHECKSUM_NET_BUFFER_LIST_INFO csumInfo;
+
+ASSERT(layers->value != 0);
+
+if (layers->isTcp || layers->isUdp) {
+hdrSize = layers->l4Offset +
+  layers->isTcp ? sizeof (*tcpHdr) : sizeof (*udpHdr);
+} else {
+hdrSize = layers->l3Offset + sizeof (*ipHdr);
+}
+
+bufferStart = OvsGetHeaderBySize(ovsFwdCtx, hdrSize);
+if (!bufferStart) {
+return NDIS_STATUS_RESOURCES;
+}
+
+ipHdr = (IPHdr *)(bufferStart + layers->l3Offset);
+
+if (layers->isTcp) {
+tcpHdr = (TCPHdr *)(bufferStart + layers->l4Offset);
+} else if (layers->isUdp) {
+udpHdr = (UDPHdr *)(bufferStart + layers->l4Offset);
+}
+
+csumInfo.Value = NET_BUFFER_LIST_INFO(ovsFwdCtx->curNbl,
+  TcpIpChecksumNetBufferListInfo);
+/*
+ * Adjust the IP header inline as dictated by the action, and also update
+ * the IP and the TCP checksum for the data modified.
+ *
+ * In the future, this could be optimized to make one call to
+ * ChecksumUpdate32(). Ignoring this for now, since for the most common
+ * case, we only update the TTL.
+ */
+
+if (isSource) {
+addrField = &ipHdr->saddr;
+if (tcpHdr) {
+portField = &tcpHdr->source;
+checkField = &tcpHdr->check;
+l4Offload = isTx ? (BOOLEAN)csumInfo.Transmit.TcpChecksum :
+((BOOLEAN)csumInfo.Receive.TcpChecksumSucceeded ||
+ (BOOLEAN)csumInfo.Receive.TcpChecksumFailed);
+} else if (udpHdr) {
+portField = &udpHdr->source;
+checkField = &udpHdr->check;
+l4Offload = isTx ? (BOOLEAN)csumInfo.Transmit.UdpChecksum :
+((BOOLEAN)csumInfo.Receive.UdpChecksumSucceeded ||
+ (BOOLEAN)csumInfo.Receive.UdpChecksumFailed);
+}
+} else {
+addrField = &ipHdr->daddr;
+if (tcpHdr) {
+portField = &tcpHdr->dest;
+checkField = &tcpHdr->check;
+} else if (udpHdr) {
+portField = &udpHdr->dest;
+checkField = &udpHdr->check;
+}
+}
+if (*addrField != newAddr) {
+UINT32 oldAddr = *addrField;
+if (checkField && *checkField != 0) {
+if (l4Offload) {
+/* Recompute IP pseudo checksum */
+*checkField = ~(*checkField);
+*checkField = ChecksumUpdate32(*checkField, oldAddr,
+   newAddr);
+*checkField = ~(*checkField);
+} else {
+*checkField = ChecksumUpdate32(*checkField, oldAddr,
+ 

[ovs-dev] [PATCH v8 1/4] datapath-windows: Add support for NAT in conntrack

2017-06-08 Thread Yin Lin
From: Anand Kumar 

Add support for parsing netlink attributes related to NAT
in conntrack.

Co-Authored-by: Anand Kumar 
Co-Authored-by: Darrell Ball 
Signed-off-by: Yin Lin 
---
 datapath-windows/ovsext/Conntrack.c | 83 -
 datapath-windows/ovsext/Conntrack.h | 17 
 datapath-windows/ovsext/Flow.c  |  4 +-
 3 files changed, 100 insertions(+), 4 deletions(-)

diff --git a/datapath-windows/ovsext/Conntrack.c 
b/datapath-windows/ovsext/Conntrack.c
index 609ae5a..6b3435c 100644
--- a/datapath-windows/ovsext/Conntrack.c
+++ b/datapath-windows/ovsext/Conntrack.c
@@ -651,7 +651,8 @@ OvsCtExecute_(PNET_BUFFER_LIST curNbl,
   UINT16 zone,
   MD_MARK *mark,
   MD_LABELS *labels,
-  PCHAR helper)
+  PCHAR helper,
+  PNAT_ACTION_INFO natInfo)
 {
 NDIS_STATUS status = NDIS_STATUS_SUCCESS;
 POVS_CT_ENTRY entry = NULL;
@@ -660,6 +661,9 @@ OvsCtExecute_(PNET_BUFFER_LIST curNbl,
 UINT64 currentTime;
 NdisGetCurrentSystemTime((LARGE_INTEGER *) ¤tTime);
 
+/* XXX: Not referenced for now */
+UNREFERENCED_PARAMETER(natInfo);
+
 /* Retrieve the Conntrack Key related fields from packet */
 OvsCtSetupLookupCtx(key, zone, &ctx, curNbl, layers->l4Offset);
 
@@ -753,11 +757,14 @@ OvsExecuteConntrackAction(OvsForwardingContext *fwdCtx,
 MD_MARK *mark = NULL;
 MD_LABELS *labels = NULL;
 PCHAR helper = NULL;
+NAT_ACTION_INFO natActionInfo;
 PNET_BUFFER_LIST curNbl = fwdCtx->curNbl;
 OVS_PACKET_HDR_INFO *layers = &fwdCtx->layers;
 PNET_BUFFER_LIST newNbl = NULL;
+NAT_ACTION_INFO natActionInfo;
 NDIS_STATUS status;
 
+memset(&natActionInfo, 0, sizeof natActionInfo);
 status = OvsDetectCtPacket(fwdCtx, key, &newNbl);
 if (status != NDIS_STATUS_SUCCESS) {
 return status;
@@ -780,6 +787,78 @@ OvsExecuteConntrackAction(OvsForwardingContext *fwdCtx,
 if (ctAttr) {
 labels = NlAttrGet(ctAttr);
 }
+natActionInfo.natAction = NAT_ACTION_NONE;
+ctAttr = NlAttrFindNested(a, OVS_CT_ATTR_NAT);
+if (ctAttr) {
+/* Pares Nested NAT attributes. */
+PNL_ATTR natAttr;
+unsigned int left;
+BOOLEAN hasMinIp = FALSE;
+BOOLEAN hasMinPort = FALSE;
+BOOLEAN hasMaxIp = FALSE;
+BOOLEAN hasMaxPort = FALSE;
+NL_NESTED_FOR_EACH_UNSAFE (natAttr, left, ctAttr) {
+enum ovs_nat_attr sub_type_nest = NlAttrType(natAttr);
+switch(sub_type_nest) {
+case OVS_NAT_ATTR_SRC:
+case OVS_NAT_ATTR_DST:
+natActionInfo.natAction |=
+((sub_type_nest == OVS_NAT_ATTR_SRC)
+? NAT_ACTION_SRC : NAT_ACTION_DST);
+break;
+case OVS_NAT_ATTR_IP_MIN:
+   if (natAttr->nlaLen < NLA_HDRLEN) {
+OVS_LOG_ERROR("Incorrect header length for "
+  "OVS_NAT_ATTR_IP_MIN message.");
+break;
+}
+memcpy(&natActionInfo.minAddr,
+   NlAttrData(natAttr), natAttr->nlaLen - NLA_HDRLEN);
+hasMinIp = TRUE;
+break;
+case OVS_NAT_ATTR_IP_MAX:
+if (natAttr->nlaLen < NLA_HDRLEN) {
+OVS_LOG_ERROR("Incorrect header length for "
+  "OVS_NAT_ATTR_IP_MAX message.");
+break;
+}
+memcpy(&natActionInfo.maxAddr,
+   NlAttrData(natAttr), natAttr->nlaLen - NLA_HDRLEN);
+hasMaxIp = TRUE;
+break;
+case OVS_NAT_ATTR_PROTO_MIN:
+natActionInfo.minPort = NlAttrGetU16(natAttr);
+hasMinPort = TRUE;
+break;
+case OVS_NAT_ATTR_PROTO_MAX:
+natActionInfo.maxPort = NlAttrGetU16(natAttr);
+hasMaxPort = TRUE;
+break;
+case OVS_NAT_ATTR_PERSISTENT:
+case OVS_NAT_ATTR_PROTO_HASH:
+case OVS_NAT_ATTR_PROTO_RANDOM:
+break;
+}
+}
+if (natActionInfo.natAction == NAT_ACTION_NONE) {
+natActionInfo.natAction = NAT_ACTION_REVERSE;
+}
+if (hasMinIp && !hasMaxIp) {
+memcpy(&natActionInfo.maxAddr,
+   &natActionInfo.minAddr,
+   sizeof(natActionInfo.maxAddr));
+}
+if (hasMinPort && !hasMaxPort) {
+natActionInfo.maxPort = natActionInfo.minPort;
+}
+if (hasMinPort || hasMaxPort) {
+if (natActionInfo.natAction & NAT_ACTION_SRC) {
+natActionInfo.natAction |= NAT_ACTION_SRC_PORT;
+} else if (natActionInfo.natAction & NAT_ACTION_DST) {
+natActionInfo.natAction |= NAT_ACTION_DST_PORT;
+}
+}
+}
 ctAttr 

Re: [ovs-dev] [PATCH 1/2] hash: New helper functions for adding words in a buffer to a hash.

2017-06-08 Thread Darrell Ball
Acked-by: Darrell Ball 

On 6/7/17, 9:37 AM, "ovs-dev-boun...@openvswitch.org on behalf of Ben Pfaff" 
 wrote:

These will receive their first user (outside of hash.h) in the following
commit.

Signed-off-by: Ben Pfaff 
---
 lib/hash.h | 66 
+-
 1 file changed, 48 insertions(+), 18 deletions(-)

diff --git a/lib/hash.h b/lib/hash.h
index d68ed759a7e9..7dffeaa9cacc 100644
--- a/lib/hash.h
+++ b/lib/hash.h
@@ -90,6 +90,15 @@ static inline uint32_t mhash_finish(uint32_t hash)
 return hash;
 }
 
+static inline uint32_t hash_add(uint32_t hash, uint32_t data);
+static inline uint32_t hash_add64(uint32_t hash, uint64_t data);
+static inline uint32_t hash_finish(uint32_t hash, uint32_t final);
+
+static inline uint32_t hash_add_words(uint32_t, const uint32_t *, size_t);
+static inline uint32_t hash_add_words64(uint32_t, const uint64_t *, 
size_t);
+static inline uint32_t hash_add_bytes32(uint32_t, const uint32_t *, 
size_t);
+static inline uint32_t hash_add_bytes64(uint32_t, const uint64_t *, 
size_t);
+
 #if !(defined(__SSE4_2__) && defined(__x86_64__))
 /* Mhash-based implementation. */
 
@@ -114,29 +123,15 @@ static inline uint32_t hash_finish(uint32_t hash, 
uint32_t final)
  * This is inlined for the compiler to have access to the 'n_words', which
  * in many cases is a constant. */
 static inline uint32_t
-hash_words_inline(const uint32_t p[], size_t n_words, uint32_t basis)
+hash_words_inline(const uint32_t *p, size_t n_words, uint32_t basis)
 {
-uint32_t hash;
-size_t i;
-
-hash = basis;
-for (i = 0; i < n_words; i++) {
-hash = hash_add(hash, p[i]);
-}
-return hash_finish(hash, n_words * 4);
+return hash_finish(hash_add_words(basis, p, n_words), n_words * 4);
 }
 
 static inline uint32_t
-hash_words64_inline(const uint64_t p[], size_t n_words, uint32_t basis)
+hash_words64_inline(const uint64_t *p, size_t n_words, uint32_t basis)
 {
-uint32_t hash;
-size_t i;
-
-hash = basis;
-for (i = 0; i < n_words; i++) {
-hash = hash_add64(hash, p[i]);
-}
-return hash_finish(hash, n_words * 8);
+return hash_finish(hash_add_words64(basis, p, n_words), n_words * 8);
 }
 
 static inline uint32_t hash_pointer(const void *p, uint32_t basis)
@@ -358,6 +353,41 @@ static inline uint32_t hash_boolean(bool x, uint32_t 
basis)
 const uint32_t P1 = 0xe90f1258;   /* This is hash_int(2, 0). */
 return (x ? P0 : P1) ^ hash_rot(basis, 1);
 }
+?
+/* Helper functions for calling hash_add() for several 32- or 64-bit words 
in a
+ * buffer.  These are not hash functions by themselves, since they need
+ * hash_finish() to be called, so if you are looking for a full hash 
function
+ * see hash_words(), etc. */
+
+static inline uint32_t
+hash_add_words(uint32_t hash, const uint32_t *p, size_t n_words)
+{
+for (size_t i = 0; i < n_words; i++) {
+hash = hash_add(hash, p[i]);
+}
+return hash;
+}
+
+static inline uint32_t
+hash_add_words64(uint32_t hash, const uint64_t *p, size_t n_words)
+{
+for (size_t i = 0; i < n_words; i++) {
+hash = hash_add64(hash, p[i]);
+}
+return hash;
+}
+
+static inline uint32_t
+hash_add_bytes32(uint32_t hash, const uint32_t *p, size_t n_bytes)
+{
+return hash_add_words(hash, p, n_bytes / 4);
+}
+
+static inline uint32_t
+hash_add_bytes64(uint32_t hash, const uint64_t *p, size_t n_bytes)
+{
+return hash_add_words64(hash, p, n_bytes / 8);
+}
 
 #ifdef __cplusplus
 }
-- 
2.10.2

___
dev mailing list
d...@openvswitch.org

https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=kee8Fq6OyF_ZAvSD3rLTSW2pqsffYCxrE2AWREKAgI8&s=DHXO3ZX4LzeJ7Lj69uoFPRako5mgJbzIiiB6-qsJHFg&e=
 


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/2] conntrack: Hash entire NAT data structure in nat_range_hash().

2017-06-08 Thread Darrell Ball
Acked-by: Darrell Ball dlu...@gmail.com

I added an incremental so we can use the new abstraction in the pre-existing
hash function – conn_key_hash(). I am not sure if we need a separate patch ?

diff --git a/lib/conntrack.c b/lib/conntrack.c
index 55029c8..d3e514d 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -1509,20 +1509,30 @@ conn_key_extract(struct conntrack *ct, struct dp_packet 
*pkt, ovs_be16 dl_type,
 
 return false;
 }
+
+static uint32_t
+ct_addr_hash_add(uint32_t hash, const struct ct_addr *addr)
+{
+BUILD_ASSERT_DECL(sizeof *addr % 4 == 0);
+return hash_add_bytes32(hash, (const uint32_t *) addr, sizeof *addr);
+}
+
+static uint32_t
+ct_endpoint_hash_add(uint32_t hash, const struct ct_endpoint *ep)
+{
+BUILD_ASSERT_DECL(sizeof *ep % 4 == 0);
+return hash_add_bytes32(hash, (const uint32_t *) ep, sizeof *ep);
+}
 ^L
 /* Symmetric */
 static uint32_t
 conn_key_hash(const struct conn_key *key, uint32_t basis)
 {
 uint32_t hsrc, hdst, hash;
-int i;
 
 hsrc = hdst = basis;
-
-for (i = 0; i < sizeof(key->src) / sizeof(uint32_t); i++) {
-hsrc = hash_add(hsrc, ((uint32_t *) &key->src)[i]);
-hdst = hash_add(hdst, ((uint32_t *) &key->dst)[i]);
-}
+hsrc = ct_endpoint_hash_add(hsrc, &key->src);
+hdst = ct_endpoint_hash_add(hdst, &key->dst);
 
 /* Even if source and destination are swapped the hash will be the same. */
 hash = hsrc ^ hdst;
@@ -1613,20 +1623,6 @@ nat_ipv6_addr_increment(struct in6_addr *ipv6_aligned, 
uint32_t increment)
 }
 
 static uint32_t
-ct_addr_hash_add(uint32_t hash, const struct ct_addr *addr)
-{
-BUILD_ASSERT_DECL(sizeof *addr % 4 == 0);
-return hash_add_bytes32(hash, (const uint32_t *) addr, sizeof *addr);
-}
-
-static uint32_t
-ct_endpoint_hash_add(uint32_t hash, const struct ct_endpoint *ep)
-{
-BUILD_ASSERT_DECL(sizeof *ep % 4 == 0);
-return hash_add_bytes32(hash, (const uint32_t *) ep, sizeof *ep);
-}
-
-static uint32_t
 nat_range_hash(const struct conn *conn, uint32_t basis)
 {
 uint32_t hash = basis;
(END)




On 6/7/17, 9:37 AM, "ovs-dev-boun...@openvswitch.org on behalf of Ben Pfaff" 
 wrote:

From: Darrell Ball 

Part of the hash input for nat_range_hash() was accidentally
omitted, so this fixes the problem.  Also, add a missing call to
hash_finish().

Fixes: 286de2729955 ("dpdk: Userspace Datapath: Introduce NAT Support.")
Signed-off-by: Darrell Ball 
Signed-off-by: Ben Pfaff 
---
 lib/conntrack-private.h |  4 
 lib/conntrack.c | 47 
++-
 2 files changed, 30 insertions(+), 21 deletions(-)

diff --git a/lib/conntrack-private.h b/lib/conntrack-private.h
index bfa88f0fe7e2..55084d3ea64e 100644
--- a/lib/conntrack-private.h
+++ b/lib/conntrack-private.h
@@ -42,6 +42,10 @@ struct ct_endpoint {
 };
 };
 
+/* Verify that there is no padding in struct ct_endpoint, to facilitate
+ * hashing in ct_endpoint_hash_add(). */
+BUILD_ASSERT_DECL(sizeof(struct ct_endpoint) == sizeof(struct ct_addr) + 
4);
+
 /* Changes to this structure need to be reflected in conn_key_hash() */
 struct conn_key {
 struct ct_endpoint src;
diff --git a/lib/conntrack.c b/lib/conntrack.c
index 44a6bc4e3ccf..55029c82e7a6 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -1613,36 +1613,41 @@ nat_ipv6_addr_increment(struct in6_addr 
*ipv6_aligned, uint32_t increment)
 }
 
 static uint32_t
-nat_range_hash(const struct conn *conn, uint32_t basis)
+ct_addr_hash_add(uint32_t hash, const struct ct_addr *addr)
 {
-uint32_t hash = basis;
-int i;
-uint16_t port;
+BUILD_ASSERT_DECL(sizeof *addr % 4 == 0);
+return hash_add_bytes32(hash, (const uint32_t *) addr, sizeof *addr);
+}
 
-for (i = 0;
- i < sizeof(conn->nat_info->min_addr) / sizeof(uint32_t);
- i++) {
-hash = hash_add(hash, ((uint32_t *) &conn->nat_info->min_addr)[i]);
-hash = hash_add(hash, ((uint32_t *) &conn->nat_info->max_addr)[i]);
-}
+static uint32_t
+ct_endpoint_hash_add(uint32_t hash, const struct ct_endpoint *ep)
+{
+BUILD_ASSERT_DECL(sizeof *ep % 4 == 0);
+return hash_add_bytes32(hash, (const uint32_t *) ep, sizeof *ep);
+}
 
-memcpy(&port, &conn->nat_info->min_port, sizeof port);
-hash = hash_add(hash, port);
+static uint32_t
+nat_range_hash(const struct conn *conn, uint32_t basis)
+{
+uint32_t hash = basis;
 
-for (i = 0; i < sizeof(conn->key.src.addr) / sizeof(uint32_t); i++) {
-hash = hash_add(hash, ((uint32_t *) &conn->key.src)[i]);
-hash = hash_add(hash, ((uint32_t *) &conn->key.dst)[i]);
-}
+hash = ct_addr_hash_add(hash, &conn->nat_info->min_addr);
+hash = ct_addr_hash_add(hash, &conn

Re: [ovs-dev] [PATCH v7 3/4] datapath-windows: NAT integration with conntrack

2017-06-08 Thread Yin Lin
Hi Alin,

Thanks for the feedback. They are very sharp catches! I have address all of 
them except the last one. The last one is a memory leak that has been there for 
a while and not touched by my patch. The fix is not easy either. Take a look at 
the following code:

OVS_CT_ENTRY *
OvsConntrackCreateIcmpEntry(UINT64 now)
{
struct conn_icmp *conn;

conn = OvsAllocateMemoryWithTag(sizeof(struct conn_icmp),
OVS_CT_POOL_TAG);
if (!conn) {
return NULL;
}

conn->state = ICMPS_FIRST;

OvsConntrackUpdateExpiration(&conn->up, now,
 icmp_timeouts[conn->state]);

return &conn->up;
}

We allocate a conn_icmp structure in the heap, but only returns a reference to 
conn->up, so the caller cannot even release the memory.

We'll have to create another patch to address this. Sai, can you take care of 
the memory leak?

Best regards,
Yin Lin

-Original Message-
From: Alin Serdean [mailto:aserd...@cloudbasesolutions.com] 
Sent: Tuesday, May 23, 2017 7:18 AM
To: Yin Lin ; d...@openvswitch.org
Subject: RE: [ovs-dev] [PATCH v7 3/4] datapath-windows: NAT integration with 
conntrack

Hi Yin,

Sorry it took a while to review the patch.

I just have a few inlined comments. I am stripping the code a bit to be more 
readable.

I will run some tests tonight over the current series to see that everything is 
ok from a functionality perspective.

Thanks,
Alin.

> 
> This patch integrates NAT module with existing conntrack module. NAT
> action is now supported.
> 
> Signed-off-by: Yin Lin 
> diff --git a/datapath-windows/automake.mk b/datapath-
> windows/automake.mk index 7177630..f1632cc 100644
> --- a/datapath-windows/automake.mk
> +++ b/datapath-windows/automake.mk
> @@ -16,9 +16,9 @@ EXTRA_DIST += \
>   datapath-windows/ovsext/Conntrack-icmp.c \
>   datapath-windows/ovsext/Conntrack-other.c \
>   datapath-windows/ovsext/Conntrack-related.c \
> -datapath-windows/ovsext/Conntrack-nat.c \
> + datapath-windows/ovsext/Conntrack-nat.c \
>   datapath-windows/ovsext/Conntrack-tcp.c \
> -datapath-windows/ovsext/Conntrack-nat.h \
> + datapath-windows/ovsext/Conntrack-nat.h \
[Alin Serdean] Just use tab instead of 4 spaces in patch 2/4
>   datapath-windows/ovsext/Conntrack.c \
>   datapath-windows/ovsext/Conntrack.h \
>   datapath-windows/ovsext/Datapath.c \
[Alin Serdean] <== cut >
> +
> +/* NatInfo is always initialized to be disabled, so that if NAT action
> + * fails, we will not end up deleting an non-existent NAT entry.
> + */
> +if (natInfo != NULL && OvsIsForwardNat(natInfo->natAction)) {
> +entry->natInfo = *natInfo;
> +if (!OvsNatCtEntry(entry)) {
> +return FALSE;
> +}
> +ctx->hash = OvsHashCtKey(&entry->key);
> +} else {
> +entry->natInfo.natAction = natInfo->natAction;
[Alin Serdean] Shouldn't this be guarded for natInfo==NULL?
> +}
> +
[Alin Serdean] <== cut >
> -OvsCtUpdateFlowKey(key, state, ctx->key.zone, 0, NULL);
> -return entry;
> +break;
>  }
>  default:
>  goto invalid;
>  }
> 
> +if (commit && !entry) {
> +return NULL;
> +}
> +if (entry && !OvsCtAddEntry(entry, ctx, natInfo, currentTime)) {
[Alin Serdean] Shouldn't we delete the 'entry' here if OvsCtAddEntry failed?
> +return NULL;
> +}
> +OvsCtUpdateFlowKey(key, state, ctx->key.zone, 0, NULL);
> +return entry;
>  invalid:
>  state |= OVS_CS_F_INVALID;
>  OvsCtUpdateFlowKey(key, state, ctx->key.zone, 0, NULL); @@ -269,11
> +289,11 @@ invalid:
> 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 4/4] ovn-controller: Use separate thread for packet-in processing.

2017-06-08 Thread Ben Pfaff
On Thu, Jun 08, 2017 at 04:21:17PM -0700, Han Zhou wrote:
> On Thu, Jun 8, 2017 at 4:14 PM, Ben Pfaff  wrote:
> >
> > On Tue, Jun 06, 2017 at 05:24:19PM -0700, Han Zhou wrote:
> > > On Tue, Jun 6, 2017 at 3:56 PM, Ben Pfaff  wrote:
> > > >
> > > > On Thu, May 25, 2017 at 05:26:47PM -0700, Han Zhou wrote:
> > > > > This patch introduces multi-threading for ovn-controller and use
> > > > > dedicated thread for packet-in processing as a start. It decouples
> > > > > packet-in processing and ovs flow computing, so that packet-in
> inputs
> > > > > won't trigger flow recomputing, and flow computing won't block
> > > > > packet-in processing. In large scale environment this largely
> reduces
> > > > > CPU cost and improves performance.
> > > >
> > > > Won't this double the load on the southbound database server, as well
> as
> > > > the bandwidth to and from it?  We already have a bottleneck there.
> > >
> > > Ben, yes this is the trade-off. Here are the considerations:
> > >
> > > 1. The bottle-neck in ovn-controller is easier to hit (you don't even
> need
> > > many number of HVs to hit it)
> > > 2. The bottle-neck of southbound DB do exist when number of HV increases
> > > but since you are already working on the ovsdb clustering I suppose it
> will
> > > be resolved.
> > >
> > > However I agree that this is not ideal. Alternatively we can spin-up a
> > > dedicated thread for SB IDL processing and other "worker" thread just
> read
> > > the data with proper locking. This will be more complicated but should
> be
> > > doable, what do you think?
> >
> > I spent a little time thinking about this.  I think that the approach
> > that you're proposing is probably practical.  Do you want to try to
> > experiment with it and see whether it's reasonably possible?
> 
> It basically needs to separate reads and writes for SB IDL from the
> xxx_run() functions, which may be tricky. But if there is no other way
> around I'll go down this path.

I thought you were proposing some very coarse-grained locking.  Maybe
you should describe what you mean in a little more detail.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 1/2] netdev-dpdk: Remove Rx checksum reconfigure.

2017-06-08 Thread Darrell Ball
oops, I did notice a minor issue with the original code moved to the _init 
function
in this patch; in theory, it should be a separate patch.


On 6/8/17, 2:40 PM, "ovs-dev-boun...@openvswitch.org on behalf of Darrell Ball" 
 wrote:

LGTM

Acked by: Darrell Ball 

On 6/8/17, 10:12 AM, "ovs-dev-boun...@openvswitch.org on behalf of Kevin 
Traynor"  
wrote:

Rx checksum offload is enabled by default on DPDK physical NICs
where available, with reconfiguration through
options:rx-checksum-offload. However, changing rx-checksum-offload
did not result in a reconfiguration of the NIC and wrong status is
reported for it.

As there seems to be diminishing reasons why a user would want
to disable Rx checksum offload, just remove the broken reconfiguration
option.

Fixes: 1a2bb11817a4 ("netdev-dpdk: Enable Rx checksum offloading 
feature on DPDK physical ports.")
Reported-by: Kevin Traynor 
Suggested-by: Sugesh Chandran 
Signed-off-by: Kevin Traynor 
---
 Documentation/howto/dpdk.rst | 17 +---
 lib/netdev-dpdk.c| 47 
+++-
 vswitchd/vswitch.xml | 14 -
 3 files changed, 13 insertions(+), 65 deletions(-)

diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst
index 93248b4..af01d3e 100644
--- a/Documentation/howto/dpdk.rst
+++ b/Documentation/howto/dpdk.rst
@@ -277,16 +277,5 @@ Rx Checksum Offload
 ---
 
-By default, DPDK physical ports are enabled with Rx checksum offload. 
Rx
-checksum offload can be configured on a DPDK physical port either when 
adding
-or at run time.
-
-To disable Rx checksum offload when adding a DPDK port dpdk-p0::
-
-$ ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 
type=dpdk \
-  options:dpdk-devargs=:01:00.0 
options:rx-checksum-offload=false
-
-Similarly to disable the Rx checksum offloading on a existing DPDK 
port dpdk-p0::
-
-$ ovs-vsctl set Interface dpdk-p0 options:rx-checksum-offload=false
+By default, DPDK physical ports are enabled with Rx checksum offload.
 
 Rx checksum offload can offer performance improvement only for 
tunneling
@@ -294,8 +283,4 @@ traffic in OVS-DPDK because the checksum validation 
of tunnel packets is
 offloaded to the NIC. Also enabling Rx checksum may slightly reduce the
 performance of non-tunnel traffic, specifically for smaller size 
packet.
-DPDK vectorization is disabled when checksum offloading is configured 
on DPDK
-physical ports which in turn effects the non-tunnel traffic 
performance.
-So it is advised to turn off the Rx checksum offload for non-tunnel 
traffic use
-cases to achieve the best performance.
 
 .. _extended-statistics:
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index b770b70..79afda5 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -719,27 +719,4 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, 
int n_rxq, int n_txq)
 
 static void
-dpdk_eth_checksum_offload_configure(struct netdev_dpdk *dev)
-OVS_REQUIRES(dev->mutex)
-{
-struct rte_eth_dev_info info;
-bool rx_csum_ol_flag = false;
-uint32_t rx_chksm_offload_capa = DEV_RX_OFFLOAD_UDP_CKSUM |
- DEV_RX_OFFLOAD_TCP_CKSUM |
- DEV_RX_OFFLOAD_IPV4_CKSUM;
-rte_eth_dev_info_get(dev->port_id, &info);
-rx_csum_ol_flag = (dev->hw_ol_features & 
NETDEV_RX_CHECKSUM_OFFLOAD) != 0;
-
-if (rx_csum_ol_flag &&
-(info.rx_offload_capa & rx_chksm_offload_capa) !=
- rx_chksm_offload_capa) {
-VLOG_WARN_ONCE("Rx checksum offload is not supported on device 
%"PRIu8,
-   dev->port_id);
-dev->hw_ol_features &= ~NETDEV_RX_CHECKSUM_OFFLOAD;
-return;
-}
-netdev_request_reconfigure(&dev->up);
-}
-
-static void
 dpdk_eth_flow_ctrl_setup(struct netdev_dpdk *dev) 
OVS_REQUIRES(dev->mutex)
 {
@@ -759,7 +736,19 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
 int diag;
 int n_rxq, n_txq;
+uint32_t rx_chksm_offload_capa = DEV_RX_OFFLOAD_UDP_CKSUM |
+ DEV_RX_OFFLOAD_TCP_CKSUM |
+ DEV_RX_OFFLOAD_IPV4_CKSUM;
 
 rte_eth_dev_info_get(dev->port_id, &info);
 
+i

Re: [ovs-dev] [PATCH 4/4] ovn-controller: Use separate thread for packet-in processing.

2017-06-08 Thread Han Zhou
On Thu, Jun 8, 2017 at 4:14 PM, Ben Pfaff  wrote:
>
> On Tue, Jun 06, 2017 at 05:24:19PM -0700, Han Zhou wrote:
> > On Tue, Jun 6, 2017 at 3:56 PM, Ben Pfaff  wrote:
> > >
> > > On Thu, May 25, 2017 at 05:26:47PM -0700, Han Zhou wrote:
> > > > This patch introduces multi-threading for ovn-controller and use
> > > > dedicated thread for packet-in processing as a start. It decouples
> > > > packet-in processing and ovs flow computing, so that packet-in
inputs
> > > > won't trigger flow recomputing, and flow computing won't block
> > > > packet-in processing. In large scale environment this largely
reduces
> > > > CPU cost and improves performance.
> > >
> > > Won't this double the load on the southbound database server, as well
as
> > > the bandwidth to and from it?  We already have a bottleneck there.
> >
> > Ben, yes this is the trade-off. Here are the considerations:
> >
> > 1. The bottle-neck in ovn-controller is easier to hit (you don't even
need
> > many number of HVs to hit it)
> > 2. The bottle-neck of southbound DB do exist when number of HV increases
> > but since you are already working on the ovsdb clustering I suppose it
will
> > be resolved.
> >
> > However I agree that this is not ideal. Alternatively we can spin-up a
> > dedicated thread for SB IDL processing and other "worker" thread just
read
> > the data with proper locking. This will be more complicated but should
be
> > doable, what do you think?
>
> I spent a little time thinking about this.  I think that the approach
> that you're proposing is probably practical.  Do you want to try to
> experiment with it and see whether it's reasonably possible?

It basically needs to separate reads and writes for SB IDL from the
xxx_run() functions, which may be tricky. But if there is no other way
around I'll go down this path.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] netdev-dpdk: round up mbuf_size to cache_line_size

2017-06-08 Thread Darrell Ball


On 5/31/17, 6:08 AM, "ovs-dev-boun...@openvswitch.org on behalf of Santosh 
Shukla"  wrote:

Some pmd driver(e.g: vNIC thunderx PMD) want mbuf_size to be multiple of
cache_line_size. With out this fix, Netdev-dpdk initialization would
simply fail for those drivers.

Signed-off-by: Santosh Shukla 
---
- Tested arm64/ThunderX platform for vNIC pmd,
- Topology: phy-phy and phy-vm
- Tested x86 platform for XL710/i40e pmd.

 lib/netdev-dpdk.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 9941f884f..1952a5cec 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -76,9 +76,12 @@ static struct vlog_rate_limit rl = 
VLOG_RATE_LIMIT_INIT(5, 20);
 #define MTU_TO_MAX_FRAME_LEN(mtu)   ((mtu) + ETHER_HDR_MAX_LEN)
 #define FRAME_LEN_TO_MTU(frame_len) ((frame_len)\
  - ETHER_HDR_LEN - ETHER_CRC_LEN)
-#define MBUF_SIZE(mtu)  (MTU_TO_MAX_FRAME_LEN(mtu)  \
+#define _MBUF_SIZE(mtu) (MTU_TO_MAX_FRAME_LEN(mtu)  \
  + sizeof(struct dp_packet) \
  + RTE_PKTMBUF_HEADROOM)
+#define MBUF_SIZE(mtu)  ROUND_UP((_MBUF_SIZE(mtu)), \
+ RTE_CACHE_LINE_SIZE)

Why not just add ROUND_UP to the existing MBUF_SIZE macro; I think it is
more straightforward and there are probably enough ‘MTU’ macros
already.


+
 #define NETDEV_DPDK_MBUF_ALIGN  1024
 #define NETDEV_DPDK_MAX_PKT_LEN 9728
 
@@ -495,7 +498,7 @@ dpdk_mp_create(int socket_id, int mtu)
   MP_CACHE_SZ,
   sizeof (struct dp_packet)
  - sizeof (struct 
rte_mbuf),
-  MBUF_SIZE(mtu)
+  MBUF_SIZE(dmp->mtu)

This change does not seem necessary and distracts from the point
of the patch.

  - sizeof(struct 
dp_packet),
   socket_id);
 if (dmp->mp) {
-- 
2.11.0

___
dev mailing list
d...@openvswitch.org

https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=1tNyJJHbYGc7-BZ-RPxJoTCjeqEwQs-1EhjxLywTsfo&s=oGUJD3e3CuDW6phStK-eGR-rFu7ldsm3VeSL3Zd7vcg&e=
 


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 4/4] ovn-controller: Use separate thread for packet-in processing.

2017-06-08 Thread Ben Pfaff
On Tue, Jun 06, 2017 at 05:24:19PM -0700, Han Zhou wrote:
> On Tue, Jun 6, 2017 at 3:56 PM, Ben Pfaff  wrote:
> >
> > On Thu, May 25, 2017 at 05:26:47PM -0700, Han Zhou wrote:
> > > This patch introduces multi-threading for ovn-controller and use
> > > dedicated thread for packet-in processing as a start. It decouples
> > > packet-in processing and ovs flow computing, so that packet-in inputs
> > > won't trigger flow recomputing, and flow computing won't block
> > > packet-in processing. In large scale environment this largely reduces
> > > CPU cost and improves performance.
> >
> > Won't this double the load on the southbound database server, as well as
> > the bandwidth to and from it?  We already have a bottleneck there.
> 
> Ben, yes this is the trade-off. Here are the considerations:
> 
> 1. The bottle-neck in ovn-controller is easier to hit (you don't even need
> many number of HVs to hit it)
> 2. The bottle-neck of southbound DB do exist when number of HV increases
> but since you are already working on the ovsdb clustering I suppose it will
> be resolved.
> 
> However I agree that this is not ideal. Alternatively we can spin-up a
> dedicated thread for SB IDL processing and other "worker" thread just read
> the data with proper locking. This will be more complicated but should be
> doable, what do you think?

I spent a little time thinking about this.  I think that the approach
that you're proposing is probably practical.  Do you want to try to
experiment with it and see whether it's reasonably possible?
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 5/6] redhat: dynamically allocate and reference ovs user

2017-06-08 Thread Flavio Leitner
On Thu, Jun 08, 2017 at 04:39:07PM -0400, Aaron Conole wrote:
> >> half-dozen-of-the-other thing.  Whichever we choose, the requirements,
> >> in my mind, are simple:
> >> 
> >> 1. Don't break existing users who are upgrading.
> >> 2. Provide new installs with non-root users, because it's a good
> >>security practice.
> >> 
> >> Did I make sense?
> >
> > We share the same requirements.
> 
> Awesome!  Okay, I'll respin with your suggestions folded in.

Well, I haven't put much thought on it so feel free to push back.

> In the meantime, I will resubmit patches 2/6 and 3/6 as a separate
> series since they are independent of this work.

Sounds good!

Thanks Aaron,
-- 
Flavio

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] system-traffic: 802.1ad: Add double VLAN match test case

2017-06-08 Thread Joe Stringer
Yes, I'm the right person.

At this point I'm looking for the tap device persistence issue to be
addressed before running it on my tester box which runs tests on a
variety of platforms.

On 8 June 2017 at 14:35, Ben Pfaff  wrote:
> Joe, are you the right person to review this?
>
> Thanks,
>
> Ben.
>
> On Thu, Jun 01, 2017 at 01:59:27PM -0400, Eric Garver wrote:
>> Test case to match outer, pop outer, then match inner VLAN.
>>
>> Signed-off-by: Eric Garver 
>> ---
>>  tests/system-traffic.at | 35 +++
>>  1 file changed, 35 insertions(+)
>>
>> diff --git a/tests/system-traffic.at b/tests/system-traffic.at
>> index b19e7538a2ef..4b883a0eaa7e 100644
>> --- a/tests/system-traffic.at
>> +++ b/tests/system-traffic.at
>> @@ -3772,3 +3772,38 @@ NS_CHECK_EXEC([at_ns0], [ping -q -c 1 -w 3 10.4.2.2], 
>> [1], [ignore])
>>
>>  OVS_TRAFFIC_VSWITCHD_STOP(["/dropping VLAN \(0\|300\) packet received on 
>> dot1q-tunnel port/d"])
>>  AT_CLEANUP
>> +
>> +AT_SETUP([802.1ad - double vlan match])
>> +OVS_TRAFFIC_VSWITCHD_START([set Open_vSwitch . other_config:vlan-limit=0])
>> +OVS_CHECK_8021AD()
>> +
>> +ADD_NAMESPACES(at_ns0, at_ns1)
>> +
>> +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24")
>> +ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24")
>> +
>> +ADD_SVLAN(p0, at_ns0, 4094, "10.255.2.1/24")
>> +ADD_SVLAN(p1, at_ns1, 4094, "10.255.2.2/24")
>> +
>> +ADD_CVLAN(p0.4094, at_ns0, 100, "10.2.2.1/24")
>> +ADD_CVLAN(p1.4094, at_ns1, 100, "10.2.2.2/24")
>> +
>> +AT_DATA([flows-br0.txt], [dnl
>> +table=0,priority=1action=drop
>> +table=0,priority=100 dl_vlan=4094 action=pop_vlan,goto_table:1
>> +table=1,priority=100 dl_vlan=100  
>> action=push_vlan:0x88a8,mod_vlan_vid:4094,normal
>> +])
>> +AT_CHECK([ovs-ofctl --bundle add-flows br0 flows-br0.txt])
>> +
>> +OVS_WAIT_UNTIL([ip netns exec at_ns0 ping -c 1 10.2.2.2])
>> +
>> +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.2.2.2 | FORMAT_PING], 
>> [0], [dnl
>> +3 packets transmitted, 3 received, 0% packet loss, time 0ms
>> +])
>> +
>> +NS_CHECK_EXEC([at_ns0], [ping -s 1600 -q -c 3 -i 0.3 -w 2 10.2.2.2 | 
>> FORMAT_PING], [0], [dnl
>> +3 packets transmitted, 3 received, 0% packet loss, time 0ms
>> +])
>> +
>> +OVS_TRAFFIC_VSWITCHD_STOP
>> +AT_CLEANUP
>> --
>> 2.12.0
>>
>> ___
>> dev mailing list
>> d...@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 1/2] netdev-dpdk: Remove Rx checksum reconfigure.

2017-06-08 Thread Darrell Ball
LGTM

Acked by: Darrell Ball 

On 6/8/17, 10:12 AM, "ovs-dev-boun...@openvswitch.org on behalf of Kevin 
Traynor"  
wrote:

Rx checksum offload is enabled by default on DPDK physical NICs
where available, with reconfiguration through
options:rx-checksum-offload. However, changing rx-checksum-offload
did not result in a reconfiguration of the NIC and wrong status is
reported for it.

As there seems to be diminishing reasons why a user would want
to disable Rx checksum offload, just remove the broken reconfiguration
option.

Fixes: 1a2bb11817a4 ("netdev-dpdk: Enable Rx checksum offloading feature on 
DPDK physical ports.")
Reported-by: Kevin Traynor 
Suggested-by: Sugesh Chandran 
Signed-off-by: Kevin Traynor 
---
 Documentation/howto/dpdk.rst | 17 +---
 lib/netdev-dpdk.c| 47 
+++-
 vswitchd/vswitch.xml | 14 -
 3 files changed, 13 insertions(+), 65 deletions(-)

diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst
index 93248b4..af01d3e 100644
--- a/Documentation/howto/dpdk.rst
+++ b/Documentation/howto/dpdk.rst
@@ -277,16 +277,5 @@ Rx Checksum Offload
 ---
 
-By default, DPDK physical ports are enabled with Rx checksum offload. Rx
-checksum offload can be configured on a DPDK physical port either when 
adding
-or at run time.
-
-To disable Rx checksum offload when adding a DPDK port dpdk-p0::
-
-$ ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk \
-  options:dpdk-devargs=:01:00.0 options:rx-checksum-offload=false
-
-Similarly to disable the Rx checksum offloading on a existing DPDK port 
dpdk-p0::
-
-$ ovs-vsctl set Interface dpdk-p0 options:rx-checksum-offload=false
+By default, DPDK physical ports are enabled with Rx checksum offload.
 
 Rx checksum offload can offer performance improvement only for tunneling
@@ -294,8 +283,4 @@ traffic in OVS-DPDK because the checksum validation of 
tunnel packets is
 offloaded to the NIC. Also enabling Rx checksum may slightly reduce the
 performance of non-tunnel traffic, specifically for smaller size packet.
-DPDK vectorization is disabled when checksum offloading is configured on 
DPDK
-physical ports which in turn effects the non-tunnel traffic performance.
-So it is advised to turn off the Rx checksum offload for non-tunnel 
traffic use
-cases to achieve the best performance.
 
 .. _extended-statistics:
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index b770b70..79afda5 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -719,27 +719,4 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, int 
n_rxq, int n_txq)
 
 static void
-dpdk_eth_checksum_offload_configure(struct netdev_dpdk *dev)
-OVS_REQUIRES(dev->mutex)
-{
-struct rte_eth_dev_info info;
-bool rx_csum_ol_flag = false;
-uint32_t rx_chksm_offload_capa = DEV_RX_OFFLOAD_UDP_CKSUM |
- DEV_RX_OFFLOAD_TCP_CKSUM |
- DEV_RX_OFFLOAD_IPV4_CKSUM;
-rte_eth_dev_info_get(dev->port_id, &info);
-rx_csum_ol_flag = (dev->hw_ol_features & NETDEV_RX_CHECKSUM_OFFLOAD) 
!= 0;
-
-if (rx_csum_ol_flag &&
-(info.rx_offload_capa & rx_chksm_offload_capa) !=
- rx_chksm_offload_capa) {
-VLOG_WARN_ONCE("Rx checksum offload is not supported on device 
%"PRIu8,
-   dev->port_id);
-dev->hw_ol_features &= ~NETDEV_RX_CHECKSUM_OFFLOAD;
-return;
-}
-netdev_request_reconfigure(&dev->up);
-}
-
-static void
 dpdk_eth_flow_ctrl_setup(struct netdev_dpdk *dev) OVS_REQUIRES(dev->mutex)
 {
@@ -759,7 +736,19 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
 int diag;
 int n_rxq, n_txq;
+uint32_t rx_chksm_offload_capa = DEV_RX_OFFLOAD_UDP_CKSUM |
+ DEV_RX_OFFLOAD_TCP_CKSUM |
+ DEV_RX_OFFLOAD_IPV4_CKSUM;
 
 rte_eth_dev_info_get(dev->port_id, &info);
 
+if ((info.rx_offload_capa & rx_chksm_offload_capa) !=
+rx_chksm_offload_capa) {
+VLOG_WARN_ONCE("Rx checksum offload is not supported on device 
%"PRIu8,
+dev->port_id);
+dev->hw_ol_features &= ~NETDEV_RX_CHECKSUM_OFFLOAD;
+} else {
+dev->hw_ol_features |= NETDEV_RX_CHECKSUM_OFFLOAD;
+}
+
 n_rxq = MIN(info.max_rx_queues, dev->up.n_rxq);
 n_txq = MIN(info.max_tx_queues, dev->up.n_txq);
@@ -1205,6 +1194,4 @@ netdev_dpdk_set_config(struct netdev *netdev, const 
struct smap *args,
 {RTE_FC_RX_PAUSE, RTE_F

Re: [ovs-dev] [PATCH v2 1/4] ovn: l3ha, handling of multiple gateways

2017-06-08 Thread Ben Pfaff
On Thu, Jun 08, 2017 at 02:05:05PM +, majop...@redhat.com wrote:
> From: Miguel Angel Ajo 
> 
> This patch handles multiple gateways with priorities in chassisredirect
> ports, any gateway with a chassis redirect port will implement the
> rules to de-encapsulate incomming packets for such port.
> 
> And hosts targetting a remote chassisredirect port will setup a
> bundle(active_backup, ..) action to each tunnel port, in the given
> priority order.
> 
> Signed-off-by: Miguel Angel Ajo 

I feel unqualified to fully and properly review this series.  Guru, is
it something you'd feel able to take a look at?  Is anyone else planning
to review this?

Thanks,

Ben.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] system-traffic: 802.1ad: Add double VLAN match test case

2017-06-08 Thread Ben Pfaff
Joe, are you the right person to review this?

Thanks,

Ben.

On Thu, Jun 01, 2017 at 01:59:27PM -0400, Eric Garver wrote:
> Test case to match outer, pop outer, then match inner VLAN.
> 
> Signed-off-by: Eric Garver 
> ---
>  tests/system-traffic.at | 35 +++
>  1 file changed, 35 insertions(+)
> 
> diff --git a/tests/system-traffic.at b/tests/system-traffic.at
> index b19e7538a2ef..4b883a0eaa7e 100644
> --- a/tests/system-traffic.at
> +++ b/tests/system-traffic.at
> @@ -3772,3 +3772,38 @@ NS_CHECK_EXEC([at_ns0], [ping -q -c 1 -w 3 10.4.2.2], 
> [1], [ignore])
>  
>  OVS_TRAFFIC_VSWITCHD_STOP(["/dropping VLAN \(0\|300\) packet received on 
> dot1q-tunnel port/d"])
>  AT_CLEANUP
> +
> +AT_SETUP([802.1ad - double vlan match])
> +OVS_TRAFFIC_VSWITCHD_START([set Open_vSwitch . other_config:vlan-limit=0])
> +OVS_CHECK_8021AD()
> +
> +ADD_NAMESPACES(at_ns0, at_ns1)
> +
> +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24")
> +ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24")
> +
> +ADD_SVLAN(p0, at_ns0, 4094, "10.255.2.1/24")
> +ADD_SVLAN(p1, at_ns1, 4094, "10.255.2.2/24")
> +
> +ADD_CVLAN(p0.4094, at_ns0, 100, "10.2.2.1/24")
> +ADD_CVLAN(p1.4094, at_ns1, 100, "10.2.2.2/24")
> +
> +AT_DATA([flows-br0.txt], [dnl
> +table=0,priority=1action=drop
> +table=0,priority=100 dl_vlan=4094 action=pop_vlan,goto_table:1
> +table=1,priority=100 dl_vlan=100  
> action=push_vlan:0x88a8,mod_vlan_vid:4094,normal
> +])
> +AT_CHECK([ovs-ofctl --bundle add-flows br0 flows-br0.txt])
> +
> +OVS_WAIT_UNTIL([ip netns exec at_ns0 ping -c 1 10.2.2.2])
> +
> +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.2.2.2 | FORMAT_PING], 
> [0], [dnl
> +3 packets transmitted, 3 received, 0% packet loss, time 0ms
> +])
> +
> +NS_CHECK_EXEC([at_ns0], [ping -s 1600 -q -c 3 -i 0.3 -w 2 10.2.2.2 | 
> FORMAT_PING], [0], [dnl
> +3 packets transmitted, 3 received, 0% packet loss, time 0ms
> +])
> +
> +OVS_TRAFFIC_VSWITCHD_STOP
> +AT_CLEANUP
> -- 
> 2.12.0
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] dpdk: announce deprecation of vhost-user server ports

2017-06-08 Thread Ben Pfaff
Thanks Aaron and Darrell, I applied this to master.

On Thu, Jun 08, 2017 at 08:59:25PM +, Darrell Ball wrote:
> Thanks for doing this.
> 
> Acked-by: Darrell Ball 
> 
> On 6/8/17, 1:41 PM, "Aaron Conole"  wrote:
> 
> Since vhost-user server mode ports are the preferred mechanism for
> interconnecting Open vSwitch with VMs when using DPDK, and since there
> are currently no known use cases for vhost-user server mode ports apart
> from version incompatibilities with QEMU, announce that server mode ports
> are considered deprecated and will be removed in a future release.
> 
> v1->v2:
> * Verbiage changes as suggested by Kevin Traynor, and Darrell Ball.
> 
> Cc: Ciara Loftus 
> Cc: Kevin Traynor 
> Suggested-by: Darrell Ball 
> Signed-off-by: Aaron Conole 
> ---
> Previous version can be found at:
> 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddev_2017-2DJune_333609.html&d=DwIBAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=M6rZdq8NlOfvmS7wgOavwyKNcKLJS2-FkF9vRM2DLS0&s=ALZOeECMUJeDQbpppAE_9VUn3QYYFLb9iK4sHzfzif4&e=
>  
> 
>  Documentation/topics/dpdk/vhost-user.rst | 26 ++
>  NEWS |  2 ++
>  lib/netdev-dpdk.c|  2 ++
>  3 files changed, 22 insertions(+), 8 deletions(-)
> 
> diff --git a/Documentation/topics/dpdk/vhost-user.rst 
> b/Documentation/topics/dpdk/vhost-user.rst
> index a1c19fd..3b11c4d 100644
> --- a/Documentation/topics/dpdk/vhost-user.rst
> +++ b/Documentation/topics/dpdk/vhost-user.rst
> @@ -32,13 +32,20 @@ documentation`_ on same.
>  Quick Example
>  -
>  
> -This example demonstrates how to add two ``dpdkvhostuser`` ports to an 
> existing
> -bridge called ``br0``::
> +This example demonstrates how to add two ``dpdkvhostuserclient`` ports 
> to an
> +existing bridge called ``br0``::
>  
> -$ ovs-vsctl add-port br0 dpdkvhostuser0 \
> --- set Interface dpdkvhostuser0 type=dpdkvhostuser
> -$ ovs-vsctl add-port br0 dpdkvhostuser1 \
> --- set Interface dpdkvhostuser1 type=dpdkvhostuser
> +$ ovs-vsctl add-port br0 dpdkvhostclient0 \
> +-- set Interface dpdkvhostclient0 type=dpdkvhostuserclient \
> +   options:vhost-server-path=/tmp/dpdkvhostclient0
> +$ ovs-vsctl add-port br0 dpdkvhostclient1 \
> +-- set Interface dpdkvhostclient1 type=dpdkvhostuserclient \
> +   options:vhost-server-path=/tmp/dpdkvhostclient1
> +
> +For the above examples to work, an appropriate server socket must be 
> created
> +at the paths specified (``/tmp/dpdkvhostclient0`` and
> +``/tmp/dpdkvhostclient0``).  These sockets can be created with QEMU; see 
> the
> +:ref:`vhost-user client ` section for details.
>  
>  vhost-user vs. vhost-user-client
>  
> @@ -59,7 +66,9 @@ means if OVS dies, all VMs **must** be restarted. On 
> the other hand, for
>  vhost-user-client ports, OVS acts as the client and QEMU the server. 
> This means
>  OVS can die and be restarted without issue, and it is also possible to 
> restart
>  an instance itself. For this reason, vhost-user-client ports are the 
> preferred
> -type for most use cases.
> +type for all known use cases; the only limitation is that vhost-user 
> client
> +mode ports require QEMU version 2.7.  Ports of type vhost-user are 
> currently
> +deprecated and will be removed in a future release.
>  
>  .. _dpdk-vhost-user:
>  
> @@ -68,7 +77,8 @@ vhost-user
>  
>  .. important::
>  
> -   Use of vhost-user ports requires QEMU >= 2.2
> +   Use of vhost-user ports requires QEMU >= 2.2;  vhost-user ports are
> +   *deprecated*.
>  
>  To use vhost-user ports, you must first add said ports to the switch. 
> DPDK
>  vhost-user ports can have arbitrary names with the exception of forward 
> and
> diff --git a/NEWS b/NEWS
> index 82004c8..b81d033 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -16,6 +16,8 @@ Post-v2.7.0
> Log level can be changed in a usual OVS way using
> 'ovs-appctl vlog' commands for 'dpdk' module. Lower bound
> still can be configured via extra arguments for DPDK EAL.
> + * dpdkvhostuser ports are marked as deprecated.  They will be 
> removed
> +   in an upcoming release.
> - IPFIX now provides additional counters:
>   * Total counters since metering process startup.
>   * Per-flow TCP flag counters.
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index b770b70..9ab4aeb 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -966,6 +966,8 @@ netdev_dpdk_vhost_construct(struct netdev *netdev)
>  err = vho

Re: [ovs-dev] [PATCH v4 1/3] rstp: Add rstp port name for human reading.

2017-06-08 Thread Ben Pfaff
On Wed, May 31, 2017 at 08:38:14PM -0700, nickcooper-zhangtonghao wrote:
> This patch is useful to debug rstp subsystem and log the
> port name instead of port number. This patch will also
> be used to display rstp info for next patches.
> 
> Signed-off-by: nickcooper-zhangtonghao 
> Acked-by: Jarno Rajahalme 

Thanks!  I applied all of these to master.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 0/3] role-based access controls for ovsdb-server, ovn-sb

2017-06-08 Thread Ben Pfaff
On Wed, May 31, 2017 at 07:04:02PM -0400, Lance Richardson wrote:
> This series implements role-based access control infrastructure for
> ovsdb-server, and uses that infrastructure to apply role-based access
> controls to the OVN_Southbound database. This implementation follows
> the outline discussed at:
> 
>  https://mail.openvswitch.org/pipermail/ovs-dev/2017-March/329801.html

Thanks a lot!  I applied these patches to master.  I only changed them
to add a couple of relevant items to NEWS.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] dpdk: announce deprecation of vhost-user server ports

2017-06-08 Thread Darrell Ball
Thanks for doing this.

Acked-by: Darrell Ball 

On 6/8/17, 1:41 PM, "Aaron Conole"  wrote:

Since vhost-user server mode ports are the preferred mechanism for
interconnecting Open vSwitch with VMs when using DPDK, and since there
are currently no known use cases for vhost-user server mode ports apart
from version incompatibilities with QEMU, announce that server mode ports
are considered deprecated and will be removed in a future release.

v1->v2:
* Verbiage changes as suggested by Kevin Traynor, and Darrell Ball.

Cc: Ciara Loftus 
Cc: Kevin Traynor 
Suggested-by: Darrell Ball 
Signed-off-by: Aaron Conole 
---
Previous version can be found at:

https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddev_2017-2DJune_333609.html&d=DwIBAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=M6rZdq8NlOfvmS7wgOavwyKNcKLJS2-FkF9vRM2DLS0&s=ALZOeECMUJeDQbpppAE_9VUn3QYYFLb9iK4sHzfzif4&e=
 

 Documentation/topics/dpdk/vhost-user.rst | 26 ++
 NEWS |  2 ++
 lib/netdev-dpdk.c|  2 ++
 3 files changed, 22 insertions(+), 8 deletions(-)

diff --git a/Documentation/topics/dpdk/vhost-user.rst 
b/Documentation/topics/dpdk/vhost-user.rst
index a1c19fd..3b11c4d 100644
--- a/Documentation/topics/dpdk/vhost-user.rst
+++ b/Documentation/topics/dpdk/vhost-user.rst
@@ -32,13 +32,20 @@ documentation`_ on same.
 Quick Example
 -
 
-This example demonstrates how to add two ``dpdkvhostuser`` ports to an 
existing
-bridge called ``br0``::
+This example demonstrates how to add two ``dpdkvhostuserclient`` ports to 
an
+existing bridge called ``br0``::
 
-$ ovs-vsctl add-port br0 dpdkvhostuser0 \
--- set Interface dpdkvhostuser0 type=dpdkvhostuser
-$ ovs-vsctl add-port br0 dpdkvhostuser1 \
--- set Interface dpdkvhostuser1 type=dpdkvhostuser
+$ ovs-vsctl add-port br0 dpdkvhostclient0 \
+-- set Interface dpdkvhostclient0 type=dpdkvhostuserclient \
+   options:vhost-server-path=/tmp/dpdkvhostclient0
+$ ovs-vsctl add-port br0 dpdkvhostclient1 \
+-- set Interface dpdkvhostclient1 type=dpdkvhostuserclient \
+   options:vhost-server-path=/tmp/dpdkvhostclient1
+
+For the above examples to work, an appropriate server socket must be 
created
+at the paths specified (``/tmp/dpdkvhostclient0`` and
+``/tmp/dpdkvhostclient0``).  These sockets can be created with QEMU; see 
the
+:ref:`vhost-user client ` section for details.
 
 vhost-user vs. vhost-user-client
 
@@ -59,7 +66,9 @@ means if OVS dies, all VMs **must** be restarted. On the 
other hand, for
 vhost-user-client ports, OVS acts as the client and QEMU the server. This 
means
 OVS can die and be restarted without issue, and it is also possible to 
restart
 an instance itself. For this reason, vhost-user-client ports are the 
preferred
-type for most use cases.
+type for all known use cases; the only limitation is that vhost-user client
+mode ports require QEMU version 2.7.  Ports of type vhost-user are 
currently
+deprecated and will be removed in a future release.
 
 .. _dpdk-vhost-user:
 
@@ -68,7 +77,8 @@ vhost-user
 
 .. important::
 
-   Use of vhost-user ports requires QEMU >= 2.2
+   Use of vhost-user ports requires QEMU >= 2.2;  vhost-user ports are
+   *deprecated*.
 
 To use vhost-user ports, you must first add said ports to the switch. DPDK
 vhost-user ports can have arbitrary names with the exception of forward and
diff --git a/NEWS b/NEWS
index 82004c8..b81d033 100644
--- a/NEWS
+++ b/NEWS
@@ -16,6 +16,8 @@ Post-v2.7.0
Log level can be changed in a usual OVS way using
'ovs-appctl vlog' commands for 'dpdk' module. Lower bound
still can be configured via extra arguments for DPDK EAL.
+ * dpdkvhostuser ports are marked as deprecated.  They will be removed
+   in an upcoming release.
- IPFIX now provides additional counters:
  * Total counters since metering process startup.
  * Per-flow TCP flag counters.
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index b770b70..9ab4aeb 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -966,6 +966,8 @@ netdev_dpdk_vhost_construct(struct netdev *netdev)
 err = vhost_common_construct(netdev);
 
 ovs_mutex_unlock(&dpdk_mutex);
+VLOG_WARN_ONCE("dpdkvhostuser ports are considered deprecated;  "
+   "please migrate to dpdkvhostuserclient ports.");
 return err;
 }
 
-- 
2.9.4



___
dev mailing lis

Re: [ovs-dev] [PATCH v3 2/2] netdev-dpdk: Show Rx checksum status when false.

2017-06-08 Thread Ben Pfaff
On Thu, Jun 08, 2017 at 06:12:20PM +0100, Kevin Traynor wrote:
> Currently ovs-appctl dpctl/show only shows the Rx checksum offload
> status when true. Change to also show the status when false.
> 
> CC: Sugesh Chandran 
> Signed-off-by: Kevin Traynor 

This patch looks obviously correct to me, so I applied it to master.
I'll let others, who know DPDK better, review patch 1.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 1/2] Debian: Rework libopenvswitch packages

2017-06-08 Thread Ben Warren via dev
From: Ben Warren 

The 'openvswitch-common' package did not work well with cross-compiling
since it required Python.  This package is broken into two packages as
follows:
- libopenvwitch: contains library files (.a, .so)
- openvswitch-common: depends on libopenvswitch, contains command-line
  tools such as ovs-ofctl, ovs-appctl etc.

In addition, this 'openvswitch-dev' library is renamed to
'libopenvswitch-dev' to align more closely with Debian policy.  It
depends on libopenvswitch.

Signed-off-by: Ben Warren 
---
 debian/.gitignore |  4 +++-
 debian/automake.mk|  3 ++-
 debian/control| 23 +--
 debian/libopenvswitch-dev.install | 11 +++
 debian/libopenvswitch.install |  1 +
 debian/openvswitch-common.install |  1 -
 debian/openvswitch-dev.install| 11 ---
 7 files changed, 38 insertions(+), 16 deletions(-)
 create mode 100644 debian/libopenvswitch-dev.install
 create mode 100644 debian/libopenvswitch.install
 delete mode 100644 debian/openvswitch-dev.install

diff --git a/debian/.gitignore b/debian/.gitignore
index 4baed48..9ec70eb 100644
--- a/debian/.gitignore
+++ b/debian/.gitignore
@@ -6,6 +6,8 @@
 /control
 /copyright
 /files
+/libopenvswitch
+/libopenvswitch-dev
 /nicira-switch
 /openvswitch
 /openvswitch-common
@@ -13,7 +15,6 @@
 /openvswitch-datapath-source
 /openvswitch-datapath-dkms
 /openvswitch-dbg
-/openvswitch-dev
 /openvswitch-ipsec
 /openvswitch-pki
 /openvswitch-switch
@@ -22,6 +23,7 @@
 /openvswitch-testcontroller
 /openvswitch-vtep
 /ovn-common
+/ovn-controller-vtep
 /ovn-host
 /ovn-central
 /ovn-docker
diff --git a/debian/automake.mk b/debian/automake.mk
index 07ea912..4d8e204 100644
--- a/debian/automake.mk
+++ b/debian/automake.mk
@@ -7,6 +7,8 @@ EXTRA_DIST += \
debian/copyright.in \
debian/dkms.conf.in \
debian/dirs \
+   debian/libopenvswitch.install \
+   debian/libopenvswitch-dev.install \
debian/openvswitch-common.dirs \
debian/openvswitch-common.docs \
debian/openvswitch-common.install \
@@ -18,7 +20,6 @@ EXTRA_DIST += \
debian/openvswitch-datapath-source.copyright \
debian/openvswitch-datapath-source.dirs \
debian/openvswitch-datapath-source.install \
-   debian/openvswitch-dev.install \
debian/openvswitch-pki.dirs \
debian/openvswitch-pki.postinst \
debian/openvswitch-pki.postrm \
diff --git a/debian/control b/debian/control
index 0b75f2b..42e6f16 100644
--- a/debian/control
+++ b/debian/control
@@ -59,6 +59,7 @@ Architecture: linux-any
 Depends: openssl,
  python (>= 2.7),
  python-six,
+ libopenvswitch (= ${binary:Version}),
  ${misc:Depends},
  ${shlibs:Depends}
 Suggests: ethtool
@@ -76,6 +77,22 @@ Description: Open vSwitch common components
  openvswitch-common provides components required by both openvswitch-switch
  and openvswitch-testcontroller.
 
+Package: libopenvswitch
+Architecture: linux-any
+Depends: libssl-dev,
+ ${misc:Depends},
+ ${shlibs:Depends}
+Description: Open vSwitch common components
+ Open vSwitch is a production quality, multilayer, software-based,
+ Ethernet virtual switch. It is designed to enable massive network
+ automation through programmatic extension, while still supporting
+ standard management interfaces and protocols (e.g. NetFlow, IPFIX,
+ sFlow, SPAN, RSPAN, CLI, LACP, 802.1ag). In addition, it is designed
+ to support distribution across multiple physical servers similar to
+ VMware's vNetwork distributed vswitch or Cisco's Nexus 1000V.
+ .
+ libopenvswitch provides runtime libraries for use by openvswitch binaries
+
 Package: openvswitch-switch
 Architecture: linux-any
 Suggests: openvswitch-datapath-module
@@ -283,11 +300,13 @@ Description: Open vSwitch VTEP utilities
  This package provides utilities that are useful to interact with a
  VTEP-configured database and a VTEP emulator.
 
-Package: openvswitch-dev
+Package: libopenvswitch-dev
 Architecture: linux-any
 Depends:
- openvswitch-common (>= ${binary:Version}),
+ libopenvswitch (>= ${binary:Version}),
  ${misc:Depends}
+Conflicts: openvswitch-dev
+Replaces: openvswitch-dev
 Description: Open vSwitch development package
  Open vSwitch is a production quality, multilayer, software-based, Ethernet
  virtual switch. It is designed to enable massive network automation through
diff --git a/debian/libopenvswitch-dev.install 
b/debian/libopenvswitch-dev.install
new file mode 100644
index 000..11791e4
--- /dev/null
+++ b/debian/libopenvswitch-dev.install
@@ -0,0 +1,11 @@
+usr/lib/lib*.so
+usr/lib/lib*.a
+usr/lib/pkgconfig
+include/*.h usr/include/openvswitch
+include/openflow/*.h usr/include/openvswitch/openflow
+include/openvswitch/*.h usr/include/openvswitch/openvswitch
+include/sparse/*.h usr/include/openvswitch/sparse
+include/sparse/arpa/*.h usr/include/openvswitch/sparse/arpa
+include/sparse/netinet/*.h usr/include/openvsw

[ovs-dev] [PATCH 2/2] Debian: Provide multi-arch support

2017-06-08 Thread Ben Warren via dev
From: Ben Warren 

This puts all libraries and pkg-config files in architecture-specific
directories for easier cross-compiling.

Signed-off-by: Ben Warren 
---
 debian/compat | 2 +-
 debian/control| 3 +++
 debian/libopenvswitch-dev.install | 6 +++---
 debian/libopenvswitch.install | 2 +-
 4 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/debian/compat b/debian/compat
index 45a4fb7..ec63514 100644
--- a/debian/compat
+++ b/debian/compat
@@ -1 +1 @@
-8
+9
diff --git a/debian/control b/debian/control
index 42e6f16..2173735 100644
--- a/debian/control
+++ b/debian/control
@@ -56,6 +56,7 @@ Description: Open vSwitch datapath module source - DKMS 
version
 
 Package: openvswitch-common
 Architecture: linux-any
+Multi-Arch: foreign
 Depends: openssl,
  python (>= 2.7),
  python-six,
@@ -79,6 +80,7 @@ Description: Open vSwitch common components
 
 Package: libopenvswitch
 Architecture: linux-any
+Multi-Arch: same
 Depends: libssl-dev,
  ${misc:Depends},
  ${shlibs:Depends}
@@ -302,6 +304,7 @@ Description: Open vSwitch VTEP utilities
 
 Package: libopenvswitch-dev
 Architecture: linux-any
+Multi-Arch: same
 Depends:
  libopenvswitch (>= ${binary:Version}),
  ${misc:Depends}
diff --git a/debian/libopenvswitch-dev.install 
b/debian/libopenvswitch-dev.install
index 11791e4..ca3d22c 100644
--- a/debian/libopenvswitch-dev.install
+++ b/debian/libopenvswitch-dev.install
@@ -1,6 +1,6 @@
-usr/lib/lib*.so
-usr/lib/lib*.a
-usr/lib/pkgconfig
+usr/lib/*/lib*.so
+usr/lib/*/lib*.a
+usr/lib/*/pkgconfig
 include/*.h usr/include/openvswitch
 include/openflow/*.h usr/include/openvswitch/openflow
 include/openvswitch/*.h usr/include/openvswitch/openvswitch
diff --git a/debian/libopenvswitch.install b/debian/libopenvswitch.install
index d0dbfd1..3ddde58 100644
--- a/debian/libopenvswitch.install
+++ b/debian/libopenvswitch.install
@@ -1 +1 @@
-usr/lib/lib*.so.*
+usr/lib/*/lib*.so.*
-- 
2.6.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 0/2] Debian: refactor development packages

2017-06-08 Thread Ben Warren via dev
From: Ben Warren 

The existing packages were not configured properly for
cross-compilation.  In particular, development packages and their
dependencies should not require executables.

A new package, 'libopenvswitch', is introduced, which contains the
library files (*.so and *.a), while 'openvswitch-dev' is renamed
to 'libopenvswitch-dev' to be consistent with other development
packages.

Finally, the packages are modified to make use of "Multi-Arch", 
which places library files in architecture-specific locations in
the file systems.  For example:

Without Multi-Arch:
/usr/lib/libopenvswitch-2.so.7

With Multi-Arch:
/usr/lib/x86_64-linux-gnu/libopenvswitch-2.so.7

allowing multiple architectures to co-exist and thus making cross-
compiling easier.

Ben Warren (2):
  Debian: Rework libopenvswitch packages
  Debian: Provide multi-arch support

 debian/.gitignore |  4 +++-
 debian/automake.mk|  3 ++-
 debian/compat |  2 +-
 debian/control| 26 --
 debian/libopenvswitch-dev.install | 11 +++
 debian/libopenvswitch.install |  1 +
 debian/openvswitch-common.install |  1 -
 debian/openvswitch-dev.install| 11 ---
 8 files changed, 42 insertions(+), 17 deletions(-)
 create mode 100644 debian/libopenvswitch-dev.install
 create mode 100644 debian/libopenvswitch.install
 delete mode 100644 debian/openvswitch-dev.install

-- 
2.6.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] dpif-netdev: Fix insertion probability

2017-06-08 Thread Ben Pfaff
On Wed, May 17, 2017 at 09:28:49AM +0100, Ciara Loftus wrote:
> -#ifdef DPDK_NETDEV
> -if (min && (key->hash ^ (uint32_t) pmd->last_cycles) <= min) {
> -#else
>  if (min && (key->hash ^ random_uint32()) <= min) {
> -#endif

If A is random, then A^B is still random, so
  if (min && random_uint32() <= min) {
is equivalent.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] netdev-dpdk: round up mbuf_size to cache_line_size

2017-06-08 Thread Ben Pfaff
On Wed, Jun 07, 2017 at 06:40:54PM +0530, santosh wrote:
> Hi,
> 
> On Monday 05 June 2017 10:31 AM, santosh wrote:
> 
> > On Wednesday 31 May 2017 06:38 PM, Santosh Shukla wrote:
> >
> >> Some pmd driver(e.g: vNIC thunderx PMD) want mbuf_size to be multiple of
> >> cache_line_size. With out this fix, Netdev-dpdk initialization would
> >> simply fail for those drivers.
> >>
> >> Signed-off-by: Santosh Shukla 
> >> ---
> >> - Tested arm64/ThunderX platform for vNIC pmd,
> >> - Topology: phy-phy and phy-vm
> >> - Tested x86 platform for XL710/i40e pmd.
> > Ping? Review comment/feedback?
> 
> Pinging again? Appreciate any review comment/feedback about patch. Thanks.

It seems reasonable to me but I'd like to hear from someone who knows
DPDK better.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2] dpdk: announce deprecation of vhost-user server ports

2017-06-08 Thread Aaron Conole
Since vhost-user server mode ports are the preferred mechanism for
interconnecting Open vSwitch with VMs when using DPDK, and since there
are currently no known use cases for vhost-user server mode ports apart
from version incompatibilities with QEMU, announce that server mode ports
are considered deprecated and will be removed in a future release.

v1->v2:
* Verbiage changes as suggested by Kevin Traynor, and Darrell Ball.

Cc: Ciara Loftus 
Cc: Kevin Traynor 
Suggested-by: Darrell Ball 
Signed-off-by: Aaron Conole 
---
Previous version can be found at:
https://mail.openvswitch.org/pipermail/ovs-dev/2017-June/333609.html

 Documentation/topics/dpdk/vhost-user.rst | 26 ++
 NEWS |  2 ++
 lib/netdev-dpdk.c|  2 ++
 3 files changed, 22 insertions(+), 8 deletions(-)

diff --git a/Documentation/topics/dpdk/vhost-user.rst 
b/Documentation/topics/dpdk/vhost-user.rst
index a1c19fd..3b11c4d 100644
--- a/Documentation/topics/dpdk/vhost-user.rst
+++ b/Documentation/topics/dpdk/vhost-user.rst
@@ -32,13 +32,20 @@ documentation`_ on same.
 Quick Example
 -
 
-This example demonstrates how to add two ``dpdkvhostuser`` ports to an existing
-bridge called ``br0``::
+This example demonstrates how to add two ``dpdkvhostuserclient`` ports to an
+existing bridge called ``br0``::
 
-$ ovs-vsctl add-port br0 dpdkvhostuser0 \
--- set Interface dpdkvhostuser0 type=dpdkvhostuser
-$ ovs-vsctl add-port br0 dpdkvhostuser1 \
--- set Interface dpdkvhostuser1 type=dpdkvhostuser
+$ ovs-vsctl add-port br0 dpdkvhostclient0 \
+-- set Interface dpdkvhostclient0 type=dpdkvhostuserclient \
+   options:vhost-server-path=/tmp/dpdkvhostclient0
+$ ovs-vsctl add-port br0 dpdkvhostclient1 \
+-- set Interface dpdkvhostclient1 type=dpdkvhostuserclient \
+   options:vhost-server-path=/tmp/dpdkvhostclient1
+
+For the above examples to work, an appropriate server socket must be created
+at the paths specified (``/tmp/dpdkvhostclient0`` and
+``/tmp/dpdkvhostclient0``).  These sockets can be created with QEMU; see the
+:ref:`vhost-user client ` section for details.
 
 vhost-user vs. vhost-user-client
 
@@ -59,7 +66,9 @@ means if OVS dies, all VMs **must** be restarted. On the 
other hand, for
 vhost-user-client ports, OVS acts as the client and QEMU the server. This means
 OVS can die and be restarted without issue, and it is also possible to restart
 an instance itself. For this reason, vhost-user-client ports are the preferred
-type for most use cases.
+type for all known use cases; the only limitation is that vhost-user client
+mode ports require QEMU version 2.7.  Ports of type vhost-user are currently
+deprecated and will be removed in a future release.
 
 .. _dpdk-vhost-user:
 
@@ -68,7 +77,8 @@ vhost-user
 
 .. important::
 
-   Use of vhost-user ports requires QEMU >= 2.2
+   Use of vhost-user ports requires QEMU >= 2.2;  vhost-user ports are
+   *deprecated*.
 
 To use vhost-user ports, you must first add said ports to the switch. DPDK
 vhost-user ports can have arbitrary names with the exception of forward and
diff --git a/NEWS b/NEWS
index 82004c8..b81d033 100644
--- a/NEWS
+++ b/NEWS
@@ -16,6 +16,8 @@ Post-v2.7.0
Log level can be changed in a usual OVS way using
'ovs-appctl vlog' commands for 'dpdk' module. Lower bound
still can be configured via extra arguments for DPDK EAL.
+ * dpdkvhostuser ports are marked as deprecated.  They will be removed
+   in an upcoming release.
- IPFIX now provides additional counters:
  * Total counters since metering process startup.
  * Per-flow TCP flag counters.
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index b770b70..9ab4aeb 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -966,6 +966,8 @@ netdev_dpdk_vhost_construct(struct netdev *netdev)
 err = vhost_common_construct(netdev);
 
 ovs_mutex_unlock(&dpdk_mutex);
+VLOG_WARN_ONCE("dpdkvhostuser ports are considered deprecated;  "
+   "please migrate to dpdkvhostuserclient ports.");
 return err;
 }
 
-- 
2.9.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] testsuite: exit gracefully if it fails.

2017-06-08 Thread Ben Pfaff
On Thu, Jun 08, 2017 at 04:19:00PM -0300, Flavio Leitner wrote:
> On Thu, Jun 08, 2017 at 10:53:05AM -0700, Ben Pfaff wrote:
> > On Thu, Jun 08, 2017 at 02:30:48PM -0300, Flavio Leitner wrote:
> > > The daemon is killed leaving resources behind when a test fails.
> > > This fixes to first signal the daemon to exit gracefully.
> > > 
> > > Suggested-by: Joe Stringer 
> > > Fixes: 0f28164be02ac ("netdev-linux: make tap devices persistent")
> > > Signed-off-by: Flavio Leitner 
> > > ---
> > >  tests/ofproto-macros.at | 3 +++
> > >  1 file changed, 3 insertions(+)
> > > 
> > > diff --git a/tests/ofproto-macros.at b/tests/ofproto-macros.at
> > > index faff5b0..5ac5d05 100644
> > > --- a/tests/ofproto-macros.at
> > > +++ b/tests/ofproto-macros.at
> > > @@ -323,6 +323,9 @@ m4_define([_OVS_VSWITCHD_START],
> > > AT_CHECK([ovs-vswitchd $1 --detach --no-chdir --pidfile --log-file 
> > > -vvconn -vofproto_dpif -vunixctl], [0], [], [stderr])
> > > AT_CAPTURE_FILE([ovs-vswitchd.log])
> > > on_exit "kill `cat ovs-vswitchd.pid`"
> > > +   dnl Wait for the daemon to exit gracefully
> > > +   on_exit "for i in 1 2 3 4 5 6 7 8 9; do kill -0 `cat 
> > > ovs-vswitchd.pid` || break; sleep 0.1 || sleep 1; done"
> > > +   on_exit "ovs-appctl -t ovs-vswitchd exit --cleanup"
> > 
> > Thanks for the patch.
> > 
> > At first, I thought that this did the steps in the wrong order, but
> > "on_exit" reverses the order. 
> 
> That's documented! :)

I wrote the implementation *and* the documentation for this bit here.
One forgets ;-)

> > It would be less surprising to do this with just one call to on_exit,
> > e.g.
> > 
> > on_exit '
> > ovs-appctl -t ovs-vswitchd exit --cleanup
> > for i in 1 2 3 4 5 6 7 8 9; do
> > kill -0 `cat ovs-vswitchd.pid` || break
> > sleep 0.1 || sleep 1
> > done
> > kill `cat ovs-vswitchd.pid`
> > '
> > 
> > Actually, I think that all of this could be put in a shell function:
> > 
> > kill_ovs_vswitchd() {
> > ovs-appctl -t ovs-vswitchd exit --cleanup
> > for i in 1 2 3 4 5 6 7 8 9; do
> > kill -0 `cat ovs-vswitchd.pid` || break
> > sleep 0.1 || sleep 1
> > done
> > kill `cat ovs-vswitchd.pid`
> > }
> > 
> > and then just "on_exit kill_ovs_vswitchd".  Maybe that is the best
> > approach.
> 
> I agree, let me respin the patch.

Thanks!
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 5/6] redhat: dynamically allocate and reference ovs user

2017-06-08 Thread Aaron Conole
Flavio Leitner  writes:

> On Thu, Jun 08, 2017 at 03:46:24PM -0400, Aaron Conole wrote:
>> Flavio Leitner  writes:
>> 
>> > On Sat, Jun 03, 2017 at 11:10:00AM -0400, Aaron Conole wrote:
>> >> After this commit, the fedora RPM will create the openvswitch user, from 
>> >> the
>> >> non-static pool, for use as an Open vSwitch daemon user.  This only 
>> >> happens
>> >> on install - not upgrade.  This will be the default user:group
>> >> combination for the openvswitch daemons.
>> >> 
>> >> To do this in a way that doesn't impact existing installations, the
>> >> /etc/openvswitch directory will be created during the installation,
>> >> rather than being provided as part of the rpm.
>> >
>> > In the previous patch you add the user configuration to the sysconfig
>> > file and here it adds the same info again to another file which the
>> > user might change, getting us back to state today but now with 2
>> > files.
>> 
>> Sortof.  The idea is the openvswitch-pre file is like the default.conf
>> you detail below.  I could remove the comment from the openvswitch
>> config file, if you think it's causing confusion.  More follows.
>
> /etc/sysconfig is meant to be changed by the user and the name
> implies that it is executed before something, which something?

Agreed.  Poor choice on my part.

>> > Perhaps we could adopt another approach that we have a default
>> > recommended configuration and then a file where the user can
>> > customize it?
>> >
>> > In this case we would create /etc/openvswitch/default.conf.
>> >
>> > If the user wants to change something, it replaces the variable in
>> > /etc/sysconfig/openvswitch as it works today.  Since default.conf is
>> > owned by the system, we can assume it's not edited by the user.
>> 
>> That's the assumption on openvswitch-pre, but I might have gotten the
>> %files section wrong for it (meaning, I assume a user will not edit
>> it).  I can call it /etc/openvswitch/default.conf if you want; that's
>> probably better, actually - so v2 I will do that, regardless what form
>> it takes.
>
> OK, I assume that since it's /etc/sysconfig, the user could change.
>
>> > Then we ship /etc/openvswitch/default.conf with
>> > OVS_USER_ID="openvswitch:openvswitch" by default, so new installations
>> > will have the file state correct in the rpmdb.
>> 
>> Is that not the case for the file I added?  I thought it was okay, but I
>> might misunderstand the way the file flags work.
>
> Sort of, because we have the config in three places now:
> One could edit the systemd services to change the line
> Environment="OVS_USER_ID=root:root", or the /etc/sysconfig/openvswitch
> or /etc/sysconfig/openvswitch-pre
>
>> > The %post appends to the end of /etc/sysconfig/openvswitch the
>> > variable replacing the default user id to root.
>> 
>> Here are the issues I can think of:
>> 
>>  - The rpmdb now thinks that the openvswitch file has been modified by
>>the user (so the %config part will be flagged even though the user
>>did nothing).
>
> Yup, then on the future upgrades, the config file should not be replaced
> and that would happen with /etc/sysconfig/openvswitch-pre, right?

Well, upgrades wouldn't write to openvswitch-pre; that only happens on
install.

>>  - The %post on upgrade will have to detect if the user has an
>>OVS_USER_ID specified, and ignore it appropriately
>
> Right.
>
>>  - The file permissions have to change again (ie: we change them to in
>>the %files section to be owned openvswitch:openvswitch, then have to
>>change them in the %post for upgrade to be the correct
>>user/group... but only if we know we are editing the ovs_user_id, I
>>think).
>
> You mean /etc/openvswitch or another files?

Yes, /etc/openvswitch/ and all the files under it.  We have this problem
anyway, I guess.  The pains of transitioning :)

>> I'm willing to try and rework this series to go in that direction, but I
>> prefer doing it this way (setting the uid on new installs in %post)
>> because it's an easy binary decision.  Is it new?  Okay, there's no way
>> we can break something on the user, so change things around.  On the
>> other hand, working from the opposite, we have to detect properly that
>> the user's system won't be broken by the change.  Maybe there's a good
>> enough regular expression / other test we can design.  I'll take any
>> suggestion :)
>
> Wouldn't the rpm change the /etc/openvswitch permissions back to
> default when the user upgrades from a new installation?

I didn't notice this; I can look for it specifically to confirm.  Maybe
I missed it in my testing.

>> > Then on new installations we have /etc/openvswitch/default.conf with
>> > the recommended system options, nothing on /etc/sysconfig/openvswitch,
>> > no need to add root userid to the services.
>> >
>> > On upgrades, there will be the default.conf recommending to run as
>> > user, and /etc/sysconfig/openvswitch changing to root which the
>> > admin can comment out to move on, and system ser

Re: [ovs-dev] [PATCH v1] netdev-linux: Remove device in netdev_shash when deleted

2017-06-08 Thread Ben Pfaff
On Wed, May 31, 2017 at 09:30:43AM +0800, fukaige wrote:
> Start a virtual machine with its backend tap device attached to a brought up 
> linux bridge.
> If we delete the linux bridge when vm is still running, we'll get the 
> following error when
> trying to create a ovs bridge with the same name.
> 
> The reason is that ovs-router subsystem add the linux bridge into 
> netdev_shash, but does
> not remove it when the bridge is deleted in the situation. When the bridge is 
> deleted, ovs
> will receive a RTM_DELLINK msg, take this chance to remove the bridge in 
> netdev_shash.
> 
> ovs-vsctl: Error detected while setting up 'br-eth'.  See ovs-vswitchd log 
> for details.
> 
> ovs-vswitchd log:
> 2017-05-11T03:45:25.293Z|00026|ofproto_dpif|INFO|system@ovs-system: Datapath 
> supports recirculation
> 2017-05-11T03:45:25.293Z|00027|ofproto_dpif|INFO|system@ovs-system: MPLS 
> label stack length probed as 1
> 2017-05-11T03:45:25.293Z|00028|ofproto_dpif|INFO|system@ovs-system: Datapath 
> supports unique flow ids
> 2017-05-11T03:45:25.293Z|00029|ofproto_dpif|INFO|system@ovs-system: Datapath 
> supports ct_state
> 2017-05-11T03:45:25.293Z|00030|ofproto_dpif|INFO|system@ovs-system: Datapath 
> supports ct_zone
> 2017-05-11T03:45:25.293Z|00031|ofproto_dpif|INFO|system@ovs-system: Datapath 
> supports ct_mark
> 2017-05-11T03:45:25.293Z|00032|ofproto_dpif|INFO|system@ovs-system: Datapath 
> supports ct_label
> 2017-05-11T03:45:25.364Z|1|ofproto_dpif_upcall(handler226)|INFO|received 
> packet on unassociated datapath port 0
> 2017-05-11T03:45:25.368Z|00033|netdev_linux|WARN|ethtool command 
> ETHTOOL_GFLAGS on network device br-eth failed: No such device
> 2017-05-11T03:45:25.368Z|00034|dpif|WARN|system@ovs-system: failed to add 
> br-eth as port: No such device
> 2017-05-11T03:45:25.368Z|00035|bridge|INFO|bridge br-eth: using datapath ID 
> 2a51cf9f2841
> 2017-05-11T03:45:25.368Z|00036|connmgr|INFO|br-eth: added service controller 
> "punix:/var/run/openvswitch/br-eth.mgmt"
> 
> Change-Id: Ib5ead59bc91453f83549da89937c0d3607e0385e
> Signed-off-by: fukaige 

Thanks for identifying the problem and coming up with a fix.

The fix looks to me like it works at the wrong level of abstraction.
netdev-linux and tnl-port-map are not directly related, and I believe
that the same problem could arise with other kinds of netdevs and
tnl-port-map.  So, I think that it would be better if tnl-ports itself
registered a callback upon device destruction, for example via a call to
if_notifier_create() from tnl_port_map_init().

Thanks,

Ben.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] dpdk: announce deprecation of vhost-user server ports

2017-06-08 Thread Aaron Conole
Darrell Ball  writes:

> On 6/8/17, 12:29 PM, "Flavio Leitner"  wrote:
>
> On Thu, Jun 08, 2017 at 06:54:24PM +, Darrell Ball wrote:
> > 
> > 
> > On 6/8/17, 11:22 AM, "Darrell Ball"  wrote:
> > 
> > 
> > 
> > On 6/8/17, 11:13 AM, "Flavio Leitner"  wrote:
> > 
> > On Thu, Jun 08, 2017 at 09:40:52AM -0400, Aaron Conole wrote:
> > > Hi Darrell,
> > > 
> > > Thanks so much for the review!  Comments below.
> > > 
> > > Darrell Ball  writes:
> > > 
> > > > On 6/7/17, 3:46 PM, "Aaron Conole"  
> wrote:
> > > >
> > > > Since vhost-user server mode ports are the preferred 
> mechanism for
> > > > interconnecting Open vSwitch with VMs when using DPDK, 
> and since there
> > > > are currently no known use cases for vhost-user server 
> mode ports apart
> > > > from version incompatibilities with QEMU, announce that 
> server mode ports
> > > > are considered deprecated and will be removed in a 
> future release.
> > > > 
> > > > Cc: Ciara Loftus 
> > > > Cc: Kevin Traynor 
> > > > Suggested-by: Darrell Ball 
> > > > Signed-off-by: Aaron Conole 
> > > > ---
> > > >  Documentation/topics/dpdk/vhost-user.rst | 24 
> 
> > > >  NEWS |  2 ++
> > > >  lib/netdev-dpdk.c|  2 ++
> > > >  3 files changed, 20 insertions(+), 8 deletions(-)
> > > > 
> > > > diff --git a/Documentation/topics/dpdk/vhost-user.rst 
> b/Documentation/topics/dpdk/vhost-user.rst
> > > > index a1c19fd..9d36cf2 100644
> > > > --- a/Documentation/topics/dpdk/vhost-user.rst
> > > > +++ b/Documentation/topics/dpdk/vhost-user.rst
> > > > @@ -32,13 +32,19 @@ documentation`_ on same.
> > > >  Quick Example
> > > >  -
> > > >  
> > > > -This example demonstrates how to add two 
> ``dpdkvhostuser`` ports to an existing
> > > > -bridge called ``br0``::
> > > > +This example demonstrates how to add two 
> ``dpdkvhostuserclient`` ports to an
> > > > +existing bridge called ``br0``::
> > > >  
> > > > -$ ovs-vsctl add-port br0 dpdkvhostuser0 \
> > > > --- set Interface dpdkvhostuser0 
> type=dpdkvhostuser
> > > > -$ ovs-vsctl add-port br0 dpdkvhostuser1 \
> > > > --- set Interface dpdkvhostuser1 
> type=dpdkvhostuser
> > > > +$ ovs-vsctl add-port br0 dpdkvhostclient0 \
> > > > +-- set Interface dpdkvhostclient0 
> type=dpdkvhostuserclient \
> > > > +   
> options:vhost-server-path=/tmp/dpdkvhostclient0
> > > > +$ ovs-vsctl add-port br0 dpdkvhostclient1 \
> > > > +-- set Interface dpdkvhostclient1 
> type=dpdkvhostuserclient \
> > > > +   
> options:vhost-server-path=/tmp/dpdkvhostclient1
> > > > +
> > > > +For the above examples to work, an appropriate server 
> socket must be created
> > > > +at the paths specified (``/tmp/dpdkvhostclient0`` and
> > > > +``/tmp/dpdkvhostclient0``).
> > > >  
> > > >  vhost-user vs. vhost-user-client
> > > >  
> > > > @@ -59,7 +65,8 @@ means if OVS dies, all VMs **must** 
> be restarted. On the other hand, for
> > > >  vhost-user-client ports, OVS acts as the client and 
> QEMU the server. This means
> > > >  OVS can die and be restarted without issue, and it is 
> also possible to restart
> > > >  an instance itself. For this reason, vhost-user-client 
> ports are the preferred
> > > > -type for most use cases.
> > > > +type for most use cases.  Ports of type vhost-user are 
> currently deprecated and
> > > > +will be removed in a future release.
> > > >
> > > > type for all known use cases; the only limitation is that 
> vhost-user client mode ports
> > > > require QEMU version 2.7.  Ports of type vhost-user are 
> currently deprecated and
> > > > will be removed in a future release.
> > > 
> > > Will update with this verbiage.  Thanks.
> > > 
> > > >  .. _dpdk-vhost-user:
> > > >  
> > > > @@ -68,7 +75,8 @@ vhost-user
> 

Re: [ovs-dev] [PATCH 5/6] redhat: dynamically allocate and reference ovs user

2017-06-08 Thread Flavio Leitner
On Thu, Jun 08, 2017 at 03:46:24PM -0400, Aaron Conole wrote:
> Flavio Leitner  writes:
> 
> > On Sat, Jun 03, 2017 at 11:10:00AM -0400, Aaron Conole wrote:
> >> After this commit, the fedora RPM will create the openvswitch user, from 
> >> the
> >> non-static pool, for use as an Open vSwitch daemon user.  This only happens
> >> on install - not upgrade.  This will be the default user:group
> >> combination for the openvswitch daemons.
> >> 
> >> To do this in a way that doesn't impact existing installations, the
> >> /etc/openvswitch directory will be created during the installation,
> >> rather than being provided as part of the rpm.
> >
> > In the previous patch you add the user configuration to the sysconfig
> > file and here it adds the same info again to another file which the
> > user might change, getting us back to state today but now with 2
> > files.
> 
> Sortof.  The idea is the openvswitch-pre file is like the default.conf
> you detail below.  I could remove the comment from the openvswitch
> config file, if you think it's causing confusion.  More follows.

/etc/sysconfig is meant to be changed by the user and the name
implies that it is executed before something, which something?


> > Perhaps we could adopt another approach that we have a default
> > recommended configuration and then a file where the user can
> > customize it?
> >
> > In this case we would create /etc/openvswitch/default.conf.
> >
> > If the user wants to change something, it replaces the variable in
> > /etc/sysconfig/openvswitch as it works today.  Since default.conf is
> > owned by the system, we can assume it's not edited by the user.
> 
> That's the assumption on openvswitch-pre, but I might have gotten the
> %files section wrong for it (meaning, I assume a user will not edit
> it).  I can call it /etc/openvswitch/default.conf if you want; that's
> probably better, actually - so v2 I will do that, regardless what form
> it takes.

OK, I assume that since it's /etc/sysconfig, the user could change.

> > Then we ship /etc/openvswitch/default.conf with
> > OVS_USER_ID="openvswitch:openvswitch" by default, so new installations
> > will have the file state correct in the rpmdb.
> 
> Is that not the case for the file I added?  I thought it was okay, but I
> might misunderstand the way the file flags work.

Sort of, because we have the config in three places now:
One could edit the systemd services to change the line
Environment="OVS_USER_ID=root:root", or the /etc/sysconfig/openvswitch
or /etc/sysconfig/openvswitch-pre

> > The %post appends to the end of /etc/sysconfig/openvswitch the
> > variable replacing the default user id to root.
> 
> Here are the issues I can think of:
> 
>  - The rpmdb now thinks that the openvswitch file has been modified by
>the user (so the %config part will be flagged even though the user
>did nothing).

Yup, then on the future upgrades, the config file should not be replaced
and that would happen with /etc/sysconfig/openvswitch-pre, right?


>  - The %post on upgrade will have to detect if the user has an
>OVS_USER_ID specified, and ignore it appropriately

Right.

>  - The file permissions have to change again (ie: we change them to in
>the %files section to be owned openvswitch:openvswitch, then have to
>change them in the %post for upgrade to be the correct
>user/group... but only if we know we are editing the ovs_user_id, I
>think).

You mean /etc/openvswitch or another files?


> I'm willing to try and rework this series to go in that direction, but I
> prefer doing it this way (setting the uid on new installs in %post)
> because it's an easy binary decision.  Is it new?  Okay, there's no way
> we can break something on the user, so change things around.  On the
> other hand, working from the opposite, we have to detect properly that
> the user's system won't be broken by the change.  Maybe there's a good
> enough regular expression / other test we can design.  I'll take any
> suggestion :)

Wouldn't the rpm change the /etc/openvswitch permissions back to
default when the user upgrades from a new installation?

> > Then on new installations we have /etc/openvswitch/default.conf with
> > the recommended system options, nothing on /etc/sysconfig/openvswitch,
> > no need to add root userid to the services.
> >
> > On upgrades, there will be the default.conf recommending to run as
> > user, and /etc/sysconfig/openvswitch changing to root which the
> > admin can comment out to move on, and system services are ok.
> >
> > What do you think? I am sure I missed something.
> 
> In one sense, it's probably more of a six-of-one and
> half-dozen-of-the-other thing.  Whichever we choose, the requirements,
> in my mind, are simple:
> 
> 1. Don't break existing users who are upgrading.
> 2. Provide new installs with non-root users, because it's a good
>security practice.
> 
> Did I make sense?

We share the same requirements.

fbl


> 
> > fbl
> >
> >
> >> 
> >> Signed-off

Re: [ovs-dev] [PATCH] dpdk: announce deprecation of vhost-user server ports

2017-06-08 Thread Darrell Ball


On 6/8/17, 12:29 PM, "Flavio Leitner"  wrote:

On Thu, Jun 08, 2017 at 06:54:24PM +, Darrell Ball wrote:
> 
> 
> On 6/8/17, 11:22 AM, "Darrell Ball"  wrote:
> 
> 
> 
> On 6/8/17, 11:13 AM, "Flavio Leitner"  wrote:
> 
> On Thu, Jun 08, 2017 at 09:40:52AM -0400, Aaron Conole wrote:
> > Hi Darrell,
> > 
> > Thanks so much for the review!  Comments below.
> > 
> > Darrell Ball  writes:
> > 
> > > On 6/7/17, 3:46 PM, "Aaron Conole"  wrote:
> > >
> > > Since vhost-user server mode ports are the preferred 
mechanism for
> > > interconnecting Open vSwitch with VMs when using DPDK, 
and since there
> > > are currently no known use cases for vhost-user server 
mode ports apart
> > > from version incompatibilities with QEMU, announce that 
server mode ports
> > > are considered deprecated and will be removed in a future 
release.
> > > 
> > > Cc: Ciara Loftus 
> > > Cc: Kevin Traynor 
> > > Suggested-by: Darrell Ball 
> > > Signed-off-by: Aaron Conole 
> > > ---
> > >  Documentation/topics/dpdk/vhost-user.rst | 24 

> > >  NEWS |  2 ++
> > >  lib/netdev-dpdk.c|  2 ++
> > >  3 files changed, 20 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/Documentation/topics/dpdk/vhost-user.rst 
b/Documentation/topics/dpdk/vhost-user.rst
> > > index a1c19fd..9d36cf2 100644
> > > --- a/Documentation/topics/dpdk/vhost-user.rst
> > > +++ b/Documentation/topics/dpdk/vhost-user.rst
> > > @@ -32,13 +32,19 @@ documentation`_ on same.
> > >  Quick Example
> > >  -
> > >  
> > > -This example demonstrates how to add two 
``dpdkvhostuser`` ports to an existing
> > > -bridge called ``br0``::
> > > +This example demonstrates how to add two 
``dpdkvhostuserclient`` ports to an
> > > +existing bridge called ``br0``::
> > >  
> > > -$ ovs-vsctl add-port br0 dpdkvhostuser0 \
> > > --- set Interface dpdkvhostuser0 
type=dpdkvhostuser
> > > -$ ovs-vsctl add-port br0 dpdkvhostuser1 \
> > > --- set Interface dpdkvhostuser1 
type=dpdkvhostuser
> > > +$ ovs-vsctl add-port br0 dpdkvhostclient0 \
> > > +-- set Interface dpdkvhostclient0 
type=dpdkvhostuserclient \
> > > +   
options:vhost-server-path=/tmp/dpdkvhostclient0
> > > +$ ovs-vsctl add-port br0 dpdkvhostclient1 \
> > > +-- set Interface dpdkvhostclient1 
type=dpdkvhostuserclient \
> > > +   
options:vhost-server-path=/tmp/dpdkvhostclient1
> > > +
> > > +For the above examples to work, an appropriate server 
socket must be created
> > > +at the paths specified (``/tmp/dpdkvhostclient0`` and
> > > +``/tmp/dpdkvhostclient0``).
> > >  
> > >  vhost-user vs. vhost-user-client
> > >  
> > > @@ -59,7 +65,8 @@ means if OVS dies, all VMs **must** be 
restarted. On the other hand, for
> > >  vhost-user-client ports, OVS acts as the client and QEMU 
the server. This means
> > >  OVS can die and be restarted without issue, and it is 
also possible to restart
> > >  an instance itself. For this reason, vhost-user-client 
ports are the preferred
> > > -type for most use cases.
> > > +type for most use cases.  Ports of type vhost-user are 
currently deprecated and
> > > +will be removed in a future release.
> > >
> > > type for all known use cases; the only limitation is that 
vhost-user client mode ports
> > > require QEMU version 2.7.  Ports of type vhost-user are 
currently deprecated and
> > > will be removed in a future release.
> > 
> > Will update with this verbiage.  Thanks.
> > 
> > >  .. _dpdk-vhost-user:
> > >  
> > > @@ -68,7 +75,8 @@ vhost-user
> > >  
> > >  .. important::
> > >  
> > > -   Use of vhost-user ports requires QEMU >= 2.2
> > > +   Use of vhost-user ports requires QEMU >= 2.2;  
vhost-

Re: [ovs-dev] [PATCH net] openvswitch: warn about missing first netlink attribute

2017-06-08 Thread David Miller
From: Nicolas Dichtel 
Date: Thu,  8 Jun 2017 10:37:45 +0200

> The first netlink attribute (value 0) must always be defined
> as none/unspec.
> 
> Because we cannot change an existing UAPI, I add a comment to point the
> mistake and avoid to propagate it in a new ovs API in the future.
> 
> Signed-off-by: Nicolas Dichtel 

Ok, I agree, we don't want people cut-and-pasting this kind of thing
and hopefully this comment prevents that.

Applied, thanks.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 5/6] redhat: dynamically allocate and reference ovs user

2017-06-08 Thread Aaron Conole
Flavio Leitner  writes:

> On Sat, Jun 03, 2017 at 11:10:00AM -0400, Aaron Conole wrote:
>> After this commit, the fedora RPM will create the openvswitch user, from the
>> non-static pool, for use as an Open vSwitch daemon user.  This only happens
>> on install - not upgrade.  This will be the default user:group
>> combination for the openvswitch daemons.
>> 
>> To do this in a way that doesn't impact existing installations, the
>> /etc/openvswitch directory will be created during the installation,
>> rather than being provided as part of the rpm.
>
> In the previous patch you add the user configuration to the sysconfig
> file and here it adds the same info again to another file which the
> user might change, getting us back to state today but now with 2
> files.

Sortof.  The idea is the openvswitch-pre file is like the default.conf
you detail below.  I could remove the comment from the openvswitch
config file, if you think it's causing confusion.  More follows.

> Perhaps we could adopt another approach that we have a default
> recommended configuration and then a file where the user can
> customize it?
>
> In this case we would create /etc/openvswitch/default.conf.
>
> If the user wants to change something, it replaces the variable in
> /etc/sysconfig/openvswitch as it works today.  Since default.conf is
> owned by the system, we can assume it's not edited by the user.

That's the assumption on openvswitch-pre, but I might have gotten the
%files section wrong for it (meaning, I assume a user will not edit
it).  I can call it /etc/openvswitch/default.conf if you want; that's
probably better, actually - so v2 I will do that, regardless what form
it takes.

> Then we ship /etc/openvswitch/default.conf with
> OVS_USER_ID="openvswitch:openvswitch" by default, so new installations
> will have the file state correct in the rpmdb.

Is that not the case for the file I added?  I thought it was okay, but I
might misunderstand the way the file flags work.

> The %post appends to the end of /etc/sysconfig/openvswitch the
> variable replacing the default user id to root.

Here are the issues I can think of:

 - The rpmdb now thinks that the openvswitch file has been modified by
   the user (so the %config part will be flagged even though the user
   did nothing).
 - The %post on upgrade will have to detect if the user has an
   OVS_USER_ID specified, and ignore it appropriately
 - The file permissions have to change again (ie: we change them to in
   the %files section to be owned openvswitch:openvswitch, then have to
   change them in the %post for upgrade to be the correct
   user/group... but only if we know we are editing the ovs_user_id, I
   think).

I'm willing to try and rework this series to go in that direction, but I
prefer doing it this way (setting the uid on new installs in %post)
because it's an easy binary decision.  Is it new?  Okay, there's no way
we can break something on the user, so change things around.  On the
other hand, working from the opposite, we have to detect properly that
the user's system won't be broken by the change.  Maybe there's a good
enough regular expression / other test we can design.  I'll take any
suggestion :)

> Then on new installations we have /etc/openvswitch/default.conf with
> the recommended system options, nothing on /etc/sysconfig/openvswitch,
> no need to add root userid to the services.
>
> On upgrades, there will be the default.conf recommending to run as
> user, and /etc/sysconfig/openvswitch changing to root which the
> admin can comment out to move on, and system services are ok.
>
> What do you think? I am sure I missed something.

In one sense, it's probably more of a six-of-one and
half-dozen-of-the-other thing.  Whichever we choose, the requirements,
in my mind, are simple:

1. Don't break existing users who are upgrading.
2. Provide new installs with non-root users, because it's a good
   security practice.

Did I make sense?

> fbl
>
>
>> 
>> Signed-off-by: Aaron Conole 
>> ---
>>  rhel/openvswitch-fedora.spec.in  | 15 ++-
>>  rhel/usr_lib_systemd_system_ovs-vswitchd.service |  1 +
>>  rhel/usr_lib_systemd_system_ovsdb-server.service |  2 ++
>>  3 files changed, 17 insertions(+), 1 deletion(-)
>> 
>> diff --git a/rhel/openvswitch-fedora.spec.in 
>> b/rhel/openvswitch-fedora.spec.in
>> index fe6f15f..f4da735 100644
>> --- a/rhel/openvswitch-fedora.spec.in
>> +++ b/rhel/openvswitch-fedora.spec.in
>> @@ -92,6 +92,8 @@ Requires: openssl hostname iproute module-init-tools
>>  #Upstream kernel commit 4f647e0a3c37b8d5086214128614a136064110c3
>>  #Requires: kernel >= 3.15.0-0
>>  
>> +Requires(post): /usr/bin/getent
>> +Requires(post): /usr/sbin/useradd
>>  Requires(post): systemd-units
>>  Requires(preun): systemd-units
>>  Requires(postun): systemd-units
>> @@ -354,6 +356,16 @@ rm -rf $RPM_BUILD_ROOT
>>  %endif
>>  
>>  %post
>> +if [ $1 -eq 1 ]; then
>> +getent passwd openvswitch >/dev/null || \
>> +useradd -r -d / -s 

Re: [ovs-dev] [PATCH] dpdk: announce deprecation of vhost-user server ports

2017-06-08 Thread Flavio Leitner
On Thu, Jun 08, 2017 at 06:54:24PM +, Darrell Ball wrote:
> 
> 
> On 6/8/17, 11:22 AM, "Darrell Ball"  wrote:
> 
> 
> 
> On 6/8/17, 11:13 AM, "Flavio Leitner"  wrote:
> 
> On Thu, Jun 08, 2017 at 09:40:52AM -0400, Aaron Conole wrote:
> > Hi Darrell,
> > 
> > Thanks so much for the review!  Comments below.
> > 
> > Darrell Ball  writes:
> > 
> > > On 6/7/17, 3:46 PM, "Aaron Conole"  wrote:
> > >
> > > Since vhost-user server mode ports are the preferred 
> mechanism for
> > > interconnecting Open vSwitch with VMs when using DPDK, and 
> since there
> > > are currently no known use cases for vhost-user server mode 
> ports apart
> > > from version incompatibilities with QEMU, announce that 
> server mode ports
> > > are considered deprecated and will be removed in a future 
> release.
> > > 
> > > Cc: Ciara Loftus 
> > > Cc: Kevin Traynor 
> > > Suggested-by: Darrell Ball 
> > > Signed-off-by: Aaron Conole 
> > > ---
> > >  Documentation/topics/dpdk/vhost-user.rst | 24 
> 
> > >  NEWS |  2 ++
> > >  lib/netdev-dpdk.c|  2 ++
> > >  3 files changed, 20 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/Documentation/topics/dpdk/vhost-user.rst 
> b/Documentation/topics/dpdk/vhost-user.rst
> > > index a1c19fd..9d36cf2 100644
> > > --- a/Documentation/topics/dpdk/vhost-user.rst
> > > +++ b/Documentation/topics/dpdk/vhost-user.rst
> > > @@ -32,13 +32,19 @@ documentation`_ on same.
> > >  Quick Example
> > >  -
> > >  
> > > -This example demonstrates how to add two ``dpdkvhostuser`` 
> ports to an existing
> > > -bridge called ``br0``::
> > > +This example demonstrates how to add two 
> ``dpdkvhostuserclient`` ports to an
> > > +existing bridge called ``br0``::
> > >  
> > > -$ ovs-vsctl add-port br0 dpdkvhostuser0 \
> > > --- set Interface dpdkvhostuser0 type=dpdkvhostuser
> > > -$ ovs-vsctl add-port br0 dpdkvhostuser1 \
> > > --- set Interface dpdkvhostuser1 type=dpdkvhostuser
> > > +$ ovs-vsctl add-port br0 dpdkvhostclient0 \
> > > +-- set Interface dpdkvhostclient0 
> type=dpdkvhostuserclient \
> > > +   options:vhost-server-path=/tmp/dpdkvhostclient0
> > > +$ ovs-vsctl add-port br0 dpdkvhostclient1 \
> > > +-- set Interface dpdkvhostclient1 
> type=dpdkvhostuserclient \
> > > +   options:vhost-server-path=/tmp/dpdkvhostclient1
> > > +
> > > +For the above examples to work, an appropriate server socket 
> must be created
> > > +at the paths specified (``/tmp/dpdkvhostclient0`` and
> > > +``/tmp/dpdkvhostclient0``).
> > >  
> > >  vhost-user vs. vhost-user-client
> > >  
> > > @@ -59,7 +65,8 @@ means if OVS dies, all VMs **must** be 
> restarted. On the other hand, for
> > >  vhost-user-client ports, OVS acts as the client and QEMU the 
> server. This means
> > >  OVS can die and be restarted without issue, and it is also 
> possible to restart
> > >  an instance itself. For this reason, vhost-user-client ports 
> are the preferred
> > > -type for most use cases.
> > > +type for most use cases.  Ports of type vhost-user are 
> currently deprecated and
> > > +will be removed in a future release.
> > >
> > > type for all known use cases; the only limitation is that 
> vhost-user client mode ports
> > > require QEMU version 2.7.  Ports of type vhost-user are currently 
> deprecated and
> > > will be removed in a future release.
> > 
> > Will update with this verbiage.  Thanks.
> > 
> > >  .. _dpdk-vhost-user:
> > >  
> > > @@ -68,7 +75,8 @@ vhost-user
> > >  
> > >  .. important::
> > >  
> > > -   Use of vhost-user ports requires QEMU >= 2.2
> > > +   Use of vhost-user ports requires QEMU >= 2.2;  vhost-user 
> ports are
> > > +   *deprecated*.
> > >  
> > >  To use vhost-user ports, you must first add said ports to 
> the switch. DPDK
> > >  vhost-user ports can have arbitrary names with the exception 
> of forward and
> > > diff --git a/NEWS b/NEWS
> > > index 82004c8..b81d033 100644

Re: [ovs-dev] [PATCH] testsuite: exit gracefully if it fails.

2017-06-08 Thread Flavio Leitner
On Thu, Jun 08, 2017 at 10:53:05AM -0700, Ben Pfaff wrote:
> On Thu, Jun 08, 2017 at 02:30:48PM -0300, Flavio Leitner wrote:
> > The daemon is killed leaving resources behind when a test fails.
> > This fixes to first signal the daemon to exit gracefully.
> > 
> > Suggested-by: Joe Stringer 
> > Fixes: 0f28164be02ac ("netdev-linux: make tap devices persistent")
> > Signed-off-by: Flavio Leitner 
> > ---
> >  tests/ofproto-macros.at | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/tests/ofproto-macros.at b/tests/ofproto-macros.at
> > index faff5b0..5ac5d05 100644
> > --- a/tests/ofproto-macros.at
> > +++ b/tests/ofproto-macros.at
> > @@ -323,6 +323,9 @@ m4_define([_OVS_VSWITCHD_START],
> > AT_CHECK([ovs-vswitchd $1 --detach --no-chdir --pidfile --log-file 
> > -vvconn -vofproto_dpif -vunixctl], [0], [], [stderr])
> > AT_CAPTURE_FILE([ovs-vswitchd.log])
> > on_exit "kill `cat ovs-vswitchd.pid`"
> > +   dnl Wait for the daemon to exit gracefully
> > +   on_exit "for i in 1 2 3 4 5 6 7 8 9; do kill -0 `cat ovs-vswitchd.pid` 
> > || break; sleep 0.1 || sleep 1; done"
> > +   on_exit "ovs-appctl -t ovs-vswitchd exit --cleanup"
> 
> Thanks for the patch.
> 
> At first, I thought that this did the steps in the wrong order, but
> "on_exit" reverses the order. 

That's documented! :)


> It would be less surprising to do this with just one call to on_exit,
> e.g.
> 
> on_exit '
> ovs-appctl -t ovs-vswitchd exit --cleanup
> for i in 1 2 3 4 5 6 7 8 9; do
> kill -0 `cat ovs-vswitchd.pid` || break
> sleep 0.1 || sleep 1
> done
> kill `cat ovs-vswitchd.pid`
> '
> 
> Actually, I think that all of this could be put in a shell function:
> 
> kill_ovs_vswitchd() {
> ovs-appctl -t ovs-vswitchd exit --cleanup
> for i in 1 2 3 4 5 6 7 8 9; do
> kill -0 `cat ovs-vswitchd.pid` || break
> sleep 0.1 || sleep 1
> done
> kill `cat ovs-vswitchd.pid`
> }
> 
> and then just "on_exit kill_ovs_vswitchd".  Maybe that is the best
> approach.

I agree, let me respin the patch.

-- 
Flavio

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 5/6] redhat: dynamically allocate and reference ovs user

2017-06-08 Thread Flavio Leitner
On Sat, Jun 03, 2017 at 11:10:00AM -0400, Aaron Conole wrote:
> After this commit, the fedora RPM will create the openvswitch user, from the
> non-static pool, for use as an Open vSwitch daemon user.  This only happens
> on install - not upgrade.  This will be the default user:group
> combination for the openvswitch daemons.
> 
> To do this in a way that doesn't impact existing installations, the
> /etc/openvswitch directory will be created during the installation,
> rather than being provided as part of the rpm.

In the previous patch you add the user configuration to the sysconfig
file and here it adds the same info again to another file which the
user might change, getting us back to state today but now with 2
files.

Perhaps we could adopt another approach that we have a default
recommended configuration and then a file where the user can
customize it?

In this case we would create /etc/openvswitch/default.conf.

If the user wants to change something, it replaces the variable in
/etc/sysconfig/openvswitch as it works today.  Since default.conf is
owned by the system, we can assume it's not edited by the user.

Then we ship /etc/openvswitch/default.conf with
OVS_USER_ID="openvswitch:openvswitch" by default, so new installations
will have the file state correct in the rpmdb.

The %post appends to the end of /etc/sysconfig/openvswitch the
variable replacing the default user id to root.

Then on new installations we have /etc/openvswitch/default.conf with
the recommended system options, nothing on /etc/sysconfig/openvswitch,
no need to add root userid to the services.

On upgrades, there will be the default.conf recommending to run as
user, and /etc/sysconfig/openvswitch changing to root which the
admin can comment out to move on, and system services are ok.

What do you think? I am sure I missed something.

fbl


> 
> Signed-off-by: Aaron Conole 
> ---
>  rhel/openvswitch-fedora.spec.in  | 15 ++-
>  rhel/usr_lib_systemd_system_ovs-vswitchd.service |  1 +
>  rhel/usr_lib_systemd_system_ovsdb-server.service |  2 ++
>  3 files changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/rhel/openvswitch-fedora.spec.in b/rhel/openvswitch-fedora.spec.in
> index fe6f15f..f4da735 100644
> --- a/rhel/openvswitch-fedora.spec.in
> +++ b/rhel/openvswitch-fedora.spec.in
> @@ -92,6 +92,8 @@ Requires: openssl hostname iproute module-init-tools
>  #Upstream kernel commit 4f647e0a3c37b8d5086214128614a136064110c3
>  #Requires: kernel >= 3.15.0-0
>  
> +Requires(post): /usr/bin/getent
> +Requires(post): /usr/sbin/useradd
>  Requires(post): systemd-units
>  Requires(preun): systemd-units
>  Requires(postun): systemd-units
> @@ -354,6 +356,16 @@ rm -rf $RPM_BUILD_ROOT
>  %endif
>  
>  %post
> +if [ $1 -eq 1 ]; then
> +getent passwd openvswitch >/dev/null || \
> +useradd -r -d / -s /sbin/nologin -c "Open vSwitch Daemons" 
> openvswitch
> +echo "OVS_USER_ID=openvswitch:openvswitch" > \
> + %{_sysconfdir}/sysconfig/openvswitch-pre
> +
> +# In the case of upgrade, this is not needed.
> +install -d -m 0755 -o openvswitch -g openvswitch /etc/openvswitch
> +fi
> +
>  %if 0%{?systemd_post:1}
>  %systemd_post %{name}.service
>  %else
> @@ -480,7 +492,8 @@ fi
>  %defattr(-,root,root)
>  %{_sysconfdir}/bash_completion.d/ovs-appctl-bashcomp.bash
>  %{_sysconfdir}/bash_completion.d/ovs-vsctl-bashcomp.bash
> -%dir %{_sysconfdir}/openvswitch
> +%ghost %{_sysconfdir}/openvswitch
> +%ghost %{_sysconfdir}/sysconfig/openvswitch-pre
>  %config %ghost %{_sysconfdir}/openvswitch/conf.db
>  %ghost %{_sysconfdir}/openvswitch/.conf.db.~lock~
>  %config %ghost %{_sysconfdir}/openvswitch/system-id.conf
> diff --git a/rhel/usr_lib_systemd_system_ovs-vswitchd.service 
> b/rhel/usr_lib_systemd_system_ovs-vswitchd.service
> index d63bf4d..0434d20 100644
> --- a/rhel/usr_lib_systemd_system_ovs-vswitchd.service
> +++ b/rhel/usr_lib_systemd_system_ovs-vswitchd.service
> @@ -11,6 +11,7 @@ PartOf=openvswitch.service
>  Type=forking
>  Restart=on-failure
>  Environment="OVS_USER_ID=root:root"
> +EnvironmentFile=-/etc/sysconfig/openvswitch-pre
>  EnvironmentFile=-/etc/sysconfig/openvswitch
>  ExecStart=/usr/share/openvswitch/scripts/ovs-ctl \
>--no-ovsdb-server --no-monitor --system-id=random \
> diff --git a/rhel/usr_lib_systemd_system_ovsdb-server.service 
> b/rhel/usr_lib_systemd_system_ovsdb-server.service
> index 67b50c8..8354087 100644
> --- a/rhel/usr_lib_systemd_system_ovsdb-server.service
> +++ b/rhel/usr_lib_systemd_system_ovsdb-server.service
> @@ -9,7 +9,9 @@ PartOf=openvswitch.service
>  Type=forking
>  Restart=on-failure
>  Environment="OVS_USER_ID=root:root"
> +EnvironmentFile=-/etc/sysconfig/openvswitch-pre
>  EnvironmentFile=-/etc/sysconfig/openvswitch
> +ExecStartPre=/usr/bin/chown ${OVS_USER_ID} /var/run/openvswitch
>  ExecStart=/usr/share/openvswitch/scripts/ovs-ctl \
>--no-ovs-vswitchd --no-monitor --system-id=random \
>--ovs-user=${OVS_USER_ID}

Re: [ovs-dev] [PATCH] dpdk: announce deprecation of vhost-user server ports

2017-06-08 Thread Aaron Conole
Hi Kevin,

Kevin Traynor  writes:

> On 06/07/2017 11:46 PM, Aaron Conole wrote:
>> Since vhost-user server mode ports are the preferred mechanism for
>> interconnecting Open vSwitch with VMs when using DPDK, and since there
>> are currently no known use cases for vhost-user server mode ports apart
>> from version incompatibilities with QEMU, announce that server mode ports
>> are considered deprecated and will be removed in a future release.
>> 
>> Cc: Ciara Loftus 
>> Cc: Kevin Traynor 
>> Suggested-by: Darrell Ball 
>> Signed-off-by: Aaron Conole 
>> ---
>>  Documentation/topics/dpdk/vhost-user.rst | 24 
>>  NEWS |  2 ++
>>  lib/netdev-dpdk.c|  2 ++
>>  3 files changed, 20 insertions(+), 8 deletions(-)
>> 
>> diff --git a/Documentation/topics/dpdk/vhost-user.rst
>> b/Documentation/topics/dpdk/vhost-user.rst
>> index a1c19fd..9d36cf2 100644
>> --- a/Documentation/topics/dpdk/vhost-user.rst
>> +++ b/Documentation/topics/dpdk/vhost-user.rst
>> @@ -32,13 +32,19 @@ documentation`_ on same.
>>  Quick Example
>>  -
>>  
>> -This example demonstrates how to add two ``dpdkvhostuser`` ports to an 
>> existing
>> -bridge called ``br0``::
>> +This example demonstrates how to add two ``dpdkvhostuserclient`` ports to an
>> +existing bridge called ``br0``::
>>  
>> -$ ovs-vsctl add-port br0 dpdkvhostuser0 \
>> --- set Interface dpdkvhostuser0 type=dpdkvhostuser
>> -$ ovs-vsctl add-port br0 dpdkvhostuser1 \
>> --- set Interface dpdkvhostuser1 type=dpdkvhostuser
>> +$ ovs-vsctl add-port br0 dpdkvhostclient0 \
>> +-- set Interface dpdkvhostclient0 type=dpdkvhostuserclient \
>> +   options:vhost-server-path=/tmp/dpdkvhostclient0
>> +$ ovs-vsctl add-port br0 dpdkvhostclient1 \
>> +-- set Interface dpdkvhostclient1 type=dpdkvhostuserclient \
>> +   options:vhost-server-path=/tmp/dpdkvhostclient1
>> +
>> +For the above examples to work, an appropriate server socket must be created
>> +at the paths specified (``/tmp/dpdkvhostclient0`` and
>> +``/tmp/dpdkvhostclient0``).
>
> You could mention QEMU here. So the reader knows where to look.
> "These can be created by QEMU. See below for details."?

Good idea.  I'll add it.

Thanks for the review!

>>  vhost-user vs. vhost-user-client
>>  
>> @@ -59,7 +65,8 @@ means if OVS dies, all VMs **must** be restarted. On the 
>> other hand, for
>>  vhost-user-client ports, OVS acts as the client and QEMU the server. This 
>> means
>>  OVS can die and be restarted without issue, and it is also possible to 
>> restart
>>  an instance itself. For this reason, vhost-user-client ports are the 
>> preferred
>> -type for most use cases.
>> +type for most use cases.  Ports of type vhost-user are currently deprecated 
>> and
>> +will be removed in a future release.
>>  
>>  .. _dpdk-vhost-user:
>>  
>> @@ -68,7 +75,8 @@ vhost-user
>>  
>>  .. important::
>>  
>> -   Use of vhost-user ports requires QEMU >= 2.2
>> +   Use of vhost-user ports requires QEMU >= 2.2;  vhost-user ports are
>> +   *deprecated*.
>>  
>>  To use vhost-user ports, you must first add said ports to the switch. DPDK
>>  vhost-user ports can have arbitrary names with the exception of forward and
>> diff --git a/NEWS b/NEWS
>> index 82004c8..b81d033 100644
>> --- a/NEWS
>> +++ b/NEWS
>> @@ -16,6 +16,8 @@ Post-v2.7.0
>> Log level can be changed in a usual OVS way using
>> 'ovs-appctl vlog' commands for 'dpdk' module. Lower bound
>> still can be configured via extra arguments for DPDK EAL.
>> + * dpdkvhostuser ports are marked as deprecated.  They will be removed
>> +   in an upcoming release.
>> - IPFIX now provides additional counters:
>>   * Total counters since metering process startup.
>>   * Per-flow TCP flag counters.
>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
>> index b770b70..9ab4aeb 100644
>> --- a/lib/netdev-dpdk.c
>> +++ b/lib/netdev-dpdk.c
>> @@ -966,6 +966,8 @@ netdev_dpdk_vhost_construct(struct netdev *netdev)
>>  err = vhost_common_construct(netdev);
>>  
>>  ovs_mutex_unlock(&dpdk_mutex);
>> +VLOG_WARN_ONCE("dpdkvhostuser ports are considered deprecated;  "
>> +   "please migrate to dpdkvhostuserclient ports.");
>>  return err;
>>  }
>>  
>> 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] dpdk: announce deprecation of vhost-user server ports

2017-06-08 Thread Darrell Ball


On 6/8/17, 11:22 AM, "Darrell Ball"  wrote:



On 6/8/17, 11:13 AM, "Flavio Leitner"  wrote:

On Thu, Jun 08, 2017 at 09:40:52AM -0400, Aaron Conole wrote:
> Hi Darrell,
> 
> Thanks so much for the review!  Comments below.
> 
> Darrell Ball  writes:
> 
> > On 6/7/17, 3:46 PM, "Aaron Conole"  wrote:
> >
> > Since vhost-user server mode ports are the preferred mechanism 
for
> > interconnecting Open vSwitch with VMs when using DPDK, and 
since there
> > are currently no known use cases for vhost-user server mode 
ports apart
> > from version incompatibilities with QEMU, announce that server 
mode ports
> > are considered deprecated and will be removed in a future 
release.
> > 
> > Cc: Ciara Loftus 
> > Cc: Kevin Traynor 
> > Suggested-by: Darrell Ball 
> > Signed-off-by: Aaron Conole 
> > ---
> >  Documentation/topics/dpdk/vhost-user.rst | 24 

> >  NEWS |  2 ++
> >  lib/netdev-dpdk.c|  2 ++
> >  3 files changed, 20 insertions(+), 8 deletions(-)
> > 
> > diff --git a/Documentation/topics/dpdk/vhost-user.rst 
b/Documentation/topics/dpdk/vhost-user.rst
> > index a1c19fd..9d36cf2 100644
> > --- a/Documentation/topics/dpdk/vhost-user.rst
> > +++ b/Documentation/topics/dpdk/vhost-user.rst
> > @@ -32,13 +32,19 @@ documentation`_ on same.
> >  Quick Example
> >  -
> >  
> > -This example demonstrates how to add two ``dpdkvhostuser`` 
ports to an existing
> > -bridge called ``br0``::
> > +This example demonstrates how to add two 
``dpdkvhostuserclient`` ports to an
> > +existing bridge called ``br0``::
> >  
> > -$ ovs-vsctl add-port br0 dpdkvhostuser0 \
> > --- set Interface dpdkvhostuser0 type=dpdkvhostuser
> > -$ ovs-vsctl add-port br0 dpdkvhostuser1 \
> > --- set Interface dpdkvhostuser1 type=dpdkvhostuser
> > +$ ovs-vsctl add-port br0 dpdkvhostclient0 \
> > +-- set Interface dpdkvhostclient0 
type=dpdkvhostuserclient \
> > +   options:vhost-server-path=/tmp/dpdkvhostclient0
> > +$ ovs-vsctl add-port br0 dpdkvhostclient1 \
> > +-- set Interface dpdkvhostclient1 
type=dpdkvhostuserclient \
> > +   options:vhost-server-path=/tmp/dpdkvhostclient1
> > +
> > +For the above examples to work, an appropriate server socket 
must be created
> > +at the paths specified (``/tmp/dpdkvhostclient0`` and
> > +``/tmp/dpdkvhostclient0``).
> >  
> >  vhost-user vs. vhost-user-client
> >  
> > @@ -59,7 +65,8 @@ means if OVS dies, all VMs **must** be 
restarted. On the other hand, for
> >  vhost-user-client ports, OVS acts as the client and QEMU the 
server. This means
> >  OVS can die and be restarted without issue, and it is also 
possible to restart
> >  an instance itself. For this reason, vhost-user-client ports 
are the preferred
> > -type for most use cases.
> > +type for most use cases.  Ports of type vhost-user are 
currently deprecated and
> > +will be removed in a future release.
> >
> > type for all known use cases; the only limitation is that 
vhost-user client mode ports
> > require QEMU version 2.7.  Ports of type vhost-user are currently 
deprecated and
> > will be removed in a future release.
> 
> Will update with this verbiage.  Thanks.
> 
> >  .. _dpdk-vhost-user:
> >  
> > @@ -68,7 +75,8 @@ vhost-user
> >  
> >  .. important::
> >  
> > -   Use of vhost-user ports requires QEMU >= 2.2
> > +   Use of vhost-user ports requires QEMU >= 2.2;  vhost-user 
ports are
> > +   *deprecated*.
> >  
> >  To use vhost-user ports, you must first add said ports to the 
switch. DPDK
> >  vhost-user ports can have arbitrary names with the exception 
of forward and
> > diff --git a/NEWS b/NEWS
> > index 82004c8..b81d033 100644
> > --- a/NEWS
> > +++ b/NEWS
> > @@ -16,6 +16,8 @@ Post-v2.7.0
> > Log level can be changed in a usual OVS way using
> > 'ovs-appctl vlog' commands for 'dpdk' module. Lower 
bound
> > still c

Re: [ovs-dev] [PATCH] dpdk: announce deprecation of vhost-user server ports

2017-06-08 Thread Darrell Ball


On 6/8/17, 11:13 AM, "Flavio Leitner"  wrote:

On Thu, Jun 08, 2017 at 09:40:52AM -0400, Aaron Conole wrote:
> Hi Darrell,
> 
> Thanks so much for the review!  Comments below.
> 
> Darrell Ball  writes:
> 
> > On 6/7/17, 3:46 PM, "Aaron Conole"  wrote:
> >
> > Since vhost-user server mode ports are the preferred mechanism for
> > interconnecting Open vSwitch with VMs when using DPDK, and since 
there
> > are currently no known use cases for vhost-user server mode ports 
apart
> > from version incompatibilities with QEMU, announce that server mode 
ports
> > are considered deprecated and will be removed in a future release.
> > 
> > Cc: Ciara Loftus 
> > Cc: Kevin Traynor 
> > Suggested-by: Darrell Ball 
> > Signed-off-by: Aaron Conole 
> > ---
> >  Documentation/topics/dpdk/vhost-user.rst | 24 

> >  NEWS |  2 ++
> >  lib/netdev-dpdk.c|  2 ++
> >  3 files changed, 20 insertions(+), 8 deletions(-)
> > 
> > diff --git a/Documentation/topics/dpdk/vhost-user.rst 
b/Documentation/topics/dpdk/vhost-user.rst
> > index a1c19fd..9d36cf2 100644
> > --- a/Documentation/topics/dpdk/vhost-user.rst
> > +++ b/Documentation/topics/dpdk/vhost-user.rst
> > @@ -32,13 +32,19 @@ documentation`_ on same.
> >  Quick Example
> >  -
> >  
> > -This example demonstrates how to add two ``dpdkvhostuser`` ports 
to an existing
> > -bridge called ``br0``::
> > +This example demonstrates how to add two ``dpdkvhostuserclient`` 
ports to an
> > +existing bridge called ``br0``::
> >  
> > -$ ovs-vsctl add-port br0 dpdkvhostuser0 \
> > --- set Interface dpdkvhostuser0 type=dpdkvhostuser
> > -$ ovs-vsctl add-port br0 dpdkvhostuser1 \
> > --- set Interface dpdkvhostuser1 type=dpdkvhostuser
> > +$ ovs-vsctl add-port br0 dpdkvhostclient0 \
> > +-- set Interface dpdkvhostclient0 type=dpdkvhostuserclient 
\
> > +   options:vhost-server-path=/tmp/dpdkvhostclient0
> > +$ ovs-vsctl add-port br0 dpdkvhostclient1 \
> > +-- set Interface dpdkvhostclient1 type=dpdkvhostuserclient 
\
> > +   options:vhost-server-path=/tmp/dpdkvhostclient1
> > +
> > +For the above examples to work, an appropriate server socket must 
be created
> > +at the paths specified (``/tmp/dpdkvhostclient0`` and
> > +``/tmp/dpdkvhostclient0``).
> >  
> >  vhost-user vs. vhost-user-client
> >  
> > @@ -59,7 +65,8 @@ means if OVS dies, all VMs **must** be restarted. 
On the other hand, for
> >  vhost-user-client ports, OVS acts as the client and QEMU the 
server. This means
> >  OVS can die and be restarted without issue, and it is also 
possible to restart
> >  an instance itself. For this reason, vhost-user-client ports are 
the preferred
> > -type for most use cases.
> > +type for most use cases.  Ports of type vhost-user are currently 
deprecated and
> > +will be removed in a future release.
> >
> > type for all known use cases; the only limitation is that vhost-user 
client mode ports
> > require QEMU version 2.7.  Ports of type vhost-user are currently 
deprecated and
> > will be removed in a future release.
> 
> Will update with this verbiage.  Thanks.
> 
> >  .. _dpdk-vhost-user:
> >  
> > @@ -68,7 +75,8 @@ vhost-user
> >  
> >  .. important::
> >  
> > -   Use of vhost-user ports requires QEMU >= 2.2
> > +   Use of vhost-user ports requires QEMU >= 2.2;  vhost-user ports 
are
> > +   *deprecated*.
> >  
> >  To use vhost-user ports, you must first add said ports to the 
switch. DPDK
> >  vhost-user ports can have arbitrary names with the exception of 
forward and
> > diff --git a/NEWS b/NEWS
> > index 82004c8..b81d033 100644
> > --- a/NEWS
> > +++ b/NEWS
> > @@ -16,6 +16,8 @@ Post-v2.7.0
> > Log level can be changed in a usual OVS way using
> > 'ovs-appctl vlog' commands for 'dpdk' module. Lower bound
> > still can be configured via extra arguments for DPDK EAL.
> > + * dpdkvhostuser ports are marked as deprecated.  They will be 
removed
> > +   in an upcoming release.
> > - IPFIX now provides additional counters:
> >   * Total counters since metering process startup.
> >   * Per-flow TCP flag counters.
> > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
  

Re: [ovs-dev] [PATCH] dpdk: announce deprecation of vhost-user server ports

2017-06-08 Thread Flavio Leitner
On Thu, Jun 08, 2017 at 09:40:52AM -0400, Aaron Conole wrote:
> Hi Darrell,
> 
> Thanks so much for the review!  Comments below.
> 
> Darrell Ball  writes:
> 
> > On 6/7/17, 3:46 PM, "Aaron Conole"  wrote:
> >
> > Since vhost-user server mode ports are the preferred mechanism for
> > interconnecting Open vSwitch with VMs when using DPDK, and since there
> > are currently no known use cases for vhost-user server mode ports apart
> > from version incompatibilities with QEMU, announce that server mode 
> > ports
> > are considered deprecated and will be removed in a future release.
> > 
> > Cc: Ciara Loftus 
> > Cc: Kevin Traynor 
> > Suggested-by: Darrell Ball 
> > Signed-off-by: Aaron Conole 
> > ---
> >  Documentation/topics/dpdk/vhost-user.rst | 24 
> >  NEWS |  2 ++
> >  lib/netdev-dpdk.c|  2 ++
> >  3 files changed, 20 insertions(+), 8 deletions(-)
> > 
> > diff --git a/Documentation/topics/dpdk/vhost-user.rst 
> > b/Documentation/topics/dpdk/vhost-user.rst
> > index a1c19fd..9d36cf2 100644
> > --- a/Documentation/topics/dpdk/vhost-user.rst
> > +++ b/Documentation/topics/dpdk/vhost-user.rst
> > @@ -32,13 +32,19 @@ documentation`_ on same.
> >  Quick Example
> >  -
> >  
> > -This example demonstrates how to add two ``dpdkvhostuser`` ports to an 
> > existing
> > -bridge called ``br0``::
> > +This example demonstrates how to add two ``dpdkvhostuserclient`` ports 
> > to an
> > +existing bridge called ``br0``::
> >  
> > -$ ovs-vsctl add-port br0 dpdkvhostuser0 \
> > --- set Interface dpdkvhostuser0 type=dpdkvhostuser
> > -$ ovs-vsctl add-port br0 dpdkvhostuser1 \
> > --- set Interface dpdkvhostuser1 type=dpdkvhostuser
> > +$ ovs-vsctl add-port br0 dpdkvhostclient0 \
> > +-- set Interface dpdkvhostclient0 type=dpdkvhostuserclient \
> > +   options:vhost-server-path=/tmp/dpdkvhostclient0
> > +$ ovs-vsctl add-port br0 dpdkvhostclient1 \
> > +-- set Interface dpdkvhostclient1 type=dpdkvhostuserclient \
> > +   options:vhost-server-path=/tmp/dpdkvhostclient1
> > +
> > +For the above examples to work, an appropriate server socket must be 
> > created
> > +at the paths specified (``/tmp/dpdkvhostclient0`` and
> > +``/tmp/dpdkvhostclient0``).
> >  
> >  vhost-user vs. vhost-user-client
> >  
> > @@ -59,7 +65,8 @@ means if OVS dies, all VMs **must** be restarted. On 
> > the other hand, for
> >  vhost-user-client ports, OVS acts as the client and QEMU the server. 
> > This means
> >  OVS can die and be restarted without issue, and it is also possible to 
> > restart
> >  an instance itself. For this reason, vhost-user-client ports are the 
> > preferred
> > -type for most use cases.
> > +type for most use cases.  Ports of type vhost-user are currently 
> > deprecated and
> > +will be removed in a future release.
> >
> > type for all known use cases; the only limitation is that vhost-user client 
> > mode ports
> > require QEMU version 2.7.  Ports of type vhost-user are currently 
> > deprecated and
> > will be removed in a future release.
> 
> Will update with this verbiage.  Thanks.
> 
> >  .. _dpdk-vhost-user:
> >  
> > @@ -68,7 +75,8 @@ vhost-user
> >  
> >  .. important::
> >  
> > -   Use of vhost-user ports requires QEMU >= 2.2
> > +   Use of vhost-user ports requires QEMU >= 2.2;  vhost-user ports are
> > +   *deprecated*.
> >  
> >  To use vhost-user ports, you must first add said ports to the switch. 
> > DPDK
> >  vhost-user ports can have arbitrary names with the exception of 
> > forward and
> > diff --git a/NEWS b/NEWS
> > index 82004c8..b81d033 100644
> > --- a/NEWS
> > +++ b/NEWS
> > @@ -16,6 +16,8 @@ Post-v2.7.0
> > Log level can be changed in a usual OVS way using
> > 'ovs-appctl vlog' commands for 'dpdk' module. Lower bound
> > still can be configured via extra arguments for DPDK EAL.
> > + * dpdkvhostuser ports are marked as deprecated.  They will be 
> > removed
> > +   in an upcoming release.
> > - IPFIX now provides additional counters:
> >   * Total counters since metering process startup.
> >   * Per-flow TCP flag counters.
> > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> > index b770b70..9ab4aeb 100644
> > --- a/lib/netdev-dpdk.c
> > +++ b/lib/netdev-dpdk.c
> > @@ -966,6 +966,8 @@ netdev_dpdk_vhost_construct(struct netdev *netdev)
> >  err = vhost_common_construct(netdev);
> >  
> >  ovs_mutex_unlock(&dpdk_mutex);
> > +VLOG_WARN_ONCE("dpdkvhostuser ports are considered deprecated;  "
> > +   

Re: [ovs-dev] [PATCH] dpdk: announce deprecation of vhost-user server ports

2017-06-08 Thread Darrell Ball


On 6/8/17, 6:40 AM, "Aaron Conole"  wrote:

Hi Darrell,

Thanks so much for the review!  Comments below.

Darrell Ball  writes:

> On 6/7/17, 3:46 PM, "Aaron Conole"  wrote:
>
> Since vhost-user server mode ports are the preferred mechanism for
> interconnecting Open vSwitch with VMs when using DPDK, and since there
> are currently no known use cases for vhost-user server mode ports 
apart
> from version incompatibilities with QEMU, announce that server mode 
ports
> are considered deprecated and will be removed in a future release.
> 
> Cc: Ciara Loftus 
> Cc: Kevin Traynor 
> Suggested-by: Darrell Ball 
> Signed-off-by: Aaron Conole 
> ---
>  Documentation/topics/dpdk/vhost-user.rst | 24 

>  NEWS |  2 ++
>  lib/netdev-dpdk.c|  2 ++
>  3 files changed, 20 insertions(+), 8 deletions(-)
> 
> diff --git a/Documentation/topics/dpdk/vhost-user.rst 
b/Documentation/topics/dpdk/vhost-user.rst
> index a1c19fd..9d36cf2 100644
> --- a/Documentation/topics/dpdk/vhost-user.rst
> +++ b/Documentation/topics/dpdk/vhost-user.rst
> @@ -32,13 +32,19 @@ documentation`_ on same.
>  Quick Example
>  -
>  
> -This example demonstrates how to add two ``dpdkvhostuser`` ports to 
an existing
> -bridge called ``br0``::
> +This example demonstrates how to add two ``dpdkvhostuserclient`` 
ports to an
> +existing bridge called ``br0``::
>  
> -$ ovs-vsctl add-port br0 dpdkvhostuser0 \
> --- set Interface dpdkvhostuser0 type=dpdkvhostuser
> -$ ovs-vsctl add-port br0 dpdkvhostuser1 \
> --- set Interface dpdkvhostuser1 type=dpdkvhostuser
> +$ ovs-vsctl add-port br0 dpdkvhostclient0 \
> +-- set Interface dpdkvhostclient0 type=dpdkvhostuserclient \
> +   options:vhost-server-path=/tmp/dpdkvhostclient0
> +$ ovs-vsctl add-port br0 dpdkvhostclient1 \
> +-- set Interface dpdkvhostclient1 type=dpdkvhostuserclient \
> +   options:vhost-server-path=/tmp/dpdkvhostclient1
> +
> +For the above examples to work, an appropriate server socket must be 
created
> +at the paths specified (``/tmp/dpdkvhostclient0`` and
> +``/tmp/dpdkvhostclient0``).
>  
>  vhost-user vs. vhost-user-client
>  
> @@ -59,7 +65,8 @@ means if OVS dies, all VMs **must** be restarted. 
On the other hand, for
>  vhost-user-client ports, OVS acts as the client and QEMU the server. 
This means
>  OVS can die and be restarted without issue, and it is also possible 
to restart
>  an instance itself. For this reason, vhost-user-client ports are the 
preferred
> -type for most use cases.
> +type for most use cases.  Ports of type vhost-user are currently 
deprecated and
> +will be removed in a future release.
>
> type for all known use cases; the only limitation is that vhost-user 
client mode ports
> require QEMU version 2.7.  Ports of type vhost-user are currently 
deprecated and
> will be removed in a future release.

Will update with this verbiage.  Thanks.

>  .. _dpdk-vhost-user:
>  
> @@ -68,7 +75,8 @@ vhost-user
>  
>  .. important::
>  
> -   Use of vhost-user ports requires QEMU >= 2.2
> +   Use of vhost-user ports requires QEMU >= 2.2;  vhost-user ports 
are
> +   *deprecated*.
>  
>  To use vhost-user ports, you must first add said ports to the 
switch. DPDK
>  vhost-user ports can have arbitrary names with the exception of 
forward and
> diff --git a/NEWS b/NEWS
> index 82004c8..b81d033 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -16,6 +16,8 @@ Post-v2.7.0
> Log level can be changed in a usual OVS way using
> 'ovs-appctl vlog' commands for 'dpdk' module. Lower bound
> still can be configured via extra arguments for DPDK EAL.
> + * dpdkvhostuser ports are marked as deprecated.  They will be 
removed
> +   in an upcoming release.
> - IPFIX now provides additional counters:
>   * Total counters since metering process startup.
>   * Per-flow TCP flag counters.
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index b770b70..9ab4aeb 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -966,6 +966,8 @@ netdev_dpdk_vhost_construct(struct netdev *netdev)
>  err = vhost_common_construct(netdev);
>  
>   

Re: [ovs-dev] [PATCH] dpif-netdev: Fix insertion probability

2017-06-08 Thread Kevin Traynor
Hi All, any objections to this fix being merged?

On 05/17/2017 09:28 AM, Ciara Loftus wrote:
> emc_conditional_insert uses pmd->last_cycles and the packet's RSS hash
> to generate a random number used to determine whether or not an emc
> entry should be inserted. This works for single-packet bursts as
> last_cycles is updated for each burst. However, for bursts > 1 packet,
> where the packets in the batch generate the same RSS hash,
> pmd->last_cycles remains constant for the entire burst also, and thus
> cannot be used as a random number for each packet in the burst.
> 
> This commit replaces the use of pmd->last_cycles with random_uint32()
> for this purpose and subsequently fixes the behavior of the
> emc_insert_inv_prob setting for high-throughput (large bursts)
> single-flow cases.
> 
> Fixes: 4c30b24602c3 ("dpif-netdev: Conditional EMC insert")
> Reported-by: Kevin Traynor 
> Signed-off-by: Ciara Loftus 
> ---
>  lib/dpif-netdev.c | 4 
>  1 file changed, 4 deletions(-)
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index d215156..ab1e26e 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -2037,11 +2037,7 @@ emc_probabilistic_insert(struct dp_netdev_pmd_thread 
> *pmd,
>  uint32_t min;
>  atomic_read_relaxed(&pmd->dp->emc_insert_min, &min);
>  
> -#ifdef DPDK_NETDEV
> -if (min && (key->hash ^ (uint32_t) pmd->last_cycles) <= min) {
> -#else
>  if (min && (key->hash ^ random_uint32()) <= min) {
> -#endif
>  emc_insert(&pmd->flow_cache, key, flow);
>  }
>  }
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net-next v2 6/6] vxlan: allow multiple VXLANs with same VNI for IPv6 link-local addresses

2017-06-08 Thread Matthias Schiffer
On 04/16/2017 05:15 PM, Matthias Schiffer wrote:
> On 04/14/2017 07:38 PM, Stephen Hemminger wrote:
>> On Fri, 14 Apr 2017 18:44:46 +0200
>> Matthias Schiffer  wrote:
>>
>>> As link-local addresses are only valid for a single interface, we can allow
>>> to use the same VNI for multiple independent VXLANs, as long as the used
>>> interfaces are distinct. This way, VXLANs can always be used as a drop-in
>>> replacement for VLANs with greater ID space.
>>>
>>> This also extends VNI lookup to respect the ifindex when link-local IPv6
>>> addresses are used, so using the same VNI on multiple interfaces can
>>> actually work.
>>>
>>> Signed-off-by: Matthias Schiffer 
>>
>> Why does this have to be IPv6 specific?
> 
> I'm not familar with IPv4 link-local addresses and how route lookup works
> for them. vxlan_get_route() sets flowi4_oif to the outgoing interface; does
> __ip_route_output_key_hash() do the right thing for link-local addresses
> when such addresses are used on multiple interfaces? I see some special
> casing for multicast destinations, but none for link-local ones.
> 

Getting back to this (sorry for the delay, I got caught up in other
projects), I'm seeing the following pros and cons regarding the support of
VXLAN over IPv4 link-local addresses:

+ There should be no technical reason not to support it; as everything is
in the kernel, the usual problems with IPv4 LL (userspace APIs not
supporting passing a scope ID as part of the IP address) don't apply here

+ The code needed to support IPv4 LL should be easy to add

- IPv4 LL semantics aren't as well-defined as for IPv6. While IPv4 LL
addresses are usually in the 169.254.x.y range, the Linux kernel allows
setting the address scope independently of the range for IPv4. In contrast
to this, we need to judge the validity of the configuration based on
syntactic properties of the IP addresses (at least if we don't want to add
a lot of more compexity to the validation, and probably other parts of the
code.) Generally, code that checks for the 169.254.x.y range is uncommon in
the kernel (I think I only found a single instance, somewhere in the SCTP
implementation.)

- IPv4 LL addresses are mostly used for zeroconf; I don't really see a
usecase for zeroconf addresses + VXLANs

- Personally, I have no interest in IPv4


I probably forgot a few more arguments... All in all, I'd like the VXLAN
maintainers to decide if we do want IPv4 LL support or not, and if the
verdict is to support it, I'll implement it in the next revision of my
patchset.

Matthias

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] datapath-windows: Add original conntrack tuple to FlowKey

2017-06-08 Thread Guru Shetty
On 2 June 2017 at 10:37, Sairam Venugopal  wrote:

> Add the original tuple to Flow Key. In case of ICMP and UDP, default the
> parent entry to NULL until related connections is supported.
>
> Signed-off-by: Sairam Venugopal 
>
Applied, thanks!


> ---
>  datapath-windows/ovsext/Conntrack.c  | 35 -
>  datapath-windows/ovsext/DpInternal.h |  1 +
>  datapath-windows/ovsext/Flow.c   | 50 ++
> --
>  3 files changed, 78 insertions(+), 8 deletions(-)
>
> diff --git a/datapath-windows/ovsext/Conntrack.c
> b/datapath-windows/ovsext/Conntrack.c
> index dce0c1b..609ae5a 100644
> --- a/datapath-windows/ovsext/Conntrack.c
> +++ b/datapath-windows/ovsext/Conntrack.c
> @@ -198,7 +198,7 @@ OvsCtEntryCreate(PNET_BUFFER_LIST curNbl,
>  }
>
>  state |= OVS_CS_F_NEW;
> -POVS_CT_ENTRY parentEntry = NULL;
> +POVS_CT_ENTRY parentEntry;
>  parentEntry = OvsCtRelatedLookup(ctx->key, currentTime);
>  if (parentEntry != NULL) {
>  state |= OVS_CS_F_RELATED;
> @@ -209,10 +209,10 @@ OvsCtEntryCreate(PNET_BUFFER_LIST curNbl,
>  if (!entry) {
>  return NULL;
>  }
> -/* If this is related entry, then update parent */
> -if (parentEntry != NULL) {
> -entry->parent = parentEntry;
> -}
> +
> +/* Set parent entry for related FTP connections */
> +entry->parent = parentEntry;
> +
>  OvsCtAddEntry(entry, ctx, currentTime);
>  *entryCreated = TRUE;
>  }
> @@ -235,6 +235,9 @@ OvsCtEntryCreate(PNET_BUFFER_LIST curNbl,
>  if (!entry) {
>  return NULL;
>  }
> +
> +/* XXX Add support for ICMP-Related */
> +entry->parent = NULL;
>  OvsCtAddEntry(entry, ctx, currentTime);
>  *entryCreated = TRUE;
>  }
> @@ -250,6 +253,9 @@ OvsCtEntryCreate(PNET_BUFFER_LIST curNbl,
>  if (!entry) {
>  return NULL;
>  }
> +
> +/* Default UDP related to NULL until TFTP is supported */
> +entry->parent = NULL;
>  OvsCtAddEntry(entry, ctx, currentTime);
>  *entryCreated = TRUE;
>  }
> @@ -586,8 +592,8 @@ OvsProcessConntrackEntry(PNET_BUFFER_LIST curNbl,
>  } else {
>  POVS_CT_ENTRY parentEntry;
>  parentEntry = OvsCtRelatedLookup(ctx->key, currentTime);
> +entry->parent = parentEntry;
>  if (parentEntry != NULL) {
> -entry->parent = parentEntry;
>  state |= OVS_CS_F_RELATED;
>  }
>  }
> @@ -702,6 +708,23 @@ OvsCtExecute_(PNET_BUFFER_LIST curNbl,
>  }
>  }
>
> +/* Add original tuple information to flow Key */
> +if (entry && entry->key.dl_type == ntohs(ETH_TYPE_IPV4)) {
> +OVS_CT_KEY *ctKey;
> +if (entry->parent != NULL) {
> +POVS_CT_ENTRY parent = entry->parent;
> +ctKey = &parent->key;
> +} else {
> +ctKey = &entry->key;
> +}
> +
> +key->ct.tuple_ipv4.ipv4_src = ctKey->src.addr.ipv4_aligned;
> +key->ct.tuple_ipv4.ipv4_dst = ctKey->dst.addr.ipv4_aligned;
> +key->ct.tuple_ipv4.src_port = ctKey->src.port;
> +key->ct.tuple_ipv4.dst_port = ctKey->dst.port;
> +key->ct.tuple_ipv4.ipv4_proto = ctKey->nw_proto;
> +}
> +
>  if (entryCreated && entry) {
>  OvsPostCtEventEntry(entry, OVS_EVENT_CT_NEW);
>  }
> diff --git a/datapath-windows/ovsext/DpInternal.h
> b/datapath-windows/ovsext/DpInternal.h
> index 9d1a783..743891c 100644
> --- a/datapath-windows/ovsext/DpInternal.h
> +++ b/datapath-windows/ovsext/DpInternal.h
> @@ -199,6 +199,7 @@ typedef __declspec(align(8)) struct OvsFlowKey {
>  UINT32 mark;
>  UINT32 state;
>  struct ovs_key_ct_labels labels;
> +struct ovs_key_ct_tuple_ipv4 tuple_ipv4;
>  } ct;/* Connection Tracking Flags */
>  } OvsFlowKey;
>
> diff --git a/datapath-windows/ovsext/Flow.c b/datapath-windows/ovsext/
> Flow.c
> index 96ff9fa..80f5676 100644
> --- a/datapath-windows/ovsext/Flow.c
> +++ b/datapath-windows/ovsext/Flow.c
> @@ -180,6 +180,10 @@ const NL_POLICY nlFlowKeyPolicy[] = {
>  [OVS_KEY_ATTR_CT_LABELS] = {.type = NL_A_UNSPEC,
>  .minLen = sizeof(struct
> ovs_key_ct_labels),
>  .maxLen = sizeof(struct
> ovs_key_ct_labels),
> +.optional = TRUE},
> +[OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV4] = {.type = NL_A_UNSPEC,
> +.minLen = sizeof(struct
> ovs_key_ct_tuple_ipv4),
> +.maxLen = sizeof(struct
> ovs_ke

Re: [ovs-dev] [PATCH] dpdk: announce deprecation of vhost-user server ports

2017-06-08 Thread Kevin Traynor
On 06/07/2017 11:46 PM, Aaron Conole wrote:
> Since vhost-user server mode ports are the preferred mechanism for
> interconnecting Open vSwitch with VMs when using DPDK, and since there
> are currently no known use cases for vhost-user server mode ports apart
> from version incompatibilities with QEMU, announce that server mode ports
> are considered deprecated and will be removed in a future release.
> 
> Cc: Ciara Loftus 
> Cc: Kevin Traynor 
> Suggested-by: Darrell Ball 
> Signed-off-by: Aaron Conole 
> ---
>  Documentation/topics/dpdk/vhost-user.rst | 24 
>  NEWS |  2 ++
>  lib/netdev-dpdk.c|  2 ++
>  3 files changed, 20 insertions(+), 8 deletions(-)
> 
> diff --git a/Documentation/topics/dpdk/vhost-user.rst 
> b/Documentation/topics/dpdk/vhost-user.rst
> index a1c19fd..9d36cf2 100644
> --- a/Documentation/topics/dpdk/vhost-user.rst
> +++ b/Documentation/topics/dpdk/vhost-user.rst
> @@ -32,13 +32,19 @@ documentation`_ on same.
>  Quick Example
>  -
>  
> -This example demonstrates how to add two ``dpdkvhostuser`` ports to an 
> existing
> -bridge called ``br0``::
> +This example demonstrates how to add two ``dpdkvhostuserclient`` ports to an
> +existing bridge called ``br0``::
>  
> -$ ovs-vsctl add-port br0 dpdkvhostuser0 \
> --- set Interface dpdkvhostuser0 type=dpdkvhostuser
> -$ ovs-vsctl add-port br0 dpdkvhostuser1 \
> --- set Interface dpdkvhostuser1 type=dpdkvhostuser
> +$ ovs-vsctl add-port br0 dpdkvhostclient0 \
> +-- set Interface dpdkvhostclient0 type=dpdkvhostuserclient \
> +   options:vhost-server-path=/tmp/dpdkvhostclient0
> +$ ovs-vsctl add-port br0 dpdkvhostclient1 \
> +-- set Interface dpdkvhostclient1 type=dpdkvhostuserclient \
> +   options:vhost-server-path=/tmp/dpdkvhostclient1
> +
> +For the above examples to work, an appropriate server socket must be created
> +at the paths specified (``/tmp/dpdkvhostclient0`` and
> +``/tmp/dpdkvhostclient0``).

You could mention QEMU here. So the reader knows where to look.
"These can be created by QEMU. See below for details."?

>  
>  vhost-user vs. vhost-user-client
>  
> @@ -59,7 +65,8 @@ means if OVS dies, all VMs **must** be restarted. On the 
> other hand, for
>  vhost-user-client ports, OVS acts as the client and QEMU the server. This 
> means
>  OVS can die and be restarted without issue, and it is also possible to 
> restart
>  an instance itself. For this reason, vhost-user-client ports are the 
> preferred
> -type for most use cases.
> +type for most use cases.  Ports of type vhost-user are currently deprecated 
> and
> +will be removed in a future release.
>  
>  .. _dpdk-vhost-user:
>  
> @@ -68,7 +75,8 @@ vhost-user
>  
>  .. important::
>  
> -   Use of vhost-user ports requires QEMU >= 2.2
> +   Use of vhost-user ports requires QEMU >= 2.2;  vhost-user ports are
> +   *deprecated*.
>  
>  To use vhost-user ports, you must first add said ports to the switch. DPDK
>  vhost-user ports can have arbitrary names with the exception of forward and
> diff --git a/NEWS b/NEWS
> index 82004c8..b81d033 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -16,6 +16,8 @@ Post-v2.7.0
> Log level can be changed in a usual OVS way using
> 'ovs-appctl vlog' commands for 'dpdk' module. Lower bound
> still can be configured via extra arguments for DPDK EAL.
> + * dpdkvhostuser ports are marked as deprecated.  They will be removed
> +   in an upcoming release.
> - IPFIX now provides additional counters:
>   * Total counters since metering process startup.
>   * Per-flow TCP flag counters.
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index b770b70..9ab4aeb 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -966,6 +966,8 @@ netdev_dpdk_vhost_construct(struct netdev *netdev)
>  err = vhost_common_construct(netdev);
>  
>  ovs_mutex_unlock(&dpdk_mutex);
> +VLOG_WARN_ONCE("dpdkvhostuser ports are considered deprecated;  "
> +   "please migrate to dpdkvhostuserclient ports.");
>  return err;
>  }
>  
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] testsuite: exit gracefully if it fails.

2017-06-08 Thread Ben Pfaff
On Thu, Jun 08, 2017 at 02:30:48PM -0300, Flavio Leitner wrote:
> The daemon is killed leaving resources behind when a test fails.
> This fixes to first signal the daemon to exit gracefully.
> 
> Suggested-by: Joe Stringer 
> Fixes: 0f28164be02ac ("netdev-linux: make tap devices persistent")
> Signed-off-by: Flavio Leitner 
> ---
>  tests/ofproto-macros.at | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/tests/ofproto-macros.at b/tests/ofproto-macros.at
> index faff5b0..5ac5d05 100644
> --- a/tests/ofproto-macros.at
> +++ b/tests/ofproto-macros.at
> @@ -323,6 +323,9 @@ m4_define([_OVS_VSWITCHD_START],
> AT_CHECK([ovs-vswitchd $1 --detach --no-chdir --pidfile --log-file 
> -vvconn -vofproto_dpif -vunixctl], [0], [], [stderr])
> AT_CAPTURE_FILE([ovs-vswitchd.log])
> on_exit "kill `cat ovs-vswitchd.pid`"
> +   dnl Wait for the daemon to exit gracefully
> +   on_exit "for i in 1 2 3 4 5 6 7 8 9; do kill -0 `cat ovs-vswitchd.pid` || 
> break; sleep 0.1 || sleep 1; done"
> +   on_exit "ovs-appctl -t ovs-vswitchd exit --cleanup"

Thanks for the patch.

At first, I thought that this did the steps in the wrong order, but
"on_exit" reverses the order.

It would be less surprising to do this with just one call to on_exit,
e.g.

on_exit '
ovs-appctl -t ovs-vswitchd exit --cleanup
for i in 1 2 3 4 5 6 7 8 9; do
kill -0 `cat ovs-vswitchd.pid` || break
sleep 0.1 || sleep 1
done
kill `cat ovs-vswitchd.pid`
'

Actually, I think that all of this could be put in a shell function:

kill_ovs_vswitchd() {
ovs-appctl -t ovs-vswitchd exit --cleanup
for i in 1 2 3 4 5 6 7 8 9; do
kill -0 `cat ovs-vswitchd.pid` || break
sleep 0.1 || sleep 1
done
kill `cat ovs-vswitchd.pid`
}

and then just "on_exit kill_ovs_vswitchd".  Maybe that is the best
approach.

Thanks,

Ben.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] testsuite: release resources when vswitch exits.

2017-06-08 Thread Flavio Leitner
On Thu, Jun 08, 2017 at 10:16:11AM -0700, Ben Pfaff wrote:
> On Wed, Jun 07, 2017 at 05:56:36PM -0700, Joe Stringer wrote:
> > On 7 June 2017 at 17:36, Joe Stringer  wrote:
> > > On 7 June 2017 at 13:58, Flavio Leitner  wrote:
> > >> This change the testsuite macro to release the resources
> > >> configured by ovs-vswitchd when exiting as it used to be.
> > >>
> > >> Fixes: 0f28164be02ac ("netdev-linux: make tap devices persistent")
> > >> Fixes: fe13ccdca6a22 ("vswitchd: Add --cleanup option to the 'appctl
> > >>exit' command")
> > >>
> > >> Reported-by: Eric Garver 
> > >> Signed-off-by: Flavio Leitner 
> > >
> > > Thanks for the fix, applied to master.
> > 
> > While this fixes successful test runs with OVS_APP_EXIT_AND_WAIT(), if
> > anything fails in the middle of the test then OVS isn't going to
> > perform cleanup, which will leave devices hanging around, which will
> > cause subsequent tests to fail.
> > 
> > I think we need to amend the _OVS_VSWITCHD_START macro as well, for
> > the following line:
> > on_exit "kill `cat ovs-vswitchd.pid`"
> > 
> > This should use appctl to do a tidy cleanup as well.
> 
> Be careful about that.  There needs to be a fallback to kill vswitchd
> even if it doesn't exit gracefully.  The test framework waits for all
> processes to exit, so the testsuite will ultimately hang if a process
> never does.

I sent this patch to fix it which I think addresses both concerns:
https://mail.openvswitch.org/pipermail/ovs-dev/2017-June/333693.html

-- 
Flavio
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] testsuite: exit gracefully if it fails.

2017-06-08 Thread Flavio Leitner
The daemon is killed leaving resources behind when a test fails.
This fixes to first signal the daemon to exit gracefully.

Suggested-by: Joe Stringer 
Fixes: 0f28164be02ac ("netdev-linux: make tap devices persistent")
Signed-off-by: Flavio Leitner 
---
 tests/ofproto-macros.at | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tests/ofproto-macros.at b/tests/ofproto-macros.at
index faff5b0..5ac5d05 100644
--- a/tests/ofproto-macros.at
+++ b/tests/ofproto-macros.at
@@ -323,6 +323,9 @@ m4_define([_OVS_VSWITCHD_START],
AT_CHECK([ovs-vswitchd $1 --detach --no-chdir --pidfile --log-file -vvconn 
-vofproto_dpif -vunixctl], [0], [], [stderr])
AT_CAPTURE_FILE([ovs-vswitchd.log])
on_exit "kill `cat ovs-vswitchd.pid`"
+   dnl Wait for the daemon to exit gracefully
+   on_exit "for i in 1 2 3 4 5 6 7 8 9; do kill -0 `cat ovs-vswitchd.pid` || 
break; sleep 0.1 || sleep 1; done"
+   on_exit "ovs-appctl -t ovs-vswitchd exit --cleanup"
AT_CHECK([[sed < stderr '
 /ovs_numa|INFO|Discovered /d
 /vlog|INFO|opened log file/d
-- 
2.9.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] datapath-windows: Add original conntrack tuple to FlowKey

2017-06-08 Thread Alin Serdean
Acked-by: Alin Gabriel Serdean 

> -Original Message-
> From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> boun...@openvswitch.org] On Behalf Of Sairam Venugopal
> Sent: Friday, June 2, 2017 8:37 PM
> To: d...@openvswitch.org
> Subject: [ovs-dev] [PATCH v2] datapath-windows: Add original conntrack
> tuple to FlowKey
> 
> Add the original tuple to Flow Key. In case of ICMP and UDP, default the
> parent entry to NULL until related connections is supported.
> 
> Signed-off-by: Sairam Venugopal 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 1/3] netdev-dpdk: Fix Rx checksum reconfigure.

2017-06-08 Thread Kevin Traynor
On 06/01/2017 02:19 AM, Darrell Ball wrote:
> 
> 
> On 5/31/17, 6:59 AM, "ovs-dev-boun...@openvswitch.org on behalf of Kevin 
> Traynor"  
> wrote:
> 
> On 05/30/2017 05:09 PM, Chandran, Sugesh wrote:
> > 
> > 
> > Regards
> > _Sugesh
> > 
> > 
> >> -Original Message-
> >> From: Kevin Traynor [mailto:ktray...@redhat.com]
> >> Sent: Monday, May 29, 2017 1:37 PM
> >> To: Chandran, Sugesh 
> >> Cc: 'd...@openvswitch.org' 
> >> Subject: Re: [PATCH v2 1/3] netdev-dpdk: Fix Rx checksum reconfigure.
> >>
> >> On 05/26/2017 03:04 PM, Chandran, Sugesh wrote:
> >>>
> >>>
> >>> Regards
> >>> _Sugesh
> >>>
> >>>
>  -Original Message-
>  From: Chandran, Sugesh
>  Sent: Wednesday, May 17, 2017 10:50 AM
>  To: Kevin Traynor 
>  Cc: d...@openvswitch.org
>  Subject: RE: [PATCH v2 1/3] netdev-dpdk: Fix Rx checksum reconfigure.
> 
> 
> 
>  Regards
>  _Sugesh
> 
> > -Original Message-
> > From: Kevin Traynor [mailto:ktray...@redhat.com]
> > Sent: Wednesday, May 17, 2017 10:10 AM
> > To: Chandran, Sugesh 
> > Cc: d...@openvswitch.org
> > Subject: Re: [PATCH v2 1/3] netdev-dpdk: Fix Rx checksum 
> reconfigure.
> >
> > On 05/16/2017 05:48 PM, Chandran, Sugesh wrote:
> >> Hi Kevin,
> >> Thank you for sending out this patch series.
> >> Have you tested the tunneling decap usecase with checksum offload?
> >> I am seeing weird behavior when I testing the tunneling with Rx
> >> checksum offload ON and OFF.(Seeing the same behavior on master as
> >> well)
> >>
> >> Here is the summary of issue with the steps,
> >>
> >> 1) Send tunnel traffic to OVS to do the decap.
> >> 2) Set & unset the checksum offload.
> >> 3) I don't see any performance difference in both case.
> >>
> >> Now I went ahead and put some debug message to see what is
>  happening
> >>
> >> diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c
> >> index
> >> 2798324..49ca847 100644
> >> --- a/lib/netdev-native-tnl.c
> >> +++ b/lib/netdev-native-tnl.c
> >> @@ -86,6 +86,7 @@ netdev_tnl_ip_extract_tnl_md(struct dp_packet
> > *packet, struct flow_tnl *tnl,
> >>  ovs_be32 ip_src, ip_dst;
> >>
> >>  if (OVS_UNLIKELY(!dp_packet_ip_checksum_valid(packet))) {
> >> +VLOG_INFO("Checksum is not validated...");
> >>  if (csum(ip, IP_IHL(ip->ip_ihl_ver) * 4)) {
> >>  VLOG_WARN_RL(&err_rl, "ip packet has invalid 
> checksum");
> >>  return NULL;
> >> @@ -182,6 +183,7 @@ udp_extract_tnl_md(struct dp_packet *packet,
> >> struct flow_tnl *tnl,
> >>
> >>  if (udp->udp_csum) {
> >>  if (OVS_UNLIKELY(!dp_packet_l4_checksum_valid(packet))) {
> >> +VLOG_INFO("Checksum is not validated...");
> >>  uint32_t csum;
> >>  if 
> (netdev_tnl_is_header_ipv6(dp_packet_data(packet))) {
> >>  csum =
> >> packet_csum_pseudoheader6(dp_packet_l3(packet));
> >> sugeshch@silpixa00389816:~/repo/ovs_master$
> >>
> >> These debug messages are not showing at all when I am sending the
> >> traffic. (I tried it with rx checksum ON and OFF)
> >>
> >> Looks like ol_flags are always reporting checksum is good.
> >>
> >> I am using DPDK 17.02 for the testing.
> >> If I remember correctly it was reporting properly at the time of rx
> >> checksum
> > offload.
> >> Looks like DPDK is reporting checksum valid in all the cases even
> >> it is
> > disabled.
> >>
> >> Any inputs on this?
> >>
> >
> > Hi Sugesh, I was trying to fix the reconfiguration code not applying
> > the OVSDB value so that's all I tested. I guess it's best to roll
> > back to your original test and take it from there? You probably
> > tested with DPDK
> > 16.11.0 and I see some changes since then (e.g. below). Also, maybe
> > you were testing enabled/disabled on first configure? It's the same
> > configure code, but perhaps there is some different state in DPDK
> > when the port is configured initially.
> >
>  Yes, I tried to configure initially as well as run time.
>  [Sugesh] Also,
>  At the time of Rx checksum offload implementation, one of the
>  comments suggests not to use any configuration option at all.
>  Keep it ON for all the physical ports when it is supported. The
>  

Re: [ovs-dev] [PATCH RFC] ofctl: Fix nonstandard isatty on Windows

2017-06-08 Thread Alin Serdean
> >
> > Signed-off-by: Alin Gabriel Serdean 
> 
> Thanks for finding and fixing the problem.
> 
> This approach fixes a problem in one place only.  I see that there are other
> calls to isatty() in the tree, so I would prefer to fix them all in one 
> place.  How
> about the following approach instead?  I have not tested it.
[Alin Serdean] Thanks for the incremental. It makes more sense to just have it 
one place.
I think isatty might be already defined.
I will work around you incremental and add you as co-author.
Thanks for the quick review.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/3] netdev-linux: make tap devices persistent.

2017-06-08 Thread Flavio Leitner
On Thu, Jun 08, 2017 at 11:24:58AM +, Vishal Deep Ajmera wrote:
> Hi Flavio,
> 
> I am facing some issue with ovs-master on Ubuntu 14.04 system. Here are the 
> steps I followed and the error message I get. Can it be possibly related to 
> your patch for making tap device persistent ? Let me know if I am missing 
> some steps here.
> 
> 1. Create netdev bridge
> $ ovs-vsctl add-br br0 -- set Bridge br0 datapath_type=netdev
> $ ovs-vsctl show
> 4c516433-c305-4894-96ad-f44af0b01b63
> Bridge "br0"
> Port "br0"
> Interface "br0"
> type: internal
> ovs_version: "2.7.90"
> 
> 2. Bring up br0 interface
> $ ifconfig br0 up
> 
> 3. Restart the openvswitch
> $ service openvswitch-switch restart
> 
> 4. Show ovs db
> $ ovs-vsctl show
> 4c516433-c305-4894-96ad-f44af0b01b63
> Bridge "br0"
> Port "br0"
> Interface "br0"
> type: internal
> error: "could not open network device br0 (File exists)"
> ovs_version: "2.7.90"


The same thing Eric reported to me when a test unit had failed.
Oddly I can't reproduce (yet), but reviewing the code seems that a
rtnetlink to add a route can open the device but not add to the DP, 
then when it tries to that error would be reported.

I will look more into it and see if I can fix it.
Thanks for the report.
Flavio



> 
> Regards,
> Vishal
> 
> -Original Message-
> From: Flavio Leitner [mailto:f...@redhat.com] 
> Sent: Wednesday, June 07, 2017 8:28 PM
> To: Vishal Deep Ajmera 
> Cc: d...@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH 2/3] netdev-linux: make tap devices persistent.
> 
> On Wed, Jun 07, 2017 at 09:39:16AM +, Vishal Deep Ajmera wrote:
> > Hi Flavio,
> > 
> > If the tap-device is persistent but the 'netdev' datapath is not yet 
> > started then will it create any issues in the system ? What happens if 
> > we start sending packets on this interface whereas data-path is not 
> > present ?
> 
> The link goes down and packets are dropped.
> fbl
> 
> > 
> > Regards,
> > Vishal
> > 
> > -Original Message-
> > From: ovs-dev-boun...@openvswitch.org 
> > [mailto:ovs-dev-boun...@openvswitch.org] On Behalf Of Flavio Leitner
> > Sent: Tuesday, May 30, 2017 1:10 AM
> > To: d...@openvswitch.org
> > Cc: Flavio Leitner 
> > Subject: [ovs-dev] [PATCH 2/3] netdev-linux: make tap devices persistent.
> > 
> > When using data path type "netdev", bridge port is a tun device and when 
> > OVS restarts, that device and its network configuration is lost.
> > 
> > This patch enables the tap device to persist instead.
> > 
> > Signed-off-by: Flavio Leitner 
> > ---
> >  lib/netdev-linux.c | 8 
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c index 
> > 3ad3d45..d181e4f 100644
> > --- a/lib/netdev-linux.c
> > +++ b/lib/netdev-linux.c
> > @@ -866,6 +866,13 @@ netdev_linux_construct_tap(struct netdev *netdev_)
> >  goto error_close;
> >  }
> >  
> > +if (ioctl(netdev->tap_fd, TUNSETPERSIST, 1)) {
> > +VLOG_WARN("%s: creating tap device failed (persist): %s", name,
> > +  ovs_strerror(errno));
> > +error = errno;
> > +goto error_close;
> > +}
> > +
> >  return 0;
> >  
> >  error_close:
> > @@ -885,6 +892,7 @@ netdev_linux_destruct(struct netdev *netdev_)
> >  if (netdev_get_class(netdev_) == &netdev_tap_class
> >  && netdev->tap_fd >= 0)
> >  {
> > +ioctl(netdev->tap_fd, TUNSETPERSIST, 0);
> >  close(netdev->tap_fd);
> >  }
> >  
> > --
> > 2.9.4
> > 
> > ___
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> 
> --
> Flavio

-- 
Flavio
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] testsuite: release resources when vswitch exits.

2017-06-08 Thread Ben Pfaff
On Wed, Jun 07, 2017 at 05:56:36PM -0700, Joe Stringer wrote:
> On 7 June 2017 at 17:36, Joe Stringer  wrote:
> > On 7 June 2017 at 13:58, Flavio Leitner  wrote:
> >> This change the testsuite macro to release the resources
> >> configured by ovs-vswitchd when exiting as it used to be.
> >>
> >> Fixes: 0f28164be02ac ("netdev-linux: make tap devices persistent")
> >> Fixes: fe13ccdca6a22 ("vswitchd: Add --cleanup option to the 'appctl
> >>exit' command")
> >>
> >> Reported-by: Eric Garver 
> >> Signed-off-by: Flavio Leitner 
> >
> > Thanks for the fix, applied to master.
> 
> While this fixes successful test runs with OVS_APP_EXIT_AND_WAIT(), if
> anything fails in the middle of the test then OVS isn't going to
> perform cleanup, which will leave devices hanging around, which will
> cause subsequent tests to fail.
> 
> I think we need to amend the _OVS_VSWITCHD_START macro as well, for
> the following line:
> on_exit "kill `cat ovs-vswitchd.pid`"
> 
> This should use appctl to do a tidy cleanup as well.

Be careful about that.  There needs to be a fallback to kill vswitchd
even if it doesn't exit gracefully.  The test framework waits for all
processes to exit, so the testsuite will ultimately hang if a process
never does.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] ovn: SFC Patch V3

2017-06-08 Thread Mickey Spiegel
A couple more issues that have not come up in a while or at all so far:

1. A port can have multiple mac addresses. Right now you are only using the
first mac address on a port (traffic_port->nbsp->addresses[0]). Rules need
to be installed for all of the mac addresses on the port.

2. The VNF ports are associated with the logical switch's datapath. Any
rules in non-SFC specific tables (e.g. ingress stages 1-9, egress stages
1-9) will be applied over and over again at each SFC hop. This affects
performance. Besides the overhead of additional redundant lookups, if there
are stateful ACL rules then conntrack recirc will occur. Possibly even
worse, if any of the VNFs change any of the fields in the ACL match
condition, then the packet could fail the ACL and be dropped. One option to
avoid all of this would be to insert a pipeline stage at the beginning of
the ingress pipeline and egress pipelines (we already have one for SFC in
the egress pipeline, which can be reused for this purpose as well),
skipping most if not all other pipeline stages on VNF ports. For
intermediate SFC hops in the ingress pipeline, I guess the rules could be
put in this first table rather than the current table 10. For the egress
pipeline, I guess just output directly?

Mickey


On Wed, May 10, 2017 at 3:49 PM, Mickey Spiegel 
wrote:

> Three issues before diving in:
>
>
> 1. Placement of S_SWITCH_IN_CHAIN
>
> For some reason I thought S_SWITCH_IN_CHAIN was after all the stateful
> processing, but now I see that it is table 7. That breaks ACLs and other
> stateful processing, since REGBIT_CONNTRACK_COMMIT is set in
> S_SWITCH_IN_ACL and matched in S_SWITCH_IN_STATEFUL.
>
> S_SWITCH_IN_CHAIN should instead be table 10. The comments below are
> written assuming this change.
>
>
> 2. Ingress pipeline needs to be expanded to more than 16 tables
>
> DNS went in since the v3 patch and used up the last of the 16 ingress
> tables. If you rebase, you will see that there is no space in the ingress
> pipeline for the addition of S_SWITCH_IN_CHAIN. Someone (not me) needs to
> expand the ingress pipeline to more than 16 stages before you can proceed.
>
>
> 3. While writing this response, I paid a little more attention to the
> "exit-lport" direction and noticed a couple of significant issues.
>
> a. If a packet goes from VM A on port 1 to VM B on port 4, there is a
> logical port chain classifier on port 1 in the "entry-lport" direction, and
> there is a logical port chain classifier on port 4 in the "exit-lport"
> direction, you will only go down one of the service chains. Since the
> priorities are equal, I cannot even tell you which one of the service
> chains. Logically I would think that the packet should go down both service
> chains, first the port 1 "entry-lport" service chain and then the port 4
> "exit-lport" service chain.
>
> b. This is done in the ingress pipeline, not the egress pipeline, and is
> based on matching eth.dst. This assumes that the forwarding decision will
> be based on eth.dst, since you are immediately going down the service
> chain, skipping the other ingress pipeline stages, and at the end you go
> directly to the egress pipeline with outport based on eth.dst. That is
> quite restrictive for a generic forwarding architecture like OVN. I would
> think that the right thing to do would be to move the classifier to the
> egress pipeline stage, but then I do not know how to avoid loops. When a
> packet comes to the egress pipeline stage where the VM resides, there is no
> way to tell whether the packet has already gone down the service chain or
> not. I guess you could put a S_SWITCH_IN_EGRESS_CHAIN ingress pipeline
> stage right after L2_LKUP instead, and match on outport in addition to
> eth.dst, but it feels a bit unclean.
>
> On Tue, May 9, 2017 at 4:33 PM, John McDowall  com> wrote:
>
>> Mickey,
>>
>>
>>
>> Thanks for the review. I need some help understanding a couple of things:
>>
>>
>>
>> 1)   The proposed change, I could see the previous logic where we
>> inserted the flow back in the ingress pipeline just after the IN_CHAIN
>> stage. The changes you suggest seem to imply that the action is still
>> insert after the _*IN*_CHAIN stage but in the egress (OUT) pipeline. I
>> am missing something here – can you give me some more info?
>>
> Assume you have port 1 to a VM on HV1, port 2 as the input port to a VNF
> on HV2, and port 3 as the output port from that same VNF on HV2. The
> service chain is just that one VNF, with direction "entry-lport".
>
> The packet progresses as follows:
>
> HV1, ingress pipeline, inport 1
> Tables 1 to 9
> Table 10 (S_SWITCH_IN_CHAIN) hits priority 100 flow setting outport = 2
> and then going to output immediately
>
> Table 32 sends the packet through a geneve tunnel to HV2
>
> HV2, egress pipeline, outport 2
> Tables 48 to 65
>
> After packet gets delivered to the VNF, it comes back
>
> HV2, ingress pipeline, inport 3
> Tables 1 to 9
>
> Table 10 (S_SWITCH_IN_CHAIN)
>
> With the co

[ovs-dev] [PATCH v3 2/2] netdev-dpdk: Show Rx checksum status when false.

2017-06-08 Thread Kevin Traynor
Currently ovs-appctl dpctl/show only shows the Rx checksum offload
status when true. Change to also show the status when false.

CC: Sugesh Chandran 
Signed-off-by: Kevin Traynor 
---
 lib/netdev-dpdk.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 79afda5..c4f32ac 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -1103,4 +1103,6 @@ netdev_dpdk_get_config(const struct netdev *netdev, 
struct smap *args)
 if (dev->hw_ol_features & NETDEV_RX_CHECKSUM_OFFLOAD) {
 smap_add(args, "rx_csum_offload", "true");
+} else {
+smap_add(args, "rx_csum_offload", "false");
 }
 }
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3 1/2] netdev-dpdk: Remove Rx checksum reconfigure.

2017-06-08 Thread Kevin Traynor
Rx checksum offload is enabled by default on DPDK physical NICs
where available, with reconfiguration through
options:rx-checksum-offload. However, changing rx-checksum-offload
did not result in a reconfiguration of the NIC and wrong status is
reported for it.

As there seems to be diminishing reasons why a user would want
to disable Rx checksum offload, just remove the broken reconfiguration
option.

Fixes: 1a2bb11817a4 ("netdev-dpdk: Enable Rx checksum offloading feature on 
DPDK physical ports.")
Reported-by: Kevin Traynor 
Suggested-by: Sugesh Chandran 
Signed-off-by: Kevin Traynor 
---
 Documentation/howto/dpdk.rst | 17 +---
 lib/netdev-dpdk.c| 47 +++-
 vswitchd/vswitch.xml | 14 -
 3 files changed, 13 insertions(+), 65 deletions(-)

diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst
index 93248b4..af01d3e 100644
--- a/Documentation/howto/dpdk.rst
+++ b/Documentation/howto/dpdk.rst
@@ -277,16 +277,5 @@ Rx Checksum Offload
 ---
 
-By default, DPDK physical ports are enabled with Rx checksum offload. Rx
-checksum offload can be configured on a DPDK physical port either when adding
-or at run time.
-
-To disable Rx checksum offload when adding a DPDK port dpdk-p0::
-
-$ ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk \
-  options:dpdk-devargs=:01:00.0 options:rx-checksum-offload=false
-
-Similarly to disable the Rx checksum offloading on a existing DPDK port 
dpdk-p0::
-
-$ ovs-vsctl set Interface dpdk-p0 options:rx-checksum-offload=false
+By default, DPDK physical ports are enabled with Rx checksum offload.
 
 Rx checksum offload can offer performance improvement only for tunneling
@@ -294,8 +283,4 @@ traffic in OVS-DPDK because the checksum validation of 
tunnel packets is
 offloaded to the NIC. Also enabling Rx checksum may slightly reduce the
 performance of non-tunnel traffic, specifically for smaller size packet.
-DPDK vectorization is disabled when checksum offloading is configured on DPDK
-physical ports which in turn effects the non-tunnel traffic performance.
-So it is advised to turn off the Rx checksum offload for non-tunnel traffic use
-cases to achieve the best performance.
 
 .. _extended-statistics:
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index b770b70..79afda5 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -719,27 +719,4 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, int 
n_rxq, int n_txq)
 
 static void
-dpdk_eth_checksum_offload_configure(struct netdev_dpdk *dev)
-OVS_REQUIRES(dev->mutex)
-{
-struct rte_eth_dev_info info;
-bool rx_csum_ol_flag = false;
-uint32_t rx_chksm_offload_capa = DEV_RX_OFFLOAD_UDP_CKSUM |
- DEV_RX_OFFLOAD_TCP_CKSUM |
- DEV_RX_OFFLOAD_IPV4_CKSUM;
-rte_eth_dev_info_get(dev->port_id, &info);
-rx_csum_ol_flag = (dev->hw_ol_features & NETDEV_RX_CHECKSUM_OFFLOAD) != 0;
-
-if (rx_csum_ol_flag &&
-(info.rx_offload_capa & rx_chksm_offload_capa) !=
- rx_chksm_offload_capa) {
-VLOG_WARN_ONCE("Rx checksum offload is not supported on device %"PRIu8,
-   dev->port_id);
-dev->hw_ol_features &= ~NETDEV_RX_CHECKSUM_OFFLOAD;
-return;
-}
-netdev_request_reconfigure(&dev->up);
-}
-
-static void
 dpdk_eth_flow_ctrl_setup(struct netdev_dpdk *dev) OVS_REQUIRES(dev->mutex)
 {
@@ -759,7 +736,19 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
 int diag;
 int n_rxq, n_txq;
+uint32_t rx_chksm_offload_capa = DEV_RX_OFFLOAD_UDP_CKSUM |
+ DEV_RX_OFFLOAD_TCP_CKSUM |
+ DEV_RX_OFFLOAD_IPV4_CKSUM;
 
 rte_eth_dev_info_get(dev->port_id, &info);
 
+if ((info.rx_offload_capa & rx_chksm_offload_capa) !=
+rx_chksm_offload_capa) {
+VLOG_WARN_ONCE("Rx checksum offload is not supported on device %"PRIu8,
+dev->port_id);
+dev->hw_ol_features &= ~NETDEV_RX_CHECKSUM_OFFLOAD;
+} else {
+dev->hw_ol_features |= NETDEV_RX_CHECKSUM_OFFLOAD;
+}
+
 n_rxq = MIN(info.max_rx_queues, dev->up.n_rxq);
 n_txq = MIN(info.max_tx_queues, dev->up.n_txq);
@@ -1205,6 +1194,4 @@ netdev_dpdk_set_config(struct netdev *netdev, const 
struct smap *args,
 {RTE_FC_RX_PAUSE, RTE_FC_FULL}
 };
-bool rx_chksm_ofld;
-bool temp_flag;
 const char *new_devargs;
 int err = 0;
@@ -1288,14 +1275,4 @@ netdev_dpdk_set_config(struct netdev *netdev, const 
struct smap *args,
 }
 
-/* Rx checksum offload configuration */
-/* By default the Rx checksum offload is ON */
-rx_chksm_ofld = smap_get_bool(args, "rx-checksum-offload", true);
-temp_flag = (dev->hw_ol_features & NETDEV_RX_CHECKSUM_OFFLOAD)
-!= 0;
-if (temp_flag != rx_chksm_ofld) {
-dev->hw_ol_feature

[ovs-dev] [PATCH v3 0/2] DPDK Rx checksum reconfigure.

2017-06-08 Thread Kevin Traynor
V3: After tests by Sugesh and discussion on ML, drop the Rx checksum
offload reconfig option.
V2: Added 3/3 to refactor Rx checksum config code and make generic.

Kevin Traynor (2):
  netdev-dpdk: Remove Rx checksum reconfigure.
  netdev-dpdk: Show Rx checksum status when false.

 Documentation/howto/dpdk.rst | 17 +--
 lib/netdev-dpdk.c| 49 +---
 vswitchd/vswitch.xml | 14 -
 3 files changed, 15 insertions(+), 65 deletions(-)

-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH V9 05/31] netdev: Adding a new netdev API to be used for offloading flows

2017-06-08 Thread Joe Stringer
On 8 June 2017 at 05:05, Roi Dayan  wrote:
>
>
> On 31/05/2017 04:37, Joe Stringer wrote:
>>
>> On 28 May 2017 at 04:59, Roi Dayan  wrote:
>>>
>>> From: Paul Blakey 
>>>
>>> Add a new API interface for offloading dpif flows to netdev.
>>> The API consist on the following:
>>>   flow_put - offload a new flow
>>>   flow_get - query an offloaded flow
>>>   flow_del - delete an offloaded flow
>>>   flow_flush - flush all offloaded flows
>>>   flow_dump_* - dump all offloaded flows
>>>
>>> In upcoming commits we will introduce an implementation of this
>>> API for netdev-linux.
>>>
>>> Signed-off-by: Paul Blakey 
>>> Reviewed-by: Roi Dayan 
>>> Reviewed-by: Simon Horman 
>>> ---
>>
>>
>> 
>>
>>> @@ -769,6 +777,67 @@ struct netdev_class {
>>>
>>>  /* Discards all packets waiting to be received from 'rx'. */
>>>  int (*rxq_drain)(struct netdev_rxq *rx);
>>> +
>>> +/* ##  ## */
>>> +/* ## netdev flow offloading functions ## */
>>> +/* ##  ## */
>>> +
>>> +/* If a particular netdev class does not support offloading flows,
>>> + * all these function pointers must be NULL. */
>>> +
>>> +/* Flush all offloaded flows from a netdev.
>>> + * Return 0 if successful, otherwise returns a positive errno value.
>>> */
>>> +int (*flow_flush)(struct netdev *);
>>> +
>>> +/* Flow dumping interface.
>>> + *
>>> + * This is the back-end for the flow dumping interface described in
>>> + * dpif.h.  Please read the comments there first, because this code
>>> + * closely follows it.
>>> + *
>>> + * 'flow_dump_create' is being executed in a dpif thread so there is
>>> + * no need for 'flow_dump_thread_create' implementation.
>>
>>
>> I find this comment a bit confusing, but it's a good thing it was here
>> because it raises a couple of discussion points.
>>
>> 'flow_dump_thread_create', perhaps poorly named, doesn't create a
>> thread, but allocates memory for per-thread state so that each thread
>> may dump safely in parallel while operating on an independent netlink
>> dump and independent buffers. I guess that in the DPIF flow dump there
>> is global dump state and per-thread state, while in this netdev flow
>> dump API there is only global state?
>>
>> Describing that this interface doesn't need something that isn't being
>> defined is a bit strange. If it's not needed, then we probably don't
>> need to describe why it's not needed here since there's no such
>> function. Then, the comment can be dropped.
>>
>>> + * On success returns allocated netdev_flow_dump data, on failure
>>> returns
>>
>>
>> ^ returns allocated netdev_flow_dump_data "and returns 0"...?
>>
>>> + * positive errno. */
>>> +int (*flow_dump_create)(struct netdev *, struct netdev_flow_dump
>>> **dump);
>>> +int (*flow_dump_destroy)(struct netdev_flow_dump *);
>>> +
>>> +/* Returns true while there are more flows to dump.
>>
>>
>> s/while/if/
>>
>>> + * rbuffer is used as a temporary buffer and needs to be pre
>>> allocated
>>> + * by the caller. while there are more flows the same rbuffer should
>>> + * be provided. wbuffer is used to store dumped actions and needs to
>>> be
>>> + * pre allocated by the caller. */
>>
>>
>> I have a couple of extra questions which this description could be
>> expanded to answer:
>>
>> Who is responsible for freeing 'rbuffer' and 'wbuffer'? I expect the
>> caller, but this could be more explicit.
>
>
> caller. as noted the function expects them to be pre allocated.

Makes sense, but to be precise in the API documentation it should
probably state that the caller is responsible for freeing those
buffers.

>> Are the pointers that are returned valid beyond the next call to
>> flow_dump_next()?
>
>
> yes. what can we add to make it clear?

Hmm, ok. Usually when you make a call to the DPIF layer
flow_dump_next, you provide a buffer which gets populated and the
flows point within the buffer (round 1). Once you call flow_dump_next
again after that (round 2), then the flows in round 1 are not
guaranteed to be valid, because they point within the buffer that is
getting manipulated by this function. The DPIF layer describes this
limitation in its documentation, which implies that callers who wish
to preserve the flow beyond the next (round 2) call to dump_next would
have to make a copy. Is that the case here too? I guess that if there
is not such a restriction, then I'm not sure if there's anything to
describe.

> Hi Joe,
>
> I accidentally skipped your comments here for V10.
> I'll address them in the next update.

OK, thanks.

> We skipped addressing port_hmap_obj as we also wanted to move it from
> global to be per dpif which I think got me stuck somewhere in the
> process. I don't remember the reason.
> maybe we can still do as a first step changing this void* to some
> type but still be global and later to be per dpif.
> in any case can we address this in a later commit

Re: [ovs-dev] [PATCH RFC] ofctl: Fix nonstandard isatty on Windows

2017-06-08 Thread Ben Pfaff
On Thu, Jun 08, 2017 at 08:28:45AM +, Alin Serdean wrote:
> A lot of tests are failing, due to the open flow ports being outputted using
> names instead of numbers.
> i.e.: 
> http://64.119.130.115/ovs/beb75a40fdc295bfd6521b0068b4cd12f6de507c/testsuite.dir/0464/testsuite.log.gz
> 
> The issues encountered above is because 'monitor' with 'detach' arguments are
> specified, that in turn will call 'close_standard_fds'
> (https://github.com/openvswitch/ovs/blob/master/lib/daemon-unix.c#L472)
> which will create a duplicate fd over '/dev/null' on Linux and 'nul' on 
> Windows.
> 
> 'isatty' will be called on those FDs.
> What POSIX standard says:
> http://pubs.opengroup.org/onlinepubs/009695399/functions/isatty.html
> 'The isatty() function shall test whether fildes, an open file descriptor,
> is associated with a terminal device.'
> What MSDN says:
> https://msdn.microsoft.com/en-us/library/f4s0ddew(VS.80).aspx
> 'The _isatty function determines whether fd is associated with a character
> device (a terminal, console, printer, or serial port).'
> 
> This patch adds another check using 'GetConsoleMode'
> https://msdn.microsoft.com/en-us/library/windows/desktop/ms683167(v=vs.85).aspx
> which will fail if the handle pointing to the file descriptor is not 
> associated
> to a console.
> 
> Signed-off-by: Alin Gabriel Serdean 

Thanks for finding and fixing the problem.

This approach fixes a problem in one place only.  I see that there are
other calls to isatty() in the tree, so I would prefer to fix them all
in one place.  How about the following approach instead?  I have not
tested it.

diff --git a/include/windows/unistd.h b/include/windows/unistd.h
index 2e9f0aef1647..00ca69762102 100644
--- a/include/windows/unistd.h
+++ b/include/windows/unistd.h
@@ -85,4 +85,19 @@ __inline long sysconf(int type)
 return value;
 }
 
+/* On Windows, a console is a specialized character device, and isatty() only
+ * reports whether a file description is a character device and thus reports
+ * that devices such as /dev/null are ttys.  This replacement avoids that
+ * problem. */
+#define isatty(fd) rpl_isatty(fd)
+static __inline int
+rpl_isatty(int fd)
+{
+HANDLE h = (HANDLE) _get_osfhandle(fd);
+DWORD st;
+return (_isatty(STDOUT_FILENO)
+&& h != INVALID_HANDLE_VALUE
+&& GetConsoleMode(h, &st));
+}
+
 #endif /* unistd.h  */
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] checkpatch: Also allow .at files to have leading tabs.

2017-06-08 Thread Ben Pfaff
On Wed, Jun 07, 2017 at 06:57:31PM -0400, Aaron Conole wrote:
> Ben Pfaff  writes:
> 
> > Autotest .at files often have lines with samples of expected output from
> > various programs, which fairly often includes leading tabs, so this warning
> > causes false positives there.
> >
> > Signed-off-by: Ben Pfaff 
> > ---
> 
> I'm assuming the list is there because probably there could be other
> extensions where we wish to disable this, and it matches the line length
> blacklist.

That's right, I was imitating the structure there.

> I haven't tested it yet, but am confident enough to give:
> 
> Reviewed-by: Aaron Conole 

Thanks!  I applied this to master.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 1/3] windows: add definition of getpid and getcwd

2017-06-08 Thread Alin Serdean
> On Fri, May 19, 2017 at 11:16:16PM +, Alin Serdean wrote:
> > > >  #define WIN32_LEAN_AND_MEAN
> > > > +#include 
> > > >  #include 
> > > > +#include 
> > >
> > > Thanks for the revised patch.
> > >
> > > Does #include  make a difference?  Every .c file should
> > > already start out with that #include, and so if it makes a
> > > difference then it probably indicates that some .c file has
> > > forgotten it.  (But the Makefile checks for that, so it is
> > > unlikely.)
> > >
> > > Thanks,
> > >
> > > Ben.
> > [Alin Serdean] I did a clean compile and a run of unit tests and everything
> was ok.
> > I included  for
> https://github.com/openvswitch/ovs/blob/master/include/windows/windef
> s.h#L41 .
> > Should I change  to  ?
> 
> It looks like config.h always include windefs.h, on Windows.  Every .c file
> includes config.h.  Thus, there should be no need for anything else to ever
> include config.h or windefs.h.  Right?
[Alin Serdean] Yup, it makes total sense. I was a bit tired when I wrote the 
reply 😊.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] Request to tag 2.7 branch with v2.7.1

2017-06-08 Thread Ben Pfaff
On Thu, Jun 08, 2017 at 04:36:21AM +0530, Numan Siddique wrote:
> Is it possible to tag the 2.7 branch with v2.7.1. Recently there were many
> back ports to 2.7 branch. OVS 2.7 is required for RDO [1] and the RDO
> community is expecting a new tag to build the latest OVS 2.7 branch.

It's fine with me if we release 2.7.1.  Justin?
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH V10 04/33] tc: Move functions the create/parse handle to be static inline

2017-06-08 Thread Simon Horman
On Thu, Jun 08, 2017 at 02:46:21PM +0300, Roi Dayan wrote:
> Those functions are just wrappers to available macros for readability.
> Move them to tc.h to avoid function-call overhead.
> 
> Signed-off-by: Roi Dayan 

This looks good to me. I'd be happy to apply it if someone provided a review.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH V10 03/33] tc: Refactor tcm handle assignment when creating filter qdisc

2017-06-08 Thread Simon Horman
On Thu, Jun 08, 2017 at 02:46:20PM +0300, Roi Dayan wrote:
> Use the available TC macros instead of 0x.
> 
> Signed-off-by: Roi Dayan 

This looks good to me. I'd be happy to apply it if someone reviewed it.

> ---
>  lib/tc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/tc.c b/lib/tc.c
> index d3263a2..a71a9e0 100644
> --- a/lib/tc.c
> +++ b/lib/tc.c
> @@ -95,7 +95,7 @@ tc_add_del_ingress_qdisc(int ifindex, bool add)
>  int flags = add ? NLM_F_EXCL | NLM_F_CREATE : 0;
>  
>  tcmsg = tc_make_request(ifindex, type, flags, &request);
> -tcmsg->tcm_handle = tc_make_handle(0x, 0);
> +tcmsg->tcm_handle = TC_H_MAKE(TC_H_INGRESS, 0);
>  tcmsg->tcm_parent = TC_H_INGRESS;
>  nl_msg_put_string(&request, TCA_KIND, "ingress");
>  nl_msg_put_unspec(&request, TCA_OPTIONS, NULL, 0);
> -- 
> 2.7.4
> 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH V10 02/33] tc: Introduce tc module

2017-06-08 Thread Simon Horman
On Thu, Jun 08, 2017 at 02:46:19PM +0300, Roi Dayan wrote:
> Add tc module to expose tc operations to be used by other modules.
> Move some tc related functions from netdev-linux.c to tc.c
> This patch doesn't change any functionality.
> 
> Signed-off-by: Paul Blakey 
> Co-authored-by: Roi Dayan 

Hi Roi,

as your name appears in the From field (as the author) I think
that Paul's name rather than yours should be in the Co-authored-by tag.
If you agree please consider responding to this email with the correct tag.

> Signed-off-by: Roi Dayan 
> Acked-by: Joe Stringer 
> Acked-by: Flavio Leitner 

...
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH V10 01/33] netdev-linux: Refactor two tc functions

2017-06-08 Thread Simon Horman
On Thu, Jun 08, 2017 at 02:46:18PM +0300, Roi Dayan wrote:
> Refactor tc_make_request and tc_add_del_ingress_qdisc to accept
> ifindex instead of netdev struct.
> We later want to move those outside netdev-linux module to be
> used by other modules.
> This patch doesn't change any functionality.
> 
> Signed-off-by: Paul Blakey 
> Co-authored-by: Roi Dayan 

Hi Roi,

as your name appears in the From field (as the author) I think
that Paul's name rather than yours should be in the Co-authored-by tag.
If you agree please consider responding to this email with the correct tag.

> Signed-off-by: Roi Dayan 
> Acked-by: Joe Stringer 
> Acked-by: Flavio Leitner 
> ---
>  lib/netdev-linux.c | 91 
> +-
>  1 file changed, 55 insertions(+), 36 deletions(-)
> 
> diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
> index 1b88775..d794453 100644
> --- a/lib/netdev-linux.c
> +++ b/lib/netdev-linux.c
> @@ -442,10 +442,14 @@ static unsigned int tc_ticks_to_bytes(unsigned int 
> rate, unsigned int ticks);
>  static unsigned int tc_bytes_to_ticks(unsigned int rate, unsigned int size);
>  static unsigned int tc_buffer_per_jiffy(unsigned int rate);
>  
> -static struct tcmsg *tc_make_request(const struct netdev *, int type,
> +static struct tcmsg *tc_make_request(int ifindex, int type,
>   unsigned int flags, struct ofpbuf *);
> +static struct tcmsg *netdev_linux_tc_make_request(const struct netdev *,
> +  int type,
> +  unsigned int flags,
> +  struct ofpbuf *);
>  static int tc_transact(struct ofpbuf *request, struct ofpbuf **replyp);
> -static int tc_add_del_ingress_qdisc(struct netdev *netdev, bool add);
> +static int tc_add_del_ingress_qdisc(int ifindex, bool add);
>  static int tc_add_policer(struct netdev *,
>uint32_t kbits_rate, uint32_t kbits_burst);
>  
> @@ -2089,6 +2093,7 @@ netdev_linux_set_policing(struct netdev *netdev_,
>  {
>  struct netdev_linux *netdev = netdev_linux_cast(netdev_);
>  const char *netdev_name = netdev_get_name(netdev_);
> +int ifindex;
>  int error;
>  
>  kbits_burst = (!kbits_rate ? 0   /* Force to 0 if no rate specified. 
> */
> @@ -2106,9 +2111,14 @@ netdev_linux_set_policing(struct netdev *netdev_,
>  netdev->cache_valid &= ~VALID_POLICING;
>  }
>  
> +error = get_ifindex(netdev_, &ifindex);
> +if (error) {
> +goto out;
> +}
> +
>  COVERAGE_INC(netdev_set_policing);
>  /* Remove any existing ingress qdisc. */
> -error = tc_add_del_ingress_qdisc(netdev_, false);
> +error = tc_add_del_ingress_qdisc(ifindex, false);
>  if (error) {
>  VLOG_WARN_RL(&rl, "%s: removing policing failed: %s",
>   netdev_name, ovs_strerror(error));
> @@ -2116,7 +2126,7 @@ netdev_linux_set_policing(struct netdev *netdev_,
>  }
>  
>  if (kbits_rate) {
> -error = tc_add_del_ingress_qdisc(netdev_, true);
> +error = tc_add_del_ingress_qdisc(ifindex, true);
>  if (error) {
>  VLOG_WARN_RL(&rl, "%s: adding policing qdisc failed: %s",
>   netdev_name, ovs_strerror(error));
> @@ -2385,7 +2395,7 @@ start_queue_dump(const struct netdev *netdev, struct 
> queue_dump_state *state)
>  struct ofpbuf request;
>  struct tcmsg *tcmsg;
>  
> -tcmsg = tc_make_request(netdev, RTM_GETTCLASS, 0, &request);
> +tcmsg = netdev_linux_tc_make_request(netdev, RTM_GETTCLASS, 0, &request);
>  if (!tcmsg) {
>  return false;
>  }
> @@ -2944,8 +2954,8 @@ codel_setup_qdisc__(struct netdev *netdev, uint32_t 
> target, uint32_t limit,
>  
>  tc_del_qdisc(netdev);
>  
> -tcmsg = tc_make_request(netdev, RTM_NEWQDISC,
> -NLM_F_EXCL | NLM_F_CREATE, &request);
> +tcmsg = netdev_linux_tc_make_request(netdev, RTM_NEWQDISC,
> + NLM_F_EXCL | NLM_F_CREATE, 
> &request);
>  if (!tcmsg) {
>  return ENODEV;
>  }
> @@ -3162,8 +3172,8 @@ fqcodel_setup_qdisc__(struct netdev *netdev, uint32_t 
> target, uint32_t limit,
>  
>  tc_del_qdisc(netdev);
>  
> -tcmsg = tc_make_request(netdev, RTM_NEWQDISC,
> -NLM_F_EXCL | NLM_F_CREATE, &request);
> +tcmsg = netdev_linux_tc_make_request(netdev, RTM_NEWQDISC,
> + NLM_F_EXCL | NLM_F_CREATE, 
> &request);
>  if (!tcmsg) {
>  return ENODEV;
>  }
> @@ -3386,8 +3396,8 @@ sfq_setup_qdisc__(struct netdev *netdev, uint32_t 
> quantum, uint32_t perturb)
>  
>  tc_del_qdisc(netdev);
>  
> -tcmsg = tc_make_request(netdev, RTM_NEWQDISC,
> -NLM_F_EXCL | NLM_F_CREATE, &request);
> +tcmsg = netdev_linux_tc_make_request(netdev, RTM_NEWQDISC,
> +   

[ovs-dev] [PATCH v2 4/4] ovn: document l3ha redirect-chassis changes

2017-06-08 Thread majopela
From: Miguel Angel Ajo 

This commit documents the ovn-architecture, ovn-nb
and ovn-sb changes to the redirect-chassis option
in Logical_Router_Port and Port_Binding tables.

Signed-off-by: Miguel Angel Ajo 
---
 ovn/ovn-architecture.7.xml |  6 --
 ovn/ovn-nb.xml | 25 +
 ovn/ovn-sb.xml | 13 ++---
 3 files changed, 31 insertions(+), 13 deletions(-)

diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml
index bce32a6..3c4b8a9 100644
--- a/ovn/ovn-architecture.7.xml
+++ b/ovn/ovn-architecture.7.xml
@@ -1331,8 +1331,10 @@
 simply a way to indicate that although the packet is destined for
 the distributed gateway port, it needs to be redirected to a
 different chassis.  At table 32, packets with this logical egress
-port are sent to a specific chassis, in the same way that table 32
-directs packets whose logical egress port is a VIF or a type
+port are sent to a specific chassis, or to the master chassis from
+a set of chassis defined in the redirect-chassis option,
+in the same way that table 32 directs packets whose logical egress
+port is a VIF or a type
 l3gateway port to different chassis.  Once the packet
 arrives at that chassis, table 33 resets the logical egress port to
 the value representing the distributed gateway port.  For each
diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
index eb348fe..8c133fb 100644
--- a/ovn/ovn-nb.xml
+++ b/ovn/ovn-nb.xml
@@ -1281,13 +1281,13 @@
 
 
 
-  Even when a redirect-chassis is specified, the
+  Even when redirect-chassis is specified, the
   logical router port still effectively resides on each chassis.
   However, due to the implications of the use of L2 learning in the
   physical network, as well as the need to support advanced features
   such as one-to-many NAT (aka IP masquerading), a subset of the
   logical router processing is handled in a centralized manner on
-  the specified redirect-chassis.
+  the specified set of redirect-chassis.
 
 
 
@@ -1297,10 +1297,18 @@
   column="external_mac" table="NAT"/>s specified in NAT rules are
   automatically programmed in the peer logical switch's
   destination lookup on the chassis where the  resides.  In addition, the
-  logical router's MAC address is automatically programmed in the
-  peer logical switch's destination lookup flow on the
-  redirect-chassis.
+  column="logical_port" table="NAT"/> resides and is master.
+  In addition, the logical router's MAC address is automatically
+  programmed in the peer logical switch's destination lookup flow
+  on the redirect-chassis.
+
+
+  The following format is accepted:
+  chassis1_name[:chassis1_prio]
+  [,chassis2_name[:chassis2_prio]]
+  [...]
+  where priorities are in decimal values, and an higher numbers means
+  higher priority.
 
 
 
@@ -1462,7 +1470,8 @@
 processed in a distributed manner on all chassis.  If this is
 not specified for a NAT rule on a distributed router, then
 this NAT rule will be processed in a centralized manner on
-the gateway port instance on the redirect-chassis.
+the gateway port instance on the master
+redirect-chassis.
   
 
   
@@ -1489,7 +1498,7 @@
 distributed manner on all chassis.  If this is not specified
 for a NAT rule on a distributed router, then this NAT rule
 will be processed in a centralized manner on the gateway
-port instance on the redirect-chassis.
+port instance on the master redirect-chassis.
   
 
   
diff --git a/ovn/ovn-sb.xml b/ovn/ovn-sb.xml
index f3c3212..b65e81b 100644
--- a/ovn/ovn-sb.xml
+++ b/ovn/ovn-sb.xml
@@ -1918,7 +1918,7 @@ tcp.flags = RST;
   chassisredirect
   
 A logical port that represents a particular instance, bound
-to a specific chassis, of an otherwise distributed parent
+to a specific set chassis, of an otherwise distributed parent
 port (e.g. of type patch).  A
 chassisredirect port should never be used as an
 inport.  When an ingress pipeline sets the
@@ -2122,8 +2122,15 @@ tcp.flags = RST;
   
 
   
-The chassis that this chassisredirect port
-is bound to.  This is taken from chassis, with optional priority that
+this chassisredirect port is bound to. With the
+following format:
+  chassis1_name[:chassis1_prio]
+  [,chassis2_name[:chassis2_prio]]
+  [...]
+where priorities are in decimal values, and an higher numbers means
+higher priority.
+This is taken from 
 in the OVN_Northbound database's  table.
-- 
1.8.3.1

_

[ovs-dev] [PATCH v2 3/4] ovn: l3ha make is_chassis_active aware of redirect-chassis

2017-06-08 Thread majopela
From: Miguel Angel Ajo 

is_chassis_active now is only true for redirect-chassis ports
in the case of the specific lport being active on the
local chassis.

This will naturally make the ARP responder / redirection openflow
rules automatically inserted/removed when a router goes active/backup.

Signed-off-by: Miguel Angel Ajo 
---
 ovn/controller/binding.c| 21 +---
 ovn/controller/binding.h|  2 +-
 ovn/controller/lflow.c  | 38 ++--
 ovn/controller/lflow.h  |  4 +++-
 ovn/controller/lport.c  | 43 -
 ovn/controller/lport.h  |  7 ++-
 ovn/controller/ovn-controller.c |  6 --
 7 files changed, 89 insertions(+), 32 deletions(-)

diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
index a2d375c..7ec7c70 100644
--- a/ovn/controller/binding.c
+++ b/ovn/controller/binding.c
@@ -430,17 +430,8 @@ consider_local_datapath(struct controller_ctx *ctx,
 
 if (redirect_chassis &&
redirect_chassis_contains(redirect_chassis, chassis_rec)) {
-struct redirect_chassis *rc;
-LIST_FOR_EACH (rc, node, redirect_chassis) {
-if (!strcmp(rc->chassis_id, chassis_rec->name)) {
-/* sb_rec_port_binding->chassis should reflect master */
-our_chassis = true;
-break;
-}
-if (sset_contains(active_tunnels, rc->chassis_id)) {
-break;
-}
-}
+our_chassis = redirect_chassis_is_active(
+redirect_chassis, chassis_rec, active_tunnels);
 add_local_datapath(ldatapaths, lports, binding_rec->datapath,
false, local_datapaths, our_chassis);
 }
@@ -491,7 +482,7 @@ binding_run(struct controller_ctx *ctx, const struct 
ovsrec_bridge *br_int,
 const struct sbrec_chassis *chassis_rec,
 const struct ldatapath_index *ldatapaths,
 const struct lport_index *lports, struct hmap *local_datapaths,
-struct sset *local_lports)
+struct sset *local_lports, struct sset *active_tunnels)
 {
 if (!chassis_rec) {
 return;
@@ -500,13 +491,12 @@ binding_run(struct controller_ctx *ctx, const struct 
ovsrec_bridge *br_int,
 const struct sbrec_port_binding *binding_rec;
 struct shash lport_to_iface = SHASH_INITIALIZER(&lport_to_iface);
 struct sset egress_ifaces = SSET_INITIALIZER(&egress_ifaces);
-struct sset active_tunnels = SSET_INITIALIZER(&active_tunnels);
 struct hmap qos_map;
 
 hmap_init(&qos_map);
 if (br_int) {
 get_local_iface_ids(br_int, &lport_to_iface, local_lports,
-&egress_ifaces, &active_tunnels);
+&egress_ifaces, active_tunnels);
 }
 
 /* Run through each binding record to see if it is resident on this
@@ -517,7 +507,7 @@ binding_run(struct controller_ctx *ctx, const struct 
ovsrec_bridge *br_int,
 chassis_rec, binding_rec,
 sset_is_empty(&egress_ifaces) ? NULL :
 &qos_map, local_datapaths, &lport_to_iface,
-local_lports, &active_tunnels);
+local_lports, active_tunnels);
 
 }
 if (!sset_is_empty(&egress_ifaces)
@@ -530,7 +520,6 @@ binding_run(struct controller_ctx *ctx, const struct 
ovsrec_bridge *br_int,
 
 shash_destroy(&lport_to_iface);
 sset_destroy(&egress_ifaces);
-sset_destroy(&active_tunnels);
 hmap_destroy(&qos_map);
 }
 
diff --git a/ovn/controller/binding.h b/ovn/controller/binding.h
index 2b88ec5..99c5225 100644
--- a/ovn/controller/binding.h
+++ b/ovn/controller/binding.h
@@ -32,7 +32,7 @@ void binding_register_ovs_idl(struct ovsdb_idl *);
 void binding_run(struct controller_ctx *, const struct ovsrec_bridge *br_int,
  const struct sbrec_chassis *, const struct ldatapath_index *,
  const struct lport_index *, struct hmap *local_datapaths,
- struct sset *all_lports);
+ struct sset *all_lports, struct sset *active_tunnels);
 void bfd_run(struct controller_ctx *ctx, const struct ovsrec_bridge *br_int,
  const struct sbrec_chassis *chassis_rec,
  struct hmap *local_datapaths);
diff --git a/ovn/controller/lflow.c b/ovn/controller/lflow.c
index b7d1bcb..f901bea 100644
--- a/ovn/controller/lflow.c
+++ b/ovn/controller/lflow.c
@@ -53,6 +53,7 @@ struct lookup_port_aux {
 struct condition_aux {
 const struct lport_index *lports;
 const struct sbrec_chassis *chassis;
+const struct sset *active_tunnels;
 };
 
 static void consider_logical_flow(const struct lport_index *lports,
@@ -65,7 +66,8 @@ static void consider_logical_flow(const struct lport_index 
*lports,
  

[ovs-dev] [PATCH v2 2/4] ovn: l3ha, enable bfd between tunnel endpoints

2017-06-08 Thread majopela
From: venkata anil 

This patch enables bfd protocol between gateways and transport nodes,
all gateway nodes with any HA chassisredirect port will enable BFD
to all tunnel endpoints, while transport nodes will enable BFD
to all tunnel endpoints hosting an HA gateway chassisredirect port.

Signed-off-by: Venkata Anil 
Signed-off-by: Miguel Angel Ajo 
Co-Authored-by: Miguel Angel Ajo 
---
 ovn/controller/binding.c| 201 
 ovn/controller/binding.h|   3 +
 ovn/controller/ovn-controller.c |   4 +
 ovn/controller/ovn-controller.h |   3 +
 tests/ovn.at|  62 +
 5 files changed, 255 insertions(+), 18 deletions(-)

diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
index d45c5df..a2d375c 100644
--- a/ovn/controller/binding.c
+++ b/ovn/controller/binding.c
@@ -58,6 +58,8 @@ binding_register_ovs_idl(struct ovsdb_idl *ovs_idl)
 ovsdb_idl_add_table(ovs_idl, &ovsrec_table_interface);
 ovsdb_idl_add_column(ovs_idl, &ovsrec_interface_col_name);
 ovsdb_idl_add_column(ovs_idl, &ovsrec_interface_col_external_ids);
+ovsdb_idl_add_column(ovs_idl, &ovsrec_interface_col_bfd);
+ovsdb_idl_add_column(ovs_idl, &ovsrec_interface_col_bfd_status);
 ovsdb_idl_add_column(ovs_idl, &ovsrec_interface_col_status);
 
 ovsdb_idl_add_table(ovs_idl, &ovsrec_table_qos);
@@ -68,7 +70,8 @@ static void
 get_local_iface_ids(const struct ovsrec_bridge *br_int,
 struct shash *lport_to_iface,
 struct sset *local_lports,
-struct sset *egress_ifaces)
+struct sset *egress_ifaces,
+struct sset *active_tunnels)
 {
 int i;
 
@@ -100,6 +103,20 @@ get_local_iface_ids(const struct ovsrec_bridge *br_int,
 if (tunnel_iface) {
 sset_add(egress_ifaces, tunnel_iface);
 }
+/* Add ovn-chassis-id if the bfd_status of the tunnel
+ * is active */
+const char *bfd = smap_get(&iface_rec->bfd, "enable");
+if (bfd && !strcmp(bfd, "true")) {
+const char *status = smap_get(&iface_rec->bfd_status,
+  "state");
+if (status && !strcmp(status, "up")) {
+const char *id = smap_get(&port_rec->external_ids,
+  "ovn-chassis-id");
+if (id) {
+sset_add(active_tunnels, id);
+}
+}
+}
 }
 }
 }
@@ -110,7 +127,8 @@ add_local_datapath__(const struct ldatapath_index 
*ldatapaths,
  const struct lport_index *lports,
  const struct sbrec_datapath_binding *datapath,
  bool has_local_l3gateway, int depth,
- struct hmap *local_datapaths)
+ struct hmap *local_datapaths,
+ bool has_local_redirectchassis)
 {
 uint32_t dp_key = datapath->tunnel_key;
 
@@ -119,6 +137,9 @@ add_local_datapath__(const struct ldatapath_index 
*ldatapaths,
 if (has_local_l3gateway) {
 ld->has_local_l3gateway = true;
 }
+if (has_local_redirectchassis) {
+ld->has_local_redirectchassis = true;
+}
 return;
 }
 
@@ -129,6 +150,7 @@ add_local_datapath__(const struct ldatapath_index 
*ldatapaths,
 ovs_assert(ld->ldatapath);
 ld->localnet_port = NULL;
 ld->has_local_l3gateway = has_local_l3gateway;
+ld->has_local_redirectchassis = has_local_redirectchassis;
 
 if (depth >= 100) {
 static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
@@ -146,7 +168,14 @@ add_local_datapath__(const struct ldatapath_index 
*ldatapaths,
 lports, peer_name);
 if (peer && peer->datapath) {
 add_local_datapath__(ldatapaths, lports, peer->datapath,
- false, depth + 1, local_datapaths);
+ false, depth + 1, local_datapaths,
+ false);
+ld->n_peer_dps++;
+ld->peer_dps = xrealloc(
+ld->peer_dps,
+ld->n_peer_dps * sizeof *ld->peer_dps);
+ld->peer_dps[ld->n_peer_dps - 1] = ldatapath_lookup_by_key(
+ldatapaths, peer->datapath->tunnel_key);
 }
 }
 }
@@ -157,10 +186,11 @@ static void
 add_local_datapath(const struct ldatapath_index *ldatapaths,
const struct lport_index *lports,
const struct sbrec_datapath_binding *datapath,
-   bool has_local_l3gateway, struct hmap *local_datapaths)
+   bool has_

[ovs-dev] [PATCH v2 1/4] ovn: l3ha, handling of multiple gateways

2017-06-08 Thread majopela
From: Miguel Angel Ajo 

This patch handles multiple gateways with priorities in chassisredirect
ports, any gateway with a chassis redirect port will implement the
rules to de-encapsulate incomming packets for such port.

And hosts targetting a remote chassisredirect port will setup a
bundle(active_backup, ..) action to each tunnel port, in the given
priority order.

Signed-off-by: Miguel Angel Ajo 
---
 ovn/controller/binding.c|   9 +--
 ovn/controller/lflow.c  |   6 +-
 ovn/controller/lport.c  | 119 
 ovn/controller/lport.h  |  28 ++
 ovn/controller/ovn-controller.c |   5 +-
 ovn/controller/physical.c   | 114 --
 tests/ovn.at| 114 ++
 7 files changed, 369 insertions(+), 26 deletions(-)

diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
index bb76608..d45c5df 100644
--- a/ovn/controller/binding.c
+++ b/ovn/controller/binding.c
@@ -394,12 +394,13 @@ consider_local_datapath(struct controller_ctx *ctx,
false, local_datapaths);
 }
 } else if (!strcmp(binding_rec->type, "chassisredirect")) {
-const char *chassis_id = smap_get(&binding_rec->options,
-  "redirect-chassis");
-our_chassis = chassis_id && !strcmp(chassis_id, chassis_rec->name);
-if (our_chassis) {
+if (pb_redirect_chassis_contains(binding_rec, chassis_rec)) {
 add_local_datapath(ldatapaths, lports, binding_rec->datapath,
false, local_datapaths);
+// XXX this should only be set to true if our chassis
+// (chassis_rec) is the master for this chassisredirect port
+// but for now we'll bind it only when not bound
+our_chassis = !binding_rec->chassis;
 }
 } else if (!strcmp(binding_rec->type, "l3gateway")) {
 const char *chassis_id = smap_get(&binding_rec->options,
diff --git a/ovn/controller/lflow.c b/ovn/controller/lflow.c
index b1b4b23..b7d1bcb 100644
--- a/ovn/controller/lflow.c
+++ b/ovn/controller/lflow.c
@@ -96,7 +96,11 @@ is_chassis_resident_cb(const void *c_aux_, const char 
*port_name)
 
 const struct sbrec_port_binding *pb
 = lport_lookup_by_name(c_aux->lports, port_name);
-return pb && pb->chassis && pb->chassis == c_aux->chassis;
+if (pb && pb->chassis && pb->chassis == c_aux->chassis) {
+return true;
+} else {
+return pb_redirect_chassis_contains(pb, c_aux->chassis);
+}
 }
 
 static bool
diff --git a/ovn/controller/lport.c b/ovn/controller/lport.c
index 906fda2..52608a1 100644
--- a/ovn/controller/lport.c
+++ b/ovn/controller/lport.c
@@ -237,3 +237,122 @@ mcgroup_lookup_by_dp_name(const struct mcgroup_index 
*mcgroups,
 }
 return NULL;
 }
+
+
+/* redirect-chassis option parsing
+ */
+static int
+compare_chassis_prio_(const void *a_, const void *b_)
+{
+const struct redirect_chassis *chassis_a = a_;
+const struct redirect_chassis *chassis_b = b_;
+int prio_diff = chassis_b->prio - chassis_a->prio;
+if (!prio_diff) {
+return strcmp(chassis_a->chassis_id, chassis_b->chassis_id);
+}
+return prio_diff;
+}
+
+struct ovs_list*
+parse_redirect_chassis(const struct sbrec_port_binding *binding)
+{
+
+const char *redir_chassis_const;
+char *redir_chassis_str;
+char *save_ptr1 = NULL;
+char *chassis_prio;
+
+struct redirect_chassis *redirect_chassis =
+xmalloc(sizeof *redirect_chassis);
+
+int n=0;
+
+redir_chassis_const = smap_get(&binding->options, "redirect-chassis");
+
+if (!redir_chassis_const) {
+free(redirect_chassis);
+return NULL;
+}
+
+redir_chassis_str = xstrdup(redir_chassis_const);
+
+for (chassis_prio = strtok_r(redir_chassis_str, ", ", &save_ptr1);
+ chassis_prio; chassis_prio = strtok_r(NULL, ", ", &save_ptr1)) {
+
+char *save_ptr2 = NULL;
+char *chassis_name = strtok_r(chassis_prio, ":", &save_ptr2);
+char *prio = strtok_r(NULL, ":", &save_ptr2);
+
+if (strlen(chassis_name) > UUID_LEN) {
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+VLOG_WARN_RL(&rl, "chassis name (%s) in redirect-chassis option "
+  "of logical port %s is too long, ignoring.",
+  chassis_name, binding->logical_port);
+continue;
+}
+
+ovs_strzcpy(redirect_chassis[n].chassis_id, chassis_name,
+UUID_LEN + 1);
+
+/* chassis with no priority get lowest priority: 0 */
+redirect_chassis[n].prio = prio ? atoi(prio):0;
+
+redirect_chassis = xrealloc(redirect_chassis,
+sizeof *redirect_chassis * (++n + 1));
+
+}
+
+free(redir_chassis_str);
+
+qsort(redirect_chas

Re: [ovs-dev] [RFC PATCH 02/21] ovsschema: Introduce 'keepalive' column in Open_vSwitch.

2017-06-08 Thread Bodireddy, Bhanuprakash
>On Wed, Jun 07, 2017 at 05:14:58PM +0100, Bhanuprakash Bodireddy wrote:
>> This commit adds new ovsdb column "keepalive". It shows the overall
>> datapath status and the health of the cores running datapath threads.
>>
>> Signed-off-by: Bhanuprakash Bodireddy
>> 
>
>I'm a little uncomfortable with having OVS report that it's nonfunctional.  If 
>it's
>dead, then from my point of view the most natural response would be to call
>abort(), to let the monitoring process restart it and presumably fix the
>problem.  What's the guiding philosophy here?

Hello Ben,

In some scenarios its correct to let the monitoring process instantly restart 
the OvS in case of failures.

However, as part of OPNFV Barometer project, key KPI statistics are exposed to 
monitor the health of computes. This includes CPU, Memory, Cache utilization, 
Link status, packet statistics, Networking MIBS etc. vSwitch health is most 
important and the same is exposed with KA patches to monitoring apps like 
collectd, which internally relays the information to OpenStack service 
Ceilometer. As you are aware Ceilometer only collects the events and metering 
data and isn't entitled to take any decisions.

In case of vSwitch issue, based on the criticality of the failure and also 
considering other KPIs from compute, fault management services like 'Doctor' 
can take actions to migrate the VNFs to other compute and further mark the 
compute node as offline so that nova wont schedule VMs on this problematic 
compute.

- Bhanuprakash.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v5] Update relevant artifacts to add support for DPDK 17.05.

2017-06-08 Thread Kavanagh, Mark B
>From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-boun...@openvswitch.org] 
>On Behalf Of
>mweglicx
>Sent: Thursday, June 8, 2017 1:47 PM
>To: d...@openvswitch.org
>Subject: [ovs-dev] [PATCH v5] Update relevant artifacts to add support for 
>DPDK 17.05.
>
>Upgrading to DPDK 17.05 adds new significant features
>relevant to OVS, including, but not limited to:
>- vhost pmd,
>- tun/tap PMD,
>- VFIO hotplug support,
>- Generic flow API.
>
>Following changes are applied:
>- netdev-dpdk: Changes required by DPDK API modifications.
>- doc: Because of DPDK API changes, backward compatibility
>  with previous DPDK releases will be broken, thus all
>  relevant documentation entries are updated.
>- .travis: DPDK version change from 16.11.1 to 17.05.
>- rhel/openvswitch-fedora.spec.in: DPDK version change
>  from 16.11 to 17.05.
>
>v1->v2: Patch rework based on minor review comments.
>v2->v3: VHOST user client reconfiguration corrected.
>v3->v4: Patch is rebased against OVS master, minor
>rework based on review comments.
>v4->v5: Some error log messages are corrected, NEWS file
>is updated accordingly.
>
>Signed-off-by: Michal Weglicki 

Hi Michal,

Did you get a chance to look at the behavior of the switch when 
rte_vhost_driver_start returns an error to netdev_dpdk_vhost_construct?

In my testing, I found that the result is that ovs-vswitchd terminates, and the 
bridge becomes unresponsive.

Thanks,
Mark 




>---
> .travis/linux-build.sh   |   2 +-
> Documentation/intro/install/dpdk.rst |   8 +-
> Documentation/topics/dpdk/vhost-user.rst |   8 +-
> NEWS |   1 +
> lib/netdev-dpdk.c| 144 +++
> rhel/openvswitch-fedora.spec.in  |   2 +-
> tests/dpdk/ring_client.c |   6 +-
> 7 files changed, 105 insertions(+), 66 deletions(-)
>
>diff --git a/.travis/linux-build.sh b/.travis/linux-build.sh
>index 8750d68..ec15fd8 100755
>--- a/.travis/linux-build.sh
>+++ b/.travis/linux-build.sh
>@@ -80,7 +80,7 @@ fi
>
> if [ "$DPDK" ]; then
> if [ -z "$DPDK_VER" ]; then
>-DPDK_VER="16.11.1"
>+DPDK_VER="17.05"
> fi
> install_dpdk $DPDK_VER
> if [ "$CC" = "clang" ]; then
>diff --git a/Documentation/intro/install/dpdk.rst 
>b/Documentation/intro/install/dpdk.rst
>index e83f852..536450b 100644
>--- a/Documentation/intro/install/dpdk.rst
>+++ b/Documentation/intro/install/dpdk.rst
>@@ -40,7 +40,7 @@ Build requirements
> In addition to the requirements described in :doc:`general`, building Open
> vSwitch with DPDK will require the following:
>
>-- DPDK 16.11
>+- DPDK 17.05
>
> - A `DPDK supported NIC`_
>
>@@ -69,9 +69,9 @@ Install DPDK
> #. Download the `DPDK sources`_, extract the file and set ``DPDK_DIR``::
>
>$ cd /usr/src/
>-   $ wget http://fast.dpdk.org/rel/dpdk-16.11.1.tar.xz
>-   $ tar xf dpdk-16.11.1.tar.xz
>-   $ export DPDK_DIR=/usr/src/dpdk-stable-16.11.1
>+   $ wget http://fast.dpdk.org/rel/dpdk-17.05.tar.xz
>+   $ tar xf dpdk-17.05.tar.xz
>+   $ export DPDK_DIR=/usr/src/dpdk-17.05
>$ cd $DPDK_DIR
>
> #. (Optional) Configure DPDK as a shared library
>diff --git a/Documentation/topics/dpdk/vhost-user.rst 
>b/Documentation/topics/dpdk/vhost-
>user.rst
>index a1c19fd..e533206 100644
>--- a/Documentation/topics/dpdk/vhost-user.rst
>+++ b/Documentation/topics/dpdk/vhost-user.rst
>@@ -282,9 +282,9 @@ To begin, instantiate a guest as described in 
>:ref:`dpdk-vhost-user` or
> DPDK sources to VM and build DPDK::
>
> $ cd /root/dpdk/
>-$ wget http://fast.dpdk.org/rel/dpdk-16.11.1.tar.xz
>-$ tar xf dpdk-16.11.1.tar.xz
>-$ export DPDK_DIR=/root/dpdk/dpdk-stable-16.11.1
>+$ wget http://fast.dpdk.org/rel/dpdk-17.05.tar.xz
>+$ tar xf dpdk-17.05.tar.xz
>+$ export DPDK_DIR=/root/dpdk/dpdk-17.05
> $ export DPDK_TARGET=x86_64-native-linuxapp-gcc
> $ export DPDK_BUILD=$DPDK_DIR/$DPDK_TARGET
> $ cd $DPDK_DIR
>@@ -368,7 +368,7 @@ Sample XML
> 
> 
>   
>-  
>+  
>   
>   
> 
>diff --git a/NEWS b/NEWS
>index 82004c8..6059368 100644
>--- a/NEWS
>+++ b/NEWS
>@@ -16,6 +16,7 @@ Post-v2.7.0
>Log level can be changed in a usual OVS way using
>'ovs-appctl vlog' commands for 'dpdk' module. Lower bound
>still can be configured via extra arguments for DPDK EAL.
>+ * Support for DPDK v17.05.
>- IPFIX now provides additional counters:
>  * Total counters since metering process startup.
>  * Per-flow TCP flag counters.
>diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
>index b770b70..495640b 100644
>--- a/lib/netdev-dpdk.c
>+++ b/lib/netdev-dpdk.c
>@@ -22,6 +22,9 @@
> #include 
> #include 
> #include 
>+#include 
>+#include 
>+#include 
>
> #include 
> #include 
>@@ -31,7 +34,7 @@
> #include 
> #include 
> #include 
>-#include 
>+#include 
>
> #include "dirs.h"
> #include "dp-packet.h"
>@@ -56,6 +59,8 

Re: [ovs-dev] [PATCH] dpdk: announce deprecation of vhost-user server ports

2017-06-08 Thread Aaron Conole
Hi Darrell,

Thanks so much for the review!  Comments below.

Darrell Ball  writes:

> On 6/7/17, 3:46 PM, "Aaron Conole"  wrote:
>
> Since vhost-user server mode ports are the preferred mechanism for
> interconnecting Open vSwitch with VMs when using DPDK, and since there
> are currently no known use cases for vhost-user server mode ports apart
> from version incompatibilities with QEMU, announce that server mode ports
> are considered deprecated and will be removed in a future release.
> 
> Cc: Ciara Loftus 
> Cc: Kevin Traynor 
> Suggested-by: Darrell Ball 
> Signed-off-by: Aaron Conole 
> ---
>  Documentation/topics/dpdk/vhost-user.rst | 24 
>  NEWS |  2 ++
>  lib/netdev-dpdk.c|  2 ++
>  3 files changed, 20 insertions(+), 8 deletions(-)
> 
> diff --git a/Documentation/topics/dpdk/vhost-user.rst 
> b/Documentation/topics/dpdk/vhost-user.rst
> index a1c19fd..9d36cf2 100644
> --- a/Documentation/topics/dpdk/vhost-user.rst
> +++ b/Documentation/topics/dpdk/vhost-user.rst
> @@ -32,13 +32,19 @@ documentation`_ on same.
>  Quick Example
>  -
>  
> -This example demonstrates how to add two ``dpdkvhostuser`` ports to an 
> existing
> -bridge called ``br0``::
> +This example demonstrates how to add two ``dpdkvhostuserclient`` ports 
> to an
> +existing bridge called ``br0``::
>  
> -$ ovs-vsctl add-port br0 dpdkvhostuser0 \
> --- set Interface dpdkvhostuser0 type=dpdkvhostuser
> -$ ovs-vsctl add-port br0 dpdkvhostuser1 \
> --- set Interface dpdkvhostuser1 type=dpdkvhostuser
> +$ ovs-vsctl add-port br0 dpdkvhostclient0 \
> +-- set Interface dpdkvhostclient0 type=dpdkvhostuserclient \
> +   options:vhost-server-path=/tmp/dpdkvhostclient0
> +$ ovs-vsctl add-port br0 dpdkvhostclient1 \
> +-- set Interface dpdkvhostclient1 type=dpdkvhostuserclient \
> +   options:vhost-server-path=/tmp/dpdkvhostclient1
> +
> +For the above examples to work, an appropriate server socket must be 
> created
> +at the paths specified (``/tmp/dpdkvhostclient0`` and
> +``/tmp/dpdkvhostclient0``).
>  
>  vhost-user vs. vhost-user-client
>  
> @@ -59,7 +65,8 @@ means if OVS dies, all VMs **must** be restarted. On 
> the other hand, for
>  vhost-user-client ports, OVS acts as the client and QEMU the server. 
> This means
>  OVS can die and be restarted without issue, and it is also possible to 
> restart
>  an instance itself. For this reason, vhost-user-client ports are the 
> preferred
> -type for most use cases.
> +type for most use cases.  Ports of type vhost-user are currently 
> deprecated and
> +will be removed in a future release.
>
> type for all known use cases; the only limitation is that vhost-user client 
> mode ports
> require QEMU version 2.7.  Ports of type vhost-user are currently deprecated 
> and
> will be removed in a future release.

Will update with this verbiage.  Thanks.

>  .. _dpdk-vhost-user:
>  
> @@ -68,7 +75,8 @@ vhost-user
>  
>  .. important::
>  
> -   Use of vhost-user ports requires QEMU >= 2.2
> +   Use of vhost-user ports requires QEMU >= 2.2;  vhost-user ports are
> +   *deprecated*.
>  
>  To use vhost-user ports, you must first add said ports to the switch. 
> DPDK
>  vhost-user ports can have arbitrary names with the exception of forward 
> and
> diff --git a/NEWS b/NEWS
> index 82004c8..b81d033 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -16,6 +16,8 @@ Post-v2.7.0
> Log level can be changed in a usual OVS way using
> 'ovs-appctl vlog' commands for 'dpdk' module. Lower bound
> still can be configured via extra arguments for DPDK EAL.
> + * dpdkvhostuser ports are marked as deprecated.  They will be 
> removed
> +   in an upcoming release.
> - IPFIX now provides additional counters:
>   * Total counters since metering process startup.
>   * Per-flow TCP flag counters.
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index b770b70..9ab4aeb 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -966,6 +966,8 @@ netdev_dpdk_vhost_construct(struct netdev *netdev)
>  err = vhost_common_construct(netdev);
>  
>  ovs_mutex_unlock(&dpdk_mutex);
> +VLOG_WARN_ONCE("dpdkvhostuser ports are considered deprecated;  "
> +   "please migrate to dpdkvhostuserclient ports.");
>
> I think we can:
> 1) Print the socket name and port name
> 2) I am not sure ‘_ONCE’ is required; do you really think the log will have 
> that many instances.

My idea to not print the socket / port name is because I figure there
wo

[ovs-dev] [PATCH v5] Update relevant artifacts to add support for DPDK 17.05.

2017-06-08 Thread mweglicx
Upgrading to DPDK 17.05 adds new significant features
relevant to OVS, including, but not limited to:
- vhost pmd,
- tun/tap PMD,
- VFIO hotplug support,
- Generic flow API.

Following changes are applied:
- netdev-dpdk: Changes required by DPDK API modifications.
- doc: Because of DPDK API changes, backward compatibility
  with previous DPDK releases will be broken, thus all
  relevant documentation entries are updated.
- .travis: DPDK version change from 16.11.1 to 17.05.
- rhel/openvswitch-fedora.spec.in: DPDK version change
  from 16.11 to 17.05.

v1->v2: Patch rework based on minor review comments.
v2->v3: VHOST user client reconfiguration corrected.
v3->v4: Patch is rebased against OVS master, minor
rework based on review comments.
v4->v5: Some error log messages are corrected, NEWS file
is updated accordingly.

Signed-off-by: Michal Weglicki 
---
 .travis/linux-build.sh   |   2 +-
 Documentation/intro/install/dpdk.rst |   8 +-
 Documentation/topics/dpdk/vhost-user.rst |   8 +-
 NEWS |   1 +
 lib/netdev-dpdk.c| 144 +++
 rhel/openvswitch-fedora.spec.in  |   2 +-
 tests/dpdk/ring_client.c |   6 +-
 7 files changed, 105 insertions(+), 66 deletions(-)

diff --git a/.travis/linux-build.sh b/.travis/linux-build.sh
index 8750d68..ec15fd8 100755
--- a/.travis/linux-build.sh
+++ b/.travis/linux-build.sh
@@ -80,7 +80,7 @@ fi
 
 if [ "$DPDK" ]; then
 if [ -z "$DPDK_VER" ]; then
-DPDK_VER="16.11.1"
+DPDK_VER="17.05"
 fi
 install_dpdk $DPDK_VER
 if [ "$CC" = "clang" ]; then
diff --git a/Documentation/intro/install/dpdk.rst 
b/Documentation/intro/install/dpdk.rst
index e83f852..536450b 100644
--- a/Documentation/intro/install/dpdk.rst
+++ b/Documentation/intro/install/dpdk.rst
@@ -40,7 +40,7 @@ Build requirements
 In addition to the requirements described in :doc:`general`, building Open
 vSwitch with DPDK will require the following:
 
-- DPDK 16.11
+- DPDK 17.05
 
 - A `DPDK supported NIC`_
 
@@ -69,9 +69,9 @@ Install DPDK
 #. Download the `DPDK sources`_, extract the file and set ``DPDK_DIR``::
 
$ cd /usr/src/
-   $ wget http://fast.dpdk.org/rel/dpdk-16.11.1.tar.xz
-   $ tar xf dpdk-16.11.1.tar.xz
-   $ export DPDK_DIR=/usr/src/dpdk-stable-16.11.1
+   $ wget http://fast.dpdk.org/rel/dpdk-17.05.tar.xz
+   $ tar xf dpdk-17.05.tar.xz
+   $ export DPDK_DIR=/usr/src/dpdk-17.05
$ cd $DPDK_DIR
 
 #. (Optional) Configure DPDK as a shared library
diff --git a/Documentation/topics/dpdk/vhost-user.rst 
b/Documentation/topics/dpdk/vhost-user.rst
index a1c19fd..e533206 100644
--- a/Documentation/topics/dpdk/vhost-user.rst
+++ b/Documentation/topics/dpdk/vhost-user.rst
@@ -282,9 +282,9 @@ To begin, instantiate a guest as described in 
:ref:`dpdk-vhost-user` or
 DPDK sources to VM and build DPDK::
 
 $ cd /root/dpdk/
-$ wget http://fast.dpdk.org/rel/dpdk-16.11.1.tar.xz
-$ tar xf dpdk-16.11.1.tar.xz
-$ export DPDK_DIR=/root/dpdk/dpdk-stable-16.11.1
+$ wget http://fast.dpdk.org/rel/dpdk-17.05.tar.xz
+$ tar xf dpdk-17.05.tar.xz
+$ export DPDK_DIR=/root/dpdk/dpdk-17.05
 $ export DPDK_TARGET=x86_64-native-linuxapp-gcc
 $ export DPDK_BUILD=$DPDK_DIR/$DPDK_TARGET
 $ cd $DPDK_DIR
@@ -368,7 +368,7 @@ Sample XML
 
 
   
-  
+  
   
   
 
diff --git a/NEWS b/NEWS
index 82004c8..6059368 100644
--- a/NEWS
+++ b/NEWS
@@ -16,6 +16,7 @@ Post-v2.7.0
Log level can be changed in a usual OVS way using
'ovs-appctl vlog' commands for 'dpdk' module. Lower bound
still can be configured via extra arguments for DPDK EAL.
+ * Support for DPDK v17.05.
- IPFIX now provides additional counters:
  * Total counters since metering process startup.
  * Per-flow TCP flag counters.
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index b770b70..495640b 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -22,6 +22,9 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
 
 #include 
 #include 
@@ -31,7 +34,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
 #include "dirs.h"
 #include "dp-packet.h"
@@ -56,6 +59,8 @@
 #include "timeval.h"
 #include "unixctl.h"
 
+enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
+
 VLOG_DEFINE_THIS_MODULE(netdev_dpdk);
 static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 20);
 
@@ -170,6 +175,21 @@ static const struct rte_eth_conf port_conf = {
 },
 };
 
+/*
+ * These callbacks allow virtio-net devices to be added to vhost ports when
+ * configuration has been fully completed.
+ */
+static int new_device(int vid);
+static void destroy_device(int vid);
+static int vring_state_changed(int vid, uint16_t queue_id, int enable);
+static const struct vhost_device_ops virtio_net_device_ops =
+{
+.new_device =  new_device,
+.destroy_dev

Re: [ovs-dev] Vhostuser ports - Jumbo MTU issue

2017-06-08 Thread Kavanagh, Mark B


>-Original Message-
>From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-boun...@openvswitch.org] 
>On Behalf Of
>sai kiran
>Sent: Thursday, June 8, 2017 5:59 AM
>To: ovs-dev@openvswitch.org
>Subject: [ovs-dev] Vhostuser ports - Jumbo MTU issue
>
>Hi,
>
>I am trying to use OVS with DPDK, using OVS (2.7.0) and DPDK 16.11.1
>
>[root@localhost openvswitch-2.7.0]# ./utilities/ovs-vsctl -V
>ovs-vsctl (Open vSwitch) 2.7.0
>DB Schema 7.14.0
>
>
>I have added a netdev datapath and vhost-user ports following the steps
>mentioned in OVS manuals.
>
>$OVS_DIR/utilities/ovs-vsctl add-br ovs-br0 -- set bridge ovs-br0
>datapath_type=netdev
>$OVS_DIR/utilities/ovs-vsctl add-port ovs-br0 vhost-user1 -- set Interface
>vhost-user1 type=dpdkvhostuser
>$OVS_DIR/utilities/ovs-vsctl add-port ovs-br0 vhost-user2 -- set Interface
>vhost-user2 type=dpdkvhostuser
>
>
>[root@localhost openvswitch-2.7.0]# ./utilities/ovs-vsctl show
>13552504-fd97-41e0-8bd9-1d68d229aa61
>Bridge "ovs-br0"
>Port "vhost-user1"
>Interface "vhost-user1"
>type: dpdkvhostuser
>Port "vhost-user2"
>Interface "vhost-user2"
>type: dpdkvhostuser
>Port "ovs-br0"
>Interface "ovs-br0"
>type: internal
>
>
>
>Now, when I try to set MTU of the interface to 9000, it is not taking
>effect.
>
>[root@localhost openvswitch-2.7.0]# ./utilities/ovs-vsctl -- set Interface
>vhost-user1 mtu_request=9000
>2017-06-08T04:42:02Z|2|ovsdb_idl|WARN|Interface table in Open_vSwitch
>database lacks mtu_request column (database needs upgrade?)

Hi Sai,

>From this error message, it appears that you're using an outdated OvS database 
>- I suggest removing the existing database configuration, and rebuilding it:

sudo rm -rf /usr/local/etc/openvswitch/conf.db
sudo $OVS_DIR/ovsdb/ovsdb-tool create 
/usr/local/etc/openvswitch/conf.db $OVS_DIR/vswitchd/vswitch.ovsschema

FWIW, I've tested this setup on OvS 2.7.0 with DPDK v16.11.1 and have been 
unable to reproduce the observed behavior.

Thanks,
Mark

>ovs-vsctl: transaction error: {"details":"No column mtu_request in table
>Interface.","error":"unknown column","syntax":"{\"mtu_request\":9000}"}
>
>[root@localhost openvswitch-2.7.0]# ./utilities/ovs-vsctl -- get Interface
>vhost-user1 mtu
>1500
>[root@localhost openvswitch-2.7.0]# ./utilities/ovs-vsctl -- set Interface
>vhost-user1 mtu=9000
>
>[root@localhost openvswitch-2.7.0]# ./utilities/ovs-vsctl -- get Interface
>vhost-user1 mtu
>1500
>
>
>
>Please advise what else I need to do, to successfully set Jumbo MTU on
>vhostuser ports and to use jumbo traffic.
>
>Thanks in advance.
>
>Regards,
>*Sai Kiran*
>___
>dev mailing list
>d...@openvswitch.org
>https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] Add a git-checkpatches script

2017-06-08 Thread majopela
From: Miguel Angel Ajo 

This utility script exports the on the tail of your current branch,
and runs utilities/checkpatch.py on each of them.

Signed-off-by: Miguel Angel Ajo 
---
 utilities/git-checkpatches | 26 ++
 1 file changed, 26 insertions(+)
 create mode 100755 utilities/git-checkpatches

diff --git a/utilities/git-checkpatches b/utilities/git-checkpatches
new file mode 100755
index 000..d25f41f
--- /dev/null
+++ b/utilities/git-checkpatches
@@ -0,0 +1,26 @@
+#!/bin/sh
+
+# usage: git-checkpatches [number of patches]
+#
+# this script can be used to pass checkpatch on a set of patches
+# of your git history
+#
+
+# just one patch by default
+PATCHES=${1:-1}
+
+DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
+TMPDIR=$(mktemp -d)
+
+git_span="HEAD~$PATCHES..HEAD"
+
+echo "Checking patches $git_span"
+
+git format-patch -n $git_span -o $TMPDIR
+for patch in $TMPDIR/*.patch; do
+echo "Checking patch $patch "
+$DIR/checkpatch.py $patch
+echo ""
+done
+
+rm -rf $TMPDIR
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] Adding a script to help you check a set of patches

2017-06-08 Thread majopela
I found myself repeating the same things once an again while
checking the set of patches that I'm working on, so I scripted
it and I thought this could be useful to others.


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH V9 05/31] netdev: Adding a new netdev API to be used for offloading flows

2017-06-08 Thread Roi Dayan



On 31/05/2017 04:37, Joe Stringer wrote:

On 28 May 2017 at 04:59, Roi Dayan  wrote:

From: Paul Blakey 

Add a new API interface for offloading dpif flows to netdev.
The API consist on the following:
  flow_put - offload a new flow
  flow_get - query an offloaded flow
  flow_del - delete an offloaded flow
  flow_flush - flush all offloaded flows
  flow_dump_* - dump all offloaded flows

In upcoming commits we will introduce an implementation of this
API for netdev-linux.

Signed-off-by: Paul Blakey 
Reviewed-by: Roi Dayan 
Reviewed-by: Simon Horman 
---





@@ -769,6 +777,67 @@ struct netdev_class {

 /* Discards all packets waiting to be received from 'rx'. */
 int (*rxq_drain)(struct netdev_rxq *rx);
+
+/* ##  ## */
+/* ## netdev flow offloading functions ## */
+/* ##  ## */
+
+/* If a particular netdev class does not support offloading flows,
+ * all these function pointers must be NULL. */
+
+/* Flush all offloaded flows from a netdev.
+ * Return 0 if successful, otherwise returns a positive errno value. */
+int (*flow_flush)(struct netdev *);
+
+/* Flow dumping interface.
+ *
+ * This is the back-end for the flow dumping interface described in
+ * dpif.h.  Please read the comments there first, because this code
+ * closely follows it.
+ *
+ * 'flow_dump_create' is being executed in a dpif thread so there is
+ * no need for 'flow_dump_thread_create' implementation.


I find this comment a bit confusing, but it's a good thing it was here
because it raises a couple of discussion points.

'flow_dump_thread_create', perhaps poorly named, doesn't create a
thread, but allocates memory for per-thread state so that each thread
may dump safely in parallel while operating on an independent netlink
dump and independent buffers. I guess that in the DPIF flow dump there
is global dump state and per-thread state, while in this netdev flow
dump API there is only global state?

Describing that this interface doesn't need something that isn't being
defined is a bit strange. If it's not needed, then we probably don't
need to describe why it's not needed here since there's no such
function. Then, the comment can be dropped.


+ * On success returns allocated netdev_flow_dump data, on failure returns


^ returns allocated netdev_flow_dump_data "and returns 0"...?


+ * positive errno. */
+int (*flow_dump_create)(struct netdev *, struct netdev_flow_dump **dump);
+int (*flow_dump_destroy)(struct netdev_flow_dump *);
+
+/* Returns true while there are more flows to dump.


s/while/if/


+ * rbuffer is used as a temporary buffer and needs to be pre allocated
+ * by the caller. while there are more flows the same rbuffer should
+ * be provided. wbuffer is used to store dumped actions and needs to be
+ * pre allocated by the caller. */


I have a couple of extra questions which this description could be
expanded to answer:

Who is responsible for freeing 'rbuffer' and 'wbuffer'? I expect the
caller, but this could be more explicit.


caller. as noted the function expects them to be pre allocated.



Are the pointers that are returned valid beyond the next call to
flow_dump_next()?


yes. what can we add to make it clear?



Please also capitalize the starts of sentences. (I'll say this once,
but it applies to several of the comments).


ok




+bool (*flow_dump_next)(struct netdev_flow_dump *, struct match *,
+   struct nlattr **actions,
+   struct dpif_flow_stats *stats, ovs_u128 *ufid,
+   struct ofpbuf *rbuffer, struct ofpbuf *wbuffer);
+
+/* Offload the given flow on netdev.
+ * To modify a flow, use the same ufid.
+ * actions are in netlink format, as with struct dpif_flow_put.


Is this "OVS netlink format" or "TC flower netlink format"?


netlink




+ * info is extra info needed to offload the flow.
+ * Read the comments on 'struct dpif_flow_put' in dpif.h about stats.


This sentence about stats is more descriptive if it states something such as:

'stats' is populated according to the rules set out in the description
above 'struct dpif_flow_del'.



ok


+ * Return 0 if successful, otherwise returns a positive errno value. */
+int (*flow_put)(struct netdev *, struct match *, struct nlattr *actions,
+size_t actions_len, const ovs_u128 *ufid,
+struct offload_info *info, struct dpif_flow_stats *);
+
+/* Queries a flow specified by ufid on netdev.
+ * Fills output buffer as wbuffer in flow_dump_next.
+ * the buffer needs to be pre allocated.
+ * Return 0 if successful, otherwise returns a positive errno value. */


How is the caller expected to use the parameters? If it is expected to
use the buffer and interpret its data, that should be described. If
not, then 'buffer' should be described as

[ovs-dev] [PATCH V10 31/33] dpif: Refactor flow logging functions to be used by other modules

2017-06-08 Thread Roi Dayan
To be reused by other modules.

Signed-off-by: Roi Dayan 
Reviewed-by: Paul Blakey 
---
 lib/dpif.c | 87 +++---
 lib/dpif.h | 28 
 2 files changed, 72 insertions(+), 43 deletions(-)

diff --git a/lib/dpif.c b/lib/dpif.c
index 7dc0d64..10bdd70 100644
--- a/lib/dpif.c
+++ b/lib/dpif.c
@@ -92,24 +92,10 @@ static struct vlog_rate_limit dpmsg_rl = 
VLOG_RATE_LIMIT_INIT(600, 600);
 /* Not really much point in logging many dpif errors. */
 static struct vlog_rate_limit error_rl = VLOG_RATE_LIMIT_INIT(60, 5);
 
-static void log_flow_message(const struct dpif *dpif, int error,
- const char *operation,
- const struct nlattr *key, size_t key_len,
- const struct nlattr *mask, size_t mask_len,
- const ovs_u128 *ufid,
- const struct dpif_flow_stats *stats,
- const struct nlattr *actions, size_t actions_len);
 static void log_operation(const struct dpif *, const char *operation,
   int error);
-static bool should_log_flow_message(int error);
-static void log_flow_put_message(struct dpif *, const struct dpif_flow_put *,
- int error);
-static void log_flow_del_message(struct dpif *, const struct dpif_flow_del *,
- int error);
-static void log_execute_message(struct dpif *, const struct dpif_execute *,
-bool subexecute, int error);
-static void log_flow_get_message(const struct dpif *,
- const struct dpif_flow_get *, int error);
+static bool should_log_flow_message(const struct vlog_module *module,
+int error);
 
 /* Incremented whenever tnl route, arp, etc changes. */
 struct seq *tnl_conf_seq;
@@ -1125,8 +,9 @@ dpif_flow_dump_next(struct dpif_flow_dump_thread *thread,
 if (n > 0) {
 struct dpif_flow *f;
 
-for (f = flows; f < &flows[n] && should_log_flow_message(0); f++) {
-log_flow_message(dpif, 0, "flow_dump",
+for (f = flows; f < &flows[n]
+ && should_log_flow_message(&this_module, 0); f++) {
+log_flow_message(dpif, 0, &this_module, "flow_dump",
  f->key, f->key_len, f->mask, f->mask_len,
  &f->ufid, &f->stats, f->actions, f->actions_len);
 }
@@ -1231,7 +1218,8 @@ dpif_execute_helper_cb(void *aux_, struct dp_packet_batch 
*packets_,
 execute.probe = false;
 execute.mtu = 0;
 aux->error = dpif_execute(aux->dpif, &execute);
-log_execute_message(aux->dpif, &execute, true, aux->error);
+log_execute_message(aux->dpif, &this_module, &execute,
+true, aux->error);
 
 dp_packet_delete(clone);
 
@@ -1346,7 +1334,7 @@ dpif_operate(struct dpif *dpif, struct dpif_op **ops, 
size_t n_ops)
 struct dpif_flow_put *put = &op->u.flow_put;
 
 COVERAGE_INC(dpif_flow_put);
-log_flow_put_message(dpif, put, error);
+log_flow_put_message(dpif, &this_module, put, error);
 if (error && put->stats) {
 memset(put->stats, 0, sizeof *put->stats);
 }
@@ -1360,7 +1348,7 @@ dpif_operate(struct dpif *dpif, struct dpif_op **ops, 
size_t n_ops)
 if (error) {
 memset(get->flow, 0, sizeof *get->flow);
 }
-log_flow_get_message(dpif, get, error);
+log_flow_get_message(dpif, &this_module, get, error);
 
 break;
 }
@@ -1369,7 +1357,7 @@ dpif_operate(struct dpif *dpif, struct dpif_op **ops, 
size_t n_ops)
 struct dpif_flow_del *del = &op->u.flow_del;
 
 COVERAGE_INC(dpif_flow_del);
-log_flow_del_message(dpif, del, error);
+log_flow_del_message(dpif, &this_module, del, error);
 if (error && del->stats) {
 memset(del->stats, 0, sizeof *del->stats);
 }
@@ -1378,7 +1366,8 @@ dpif_operate(struct dpif *dpif, struct dpif_op **ops, 
size_t n_ops)
 
 case DPIF_OP_EXECUTE:
 COVERAGE_INC(dpif_execute);
-log_execute_message(dpif, &op->u.execute, false, error);
+log_execute_message(dpif, &this_module, &op->u.execute,
+false, error);
 break;
 }
 }
@@ -1690,14 +1679,16 @@ flow_message_log_level(int error)
 }
 
 static bool
-should_log_flow_message(int error)
+should_log_flow_message(const struct vlog_module *module, int error)
 {
-return !vlog_should_drop(&this_mod

[ovs-dev] [PATCH V10 30/33] netdev: Init flow api on already added ports on offload enable

2017-06-08 Thread Roi Dayan
From: Paul Blakey 

Ports already added to a switch are not being initialized for offloading
so when enabling offload we need to go over those ports.

Signed-off-by: Paul Blakey 
Reviewed-by: Roi Dayan 
Reviewed-by: Simon Horman 
---
 lib/netdev.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/lib/netdev.c b/lib/netdev.c
index 0aae83a..001b7b3 100644
--- a/lib/netdev.c
+++ b/lib/netdev.c
@@ -2346,6 +2346,18 @@ netdev_ports_flow_get(const void *obj, struct match 
*match,
 }
 
 #ifdef __linux__
+static void
+netdev_ports_flow_init(void)
+{
+struct port_to_netdev_data *data;
+
+ovs_mutex_lock(&netdev_hmap_mutex);
+HMAP_FOR_EACH(data, node, &port_to_netdev) {
+   netdev_init_flow_api(data->netdev);
+}
+ovs_mutex_unlock(&netdev_hmap_mutex);
+}
+
 void
 netdev_set_flow_api_enabled(const struct smap *ovs_other_config)
 {
@@ -2360,6 +2372,8 @@ netdev_set_flow_api_enabled(const struct smap 
*ovs_other_config)
 tc_set_policy(smap_get_def(ovs_other_config, "tc-policy",
TC_POLICY_DEFAULT));
 
+netdev_ports_flow_init();
+
 ovsthread_once_done(&once);
 }
 }
-- 
2.7.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V10 29/33] tests: Add system-offloads-testsuite

2017-06-08 Thread Roi Dayan
From: Paul Blakey 

The new system-offloads-testsuite, which can be launched via
`make check-offloads`, tests offloading capabilities
to makes sure that certian flows are actually offloaded.

The tests run on virtual netdevices (VETH).

Signed-off-by: Paul Blakey 
Reviewed-by: Roi Dayan 
Reviewed-by: Simon Horman 
---
 tests/.gitignore   |  1 +
 tests/automake.mk  | 16 +
 tests/ofproto-macros.at|  6 ++--
 tests/system-offloads-testsuite.at | 25 ++
 tests/system-offloads-traffic.at   | 67 ++
 5 files changed, 113 insertions(+), 2 deletions(-)
 create mode 100644 tests/system-offloads-testsuite.at
 create mode 100644 tests/system-offloads-traffic.at

diff --git a/tests/.gitignore b/tests/.gitignore
index f4540a3..77e5a95 100644
--- a/tests/.gitignore
+++ b/tests/.gitignore
@@ -12,6 +12,7 @@
 /pki/
 /system-kmod-testsuite
 /system-userspace-testsuite
+/system-offloads-testsuite
 /test-aes128
 /test-atomic
 /test-bundle
diff --git a/tests/automake.mk b/tests/automake.mk
index c6bd120..e88c622 100644
--- a/tests/automake.mk
+++ b/tests/automake.mk
@@ -4,9 +4,11 @@ EXTRA_DIST += \
$(SYSTEM_TESTSUITE_AT) \
$(SYSTEM_KMOD_TESTSUITE_AT) \
$(SYSTEM_USERSPACE_TESTSUITE_AT) \
+   $(SYSTEM_OFFLOADS_TESTSUITE_AT) \
$(TESTSUITE) \
$(SYSTEM_KMOD_TESTSUITE) \
$(SYSTEM_USERSPACE_TESTSUITE) \
+   $(SYSTEM_OFFLOADS_TESTSUITE) \
tests/atlocal.in \
$(srcdir)/package.m4 \
$(srcdir)/tests/testsuite \
@@ -112,12 +114,18 @@ SYSTEM_TESTSUITE_AT = \
tests/system-ovn.at \
tests/system-traffic.at
 
+SYSTEM_OFFLOADS_TESTSUITE_AT = \
+   tests/system-common-macros.at \
+   tests/system-offloads-traffic.at \
+   tests/system-offloads-testsuite.at
+
 check_SCRIPTS += tests/atlocal
 
 TESTSUITE = $(srcdir)/tests/testsuite
 TESTSUITE_PATCH = $(srcdir)/tests/testsuite.patch
 SYSTEM_KMOD_TESTSUITE = $(srcdir)/tests/system-kmod-testsuite
 SYSTEM_USERSPACE_TESTSUITE = $(srcdir)/tests/system-userspace-testsuite
+SYSTEM_OFFLOADS_TESTSUITE = $(srcdir)/tests/system-offloads-testsuite
 DISTCLEANFILES += tests/atconfig tests/atlocal
 
 AUTOTEST_PATH = 
utilities:vswitchd:ovsdb:vtep:tests:$(PTHREAD_WIN32_DIR_DLL):ovn/controller-vtep:ovn/northd:ovn/utilities:ovn/controller
@@ -229,6 +237,10 @@ check-system-userspace: all
set $(SHELL) '$(SYSTEM_USERSPACE_TESTSUITE)' -C tests  
AUTOTEST_PATH='$(AUTOTEST_PATH)' $(TESTSUITEFLAGS) -j1; \
"$$@" || (test X'$(RECHECK)' = Xyes && "$$@" --recheck)
 
+check-offloads: all
+   set $(SHELL) '$(SYSTEM_OFFLOADS_TESTSUITE)' -C tests  
AUTOTEST_PATH='$(AUTOTEST_PATH)' $(TESTSUITEFLAGS) -j1; \
+   "$$@" || (test X'$(RECHECK)' = Xyes && "$$@" --recheck)
+
 clean-local:
test ! -f '$(TESTSUITE)' || $(SHELL) '$(TESTSUITE)' -C tests --clean
 
@@ -253,6 +265,10 @@ $(SYSTEM_USERSPACE_TESTSUITE): package.m4 
$(SYSTEM_TESTSUITE_AT) $(SYSTEM_USERSP
$(AM_V_GEN)$(AUTOTEST) -I '$(srcdir)' -o $@.tmp $@.at
$(AM_V_at)mv $@.tmp $@
 
+$(SYSTEM_OFFLOADS_TESTSUITE): package.m4 $(SYSTEM_TESTSUITE_AT) 
$(SYSTEM_OFFLOADS_TESTSUITE_AT) $(COMMON_MACROS_AT)
+   $(AM_V_GEN)$(AUTOTEST) -I '$(srcdir)' -o $@.tmp $@.at
+   $(AM_V_at)mv $@.tmp $@
+
 # The `:;' works around a Bash 3.2 bug when the output is not writeable.
 $(srcdir)/package.m4: $(top_srcdir)/configure.ac
$(AM_V_GEN):;{ \
diff --git a/tests/ofproto-macros.at b/tests/ofproto-macros.at
index faff5b0..0adf555 100644
--- a/tests/ofproto-macros.at
+++ b/tests/ofproto-macros.at
@@ -317,7 +317,7 @@ m4_define([_OVS_VSWITCHD_START],
AT_CAPTURE_FILE([ovsdb-server.log])
 
dnl Initialize database.
-   AT_CHECK([ovs-vsctl --no-wait init])
+   AT_CHECK([ovs-vsctl --no-wait init $2])
 
dnl Start ovs-vswitchd.
AT_CHECK([ovs-vswitchd $1 --detach --no-chdir --pidfile --log-file -vvconn 
-vofproto_dpif -vunixctl], [0], [], [stderr])
@@ -331,7 +331,9 @@ m4_define([_OVS_VSWITCHD_START],
 /ofproto|INFO|using datapath ID/d
 /netdev_linux|INFO|.*device has unknown hardware address family/d
 /ofproto|INFO|datapath ID changed to fedcba9876543210/d
-/dpdk|INFO|DPDK Disabled - Use other_config:dpdk-init to enable/d']])
+/dpdk|INFO|DPDK Disabled - Use other_config:dpdk-init to enable/d
+/netdev: Flow API/d
+/tc: Using policy/d']])
 ])
 
 # OVS_VSWITCHD_START([vsctl-args], [vsctl-output], [=override],
diff --git a/tests/system-offloads-testsuite.at 
b/tests/system-offloads-testsuite.at
new file mode 100644
index 000..eb5d2d4
--- /dev/null
+++ b/tests/system-offloads-testsuite.at
@@ -0,0 +1,25 @@
+AT_INIT
+
+AT_COPYRIGHT([Copyright (c) 2016 Mellanox Technologies, Ltd.
+
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at:
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agree

[ovs-dev] [PATCH V10 32/33] dpif-netlink: Use dpif logging functions

2017-06-08 Thread Roi Dayan
Remove redundant logging functions and reuse
the exposed dpif logging functions.

Signed-off-by: Roi Dayan 
Reviewed-by: Paul Blakey 
---
 lib/dpif-netlink.c | 39 ++-
 1 file changed, 6 insertions(+), 33 deletions(-)

diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
index 75cd228..13c797f 100644
--- a/lib/dpif-netlink.c
+++ b/lib/dpif-netlink.c
@@ -2183,32 +2183,6 @@ out:
 return err;
 }
 
-static void
-dbg_print_flow(const struct nlattr *key, size_t key_len,
-   const struct nlattr *mask, size_t mask_len,
-   const struct nlattr *actions, size_t actions_len,
-   const ovs_u128 *ufid,
-   const char *op)
-{
-struct ds s;
-
-ds_init(&s);
-ds_put_cstr(&s, op);
-ds_put_cstr(&s, " (");
-odp_format_ufid(ufid, &s);
-ds_put_cstr(&s, ")");
-if (key_len) {
-ds_put_cstr(&s, "\nflow (verbose): ");
-odp_flow_format(key, key_len, mask, mask_len, NULL, &s, true);
-ds_put_cstr(&s, "\nflow: ");
-odp_flow_format(key, key_len, mask, mask_len, NULL, &s, false);
-}
-ds_put_cstr(&s, "\nactions: ");
-format_odp_actions(&s, actions, actions_len);
-VLOG_DBG("\n%s", ds_cstr(&s));
-ds_destroy(&s);
-}
-
 static int
 try_send_to_netdev(struct dpif_netlink *dpif, struct dpif_op *op)
 {
@@ -2221,9 +2195,8 @@ try_send_to_netdev(struct dpif_netlink *dpif, struct 
dpif_op *op)
 if (!put->ufid) {
 break;
 }
-dbg_print_flow(put->key, put->key_len, put->mask, put->mask_len,
-   put->actions, put->actions_len, put->ufid,
-   (put->flags & DPIF_FP_MODIFY ? "PUT(MODIFY)" : "PUT"));
+
+log_flow_put_message(&dpif->dpif, &this_module, put, 0);
 err = parse_flow_put(dpif, put);
 break;
 }
@@ -2233,8 +2206,8 @@ try_send_to_netdev(struct dpif_netlink *dpif, struct 
dpif_op *op)
 if (!del->ufid) {
 break;
 }
-dbg_print_flow(del->key, del->key_len, NULL, 0, NULL, 0,
-   del->ufid, "DEL");
+
+log_flow_del_message(&dpif->dpif, &this_module, del, 0);
 err = netdev_ports_flow_del(DPIF_HMAP_KEY(&dpif->dpif), del->ufid,
 del->stats);
 break;
@@ -2245,8 +2218,8 @@ try_send_to_netdev(struct dpif_netlink *dpif, struct 
dpif_op *op)
 if (!op->u.flow_get.ufid) {
 break;
 }
-dbg_print_flow(get->key, get->key_len, NULL, 0, NULL, 0,
-   get->ufid, "GET");
+
+log_flow_get_message(&dpif->dpif, &this_module, get, 0);
 err = parse_flow_get(dpif, get);
 break;
 }
-- 
2.7.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V10 33/33] NEWS: add a note about hw offloading

2017-06-08 Thread Roi Dayan
Signed-off-by: Roi Dayan 
---
 NEWS | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/NEWS b/NEWS
index 82004c8..fd2c98e 100644
--- a/NEWS
+++ b/NEWS
@@ -58,6 +58,9 @@ Post-v2.7.0
  * Transparently pop and push Ethernet headers at transmit/reception
of packets to/from L3 tunnels.
- The BFD detection multiplier is now user-configurable.
+   - New support for HW offloading
+ * HW offloading is disabled by default.
+ * HW offloading is done through the TC interface.
 
 v2.7.0 - 21 Feb 2017
 -
-- 
2.7.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V10 27/33] dpctl: Add an option to dump only certain kinds of flows

2017-06-08 Thread Roi Dayan
From: Paul Blakey 

Usage:
# to dump all datapath flows (default):
ovs-dpctl dump-flows

# to dump only flows that in kernel datapath:
ovs-dpctl dump-flows type=ovs

# to dump only flows that are offloaded:
ovs-dpctl dump-flows type=offloaded

Signed-off-by: Paul Blakey 
Reviewed-by: Roi Dayan 
Reviewed-by: Simon Horman 
---
 lib/dpctl.c   | 44 --
 lib/dpctl.man |  7 -
 lib/dpif-netdev.c |  3 ++-
 lib/dpif-netlink.c| 63 ++-
 lib/dpif-provider.h   |  6 +++--
 lib/dpif.c|  4 +--
 lib/dpif.h|  3 ++-
 ofproto/ofproto-dpif-upcall.c |  3 ++-
 ofproto/ofproto-dpif.c|  2 +-
 9 files changed, 106 insertions(+), 29 deletions(-)

diff --git a/lib/dpctl.c b/lib/dpctl.c
index 2dfaeeb..a2ee8a2 100644
--- a/lib/dpctl.c
+++ b/lib/dpctl.c
@@ -754,6 +754,11 @@ format_dpif_flow(struct ds *ds, const struct dpif_flow *f, 
struct hmap *ports,
 format_odp_actions(ds, f->actions, f->actions_len);
 }
 
+static char *supported_dump_types[] = {
+"offloaded",
+"ovs",
+};
+
 static int
 dpctl_dump_flows(int argc, const char *argv[], struct dpctl_params *dpctl_p)
 {
@@ -762,6 +767,7 @@ dpctl_dump_flows(int argc, const char *argv[], struct 
dpctl_params *dpctl_p)
 char *name;
 
 char *filter = NULL;
+char *type = NULL;
 struct flow flow_filter;
 struct flow_wildcards wc_filter;
 
@@ -773,22 +779,29 @@ dpctl_dump_flows(int argc, const char *argv[], struct 
dpctl_params *dpctl_p)
 struct dpif_flow_dump *flow_dump;
 struct dpif_flow f;
 int pmd_id = PMD_ID_NULL;
+int lastargc = 0;
 int error;
 
-if (argc > 1 && !strncmp(argv[argc - 1], "filter=", 7)) {
-filter = xstrdup(argv[--argc] + 7);
+while (argc > 1 && lastargc != argc) {
+lastargc = argc;
+if (!strncmp(argv[argc - 1], "filter=", 7) && !filter) {
+filter = xstrdup(argv[--argc] + 7);
+} else if (!strncmp(argv[argc - 1], "type=", 5) && !type) {
+type = xstrdup(argv[--argc] + 5);
+}
 }
-name = (argc == 2) ? xstrdup(argv[1]) : get_one_dp(dpctl_p);
+
+name = (argc > 1) ? xstrdup(argv[1]) : get_one_dp(dpctl_p);
 if (!name) {
 error = EINVAL;
-goto out_freefilter;
+goto out_free;
 }
 
 error = parsed_dpif_open(name, false, &dpif);
 free(name);
 if (error) {
 dpctl_error(dpctl_p, error, "opening datapath");
-goto out_freefilter;
+goto out_free;
 }
 
 
@@ -816,6 +829,20 @@ dpctl_dump_flows(int argc, const char *argv[], struct 
dpctl_params *dpctl_p)
 }
 }
 
+if (type) {
+error = EINVAL;
+for (int i = 0; i < ARRAY_SIZE(supported_dump_types); i++) {
+if (!strcmp(supported_dump_types[i], type)) {
+error = 0;
+break;
+}
+}
+if (error) {
+dpctl_error(dpctl_p, error, "Failed to parse type (%s)", type);
+goto out_free;
+}
+}
+
 /* Make sure that these values are different. PMD_ID_NULL means that the
  * pmd is unspecified (e.g. because the datapath doesn't have different
  * pmd threads), while NON_PMD_CORE_ID refers to every non pmd threads
@@ -823,7 +850,7 @@ dpctl_dump_flows(int argc, const char *argv[], struct 
dpctl_params *dpctl_p)
 BUILD_ASSERT(PMD_ID_NULL != NON_PMD_CORE_ID);
 
 ds_init(&ds);
-flow_dump = dpif_flow_dump_create(dpif, false);
+flow_dump = dpif_flow_dump_create(dpif, false, (type ? type : "dpctl"));
 flow_dump_thread = dpif_flow_dump_thread_create(flow_dump);
 while (dpif_flow_dump_next(flow_dump_thread, &f, 1)) {
 if (filter) {
@@ -874,8 +901,9 @@ out_dpifclose:
 odp_portno_names_destroy(&portno_names);
 hmap_destroy(&portno_names);
 dpif_close(dpif);
-out_freefilter:
+out_free:
 free(filter);
+free(type);
 return error;
 }
 
@@ -1558,7 +1586,7 @@ static const struct dpctl_command all_commands[] = {
 { "set-if", "dp iface...", 2, INT_MAX, dpctl_set_if, DP_RW },
 { "dump-dps", "", 0, 0, dpctl_dump_dps, DP_RO },
 { "show", "[dp...]", 0, INT_MAX, dpctl_show, DP_RO },
-{ "dump-flows", "[dp] [filter=..]", 0, 2, dpctl_dump_flows, DP_RO },
+{ "dump-flows", "[dp] [filter=..] [type=..]", 0, 3, dpctl_dump_flows, 
DP_RO },
 { "add-flow", "[dp] flow actions", 2, 3, dpctl_add_flow, DP_RW },
 { "mod-flow", "[dp] flow actions", 2, 3, dpctl_mod_flow, DP_RW },
 { "get-flow", "[dp] ufid", 1, 2, dpctl_get_flow, DP_RO },
diff --git a/lib/dpctl.man b/lib/dpctl.man
index f7ae311..f6e4a7a 100644
--- a/lib/dpctl.man
+++ b/lib/dpctl.man
@@ -99,7 +99,7 @@ default.  When multiple datapaths exist, then a datapath name 
is
 required.
 .
 .TP
-.DO "[\fB\-m \fR| \fB\-\-more\fR]" \*(DX\fBdump\-flows\fR "[\fIdp\fR] 
[\fBfilter=\fIfilter\fR]"
+.DO "[\fB\-m \fR| \fB\-

[ovs-dev] [PATCH V10 25/33] netdev-tc-offloads: Add ingress on netdev flow api init

2017-06-08 Thread Roi Dayan
From: Paul Blakey 

Signed-off-by: Paul Blakey 
Reviewed-by: Roi Dayan 
Reviewed-by: Simon Horman 
---
 lib/netdev-tc-offloads.c | 23 ++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/lib/netdev-tc-offloads.c b/lib/netdev-tc-offloads.c
index d1036d1..26f1c1b 100644
--- a/lib/netdev-tc-offloads.c
+++ b/lib/netdev-tc-offloads.c
@@ -33,6 +33,7 @@
 #include "hash.h"
 #include "dpif.h"
 #include "tc.h"
+#include "netdev-linux.h"
 
 VLOG_DEFINE_THIS_MODULE(netdev_tc_offloads);
 
@@ -923,7 +924,27 @@ netdev_tc_flow_del(struct netdev *netdev OVS_UNUSED,
 }
 
 int
-netdev_tc_init_flow_api(struct netdev *netdev OVS_UNUSED)
+netdev_tc_init_flow_api(struct netdev *netdev)
 {
+int ifindex;
+int error;
+
+ifindex = netdev_get_ifindex(netdev);
+if (ifindex < 0) {
+VLOG_ERR_RL(&error_rl, "failed to get ifindex for %s: %s",
+netdev_get_name(netdev), ovs_strerror(-ifindex));
+return -ifindex;
+}
+
+error = tc_add_del_ingress_qdisc(ifindex, true);
+
+if (error && error != EEXIST) {
+VLOG_ERR("failed adding ingress qdisc required for offloading: %s",
+ ovs_strerror(error));
+return error;
+}
+
+VLOG_INFO("added ingress qdisc to %s", netdev_get_name(netdev));
+
 return 0;
 }
-- 
2.7.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V10 26/33] dpctl: Add filter arg to dump-flows command info

2017-06-08 Thread Roi Dayan
This is for it to appear in bash completion.

Signed-off-by: Roi Dayan 
---
 lib/dpctl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/dpctl.c b/lib/dpctl.c
index ae23913..2dfaeeb 100644
--- a/lib/dpctl.c
+++ b/lib/dpctl.c
@@ -1558,7 +1558,7 @@ static const struct dpctl_command all_commands[] = {
 { "set-if", "dp iface...", 2, INT_MAX, dpctl_set_if, DP_RW },
 { "dump-dps", "", 0, 0, dpctl_dump_dps, DP_RO },
 { "show", "[dp...]", 0, INT_MAX, dpctl_show, DP_RO },
-{ "dump-flows", "[dp]", 0, 2, dpctl_dump_flows, DP_RO },
+{ "dump-flows", "[dp] [filter=..]", 0, 2, dpctl_dump_flows, DP_RO },
 { "add-flow", "[dp] flow actions", 2, 3, dpctl_add_flow, DP_RW },
 { "mod-flow", "[dp] flow actions", 2, 3, dpctl_mod_flow, DP_RW },
 { "get-flow", "[dp] ufid", 1, 2, dpctl_get_flow, DP_RO },
-- 
2.7.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V10 24/33] netdev-vport: Use common offloads interface

2017-06-08 Thread Roi Dayan
From: Paul Blakey 

netdev vports are backed by actualy netdev at the kernel
level, so they can use the common netdev-tc offloads interface
for flow offloading (if enabled).

Signed-off-by: Paul Blakey 
Reviewed-by: Roi Dayan 
Reviewed-by: Simon Horman 
---
 lib/netdev-linux.c |  7 +++---
 lib/netdev-linux.h |  2 ++
 lib/netdev-vport.c | 65 +++---
 3 files changed, 52 insertions(+), 22 deletions(-)

diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index ce0a153..f5dc30f 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -530,7 +530,6 @@ static int set_flags(const char *, unsigned int flags);
 static int update_flags(struct netdev_linux *netdev, enum netdev_flags off,
 enum netdev_flags on, enum netdev_flags *old_flagsp)
 OVS_REQUIRES(netdev->mutex);
-static int do_get_ifindex(const char *netdev_name);
 static int get_ifindex(const struct netdev *, int *ifindexp);
 static int do_set_addr(struct netdev *netdev,
int ioctl_nr, const char *ioctl_name,
@@ -5414,8 +5413,8 @@ set_flags(const char *name, unsigned int flags)
 return af_inet_ifreq_ioctl(name, &ifr, SIOCSIFFLAGS, "SIOCSIFFLAGS");
 }
 
-static int
-do_get_ifindex(const char *netdev_name)
+int
+linux_get_ifindex(const char *netdev_name)
 {
 struct ifreq ifr;
 int error;
@@ -5438,7 +5437,7 @@ get_ifindex(const struct netdev *netdev_, int *ifindexp)
 struct netdev_linux *netdev = netdev_linux_cast(netdev_);
 
 if (!(netdev->cache_valid & VALID_IFINDEX)) {
-int ifindex = do_get_ifindex(netdev_get_name(netdev_));
+int ifindex = linux_get_ifindex(netdev_get_name(netdev_));
 
 if (ifindex < 0) {
 netdev->get_ifindex_error = -ifindex;
diff --git a/lib/netdev-linux.h b/lib/netdev-linux.h
index d944691..880f864 100644
--- a/lib/netdev-linux.h
+++ b/lib/netdev-linux.h
@@ -27,6 +27,7 @@ struct netdev;
 
 int netdev_linux_ethtool_set_flag(struct netdev *netdev, uint32_t flag,
   const char *flag_name, bool enable);
+int linux_get_ifindex(const char *netdev_name);
 
 #define LINUX_FLOW_OFFLOAD_API  \
 netdev_tc_flow_flush,   \
@@ -37,4 +38,5 @@ int netdev_linux_ethtool_set_flag(struct netdev *netdev, 
uint32_t flag,
 netdev_tc_flow_get, \
 netdev_tc_flow_del, \
 netdev_tc_init_flow_api
+
 #endif /* netdev-linux.h */
diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c
index fc02438..640cdbe 100644
--- a/lib/netdev-vport.c
+++ b/lib/netdev-vport.c
@@ -45,6 +45,10 @@
 #include "unaligned.h"
 #include "unixctl.h"
 #include "openvswitch/vlog.h"
+#include "netdev-tc-offloads.h"
+#ifdef __linux__
+#include "netdev-linux.h"
+#endif
 
 VLOG_DEFINE_THIS_MODULE(netdev_vport);
 
@@ -806,10 +810,37 @@ get_stats(const struct netdev *netdev, struct 
netdev_stats *stats)
 }
 
 
+#ifdef __linux__
+static int
+netdev_vport_get_ifindex__(const struct netdev *netdev_)
+{
+char buf[NETDEV_VPORT_NAME_BUFSIZE];
+const char *name = netdev_vport_get_dpif_port(netdev_, buf, sizeof(buf));
+
+return linux_get_ifindex(name);
+}
+
+static int
+netdev_vport_get_ifindex(const struct netdev *netdev_)
+{
+if (netdev_is_flow_api_enabled())
+return netdev_vport_get_ifindex__(netdev_);
+else
+return -EOPNOTSUPP;
+}
+
+#define NETDEV_VPORT_GET_IFINDEX netdev_vport_get_ifindex
+#define NETDEV_FLOW_OFFLOAD_API LINUX_FLOW_OFFLOAD_API
+#else /* !__linux__ */
+#define NETDEV_VPORT_GET_IFINDEX NULL
+#define NETDEV_FLOW_OFFLOAD_API NO_OFFLOAD_API
+#endif /* __linux__ */
+
 #define VPORT_FUNCTIONS(GET_CONFIG, SET_CONFIG, \
 GET_TUNNEL_CONFIG, GET_STATUS,  \
 BUILD_HEADER,   \
-PUSH_HEADER, POP_HEADER)\
+PUSH_HEADER, POP_HEADER,\
+GET_IFINDEX)\
 NULL,   \
 netdev_vport_run,   \
 netdev_vport_wait,  \
@@ -834,7 +865,7 @@ get_stats(const struct netdev *netdev, struct netdev_stats 
*stats)
 netdev_vport_get_etheraddr, \
 NULL,   /* get_mtu */   \
 NULL,   /* set_mtu */   \
-NULL,   /* get_ifindex */   \
+GET_IFINDEX,\
 NULL,   /* get_carrier */   \
 NULL,   /* get_carrier_resets */\
 NULL,   /* get_miimon */\
@@ -875,24 +906,19 @@ get_stats(const struct netdev *netdev, struct 
netdev_stats *stats)
 NULL,

[ovs-dev] [PATCH V10 28/33] dpctl: Indicate if flow is offloaded when dumping flows of all types

2017-06-08 Thread Roi Dayan
From: Paul Blakey 

When verbosity is requested on dump-flows (-m) indicate which flows
are offloaded.

Signed-off-by: Paul Blakey 
Reviewed-by: Roi Dayan 
---
 lib/dpctl.c| 11 ---
 lib/dpif-netlink.c |  4 
 lib/dpif.h |  1 +
 3 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/lib/dpctl.c b/lib/dpctl.c
index a2ee8a2..7f44d02 100644
--- a/lib/dpctl.c
+++ b/lib/dpctl.c
@@ -739,7 +739,7 @@ dpctl_dump_dps(int argc OVS_UNUSED, const char *argv[] 
OVS_UNUSED,
 
 static void
 format_dpif_flow(struct ds *ds, const struct dpif_flow *f, struct hmap *ports,
- struct dpctl_params *dpctl_p)
+ char *type, struct dpctl_params *dpctl_p)
 {
 if (dpctl_p->verbosity && f->ufid_present) {
 odp_format_ufid(&f->ufid, ds);
@@ -750,6 +750,9 @@ format_dpif_flow(struct ds *ds, const struct dpif_flow *f, 
struct hmap *ports,
 ds_put_cstr(ds, ", ");
 
 dpif_flow_stats_format(&f->stats, ds);
+if (dpctl_p->verbosity && !type && f->offloaded) {
+ds_put_cstr(ds, ", offloaded:yes");
+}
 ds_put_cstr(ds, ", actions:");
 format_odp_actions(ds, f->actions, f->actions_len);
 }
@@ -850,6 +853,7 @@ dpctl_dump_flows(int argc, const char *argv[], struct 
dpctl_params *dpctl_p)
 BUILD_ASSERT(PMD_ID_NULL != NON_PMD_CORE_ID);
 
 ds_init(&ds);
+memset(&f, 0, sizeof f);
 flow_dump = dpif_flow_dump_create(dpif, false, (type ? type : "dpctl"));
 flow_dump_thread = dpif_flow_dump_thread_create(flow_dump);
 while (dpif_flow_dump_next(flow_dump_thread, &f, 1)) {
@@ -886,7 +890,8 @@ dpctl_dump_flows(int argc, const char *argv[], struct 
dpctl_params *dpctl_p)
 }
 pmd_id = f.pmd_id;
 }
-format_dpif_flow(&ds, &f, &portno_names, dpctl_p);
+format_dpif_flow(&ds, &f, &portno_names, type, dpctl_p);
+
 dpctl_print(dpctl_p, "%s\n", ds_cstr(&ds));
 }
 dpif_flow_dump_thread_destroy(flow_dump_thread);
@@ -1069,7 +1074,7 @@ dpctl_get_flow(int argc, const char *argv[], struct 
dpctl_params *dpctl_p)
 }
 
 ds_init(&ds);
-format_dpif_flow(&ds, &flow, &portno_names, dpctl_p);
+format_dpif_flow(&ds, &flow, &portno_names, NULL, dpctl_p);
 dpctl_print(dpctl_p, "%s\n", ds_cstr(&ds));
 ds_destroy(&ds);
 
diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
index f10c638..75cd228 100644
--- a/lib/dpif-netlink.c
+++ b/lib/dpif-netlink.c
@@ -1639,6 +1639,7 @@ dpif_netlink_flow_to_dpif_flow(struct dpif *dpif, struct 
dpif_flow *dpif_flow,
&dpif_flow->ufid);
 }
 dpif_netlink_flow_get_stats(datapath_flow, &dpif_flow->stats);
+dpif_flow->offloaded = false;
 }
 
 /* The design is such that all threads are working together on the first dump
@@ -1718,6 +1719,9 @@ dpif_netlink_netdev_match_to_dpif_flow(struct match 
*match,
 flow->ufid = *ufid;
 
 flow->pmd_id = PMD_ID_NULL;
+
+flow->offloaded = true;
+
 return 0;
 }
 
diff --git a/lib/dpif.h b/lib/dpif.h
index b1f516e..38efd29 100644
--- a/lib/dpif.h
+++ b/lib/dpif.h
@@ -591,6 +591,7 @@ struct dpif_flow {
 bool ufid_present;/* True if 'ufid' was provided by datapath.*/
 unsigned pmd_id;  /* Datapath poll mode driver id. */
 struct dpif_flow_stats stats; /* Flow statistics. */
+bool offloaded;   /* True if flow is offloaded */
 };
 int dpif_flow_dump_next(struct dpif_flow_dump_thread *,
 struct dpif_flow *flows, int max_flows);
-- 
2.7.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V10 18/33] netdev-tc-offloads: Implement netdev flow put using tc interface

2017-06-08 Thread Roi Dayan
From: Paul Blakey 

Currently only tunnel offload is supported.

Signed-off-by: Paul Blakey 
Reviewed-by: Roi Dayan 
Reviewed-by: Simon Horman 
---
 lib/netdev-tc-offloads.c | 406 +--
 1 file changed, 396 insertions(+), 10 deletions(-)

diff --git a/lib/netdev-tc-offloads.c b/lib/netdev-tc-offloads.c
index 49e97fa..6b8f554 100644
--- a/lib/netdev-tc-offloads.c
+++ b/lib/netdev-tc-offloads.c
@@ -89,7 +89,7 @@ del_ufid_tc_mapping(const ovs_u128 *ufid)
 
 /* Add ufid entry to ufid_tc hashmap.
  * If entry exists already it will be replaced. */
-static void OVS_UNUSED
+static void
 add_ufid_tc_mapping(const ovs_u128 *ufid, int prio, int handle,
 struct netdev *netdev, int ifindex)
 {
@@ -120,7 +120,7 @@ add_ufid_tc_mapping(const ovs_u128 *ufid, int prio, int 
handle,
  * Returns handle if successful and fill prio and netdev for that ufid.
  * Otherwise returns 0.
  */
-static int OVS_UNUSED
+static int
 get_ufid_tc_mapping(const ovs_u128 *ufid, int *prio, struct netdev **netdev)
 {
 size_t ufid_hash = hash_bytes(ufid, sizeof *ufid, 0);
@@ -183,7 +183,7 @@ struct prio_map_data {
  *
  * Return prio on success or 0 if we are out of prios.
  */
-static uint16_t OVS_UNUSED
+static uint16_t
 get_prio_for_tc_flower(struct tc_flower *flower)
 {
 static struct hmap prios = HMAP_INITIALIZER(&prios);
@@ -440,16 +440,402 @@ netdev_tc_flow_dump_next(struct netdev_flow_dump *dump,
 return false;
 }
 
+static int
+parse_put_flow_set_action(struct tc_flower *flower, const struct nlattr *set,
+  size_t set_len)
+{
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 20);
+const struct nlattr *set_attr;
+size_t set_left;
+
+NL_ATTR_FOR_EACH_UNSAFE(set_attr, set_left, set, set_len) {
+if (nl_attr_type(set_attr) == OVS_KEY_ATTR_TUNNEL) {
+const struct nlattr *tunnel = nl_attr_get(set_attr);
+const size_t tunnel_len = nl_attr_get_size(set_attr);
+const struct nlattr *tun_attr;
+size_t tun_left;
+
+flower->set.set = true;
+NL_ATTR_FOR_EACH_UNSAFE(tun_attr, tun_left, tunnel, tunnel_len) {
+switch (nl_attr_type(tun_attr)) {
+case OVS_TUNNEL_KEY_ATTR_ID: {
+flower->set.id = nl_attr_get_be64(tun_attr);
+}
+break;
+case OVS_TUNNEL_KEY_ATTR_IPV4_SRC: {
+flower->set.ipv4.ipv4_src = nl_attr_get_be32(tun_attr);
+}
+break;
+case OVS_TUNNEL_KEY_ATTR_IPV4_DST: {
+flower->set.ipv4.ipv4_dst = nl_attr_get_be32(tun_attr);
+}
+break;
+case OVS_TUNNEL_KEY_ATTR_IPV6_SRC: {
+flower->set.ipv6.ipv6_src =
+nl_attr_get_in6_addr(tun_attr);
+}
+break;
+case OVS_TUNNEL_KEY_ATTR_IPV6_DST: {
+flower->set.ipv6.ipv6_dst =
+nl_attr_get_in6_addr(tun_attr);
+}
+break;
+case OVS_TUNNEL_KEY_ATTR_TP_SRC: {
+flower->set.tp_src = nl_attr_get_be16(tun_attr);
+}
+break;
+case OVS_TUNNEL_KEY_ATTR_TP_DST: {
+flower->set.tp_dst = nl_attr_get_be16(tun_attr);
+}
+break;
+}
+}
+} else {
+VLOG_DBG_RL(&rl, "unsupported set action type: %d",
+nl_attr_type(set_attr));
+return EOPNOTSUPP;
+}
+}
+return 0;
+}
+
+static int
+test_key_and_mask(struct match *match) {
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 20);
+const struct flow *key = &match->flow;
+struct flow *mask = &match->wc.masks;
+
+if (mask->pkt_mark) {
+VLOG_DBG_RL(&rl, "offloading attribute pkt_mark isn't supported");
+return EOPNOTSUPP;
+}
+
+if (mask->recirc_id && key->recirc_id) {
+VLOG_DBG_RL(&rl, "offloading attribute recirc_id isn't supported");
+return EOPNOTSUPP;
+}
+mask->recirc_id = 0;
+
+if (mask->dp_hash) {
+VLOG_DBG_RL(&rl, "offloading attribute dp_hash isn't supported");
+return EOPNOTSUPP;
+}
+
+if (mask->conj_id) {
+VLOG_DBG_RL(&rl, "offloading attribute conj_id isn't supported");
+return EOPNOTSUPP;
+}
+
+if (mask->skb_priority) {
+VLOG_DBG_RL(&rl, "offloading attribute skb_priority isn't supported");
+return EOPNOTSUPP;
+}
+
+if (mask->actset_output) {
+VLOG_DBG_RL(&rl,
+"offloading attribute actset_output isn't supported");
+return EOPNOTSUPP;
+}
+
+if (mask->ct_state) {
+VLOG_DBG_RL(&rl, "offloading attribute ct_state isn't supported");
+return EOPNOTSUPP;
+

[ovs-dev] [PATCH V10 23/33] netdev-linux: Disallow setting policing when configured with hw offload

2017-06-08 Thread Roi Dayan
From: Paul Blakey 

Notify as not supported. Otherwise the ingress qdisc is being removed and
offload rules will be removed.

Signed-off-by: Paul Blakey 
Reviewed-by: Roi Dayan 
Reviewed-by: Simon Horman 
---
 lib/netdev-linux.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index 44dfac5..ce0a153 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -2087,6 +2087,14 @@ netdev_linux_set_policing(struct netdev *netdev_,
 int ifindex;
 int error;
 
+if (netdev_is_flow_api_enabled()) {
+if (kbits_rate) {
+VLOG_WARN_RL(&rl, "%s: policing with offload isn't supported",
+ netdev_name);
+}
+return EOPNOTSUPP;
+}
+
 kbits_burst = (!kbits_rate ? 0   /* Force to 0 if no rate specified. */
: !kbits_burst ? 8000 /* Default to 8000 kbits if 0. */
: kbits_burst);   /* Stick with user-specified value. */
-- 
2.7.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V10 22/33] netdev-tc-offloads: Implement flow get using tc interface

2017-06-08 Thread Roi Dayan
From: Paul Blakey 

Search the requested ufid for a offloaded flow, and if found,
dump and parse it back to required format.

Signed-off-by: Paul Blakey 
Reviewed-by: Roi Dayan 
Reviewed-by: Simon Horman 
---
 lib/netdev-tc-offloads.c | 50 ++--
 1 file changed, 44 insertions(+), 6 deletions(-)

diff --git a/lib/netdev-tc-offloads.c b/lib/netdev-tc-offloads.c
index 287d7cd..d1036d1 100644
--- a/lib/netdev-tc-offloads.c
+++ b/lib/netdev-tc-offloads.c
@@ -840,13 +840,51 @@ netdev_tc_flow_put(struct netdev *netdev,
 
 int
 netdev_tc_flow_get(struct netdev *netdev OVS_UNUSED,
-   struct match *match OVS_UNUSED,
-   struct nlattr **actions OVS_UNUSED,
-   const ovs_u128 *ufid OVS_UNUSED,
-   struct dpif_flow_stats *stats OVS_UNUSED,
-   struct ofpbuf *buf OVS_UNUSED)
+   struct match *match,
+   struct nlattr **actions,
+   const ovs_u128 *ufid,
+   struct dpif_flow_stats *stats,
+   struct ofpbuf *buf)
 {
-return EOPNOTSUPP;
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 20);
+struct netdev *dev;
+struct tc_flower flower;
+odp_port_t in_port;
+int prio = 0;
+int ifindex;
+int handle;
+int err;
+
+handle = get_ufid_tc_mapping(ufid, &prio, &dev);
+if (!handle) {
+return ENOENT;
+}
+
+ifindex = netdev_get_ifindex(dev);
+if (ifindex < 0) {
+VLOG_ERR_RL(&error_rl, "failed to get ifindex for %s: %s",
+netdev_get_name(dev), ovs_strerror(-ifindex));
+netdev_close(dev);
+return -ifindex;
+}
+
+VLOG_DBG_RL(&rl, "flow get (dev %s prio %d handle %d)",
+netdev_get_name(dev), prio, handle);
+err = tc_get_flower(ifindex, prio, handle, &flower);
+netdev_close(dev);
+if (err) {
+VLOG_ERR_RL(&error_rl, "flow get failed (dev %s prio %d handle %d): 
%s",
+netdev_get_name(dev), prio, handle, ovs_strerror(err));
+return err;
+}
+
+in_port = netdev_ifindex_to_odp_port(ifindex);
+parse_tc_flower_to_match(&flower, match, actions, stats, buf);
+
+match->wc.masks.in_port.odp_port = u32_to_odp(UINT32_MAX);
+match->flow.in_port.odp_port = in_port;
+
+return 0;
 }
 
 int
-- 
2.7.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V10 20/33] netdev-tc-offloads: Implement netdev flow del using tc interface

2017-06-08 Thread Roi Dayan
From: Paul Blakey 

Signed-off-by: Paul Blakey 
Reviewed-by: Roi Dayan 
Reviewed-by: Simon Horman 
---
 lib/netdev-tc-offloads.c | 33 ++---
 1 file changed, 30 insertions(+), 3 deletions(-)

diff --git a/lib/netdev-tc-offloads.c b/lib/netdev-tc-offloads.c
index 6b8f554..287d7cd 100644
--- a/lib/netdev-tc-offloads.c
+++ b/lib/netdev-tc-offloads.c
@@ -851,10 +851,37 @@ netdev_tc_flow_get(struct netdev *netdev OVS_UNUSED,
 
 int
 netdev_tc_flow_del(struct netdev *netdev OVS_UNUSED,
-   const ovs_u128 *ufid OVS_UNUSED,
-   struct dpif_flow_stats *stats OVS_UNUSED)
+   const ovs_u128 *ufid,
+   struct dpif_flow_stats *stats)
 {
-return EOPNOTSUPP;
+struct netdev *dev;
+int prio = 0;
+int ifindex;
+int handle;
+int error;
+
+handle = get_ufid_tc_mapping(ufid, &prio, &dev);
+if (!handle) {
+return ENOENT;
+}
+
+ifindex = netdev_get_ifindex(dev);
+if (ifindex < 0) {
+VLOG_ERR_RL(&error_rl, "failed to get ifindex for %s: %s",
+netdev_get_name(dev), ovs_strerror(-ifindex));
+netdev_close(dev);
+return -ifindex;
+}
+
+error = tc_del_filter(ifindex, prio, handle);
+del_ufid_tc_mapping(ufid);
+
+netdev_close(dev);
+
+if (stats) {
+memset(stats, 0, sizeof *stats);
+}
+return error;
 }
 
 int
-- 
2.7.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V10 21/33] dpif-netlink: Use netdev flow get api to query a flow

2017-06-08 Thread Roi Dayan
From: Paul Blakey 

Search all datapath added netdevs for a given flow
using netdev flow api and parse it back to dpif flow.

Signed-off-by: Paul Blakey 
Reviewed-by: Roi Dayan 
Reviewed-by: Simon Horman 
---
 lib/dpif-netlink.c | 51 ++-
 lib/netdev.c   | 21 +
 lib/netdev.h   |  5 +
 3 files changed, 76 insertions(+), 1 deletion(-)

diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
index f959ed9..21b9a99 100644
--- a/lib/dpif-netlink.c
+++ b/lib/dpif-netlink.c
@@ -1986,6 +1986,45 @@ dpif_netlink_operate__(struct dpif_netlink *dpif,
 }
 
 static int
+parse_flow_get(struct dpif_netlink *dpif, struct dpif_flow_get *get)
+{
+struct dpif_flow *dpif_flow = get->flow;
+struct match match;
+struct nlattr *actions;
+struct dpif_flow_stats stats;
+struct ofpbuf buf;
+uint64_t act_buf[1024 / 8];
+struct odputil_keybuf maskbuf;
+struct odputil_keybuf keybuf;
+struct odputil_keybuf actbuf;
+struct ofpbuf key, mask, act;
+int err;
+
+ofpbuf_use_stack(&buf, &act_buf, sizeof act_buf);
+err = netdev_ports_flow_get(DPIF_HMAP_KEY(&dpif->dpif), &match,
+&actions, get->ufid, &stats, &buf);
+if (err) {
+return err;
+}
+
+VLOG_DBG("found flow from netdev, translating to dpif flow");
+
+ofpbuf_use_stack(&key, &keybuf, sizeof keybuf);
+ofpbuf_use_stack(&act, &actbuf, sizeof actbuf);
+ofpbuf_use_stack(&mask, &maskbuf, sizeof maskbuf);
+dpif_netlink_netdev_match_to_dpif_flow(&match, &key, &mask, actions,
+   &stats,
+   (ovs_u128 *) get->ufid,
+   dpif_flow,
+   false);
+ofpbuf_put(get->buffer, nl_attr_get(actions), nl_attr_get_size(actions));
+dpif_flow->actions = ofpbuf_at(get->buffer, 0, 0);
+dpif_flow->actions_len = nl_attr_get_size(actions);
+
+return 0;
+}
+
+static int
 parse_flow_put(struct dpif_netlink *dpif, struct dpif_flow_put *put)
 {
 static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 20);
@@ -2157,7 +2196,17 @@ try_send_to_netdev(struct dpif_netlink *dpif, struct 
dpif_op *op)
 del->stats);
 break;
 }
-case DPIF_OP_FLOW_GET:
+case DPIF_OP_FLOW_GET: {
+struct dpif_flow_get *get = &op->u.flow_get;
+
+if (!op->u.flow_get.ufid) {
+break;
+}
+dbg_print_flow(get->key, get->key_len, NULL, 0, NULL, 0,
+   get->ufid, "GET");
+err = parse_flow_get(dpif, get);
+break;
+}
 case DPIF_OP_EXECUTE:
 default:
 break;
diff --git a/lib/netdev.c b/lib/netdev.c
index 4311c21..0aae83a 100644
--- a/lib/netdev.c
+++ b/lib/netdev.c
@@ -2324,6 +2324,27 @@ netdev_ports_flow_del(const void *obj, const ovs_u128 
*ufid,
 return ENOENT;
 }
 
+int
+netdev_ports_flow_get(const void *obj, struct match *match,
+  struct nlattr **actions,
+  const ovs_u128 *ufid,
+  struct dpif_flow_stats *stats,
+  struct ofpbuf *buf)
+{
+struct port_to_netdev_data *data;
+
+ovs_mutex_lock(&netdev_hmap_mutex);
+HMAP_FOR_EACH(data, node, &port_to_netdev) {
+if (data->obj == obj && !netdev_flow_get(data->netdev, match, actions,
+ ufid, stats, buf)) {
+ovs_mutex_unlock(&netdev_hmap_mutex);
+return 0;
+}
+}
+ovs_mutex_unlock(&netdev_hmap_mutex);
+return ENOENT;
+}
+
 #ifdef __linux__
 void
 netdev_set_flow_api_enabled(const struct smap *ovs_other_config)
diff --git a/lib/netdev.h b/lib/netdev.h
index 2ddc595..31846fa 100644
--- a/lib/netdev.h
+++ b/lib/netdev.h
@@ -192,6 +192,11 @@ struct netdev_flow_dump 
**netdev_ports_flow_dump_create(const void *obj,
 void netdev_ports_flow_flush(const void *obj);
 int netdev_ports_flow_del(const void *obj, const ovs_u128 *ufid,
   struct dpif_flow_stats *stats);
+int netdev_ports_flow_get(const void *obj, struct match *match,
+  struct nlattr **actions,
+  const ovs_u128 *ufid,
+  struct dpif_flow_stats *stats,
+  struct ofpbuf *buf);
 
 /* native tunnel APIs */
 /* Structure to pass parameters required to build a tunnel header. */
-- 
2.7.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V10 19/33] dpif-netlink: Use netdev flow del api to delete a flow

2017-06-08 Thread Roi Dayan
From: Paul Blakey 

If a flow was offloaded to a netdev we delete it using netdev
flow api.

Signed-off-by: Paul Blakey 
Reviewed-by: Roi Dayan 
Reviewed-by: Simon Horman 
---
 lib/dpif-netlink.c | 13 -
 lib/netdev.c   | 18 ++
 lib/netdev.h   |  2 ++
 3 files changed, 32 insertions(+), 1 deletion(-)

diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
index fc8bf20..f959ed9 100644
--- a/lib/dpif-netlink.c
+++ b/lib/dpif-netlink.c
@@ -2145,7 +2145,18 @@ try_send_to_netdev(struct dpif_netlink *dpif, struct 
dpif_op *op)
 err = parse_flow_put(dpif, put);
 break;
 }
-case DPIF_OP_FLOW_DEL:
+case DPIF_OP_FLOW_DEL: {
+struct dpif_flow_del *del = &op->u.flow_del;
+
+if (!del->ufid) {
+break;
+}
+dbg_print_flow(del->key, del->key_len, NULL, 0, NULL, 0,
+   del->ufid, "DEL");
+err = netdev_ports_flow_del(DPIF_HMAP_KEY(&dpif->dpif), del->ufid,
+del->stats);
+break;
+}
 case DPIF_OP_FLOW_GET:
 case DPIF_OP_EXECUTE:
 default:
diff --git a/lib/netdev.c b/lib/netdev.c
index 41960b6..4311c21 100644
--- a/lib/netdev.c
+++ b/lib/netdev.c
@@ -2306,6 +2306,24 @@ netdev_ports_flow_dump_create(const void *obj, int 
*ports)
 return dumps;
 }
 
+int
+netdev_ports_flow_del(const void *obj, const ovs_u128 *ufid,
+  struct dpif_flow_stats *stats)
+{
+struct port_to_netdev_data *data;
+
+ovs_mutex_lock(&netdev_hmap_mutex);
+HMAP_FOR_EACH(data, node, &port_to_netdev) {
+if (data->obj == obj && !netdev_flow_del(data->netdev, ufid, stats)) {
+ovs_mutex_unlock(&netdev_hmap_mutex);
+return 0;
+}
+}
+ovs_mutex_unlock(&netdev_hmap_mutex);
+
+return ENOENT;
+}
+
 #ifdef __linux__
 void
 netdev_set_flow_api_enabled(const struct smap *ovs_other_config)
diff --git a/lib/netdev.h b/lib/netdev.h
index 0b2e674..2ddc595 100644
--- a/lib/netdev.h
+++ b/lib/netdev.h
@@ -190,6 +190,8 @@ odp_port_t netdev_ifindex_to_odp_port(int ifindex);
 struct netdev_flow_dump **netdev_ports_flow_dump_create(const void *obj,
 int *ports);
 void netdev_ports_flow_flush(const void *obj);
+int netdev_ports_flow_del(const void *obj, const ovs_u128 *ufid,
+  struct dpif_flow_stats *stats);
 
 /* native tunnel APIs */
 /* Structure to pass parameters required to build a tunnel header. */
-- 
2.7.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V10 17/33] netdev-tc-offloads: Add flower mask to priority map

2017-06-08 Thread Roi Dayan
From: Paul Blakey 

Flower classifer requires a different priority per mask,
so we hash the mask and generate a new priority for
each new mask used.

Signed-off-by: Paul Blakey 
Reviewed-by: Roi Dayan 
Reviewed-by: Simon Horman 
---
 lib/netdev-tc-offloads.c | 54 
 1 file changed, 54 insertions(+)

diff --git a/lib/netdev-tc-offloads.c b/lib/netdev-tc-offloads.c
index 4b14c5c..49e97fa 100644
--- a/lib/netdev-tc-offloads.c
+++ b/lib/netdev-tc-offloads.c
@@ -170,6 +170,60 @@ find_ufid(int prio, int handle, struct netdev *netdev, 
ovs_u128 *ufid)
 return (data != NULL);
 }
 
+struct prio_map_data {
+struct hmap_node node;
+struct tc_flower_key mask;
+ovs_be16 protocol;
+uint16_t prio;
+};
+
+/* Get free prio for tc flower
+ * If prio is already allocated for mask/eth_type combination then return it.
+ * If not assign new prio.
+ *
+ * Return prio on success or 0 if we are out of prios.
+ */
+static uint16_t OVS_UNUSED
+get_prio_for_tc_flower(struct tc_flower *flower)
+{
+static struct hmap prios = HMAP_INITIALIZER(&prios);
+static struct ovs_mutex prios_lock = OVS_MUTEX_INITIALIZER;
+static uint16_t last_prio = 0;
+size_t key_len = sizeof(struct tc_flower_key);
+size_t hash = hash_bytes(&flower->mask, key_len,
+ (OVS_FORCE uint32_t) flower->key.eth_type);
+struct prio_map_data *data;
+struct prio_map_data *new_data;
+
+/* We can use the same prio for same mask/eth combination but must have
+ * different prio if not. Flower classifier will reject same prio for
+ * different mask/eth combination. */
+ovs_mutex_lock(&prios_lock);
+HMAP_FOR_EACH_WITH_HASH(data, node, hash, &prios) {
+if (!memcmp(&flower->mask, &data->mask, key_len)
+&& data->protocol == flower->key.eth_type) {
+ovs_mutex_unlock(&prios_lock);
+return data->prio;
+}
+}
+
+if (last_prio == UINT16_MAX) {
+/* last_prio can overflow if there will be many different kinds of
+ * flows which shouldn't happen organically. */
+ovs_mutex_unlock(&prios_lock);
+return 0;
+}
+
+new_data = xzalloc(sizeof *new_data);
+memcpy(&new_data->mask, &flower->mask, key_len);
+new_data->prio = ++last_prio;
+new_data->protocol = flower->key.eth_type;
+hmap_insert(&prios, &new_data->node, hash);
+ovs_mutex_unlock(&prios_lock);
+
+return new_data->prio;
+}
+
 int
 netdev_tc_flow_flush(struct netdev *netdev)
 {
-- 
2.7.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V10 15/33] netdev-tc-offloads: Implement netdev flow dump api using tc interface

2017-06-08 Thread Roi Dayan
From: Paul Blakey 

Signed-off-by: Paul Blakey 
Reviewed-by: Roi Dayan 
Reviewed-by: Simon Horman 
---
 lib/netdev-tc-offloads.c | 186 ---
 1 file changed, 177 insertions(+), 9 deletions(-)

diff --git a/lib/netdev-tc-offloads.c b/lib/netdev-tc-offloads.c
index 0786048..4b14c5c 100644
--- a/lib/netdev-tc-offloads.c
+++ b/lib/netdev-tc-offloads.c
@@ -150,7 +150,7 @@ get_ufid_tc_mapping(const ovs_u128 *ufid, int *prio, struct 
netdev **netdev)
  *
  * Returns true on success.
  */
-static bool OVS_UNUSED
+static bool
 find_ufid(int prio, int handle, struct netdev *netdev, ovs_u128 *ufid)
 {
 int ifindex = netdev_get_ifindex(netdev);
@@ -188,9 +188,20 @@ int
 netdev_tc_flow_dump_create(struct netdev *netdev,
struct netdev_flow_dump **dump_out)
 {
-struct netdev_flow_dump *dump = xzalloc(sizeof *dump);
+struct netdev_flow_dump *dump;
+int ifindex;
+
+ifindex = netdev_get_ifindex(netdev);
+if (ifindex < 0) {
+VLOG_ERR_RL(&error_rl, "failed to get ifindex for %s: %s",
+netdev_get_name(netdev), ovs_strerror(-ifindex));
+return -ifindex;
+}
 
+dump = xzalloc(sizeof *dump);
+dump->nl_dump = xzalloc(sizeof *dump->nl_dump);
 dump->netdev = netdev_ref(netdev);
+tc_dump_flower_start(ifindex, dump->nl_dump);
 
 *dump_out = dump;
 
@@ -200,21 +211,178 @@ netdev_tc_flow_dump_create(struct netdev *netdev,
 int
 netdev_tc_flow_dump_destroy(struct netdev_flow_dump *dump)
 {
+nl_dump_done(dump->nl_dump);
 netdev_close(dump->netdev);
+free(dump->nl_dump);
 free(dump);
+return 0;
+}
+
+static int
+parse_tc_flower_to_match(struct tc_flower *flower,
+ struct match *match,
+ struct nlattr **actions,
+ struct dpif_flow_stats *stats,
+ struct ofpbuf *buf) {
+size_t act_off;
+struct tc_flower_key *key = &flower->key;
+struct tc_flower_key *mask = &flower->mask;
+odp_port_t outport = 0;
+
+if (flower->ifindex_out) {
+outport = netdev_ifindex_to_odp_port(flower->ifindex_out);
+if (!outport) {
+return ENOENT;
+}
+}
+
+ofpbuf_clear(buf);
+
+match_init_catchall(match);
+match_set_dl_type(match, key->eth_type);
+match_set_dl_src_masked(match, key->src_mac, mask->src_mac);
+match_set_dl_dst_masked(match, key->dst_mac, mask->dst_mac);
+if (key->vlan_id || key->vlan_prio) {
+match_set_dl_vlan(match, htons(key->vlan_id));
+match_set_dl_vlan_pcp(match, key->vlan_prio);
+match_set_dl_type(match, key->encap_eth_type);
+}
+
+if (key->ip_proto &&
+(key->eth_type == htons(ETH_P_IP)
+ || key->eth_type == htons(ETH_P_IPV6))) {
+match_set_nw_proto(match, key->ip_proto);
+}
+
+match_set_nw_src_masked(match, key->ipv4.ipv4_src, mask->ipv4.ipv4_src);
+match_set_nw_dst_masked(match, key->ipv4.ipv4_dst, mask->ipv4.ipv4_dst);
+
+match_set_ipv6_src_masked(match,
+  &key->ipv6.ipv6_src, &mask->ipv6.ipv6_src);
+match_set_ipv6_dst_masked(match,
+  &key->ipv6.ipv6_dst, &mask->ipv6.ipv6_dst);
+
+match_set_tp_dst_masked(match, key->dst_port, mask->dst_port);
+match_set_tp_src_masked(match, key->src_port, mask->src_port);
+
+if (flower->tunnel.tunnel) {
+match_set_tun_id(match, flower->tunnel.id);
+if (flower->tunnel.ipv4.ipv4_dst) {
+match_set_tun_src(match, flower->tunnel.ipv4.ipv4_src);
+match_set_tun_dst(match, flower->tunnel.ipv4.ipv4_dst);
+} else if (!is_all_zeros(&flower->tunnel.ipv6.ipv6_dst,
+   sizeof flower->tunnel.ipv6.ipv6_dst)) {
+match_set_tun_ipv6_src(match, &flower->tunnel.ipv6.ipv6_src);
+match_set_tun_ipv6_dst(match, &flower->tunnel.ipv6.ipv6_dst);
+}
+if (flower->tunnel.tp_dst) {
+match_set_tun_tp_dst(match, flower->tunnel.tp_dst);
+}
+}
+
+act_off = nl_msg_start_nested(buf, OVS_FLOW_ATTR_ACTIONS);
+{
+if (flower->vlan_pop) {
+nl_msg_put_flag(buf, OVS_ACTION_ATTR_POP_VLAN);
+}
+
+if (flower->vlan_push_id || flower->vlan_push_prio) {
+struct ovs_action_push_vlan *push;
+push = nl_msg_put_unspec_zero(buf, OVS_ACTION_ATTR_PUSH_VLAN,
+  sizeof *push);
+
+push->vlan_tpid = htons(ETH_TYPE_VLAN);
+push->vlan_tci = htons(flower->vlan_push_id
+   | (flower->vlan_push_prio << 13)
+   | VLAN_CFI);
+}
+
+if (flower->set.set) {
+size_t set_offset = nl_msg_start_nested(buf, OVS_ACTION_ATTR_SET);
+size_t tunnel_offset =
+nl_msg_start_nested(buf, OVS_KEY_ATTR_TUNNEL);
+
+nl_msg_put_be64

  1   2   >