[ovs-dev] hi!

2016-03-29 Thread dev
Hello!

We are looking for employees working remotely.

My name is Lucas, am the personnel manager of a large International company.
Most of the work you can do from home, that is, at a distance.
Salary is $1000-$4000.

If you are interested in this offer, please visit our site: --> 
www.bestearntrade.com

Best regards!

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] Bit-level setting with ct()

2016-03-29 Thread Joe Stringer
On 30 March 2016 at 11:05, Justin Pettit  wrote:
> Hi, Joe.  Russell is adding the ability to set ct_mark and ct_label in OVN 
> logical flows.  The unit tests and ovs-ofctl documentation only show setting 
> the whole field through OpenFlow, but I think the code supports bit-level 
> manipulation.  Does that seem correct to you?

Seems like it was broken. I sent a couple of patches.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH 2/2] ofproto-dpif-xlate: Fix bitwise ops on ct_labels.

2016-03-29 Thread Joe Stringer
Using the action ct(commit,set_field:0x1/0x1->ct_label), ie, specifying
a mask, would previously overwrite the entire ct_labels field rather than
modifying only the specified bits. Fix the issue.

Fixes: 9daf23484fb1 ("Add connection tracking label support.")
Signed-off-by: Joe Stringer 
---
 lib/util.h   | 22 ++
 ofproto/ofproto-dpif-xlate.c |  4 +++-
 tests/system-traffic.at  | 34 ++
 utilities/ovs-ofctl.8.in |  2 +-
 4 files changed, 60 insertions(+), 2 deletions(-)

diff --git a/lib/util.h b/lib/util.h
index 389c093a2fd7..5678671fbcf5 100644
--- a/lib/util.h
+++ b/lib/util.h
@@ -611,6 +611,28 @@ ovs_be128_is_zero(const ovs_be128 *val)
 return !(val->be64.hi || val->be64.lo);
 }
 
+static inline ovs_u128
+ovs_u128_xor(const ovs_u128 a, const ovs_u128 b)
+{
+ovs_u128 dst;
+
+dst.u64.hi = a.u64.hi ^ b.u64.hi;
+dst.u64.lo = a.u64.lo ^ b.u64.lo;
+
+return dst;
+}
+
+static inline ovs_u128
+ovs_u128_or(const ovs_u128 a, const ovs_u128 b)
+{
+ovs_u128 dst;
+
+dst.u64.hi = a.u64.hi | b.u64.hi;
+dst.u64.lo = a.u64.lo | b.u64.lo;
+
+return dst;
+}
+
 void xsleep(unsigned int seconds);
 
 bool is_stdout_a_tty(void);
diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index 25297851f9c5..b59a5407bb9f 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -4315,7 +4315,9 @@ put_ct_label(const struct flow *flow, struct flow 
*base_flow,
 OVS_CT_ATTR_LABELS,
 sizeof(*odp_ct_label));
 odp_ct_label->key = flow->ct_label;
-odp_ct_label->mask = wc->masks.ct_label;
+odp_ct_label->mask = ovs_u128_xor(base_flow->ct_label, flow->ct_label);
+wc->masks.ct_label = ovs_u128_or(wc->masks.ct_label,
+ odp_ct_label->mask);
 }
 }
 
diff --git a/tests/system-traffic.at b/tests/system-traffic.at
index 9d2c57faa6d7..9c0068410d85 100644
--- a/tests/system-traffic.at
+++ b/tests/system-traffic.at
@@ -889,6 +889,40 @@ NS_CHECK_EXEC([at_ns2], [wget 10.1.1.4 -t 3 -T 1 -v -o 
wget1.log], [4])
 OVS_TRAFFIC_VSWITCHD_STOP
 AT_CLEANUP
 
+AT_SETUP([conntrack - ct_label bit-fiddling])
+CHECK_CONNTRACK()
+OVS_TRAFFIC_VSWITCHD_START()
+
+ADD_NAMESPACES(at_ns0, at_ns1, at_ns2, at_ns3)
+
+ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24")
+ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24")
+
+dnl Allow traffic between ns0<->ns1 using the ct_labels. Return traffic should
+dnl cause an additional bit to be set in the connection labels (and be allowed)
+AT_DATA([flows.txt], [dnl
+priority=1,action=drop
+priority=10,arp,action=normal
+priority=10,icmp,action=normal
+priority=100,in_port=1,tcp,action=ct(commit,exec(set_field:0x1/0x1->ct_label)),2
+priority=100,in_port=2,ct_state=-trk,tcp,action=ct(table=0,commit,exec(set_field:0x2/0x2->ct_label))
+priority=100,in_port=2,ct_state=+trk,ct_label=0x1,tcp,action=1
+priority=100,in_port=2,ct_state=+trk,ct_label=0x20001,tcp,action=1
+])
+
+AT_CHECK([ovs-ofctl --bundle add-flows br0 flows.txt])
+
+dnl HTTP requests from p0->p1 should work fine.
+NETNS_DAEMONIZE([at_ns1], [[$PYTHON $srcdir/test-l7.py]], [http0.pid])
+NS_CHECK_EXEC([at_ns0], [wget 10.1.1.2 -t 3 -T 1 --retry-connrefused -v -o 
wget0.log])
+
+AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(10.1.1.2) | grep TIME], 
[0], [dnl
+tcp,orig=(src=10.1.1.1,dst=10.1.1.2,sport=,dport=),reply=(src=10.1.1.2,dst=10.1.1.1,sport=,dport=),labels=0x20001,protoinfo=(state=TIME_WAIT)
+])
+
+OVS_TRAFFIC_VSWITCHD_STOP
+AT_CLEANUP
+
 AT_SETUP([conntrack - ICMP related])
 CHECK_CONNTRACK()
 OVS_TRAFFIC_VSWITCHD_START()
diff --git a/utilities/ovs-ofctl.8.in b/utilities/ovs-ofctl.8.in
index 51a57f777b37..8188b5352349 100644
--- a/utilities/ovs-ofctl.8.in
+++ b/utilities/ovs-ofctl.8.in
@@ -1730,7 +1730,7 @@ Store a 32-bit metadata value with the connection. If the 
connection is
 committed, then subsequent lookups for packets in this connection will
 populate the \fBct_mark\fR flow field when the packet is sent to the
 connection tracker with the \fBtable\fR specified.
-.IP \fBset_field:\fIvalue\fR->ct_label\fR
+.IP \fBset_field:\fIvalue\fR[\fB/\fImask\fR]->ct_label\fR
 Store a 128-bit metadata value with the connection.  If the connection is
 committed, then subsequent lookups for packets in this connection will
 populate the \fBct_label\fR flow field when the packet is sent to the
-- 
2.1.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH 1/2] ofproto-dpif-xlate: Fix bitwise ops on ct_mark.

2016-03-29 Thread Joe Stringer
Using the action ct(commit,set_field:0x1/0x1->ct_mark), ie, specifying a
mask, would previously overwrite the entire ct_mark field rather than
modifying only the specified bits. Fix the issue.

Fixes: 8e53fe8cf7a1 ("Add connection tracking mark support.")
Signed-off-by: Joe Stringer 
---
 ofproto/ofproto-dpif-xlate.c |  3 ++-
 tests/system-traffic.at  | 34 ++
 utilities/ovs-ofctl.8.in |  2 +-
 3 files changed, 37 insertions(+), 2 deletions(-)

diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index 19e690ec1ecb..25297851f9c5 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -4291,7 +4291,8 @@ put_ct_mark(const struct flow *flow, struct flow 
*base_flow,
 } odp_attr;
 
 odp_attr.key = flow->ct_mark;
-odp_attr.mask = wc->masks.ct_mark;
+odp_attr.mask = base_flow->ct_mark ^ flow->ct_mark;
+wc->masks.ct_mark |= odp_attr.mask;
 
 if (odp_attr.mask && odp_attr.key != base_flow->ct_mark) {
 nl_msg_put_unspec(odp_actions, OVS_CT_ATTR_MARK, _attr,
diff --git a/tests/system-traffic.at b/tests/system-traffic.at
index 28adbdcb9ee6..9d2c57faa6d7 100644
--- a/tests/system-traffic.at
+++ b/tests/system-traffic.at
@@ -770,6 +770,40 @@ 
tcp,orig=(src=10.1.1.3,dst=10.1.1.4,sport=,dport=),reply=(src=
 OVS_TRAFFIC_VSWITCHD_STOP
 AT_CLEANUP
 
+AT_SETUP([conntrack - ct_mark bit-fiddling])
+CHECK_CONNTRACK()
+OVS_TRAFFIC_VSWITCHD_START()
+
+ADD_NAMESPACES(at_ns0, at_ns1, at_ns2, at_ns3)
+
+ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24")
+ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24")
+
+dnl Allow traffic between ns0<->ns1 using the ct_mark. Return traffic should
+dnl cause an additional bit to be set in the connection (and be allowed).
+AT_DATA([flows.txt], [dnl
+priority=1,action=drop
+priority=10,arp,action=normal
+priority=10,icmp,action=normal
+priority=100,in_port=1,tcp,action=ct(commit,exec(set_field:0x1/0x1->ct_mark)),2
+priority=100,in_port=2,ct_state=-trk,tcp,action=ct(table=0,commit,exec(set_field:0x2/0x2->ct_mark))
+priority=100,in_port=2,ct_state=+trk,ct_mark=1,tcp,action=1
+priority=100,in_port=2,ct_state=+trk,ct_mark=3,tcp,action=1
+])
+
+AT_CHECK([ovs-ofctl --bundle add-flows br0 flows.txt])
+
+dnl HTTP requests from p0->p1 should work fine.
+NETNS_DAEMONIZE([at_ns1], [[$PYTHON $srcdir/test-l7.py]], [http0.pid])
+NS_CHECK_EXEC([at_ns0], [wget 10.1.1.2 -t 3 -T 1 --retry-connrefused -v -o 
wget0.log])
+
+AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(10.1.1.2) | grep TIME], 
[0], [dnl
+tcp,orig=(src=10.1.1.1,dst=10.1.1.2,sport=,dport=),reply=(src=10.1.1.2,dst=10.1.1.1,sport=,dport=),mark=3,protoinfo=(state=TIME_WAIT)
+])
+
+OVS_TRAFFIC_VSWITCHD_STOP
+AT_CLEANUP
+
 AT_SETUP([conntrack - ct_mark from register])
 CHECK_CONNTRACK()
 OVS_TRAFFIC_VSWITCHD_START()
diff --git a/utilities/ovs-ofctl.8.in b/utilities/ovs-ofctl.8.in
index 6e2613207979..51a57f777b37 100644
--- a/utilities/ovs-ofctl.8.in
+++ b/utilities/ovs-ofctl.8.in
@@ -1725,7 +1725,7 @@ fields are accepted within the \fBexec\fR action, and 
these fields may only be
 modified with this option. For example:
 .
 .RS
-.IP \fBset_field:\fIvalue\fR->ct_mark\fR
+.IP \fBset_field:\fIvalue\fR[\fB/\fImask\fR]->ct_mark\fR
 Store a 32-bit metadata value with the connection. If the connection is
 committed, then subsequent lookups for packets in this connection will
 populate the \fBct_mark\fR flow field when the packet is sent to the
-- 
2.1.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] dpif-netdev: Remove PMD latency on seq_mutex

2016-03-29 Thread Daniele Di Proietto


On 29/03/2016 06:44, "Karl Rister"  wrote:

>On 03/29/2016 08:08 AM, Flavio Leitner wrote:
>> On Tue, Mar 29, 2016 at 02:13:18AM +, Daniele Di Proietto wrote:
>>> Hi Flavio and Karl,
>>>
>>> thanks for the patch! I have a couple of comments:
>>>
>>> Can you point out a configuration where this is the bottleneck?
>>> I'm interested in reproducing this.
>> 
>> Karl, since you did the tests, could you please provide more details?
>
>When performing packet forwarding latency tests, I first noticed system
>and idle time when looking at CPU statistics when I expected the PMD
>threads to be 100% in userspace.  I used the kernel ftrace facility to
>track down what was happening and saw that the PMD thread was being
>context switched out and going idle.  The PMD thread was pinned to CPU
>core thread isolated with isolcpus so there was no competing task that
>could be scheduled to cause the context switch and I would not expect
>the polling thread to ever go idle.  Further analysis with frace and gdb
>tracked the cause to seq_mutex blocking when another task held the mutex.
>
>I would estimate that this change removed packet latency spikes of 35-45
>usecs in our test scenario.
>
>The test is forwarding packets through a KVM guest using OVS+DPDK in the
>host and the DPDK testpmd application in the guest.

Thanks the explanation

>
>Flavio, I thought I remembered you also saying that you saw a throughput
>improvement in a test you were running?
>
>> 
>> 
>>> I think the implementation would look simpler if we could
>>> avoid explicitly taking the mutex in dpif-netdev and instead
>>> having a ovsrcu_try_quiesce(). What do you think?
>> 
>> My concern is that it is freeing one entry from EMC each round
>> and it should quiesce to allow the callbacks to run.  If, for
>> some reason, it fails to quiesce for a long period, then it might
>> backlog a significant number of entries.
>
>My initial approach, which Flavio's code is very similar to, was simply
>trying to provide the simplest change to achieve what I was looking for.
> I could certainly see alternative solutions being more appropriate.
>
>> 
>> 
>>> I think we can avoid the recursive mutex as well if we introduce
>>> some explicit APIs in seq (seq_try_lock, seq_change_protected and
>>> seq_unlock), but I'd like to understand the performance implication
>>> of this commit first.
>
>One other area of the sequence code that I thought was curious was a
>single mutex that covered all sequences.  If updating the API is a
>possibility I would think going to a mutex per sequence might be an
>appropriate change as well.  That said, I don't have data that
>specifically points this out as a problem.

If we find this to be a bottleneck I think we can have a finer-grained
locking.

>
>> 
>> The issue is the latency spike when the PMD thread blocks on the
>> busy mutex.
>> 
>> The goal with recursive locking is to make sure we can sweep
>> the EMC cache and quiesce without blocking.  Fixing seq API
>> would help to not block, but then we have no control to whether
>> we did both tasks in the same round.
>> 
>> fbl
>> 
>> 
>>>
>>> On 23/03/2016 20:54, "dev on behalf of Flavio Leitner"
>>>  wrote:
>>>
 The PMD thread needs to keep processing RX queues in order
 archive maximum throughput.  However, it also needs to run
 other tasks after some time processing the RX queues which
 a mutex can block the PMD thread.  That causes latency
 spikes and affects the throughput.

 Convert to recursive mutex so that PMD thread can test first
 and if it gets the lock, continue as before, otherwise try
 again another time.  There is an additional logic to make
 sure the PMD thread will try harder as the attempt to get
 the mutex continues to fail.

 Co-authored-by: Karl Rister 
 Signed-off-by: Flavio Leitner 
>>>
>>> Oh, we're going to need a signoff from Karl as well :-)
>
>Signed-off-by: Karl Rister 
>
>Is this good enough?

Absolutely, thanks!

>
>>>
>>> Thanks,
>>>
>>> Daniele
>>>
 ---
 include/openvswitch/thread.h |  3 +++
 lib/dpif-netdev.c| 33 ++---
 lib/seq.c| 15 ++-
 lib/seq.h|  3 +++
 4 files changed, 42 insertions(+), 12 deletions(-)

 diff --git a/include/openvswitch/thread.h
b/include/openvswitch/thread.h
 index af6f2bb..6d20720 100644
 --- a/include/openvswitch/thread.h
 +++ b/include/openvswitch/thread.h
 @@ -44,6 +44,9 @@ struct OVS_LOCKABLE ovs_mutex {
 #define OVS_ADAPTIVE_MUTEX_INITIALIZER OVS_MUTEX_INITIALIZER
 #endif

 +#define OVS_RECURSIVE_MUTEX_INITIALIZER \
 +   { PTHREAD_RECURSIVE_MUTEX_INITIALIZER_NP, "" }
 +
 /* ovs_mutex functions analogous to pthread_mutex_*() functions.
  *
  * Most of these 

[ovs-dev] Mboihuudlmlqeorx

2016-03-29 Thread Post Office


___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] datapath-windows: Fix the hash length when using recirculation

2016-03-29 Thread Alin Serdean
Current implementation of hashing does not take into consideration the
value of recirculation.

This patch updates the length of the hash to include the value of recirculation
in the hash itself.

To make sure the length is a multiple of 8 include the dphash in the
calculation.

Also clean some unnecessary code.

Signed-off-by: Alin Gabriel Serdean 
---
 datapath-windows/ovsext/Actions.c |  6 ++
 datapath-windows/ovsext/Flow.c| 24 +---
 datapath-windows/ovsext/Flow.h|  4 ++--
 datapath-windows/ovsext/User.c|  9 +++--
 4 files changed, 24 insertions(+), 19 deletions(-)

diff --git a/datapath-windows/ovsext/Actions.c 
b/datapath-windows/ovsext/Actions.c
index a91454d..00b3625 100644
--- a/datapath-windows/ovsext/Actions.c
+++ b/datapath-windows/ovsext/Actions.c
@@ -1964,6 +1964,12 @@ OvsDoRecirc(POVS_SWITCH_CONTEXT switchContext,
 
 status = OvsExtractFlow(ovsFwdCtx.curNbl, ovsFwdCtx.srcVportNo, key,
 , NULL);
+
+if (key->recircId || key->dpHash) {
+key->l2.keyLen += sizeof(key->recircId);
+key->l2.keyLen += sizeof(key->dpHash);
+}
+
 if (status != NDIS_STATUS_SUCCESS) {
 OvsCompleteNBLForwardingCtx(,
 L"OVS-Dropped due to extract flow failure");
diff --git a/datapath-windows/ovsext/Flow.c b/datapath-windows/ovsext/Flow.c
index 02c41b7..3a397ce 100644
--- a/datapath-windows/ovsext/Flow.c
+++ b/datapath-windows/ovsext/Flow.c
@@ -1380,13 +1380,20 @@ _MapKeyAttrToFlowPut(PNL_ATTR *keyAttrs,
 {
 _MapTunAttrToFlowPut(keyAttrs, tunnelAttrs, destKey);
 
+/*  L3 + L4.  */
+destKey->l2.keyLen = OVS_WIN_TUNNEL_KEY_SIZE + OVS_L2_KEY_SIZE
+ - destKey->l2.offset;
+
 if (keyAttrs[OVS_KEY_ATTR_RECIRC_ID]) {
 destKey->recircId = NlAttrGetU32(keyAttrs[OVS_KEY_ATTR_RECIRC_ID]);
-destKey->l2.keyLen += sizeof(destKey->recircId);
 }
 
 if (keyAttrs[OVS_KEY_ATTR_DP_HASH]) {
 destKey->dpHash = NlAttrGetU32(keyAttrs[OVS_KEY_ATTR_DP_HASH]);
+}
+
+if (destKey->recircId || destKey->dpHash) {
+destKey->l2.keyLen += sizeof(destKey->recircId);
 destKey->l2.keyLen += sizeof(destKey->dpHash);
 }
 
@@ -1413,10 +1420,6 @@ _MapKeyAttrToFlowPut(PNL_ATTR *keyAttrs,
 destKey->l2.vlanTci = NlAttrGetU16(keyAttrs[OVS_KEY_ATTR_VLAN]);
 }
 
-/*  L3 + L4.  */
-destKey->l2.keyLen = OVS_WIN_TUNNEL_KEY_SIZE + OVS_L2_KEY_SIZE
- - destKey->l2.offset;
-
 switch (ntohs(destKey->l2.dlType)) {
 case ETH_TYPE_IPV4: {
 
@@ -1757,23 +1760,22 @@ DeleteAllFlows(OVS_DATAPATH *datapath)
 }
 }
 
-NDIS_STATUS
+VOID
 OvsGetFlowMetadata(OvsFlowKey *key,
PNL_ATTR *keyAttrs)
 {
-NDIS_STATUS status = NDIS_STATUS_SUCCESS;
-
 if (keyAttrs[OVS_KEY_ATTR_RECIRC_ID]) {
 key->recircId = NlAttrGetU32(keyAttrs[OVS_KEY_ATTR_RECIRC_ID]);
-key->l2.keyLen += sizeof(key->recircId);
 }
 
 if (keyAttrs[OVS_KEY_ATTR_DP_HASH]) {
 key->dpHash = NlAttrGetU32(keyAttrs[OVS_KEY_ATTR_DP_HASH]);
-key->l2.keyLen += sizeof(key->dpHash);
 }
 
-return status;
+if (key->recircId || key->dpHash) {
+key->l2.keyLen += sizeof(key->recircId);
+key->l2.keyLen += sizeof(key->dpHash);
+}
 }
 
 /*
diff --git a/datapath-windows/ovsext/Flow.h b/datapath-windows/ovsext/Flow.h
index 310c472..486e449 100644
--- a/datapath-windows/ovsext/Flow.h
+++ b/datapath-windows/ovsext/Flow.h
@@ -51,8 +51,8 @@ NDIS_STATUS OvsDeleteFlowTable(OVS_DATAPATH *datapath);
 NDIS_STATUS OvsAllocateFlowTable(OVS_DATAPATH *datapath,
  POVS_SWITCH_CONTEXT switchContext);
 
-NDIS_STATUS OvsGetFlowMetadata(OvsFlowKey *key,
-   PNL_ATTR *keyAttrs);
+VOID OvsGetFlowMetadata(OvsFlowKey *key,
+PNL_ATTR *keyAttrs);
 NDIS_STATUS OvsExtractFlow(const NET_BUFFER_LIST *pkt, UINT32 inPort,
OvsFlowKey *flow, POVS_PACKET_HDR_INFO layers,
OvsIPv4TunnelKey *tunKey);
diff --git a/datapath-windows/ovsext/User.c b/datapath-windows/ovsext/User.c
index 6b2d94a..902e35a 100644
--- a/datapath-windows/ovsext/User.c
+++ b/datapath-windows/ovsext/User.c
@@ -430,13 +430,11 @@ OvsExecuteDpIoctl(OvsPacketExecute *execute)
 }
 // XXX: Figure out if any of the other members of fwdDetail need to be set.
 
-status = OvsGetFlowMetadata(, execute->keyAttrs);
-if (status != STATUS_SUCCESS) {
-goto dropit;
-}
-
 ndisStatus = OvsExtractFlow(pNbl, fwdDetail->SourcePortId, , ,
 NULL);
+
+OvsGetFlowMetadata(, execute->keyAttrs);
+
 if (ndisStatus == NDIS_STATUS_SUCCESS) {
 NdisAcquireRWLockRead(gOvsSwitchContext->dispatchLock, , 0);
 ndisStatus = OvsActionsExecute(gOvsSwitchContext, NULL, pNbl,
@@ -456,7 +454,6 @@ 

Re: [ovs-dev] [PATCH 1/2] [Patch V3] OVN: Physical Endpoints

2016-03-29 Thread Darrell Lu
>
> please see responses inline
>
> -- Forwarded message --
> From: Russell Bryant 
> Date: Tue, Mar 29, 2016 at 11:18 AM
> Subject: Re: [ovs-dev] [PATCH 1/2] [Patch V3] OVN: Physical Endpoints
> To: Darrell Ball 
> Cc: ovs dev 
>
>
>
>
> On Tue, Mar 29, 2016 at 12:56 AM, Darrell Ball  wrote:
>
>> The following patch series implements physical-logical separation
>> to be used presently by gateways.
>>
>> The physical endpoint changes allow the physical network to be
>> managed more easily by a dedicated provider network management function
>> and also allow the northbound schema to remain purely logical.
>> Another benefit to allow more easily for additional encapsulations to be
>> used when interacting with physical provider networks.
>>
>> Physical endpoints are defined here as endpoints at the border of a
>> physical network.
>> This presently applies to gateways.
>> If no physical endpt, the encap is assumed empty.
>> For gateways, a single physical endpoint must be bound to each logical
>> port.
>> The chassis name must match the chassis system-id. All 6 arguments must
>> be supplied for gateway physical endpoints configuration.
>> Support is provided to bind multiple physical endpoints to a logical port
>> for
>> future considerations.
>>
>> Physical Endpoint Patch:
>>
>> ovn-sb.ovsschema: Add physical endpoint table and phys_endpts to logical
>> ports
>>
>> ovn-sb.xml: Update documentation regarding physical endpoints and their
>> binding to logical ports. Also describe possible future encapsulation
>> type usages.
>>
>> ovn-sbctl.c: Add configuration for physcial endpoints and binding to
>> logical ports.
>>
>> ovn-sbctl.8.in: Update the ovn-sbctl man page for physical endpoints and
>> binding
>> to logical ports.
>>
>> Signed-off-by: Darrell Ball 
>> ---
>>  ovn/ovn-sb.ovsschema |  23 +++-
>>  ovn/ovn-sb.xml   |  75 +++
>>  ovn/utilities/ovn-sbctl.8.in |  47 +++
>>  ovn/utilities/ovn-sbctl.c| 290
>> ++-
>>  4 files changed, 429 insertions(+), 6 deletions(-)
>>
>> diff --git a/ovn/ovn-sb.ovsschema b/ovn/ovn-sb.ovsschema
>> index 32f8748..14ecb99 100644
>> --- a/ovn/ovn-sb.ovsschema
>> +++ b/ovn/ovn-sb.ovsschema
>> @@ -1,7 +1,7 @@
>>  {
>>  "name": "OVN_Southbound",
>> -"version": "1.2.0",
>> -"cksum": "1259182303 5368",
>> +"version": "1.3.0",
>> +"cksum": "4176169852 6394",
>>
>
> The patch will have to be rebased as this has changed in the meantime.
>
>
>>  "tables": {
>>  "Chassis": {
>>  "columns": {
>> @@ -71,6 +71,21 @@
>>   "min": 0, "max": "unlimited"}}},
>>  "indexes": [["tunnel_key"]],
>>  "isRoot": true},
>> +"Physical_Endpoint": {
>> +"columns": {
>> +"name": {"type": "string"},
>> +"chassis": {"type": {"key": {"type": "uuid",
>> + "refTable": "Chassis",
>> + "refType": "weak"},
>> + "min": 1, "max": 1}},
>> +"chassis_port": {"type": {"key": "string", "min": 1,
>> "max": 1}},
>> +"type": {"type": {"key": {
>> +   "type": "string",
>> +   "enum": ["set", ["vlan"]]}}},
>> +"ingress_encap": {"type": "string"},
>> +"egress_encap": {"type": "string"}},
>> +"indexes": [["name"]],
>> +"isRoot": true},
>>  "Port_Binding": {
>>  "columns": {
>>  "logical_port": {"type": "string"},
>> @@ -96,6 +111,10 @@
>>   "refTable": "Chassis",
>>   "refType": "weak"},
>>   "min": 0, "max": 1}},
>> +"phys_endpts": {"type": {"key": {"type": "uuid",
>> + "refTable":
>> "Physical_Endpoint",
>> + "refType": "weak"},
>> + "min": 0, "max": "unlimited"}},
>>  "mac": {"type": {"key": "string",
>>   "min": 0,
>>   "max": "unlimited"}}},
>>
>
> The code and documentation both seem to imply that there should be at most
> 1 Physical_Endpoint per Port_Binding.  Should "unlimited" by "1" here?  If
> multiple endpoints gains a meaning in the future, it can be changed.
>
> >> The port binding code and documentation clearly support binding
> multiple physical endpoints to a logical
> >> port.
> >> look here: *cmd_lport_bind_phys_endpts*(*struct* ctl_context *ctx)
>
>
> More broadly, I'm also wondering if having Physical_Endpoints on each
> Port_Binding makes sense.  We 

Re: [ovs-dev] [PATCH v2] datapath-windows: Update Recirculation to use the right parameters

2016-03-29 Thread Alin Serdean
I acked the port part. I will send out a patch that deals with the keylen soon. 
I found some more things I don't like.

Alin.
> -Mesaj original-
> De la: Sairam Venugopal [mailto:vsai...@vmware.com]
> Trimis: Wednesday, March 30, 2016 3:53 AM
> Către: Alin Serdean ; Nithin Raju
> ; dev@openvswitch.org
> Subiect: Re: [ovs-dev] [PATCH v2] datapath-windows: Update Recirculation
> to use the right parameters
> 
> Hi Alin,
> 
> I have sent out a newer series of patches with the changes. These are
> necessary to fix the master and test out Connection Tracking patch. We can
> circle back and cleanup Flow.c to use KeyLen and align it by 8, after ensuring
> that things are working properly.
> 
> 
> Thanks,
> Sairam
> 
> On 3/29/16, 9:28 AM, "Alin Serdean" 
> wrote:
> 
> >Comments inlined.
> >
> >
> >
> >Thanks,
> >
> >Alin.
> >
> >> -Mesaj original-
> >
> >> This comparison determines the value for ŒisRecv¹ parameter. ŒisRecv¹
> >
> >> determines whether the OOB data should be interpreted as receive data
> >> or
> >
> >> send data. So, the existing code is checking for:
> >
> >> srcPortNo == switchContext->virtualExternalPortId
> >
> >>
> >
> >> Left side data is a UINT32 and right side data is a NDIS_SWITCH_PORT_ID.
> >
> >> So, clearly this comparison is not right since we are comparing
> >>different
> >
> >> types. They are not expected to have the same value even if they are
> >
> >> representing the same vport.
> >
> >>
> >
> >[Alin Gabriel Serdean: ] from the header ntddndis.h:
> >
> >typedef UINT32 NDIS_SWITCH_PORT_ID, *PNDIS_SWITCH_PORT_ID;
> >
> >I don't think it is a problem of type but a problem with the number
> >itself.
> >
> >> The proposed fix at least makes sure that we are comparing the right
> >>types.
> >
> >> So, I¹d go with it. Is the comparison right? It seems so. Basically
> >>we want to
> >
> >> determine if the packet came from a VIF or a physical NIC.
> >
> >> ŒfwdDetail¹ is a reliable way of doing it. Only way this could mess
> >>up is if we
> >
> >> mean to explicitly change the ŒsrcPortNo¹ during tunneling. In such
> >>cases,
> >
> >> the 'fwdDetail->SourcePortId¹ will end up different from the
> >>ŒsrcPortNo¹.
> >
> >> This is exactly why we use ŒsrcPortNo¹ for comparison around the code
> >> to
> >
> >> allow flexibility.
> >
> >[Alin Gabriel Serdean: ] fwdDetail is not reliable in our case because
> >we could have cloned the nbl and not update the OOB data and this is
> >not what we want to do in our case. We want to reprocess the packet as
> >it came from the same input port
> >(https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__github.com_openvs
> >wit
> >ch_ovs_blob_master_ofproto_ofproto-2Ddpif-2Drid.h-23L64-
> 2DL69=BQIGaQ&
> >c=S
> >qcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-
> uEs=Dcruz40PROJ40ROzSpxyQSLw6f
> >crO
> >WpJgEcEmNR3JEQ=SiAZF6ppbO8xujF9FCVi7GvwJdKI2aCc81fTzQUV0ZA
> =dzc7xmRZ
> >NyK
> >wLwlgvoBr97WN7PfuX_f4NXQjzkqcLVU= )
> >
> >Indeed, there is a problem because srcportno == vport->portno and we
> >should use in our case vport->portId when doing the comparison. I'll
> >send out a patch to update it.
> >
> >>
> >
> >>
> >
> >> In any case, the problematic case here is tunneling + recirc. We can
> >>deal with
> >
> >> that in a subsequent patch. For now, this is the right fix.
> >
> >>
> >
> >> I¹d also add a XXX comment to investigate that "tunneling + recirc²
> >>case in the
> >
> >> future.
> >
> >>
> >
> >> > ,
> >
> >> > ovsFwdCtx.switchContext,
> >> > diff
> >
> >> >--git a/datapath-windows/ovsext/Flow.c b/datapath-
> >
> >> windows/ovsext/Flow.c
> >
> >> >index 02c41b7..d49697c 100644
> >
> >> >--- a/datapath-windows/ovsext/Flow.c
> >
> >> >+++ b/datapath-windows/ovsext/Flow.c
> >
> >> >@@ -2133,6 +2133,9 @@ OvsLookupFlow(OVS_DATAPATH *datapath,
> >
> >> >
> >
> >> > if (!hashValid) {
> >
> >> > *hash = OvsJhashBytes(start, size, 0);
> >
> >> >+if (key->recircId) {
> >
> >> >+*hash = OvsJhashWords((UINT32*)hash, 1, key->recircId);
> >
> >> >+}
> >
> >>
> >
> >> Ok, we have two options:
> >
> >> 1. Increment ŒkeyLen¹ and include recircId as part of it. So, while
> >>hashing we
> >
> >> make sure that recircId gets included.
> >
> >> 2. Explicitly add Œkey->recirId¹ during hashing.
> >
> >>
> >
> >> I¹m ok with either way. The ŒkeyLen¹ approach is more efficient for
> >> now,
> >
> >> since it avoids a Œif¹ check and also an additional function call. In
> >>the future, if
> >
> >> we have a lot of meta data such as ŒrecircId¹, then we should
> >> consider
> >
> >> adding a Œkey->metaDataLen¹ and go from there. For now, I¹d add a
> >
> >> comment in ŒOvsFlowKey¹ to set future direction, and go with the
> >
> >> ŒkeyLen¹ approach.
> >
> >[Alin Gabriel Serdean: ] I would go with option one so we maintain the
> >same of hashing we have at the 

Re: [ovs-dev] [PATCH] dpif-netdev: Remove PMD latency on seq_mutex

2016-03-29 Thread Daniele Di Proietto


On 29/03/2016 06:08, "Flavio Leitner"  wrote:

>On Tue, Mar 29, 2016 at 02:13:18AM +, Daniele Di Proietto wrote:
>> Hi Flavio and Karl,
>> 
>> thanks for the patch! I have a couple of comments:
>> 
>> Can you point out a configuration where this is the bottleneck?
>> I'm interested in reproducing this.
>
>Karl, since you did the tests, could you please provide more details?
>
>
>> I think the implementation would look simpler if we could
>> avoid explicitly taking the mutex in dpif-netdev and instead
>> having a ovsrcu_try_quiesce(). What do you think?
>
>My concern is that it is freeing one entry from EMC each round
>and it should quiesce to allow the callbacks to run.  If, for
>some reason, it fails to quiesce for a long period, then it might
>backlog a significant number of entries.
>
>
>> I think we can avoid the recursive mutex as well if we introduce
>> some explicit APIs in seq (seq_try_lock, seq_change_protected and
>> seq_unlock), but I'd like to understand the performance implication
>> of this commit first.
>
>The issue is the latency spike when the PMD thread blocks on the
>busy mutex.
>
>The goal with recursive locking is to make sure we can sweep
>the EMC cache and quiesce without blocking.  Fixing seq API
>would help to not block, but then we have no control to whether
>we did both tasks in the same round.
>
>fbl

If I understand your concerns correctly, I think we can have something
like:

if (ovsrcu_try_quiesce()) {
...
emc_cache_slow_sweep();
...
}

Sure, the swept flows will need to wait another round to actually get
freed,
but I think this is ok

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [RFC 0/2] ovn: Add minimal software l2 gateway.

2016-03-29 Thread Russell Bryant
I keep getting hung up on the additional complexity introduced by the new
Physical_Endpoints table proposed here:

http://openvswitch.org/pipermail/dev/2016-March/068705.html

I wanted to see how much work it would be to implement a software L2 gateway
while trying to minimize the amount of change required to do so.

The real code changes here are really quite small.  A new port type called
"gateway" is introduced that works *very* close to the existing localnet ports.

I'm interested in feedback on this alternative approach.

Thanks,

Russell Bryant (2):
  ovn: Minor refactoring.
  ovn: Add minimal software l2 gateway.

 ovn/controller/binding.c|  12 ++-
 ovn/controller/ovn-controller.8.xml |  15 ++--
 ovn/controller/ovn-controller.c |   2 +-
 ovn/controller/patch.c  |  68 ---
 ovn/controller/patch.h  |   2 +-
 ovn/controller/physical.c   |  10 ++-
 ovn/ovn-nb.xml  |  19 +
 ovn/ovn-sb.xml  |  38 +
 tests/ovn.at| 164 
 9 files changed, 285 insertions(+), 45 deletions(-)

-- 
2.5.5

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [RFC 2/2] ovn: Add minimal software l2 gateway.

2016-03-29 Thread Russell Bryant
This patch implements one approach to using ovn-controller to implement
a software l2 gateway between logical and physical networks.

A new logical port type called "gateway" is introduced here.  It is very
close to how localnet ports work, with the following exception:
A localnet port makes OVN use the physical network as the
transport between hypervisors instead of tunnels. A gateway port still
uses tunnels between all hypervisors, and packets only go to/from the
specified physical network as needed via the chassis the gateway port
is bound to.

Signed-off-by: Russell Bryant 
---
 ovn/controller/binding.c|   5 ++
 ovn/controller/ovn-controller.8.xml |  15 ++--
 ovn/controller/ovn-controller.c |   2 +-
 ovn/controller/patch.c  |  24 --
 ovn/controller/patch.h  |   2 +-
 ovn/controller/physical.c   |  10 ++-
 ovn/ovn-nb.xml  |  19 +
 ovn/ovn-sb.xml  |  38 +
 tests/ovn.at| 164 
 9 files changed, 259 insertions(+), 20 deletions(-)

diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
index d3ca9c9..ab12ecb 100644
--- a/ovn/controller/binding.c
+++ b/ovn/controller/binding.c
@@ -205,6 +205,11 @@ binding_run(struct controller_ctx *ctx, const struct 
ovsrec_bridge *br_int,
 }
 sbrec_port_binding_set_chassis(binding_rec, chassis_rec);
 }
+} else if (!strcmp(binding_rec->type, "gateway")
+   && binding_rec->chassis == chassis_rec) {
+/* A locally bound gateway port. */
+sset_add(_lports, binding_rec->logical_port);
+add_local_datapath(local_datapaths, binding_rec);
 } else if (binding_rec->chassis == chassis_rec) {
 if (ctx->ovnsb_idl_txn) {
 VLOG_INFO("Releasing lport %s from this chassis.",
diff --git a/ovn/controller/ovn-controller.8.xml 
b/ovn/controller/ovn-controller.8.xml
index 69d4cc0..277d5b4 100644
--- a/ovn/controller/ovn-controller.8.xml
+++ b/ovn/controller/ovn-controller.8.xml
@@ -169,18 +169,19 @@
 
   The presence of this key identifies a patch port as one created by
   ovn-controller to connect the integration bridge and
-  another bridge to implement a localnet logical port.
-  Its value is the name of the logical port with type=localnet that
-  the port implements.
+  another bridge to implement a localnet or
+  gateway logical port.  Its value is the name of the
+  logical port with type set to localnet
+  or gateway that the port implements.
   See external_ids:ovn-bridge-mappings, above,
   for more information.
 
 
 
-  Each localnet logical port is implemented as a pair of
-  patch ports, one in the integration bridge, one in a different
-  bridge, with the same external-ids:ovn-localnet-port
-  value.
+  Each localnet and gateway logical port
+  is implemented as a pair of patch ports, one in the integration
+  bridge, one in a different bridge, with the same
+  external-ids:ovn-localnet-port value.
 
   
 
diff --git a/ovn/controller/ovn-controller.c b/ovn/controller/ovn-controller.c
index e52b731..fd57e21 100644
--- a/ovn/controller/ovn-controller.c
+++ b/ovn/controller/ovn-controller.c
@@ -293,7 +293,7 @@ main(int argc, char *argv[])
 }
 
 if (br_int) {
-patch_run(, br_int, _datapaths);
+patch_run(, br_int, _datapaths, chassis_id);
 
 struct lport_index lports;
 struct mcgroup_index mcgroups;
diff --git a/ovn/controller/patch.c b/ovn/controller/patch.c
index 798cd7a..0d8fc5b 100644
--- a/ovn/controller/patch.c
+++ b/ovn/controller/patch.c
@@ -134,7 +134,8 @@ static void
 add_bridge_mappings(struct controller_ctx *ctx,
 const struct ovsrec_bridge *br_int,
 struct shash *existing_ports,
-struct hmap *local_datapaths)
+struct hmap *local_datapaths,
+const char *chassis_id)
 {
 /* Get ovn-bridge-mappings. */
 const char *mappings_cfg = "";
@@ -196,23 +197,30 @@ add_bridge_mappings(struct controller_ctx *ctx,
 continue;
 }
 ld->localnet_port = binding;
+} else if (!strcmp(binding->type, "gateway")) {
+if (!binding->chassis || strcmp(chassis_id, 
binding->chassis->name)) {
+/* This gateway port is not bound to this chassis, so we should
+ * not create any patch ports for it. */
+continue;
+}
 } else {
-/* Not a binding for a localnet port. */
+/* not a localnet or gateway port. */
 continue;
 }
 
 const char *network = 

[ovs-dev] [PATCH 1/2] ovn: Minor refactoring.

2016-03-29 Thread Russell Bryant
This commit applies a minor restructuring of this code to put the
localnet port specific code in its own block.  This is mostly to make a
future patch easier to read.

Signed-off-by: Russell Bryant 
---
 ovn/controller/patch.c | 44 ++--
 1 file changed, 22 insertions(+), 22 deletions(-)

diff --git a/ovn/controller/patch.c b/ovn/controller/patch.c
index 753ce3e..798cd7a 100644
--- a/ovn/controller/patch.c
+++ b/ovn/controller/patch.c
@@ -175,32 +175,32 @@ add_bridge_mappings(struct controller_ctx *ctx,
 
 const struct sbrec_port_binding *binding;
 SBREC_PORT_BINDING_FOR_EACH (binding, ctx->ovnsb_idl) {
-if (strcmp(binding->type, "localnet")) {
+if (!strcmp(binding->type, "localnet")) {
+struct local_datapath *ld;
+ld = CONTAINER_OF(hmap_first_with_hash(local_datapaths,
+  binding->datapath->tunnel_key),
+  struct local_datapath, hmap_node);
+if (!ld) {
+/* This localnet port is on a datapath with no
+ * logical ports bound to this chassis, so there's no need
+ * to create patch ports for it. */
+continue;
+}
+if (ld->localnet_port) {
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
+VLOG_WARN_RL(, "localnet port '%s' already set for datapath 
"
+ "'%"PRId64"', skipping the new port '%s'.",
+ ld->localnet_port->logical_port,
+ binding->datapath->tunnel_key,
+ binding->logical_port);
+continue;
+}
+ld->localnet_port = binding;
+} else {
 /* Not a binding for a localnet port. */
 continue;
 }
 
-struct local_datapath *ld;
-ld = CONTAINER_OF(hmap_first_with_hash(local_datapaths,
-  binding->datapath->tunnel_key),
-  struct local_datapath, hmap_node);
-if (!ld) {
-/* This localnet port is on a datapath with no
- * logical ports bound to this chassis, so there's no need
- * to create patch ports for it. */
-continue;
-}
-if (ld->localnet_port) {
-static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
-VLOG_WARN_RL(, "localnet port '%s' already set for datapath "
- "'%"PRId64"', skipping the new port '%s'.",
- ld->localnet_port->logical_port,
- binding->datapath->tunnel_key,
- binding->logical_port);
-continue;
-}
-ld->localnet_port = binding;
-
 const char *network = smap_get(>options, "network_name");
 if (!network) {
 static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
-- 
2.5.5

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2] datapath-windows: Update Recirculation to use the right parameters

2016-03-29 Thread Sairam Venugopal
Hi Alin,

I have sent out a newer series of patches with the changes. These are
necessary to fix the master and test out Connection Tracking patch. We can
circle back and cleanup Flow.c to use KeyLen and align it by 8, after
ensuring that things are working properly.


Thanks,
Sairam

On 3/29/16, 9:28 AM, "Alin Serdean" 
wrote:

>Comments inlined.
>
>
>
>Thanks,
>
>Alin.
>
>> -Mesaj original-
>
>> This comparison determines the value for ŒisRecv¹ parameter. ŒisRecv¹
>
>> determines whether the OOB data should be interpreted as receive data or
>
>> send data. So, the existing code is checking for:
>
>> srcPortNo == switchContext->virtualExternalPortId
>
>> 
>
>> Left side data is a UINT32 and right side data is a NDIS_SWITCH_PORT_ID.
>
>> So, clearly this comparison is not right since we are comparing
>>different
>
>> types. They are not expected to have the same value even if they are
>
>> representing the same vport.
>
>> 
>
>[Alin Gabriel Serdean: ] from the header ntddndis.h:
>
>typedef UINT32 NDIS_SWITCH_PORT_ID, *PNDIS_SWITCH_PORT_ID;
>
>I don't think it is a problem of type but a problem with the number
>itself.
>
>> The proposed fix at least makes sure that we are comparing the right
>>types.
>
>> So, I¹d go with it. Is the comparison right? It seems so. Basically we
>>want to
>
>> determine if the packet came from a VIF or a physical NIC.
>
>> ŒfwdDetail¹ is a reliable way of doing it. Only way this could mess up
>>is if we
>
>> mean to explicitly change the ŒsrcPortNo¹ during tunneling. In such
>>cases,
>
>> the 'fwdDetail->SourcePortId¹ will end up different from the
>>ŒsrcPortNo¹.
>
>> This is exactly why we use ŒsrcPortNo¹ for comparison around the code to
>
>> allow flexibility.
>
>[Alin Gabriel Serdean: ] fwdDetail is not reliable in our case because we
>could have cloned the nbl and not update the OOB data and this is not
>what we want to do in our case. We want to reprocess the packet as it
>came from the same input port
>(https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_openvswit
>ch_ovs_blob_master_ofproto_ofproto-2Ddpif-2Drid.h-23L64-2DL69=BQIGaQ=S
>qcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs=Dcruz40PROJ40ROzSpxyQSLw6fcrO
>WpJgEcEmNR3JEQ=SiAZF6ppbO8xujF9FCVi7GvwJdKI2aCc81fTzQUV0ZA=dzc7xmRZNyK
>wLwlgvoBr97WN7PfuX_f4NXQjzkqcLVU= )
>
>Indeed, there is a problem because srcportno == vport->portno and we
>should use in our case vport->portId when doing the comparison. I'll send
>out a patch to update it.
>
>> 
>
>> 
>
>> In any case, the problematic case here is tunneling + recirc. We can
>>deal with
>
>> that in a subsequent patch. For now, this is the right fix.
>
>> 
>
>> I¹d also add a XXX comment to investigate that "tunneling + recirc²
>>case in the
>
>> future.
>
>> 
>
>> > ,
>
>> > ovsFwdCtx.switchContext, diff
>
>> >--git a/datapath-windows/ovsext/Flow.c b/datapath-
>
>> windows/ovsext/Flow.c
>
>> >index 02c41b7..d49697c 100644
>
>> >--- a/datapath-windows/ovsext/Flow.c
>
>> >+++ b/datapath-windows/ovsext/Flow.c
>
>> >@@ -2133,6 +2133,9 @@ OvsLookupFlow(OVS_DATAPATH *datapath,
>
>> >
>
>> > if (!hashValid) {
>
>> > *hash = OvsJhashBytes(start, size, 0);
>
>> >+if (key->recircId) {
>
>> >+*hash = OvsJhashWords((UINT32*)hash, 1, key->recircId);
>
>> >+}
>
>> 
>
>> Ok, we have two options:
>
>> 1. Increment ŒkeyLen¹ and include recircId as part of it. So, while
>>hashing we
>
>> make sure that recircId gets included.
>
>> 2. Explicitly add Œkey->recirId¹ during hashing.
>
>> 
>
>> I¹m ok with either way. The ŒkeyLen¹ approach is more efficient for now,
>
>> since it avoids a Œif¹ check and also an additional function call. In
>>the future, if
>
>> we have a lot of meta data such as ŒrecircId¹, then we should consider
>
>> adding a Œkey->metaDataLen¹ and go from there. For now, I¹d add a
>
>> comment in ŒOvsFlowKey¹ to set future direction, and go with the
>
>> ŒkeyLen¹ approach.
>
>[Alin Gabriel Serdean: ] I would go with option one so we maintain the
>same of hashing we have at the moment.
>
>> 
>
>> Of course, there¹s a bug when ŒkeyLen¹ gets set in
>
>> _MapKeyAttrToFlowPut().
>
>> The value gets incremented due to ŒrecircId¹ but gets reset again.
>
>> 
>
>> > }
>
>> >
>
>> > head = >flowTable[HASH_BUCKET(*hash)];
>
>> 
>
>> ___
>
>> dev mailing list
>
>> dev@openvswitch.org
>
>> 
>>https://urldefense.proofpoint.com/v2/url?u=http-3A__openvswitch.org_mailm
>>an_listinfo_dev=BQIGaQ=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs=
>>Dcruz40PROJ40ROzSpxyQSLw6fcrOWpJgEcEmNR3JEQ=SiAZF6ppbO8xujF9FCVi7GvwJdK
>>I2aCc81fTzQUV0ZA=rvvo6cM83_V6xVB2DDLyTXvP7e4vuYT5SLhq9HLgLT4=
>

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [RFC PATCH] tunneling: Improving vxlan performance using DPDK flow director feature.

2016-03-29 Thread Jesse Gross
On Tue, Mar 29, 2016 at 12:43 AM, Chandran, Sugesh
 wrote:
>> -Original Message-
>> From: Jesse Gross [mailto:je...@kernel.org]
>> Sent: Friday, March 25, 2016 12:38 AM
>> To: Chandran, Sugesh 
>> Cc: dev@openvswitch.org
>> Subject: Re: [ovs-dev] [RFC PATCH] tunneling: Improving vxlan performance
>> using DPDK flow director feature.
>>  * Chaining together the multiple lookups used by tunnels on the assumption
>> that the outer VXLAN source port distinguishes the inner flow. This would
>> allow avoiding netdev_flow_key_equal_mf() a second time. This is definitely
>> not legal because the VXLAN source port is only capturing a small subset of
>> the total data that OVS is using.
>
> [Sugesh] From our analysis we found that optimizing one lookup give no
> significant performance boost when compared with the overhead. This is  due 
> to the
> fact that the second netdev_flow_key_equal_mf() still need the tunnel 
> information
> to match on a flow.  We found in our tests that most CPU cycles spends on 
> extracting
> header fields from the packets than lookup.
>
> The proposal is to avoid the header field extraction by using an additional 
> unique software
> flow ID to match on. The two flows for tunnel are marked with this ID when 
> installing on the
> EMC. The hardware report this ID along with hash(to mitigate the hash 
> collision in EMC)
> for every incoming packets that match on a hardware rule. This used in EMC
> along with hash to find the flow. Currently OVS compares  hash +key(from 
> header fields)
> to match a flow. The inner flow uses the same unique ID and hardware flow 
> flag to match
> on than the source port. We have modified the code little bit more, so that 
> it saves the hardware
> id in the matching flow, for every emc_insert.

I think that the performance improvements look cool but unfortunately,
I just don't see how this can work.

There really isn't a way to avoid extracting the header fields in
software - I don't think that any NIC short of an NPU or other
programmable hardware has the capability to match on all of the fields
that OVS supports. Certainly, the UDP source port used by VXLAN and
other tunnel protocols does not contain all of the information and,
worse, it's controlled by a remote system. We can't trust the
information contained in it without further verification because OVS
flow rules are often used for security checks. I realize that in many
cases this will appear to work because for a flow represented by a
5-tuple many of the other fields will be the same. However, we can't
just make this assumption.

One possible exception to this rule is if we did an analysis on the
flows that are actually being used by OVS and only tried to extract
those fields. This is a pure software optimization that might have
similar effects to what you are observing here. This most likely makes
the most sense in the context of a BPF based datapath where the flow
extractor can be dynamically generated and compiled.

>> I'm not sure that I really see any advantage in using a Flow Director perfect
>> filter to return a software defined hash value compared to just using the RSS
>> hash directly as we are doing today. I think the main case where it would be
>> useful is if hardware wildcarding was used to skip the EMC altogether and its
>> size constraints. If that was done then I think that this would no longer be
>> specialized to VXLAN at all.
> [Sugesh] This may give performance improvement when we have
> large set of rules that overflows EMC. But for a typical use case where 
> 80-90% rules hits EMC
> doesn’t get any performance benefit out of it. Please correct me if I am 
> wrong here.
> The intention here is to optimize the tunneling performance in all the use 
> cases.

To be honest, I think that last 10-20% may be more interesting. Up to
this point in time, the DPDK implementation in OVS has placed a lot of
emphasis on PPS throughput with a relatively small number of streams.
However, while this looks great on benchmarks, it doesn't necessarily
match real world use cases. Even worse, it tends to fall apart at the
worst possible times - like a DoS attack. If the NIC were able to
effectively enlarge the EMC to handle these cases then I think that
would be a huge boost to the usability of OVS on DPDK.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH 3/3] datapath-windows: Add Connection Tracking Support

2016-03-29 Thread Sairam Venugopal
Enables support for Stateful Firewall in Hyper-V by adding a Connection
Tracking module. The module has been ported over from the userspace
implementation patch of a similar name.

The current version of the module supports ct_zone for TCP packets.
Support for ct_mark, ct_label and other packet formats will be added in
subsequent patches.

The conntrack-tcp module is adapted from FreeBSD's pf subsystem and hence the
BSD license. It has been ported over to match OVS Hyper-V coding style.
The update to NOTICE and debian/copyright.in will be handled in another
patch.

Signed-off-by: Sairam Venugopal 
Signed-off-by: Daniele Di Proietto 
Co-Authored-by: Daniele Di Proietto 
---
 datapath-windows/automake.mk|   3 +
 datapath-windows/ovsext/Actions.c   |  24 ++
 datapath-windows/ovsext/Conntrack-tcp.c | 533 
 datapath-windows/ovsext/Conntrack.c | 472 
 datapath-windows/ovsext/Conntrack.h | 110 +++
 datapath-windows/ovsext/Debug.h |   1 +
 datapath-windows/ovsext/DpInternal.h|   7 +
 datapath-windows/ovsext/Flow.c  |  98 +-
 datapath-windows/ovsext/Switch.c|  10 +-
 datapath-windows/ovsext/Util.h  |   3 +-
 datapath-windows/ovsext/ovsext.vcxproj  |   3 +
 11 files changed, 1259 insertions(+), 5 deletions(-)
 create mode 100644 datapath-windows/ovsext/Conntrack-tcp.c
 create mode 100644 datapath-windows/ovsext/Conntrack.c
 create mode 100644 datapath-windows/ovsext/Conntrack.h

diff --git a/datapath-windows/automake.mk b/datapath-windows/automake.mk
index 04fc97f..c9af806 100644
--- a/datapath-windows/automake.mk
+++ b/datapath-windows/automake.mk
@@ -13,6 +13,9 @@ EXTRA_DIST += \
datapath-windows/ovsext/Atomic.h \
datapath-windows/ovsext/BufferMgmt.c \
datapath-windows/ovsext/BufferMgmt.h \
+   datapath-windows/ovsext/Conntrack-tcp.c \
+   datapath-windows/ovsext/Conntrack.c \
+   datapath-windows/ovsext/Conntrack.h \
datapath-windows/ovsext/Datapath.c \
datapath-windows/ovsext/Datapath.h \
datapath-windows/ovsext/Debug.c \
diff --git a/datapath-windows/ovsext/Actions.c 
b/datapath-windows/ovsext/Actions.c
index 3e5dac9..54ad04f 100644
--- a/datapath-windows/ovsext/Actions.c
+++ b/datapath-windows/ovsext/Actions.c
@@ -17,6 +17,7 @@
 #include "precomp.h"
 
 #include "Actions.h"
+#include "Conntrack.h"
 #include "Debug.h"
 #include "Event.h"
 #include "Flow.h"
@@ -1786,6 +1787,29 @@ OvsDoExecuteActions(POVS_SWITCH_CONTEXT switchContext,
 break;
 }
 
+case OVS_ACTION_ATTR_CT:
+{
+if (ovsFwdCtx.destPortsSizeOut > 0
+|| ovsFwdCtx.tunnelTxNic != NULL
+|| ovsFwdCtx.tunnelRxNic != NULL) {
+status = OvsOutputBeforeSetAction();
+if (status != NDIS_STATUS_SUCCESS) {
+dropReason = L"OVS-adding destination failed";
+goto dropit;
+}
+}
+
+status = OvsExecuteConntrackAction(ovsFwdCtx.switchContext,
+   ovsFwdCtx.curNbl, layers,
+   key, (const PNL_ATTR)a);
+if (status != NDIS_STATUS_SUCCESS) {
+OVS_LOG_ERROR("CT Action failed");
+dropReason = L"OVS-conntrack action failed";
+goto dropit;
+}
+break;
+}
+
 case OVS_ACTION_ATTR_RECIRC:
 {
 if (ovsFwdCtx.destPortsSizeOut > 0 || ovsFwdCtx.tunnelTxNic != NULL
diff --git a/datapath-windows/ovsext/Conntrack-tcp.c 
b/datapath-windows/ovsext/Conntrack-tcp.c
new file mode 100644
index 000..71b9378
--- /dev/null
+++ b/datapath-windows/ovsext/Conntrack-tcp.c
@@ -0,0 +1,533 @@
+/*-
+ * Copyright (c) 2001 Daniel Hartmeier
+ * Copyright (c) 2002 - 2008 Henning Brauer
+ * Copyright (c) 2012 Gleb Smirnoff 
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ *- Redistributions of source code must retain the above copyright
+ *  notice, this list of conditions and the following disclaimer.
+ *- Redistributions in binary form must reproduce the above
+ *  copyright notice, this list of conditions and the following
+ *  disclaimer in the documentation and/or other materials provided
+ *  with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+ * 

[ovs-dev] [PATCH 2/3] datapath-windows: Update flow lookup to support RecircId and DpHash

2016-03-29 Thread Sairam Venugopal
Update the OvsLookupFlow to include RecircId and DpHash in its flow
comparison. Revert the keyLen related changes until they are aligned
appropriately.

Signed-off-by: Sairam Venugopal 
---
 datapath-windows/ovsext/Flow.c | 31 ++-
 1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/datapath-windows/ovsext/Flow.c b/datapath-windows/ovsext/Flow.c
index 02c41b7..0d48963 100644
--- a/datapath-windows/ovsext/Flow.c
+++ b/datapath-windows/ovsext/Flow.c
@@ -1382,12 +1382,10 @@ _MapKeyAttrToFlowPut(PNL_ATTR *keyAttrs,
 
 if (keyAttrs[OVS_KEY_ATTR_RECIRC_ID]) {
 destKey->recircId = NlAttrGetU32(keyAttrs[OVS_KEY_ATTR_RECIRC_ID]);
-destKey->l2.keyLen += sizeof(destKey->recircId);
 }
 
 if (keyAttrs[OVS_KEY_ATTR_DP_HASH]) {
 destKey->dpHash = NlAttrGetU32(keyAttrs[OVS_KEY_ATTR_DP_HASH]);
-destKey->l2.keyLen += sizeof(destKey->dpHash);
 }
 
 /* = L2 headers = */
@@ -1765,12 +1763,10 @@ OvsGetFlowMetadata(OvsFlowKey *key,
 
 if (keyAttrs[OVS_KEY_ATTR_RECIRC_ID]) {
 key->recircId = NlAttrGetU32(keyAttrs[OVS_KEY_ATTR_RECIRC_ID]);
-key->l2.keyLen += sizeof(key->recircId);
 }
 
 if (keyAttrs[OVS_KEY_ATTR_DP_HASH]) {
 key->dpHash = NlAttrGetU32(keyAttrs[OVS_KEY_ATTR_DP_HASH]);
-key->l2.keyLen += sizeof(key->dpHash);
 }
 
 return status;
@@ -2032,7 +2028,7 @@ OvsExtractFlow(const NET_BUFFER_LIST *packet,
 }
 
 __inline BOOLEAN
-FlowEqual(UINT64 *src, UINT64 *dst, UINT32 size)
+FlowMemoryEqual(UINT64 *src, UINT64 *dst, UINT32 size)
 {
 UINT32 i;
 ASSERT((size & 0x7) == 0);
@@ -2046,6 +2042,20 @@ FlowEqual(UINT64 *src, UINT64 *dst, UINT32 size)
 return TRUE;
 }
 
+__inline BOOLEAN
+FlowEqual(OvsFlow *flow, const OvsFlowKey *key, UINT64 hash)
+{
+UINT16 size = key->l2.keyLen;
+UINT16 offset = key->l2.offset;
+UINT64 *src = (UINT64 *)((UINT8 *)>key + offset);
+UINT64 *dst = (UINT64 *)((UINT8 *)key + offset);
+
+return (flow->hash == hash &&
+flow->key.l2.val == key->l2.val &&
+flow->key.recircId == key->recircId &&
+flow->key.dpHash == key->dpHash &&
+FlowMemoryEqual(src, dst, size));
+}
 
 /*
  * 
@@ -2133,6 +2143,12 @@ OvsLookupFlow(OVS_DATAPATH *datapath,
 
 if (!hashValid) {
 *hash = OvsJhashBytes(start, size, 0);
+if (key->recircId) {
+*hash = OvsJhashWords((UINT32*)hash, 1, key->recircId);
+}
+if (key->dpHash) {
+*hash = OvsJhashWords((UINT32*)hash, 1, key->dpHash);
+}
 }
 
 head = >flowTable[HASH_BUCKET(*hash)];
@@ -2140,10 +2156,7 @@ OvsLookupFlow(OVS_DATAPATH *datapath,
 while (link != head) {
 OvsFlow *flow = CONTAINING_RECORD(link, OvsFlow, ListEntry);
 
-if (flow->hash == *hash &&
-flow->key.l2.val == key->l2.val &&
-FlowEqual((UINT64 *)((uint8 *)>key + offset),
- (UINT64 *)start, size)) {
+if (FlowEqual(flow, key, *hash)) {
 return flow;
 }
 link = link->Flink;
-- 
2.5.0.windows.1

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH 0/3] datapath-windows: Enable support for Connection Tracking

2016-03-29 Thread Sairam Venugopal
Add support for Stateful firewall in Hyper-V by implementing Connection 
tracking module.
The module has been ported from the patch 
(https://patchwork.ozlabs.org/patch/544906/) authored by Daniele Di Proietto 


The first two patches fixes the issues with Master that prohibits Connection 
Tracking from working properly. The connection tracking currently works with 
ct_zones on TCP packets. ct_labels, ct_mark and other formats will be supported 
in future patches.

Sairam Venugopal (3):
  datapath-windows: Update Recirculation to use portId instead of portNo
  datapath-windows: Update flow lookup to support RecircId and DpHash
  datapath-windows: Add Connection Tracking Support

 datapath-windows/automake.mk|   3 +
 datapath-windows/ovsext/Actions.c   |  26 +-
 datapath-windows/ovsext/Conntrack-tcp.c | 533 
 datapath-windows/ovsext/Conntrack.c | 472 
 datapath-windows/ovsext/Conntrack.h | 110 +++
 datapath-windows/ovsext/Debug.h |   1 +
 datapath-windows/ovsext/DpInternal.h|   7 +
 datapath-windows/ovsext/Flow.c  | 129 +++-
 datapath-windows/ovsext/Switch.c|  10 +-
 datapath-windows/ovsext/Util.h  |   3 +-
 datapath-windows/ovsext/ovsext.vcxproj  |   3 +
 11 files changed, 1282 insertions(+), 15 deletions(-)
 create mode 100644 datapath-windows/ovsext/Conntrack-tcp.c
 create mode 100644 datapath-windows/ovsext/Conntrack.c
 create mode 100644 datapath-windows/ovsext/Conntrack.h

-- 
2.5.0.windows.1

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH 1/3] datapath-windows: Update Recirculation to use portId instead of portNo

2016-03-29 Thread Sairam Venugopal
Fix OvsDoRecirc to use the right PortId when there is a flow miss. This is
used to determine if a packet is received or transmitted by comparing
against the virtualExternalPortId.

Signed-off-by: Sairam Venugopal 
---
 datapath-windows/ovsext/Actions.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/datapath-windows/ovsext/Actions.c 
b/datapath-windows/ovsext/Actions.c
index a91454d..3e5dac9 100644
--- a/datapath-windows/ovsext/Actions.c
+++ b/datapath-windows/ovsext/Actions.c
@@ -2015,7 +2015,7 @@ OvsDoRecirc(POVS_SWITCH_CONTEXT switchContext,
 }
 status = OvsCreateAndAddPackets(NULL, 0, OVS_PACKET_CMD_MISS,
 vport, key, ovsFwdCtx.curNbl,
-srcPortNo ==
+vport->portId ==
 switchContext->virtualExternalPortId,
 ,
 ovsFwdCtx.switchContext,
-- 
2.5.0.windows.1

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] ovn-controller: Loopback prevention flows for local ports only.

2016-03-29 Thread Han Zhou
Currently in physical_run() we added per-port loopback prevention
flows for all lports. The flows are actually required only for
local ports on the chassis. This change greatly reduces number of
flows in table 34.

Signed-off-by: Han Zhou 
---
 ovn/controller/physical.c | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c
index 657c3e2..1f11e48 100644
--- a/ovn/controller/physical.c
+++ b/ovn/controller/physical.c
@@ -384,6 +384,18 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id 
mff_ovn_geneve,
 ofctrl_add_flow(flow_table, OFTABLE_LOCAL_OUTPUT, 100, ,
 );
 
+/* Table 34, Priority 100.
+ * ===
+ *
+ * Drop packets whose logical inport and outport are the same. */
+match_init_catchall();
+ofpbuf_clear();
+match_set_metadata(, htonll(binding->datapath->tunnel_key));
+match_set_reg(, MFF_LOG_INPORT - MFF_REG0, 
binding->tunnel_key);
+match_set_reg(, MFF_LOG_OUTPORT - MFF_REG0, 
binding->tunnel_key);
+ofctrl_add_flow(flow_table, OFTABLE_DROP_LOOPBACK, 100,
+, );
+
 /* Table 64, Priority 100.
  * ===
  *
@@ -469,18 +481,6 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id 
mff_ovn_geneve,
 ofctrl_add_flow(flow_table, OFTABLE_REMOTE_OUTPUT, 100,
 , );
 }
-
-/* Table 34, Priority 100.
- * ===
- *
- * Drop packets whose logical inport and outport are the same. */
-match_init_catchall();
-ofpbuf_clear();
-match_set_metadata(, htonll(binding->datapath->tunnel_key));
-match_set_reg(, MFF_LOG_INPORT - MFF_REG0, binding->tunnel_key);
-match_set_reg(, MFF_LOG_OUTPORT - MFF_REG0, binding->tunnel_key);
-ofctrl_add_flow(flow_table, OFTABLE_DROP_LOOPBACK, 100,
-, );
 }
 
 /* Handle output to multicast groups, in tables 32 and 33. */
-- 
2.1.0

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [RFC] ofp-actions: Extend the use of max_len in OUTPUT action.

2016-03-29 Thread pravin shelar
On Tue, Mar 29, 2016 at 3:16 PM, William Tu  wrote:
> Before, OpenFlow specification defines 'max_len' in struct ofp_action_output
> as the max number of bytes to send when port is OFPP_CONTROLLER.  A max_len
> of zero means no bytes of the packet should be sent, and max_len of
> OFPCML_NO_BUFFER means the complete packet is sent to the controller.
> It is left undefined of max_len, when output port is not OFPP_CONTROLLER.
> The patch extends the use of max_len when output is non-controller.
>
> One use case is to enable port mirroring to send smaller packets to the
> destination port so that only useful packet information is mirrored/copied,
> saving some performance overhead of copying entire packet payload.
> The patch proposes adding a '(max_len=)' after the output action.  An
> example use case is below as well as shown in the tests/:
>
> - Output to port 1 with max_len 100 bytes.
> - The output packet size on port 1 will be MIN(original_packet_size, 100).
> # ovs-ofctl add-flow br0 'actions=output:1(max_len=100)'
>
> - The scope of max_len is limited to output action itself.  The following
>   output:1 and output:2 will be the original packet size.
> # ovs-ofctl add-flow br0 'actions=output:1(max_len=100),output:1,output:2'
>
> Implementation/Limitaions:
> - Userspace and kernel datapath is supported, no Windows support.
> - The minimum value of max_len is 60 byte (minimum Ethernet frame size).
>   This is defined in OVS_ACTION_OUTPUT_MIN.
> - Since 'max_len' is undefined in OpenFlow spec, the controller might
>   accidentally place a garbage value in max_len and send to OVS.
> - For compatibility, if the kernel datapath is not supported, set
>   max_len to zero.
> - OUTPUT_REG with max_len is not supported.
> - actions=1(max_len=100) is not supported, must specify as 'output:1'.
> - Only output:[0-9]*(max_len=) is supported.  Output to IN_PORT,
>   TABLE, NORMAL, FLOOD, ALL, and LOCAL are not supported.
>
> Signed-off-by: William Tu 
> ---
>  datapath/actions.c| 19 +--
>  datapath/datapath.h   |  1 +
>  datapath/flow_netlink.c   | 10 ++--
>  datapath/linux/compat/include/linux/openvswitch.h |  7 +++
>  datapath/vport.c  |  6 +++
>  lib/dp-packet.c   |  1 +
>  lib/dp-packet.h   |  1 +
>  lib/dpctl.c   | 19 ---
>  lib/dpif-netdev.c | 20 ++-
>  lib/netdev.c  |  8 +++
>  lib/netlink.h |  1 +
>  lib/odp-util.c| 27 --
>  lib/ofp-actions.c | 41 +++
>  lib/ofp-actions.h |  4 +-
>  ofproto/ofproto-dpif-xlate.c  | 33 +++-
>  ofproto/ofproto-dpif.c| 45 
>  ofproto/ofproto-dpif.h|  4 ++
>  tests/ofp-print.at|  6 +--
>  tests/ofproto-dpif.at | 53 +++
>  tests/system-traffic.at   | 63 
> +++
>  20 files changed, 330 insertions(+), 39 deletions(-)
>
> diff --git a/datapath/actions.c b/datapath/actions.c
> index dcf8591..d64dadf 100644
> --- a/datapath/actions.c
> +++ b/datapath/actions.c
> @@ -738,10 +738,15 @@ err:
>  }
>
>  static void do_output(struct datapath *dp, struct sk_buff *skb, int out_port,
> - struct sw_flow_key *key)
> +uint16_t max_len, struct sw_flow_key *key)
>  {
> struct vport *vport = ovs_vport_rcu(dp, out_port);
>
> +   /* This is after skb_clone called from do_execute_actions,
> +  so max_len only applies to the current skb. */
> +   if (unlikely(max_len != 0))
> +   OVS_CB(skb)->max_len = max_len;
> +
> if (likely(vport)) {
> u16 mru = OVS_CB(skb)->mru;
>
> @@ -1034,6 +1039,7 @@ static int do_execute_actions(struct datapath *dp, 
> struct sk_buff *skb,
>  * is slightly obscure just to avoid that.
>  */
> int prev_port = -1;
> +   uint16_t max_len = 0;
> const struct nlattr *a;
> int rem;
>
> @@ -1045,15 +1051,18 @@ static int do_execute_actions(struct datapath *dp, 
> struct sk_buff *skb,
> struct sk_buff *out_skb = skb_clone(skb, GFP_ATOMIC);
>
> if (out_skb)
> -   do_output(dp, out_skb, prev_port, key);
> +   do_output(dp, out_skb, prev_port, max_len, 
> key);
>
> prev_port = -1;
> }
>
> switch (nla_type(a)) {
> - 

Re: [ovs-dev] [PATCH v3 3/3] ovn: Apply ACL changes to existing connections.

2016-03-29 Thread Justin Pettit

> On Mar 21, 2016, at 7:54 AM, Russell Bryant  wrote:
> 
> diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
> index 2cc9c34..c8cca54 100644
> --- a/ovn/northd/ovn-northd.8.xml
> +++ b/ovn/northd/ovn-northd.8.xml
> 
> +allow ACLs translate into logical flows with
> +the next; action.  If there are any stateful ACLs
> +on this datapath, then alloc ACLs translate to

Should that be "allow" instead of "alloc"?

> +ct_commit; next;.
> +  
> +  
> +allow-related ACLs translate into logical
> +flows with the ct_commit(ct_label=0); next; actions
> +for new connections and "next;" for existing connections.

Should that "next;" be surrounded by a  declaration instead of quotes?

> -A priority-65535 flow that allows any traffic that has been
> -committed to the connection tracker (i.e., established flows).
> +A priority-65535 flow that allows any traffic in the reply
> +direction for a connection that has been committed to the
> +connection tracker (i.e., established flows), as long as
> +the committed flow does not have ct_label=1 set.

I realize it has the same effect since we set the whole field, but I think it 
actually just matches on "ct_label[0]".

> +We only handle traffic in the reply direction here because
> +we want all packets going in the request direction to still
> +go through the flows that implement the currently defined
> +policy based on ACLs.  If a connection is no longer allowed by
> +policy, ct_label will get set and packets in the
> +reply direction will no longer be allowed, either.

I think it would be good to specify which bit gets set here, since we may use 
other bits in the future, and we may forget to update this comment.

>   
> 
>   
> A priority-65535 flow that allows any traffic that is considered
> related to a committed flow in the connection tracker (e.g., an
> -ICMP Port Unreachable from a non-listening UDP port).
> +ICMP Port Unreachable from a non-listening UDP port), as long
> +as the committed flow does not have ct_label=1 set.

Same comment about "ct_label[0]".

> A priority-65535 flow that drops all traffic marked by the
> -connection tracker as invalid.
> +connection tracker as invalid or reply packets with
> +ct_label=1 set meaning that the connection should

Ditto.

> +no longer be allowed due to a policy change.  Packets in the
> +request direction are skipped here to let a newly created
> +ACL re-allow this connection.

This sentence makes it sound like invalid packets will be allowed in the 
request direction, which I don't think is desired behavior, and it doesn't 
appear to be what the code does (ie, the code looks right to me).

> --- a/ovn/northd/ovn-northd.c
> +++ b/ovn/northd/ovn-northd.c
> @@ -1384,48 +1384,77 @@ build_acls(struct ovn_datapath *od, struct hmap 
> *lflows, struct hmap *ports)
> ...
> + * We use "ct_commit" for a connection that is not already known
> + * by the connection tracker.  Once a connection is committed,
> + * subsequent packets will hit the flow at priority 0 that just
> + * uses "next;"
> + *
> + * We also check for established connections that have ct_label set

Once again, I would mention "ct_label[0]".

> + * on them.  That's a connection that was disallowed, but is now
> + * allowed by policy again since it hit this default-allow flow.
> + * We need to clear the mark to let the connection continue.

"mark" is correct generically, but it might be clearer to say "label".  Also, I 
would mention that it's the "special" bit.

> /* Ingress and Egress ACL Table (Priority 65535).
>  *
> - * Always allow traffic that is established to a committed
> - * conntrack entry.  This is enforced at a higher priority than
> + * Allow reply traffic that is part of an established
> + * conntrack entry that has not been marked for deletion
> + * (bit 0 of ct_label).  We only match traffic in the
> + * reply direction because we want traffic in the request
> + * direction to hit the currently defined policy from ACLs.
> + *
> + * This is enforced at a higher priority than
>  * ACLs can be defined. */

I think this may be able to to fit on one line.

> @@ -1435,38 +1464,102 @@ build_acls(struct ovn_datapath *od, struct hmap 
> *lflows, struct hmap *ports)
> bool ingress = !strcmp(acl->direction, "from-lport") ? true :false;
> enum ovn_stage stage = ingress ? S_SWITCH_IN_ACL : S_SWITCH_OUT_ACL;
> 
> -if (!strcmp(acl->action, "allow")) {
> -/* If there are any stateful flows, we must even commit "allow"
> - * actions.  This is because, while 

Re: [ovs-dev] [PATCH v3 2/3] ovn: Add ct_commit(ct_mark=INT, ct_label=INT); action.

2016-03-29 Thread Justin Pettit

> On Mar 29, 2016, at 2:07 PM, Russell Bryant  wrote:
> 
> On Tue, Mar 29, 2016 at 5:00 PM, Justin Pettit  wrote:
> 
> > On Mar 28, 2016, at 2:33 PM, Russell Bryant  wrote:
> >
> >
>>> > Yeah, no tests.  I honestly wasn't sure how to test it since we can't use 
>>> > ct() in the test suite.  I was hoping that we could start adding some 
>>> > tests for this stuff once the userspace conntrack patches go in.
>>> 
>> There actually are some tests written that are executed when you run "make 
>> check-kernel", but it has to load the kernel module locally and run them.  
>> Those tests don't do bitwise manipulation, though.  Obviously, userspace 
>> conntrack would make all of this easier.  Daniele plans to start working on 
>> those patches again soon.
> 
> Yeah, we don't have any OVN tests in the check-kernel suite.  There's nothing 
> kernel datapath specific in OVN, so ideally we could keep all of our tests in 
> the other suite, which is easier to run in more places.

If we can avoid OVN tests with check-kernel, that would be ideal.  I was just 
referring to testing the OVS functionality, which only recently had the ability 
to run kernel unit tests.

>>> > Bit-level twiddling would indeed be nice.  I didn't have a need for it in 
>>> > this series, though.  Are you OK with it coming as a future enhancement, 
>>> > or would you like to see it now?
>>> 
>> In general, I think its nicer to provide a completed interface if it doesn't 
>> add too much complexity.  (I don't think it requires much (if any) more code 
>> since we have functions that parse bit-level setting.)  Also, the intent of 
>> the next patch is really to do flag manipulation and not setting a 128-bit 
>> field.  However, there should have been some bit-level tests for the 
>> functionality that's already checked in, so I'm happy to just add the tests 
>> and this flexibility to my to-do list.
>> 
> Yeah, you're right, it does matching on the bit.  It's setting the whole 
> value, which only works because this was the first use of the field.  It'll 
> be needed before we add a second bit.
> 
> I can add it now.  I was mostly just being lazy.  It may be next week before 
> I get to it, though.  I'm traveling all week and hacking time is hit or miss.

I definitely understand.  :-)  If you can do that, it would be great, but I'm 
okay if we want to add the support once we clearly document and test the 
capability in OVS itself.  I've sent Joe a message confirming whether this is 
indeed supported.

Thanks!

--Justin


___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [RFC] ofp-actions: Extend the use of max_len in OUTPUT action.

2016-03-29 Thread William Tu
Before, OpenFlow specification defines 'max_len' in struct ofp_action_output
as the max number of bytes to send when port is OFPP_CONTROLLER.  A max_len
of zero means no bytes of the packet should be sent, and max_len of
OFPCML_NO_BUFFER means the complete packet is sent to the controller.
It is left undefined of max_len, when output port is not OFPP_CONTROLLER.
The patch extends the use of max_len when output is non-controller.

One use case is to enable port mirroring to send smaller packets to the
destination port so that only useful packet information is mirrored/copied,
saving some performance overhead of copying entire packet payload.
The patch proposes adding a '(max_len=)' after the output action.  An
example use case is below as well as shown in the tests/:

- Output to port 1 with max_len 100 bytes.
- The output packet size on port 1 will be MIN(original_packet_size, 100).
# ovs-ofctl add-flow br0 'actions=output:1(max_len=100)'

- The scope of max_len is limited to output action itself.  The following
  output:1 and output:2 will be the original packet size.
# ovs-ofctl add-flow br0 'actions=output:1(max_len=100),output:1,output:2'

Implementation/Limitaions:
- Userspace and kernel datapath is supported, no Windows support.
- The minimum value of max_len is 60 byte (minimum Ethernet frame size).
  This is defined in OVS_ACTION_OUTPUT_MIN.
- Since 'max_len' is undefined in OpenFlow spec, the controller might
  accidentally place a garbage value in max_len and send to OVS.
- For compatibility, if the kernel datapath is not supported, set
  max_len to zero.
- OUTPUT_REG with max_len is not supported.
- actions=1(max_len=100) is not supported, must specify as 'output:1'.
- Only output:[0-9]*(max_len=) is supported.  Output to IN_PORT,
  TABLE, NORMAL, FLOOD, ALL, and LOCAL are not supported.

Signed-off-by: William Tu 
---
 datapath/actions.c| 19 +--
 datapath/datapath.h   |  1 +
 datapath/flow_netlink.c   | 10 ++--
 datapath/linux/compat/include/linux/openvswitch.h |  7 +++
 datapath/vport.c  |  6 +++
 lib/dp-packet.c   |  1 +
 lib/dp-packet.h   |  1 +
 lib/dpctl.c   | 19 ---
 lib/dpif-netdev.c | 20 ++-
 lib/netdev.c  |  8 +++
 lib/netlink.h |  1 +
 lib/odp-util.c| 27 --
 lib/ofp-actions.c | 41 +++
 lib/ofp-actions.h |  4 +-
 ofproto/ofproto-dpif-xlate.c  | 33 +++-
 ofproto/ofproto-dpif.c| 45 
 ofproto/ofproto-dpif.h|  4 ++
 tests/ofp-print.at|  6 +--
 tests/ofproto-dpif.at | 53 +++
 tests/system-traffic.at   | 63 +++
 20 files changed, 330 insertions(+), 39 deletions(-)

diff --git a/datapath/actions.c b/datapath/actions.c
index dcf8591..d64dadf 100644
--- a/datapath/actions.c
+++ b/datapath/actions.c
@@ -738,10 +738,15 @@ err:
 }
 
 static void do_output(struct datapath *dp, struct sk_buff *skb, int out_port,
- struct sw_flow_key *key)
+uint16_t max_len, struct sw_flow_key *key)
 {
struct vport *vport = ovs_vport_rcu(dp, out_port);
 
+   /* This is after skb_clone called from do_execute_actions,
+  so max_len only applies to the current skb. */
+   if (unlikely(max_len != 0))
+   OVS_CB(skb)->max_len = max_len;
+
if (likely(vport)) {
u16 mru = OVS_CB(skb)->mru;
 
@@ -1034,6 +1039,7 @@ static int do_execute_actions(struct datapath *dp, struct 
sk_buff *skb,
 * is slightly obscure just to avoid that.
 */
int prev_port = -1;
+   uint16_t max_len = 0;
const struct nlattr *a;
int rem;
 
@@ -1045,15 +1051,18 @@ static int do_execute_actions(struct datapath *dp, 
struct sk_buff *skb,
struct sk_buff *out_skb = skb_clone(skb, GFP_ATOMIC);
 
if (out_skb)
-   do_output(dp, out_skb, prev_port, key);
+   do_output(dp, out_skb, prev_port, max_len, key);
 
prev_port = -1;
}
 
switch (nla_type(a)) {
-   case OVS_ACTION_ATTR_OUTPUT:
-   prev_port = nla_get_u32(a);
+   case OVS_ACTION_ATTR_OUTPUT: {
+   struct ovs_action_output *output = nla_data(a);
+   prev_port = output->port;
+   

[ovs-dev] partial occupation

2016-03-29 Thread dev
Hello!

We are looking for employees working remotely.

My name is Lucas, am the personnel manager of a large International company.
Most of the work you can do from home, that is, at a distance.
Salary is $1000-$4000.

If you are interested in this offer, please visit our site: --> 
www.bestearntrade.com

Best regards!

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] Bit-level setting with ct()

2016-03-29 Thread Justin Pettit
Hi, Joe.  Russell is adding the ability to set ct_mark and ct_label in OVN 
logical flows.  The unit tests and ovs-ofctl documentation only show setting 
the whole field through OpenFlow, but I think the code supports bit-level 
manipulation.  Does that seem correct to you?

--Justin


___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH 2/4] ovsdb: Introduce OVSDB replication feature

2016-03-29 Thread Cabrera Vega, Mario Alberto
Replication is enabled by using the following option when starting the
database server:

--sync-from=server

Where 'server' can take any form described in the ovsdb-client(1)
manpage as an active connection. If this option is specified, the
replication process is immediately started.

Signed-off-by: Mario Cabrera 
---
ovsdb/automake.mk |   6 +-
ovsdb/ovsdb-server.1.in   |   3 +
ovsdb/ovsdb-server.c  |  46 ++--
ovsdb/replication-syn.man |   2 +
ovsdb/replication.c   | 597 ++
ovsdb/replication.h   |  39 +++
ovsdb/replication.man |   8 +
tests/ovsdb-server.at |  51 
8 files changed, 725 insertions(+), 27 deletions(-)
create mode 100644 ovsdb/replication-syn.man
create mode 100644 ovsdb/replication.c
create mode 100644 ovsdb/replication.h
create mode 100644 ovsdb/replication.man

diff --git a/ovsdb/automake.mk b/ovsdb/automake.mk
index 7db6fea..099ed3c 100644
--- a/ovsdb/automake.mk
+++ b/ovsdb/automake.mk
@@ -24,6 +24,8 @@ ovsdb_libovsdb_la_SOURCES = \
  ovsdb/monitor.h \
  ovsdb/query.c \
  ovsdb/query.h \
+ ovsdb/replication.c \
+ ovsdb/replication.h \
  ovsdb/row.c \
  ovsdb/row.h \
  ovsdb/server.c \
@@ -42,7 +44,9 @@ pkgconfig_DATA += \
 MAN_FRAGMENTS += \
  ovsdb/remote-active.man \
-  ovsdb/remote-passive.man
+ ovsdb/remote-passive.man \
+ ovsdb/replication.man \
+ ovsdb/replication-syn.man
 # ovsdb-tool
bin_PROGRAMS += ovsdb/ovsdb-tool
diff --git a/ovsdb/ovsdb-server.1.in b/ovsdb/ovsdb-server.1.in
index 6c85729..1025ade 100644
--- a/ovsdb/ovsdb-server.1.in
+++ b/ovsdb/ovsdb-server.1.in
@@ -19,6 +19,7 @@ ovsdb\-server \- Open vSwitch database server
.so lib/daemon-syn.man
.so lib/service-syn.man
.so lib/vlog-syn.man
+.so ovsdb/replication-syn.man
.so lib/ssl-syn.man
.so lib/ssl-bootstrap-syn.man
.so lib/ssl-peer-ca-cert-syn.man
@@ -100,6 +101,8 @@ configured remotes.
.so lib/service.man
.SS "Logging Options"
.so lib/vlog.man
+.SS "Syncing Options"
+.so ovsdb/replication.man
.SS "Public Key Infrastructure Options"
The options described below for configuring the SSL public key
infrastructure accept a special syntax for obtaining their
diff --git a/ovsdb/ovsdb-server.c b/ovsdb/ovsdb-server.c
index fa662b1..63dd209 100644
--- a/ovsdb/ovsdb-server.c
+++ b/ovsdb/ovsdb-server.c
@@ -42,6 +42,7 @@
#include "ovsdb-error.h"
#include "poll-loop.h"
#include "process.h"
+#include "replication.h"
#include "row.h"
#include "simap.h"
#include "shash.h"
@@ -59,15 +60,7 @@
 VLOG_DEFINE_THIS_MODULE(ovsdb_server);
-struct db {
-/* Initialized in main(). */
-char *filename;
-struct ovsdb_file *file;
-struct ovsdb *db;
-
-/* Only used by update_remote_status(). */
-struct ovsdb_txn *txn;
-};
+struct db;
 /* SSL configuration. */
static char *private_key_file;
@@ -75,6 +68,9 @@ static char *certificate_file;
static char *ca_cert_file;
static bool bootstrap_ca_cert;
+/* Replication configuration. */
+static bool connect_to_remote_server;
+
static unixctl_cb_func ovsdb_server_exit;
static unixctl_cb_func ovsdb_server_compact;
static unixctl_cb_func ovsdb_server_reconnect;
@@ -159,6 +155,10 @@ main_loop(struct ovsdb_jsonrpc_server *jsonrpc, struct 
shash *all_dbs,
 report_error_if_changed(reconfigure_ssl(all_dbs), _error);
 ovsdb_jsonrpc_server_run(jsonrpc);
+if (connect_to_remote_server) {
+ replication_run(all_dbs);
+}
+
 SHASH_FOR_EACH(node, all_dbs) {
 struct db *db = node->data;
 ovsdb_trigger_run(db->db, time_msec());
@@ -170,9 +170,9 @@ main_loop(struct ovsdb_jsonrpc_server *jsonrpc, struct 
shash *all_dbs,
 }
 }
-/* update Manager status(es) every 5 seconds */
+/* update Manager status(es) every 2.5 seconds */
 if (time_msec() >= status_timer) {
-status_timer = time_msec() + 5000;
+status_timer = time_msec() + 2500;
 update_remote_status(jsonrpc, remotes, all_dbs);
 }
@@ -350,6 +350,7 @@ main(int argc, char *argv[])
 sset_destroy();
 sset_destroy(_filenames);
 unixctl_server_destroy(unixctl);
+disconnect_remote_server();
 if (run_process && process_exited(run_process)) {
 int status = process_status(run_process);
@@ -433,21 +434,6 @@ open_db(struct server_config *config, const char *filename)
 return error;
}
-static const struct db *
-find_db(const struct shash *all_dbs, const char *db_name)
-{
-struct shash_node *node;
-
-SHASH_FOR_EACH(node, all_dbs) {
-struct db *db = node->data;
-if (!strcmp(db->db->schema->name, db_name)) {
-return db;
-}
-}
-
-return NULL;
-}
-
static char * OVS_WARN_UNUSED_RESULT
parse_db_column__(const struct shash *all_dbs,
   const char *name_, char *name,
@@ 

[ovs-dev] [PATCH 4/4] ovsdb: Add unixctl commands for OVSDB replication

2016-03-29 Thread Cabrera Vega, Mario Alberto
Set and get the server to replicate from:

ovsdb-server/set-remote-ovsdb-server {server}
ovsdb-server/get-remote-ovsdb-server

Set and get the replicated table blacklist:

ovsdb-server/set-sync-excluded-tables {DB:table,...}
ovsdb-server/get-sync-excluded-tables

Connect to the configured server and start replication:

ovsdb-server/connect-remote-ovsdb-server

Disconnect from the remote server and stop replication, without dropping
the replicated data:

ovsdb-server/disconnect-remote-ovsdb-server

Signed-off-by: Mario Cabrera 
---
 ovsdb/ovsdb-server.1.in |  21 ++
 ovsdb/ovsdb-server.c|  97 +
 ovsdb/replication.c |  13 +++-
 ovsdb/replication.h |   3 +
 tests/ovsdb-server.at   | 188 
 5 files changed, 321 insertions(+), 1 deletion(-)

diff --git a/ovsdb/ovsdb-server.1.in b/ovsdb/ovsdb-server.1.in
index 1025ade..4a8b551 100644
--- a/ovsdb/ovsdb-server.1.in
+++ b/ovsdb/ovsdb-server.1.in
@@ -185,6 +185,27 @@ again (with \fBovsdb\-server/add\-db\fR).
 Outputs a list of the currently configured databases added either through
 the command line or through the \fBovsdb\-server/add\-db\fR command.
 .
+.IP "\fBovsdb\-server/set\-remote\-ovsdb\-server \fIserver"
+Sets  the remote \fIserver\fR from which \fBovsdb\-server\fR connects through
+\fBovsdb\-server/connect\-remote\-ovsdb\-server\fR.
+.
+.IP "\fBovsdb\-server/get\-remote\-ovsdb\-server"
+Gets the remote server from which \fBovsdb\-server\fR is currently 
synchronizing
+its databases.
+.
+.IP "\fBovsdb\-server/connect\-remote\-ovsdb\-server"
+Causes \fBovsdb\-server\fR to synchronize its databases with the server
+specified by \fBovsdb\-server/set\-remote\-ovsdb\-server\fR.
+.
+.IP "\fBovsdb\-server/disconnect\-remote\-ovsdb\-server"
+Causes \fBovsdb\-server\fR to  stop  synchronizing  its  databases with a 
remote server.
+.
+.IP "\fBovsdb\-server/set\-sync\-excluded\-tables 
\fIdb\fB:\fItable\fR[\fB,\fIdb\fB:\fItable\fR]..."
+Sets the \fItable\fR whitin \fIdb\fR that will be excluded from 
synchronization.
+.
+.IP "\fBovsdb\-server/get\-sync\-excluded\-tables"
+Gets  the  tables  that are currently excluded from synchronization.
+.
 .so lib/vlog-unixctl.man
 .so lib/memory-unixctl.man
 .so lib/coverage-unixctl.man
diff --git a/ovsdb/ovsdb-server.c b/ovsdb/ovsdb-server.c
index 93beaf0..bb2364e 100644
--- a/ovsdb/ovsdb-server.c
+++ b/ovsdb/ovsdb-server.c
@@ -77,6 +77,12 @@ static unixctl_cb_func ovsdb_server_reconnect;
 static unixctl_cb_func ovsdb_server_perf_counters_clear;
 static unixctl_cb_func ovsdb_server_perf_counters_show;
 static unixctl_cb_func ovsdb_server_disable_monitor2;
+static unixctl_cb_func ovsdb_server_set_remote_ovsdb_server;
+static unixctl_cb_func ovsdb_server_get_remote_ovsdb_server;
+static unixctl_cb_func ovsdb_server_connect_remote_ovsdb_server;
+static unixctl_cb_func ovsdb_server_disconnect_remote_ovsdb_server;
+static unixctl_cb_func ovsdb_server_set_sync_excluded_tables;
+static unixctl_cb_func ovsdb_server_get_sync_excluded_tables;
 
 struct server_config {
 struct sset *remotes;
@@ -333,6 +339,19 @@ main(int argc, char *argv[])
 unixctl_command_register("ovsdb-server/perf-counters-clear", "", 0, 0,
  ovsdb_server_perf_counters_clear, NULL);
 
+unixctl_command_register("ovsdb-server/set-remote-ovsdb-server", "", 0, 1,
+  ovsdb_server_set_remote_ovsdb_server, NULL);
+unixctl_command_register("ovsdb-server/get-remote-ovsdb-server", "", 0, 0,
+  ovsdb_server_get_remote_ovsdb_server, NULL);
+unixctl_command_register("ovsdb-server/connect-remote-ovsdb-server", "", 
0, 0,
+  ovsdb_server_connect_remote_ovsdb_server, NULL);
+unixctl_command_register("ovsdb-server/disconnect-remote-ovsdb-server", 
"", 0, 0,
+  ovsdb_server_disconnect_remote_ovsdb_server, 
NULL);
+unixctl_command_register("ovsdb-server/set-sync-excluded-tables", "", 0, 1,
+  ovsdb_server_set_sync_excluded_tables, NULL);
+unixctl_command_register("ovsdb-server/get-sync-excluded-tables", "", 0, 0,
+  ovsdb_server_get_sync_excluded_tables, NULL);
+
 /* Simulate the behavior of OVS release prior to version 2.5 that
  * does not support the monitor2 method.  */
 unixctl_command_register("ovsdb-server/disable-monitor2", "", 0, 0,
@@ -1018,6 +1037,84 @@ report_error_if_changed(char *error, char **last_errorp)
 }
 
 static void
+ovsdb_server_set_remote_ovsdb_server(struct unixctl_conn *conn,
+ int argc OVS_UNUSED, const char *argv[],
+ void *arg_ OVS_UNUSED)
+{
+set_remote_ovsdb_server(argv[1]);
+connect_to_remote_server = false;
+unixctl_command_reply(conn, NULL);
+}
+
+static void

[ovs-dev] [PATCH 3/4] ovsdb: Add table exclusion functionality to OVSDB

2016-03-29 Thread Cabrera Vega, Mario Alberto
A blacklist of tables that will be excluded from replication can be
specified by the following option:

--sync-exclude-tables=db:table[,db:table]...

Where 'table' corresponds to a table name, and 'db' corresponds to the
database name where the table resides.

Signed-off-by: Mario Cabrera 
---
 ovsdb/ovsdb-server.c   |   6 ++
 ovsdb/replication.c|  56 ++-
 ovsdb/replication.h|   1 +
 tests/automake.mk  |   1 +
 tests/ovsdb-replication.at | 174 +
 tests/ovsdb-server.at  |  56 +++
 tests/ovsdb.at |   1 +
 7 files changed, 292 insertions(+), 3 deletions(-)
 create mode 100644 tests/ovsdb-replication.at

diff --git a/ovsdb/ovsdb-server.c b/ovsdb/ovsdb-server.c
index 63dd209..93beaf0 100644
--- a/ovsdb/ovsdb-server.c
+++ b/ovsdb/ovsdb-server.c
@@ -1265,6 +1265,7 @@ parse_options(int *argcp, char **argvp[],
 OPT_BOOTSTRAP_CA_CERT,
 OPT_PEER_CA_CERT,
 OPT_SYNC_FROM,
+OPT_SYNC_EXCLUDE,
 VLOG_OPTION_ENUMS,
 DAEMON_OPTION_ENUMS
 };
@@ -1284,6 +1285,7 @@ parse_options(int *argcp, char **argvp[],
 {"certificate", required_argument, NULL, 'c'},
 {"ca-cert", required_argument, NULL, 'C'},
 {"sync-from",   required_argument, NULL, OPT_SYNC_FROM},
+{"sync-exclude-tables", required_argument, NULL, OPT_SYNC_EXCLUDE},
 {NULL, 0, NULL, 0},
 };
 char *short_options = ovs_cmdl_long_options_to_short_options(long_options);
@@ -1349,6 +1351,10 @@ parse_options(int *argcp, char **argvp[],
 connect_to_remote_server = true;
 break;
 
+case OPT_SYNC_EXCLUDE:
+set_tables_blacklist(optarg);
+break;
+
 case '?':
 exit(EXIT_FAILURE);
 
diff --git a/ovsdb/replication.c b/ovsdb/replication.c
index d9e609e..5507a5a 100644
--- a/ovsdb/replication.c
+++ b/ovsdb/replication.c
@@ -35,8 +35,10 @@
 static char *remote_ovsdb_server;
 static struct jsonrpc *rpc;
 static struct sset monitored_tables = SSET_INITIALIZER(_tables);
+static struct sset tables_blacklist = SSET_INITIALIZER(_blacklist);
 static bool reset_dbs = true;
 
+void replication_init(void);
 static struct jsonrpc *open_jsonrpc(const char *server);
 static struct ovsdb_error *check_jsonrpc_error(int error,
struct jsonrpc_msg **reply_);
@@ -73,6 +75,14 @@ static struct ovsdb_error *execute_update(struct ovsdb_txn 
*txn,
   struct json *new);
 

 void
+replication_init(void)
+{
+sset_init(_tables);
+sset_init(_blacklist);
+reset_dbs = true;
+}
+
+void
 replication_run(struct shash *all_dbs)
 {
 if (sset_is_empty(_tables) && remote_ovsdb_server) {
@@ -109,10 +119,30 @@ set_remote_ovsdb_server(const char *remote_server)
 }
 
 void
+set_tables_blacklist(const char *blacklist)
+{
+char *save_ptr = NULL;
+char *blacklist_item;
+
+replication_init();
+
+if (blacklist) {
+   char *t_blacklist = strdup(blacklist);
+for (blacklist_item = strtok_r(t_blacklist, ",", _ptr);
+ blacklist_item != NULL;
+ blacklist_item = strtok_r(NULL, ",", _ptr)) {
+sset_add(_blacklist, blacklist_item);
+}
+free(t_blacklist);
+}
+}
+
+void
 disconnect_remote_server(void)
 {
 jsonrpc_close(rpc);
 sset_destroy(_tables);
+sset_destroy(_blacklist);
 
 if (remote_ovsdb_server) {
 free(remote_ovsdb_server);
@@ -160,8 +190,18 @@ reset_database(struct ovsdb *db, struct ovsdb_txn *txn)
 struct ovsdb_table *table = table_node->data;
 struct ovsdb_row *row;
 
-HMAP_FOR_EACH (row, hmap_node, >rows) {
-ovsdb_txn_row_delete(txn, row);
+size_t blacklist_item_len = strlen(db->schema->name) +
+strlen(table_node->name) + 2;
+
+/* Do not reset if table is blacklisted. */
+char blacklist_item[blacklist_item_len];
+snprintf(blacklist_item, blacklist_item_len, "%s%s%s",
+ db->schema->name, ":", table_node->name);
+
+if (!sset_contains(_blacklist, blacklist_item)) {
+HMAP_FOR_EACH (row, hmap_node, >rows) {
+ovsdb_txn_row_delete(txn, row);
+}
 }
 }
 }
@@ -293,7 +333,17 @@ send_monitor_requests(struct shash *all_dbs)
 
 for (int j = 0; j < n; j++) {
 struct ovsdb_table_schema *table = nodes[j]->data;
-add_monitored_table(table, monitor_request);
+size_t blacklist_item_len = strlen(db_name) +
+strlen(table->name) + 2;
+char blacklist_item[blacklist_item_len];
+
+snprintf(blacklist_item, blacklist_item_len, "%s%s%s",
+ db_name, ":", table->name);
+
+

[ovs-dev] [PATCH 1/4] docs: OVSDB replication design document

2016-03-29 Thread Cabrera Vega, Mario Alberto
This patch series add database replication functionality between two 
ovsdb-servers.
The main idea is that an "active" server replicate its database contents to an
"standby" server in order to provide "fail over" characteristics.

---
From 511d124fed0f7fcf27327242a9089bce561da411 Mon Sep 17 00:00:00 2001
From: Mario Cabrera 
Date: Tue, 29 Mar 2016 09:28:05 -0600
Subject: [PATCH 1/4] docs: OVSDB replication design document

The database replication functionality is designed to provide "fail
over" characteristics. There are two participating databases, one of
which is the "active" database and the other is the "stand by" database.
Replication happens exclusively from the active to the stand by
database.

This document explains how the replication functionality is implemented.

Signed-off-by: Mario Cabrera 
---
 Documentation/OVSDB-replication.md | 123 +
 Documentation/automake.mk  |   3 +-
 2 files changed, 125 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/OVSDB-replication.md

diff --git a/Documentation/OVSDB-replication.md 
b/Documentation/OVSDB-replication.md
new file mode 100644
index 000..4a4eb5e
--- /dev/null
+++ b/Documentation/OVSDB-replication.md
@@ -0,0 +1,123 @@
+OVSDB replication implementation
+
+
+Overview
+
+Given two Open vSwitch databases that have the same schema, OVSDB replication
+consists on maintaining these databases in the same state with one another,
+i.e each of the databases have the same contents at any given time even if they
+are not running in the same host. This document elaborates on the 
implementation
+details to provide this functionality.
+
+Terminology
+===
+-   Source of truth database: database whose content will be replicated to
+another database.
+-   Active server: ovsdb-server providing RPC interface to the source of
+truth database.
+-   Standby server: ovsdb-server providing RPC interface to the database that
+is not the source of truth.
+
+Design
+==
+The overall design of replication consist on one ovsdb-server (active server)
+communicating the state of its databases to another ovsdb-server
+(standby server) so that the latter keep its own databases in that same state.
+In order to achieve this, the standby server acts as a client of the active
+server, in the sense that it sends a monitor request to keep up to date with
+the changes in the active server databases. When a notification from the
+active server arrives, the standby server executes the necessary set of
+operations so its databases reach the same state as the the active server
+databases. Below is the design represented as a diagram.
+
++--+replication +--+
+|Active|<---|   Standby|
+| OVSDB-server || OVSDB-server |
++--++--+
+|  |
+|  |
++---+  +---+
+|  SoT  |  |   |
+| OVSDB |  | OVSDB |
++---+  +---+
+
+Setting up the replication
+==
+To initiate the replication process, the standby server must be executed
+indicating the location of the active server via the command line option
+"--sync-from=server", where server can take any form described in the
+ovsdb-client manpage and it must specify an active connection type (tcp, unix,
+ssl). This option will cause the standby server to attempt to send a monitor
+request to the active server in every main loop iteration, until the active
+server responds.
+
+When sending a monitor request the standby server is doing the following:
+
+1. Erase the content of the databases for which it is providing a RPC 
interface.
+2. Open the jsonrpc channel to communicate with the active server.
+3. Fetch all the databases located in the active server.
+4. For each database with the same schema in both the active and standby
+servers: construct and send a monitor request message specifying the tables
+that will be monitored (i.e all the tables on the database except the ones
+blacklisted*).
+5. Set the standby database to the current state of the active database.
+
+Once the monitor request message is sent, the standby server will continuosly
+receive notifications of changes occuring to the tables specified in the
+request. The process of handling this notifications is detailed in the next
+section.
+
+*A set of tables that will be excluded from replication can be configure as a
+blacklist of tables via the command line option 
"--sync-exclude-tables=db:table[,db:table]...",
+where db corresponds to the database where the table resides.
+
+Replication process
+===
+The replication proccess consists 

Re: [ovs-dev] [PATCH] nx-match: Fix use-after-free parsing matches.

2016-03-29 Thread Joe Stringer
On 23 March 2016 at 06:41, Ben Pfaff  wrote:
> On Mon, Mar 07, 2016 at 11:31:02AM -0800, Joe Stringer wrote:
>> Address pointed by header_ptr might be free'd due to realloc
>> happened in ofpbuf_put_hex(). Reported by valgrind in the test
>> 379: check TCP flags expression in OXM and NXM.
>>
>> Invalid write of size 4
>> nx_match_from_string_raw (nx-match.c:1510)
>> nx_match_from_string (nx-match.c:1538)
>> ofctl_parse_nxm__ (ovs-ofctl.c:3325)
>> ovs_cmdl_run_command (command-line.c:121)
>> main (ovs-ofctl.c:137)
>>
>> Address 0x7a2cc40 is 0 bytes inside a block of size 64 free'd
>> free (vg_replace_malloc.c:530)
>> ofpbuf_resize__ (ofpbuf.c:246)
>> ofpbuf_put (ofpbuf.c:386)
>> ofpbuf_put_hex (ofpbuf.c:414)
>> nx_match_from_string_raw (nx-match.c:1488)
>> nx_match_from_string (nx-match.c:1538)
>> ofctl_parse_nxm__ (ovs-ofctl.c:3325)
>>
>> Reported-by: William Tu 
>> Signed-off-by: Joe Stringer 
>
> Acked-by: Ben Pfaff 

Thanks, applied to master.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] ofp-actions: Fix use-after-free with ofpact_finish().

2016-03-29 Thread Joe Stringer
On 20 March 2016 at 05:34, Ben Pfaff  wrote:
> On Mon, Mar 07, 2016 at 03:36:37PM -0800, Joe Stringer wrote:
>> ofpact_finish() may now reallocate the buffer it is passed, but not all
>> callers updated their local pointers to the current action in the
>> buffer. This could potentially lead to several use-after-free bugs.
>>
>> Update ofpact_finish() to return the new pointer to the ofpact which is
>> provided, and update the calling points to ensure that their local
>> pointers are pointing into the correct (potentially reallocated) buffer.
>>
>> Fixes: 2bd318dec242 ("ofp-actions: Make composing actions harder to screw 
>> up.")
>> Reported-by: William Tu 
>> Signed-off-by: Joe Stringer 
>
> Thank you.
>
> Acked-by: Ben Pfaff 

Thanks, applied to master.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v4 0/3] Testsuite fixes.

2016-03-29 Thread Joe Stringer
On 4 March 2016 at 18:31, Ilya Maximets  wrote:
> version 4:
> * Reworked prohibition of parallel execution.
>
> version 3:
> * AT_SKIP_IF ---> AT_CHECK(... || return 77).
> * Using of GNU make extentions removed.
>
> version 2:
> * 'testsuite: Add timeout to add_of_br() command.' removed
>   because already applied.
> * New patch 'tests/automake.mk: Prohibition of parallel
>   system-traffic test execution.'
> * delay after ovs-vswitchd killing replaced with 'Waiting
>   for port's availability before creation.'
>
> Ilya Maximets (3):
>   system-traffic.at: Skip tests if namespaces or veths aren't supported.
>   check-system-userspace: Waiting for port's availability before
> creation.
>   tests/automake.mk: Prohibition of parallel system-traffic test
> execution.

I applied this series to master. I probably should have done this
before taking a vacation, my apologies for the delay :-)
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v3 2/3] ovn: Add ct_commit(ct_mark=INT, ct_label=INT); action.

2016-03-29 Thread Russell Bryant
On Tue, Mar 29, 2016 at 5:00 PM, Justin Pettit  wrote:

>
> > On Mar 28, 2016, at 2:33 PM, Russell Bryant  wrote:
> >
> >
> > Yeah, no tests.  I honestly wasn't sure how to test it since we can't
> use ct() in the test suite.  I was hoping that we could start adding some
> tests for this stuff once the userspace conntrack patches go in.
>
> There actually are some tests written that are executed when you run "make
> check-kernel", but it has to load the kernel module locally and run them.
> Those tests don't do bitwise manipulation, though.  Obviously, userspace
> conntrack would make all of this easier.  Daniele plans to start working on
> those patches again soon.


Yeah, we don't have any OVN tests in the check-kernel suite.  There's
nothing kernel datapath specific in OVN, so ideally we could keep all of
our tests in the other suite, which is easier to run in more places.


>
> > Bit-level twiddling would indeed be nice.  I didn't have a need for it
> in this series, though.  Are you OK with it coming as a future enhancement,
> or would you like to see it now?
>
> In general, I think its nicer to provide a completed interface if it
> doesn't add too much complexity.  (I don't think it requires much (if any)
> more code since we have functions that parse bit-level setting.)  Also, the
> intent of the next patch is really to do flag manipulation and not setting
> a 128-bit field.  However, there should have been some bit-level tests for
> the functionality that's already checked in, so I'm happy to just add the
> tests and this flexibility to my to-do list.
>

Yeah, you're right, it does matching on the bit.  It's setting the whole
value, which only works because this was the first use of the field.  It'll
be needed before we add a second bit.

I can add it now.  I was mostly just being lazy.  It may be next week
before I get to it, though.  I'm traveling all week and hacking time is hit
or miss.

-- 
Russell Bryant
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v3 2/3] ovn: Add ct_commit(ct_mark=INT, ct_label=INT); action.

2016-03-29 Thread Justin Pettit

> On Mar 28, 2016, at 2:33 PM, Russell Bryant  wrote:
> 
> 
> Yeah, no tests.  I honestly wasn't sure how to test it since we can't use 
> ct() in the test suite.  I was hoping that we could start adding some tests 
> for this stuff once the userspace conntrack patches go in.

There actually are some tests written that are executed when you run "make 
check-kernel", but it has to load the kernel module locally and run them.  
Those tests don't do bitwise manipulation, though.  Obviously, userspace 
conntrack would make all of this easier.  Daniele plans to start working on 
those patches again soon.

> Bit-level twiddling would indeed be nice.  I didn't have a need for it in 
> this series, though.  Are you OK with it coming as a future enhancement, or 
> would you like to see it now?

In general, I think its nicer to provide a completed interface if it doesn't 
add too much complexity.  (I don't think it requires much (if any) more code 
since we have functions that parse bit-level setting.)  Also, the intent of the 
next patch is really to do flag manipulation and not setting a 128-bit field.  
However, there should have been some bit-level tests for the functionality 
that's already checked in, so I'm happy to just add the tests and this 
flexibility to my to-do list.

--Justin


___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] The part-time employment

2016-03-29 Thread dev
Hello!

We are looking for employees working remotely.

My name is Lucas, am the personnel manager of a large International company.
Most of the work you can do from home, that is, at a distance.
Salary is $1000-$4000.

If you are interested in this offer, please visit our site: --> 
www.bestearntrade.com

Best regards!

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v3] netdev-dpdk: vhost: Fix txq enabling in the absence of notifications.

2016-03-29 Thread Daniele Di Proietto
Thanks!

I applied this to master and branch-2.5

On 28/03/2016 23:20, "Ilya Maximets"  wrote:

>According to QEMU documentation (docs/specs/vhost-user.txt) one queue
>should be enabled initially. More queues are enabled dynamically, by
>sending message VHOST_USER_SET_VRING_ENABLE.
>
>Currently all queues in OVS disabled by default. This breaks above
>specification. So, queue #0 should be enabled by default to support
>QEMU versions less than 2.5 and fix probable issues if QEMU will not
>send VHOST_USER_SET_VRING_ENABLE for queue #0 according to documentation.
>Also this will fix currently broken vhost-cuse support in OVS.
>
>Fixes: 585a5beaa2a4 ("netdev-dpdk: vhost-user: Fix sending packets to
>  queues not enabled by guest.")
>Reported-by: Mauricio Vasquez B
>
>Signed-off-by: Ilya Maximets 
>---
>
>version 3:
>   * Fixed qid checking in __netdev_dpdk_vhost_send()
>
>version 2:
>   * Fixed initialization in netdev_dpdk_alloc_txq().
>   * Clearing moved to separate function.
>
> lib/netdev-dpdk.c | 28 
> 1 file changed, 24 insertions(+), 4 deletions(-)
>
>diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
>index 7c4cd07..8eea788 100644
>--- a/lib/netdev-dpdk.c
>+++ b/lib/netdev-dpdk.c
>@@ -103,6 +103,9 @@ BUILD_ASSERT_DECL((MAX_NB_MBUF /
>ROUND_DOWN_POW2(MAX_NB_MBUF/MIN_NB_MBUF))
> #define NIC_PORT_TX_Q_SIZE 2048  /* Size of Physical NIC TX Queue, Max
>(n+32<=4096)*/
> 
> #define OVS_VHOST_MAX_QUEUE_NUM 1024  /* Maximum number of vHost TX
>queues. */
>+#define OVS_VHOST_QUEUE_MAP_UNKNOWN (-1) /* Mapping not initialized. */
>+#define OVS_VHOST_QUEUE_DISABLED(-2) /* Queue was disabled by guest
>and not
>+  * yet mapped to another queue.
>*/
> 
> static char *cuse_dev_name = NULL;/* Character device cuse_dev_name.
>*/
> static char *vhost_sock_dir = NULL;   /* Location of vhost-user sockets
>*/
>@@ -671,7 +674,7 @@ netdev_dpdk_alloc_txq(struct netdev_dpdk *netdev,
>unsigned int n_txqs)
> }
> 
> /* Initialize map for vhost devices. */
>-netdev->tx_q[i].map = -1;
>+netdev->tx_q[i].map = OVS_VHOST_QUEUE_MAP_UNKNOWN;
> rte_spinlock_init(>tx_q[i].tx_lock);
> }
> }
>@@ -1265,7 +1268,7 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int
>qid,
> 
> qid = vhost_dev->tx_q[qid % vhost_dev->real_n_txq].map;
> 
>-if (OVS_UNLIKELY(!is_vhost_running(virtio_dev) || qid == -1)) {
>+if (OVS_UNLIKELY(!is_vhost_running(virtio_dev) || qid < 0)) {
> rte_spinlock_lock(_dev->stats_lock);
> vhost_dev->stats.tx_dropped+= cnt;
> rte_spinlock_unlock(_dev->stats_lock);
>@@ -2019,7 +2022,7 @@ netdev_dpdk_remap_txqs(struct netdev_dpdk *netdev)
> }
> 
> if (n_enabled == 0 && total_txqs != 0) {
>-enabled_queues[0] = -1;
>+enabled_queues[0] = OVS_VHOST_QUEUE_DISABLED;
> n_enabled = 1;
> }
> 
>@@ -2056,6 +2059,10 @@ netdev_dpdk_vhost_set_queues(struct netdev_dpdk
>*netdev, struct virtio_net *dev)
> netdev->real_n_rxq = qp_num;
> netdev->real_n_txq = qp_num;
> netdev->txq_needs_locking = true;
>+/* Enable TX queue 0 by default if it wasn't disabled. */
>+if (netdev->tx_q[0].map == OVS_VHOST_QUEUE_MAP_UNKNOWN) {
>+netdev->tx_q[0].map = 0;
>+}
> 
> netdev_dpdk_remap_txqs(netdev);
> 
>@@ -2104,6 +2111,18 @@ new_device(struct virtio_net *dev)
> return 0;
> }
> 
>+/* Clears mapping for all available queues of vhost interface. */
>+static void
>+netdev_dpdk_txq_map_clear(struct netdev_dpdk *dev)
>+OVS_REQUIRES(dev->mutex)
>+{
>+int i;
>+
>+for (i = 0; i < dev->real_n_txq; i++) {
>+dev->tx_q[i].map = OVS_VHOST_QUEUE_MAP_UNKNOWN;
>+}
>+}
>+
> /*
>  * Remove a virtio-net device from the specific vhost port.  Use
>dev->remove
>  * flag to stop any more packets from being sent or received to/from a
>VM and
>@@ -2123,6 +2142,7 @@ destroy_device(volatile struct virtio_net *dev)
> ovs_mutex_lock(_dev->mutex);
> dev->flags &= ~VIRTIO_DEV_RUNNING;
> ovsrcu_set(_dev->virtio_dev, NULL);
>+netdev_dpdk_txq_map_clear(vhost_dev);
> exists = true;
> ovs_mutex_unlock(_dev->mutex);
> break;
>@@ -2169,7 +2189,7 @@ vring_state_changed(struct virtio_net *dev,
>uint16_t queue_id, int enable)
> if (enable) {
> vhost_dev->tx_q[qid].map = qid;
> } else {
>-vhost_dev->tx_q[qid].map = -1;
>+vhost_dev->tx_q[qid].map = OVS_VHOST_QUEUE_DISABLED;
> }
> netdev_dpdk_remap_txqs(vhost_dev);
> exists = true;
>-- 
>2.5.0
>

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] ovn: Fix a typo in ovn-northd documentation

2016-03-29 Thread Justin Pettit

> On Mar 29, 2016, at 1:12 PM, Bruce Davie  wrote:
> 
> Signed-off-by: Bruce Davie 

Thanks!  I pushed this change.

--Justin


___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/3] flow: Fix remote DoS for crafted MPLS packets with debug logging enabled.

2016-03-29 Thread Justin Pettit

> On Mar 28, 2016, at 5:26 PM, Ben Pfaff  wrote:
> 
>> Vulnerability: CVE-2016-2074
>> Reported-by: Kashyap Thimmaraju 
>> Reported-by: Bhargava Shastry 
>> Signed-off-by: Ben Pfaff 
>> Acked-by: Jesse Gross 
> 
> Jesse acked this one, privately.  It's my patch so I can't ack it ;-)

Yep.  I just wanted to get it out on the mailing list so there wouldn't be any 
gaps.

Thanks for the reviews.

--Justin


___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH RFC] OVN: Openstack floating ip support

2016-03-29 Thread Chandra Sekhar Vejendla
Ryan,

I'll send out a V2 patch with the documentation changes you have suggested.

In the IP_DNAT stage inport is reset to address cases which involve
east-west communication via a floating ip where both the source and
the destination VM's are in the same logical switch. This would require
that the router send out the packet on the same interface it came in after
applying the NAT rules.

Thanks,
Chandra


Ryan Moats/Omaha/IBM wrote on 03/25/2016 10:15:22 AM:

> From: Ryan Moats/Omaha/IBM
> To: Chandra Sekhar Vejendla/San Jose/IBM@IBMUS
> Cc: dev@openvswitch.org
> Date: 03/25/2016 10:15 AM
> Subject: Re: [ovs-dev] [PATCH RFC] OVN: Openstack floating ip support
>
> "dev"  wrote on 03/22/2016 04:19:07 PM:
>
> > From: Chandra Sekhar Vejendla/San Jose/IBM@IBMUS
> > To: dev@openvswitch.org
> > Date: 03/22/2016 04:19 PM
> > Subject: [ovs-dev] [PATCH RFC] OVN: Openstack floating ip support
> > Sent by: "dev" 
> >
> > This patch adds distributed floating ip support for ovn. The assumption
made
> > here is that the external network is a single L2 broadcast domain
> and all the
> > chassis have connectivity to the external network.
> >
> > 2 new tables are added in the LROUTER pipeline IN_IP_DNAT & IP_IN_SNAT.
> > IN_IP_DNAT will modify the dst ip of the packet from floating ip to vm
ip.
> > IN_IP_SNAT will modify the src ip of the packet from vm ip to floating
ip.
> >
> > Rules in IN_IP_DNAT:
> > - Priority 100 rule to set the reg2 to 0x1 if dst & src networks are
> >   connectected via a router and both the networks are private.
> > - Priority 90 rule to modify the dst ip from floating ip to vm ip.
> > - Priority 0 rule to go to next table.
> >
> > Rules in IN_IP_SNAT:
> > - Priority 100 rule to skip modifying the src ip when reg2 is set to
0x1
> > - Priority 90 rule to modify the src ip from vm ip to floating ip
> and dst mac
> >   to floating ip port mac if the packet is egressing via the gateway
port
> > - Priority 50 rule to modify the src ip from vm ip to floating ip
> > - Priority 0 rule to go to next table.
> >
> > Priority 100 rules in IN_IP_DNAT and IN_IP_SNAT serves 2 purposes.
> > - Avoid NAT when vms in different LSWITCHES connected via a LROUTER
talk to
> >   each other using private ips.
> > - When 2 VMs connected to the same LSWITCH or different LSWITCHES
connected
> >   via a router try to talk to each other, the dst ip of the packet
should
> >   first be DNATed and then the src ip should be SNATed.
> >
> > The initial design was to stage DNAT in the ingress pipeline and the
SNAT in
> > the egress pipeline, but now both the stages are in the ingress
> pipeline. This
> > was done to solve the cases highlighted above [Priority 100
> rules]. There is a
> > need to use information from DNAT stage when SNAT is being processed.
This
> > would require an explicit register to be burnt to store the
information.
> >
> > Flows modified in the LSWITCH pipeline
> >
> > Rules in IN_PORT_SEC:
> > - Priority 50 rule to allow packets ingressing the LSWITCH router port
> >   with a src mac of floating ip port
> >
> > Rules in ARP_RSP:
> > - Priority 150 rule to respond to arp request for floating ip. To
> prevent arp
> >   responses for floating ip's from all the chassis, "lport" optionis
set in
> >   the external_id's column of the lflow table. lport will point to
> > the vif-id of
> >   the vm that is associated with the floating ip. When ovn-controller
is
> >   processing the flows, if it sees an lport option set in the
external_ids
> >   column, it will install this lflow only if the lport is a local
> port on the
> >   chassis.
> >
> > Rules in L2_LKUP:
> > - Priority 50 rule to set the outport to the lrouter port when the dst
mac
> >   matches the floating ip mac
> >
> > Rules in OUT_PORT_SEC:
> > - Priority 50 rule to allow packet egressing the lrouter port witha mac
of a
> >   floating ip port.
> >
> > Had to increase MAX_RESUBMIT_RECURSION from 64 to 96. When 2 VMs
connected
> > via vm1->LS->LR->LS->LR->LS->vm2 are trying to talk to each other, the
> > resubmits are exceeding the existing 64 limit.
> >
> > When a floating ip is associated with a VM ip, NB will set the
> options of the
> > floating ip lport to "fixed-ip-port=, router-
> port= > logical router port".
> >
> > If you want to try out this patch with openstack, add the
> following patch [1]
> > to networking-ovn.
> >
> > [1] https://review.openstack.org/#/c/295547/

> I know this is an RFC patch, but I'd really like to see two improvements
> w.r.t the documentation:
>
> 1. include the needed changes to ovn-northd.8.xml.
> 2. It looks like the DNAT code is also clearing the inport value and I
> didn't find the reason why anywhere - I'm pretty sure I know why, but
> confirmation would be good.
>
> Ryan Moats (regXboi)
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] ovn: Fix a typo in ovn-northd documentation

2016-03-29 Thread Bruce Davie
Signed-off-by: Bruce Davie 
---
 ovn/northd/ovn-northd.8.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
index 7954e22..743c939 100644
--- a/ovn/northd/ovn-northd.8.xml
+++ b/ovn/northd/ovn-northd.8.xml
@@ -36,7 +36,7 @@
   
   
 
-  --ovsnb-db=database
+  --ovnsb-db=database
 
 
   The database containing the OVN Southbound Database.
-- 
2.6.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH RFC] OVN: Openstack floating ip support

2016-03-29 Thread Chandra Sekhar Vejendla

Hi Guru,

The rule which responds to the ARP request for the floating ip is
installed only on the hypervisor which hosts the VM mapped to that
floating ip. This is done through a hack right now. When the NB is
installing this rule, it appends a key-value "lport: "
to the external-id's column of the lflow. When the controller is
processing the lflows, if it sees a "lport" key-value in the flow,
it will verify if the lport is a local resident and only then it will
install the rule.


Here is the packet walk flow of all the scenarios. This implementation
does not use the CONNTRACK NAT support. Since floating ip is a case of
1-1 IP mapping we are relying on just modifying the DIP and the SIP.


Topology:

+--+ +--+                       +--+ +--+
| VM1      | | VM2      |                       | VM1      | | VM2      |
| 10.1.1.3 | | 10.1.1.4 |                       | 10.1.2.5 | | 10.1.2.6 |
| MAC:m_vm1| | MAC:m_vm2|                       | MAC:m_vm1| | MAC:m_vm2|
+--+ +--+                       +--+ +--+
   |           |                                         |           |
   |           |                                         |           |
   |           |                                         |           |
  PVM1       PVM2               *                PVM3       PVM4
++            **         **            ++
|                |          **   *      *  **          |                |
|    Net1        |         *       *  *      *         |     Net2       |
|  10.1.1.0/24   |P1RP1*      ROUTER     *RP2R2| 10.1.2.0/24    |
| GW: 10.1.1.1   |         *       *  *      *         | GW: 10.1.2.1   |
| GW MAC: m_net1 |          **   *      *  **          | GW MAC: m_net2 |
++            **         **            ++
                                *
                                   RP3
                                    |
                                    |
                                    |
 +-+                P3                 +-+
 | FIP1:198.44.1.3 |       +---+       | FIP1:198.44.1.5 |
 | MAC: m_fip1     |---|    Provider Net   |---| MAC: m_fip3     |
 | VM: VM1         |       |  198.44.1.0/24    |       | VM: VM3         |
 +-+       |  GW: 198.44.1.254 |       +-+
 | FIP1:198.44.1.4 |---|RouterIP:198.44.1.2|---| FIP: 198.44.1.6 |
 | MAC: m_fip2     |       |RouterMac: m_pn1   |       | MAC: m_fip4     |
 | VM: VM2         |       +---+       | VM: VM4         |
 +-+             localnet              +-+
                                    |
                                    |
                                    |
   +-+           +-+           ++
   |  EXT HOST   |           | Physical Network|           | Gateway    |
   | 198.44.1.253|---|  198.44.1.0/24  |---|198.44.1.254|
   |  MAC: m_eh  |           |                 |           | MAC: m_gw  |
   +-+           +-+           ++



Router Ports:
-
    RP1 --> P1 (Net1) IP: 10.1.1.1, MAC: m_net1
    RP2 --> P2 (Net2) IP: 10.1.2.1, MAC: m_net2
    RP3 --> P3 (Provider Net) IP: 198.44.1.2, MAC: m_pn1

Floating ip Mappings:

    FIP1 mapped to VM1 (198.44.1.3 - 10.1.1.3)
    FIP2 mapped to Vm2 (198.44.1.4 - 10.1.1.4)
    FIP3 mapped to VM3 (198.44.1.5 - 10.1.2.5)
    FIP4 mapped to VM4 (198.44.1.6 - 10.1.2.6

Rules in logical router:

 Table=lr_in_admission, P=50, match=(dst==m_fip1 && inport = RP3), action=
    next;
 Table=lr_in_admission, P=50, match=(dst==m_fip2 && inport = RP3), action=
    next;
 Table=lr_in_admission, P=50, match=(dst==m_fip3 && inport = RP3), action=
    next;
 Table=lr_in_admission, P=50, match=(dst==m_fip4 && inport = RP3), action=
    next;

 Table=lr_in_ip_input, P=0, action=next;

 Table=lr_in_ip_dnat, P=100, match=(SIP == 10.1.1.0/24 && DIP ==
    10.1.2.0/24); action (reg2 = 0x1;next)
 Table=lr_in_ip_dnat, P=100, match=(SIP == 10.1.2.0/24 && DIP ==
    10.1.1.0/24); action (reg2 = 0x1;next)
 Table=lr_in_ip_dnat, P=90, match=(DIP == 192.168.1.3), action=(
    DIP=10.1.1.3; inport=" "; next;)
 Table=lr_in_ip_dnat, P=90, match=(DIP == 192.168.1.4), action=(
    DIP=10.1.1.4; inport= " "; next;)
 Table=lr_in_ip_dnat, P=90, match=(DIP == 192.168.1.5), action=(
    DIP=10.1.2.5; inport = " "; next;)
 Table=lr_in_ip_dnat, P=90, match=(DIP == 192.168.1.6), action=(
    DIP=10.1.2.6; inport = " "; next;)
 Table=lr_in_ip_dnat, P=0, action=next;

 Table=lr_in_ip_routing, P=24 match=(DIP=10.1.1.0/24), action=(ip.ttl--,
    SMAC=m_net1;outport=RP1;next;)
 Table=lr_in_ip_routing, P=24 match=(DIP=10.1.2.0/24), action=(ip.ttl--,
    SMAC=m_net3;outport=RP2;next;)
 

Re: [ovs-dev] [PATCH] SECURITY.md: Add advisory document details.

2016-03-29 Thread Ben Pfaff
On Tue, Mar 29, 2016 at 12:26:33PM -0700, Justin Pettit wrote:
> 
> > On Mar 29, 2016, at 11:07 AM, Ben Pfaff  wrote:
> > 
> > +  This section should state the risks of the procedure. For
> > +  example. if it can crash Open vSwitch or disrupt packet
> 
> I think that should be a comma instead of a period.
> 
> Thanks a lot for writing this up!
> 
> Acked-by: Justin Pettit 

Thanks, applied to master with that fix.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] SECURITY.md: Add advisory document details.

2016-03-29 Thread Ben Pfaff
Thanks, applied to master.

On Tue, Mar 29, 2016 at 12:27:24PM -0600, Ryan Moats wrote:
> Yep, looks sensible...
> 
> Acked-by: Ryan Moats 
> 
> "dev"  wrote on 03/29/2016 01:07:53 PM:
> 
> > From: Ben Pfaff 
> > To: dev@openvswitch.org
> > Cc: Ben Pfaff 
> > Date: 03/29/2016 01:09 PM
> > Subject: [ovs-dev] [PATCH] SECURITY.md: Add advisory document details.
> > Sent by: "dev" 
> >
> > Signed-off-by: Ben Pfaff 
> > ---
> >  SECURITY.md | 104 +
> > ---
> >  1 file changed, 100 insertions(+), 4 deletions(-)
> >
> > diff --git a/SECURITY.md b/SECURITY.md
> > index 08a6ed8..33b85b5 100644
> > --- a/SECURITY.md
> > +++ b/SECURITY.md
> > @@ -101,16 +101,112 @@ determines that it is not a realistic
> vulnerability.
> >  Step 3a: Document
> >  
> >
> > -The security team develops a security advisory document.  The document
> > -credits the reporter and describes the vulnerability, including all of
> > -the relevant information from the assessment in step 2.  The security
> > +The security team develops a security advisory document.  The security
> >  team may, at its discretion, include the reporter (via "CC") in
> >  developing the security advisory document, but in any case should
> >  accept feedback from the reporter before finalizing the document.
> > -
> >  When the document is final, the security team should obtain a CVE for
> >  the vulnerability from a CNA (https://cve.mitre.org/cve/cna.html).
> >
> > +The document credits the reporter and describes the vulnerability,
> > +including all of the relevant information from the assessment in step
> > +2.  Suitable sections for the document include:
> > +
> > +* Title: The CVE identifier, a short description of the
> > +  vulnerability.  The title should mention Open vSwitch.
> > +
> > +  In email, the title becomes the subject.  Pre-release advisories
> > +  are often passed around in encrypted email, which have plaintext
> > +  subjects, so the title should not be too specific.
> > +
> > +* Description: A few paragraphs describing the general
> > +  characteristics of the vulnerability, including the versions of
> > +  Open vSwitch that are vulnerable, the kind of attack that
> > +  exposes the vulnerability, and potential consequences of the
> > +  attack.
> > +
> > +  The description should re-state the CVE identifier, in case the
> > +  subject is lost when an advisory is sent over email.
> > +
> > +* Mitigation: How an Open vSwitch administrator can minimize the
> > +  potential for exploitation of the vulnerability, before applying
> > +  a fix.  If no mitigation is possible or recommended, explain
> > +  why, to reduce the chance that at-risk users believe they are
> > +  not at risk.
> > +
> > +* Fix: Describe how to fix the vulnerability, perhaps in terms of
> > +  applying a source patch.  The patch or patches themselves, if
> > +  included in the email, should be at the very end of the advisory
> > +  to reduce the risk that a reader would stop reading at this
> > +  point.
> > +
> > +* Recommendation: A concise description of the security team's
> > +  recommendation to users.
> > +
> > +* Acknowledgments: Thank the reporters.
> > +
> > +* Vulnerability Check: A step-by-step procedure by which a user
> > +  can determine whether an installed copy of Open vSwitch is
> > +  vulnerable.
> > +
> > +  The procedure should clearly describe how to interpret the
> > +  results, including expected results in vulnerable and
> > +  not-vulnerable cases.  Thus, procedures that produce clear and
> > +  easily distinguished results are preferred.
> > +
> > +  The procedure should assume as little understanding of Open
> > +  vSwitch as possible, to make it more likely that a competent
> > +  administrator who does not specialize in Open vSwitch can
> > +  perform it successfully.
> > +
> > +  The procedure should have minimal dependencies on tools that are
> > +  not widely installed.
> > +
> > +  Given a choice, the procedure should be one that takes at least
> > +  some work to turn into a useful exploit.  For example, a
> > +  procedure based on "ovs-appctl" commands, which require local
> > +  administrator access, is preferred to one that sends test
> > +  packets to a machine, which only requires network connectivity.
> > +
> > +  The section should say which operating systems it is designed
> > +  for.  If the procedure is likely to be specific to particular
> > +  architectures (e.g. x86-64, i386), it should state on which ones
> > +  it has been tested.
> > +
> > +  This section should state the risks of the procedure. For
> > +  example. if it can crash Open vSwitch or 

Re: [ovs-dev] [PATCH] SECURITY.md: Add advisory document details.

2016-03-29 Thread Justin Pettit

> On Mar 29, 2016, at 11:07 AM, Ben Pfaff  wrote:
> 
> +  This section should state the risks of the procedure. For
> +  example. if it can crash Open vSwitch or disrupt packet

I think that should be a comma instead of a period.

Thanks a lot for writing this up!

Acked-by: Justin Pettit 

--Justin


___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] ovn-controller: Optimize processing for non-local datapath without patch ports.

2016-03-29 Thread Han Zhou
For non-local datapaths, if there are no patch ports attached, it
means the lflows and port bindings would never be needed on the
Chassis. Since lflow_run() and physical_run() are the bottlenecks,
skipping the processing for such lflows and port bindings can save
significant amount of CPU, at the same time largely reduce the
number of rules in local openflow tables. This is specifically
useful when most of the lswitches are created for bridged networks,
where logical router is not used.

Test precondition:
2k hypervisors, 20k lports, 200 lswitches (each with a localnet
port).

Test case:
step1: add 50 hypervisors (simulated on 1 BM with 40 cores), and
   wait for flow updates complete on all new hypervisors.
step2: create a lswitch and a localnet port, create and bind 100
   lports evenly on these hypervisors. Repeat this 5 times.

Before the change:
Step1 took around 20 minutes.
Step2 took 936 seconds.

After the change:
Step1 took less than 1 minute: 20x faster.
Step2 took 464 seconds: 2x faster.

Signed-off-by: Han Zhou 
---
 ovn/controller/lflow.c  | 38 +++---
 ovn/controller/lflow.h  |  3 ++-
 ovn/controller/ovn-controller.c | 16 +---
 ovn/controller/ovn-controller.h |  6 ++
 ovn/controller/patch.c  | 22 +++---
 ovn/controller/patch.h  |  2 +-
 ovn/controller/physical.c   | 34 +-
 ovn/controller/physical.h   |  2 +-
 8 files changed, 102 insertions(+), 21 deletions(-)

diff --git a/ovn/controller/lflow.c b/ovn/controller/lflow.c
index 0614a54..be10d18 100644
--- a/ovn/controller/lflow.c
+++ b/ovn/controller/lflow.c
@@ -198,6 +198,7 @@ static void
 add_logical_flows(struct controller_ctx *ctx, const struct lport_index *lports,
   const struct mcgroup_index *mcgroups,
   const struct hmap *local_datapaths,
+  const struct hmap *patched_datapaths,
   const struct simap *ct_zones, struct hmap *flow_table)
 {
 uint32_t conj_id_ofs = 1;
@@ -211,17 +212,18 @@ add_logical_flows(struct controller_ctx *ctx, const 
struct lport_index *lports,
 if (!ldp) {
 continue;
 }
-if (!ingress && is_switch(ldp)) {
+if (is_switch(ldp)) {
 /* For a logical switch datapath, local_datapaths tells us if there
- * are any local ports for this datapath.  If not, processing
- * logical flows for the egress pipeline of this datapath is
- * unnecessary.
+ * are any local ports for this datapath.  If not, we can skip
+ * processing logical flows if the flow belongs to egress pipeline
+ * or if that logical switch datapath is not patched to any logical
+ * router.
  *
- * We still need the ingress pipeline because even if there are no
- * local ports, we still may need to execute the ingress pipeline
- * after a packet leaves a logical router.  Further optimization
- * is possible, but not based on what we know with local_datapaths
- * right now.
+ * Otherwise, we still need the ingress pipeline because even if
+ * there are no local ports, we still may need to execute the 
ingress
+ * pipeline after a packet leaves a logical router.  Further
+ * optimization is possible, but not based on what we know with
+ * local_datapaths right now.
  *
  * A better approach would be a kind of "flood fill" algorithm:
  *
@@ -231,12 +233,25 @@ add_logical_flows(struct controller_ctx *ctx, const 
struct lport_index *lports,
  *   2. For each patch port P in a logical datapath in S, add the
  *  logical datapath of the remote end of P to S.  Iterate
  *  until S reaches a fixed point.
+ *
+ * This can be implemented in northd, which can generate the sets 
and
+ * save it on each port-binding record in SB, and ovn-controller 
can
+ * use the information directly. However, there can be update 
storms
+ * when a pair of patch ports are added/removed to 
connect/disconnect
+ * large lrouters and lswitches. This need to be studied further.
  */
 
 struct hmap_node *ld;
 ld = hmap_first_with_hash(local_datapaths, ldp->tunnel_key);
 if (!ld) {
-continue;
+if (!ingress) {
+continue;
+}
+struct hmap_node *pd;
+pd = hmap_first_with_hash(patched_datapaths, ldp->tunnel_key);
+if (!pd) {
+continue;
+}
 }
 }
 
@@ -416,10 +431,11 @@ void
 lflow_run(struct controller_ctx *ctx, const struct lport_index 

Re: [ovs-dev] [PATCH] SECURITY.md: Add advisory document details.

2016-03-29 Thread Ryan Moats
Yep, looks sensible...

Acked-by: Ryan Moats 

"dev"  wrote on 03/29/2016 01:07:53 PM:

> From: Ben Pfaff 
> To: dev@openvswitch.org
> Cc: Ben Pfaff 
> Date: 03/29/2016 01:09 PM
> Subject: [ovs-dev] [PATCH] SECURITY.md: Add advisory document details.
> Sent by: "dev" 
>
> Signed-off-by: Ben Pfaff 
> ---
>  SECURITY.md | 104 +
> ---
>  1 file changed, 100 insertions(+), 4 deletions(-)
>
> diff --git a/SECURITY.md b/SECURITY.md
> index 08a6ed8..33b85b5 100644
> --- a/SECURITY.md
> +++ b/SECURITY.md
> @@ -101,16 +101,112 @@ determines that it is not a realistic
vulnerability.
>  Step 3a: Document
>  
>
> -The security team develops a security advisory document.  The document
> -credits the reporter and describes the vulnerability, including all of
> -the relevant information from the assessment in step 2.  The security
> +The security team develops a security advisory document.  The security
>  team may, at its discretion, include the reporter (via "CC") in
>  developing the security advisory document, but in any case should
>  accept feedback from the reporter before finalizing the document.
> -
>  When the document is final, the security team should obtain a CVE for
>  the vulnerability from a CNA (https://cve.mitre.org/cve/cna.html).
>
> +The document credits the reporter and describes the vulnerability,
> +including all of the relevant information from the assessment in step
> +2.  Suitable sections for the document include:
> +
> +* Title: The CVE identifier, a short description of the
> +  vulnerability.  The title should mention Open vSwitch.
> +
> +  In email, the title becomes the subject.  Pre-release advisories
> +  are often passed around in encrypted email, which have plaintext
> +  subjects, so the title should not be too specific.
> +
> +* Description: A few paragraphs describing the general
> +  characteristics of the vulnerability, including the versions of
> +  Open vSwitch that are vulnerable, the kind of attack that
> +  exposes the vulnerability, and potential consequences of the
> +  attack.
> +
> +  The description should re-state the CVE identifier, in case the
> +  subject is lost when an advisory is sent over email.
> +
> +* Mitigation: How an Open vSwitch administrator can minimize the
> +  potential for exploitation of the vulnerability, before applying
> +  a fix.  If no mitigation is possible or recommended, explain
> +  why, to reduce the chance that at-risk users believe they are
> +  not at risk.
> +
> +* Fix: Describe how to fix the vulnerability, perhaps in terms of
> +  applying a source patch.  The patch or patches themselves, if
> +  included in the email, should be at the very end of the advisory
> +  to reduce the risk that a reader would stop reading at this
> +  point.
> +
> +* Recommendation: A concise description of the security team's
> +  recommendation to users.
> +
> +* Acknowledgments: Thank the reporters.
> +
> +* Vulnerability Check: A step-by-step procedure by which a user
> +  can determine whether an installed copy of Open vSwitch is
> +  vulnerable.
> +
> +  The procedure should clearly describe how to interpret the
> +  results, including expected results in vulnerable and
> +  not-vulnerable cases.  Thus, procedures that produce clear and
> +  easily distinguished results are preferred.
> +
> +  The procedure should assume as little understanding of Open
> +  vSwitch as possible, to make it more likely that a competent
> +  administrator who does not specialize in Open vSwitch can
> +  perform it successfully.
> +
> +  The procedure should have minimal dependencies on tools that are
> +  not widely installed.
> +
> +  Given a choice, the procedure should be one that takes at least
> +  some work to turn into a useful exploit.  For example, a
> +  procedure based on "ovs-appctl" commands, which require local
> +  administrator access, is preferred to one that sends test
> +  packets to a machine, which only requires network connectivity.
> +
> +  The section should say which operating systems it is designed
> +  for.  If the procedure is likely to be specific to particular
> +  architectures (e.g. x86-64, i386), it should state on which ones
> +  it has been tested.
> +
> +  This section should state the risks of the procedure. For
> +  example. if it can crash Open vSwitch or disrupt packet
> +  forwarding, say so.
> +
> +  It is more useful to explain how to check an installed and
> +  running Open vSwitch than one built locally from source, but if
> +  it is easy to use the procedure from a sandbox environment, it
> +  can be helpful to explain how to 

Re: [ovs-dev] [PATCH 1/2] [Patch V3] OVN: Physical Endpoints

2016-03-29 Thread Russell Bryant
On Tue, Mar 29, 2016 at 12:56 AM, Darrell Ball  wrote:

> The following patch series implements physical-logical separation
> to be used presently by gateways.
>
> The physical endpoint changes allow the physical network to be
> managed more easily by a dedicated provider network management function
> and also allow the northbound schema to remain purely logical.
> Another benefit to allow more easily for additional encapsulations to be
> used when interacting with physical provider networks.
>
> Physical endpoints are defined here as endpoints at the border of a
> physical network.
> This presently applies to gateways.
> If no physical endpt, the encap is assumed empty.
> For gateways, a single physical endpoint must be bound to each logical
> port.
> The chassis name must match the chassis system-id. All 6 arguments must
> be supplied for gateway physical endpoints configuration.
> Support is provided to bind multiple physical endpoints to a logical port
> for
> future considerations.
>
> Physical Endpoint Patch:
>
> ovn-sb.ovsschema: Add physical endpoint table and phys_endpts to logical
> ports
>
> ovn-sb.xml: Update documentation regarding physical endpoints and their
> binding to logical ports. Also describe possible future encapsulation type
> usages.
>
> ovn-sbctl.c: Add configuration for physcial endpoints and binding to
> logical ports.
>
> ovn-sbctl.8.in: Update the ovn-sbctl man page for physical endpoints and
> binding
> to logical ports.
>
> Signed-off-by: Darrell Ball 
> ---
>  ovn/ovn-sb.ovsschema |  23 +++-
>  ovn/ovn-sb.xml   |  75 +++
>  ovn/utilities/ovn-sbctl.8.in |  47 +++
>  ovn/utilities/ovn-sbctl.c| 290
> ++-
>  4 files changed, 429 insertions(+), 6 deletions(-)
>
> diff --git a/ovn/ovn-sb.ovsschema b/ovn/ovn-sb.ovsschema
> index 32f8748..14ecb99 100644
> --- a/ovn/ovn-sb.ovsschema
> +++ b/ovn/ovn-sb.ovsschema
> @@ -1,7 +1,7 @@
>  {
>  "name": "OVN_Southbound",
> -"version": "1.2.0",
> -"cksum": "1259182303 5368",
> +"version": "1.3.0",
> +"cksum": "4176169852 6394",
>

The patch will have to be rebased as this has changed in the meantime.


>  "tables": {
>  "Chassis": {
>  "columns": {
> @@ -71,6 +71,21 @@
>   "min": 0, "max": "unlimited"}}},
>  "indexes": [["tunnel_key"]],
>  "isRoot": true},
> +"Physical_Endpoint": {
> +"columns": {
> +"name": {"type": "string"},
> +"chassis": {"type": {"key": {"type": "uuid",
> + "refTable": "Chassis",
> + "refType": "weak"},
> + "min": 1, "max": 1}},
> +"chassis_port": {"type": {"key": "string", "min": 1,
> "max": 1}},
> +"type": {"type": {"key": {
> +   "type": "string",
> +   "enum": ["set", ["vlan"]]}}},
> +"ingress_encap": {"type": "string"},
> +"egress_encap": {"type": "string"}},
> +"indexes": [["name"]],
> +"isRoot": true},
>  "Port_Binding": {
>  "columns": {
>  "logical_port": {"type": "string"},
> @@ -96,6 +111,10 @@
>   "refTable": "Chassis",
>   "refType": "weak"},
>   "min": 0, "max": 1}},
> +"phys_endpts": {"type": {"key": {"type": "uuid",
> + "refTable":
> "Physical_Endpoint",
> + "refType": "weak"},
> + "min": 0, "max": "unlimited"}},
>  "mac": {"type": {"key": "string",
>   "min": 0,
>   "max": "unlimited"}}},
>

The code and documentation both seem to imply that there should be at most
1 Physical_Endpoint per Port_Binding.  Should "unlimited" by "1" here?  If
multiple endpoints gains a meaning in the future, it can be changed.

More broadly, I'm also wondering if having Physical_Endpoints on each
Port_Binding makes sense.  We already have Chassis.  To me, binding the
gateway port to a chassis is the relationship that makes sense here.  How
that is realized on the chassis is a separate thing.  It seems like a list
of physical endpoints could be on the Chassis, instead.

We also have a somewhat conflicting approach in configuring each chassis.
We have ovn-bridge-mappings, but these patches take a different approach
that doesn't appear to actually work in all cases.

A Physical_Endpoint has a "chassis_port".  The code will look for
br- and create it if it doesn't exist.  If ovn-controller
creates it, it won't do anything useful as no interface 

[ovs-dev] [PATCH] SECURITY.md: Add advisory document details.

2016-03-29 Thread Ben Pfaff
Signed-off-by: Ben Pfaff 
---
 SECURITY.md | 104 +---
 1 file changed, 100 insertions(+), 4 deletions(-)

diff --git a/SECURITY.md b/SECURITY.md
index 08a6ed8..33b85b5 100644
--- a/SECURITY.md
+++ b/SECURITY.md
@@ -101,16 +101,112 @@ determines that it is not a realistic vulnerability.
 Step 3a: Document
 
 
-The security team develops a security advisory document.  The document
-credits the reporter and describes the vulnerability, including all of
-the relevant information from the assessment in step 2.  The security
+The security team develops a security advisory document.  The security
 team may, at its discretion, include the reporter (via "CC") in
 developing the security advisory document, but in any case should
 accept feedback from the reporter before finalizing the document.
-
 When the document is final, the security team should obtain a CVE for
 the vulnerability from a CNA (https://cve.mitre.org/cve/cna.html).
 
+The document credits the reporter and describes the vulnerability,
+including all of the relevant information from the assessment in step
+2.  Suitable sections for the document include:
+
+* Title: The CVE identifier, a short description of the
+  vulnerability.  The title should mention Open vSwitch.
+
+  In email, the title becomes the subject.  Pre-release advisories
+  are often passed around in encrypted email, which have plaintext
+  subjects, so the title should not be too specific.
+
+* Description: A few paragraphs describing the general
+  characteristics of the vulnerability, including the versions of
+  Open vSwitch that are vulnerable, the kind of attack that
+  exposes the vulnerability, and potential consequences of the
+  attack.
+
+  The description should re-state the CVE identifier, in case the
+  subject is lost when an advisory is sent over email.
+
+* Mitigation: How an Open vSwitch administrator can minimize the
+  potential for exploitation of the vulnerability, before applying
+  a fix.  If no mitigation is possible or recommended, explain
+  why, to reduce the chance that at-risk users believe they are
+  not at risk.
+
+* Fix: Describe how to fix the vulnerability, perhaps in terms of
+  applying a source patch.  The patch or patches themselves, if
+  included in the email, should be at the very end of the advisory
+  to reduce the risk that a reader would stop reading at this
+  point.
+
+* Recommendation: A concise description of the security team's
+  recommendation to users.
+
+* Acknowledgments: Thank the reporters.
+
+* Vulnerability Check: A step-by-step procedure by which a user
+  can determine whether an installed copy of Open vSwitch is
+  vulnerable.
+
+  The procedure should clearly describe how to interpret the
+  results, including expected results in vulnerable and
+  not-vulnerable cases.  Thus, procedures that produce clear and
+  easily distinguished results are preferred.
+
+  The procedure should assume as little understanding of Open
+  vSwitch as possible, to make it more likely that a competent
+  administrator who does not specialize in Open vSwitch can
+  perform it successfully.
+
+  The procedure should have minimal dependencies on tools that are
+  not widely installed.
+
+  Given a choice, the procedure should be one that takes at least
+  some work to turn into a useful exploit.  For example, a
+  procedure based on "ovs-appctl" commands, which require local
+  administrator access, is preferred to one that sends test
+  packets to a machine, which only requires network connectivity.
+
+  The section should say which operating systems it is designed
+  for.  If the procedure is likely to be specific to particular
+  architectures (e.g. x86-64, i386), it should state on which ones
+  it has been tested.
+
+  This section should state the risks of the procedure. For
+  example. if it can crash Open vSwitch or disrupt packet
+  forwarding, say so.
+
+  It is more useful to explain how to check an installed and
+  running Open vSwitch than one built locally from source, but if
+  it is easy to use the procedure from a sandbox environment, it
+  can be helpful to explain how to do so.
+
+* Patch: If a patch or patches are available, and it is practical
+  to include them in the email, put them at the end.  Format them
+  as described in CONTRIBUTING.md, that is, as output by "git
+  format-patch".
+
+  The patch subjects should include the version for which they are
+  suited, e.g. "[PATCH branch-2.3]" for a patch against Open
+  vSwitch 2.3.x.  If there are multiple patches for multiple
+  versions of Open vSwitch, put them in separate sections with
+  clear titles.
+
+  Multiple patches for a 

[ovs-dev] Bill N-AE7C8A

2016-03-29 Thread Lorraine Frye
Dear dev,

Please check the bill in attachment.
In order to avoid fine you have to pay in 48 hours.

Best regards
Lorraine Frye
Divisional Finance Director
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH RFC] ovn-controller: Optimize processing for non-local datapath without patch ports.

2016-03-29 Thread Han Zhou
On Tue, Mar 29, 2016 at 6:57 AM, Ryan Moats  wrote:
>
> Acked-by: Ryan Moats 
>

Ryan, thanks for the ack.

Scale testing shows very good result:

Test precondition:
2k hypervisors, 20k lports, 200 lswitches (each with a localnet port).

Test case:
step1: add 50 hypervisors (simulated on 1 BM with 40 cores), and wait for
flow updates complete on all new hypervisors.
step2: create a lswitch and a localnet port, create and bind 100 lports
evenly on these hypervisors. Repeat this 5 times.

Before the change:
Step1 took around 20 minutes.
Step2 took 936 seconds.

After the change:
Step1 took less than 1 minute: 20 times faster.
Step2 took 464 seconds: 2x faster.

I will do some more regression test then submit a formal patch.

--
Best regards,
Han
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] CCE29032016_00032

2016-03-29 Thread d...@openvswitch.com





Sent from my iPhone
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] Bill N-08E4AC

2016-03-29 Thread Cecilia Watson
Dear dev,

Please check the bill in attachment.
In order to avoid fine you have to pay in 48 hours.

Best regards
Cecilia Watson
Business Director USA Job
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] Bill N-29A3F4

2016-03-29 Thread Imogene Padilla
Dear dev,

Please check the bill in attachment.
In order to avoid fine you have to pay in 48 hours.

Best regards
Imogene Padilla
Divisional Managing Director
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2] ovn: Add ovn-bridge-mappings to Chassis external_ids.

2016-03-29 Thread Russell Bryant
On Tue, Mar 29, 2016 at 11:43 AM, Ben Pfaff  wrote:

> On Mon, Mar 28, 2016 at 06:44:49PM -0400, Russell Bryant wrote:
> > Publish ovn-controller's local bridge mappings configuration
> > in the external_ids column of the Chassis table.  Having this
> > information available for reading is useful to applications
> > integrating with OVN.
> >
> > Signed-off-by: Russell Bryant 
>
> Ryan's comment about a test seems reasonable to me.
>

*holds head in shame*

Acked-by: Ben Pfaff 
>

Thanks.  I added test coverage and pushed this to master.

-- 
Russell Bryant

-- 
Russell Bryant
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] Bill N-46D3C3

2016-03-29 Thread Francesco Williams
Dear dev,

Please check the bill in attachment.
In order to avoid fine you have to pay in 48 hours.

Best regards
Francesco Williams
Operations Director (CEO Designate)
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] datapath-windows: Add Windows thread atomic APIs for x64 binaries.

2016-03-29 Thread Ben Pfaff
I understand that InterlockedExchange64 works to implement a 64-bit
store.  I also understand 32-bit builds use it for 64-bit atomic stores:
because 32-bit code cannot otherwise store to 64-bit objects.  I don't
understand why 64-bit code would use it for 64-bit atomic stores; it
does not make any sense for MSVC to convert such a store to two 32-bit
instructions, and I doubt that it does.

Here's the code I'm talking about:

/* MSVC converts 64 bit writes into two instructions. So there is
 * a possibility that an interrupt can make a 64 bit write non-atomic even
 * when 8 byte aligned. So use InterlockedExchange64().
 *
 * For atomic stores, 'consume' and 'acquire' semantics are not valid. But we
 * are using 'Exchange' to get atomic stores here and we only have
 * InterlockedExchange64(), InterlockedExchangeNoFence64() and
 * InterlockedExchange64Acquire() available. So we are forced to use
 * InterlockedExchange64() which uses full memory barrier for everything
 * greater than 'memory_order_relaxed'. */
#define atomic_store64(DST, SRC, ORDER)\
if (ORDER == memory_order_relaxed) {   \
InterlockedExchangeNoFence64((int64_t volatile *) (DST),   \
 (int64_t) (SRC)); \
} else {   \
InterlockedExchange64((int64_t volatile *) (DST), (int64_t) (SRC));\
}

Similarly for atomic_read64():

/* MSVC converts 64 bit reads into two instructions. So there is
 * a possibility that an interrupt can make a 64 bit read non-atomic even
 * when 8 byte aligned. So use fully memory barrier InterlockedOr64(). */
#define atomic_read64(SRC, DST, ORDER) \
__pragma (warning(push))   \
__pragma (warning(disable:4047))   \
*(DST) = InterlockedOr64((int64_t volatile *) (SRC), 0);   \
__pragma (warning(pop))


On Tue, Mar 29, 2016 at 04:20:13PM +, Sorin Vinturis wrote:
> Ben, ovs-atomic-msvc.h contains support for 8,16,32 and 64-bit arithmetic and 
> logical operations.
> Regarding 64-bit operations, like read and write, they are performed using 
> specific 64-bit interlocked functions, i.e. InterlockedCompareExchange64 and, 
> respectivelly, InterlockedOr64. Both functions are performed atomically.
> 
> All remaining 64-bit operations use, also, 64-bit interlocked functions, 
> which are executed atomically, except for logical AND operation. For the 
> latter operation, an intrinsic function is used, _InterlockedExchangeAdd64, 
> which means that is a faster function built in to the compiler. More details 
> about this is here: 
> https://msdn.microsoft.com/ro-ro/library/26td21ds(v=vs.80).aspx.
> 
> 
> -Original Message-
> From: Ben Pfaff [mailto:b...@ovn.org] 
> Sent: Tuesday, 29 March, 2016 00:25
> To: Sorin Vinturis
> Cc: Guru Shetty; dev@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH] datapath-windows: Add Windows thread atomic 
> APIs for x64 binaries.
> 
> On Mon, Mar 28, 2016 at 09:11:24PM +, Sorin Vinturis wrote:
> > Hi Guru,
> > 
> > I have verified the ovs-atomic-msvc.h header and all the defined 
> > macros and functions have 32-bit and 64-bit correspondent. All 64-bit 
> > macros use InterlockedXXX functions that are atomic:
> > InterlockedExchangeNoFence64, InterlockedExchange64, 
> > InterlockedCompareExchange64, _InterlockedExchangeAdd64, 
> > InterlockedAnd64, InterlockedOr64, InterlockedXor64. I have ran 
> > test-atomic.c unit tests, on both 32-bit and 64-bit platforms, and all 
> > the tests are verified.
> > 
> > I also tested the 64-bit build on Hyper-V and it is stable.
> > 
> > Is there a reason why we should not use ovs-atomic-msvc APIs for 64-bit 
> > builds?
> 
> ovs-atomic-msvc.h seems pretty oriented toward a 32-bit build.  At least, I 
> really doubt MSVC would break 64-bit reads and writes into two 32-bit reads 
> and writes on 64-bit builds, which is what the code in the header currently 
> assumes.  You don't want to update it to handle 64-bit builds differently?  I 
> guess that these would be optimizations, but they seem straightforward.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v10 0/6] Convert DPDK configuration from command line to DB based

2016-03-29 Thread Ben Pfaff
On Tue, Mar 29, 2016 at 12:08:50PM -0400, Aaron Conole wrote:
> Ben Pfaff  writes:
> 
> > On Tue, Mar 29, 2016 at 06:04:44AM +, Wojciechowicz, RobertX wrote:
> >> > -Original Message-
> >> > From: dev [mailto:dev-boun...@openvswitch.org] On Behalf Of Aaron
> >> > Conole
> >> > Sent: Monday, March 28, 2016 8:55 PM
> >> > To: dev@openvswitch.org
> >> > Cc: Flavio Leitner 
> >> > Subject: Re: [ovs-dev] [PATCH v10 0/6] Convert DPDK configuration from
> >> > command line to DB based
> >> > 
> >> > Hi (and apologies if the top posting is inappropriate),
> >> > 
> >> > Don't want to be a pest, but just pinging re: this series. What work
> >> > remains? I want to try and close this out to do some additional
> >> > vhostuser config work, so anything that might be gating this please let
> >> > me know and I'll work on it.
> >> > 
> >> 
> >> Please remember to add "vhost-sock-dir" to the database,
> >> even if there will be used the default directory (no command line value).
> >
> > I'm nervous about adding unrestricted directory names to the database,
> > because they could allow a remote database user to write to arbitrary
> > places in the file system.
> 
> I see your point here. Is there a suggested mechanism to resolve this?
> What if we had a scheme like:
> 
> ovs_rundir() + dboption
> 
> where we scrubbed dboption for '..' characters. Since I'm in this area
> right now doing the change, I don't mind altering this scheme, but it
> does slightly change the semantic of the option so I'd want to hear from
> folks before making said scheme change.

That would make me happy; I can't speak for the others of course.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] CCE29032016_00050

2016-03-29 Thread d...@openvswitch.com





Sent from my iPhone
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2] datapath-windows: Update Recirculation to use the right parameters

2016-03-29 Thread Alin Serdean
Comments inlined.

Thanks,
Alin.
> -Mesaj original-
> This comparison determines the value for ŒisRecv¹ parameter. ŒisRecv¹
> determines whether the OOB data should be interpreted as receive data or
> send data. So, the existing code is checking for:
> srcPortNo == switchContext->virtualExternalPortId
> 
> Left side data is a UINT32 and right side data is a NDIS_SWITCH_PORT_ID.
> So, clearly this comparison is not right since we are comparing different
> types. They are not expected to have the same value even if they are
> representing the same vport.
> 
[Alin Gabriel Serdean: ] from the header ntddndis.h:
typedef UINT32 NDIS_SWITCH_PORT_ID, *PNDIS_SWITCH_PORT_ID; 
I don't think it is a problem of type but a problem with the number itself.
> The proposed fix at least makes sure that we are comparing the right types.
> So, I¹d go with it. Is the comparison right? It seems so. Basically we want to
> determine if the packet came from a VIF or a physical NIC.
> ŒfwdDetail¹ is a reliable way of doing it. Only way this could mess up is if 
> we
> mean to explicitly change the ŒsrcPortNo¹ during tunneling. In such cases,
> the 'fwdDetail->SourcePortId¹ will end up different from the ŒsrcPortNo¹.
> This is exactly why we use ŒsrcPortNo¹ for comparison around the code to
> allow flexibility.
[Alin Gabriel Serdean: ] fwdDetail is not reliable in our case because we could 
have cloned the nbl and not update the OOB data and this is not what we want to 
do in our case. We want to reprocess the packet as it came from the same input 
port 
(https://github.com/openvswitch/ovs/blob/master/ofproto/ofproto-dpif-rid.h#L64-L69)
Indeed, there is a problem because srcportno == vport->portno and we should use 
in our case vport->portId when doing the comparison. I'll send out a patch to 
update it.
> 
> 
> In any case, the problematic case here is tunneling + recirc. We can deal with
> that in a subsequent patch. For now, this is the right fix.
> 
> I¹d also add a XXX comment to investigate that "tunneling + recirc² case in 
> the
> future.
> 
> > ,
> > ovsFwdCtx.switchContext, diff
> >--git a/datapath-windows/ovsext/Flow.c b/datapath-
> windows/ovsext/Flow.c
> >index 02c41b7..d49697c 100644
> >--- a/datapath-windows/ovsext/Flow.c
> >+++ b/datapath-windows/ovsext/Flow.c
> >@@ -2133,6 +2133,9 @@ OvsLookupFlow(OVS_DATAPATH *datapath,
> >
> > if (!hashValid) {
> > *hash = OvsJhashBytes(start, size, 0);
> >+if (key->recircId) {
> >+*hash = OvsJhashWords((UINT32*)hash, 1, key->recircId);
> >+}
> 
> Ok, we have two options:
> 1. Increment ŒkeyLen¹ and include recircId as part of it. So, while hashing we
> make sure that recircId gets included.
> 2. Explicitly add Œkey->recirId¹ during hashing.
> 
> I¹m ok with either way. The ŒkeyLen¹ approach is more efficient for now,
> since it avoids a Œif¹ check and also an additional function call. In the 
> future, if
> we have a lot of meta data such as ŒrecircId¹, then we should consider
> adding a Œkey->metaDataLen¹ and go from there. For now, I¹d add a
> comment in ŒOvsFlowKey¹ to set future direction, and go with the
> ŒkeyLen¹ approach.
[Alin Gabriel Serdean: ] I would go with option one so we maintain the same of 
hashing we have at the moment.
> 
> Of course, there¹s a bug when ŒkeyLen¹ gets set in
> _MapKeyAttrToFlowPut().
> The value gets incremented due to ŒrecircId¹ but gets reset again.
> 
> > }
> >
> > head = >flowTable[HASH_BUCKET(*hash)];
> 
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] CCE29032016_00021

2016-03-29 Thread d...@openvswitch.com





Sent from my iPhone
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] datapath-windows: Add Windows thread atomic APIs for x64 binaries.

2016-03-29 Thread Sorin Vinturis
Ben, ovs-atomic-msvc.h contains support for 8,16,32 and 64-bit arithmetic and 
logical operations.
Regarding 64-bit operations, like read and write, they are performed using 
specific 64-bit interlocked functions, i.e. InterlockedCompareExchange64 and, 
respectivelly, InterlockedOr64. Both functions are performed atomically.

All remaining 64-bit operations use, also, 64-bit interlocked functions, which 
are executed atomically, except for logical AND operation. For the latter 
operation, an intrinsic function is used, _InterlockedExchangeAdd64, which 
means that is a faster function built in to the compiler. More details about 
this is here: https://msdn.microsoft.com/ro-ro/library/26td21ds(v=vs.80).aspx.


-Original Message-
From: Ben Pfaff [mailto:b...@ovn.org] 
Sent: Tuesday, 29 March, 2016 00:25
To: Sorin Vinturis
Cc: Guru Shetty; dev@openvswitch.org
Subject: Re: [ovs-dev] [PATCH] datapath-windows: Add Windows thread atomic APIs 
for x64 binaries.

On Mon, Mar 28, 2016 at 09:11:24PM +, Sorin Vinturis wrote:
> Hi Guru,
> 
> I have verified the ovs-atomic-msvc.h header and all the defined 
> macros and functions have 32-bit and 64-bit correspondent. All 64-bit 
> macros use InterlockedXXX functions that are atomic:
> InterlockedExchangeNoFence64, InterlockedExchange64, 
> InterlockedCompareExchange64, _InterlockedExchangeAdd64, 
> InterlockedAnd64, InterlockedOr64, InterlockedXor64. I have ran 
> test-atomic.c unit tests, on both 32-bit and 64-bit platforms, and all 
> the tests are verified.
> 
> I also tested the 64-bit build on Hyper-V and it is stable.
> 
> Is there a reason why we should not use ovs-atomic-msvc APIs for 64-bit 
> builds?

ovs-atomic-msvc.h seems pretty oriented toward a 32-bit build.  At least, I 
really doubt MSVC would break 64-bit reads and writes into two 32-bit reads and 
writes on 64-bit builds, which is what the code in the header currently 
assumes.  You don't want to update it to handle 64-bit builds differently?  I 
guess that these would be optimizations, but they seem straightforward.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v5 08/12] dpif-netdev: Change pmd thread configuration in dpif_netdev_run().

2016-03-29 Thread Kavanagh, Mark B
>
>Hi Mark,
>
>thanks for your comment, I replied inline
>
>On 24/03/2016 10:17, "Kavanagh, Mark B"  wrote:
>
>>Hi Daniele,
>>
>>One comment inline.
>>
>>Cheers,
>>Mark
>>
>>>-Original Message-
>>>From: Daniele Di Proietto [mailto:diproiet...@vmware.com]
>>>Sent: Wednesday, March 23, 2016 6:37 PM
>>>To: dev@openvswitch.org
>>>Cc: Ben Pfaff; Kavanagh, Mark B; Ilya Maximets; Daniele Di Proietto
>>>Subject: [PATCH v5 08/12] dpif-netdev: Change pmd thread configuration
>>>in dpif_netdev_run().
>>>
>>>Signed-off-by: Daniele Di Proietto 
>>>Tested-by: Ilya Maximets 
>>>Acked-by: Ilya Maximets 
>>>---
>>> lib/dpif-netdev.c   | 140
>>>++--
>>> lib/dpif-provider.h |   3 +-
>>> 2 files changed, 83 insertions(+), 60 deletions(-)
>>>
>>>diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
>>>index 66c0b19..6aaeaeb 100644
>>>--- a/lib/dpif-netdev.c
>>>+++ b/lib/dpif-netdev.c
>>>@@ -223,7 +223,9 @@ struct dp_netdev {
>>> ovsthread_key_t per_pmd_key;
>>>
>>> /* Cpu mask for pin of pmd threads. */
>>>+char *requested_pmd_cmask;
>>> char *pmd_cmask;
>>>+
>>> uint64_t last_tnl_conf_seq;
>>> };
>>>
>>>@@ -2447,82 +2449,44 @@ dpif_netdev_operate(struct dpif *dpif, struct
>>>dpif_op **ops, size_t
>>>n_ops)
>>> }
>>> }
>>>
>>>-/* Returns true if the configuration for rx queues or cpu mask
>>>- * is changed. */
>>>+/* Returns true if the configuration for rx queues is changed. */
>>> static bool
>>>-pmd_config_changed(const struct dp_netdev *dp, const char *cmask)
>>>+pmd_n_rxq_changed(const struct dp_netdev *dp)
>>> {
>>> struct dp_netdev_port *port;
>>>
>>> CMAP_FOR_EACH (port, node, >ports) {
>>>-struct netdev *netdev = port->netdev;
>>>-int requested_n_rxq = netdev_requested_n_rxq(netdev);
>>>-if (netdev_is_pmd(netdev)
>>>+int requested_n_rxq = netdev_requested_n_rxq(port->netdev);
>>>+
>>>+if (netdev_is_pmd(port->netdev)
>>> && port->latest_requested_n_rxq != requested_n_rxq) {
>>> return true;
>>> }
>>> }
>>>
>>>-if (dp->pmd_cmask != NULL && cmask != NULL) {
>>>-return strcmp(dp->pmd_cmask, cmask);
>>>-} else {
>>>-return (dp->pmd_cmask != NULL || cmask != NULL);
>>>+return false;
>>>+}
>>>+
>>>+static bool
>>>+cmask_equals(const char *a, const char *b)
>>>+{
>>>+if (a && b) {
>>>+return !strcmp(a, b);
>>> }
>>>+
>>>+return a == NULL && b == NULL;
>>> }
>>>
>>>-/* Resets pmd threads if the configuration for 'rxq's or cpu mask
>>>changes. */
>>>+/* Changes the number or the affinity of pmd threads.  The changes are
>>>actually
>>>+ * applied in dpif_netdev_run(). */
>>> static int
>>> dpif_netdev_pmd_set(struct dpif *dpif, const char *cmask)
>>> {
>>> struct dp_netdev *dp = get_dp_netdev(dpif);
>>>
>>>-if (pmd_config_changed(dp, cmask)) {
>>>-struct dp_netdev_port *port;
>>>-
>>>-dp_netdev_destroy_all_pmds(dp);
>>>-
>>>-CMAP_FOR_EACH (port, node, >ports) {
>>>-struct netdev *netdev = port->netdev;
>>>-int requested_n_rxq = netdev_requested_n_rxq(netdev);
>>>-if (netdev_is_pmd(port->netdev)
>>>-&& port->latest_requested_n_rxq != requested_n_rxq) {
>>>-int i, err;
>>>-
>>>-/* Closes the existing 'rxq's. */
>>>-for (i = 0; i < netdev_n_rxq(port->netdev); i++) {
>>>-netdev_rxq_close(port->rxq[i]);
>>>-port->rxq[i] = NULL;
>>>-}
>>>-port->n_rxq = 0;
>>>-
>>>-/* Sets the new rx queue config.  */
>>>-err = netdev_set_multiq(port->netdev,
>>>-ovs_numa_get_n_cores() + 1,
>>>-requested_n_rxq);
>>>-if (err && (err != EOPNOTSUPP)) {
>>>-VLOG_ERR("Failed to set dpdk interface %s rx_queue
>>>to:"
>>>- " %u", netdev_get_name(port->netdev),
>>>- requested_n_rxq);
>>>-return err;
>>>-}
>>>-port->latest_requested_n_rxq = requested_n_rxq;
>>>-/* If the set_multiq() above succeeds, reopens the
>>>'rxq's. */
>>>-port->n_rxq = netdev_n_rxq(port->netdev);
>>>-port->rxq = xrealloc(port->rxq, sizeof *port->rxq *
>>>port->n_rxq);
>>>-for (i = 0; i < port->n_rxq; i++) {
>>>-netdev_rxq_open(port->netdev, >rxq[i], i);
>>>-}
>>>-}
>>>-}
>>>-/* Reconfigures the cpu mask. */
>>>-ovs_numa_set_cpu_mask(cmask);
>>>-free(dp->pmd_cmask);
>>>-dp->pmd_cmask = cmask ? xstrdup(cmask) : NULL;
>>>-
>>>-/* Restores the non-pmd. */
>>>-

Re: [ovs-dev] [PATCH v10 0/6] Convert DPDK configuration from command line to DB based

2016-03-29 Thread Aaron Conole
Ben Pfaff  writes:

> On Tue, Mar 29, 2016 at 06:04:44AM +, Wojciechowicz, RobertX wrote:
>> > -Original Message-
>> > From: dev [mailto:dev-boun...@openvswitch.org] On Behalf Of Aaron
>> > Conole
>> > Sent: Monday, March 28, 2016 8:55 PM
>> > To: dev@openvswitch.org
>> > Cc: Flavio Leitner 
>> > Subject: Re: [ovs-dev] [PATCH v10 0/6] Convert DPDK configuration from
>> > command line to DB based
>> > 
>> > Hi (and apologies if the top posting is inappropriate),
>> > 
>> > Don't want to be a pest, but just pinging re: this series. What work
>> > remains? I want to try and close this out to do some additional
>> > vhostuser config work, so anything that might be gating this please let
>> > me know and I'll work on it.
>> > 
>> 
>> Please remember to add "vhost-sock-dir" to the database,
>> even if there will be used the default directory (no command line value).
>
> I'm nervous about adding unrestricted directory names to the database,
> because they could allow a remote database user to write to arbitrary
> places in the file system.

I see your point here. Is there a suggested mechanism to resolve this?
What if we had a scheme like:

ovs_rundir() + dboption

where we scrubbed dboption for '..' characters. Since I'm in this area
right now doing the change, I don't mind altering this scheme, but it
does slightly change the semantic of the option so I'd want to hear from
folks before making said scheme change.

Thanks,
-Aaron
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] datapath-windows: Remove unnecessary keylen computation in Flow.c

2016-03-29 Thread Alin Serdean
Again please move the keylen reset upwards before the keylen gets updated.

We could do as you propose Nithin. Although hash is applicable from my 
knowledge only in the case of bonding.

We could simply just align the structure "OvsFlowKey" to 8 and make sure we 
initialize the members properly that would be better in the future IMO. Thus 
resulting we only set the members we receive from the userspace in this 
particular case recircId.

Alin.

> -Mesaj original-
> De la: dev [mailto:dev-boun...@openvswitch.org] În numele Nithin Raju
> Trimis: Tuesday, March 29, 2016 5:43 PM
> Către: Sairam Venugopal ; dev@openvswitch.org
> Subiect: Re: [ovs-dev] [PATCH] datapath-windows: Remove unnecessary
> keylen computation in Flow.c
> 
> Let¹s go in the order in which the various fields in OvsFlowKey are laid out.
> Let¹s setup ŒdestKey->recircId¹ and ŒdestKey->dpHash¹ after the L2
> headers.
> 
> Also, you want to combine this with the fix in the other patch during hash
> computation. They all go together.
> 
> 
> -Original Message-
> From: dev  on behalf of Sairam Venugopal
> 
> Date: Friday, March 25, 2016 at 11:41 AM
> To: "dev@openvswitch.org" 
> Subject: [ovs-dev] [PATCH] datapath-windows: Remove unnecessary
> keylencomputation in Flow.c
> 
> >destKey->l2.keylen gets reset after this line. This change
> >doesn't help with hash computation and can be removed. This is what
> >destKey->l2.keyLen gets set to:
> >
> >destKey->l2.keyLen = OVS_WIN_TUNNEL_KEY_SIZE + OVS_L2_KEY_SIZE
> > - destKey->l2.offset;
> >
> >Signed-off-by: Sairam Venugopal 
> >---
> > datapath-windows/ovsext/Flow.c | 2 --
> > 1 file changed, 2 deletions(-)
> >
> >diff --git a/datapath-windows/ovsext/Flow.c
> >b/datapath-windows/ovsext/Flow.c index d49697c..004c54a 100644
> >--- a/datapath-windows/ovsext/Flow.c
> >+++ b/datapath-windows/ovsext/Flow.c
> >@@ -1382,12 +1382,10 @@ _MapKeyAttrToFlowPut(PNL_ATTR *keyAttrs,
> >
> > if (keyAttrs[OVS_KEY_ATTR_RECIRC_ID]) {
> > destKey->recircId =
> >NlAttrGetU32(keyAttrs[OVS_KEY_ATTR_RECIRC_ID]);
> >-destKey->l2.keyLen += sizeof(destKey->recircId);
> > }
> >
> > if (keyAttrs[OVS_KEY_ATTR_DP_HASH]) {
> > destKey->dpHash =
> NlAttrGetU32(keyAttrs[OVS_KEY_ATTR_DP_HASH]);
> >-destKey->l2.keyLen += sizeof(destKey->dpHash);
> > }
> >
> > /* = L2 headers = */
> >--
> >2.5.0.windows.1
> >
> >___
> >dev mailing list
> >dev@openvswitch.org
> >https://urldefense.proofpoint.com/v2/url?u=http-
> 3A__openvswitch.org_mai
> >lma
> >n_listinfo_dev=BQIGaQ=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-
> YihVMNtXt-uEs
> >=pN
> >HQcdr7B40b4h6Yb7FIedI1dnBsxdDuTLBYD3JqV80=oRGQX3o3i3alWuYX8
> DM0Rq2_Ree
> >DJO
> 4aL_3CkyeAp40=89cOfCUZyjbS3g8Nk_phUFg5pgZHZKTGNV2rkS550zc=
> 
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] datapath-windows: Remove unnecessary keylen computation in Flow.c

2016-03-29 Thread Alin Serdean
If we don't move that upwards it will reset the keylen after recircid/hash has 
been set to update it.

> -Mesaj original-
> De la: Nithin Raju [mailto:nit...@vmware.com]
> Trimis: Tuesday, March 29, 2016 5:45 PM
> Către: Alin Serdean ; Sairam Venugopal
> ; dev@openvswitch.org
> Subiect: Re: [ovs-dev] [PATCH] datapath-windows: Remove unnecessary
> keylen computation in Flow.c
> 
> -Original Message-
> From: dev  on behalf of Alin Serdean
> 
> Date: Monday, March 28, 2016 at 3:35 AM
> To: Sairam Venugopal , "dev@openvswitch.org"
> 
> Subject: Re: [ovs-dev] [PATCH] datapath-windows: Remove
> unnecessary   keylen  computation in Flow.c
> 
> >Please move:
> >
> >
> >
> >destKey->l2.keyLen = OVS_WIN_TUNNEL_KEY_SIZE + OVS_L2_KEY_SIZE
> >
> > - destKey->l2.offset;
> >
> >
> >
> >After
> >
> >_MapTunAttrToFlowPut(keyAttrs, tunnelAttrs, destKey);
> 
> Why?

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v10 0/6] Convert DPDK configuration from command line to DB based

2016-03-29 Thread Ben Pfaff
On Tue, Mar 29, 2016 at 06:04:44AM +, Wojciechowicz, RobertX wrote:
> > -Original Message-
> > From: dev [mailto:dev-boun...@openvswitch.org] On Behalf Of Aaron
> > Conole
> > Sent: Monday, March 28, 2016 8:55 PM
> > To: dev@openvswitch.org
> > Cc: Flavio Leitner 
> > Subject: Re: [ovs-dev] [PATCH v10 0/6] Convert DPDK configuration from
> > command line to DB based
> > 
> > Hi (and apologies if the top posting is inappropriate),
> > 
> > Don't want to be a pest, but just pinging re: this series. What work
> > remains? I want to try and close this out to do some additional
> > vhostuser config work, so anything that might be gating this please let
> > me know and I'll work on it.
> > 
> 
> Please remember to add "vhost-sock-dir" to the database,
> even if there will be used the default directory (no command line value).

I'm nervous about adding unrestricted directory names to the database,
because they could allow a remote database user to write to arbitrary
places in the file system.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2] ovn: Add ovn-bridge-mappings to Chassis external_ids.

2016-03-29 Thread Ben Pfaff
On Mon, Mar 28, 2016 at 06:44:49PM -0400, Russell Bryant wrote:
> Publish ovn-controller's local bridge mappings configuration
> in the external_ids column of the Chassis table.  Having this
> information available for reading is useful to applications
> integrating with OVN.
> 
> Signed-off-by: Russell Bryant 

Ryan's comment about a test seems reasonable to me.

Acked-by: Ben Pfaff 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] debian: Ship ovn-[ns]b man pages in ovn-common.

2016-03-29 Thread Russell Bryant
On Tue, Mar 29, 2016 at 10:49 AM, Guru Shetty  wrote:

>
>
> On 24 March 2016 at 17:04, Russell Bryant  wrote:
>
>> Move ovn-nb and ovn-sb man pages to ovn-common so that the man pages for
>> these DB schemas are always available with the corresponding command
>> line utilities, ovn-nbctl and ovn-sbctl.
>>
>> Signed-off-by: Russell Bryant 
>>
> Acked-by: Gurucharan Shetty 
>

Thanks!  I applied this to master.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] debian: Ship ovn-[ns]b man pages in ovn-common.

2016-03-29 Thread Russell Bryant
On Mon, Mar 28, 2016 at 9:16 PM, Simon Horman 
wrote:

> On Thu, Mar 24, 2016 at 08:04:33PM -0400, Russell Bryant wrote:
> > On Thu, Mar 24, 2016 at 8:04 PM, Russell Bryant  wrote:
> >
> > > Move ovn-nb and ovn-sb man pages to ovn-common so that the man pages
> for
> > > these DB schemas are always available with the corresponding command
> > > line utilities, ovn-nbctl and ovn-sbctl.
> > >
> > > Signed-off-by: Russell Bryant 
> > > ---
> > >  debian/ovn-central.manpages | 2 --
> > >  debian/ovn-common.manpages  | 2 ++
> > >  2 files changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/debian/ovn-central.manpages b/debian/ovn-central.manpages
> > > index 60ac082..2ddb437 100644
> > > --- a/debian/ovn-central.manpages
> > > +++ b/debian/ovn-central.manpages
> > > @@ -1,3 +1 @@
> > > -ovn/ovn-nb.5
> > > -ovn/ovn-sb.5
> > >  ovn/northd/ovn-northd.8
> > > diff --git a/debian/ovn-common.manpages b/debian/ovn-common.manpages
> > > index d48d801..a1bd031 100644
> > > --- a/debian/ovn-common.manpages
> > > +++ b/debian/ovn-common.manpages
> > > @@ -1,4 +1,6 @@
> > >  ovn/ovn-architecture.7
> > > +ovn/ovn-nb.5
> > > +ovn/ovn-sb.5
> > >  ovn/utilities/ovn-ctl.8
> > >  ovn/utilities/ovn-nbctl.8
> > >  ovn/utilities/ovn-sbctl.8
> > > --
> > > 2.5.5
> > >
> > >
> > Note that this is completely untested.
>
> It looks trivial and correct but just in case:
>
> Tested-by: Simon Horman 
>
>
Thanks for trying it.  It looked trivial, but since I don't really know
debian packaging, I figured I could have missed something.

-- 
Russell Bryant
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2] ovn: Add ovn-bridge-mappings to Chassis external_ids.

2016-03-29 Thread Russell Bryant
On Tue, Mar 29, 2016 at 10:43 AM, Ryan Moats  wrote:

> This makes sense to me, though before I ack it, I'm wondering if a test
> case would be a useful addition?
>
That's a fair request.  I have no good excuse for not adding a test case
for this.

Thanks,

-- 
Russell Bryant
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] datapath-windows: Fix OVS_KEY_ATTR_ICMPV6 support

2016-03-29 Thread Ben Pfaff
On Thu, Mar 10, 2016 at 02:12:50PM +, Alin Serdean wrote:
> This patch applies the conversions needed for the ICMPv6 type and code.
> 
> Signed-off-by: Alin Gabriel Serdean 

Though this was acked, it currently gets patch rejects.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] datapath-windows: Add OVS_KEY_ATTR_ICMP support

2016-03-29 Thread Ben Pfaff
On Thu, Mar 10, 2016 at 02:07:06PM +, Alin Serdean wrote:
> Revisit the mapping of an IPv4 key to netlink key and add the according
> transformation.
> 
> Also add support for OVS_KEY_ATTR_ICMP to the windows datapath.
> 
> Signed-off-by: Alin Gabriel Serdean 

Though this was acked, it gets patch rejects when I try to apply it.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] Implementation of Packet-Out in Bundles

2016-03-29 Thread Ben Pfaff
On Tue, Mar 15, 2016 at 09:21:40PM +, André Mantas wrote:
> I think one option could be do extra validations depending on the error
> returned by ofproto_check_ofpacts like checking if previous entries in the
> bundle would make the validation successful. But since group and meter mods
> are not implemented yet this isn't feasible.
> I need to focus on my project for now but later I might be able to help
> with group and meter mods.
> 
> To submit the patch is "just" follow CONTRIBUTING.md, right?

Yes, that's correct.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2] datapath-windows: Pause switch state on PnP event

2016-03-29 Thread Ben Pfaff
On Thu, Mar 10, 2016 at 01:33:42PM +, Alin Serdean wrote:
> A PnP(plug and play) event will be triggered before trying to disable
> the extension. We could use this PnP event to prepare for detaching
> the datapath.
> 
> This patch sets the switch into a paused state so no more net buffers
> are queued.
> 
> Also clean some commentaries.
> 
> Signed-off-by: Alin Gabriel Serdean 
> ---
> v2: Address comments

It's been a while since this was posted but I haven't seen any reviews.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] CONTRIBUTING.md: Describe a new "Vulnerability" tag.

2016-03-29 Thread Ben Pfaff
Thanks, applied to master.

On Mon, Mar 28, 2016 at 11:13:10PM -0400, Russell Bryant wrote:
> Acked-by: Russell Bryant 
> 
> On Monday, March 28, 2016, Ben Pfaff  wrote:
> 
> > Signed-off-by: Ben Pfaff >
> > ---
> >  CONTRIBUTING.md | 11 +++
> >  1 file changed, 11 insertions(+)
> >
> > diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
> > index 439c56a..8fcebba 100644
> > --- a/CONTRIBUTING.md
> > +++ b/CONTRIBUTING.md
> > @@ -250,6 +250,17 @@ Examples of common tags follow.
> >
> >  git log -1 --pretty=format:"Fixes: %h (\"%s\")" --abbrev=12
> > COMMIT_REF
> >
> > +Vulnerability: CVE-2016-2074
> > +
> > +Specifies that the patch fixes or is otherwise related to a
> > +security vulnerability with the given CVE identifier.  Other
> > +identifiers in public vulnerability databases are also
> > +suitable.
> > +
> > +If the vulnerability was reported publicly, then it is also
> > +appropriate to cite the URL to the report in a Reported-at
> > +tag.  Use a Reported-by tag to acknowledge the reporters.
> > +
> >  Developer's Certificate of Origin
> >  -
> >
> > --
> > 2.1.3
> >
> > ___
> > dev mailing list
> > dev@openvswitch.org 
> > http://openvswitch.org/mailman/listinfo/dev
> >
> 
> 
> -- 
> Russell Bryant
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] AUTHORS: Add Bhargava Shastry and KashyapThimmaraju.

2016-03-29 Thread Ben Pfaff
Thanks, will push soon.

On Tue, Mar 29, 2016 at 07:31:45AM -0600, Ryan Moats wrote:
> Acked-by: Ryan Moats 
> 
> "dev"  wrote on 03/28/2016 09:40:53 PM:
> 
> > From: Ben Pfaff 
> > To: dev@openvswitch.org
> > Cc: Ben Pfaff 
> > Date: 03/28/2016 09:42 PM
> > Subject: [ovs-dev] [PATCH] AUTHORS: Add Bhargava Shastry and Kashyap
> > Thimmaraju.
> > Sent by: "dev" 
> >
> > Bhargava and Kashyap reported vulnerability CVE-2016-2074, which was
> > present only on the branches for 2.3 and 2.4 and thus did not require any
> > commits on master.  They still deserve credit as bug reporters, so this
> > commit provides that.
> >
> > Signed-off-by: Ben Pfaff 
> > ---
> >  AUTHORS | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/AUTHORS b/AUTHORS
> > index f0d1e23..6ec6456 100644
> > --- a/AUTHORS
> > +++ b/AUTHORS
> > @@ -268,6 +268,7 @@ Atzm Watanabe   a...@stratosphere.co.jp
> >  Aurélien Poulain   aurepoul...@viacesi.fr
> >  Bastian Blank   wa...@debian.org
> >  Ben Basler  bbas...@nicira.com
> > +Bhargava Shastrybshas...@sec.t-labs.tu-berlin.de
> >  Bob Ballbob.b...@citrix.com
> >  Brad Hall   b...@nicira.com
> >  Brandon Heller  brand...@stanford.edu
> > @@ -333,6 +334,7 @@ John Darrington j...@darrington.wattle.id.au
> >  John Galgay j...@galgay.net
> >  John Hurley john.hur...@netronome.com
> >  John Reumannnofutznetwo...@gmail.com
> > +Kashyap Thimmaraju  kashyap.thimmar...@sec.t-labs.tu-berlin.de
> >  Keith Holleman  hollemani...@gmail.com
> >  K 華k940...@hotmail.com
> >  Kevin Mancuso   kevin.manc...@rackspace.com
> > --
> > 2.1.3
> >
> > ___
> > dev mailing list
> > dev@openvswitch.org
> > http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] debian: Ship ovn-[ns]b man pages in ovn-common.

2016-03-29 Thread Guru Shetty
On 24 March 2016 at 17:04, Russell Bryant  wrote:

> Move ovn-nb and ovn-sb man pages to ovn-common so that the man pages for
> these DB schemas are always available with the corresponding command
> line utilities, ovn-nbctl and ovn-sbctl.
>
> Signed-off-by: Russell Bryant 
>
Acked-by: Gurucharan Shetty 

> ---
>  debian/ovn-central.manpages | 2 --
>  debian/ovn-common.manpages  | 2 ++
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/debian/ovn-central.manpages b/debian/ovn-central.manpages
> index 60ac082..2ddb437 100644
> --- a/debian/ovn-central.manpages
> +++ b/debian/ovn-central.manpages
> @@ -1,3 +1 @@
> -ovn/ovn-nb.5
> -ovn/ovn-sb.5
>  ovn/northd/ovn-northd.8
> diff --git a/debian/ovn-common.manpages b/debian/ovn-common.manpages
> index d48d801..a1bd031 100644
> --- a/debian/ovn-common.manpages
> +++ b/debian/ovn-common.manpages
> @@ -1,4 +1,6 @@
>  ovn/ovn-architecture.7
> +ovn/ovn-nb.5
> +ovn/ovn-sb.5
>  ovn/utilities/ovn-ctl.8
>  ovn/utilities/ovn-nbctl.8
>  ovn/utilities/ovn-sbctl.8
> --
> 2.5.5
>
>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] datapath-windows: Remove unnecessary keylen computation in Flow.c

2016-03-29 Thread Nithin Raju
-Original Message-
From: dev  on behalf of Alin Serdean

Date: Monday, March 28, 2016 at 3:35 AM
To: Sairam Venugopal , "dev@openvswitch.org"

Subject: Re: [ovs-dev] [PATCH] datapath-windows: Remove
unnecessary keylen  computation in Flow.c

>Please move:
>
>
>
>destKey->l2.keyLen = OVS_WIN_TUNNEL_KEY_SIZE + OVS_L2_KEY_SIZE
>
> - destKey->l2.offset;
>
>
>
>After 
>
>_MapTunAttrToFlowPut(keyAttrs, tunnelAttrs, destKey);

Why?

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2] ovn: Add ovn-bridge-mappings to Chassis external_ids.

2016-03-29 Thread Ryan Moats
This makes sense to me, though before I ack it, I'm wondering if a test
case would be a useful addition?

Ryan

"dev"  wrote on 03/28/2016 05:44:49 PM:

> From: Russell Bryant 
> To: dev@openvswitch.org
> Date: 03/28/2016 05:45 PM
> Subject: [ovs-dev] [PATCH v2] ovn: Add ovn-bridge-mappings to
> Chassis external_ids.
> Sent by: "dev" 
>
> Publish ovn-controller's local bridge mappings configuration
> in the external_ids column of the Chassis table.  Having this
> information available for reading is useful to applications
> integrating with OVN.
>
> Signed-off-by: Russell Bryant 
> ---
>  ovn/controller/chassis.c | 24 
>  ovn/ovn-sb.xml   |  7 +++
>  2 files changed, 31 insertions(+)
>
>  v1->v2:
>- add verify() before updating a key in Chassis external_ids
>- add get_bridge_mappings() helper
>
>
> diff --git a/ovn/controller/chassis.c b/ovn/controller/chassis.c
> index 52c9993..d40181b 100644
> --- a/ovn/controller/chassis.c
> +++ b/ovn/controller/chassis.c
> @@ -18,6 +18,7 @@
>
>  #include "chassis.h"
>
> +#include "lib/smap.h"
>  #include "lib/vswitch-idl.h"
>  #include "openvswitch/dynamic-string.h"
>  #include "openvswitch/vlog.h"
> @@ -55,6 +56,13 @@ pop_tunnel_name(uint32_t *type)
>  OVS_NOT_REACHED();
>  }
>
> +static const char *
> +get_bridge_mappings(const struct smap *ext_ids)
> +{
> +const char *bridge_mappings = smap_get(ext_ids,
"ovn-bridge-mappings");
> +return bridge_mappings ? bridge_mappings : "";
> +}
> +
>  void
>  chassis_run(struct controller_ctx *ctx, const char *chassis_id)
>  {
> @@ -102,6 +110,8 @@ chassis_run(struct controller_ctx *ctx, const
> char *chassis_id)
>  hostname = hostname_;
>  }
>
> +const char *bridge_mappings = get_bridge_mappings(>
external_ids);
> +
>  const struct sbrec_chassis *chassis_rec
>  = get_chassis(ctx->ovnsb_idl, chassis_id);
>
> @@ -110,6 +120,17 @@ chassis_run(struct controller_ctx *ctx, const
> char *chassis_id)
>  sbrec_chassis_set_hostname(chassis_rec, hostname);
>  }
>
> +const char *chassis_bridge_mappings
> += get_bridge_mappings(_rec->external_ids);
> +if (strcmp(bridge_mappings, chassis_bridge_mappings)) {
> +struct smap new_ids;
> +smap_clone(_ids, _rec->external_ids);
> +smap_replace(_ids, "ovn-bridge-mappings",
bridge_mappings);
> +sbrec_chassis_verify_external_ids(chassis_rec);
> +sbrec_chassis_set_external_ids(chassis_rec, _ids);
> +smap_destroy(_ids);
> +}
> +
>  /* Compare desired tunnels against those currently in the
> database. */
>  uint32_t cur_tunnels = 0;
>  bool same = true;
> @@ -145,9 +166,12 @@ chassis_run(struct controller_ctx *ctx, const
> char *chassis_id)
>chassis_id);
>
>  if (!chassis_rec) {
> +struct smap ext_ids = SMAP_CONST1(_ids,
"ovn-bridge-mappings",
> +  bridge_mappings);
>  chassis_rec = sbrec_chassis_insert(ctx->ovnsb_idl_txn);
>  sbrec_chassis_set_name(chassis_rec, chassis_id);
>  sbrec_chassis_set_hostname(chassis_rec, hostname);
> +sbrec_chassis_set_external_ids(chassis_rec, _ids);
>  }
>
>  int n_encaps = count_1bits(req_tunnels);
> diff --git a/ovn/ovn-sb.xml b/ovn/ovn-sb.xml
> index d68f3f6..321bf5b 100644
> --- a/ovn/ovn-sb.xml
> +++ b/ovn/ovn-sb.xml
> @@ -172,6 +172,13 @@
>ovn-controller-vtep will leave this column empty.
>  
>
> +
> +  ovn-controller populates this key with the set of
bridge
> +  mappings it has been configured to use.  Other applications
> should treat
> +  this key as read-only.  See ovn-controller(8) for
more
> +  information.
> +
> +
>  
>The overall purpose of these columns is described under
Common
>Columns at the beginning of this document.
> --
> 2.5.5
>
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] datapath-windows: Remove unnecessary keylen computation in Flow.c

2016-03-29 Thread Nithin Raju
Let¹s go in the order in which the various fields in OvsFlowKey are laid
out. Let¹s setup ŒdestKey->recircId¹ and ŒdestKey->dpHash¹ after the L2
headers.

Also, you want to combine this with the fix in the other patch during hash
computation. They all go together.


-Original Message-
From: dev  on behalf of Sairam Venugopal

Date: Friday, March 25, 2016 at 11:41 AM
To: "dev@openvswitch.org" 
Subject: [ovs-dev] [PATCH] datapath-windows: Remove unnecessary
keylen  computation in Flow.c

>destKey->l2.keylen gets reset after this line. This change
>doesn't help with hash computation and can be removed. This is what
>destKey->l2.keyLen gets set to:
>
>destKey->l2.keyLen = OVS_WIN_TUNNEL_KEY_SIZE + OVS_L2_KEY_SIZE
> - destKey->l2.offset;
>
>Signed-off-by: Sairam Venugopal 
>---
> datapath-windows/ovsext/Flow.c | 2 --
> 1 file changed, 2 deletions(-)
>
>diff --git a/datapath-windows/ovsext/Flow.c
>b/datapath-windows/ovsext/Flow.c
>index d49697c..004c54a 100644
>--- a/datapath-windows/ovsext/Flow.c
>+++ b/datapath-windows/ovsext/Flow.c
>@@ -1382,12 +1382,10 @@ _MapKeyAttrToFlowPut(PNL_ATTR *keyAttrs,
> 
> if (keyAttrs[OVS_KEY_ATTR_RECIRC_ID]) {
> destKey->recircId =
>NlAttrGetU32(keyAttrs[OVS_KEY_ATTR_RECIRC_ID]);
>-destKey->l2.keyLen += sizeof(destKey->recircId);
> }
> 
> if (keyAttrs[OVS_KEY_ATTR_DP_HASH]) {
> destKey->dpHash = NlAttrGetU32(keyAttrs[OVS_KEY_ATTR_DP_HASH]);
>-destKey->l2.keyLen += sizeof(destKey->dpHash);
> }
> 
> /* = L2 headers = */
>-- 
>2.5.0.windows.1
>
>___
>dev mailing list
>dev@openvswitch.org
>https://urldefense.proofpoint.com/v2/url?u=http-3A__openvswitch.org_mailma
>n_listinfo_dev=BQIGaQ=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs=pN
>HQcdr7B40b4h6Yb7FIedI1dnBsxdDuTLBYD3JqV80=oRGQX3o3i3alWuYX8DM0Rq2_ReeDJO
>4aL_3CkyeAp40=89cOfCUZyjbS3g8Nk_phUFg5pgZHZKTGNV2rkS550zc= 

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2] datapath-windows: Update Recirculation to use the right parameters

2016-03-29 Thread Nithin Raju
Thanks for sending out the fix.

I¹m ok with one of the fixes. The other one needs to be reworked to use
one of the approaches consistently. Pls. see inlined comments.

-- Nithin

-Original Message-
From: dev  on behalf of Sairam Venugopal

Date: Friday, March 25, 2016 at 11:06 AM
To: "dev@openvswitch.org" 
Subject: [ovs-dev] [PATCH v2] datapath-windows: Update Recirculation to
use the right parameters

>Update OvsLookupFlow() to include flowKey->RecircId when computing hash.
>Use the right source port Id for checking if a packet is received or
>transmitted.
>
>Signed-off-by: Sairam Venugopal 
>---
> datapath-windows/ovsext/Actions.c | 2 +-
> datapath-windows/ovsext/Flow.c| 3 +++
> 2 files changed, 4 insertions(+), 1 deletion(-)
>
>diff --git a/datapath-windows/ovsext/Actions.c
>b/datapath-windows/ovsext/Actions.c
>index a91454d..7742096 100644
>--- a/datapath-windows/ovsext/Actions.c
>+++ b/datapath-windows/ovsext/Actions.c
>@@ -2015,7 +2015,7 @@ OvsDoRecirc(POVS_SWITCH_CONTEXT switchContext,
> }
> status = OvsCreateAndAddPackets(NULL, 0, OVS_PACKET_CMD_MISS,
> vport, key, ovsFwdCtx.curNbl,
>-srcPortNo ==
>+ 
>ovsFwdCtx.fwdDetail->SourcePortId ==
>  
>switchContext->virtualExternalPortId,

This comparison determines the value for ŒisRecv¹ parameter. ŒisRecv¹
determines whether the OOB data should be interpreted as receive data or
send data. So, the existing code is checking for:
srcPortNo == switchContext->virtualExternalPortId

Left side data is a UINT32 and right side data is a NDIS_SWITCH_PORT_ID.
So, clearly this comparison is not right since we are comparing different
types. They are not expected to have the same value even if they are
representing the same vport.

The proposed fix at least makes sure that we are comparing the right
types. So, I¹d go with it. Is the comparison right? It seems so. Basically
we want to determine if the packet came from a VIF or a physical NIC.
ŒfwdDetail¹ is a reliable way of doing it. Only way this could mess up is
if we mean to explicitly change the ŒsrcPortNo¹ during tunneling. In such
cases, the 'fwdDetail->SourcePortId¹ will end up different from the
ŒsrcPortNo¹. This is exactly why we use ŒsrcPortNo¹ for comparison around
the code to allow flexibility.


In any case, the problematic case here is tunneling + recirc. We can deal
with that in a subsequent patch. For now, this is the right fix.

I¹d also add a XXX comment to investigate that "tunneling + recirc² case
in the future.

> ,
> ovsFwdCtx.switchContext,
>diff --git a/datapath-windows/ovsext/Flow.c
>b/datapath-windows/ovsext/Flow.c
>index 02c41b7..d49697c 100644
>--- a/datapath-windows/ovsext/Flow.c
>+++ b/datapath-windows/ovsext/Flow.c
>@@ -2133,6 +2133,9 @@ OvsLookupFlow(OVS_DATAPATH *datapath,
> 
> if (!hashValid) {
> *hash = OvsJhashBytes(start, size, 0);
>+if (key->recircId) {
>+*hash = OvsJhashWords((UINT32*)hash, 1, key->recircId);
>+}

Ok, we have two options:
1. Increment ŒkeyLen¹ and include recircId as part of it. So, while
hashing we make sure that recircId gets included.
2. Explicitly add Œkey->recirId¹ during hashing.

I¹m ok with either way. The ŒkeyLen¹ approach is more efficient for now,
since it avoids a Œif¹ check and also an additional function call. In the
future, if we have a lot of meta data such as ŒrecircId¹, then we should
consider adding a Œkey->metaDataLen¹ and go from there. For now, I¹d add a
comment in ŒOvsFlowKey¹ to set future direction, and go with the ŒkeyLen¹
approach.

Of course, there¹s a bug when ŒkeyLen¹ gets set in _MapKeyAttrToFlowPut().
The value gets incremented due to ŒrecircId¹ but gets reset again.

> }
> 
> head = >flowTable[HASH_BUCKET(*hash)];

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] ovn-northd: Fix peering of routers.

2016-03-29 Thread Ryan Moats
Acked-by: Ryan Moats 

"dev"  wrote on 03/28/2016 04:31:41 PM:

> From: Gurucharan Shetty 
> To: dev@openvswitch.org
> Cc: Gurucharan Shetty 
> Date: 03/28/2016 04:48 PM
> Subject: [ovs-dev] [PATCH] ovn-northd: Fix peering of routers.
> Sent by: "dev" 
>
> 1. Currently, the ovn-nb man page says that the 'peer'
> in a logical_router_port table should point to the name
> of the peer's logical router port. But the schema had declared
> this column as a uuid. This looks not to be the intention as peers
> for logical switches connected to routers is a name (and not a uuid).
> So this patch changes the schema to be name.
>
> 2. In the southbound database, in the port_binding table, for a
> logical_router_port, the peer was pointing back to itself. This
> was causing ovn-controller to create patch ports where the peer
> was wrongly pointing back to the source itself. This clearly looks
> to be an error. So this patch fixes the peer in southbound database
> to correclty point to the real peer.
>
> 3. ovn-northd.c currently skips generating logical flows to transfer
> packets between two peers with comment about needing 'ARP for
> neighboring routers'. It looked to me that since the router peer
> is a logical object that has to be created in OVN-NB database, we
> always need to statically assign the mac address. So this patch
> picks the mac address from the database.
>
> Signed-off-by: Gurucharan Shetty 
> ---
>  ovn/northd/ovn-northd.c |   28 -
>  ovn/ovn-nb.ovsschema|9 +--
>  tests/ovn.at|  155 
> +++
>  3 files changed, 184 insertions(+), 8 deletions(-)
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH RFC] ovn-controller: Optimize processing for non-local datapath without patch ports.

2016-03-29 Thread Ryan Moats
Acked-by: Ryan Moats 

"dev"  wrote on 03/28/2016 02:10:35 AM:

> From: Han Zhou 
> To: dev@openvswitch.org
> Date: 03/28/2016 02:11 AM
> Subject: [ovs-dev] [PATCH RFC] ovn-controller: Optimize processing
> for non-local datapath without patch ports.
> Sent by: "dev" 
>
> For non-local datapaths, if there are no patch ports attached, it
> means the lflows and port bindings would never be needed on the
> Chassis. Skipping the processing for such lflows and port bindings
> can save significant amount of CPU, at the same time largely reduce
> the number of rules in local openflow tables.
>
> Signed-off-by: Han Zhou 
> ---
>  ovn/controller/lflow.c  | 38 ++
+---
>  ovn/controller/lflow.h  |  3 ++-
>  ovn/controller/ovn-controller.c | 16 +---
>  ovn/controller/ovn-controller.h |  6 ++
>  ovn/controller/patch.c  | 22 +++---
>  ovn/controller/patch.h  |  2 +-
>  ovn/controller/physical.c   | 34 +-
>  ovn/controller/physical.h   |  2 +-
>  8 files changed, 102 insertions(+), 21 deletions(-)
>
> diff --git a/ovn/controller/lflow.c b/ovn/controller/lflow.c
> index 0614a54..be10d18 100644
> --- a/ovn/controller/lflow.c
> +++ b/ovn/controller/lflow.c
> @@ -198,6 +198,7 @@ static void
>  add_logical_flows(struct controller_ctx *ctx, const struct
> lport_index *lports,
>const struct mcgroup_index *mcgroups,
>const struct hmap *local_datapaths,
> +  const struct hmap *patched_datapaths,
>const struct simap *ct_zones, struct hmap *flow_table)
>  {
>  uint32_t conj_id_ofs = 1;
> @@ -211,17 +212,18 @@ add_logical_flows(struct controller_ctx *ctx,
> const struct lport_index *lports,
>  if (!ldp) {
>  continue;
>  }
> -if (!ingress && is_switch(ldp)) {
> +if (is_switch(ldp)) {
>  /* For a logical switch datapath, local_datapaths tells
> us if there
> - * are any local ports for this datapath.  If not,
processing
> - * logical flows for the egress pipeline of this datapath is
> - * unnecessary.
> + * are any local ports for this datapath.  If not, we can
skip
> + * processing logical flows if the flow belongs to
> egress pipeline
> + * or if that logical switch datapath is not patched to
> any logical
> + * router.
>   *
> - * We still need the ingress pipeline because even if
> there are no
> - * local ports, we still may need to execute the ingress
pipeline
> - * after a packet leaves a logical router.  Further
optimization
> - * is possible, but not based on what we know with
> local_datapaths
> - * right now.
> + * Otherwise, we still need the ingress pipeline because
even if
> + * there are no local ports, we still may need to
> execute the ingress
> + * pipeline after a packet leaves a logical router.  Further
> + * optimization is possible, but not based on what we know
with
> + * local_datapaths right now.
>   *
>   * A better approach would be a kind of "flood fill"
algorithm:
>   *
> @@ -231,12 +233,25 @@ add_logical_flows(struct controller_ctx *ctx,
> const struct lport_index *lports,
>   *   2. For each patch port P in a logical datapath in S,
add the
>   *  logical datapath of the remote end of P to S.
Iterate
>   *  until S reaches a fixed point.
> + *
> + * This can be implemented in northd, which can
> generate the sets and
> + * save it on each port-binding record in SB, and ovn-
> controller can
> + * use the information directly. However, there can be
> update storms
> + * when a pair of patch ports are added/removed to
> connect/disconnect
> + * large lrouters and lswitches. This need to be studied
further.
>   */
>
>  struct hmap_node *ld;
>  ld = hmap_first_with_hash(local_datapaths, ldp->tunnel_key);
>  if (!ld) {
> -continue;
> +if (!ingress) {
> +continue;
> +}
> +struct hmap_node *pd;
> +pd = hmap_first_with_hash(patched_datapaths,
> ldp->tunnel_key);
> +if (!pd) {
> +continue;
> +}
>  }
>  }
>
> @@ -416,10 +431,11 @@ void
>  lflow_run(struct controller_ctx *ctx, const struct lport_index *lports,
>const struct mcgroup_index *mcgroups,
>const struct hmap *local_datapaths,
> +  const struct hmap *patched_datapaths,

Re: [ovs-dev] [PATCH] dpif-netdev: Remove PMD latency on seq_mutex

2016-03-29 Thread Karl Rister
On 03/29/2016 08:08 AM, Flavio Leitner wrote:
> On Tue, Mar 29, 2016 at 02:13:18AM +, Daniele Di Proietto wrote:
>> Hi Flavio and Karl,
>>
>> thanks for the patch! I have a couple of comments:
>>
>> Can you point out a configuration where this is the bottleneck?
>> I'm interested in reproducing this.
> 
> Karl, since you did the tests, could you please provide more details?

When performing packet forwarding latency tests, I first noticed system
and idle time when looking at CPU statistics when I expected the PMD
threads to be 100% in userspace.  I used the kernel ftrace facility to
track down what was happening and saw that the PMD thread was being
context switched out and going idle.  The PMD thread was pinned to CPU
core thread isolated with isolcpus so there was no competing task that
could be scheduled to cause the context switch and I would not expect
the polling thread to ever go idle.  Further analysis with frace and gdb
tracked the cause to seq_mutex blocking when another task held the mutex.

I would estimate that this change removed packet latency spikes of 35-45
usecs in our test scenario.

The test is forwarding packets through a KVM guest using OVS+DPDK in the
host and the DPDK testpmd application in the guest.

Flavio, I thought I remembered you also saying that you saw a throughput
improvement in a test you were running?

> 
> 
>> I think the implementation would look simpler if we could
>> avoid explicitly taking the mutex in dpif-netdev and instead
>> having a ovsrcu_try_quiesce(). What do you think?
> 
> My concern is that it is freeing one entry from EMC each round
> and it should quiesce to allow the callbacks to run.  If, for
> some reason, it fails to quiesce for a long period, then it might
> backlog a significant number of entries.

My initial approach, which Flavio's code is very similar to, was simply
trying to provide the simplest change to achieve what I was looking for.
 I could certainly see alternative solutions being more appropriate.

> 
> 
>> I think we can avoid the recursive mutex as well if we introduce
>> some explicit APIs in seq (seq_try_lock, seq_change_protected and
>> seq_unlock), but I'd like to understand the performance implication
>> of this commit first.

One other area of the sequence code that I thought was curious was a
single mutex that covered all sequences.  If updating the API is a
possibility I would think going to a mutex per sequence might be an
appropriate change as well.  That said, I don't have data that
specifically points this out as a problem.

> 
> The issue is the latency spike when the PMD thread blocks on the
> busy mutex.
> 
> The goal with recursive locking is to make sure we can sweep
> the EMC cache and quiesce without blocking.  Fixing seq API
> would help to not block, but then we have no control to whether
> we did both tasks in the same round.
> 
> fbl
> 
> 
>>
>> On 23/03/2016 20:54, "dev on behalf of Flavio Leitner"
>>  wrote:
>>
>>> The PMD thread needs to keep processing RX queues in order
>>> archive maximum throughput.  However, it also needs to run
>>> other tasks after some time processing the RX queues which
>>> a mutex can block the PMD thread.  That causes latency
>>> spikes and affects the throughput.
>>>
>>> Convert to recursive mutex so that PMD thread can test first
>>> and if it gets the lock, continue as before, otherwise try
>>> again another time.  There is an additional logic to make
>>> sure the PMD thread will try harder as the attempt to get
>>> the mutex continues to fail.
>>>
>>> Co-authored-by: Karl Rister 
>>> Signed-off-by: Flavio Leitner 
>>
>> Oh, we're going to need a signoff from Karl as well :-)

Signed-off-by: Karl Rister 

Is this good enough?

>>
>> Thanks,
>>
>> Daniele
>>
>>> ---
>>> include/openvswitch/thread.h |  3 +++
>>> lib/dpif-netdev.c| 33 ++---
>>> lib/seq.c| 15 ++-
>>> lib/seq.h|  3 +++
>>> 4 files changed, 42 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/include/openvswitch/thread.h b/include/openvswitch/thread.h
>>> index af6f2bb..6d20720 100644
>>> --- a/include/openvswitch/thread.h
>>> +++ b/include/openvswitch/thread.h
>>> @@ -44,6 +44,9 @@ struct OVS_LOCKABLE ovs_mutex {
>>> #define OVS_ADAPTIVE_MUTEX_INITIALIZER OVS_MUTEX_INITIALIZER
>>> #endif
>>>
>>> +#define OVS_RECURSIVE_MUTEX_INITIALIZER \
>>> +   { PTHREAD_RECURSIVE_MUTEX_INITIALIZER_NP, "" }
>>> +
>>> /* ovs_mutex functions analogous to pthread_mutex_*() functions.
>>>  *
>>>  * Most of these functions abort the process with an error message on any
>>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
>>> index 0f2385a..a10b2d1 100644
>>> --- a/lib/dpif-netdev.c
>>> +++ b/lib/dpif-netdev.c
>>> @@ -2668,12 +2668,15 @@ static void *
>>> pmd_thread_main(void *f_)
>>> {
>>> struct dp_netdev_pmd_thread *pmd 

Re: [ovs-dev] [ovs-dev, v3] lib/ovs-thread: Add Transactional Memory (TM) support.

2016-03-29 Thread Ryan Moats

 Original Message ---
> v2->v3
> - Pass all test cases
> - Introduce separate lock elision for rwlock
> - Add performance comparison of cmap/hmap search
>
> v1->v2
> - Fix a bug at UNLOCK_ELISION
> - Add checking if glibc version >= 2.21 (OVS_CHECK_GLIBC_TSX)
> - Add checking of whether cpu has TSX support (OVS_CHECK_RTM)
> - Enable LOCK_ELISION only when CPU has TSX and glibc doesn't
>   (if glibc version >= 2.21, then using phtread_mutex has lock elision)
>   - Add 20% mutation test-cmap testcase
>   - List failed testcases below
>
> The patch shows the preliminary results of enabling RTM
> (Restricted Transactional Memory). A successful transactional execution
> elides the lock, i.e., the lock is by-passed, and exposes concurrency.
> However, transactions might abort due to several reasons
> such as data conflicts, I/O operations, syscall, etc. When transaction
aborts,
> it falls back to the original locking mechanisms. Thus, the performance
> improvement depends on the abort rate, and the overhead of speculative
execution
> and rollback.
>
> The patch adds ovs_##ELISION_FUNC at LOCK_FUNCTION, TRY_LOCK_FUNCTION,
and
> UNLOCK_FUNCTION macros, and calls either rwlock_elision or mutex_elision
> accordingly. Experiments show that for cmap, the TM does not seem to have
> observable improvements below 5% mutations, while hmap is more obvious.
> For cmap search over 5% mutation, the search time of TM shows much better
> scalability when % of writers increase.

This patch doesn't apply cleanly - will there be a rebase?

Thanks in advance,
Ryan Moats

Results are shown by using test-cmap benchmark with different mutation %.
$ tests/ovstest test-cmap benchmark 2000 4  1

Unit: ms, number presented as baseline / TM
  cmap_search hmap_search
 0.1%   117/117312/292
   2%   124/120688/328
   5%   142/125   1093/403
  10%   237/132   1588/432
  20%   512/160   2516/679
  40%  1010/277   2927/1007
  60%  1278/324   3313/1119
  80%  1614/343   3392/1291

Signed-off-by: William Tu 
---
 configure.ac  |   2 +
 lib/ovs-thread.c  | 201 +
+
 m4/openvswitch.m4 |  21 ++
 3 files changed, 210 insertions(+), 14 deletions(-)
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] AUTHORS: Add Bhargava Shastry and KashyapThimmaraju.

2016-03-29 Thread Ryan Moats
Acked-by: Ryan Moats 

"dev"  wrote on 03/28/2016 09:40:53 PM:

> From: Ben Pfaff 
> To: dev@openvswitch.org
> Cc: Ben Pfaff 
> Date: 03/28/2016 09:42 PM
> Subject: [ovs-dev] [PATCH] AUTHORS: Add Bhargava Shastry and Kashyap
> Thimmaraju.
> Sent by: "dev" 
>
> Bhargava and Kashyap reported vulnerability CVE-2016-2074, which was
> present only on the branches for 2.3 and 2.4 and thus did not require any
> commits on master.  They still deserve credit as bug reporters, so this
> commit provides that.
>
> Signed-off-by: Ben Pfaff 
> ---
>  AUTHORS | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/AUTHORS b/AUTHORS
> index f0d1e23..6ec6456 100644
> --- a/AUTHORS
> +++ b/AUTHORS
> @@ -268,6 +268,7 @@ Atzm Watanabe   a...@stratosphere.co.jp
>  Aurélien Poulain   aurepoul...@viacesi.fr
>  Bastian Blank   wa...@debian.org
>  Ben Basler  bbas...@nicira.com
> +Bhargava Shastrybshas...@sec.t-labs.tu-berlin.de
>  Bob Ballbob.b...@citrix.com
>  Brad Hall   b...@nicira.com
>  Brandon Heller  brand...@stanford.edu
> @@ -333,6 +334,7 @@ John Darrington j...@darrington.wattle.id.au
>  John Galgay j...@galgay.net
>  John Hurley john.hur...@netronome.com
>  John Reumannnofutznetwo...@gmail.com
> +Kashyap Thimmaraju  kashyap.thimmar...@sec.t-labs.tu-berlin.de
>  Keith Holleman  hollemani...@gmail.com
>  K 華k940...@hotmail.com
>  Kevin Mancuso   kevin.manc...@rackspace.com
> --
> 2.1.3
>
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] CONTRIBUTING.md: Describe a new "Vulnerability"tag.

2016-03-29 Thread Ryan Moats
Acked-by: Ryan Moats 

"dev"  wrote on 03/28/2016 09:34:58 PM:

> From: Ben Pfaff 
> To: dev@openvswitch.org
> Cc: Ben Pfaff 
> Date: 03/28/2016 09:35 PM
> Subject: [ovs-dev] [PATCH] CONTRIBUTING.md: Describe a new
> "Vulnerability" tag.
> Sent by: "dev" 
>
> Signed-off-by: Ben Pfaff 
> ---
>  CONTRIBUTING.md | 11 +++
>  1 file changed, 11 insertions(+)
>
> diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
> index 439c56a..8fcebba 100644
> --- a/CONTRIBUTING.md
> +++ b/CONTRIBUTING.md
> @@ -250,6 +250,17 @@ Examples of common tags follow.
>
>  git log -1 --pretty=format:"Fixes: %h (\"%s\")"
--abbrev=12COMMIT_REF
>
> +Vulnerability: CVE-2016-2074
> +
> +Specifies that the patch fixes or is otherwise related to a
> +security vulnerability with the given CVE identifier.  Other
> +identifiers in public vulnerability databases are also
> +suitable.
> +
> +If the vulnerability was reported publicly, then it is also
> +appropriate to cite the URL to the report in a Reported-at
> +tag.  Use a Reported-by tag to acknowledge the reporters.
> +
>  Developer's Certificate of Origin
>  -
>
> --
> 2.1.3
>
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] dpif-netdev: Remove PMD latency on seq_mutex

2016-03-29 Thread Flavio Leitner
On Tue, Mar 29, 2016 at 02:13:18AM +, Daniele Di Proietto wrote:
> Hi Flavio and Karl,
> 
> thanks for the patch! I have a couple of comments:
> 
> Can you point out a configuration where this is the bottleneck?
> I'm interested in reproducing this.

Karl, since you did the tests, could you please provide more details?


> I think the implementation would look simpler if we could
> avoid explicitly taking the mutex in dpif-netdev and instead
> having a ovsrcu_try_quiesce(). What do you think?

My concern is that it is freeing one entry from EMC each round
and it should quiesce to allow the callbacks to run.  If, for
some reason, it fails to quiesce for a long period, then it might
backlog a significant number of entries.


> I think we can avoid the recursive mutex as well if we introduce
> some explicit APIs in seq (seq_try_lock, seq_change_protected and
> seq_unlock), but I'd like to understand the performance implication
> of this commit first.

The issue is the latency spike when the PMD thread blocks on the
busy mutex.

The goal with recursive locking is to make sure we can sweep
the EMC cache and quiesce without blocking.  Fixing seq API
would help to not block, but then we have no control to whether
we did both tasks in the same round.

fbl


> 
> On 23/03/2016 20:54, "dev on behalf of Flavio Leitner"
>  wrote:
> 
> >The PMD thread needs to keep processing RX queues in order
> >archive maximum throughput.  However, it also needs to run
> >other tasks after some time processing the RX queues which
> >a mutex can block the PMD thread.  That causes latency
> >spikes and affects the throughput.
> >
> >Convert to recursive mutex so that PMD thread can test first
> >and if it gets the lock, continue as before, otherwise try
> >again another time.  There is an additional logic to make
> >sure the PMD thread will try harder as the attempt to get
> >the mutex continues to fail.
> >
> >Co-authored-by: Karl Rister 
> >Signed-off-by: Flavio Leitner 
> 
> Oh, we're going to need a signoff from Karl as well :-)
> 
> Thanks,
> 
> Daniele
> 
> >---
> > include/openvswitch/thread.h |  3 +++
> > lib/dpif-netdev.c| 33 ++---
> > lib/seq.c| 15 ++-
> > lib/seq.h|  3 +++
> > 4 files changed, 42 insertions(+), 12 deletions(-)
> >
> >diff --git a/include/openvswitch/thread.h b/include/openvswitch/thread.h
> >index af6f2bb..6d20720 100644
> >--- a/include/openvswitch/thread.h
> >+++ b/include/openvswitch/thread.h
> >@@ -44,6 +44,9 @@ struct OVS_LOCKABLE ovs_mutex {
> > #define OVS_ADAPTIVE_MUTEX_INITIALIZER OVS_MUTEX_INITIALIZER
> > #endif
> > 
> >+#define OVS_RECURSIVE_MUTEX_INITIALIZER \
> >+   { PTHREAD_RECURSIVE_MUTEX_INITIALIZER_NP, "" }
> >+
> > /* ovs_mutex functions analogous to pthread_mutex_*() functions.
> >  *
> >  * Most of these functions abort the process with an error message on any
> >diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> >index 0f2385a..a10b2d1 100644
> >--- a/lib/dpif-netdev.c
> >+++ b/lib/dpif-netdev.c
> >@@ -2668,12 +2668,15 @@ static void *
> > pmd_thread_main(void *f_)
> > {
> > struct dp_netdev_pmd_thread *pmd = f_;
> >-unsigned int lc = 0;
> >+unsigned int lc_max = 1024;
> >+unsigned int lc_start;
> >+unsigned int lc;
> > struct rxq_poll *poll_list;
> > unsigned int port_seq = PMD_INITIAL_SEQ;
> > int poll_cnt;
> > int i;
> > 
> >+lc_start = 0;
> > poll_cnt = 0;
> > poll_list = NULL;
> > 
> >@@ -2698,24 +2701,32 @@ reload:
> >  * reloading the updated configuration. */
> > dp_netdev_pmd_reload_done(pmd);
> > 
> >+lc = lc_start;
> > for (;;) {
> > for (i = 0; i < poll_cnt; i++) {
> > dp_netdev_process_rxq_port(pmd, poll_list[i].port,
> >poll_list[i].rx);
> > }
> > 
> >-if (lc++ > 1024) {
> >-unsigned int seq;
> >+if (lc++ > lc_max) {
> >+if (!seq_pmd_trylock()) {
> >+unsigned int seq;
> >+lc_start = 0;
> >+lc = 0;
> > 
> >-lc = 0;
> >+emc_cache_slow_sweep(>flow_cache);
> >+coverage_try_clear();
> >+ovsrcu_quiesce();
> > 
> >-emc_cache_slow_sweep(>flow_cache);
> >-coverage_try_clear();
> >-ovsrcu_quiesce();
> >+seq_pmd_unlock();
> > 
> >-atomic_read_relaxed(>change_seq, );
> >-if (seq != port_seq) {
> >-port_seq = seq;
> >-break;
> >+atomic_read_relaxed(>change_seq, );
> >+if (seq != port_seq) {
> >+port_seq = seq;
> >+break;
> >+}
> >+} else {
> >+lc_start += (lc_start + lc_max)/2;
> >+lc = lc_start;
> > }
> >   

[ovs-dev] Get Rich

2016-03-29 Thread dev
You can make good money!Follow the instructions on the next page -->> 
www.bestearntrade.com
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 3/3] NEWS: dpdk port hotplug is now supported

2016-03-29 Thread Mauricio Vásquez
Right, I'll wait for a review and then I'll send a V2 with all the patches
in one.

On Mon, Mar 28, 2016 at 11:43 PM, Justin Pettit  wrote:

>
> > On Mar 28, 2016, at 1:52 AM, Mauricio Vasquez B <
> mauricio.vasquezber...@studenti.polito.it> wrote:
> >
> > Signed-off-by: Mauricio Vasquez B <
> mauricio.vasquezber...@studenti.polito.it>
> > ---
> > NEWS | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/NEWS b/NEWS
> > index ea7f3a1..2ba8659 100644
> > --- a/NEWS
> > +++ b/NEWS
> > @@ -26,6 +26,7 @@ Post-v2.5.0
> >assignment.
> >  * Type of log messages from PMD threads changed from INFO to DBG.
> >  * QoS functionality with sample egress-policer implementation.
> > + * Port Hotplug is now supported.
> >- ovs-benchmark: This utility has been removed due to lack of use and
> >  bitrot.
> >- ovs-appctl:
>
> This isn't a full review, but we normally add documentation, introduce
> tests, update NEWS alongside the code that adds it.
>
> --Justin
>
>
>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [CudaMailTagged] U.S. Manufacturer seeks Distributors Worldwide

2016-03-29 Thread Tony Williams

Distributors Needed Worldwide

27 year old U.S. Company with proven floor safety products has openings
for exclusive distributors in several countries.

One application last for a minimum of 4 years - Guaranteed!

Indoors or Outdoors

For use on Tile - Marble - Granite - Terrazzo - Concrete

Does not change the appearance of the floor

Easy to apply - No training necessary

Currently used by McDonalds- Burger King - KFC - Sheraton - Holiday Inns -
Pfizer - Mercedes - BMW, etc.

To learn more please send us your Name and Country. Complete details will
be sent to you, upon receipt.

Fully Secured Inventory Investment of $3,000 - $5,000 USD required
Tony Williams
Email: kamite...@sina.com

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v3] netdev-dpdk: vhost: Fix txq enabling in the absence of notifications.

2016-03-29 Thread Flavio Leitner
On Tue, Mar 29, 2016 at 09:20:41AM +0300, Ilya Maximets wrote:
> According to QEMU documentation (docs/specs/vhost-user.txt) one queue
> should be enabled initially. More queues are enabled dynamically, by
> sending message VHOST_USER_SET_VRING_ENABLE.
> 
> Currently all queues in OVS disabled by default. This breaks above
> specification. So, queue #0 should be enabled by default to support
> QEMU versions less than 2.5 and fix probable issues if QEMU will not
> send VHOST_USER_SET_VRING_ENABLE for queue #0 according to documentation.
> Also this will fix currently broken vhost-cuse support in OVS.
> 
> Fixes: 585a5beaa2a4 ("netdev-dpdk: vhost-user: Fix sending packets to
>   queues not enabled by guest.")
> Reported-by: Mauricio Vasquez B 
> Signed-off-by: Ilya Maximets 
> ---

Acked-by: Flavio Leitner 


___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] Returned mail: Data format error

2016-03-29 Thread The Post Office
The original message was included as attachment

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] Get Rich With Social Media NOW!

2016-03-29 Thread dev

Hi there,

This is the most importantday of your life. Why?Because you are going to make 
at least $1650!

How do I know? Because I did exactly the same thing last month using this incredible 
system. >>GoThere Now

It has completely changed my life and nowIm spreading the good news.
Get your FREE access from me now, beforeall the places are snapped up!

Enjoy!
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] Requested receipt ID:BB1149

2016-03-29 Thread Benjamin Shelton
Dear dev,

Please find attached your receipt, sent as requested.

We are making improvements to our billing systems to help serve you better and 
because of that the attached invoice will look different from your previous 
ones.

Kind regards,
Benjamin Shelton
Financial Director - Multinational Group
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [RFC PATCH] tunneling: Improving vxlan performance using DPDK flow director feature.

2016-03-29 Thread Chandran, Sugesh


Regards
_Sugesh

> -Original Message-
> From: Jesse Gross [mailto:je...@kernel.org]
> Sent: Friday, March 25, 2016 12:38 AM
> To: Chandran, Sugesh 
> Cc: dev@openvswitch.org
> Subject: Re: [ovs-dev] [RFC PATCH] tunneling: Improving vxlan performance
> using DPDK flow director feature.
> 
> On Fri, Mar 18, 2016 at 8:50 AM, Chandran, Sugesh
>  wrote:
> > Hi Jesse,
> > Please find my answers inline.
> >
> > Regards
> > _Sugesh
> >
> >
> >> -Original Message-
> >> From: Jesse Gross [mailto:je...@kernel.org]
> >> Sent: Thursday, March 17, 2016 11:50 PM
> >> To: Chandran, Sugesh 
> >> Cc: dev@openvswitch.org
> >> Subject: Re: [ovs-dev] [RFC PATCH] tunneling: Improving vxlan
> >> performance using DPDK flow director feature.
> >>
> >> On Thu, Mar 17, 2016 at 3:43 PM, Chandran, Sugesh
> >>  wrote:
> >> > Hi,
> >> >
> >> > This patch proposes an approach that uses Flow director feature on
> >> > the
> >> Intel Fortville NICs to boost the VxLAN tunneling performance. In our
> >> testing we verified that the VxLAN performance is almost doubled with
> this patch.
> >> > The solution programs the NIC to report the flow ID along with the
> >> > VxLAN
> >> packets, and it is matched by OVS in software. There may be corner
> >> cases that needs to addressed in the approach, For eg:  There is a
> >> possibility of race condition where NIC reports flow ID that may
> >> match on different flow in OVS. This happen when a rule is evicted by
> >> a new rule with same flowID+ hash in the OVS software. The packets
> >> may hit on wrong new rule in OVS until the flow get deleted in the
> hardware too.
> >> >
> >> > It is a hardware specific implementation (Only work with Intel
> >> > Fortville
> >> NICs) for now, however the proposal works with any programmable
> >> NICs.This RFC proves that the OVS can offer very high speed tunneling
> >> performance using flow programmability in NICs. I am looking for
> >> comments/suggestions on adding this support(such as configuring,
> >> enable it for all the programmable NICs and etc) in OVS userspace
> >> datapath for improving the performance.
> >>
> >> This is definitely very interesting to see. Can you post some more
> >> specific performance numbers?
> > [Sugesh]
> > VxLAN DECAP performance(Unidirectional, Single flow, Single CPU Core)
> > ---
> > PKT-IN - 9.3 Mpps
> > Pkt size - 114 byte VxLAN Packets(64 byte payload) PKT-OUT - 5.6 Mpps(
> > Without Optimization) PKT-OUT - 9.3 Mpps(After the optimization, It
> > hits the Input Line rate)
> >
> > VxLAN ENCAP-DECAP performance (Bidirectional, Single CPU Core)
> > --
> > --- PKT-IN - 9.3 Mpps, PKT SIZE - 114 Byte VxLAN Packets (64
> > Byte payload) --> PKT-IN - 14 Mpps, PKT SIZE - 64 Byte UDP packets <--
> >
> > PKT-OUT - 3.6 Mpps(Without Optimization) PKT-OUT - 5.3 Mpps(Using the
> > patch)
> 
> Thanks, that is interesting to see, particularly for a gateway-type use case
> where an appliance is translating between encapsulated and non-
> encapsulated packets.
> 
> >> Is this really specific to VXLAN? I'm sure that it could be
> >> generalized to other tunneling protocols (Geneve would be nice given
> >> that OVN is using it and I know Fortville supports it). But shouldn't
> >> it apply to non-tunneled traffic as well?
> > Yes, this can be applied for any tunneling protocol provided the NIC
> > hardware is programmed to handle those packets.
> > We haven’t tested it for non-tunneled packets. The performance
> > improvement on non-tunneled packets are subjective due to the fact
> > that there is a limitation on number of hardware flows(8K on FVL), and
> > software still has to spend cycles on matching the flow IDs reported
> > by hardware.  This improves the tunneling performance in all the cases,
> because it tunnel packets needs two lookup than one.
> 
> Looking at the code some more, I think there are basically two sources of
> optimization here:
>  * Accelerating the EMC by avoiding netdev_flow_key_equal_mf() on the
> assumption that the rule you've installed points exactly to the correct flow.
> However, I don't think this is legal because the flows that you are
> programming the hardware with don't capture the full set of values in an OVS
> flow. For example, in the case of tunnels, there is no match on DMAC.

[Sugesh] We can program hardware to match on all the fields that we want , 
including the tunnel fields in the outer header.

>  * Chaining together the multiple lookups used by tunnels on the assumption
> that the outer VXLAN source port distinguishes the inner flow. This would
> allow avoiding netdev_flow_key_equal_mf() a second time. This is definitely
> not legal because the VXLAN source port is only capturing a small subset of
> the total data that OVS is 

[ovs-dev] Get Rich

2016-03-29 Thread dev

Hi there,

This is the most importantday of your life. Why?Because you are going to make 
at least $1650!

How do I know? Because I did exactly the same thing last month using this incredible 
system. >>GoThere Now

It has completely changed my life and nowIm spreading the good news.
Get your FREE access from me now, beforeall the places are snapped up!

Enjoy!
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v3] netdev-dpdk: vhost: Fix txq enabling in the absence of notifications.

2016-03-29 Thread Ilya Maximets
According to QEMU documentation (docs/specs/vhost-user.txt) one queue
should be enabled initially. More queues are enabled dynamically, by
sending message VHOST_USER_SET_VRING_ENABLE.

Currently all queues in OVS disabled by default. This breaks above
specification. So, queue #0 should be enabled by default to support
QEMU versions less than 2.5 and fix probable issues if QEMU will not
send VHOST_USER_SET_VRING_ENABLE for queue #0 according to documentation.
Also this will fix currently broken vhost-cuse support in OVS.

Fixes: 585a5beaa2a4 ("netdev-dpdk: vhost-user: Fix sending packets to
  queues not enabled by guest.")
Reported-by: Mauricio Vasquez B 
Signed-off-by: Ilya Maximets 
---

version 3:
* Fixed qid checking in __netdev_dpdk_vhost_send()

version 2:
* Fixed initialization in netdev_dpdk_alloc_txq().
* Clearing moved to separate function.

 lib/netdev-dpdk.c | 28 
 1 file changed, 24 insertions(+), 4 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 7c4cd07..8eea788 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -103,6 +103,9 @@ BUILD_ASSERT_DECL((MAX_NB_MBUF / 
ROUND_DOWN_POW2(MAX_NB_MBUF/MIN_NB_MBUF))
 #define NIC_PORT_TX_Q_SIZE 2048  /* Size of Physical NIC TX Queue, Max 
(n+32<=4096)*/
 
 #define OVS_VHOST_MAX_QUEUE_NUM 1024  /* Maximum number of vHost TX queues. */
+#define OVS_VHOST_QUEUE_MAP_UNKNOWN (-1) /* Mapping not initialized. */
+#define OVS_VHOST_QUEUE_DISABLED(-2) /* Queue was disabled by guest and not
+  * yet mapped to another queue. */
 
 static char *cuse_dev_name = NULL;/* Character device cuse_dev_name. */
 static char *vhost_sock_dir = NULL;   /* Location of vhost-user sockets */
@@ -671,7 +674,7 @@ netdev_dpdk_alloc_txq(struct netdev_dpdk *netdev, unsigned 
int n_txqs)
 }
 
 /* Initialize map for vhost devices. */
-netdev->tx_q[i].map = -1;
+netdev->tx_q[i].map = OVS_VHOST_QUEUE_MAP_UNKNOWN;
 rte_spinlock_init(>tx_q[i].tx_lock);
 }
 }
@@ -1265,7 +1268,7 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int qid,
 
 qid = vhost_dev->tx_q[qid % vhost_dev->real_n_txq].map;
 
-if (OVS_UNLIKELY(!is_vhost_running(virtio_dev) || qid == -1)) {
+if (OVS_UNLIKELY(!is_vhost_running(virtio_dev) || qid < 0)) {
 rte_spinlock_lock(_dev->stats_lock);
 vhost_dev->stats.tx_dropped+= cnt;
 rte_spinlock_unlock(_dev->stats_lock);
@@ -2019,7 +2022,7 @@ netdev_dpdk_remap_txqs(struct netdev_dpdk *netdev)
 }
 
 if (n_enabled == 0 && total_txqs != 0) {
-enabled_queues[0] = -1;
+enabled_queues[0] = OVS_VHOST_QUEUE_DISABLED;
 n_enabled = 1;
 }
 
@@ -2056,6 +2059,10 @@ netdev_dpdk_vhost_set_queues(struct netdev_dpdk *netdev, 
struct virtio_net *dev)
 netdev->real_n_rxq = qp_num;
 netdev->real_n_txq = qp_num;
 netdev->txq_needs_locking = true;
+/* Enable TX queue 0 by default if it wasn't disabled. */
+if (netdev->tx_q[0].map == OVS_VHOST_QUEUE_MAP_UNKNOWN) {
+netdev->tx_q[0].map = 0;
+}
 
 netdev_dpdk_remap_txqs(netdev);
 
@@ -2104,6 +2111,18 @@ new_device(struct virtio_net *dev)
 return 0;
 }
 
+/* Clears mapping for all available queues of vhost interface. */
+static void
+netdev_dpdk_txq_map_clear(struct netdev_dpdk *dev)
+OVS_REQUIRES(dev->mutex)
+{
+int i;
+
+for (i = 0; i < dev->real_n_txq; i++) {
+dev->tx_q[i].map = OVS_VHOST_QUEUE_MAP_UNKNOWN;
+}
+}
+
 /*
  * Remove a virtio-net device from the specific vhost port.  Use dev->remove
  * flag to stop any more packets from being sent or received to/from a VM and
@@ -2123,6 +2142,7 @@ destroy_device(volatile struct virtio_net *dev)
 ovs_mutex_lock(_dev->mutex);
 dev->flags &= ~VIRTIO_DEV_RUNNING;
 ovsrcu_set(_dev->virtio_dev, NULL);
+netdev_dpdk_txq_map_clear(vhost_dev);
 exists = true;
 ovs_mutex_unlock(_dev->mutex);
 break;
@@ -2169,7 +2189,7 @@ vring_state_changed(struct virtio_net *dev, uint16_t 
queue_id, int enable)
 if (enable) {
 vhost_dev->tx_q[qid].map = qid;
 } else {
-vhost_dev->tx_q[qid].map = -1;
+vhost_dev->tx_q[qid].map = OVS_VHOST_QUEUE_DISABLED;
 }
 netdev_dpdk_remap_txqs(vhost_dev);
 exists = true;
-- 
2.5.0

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v10 0/6] Convert DPDK configuration from command line to DB based

2016-03-29 Thread Wojciechowicz, RobertX
Hi Aaron,

just gentle reminder

> -Original Message-
> From: dev [mailto:dev-boun...@openvswitch.org] On Behalf Of Aaron
> Conole
> Sent: Monday, March 28, 2016 8:55 PM
> To: dev@openvswitch.org
> Cc: Flavio Leitner 
> Subject: Re: [ovs-dev] [PATCH v10 0/6] Convert DPDK configuration from
> command line to DB based
> 
> Hi (and apologies if the top posting is inappropriate),
> 
> Don't want to be a pest, but just pinging re: this series. What work
> remains? I want to try and close this out to do some additional
> vhostuser config work, so anything that might be gating this please let
> me know and I'll work on it.
> 

Please remember to add "vhost-sock-dir" to the database,
even if there will be used the default directory (no command line value).

> Thanks,
> -Aaron
> 

Thanks,
Robert

> Aaron Conole  writes:
> > Currently, configuration of DPDK parameters is done via the command line
> > through a --dpdk **OPTIONS** -- command line argument. This has a
> number of
> > challenges, including:
> > * It must be the first option passed to ovs-vswitchd
> > * It is the only datapath feature in OVS to be configured on the command
> line
> > * It requires specialized knowledge of sub-component command switches
> > * It also inteprets non-EAL arguments (confusing users)
> > * It is a broken model.
> >
> >
> > This series brings the following changes to openvswitch:
> > * All DPDK options are taken from the ovs database rather than the
> >   command line
> > * Non-EAL arguments also have separate database entries
> > * DPDK lcores are optionally auto-assigned to a single core based on the
> >   bridge coremask.
> > * DPDK options have default behaviors
> > * Updated documentation
> >
> > v2:
> > * Dropped the vhost-user socket configuration options. Those can be re-
> added
> >   as an extension
> > * Incorporated feedback from Kevin Traynor.
> >
> > v3:
> > * Went back to a global dpdk-init
> > * Language cleanup and various minor fixes
> >
> > v4:
> > * Added a way to pass arbitrary eal arguments
> >
> > v5:
> > * Restore the socket-mem default, and fix up the ovs-dev.py script, along
> >   with the manpage for ovsdb-server
> >
> > v6:
> > * Correct a documentation issue with INSTALL.DPDK.md
> > * Correct a non-dpdk enabled OVS incorrect warning variable
> > * Remove an excess whitespace
> >
> > v7:
> > * After testing by Christian with dpdk-alloc-mem
> >
> > v8:
> > * Confirmed ``make check`` operation with and without dpdk.
> >   Retested on live-host
> >
> > v9:
> > * Cleanup of comments
> > * Cleanup of one place where headers are specified
> > * Mark the dpdk coremask and numa config as optional
> > * Added 5/6 to scan the extras and warn the user when conflicting
> >   DB entries are present
> > * Acks given for all but patch 5/6
> >
> > v10:
> > * Rebased against latest upstream
> > * ACK or Tested-by for all patches
> > * Code cleanup on patch 2/6 (vhost-cuse warning)
> > * DB options documentation cleanup.
> >
> > Aaron Conole (6):
> >   netdev-dpdk: Restore thread affinity after DPDK init
> >   netdev-dpdk: Convert initialization from cmdline to db
> >   netdev-dpdk: Autofill lcore coremask if absent
> >   netdev-dpdk: Allow arbitrary eal arguments
> >   netdev-dpdk: Check dpdk-extra when reading db
> >   NEWS: Announce the DPDK EAL configuration change
> >
> >  FAQ.md |   6 +-
> >  INSTALL.DPDK.md|  86 +++---
> >  NEWS   |   3 +
> >  lib/netdev-dpdk.c  | 409
> +
> >  lib/netdev-dpdk.h  |  22 ++-
> >  tests/ofproto-macros.at|   3 +-
> >  utilities/ovs-dev.py   |   7 +-
> >  vswitchd/bridge.c  |   3 +
> >  vswitchd/ovs-vswitchd.8.in |   5 +-
> >  vswitchd/ovs-vswitchd.c|  25 +--
> >  vswitchd/vswitch.xml   | 142 +++-
> >  11 files changed, 581 insertions(+), 130 deletions(-)
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
--
Intel Research and Development Ireland Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263


This e-mail and any attachments may contain confidential material for the sole
use of the intended recipient(s). Any review or distribution by others is
strictly prohibited. If you are not the intended recipient, please contact the
sender and delete all copies.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


  1   2   >