date:20170926

Re: [ovs-dev] [PATCH v3 0/9] OVS-DPDK flow offload with rte_flow

2017-09-26 Thread Darrell Ball



On 9/26/17, 8:27 PM, "Yuanhan Liu"  wrote:

On Wed, Sep 27, 2017 at 03:13:33AM +, Darrell Ball wrote:
> 
> 
> On 9/26/17, 6:25 PM, "Yuanhan Liu"  wrote:
> 
> On Tue, Sep 26, 2017 at 09:58:16PM +, Darrell Ball wrote:
> > The second major change is there is a thread introduced to do 
the real
> > flow offload. This is for diminishing the overhead by hw flow 
offload
> > installation/deletion at data path. See patch 9 for more 
detailed info.
> > 
> > [Darrell] There might be other options to handle this such as 
dividing time
> > to HWOL in a given PMD. 
> > Another option to have PMD isolation.
> 
> Good to know, though I'm not sure I understand it so far. But it can 
be
> discussed later. That's also the reason I put it in the last patch, so
> that we could easily turn it to another solution, or even simpler 
(just
> remove it).
> 
> > In the last discussion, RSS action was suggested to use to get 
rid of
> > the QUEUE action workaround. Unfortunately, it didn't work. The 
flow
> > creation failed with MLX5 PMD driver, and that's the only 
driver in
> > DPDK that supports RSS action so far.
> > 
> > 
> > [Darrell] 
> > I wonder if we should take a pause before jumping into the next 
version of the code.
> 
> I have no objections.
> 
> > If workarounds are required in OVS code, then so be it as long as 
they are not
> > overly disruptive to the existing code and hard to maintain.
> > In the case of RTE_FLOW_ACTION_TYPE_RSS, we might have a reasonable 
option
> > to avoid some unpleasant OVS workarounds.
> > This would make a significant difference in the code paths, if we 
supported it, so
> > we need to be sure as early as possible.
> 
> I agree.
> 
> > The support needed would be in some drivers and seems reasonably 
doable. 
> > Moreover, this was discussed in the last dpdk meeting and the 
support was
> > indicated as existing?, although I only verified the MLX code, 
myself.
> 
> From the code point of view, yes, the code was there months ago.
> 
> > I had seen the MLX code supporting _RSS action and there are some 
checks for
> > supported cases; when you say “it didn't work”, what was the issue ?
> 
> I actually have met the error months ago, I even debugged it. IIRC,
> the error is from libibverbs, when trying to create the hw flow. I
> think I need double-confirm it with our colleague who developed this
> feature.
> 
> > Let us have a discussion also about the Intel nic side and the 
Napatech side.
> 
> I think it's not necessary for Napatech, because they can support
> MARK only action.
> 
> It is not necessary for Napatech; however to avoid the need to detect the 
Napatech
> special (good) case, ‘ideally’ we do the exact same programming even in 
Napatech case.

Will following look okay to you (assuming we have the probe ability and DPDK
has some basic capability feedback)?

if (!pure_mark_cap_probed) {
if (dpdk_rte_flow_has_rss_action_support) {
append_rss_action();
} else {
fail and return;
}
}

/* create flow once, with no retries, if fails, let it fail */
flow = rte_flow_create(...);

I assume that's how the code looks like finally. What do you think?


[Darrell] It looks fine; of course, if we could drop dependencies on cap probe, 
it would be ideal (.  


--yliu

> For Intel, yes, I think Intel folks could give some comments here.
> 
> > It seems reasonable to ask where the disconnect is and whether this 
support
> > can be added and then make a decision based on the answers. 
> 
> No objections.
> 
>   --yliu
> 
> 


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v3 0/9] OVS-DPDK flow offload with rte_flow

2017-09-26 Thread Yuanhan Liu

On Wed, Sep 27, 2017 at 03:13:33AM +, Darrell Ball wrote:
> 
> 
> On 9/26/17, 6:25 PM, "Yuanhan Liu"  wrote:
> 
> On Tue, Sep 26, 2017 at 09:58:16PM +, Darrell Ball wrote:
> > The second major change is there is a thread introduced to do the 
> real
> > flow offload. This is for diminishing the overhead by hw flow 
> offload
> > installation/deletion at data path. See patch 9 for more detailed 
> info.
> > 
> > [Darrell] There might be other options to handle this such as dividing 
> time
> > to HWOL in a given PMD. 
> > Another option to have PMD isolation.
> 
> Good to know, though I'm not sure I understand it so far. But it can be
> discussed later. That's also the reason I put it in the last patch, so
> that we could easily turn it to another solution, or even simpler (just
> remove it).
> 
> > In the last discussion, RSS action was suggested to use to get rid 
> of
> > the QUEUE action workaround. Unfortunately, it didn't work. The flow
> > creation failed with MLX5 PMD driver, and that's the only driver in
> > DPDK that supports RSS action so far.
> > 
> > 
> > [Darrell] 
> > I wonder if we should take a pause before jumping into the next version 
> of the code.
> 
> I have no objections.
> 
> > If workarounds are required in OVS code, then so be it as long as they 
> are not
> > overly disruptive to the existing code and hard to maintain.
> > In the case of RTE_FLOW_ACTION_TYPE_RSS, we might have a reasonable 
> option
> > to avoid some unpleasant OVS workarounds.
> > This would make a significant difference in the code paths, if we 
> supported it, so
> > we need to be sure as early as possible.
> 
> I agree.
> 
> > The support needed would be in some drivers and seems reasonably 
> doable. 
> > Moreover, this was discussed in the last dpdk meeting and the support 
> was
> > indicated as existing?, although I only verified the MLX code, myself.
> 
> From the code point of view, yes, the code was there months ago.
> 
> > I had seen the MLX code supporting _RSS action and there are some 
> checks for
> > supported cases; when you say “it didn't work”, what was the issue ?
> 
> I actually have met the error months ago, I even debugged it. IIRC,
> the error is from libibverbs, when trying to create the hw flow. I
> think I need double-confirm it with our colleague who developed this
> feature.
> 
> > Let us have a discussion also about the Intel nic side and the Napatech 
> side.
> 
> I think it's not necessary for Napatech, because they can support
> MARK only action.
> 
> It is not necessary for Napatech; however to avoid the need to detect the 
> Napatech
> special (good) case, ‘ideally’ we do the exact same programming even in 
> Napatech case.

Will following look okay to you (assuming we have the probe ability and DPDK
has some basic capability feedback)?

if (!pure_mark_cap_probed) {
if (dpdk_rte_flow_has_rss_action_support) {
append_rss_action();
} else {
fail and return;
}
}

/* create flow once, with no retries, if fails, let it fail */
flow = rte_flow_create(...);

I assume that's how the code looks like finally. What do you think?

--yliu

> For Intel, yes, I think Intel folks could give some comments here.
> 
> > It seems reasonable to ask where the disconnect is and whether this 
> support
> > can be added and then make a decision based on the answers. 
> 
> No objections.
> 
>   --yliu
> 
> 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v3 0/9] OVS-DPDK flow offload with rte_flow

2017-09-26 Thread Darrell Ball

On 9/26/17, 6:25 PM, "Yuanhan Liu"  wrote:

On Tue, Sep 26, 2017 at 09:58:16PM +, Darrell Ball wrote:
> The second major change is there is a thread introduced to do the real
> flow offload. This is for diminishing the overhead by hw flow offload
> installation/deletion at data path. See patch 9 for more detailed 
info.
> 
> [Darrell] There might be other options to handle this such as dividing 
time
> to HWOL in a given PMD. 
> Another option to have PMD isolation.

Good to know, though I'm not sure I understand it so far. But it can be
discussed later. That's also the reason I put it in the last patch, so
that we could easily turn it to another solution, or even simpler (just
remove it).

> In the last discussion, RSS action was suggested to use to get rid of
> the QUEUE action workaround. Unfortunately, it didn't work. The flow
> creation failed with MLX5 PMD driver, and that's the only driver in
> DPDK that supports RSS action so far.
> 
> 
> [Darrell] 
> I wonder if we should take a pause before jumping into the next version 
of the code.

I have no objections.

> If workarounds are required in OVS code, then so be it as long as they 
are not
> overly disruptive to the existing code and hard to maintain.
> In the case of RTE_FLOW_ACTION_TYPE_RSS, we might have a reasonable option
> to avoid some unpleasant OVS workarounds.
> This would make a significant difference in the code paths, if we 
supported it, so
> we need to be sure as early as possible.

I agree.

> The support needed would be in some drivers and seems reasonably doable. 
> Moreover, this was discussed in the last dpdk meeting and the support was
> indicated as existing?, although I only verified the MLX code, myself.

From the code point of view, yes, the code was there months ago.

> I had seen the MLX code supporting _RSS action and there are some checks 
for
> supported cases; when you say “it didn't work”, what was the issue ?

I actually have met the error months ago, I even debugged it. IIRC,
the error is from libibverbs, when trying to create the hw flow. I
think I need double-confirm it with our colleague who developed this
feature.

> Let us have a discussion also about the Intel nic side and the Napatech 
side.

I think it's not necessary for Napatech, because they can support
MARK only action.

It is not necessary for Napatech; however to avoid the need to detect the 
Napatech
special (good) case, ‘ideally’ we do the exact same programming even in 
Napatech case.

For Intel, yes, I think Intel folks could give some comments here.

> It seems reasonable to ask where the disconnect is and whether this 
support
> can be added and then make a decision based on the answers. 

No objections.

--yliu

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v3 0/9] OVS-DPDK flow offload with rte_flow

2017-09-26 Thread Yuanhan Liu

On Wed, Sep 27, 2017 at 09:24:54AM +0800, Yuanhan Liu wrote:
> > The support needed would be in some drivers and seems reasonably doable. 
> > Moreover, this was discussed in the last dpdk meeting and the support was
> > indicated as existing?, although I only verified the MLX code, myself.
> 
> >From the code point of view, yes, the code was there months ago.
> 
> > I had seen the MLX code supporting _RSS action and there are some checks for
> > supported cases; when you say “it didn't work”, what was the issue ?
> 
> I actually have met the error months ago, I even debugged it. IIRC,
> the error is from libibverbs, when trying to create the hw flow. I
> think I need double-confirm it with our colleague who developed this
> feature.

Hmm, I think I was wrong, and apparenetly, I think I have did something
stupid in the last try. I double confirmed it just now, at least the
flow creation now works. I just need take some time to intergrate it
to OVS-DPDK, to see how it goes. I will get it back to you when I have
done that. Currently, I was occupied by something else.

--yliu
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH net-next v9] openvswitch: enable NSH support

2017-09-26 Thread Yang, Yi

On Tue, Sep 26, 2017 at 06:49:14PM +0800, Jiri Benc wrote:
> On Tue, 26 Sep 2017 12:55:39 +0800, Yang, Yi wrote:
> > After push_nsh, the packet won't be recirculated to flow pipeline, so
> > key->eth.type must be set explicitly here, but for pop_nsh, the packet
> > will be recirculated to flow pipeline, it will be reparsed, so
> > key->eth.type will be set in packet parse function, we needn't handle it
> > in pop_nsh.
> 
> This seems to be a very different approach than what we currently have.
> Looking at the code, the requirement after "destructive" actions such
> as pushing or popping headers is to recirculate.

This is optimization proposed by Jan Scheurich, recurculating after push_nsh
will impact on performance, recurculating after pop_nsh is unavoidable, So
also cc jan.scheur...@ericsson.com.

Actucally all the keys before push_nsh are still there after push_nsh,
push_nsh has updated all the nsh keys, so recirculating remains avoidable.

> 
> Setting key->eth.type to satisfy conditions in the output path without
> updating the rest of the key looks very hacky and fragile to me. There
> might be other conditions and dependencies that are not obvious.
> I don't think the code was written with such code path in mind.
> 
> I'd like to hear what Pravin thinks about this.
> 
>  Jiri
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v3 0/9] OVS-DPDK flow offload with rte_flow

2017-09-26 Thread Yuanhan Liu

On Tue, Sep 26, 2017 at 09:58:16PM +, Darrell Ball wrote:
> The second major change is there is a thread introduced to do the real
> flow offload. This is for diminishing the overhead by hw flow offload
> installation/deletion at data path. See patch 9 for more detailed info.
> 
> [Darrell] There might be other options to handle this such as dividing time
> to HWOL in a given PMD. 
> Another option to have PMD isolation.

Good to know, though I'm not sure I understand it so far. But it can be
discussed later. That's also the reason I put it in the last patch, so
that we could easily turn it to another solution, or even simpler (just
remove it).

> In the last discussion, RSS action was suggested to use to get rid of
> the QUEUE action workaround. Unfortunately, it didn't work. The flow
> creation failed with MLX5 PMD driver, and that's the only driver in
> DPDK that supports RSS action so far.
> 
> 
> [Darrell] 
> I wonder if we should take a pause before jumping into the next version of 
> the code.

I have no objections.

> If workarounds are required in OVS code, then so be it as long as they are not
> overly disruptive to the existing code and hard to maintain.
> In the case of RTE_FLOW_ACTION_TYPE_RSS, we might have a reasonable option
> to avoid some unpleasant OVS workarounds.
> This would make a significant difference in the code paths, if we supported 
> it, so
> we need to be sure as early as possible.

I agree.

> The support needed would be in some drivers and seems reasonably doable. 
> Moreover, this was discussed in the last dpdk meeting and the support was
> indicated as existing?, although I only verified the MLX code, myself.

From the code point of view, yes, the code was there months ago.

> I had seen the MLX code supporting _RSS action and there are some checks for
> supported cases; when you say “it didn't work”, what was the issue ?

I actually have met the error months ago, I even debugged it. IIRC,
the error is from libibverbs, when trying to create the hw flow. I
think I need double-confirm it with our colleague who developed this
feature.

> Let us have a discussion also about the Intel nic side and the Napatech side.

I think it's not necessary for Napatech, because they can support
MARK only action.

For Intel, yes, I think Intel folks could give some comments here.

> It seems reasonable to ask where the disconnect is and whether this support
> can be added and then make a decision based on the answers. 

No objections.

--yliu
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH net-next v9] openvswitch: enable NSH support

2017-09-26 Thread Yang, Yi

On Wed, Sep 27, 2017 at 04:59:36AM +0800, Eric Garver wrote:
> On Tue, Sep 26, 2017 at 01:02:15PM +0800, Yang, Yi wrote:
> > On Tue, Sep 26, 2017 at 03:28:42AM +0800, Eric Garver wrote:
> > > On Mon, Sep 25, 2017 at 10:16:09PM +0800, Yi Yang wrote:
> > > > +
> > > > +   length = nsh_hdr_len(nsh_hdr);
> > > > +   skb_pull(skb, length);
> > > 
> > > Do you need to verify you can actually pull length bytes? I don't see
> > > any guarantee.
> > 
> > I have added skb length check in pop_nsh, so that can verify this.
> 
> That doesn't help other code that may call skb_pop_nsh(). skb_vlan_pop()
> calls skb_ensure_writable() which seems like the right thing to do.

Make sense, I will move it to skp_pop_nsh, thanks.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH net-next v10] openvswitch: enable NSH support

2017-09-26 Thread Yang, Yi

On Tue, Sep 26, 2017 at 10:42:40PM +0800, Jiri Benc wrote:
> On Tue, 26 Sep 2017 21:52:41 +0800, Yang, Yi wrote:
> > > + return ((ret != 0) ? false : true);
> > 
> > But I don't think this is a problematic line from my understanding,
> 
> Why not:
> 
>   return ((ret != 0 == true) ? false : true) == true;
> 
> ?
> 
> Sigh. This is equal to:
> 
>   return !ret;
> 
> which you should use.

Ok, got it, I'll use "return !ret;", real programming art :-), I also saw
!!(condition), personally its readability is not good, typical kernel
style :-)
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] conntrack: Fix conn_type need be checked when remove rev_conn.

2017-09-26 Thread Darrell Ball



On 9/23/17, 2:36 AM, "ovs-dev-boun...@openvswitch.org on behalf of wangyunjian" 
 wrote:

We use hping3 to send random tcp, udp from VM1 to VM2. We met a issue that 
conn_lookup( ) will find
rev_conn->conn_type = CT_CONN_TYPE_DEFAULT in nat_clean().

[Darrell] So you are sending traffic in one direction between 2 OVS ports ?
   What is the associated packet src/dst IP and src/dst ports, conn 
key src/dst addresses, src/dst ports
   and reverse conn key src/dst addresses, src/dst ports causing 
the problem?
   Is it the TCP or UDP traffic related ?
   
Is this rev_conn->conn_type = CT_CONN_TYPE_DEFAULT ok?

[Darrell] I asked for information related to such a possible case, but no 
details were provided yet – see link below.
   Furthermore, the other information provided was not consistent, 
as mentioned earlier.
   The data would need to be from a non-instrumented version of the 
code.
   
I think the rev_conn->conn_type need to be CT_CONN_TYPE_UN_NAT.


From: Darrell Ball [mailto:dlu...@gmail.com]
Sent: Tuesday, September 12, 2017 12:38 PM
To: wangyunjian 
Cc: ovs dev ; Huanglili (lee) 
; b...@ovs.org
Subject: Re: [ovs-dev] [PATCH] conntrack: Fix conn_type need be checked 
when remove rev_conn.

We cannot merge this patch.

Can you provide answers to the questions I asked here


https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddiscuss_2017-2DSeptember_045308.html&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=lN8LgYzgffaggReqxu_7a9QKH74RKoTbK0Faru94z7o&s=l8K4j912IB0MotzNP8RaUgLFTrO4BI9QTggLebSzhoY&e=
 

Thanks Darrell


On Mon, Sep 11, 2017 at 2:49 AM, w00273186 
mailto:wangyunj...@huawei.com>> wrote:
From: Yunjian Wang mailto:wangyunj...@huawei.com>>

The rev_conn need will be removed, only when conn_type is 
CT_CONN_TYPE_UN_NAT.
This crash will be triggered when remove conn in ct-clean thread.

Reported-by: Lili Huang 
mailto:huanglili.hu...@huawei.com>>
Signed-off-by: Yunjian Wang 
mailto:wangyunj...@huawei.com>>
---
 lib/conntrack.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/lib/conntrack.c b/lib/conntrack.c
index 419cb1d..c1adb56 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -684,9 +684,10 @@ nat_clean(struct conntrack *ct, struct conn *conn,

 /* In the unlikely event, rev conn was recreated, then skip
  * rev_conn cleanup. */
-if (rev_conn && (!nat_conn_key_node ||
- conn_key_cmp(&nat_conn_key_node->value,
-  &rev_conn->rev_key))) {
+if (rev_conn &&
+(rev_conn->conn_type == CT_CONN_TYPE_UN_NAT) &&
+(!nat_conn_key_node || conn_key_cmp(&nat_conn_key_node->value,
+&rev_conn->rev_key))) {
 hmap_remove(&ct->buckets[bucket_rev_conn].connections,
 &rev_conn->node);
 free(rev_conn);
--
1.8.3.1


___
dev mailing list
d...@openvswitch.org

https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=lN8LgYzgffaggReqxu_7a9QKH74RKoTbK0Faru94z7o&s=z8oNo064OLtfO_bcwjyMAq0xXydcu0oWif5YkLENYtw&e=
 

___
dev mailing list
d...@openvswitch.org

https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=lN8LgYzgffaggReqxu_7a9QKH74RKoTbK0Faru94z7o&s=z8oNo064OLtfO_bcwjyMAq0xXydcu0oWif5YkLENYtw&e=
 


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v4 3/3] ovn-controller: Use separate thread for packet-in processing.

2017-09-26 Thread Guoshuai Li


This is very useful to me.

I found a problem in my use:
In the ovn-controller and the south of the database connection，when the 
ovn-controller as a passive service, the SB as a client, such as 
configured to


ovs-vsctl set Open_vSwitch . external-ids:ovn-remote=ptcp:6644:0.0.0.0
ovn-sbctl set-connection  tcp:10.157.145.211:6644 
tcp:10.157.145.212:6644 tcp:10.157.145.213:6644 tcp:10.157.145.214:6644


This configuration is to determine the status of the chassis through the 
state of the connection in the OVNSB。



But here the two threads listening conflict：
2017-09-27T00:00:59.387Z|26144|socket_util|ERR|6644:0.0.0.0: bind: 
Address already in use
2017-09-27T00:00:59.387Z|26145|reconnect|INFO|ptcp:6644:0.0.0.0: 
listening...
2017-09-27T00:00:59.387Z|26146|reconnect|INFO|ptcp:6644:0.0.0.0: listen 
attempt failed (Address already in use)


Do you have a good way to fix it?

And this patch also conflict with master...


 2017/8/28 12:14, Han Zhou :

This patch introduces multi-threading for ovn-controller and use
dedicated thread for packet-in processing as a start. It decouples
packet-in processing and ovs flow computing, so that packet-in inputs
won't trigger flow recomputing, and flow computing won't block
packet-in processing. In large scale environment this largely reduces
CPU cost and improves performance.

Related effort and discussion:
https://mail.openvswitch.org/pipermail/ovs-dev/2017-May/331813.html

Signed-off-by: Han Zhou 
---
v3->v4: rebased on master.

  ovn/controller/ovn-controller.c |  71 ++-
  ovn/controller/ovn-controller.h |  38 +++
  ovn/controller/pinctrl.c| 105 
  ovn/controller/pinctrl.h|   1 +
  4 files changed, 193 insertions(+), 22 deletions(-)

diff --git a/ovn/controller/ovn-controller.c b/ovn/controller/ovn-controller.c
index 414443f..cb04244 100644
--- a/ovn/controller/ovn-controller.c
+++ b/ovn/controller/ovn-controller.c
@@ -56,6 +56,8 @@
  #include "stream.h"
  #include "unixctl.h"
  #include "util.h"
+#include "latch.h"
+#include "ovs-thread.h"
  
  VLOG_DEFINE_THIS_MODULE(main);
  
@@ -66,8 +68,6 @@ static unixctl_cb_func inject_pkt;

  #define DEFAULT_BRIDGE_NAME "br-int"
  #define DEFAULT_PROBE_INTERVAL_MSEC 5000
  
-static void update_probe_interval(struct controller_ctx *,

-  const char *ovnsb_remote);
  static void parse_options(int argc, char *argv[]);
  OVS_NO_RETURN static void usage(void);
  
@@ -78,7 +78,7 @@ struct pending_pkt {

  char *flow_s;
  };
  
-static char *ovs_remote;

+char *ovs_remote;
  
  struct local_datapath *

  get_local_datapath(const struct hmap *local_datapaths, uint32_t tunnel_key)
@@ -129,7 +129,7 @@ get_bridge(struct ovsdb_idl *ovs_idl, const char *br_name)
  return NULL;
  }
  
-static void

+void
  update_sb_monitors(struct ovsdb_idl *ovnsb_idl,
 const struct sbrec_chassis *chassis,
 const struct sset *local_ifaces,
@@ -257,7 +257,7 @@ create_br_int(struct controller_ctx *ctx)
  return bridge;
  }
  
-static const struct ovsrec_bridge *

+const struct ovsrec_bridge *
  get_br_int(struct controller_ctx *ctx)
  {
  const struct ovsrec_open_vswitch *cfg;
@@ -269,7 +269,7 @@ get_br_int(struct controller_ctx *ctx)
  return get_bridge(ctx->ovs_idl, br_int_name(cfg));
  }
  
-static const char *

+const char *
  get_chassis_id(const struct ovsdb_idl *ovs_idl)
  {
  const struct ovsrec_open_vswitch *cfg = 
ovsrec_open_vswitch_first(ovs_idl);
@@ -309,7 +309,7 @@ update_ssl_config(const struct ovsdb_idl *ovs_idl)
  
  /* Retrieves the OVN Southbound remote location from the

   * "external-ids:ovn-remote" key in 'ovs_idl' and returns a copy of it. */
-static char *
+char *
  get_ovnsb_remote(struct ovsdb_idl *ovs_idl)
  {
  while (1) {
@@ -498,6 +498,22 @@ get_nb_cfg(struct ovsdb_idl *idl)
  }
  
  static void

+ctrl_thread_create(struct ctrl_thread *thread, const char *name,
+void *(*start)(void *))
+{
+latch_init(&thread->exit_latch);
+thread->thread = ovs_thread_create(name, start, thread);
+}
+
+static void
+ctrl_thread_exit(struct ctrl_thread *thread)
+{
+latch_set(&thread->exit_latch);
+xpthread_join(thread->thread, NULL);
+latch_destroy(&thread->exit_latch);
+}
+
+void
  ctrl_register_ovs_idl(struct ovsdb_idl *ovs_idl)
  {
  /* We do not monitor all tables by default, so modules must register
@@ -574,6 +590,22 @@ create_ovnsb_indexes(struct ovsdb_idl *ovnsb_idl)
 OVSDB_INDEX_ASC, NULL);
  }
  
+void

+connect_ovnsb(struct ovsdb_idl_loop *ovnsb_idl_loop,
+  struct ovnsb_cursors *cursors,
+  const char *ovnsb_remote)
+{
+ovnsb_idl_loop->idl = ovsdb_idl_create(ovnsb_remote,
+&sbrec_idl_class, true, true);
+
+create_ovnsb_indexes(ovnsb_idl_loop->idl);
+lport_init(cursors, ovnsb_idl_loop->idl);
+
+ovsdb_idl_omit_alert(ovnsb_idl_loop->idl, &

Re: [ovs-dev] [branch-2.8 2/2] Prepare for 2.8.2.

2017-09-26 Thread Ben Pfaff

On Tue, Sep 26, 2017 at 01:37:37PM -0700, Justin Pettit wrote:
> Signed-off-by: Justin Pettit 

Acked-by: Ben Pfaff 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [branch-2.8 1/2] Set release date for 2.8.1.

2017-09-26 Thread Ben Pfaff

On Tue, Sep 26, 2017 at 01:37:36PM -0700, Justin Pettit wrote:
> Signed-off-by: Justin Pettit 

Acked-by: Ben Pfaff 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] 2.7.3 release?

2017-09-26 Thread Ben Pfaff

On Tue, Sep 26, 2017 at 10:34:07PM +, Darrell Ball wrote:
> 
> 
> On 9/25/17, 12:41 PM, "Ben Pfaff"  wrote:
> 
> On Mon, Sep 25, 2017 at 06:37:29PM +, Darrell Ball wrote:
> > 
> > 
> > On 9/25/17, 11:31 AM, "Ben Pfaff"  wrote:
> > 
> > On Mon, Sep 25, 2017 at 06:28:46PM +, Darrell Ball wrote:
> > > 
> > > 
> > > On 9/25/17, 10:49 AM, "ovs-dev-boun...@openvswitch.org on behalf 
> of Justin Pettit"  jpet...@ovn.org> wrote:
> > > 
> > > 
> > > > On Sep 25, 2017, at 5:54 AM, Russell Bryant 
>  wrote:
> > > > 
> > > > There's some important changes in branch-2.7.  Can we do a 
> 2.7.3
> > > > release soon?  Does anyone have anything they need 
> addressed first?
> > > 
> > > I'm happy to do a release.  I think we're due for a 2.8.1 
> release, too.
> > > 
> > > Ian wants a DPDK patch moved into branch-2.8.  Is there 
> anything else that's important?
> > > 
> > > 
> > > 2.8.0 was Aug 31.
> > > Maybe 2.8.1 can wait a couple weeks to give time for other 
> potential fixes?
> > 
> > There will always be more fixes, and we can always do more releases.
> > 
> > Do you know of important fixes that are in the works?
> > 
> > There are a couple backport requests in the dpdk merge.
> 
> OK.
> 
> I merged that request, and backported one of them, but "netdev-dpdk:
> reset packet_type for reused dp_packets." had a conflict.  Can you send
> out a backport?
> 
> I sent out a two patch series backport request here
> https://patchwork.ozlabs.org/project/openvswitch/list/?series=5042

Thanks, I applied these to branch-2.8.

> 
> > Also, ideally, I would like to include
> > 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.ozlabs.org_patch_816712_&d=DwIBAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=dUNQc7lLVYHsKc9aahwyrvZMr8T9MBwGKdlBzWvttMM&s=EyJ0ntHzrXaLo0BMDREZnHkpMhNLixvNi-HdUDBoWzQ&e=
>  
> > as well.
> 
> Just patch 5 from that series?  Usually, I expect important bug fixes to
> be early in a series.
> 
> I changed the ordering of the patches, so that patches 1-3 are relevant to 
> backporting and sent a new version
> https://patchwork.ozlabs.org/project/openvswitch/list/?series=5051
> The changes are borderline in meeting the test for backporting, but I think 
> the debug change would be nice to have.

Thanks.  I applied the series to master and the first 3 patches to
branch-2.8.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [patch v3 1/5] conntrack: Create nat_conn_keys_insert().

2017-09-26 Thread Ben Pfaff

On Mon, Sep 25, 2017 at 08:51:42PM -0700, Darrell Ball wrote:
> Create a separate function from existing code, so the
> code can be reused in a subsequent patch; no change
> in functionality.
> 
> Acked-by: Bhanuprakash Bodireddy 
> Signed-off-by: Darrell Ball 

Thanks Darrell and Bhanu.

I applied this series to master and backported the first three patches
to branch-2.8.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [branch 2.8 1/2] dp-packet: Refactor DPDK packet initialization.

2017-09-26 Thread Ben Pfaff

On Mon, Sep 25, 2017 at 07:01:35PM -0700, Darrell Ball wrote:
> DPDK uses dp-packet pools and manages the mbuf portion of
> each packet. When a pool is created, partial initialization is
> also done on the OVS portion (i.e. non-mbuf).  Since packet
> memory is reused, this is not very useful for transient
> fields and is also misleading.  Furthermore, some of these
> transient fields are properly initialized for DPDK packets
> entering OVS anyways, which is the only reasonable way to do this.
> Another field, cutlen, is initialized in this manner in the pool
> and intended to be reset when cutlen is applied on sending the
> packet out. However, if cutlen context is set but the packet is
> not sent out for some reason, then the packet header would be
> corrupted in the memory pool.  It is better to just reset the
> cutlen in the packets when received.  I did not detect a
> degradation in performance, however, I would be willing to
> have some degradation, since this is a proper way to handle
> this.  In addition to initializing cutlen in received packets,
> the other OVS transient fields are removed from the DPDK pool
> initialization.
> 
> Acked-by: Sugesh Chandran 
> Signed-off-by: Darrell Ball 

Thanks, I applied both of these to branch-2.8.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] 2.7.3 release?

2017-09-26 Thread Darrell Ball



On 9/25/17, 12:41 PM, "Ben Pfaff"  wrote:

On Mon, Sep 25, 2017 at 06:37:29PM +, Darrell Ball wrote:
> 
> 
> On 9/25/17, 11:31 AM, "Ben Pfaff"  wrote:
> 
> On Mon, Sep 25, 2017 at 06:28:46PM +, Darrell Ball wrote:
> > 
> > 
> > On 9/25/17, 10:49 AM, "ovs-dev-boun...@openvswitch.org on behalf of 
Justin Pettit"  
wrote:
> > 
> > 
> > > On Sep 25, 2017, at 5:54 AM, Russell Bryant  
wrote:
> > > 
> > > There's some important changes in branch-2.7.  Can we do a 
2.7.3
> > > release soon?  Does anyone have anything they need addressed 
first?
> > 
> > I'm happy to do a release.  I think we're due for a 2.8.1 
release, too.
> > 
> > Ian wants a DPDK patch moved into branch-2.8.  Is there 
anything else that's important?
> > 
> > 
> > 2.8.0 was Aug 31.
> > Maybe 2.8.1 can wait a couple weeks to give time for other 
potential fixes?
> 
> There will always be more fixes, and we can always do more releases.
> 
> Do you know of important fixes that are in the works?
> 
> There are a couple backport requests in the dpdk merge.

OK.

I merged that request, and backported one of them, but "netdev-dpdk:
reset packet_type for reused dp_packets." had a conflict.  Can you send
out a backport?

I sent out a two patch series backport request here
https://patchwork.ozlabs.org/project/openvswitch/list/?series=5042


> Also, ideally, I would like to include
> 
https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.ozlabs.org_patch_816712_&d=DwIBAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=dUNQc7lLVYHsKc9aahwyrvZMr8T9MBwGKdlBzWvttMM&s=EyJ0ntHzrXaLo0BMDREZnHkpMhNLixvNi-HdUDBoWzQ&e=
 
> as well.

Just patch 5 from that series?  Usually, I expect important bug fixes to
be early in a series.

I changed the ordering of the patches, so that patches 1-3 are relevant to 
backporting and sent a new version
https://patchwork.ozlabs.org/project/openvswitch/list/?series=5051
The changes are borderline in meeting the test for backporting, but I think the 
debug change would be nice to have.







___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v3 0/9] OVS-DPDK flow offload with rte_flow

2017-09-26 Thread Darrell Ball



On 9/25/17, 10:37 PM, "Yuanhan Liu"  wrote:

Some hightlights in v3
==

Here is the v3, with 2 major changes and further testings (including
many flows). This took more effort than I thought, thus v3 publication
has been delayed for a while.

The first major change is the mark and id association is done with array
instead of CMAP now. This gives us further performance gain: it could
be up to 70% now (please see the exact number below).

This change also make the code a bit more complex though, due to the
lock issue. RCU is used (not quite sure it's been used rightly though).
For now, RCU only protects the array base address update (due to
reallocate), it doesn't protect the array item (array[i] = xx]) change.
I think it's buggy and I need rethink about it.

The second major change is there is a thread introduced to do the real
flow offload. This is for diminishing the overhead by hw flow offload
installation/deletion at data path. See patch 9 for more detailed info.

[Darrell] There might be other options to handle this such as dividing time
to HWOL in a given PMD. 
Another option to have PMD isolation.

In the last discussion, RSS action was suggested to use to get rid of
the QUEUE action workaround. Unfortunately, it didn't work. The flow
creation failed with MLX5 PMD driver, and that's the only driver in
DPDK that supports RSS action so far.


[Darrell] 
I wonder if we should take a pause before jumping into the next version of the 
code.

If workarounds are required in OVS code, then so be it as long as they are not
overly disruptive to the existing code and hard to maintain.
In the case of RTE_FLOW_ACTION_TYPE_RSS, we might have a reasonable option
to avoid some unpleasant OVS workarounds.
This would make a significant difference in the code paths, if we supported it, 
so
we need to be sure as early as possible.
The support needed would be in some drivers and seems reasonably doable. 
Moreover, this was discussed in the last dpdk meeting and the support was
indicated as existing?, although I only verified the MLX code, myself.

I had seen the MLX code supporting _RSS action and there are some checks for
supported cases; when you say “it didn't work”, what was the issue ?
Let us have a discussion also about the Intel nic side and the Napatech side.
It seems reasonable to ask where the disconnect is and whether this support
can be added and then make a decision based on the answers. 

What do you think?


I also tested many flows this time. The result is more exciting: it
could be up to 267% boost, with 512 mega flows (with each has another
512 exact matching flows, thus it's 512*512=256K flows in total), one
core and one queue doing PHY-PHY forwarding. For the offload case, the
performance keeps the same as with one flow only: because the cost of
the mark to flow translation is constant, no matter how many flows
are inserted (as far as they are all offloaded). However, for the
vanilla ovs-dpdk, the more flows, the worse the performance is. In
another word, the more flows, the bigger difference we will see.

There were too many discussions in last version. I'm sorry if I missed
some comments and didn't do the corresponding changes in v3. Please let
me know if I made such mistakes.

And below are the formal cover letter introduction, for someone who
is the first time to see this patchset.

---
Hi,

Here is a joint work from Mellanox and Napatech, to enable the flow hw
offload with the DPDK generic flow interface (rte_flow).

The basic idea is to associate the flow with a mark id (a unit32_t number).
Later, we then get the flow directly from the mark id, bypassing the heavy
emc processing, including miniflow_extract.

The association is done with array in patch 1. It also reuses the flow
APIs introduced while adding the tc offloads. The emc bypassing is done
in patch 2. The flow offload is done in patch 4, which mainly does two
things:

- translate the ovs match to DPDK rte flow patterns
- bind those patterns with a MARK action.

Afterwards, the NIC will set the mark id in every pkt's mbuf when it
matches the flow. That's basically how we could get the flow directly
from the received mbuf.

While testing with PHY-PHY forwarding with one core, one queue and one
flow, I got about 70% performance boost. For PHY-vhost forwarding, I got
about 50% performance boost. It's basically the performance I got with v1,
when the tcp_flags is the ignored. In summary, the CMPA to array change
gives up yet another 16% performance boost.

The major issue mentioned in v1 is also workarounded: the queue index
is never set to 0 blindly anymore, but set to the rxq that first
receives the upcall pkt.

Note th

Re: [ovs-dev] [PATCH net-next v9] openvswitch: enable NSH support

2017-09-26 Thread Eric Garver

On Tue, Sep 26, 2017 at 01:02:15PM +0800, Yang, Yi wrote:
> On Tue, Sep 26, 2017 at 03:28:42AM +0800, Eric Garver wrote:
> > On Mon, Sep 25, 2017 at 10:16:09PM +0800, Yi Yang wrote:
> > > +
> > > + length = nsh_hdr_len(nsh_hdr);
> > > + skb_pull(skb, length);
> > 
> > Do you need to verify you can actually pull length bytes? I don't see
> > any guarantee.
> 
> I have added skb length check in pop_nsh, so that can verify this.

That doesn't help other code that may call skb_pop_nsh(). skb_vlan_pop()
calls skb_ensure_writable() which seems like the right thing to do.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [branch-2.8 1/2] Set release date for 2.8.1.

2017-09-26 Thread Justin Pettit

Signed-off-by: Justin Pettit 
---
 NEWS | 5 +++--
 debian/changelog | 2 +-
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/NEWS b/NEWS
index 7e9c0f63a90b..ef3abd56b7f2 100644
--- a/NEWS
+++ b/NEWS
@@ -1,8 +1,9 @@
-v2.8.1 - xx xxx 
-
+v2.8.1 - 26 Sep 2017
+-
- OVN:
  * The "requested-chassis" option for a logical switch port now accepts a
chassis "hostname" in addition to a chassis "name".
+   - Bug fixes
 
 v2.8.0 - 31 Aug 2017
 -
diff --git a/debian/changelog b/debian/changelog
index 0bf3c4c82139..8724b9bc6fff 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -3,7 +3,7 @@ openvswitch (2.8.1-1) unstable; urgency=medium
[ Open vSwitch team ]
* New upstream version
 
- -- Open vSwitch team   Thu, 31 Aug 2017 09:32:16 -0700
+ -- Open vSwitch team   Tue, 26 Sep 2017 13:34:23 -0700
 
 openvswitch (2.8.0-1) unstable; urgency=medium
 
-- 
2.7.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [branch-2.8 2/2] Prepare for 2.8.2.

2017-09-26 Thread Justin Pettit

Signed-off-by: Justin Pettit 
---
 NEWS | 4 
 configure.ac | 2 +-
 debian/changelog | 7 +++
 3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/NEWS b/NEWS
index ef3abd56b7f2..669d4ac8 100644
--- a/NEWS
+++ b/NEWS
@@ -1,3 +1,7 @@
+v2.8.2 - xx xxx 
+-
+
+
 v2.8.1 - 26 Sep 2017
 -
- OVN:
diff --git a/configure.ac b/configure.ac
index 6ade118a029c..757a6375eaef 100644
--- a/configure.ac
+++ b/configure.ac
@@ -13,7 +13,7 @@
 # limitations under the License.
 
 AC_PREREQ(2.63)
-AC_INIT(openvswitch, 2.8.1, b...@openvswitch.org)
+AC_INIT(openvswitch, 2.8.2, b...@openvswitch.org)
 AC_CONFIG_SRCDIR([datapath/datapath.c])
 AC_CONFIG_MACRO_DIR([m4])
 AC_CONFIG_AUX_DIR([build-aux])
diff --git a/debian/changelog b/debian/changelog
index 8724b9bc6fff..b536a44dc528 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,10 @@
+openvswitch (2.8.2-1) unstable; urgency=medium
+
+   [ Open vSwitch team ]
+   * New upstream version
+
+ -- Open vSwitch team   Tue, 26 Sep 2017 13:34:23 -0700
+
 openvswitch (2.8.1-1) unstable; urgency=medium
 
[ Open vSwitch team ]
-- 
2.7.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 3/4] netdev-dpdk: log an err message when a mempool name is empty.

2017-09-26 Thread Darrell Ball



On 9/26/17, 8:06 AM, "ovs-dev-boun...@openvswitch.org on behalf of 
antonio.fische...@intel.com"  wrote:

Log an error message when the creation of a name for a
new mempool fails.

CC: Ciara Loftus 
CC: Kevin Traynor 
CC: Aaron Conole 
Signed-off-by: Antonio Fischetti 
---
 lib/netdev-dpdk.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index f3f42ee..7c673ec 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -502,6 +502,9 @@ dpdk_mp_name(struct dpdk_mp *dmp)
 int ret = snprintf(mp_name, RTE_MEMPOOL_NAMESIZE, "ovs_%x_%d_%u",
h, dmp->mtu, dmp->mp_size);
 if (ret < 0 || ret >= RTE_MEMPOOL_NAMESIZE) {
+VLOG_ERR("Failed to generate a mempool name for \"%s\". "
+"Hash:0x%x, mtu:%d, mbufs:%u",
+dmp->if_name, h, dmp->mtu, dmp->mp_size);

[Darrell] “Unlikely” to fail but this could be an ovs_assert(…)


 return NULL;
 }
 return mp_name;
-- 
2.4.11

___
dev mailing list
d...@openvswitch.org

https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=VXc6b9RFgB9FaMDL4JevWOtvj0gTpvDJYBS30fvVj8Y&s=Eeo-qAFITbCVhSWvMvxHDN4IzLsDEIYI1MmIsoUtDOE&e=
 


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 1/4] netdev-dpdk: fix mempool management with vhu client.

2017-09-26 Thread Darrell Ball



On 9/26/17, 12:58 PM, "Darrell Ball"  wrote:



On 9/26/17, 8:04 AM, "ovs-dev-boun...@openvswitch.org on behalf of 
antonio.fische...@intel.com"  wrote:

In a PVP test where vhostuser ports are configured as
clients, OvS crashes when QEMU is launched.
This patch avoids the repeated calls to netdev_change_seq_changed
after the requested mempool is already acquired.

CC: Ciara Loftus 
CC: Kevin Traynor 
CC: Aaron Conole 
Fixes: d555d9bded5f ("netdev-dpdk: Create separate memory pool for each 
port.")
Signed-off-by: Antonio Fischetti 
---
To replicate the bug scenario:

[Darrell] Just curious, but reproducibility with below steps ?; what about 
using libvirt ?
   Do we have the stacktrace ?
 


 PVP test setup
 --
CLIENT_SOCK_DIR=/tmp
SOCK0=dpdkvhostuser0
SOCK1=dpdkvhostuser1

1 PMD
Add 2 dpdk ports, n_rxq=1
Add 2 vhu ports both of type dpdkvhostuserclient and specify 
vhost-server-path
 ovs-vsctl set Interface dpdkvhostuser0 
options:vhost-server-path="$CLIENT_SOCK_DIR/$SOCK0"
 ovs-vsctl set Interface dpdkvhostuser1 
options:vhost-server-path="$CLIENT_SOCK_DIR/$SOCK1"

Set port-based rules: dpdk0 <--> vhu0 and dpdk1 <--> vhu1
 add-flow br0 in_port=1,action=output:3
 add-flow br0 in_port=3,action=output:1
 add-flow br0 in_port=4,action=output:2
 add-flow br0 in_port=2,action=output:4

 Launch QEMU
 ---
As OvS vhu ports are acting as clients, we must specify 'server' in the 
next command.
VM_IMAGE=

 sudo -E taskset 0x3F00 $QEMU_DIR/x86_64-softmmu/qemu-system-x86_64 
-name us-vhost-vm1 -cpu host -enable-kvm -m 4096M -object 
memory-backend-file,id=mem,size=4096M,mem-path=/dev/hugepages,share=on -numa 
node,memdev=mem -mem-prealloc -smp 4 -drive file=$VM_IMAGE -chardev 
socket,id=char0,path=$CLIENT_SOCK_DIR/$SOCK0,server -netdev 
type=vhost-user,id=mynet1,chardev=char0,vhostforce -device 
virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,mrg_rxbuf=off -chardev 
socket,id=char1,path=$CLIENT_SOCK_DIR/$SOCK1,server -netdev 
type=vhost-user,id=mynet2,chardev=char1,vhostforce -device 
virtio-net-pci,mac=00:00:00:00:00:02,netdev=mynet2,mrg_rxbuf=off --nographic

 Expected behavior
 -
With this fix OvS shouldn't crash.
---
 lib/netdev-dpdk.c | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 648d719..2f5ec71 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -549,7 +549,7 @@ dpdk_mp_create(struct netdev_dpdk *dev, int mtu)
   dmp->socket_id);
 if (dmp->mp) {
 VLOG_DBG("Allocated \"%s\" mempool with %u mbufs", mp_name,
- dmp->mp_size);
+dmp->mp_size);


[Darrell]
is this needed?



 } else if (rte_errno == EEXIST) {
 /* A mempool with the same name already exists.  We just
  * retrieve its pointer to be returned to the caller. */
@@ -571,6 +571,7 @@ dpdk_mp_create(struct netdev_dpdk *dev, int mtu)
  * initializes some OVS specific fields of dp_packet.
  */
 rte_mempool_obj_iter(dmp->mp, ovs_rte_pktmbuf_init, NULL);
+
 return dmp;
 }
 } while (!mp_exists &&
@@ -3247,11 +3248,14 @@ dpdk_vhost_reconfigure_helper(struct 
netdev_dpdk *dev)
 
 netdev_dpdk_remap_txqs(dev);
 
-err = netdev_dpdk_mempool_configure(dev);
-if (err) {
-return err;
-} else {
-netdev_change_seq_changed(&dev->up);
+if (dev->requested_socket_id != dev->socket_id
+|| dev->requested_mtu != dev->mtu) {


[Darrell] Can this check be moved to common code in 
netdev_dpdk_mempool_configure() and using a special return value
that skips netdev_change_seq_changed(&dev->up); in the callers, if
the call would be redundant ?


+err = netdev_dpdk_mempool_configure(dev);
+if (err) {
+return err;
+} else {
+netdev_change_seq_changed(&dev->up);
+}
 }
 
 if (netdev_dpdk_get_vid(dev) >= 0) {
-- 
2.4.11

___
dev mailing list
d...@openvswitch.org

https://urldefense.proofpoint.com/v2/url?u=https-3A__m

Re: [ovs-dev] [PATCH 2/4] netdev-dpdk: if mempool already exists don't reinit packet areas.

2017-09-26 Thread Darrell Ball



On 9/26/17, 8:05 AM, "ovs-dev-boun...@openvswitch.org on behalf of 
antonio.fische...@intel.com"  wrote:

Skip initialization of mempool objects if this was already
done in a previous call to dpdk_mp_create.

CC: Ciara Loftus 
CC: Kevin Traynor 
CC: Aaron Conole 
Signed-off-by: Antonio Fischetti 
---
 lib/netdev-dpdk.c | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 2f5ec71..f3f42ee 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -566,12 +566,15 @@ dpdk_mp_create(struct netdev_dpdk *dev, int mtu)
 }
 free(mp_name);
 if (dmp->mp) {
-/* rte_pktmbuf_pool_create has done some initialization of the
- * rte_mbuf part of each dp_packet, while ovs_rte_pktmbuf_init
- * initializes some OVS specific fields of dp_packet.
- */
-rte_mempool_obj_iter(dmp->mp, ovs_rte_pktmbuf_init, NULL);
-
+/* If the current mp was already created by a previous call
+ * we don't need to init again all its elements. */
+if (!mp_exists) {
+/* rte_pktmbuf_pool_create has done some initialization of 
the
+ * rte_mbuf part of each dp_packet, while 
ovs_rte_pktmbuf_init
+ * initializes some OVS specific fields of dp_packet.
+ */
+rte_mempool_obj_iter(dmp->mp, ovs_rte_pktmbuf_init, NULL);


[Darrell] Can this be moved inside 
if (dmp->mp) {
VLOG_DBG("Allocated \"%s\" mempool with %u mbufs", mp_name,
 dmp->mp_size);
}….. 

for clarity?


+}
 return dmp;
 }
 } while (!mp_exists &&
-- 
2.4.11

___
dev mailing list
d...@openvswitch.org

https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=Bf45eQULq41ut2PpTtt6Dah9xN86c0suku7rL1WVaTs&s=2HhdsumV1sIAkn7BT6u3jzjIPghy-48d2IgkqXemk8c&e=
 


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 1/4] netdev-dpdk: fix mempool management with vhu client.

2017-09-26 Thread Darrell Ball



On 9/26/17, 8:04 AM, "ovs-dev-boun...@openvswitch.org on behalf of 
antonio.fische...@intel.com"  wrote:

In a PVP test where vhostuser ports are configured as
clients, OvS crashes when QEMU is launched.
This patch avoids the repeated calls to netdev_change_seq_changed
after the requested mempool is already acquired.

CC: Ciara Loftus 
CC: Kevin Traynor 
CC: Aaron Conole 
Fixes: d555d9bded5f ("netdev-dpdk: Create separate memory pool for each 
port.")
Signed-off-by: Antonio Fischetti 
---
To replicate the bug scenario:

[Darrell] Just curious, but reproducibility with below steps ?; what about 
using libvirt ?
   Do we have the stacktrace ?
 


 PVP test setup
 --
CLIENT_SOCK_DIR=/tmp
SOCK0=dpdkvhostuser0
SOCK1=dpdkvhostuser1

1 PMD
Add 2 dpdk ports, n_rxq=1
Add 2 vhu ports both of type dpdkvhostuserclient and specify 
vhost-server-path
 ovs-vsctl set Interface dpdkvhostuser0 
options:vhost-server-path="$CLIENT_SOCK_DIR/$SOCK0"
 ovs-vsctl set Interface dpdkvhostuser1 
options:vhost-server-path="$CLIENT_SOCK_DIR/$SOCK1"

Set port-based rules: dpdk0 <--> vhu0 and dpdk1 <--> vhu1
 add-flow br0 in_port=1,action=output:3
 add-flow br0 in_port=3,action=output:1
 add-flow br0 in_port=4,action=output:2
 add-flow br0 in_port=2,action=output:4

 Launch QEMU
 ---
As OvS vhu ports are acting as clients, we must specify 'server' in the 
next command.
VM_IMAGE=

 sudo -E taskset 0x3F00 $QEMU_DIR/x86_64-softmmu/qemu-system-x86_64 -name 
us-vhost-vm1 -cpu host -enable-kvm -m 4096M -object 
memory-backend-file,id=mem,size=4096M,mem-path=/dev/hugepages,share=on -numa 
node,memdev=mem -mem-prealloc -smp 4 -drive file=$VM_IMAGE -chardev 
socket,id=char0,path=$CLIENT_SOCK_DIR/$SOCK0,server -netdev 
type=vhost-user,id=mynet1,chardev=char0,vhostforce -device 
virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,mrg_rxbuf=off -chardev 
socket,id=char1,path=$CLIENT_SOCK_DIR/$SOCK1,server -netdev 
type=vhost-user,id=mynet2,chardev=char1,vhostforce -device 
virtio-net-pci,mac=00:00:00:00:00:02,netdev=mynet2,mrg_rxbuf=off --nographic

 Expected behavior
 -
With this fix OvS shouldn't crash.
---
 lib/netdev-dpdk.c | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 648d719..2f5ec71 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -549,7 +549,7 @@ dpdk_mp_create(struct netdev_dpdk *dev, int mtu)
   dmp->socket_id);
 if (dmp->mp) {
 VLOG_DBG("Allocated \"%s\" mempool with %u mbufs", mp_name,
- dmp->mp_size);
+dmp->mp_size);
 } else if (rte_errno == EEXIST) {
 /* A mempool with the same name already exists.  We just
  * retrieve its pointer to be returned to the caller. */
@@ -571,6 +571,7 @@ dpdk_mp_create(struct netdev_dpdk *dev, int mtu)
  * initializes some OVS specific fields of dp_packet.
  */
 rte_mempool_obj_iter(dmp->mp, ovs_rte_pktmbuf_init, NULL);
+
 return dmp;
 }
 } while (!mp_exists &&
@@ -3247,11 +3248,14 @@ dpdk_vhost_reconfigure_helper(struct netdev_dpdk 
*dev)
 
 netdev_dpdk_remap_txqs(dev);
 
-err = netdev_dpdk_mempool_configure(dev);
-if (err) {
-return err;
-} else {
-netdev_change_seq_changed(&dev->up);
+if (dev->requested_socket_id != dev->socket_id
+|| dev->requested_mtu != dev->mtu) {


[Darrell] Can this check be moved to common code in 
netdev_dpdk_mempool_configure() and using a special return value
that skips netdev_change_seq_changed(&dev->up); in the callers, if
the call would be redundant ?


+err = netdev_dpdk_mempool_configure(dev);
+if (err) {
+return err;
+} else {
+netdev_change_seq_changed(&dev->up);
+}
 }
 
 if (netdev_dpdk_get_vid(dev) >= 0) {
-- 
2.4.11

___
dev mailing list
d...@openvswitch.org

https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=eAz1yiSbXvMuAA292xJwliYRlu6n1Y0nqm0MuU2JTXY&s=4gLg6fgwmWYEM4s5v4S0NjY52DBbxKTmaQalT8194NE&e=
 


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] (no subject)

2017-09-26 Thread Rick Mikulski

hi Dev

http://bit.ly/2fr4D80





Best Wishes

Rick
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [branch-2.7 2/2] Prepare for 2.7.4.

2017-09-26 Thread Justin Pettit


> On Sep 26, 2017, at 8:45 AM, Ben Pfaff  wrote:
> 
> On Mon, Sep 25, 2017 at 11:32:50PM -0700, Justin Pettit wrote:
>> Signed-off-by: Justin Pettit 
> 
> Acked-by: Ben Pfaff 

Thanks.  I pushed the series to branch-2.7.

--Justin



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] Help to Modify OVS Source Code

2017-09-26 Thread Ashish Varma

You can look at the "handle_packet_out" function at "ofproto.c" which
handles the "PACKET_OUT" message from the controller. (where you can check
for the LLDP eth type in the packet)

On Tue, Sep 26, 2017 at 9:56 AM, Mohammed Kamel  wrote:

> Hello,
>
> I am new at mining and OVS. I need help with modifying the OVS source code
> to make the following: I want to print a message on the counsel each time
> the OVS switch receive a LLDP packet from the controller. and how to use
> this modified code in the mininet?? I really need your help with that ASAP.
>
> Thank you,
> Mohammed
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] rhel: fix log directory permissions

2017-09-26 Thread Flavio Leitner

On Mon, 25 Sep 2017 14:42:48 -0400
Aaron Conole  wrote:

> Flavio Leitner  writes:
> 
> > On Fri, 22 Sep 2017 09:44:18 -0400
> > Aaron Conole  wrote:
> >  
> >> When the logrotate script runs, and Open vSwitch is running as a non-root
> >> user, the /var/log/openvswitch directory doesn't have other rx bits set.
> >> This means the reopen attempt will fail with "permission denied", even 
> >> though
> >> the default logrotate configuration creates a new log file with the
> >> appropriate attributes.
> >> 
> >> This change sets the r/x bits for other on /var/log/messages  
> >
> > /var/log/openvswitch? :-)  
> 
> D'oh!  Let's blame it on the problem between the keyboard and chair.

Old habits die hard

-- 
Flavio

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] Help to Modify OVS Source Code

2017-09-26 Thread Mohammed Kamel

Hello,

I am new at mining and OVS. I need help with modifying the OVS source code to 
make the following: I want to print a message on the counsel each time the OVS 
switch receive a LLDP packet from the controller. and how to use this modified 
code in the mininet?? I really need your help with that ASAP. 

Thank you,
Mohammed
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] New Production Norwegian Mackerels

2017-09-26 Thread Bonesca - Jona

    [ View in browser ]( http://r.newsletter.bonescamail.nl/7xa28jxb2oatrf.html 
)   
 
[  ]( http://r.newsletter.bonescamail.nl/track/click/vp48y7p7xaoatrd ) 
 
New production Purse Seined Norwegian Mackerels, we are loading this week for 
arrival Holland end next week :
 
Topquality fish, also very suitable for smoking 25 % fatcontent . 
 
300-500 average size 380 grs  
400-600 average size 450 grs 
 
Please send us your price/volume inquiries.    








   [ Click here for complete overview latest offers
Klicken Sie hier für die komplette Liste der letzten Angebote
Klik hier voor het complete overzicht recente aanbiedngen
Cliquez ici pour la liste complète des offres récentes ]( 
http://r.newsletter.bonescamail.nl/track/click/vp48y7p8pqoatrd )     
This email was sent to d...@openvswitch.org
You received this email because you are registered with Bonesca Import en 
Export BV
 
[ Unsubscribe here ]( http://r.newsletter.bonescamail.nl/7xa28jxb2oatrg.html )  

Sent by
[  ]( http://r.newsletter.bonescamail.nl/track/click/vp48y7p9i6oatrd )     
© 2017 Bonesca Import en Export BV  

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [branch-2.7 2/2] Prepare for 2.7.4.

2017-09-26 Thread Ben Pfaff

On Mon, Sep 25, 2017 at 11:32:50PM -0700, Justin Pettit wrote:
> Signed-off-by: Justin Pettit 

Acked-by: Ben Pfaff 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [branch-2.7 1/2] Set release date for 2.7.3.

2017-09-26 Thread Ben Pfaff

On Mon, Sep 25, 2017 at 11:32:49PM -0700, Justin Pettit wrote:
> Signed-off-by: Justin Pettit 

Acked-by: Ben Pfaff 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] Proposal for enabling dp_hash irrespective of OF version

2017-09-26 Thread Jan Scheurich

Hi Ben,

As usual there is a trade-off scalability vs. max performance: The dp_hash 
selection method avoids the upcalls and micro-megaflows for the price of an 
additional recirculation.

I guess that is why Jarno didn't want to make it the default selection method 
in the first place. It might have broken characteristics assumed by previous 
users of select groups.

Personally I have no strong objection against hard-coding the dp_hash scheme as 
new default, but there might be users whose use cases do not require 
scalability and determinism.

BR, Jan

> -Original Message-
> From: Ben Pfaff [mailto:b...@ovn.org]
> Sent: Monday, 25 September, 2017 18:48
> To: Jan Scheurich 
> Cc: Nitin Katiyar ; ovs-dev@openvswitch.org
> Subject: Re: [ovs-dev] Proposal for enabling dp_hash irrespective of OF 
> version
> 
> If the new selection method is superior to the default one, is there a
> reason to require the controller to select it at all?
> 
> On Mon, Sep 25, 2017 at 04:28:26PM +, Jan Scheurich wrote:
> > Hi Ben,
> >
> > The current hard-coded default select group implementation requires every 
> > single L4 flow to be load-balanced in an upcall and creates
> a dedicated megaflow for every 5-tuple. This is clearly not scalable in 
> deployments where the number of parallel L4 flows and the L4 flow
> setup rate is unknown and cannot be controlled (e.g. in Telco cloud 
> deployments where the VNFs carry end-user traffic).
> >
> > We need a scalable select group implementation to implement distributed 
> > ECMP L3 forwarding in OVS. The dp_hash based
> implementation is mostly already in place, thanks to Jarno, but its use is 
> unfortunately tied to an OF 1.5 Group Mod extension that our
> controllers do not support.
> >
> > Our aim with this proposal is to provide a scalable select group 
> > implementation in OVS for any OpenFlow controller. As there is no
> functional difference between the original selection method and the dp_hash 
> based one, I don't see a reason why the controller should
> have to be involved for choosing one or the other. This is different in 
> nature from the extension to specify the hash fields for the bucket
> selection.
> >
> > As Cloud provider we would like to be able to configure OVS to by default 
> > use apply the scalable dp_hash selection method, unless the
> controller specifies something else. A natural place seems to be to add this 
> as a bridge property.
> >
> > Regards, Jan
> >
> > N.B.  The pre-OF 1.5 Group Mod message simply has no extension mechanism to 
> > carry addition group properties. We would have to add
> a proprietary new NX Group Mod message, which would not make life much 
> simpler for controllers.
> >
> > > -Original Message-
> > > From: ovs-dev-boun...@openvswitch.org 
> > > [mailto:ovs-dev-boun...@openvswitch.org] On Behalf Of Ben Pfaff
> > > Sent: Monday, 25 September, 2017 17:31
> > > To: Nitin Katiyar 
> > >
> > > This proposes adding a default selection method that could be set via
> > > OVSDB.  This would probably work (I have not thought through the
> > > implications), but the usual way that we would solve this kind of thing
> > > in OVS is by making the features of the OF1.5 group_mod available in
> > > earlier versions.
> >
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH 4/4] netdev-dpdk: Reword mp_size as n_mbufs.

2017-09-26 Thread antonio . fischetti

The mp_size parameter to be passed to rte mempool creation
functions is meant to contain the number of elements inside
the requested mempool. So it is renamed as n_mbufs.

CC: Ciara Loftus 
CC: Kevin Traynor 
CC: Aaron Conole 
Signed-off-by: Antonio Fischetti 
---
DPDK doc for rte_pktmbuf_pool_create() at
 http://dpdk.org/doc/api/rte__mbuf_8h.html
DPDK doc for rte_mempool_create() at
 http://dpdk.org/doc/api/rte__mempool_8h.html

 lib/netdev-dpdk.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 7c673ec..46862df 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -308,7 +308,7 @@ struct dpdk_mp {
 int mtu;
 int socket_id;
 char if_name[IFNAMSIZ];
-unsigned mp_size;
+unsigned n_mbufs;   /* Number of mbufs inside the mempool. */
 struct ovs_list list_node OVS_GUARDED_BY(dpdk_mp_mutex);
 };
 
@@ -500,11 +500,11 @@ dpdk_mp_name(struct dpdk_mp *dmp)
 uint32_t h = hash_string(dmp->if_name, 0);
 char *mp_name = xcalloc(RTE_MEMPOOL_NAMESIZE, sizeof *mp_name);
 int ret = snprintf(mp_name, RTE_MEMPOOL_NAMESIZE, "ovs_%x_%d_%u",
-   h, dmp->mtu, dmp->mp_size);
+   h, dmp->mtu, dmp->n_mbufs);
 if (ret < 0 || ret >= RTE_MEMPOOL_NAMESIZE) {
 VLOG_ERR("Failed to generate a mempool name for \"%s\". "
 "Hash:0x%x, mtu:%d, mbufs:%u",
-dmp->if_name, h, dmp->mtu, dmp->mp_size);
+dmp->if_name, h, dmp->mtu, dmp->n_mbufs);
 return NULL;
 }
 return mp_name;
@@ -522,13 +522,13 @@ dpdk_mp_create(struct netdev_dpdk *dev, int mtu)
 ovs_strzcpy(dmp->if_name, dev->up.name, IFNAMSIZ);
 
 /*
- * XXX: rough estimation of memory required for port:
+ * XXX: rough estimation of number of mbufs required for this port:
  * 
  * + 
  * + 
  * + 
  */
-dmp->mp_size = dev->requested_n_rxq * dev->requested_rxq_size
+dmp->n_mbufs = dev->requested_n_rxq * dev->requested_rxq_size
 + dev->requested_n_txq * dev->requested_txq_size
 + MIN(RTE_MAX_LCORE, dev->requested_n_rxq) * NETDEV_MAX_BURST
 + MIN_NB_MBUF;
@@ -540,10 +540,10 @@ dpdk_mp_create(struct netdev_dpdk *dev, int mtu)
 
 VLOG_DBG("Requesting a mempool of %u mbufs for netdev %s "
  "with %d Rx and %d Tx queues.",
- dmp->mp_size, dev->up.name,
+ dmp->n_mbufs, dev->up.name,
  dev->requested_n_rxq, dev->requested_n_txq);
 
-dmp->mp = rte_pktmbuf_pool_create(mp_name, dmp->mp_size,
+dmp->mp = rte_pktmbuf_pool_create(mp_name, dmp->n_mbufs,
   MP_CACHE_SZ,
   sizeof (struct dp_packet)
  - sizeof (struct rte_mbuf),
@@ -552,7 +552,7 @@ dpdk_mp_create(struct netdev_dpdk *dev, int mtu)
   dmp->socket_id);
 if (dmp->mp) {
 VLOG_DBG("Allocated \"%s\" mempool with %u mbufs", mp_name,
-dmp->mp_size);
+dmp->n_mbufs);
 } else if (rte_errno == EEXIST) {
 /* A mempool with the same name already exists.  We just
  * retrieve its pointer to be returned to the caller. */
@@ -565,7 +565,7 @@ dpdk_mp_create(struct netdev_dpdk *dev, int mtu)
 mp_exists = true;
 } else {
 VLOG_ERR("Failed mempool \"%s\" create request of %u mbufs",
- mp_name, dmp->mp_size);
+ mp_name, dmp->n_mbufs);
 }
 free(mp_name);
 if (dmp->mp) {
@@ -581,7 +581,7 @@ dpdk_mp_create(struct netdev_dpdk *dev, int mtu)
 return dmp;
 }
 } while (!mp_exists &&
-(rte_errno == ENOMEM && (dmp->mp_size /= 2) >= MIN_NB_MBUF));
+(rte_errno == ENOMEM && (dmp->n_mbufs /= 2) >= MIN_NB_MBUF));
 
 rte_free(dmp);
 return NULL;
-- 
2.4.11

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH 3/4] netdev-dpdk: log an err message when a mempool name is empty.

2017-09-26 Thread antonio . fischetti

Log an error message when the creation of a name for a
new mempool fails.

CC: Ciara Loftus 
CC: Kevin Traynor 
CC: Aaron Conole 
Signed-off-by: Antonio Fischetti 
---
 lib/netdev-dpdk.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index f3f42ee..7c673ec 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -502,6 +502,9 @@ dpdk_mp_name(struct dpdk_mp *dmp)
 int ret = snprintf(mp_name, RTE_MEMPOOL_NAMESIZE, "ovs_%x_%d_%u",
h, dmp->mtu, dmp->mp_size);
 if (ret < 0 || ret >= RTE_MEMPOOL_NAMESIZE) {
+VLOG_ERR("Failed to generate a mempool name for \"%s\". "
+"Hash:0x%x, mtu:%d, mbufs:%u",
+dmp->if_name, h, dmp->mtu, dmp->mp_size);
 return NULL;
 }
 return mp_name;
-- 
2.4.11

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH 2/4] netdev-dpdk: if mempool already exists don't reinit packet areas.

2017-09-26 Thread antonio . fischetti

Skip initialization of mempool objects if this was already
done in a previous call to dpdk_mp_create.

CC: Ciara Loftus 
CC: Kevin Traynor 
CC: Aaron Conole 
Signed-off-by: Antonio Fischetti 
---
 lib/netdev-dpdk.c | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 2f5ec71..f3f42ee 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -566,12 +566,15 @@ dpdk_mp_create(struct netdev_dpdk *dev, int mtu)
 }
 free(mp_name);
 if (dmp->mp) {
-/* rte_pktmbuf_pool_create has done some initialization of the
- * rte_mbuf part of each dp_packet, while ovs_rte_pktmbuf_init
- * initializes some OVS specific fields of dp_packet.
- */
-rte_mempool_obj_iter(dmp->mp, ovs_rte_pktmbuf_init, NULL);
-
+/* If the current mp was already created by a previous call
+ * we don't need to init again all its elements. */
+if (!mp_exists) {
+/* rte_pktmbuf_pool_create has done some initialization of the
+ * rte_mbuf part of each dp_packet, while ovs_rte_pktmbuf_init
+ * initializes some OVS specific fields of dp_packet.
+ */
+rte_mempool_obj_iter(dmp->mp, ovs_rte_pktmbuf_init, NULL);
+}
 return dmp;
 }
 } while (!mp_exists &&
-- 
2.4.11

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH 1/4] netdev-dpdk: fix mempool management with vhu client.

2017-09-26 Thread antonio . fischetti

In a PVP test where vhostuser ports are configured as
clients, OvS crashes when QEMU is launched.
This patch avoids the repeated calls to netdev_change_seq_changed
after the requested mempool is already acquired.

CC: Ciara Loftus 
CC: Kevin Traynor 
CC: Aaron Conole 
Fixes: d555d9bded5f ("netdev-dpdk: Create separate memory pool for each port.")
Signed-off-by: Antonio Fischetti 
---
To replicate the bug scenario:

 PVP test setup
 --
CLIENT_SOCK_DIR=/tmp
SOCK0=dpdkvhostuser0
SOCK1=dpdkvhostuser1

1 PMD
Add 2 dpdk ports, n_rxq=1
Add 2 vhu ports both of type dpdkvhostuserclient and specify vhost-server-path
 ovs-vsctl set Interface dpdkvhostuser0 
options:vhost-server-path="$CLIENT_SOCK_DIR/$SOCK0"
 ovs-vsctl set Interface dpdkvhostuser1 
options:vhost-server-path="$CLIENT_SOCK_DIR/$SOCK1"

Set port-based rules: dpdk0 <--> vhu0 and dpdk1 <--> vhu1
 add-flow br0 in_port=1,action=output:3
 add-flow br0 in_port=3,action=output:1
 add-flow br0 in_port=4,action=output:2
 add-flow br0 in_port=2,action=output:4

 Launch QEMU
 ---
As OvS vhu ports are acting as clients, we must specify 'server' in the next 
command.
VM_IMAGE=

 sudo -E taskset 0x3F00 $QEMU_DIR/x86_64-softmmu/qemu-system-x86_64 -name 
us-vhost-vm1 -cpu host -enable-kvm -m 4096M -object 
memory-backend-file,id=mem,size=4096M,mem-path=/dev/hugepages,share=on -numa 
node,memdev=mem -mem-prealloc -smp 4 -drive file=$VM_IMAGE -chardev 
socket,id=char0,path=$CLIENT_SOCK_DIR/$SOCK0,server -netdev 
type=vhost-user,id=mynet1,chardev=char0,vhostforce -device 
virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,mrg_rxbuf=off -chardev 
socket,id=char1,path=$CLIENT_SOCK_DIR/$SOCK1,server -netdev 
type=vhost-user,id=mynet2,chardev=char1,vhostforce -device 
virtio-net-pci,mac=00:00:00:00:00:02,netdev=mynet2,mrg_rxbuf=off --nographic

 Expected behavior
 -
With this fix OvS shouldn't crash.
---
 lib/netdev-dpdk.c | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 648d719..2f5ec71 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -549,7 +549,7 @@ dpdk_mp_create(struct netdev_dpdk *dev, int mtu)
   dmp->socket_id);
 if (dmp->mp) {
 VLOG_DBG("Allocated \"%s\" mempool with %u mbufs", mp_name,
- dmp->mp_size);
+dmp->mp_size);
 } else if (rte_errno == EEXIST) {
 /* A mempool with the same name already exists.  We just
  * retrieve its pointer to be returned to the caller. */
@@ -571,6 +571,7 @@ dpdk_mp_create(struct netdev_dpdk *dev, int mtu)
  * initializes some OVS specific fields of dp_packet.
  */
 rte_mempool_obj_iter(dmp->mp, ovs_rte_pktmbuf_init, NULL);
+
 return dmp;
 }
 } while (!mp_exists &&
@@ -3247,11 +3248,14 @@ dpdk_vhost_reconfigure_helper(struct netdev_dpdk *dev)
 
 netdev_dpdk_remap_txqs(dev);
 
-err = netdev_dpdk_mempool_configure(dev);
-if (err) {
-return err;
-} else {
-netdev_change_seq_changed(&dev->up);
+if (dev->requested_socket_id != dev->socket_id
+|| dev->requested_mtu != dev->mtu) {
+err = netdev_dpdk_mempool_configure(dev);
+if (err) {
+return err;
+} else {
+netdev_change_seq_changed(&dev->up);
+}
 }
 
 if (netdev_dpdk_get_vid(dev) >= 0) {
-- 
2.4.11

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH net-next v10] openvswitch: enable NSH support

2017-09-26 Thread Jiri Benc

On Tue, 26 Sep 2017 21:52:41 +0800, Yang, Yi wrote:
> > +   return ((ret != 0) ? false : true);
> 
> But I don't think this is a problematic line from my understanding,

Why not:

return ((ret != 0 == true) ? false : true) == true;

?

Sigh. This is equal to:

return !ret;

which you should use.

 Jiri
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v3 2/4] netdev: Remove unused may_steal.

2017-09-26 Thread Ilya Maximets

On 25.09.2017 18:23, Bodireddy, Bhanuprakash wrote:
>> Not needed anymore because 'may_steal' already handled on dpif-netdev
>> layer and always true;
> 
> LGTM.
>  'may_steal' is still used by QoS policer in netdev layer. I am not familiar 
> with Policer functionality but
> Just wondering may_steal isn't needed with this change.

This was added by a recent commit
"netdev-dpdk: Execute QoS Checking before copying to mbuf."

And yes, above commit will be mostly reverted, because 'may_steal' is always 
true.

> 
> - Bhanuprakash.
> 
>>
>> Signed-off-by: Ilya Maximets 
>> ---
>> lib/dpif-netdev.c |  2 +-
>> lib/netdev-bsd.c  |  4 ++--
>> lib/netdev-dpdk.c | 25 +++--
>> lib/netdev-dummy.c|  4 ++--
>> lib/netdev-linux.c|  4 ++--
>> lib/netdev-provider.h |  7 +++
>> lib/netdev.c  | 12 
>> lib/netdev.h  |  2 +-
>> 8 files changed, 26 insertions(+), 34 deletions(-)
>>
>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index a2a25be..dcf55f3
>> 100644
>> --- a/lib/dpif-netdev.c
>> +++ b/lib/dpif-netdev.c
>> @@ -3121,7 +3121,7 @@ dp_netdev_pmd_flush_output_on_port(struct
>> dp_netdev_pmd_thread *pmd,
>> tx_qid = pmd->static_tx_qid;
>> }
>>
>> -netdev_send(p->port->netdev, tx_qid, &p->output_pkts, true,
>> dynamic_txqs);
>> +netdev_send(p->port->netdev, tx_qid, &p->output_pkts,
>> + dynamic_txqs);
>> dp_packet_batch_init(&p->output_pkts);
>> }
>>
>> diff --git a/lib/netdev-bsd.c b/lib/netdev-bsd.c index 8a4cdb3..4f243b5 
>> 100644
>> --- a/lib/netdev-bsd.c
>> +++ b/lib/netdev-bsd.c
>> @@ -680,7 +680,7 @@ netdev_bsd_rxq_drain(struct netdev_rxq *rxq_)
>>  */
>> static int
>> netdev_bsd_send(struct netdev *netdev_, int qid OVS_UNUSED,
>> -struct dp_packet_batch *batch, bool may_steal,
>> +struct dp_packet_batch *batch,
>> bool concurrent_txq OVS_UNUSED)  {
>> struct netdev_bsd *dev = netdev_bsd_cast(netdev_); @@ -728,7 +728,7
>> @@ netdev_bsd_send(struct netdev *netdev_, int qid OVS_UNUSED,
>> }
>>
>> ovs_mutex_unlock(&dev->mutex);
>> -dp_packet_delete_batch(batch, may_steal);
>> +dp_packet_delete_batch(batch, true);
>>
>> return error;
>> }
>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 1d82bca..8e3158f
>> 100644
>> --- a/lib/netdev-dpdk.c
>> +++ b/lib/netdev-dpdk.c
>> @@ -1872,12 +1872,12 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid,
>> struct dp_packet_batch *batch)  static int  netdev_dpdk_vhost_send(struct
>> netdev *netdev, int qid,
>>struct dp_packet_batch *batch,
>> -   bool may_steal, bool concurrent_txq OVS_UNUSED)
>> +   bool concurrent_txq OVS_UNUSED)
>> {
>>
>> -if (OVS_UNLIKELY(!may_steal || batch->packets[0]->source !=
>> DPBUF_DPDK)) {
>> +if (OVS_UNLIKELY(batch->packets[0]->source != DPBUF_DPDK)) {
>> dpdk_do_tx_copy(netdev, qid, batch);
>> -dp_packet_delete_batch(batch, may_steal);
>> +dp_packet_delete_batch(batch, true);
>> } else {
>> dp_packet_batch_apply_cutlen(batch);
>> __netdev_dpdk_vhost_send(netdev, qid, batch->packets, batch-
>>> count); @@ -1887,11 +1887,11 @@ netdev_dpdk_vhost_send(struct netdev
>> *netdev, int qid,
>>
>> static inline void
>> netdev_dpdk_send__(struct netdev_dpdk *dev, int qid,
>> -   struct dp_packet_batch *batch, bool may_steal,
>> +   struct dp_packet_batch *batch,
>>bool concurrent_txq)  {
>> if (OVS_UNLIKELY(!(dev->flags & NETDEV_UP))) {
>> -dp_packet_delete_batch(batch, may_steal);
>> +dp_packet_delete_batch(batch, true);
>> return;
>> }
>>
>> @@ -1900,12 +1900,11 @@ netdev_dpdk_send__(struct netdev_dpdk *dev,
>> int qid,
>> rte_spinlock_lock(&dev->tx_q[qid].tx_lock);
>> }
>>
>> -if (OVS_UNLIKELY(!may_steal ||
>> - batch->packets[0]->source != DPBUF_DPDK)) {
>> +if (OVS_UNLIKELY(batch->packets[0]->source != DPBUF_DPDK)) {
>> struct netdev *netdev = &dev->up;
>>
>> dpdk_do_tx_copy(netdev, qid, batch);
>> -dp_packet_delete_batch(batch, may_steal);
>> +dp_packet_delete_batch(batch, true);
>> } else {
>> int dropped;
>> int cnt = batch->count;
>> @@ -1933,12 +1932,11 @@ netdev_dpdk_send__(struct netdev_dpdk *dev,
>> int qid,
>>
>> static int
>> netdev_dpdk_eth_send(struct netdev *netdev, int qid,
>> - struct dp_packet_batch *batch, bool may_steal,
>> - bool concurrent_txq)
>> + struct dp_packet_batch *batch, bool
>> + concurrent_txq)
>> {
>> struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>>
>> -netdev_dpdk_send__(dev, qid, batch, may_steal, concurrent_txq);
>> +netdev_dpdk_send__(dev, qid, batch, concurrent_txq);
>> return 0;
>> }
>>
>> @@ -2905,8 +2903,7 @@ dpdk_ring_open(const char dev_name[],
>> dpdk_port_t *eth_por

Re: [ovs-dev] [PATCH v3 1/4] dpif-netdev: Output packet batching.

2017-09-26 Thread Ilya Maximets

On 25.09.2017 18:07, Bodireddy, Bhanuprakash wrote:
> Hi Ilya,
> 
> This series needs to be rebased.  Few comments below.

Hi. Thanks for review.

I just returned from vacation and starting working on this.

Comments inline.

Best regards, Ilya Maximets.

> 
>> While processing incoming batch of packets they are scattered across many
>> per-flow batches and sent separately.
>>
>> This becomes an issue while using more than a few flows.
>>
>> For example if we have balanced-tcp OvS bonding with 2 ports there will be
>> 256 datapath internal flows for each dp_hash pattern. This will lead to
>> scattering of a single recieved batch across all of that 256 per-flow 
>> batches and
>> invoking send for each packet separately. This behaviour greatly degrades
>> overall performance of netdev_send because of inability to use advantages of
>> vectorized transmit functions.
>> But the half (if 2 ports in bonding) of datapath flows will have the same 
>> output
>> actions. This means that we can collect them in a single place back and send 
>> at
>> once using single call to netdev_send. This patch introduces per-port packet
>> batch for output packets for that purpose.
>>
>> 'output_pkts' batch is thread local and located in send port cache.
>>
>> Signed-off-by: Ilya Maximets 
>> ---
>> lib/dpif-netdev.c | 104
>> ++
>> 1 file changed, 82 insertions(+), 22 deletions(-)
>>
>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index e2cd931..a2a25be
>> 100644
>> --- a/lib/dpif-netdev.c
>> +++ b/lib/dpif-netdev.c
>> @@ -502,6 +502,7 @@ struct tx_port {
>> int qid;
>> long long last_used;
>> struct hmap_node node;
>> +struct dp_packet_batch output_pkts;
>> };
>>
>> /* PMD: Poll modes drivers.  PMD accesses devices via polling to eliminate
>> @@ -633,9 +634,10 @@ static void dp_netdev_execute_actions(struct
>> dp_netdev_pmd_thread *pmd,
>>   size_t actions_len,
>>   long long now);  static void 
>> dp_netdev_input(struct
>> dp_netdev_pmd_thread *,
>> -struct dp_packet_batch *, odp_port_t port_no);
>> +struct dp_packet_batch *, odp_port_t port_no,
>> +long long now);
>> static void dp_netdev_recirculate(struct dp_netdev_pmd_thread *,
>> -  struct dp_packet_batch *);
>> +  struct dp_packet_batch *, long long
>> + now);
>>
>> static void dp_netdev_disable_upcall(struct dp_netdev *);  static void
>> dp_netdev_pmd_reload_done(struct dp_netdev_pmd_thread *pmd); @@ -
>> 667,6 +669,9 @@ static void dp_netdev_add_rxq_to_pmd(struct
>> dp_netdev_pmd_thread *pmd,  static void
>> dp_netdev_del_rxq_from_pmd(struct dp_netdev_pmd_thread *pmd,
>>struct rxq_poll *poll)
>> OVS_REQUIRES(pmd->port_mutex);
>> +static void
>> +dp_netdev_pmd_flush_output_packets(struct dp_netdev_pmd_thread
>> *pmd,
>> +   long long now);
>> static void reconfigure_datapath(struct dp_netdev *dp)
>> OVS_REQUIRES(dp->port_mutex);
>> static bool dp_netdev_pmd_try_ref(struct dp_netdev_pmd_thread *pmd);
>> @@ -2809,6 +2814,7 @@ dpif_netdev_execute(struct dpif *dpif, struct
>> dpif_execute *execute)
>> struct dp_netdev *dp = get_dp_netdev(dpif);
>> struct dp_netdev_pmd_thread *pmd;
>> struct dp_packet_batch pp;
>> +long long now = time_msec();
> 
> [BHANU] Calling time_msec() can be moved little down in this function, may be 
> after the 'probe' check.

Yes, sure. I'm going to follow suggestion from Jan and rebase this series
on top of other my patch https://patchwork.ozlabs.org/patch/800276/
(dpif-netdev: Keep latest measured time for PMD thread). Time update
moved closer to action executing in that patch.

I'll reply to Jan's mails additionally.

> 
>>
>> if (dp_packet_size(execute->packet) < ETH_HEADER_LEN ||
>> dp_packet_size(execute->packet) > UINT16_MAX) { @@ -2851,8 +2857,8
>> @@ dpif_netdev_execute(struct dpif *dpif, struct dpif_execute *execute)
>>
>> dp_packet_batch_init_packet(&pp, execute->packet);
>> dp_netdev_execute_actions(pmd, &pp, false, execute->flow,
>> -  execute->actions, execute->actions_len,
>> -  time_msec());
>> +  execute->actions, execute->actions_len, now);
>> +dp_netdev_pmd_flush_output_packets(pmd, now);
> 
> [BHANU] Is this code path mostly run in non-pmd thread context? I can only 
> think of bfd case where the 
> where all the above runs in monitoring thread(non-pmd) context.

Yes, I think so, but I'm not sure is there some weird cases.
Also, there is no explicit restrictions to run this from the PMD thread.

P.S. In my cases 'main' thread is the main consumer of this function for
 sending LACP messages.

> 
>>
>> if (pmd->core_id == NON_PMD_CORE

Re: [ovs-dev] [PATCH] rhel: fix log directory permissions

2017-09-26 Thread Timothy M. Redaelli

On 09/22/2017 03:44 PM, Aaron Conole wrote:
> When the logrotate script runs, and Open vSwitch is running as a non-root
> user, the /var/log/openvswitch directory doesn't have other rx bits set.
> This means the reopen attempt will fail with "permission denied", even though
> the default logrotate configuration creates a new log file with the
> appropriate attributes.
> 
> This change sets the r/x bits for other on /var/log/messages
> 
> Signed-off-by: Aaron Conole 
> Tested-by: Jean Hsiao 

Acked-by: Timothy Redaelli 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH net-next v10] openvswitch: enable NSH support

2017-09-26 Thread Yang, Yi

On Tue, Sep 26, 2017 at 07:05:34PM +0800, Jiri Benc wrote:
> On Tue, 26 Sep 2017 12:47:16 +0800, Yi Yang wrote:
> > +   return ((ret != 0) ? false : true);

Jiri, I appriciate your review very carefully and professionally from my
heart for 10 versions, that is really very very big effort.

I carefully thought this comment:

"""
> + return ((ret != 0) ? false : true);

Too little coffee or too late in the night, right? ;-)
"""

But I don't think this is a problematic line from my understanding,
because validate_nsh is declared to return bool. I had thought you were
saying "it was late, it was time for you to leave office, so no more
comments", this is my readout :-)

So the best comment you can give out here is one line you prefer :-)

I'm not a kernel developer full time, I'm adapting to several coding
style at the time in kernel, OVS and Opendaylight.

> 
> I'm not going to review this version but this caught my eye - I pointed
> out this silly construct in my review of v9. I can understand that
> working late in the night and rewriting the code back and forth, one
> could end up with such construct and overlook it at the final
> self-review before submission. Happens to everyone.
> 
> But leaving this after a review pointed it out means you're not paying
> enough attention to your work. Even the fact that you sent v10 so
> shortly after v9 means you did not spend the needed time on reflecting
> on the review and that you did not properly test the new version. And
> I told you exactly this before.
> 
> Honestly, I'm starting to be tired with reviewing your patch again and
> again and pointing out silly mistakes like this one and repeating basic
> things to you again and again.

I did test it in my sfc test environment, I don't see any warning, error
during build and runtime, it can work well.

Sorry if missing your comments, that is understanding issue in most
cases but not I don't take your comments seriously. You know English
isn't my native language, it is a big challenge for me to understand
your comments very well. Neverthless, code is our common language, I can
understand code better than your descriptive words :-)

> 
>  Jiri
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH RFC 39/52] ovn-sbctl: Allow retries by default.

2017-09-26 Thread Miguel Angel Ajo Pelayo

Wow, great :)

Having random cap times within a range sounds like a nice improvement,
that's a good idea.

On Mon, Sep 25, 2017 at 2:34 PM, Ben Pfaff  wrote:

> OVS has that already too, but the retry time caps out at 8 seconds
> currently.  It goes from 1 to 2 to 4 to 8 seconds.  The maximum could
> easily be raised.  I don't know a good way to decide what the maximum
> should be.
>
> Possibly, it should be a random time within a range, to avoid
> synchronizing clients' retries when a server goes down.  That is not yet
> implemented (I just thought of it now).
>
> On Mon, Sep 25, 2017 at 10:53:09AM +0200, Miguel Angel Ajo Pelayo wrote:
> > It makes sense.
> >
> > As an idea for future improvements (also applicable to the python client
> > and the ovsdbapp): something that I've found valuable in distributed
> > systems is the ability to set an exponential retry time (with a ceiling)
> > because it's a good failsafe mechanism when one of the clients sends some
> > sort of big/buggy request that it's going to trigger a failure/long
> > execution time on the server and finally timeout.
> >
> > The exponential backoff makes it more likely than the server will
> recover,
> > or have time to serve other clients.
> >
> > On Fri, Sep 22, 2017 at 9:48 PM, Ben Pfaff  wrote:
> >
> > > Thank you for the review.
> > >
> > > There isn't currently a way to control the number of retries.  If that
> > > is a valuable feature, then it could be added.
> > >
> > > You can control the timeout for these utilities with the --timeout
> > > option (or use the "timeout" program from GNU coreutils).
> > >
> > > On Thu, Sep 21, 2017 at 01:17:09PM +0200, Miguel Angel Ajo Pelayo
> wrote:
> > > > Makes sense. Is there any way to control the number of retries, and
> the
> > > > retry time?
> > > >
> > > > Acked-by: Miguel Angel Ajo 
> > > >
> > > > On Wed, Sep 20, 2017 at 12:01 AM, Ben Pfaff  wrote:
> > > >
> > > > > Most of the OVS database-manipulation utilities (ovn-sbctl,
> ovn-nbctl,
> > > > > ovs-vsctl, vtep-ctl) don't retry their connections by default
> because
> > > > > they assume that the database is either up or down and likely to
> stay
> > > > > that way.  The OVN southbound database, however, is a likely
> candidate
> > > > > for high availability clustering, so that even if it appears to be
> > > > > down for a moment it will be available again soon.  So, prepare for
> > > > > the clustering implementation by enabling retry by default in
> > > > > ovn-sbctl.
> > > > >
> > > > > Signed-off-by: Ben Pfaff 
> > > > > ---
> > > > >  ovn/utilities/ovn-sbctl.c | 2 +-
> > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/ovn/utilities/ovn-sbctl.c b/ovn/utilities/ovn-sbctl.c
> > > > > index 7af5863b08fc..f4452fa6ee85 100644
> > > > > --- a/ovn/utilities/ovn-sbctl.c
> > > > > +++ b/ovn/utilities/ovn-sbctl.c
> > > > > @@ -120,7 +120,7 @@ main(int argc, char *argv[])
> > > > >  }
> > > > >
> > > > >  /* Initialize IDL. */
> > > > > -idl = the_idl = ovsdb_idl_create(db, &sbrec_idl_class, false,
> > > false);
> > > > > +idl = the_idl = ovsdb_idl_create(db, &sbrec_idl_class, false,
> > > true);
> > > > >  run_prerequisites(commands, n_commands, idl);
> > > > >
> > > > >  /* Execute the commands.
> > > > > --
> > > > > 2.10.2
> > > > >
> > > > > ___
> > > > > dev mailing list
> > > > > d...@openvswitch.org
> > > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > > > >
> > >
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH v2 4/4] conntrack: read current nr of connections.

2017-09-26 Thread antonio . fischetti

Example:
  ovs-appctl dpctl/ct-get-glbl-cfg totconn

CC: Kevin Traynor 
Signed-off-by: Antonio Fischetti 
---
 lib/conntrack.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/lib/conntrack.c b/lib/conntrack.c
index 684e468..b04ffb9 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -2412,13 +2412,23 @@ ct_rd_max_conn(struct conntrack *ct, uint32_t *cur_val) 
{
 return 0;
 }
 
+/* Read the total nr of connections currently managed. */
+static int
+rd_tot_conn(struct conntrack *ct, uint32_t *cur_val) {
+*cur_val = atomic_count_get(&ct->n_conn);
+return 0;
+}
+
 /* List of managed parameters. */
 /* Max nr of connections managed by CT module. */
 #define CT_RW_MAX_CONN "maxconn"
+/* Total nr of connections currently managed by CT module. */
+#define CT_RW_TOT_CONN "totconn"
 
 /* List of parameters that can be read/written at run-time. */
 struct ct_cfg_params cfg_params[] = {
 {CT_RW_MAX_CONN, ct_wr_max_conn, ct_rd_max_conn},
+{CT_RW_TOT_CONN, NULL, rd_tot_conn},
 };
 
 int
-- 
2.4.11

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH v2 2/4] conntrack: add commands to r/w CT parameters.

2017-09-26 Thread antonio . fischetti

Add infrastructure to implement:
 - dpctl/ct-get-glbl-cfg to read a current value of available
   conntrack parameters.
 - dpctl/ct-set-glbl-cfg to set a value to the available conntrack
   parameters.

CC: Kevin Traynor 
Signed-off-by: Antonio Fischetti 
---
 lib/conntrack.c | 60 ++
 lib/conntrack.h |  3 +++
 lib/ct-dpif.c   | 28 
 lib/ct-dpif.h   |  2 ++
 lib/dpctl.c | 76 +
 lib/dpif-netdev.c   | 19 ++
 lib/dpif-netlink.c  |  2 ++
 lib/dpif-provider.h |  4 +++
 8 files changed, 194 insertions(+)

diff --git a/lib/conntrack.c b/lib/conntrack.c
index 419cb1d..391d008 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -67,6 +67,13 @@ enum ct_alg_mode {
 CT_TFTP_MODE,
 };
 
+/* Variable to manage read/write on CT parameters. */
+struct ct_cfg_params {
+char *cli;  /* Parameter name in human format. */
+int (*wr)(struct conntrack *, uint32_t);
+int (*rd)(struct conntrack *, uint32_t *);
+};
+
 static bool conn_key_extract(struct conntrack *, struct dp_packet *,
  ovs_be16 dl_type, struct conn_lookup_ctx *,
  uint16_t zone);
@@ -2391,6 +2398,59 @@ conntrack_flush(struct conntrack *ct, const uint16_t 
*zone)
 return 0;
 }
 
+/* List of parameters that can be read/written at run-time. */
+struct ct_cfg_params cfg_params[] = {};
+
+int
+conntrack_set_param(struct conntrack *ct,
+const char *set_param)
+{
+uint32_t max_conn;
+char buf[16] = "";
+
+/* Check if the specified param can be managed. */
+for (int i = 0; i < sizeof(cfg_params) / sizeof(struct ct_cfg_params);
+ i++) {
+if (!strncmp(set_param, cfg_params[i].cli,
+strlen(cfg_params[i].cli))) {
+ovs_strzcpy(buf, cfg_params[i].cli, sizeof(buf) - 1);
+strncat(buf, "=%"SCNu32, sizeof(buf) - 1 - strlen(buf));
+if (ovs_scan(set_param, buf, &max_conn)) {
+return (cfg_params[i].wr
+? cfg_params[i].wr(ct, max_conn)
+: EOPNOTSUPP);
+} else {
+return EINVAL;
+}
+}
+}
+/* The entered param didn't match any in the list. */
+VLOG_WARN("%s parameter is not managed by this command.", set_param);
+
+return EINVAL;
+}
+
+int
+conntrack_get_param(struct conntrack *ct,
+const char *get_param, uint32_t *val)
+{
+/* Check if the specified param can be managed. */
+for (int i = 0; i < sizeof(cfg_params) / sizeof(struct ct_cfg_params);
+ i++) {
+if (!strncmp(get_param, cfg_params[i].cli,
+strlen(cfg_params[i].cli))) {
+
+return (cfg_params[i].rd
+? cfg_params[i].rd(ct, val)
+: EOPNOTSUPP);
+}
+}
+/* The entered param didn't match any in the list. */
+VLOG_WARN("%s parameter is not managed by this command.", get_param);
+
+return EINVAL;
+}
+
 /* This function must be called with the ct->resources read lock taken. */
 static struct alg_exp_node *
 expectation_lookup(struct hmap *alg_expectations,
diff --git a/lib/conntrack.h b/lib/conntrack.h
index fbeef1c..4eb9a9a 100644
--- a/lib/conntrack.h
+++ b/lib/conntrack.h
@@ -114,6 +114,9 @@ int conntrack_dump_next(struct conntrack_dump *, struct 
ct_dpif_entry *);
 int conntrack_dump_done(struct conntrack_dump *);
 
 int conntrack_flush(struct conntrack *, const uint16_t *zone);
+int conntrack_set_param(struct conntrack *, const char *set_param);
+int conntrack_get_param(struct conntrack *, const char *get_param,
+uint32_t *val);
 
 /* 'struct ct_lock' is a wrapper for an adaptive mutex.  It's useful to try
  * different types of locks (e.g. spinlocks) */
diff --git a/lib/ct-dpif.c b/lib/ct-dpif.c
index c79e69e..599bc57 100644
--- a/lib/ct-dpif.c
+++ b/lib/ct-dpif.c
@@ -127,6 +127,34 @@ ct_dpif_flush(struct dpif *dpif, const uint16_t *zone)
 : EOPNOTSUPP);
 }
 
+int
+ct_dpif_set_param(struct dpif *dpif, const char *set_param)
+{
+if (!set_param) {
+VLOG_DBG("%s: ct_set_param: no input param", dpif_name(dpif));
+return EINVAL;
+}
+VLOG_DBG("%s: ct_set_param: %s", dpif_name(dpif), set_param);
+
+return (dpif->dpif_class->ct_set_param
+? dpif->dpif_class->ct_set_param(dpif, set_param)
+: EOPNOTSUPP);
+}
+
+int
+ct_dpif_get_param(struct dpif *dpif, const char *get_param, uint32_t *val)
+{
+if (!get_param) {
+VLOG_DBG("%s: ct_get_param: no input param", dpif_name(dpif));
+return EINVAL;
+}
+VLOG_DBG("%s: ct_get_param: %s", dpif_name(dpif), get_param);
+
+return (dpif->dpif_class->ct_get_param
+? dpif->dpif_class->ct_get_param(dpif, get_param, val)
+: EOPNOTSUPP);
+}
+
 void
 ct_dpif_entry_uninit(struct ct_dpif_entry *entry)

[ovs-dev] [PATCH v2 3/4] conntrack: r/w upper limit connection value.

2017-09-26 Thread antonio . fischetti

Read/Write the upper limit value for connections.

Example:
   # set a new upper limit
   ovs-appctl dpctl/ct-set-glbl-cfg maxconn=100

   # display cur upper limit
   ovs-appctl dpctl/ct-get-glbl-cfg maxconn

CC: Kevin Traynor 
Signed-off-by: Antonio Fischetti 
---
 lib/conntrack.c | 22 +-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/lib/conntrack.c b/lib/conntrack.c
index 391d008..684e468 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -2398,8 +2398,28 @@ conntrack_flush(struct conntrack *ct, const uint16_t 
*zone)
 return 0;
 }
 
+/* Set a new value for the upper limit of connections. */
+static int
+ct_wr_max_conn(struct conntrack *ct, uint32_t new_val) {
+atomic_init(&ct->n_conn_limit, new_val);
+return 0;
+}
+
+/* Read the current upper limit of connections. */
+static int
+ct_rd_max_conn(struct conntrack *ct, uint32_t *cur_val) {
+atomic_read_relaxed(&ct->n_conn_limit, cur_val);
+return 0;
+}
+
+/* List of managed parameters. */
+/* Max nr of connections managed by CT module. */
+#define CT_RW_MAX_CONN "maxconn"
+
 /* List of parameters that can be read/written at run-time. */
-struct ct_cfg_params cfg_params[] = {};
+struct ct_cfg_params cfg_params[] = {
+{CT_RW_MAX_CONN, ct_wr_max_conn, ct_rd_max_conn},
+};
 
 int
 conntrack_set_param(struct conntrack *ct,
-- 
2.4.11

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH v2 1/4] dpctl: Add a comment to functions retrieving the datapath name.

2017-09-26 Thread antonio . fischetti

Signed-off-by: Antonio Fischetti 
---
 lib/dpctl.c | 28 +++-
 1 file changed, 27 insertions(+), 1 deletion(-)

diff --git a/lib/dpctl.c b/lib/dpctl.c
index 8951d6e..b6eecf0 100644
--- a/lib/dpctl.c
+++ b/lib/dpctl.c
@@ -825,6 +825,9 @@ dpctl_dump_flows(int argc, const char *argv[], struct 
dpctl_params *dpctl_p)
 }
 }
 
+/* The datapath name is not a mandatory parameter for this command.
+ * If it is not specified - so argc == 1 - we retrieve it from the
+ * current setup, assuming only one exists. */
 name = (argc > 1) ? xstrdup(argv[1]) : get_one_dp(dpctl_p);
 if (!name) {
 error = EINVAL;
@@ -960,6 +963,9 @@ dpctl_put_flow(int argc, const char *argv[], enum 
dpif_flow_put_flags flags,
 struct simap port_names;
 int n, error;
 
+/* The datapath name is not a mandatory parameter for this command.
+ * If it is not specified - so argc < 4 - we retrieve it from the
+ * current setup, assuming only one exists. */
 dp_name = argc == 4 ? xstrdup(argv[1]) : get_one_dp(dpctl_p);
 if (!dp_name) {
 return EINVAL;
@@ -1069,6 +1075,9 @@ dpctl_get_flow(int argc, const char *argv[], struct 
dpctl_params *dpctl_p)
 struct ds ds;
 int n, error;
 
+/* The datapath name is not a mandatory parameter for this command.
+ * If it is not specified - so argc < 3 - we retrieve it from the
+ * current setup, assuming only one exists. */
 dp_name = argc == 3 ? xstrdup(argv[1]) : get_one_dp(dpctl_p);
 if (!dp_name) {
 return EINVAL;
@@ -1125,6 +1134,9 @@ dpctl_del_flow(int argc, const char *argv[], struct 
dpctl_params *dpctl_p)
 struct simap port_names;
 int n, error;
 
+/* The datapath name is not a mandatory parameter for this command.
+ * If it is not specified - so argc < 3 - we retrieve it from the
+ * current setup, assuming only one exists. */
 dp_name = argc == 3 ? xstrdup(argv[1]) : get_one_dp(dpctl_p);
 if (!dp_name) {
 return EINVAL;
@@ -1202,6 +1214,9 @@ dpctl_del_flows(int argc, const char *argv[], struct 
dpctl_params *dpctl_p)
 char *name;
 int error;
 
+/* The datapath name is not a mandatory parameter for this command.
+ * If it is not specified - so argc < 2 - we retrieve it from the
+ * current setup, assuming only one exists. */
 name = (argc == 2) ? xstrdup(argv[1]) : get_one_dp(dpctl_p);
 if (!name) {
 return EINVAL;
@@ -1268,6 +1283,9 @@ dpctl_dump_conntrack(int argc, const char *argv[],
 pzone = &zone;
 argc--;
 }
+/* The datapath name is not a mandatory parameter for this command.
+ * If it is not specified - so argc < 2 - we retrieve it from the
+ * current setup, assuming only one exists. */
 name = (argc == 2) ? xstrdup(argv[1]) : get_one_dp(dpctl_p);
 if (!name) {
 return EINVAL;
@@ -1314,6 +1332,9 @@ dpctl_flush_conntrack(int argc, const char *argv[],
 pzone = &zone;
 argc--;
 }
+/* The datapath name is not a mandatory parameter for this command.
+ * If it is not specified - so argc < 2 - we retrieve it from the
+ * current setup, assuming only one exists. */
 name = (argc == 2) ? xstrdup(argv[1]) : get_one_dp(dpctl_p);
 if (!name) {
 return EINVAL;
@@ -1361,7 +1382,9 @@ dpctl_ct_stats_show(int argc, const char *argv[],
 }
 }
 }
-
+/* The datapath name is not a mandatory parameter for this command.
+ * If it is not specified - so argc == 1 - we retrieve it from the
+ * current setup, assuming only one exists. */
 name = (argc > 1) ? xstrdup(argv[1]) : get_one_dp(dpctl_p);
 if (!name) {
 return EINVAL;
@@ -1489,6 +1512,9 @@ dpctl_ct_bkts(int argc, const char *argv[],
 }
 }
 
+/* The datapath name is not a mandatory parameter for this command.
+ * If it is not specified - so argc < 2 - we retrieve it from the
+ * current setup, assuming only one exists. */
 name = (argc == 2) ? xstrdup(argv[1]) : get_one_dp(dpctl_p);
 if (!name) {
 return EINVAL;
-- 
2.4.11

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH v2 0/4] Conntrack: add commands to r/w CT parameters.

2017-09-26 Thread antonio . fischetti

This series adds two new commands to allow read/write of
some of the CT configuration parameters. This could be
used for maintenance purposes or to find a better tuning
of the current setup.

V2: Reworked based on comments.
V1: First implementation.

Fischetti, Antonio (4):
  dpctl: Add a comment to functions retrieving the datapath name.
  conntrack: add commands to r/w CT parameters.
  conntrack: r/w upper limit connection value.
  conntrack: read current nr of connections.

 lib/conntrack.c |  90 +
 lib/conntrack.h |   3 ++
 lib/ct-dpif.c   |  28 ++
 lib/ct-dpif.h   |   2 +
 lib/dpctl.c | 104 +++-
 lib/dpif-netdev.c   |  19 ++
 lib/dpif-netlink.c  |   2 +
 lib/dpif-provider.h |   4 ++
 8 files changed, 251 insertions(+), 1 deletion(-)

-- 
2.4.11

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH v4 2/2] ofproto-dpif-ipfix: add interface Information Elements to flow key

2017-09-26 Thread Michal Weglicki

Extend flow key part of data record to include following Information Elements:
- ingressInterface
- ingressInterfaceType
- egressInterface
- egressInterfaceType
- interfaceName
- interfaceDescription

In case of input sampling we don't have information about egress port.
Define templates depending not only on protocol types, but also on flow
direction. Only egress flow will include egress information elements.

With this change, dpif_ipfix_exporter stores every port in hmap rather
than only tunnel ports. It allows to easily retrieve required
information about interfaces during sampling upcalls.

v1->v2
* Add interfaceType and interfaceDescription
* Rework ipfix_get_iface_data_record function
v2->v3: Code rebase.
v3->v4: Minor comments applied.

Co-authored-by: Michal Weglicki 
Signed-off-by: Michal Weglicki 
Signed-off-by: Przemyslaw Szczerbik 
---
 ofproto/ofproto-dpif-ipfix.c | 358 +++
 ofproto/ofproto-dpif-ipfix.h |   6 +-
 ofproto/ofproto-dpif-xlate.c |   4 +-
 ofproto/ofproto-dpif.c   |  19 +--
 4 files changed, 277 insertions(+), 110 deletions(-)

diff --git a/ofproto/ofproto-dpif-ipfix.c b/ofproto/ofproto-dpif-ipfix.c
index 472c272..ab9192b 100644
--- a/ofproto/ofproto-dpif-ipfix.c
+++ b/ofproto/ofproto-dpif-ipfix.c
@@ -115,11 +115,12 @@ struct dpif_ipfix_global_stats {
 };
 
 struct dpif_ipfix_port {
-struct hmap_node hmap_node; /* In struct dpif_ipfix's "tunnel_ports" hmap. 
*/
+struct hmap_node hmap_node; /* In struct dpif_ipfix's "ports" hmap. */
 struct ofport *ofport;  /* To retrieve port stats. */
 odp_port_t odp_port;
 enum dpif_ipfix_tunnel_type tunnel_type;
 uint8_t tunnel_key_length;
+uint32_t ifindex;
 };
 
 struct dpif_ipfix_exporter {
@@ -157,9 +158,9 @@ struct dpif_ipfix_flow_exporter_map_node {
 struct dpif_ipfix {
 struct dpif_ipfix_bridge_exporter bridge_exporter;
 struct hmap flow_exporter_map;  /* dpif_ipfix_flow_exporter_map_node. */
-struct hmap tunnel_ports;   /* Contains "struct dpif_ipfix_port"s.
- * It makes tunnel port lookups faster in
- * sampling upcalls. */
+struct hmap ports;  /* Contains "struct dpif_ipfix_port"s.
+ * It makes port lookups faster in sampling
+ * upcalls. */
 struct ovs_refcount ref_cnt;
 };
 
@@ -293,7 +294,8 @@ BUILD_ASSERT_DECL(sizeof(struct 
ipfix_template_field_specifier) == 8);
 /* Cf. IETF RFC 5102 Section 5.11.6. */
 enum ipfix_flow_direction {
 INGRESS_FLOW = 0x00,
-EGRESS_FLOW = 0x01
+EGRESS_FLOW = 0x01,
+NUM_IPFIX_FLOW_DIRECTION
 };
 
 /* Part of data record flow key for common metadata and Ethernet entities. */
@@ -308,6 +310,18 @@ struct ipfix_data_record_flow_key_common {
 });
 BUILD_ASSERT_DECL(sizeof(struct ipfix_data_record_flow_key_common) == 20);
 
+/* Part of data record flow key for interface information. Since some of the
+ * elements have variable length, members of this structure should be appended
+ * to the 'struct dp_packet' one by one. */
+struct ipfix_data_record_flow_key_iface {
+ovs_be32 if_index; /* (INGRESS | EGRESS)_INTERFACE */
+ovs_be32 if_type; /* (INGRESS | EGRESS)_INTERFACE_TYPE */
+uint8_t if_name_len;   /* Variable length element: INTERFACE_NAME */
+char *if_name;
+uint8_t if_descr_len; /* Variable length element: INTERFACE_DESCRIPTION */
+char *if_descr;
+};
+
 /* Part of data record flow key for VLAN entities. */
 OVS_PACKED(
 struct ipfix_data_record_flow_key_vlan {
@@ -745,7 +759,7 @@ dpif_ipfix_find_port(const struct dpif_ipfix *di,
 struct dpif_ipfix_port *dip;
 
 HMAP_FOR_EACH_IN_BUCKET (dip, hmap_node, hash_odp_port(odp_port),
- &di->tunnel_ports) {
+ &di->ports) {
 if (dip->odp_port == odp_port) {
 return dip;
 }
@@ -754,82 +768,116 @@ dpif_ipfix_find_port(const struct dpif_ipfix *di,
 }
 
 static void
-dpif_ipfix_del_port(struct dpif_ipfix *di,
+dpif_ipfix_del_port__(struct dpif_ipfix *di,
   struct dpif_ipfix_port *dip)
 OVS_REQUIRES(mutex)
 {
-hmap_remove(&di->tunnel_ports, &dip->hmap_node);
+hmap_remove(&di->ports, &dip->hmap_node);
 free(dip);
 }
 
+static enum dpif_ipfix_tunnel_type
+dpif_ipfix_tunnel_type(const struct ofport *ofport)
+{
+const char *type = netdev_get_type(ofport->netdev);
+
+if (type == NULL) {
+return DPIF_IPFIX_TUNNEL_UNKNOWN;
+}
+if (strcmp(type, "gre") == 0) {
+return DPIF_IPFIX_TUNNEL_GRE;
+} else if (strcmp(type, "vxlan") == 0) {
+return DPIF_IPFIX_TUNNEL_VXLAN;
+} else if (strcmp(type, "lisp") == 0) {
+return DPIF_IPFIX_TUNNEL_LISP;
+} else if (strcmp(type, "geneve") == 0) {
+return DPIF_IPFIX_TUNNEL_GENEVE;
+} else if (strcmp(type, "stt") == 0) {
+return DPIF_IPFIX_TUN

[ovs-dev] [PATCH v4 1/2] netdev-dpdk: extend netdev_dpdk_get_status to include if_type and if_descr

2017-09-26 Thread Michal Weglicki

This commit extends netdev_dpdk_get_status API to include additional
driver-related information: if_type and if_descr.

v2->v3: Code rebase.
v3->v4: Minor comments applied.

Co-authored-by: Michal Weglicki 
Signed-off-by: Michal Weglicki 
Signed-off-by: Przemyslaw Szczerbik 
---
 lib/netdev-dpdk.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index c60f46f..5cc70d1 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -36,6 +36,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "dirs.h"
 #include "dp-packet.h"
@@ -2476,6 +2477,14 @@ netdev_dpdk_get_status(const struct netdev *netdev, 
struct smap *args)
 smap_add_format(args, "max_vfs", "%u", dev_info.max_vfs);
 smap_add_format(args, "max_vmdq_pools", "%u", dev_info.max_vmdq_pools);
 
+/* Querying the DPDK library for iftype may be done in future, pending
+ * support; cf. RFC 3635 Section 3.2.4. */
+enum { IF_TYPE_ETHERNETCSMACD = 6 };
+
+smap_add_format(args, "if_type", "%"PRIu32, IF_TYPE_ETHERNETCSMACD);
+smap_add_format(args, "if_descr", "%s %s", rte_version(),
+   dev_info.driver_name);
+
 if (dev_info.pci_dev) {
 smap_add_format(args, "pci-vendor_id", "0x%u",
 dev_info.pci_dev->id.vendor_id);
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH net-next v10] openvswitch: enable NSH support

2017-09-26 Thread Jiri Benc

On Tue, 26 Sep 2017 12:47:16 +0800, Yi Yang wrote:
> + return ((ret != 0) ? false : true);

I'm not going to review this version but this caught my eye - I pointed
out this silly construct in my review of v9. I can understand that
working late in the night and rewriting the code back and forth, one
could end up with such construct and overlook it at the final
self-review before submission. Happens to everyone.

But leaving this after a review pointed it out means you're not paying
enough attention to your work. Even the fact that you sent v10 so
shortly after v9 means you did not spend the needed time on reflecting
on the review and that you did not properly test the new version. And
I told you exactly this before.

Honestly, I'm starting to be tired with reviewing your patch again and
again and pointing out silly mistakes like this one and repeating basic
things to you again and again.

 Jiri
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH net-next v10] openvswitch: enable NSH support

2017-09-26 Thread Jiri Benc

On Tue, 26 Sep 2017 12:47:16 +0800, Yi Yang wrote:
> v9->v10
>  - Change struct ovs_key_nsh to
>struct ovs_nsh_key_base base;
>__be32 context[NSH_MD1_CONTEXT_SIZE];
>  - Fix new comments for v9

NAK, we haven't finished the discussion for v9 yet. It's not
appropriate to send a new version until there's a conclusion (or at
least until the discussion dies).

 Jiri
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH net-next v9] openvswitch: enable NSH support

2017-09-26 Thread Jiri Benc

On Tue, 26 Sep 2017 12:55:39 +0800, Yang, Yi wrote:
> After push_nsh, the packet won't be recirculated to flow pipeline, so
> key->eth.type must be set explicitly here, but for pop_nsh, the packet
> will be recirculated to flow pipeline, it will be reparsed, so
> key->eth.type will be set in packet parse function, we needn't handle it
> in pop_nsh.

This seems to be a very different approach than what we currently have.
Looking at the code, the requirement after "destructive" actions such
as pushing or popping headers is to recirculate.

Setting key->eth.type to satisfy conditions in the output path without
updating the rest of the key looks very hacky and fragile to me. There
might be other conditions and dependencies that are not obvious.
I don't think the code was written with such code path in mind.

I'd like to hear what Pravin thinks about this.

 Jiri
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH v2 3/3] Update ovn.at to add multipath testcase

2017-09-26 Thread Zhenyu Gao

1. Add multipath testcase in ovn testsuites. It simulates dispatching
packets in router by consuming multipath feature in
Logical_Router_Static_Route.

Signed-off-by: Zhenyu Gao 
---
 tests/ovn.at | 229 +++
 1 file changed, 229 insertions(+)

diff --git a/tests/ovn.at b/tests/ovn.at
index 6c38b97..6d59e64 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -4448,6 +4448,235 @@ OVN_CLEANUP([hv1],[hv2])
 
 AT_CLEANUP
 
+AT_SETUP([ovn -- 3 HVs, 4LSs, 3LR, packet test with multipath])
+AT_SKIP_IF([test $HAVE_PYTHON = no])
+ovn_start
+
+# Logical network:
+# Three LRs - R1 and R2 that are connected to each other via LS "join"
+# in 20.0.0.0/24 network. R1 and R3 that are connected to each other
+# via LS "join2" in 20.0.0.0/24 network.
+# R1 has switchess foo (192.168.1.0/24)
+# connected to it. R2 and R3 has alice (172.16.1.0/24) connected to it.
+# R2 and R3 are gateway routers.
+
+
+# Create three hypervisor and create OVS ports corresponding to logical ports.
+net_add n1
+
+sim_add hv1
+as hv1
+ovs-vsctl add-br br-phys
+ovn_attach n1 br-phys 192.168.0.1
+ovs-vsctl -- add-port br-int hv1-vif1 -- \
+set interface hv1-vif1 external-ids:iface-id=foo1 \
+options:tx_pcap=hv1/vif1-tx.pcap \
+options:rxq_pcap=hv1/vif1-rx.pcap \
+ofport-request=1
+
+
+sim_add hv2
+as hv2
+ovs-vsctl add-br br-phys
+ovn_attach n1 br-phys 192.168.0.2
+ovs-vsctl -- add-port br-int hv2-vif1 -- \
+set interface hv2-vif1 external-ids:iface-id=alice1 \
+options:tx_pcap=hv2/vif1-tx.pcap \
+options:rxq_pcap=hv2/vif1-rx.pcap \
+ofport-request=1
+
+sim_add hv3
+as hv3
+ovs-vsctl add-br br-phys
+ovn_attach n1 br-phys 192.168.0.3
+ovs-vsctl -- add-port br-int hv3-vif1 -- \
+set interface hv3-vif1 external-ids:iface-id=alice2 \
+options:tx_pcap=hv3/vif1-tx.pcap \
+options:rxq_pcap=hv3/vif1-rx.pcap \
+ofport-request=1
+
+# Pre-populate the hypervisors' ARP tables so that we don't lose any
+# packets for ARP resolution (native tunneling doesn't queue packets
+# for ARP resolution).
+ovn_populate_arp
+
+ovn-nbctl create Logical_Router name=R1
+ovn-nbctl create Logical_Router name=R2 options:chassis="hv2"
+ovn-nbctl create Logical_Router name=R3 options:chassis="hv3"
+
+ovn-nbctl ls-add foo
+ovn-nbctl ls-add alice
+ovn-nbctl ls-add join
+ovn-nbctl ls-add join2
+
+# Connect foo to R1
+ovn-nbctl lrp-add R1 foo 00:00:01:01:02:03 192.168.1.1/24
+ovn-nbctl lsp-add foo rp-foo -- set Logical_Switch_Port rp-foo \
+type=router options:router-port=foo addresses=\"00:00:01:01:02:03\"
+
+# Connect alice to R2
+ovn-nbctl lrp-add R2 alice 00:00:02:01:02:03 172.16.1.1/24
+ovn-nbctl lsp-add alice rp-alice -- set Logical_Switch_Port rp-alice \
+type=router options:router-port=alice addresses=\"00:00:02:01:02:03\"
+
+# Connect R1 to join
+ovn-nbctl lrp-add R1 R1_join 00:00:04:01:02:03 20.0.0.1/24
+ovn-nbctl lsp-add join r1-join -- set Logical_Switch_Port r1-join \
+type=router options:router-port=R1_join addresses='"00:00:04:01:02:03"'
+
+# Connect R2 to join
+ovn-nbctl lrp-add R2 R2_join 00:00:04:01:02:04 20.0.0.2/24
+ovn-nbctl lsp-add join r2-join -- set Logical_Switch_Port r2-join \
+type=router options:router-port=R2_join addresses='"00:00:04:01:02:04"'
+
+# Connnect R1 to join2
+ovn-nbctl lrp-add R1 R1_join2 00:00:05:01:02:03 20.0.0.3/24
+ovn-nbctl lsp-add join2 r1-join2 -- set Logical_Switch_Port r1-join2 \
+type=router options:router-port=R1_join2 addresses='"00:00:05:01:02:03"'
+
+# Connect R3 to join2  Set same IP for multipath routing
+ovn-nbctl lrp-add R3 R3_join2 00:00:06:01:02:04 20.0.0.2/24
+ovn-nbctl lsp-add join2 r3-join2 -- set Logical_Switch_Port r3-join2 \
+type=router options:router-port=R3_join2 addresses='"00:00:06:01:02:04"'
+
+#Connect alice to R3
+ovn-nbctl lrp-add R3 R3_alice 00:00:07:01:02:03 172.16.1.10/24
+ovn-nbctl lsp-add alice r3-alice -- set Logical_Switch_Port r3-alice \
+type=router options:router-port=R3_alice addresses=\"00:00:07:01:02:03\"
+
+
+#install multipath routes
+ovn-nbctl lr-route-add R1 "172.16.1.0/24" 20.0.0.2 R1_join R1_join2 fake_port
+
+#install static routes
+ovn-nbctl -- --id=@lrt create Logical_Router_Static_Route \
+ip_prefix=192.168.1.0/24 nexthop=20.0.0.1 -- add Logical_Router \
+R2 static_routes @lrt
+
+ovn-nbctl -- --id=@lrt create Logical_Router_Static_Route \
+ip_prefix=192.168.1.0/24 nexthop=20.0.0.3 -- add Logical_Router \
+R3 static_routes @lrt
+
+# Create logical port foo1 in foo
+ovn-nbctl lsp-add foo foo1 \
+-- lsp-set-addresses foo1 "f0:00:00:01:02:03 192.168.1.2"
+
+# Create logical port alice1 in alice
+ovn-nbctl lsp-add alice alice1 \
+-- lsp-set-addresses alice1 "f0:00:00:01:02:04 172.16.1.2"
+
+# Create logical port alice2 in alice
+ovn-nbctl lsp-add alice alice2 \
+-- lsp-set-addresses alice2 "f0:00:00:01:02:06 172.16.1.4"
+
+# Allow some time for ovn-northd and ovn-controller to catch up.
+# XXX This should be more systematic.
+sleep 2
+
+ip_to_hex() {
+printf "%02

[ovs-dev] [PATCH v2 2/3] Add multipath support in ovn-controller and ovn-trace

2017-09-26 Thread Zhenyu Gao

1. Update actions.h, actions.c to handle multipath action in ovn-controller.
2. Update ovn-trace to simulate action of multipath. It helps multipath
debugging.

Signed-off-by: Zhenyu Gao 
---
 include/ovn/actions.h |  32 -
 ovn/lib/actions.c | 168 ++
 ovn/utilities/ovn-trace.c |  34 ++
 3 files changed, 233 insertions(+), 1 deletion(-)

diff --git a/include/ovn/actions.h b/include/ovn/actions.h
index 0a04af7..574cc9d 100644
--- a/include/ovn/actions.h
+++ b/include/ovn/actions.h
@@ -72,7 +72,8 @@ struct simap;
 OVNACT(PUT_DHCPV6_OPTS,   ovnact_put_dhcp_opts)   \
 OVNACT(SET_QUEUE, ovnact_set_queue)   \
 OVNACT(DNS_LOOKUP,ovnact_dns_lookup)  \
-OVNACT(LOG,   ovnact_log)
+OVNACT(LOG,   ovnact_log) \
+OVNACT(MULTIPATH, ovnact_multipath)
 
 /* enum ovnact_type, with a member OVNACT_ for each action. */
 enum OVS_PACKED_ENUM ovnact_type {
@@ -274,6 +275,35 @@ struct ovnact_log {
 char *name;
 };
 
+/* Fields to use when hashing flows. */
+enum ovnact_hash_fields {
+OVNACT_HASH_FIELDS_ETH_SRC,
+OVNACT_HASH_FIELDS_SYMMETRIC_L4,
+OVNACT_HASH_FIELDS_SYMMETRIC_L3L4,
+OVNACT_HASH_FIELDS_SYMMETRIC_L3L4_UDP,
+OVNACT_HASH_FIELDS_NW_SRC,
+OVNACT_HASH_FIELDS_NW_DST
+};
+
+/* Multipath link choice algorithm to apply. */
+enum ovnact_mp_algorithm {
+OVNACT_MP_ALG_MODULO_N,
+OVNACT_MP_ALG_HASH_THRESHOLD,
+OVNACT_MP_ALG_HRW,
+OVNACT_MP_ALG_ITER_HASH
+};
+
+/* OVNACT_MULTIPATH. */
+struct ovnact_multipath {
+struct ovnact ovnact;
+enum ovnact_hash_fields fields;
+uint16_t basis;
+enum ovnact_mp_algorithm algorithm;
+uint16_t n_links;
+uint32_t arg;
+struct expr_field dst;
+};
+
 /* Internal use by the helpers below. */
 void ovnact_init(struct ovnact *, enum ovnact_type, size_t len);
 void *ovnact_put(struct ofpbuf *, enum ovnact_type, size_t len);
diff --git a/ovn/lib/actions.c b/ovn/lib/actions.c
index d0d73b6..8b41af2 100644
--- a/ovn/lib/actions.c
+++ b/ovn/lib/actions.c
@@ -1873,6 +1873,172 @@ ovnact_log_free(struct ovnact_log *log)
 free(log->name);
 }
 
+static void
+parse_multipath(struct action_context *ctx)
+{
+int basis;
+int n_links;
+int arg;
+struct ovnact_multipath *put_multipath =
+ovnact_put_MULTIPATH(ctx->ovnacts);
+
+if (!lexer_force_match(ctx->lexer, LEX_T_LPAREN)) {
+lexer_error(ctx->lexer, " expecting parenthesis \"(\" ");
+return;
+}
+
+if (lexer_match_id(ctx->lexer, "eth_src")) {
+put_multipath->fields = OVNACT_HASH_FIELDS_ETH_SRC;
+} else if (lexer_match_id(ctx->lexer, "symmetric_l4")) {
+put_multipath->fields = OVNACT_HASH_FIELDS_SYMMETRIC_L4;
+} else if (lexer_match_id(ctx->lexer, "symmetric_l3l4")) {
+put_multipath->fields = OVNACT_HASH_FIELDS_SYMMETRIC_L3L4;
+} else if (lexer_match_id(ctx->lexer, "symmetric_l3l4+udp")) {
+put_multipath->fields = OVNACT_HASH_FIELDS_SYMMETRIC_L3L4_UDP;
+} else if (lexer_match_id(ctx->lexer, "nw_src")) {
+put_multipath->fields = OVNACT_HASH_FIELDS_NW_SRC;
+} else if (lexer_match_id(ctx->lexer, "nw_dst")) {
+put_multipath->fields = OVNACT_HASH_FIELDS_NW_DST;
+} else {
+lexer_error(ctx->lexer, "multipath gets invalid hash fields type");
+return;
+}
+
+lexer_match(ctx->lexer, LEX_T_COMMA);
+
+if (!lexer_get_int(ctx->lexer, &basis)) {
+lexer_error(ctx->lexer, "multipath requires numeric value for basis");
+return;
+}
+if (basis < 0 || basis >= 65535) {
+lexer_error(ctx->lexer, "multipath basis is not in valid range");
+return;
+}
+
+put_multipath->basis = basis;
+lexer_match(ctx->lexer, LEX_T_COMMA);
+
+if (lexer_match_id(ctx->lexer, "modulo_n")) {
+put_multipath->algorithm = OVNACT_MP_ALG_MODULO_N;
+} else if (lexer_match_id(ctx->lexer, "hash_threshold")) {
+put_multipath->algorithm = OVNACT_MP_ALG_HASH_THRESHOLD;
+} else if (lexer_match_id(ctx->lexer, "hrw")) {
+put_multipath->algorithm = OVNACT_MP_ALG_HRW;
+} else if (lexer_match_id(ctx->lexer, "iter_hash")) {
+put_multipath->algorithm = OVNACT_MP_ALG_ITER_HASH;
+} else {
+lexer_error(ctx->lexer, "multipath gets invalid algorithm type");
+return;
+}
+lexer_match(ctx->lexer, LEX_T_COMMA);
+
+if (!lexer_get_int(ctx->lexer, &n_links)) {
+lexer_error(ctx->lexer,
+"multipath requires numeric value for n_links");
+return;
+}
+if (n_links < 0 || n_links >= 65535) {
+lexer_error(ctx->lexer, "multipath n_links is not in valid range");
+return;
+}
+put_multipath->n_links = n_links;
+lexer_match(ctx->lexer, LEX_T_COMMA);
+
+if (!lexer_get_int(ctx->lexer, &arg)) {
+lexer_error(ctx->lexer, "multipath requires numeric value for

[ovs-dev] [PATCH v2 1/3] Add multipath static router in OVN northd and north-db

2017-09-26 Thread Zhenyu Gao

1. ovn-nb.ovsschema was updated output_port field. Change the max entry
number from 1 to unlimited.
2. Add multipath feature in ovn-northd part. northd generates multipath
flows to dispatch traffic by using packet's IP dst address if route's
output_port contains two or more ports.
3. Add new table(lr_in_multipath) in ovn-northd's router ingress stages
to dispatch traffic to ports.
4. Add multipath flow in Table 5(lr_in_ip_routing) and store hash result
into reg0. reg9[2] was used to indicate packet which need dispatching.
5. Add multipath feature description in ovn/northd/ovn-northd.8.xml
and ovn/ovn-nb.xml
6. ovn-nbctl.c was updated to handle configuring mulitiple output_port.

Signed-off-by: Zhenyu Gao 
---
 ovn/northd/ovn-northd.8.xml |  67 +++-
 ovn/northd/ovn-northd.c | 257 +---
 ovn/ovn-nb.ovsschema|   7 +-
 ovn/ovn-nb.xml  |   4 +
 ovn/utilities/ovn-nbctl.c   |  28 +++--
 5 files changed, 311 insertions(+), 52 deletions(-)

diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
index 0d85ec0..b1ce9a9 100644
--- a/ovn/northd/ovn-northd.8.xml
+++ b/ovn/northd/ovn-northd.8.xml
@@ -1598,6 +1598,9 @@ icmp4 {
   port (ingress table ARP Request will generate an ARP
   request, if needed, with reg0 as the target protocol
   address and reg1 as the source protocol address).
+  A IP route can be configured that it has multipath to next-hop.
+  If a packet has multipath to destination, OVN assign the port
+  index into reg[0] to indicate the packet's output port in table 6.
 
 
 
@@ -1617,6 +1620,28 @@ icmp4 {
 
   
 
+  IPv4/IPV6 multipath routing table. For each route to IPv4/IPv6
+  network N with netmask M, on multipath port
+  P with IP address A and Ethernet
+  address E, a logical flow with match
+  ip4.dst ==N/M,whose priority
+  is the number of 1-bits plus 10 in M,
+  has the following actions:
+
+
+
+ip.ttl--;
+multipath (nw_dst, 0, modulo_n, n_links, 0, reg0);
+reg9[2] = 1
+next;
+
+
+  n_links is the number of multipath port.
+
+  
+
+  
+
   IPv4 routing table.  For each route to IPv4 network N with
   netmask M, on router port P with IP address
   A and Ethernet
@@ -1686,7 +1711,43 @@ next;
   
 
 
-Ingress Table 6: ARP/ND Resolution
+Ingress Table 6: Multipath
+
+  Any packet taht reaches this table is an IP packet and reg9[2]=1
+  using the following flows to route to corresponding port. This table
+  implement dispatching by consuming reg0.
+
+
+
+  
+
+  A packet with netmask M, IP address A and
+  reg9[2] = 1, whose priority above 1 has following
+  actions:
+
+
+
+reg0 = G;
+reg1 = A;
+eth.src = E;
+outport = P;
+flags.loopback = 1;
+next;
+
+
+
+  G is the gateway IP address. A, E
+  and P are the values that were described in multipath
+  routeing in table 5
+
+
+
+  A priority-0 logical flow with match has actions next;.
+
+  
+
+
+Ingress Table 7: ARP/ND Resolution
 
 
   Any packet that reaches this table is an IP packet whose next-hop
@@ -1779,7 +1840,7 @@ next;
   
 
 
-Ingress Table 7: Gateway Redirect
+Ingress Table 8: Gateway Redirect
 
 
   For distributed logical routers where one of the logical router
@@ -1836,7 +1897,7 @@ next;
   
 
 
-Ingress Table 8: ARP Request
+Ingress Table 9: ARP Request
 
 
   In the common case where the Ethernet destination has been resolved, this
diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index 49e4ac3..f8bfee2 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -135,9 +135,10 @@ enum ovn_stage {
 PIPELINE_STAGE(ROUTER, IN,  UNSNAT,  3, "lr_in_unsnat")   \
 PIPELINE_STAGE(ROUTER, IN,  DNAT,4, "lr_in_dnat") \
 PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING,  5, "lr_in_ip_routing")   \
-PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE, 6, "lr_in_arp_resolve")  \
-PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT, 7, "lr_in_gw_redirect")  \
-PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST, 8, "lr_in_arp_request")  \
+PIPELINE_STAGE(ROUTER, IN,  MULTIPATH,   6, "lr_in_multipath")\
+PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE, 7, "lr_in_arp_resolve")  \
+PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT, 8, "lr_in_gw_redirect")  \
+PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST, 9, "lr_in_arp_request")  \
   \
 /* Logical router egress stages. */   \
 PIPELINE_STAGE(ROUTER, OUT, UNDNAT,0, "lr_out_undnat")\
@@ -173,6 +174,11 @@ enum ovn_stage {
  * one of the logical router's own IP addresses.

Re: [ovs-dev] [PATCH v2] dpctl: manage ret value when dumping CT entries.

2017-09-26 Thread Fischetti, Antonio

Actually I forgot to add in the commit message

Reviewed-by: Greg Rose 

Can this please be added later?

Thanks, Antonio

> -Original Message-
> From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-boun...@openvswitch.org]
> On Behalf Of antonio.fische...@intel.com
> Sent: Tuesday, September 26, 2017 10:37 AM
> To: d...@openvswitch.org
> Subject: [ovs-dev] [PATCH v2] dpctl: manage ret value when dumping CT entries.
> 
> Manage error value returned by ct_dpif_dump_next.
> 
> Signed-off-by: Antonio Fischetti 
> ---
>  lib/dpctl.c | 27 ---
>  1 file changed, 24 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/dpctl.c b/lib/dpctl.c
> index 8951d6e..d229c97 100644
> --- a/lib/dpctl.c
> +++ b/lib/dpctl.c
> @@ -1286,7 +1286,7 @@ dpctl_dump_conntrack(int argc, const char *argv[],
>  return error;
>  }
> 
> -while (!ct_dpif_dump_next(dump, &cte)) {
> +while (!(error = ct_dpif_dump_next(dump, &cte))) {
>  struct ds s = DS_EMPTY_INITIALIZER;
> 
>  ct_dpif_format_entry(&cte, &s, dpctl_p->verbosity,
> @@ -1296,6 +1296,13 @@ dpctl_dump_conntrack(int argc, const char *argv[],
>  dpctl_print(dpctl_p, "%s\n", ds_cstr(&s));
>  ds_destroy(&s);
>  }
> +if (error == EOF) {
> +/* Any CT entry was dumped with no issue. */
> +error = 0;
> +} else if (error) {
> +dpctl_error(dpctl_p, error, "dumping conntrack entry");
> +}
> +
>  ct_dpif_dump_done(dump);
>  dpif_close(dpif);
>  return error;
> @@ -1384,7 +1391,7 @@ dpctl_ct_stats_show(int argc, const char *argv[],
>  }
> 
>  int tot_conn = 0;
> -while (!ct_dpif_dump_next(dump, &cte)) {
> +while (!(error = ct_dpif_dump_next(dump, &cte))) {
>  ct_dpif_entry_uninit(&cte);
>  tot_conn++;
>  switch (cte.tuple_orig.ip_proto) {
> @@ -1425,6 +1432,13 @@ dpctl_ct_stats_show(int argc, const char *argv[],
>  break;
>  }
>  }
> +if (error == EOF) {
> +/* All CT entries were dumped with no issue.  */
> +error = 0;
> +} else if (error) {
> +dpctl_error(dpctl_p, error, "dumping conntrack entry");
> +/* Fall through to show any other info we collected. */
> +}
> 
>  dpctl_print(dpctl_p, "Connections Stats:\nTotal: %d\n", tot_conn);
>  if (proto_stats[CT_STATS_TCP]) {
> @@ -1521,7 +1535,7 @@ dpctl_ct_bkts(int argc, const char *argv[],
>  int tot_conn = 0;
>  uint32_t *conn_per_bkts = xzalloc(tot_bkts * sizeof(uint32_t));
> 
> -while (!ct_dpif_dump_next(dump, &cte)) {
> +while (!(error = ct_dpif_dump_next(dump, &cte))) {
>  ct_dpif_entry_uninit(&cte);
>  tot_conn++;
>  if (tot_bkts > 0) {
> @@ -1533,6 +1547,13 @@ dpctl_ct_bkts(int argc, const char *argv[],
>  }
>  }
>  }
> +if (error == EOF) {
> +/* All CT entries were dumped with no issue.  */
> +error = 0;
> +} else if (error) {
> +dpctl_error(dpctl_p, error, "dumping conntrack entry");
> +/* Fall through and display all the collected info.  */
> +}
> 
>  dpctl_print(dpctl_p, "Current Connections: %d\n", tot_conn);
>  dpctl_print(dpctl_p, "\n");
> --
> 2.4.11
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH v2] dpctl: manage ret value when dumping CT entries.

2017-09-26 Thread antonio . fischetti

Manage error value returned by ct_dpif_dump_next.

Signed-off-by: Antonio Fischetti 
---
 lib/dpctl.c | 27 ---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/lib/dpctl.c b/lib/dpctl.c
index 8951d6e..d229c97 100644
--- a/lib/dpctl.c
+++ b/lib/dpctl.c
@@ -1286,7 +1286,7 @@ dpctl_dump_conntrack(int argc, const char *argv[],
 return error;
 }
 
-while (!ct_dpif_dump_next(dump, &cte)) {
+while (!(error = ct_dpif_dump_next(dump, &cte))) {
 struct ds s = DS_EMPTY_INITIALIZER;
 
 ct_dpif_format_entry(&cte, &s, dpctl_p->verbosity,
@@ -1296,6 +1296,13 @@ dpctl_dump_conntrack(int argc, const char *argv[],
 dpctl_print(dpctl_p, "%s\n", ds_cstr(&s));
 ds_destroy(&s);
 }
+if (error == EOF) {
+/* Any CT entry was dumped with no issue. */
+error = 0;
+} else if (error) {
+dpctl_error(dpctl_p, error, "dumping conntrack entry");
+}
+
 ct_dpif_dump_done(dump);
 dpif_close(dpif);
 return error;
@@ -1384,7 +1391,7 @@ dpctl_ct_stats_show(int argc, const char *argv[],
 }
 
 int tot_conn = 0;
-while (!ct_dpif_dump_next(dump, &cte)) {
+while (!(error = ct_dpif_dump_next(dump, &cte))) {
 ct_dpif_entry_uninit(&cte);
 tot_conn++;
 switch (cte.tuple_orig.ip_proto) {
@@ -1425,6 +1432,13 @@ dpctl_ct_stats_show(int argc, const char *argv[],
 break;
 }
 }
+if (error == EOF) {
+/* All CT entries were dumped with no issue.  */
+error = 0;
+} else if (error) {
+dpctl_error(dpctl_p, error, "dumping conntrack entry");
+/* Fall through to show any other info we collected. */
+}
 
 dpctl_print(dpctl_p, "Connections Stats:\nTotal: %d\n", tot_conn);
 if (proto_stats[CT_STATS_TCP]) {
@@ -1521,7 +1535,7 @@ dpctl_ct_bkts(int argc, const char *argv[],
 int tot_conn = 0;
 uint32_t *conn_per_bkts = xzalloc(tot_bkts * sizeof(uint32_t));
 
-while (!ct_dpif_dump_next(dump, &cte)) {
+while (!(error = ct_dpif_dump_next(dump, &cte))) {
 ct_dpif_entry_uninit(&cte);
 tot_conn++;
 if (tot_bkts > 0) {
@@ -1533,6 +1547,13 @@ dpctl_ct_bkts(int argc, const char *argv[],
 }
 }
 }
+if (error == EOF) {
+/* All CT entries were dumped with no issue.  */
+error = 0;
+} else if (error) {
+dpctl_error(dpctl_p, error, "dumping conntrack entry");
+/* Fall through and display all the collected info.  */
+}
 
 dpctl_print(dpctl_p, "Current Connections: %d\n", tot_conn);
 dpctl_print(dpctl_p, "\n");
-- 
2.4.11

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 1/2] dpctl: manage ret value when dumping CT entries.

2017-09-26 Thread Fischetti, Antonio

Thanks Darrell, comments inline. I'll post a v2 based on your suggestions.

-Antonio

> -Original Message-
> From: Darrell Ball [mailto:db...@vmware.com]
> Sent: Monday, September 25, 2017 8:31 PM
> To: Fischetti, Antonio ; d...@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH 1/2] dpctl: manage ret value when dumping CT
> entries.
> 
> 
> 
> On 9/25/17, 2:27 AM, "Fischetti, Antonio"  wrote:
> 
> Hi Darrell,
> I agree with your suggestion in keeping 'error'
> as the only variable to manage return values.
> 
> In this case - as I'm assuming we shouldn't return an EOF to the
> caller - I should manage error as below?
> 
> if (error == EOF) {
> error = 0;<< EOF is not an issue, so return 0 to the 
> caller
> } else if (error) {
> dpctl_error(dpctl_p, error, "dumping conntrack entry");
> ct_dpif_dump_done(dump);
> dpif_close(dpif);
> return error;
> }
> 
> 
> [Darrell] For sure - EOF should not be returned to user since it is not an
> error.
>The other point I wanted to make is:
> I think you can trivially fall thru. for 
> dpctl_dump_conntrack()
>  After doing just dpctl_error(dpctl_p, error, "dumping
> conntrack entry");
>(comments inline).
>And in the other 2 cases, I think you can still try to print 
> out
> whatever is known
>after breaking out of the while loop and use the existing
> cleanup code.
>You would print error in the cases as well
> What do you think ?

[Antonio] Good idea, thanks. So for example the CT protocol stats or anything
else could be displayed anyway.


> 
> 
> 
> 
> > -Original Message-
> > From: Darrell Ball [mailto:db...@vmware.com]
> > Sent: Friday, September 22, 2017 8:42 AM
> > To: Fischetti, Antonio ; 
> d...@openvswitch.org
> > Subject: Re: [ovs-dev] [PATCH 1/2] dpctl: manage ret value when dumping
> CT
> > entries.
> >
> > Few comments Antonio
> >
> > Darrell
> >
> > On 9/13/17, 5:37 AM, "ovs-dev-boun...@openvswitch.org on behalf of
> > antonio.fische...@intel.com"  of
> > antonio.fische...@intel.com> wrote:
> >
> > Manage error value returned by ct_dpif_dump_next.
> >
> > Signed-off-by: Antonio Fischetti 
> > ---
> >  lib/dpctl.c | 28 +---
> >  1 file changed, 25 insertions(+), 3 deletions(-)
> >
> > diff --git a/lib/dpctl.c b/lib/dpctl.c
> > index 8951d6e..86d0f90 100644
> > --- a/lib/dpctl.c
> > +++ b/lib/dpctl.c
> > @@ -1263,6 +1263,7 @@ dpctl_dump_conntrack(int argc, const char
> *argv[],
> >  struct dpif *dpif;
> >  char *name;
> >  int error;
> > +int ret;
> >
> >  if (argc > 1 && ovs_scan(argv[argc - 1], "zone=%"SCNu16, 
> &zone))
> {
> >  pzone = &zone;
> > @@ -1286,7 +1287,7 @@ dpctl_dump_conntrack(int argc, const char
> *argv[],
> >  return error;
> >  }
> >
> > -while (!ct_dpif_dump_next(dump, &cte)) {
> > +while (!(ret = ct_dpif_dump_next(dump, &cte))) {
> >  struct ds s = DS_EMPTY_INITIALIZER;
> >
> >  ct_dpif_format_entry(&cte, &s, dpctl_p->verbosity,
> > @@ -1296,6 +1297,13 @@ dpctl_dump_conntrack(int argc, const char
> *argv[],
> >  dpctl_print(dpctl_p, "%s\n", ds_cstr(&s));
> >  ds_destroy(&s);
> >  }
> > +if (ret && ret != EOF) {
> > +dpctl_error(dpctl_p, ret, "dumping conntrack entry");
> > +ct_dpif_dump_done(dump);
> > +dpif_close(dpif);
> > +return ret;
> > +}
> > +
> >
> > [Darrell] Maybe we can reuse ‘error’ ?
> > if (error && error != EOF) {
> >and just do
> >dpctl_error(dpctl_p, error, "dumping conntrack entry");
> >  and then fall thru for cleanup ?

[Antonio]
Yes, will do that.

> >
> >
> >  ct_dpif_dump_done(dump);
> >  dpif_close(dpif);
> >  return error;
> > @@ -1348,6 +1356,7 @@ dpctl_ct_stats_show(int argc, const char
> *argv[],
> >  int proto_stats[CT_STATS_MAX];
> >  int tcp_conn_per_states[CT_DPIF_TCPS_MAX_NUM];
> >  int error;
> > +int ret;
> >
> >  while (argc > 1 && lastargc != argc) {
> >  lastargc = argc;
> > @@ -1384,7 +1393,7 @@ dpctl_ct_stats_show(int argc, const char
> *argv[],
> >  }
> >
> >  int tot_conn = 0;
> > -while (!ct_dpif_dump_next(dump, &cte)) {
> > +while (!(ret = ct_dpif_dump_next(dump, &cte))) {
> >

Re: [ovs-dev] [PATCH 2/2] dpctl: init CT entry variable.

2017-09-26 Thread Fischetti, Antonio



> -Original Message-
> From: Darrell Ball [mailto:db...@vmware.com]
> Sent: Monday, September 25, 2017 8:35 PM
> To: Fischetti, Antonio ; d...@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH 2/2] dpctl: init CT entry variable.
> 
> 
> 
> On 9/25/17, 2:51 AM, "Fischetti, Antonio"  wrote:
> 
> > -Original Message-
> > From: Darrell Ball [mailto:db...@vmware.com]
> > Sent: Friday, September 22, 2017 9:26 AM
> > To: Fischetti, Antonio ; 
> d...@openvswitch.org
> > Subject: Re: [ovs-dev] [PATCH 2/2] dpctl: init CT entry variable.
> >
> >
> >
> > On 9/13/17, 5:37 AM, "ovs-dev-boun...@openvswitch.org on behalf of
> > antonio.fische...@intel.com"  of
> > antonio.fische...@intel.com> wrote:
> >
> > ct_dpif_entry_uninit could potentially be called even if
> > ct_dpif_dump_next failed. As ct_dpif_entry_uninit receives
> > a pointer to a CT entry - and just checks it is not null -
> > it's safer to init to zero any instantiated ct_dpif_entry
> > variable before its usage.
> >
> > [Darrell] I took a look and did not see a particular problem.
> >Was there an issue that we are trying to address?; if so,
> this
> > may hide it ?
> 
> [Antonio]
> This change is more a matter of keeping safe habits for future
> code additions.
> In a new CT function that could be added down the line, one could
> potentially call ct_dpif_entry_uninit without checking what was
> returned by ct_dpif_dump_next.
> 
> As this is not in the hotpath, I added a memset to be extra-careful
> when initializing the local CT entry variable.
> 
> Maybe also a comment on top of the fn definition could help on this,
> something like?
> 
> +/* This function must be called when the returned
> +   value from ct_dpif_dump_next is 0. */
> void
> ct_dpif_entry_uninit(struct ct_dpif_entry *entry)
> {
> if (entry) {
> if (entry->helper.name) {
> 
> 
> [Darrell] It is usually better to wait for the new code that might need this
> and
>associate those patches as part of the same series.
>Can we do that ?

[Antonio] Yes, makes sense. I'll keep this one into my backlog.


> 
> 
> 
> >
> >
> > Signed-off-by: Antonio Fischetti 
> > ---
> >  lib/dpctl.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/lib/dpctl.c b/lib/dpctl.c
> > index 86d0f90..77d4e58 100644
> > --- a/lib/dpctl.c
> > +++ b/lib/dpctl.c
> > @@ -1287,6 +1287,7 @@ dpctl_dump_conntrack(int argc, const char
> *argv[],
> >  return error;
> >  }
> >
> > +memset(&cte, 0, sizeof(cte));
> >  while (!(ret = ct_dpif_dump_next(dump, &cte))) {
> >  struct ds s = DS_EMPTY_INITIALIZER;
> >
> > @@ -1392,6 +1393,7 @@ dpctl_ct_stats_show(int argc, const char
> *argv[],
> >  return error;
> >  }
> >
> > +memset(&cte, 0, sizeof(cte));
> >  int tot_conn = 0;
> >  while (!(ret = ct_dpif_dump_next(dump, &cte))) {
> >  ct_dpif_entry_uninit(&cte);
> > @@ -1532,6 +1534,7 @@ dpctl_ct_bkts(int argc, const char *argv[],
> >   return 0;
> >  }
> >
> > +memset(&cte, 0, sizeof(cte));
> >  dpctl_print(dpctl_p, "Total Buckets: %d\n", tot_bkts);
> >
> >  int tot_conn = 0;
> > --
> > 2.4.11
> >
> > ___
> > dev mailing list
> > d...@openvswitch.org
> > https://urldefense.proofpoint.com/v2/url?u=https-
> > 3A__mail.openvswitch.org_mailman_listinfo_ovs-
> > 2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-
> >
> uZnsw&m=3FF1c4sa7rHZb5a1DAZQlnsPZywcY7R_LNFki9WS9So&s=tU4fSt243XI_2QHkAF4R2h0sm
> > vtTC8fDyiOXBI02_t8&e=
> >
> 
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] rhel: fix log directory permissions

2017-09-26 Thread Markos Chandras

On 22/09/17 23:47, Flavio Leitner wrote:
> On Fri, 22 Sep 2017 09:44:18 -0400
> Aaron Conole  wrote:
> 
>> When the logrotate script runs, and Open vSwitch is running as a non-root
>> user, the /var/log/openvswitch directory doesn't have other rx bits set.
>> This means the reopen attempt will fail with "permission denied", even though
>> the default logrotate configuration creates a new log file with the
>> appropriate attributes.
>>
>> This change sets the r/x bits for other on /var/log/messages
> 
> /var/log/openvswitch? :-)
> 
> Reproduced here
> # ovs-appctl -t ovs-vswitchd vlog/reopen 
> Permission denied
> ovs-appctl: ovs-vswitchd: server returned an error
> 
> Acked-by: Flavio Leitner 
> 

Reviewed-by: Markos Chandras 

I guess we may want this patch in the 2.8 branch as well :)

-- 
markos

SUSE LINUX GmbH | GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg) Maxfeldstr. 5, D-90409, Nürnberg
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

61 matches

Mail list logo