Re: [ovs-dev] [PATCH v2] ovn-vtep: fix arping from vtep-gw physical port

2016-09-29 Thread Ramu Ramamurthy
> Hello Ramu
>
> Can you describe your configuration for this test failure when logical
> switch
> arp responders are skipped for logical switch "router type" ports  ?
> I know the existing OVN tests (both system and non-system) pass either way.
>
> Thanks Darrell
>

There is no test that failed. But, I found that when a VM boots, and
gets dhcp-ip,
it pings the default-gw IP address, and for this it arps the
gateway-ip. That arp
hits the arp-responder flows for the router-port.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v3] ovn-vtep: fix arping from vtep-gw physical port

2016-09-27 Thread Ramu Ramamurthy
Currently, arping from a vtep-gw physical-switch port to
a VIF IP address does not work.

When a physical-switch-port arps for an IP address
of a VIF, that arp packet comes into the VIF hypervisor via a
vxlan tunnel. That arp packet must not be responded-to by the
arp responder table because, potentially, multiple hypervisors
could independently respond.

Signed-off-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>
---
 ovn/northd/ovn-northd.8.xml | 5 +++--
 ovn/northd/ovn-northd.c | 5 -
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
index 1da633a..f27b776 100644
--- a/ovn/northd/ovn-northd.8.xml
+++ b/ovn/northd/ovn-northd.8.xml
@@ -421,8 +421,9 @@
 
 
   
-Priority-100 flows to skip ARP responder if inport is of type
-localnet, and advances directly to the next table.
+Priority-100 flows to skip the ARP responder if inport type is
+localnet or vtep, and advances directly
+to the next table.
   
 
   
diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index cdc5525..9606c95 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -2683,7 +2683,10 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap 
*ports,
 continue;
 }
 
-if (!strcmp(op->nbsp->type, "localnet")) {
+/* Skip arp responder if the logical switch inport
+ * is of type localnet or vtep. */
+if (!strcmp(op->nbsp->type, "localnet")
+|| !strcmp(op->nbsp->type, "vtep")) {
 ds_clear();
 ds_put_format(, "inport == %s", op->json_key);
 ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, 100,
-- 
1.9.1

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2] ovn-vtep: fix arping from vtep-gw physical port

2016-09-26 Thread Ramu Ramamurthy
-if (!strcmp(op->nbsp->type, "localnet")) {
+/* Skip arp responder if the logical switch inport is not
+ * associated with a local VIF or a l2gateway port */
+if ((strcmp(op->nbsp->type, "")) &&
+(strcmp(op->nbsp->type, "l2gateway"))) {
 ds_clear();
 ds_put_format(, "inport == %s", op->json_key);
 ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, 100,

The logical switch port of type "router" needs to have an
arp-responder entry for VIFs to get the
mac of the gateway port, but the above skips it and would introduce a
regression.

I would suggest a) that the code is simpler if we explicitly list the
two known port-types for which we want to
skip the arp-responder b) keep every other port-type as-is with
respect to the arp-responder (and not make
code changes for them in this commit). Please let me know what you
think of this..

-if (!strcmp(op->nbsp->type, "localnet")) {
+if (!strcmp(op->nbsp->type, "localnet")
+|| !strcmp(op->nbsp->type, "vtep")) {
 ds_clear();
 ds_put_format(, "inport == %s", op->json_key);
 ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, 100,
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v2] ovn-vtep: fix arping from vtep-gw physical port

2016-09-22 Thread Ramu Ramamurthy
Currently, arping from a vtep-gw physical-switch port to
a VIF IP address does not work.

When a physical-switch-port arps for an IP address
of a VIF, that arp packet comes into the VIF hypervisor via a
vxlan tunnel. That arp packet must not be responded-to by the
arp responder table because, potentially, multiple hypervisors
could independently respond.

Signed-off-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>
---
 ovn/northd/ovn-northd.8.xml | 4 ++--
 ovn/northd/ovn-northd.c | 4 +++-
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
index 1da633a..3b885f9 100644
--- a/ovn/northd/ovn-northd.8.xml
+++ b/ovn/northd/ovn-northd.8.xml
@@ -421,8 +421,8 @@
 
 
   
-Priority-100 flows to skip ARP responder if inport is of type
-localnet, and advances directly to the next table.
+Priority-100 flows to skip ARP responder if inport is not associated
+with a local VIF, and advances directly to the next table.
   
 
   
diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index 2cbbb47..d752589 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -2683,7 +2683,9 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap 
*ports,
 continue;
 }
 
-if (!strcmp(op->nbsp->type, "localnet")) {
+/* Skip installing arp responder if the logical switch inport
+ * is not associated with a local VIF. */
+if (strcmp(op->nbsp->type, "")) {
 ds_clear();
 ds_put_format(, "inport == %s", op->json_key);
 ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, 100,
-- 
1.9.1

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] ovn-vtep: fix arping from vtep-gw physical port

2016-09-20 Thread Ramu Ramamurthy
>
> Does the following diff work on your system?
>
>
> diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
> index 7797417..4f1cd89 100644
> --- a/ovn/northd/ovn-northd.8.xml
> +++ b/ovn/northd/ovn-northd.8.xml
> @@ -421,8 +421,8 @@
>
>  
>
> -Priority-100 flows to skip ARP responder if inport is of type
> -localnet, and advances directly to the next table.
> +Priority-100 flows to skip ARP responder if inport is not
> associated
> +with a local VIF, and advances directly to the next table.
>
>
>
> diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
> index 8c7e80c..bcb9738 100644
> --- a/ovn/northd/ovn-northd.c
> +++ b/ovn/northd/ovn-northd.c
> @@ -2428,7 +2428,9 @@ build_lswitch_flows(struct hmap *datapaths, struct
> hmap *ports,
>  continue;
>  }
>
> -if (!strcmp(op->nbsp->type, "localnet")) {
> +/* Skip installing arp responder if the logical switch inport
> + * is not associated with a local VIF. */
> +if (strcmp(op->nbsp->type, "")) {
>  ds_clear();
>  ds_put_format(, "inport == %s", op->json_key);
>  ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, 100,

Yes, the above works, The above code skips the arp-responder for all
non-vif ports
(localnet, vtep, router, l2gateway). Is that ok ?

My usecase requires to skip the arp-responder for
vtep because, multiple hypervisors could respond to an arp from vtep.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] ovn-vtep: fix arping from vtep-gw physical port

2016-09-19 Thread Ramu Ramamurthy
Currently, arping from a vetp-gw physical-switch port to
a VIF IP address does not work.

When a physical-switch-port arps for an IP address
of a VIF, that arp packet comes into the VIF hypervisor via a
vxlan tunnel. That arp packet must not be responded-to by the
arp responder table. Its treatment must be similar to the
arps coming in from the localnet port - which is to
skip the arp responder table.

Signed-off-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>
---
 ovn/northd/ovn-northd.8.xml | 3 ++-
 ovn/northd/ovn-northd.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
index 307e8be..d6dc860 100644
--- a/ovn/northd/ovn-northd.8.xml
+++ b/ovn/northd/ovn-northd.8.xml
@@ -422,7 +422,8 @@
 
   
 Priority-100 flows to skip ARP responder if inport is of type
-localnet, and advances directly to the next table.
+localnet or vtep, and advances directly to
+the next table.
   
 
   
diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index 5ccb516..ebbc18a 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -2669,7 +2669,8 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap 
*ports,
 continue;
 }
 
-if (!strcmp(op->nbsp->type, "localnet")) {
+if (!strcmp(op->nbsp->type, "localnet")
+|| !strcmp(op->nbsp->type, "vtep")) {
 ds_clear();
 ds_put_format(, "inport == %s", op->json_key);
 ovn_lflow_add(lflows, op->od, S_SWITCH_IN_ARP_ND_RSP, 100,
-- 
1.9.1

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] procfs interface to openvswitch kernel module to help debugging

2016-09-12 Thread Ramu Ramamurthy
Hi,

I dont see a /proc filesystem based interface to help debug the
openvswitch kernel module. Some of the knobs that may be useful are:
a) debug knobs for stats (for those stats are not already reported),
b) control knobs like: stop/start forwarding on the fast-path etc.
Such an interface would help debugging for some problems.

Does such an interface already exist ? If it does not, does it make
sense to add such an interface ?
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] ovn: fix ovn-northd leaks in build_acl

2016-09-01 Thread Ramu Ramamurthy
Hi Russell, I updated the patch with a Signed-off-by, but it landed
here as new on patchworks,
https://patchwork.ozlabs.org/patch/665079/
instead of updating the original here:
https://patchwork.ozlabs.org/patch/661079/

On Thu, Sep 1, 2016 at 4:50 PM, Russell Bryant <russ...@ovn.org> wrote:
>
> On Fri, Aug 19, 2016 at 9:06 PM, Ramu Ramamurthy <ramu.ramamur...@gmail.com>
> wrote:
>>
>> From: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>
>>
>> The following leaks are due to missing ds_destroy in a few
>> places in build_acl.
>>
>> 5,850 bytes in 50 blocks are definitely lost in loss record 93 of 93
>>at 0x4C29BFD: malloc (in
>> /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
>>by 0x4C2BACB: realloc (in
>> /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
>>by 0x449507: xrealloc (util.c:123)
>>by 0x42CC73: ds_reserve (dynamic-string.c:63)
>>by 0x42D08F: ds_put_format_valist (dynamic-string.c:161)
>>by 0x42D176: ds_put_format (dynamic-string.c:142)
>>by 0x40D380: build_acls (ovn-northd.c:2320)
>>by 0x40D380: build_lswitch_flows.constprop.36 (ovn-northd.c:2472)
>>by 0x4072D9: build_lflows (ovn-northd.c:3845)
>>by 0x4072D9: ovnnb_db_run (ovn-northd.c:3971)
>>by 0x4072D9: main (ovn-northd.c:4375)
>>
>> 9,360 bytes in 72 blocks are definitely lost in loss record 93 of 93
>>at 0x4C29BFD: malloc (in
>> /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
>>by 0x4C2BACB: realloc (in
>> /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
>>by 0x449507: xrealloc (util.c:123)
>>by 0x42CC73: ds_reserve (dynamic-string.c:63)
>>by 0x42D08F: ds_put_format_valist (dynamic-string.c:161)
>>by 0x42D176: ds_put_format (dynamic-string.c:142)
>>by 0x40D505: build_acls (ovn-northd.c:2346)
>>by 0x40D505: build_lswitch_flows.constprop.36 (ovn-northd.c:2472)
>>by 0x4072D9: build_lflows (ovn-northd.c:3845)
>>by 0x4072D9: ovnnb_db_run (ovn-northd.c:3971)
>>by 0x4072D9: main (ovn-northd.c:4375)
>> ---
>>  ovn/northd/ovn-northd.c | 2 ++
>>  1 file changed, 2 insertions(+)
>
>
> I was about to apply this, but it's missing your Signed-off-by.  Can you
> please re-submit with Signed-off-by included?
>
> Thanks,
>
> --
> Russell Bryant
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] ovn: fix ovn-northd leaks in build_acl

2016-09-01 Thread Ramu Ramamurthy
The following leaks are due to missing ds_destroy in a few
places in build_acl.

5,850 bytes in 50 blocks are definitely lost in loss record 93 of 93
   at 0x4C29BFD: malloc (in 
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
   by 0x4C2BACB: realloc (in 
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
   by 0x449507: xrealloc (util.c:123)
   by 0x42CC73: ds_reserve (dynamic-string.c:63)
   by 0x42D08F: ds_put_format_valist (dynamic-string.c:161)
   by 0x42D176: ds_put_format (dynamic-string.c:142)
   by 0x40D380: build_acls (ovn-northd.c:2320)
   by 0x40D380: build_lswitch_flows.constprop.36 (ovn-northd.c:2472)
   by 0x4072D9: build_lflows (ovn-northd.c:3845)
   by 0x4072D9: ovnnb_db_run (ovn-northd.c:3971)
   by 0x4072D9: main (ovn-northd.c:4375)

9,360 bytes in 72 blocks are definitely lost in loss record 93 of 93
   at 0x4C29BFD: malloc (in 
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
   by 0x4C2BACB: realloc (in 
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
   by 0x449507: xrealloc (util.c:123)
   by 0x42CC73: ds_reserve (dynamic-string.c:63)
   by 0x42D08F: ds_put_format_valist (dynamic-string.c:161)
   by 0x42D176: ds_put_format (dynamic-string.c:142)
   by 0x40D505: build_acls (ovn-northd.c:2346)
   by 0x40D505: build_lswitch_flows.constprop.36 (ovn-northd.c:2472)
   by 0x4072D9: build_lflows (ovn-northd.c:3845)
   by 0x4072D9: ovnnb_db_run (ovn-northd.c:3971)
   by 0x4072D9: main (ovn-northd.c:4375)

Signed-off-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>
Acked-by: Ryan Moats <rmo...@us.ibm.com>
---
 ovn/northd/ovn-northd.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index 0ad9190..1731332 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -2342,6 +2342,7 @@ build_acls(struct ovn_datapath *od, struct hmap *lflows)
 ovn_lflow_add(
 lflows, od, S_SWITCH_OUT_ACL, 34000, ds_cstr(),
 actions);
+ds_destroy();
 }
 }
 
@@ -2368,6 +2369,7 @@ build_acls(struct ovn_datapath *od, struct hmap *lflows)
 ovn_lflow_add(
 lflows, od, S_SWITCH_OUT_ACL, 34000, ds_cstr(),
 actions);
+ds_destroy();
 }
 }
 }
-- 
1.8.3.1

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v2] ovn: log dhcp responses for debugging

2016-09-01 Thread Ramu Ramamurthy
Add a few messages at INFO to help debug the vif lifecycle.
A logsearch on mac or ip helps debug what happened to the
vif and when. This helps easily correlate logs across CMS and ovn.

Logs appear like this:

2016-09-01T18:15:48Z|00014|binding|INFO|Claiming lport 
eee1a9af-7513-4540-9385-9e3972bfca05 for this chassis.
2016-09-01T18:15:48Z|00015|binding|INFO|Claiming fa:16:3e:01:c3:4a 10.0.0.7 
fd93:b509:aa46:0:f816:3eff:fe01:c34a
2016-09-01T18:15:59Z|00016|pinctrl|INFO|DHCPOFFER fa:16:3e:01:c3:4a 10.0.0.7
2016-09-01T18:15:59Z|00017|pinctrl|INFO|DHCPACK fa:16:3e:01:c3:4a 10.0.0.7
2016-09-01T18:16:22Z|00018|binding|INFO|Releasing lport 
eee1a9af-7513-4540-9385-9e3972bfca05 from this chassis.
2016-09-01T18:16:22Z|00019|binding|INFO|Releasing fa:16:3e:01:c3:4a 10.0.0.7 
fd93:b509:aa46:0:f816:3eff:fe01:c34a

Signed-off-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>
---
 v1->v2: add rate limit

 ovn/controller/binding.c | 6 ++
 ovn/controller/pinctrl.c | 7 +++
 2 files changed, 13 insertions(+)

diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
index 0353a7b..13c51da 100644
--- a/ovn/controller/binding.c
+++ b/ovn/controller/binding.c
@@ -143,6 +143,9 @@ consider_local_datapath(struct controller_ctx *ctx,
 } else {
 VLOG_INFO("Claiming lport %s for this chassis.",
   binding_rec->logical_port);
+for (int i = 0; i < binding_rec->n_mac; i++) {
+VLOG_INFO("Claiming %s", binding_rec->mac[i]);
+}
 }
 sbrec_port_binding_set_chassis(binding_rec, chassis_rec);
 }
@@ -179,6 +182,9 @@ consider_local_datapath(struct controller_ctx *ctx,
 if (ctx->ovnsb_idl_txn) {
 VLOG_INFO("Releasing lport %s from this chassis.",
   binding_rec->logical_port);
+for (int i = 0; i < binding_rec->n_mac; i++) {
+VLOG_INFO("Releasing %s", binding_rec->mac[i]);
+}
 sbrec_port_binding_set_chassis(binding_rec, NULL);
 sset_find_and_delete(all_lports, binding_rec->logical_port);
 }
diff --git a/ovn/controller/pinctrl.c b/ovn/controller/pinctrl.c
index d65e213..323faa1 100644
--- a/ovn/controller/pinctrl.c
+++ b/ovn/controller/pinctrl.c
@@ -354,6 +354,13 @@ pinctrl_handle_put_dhcp_opts(
 pin->packet = dp_packet_data(_out);
 pin->packet_len = dp_packet_size(_out);
 
+/* Log the response. */
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(20, 40);
+const struct eth_header *l2 = dp_packet_l2(_out);
+VLOG_INFO_RL(, "DHCP%s "ETH_ADDR_FMT" "IP_FMT"",
+ msg_type == DHCP_MSG_OFFER ? "OFFER" : "ACK",
+ ETH_ADDR_ARGS(l2->eth_src), IP_ARGS(*offer_ip));
+
 success = 1;
 exit:
 if (!ofperr) {
-- 
1.8.3.1

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] ovn: log dhcp responses at INFO for debugging

2016-08-31 Thread Ramu Ramamurthy
Since dhcp responses are important for debugging the vif lifecycle,
we want to log them at info level which is the default log level.
A logsearch (on macaddr) can quickly tell the dhcp status using 
such messages. There is no need to rate limit such logs
because, we expect dhcp messages to be at low rate normally.

Logs appear like this:

2016-09-01T00:08:16Z|00014|pinctrl|INFO|DHCPOFFER fa:16:3e:25:b0:78 10.0.0.5
2016-09-01T00:08:16Z|00015|pinctrl|INFO|DHCPACK fa:16:3e:25:b0:78 10.0.0.5

Signed-off-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>
---
 ovn/controller/pinctrl.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/ovn/controller/pinctrl.c b/ovn/controller/pinctrl.c
index d65e213..ad42fe9 100644
--- a/ovn/controller/pinctrl.c
+++ b/ovn/controller/pinctrl.c
@@ -354,6 +354,12 @@ pinctrl_handle_put_dhcp_opts(
 pin->packet = dp_packet_data(_out);
 pin->packet_len = dp_packet_size(_out);
 
+/* Log the response. */
+const struct eth_header *l2 = dp_packet_l2(_out);
+VLOG_INFO("DHCP%s "ETH_ADDR_FMT" "IP_FMT"",
+ msg_type == DHCP_MSG_OFFER ? "OFFER" : "ACK",
+ ETH_ADDR_ARGS(l2->eth_src), IP_ARGS(*offer_ip));
+
 success = 1;
 exit:
 if (!ofperr) {
-- 
1.8.3.1

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] ovn: add lsp-deletion and bcast-flow removal tests for localnet

2016-08-30 Thread Ramu Ramamurthy
From: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>

Add 2 tests for scenarios around lsp-deletion and flow removal
which have escaped current unit tests.

This test depends on the following patch:
"ovn-controller: Back out incremental processing" and passes
after applying it, but fails currently on master.

1) In the following sequence of events,
createi vif1, create vif2, delete vif1
we find that the localnet patch port
got deleted, whereas it should exist because there is a
bound vif2.

2) The flow broadcasting to tunnels in table=32 must be deleted
when a localnet port gets bound, but we find that the flow remains
in table 32 causing broadcasts to both tunnels and localnet patch.

Signed-off-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>
---
 tests/ovn.at | 71 
 1 file changed, 71 insertions(+)

diff --git a/tests/ovn.at b/tests/ovn.at
index a23b422..3fa267a 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -5053,3 +5053,74 @@ OVS_WAIT_UNTIL([test `ovs-ofctl dump-flows br-int 
table=0 | grep REG13 | wc -l`
 OVN_CLEANUP([hv1])
 
 AT_CLEANUP
+
+
+AT_SETUP([ovn -- lsp deletion and broadcast-flow deletion on localnet])
+AT_KEYWORDS([ovn])
+ovn_start
+ovn-nbctl ls-add lsw0
+net_add n1
+for i in 1 2; do
+sim_add hv$i
+as hv$i
+ovs-vsctl add-br br-phys
+ovn_attach n1 br-phys 192.168.0.$i
+ovs-vsctl add-br br-eth0
+AT_CHECK([ovs-vsctl set Open_vSwitch . 
external-ids:ovn-bridge-mappings=physnet1:br-eth0])
+done
+
+# Create a localnet port.
+AT_CHECK([ovn-nbctl lsp-add lsw0 ln_port])
+AT_CHECK([ovn-nbctl lsp-set-addresses ln_port unknown])
+AT_CHECK([ovn-nbctl lsp-set-type ln_port localnet])
+AT_CHECK([ovn-nbctl lsp-set-options ln_port network_name=physnet1])
+
+
+# Create 3 vifs.
+AT_CHECK([ovn-nbctl lsp-add lsw0 localvif1])
+AT_CHECK([ovn-nbctl lsp-set-addresses localvif1 "f0:00:00:00:00:01 
192.168.1.1"])
+AT_CHECK([ovn-nbctl lsp-set-port-security localvif1 "f0:00:00:00:00:01"])
+AT_CHECK([ovn-nbctl lsp-add lsw0 localvif2])
+AT_CHECK([ovn-nbctl lsp-set-addresses localvif2 "f0:00:00:00:00:01 
192.168.1.2"])
+AT_CHECK([ovn-nbctl lsp-set-port-security localvif2 "f0:00:00:00:00:02"])
+AT_CHECK([ovn-nbctl lsp-add lsw0 localvif3])
+AT_CHECK([ovn-nbctl lsp-set-addresses localvif3 "f0:00:00:00:00:03 
192.168.1.3"])
+AT_CHECK([ovn-nbctl lsp-set-port-security localvif3 "f0:00:00:00:00:03"])
+
+# Bind the localvif1 to hv1.
+as hv1
+AT_CHECK([ovs-vsctl add-port br-int localvif1 -- set Interface localvif1 
external_ids:iface-id=localvif1])
+
+# On hv1, check that there are no flows outputting bcast to tunnel
+OVS_WAIT_UNTIL([test `ovs-ofctl dump-flows br-int table=32 | ofctl_strip | 
grep output | wc -l` -eq 0])
+
+# On hv2, check that there is 1 flow outputting bcast to tunnel to hv1.
+as hv2
+OVS_WAIT_UNTIL([test `ovs-ofctl dump-flows br-int table=32 | ofctl_strip | 
grep output | wc -l` -eq 1])
+
+# Now bind vif2 on hv2.
+AT_CHECK([ovs-vsctl add-port br-int localvif2 -- set Interface localvif2 
external_ids:iface-id=localvif2])
+
+# At this point, the broadcast flow on vif2 should be deleted.
+# because, there is now a localnet vif bound (table=32 programming logic)
+OVS_WAIT_UNTIL([test `ovs-ofctl dump-flows br-int table=32 | ofctl_strip | 
grep output | wc -l` -eq 0])
+
+# Verify that the local net patch port exists on hv2.
+OVS_WAIT_UNTIL([test `ovs-vsctl show | grep "Port patch-br-int-to-ln_port" | 
wc -l` -eq 1])
+
+# Now bind vif3 on hv2.
+AT_CHECK([ovs-vsctl add-port br-int localvif3 -- set Interface localvif3 
external_ids:iface-id=localvif3])
+
+# Verify that the local net patch port still exists on hv2
+OVS_WAIT_UNTIL([test `ovs-vsctl show | grep "Port patch-br-int-to-ln_port" | 
wc -l` -eq 1])
+
+# Delete localvif2
+AT_CHECK([ovn-nbctl lsp-del localvif2])
+
+# Verify that the local net patch port still exists on hv2,
+# because, localvif3 is still bound.
+OVS_WAIT_UNTIL([test `ovs-vsctl show | grep "Port patch-br-int-to-ln_port" | 
wc -l` -eq 1])
+
+
+OVN_CLEANUP([hv1],[hv2])
+AT_CLEANUP
-- 
1.8.3.1

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] ovn: fix northd leaks in build_acl

2016-08-19 Thread Ramu Ramamurthy
From: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>

The following leaks are due to missing ds_destroy in a few
places in build_acl.

5,850 bytes in 50 blocks are definitely lost in loss record 93 of 93
   at 0x4C29BFD: malloc (in 
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
   by 0x4C2BACB: realloc (in 
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
   by 0x449507: xrealloc (util.c:123)
   by 0x42CC73: ds_reserve (dynamic-string.c:63)
   by 0x42D08F: ds_put_format_valist (dynamic-string.c:161)
   by 0x42D176: ds_put_format (dynamic-string.c:142)
   by 0x40D380: build_acls (ovn-northd.c:2320)
   by 0x40D380: build_lswitch_flows.constprop.36 (ovn-northd.c:2472)
   by 0x4072D9: build_lflows (ovn-northd.c:3845)
   by 0x4072D9: ovnnb_db_run (ovn-northd.c:3971)
   by 0x4072D9: main (ovn-northd.c:4375)

9,360 bytes in 72 blocks are definitely lost in loss record 93 of 93
   at 0x4C29BFD: malloc (in 
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
   by 0x4C2BACB: realloc (in 
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
   by 0x449507: xrealloc (util.c:123)
   by 0x42CC73: ds_reserve (dynamic-string.c:63)
   by 0x42D08F: ds_put_format_valist (dynamic-string.c:161)
   by 0x42D176: ds_put_format (dynamic-string.c:142)
   by 0x40D505: build_acls (ovn-northd.c:2346)
   by 0x40D505: build_lswitch_flows.constprop.36 (ovn-northd.c:2472)
   by 0x4072D9: build_lflows (ovn-northd.c:3845)
   by 0x4072D9: ovnnb_db_run (ovn-northd.c:3971)
   by 0x4072D9: main (ovn-northd.c:4375)

Signed-off-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>
---
 ovn/northd/ovn-northd.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index 625937d..756d188 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -2324,6 +2324,7 @@ build_acls(struct ovn_datapath *od, struct hmap *lflows)
 ovn_lflow_add(
 lflows, od, S_SWITCH_OUT_ACL, 34000, ds_cstr(),
 actions);
+ds_destroy();
 }
 }
 
@@ -2350,6 +2351,7 @@ build_acls(struct ovn_datapath *od, struct hmap *lflows)
 ovn_lflow_add(
 lflows, od, S_SWITCH_OUT_ACL, 34000, ds_cstr(),
 actions);
+ds_destroy();
 }
 }
 }
-- 
1.8.3.1

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] ovn: fix ovn-northd leaks in build_acl

2016-08-19 Thread Ramu Ramamurthy
From: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>

The following leaks are due to missing ds_destroy in a few
places in build_acl.

5,850 bytes in 50 blocks are definitely lost in loss record 93 of 93
   at 0x4C29BFD: malloc (in 
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
   by 0x4C2BACB: realloc (in 
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
   by 0x449507: xrealloc (util.c:123)
   by 0x42CC73: ds_reserve (dynamic-string.c:63)
   by 0x42D08F: ds_put_format_valist (dynamic-string.c:161)
   by 0x42D176: ds_put_format (dynamic-string.c:142)
   by 0x40D380: build_acls (ovn-northd.c:2320)
   by 0x40D380: build_lswitch_flows.constprop.36 (ovn-northd.c:2472)
   by 0x4072D9: build_lflows (ovn-northd.c:3845)
   by 0x4072D9: ovnnb_db_run (ovn-northd.c:3971)
   by 0x4072D9: main (ovn-northd.c:4375)

9,360 bytes in 72 blocks are definitely lost in loss record 93 of 93
   at 0x4C29BFD: malloc (in 
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
   by 0x4C2BACB: realloc (in 
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
   by 0x449507: xrealloc (util.c:123)
   by 0x42CC73: ds_reserve (dynamic-string.c:63)
   by 0x42D08F: ds_put_format_valist (dynamic-string.c:161)
   by 0x42D176: ds_put_format (dynamic-string.c:142)
   by 0x40D505: build_acls (ovn-northd.c:2346)
   by 0x40D505: build_lswitch_flows.constprop.36 (ovn-northd.c:2472)
   by 0x4072D9: build_lflows (ovn-northd.c:3845)
   by 0x4072D9: ovnnb_db_run (ovn-northd.c:3971)
   by 0x4072D9: main (ovn-northd.c:4375)
---
 ovn/northd/ovn-northd.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index 625937d..756d188 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -2324,6 +2324,7 @@ build_acls(struct ovn_datapath *od, struct hmap *lflows)
 ovn_lflow_add(
 lflows, od, S_SWITCH_OUT_ACL, 34000, ds_cstr(),
 actions);
+ds_destroy();
 }
 }
 
@@ -2350,6 +2351,7 @@ build_acls(struct ovn_datapath *od, struct hmap *lflows)
 ovn_lflow_add(
 lflows, od, S_SWITCH_OUT_ACL, 34000, ds_cstr(),
 actions);
+ds_destroy();
 }
 }
 }
-- 
1.8.3.1

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2] pinctrl: Fix memory leak for NAT IPs in send_garp_run().

2016-08-19 Thread Ramu Ramamurthy
I applied this patch, and while it fixes the original leak, it still
produces the following leak:

 valgrind.8774-==8776== 2,464 bytes in 44 blocks are definitely lost
in loss record 176 of 179
valgrind.8774-==8776==at 0x4C29BFD: malloc (in
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
valgrind.8774-==8776==by 0x48E3C7: xmalloc (util.c:112)
valgrind.8774-==8776==by 0x411BAD: get_nat_addresses_and_keys
(pinctrl.c:1221)
valgrind.8774-==8776==by 0x411BAD: send_garp_run (pinctrl.c:1259)
valgrind.8774-==8776==by 0x411BAD: pinctrl_run (pinctrl.c:787)
valgrind.8774:==8776==by 0x407ABE: main (ovn-controller.c:451)

With the following incremental change, no leak is found in that test.
( as the above points to leaking laddrs which
was mallocd in get_nat_addresses_and_keys())

 struct lport_addresses *laddrs = iter->data;
 destroy_lport_addresses(laddrs);
 shash_delete(_addresses, iter);
+   free(laddrs);
 }
 shash_destroy(_addresses);


On Fri, Aug 19, 2016 at 9:03 AM, Ben Pfaff <b...@ovn.org> wrote:
> send_garp_run() allocated and populated a shash of struct lport_addresses,
> but it only freed some of the data.  This fixes the problem.
>
> CC: Chandra S Vejendla <csvej...@us.ibm.com>
> Reported-by: Ramu Ramamurthy <ramu.ramamur...@gmail.com>
> Fixes: 8439c2ebd823 ("ovn: Support for GARP for NAT IPs via localnet")
> Signed-off-by: Ben Pfaff <b...@ovn.org>
> ---
> v1->v2: Rebase.
>
>  ovn/controller/pinctrl.c | 10 --
>  1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/ovn/controller/pinctrl.c b/ovn/controller/pinctrl.c
> index 358602a..fc72bd5 100644
> --- a/ovn/controller/pinctrl.c
> +++ b/ovn/controller/pinctrl.c
> @@ -1059,7 +1059,6 @@ send_garp_update(const struct sbrec_port_binding 
> *binding_rec,
>  }
>  free(name);
>  }
> -destroy_lport_addresses(laddrs);
>  return;
>  }
>
> @@ -1302,7 +1301,14 @@ send_garp_run(const struct ovsrec_bridge *br_int, 
> const char *chassis_id,
>  sset_destroy(_vifs);
>  sset_destroy(_l3gw_ports);
>  simap_destroy(_ofports);
> -shash_destroy_free_data(_addresses);
> +
> +SHASH_FOR_EACH_SAFE (iter, next, _addresses) {
> +struct lport_addresses *laddrs = iter->data;
> +destroy_lport_addresses(laddrs);
> +shash_delete(_addresses, iter);
> +}
> +shash_destroy(_addresses);
> +
>  sset_destroy(_ip_keys);
>  }
>
> --
> 2.1.3
>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [bug] ovn: make check-valgrind issues

2016-08-18 Thread Ramu Ramamurthy
The following are among many reported by
"make check-valgrind" on the ovn tests. The
ones in pinctrl.c and binding.c seem real.



2246/valgrind.28645-==28647== 1,496 bytes in 22 blocks are definitely
lost in loss record 175 of 179
2246/valgrind.28645-==28647==at 0x4C29BFD: malloc (in
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
2246/valgrind.28645-==28647==by 0x4C2BACB: realloc (in
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
2246/valgrind.28645-==28647==by 0x48E407: xrealloc (util.c:123)
2246/valgrind.28645-==28647==by 0x4201A4: add_ipv4_netaddr.isra.0
(ovn-util.c:28)
2246/valgrind.28645-==28647==by 0x420535: extract_lsp_addresses
(ovn-util.c:103)
2246/valgrind.28645-==28647==by 0x4119EB:
get_nat_addresses_and_keys (pinctrl.c:1223)
2246/valgrind.28645-==28647==by 0x4119EB: send_garp_run (pinctrl.c:1260)
2246/valgrind.28645-==28647==by 0x4119EB: pinctrl_run (pinctrl.c:787)
2246/valgrind.28645:==28647==by 0x407ABE: main (ovn-controller.c:451)

2258/valgrind.29417-==29419== 195 (96 direct, 99 indirect) bytes in 3
blocks are definitely lost in loss record 163 of 173
2258/valgrind.29417-==29419==at 0x4C29BFD: malloc (in
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
2258/valgrind.29417-==29419==by 0x48E1E7: xmalloc (util.c:112)
2258/valgrind.29417-==29419==by 0x435044: resize (hmap.c:100)
2258/valgrind.29417-==29419==by 0x487F2D: hmap_insert_at (hmap.h:277)
2258/valgrind.29417-==29419==by 0x487F2D: sset_add__ (sset.c:53)
2258/valgrind.29417-==29419==by 0x487FE7: sset_clone (sset.c:82)
2258/valgrind.29417-==29419==by 0x40933B: binding_run (binding.c:289)
2258/valgrind.29417:==29419==by 0x4077B1: main (ovn-controller.c:438)


2258/valgrind.29417-==29419== 512 (384 direct, 128 indirect) bytes in
6 blocks are definitely lost in loss record 167 of 173
2258/valgrind.29417-==29419==at 0x4C2B974: calloc (in
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
2258/valgrind.29417-==29419==by 0x48E1B4: xcalloc (util.c:95)
2258/valgrind.29417-==29419==by 0x4093D1: binding_run (binding.c:298)
2258/valgrind.29417:==29419==by 0x4077B1: main (ovn-controller.c:438)
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] ovn: fix valgrind leak

2016-08-18 Thread Ramu Ramamurthy
From: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>

This commit fixes the following leak found by check-valgrind in the test:
"send gratuitous arp for nat ips in localnet"

sset gets allocated but not destroyed.

valgrind.14154-==14157== 1,892 bytes in 44 blocks are definitely lost in loss 
record 176 of 180
valgrind.14154-==14157==at 0x4C29BFD: malloc (in 
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
valgrind.14154-==14157==by 0x48E1E7: xmalloc (util.c:112)
valgrind.14154-==14157==by 0x487EDC: sset_add__ (sset.c:51)
valgrind.14154-==14157==by 0x411A46: get_nat_addresses_and_keys 
(pinctrl.c:1231)
valgrind.14154-==14157==by 0x411A46: send_garp_run (pinctrl.c:1260)
valgrind.14154-==14157==by 0x411A46: pinctrl_run (pinctrl.c:787)
valgrind.14154:==14157==by 0x407ABE: main (ovn-controller.c:451)

Signed-off-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>
---
 ovn/controller/pinctrl.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/ovn/controller/pinctrl.c b/ovn/controller/pinctrl.c
index 8bbc42d..358602a 100644
--- a/ovn/controller/pinctrl.c
+++ b/ovn/controller/pinctrl.c
@@ -1303,6 +1303,7 @@ send_garp_run(const struct ovsrec_bridge *br_int, const 
char *chassis_id,
 sset_destroy(_l3gw_ports);
 simap_destroy(_ofports);
 shash_destroy_free_data(_addresses);
+sset_destroy(_ip_keys);
 }
 
 static void
-- 
1.8.3.1

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7] ovn-northd: Add logical flows to support native DHCPv4

2016-07-24 Thread Ramu Ramamurthy
On Sat, Jul 23, 2016 at 5:49 AM, Numan Siddique <nusid...@redhat.com> wrote:
> OVN implements a native DHCPv4 support which caters to the common
> use case of providing an IP address to a booting instance by
> providing stateless replies to DHCPv4 requests based on statically
> configured address mappings. To do this it allows a short list of
> DHCPv4 options to be configured and applied at each compute host
> running ovn-controller.
>
> A new table 'DHCP_Options' is added in OVN NB DB to store the DHCP
> options. Logical ports refer to this table to configure the DHCPv4
> options.
>
> For each logical port configured with DHCPv4 Options following flows
> are added
>  - A logical flow which copies the DHCPv4 options to the DHCPv4
>request packets using the 'put_dhcp_opts' action and advances the
>packet to the next stage.
>
>  - A logical flow which implements the DHCP reponder by sending
>the DHCPv4 reply back to the inport once the 'put_dhcp_opts' action
>is applied.
>
> Signed-off-by: Numan Siddique <nusid...@redhat.com>
> Co-authored-by: Ben Pfaff <b...@ovn.org>
> Signed-off-by: Ben Pfaff <b...@ovn.org>
> ---

Tested-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>
against openstack cms.
Acked-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v7] ovn-northd: Add logical flows to support native DHCPv4

2016-07-24 Thread Ramu Ramamurthy
Tested to work end-to-end with openstack.

On Sat, Jul 23, 2016 at 5:49 AM, Numan Siddique  wrote:
> OVN implements a native DHCPv4 support which caters to the common
> use case of providing an IP address to a booting instance by
> providing stateless replies to DHCPv4 requests based on statically
> configured address mappings. To do this it allows a short list of
> DHCPv4 options to be configured and applied at each compute host
> running ovn-controller.
>
> A new table 'DHCP_Options' is added in OVN NB DB to store the DHCP
> options. Logical ports refer to this table to configure the DHCPv4
> options.
>
> For each logical port configured with DHCPv4 Options following flows
> are added
>  - A logical flow which copies the DHCPv4 options to the DHCPv4
>request packets using the 'put_dhcp_opts' action and advances the
>packet to the next stage.
>
>  - A logical flow which implements the DHCP reponder by sending
>the DHCPv4 reply back to the inport once the 'put_dhcp_opts' action
>is applied.
>
> Signed-off-by: Numan Siddique 
> Co-authored-by: Ben Pfaff 
> Signed-off-by: Ben Pfaff 
> ---
>  ovn/northd/ovn-northd.8.xml   |  91 +-
>  ovn/northd/ovn-northd.c   | 256 +-
>  ovn/ovn-nb.ovsschema  |  20 ++-
>  ovn/ovn-nb.xml| 270 
>  ovn/utilities/ovn-nbctl.8.xml |  30 +
>  ovn/utilities/ovn-nbctl.c | 197 +
>  tests/ovn.at  | 281 
> ++
>  7 files changed, 1135 insertions(+), 10 deletions(-)
>
>
> v6 -> v7
>   * Rebased and resolved the merge conflicts.
>
> v5 -> v6
>   * Rebased and fixed the compilation errors because of renaming of struct 
> ovn_port's nbs
> to nbsp
>
> diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
> index ced2839..b95caef 100644
> --- a/ovn/northd/ovn-northd.8.xml
> +++ b/ovn/northd/ovn-northd.8.xml
> @@ -457,7 +457,90 @@ output;
>
>  
>
> -Ingress Table 10: Destination Lookup
> +Ingress Table 10: DHCP option processing
> +
> +
> +  This table adds the DHCPv4 options to a DHCPv4 packet from the
> +  logical ports configured with IPv4 address(es) and DHCPv4 options.
> +
> +
> +
> +  
> +
> +  A priority-100 logical flow is added for these logical ports
> +  which matches the IPv4 packet with udp.src = 68 and
> +  udp.dst = 67 and applies the action
> +  put_dhcp_opts and advances the packet to the next 
> table.
> +
> +
> +
> +reg0[3] = put_dhcp_opts(offer_ip = O, options...);
> +next;
> +
> +
> +
> +  For DHCPDISCOVER and DHCPREQUEST, this transforms the packet into a
> +  DHCP reply, adds the DHCP offer IP O and options to the
> +  packet, and stores 1 into reg0[3].  For other kinds of packets, it
> +  just stores 0 into reg0[3].  Either way, it continues to the next
> +  table.
> +
> +
> +  
> +
> +  
> +A priority-0 flow that matches all packets to advances to table 11.
> +  
> +
> +
> +Ingress Table 11: DHCP responses
> +
> +
> +  This table implements DHCP responder for the DHCP replies generated by
> +  the previous table.
> +
> +
> +
> +  
> +
> +  A priority 100 logical flow is added for the logical ports 
> configured
> +  with DHCPv4 options which matches IPv4 packets with udp.src 
> == 68
> +   udp.dst == 67  reg0[3] == 1 and
> +  responds back to the inport after applying these
> +  actions.  If reg0[3] is set to 1, it means that the
> +  action put_dhcp_opts was successful.
> +
> +
> +
> +eth.dst = eth.src;
> +eth.src = E;
> +ip4.dst = O;
> +ip4.src = S;
> +udp.src = 67;
> +udp.dst = 68;
> +outport = P;
> +inport = ""; /* Allow sending out inport. */
> +output;
> +
> +
> +
> +  where E is the server MAC address and S is 
> the
> +  server IPv4 address defined in the DHCPv4 options and O 
> is
> +  the IPv4 address defined in the logical port's addresses column.
> +
> +
> +
> +  (This terminates ingress packet processing; the packet does not go
> +   to the next ingress table.)
> +
> +  
> +
> +  
> +A priority-0 flow that matches all packets to advances to table 12.
> +  
> +
> +
> +Ingress Table 12: Destination Lookup
>
>  
>This table implements switching behavior.  It contains these logical
> @@ -531,6 +614,12 @@ output;
>there are no rules added for load balancing new connections.
>  
>
> +
> +  Also a priority 34000 logical flow is added for each logical port which
> +  has DHCPv4 options defined to allow the DHCPv4 reply packet from the
> 

Re: [ovs-dev] [PATCH v5] ovn-northd: Add logical flows to support native DHCPv4

2016-07-16 Thread Ramu Ramamurthy
Tested-By: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>
with corresponding openstack patch.

Is there a mechanism to rate-limit the rate of broadcast
dhcp-requests ? This is useful when there are untrusted vifs requesting dhcp.
I sent dhcp-requests from a namespace at a high rate (100pkts/s, 200pks/s etc),
ovn-controller is able to handle 500 dhcp-requests/s easily with
replies. However, cpu of the
ovn-controller climbs with with the dhcp-request rate.

On Thu, Jul 14, 2016 at 10:07 AM, Numan Siddique <nusid...@redhat.com> wrote:
> OVN implements a native DHCPv4 support which caters to the common
> use case of providing an IP address to a booting instance by
> providing stateless replies to DHCPv4 requests based on statically
> configured address mappings. To do this it allows a short list of
> DHCPv4 options to be configured and applied at each compute host
> running ovn-controller.
>
> A new table 'DHCP_Options' is added in OVN NB DB to store the DHCP
> options. Logical ports refer to this table to configure the DHCPv4
> options.
>
> For each logical port configured with DHCPv4 Options following flows
> are added
>  - A logical flow which copies the DHCPv4 options to the DHCPv4
>request packets using the 'put_dhcp_opts' action and advances the
>packet to the next stage.
>
>  - A logical flow which implements the DHCP reponder by sending
>the DHCPv4 reply back to the inport once the 'put_dhcp_opts' action
>is applied.
>
> Signed-off-by: Numan Siddique <nusid...@redhat.com>
> Co-authored-by: Ben Pfaff <b...@ovn.org>
> Signed-off-by: Ben Pfaff <b...@ovn.org>
> ---
>  ovn/northd/ovn-northd.8.xml   |  91 +-
>  ovn/northd/ovn-northd.c   | 256 +-
>  ovn/ovn-nb.ovsschema  |  20 ++-
>  ovn/ovn-nb.xml| 270 
>  ovn/utilities/ovn-nbctl.8.xml |  30 +
>  ovn/utilities/ovn-nbctl.c | 197 +
>  tests/ovn.at  | 281 
> ++
>  7 files changed, 1135 insertions(+), 10 deletions(-)
>
> diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
> index 178a1a9..447e042 100644
> --- a/ovn/northd/ovn-northd.8.xml
> +++ b/ovn/northd/ovn-northd.8.xml
> @@ -423,7 +423,90 @@ output;
>
>  
>
> -Ingress Table 10: Destination Lookup
> +Ingress Table 10: DHCP option processing
> +
> +
> +  This table adds the DHCPv4 options to a DHCPv4 packet from the
> +  logical ports configured with IPv4 address(es) and DHCPv4 options.
> +
> +
> +
> +  
> +
> +  A priority-100 logical flow is added for these logical ports
> +  which matches the IPv4 packet with udp.src = 68 and
> +  udp.dst = 67 and applies the action
> +  put_dhcp_opts and advances the packet to the next 
> table.
> +
> +
> +
> +reg0[3] = put_dhcp_opts(offer_ip = O, options...);
> +next;
> +
> +
> +
> +  For DHCPDISCOVER and DHCPREQUEST, this transforms the packet into a
> +  DHCP reply, adds the DHCP offer IP O and options to the
> +  packet, and stores 1 into reg0[3].  For other kinds of packets, it
> +  just stores 0 into reg0[3].  Either way, it continues to the next
> +  table.
> +
> +
> +  
> +
> +  
> +A priority-0 flow that matches all packets to advances to table 11.
> +  
> +
> +
> +Ingress Table 11: DHCP responses
> +
> +
> +  This table implements DHCP responder for the DHCP replies generated by
> +  the previous table.
> +
> +
> +
> +  
> +
> +  A priority 100 logical flow is added for the logical ports 
> configured
> +  with DHCPv4 options which matches IPv4 packets with udp.src 
> == 68
> +   udp.dst == 67  reg0[3] == 1 and
> +  responds back to the inport after applying these
> +  actions.  If reg0[3] is set to 1, it means that the
> +  action put_dhcp_opts was successful.
> +
> +
> +
> +eth.dst = eth.src;
> +eth.src = E;
> +ip4.dst = O;
> +ip4.src = S;
> +udp.src = 67;
> +udp.dst = 68;
> +outport = P;
> +inport = ""; /* Allow sending out inport. */
> +output;
> +
> +
> +
> +  where E is the server MAC address and S is 
> the
> +  server IPv4 address defined in the DHCPv4 options and O 
> is
> +  the IPv4 address defined in the logical port's addresses column.
> +
> +
> +
> +

Re: [ovs-dev] [PATCH v4] ovn-northd: Add logical flows to support native DHCPv4

2016-07-11 Thread Ramu Ramamurthy
Tested-By: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com> against the
corresponding (rebased) openstack patch:
https://review.openstack.org/#/c/243174/
Hence, Acked-By: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>


On Sun, Jul 3, 2016 at 8:54 PM, Numan Siddique <nusid...@redhat.com> wrote:
> OVN implements a native DHCPv4 support which caters to the common
> use case of providing an IP address to a booting instance by
> providing stateless replies to DHCPv4 requests based on statically
> configured address mappings. To do this it allows a short list of
> DHCPv4 options to be configured and applied at each compute host
> running ovn-controller.
>
> A new table 'DHCP_Options' is added in OVN NB DB to store the DHCP
> options. Logical ports refer to this table to configure the DHCPv4
> options.
>
> For each logical port configured with DHCPv4 Options following flows
> are added
>  - A logical flow which copies the DHCPv4 options to the DHCPv4
>request packets using the 'put_dhcp_opts' action and advances the
>packet to the next stage.
>
>  - A logical flow which implements the DHCP reponder by sending
>the DHCPv4 reply back to the inport once the 'put_dhcp_opts' action
>is applied.
>
> Signed-off-by: Numan Siddique <nusid...@redhat.com>
> Co-authored-by: Ben Pfaff <b...@ovn.org>
> Signed-off-by: Ben Pfaff <b...@ovn.org>
> ---
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v10 2/2] ovn-northd: Add logical flows to support native DHCP

2016-06-08 Thread Ramu Ramamurthy
I tested and verified this patchset using the corresponding WIP
openstack dhcp patch.
native-dhcp works end-to-end as advertized - I tested with the default
dhcp-options
that are used when VMs boot.

On Mon, Jun 6, 2016 at 10:49 PM, Numan Siddique  wrote:
> OVN implements a native DHCP support which caters to the common
> use case of providing an IP address to a booting instance by
> providing stateless replies to DHCP requests based on statically
> configured address mappings. To do this it allows a short list of
> DHCP options to be configured and applied at each compute host
> running ovn-controller.
>
> A new table 'Subnet' is added in OVN NB DB to store the DHCP options.
>
> For each logical port following flows are added if the CMS has defined
> DHCP options in the 'Subnet' column
>
>  - A logical flow which copies the DHCP options to the DHCP
>request packets using the 'put_dhcp_opts' action and advances the
>packet to the next stage.
>
>  - A logical flow which implements the DHCP reponder by sending
>the DHCP reply back to the inport once the 'put_dhcp_opts' action
>is applied.
>
> Signed-Off-by: Numan Siddique 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v8 2/2] ovn-northd: Add logical flows to support native DHCP

2016-05-25 Thread Ramu Ramamurthy
Numan, I tested this patchset along with the corresponding WIP ml2
native-dhcp openstack patch.
It looks good to me,

On Tue, May 24, 2016 at 11:04 AM, Numan Siddique  wrote:
> OVN implements a native DHCP support which caters to the common
> use case of providing an IP address to a booting instance by
> providing stateless replies to DHCP requests based on statically
> configured address mappings. To do this it allows a short list of
> DHCP options to be configured and applied at each compute host
> running ovn-controller.
>
> A new table 'Subnet' is added in OVN NB DB to store the DHCP options.
>
> For each logical port following flows are added if the CMS has defined
> DHCP options in the 'Subnet' column
>
>  - A logical flow which copies the DHCP options to the DHCP
>request packets using the 'put_dhcp_opts' action and advances the
>packet to the next stage.
>
>  - A logical flow which implements the DHCP reponder by sending
>the DHCP reply back to the inport once the 'put_dhcp_opts' action
>is applied.
>
> Signed-Off-by: Numan Siddique 
> ---
>  ovn/northd/ovn-northd.8.xml   |  89 +++-
>  ovn/northd/ovn-northd.c   | 265 ++-
>  ovn/ovn-nb.ovsschema  |  19 ++-
>  ovn/ovn-nb.xml| 314 
> +-
>  ovn/utilities/ovn-nbctl.8.xml |  29 
>  ovn/utilities/ovn-nbctl.c | 196 ++
>  tests/ovn.at  | 250 +
>  tests/test-ovn-dhcp.c | 135 ++
>  8 files changed, 1287 insertions(+), 10 deletions(-)
>
> diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
> index 970c352..9a4a46a 100644
> --- a/ovn/northd/ovn-northd.8.xml
> +++ b/ovn/northd/ovn-northd.8.xml
> @@ -326,7 +326,88 @@ output;
>
>  
>
> -Ingress Table 6: Destination Lookup
> +Ingress Table 6: DHCP pause
> +
> +
> +  This table adds the DHCP options to a DHCP packet from the
> +  logical ports configured with IPv4 address(es) and DHCP options.
> +
> +
> +
> +  
> +
> +  A priority-100 logical flow is added for these logical ports
> +  which matches the packet with udp.src = 68 and
> +  udp.dst = 67 and applies the action
> +  put_dhcp_opts and advances the packet to the table 7.
> +
> +
> +
> +put_dhcp_opts(R, offer_ip = O, DHCP_OPTIONS);
> +next;
> +
> +
> +
> +  The action put_dhcp_opts adds the DHCP options to
> +  the packet and sets the yiaddr to the offer ip
> +  defined in O only if the packet is a
> +  DHCPDISCOVER/DHCPREQUEST and resumes the pipeline by setting
> +  0x1 to the OVS register R.
> +  If the packet is non-DHCP packet or invalid DHCP packet
> +  it resumes the pipeline without any modifications.
> +
> +
> +  
> +
> +  
> +A priority-0 flow that matches all packets to advances to table 7.
> +  
> +
> +
> +Ingress Table 7: DHCP resume
> +
> +
> +  This table implements DHCP responder for the resumed DHCP packets
> +  from the previous table.
> +
> +
> +
> +  
> +
> +  A priority 100 logical flow is added for the logical ports 
> configured
> +  with DHCP options which matches the packet with
> +  udp.src = 68, udp.dst = 67 and
> +  reg0 = 0x1 and responds back to the
> +  inport after applying these actions.
> +  If reg0 is set to 1, it means that the action
> +  put_dhcp_opts was successful.
> +
> +
> +
> +eth.dst = eth.src;
> +eth.src = E;
> +ip4.dst = O;
> +ip4.src = S;
> +udp.src = 67;
> +udp.dst = 68;
> +outport = P;
> +inport = ""; /* Allow sending out inport. */
> +output;
> +
> +
> +
> +  Where E is the server mac address and S is 
> the
> +  server IPv4 address defined in the DHCP options and O is
> +  the IPv4 address defined in the logical port's addresses column.
> +
> +  
> +
> +  
> +A priority-0 flow that matches all packets to advances to table 8.
> +  
> +
> +
> +Ingress Table 8: Destination Lookup
>
>  
>This table implements switching behavior.  It contains these logical
> @@ -370,6 +451,12 @@ output;
>This is similar to ingress table 4 except for to-lport 
> ACLs.
>  
>
> +
> +  Also a priority 34000 logical flow is added for each subnet of the 
> logical
> +  switch which has DHCP options defined to allow the DHCP reply packet
> +  from the Ingress Table 7: DHCP resume.
> +
> +
>  Egress Table 2: Egress Port Security - IP
>
>  
> diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
> index 44e9430..ff16622 100644
> --- a/ovn/northd/ovn-northd.c
> +++ b/ovn/northd/ovn-northd.c
> @@ -26,6 +26,7 @@
>  

Re: [ovs-dev] [PATCH v7 1/2] ovn-controller: Add 'put_dhcp_opts' action in ovn-controller

2016-05-23 Thread Ramu Ramamurthy
Tested-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>

I tested this patchset to work end-to-end with openstack (with the
corresponding WIP openstack patch).

> +/* Check that the DHCP Message Type (opt 53) is present or not with
> + * valid values - DHCP_MSG_DISCOVER or DHCP_MSG_REQUEST as the first
> + * DHCP option.
> + */
> +if (!(in_dhcp_opt[0] == DHCP_OPT_MSG_TYPE && in_dhcp_opt[1] == 1 && (
> +in_dhcp_opt[2] == DHCP_MSG_DISCOVER ||
> +in_dhcp_opt[2] == DHCP_MSG_REQUEST))) {
> +/* Invalid dhcp packet. Resume the packet without further
> + * processing. */
> +goto exit;
> +}
> +

Would it make sense to log messages are not handling (nak, decline,
release, inform)
 to help debug (rare) scenarios when VMs dont get dhcp IPs as expected.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v5 2/2] ovn: Add logical flows to support native DHCP

2016-05-17 Thread Ramu Ramamurthy
> The reason for not adding the flow in IN_ACL is because the CMS can add
> flows to allow or drop DHCP  traffic on a logical port if it wants to. In
> the case of OpenStack networking-ovn, it is adding the below flows for each
> logical port.
>
>   table=4(   ls_in_acl), priority= 2002, match=(ct.new && (inport ==
> "2636f285-6d1a-4ad9-89db-c3323349c554" && ip4)), action=(ct_commit; next;)
>   table=4(   ls_in_acl), priority= 2002, match=(ct.new && (inport ==
> "2636f285-6d1a-4ad9-89db-c3323349c554" && ip6)), action=(ct_commit; next;)
>   table=4(   ls_in_acl), priority= 2001, match=(inport ==
> "2636f285-6d1a-4ad9-89db-c3323349c554" && ip), action=(drop;)
>
>
> Actually if we want we can remove the 34000 OUT_ACL flow from ovn-northd and
> let CMS add it. I initially thought its good to take care of it on
> ovn-northd. But now I am not sure whats the best approach. Please let me
> know your comments.
>
Thanks, It may be fine to program the out-acl flows implicitly (by
northd) as a consequence of the CMS
setting enable_dhcp on the port (ie the user has accepted DHCP traffic
on the port by setting enable_dhcp).
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] pinctrl: Fix "sparse" warning.

2016-05-17 Thread Ramu Ramamurthy
Darrell, Ben,

Thanks for the fix, The warning was introduced by my patch "send GARP
on localnet"
In the future, I will run my changes through sparse (and possibly
clang) to detect such
problems prior to sharing a patch.

ovn/controller/pinctrl.c:609:39: warning: incorrect type in assignment
(different base types)
ovn/controller/pinctrl.c:609:39:expected restricted ofp_port_t
[usertype] port
ovn/controller/pinctrl.c:609:39:got int [signed] ofport

On Tue, May 17, 2016 at 8:12 AM, Darrell Ball  wrote:
> On Tue, May 17, 2016 at 7:44 AM, Ben Pfaff  wrote:
>
>> The ofport member should be an ofp_port_t, since it represents an OpenFlow
>> port number.
>>
>> Fixes: 9baaabfff3c7 ("ovn: Fix localnet ports deletion and recreation
>> sometimes after restart.")
>>
>
> Is this rather related to "ovn: send garp on localnet." ?
>
>
>
>> Signed-off-by: Ben Pfaff 
>> ---
>>  ovn/controller/pinctrl.c | 5 +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/ovn/controller/pinctrl.c b/ovn/controller/pinctrl.c
>> index 1611bcd..bc57c40 100644
>> --- a/ovn/controller/pinctrl.c
>> +++ b/ovn/controller/pinctrl.c
>> @@ -514,7 +514,7 @@ struct garp_data {
>>  ovs_be32 ipv4;   /* Ipv4 address of port. */
>>  long long int announce_time; /* Next announcement in ms. */
>>  int backoff; /* Backoff for the next announcement. */
>> -int ofport;  /* ofport used to output this GARP. */
>> +ofp_port_t ofport;   /* ofport used to output this GARP. */
>>  };
>>
>>  /* Contains GARPs to be sent. */
>> @@ -548,7 +548,8 @@ send_garp_update(const struct sbrec_port_binding
>> *binding_rec,
>>  if (!ld || !ld->localnet_port) {
>>  return;
>>  }
>> -int ofport = simap_get(localnet_ofports,
>> ld->localnet_port->logical_port);
>> +ofp_port_t ofport = u16_to_ofp(simap_get(localnet_ofports,
>> +
>>  ld->localnet_port->logical_port));
>>
>>  /* Update GARP if it exists. */
>>  struct garp_data *garp = shash_find_data(_garp_data,
>> --
>> 2.1.3
>>
>> ___
>> dev mailing list
>> dev@openvswitch.org
>> http://openvswitch.org/mailman/listinfo/dev
>>
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v5 2/2] ovn: Add logical flows to support native DHCP

2016-05-16 Thread Ramu Ramamurthy
Tested-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>

I tested v5 of this patchset to work end-to-end with openstack (using
your openstack changes which are also under review). The options tested
include dns-server and classless-static-route.

A question I have is why you program these out-acl flows at prio 34000,
whereas you did not need to program the corresponding in-acl flow ?
  table=1(  ls_out_acl), priority=34000, match=(eth.src ==
fa:16:3e:94:07:40 && ip4.src == 10.0.2.1 && udp && udp.src == 67 &&
udp.dst == 68), action=(ct_commit; next;)

On Mon, May 16, 2016 at 2:23 AM, Numan Siddique <nusid...@redhat.com> wrote:
> OVN implements a native DHCP support which caters to the common
> use case of providing an IP address to a booting instance by
> providing stateless replies to DHCP requests based on statically
> configured address mappings. To do this it allows a short list of
> DHCP options to be configured and applied at each compute host
> running ovn-controller.
>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH RFC] ovn: localnet ports deleted, recreated sometimes after restart

2016-04-28 Thread Ramu Ramamurthy
On graceful-restart of ovn-controller, the chassis row is
inserted in the Chassis table. During this transaction,
there is a window of time where an idl row-read may not
return the newly created row - even though the row should exist,
but the transaction is in an incomplete state.
As a result, get_chassis() in binding_run() returns a null chassis record
binding_run exits early, and does not
create local_datapaths, and patch_run deletes
localnet patch ports. In a later run, the localnet patch ports are recreated.

This is reproducable consistently but not on every restart.
The fix is to handle the case that the chassis record
may be null in binding_run, and yet create local_datapaths.

RFC because, I am not sure if the fix is correct - ie,
can there be local_datapaths with no chassis record ? And if this
fix is not correct, then what is the correct fix ? Would it be
to not run binding_run() and patch_run() when ovnsb_idl_txn is null ?

Restart logs follow with commentary:

2016-04-28T18:35:42.448Z|1|vlog|INFO|opened log file 
/home/ovs/ovs/tests/testsuite.dir/2035/hv/ovn-controller.log
2016-04-28T18:35:42.449Z|2|reconnect|INFO|unix:/home/ovs/ovs/tests/testsuite.dir/2035/hv/db.sock:
 connecting...
2016-04-28T18:35:42.449Z|3|reconnect|INFO|unix:/home/ovs/ovs/tests/testsuite.dir/2035/hv/db.sock:
 connected
2016-04-28T18:35:42.452Z|4|reconnect|INFO|unix:/home/ovs/ovs/tests/testsuite.dir/2035/ovn-sb/ovn-sb.sock:
 connecting...
2016-04-28T18:35:42.452Z|5|reconnect|INFO|unix:/home/ovs/ovs/tests/testsuite.dir/2035/ovn-sb/ovn-sb.sock:
 connected
2016-04-28T18:35:42.454Z|6|ovsdb_idl|INFO|ovsdb_idl_txn_insert:
Chassis row inserted into transaction above
2016-04-28T18:35:42.454Z|7|binding|INFO|Claiming lport localvif2 for this 
chassis.
2016-04-28T18:35:42.454Z|8|binding|INFO|Claiming lport localvif3 for this 
chassis.
2016-04-28T18:35:42.454Z|9|binding|INFO|Claiming lport localcif4 for this 
chassis.
2016-04-28T18:35:42.454Z|00010|binding|INFO|Claiming lport localcif5 for this 
chassis.
2016-04-28T18:35:42.454Z|00011|binding|INFO|Claiming lport localcif1 for this 
chassis.
2016-04-28T18:35:42.454Z|00012|binding|INFO|Claiming lport localvif1 for this 
chassis.
2016-04-28T18:35:42.454Z|00013|binding|INFO|Claiming lport localvif201 for this 
chassis.
2016-04-28T18:35:42.454Z|00014|binding|INFO|Claiming lport localcif3 for this 
chassis.
2016-04-28T18:35:42.454Z|00015|binding|INFO|Claiming lport localcif2 for this 
chassis.
   Binding run found the chassis record and has claimed the vifs
2016-04-28T18:35:42.455Z|00016|ofctrl|INFO|unix:/home/ovs/ovs/tests/testsuite.dir/2035/hv/br-int.mgmt:
 connecting to switch
2016-04-28T18:35:42.455Z|00017|rconn|INFO|unix:/home/ovs/ovs/tests/testsuite.dir/2035/hv/br-int.mgmt:
 connecting...
2016-04-28T18:35:42.455Z|00018|pinctrl|INFO|unix:/home/ovs/ovs/tests/testsuite.dir/2035/hv/br-int.mgmt:
 connecting to switch
2016-04-28T18:35:42.456Z|00019|rconn|INFO|unix:/home/ovs/ovs/tests/testsuite.dir/2035/hv/br-int.mgmt:
 connecting...
2016-04-28T18:35:42.457Z|00020|ovsdb_idl|INFO|ovsdb_idl_row_clear_new:
At this point read of Chassis table returns no rows, and
the transaction status is still incomplete.
2016-04-28T18:35:42.457Z|00021|binding|INFO|no chassis rec!
Binding run exits early because chassis_rec was null
2016-04-28T18:35:42.459Z|00022|patch|INFO|removing port 
patch-br-int-to-localnet201
2016-04-28T18:35:42.459Z|00023|patch|INFO|removing port 
patch-br-int-to-localnet1
2016-04-28T18:35:42.459Z|00024|patch|INFO|removing port 
patch-localnet1-to-br-int
2016-04-28T18:35:42.459Z|00025|patch|INFO|removing port 
patch-localnet201-to-br-int
   Localnet ports are removed above, because local_datapaths dont 
exist
2016-04-28T18:35:42.459Z|00026|rconn|INFO|unix:/home/ovs/ovs/tests/testsuite.dir/2035/hv/br-int.mgmt:
 connected
2016-04-28T18:35:42.460Z|00027|rconn|INFO|unix:/home/ovs/ovs/tests/testsuite.dir/2035/hv/br-int.mgmt:
 connected
2016-04-28T18:35:42.460Z|00028|ovsdb_idl|INFO|ovsdb_idl_row_create:
   Now, the transaction is complete

Signed-off-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>
---
 ovn/controller/binding.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
index 32fcb85..9f50cd5 100644
--- a/ovn/controller/binding.c
+++ b/ovn/controller/binding.c
@@ -155,9 +155,6 @@ binding_run(struct controller_ctx *ctx, const struct 
ovsrec_bridge *br_int,
 const struct sbrec_port_binding *binding_rec;
 
 chassis_rec = get_chassis(ctx->ovnsb_idl, chassis_id);
-if (!chassis_rec) {
-return;
-}
 
 struct shash lports = SHASH_INITIALIZER();
 if (br_int) {
@@ -190,10 +187,10 @@ binding_run(struct controller_ctx *ctx, const struct 
ovsrec_bridge *br_int,
 if (iface_rec && ctx->ovs_idl_txn) {
 

Re: [ovs-dev] [PATCH v1 2/2] ovn: Send garp on localnet

2016-04-26 Thread Ramu Ramamurthy
>
> Is there an equivalent of GARP for IPv6?
>

The "Unsolicited Neighbor Advertisement" seems to be the
equivalent of GARP for IPv6.

http://tools.ietf.org/html/rfc2461#section-7.2.6

The problem motivating this fix was seen with IPv4 though.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v2 2/2] ovn: send garp on localnet

2016-04-26 Thread Ramu Ramamurthy
In some usecases such as VM migration or when VMs reuse IP addresses,
VMs become unreachable externally because
external switches/routers on localnet
have stale port-mac or arp caches. The problem resolves
after some time when the caches ageout which could be
minutes for port-mac bindings or hours for arp caches.

To fix this, send some gratuitous ARPs when a logical
port on a localnet datapath gets added. Such gratuitous arps
help on a best-effort basis to update the mac-port bindings and arp-caches
of external switches and routers on the localnet.

Reported-at: https://bugs.launchpad.net/networking-ovn/+bug/1545897
Reported-by: Kyle Mestery <mest...@mestery.com>
Signed-off-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>
---
 ovn/controller/ovn-controller.c |   2 +-
 ovn/controller/pinctrl.c| 244 +++-
 ovn/controller/pinctrl.h|   5 +-
 tests/ovn.at|  63 +++
 4 files changed, 311 insertions(+), 3 deletions(-)

diff --git a/ovn/controller/ovn-controller.c b/ovn/controller/ovn-controller.c
index f68f842..ae7d44b 100644
--- a/ovn/controller/ovn-controller.c
+++ b/ovn/controller/ovn-controller.c
@@ -373,7 +373,7 @@ main(int argc, char *argv[])
 
 enum mf_field_id mff_ovn_geneve = ofctrl_run(br_int);
 
-pinctrl_run(, , br_int);
+pinctrl_run(, , br_int, chassis_id, _datapaths);
 
 struct hmap flow_table = HMAP_INITIALIZER(_table);
 lflow_run(, , , _datapaths,
diff --git a/ovn/controller/pinctrl.c b/ovn/controller/pinctrl.c
index cbac50e..aec2aa3 100644
--- a/ovn/controller/pinctrl.c
+++ b/ovn/controller/pinctrl.c
@@ -22,6 +22,8 @@
 #include "dp-packet.h"
 #include "flow.h"
 #include "lport.h"
+#include "ovn-controller.h"
+#include "lib/sset.h"
 #include "openvswitch/ofp-actions.h"
 #include "openvswitch/ofp-msgs.h"
 #include "openvswitch/ofp-print.h"
@@ -30,6 +32,7 @@
 #include "ovn-controller.h"
 #include "ovn/lib/actions.h"
 #include "ovn/lib/logical-fields.h"
+#include "ovn/lib/ovn-util.h"
 #include "poll-loop.h"
 #include "rconn.h"
 #include "socket-util.h"
@@ -54,6 +57,14 @@ static void run_put_arps(struct controller_ctx *,
 static void wait_put_arps(struct controller_ctx *);
 static void flush_put_arps(void);
 
+static void init_send_garps(void);
+static void destroy_send_garps(void);
+static void send_garp_wait(void);
+static void send_garp_run(const struct ovsrec_bridge *,
+  const char *chassis_id,
+  const struct lport_index *lports,
+  struct hmap *local_datapaths);
+
 COVERAGE_DEFINE(pinctrl_drop_put_arp);
 
 void
@@ -62,6 +73,7 @@ pinctrl_init(void)
 swconn = rconn_create(5, 0, DSCP_DEFAULT, 1 << OFP13_VERSION);
 conn_seq_no = 0;
 init_put_arps();
+init_send_garps();
 }
 
 static ovs_be32
@@ -266,7 +278,9 @@ pinctrl_recv(const struct ofp_header *oh, enum ofptype type)
 
 void
 pinctrl_run(struct controller_ctx *ctx, const struct lport_index *lports,
-const struct ovsrec_bridge *br_int)
+const struct ovsrec_bridge *br_int,
+const char *chassis_id,
+struct hmap *local_datapaths)
 {
 if (br_int) {
 char *target;
@@ -307,6 +321,7 @@ pinctrl_run(struct controller_ctx *ctx, const struct 
lport_index *lports,
 }
 
 run_put_arps(ctx, lports);
+send_garp_run(br_int, chassis_id, lports, local_datapaths);
 }
 
 void
@@ -315,6 +330,7 @@ pinctrl_wait(struct controller_ctx *ctx)
 wait_put_arps(ctx);
 rconn_run_wait(swconn);
 rconn_recv_wait(swconn);
+send_garp_wait();
 }
 
 void
@@ -322,6 +338,7 @@ pinctrl_destroy(void)
 {
 rconn_destroy(swconn);
 destroy_put_arps();
+destroy_send_garps();
 }
 
 /* Implementation of the "put_arp" OVN action.  This action sends a packet to
@@ -484,3 +501,228 @@ flush_put_arps(void)
 free(pa);
 }
 }
+
+/*
+ * Send gratuitous arp for vif on localnet.
+ *
+ * When a new vif on localnet is added, gratuitous arps are sent announcing
+ * the port's mac,ip mapping.  On localnet, such announcements are needed for
+ * switches and routers on the broadcast segment to update their port-mac
+ * and arp tables.
+ */
+struct garp_data {
+struct eth_addr ea;  /* Ethernet address of port. */
+ovs_be32 ipv4;   /* Ipv4 address of port. */
+long long int announce_time; /* Next announcement in ms. */
+int backoff; /* Backoff for the next announcement. */
+int ofport;  /* Ofport used to output this garp. */
+};
+
+/* Contains garps to be sent. */
+static struct shash send_garp_data;
+
+/* Next garp announcement in ms. */
+static long long int send_garp_time;
+
+static void
+init_send_garps(void)
+{
+shash_i

[ovs-dev] [PATCH v2 1/2] ovn: Move extract_lport_addresses

2016-04-26 Thread Ramu Ramamurthy
Move the function extract_lport_addresses to a file
in ovn/lib since that function can be used by ovn-controller also
to parse addresses stored in the mac column of the
port_binding table. Currently that function is used only
in ovn_northd.

Signed-off-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>
---
 ovn/lib/automake.mk |   2 +
 ovn/lib/ovn-util.c  | 104 
 ovn/lib/ovn-util.h  |  44 
 ovn/northd/ovn-northd.c | 104 +---
 4 files changed, 151 insertions(+), 103 deletions(-)
 create mode 100644 ovn/lib/ovn-util.c
 create mode 100644 ovn/lib/ovn-util.h

diff --git a/ovn/lib/automake.mk b/ovn/lib/automake.mk
index 9e09352..2b178da 100644
--- a/ovn/lib/automake.mk
+++ b/ovn/lib/automake.mk
@@ -10,6 +10,8 @@ ovn_lib_libovn_la_SOURCES = \
ovn/lib/expr.h \
ovn/lib/lex.c \
ovn/lib/lex.h \
+   ovn/lib/ovn-util.c \
+   ovn/lib/ovn-util.h \
ovn/lib/logical-fields.h
 nodist_ovn_lib_libovn_la_SOURCES = \
ovn/lib/ovn-nb-idl.c \
diff --git a/ovn/lib/ovn-util.c b/ovn/lib/ovn-util.c
new file mode 100644
index 000..abdc247
--- /dev/null
+++ b/ovn/lib/ovn-util.c
@@ -0,0 +1,104 @@
+/*
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include 
+#include "ovn-util.h"
+#include "openvswitch/vlog.h"
+
+VLOG_DEFINE_THIS_MODULE(ovn_util);
+
+/*
+ * Extracts the mac, ipv4 and ipv6 addresses from the input param 'address'
+ * which should be of the format 'MAC [IP1 IP2 ..]" where IPn should be
+ * a valid IPv4 or IPv6 address and stores them in the 'ipv4_addrs' and
+ * 'ipv6_addrs' fields of input param 'laddrs'.
+ * The caller has to free the 'ipv4_addrs' and 'ipv6_addrs' fields.
+ * If input param 'store_ipv6' is true only then extracted ipv6 addresses
+ * are stored in 'ipv6_addrs' fields.
+ * Return true if at least 'MAC' is found in 'address', false otherwise.
+ * Eg 1.
+ * If 'address' = '00:00:00:00:00:01 10.0.0.4 fe80::ea2a:eaff:fe28:3390/64
+ * 30.0.0.3/23' and 'store_ipv6' = true
+ * then returns true with laddrs->n_ipv4_addrs = 2, naddrs->n_ipv6_addrs = 1.
+ *
+ * Eg. 2
+ * If 'address' = '00:00:00:00:00:01 10.0.0.4 fe80::ea2a:eaff:fe28:3390/64
+ * 30.0.0.3/23' and 'store_ipv6' = false
+ * then returns true with laddrs->n_ipv4_addrs = 2, naddrs->n_ipv6_addrs = 0.
+ *
+ * Eg 3. If 'address' = '00:00:00:00:00:01 10.0.0.4 addr 30.0.0.4', then
+ * returns true with laddrs->n_ipv4_addrs = 1 and laddrs->n_ipv6_addrs = 0.
+ */
+bool
+extract_lport_addresses(char *address, struct lport_addresses *laddrs,
+bool store_ipv6)
+{
+char *buf = address;
+int buf_index = 0;
+char *buf_end = buf + strlen(address);
+if (!ovs_scan_len(buf, _index, ETH_ADDR_SCAN_FMT,
+  ETH_ADDR_SCAN_ARGS(laddrs->ea))) {
+return false;
+}
+
+ovs_be32 ip4;
+struct in6_addr ip6;
+unsigned int plen;
+char *error;
+
+laddrs->n_ipv4_addrs = 0;
+laddrs->n_ipv6_addrs = 0;
+laddrs->ipv4_addrs = NULL;
+laddrs->ipv6_addrs = NULL;
+
+/* Loop through the buffer and extract the IPv4/IPv6 addresses
+ * and store in the 'laddrs'. Break the loop if invalid data is found.
+ */
+buf += buf_index;
+while (buf < buf_end) {
+buf_index = 0;
+error = ip_parse_cidr_len(buf, _index, , );
+if (!error) {
+laddrs->n_ipv4_addrs++;
+laddrs->ipv4_addrs = xrealloc(
+laddrs->ipv4_addrs,
+sizeof (struct ipv4_netaddr) * laddrs->n_ipv4_addrs);
+laddrs->ipv4_addrs[laddrs->n_ipv4_addrs - 1].addr = ip4;
+laddrs->ipv4_addrs[laddrs->n_ipv4_addrs - 1].plen = plen;
+buf += buf_index;
+continue;
+}
+free(error);
+error = ipv6_parse_cidr_len(buf, _index, , );
+if (!error && store_ipv6) {
+laddrs->n_ipv6_addrs++;
+laddrs->ipv6_addrs = xrealloc(
+laddrs->ipv6_addrs,
+sizeof(struct ipv6_netaddr) * laddrs->n_ipv6_addrs);
+memcpy(>ipv6_addrs[laddrs->n_ipv6_addrs - 1].addr, ,
+   sizeof(struct in6_addr));
+laddrs->ipv6_addrs[laddrs->n_ipv6_addrs - 1].plen = plen;
+   

[ovs-dev] [PATCH v2 0/2] ovn: send garp on localnet

2016-04-26 Thread Ramu Ramamurthy

The first patch refactors the function extract_lport_addresses
so that it can be used in ovn-northd and in ovn-controller.

The second patch depends on the first patch, and implements the
functionality to send gratuitous arps on localnet.

Changes v1->v2:

Changes per code-review:

* send gratuitous arp request instead of a reply
* use poll_timer_wait_until for timing
* Various code changes to conform to coding style


___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2 RFC] ovn: Support native dhcp using 'continuations'

2016-04-20 Thread Ramu Ramamurthy
Tested-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>

Numan, I tested this patch to work on devstack+ovn without the openstack-plugin,
with manual configuration.

Notes:

1) In ovn/utilities/ovn-nbctl.c, usage() Can you add a help string to
ovn-nbctl for the new command
lswitch-get-dhcp-options, and lswitch-set-dhcp-options

2) When server_mac was not defined, the lflow was created like this,
maybe the error
checks can be tighter in ovn-northd.c

table=3(  ls_in_dhcp), priority=   50, match=(inport ==
"7fe086b5-9ab0-4c14-bf62-0291a62a4b14" && eth.src == fa:16:3e:cf:3d:bc
&& ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && udp.src == 68
&& udp.dst == 67), action=(dhcp_offer(offerip = 10.0.1.3, netmask =
255.255.255.0, router = 10.0.1.1, mtu = 1400, server_id = 10.0.1.2,
dns_server = {8.8.8.8,7.7.7.7}, lease_time = 43200); eth.dst =
eth.src; eth.src = (null); ip4.dst = 10.0.1.3; ip4.src = 10.0.1.2;
udp.src = 67; udp.dst = 68; outport = inport; inport = ""; /* Allow
sending out inport. */ output;)

The ovn-controller failed to parse the above flow at eth.src

2016-04-20T23:19:49Z|00070|lflow|WARN|error parsing actions
"dhcp_offer(offerip = 10.0.1.2, netmask = 255.255.255.0, router =
10.0.1.1, mtu = 1400, server_id = 10.0.1.2, dns_server =
{8.8.8.8,7.7.7.7}, lease_time = 43200); eth.dst = eth.src; eth.src =
(null); ip4.dst = 10.0.1.2; ip4.src = 10.0.1.2; udp.src = 67; udp.dst
= 68; outport = inport; inport = ""; /* Allow sending out inport. */
output;": Syntax error at `(' expecting constant.





On Mon, Apr 18, 2016 at 10:29 AM, Numan Siddique <nusid...@redhat.com> wrote:
> To support native dhcp in ovn
>  - A new column 'dhcp-options' is added in 'Logical_Switch' north db.
>  - A logical flow is added for each logical port to handle dhcp packets
>if the CMS has defined dhcp options in this column.
>
> Eg.
> action=(dhcp_offer(offerip = 10.0.0.2, router = 10.0.0.1,
> server_id = 10.0.0.2, mtu = 1300, lease_time = 3600,
> netmask = 255.255.255.0); eth.dst = eth.src; eth.src = 00:00:00:00:00:03;
> ip4.dst = 10.0.0.2; ip4.src = 10.0.0.2; udp.src = 67; udp.dst = 68;
> outport = inport; inport = ""; /* Allow sending out inport. */ output;)
>
>  - ovn-controller will convert this logical flow to a packet-in OF flow with
>'pause' flag set. The dhcp options defined in 'dhcp_offer' action
>are stored in the 'userdata'
>
>  - When the ovn-controller receives the packet-in, it would frame a new
>dhcp packet with the dhcp options set in the 'userdata' and resume
>the packet.
>
> TODO: Test cases and updating the necessary documentation.
>
> Signed-off-by: Numan Siddique <nusid...@redhat.com>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH RFC] ovn: distributed logical port for VM metadata access

2016-04-19 Thread Ramu Ramamurthy
On Mon, Apr 18, 2016 at 2:55 PM, Aaron Rosen  wrote:
> I like this idea as well.
>
> The one question I have with is is how we should determine which ip address
> to select for the 'distributed' port?

Aaron, Thanks for your review, and feedback.

We can use the dhcp-port (and its IP address) for the distributed-port.

The current native-dhcp proposal
(http://openvswitch.org/pipermail/dev/2016-April/069787.html)
assumes that a dhcp-server ip-address "server_id" is defined on the subnet.

action=(dhcp_offer(offerip = 10.0.0.2, router = 10.0.0.1,
server_id = 10.0.0.2, mtu = 1300, lease_time = 3600,

For openstack, This means that a DHCP port has been created on that
subnet in neutron.
In the absence of a dhcp-agent, the neutron-ovn-plugin would have to
auto-create the
dhcp-port in neutron upon creation of a subnet, and then use that
port's IP address as
the "server_id" when it programs the "dhcp-options"
column of the Logical_Switch table.

The pros of the distributed-port approach is that a) HA is not needed,
b) it runs the existing
neutron-metadata-proxy/neutron-metadata-agent as-is, c) In the future,
we could remove the
neutron-metadata-agent also, by (nova-compute) configuring the
instance-id and tenant-id as external-ids
of the VM's ovs interface - hence not need to run any neutron-agents at all.
The drawbacks include creation of namespaces and metadata-proxy
processes on each hypervisor.


> If we want to avoid creating the network name spaces and
> running the http proxy on each hypervisor is if we took a similar approach
> that openstack uses for handling dhcp/metadata requests.
> When a subnet is created we could have the neutron-ovn-plugin notify a
> metadata agent which would create a port on the given subnet for the logical
> network. Then, to get instances to route its metadata traffic to this
> logical port we could have ovn distribute an additional host-route via dhcp
> (using option 121).  Similar to what you are proposing.
>
>
> I.e:  So for example if someone created a network/subnet.
>
> In the ovn plugin we can signal the metadata agent to create a port on that
> network.  Then, for every port that is created on this network we would
> distribute a hostroute of 169.254.169.254/32 via ; Then,
> we'd have the metadata agent just run there which would answer these meta
> data requests and route them back.
>
> One down side to this solution is that this metadata agent would need to be
> made HA in some way. In your current solution if the metadata agent crashes
> or something the failure is isolated to the hypervisor. That said, this type
> of HA seems like it can be implemented in at least an active passive
> solution easily enough.
>
> Thoughts?
>

Your proposal is an alternative solution - which involves changes only to the
neutron components  (and no changes in ovn ?). Would there be only one
modified neutron-metadata-agent in an active-passive configuration serving all
the VMs ? If there are multiple agents, would you need agent-scheduling to
assign networks to agents ?

Could you share more details of the approach ?
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH RFC] ovn: distributed logical port for VM metadata access

2016-04-15 Thread Ramu Ramamurthy
>
> neutron-ovn-metadata would be something we maintain in our OpenStack plugin
> repo (networking-ovn), right?
>

Russell, Thanks for your review and feedback,

neutron-ovn-metadata would be maintained in networking-ovn repo.

> I like this proposal.  It suggests adding only the minimal amount of support
> needed from OVN itself to enable us to get our OpenStack-specific job done.
> This is much better than anything I had thought of.
>
> Thank you!
>
> --
> Russell Bryant
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH RFC] ovn: distributed logical port for VM metadata access

2016-04-14 Thread Ramu Ramamurthy
> This is an interesting proposal.  The alternative, which strikes me as a
> terrible idea, would be to somehow implement a TCP/IP stack and HTTP
> server inside OVS.
>
> Does there need to be a single metadata server per hypervisor, or one
> per logical switch per hypervisor?
>

Ben, Thanks for your review, and feedback.

The metadata-proxy-server is one per logical-switch that has at least
1 VM on the hypervisor.
In the worst case, there may be one metadata-proxy-server per VM on
the hypervisor
if each VM belonged to a different logical switch.

The metadata-proxy-server forwards the http request from the VM after
setting a few http headers -
one of the http-headers is the VMs IP address which is obtained from
the http request,
and the other http header is the network-id which comes from the
configuration of the process.
As such, the metadata-proxy-server is expected to be a lightweight process.

> I think that a lot of the commit message (include some bits that I
> snipped) should go into documentation.
>
> I feel like this needs to be properly evaluated by someone who
> understands OpenStack and Neutron better than me.  Maybe Russell?
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v1 RFC] ovn: Support native dhcp using 'continuations'

2016-04-13 Thread Ramu Ramamurthy
> +/* dhcp options */
> +shash_init(_opt_symtab);
> +dhcp_opt_expr_symtab_add_field(_opt_symtab, "offerip", 0,
> +   DHCP_OPT_TYPE_IP4);
> +dhcp_opt_expr_symtab_add_field(_opt_symtab, "netmask", 1,
> +   DHCP_OPT_TYPE_IP4);
> +dhcp_opt_expr_symtab_add_field(_opt_symtab, "router", 3,
> +   DHCP_OPT_TYPE_IP4);
> +dhcp_opt_expr_symtab_add_field(_opt_symtab, "dns_server", 6,
> +   DHCP_OPT_TYPE_IP4);
> +dhcp_opt_expr_symtab_add_field(_opt_symtab, "domain_name", 15,
> +   DHCP_OPT_TYPE_STR);
> +dhcp_opt_expr_symtab_add_field(_opt_symtab, "ip_forward_enable", 19,
> +   DHCP_OPT_TYPE_BOOL);
> +dhcp_opt_expr_symtab_add_field(_opt_symtab, "mtu", 26,
> +   DHCP_OPT_TYPE_UINT16);
> +dhcp_opt_expr_symtab_add_field(_opt_symtab, "lease_time", 51,
> +   DHCP_OPT_TYPE_UINT32);
> +dhcp_opt_expr_symtab_add_field(_opt_symtab, "server_id", 54,
> +   DHCP_OPT_TYPE_IP4);
>  }

Numan, DHCP static-route option (249) is used in the DHCP-offer to
enable VM Metadata access.
Could you consider adding support for that option - Or I can help to
add that option.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v1 1/2] ovn: Move extract_lport_addresses

2016-04-12 Thread Ramu Ramamurthy
On Mon, Apr 11, 2016 at 6:12 PM, Ben Pfaff <b...@ovn.org> wrote:
> On Sun, Apr 03, 2016 at 09:49:03PM -0400, Ramu Ramamurthy wrote:
>> Move the function extract_lport_addresses to a file
>> in ovn/lib since that function can be used by ovn-controller also
>> to parse addresses stored in the mac column of the
>> port_binding table. Currently that function is used only
>> in ovn_northd.
>>
>> Signed-off-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>
>
> Please add the standard license notice at the top of each of the new
> files.
>
> Function-level comments go on function definitions, not their header
> file prototypes.
>
> The comment
> +/* contains ovn utility functions */
> isn't very useful and I'd recommend deleting it.
>
> Thanks,
>
> Ben.

Ben, Thanks for the review, I will update in the next version
with changes noted above.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v1 2/2] ovn: Send garp on localnet

2016-04-12 Thread Ramu Ramamurthy
On Mon, Apr 11, 2016 at 6:29 PM, Ben Pfaff <b...@ovn.org> wrote:
> On Sun, Apr 03, 2016 at 09:49:04PM -0400, Ramu Ramamurthy wrote:
>> In some usecases such as VM migration or when VMs reuse IP addresses,
>> VMs become unreachable externally because
>> external switches/routers on localnet
>> have stale port-mac or arp caches. The problem resolves
>> after some time when the caches ageout which could be
>> minutes for port-mac bindings or hours for arp caches.
>>
>> To fix this, send some gratuitous ARPs when a logical
>> port on a localnet datapath gets added. Such gratuitous arps
>> help on a best-effort basis to update the mac-port bindings and arp-caches
>> of external switches and routers on the localnet.
>>
>> Reported-at: https://bugs.launchpad.net/networking-ovn/+bug/1545897
>> Reported-by: Kyle Mestery <mest...@mestery.com>
>> Signed-off-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>
>

Ben, Thanks for your review,

> Darrell (CCed) brought up some issues with this idea.  I don't know
> whether his questions have been satisfactorily answered.
>
In the next version, I will update as per Darrell's comments.

> Is there an equivalent of GARP for IPv6?
>
I dont know enough IPv6 to answer this, The problem
we saw which motivated this fix was in a usecase with IPv4.

> The GARP code manages and acts based on timers, but I don't see anything
> that calls, e.g., poll_timer_wait_until(), to ensure that ovn-controller
> wakes up when it's time to send a GARP.  This call would ordinarily be
> made by adding another call to pinctrl_wait().
>
The code assumes that pinctr_run() would be run often enough to not
need to fire timers. But this is not a correct assumption, as there
could be pauses between successive runs of pinctl_run(), causing
successive garps to be sent at arbitrary intervals.
An approach based on timers makes the fix deterministic.
I will look into this further in the next version.

> Please be careful about coding style.  The first thing that jumps out at
> me here is comment style.  CodingStyle.md says:
>
>   Comments should be written as full sentences that start with a
> capital letter and end with a period.  Put two spaces between
> sentences.
>
> (Comments that aren't on lines of their own don't need to be sentences
> but still should be capitalized and end with a period.)
>
I will update the next version with careful attention to the
coding style for this and the following notes on sizeof, and expressions,

> The following is unsafe if there is no such datapath, because
> CONTAINER_OF does not treat a null pointer specially:
>
> struct local_datapath *ld =
> CONTAINER_OF(hmap_first_with_hash(local_datapaths,
>   pb->datapath->tunnel_key),
>  struct local_datapath, hmap_node);
>
For this I will use the get_local_datapath() helper.

> Please write sizeof *garp here instead of sizeof(struct garp_data):
>
> /* the mac address must be valid */
> struct garp_data *garp = xmalloc(sizeof(struct garp_data));
>
> as requested in CodingStyle:
>
>   The "sizeof" operator is unique among C operators in that it accepts
> two very different kinds of operands: an expression or a type.  In
> general, prefer to specify an expression, e.g. "int *x =
> xmalloc(sizeof *x);".  When the operand of sizeof is an expression,
> there is no need to parenthesize that operand, and please don't.
>
> Please put spaces around + and * as recommended by CodingStyle, e.g. here:
> +garp->announce_time = time_msec()+1000;
> and here:
> +garp->announce_time = current_time + garp->backoff*1000;
>
> Thanks,
>
> Ben.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v1 2/2] ovn: Send garp on localnet

2016-04-12 Thread Ramu Ramamurthy
On Tue, Apr 12, 2016 at 9:37 AM, Darrell Ball <dlu...@gmail.com> wrote:
> Ramu
>
> High level comment earlier that still applies:
>
> 1) garp reply is still being used rather than garp request
>garp replies are more likely to be dropped or explicitly filtered so garp
> requests are "better"
>
>  Do you need to make changes to build_lswitch_flows() to allow the garp
> requests out the localnet
> ports ?

Darrell, Thanks, In the next version, I will update the code to send
garp requests.

garp requests are responded-to by the ARP responder flows. This interferes
with sending garps using the packet out destination OFPP_TABLE which
injects the packet into the ingress pipeline. So a change in needed
the arp-responder flows to not respond to garps. Alternatively, I can change
the packet out destination to directly output garps onto the localnet ofport.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH RFC] ovn: distributed logical port for VM metadata access

2016-04-11 Thread Ramu Ramamurthy
A new logical-port-type called "distributed" is introduced with this change.

A distributed logical port is not bound to any chassis. An instance of
the distributed port is created  as an ovs internal port at all chasses
where the datapath has at least 1 vif. VIFs on a chassis can communicate
with the locally created distributed port on the same chassis. A distributed
port cannot be reached remotely via tunnels or via localnet from other
logical ports on the logical switch. Unicast traffic originated from
external localnet VMs cannot reach the distributed port.

The distributed logical port allows the implementation of services
in a logical switch that are available at the same IP,mac address on each
chassis to serve local VMs on that logical switch bound to that chassis.
The motivating usecase is the Openstack dhcp-based metadata
access which is described in detail below.

Please suggest any feedback on a) the usecase, b) the approach suggested
here to implement it, c) any alternatives. If it makes sense
I can pursue this work further.

Usecase - DHCP-based metadata access

VMs (using cloudinit) use metadata access at the url http://169.254.169.254.
(https://cloudinit.readthedocs.org/en/latest/topics/datasources.html#ec2)

In Openstack DHCP-based metadata access, the DHCP server advertizes a static 
route
to the link-local address 169.254.169.254 pointing to the DHCP port's IP address
using DHCP option 249. A neutron-metadata-proxy server listens on the DHCP port
which is also aliased with the link local address 169.254.169.254, and
serves metadata requests from VMs. The metadata-proxy server extracts
the VMs IP address from the http request, and sets the "X-Forwarded-For" and
"X-Neutron-Network-Id" http headers, and forwards the request to the
neutron-metadata-agent which in turn forwards the request to nova.

With proposed native-dhcp implementation in OVN, there is no need to run the
neutron-DHCP-agent in Openstack. Since DHCP-based metadata was implemented by
the dhcp-agent, A new approach is needed to implement the metadata-service.
The following is a proposed approach.

1) The DHCP logical port's type is configured to be "distributed" in the
NorthBound DB. Options on the
logical port are configured to indicate the neutron network id.

2) OVN controller creates internal OVS port on each chassis where VIFs on that
datapath are present. External IDs on that interface denote the neutron-network 
id,
and dhcp-port ip address, and the link-local ip address to which a static route
was advertized.

3) An ovn-metadata entity monitors changes to the OVSDB interface table, and
upon seeing a new interface does the following using information from the
external-ids on that interface: a) creates namespace and sets the
internal port in the namespace, b) spawns the neutron-metadata-proxy server
listening on that port

Example

1) Create a distributed port (This would be done by neutron-ovn-plugin, upon 
creation of the DHCP port)

ovn-nbctl lport-add 369611f4-f6a5-4a97-a9a2-75e241dc0803 vif-1
ovn-nbctl lport-set-addresses vif-1 "fe:16:3e:ab:cd:01 10.0.0.100"
ovn-nbctl lport-set-port-security vif-1 "fe:16:3e:ab:cd:01"
ovn-nbctl lport-set-type vif-1 distributed
ovn-nbctl lport-set-options vif-1 name="vif1" 
network="c827c93e-76f3-4fc7-8e98-1df0b97cf927" mac="fe:16:3e:ab:cd:01" 
ipv4="10.0.0.100/24" application="neutron-metadata"

2) Boot a VM, after VM boots, set the static route inside the VM (this would be 
done
by native-DHCP server automatically by injecting a static route in the DHCP 
offer)
sudo ip route add 169.254.169.254 via 10.0.0.100

3) On the hypervisor (This would be done by a new neutron-ovn-metadata entity 
running on
each hypervisor upon monitoring ovs interface table and detecting a new vif1 
interface
with options indicating that the application is metadata)

sudo ip netns add vif-1
sudo ip link set vif-1 netns vif-1
sudo ip netns exec vif-1 ip link set dev vif1 address fe:16:3e:ab:cd:01
sudo ip netns exec vif-1 ip addr add 10.0.0.100/24 dev vif1
sudo ip netns exec vif-1 ip addr add 169.254.169.254 dev vif1
sudo ip netns exec vif-1 ip link set dev vif1 up
sudo ip netns exec vif-1 /usr/bin/python /usr/bin/neutron-ns-metadata-proxy 
--pid_file=/opt/stack/data/neutron/external/pids/vif-1.pid 
--metadata_proxy_socket=/opt/stack/data/neutron/metadata_proxy 
--network_id=c827c93e-76f3-4fc7-8e98-1df0b97cf927 
--state_path=/opt/stack/data/neutron --metadata_port=80 
--metadata_proxy_user=1001 --metadata_proxy_group=1001 --debug --verbose

Signed-off-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>
---
 ovn/controller/binding.c  | 73 +--
 ovn/controller/physical.c | 10 +--
 2 files changed, 79 insertions(+), 4 deletions(-)

diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
index d3ca9c9..abf9556 100644
--- a/ovn/controller/binding.

Re: [ovs-dev] [PATCH v2] ovn: Add software l2 gateway.

2016-04-06 Thread Ramu Ramamurthy
On Mon, Apr 4, 2016 at 5:58 AM, Russell Bryant  wrote:
> This patch implements one approach to using ovn-controller to implement
> a software l2 gateway between logical and physical networks.
>
> A new logical port type called "gateway" is introduced here.  It is very
> close to how localnet ports work, with the following exception:
>
> - A localnet port makes OVN use the physical network as the
>   transport between hypervisors instead of tunnels. A gateway port still
>   uses tunnels between all hypervisors, and packets only go to/from the
>   specified physical network as needed via the chassis the gateway port
>   is bound to.
>
> - A gateway port also gets bound to a chassis while a localnet port does
>   not.  This binding is not done by ovn-controller.  It is left as an
>   administrative function.  In the case of OpenStack, the Neutron plugin
>   will do this.
>
> Signed-off-by: Russell Bryant 
> ---
>

Can there be more than 1 gateway to the same l2 network  (say for HA) ?
If gateway port G1 on chassis1 and gateway port G2 on chassis2,
Then, when a VM a chassis3 sends a broadcast, it will tunnel to
chassis1, and chassis2,
and then get sent out of G1 and G2 to the same L2 network, ie, Can
that broadcast packet
get sent in duplicate to the physical network ? If so, would you have
to run xSTP on the
gateway ports ?
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v1 RFC] ovn: Support native dhcp using 'continuations'

2016-04-04 Thread Ramu Ramamurthy
> @@ -89,10 +89,11 @@ enum ovn_stage {
>  PIPELINE_STAGE(SWITCH, IN,  PORT_SEC_L2,0, "ls_in_port_sec_l2") \
>  PIPELINE_STAGE(SWITCH, IN,  PORT_SEC_IP,1, "ls_in_port_sec_ip") \
>  PIPELINE_STAGE(SWITCH, IN,  PORT_SEC_ND,2, "ls_in_port_sec_nd") \
> -PIPELINE_STAGE(SWITCH, IN,  PRE_ACL,3, "ls_in_pre_acl")  \
> -PIPELINE_STAGE(SWITCH, IN,  ACL,4, "ls_in_acl")  \
> -PIPELINE_STAGE(SWITCH, IN,  ARP_RSP,5, "ls_in_arp_rsp")  \
> -PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,6, "ls_in_l2_lkup")  \
> +PIPELINE_STAGE(SWITCH, IN,  DHCP,   3, "ls_in_dhcp") \
> +PIPELINE_STAGE(SWITCH, IN,  PRE_ACL,4, "ls_in_pre_acl")  \
> +PIPELINE_STAGE(SWITCH, IN,  ACL,5, "ls_in_acl")  \
> +PIPELINE_STAGE(SWITCH, IN,  ARP_RSP,6, "ls_in_arp_rsp")  \
> +PIPELINE_STAGE(SWITCH, IN,  L2_LKUP,7, "ls_in_l2_lkup")  \
>\

Would it make sense to put DHCP after ACL instead of before - so,
some control is
provided on the handling of DHCP packets via acl rules. For instance,
OpenStack programs these DHCP ACL
rules currently allowing client->server communication - and the
usecase would be if i want to drop
all client->server DHCP traffic for a rogue VM.

table=2(   ls_in_acl), priority= 2002, match=(inport ==
"55c0912f-f7aa-4318-82f1-6118032839e3" && ip4 && (ip4.dst ==
255.255.255.255 || ip4.dst == 10.10.0.0/16) && udp && udp.src == 68 &&
udp.dst == 67), action=(ct_commit; next;)
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v1 2/2] ovn: Send garp on localnet

2016-04-03 Thread Ramu Ramamurthy
In some usecases such as VM migration or when VMs reuse IP addresses,
VMs become unreachable externally because
external switches/routers on localnet
have stale port-mac or arp caches. The problem resolves
after some time when the caches ageout which could be
minutes for port-mac bindings or hours for arp caches.

To fix this, send some gratuitous ARPs when a logical
port on a localnet datapath gets added. Such gratuitous arps
help on a best-effort basis to update the mac-port bindings and arp-caches
of external switches and routers on the localnet.

Reported-at: https://bugs.launchpad.net/networking-ovn/+bug/1545897
Reported-by: Kyle Mestery <mest...@mestery.com>
Signed-off-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>
---
 ovn/controller/ovn-controller.c |   2 +-
 ovn/controller/pinctrl.c| 217 +++-
 ovn/controller/pinctrl.h|   5 +-
 tests/ovn.at|  61 +++
 4 files changed, 282 insertions(+), 3 deletions(-)

diff --git a/ovn/controller/ovn-controller.c b/ovn/controller/ovn-controller.c
index 6027011..3f0aea4 100644
--- a/ovn/controller/ovn-controller.c
+++ b/ovn/controller/ovn-controller.c
@@ -335,7 +335,7 @@ main(int argc, char *argv[])
 
 enum mf_field_id mff_ovn_geneve = ofctrl_run(br_int);
 
-pinctrl_run(, , br_int);
+pinctrl_run(, , br_int, chassis_id, _datapaths);
 
 struct hmap flow_table = HMAP_INITIALIZER(_table);
 lflow_run(, , , _datapaths,
diff --git a/ovn/controller/pinctrl.c b/ovn/controller/pinctrl.c
index 3fcab99..57be002 100644
--- a/ovn/controller/pinctrl.c
+++ b/ovn/controller/pinctrl.c
@@ -24,6 +24,7 @@
 #include "ofp-actions.h"
 #include "ovn/lib/actions.h"
 #include "ovn/lib/logical-fields.h"
+#include "ovn/lib/ovn-util.h"
 #include "ofp-msgs.h"
 #include "ofp-print.h"
 #include "ofp-util.h"
@@ -54,6 +55,13 @@ static void run_put_arps(struct controller_ctx *,
 static void wait_put_arps(struct controller_ctx *);
 static void flush_put_arps(void);
 
+static void init_send_garps(void);
+static void destroy_send_garps(void);
+static void send_garp_run(const struct lport_index *lports,
+  const struct ovsrec_bridge *br_int,
+  const char *chassis_id,
+  struct hmap *local_datapaths);
+
 COVERAGE_DEFINE(pinctrl_drop_put_arp);
 
 void
@@ -62,6 +70,7 @@ pinctrl_init(void)
 swconn = rconn_create(5, 0, DSCP_DEFAULT, 1 << OFP13_VERSION);
 conn_seq_no = 0;
 init_put_arps();
+init_send_garps();
 }
 
 static ovs_be32
@@ -266,7 +275,9 @@ pinctrl_recv(const struct ofp_header *oh, enum ofptype type)
 
 void
 pinctrl_run(struct controller_ctx *ctx, const struct lport_index *lports,
-const struct ovsrec_bridge *br_int)
+const struct ovsrec_bridge *br_int,
+const char *chassis_id,
+struct hmap *local_datapaths)
 {
 if (br_int) {
 char *target;
@@ -307,6 +318,9 @@ pinctrl_run(struct controller_ctx *ctx, const struct 
lport_index *lports,
 }
 
 run_put_arps(ctx, lports);
+if (rconn_is_connected(swconn)) {
+send_garp_run(lports, br_int, chassis_id, local_datapaths);
+}
 }
 
 void
@@ -322,6 +336,7 @@ pinctrl_destroy(void)
 {
 rconn_destroy(swconn);
 destroy_put_arps();
+destroy_send_garps();
 }
 
 /* Implementation of the "put_arp" OVN action.  This action sends a packet to
@@ -484,3 +499,203 @@ flush_put_arps(void)
 free(pa);
 }
 }
+
+/*
+ * Send gratuitous arp for vifs on localnets
+ *
+ * When a new vif on localnet is added, gratuitous arps
+ * are sent announcing the port's mac,ip mapping.
+ * On localnet, such announcements are needed for
+ * switches and routers on the broadcast segment to
+ * update their port-mac and arp tables.
+ */
+struct garp_data {
+int ofport; /* ofport used for this garp */
+struct eth_addr ea; /* ethernet address of port */
+ovs_be32 ipv4;  /* ipv4 address of port */
+long long int announce_time; /* next announcement in ms */
+int backoff; /* backoff for the next announcement */
+};
+
+/* Contains garps to be sent */
+static struct shash send_garp_data;
+
+static void
+init_send_garps(void)
+{
+shash_init(_garp_data);
+}
+
+static void
+destroy_send_garps(void)
+{
+shash_destroy(_garp_data);
+}
+
+/* Add a new vif to announce garps */
+static void
+send_garp_add(const struct sbrec_port_binding *binding_rec,
+  int ofport)
+{
+int i;
+for (i = 0; i < binding_rec->n_mac; i++) {
+struct lport_addresses laddrs;
+if (extract_lport_addresses(binding_rec->mac[i], ,
+false)) {
+/* the mac address must be valid */
+struct garp_data *garp = xmalloc(sizeof(struct garp_data));
+garp->ofport 

[ovs-dev] [PATCH v1 1/2] ovn: Move extract_lport_addresses

2016-04-03 Thread Ramu Ramamurthy
Move the function extract_lport_addresses to a file
in ovn/lib since that function can be used by ovn-controller also
to parse addresses stored in the mac column of the
port_binding table. Currently that function is used only
in ovn_northd.

Signed-off-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>
---
 ovn/lib/automake.mk |   4 +-
 ovn/lib/ovn-util.c  |  71 +
 ovn/lib/ovn-util.h  |  53 
 ovn/northd/ovn-northd.c | 104 +---
 4 files changed, 128 insertions(+), 104 deletions(-)
 create mode 100644 ovn/lib/ovn-util.c
 create mode 100644 ovn/lib/ovn-util.h

diff --git a/ovn/lib/automake.mk b/ovn/lib/automake.mk
index 9e09352..32e7b87 100644
--- a/ovn/lib/automake.mk
+++ b/ovn/lib/automake.mk
@@ -10,7 +10,9 @@ ovn_lib_libovn_la_SOURCES = \
ovn/lib/expr.h \
ovn/lib/lex.c \
ovn/lib/lex.h \
-   ovn/lib/logical-fields.h
+   ovn/lib/logical-fields.h \
+   ovn/lib/ovn-util.c \
+   ovn/lib/ovn-util.h
 nodist_ovn_lib_libovn_la_SOURCES = \
ovn/lib/ovn-nb-idl.c \
ovn/lib/ovn-nb-idl.h \
diff --git a/ovn/lib/ovn-util.c b/ovn/lib/ovn-util.c
new file mode 100644
index 000..e6297ed
--- /dev/null
+++ b/ovn/lib/ovn-util.c
@@ -0,0 +1,71 @@
+
+#include 
+#include "ovn-util.h"
+#include "openvswitch/vlog.h"
+
+VLOG_DEFINE_THIS_MODULE(ovn_util);
+
+/* contains ovn utility functions */
+
+bool
+extract_lport_addresses(char *address, struct lport_addresses *laddrs,
+bool store_ipv6)
+{
+char *buf = address;
+int buf_index = 0;
+char *buf_end = buf + strlen(address);
+if (!ovs_scan_len(buf, _index, ETH_ADDR_SCAN_FMT,
+  ETH_ADDR_SCAN_ARGS(laddrs->ea))) {
+return false;
+}
+
+ovs_be32 ip4;
+struct in6_addr ip6;
+unsigned int plen;
+char *error;
+
+laddrs->n_ipv4_addrs = 0;
+laddrs->n_ipv6_addrs = 0;
+laddrs->ipv4_addrs = NULL;
+laddrs->ipv6_addrs = NULL;
+
+/* Loop through the buffer and extract the IPv4/IPv6 addresses
+ * and store in the 'laddrs'. Break the loop if invalid data is found.
+ */
+buf += buf_index;
+while (buf < buf_end) {
+buf_index = 0;
+error = ip_parse_cidr_len(buf, _index, , );
+if (!error) {
+laddrs->n_ipv4_addrs++;
+laddrs->ipv4_addrs = xrealloc(
+laddrs->ipv4_addrs,
+sizeof (struct ipv4_netaddr) * laddrs->n_ipv4_addrs);
+laddrs->ipv4_addrs[laddrs->n_ipv4_addrs - 1].addr = ip4;
+laddrs->ipv4_addrs[laddrs->n_ipv4_addrs - 1].plen = plen;
+buf += buf_index;
+continue;
+}
+free(error);
+error = ipv6_parse_cidr_len(buf, _index, , );
+if (!error && store_ipv6) {
+laddrs->n_ipv6_addrs++;
+laddrs->ipv6_addrs = xrealloc(
+laddrs->ipv6_addrs,
+sizeof(struct ipv6_netaddr) * laddrs->n_ipv6_addrs);
+memcpy(>ipv6_addrs[laddrs->n_ipv6_addrs - 1].addr, ,
+   sizeof(struct in6_addr));
+laddrs->ipv6_addrs[laddrs->n_ipv6_addrs - 1].plen = plen;
+}
+
+if (error) {
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+VLOG_INFO_RL(, "invalid syntax '%s' in address", address);
+free(error);
+break;
+}
+buf += buf_index;
+}
+
+return true;
+}
diff --git a/ovn/lib/ovn-util.h b/ovn/lib/ovn-util.h
new file mode 100644
index 000..7852568
--- /dev/null
+++ b/ovn/lib/ovn-util.h
@@ -0,0 +1,53 @@
+
+
+#ifndef OVN_UTIL_H
+#define OVN_UTIL_H 1
+
+#include "lib/packets.h"
+
+struct ipv4_netaddr {
+ovs_be32 addr;
+unsigned int plen;
+};
+
+struct ipv6_netaddr {
+struct in6_addr addr;
+unsigned int plen;
+};
+
+struct lport_addresses {
+struct eth_addr ea;
+size_t n_ipv4_addrs;
+struct ipv4_netaddr *ipv4_addrs;
+size_t n_ipv6_addrs;
+struct ipv6_netaddr *ipv6_addrs;
+};
+
+/*
+ * Extracts the mac, ipv4 and ipv6 addresses from the input param 'address'
+ * which should be of the format 'MAC [IP1 IP2 ..]" where IPn should be
+ * a valid IPv4 or IPv6 address and stores them in the 'ipv4_addrs' and
+ * 'ipv6_addrs' fields of input param 'laddrs'.
+ * The caller has to free the 'ipv4_addrs' and 'ipv6_addrs' fields.
+ * If input param 'store_ipv6' is true only then extracted ipv6 addresses
+ * are stored in 'ipv6_addrs' fields.
+ * Return true if at least 'MAC' is found in 'address', false otherwise.
+ * Eg 1.
+ * If 'address' = '00:00:00:00:00:01 10.0.0.4 fe80::ea2a:eaff:fe28:3390/64
+ * 30.0.0.3/23' and 'store_ipv6' = true
+ * then returns true with laddrs->n_ipv4_addrs = 2, naddrs->n_ipv6_addrs = 1.
+ *
+ * Eg. 2
+ 

Re: [ovs-dev] [PATCH RFC 0/3] - send garp on localnet

2016-03-31 Thread Ramu Ramamurthy
>
> On a separate but related topic, northd programs localnet to not respond to
> regular
> arp requests coming from the network via a localnet port.
>
> How can IP traffic be initiated from a VM outside OVN control or from the
> physical network to a VM inside OVN controlled HV, connected via a localnet
> port ?
>

Traffic initiated from VM1 outside OVN to VM2 inside OVN would flow as follows:

VM1 would first arp for VM2's IP. This arp is a broadcast, and would
enter br-int via the
localnet patch on ALL hypervisors connected to that localnet. br-int
"floods" that arp
to all locally attached VIFs on that localnet datapath. VM2 happens to
be one of those VIFs.
VM2 receives that arp, and replies back.

As a result of the above interaction, external switches would learn
(port->source mac) bindings
and the external router would learn the arp mappings (IP->mac)

The problem is that port->source mac binding is cached for 5 minutes
at switches, and the arp
binding is cached for 4 hours by some routers !!

During that period, consider the following 2 scenarios:

1) VM2 moved between hypervisors due to migration and stayed silent.
VM2 cannot be
accessed externally due to stale port->mac mappings for several
minutes. Eventually,
the mac-learning entry ages out and the switch treats VM2's mac as unknown and
floods it as unknown unicast.

OR

2) VM2 got deleted, and VM3 is booted within OVN, and VM3 has the same IP as VM2
BUT a different mac, AND VM3 stayed silent.
VM3 cannot be accessed externally due to stale arp mapping at a router
for several hours !.
Eventually, the router would ageout its arp entry, and then, the
router would ARP for VM3's IP,
and traffic to VM3 is successfully established.

So, the problem addressed by this code change ONLY handles the corner
cases of stale mac-learning
entries at switches or stale arp entries at routers.


> Are you depending on flooding the network with gARPs periodically (to handle
> new external hosts) for all VMs on all HVs inside OVN, attached to localnet
> ?

No, the gARP is sent only a few times when a logical port on localnet comes up
to handle the special corner case (of stale mac-learning or arp
caches) described above.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH RFC 2/3] send garp on localnet

2016-03-31 Thread Ramu Ramamurthy
On Thu, Mar 31, 2016 at 12:33 PM, Numan Siddique  wrote:
>>
>> +/* Add a new vif to announce garps */
>> +static void
>> +send_garp_add(const struct sbrec_port_binding *binding_rec,
>> +  int ofport)
>> +{
>> +int i;
>> +for (i = 0; i < binding_rec->n_mac; i++) {
>> +struct lport_addresses laddrs;
>> +if (extract_lport_addresses(binding_rec->mac[i], ,
>> +false)) {
>> +/* the mac address must be valid */
>> +struct garp_data *garp = xmalloc(sizeof(struct garp_data));
>> +garp->ofport = ofport;
>> +garp->ea = laddrs.ea;
>> +garp->ipv4 = 0;
>> +if (laddrs.n_ipv4_addrs) {
>> +garp->ipv4 = laddrs.ipv4_addrs[0].addr;
>>
>
> You have to free laddrs.ipv4_addrs.
>
> Also I think you can merge patchset 2 and 3 as patch 3 tests patch 2
>
Thanks  !, I missed the free of laddrs.ipv4_addr,
In the next version, I will combine the test with the code in one commit.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH RFC 0/3] - send garp on localnet

2016-03-31 Thread Ramu Ramamurthy
>
> Why did you choose to use reply packets ?
>
> request packets are less likely to be rejected by external networks
> and I think thats discussed in the RFC
>
>

I wanted to use the GARP request, but found a problem.

OVN currently programs flows to respond to gratuitous arp requests from VMs,
Such flows also respond to ARP probes from VMs. This behavior needs to
change so that OVN must not respond to GARP requests or ARP probes.
It must flood them. This change is a work in progress.
In fact, some DHCP clients send ARP probes after the DHCP ACK to
verify no address collisions, and upon seeing ARP probe response will decline
dhcp offered IP.

As soon as the above problem is fixed, I can update the code to
generate GARP requests.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH RFC 3/3] garp generate test

2016-03-31 Thread Ramu Ramamurthy
add a test to validate garp generation on localnet

Signed-off-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>
---
 tests/ovn.at | 61 
 1 file changed, 61 insertions(+)

diff --git a/tests/ovn.at b/tests/ovn.at
index f2ceba3..715d1d8 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -1986,3 +1986,64 @@ OVS_APP_EXIT_AND_WAIT([ovs-vswitchd])
 OVS_APP_EXIT_AND_WAIT([ovsdb-server])
 
 AT_CLEANUP
+
+AT_SETUP([ovn -- gratuitous arp generate])
+AT_KEYWORDS([ovn])
+
+ovn_start
+ovn-nbctl lswitch-add lsw0
+net_add n1
+sim_add hv
+as hv
+ovs-vsctl \
+-- add-br br-phys \
+-- add-br br-eth0
+
+ovn_attach n1 br-phys 192.168.0.1
+
+AT_CHECK([ovs-vsctl set Open_vSwitch . 
external-ids:ovn-bridge-mappings=physnet1:br-eth0])
+AT_CHECK([ovs-vsctl add-port br-eth0 snoopvif -- set Interface snoopvif 
options:tx_pcap=hv/snoopvif-tx.pcap options:rxq_pcap=hv/snoopvif-rx.pcap])
+
+# create a vif
+AT_CHECK([ovn-nbctl lport-add lsw0 localvif1])
+AT_CHECK([ovn-nbctl lport-set-addresses localvif1 "f0:00:00:00:00:01 
192.168.1.2"])
+AT_CHECK([ovn-nbctl lport-set-port-security localvif1 "f0:00:00:00:00:01"])
+
+# create a localnet port
+AT_CHECK([ovn-nbctl lport-add lsw0 ln_port])
+AT_CHECK([ovn-nbctl lport-set-addresses ln_port unknown])
+AT_CHECK([ovn-nbctl lport-set-type ln_port localnet])
+AT_CHECK([ovn-nbctl lport-set-options ln_port network_name=physnet1])
+
+AT_CHECK([ovs-vsctl add-port br-int localvif1 -- set Interface localvif1 
external_ids:iface-id=localvif1])
+
+#wait for packet to be received
+OVS_WAIT_UNTIL([test `wc -c < "hv/snoopvif-tx.pcap"` -ge 50])
+trim_zeros() {
+sed 's/\(00\)\{1,\}$//'
+}
+$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv/snoopvif-tx.pcap | trim_zeros > 
packets
+expected="f00108060001080006040002f001c0a80102c0a80102"
+echo $expected > expout
+AT_CHECK([sort packets], [0], [expout])
+cat packets
+
+as hv
+OVS_APP_EXIT_AND_WAIT([ovn-controller])
+OVS_APP_EXIT_AND_WAIT([ovs-vswitchd])
+OVS_APP_EXIT_AND_WAIT([ovsdb-server])
+
+as ovn-sb
+OVS_APP_EXIT_AND_WAIT([ovsdb-server])
+
+as ovn-nb
+OVS_APP_EXIT_AND_WAIT([ovsdb-server])
+
+as northd
+OVS_APP_EXIT_AND_WAIT([ovn-northd])
+
+as main
+OVS_APP_EXIT_AND_WAIT([ovs-vswitchd])
+OVS_APP_EXIT_AND_WAIT([ovsdb-server])
+
+AT_CLEANUP
-- 
2.3.9

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH RFC 2/3] send garp on localnet

2016-03-31 Thread Ramu Ramamurthy
Generate gratuitous ARP replies when a logical
port on a localnet gets added. Such gratuitous arps
help to update the mac-port bindings and arp-caches
of external switches and routers on the localnet.

Signed-off-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>
---
 ovn/controller/ovn-controller.c |   2 +-
 ovn/controller/pinctrl.c| 216 +++-
 ovn/controller/pinctrl.h|   5 +-
 3 files changed, 220 insertions(+), 3 deletions(-)

diff --git a/ovn/controller/ovn-controller.c b/ovn/controller/ovn-controller.c
index 6027011..3f0aea4 100644
--- a/ovn/controller/ovn-controller.c
+++ b/ovn/controller/ovn-controller.c
@@ -335,7 +335,7 @@ main(int argc, char *argv[])
 
 enum mf_field_id mff_ovn_geneve = ofctrl_run(br_int);
 
-pinctrl_run(, , br_int);
+pinctrl_run(, , br_int, chassis_id, _datapaths);
 
 struct hmap flow_table = HMAP_INITIALIZER(_table);
 lflow_run(, , , _datapaths,
diff --git a/ovn/controller/pinctrl.c b/ovn/controller/pinctrl.c
index 3fcab99..ceafacc 100644
--- a/ovn/controller/pinctrl.c
+++ b/ovn/controller/pinctrl.c
@@ -24,6 +24,7 @@
 #include "ofp-actions.h"
 #include "ovn/lib/actions.h"
 #include "ovn/lib/logical-fields.h"
+#include "ovn/lib/ovn-util.h"
 #include "ofp-msgs.h"
 #include "ofp-print.h"
 #include "ofp-util.h"
@@ -54,6 +55,13 @@ static void run_put_arps(struct controller_ctx *,
 static void wait_put_arps(struct controller_ctx *);
 static void flush_put_arps(void);
 
+static void init_send_garps(void);
+static void destroy_send_garps(void);
+static void send_garp_run(const struct lport_index *lports,
+  const struct ovsrec_bridge *br_int,
+  const char *chassis_id,
+  struct hmap *local_datapaths);
+
 COVERAGE_DEFINE(pinctrl_drop_put_arp);
 
 void
@@ -62,6 +70,7 @@ pinctrl_init(void)
 swconn = rconn_create(5, 0, DSCP_DEFAULT, 1 << OFP13_VERSION);
 conn_seq_no = 0;
 init_put_arps();
+init_send_garps();
 }
 
 static ovs_be32
@@ -266,7 +275,9 @@ pinctrl_recv(const struct ofp_header *oh, enum ofptype type)
 
 void
 pinctrl_run(struct controller_ctx *ctx, const struct lport_index *lports,
-const struct ovsrec_bridge *br_int)
+const struct ovsrec_bridge *br_int,
+const char *this_chassis_id,
+struct hmap *local_datapaths)
 {
 if (br_int) {
 char *target;
@@ -307,6 +318,9 @@ pinctrl_run(struct controller_ctx *ctx, const struct 
lport_index *lports,
 }
 
 run_put_arps(ctx, lports);
+if (rconn_is_connected(swconn)) {
+send_garp_run(lports, br_int, this_chassis_id, local_datapaths);
+}
 }
 
 void
@@ -322,6 +336,7 @@ pinctrl_destroy(void)
 {
 rconn_destroy(swconn);
 destroy_put_arps();
+destroy_send_garps();
 }
 
 /* Implementation of the "put_arp" OVN action.  This action sends a packet to
@@ -484,3 +499,202 @@ flush_put_arps(void)
 free(pa);
 }
 }
+
+/*
+ * Send gratuitous arp for vifs on localnets
+ *
+ * When a new vif on localnet is added, gratuitous arps
+ * are sent announcing the port's mac,ip mapping.
+ * On localnet, such announcements are needed for
+ * switches and routers on the broadcast segment to
+ * update their port-mac and arp tables.
+ */
+struct garp_data {
+int ofport; /* ofport used for this garp */
+struct eth_addr ea; /* ethernet address of port */
+ovs_be32 ipv4;  /* ipv4 address of port */
+long long int announce_time; /* next announcement in ms */
+int backoff; /* backoff for the next announcement */
+};
+
+/* Contains garps to be sent */
+static struct shash send_garp_data;
+
+static void
+init_send_garps(void)
+{
+shash_init(_garp_data);
+}
+
+static void
+destroy_send_garps(void)
+{
+shash_destroy(_garp_data);
+}
+
+/* Add a new vif to announce garps */
+static void
+send_garp_add(const struct sbrec_port_binding *binding_rec,
+  int ofport)
+{
+int i;
+for (i = 0; i < binding_rec->n_mac; i++) {
+struct lport_addresses laddrs;
+if (extract_lport_addresses(binding_rec->mac[i], ,
+false)) {
+/* the mac address must be valid */
+struct garp_data *garp = xmalloc(sizeof(struct garp_data));
+garp->ofport = ofport;
+garp->ea = laddrs.ea;
+garp->ipv4 = 0;
+if (laddrs.n_ipv4_addrs) {
+garp->ipv4 = laddrs.ipv4_addrs[0].addr;
+}
+garp->backoff = 1;
+garp->announce_time = time_msec()+1000;
+shash_add(_garp_data, binding_rec->logical_port, garp);
+break;
+}
+}
+}
+
+/* Remove a port from garp announce */
+static void
+send_garp_delete(const char *lp)
+{
+struct garp_data *garp = 

[ovs-dev] [PATCH RFC 1/3] refactor extract_lport_addresses

2016-03-31 Thread Ramu Ramamurthy
Move the function extract_lport_addresses under ovn/lib
since that function can be used by ovn-controller also
to parse addresses stored in the mac column of the
port_binding table. Currently that funtion is used only
in ovn_northd.

Signed-off-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>
---
 ovn/lib/automake.mk |   4 +-
 ovn/lib/ovn-util.c  |  71 +
 ovn/lib/ovn-util.h  |  53 
 ovn/northd/ovn-northd.c | 104 +---
 4 files changed, 128 insertions(+), 104 deletions(-)
 create mode 100644 ovn/lib/ovn-util.c
 create mode 100644 ovn/lib/ovn-util.h

diff --git a/ovn/lib/automake.mk b/ovn/lib/automake.mk
index 9e09352..32e7b87 100644
--- a/ovn/lib/automake.mk
+++ b/ovn/lib/automake.mk
@@ -10,7 +10,9 @@ ovn_lib_libovn_la_SOURCES = \
ovn/lib/expr.h \
ovn/lib/lex.c \
ovn/lib/lex.h \
-   ovn/lib/logical-fields.h
+   ovn/lib/logical-fields.h \
+   ovn/lib/ovn-util.c \
+   ovn/lib/ovn-util.h
 nodist_ovn_lib_libovn_la_SOURCES = \
ovn/lib/ovn-nb-idl.c \
ovn/lib/ovn-nb-idl.h \
diff --git a/ovn/lib/ovn-util.c b/ovn/lib/ovn-util.c
new file mode 100644
index 000..1c25efb
--- /dev/null
+++ b/ovn/lib/ovn-util.c
@@ -0,0 +1,71 @@
+
+#include 
+#include "lib/packets.h"
+#include "ovn-util.h"
+#include "openvswitch/vlog.h"
+
+VLOG_DEFINE_THIS_MODULE(ovn_util);
+
+
+bool
+extract_lport_addresses(char *address, struct lport_addresses *laddrs,
+bool store_ipv6)
+{
+char *buf = address;
+int buf_index = 0;
+char *buf_end = buf + strlen(address);
+if (!ovs_scan_len(buf, _index, ETH_ADDR_SCAN_FMT,
+  ETH_ADDR_SCAN_ARGS(laddrs->ea))) {
+return false;
+}
+
+ovs_be32 ip4;
+struct in6_addr ip6;
+unsigned int plen;
+char *error;
+
+laddrs->n_ipv4_addrs = 0;
+laddrs->n_ipv6_addrs = 0;
+laddrs->ipv4_addrs = NULL;
+laddrs->ipv6_addrs = NULL;
+
+/* Loop through the buffer and extract the IPv4/IPv6 addresses
+ * and store in the 'laddrs'. Break the loop if invalid data is found.
+ */
+buf += buf_index;
+while (buf < buf_end) {
+buf_index = 0;
+error = ip_parse_cidr_len(buf, _index, , );
+if (!error) {
+laddrs->n_ipv4_addrs++;
+laddrs->ipv4_addrs = xrealloc(
+laddrs->ipv4_addrs,
+sizeof (struct ipv4_netaddr) * laddrs->n_ipv4_addrs);
+laddrs->ipv4_addrs[laddrs->n_ipv4_addrs - 1].addr = ip4;
+laddrs->ipv4_addrs[laddrs->n_ipv4_addrs - 1].plen = plen;
+buf += buf_index;
+continue;
+}
+free(error);
+error = ipv6_parse_cidr_len(buf, _index, , );
+if (!error && store_ipv6) {
+laddrs->n_ipv6_addrs++;
+laddrs->ipv6_addrs = xrealloc(
+laddrs->ipv6_addrs,
+sizeof(struct ipv6_netaddr) * laddrs->n_ipv6_addrs);
+memcpy(>ipv6_addrs[laddrs->n_ipv6_addrs - 1].addr, ,
+   sizeof(struct in6_addr));
+laddrs->ipv6_addrs[laddrs->n_ipv6_addrs - 1].plen = plen;
+}
+
+if (error) {
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+VLOG_INFO_RL(, "invalid syntax '%s' in address", address);
+free(error);
+break;
+}
+buf += buf_index;
+}
+
+return true;
+}
diff --git a/ovn/lib/ovn-util.h b/ovn/lib/ovn-util.h
new file mode 100644
index 000..863434a
--- /dev/null
+++ b/ovn/lib/ovn-util.h
@@ -0,0 +1,53 @@
+
+
+#ifndef OVN_UTIL_H
+#define OVN_UTIL_H 1
+
+
+
+struct ipv4_netaddr {
+ovs_be32 addr;
+unsigned int plen;
+};
+
+struct ipv6_netaddr {
+struct in6_addr addr;
+unsigned int plen;
+};
+
+struct lport_addresses {
+struct eth_addr ea;
+size_t n_ipv4_addrs;
+struct ipv4_netaddr *ipv4_addrs;
+size_t n_ipv6_addrs;
+struct ipv6_netaddr *ipv6_addrs;
+};
+
+/*
+ * Extracts the mac, ipv4 and ipv6 addresses from the input param 'address'
+ * which should be of the format 'MAC [IP1 IP2 ..]" where IPn should be
+ * a valid IPv4 or IPv6 address and stores them in the 'ipv4_addrs' and
+ * 'ipv6_addrs' fields of input param 'laddrs'.
+ * The caller has to free the 'ipv4_addrs' and 'ipv6_addrs' fields.
+ * If input param 'store_ipv6' is true only then extracted ipv6 addresses
+ * are stored in 'ipv6_addrs' fields.
+ * Return true if at least 'MAC' is found in 'address', false otherwise.
+ * Eg 1.
+ * If 'address' = '00:00:00:00:00:01 10.0.0.4 fe80::ea2a:eaff:fe28:3390/64
+ * 30.0.0.3/23' and 'store_ipv6' = true
+ * then returns true with laddrs->n_ipv4_addrs = 2, naddrs->n_ipv6_addrs = 1.
+ *
+ * Eg. 2
+ * If 'address' = '00:00:00:00:00:01 10.0.0.4 fe80::e

[ovs-dev] [PATCH RFC 0/3] - send garp on localnet

2016-03-31 Thread Ramu Ramamurthy

Problem:
When a VM comes up with interface on localnet, external
switches on the broadcast segment have to update their
port->mac mappings and routers have to update their arp caches.
This normally happens when the VM initiates traffic.
on the localnet interface. But in some scenarios, when the
VM remains silent, the VM is unreachable externally. This can
happen when the VM is migrated or when the VM is reusing an earlier used
IP but with a different mac. The fix is to for OVN to initiate gratuitous ARP
which updates external routers/switches on localnet.

This problem has been reported at:
https://bugs.launchpad.net/networking-ovn/+bug/1545897

What to generate:
When ipv4 and mac are defined on the port, we can generate a gratuitous ARP
request or a reply. ARP requests seem to be preferred over ARP replies
as suggested here:
https://tools.ietf.org/html/rfc5227#section-3
In this change, a broadcast garp reply is generated.

How many to generate:
3 to 5 arps spaced over 1-2 minutes seems to be reasonable to allow
for losses. For comparison qemu generates 5 rarps when a VM port comes up.

In this change, the garp packet is input to the port, and goes through
the pipelines. An alternative is to output directly
onto the localnet patch port.

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] ovn-controller: add restart test

2016-03-23 Thread Ramu Ramamurthy
On Wed, Mar 23, 2016 at 4:23 PM, Ben Pfaff <b...@ovn.org> wrote:
> On Tue, Mar 22, 2016 at 04:37:16PM -0700, Ramu Ramamurthy wrote:
>> > This adds a test for this invariant for a specific test case, which is
>> > useful.  However, I believe that this should be a universal invariant
>> > for ovn-controller, so that at any time if ovn-controller is restarted
>> > gracefully after it has converged, it should preserve these flows.  Is
>> > that true?  If so, then it would be more useful to add such a check to
>> > *every* test case for OVN.
>> >
>> > What do you think?
>> >
>> > Thanks,
>> >
>> > Ben.
>>
>> I am finding that, after a graceful exit of ovn-controller
>> using "ovs-appctl -t ovn-controller exit", and then a restart of the
>> ovn-controller,
>> the flows before restart and after restart dont match on table 0 in 1 out of
>> 2 runs (or 1 out of 3 runs).
>>
>> This is because, localnet ports in the test
>> are being deleted and recreated in such failing runs - and therefore, their
>> ofports are mismatching in the flow-table (before and after restart).
>>
>> Localnet ports are being deleted and recreated in the failing runs
>> because, ovn-controller is cleaning up the chassis row during graceful
>> exit, and upon startup there still seems
>> to be a race between patch_run and binding_run when the chassis does
>> not exist, I am still
>> chasing this down.
>
> Should that behavior of ovn-controller on graceful restart be considered
> a bug, though?  If so, then we should probably fix it and add a test (or
> modify multiple tests, as I suggested above) to make sure that it does
> not recur.
>
Indeed, its a bug in code, and I will update with patch(es) as needed
to fix such
restart bugs, along with test modifications.

>> Thats why the test as currently coded is killing the ovn-controller
>> ungracefully (kill -9) and restarting
>> it. As such the restart test and expected invariance of flows seems useful.
>
> I guess that kill -9 and restart can be a useful test too.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] ovn-controller: add restart test

2016-03-22 Thread Ramu Ramamurthy
> This adds a test for this invariant for a specific test case, which is
> useful.  However, I believe that this should be a universal invariant
> for ovn-controller, so that at any time if ovn-controller is restarted
> gracefully after it has converged, it should preserve these flows.  Is
> that true?  If so, then it would be more useful to add such a check to
> *every* test case for OVN.
>
> What do you think?
>
> Thanks,
>
> Ben.

I am finding that, after a graceful exit of ovn-controller
using "ovs-appctl -t ovn-controller exit", and then a restart of the
ovn-controller,
the flows before restart and after restart dont match on table 0 in 1 out of
2 runs (or 1 out of 3 runs).

This is because, localnet ports in the test
are being deleted and recreated in such failing runs - and therefore, their
ofports are mismatching in the flow-table (before and after restart).

Localnet ports are being deleted and recreated in the failing runs
because, ovn-controller is cleaning up the chassis row during graceful
exit, and upon startup there still seems
to be a race between patch_run and binding_run when the chassis does
not exist, I am still
chasing this down.

Thats why the test as currently coded is killing the ovn-controller
ungracefully (kill -9) and restarting
it. As such the restart test and expected invariance of flows seems useful.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v6 3/3] ovn: Add doc for zone-id external-id on iface record

2016-03-22 Thread Ramu Ramamurthy
Update documentation for the new external-id
in the interface record of ovsdb. The key consists if the
string "ovn-zone-id" concatenated with the logical port id,
and the value is the zone-id.

Signed-off-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>
---
 ovn/controller/ovn-controller.8.xml | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/ovn/controller/ovn-controller.8.xml 
b/ovn/controller/ovn-controller.8.xml
index b261af9..8f60ba0 100644
--- a/ovn/controller/ovn-controller.8.xml
+++ b/ovn/controller/ovn-controller.8.xml
@@ -194,6 +194,21 @@
   logical patch port that it implements.
 
   
+
+  
+external-ids:ovn-zone-id-logicalportid in the 
Interface table
+  
+
+  
+
+  This key is set by the ovn-controller to identify the
+  conntrack zone-id used for a OVN logical port. The key consists of
+  the string ovn-zone-id- concatenated with the logical
+  port-id. Its value contains the zone-id used for the logical port.
+  The interface for the parent logical port contains the zone-id keys 
for
+  child logical ports.
+
+  
 
 
 Runtime Management Commands
-- 
2.3.9

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v6 2/3] ovn: Add a restart test

2016-03-22 Thread Ramu Ramamurthy
Add a test to validate that a restart of the ovn-controller
will program zoneids consistently on ports. Flows before
restart are compared against flows after restart to detect
problems with ofports or zone-ids.

Signed-off-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>
---
 tests/ovn-controller.at | 131 
 1 file changed, 131 insertions(+)

diff --git a/tests/ovn-controller.at b/tests/ovn-controller.at
index 971ea1e..7d17cd8 100644
--- a/tests/ovn-controller.at
+++ b/tests/ovn-controller.at
@@ -128,3 +128,134 @@ as ovn-sb
 OVS_APP_EXIT_AND_WAIT([ovsdb-server])
 
 AT_CLEANUP
+
+AT_SETUP([ovn-controller - zone-id assignment])
+AT_KEYWORDS([ovn])
+#
+# The purpose of this test is to validate the zone-id
+# assigned to ports and programmed in the flow table.
+# It verifies that the zone-id is assigned consistently
+# after ovn-controller restart
+#
+
+# create 2 localnet ports
+# create 3 vifs
+# create 5 cifs
+
+ovn_init_db ovn-sb
+net_add n1
+sim_add hv
+as hv
+ovs-vsctl \
+-- add-br br-phys \
+-- add-br br-eth0 \
+-- add-br br-eth1 \
+-- add-br br-eth2
+
+ovn_attach n1 br-phys 192.168.0.1
+ovs-appctl -t ovn-controller vlog/set any:any:dbg
+
+AT_CHECK([ovs-vsctl set Open_vSwitch . 
external-ids:ovn-bridge-mappings=physnet1:br-eth0,physnet2:br-eth1,physnet3:br-eth2])
+
+# Create a localnet port
+ovn-sbctl \
+-- --id=@dp101 create Datapath_Binding tunnel_key=101 \
+-- create Port_Binding datapath=@dp101 logical_port=localnet1 tunnel_key=1 
\
+type=localnet options:network_name=physnet1 \
+-- create Port_Binding datapath=@dp101 logical_port=localvif1 tunnel_key=2
+ovs-vsctl add-port br-int localvif1 -- set Interface localvif1 
external_ids:iface-id=localvif1
+
+# Create a vif port
+ovn-sbctl \
+-- --id=@dp102 create Datapath_Binding tunnel_key=102 \
+-- create Port_Binding datapath=@dp102 logical_port=localvif2 tunnel_key=3
+ovs-vsctl add-port br-int localvif2 -- set Interface localvif2 
external_ids:iface-id=localvif2
+
+
+# Create a vif port and some cifs on it
+ovn-sbctl \
+-- --id=@dp103 create Datapath_Binding tunnel_key=103 \
+-- create Port_Binding datapath=@dp103 logical_port=localvif3 tunnel_key=4 
\
+-- create Port_Binding datapath=@dp103 logical_port=localcif1 tunnel_key=5 
parent_port=localvif3 tag=1 \
+-- create Port_Binding datapath=@dp103 logical_port=localcif2 tunnel_key=6 
parent_port=localvif3 tag=2 \
+-- create Port_Binding datapath=@dp103 logical_port=localcif3 tunnel_key=7 
parent_port=localvif3 tag=3 \
+-- create Port_Binding datapath=@dp103 logical_port=localcif4 tunnel_key=8 
parent_port=localvif3 tag=4 \
+-- create Port_Binding datapath=@dp103 logical_port=localcif5 tunnel_key=9 
parent_port=localvif3 tag=5
+ovs-vsctl add-port br-int localvif3 -- set Interface localvif3 
external_ids:iface-id=localvif3
+
+# Create another localnet port
+ovn-sbctl \
+-- --id=@dp201 create Datapath_Binding tunnel_key=201 \
+-- create Port_Binding datapath=@dp201 logical_port=localnet201 
tunnel_key=1 \
+type=localnet options:network_name=physnet2 \
+-- create Port_Binding datapath=@dp201 logical_port=localvif201 
tunnel_key=2
+ovs-vsctl add-port br-int localvif201 -- set Interface localvif201 
external_ids:iface-id=localvif201
+
+# wait until all expected flows are programmed in table 0
+OVS_WAIT_UNTIL([test `ovs-ofctl dump-flows br-int table=0 | ofctl_strip | tail 
-n+2 | sort | wc -l` -ge 13])
+
+# dump the zoneids
+ovs-vsctl get interface patch-br-int-to-localnet1 external-ids
+ovs-vsctl get interface patch-br-int-to-localnet201 external-ids
+ovs-vsctl get interface localvif1 external-ids
+ovs-vsctl get interface localvif2 external-ids
+ovs-vsctl get interface localvif3 external-ids
+
+# verify zone-id tags are set on the interface
+OVS_WAIT_UNTIL([ovs-vsctl get interface patch-br-int-to-localnet1 
external-ids:ovn-zone-id-localnet1])
+OVS_WAIT_UNTIL([ovs-vsctl get interface patch-br-int-to-localnet201 
external-ids:ovn-zone-id-localnet201])
+OVS_WAIT_UNTIL([ovs-vsctl get interface localvif1 
external-ids:ovn-zone-id-localvif1])
+OVS_WAIT_UNTIL([ovs-vsctl get interface localvif2 
external-ids:ovn-zone-id-localvif2])
+OVS_WAIT_UNTIL([ovs-vsctl get interface localvif3 
external-ids:ovn-zone-id-localvif3])
+OVS_WAIT_UNTIL([ovs-vsctl get interface localvif3 
external-ids:ovn-zone-id-localcif1])
+OVS_WAIT_UNTIL([ovs-vsctl get interface localvif3 
external-ids:ovn-zone-id-localcif2])
+OVS_WAIT_UNTIL([ovs-vsctl get interface localvif3 
external-ids:ovn-zone-id-localcif3])
+OVS_WAIT_UNTIL([ovs-vsctl get interface localvif3 
external-ids:ovn-zone-id-localcif4])
+OVS_WAIT_UNTIL([ovs-vsctl get interface localvif3 
external-ids:ovn-zone-id-localcif5])
+
+# save the table0 flows for a later comparison after restart
+ovs-ofctl dump-flows br-int table=0 | ofctl_strip | tail -n+2 | sort > expout
+cat expout
+
+#kill the ovn controller ungracefully, so nothing gets c

[ovs-dev] [PATCH v6 1/3] ovn: Assign zone-id consistently

2016-03-22 Thread Ramu Ramamurthy
Currently, ovn-controller does not record the
zoneid assigned to logical port. Tests indicate that
zoneids can be different on the same logical port
after ovn-controller restart.

The fix sets the zone-id as an external-id of the interface record,
and recovers the zone-id from that record.

Reported-by: Russell Bryant <russ...@ovn.org>
Reported-at: https://bugs.launchpad.net/networking-ovn/+bug/1538696
Signed-off-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>
---
 ovn/controller/binding.c | 157 ++-
 1 file changed, 141 insertions(+), 16 deletions(-)

diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
index d3ca9c9..77d0a2d 100644
--- a/ovn/controller/binding.c
+++ b/ovn/controller/binding.c
@@ -19,6 +19,7 @@
 #include "lib/bitmap.h"
 #include "lib/hmap.h"
 #include "lib/sset.h"
+#include "lib/type-props.h"
 #include "lib/util.h"
 #include "lib/vswitch-idl.h"
 #include "openvswitch/vlog.h"
@@ -50,7 +51,8 @@ binding_register_ovs_idl(struct ovsdb_idl *ovs_idl)
 }
 
 static void
-get_local_iface_ids(const struct ovsrec_bridge *br_int, struct shash *lports)
+get_local_iface_ids(const struct ovsrec_bridge *br_int, struct shash *lports,
+struct shash *localnet_ports)
 {
 int i;
 
@@ -63,10 +65,19 @@ get_local_iface_ids(const struct ovsrec_bridge *br_int, 
struct shash *lports)
 continue;
 }
 
+const char *localnet = smap_get(_rec->external_ids,
+"ovn-localnet-port");
+
 for (j = 0; j < port_rec->n_interfaces; j++) {
 const struct ovsrec_interface *iface_rec;
 
 iface_rec = port_rec->interfaces[j];
+
+if (!strcmp(iface_rec->type, "patch") && localnet) {
+shash_add(localnet_ports, localnet, iface_rec);
+break;
+}
+
 iface_id = smap_get(_rec->external_ids, "iface-id");
 if (!iface_id) {
 continue;
@@ -76,30 +87,127 @@ get_local_iface_ids(const struct ovsrec_bridge *br_int, 
struct shash *lports)
 }
 }
 
+static void get_zones_on_ifaces(struct shash *lports,
+struct simap *iface_zones,
+struct shash *parent_ports)
+{
+const char OVN_ZONE_ID[] = "ovn-zone-id-";
+const struct ovsrec_interface *iface_rec;
+struct shash_node *node;
+struct shash already_added = SHASH_INITIALIZER(_added);
+
+/* get zone-ids set in external-ids of iface records, and
+ * also get parent ports for logical ports
+ * for efficient iteration on child ports, maintain already_added
+ */
+SHASH_FOR_EACH(node, lports) {
+iface_rec = node->data;
+if (!iface_rec || shash_find(_added, iface_rec->name)) {
+continue;
+}
+shash_add(_added, iface_rec->name, iface_rec);
+struct smap_node *zone;
+int zone_id;
+SMAP_FOR_EACH(zone, _rec->external_ids) {
+zone_id = smap_get_int(_rec->external_ids, zone->key, 0);
+if (!strncmp(zone->key, OVN_ZONE_ID, strlen(OVN_ZONE_ID)) &&
+zone_id) {
+simap_put(iface_zones, zone->key+strlen(OVN_ZONE_ID), zone_id);
+shash_add(parent_ports, zone->key+strlen(OVN_ZONE_ID), 
iface_rec);
+}
+}
+}
+shash_destroy(_added);
+}
+
+static void
+update_zone_on_iface_rec(const struct ovsrec_interface *iface_rec,
+ const char *iface_id, int zone_id, struct 
controller_ctx *ctx)
+{
+
+if (!iface_rec || !ctx->ovs_idl_txn) {
+return;
+}
+struct smap new;
+char *zone_key = xasprintf("ovn-zone-id-%s",iface_id);
+char zone[INT_STRLEN(int)+1];
+snprintf(zone, sizeof zone, "%d", zone_id);
+smap_clone(, _rec->external_ids);
+smap_replace(, zone_key, zone);
+ovsrec_interface_verify_external_ids(iface_rec);
+ovsrec_interface_set_external_ids(iface_rec, );
+smap_destroy();
+free(zone_key);
+}
+
+static void
+delete_zone_on_iface_rec(const struct ovsrec_interface *iface_rec,
+ const char *key, struct controller_ctx *ctx)
+{
+if (!iface_rec || !ctx->ovs_idl_txn) {
+return;
+}
+struct smap new;
+char *zone_key = xasprintf("ovn-zone-id-%s", key);
+smap_clone(, _rec->external_ids);
+smap_remove(, zone_key);
+ovsrec_interface_verify_external_ids(iface_rec);
+ovsrec_interface_set_external_ids(iface_rec, );
+smap_destroy();
+free(zone_key);
+}
+
 static void
-update_ct_zones(struct sset *lports, struct simap *ct_zones,
-unsigned long *ct_zone_bitmap)
+update_ct_zones(struct shash *lports, struct simap *ct_zones,
+u

[ovs-dev] [PATCH v6 0/3] ovn: Assign zoneid consistently

2016-03-22 Thread Ramu Ramamurthy

Changes v5 to v6

* updates per code-review
* handle child-ports and localnet ports
* add a test
* update docs
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] ovn: broadcast packets dropped with "drop recirc action"

2016-03-19 Thread Ramu Ramamurthy

On 2016-03-15 11:16, Andy Zhou wrote:

On Mon, Mar 14, 2016 at 12:11 PM, Ramu Ramamurthy <
srama...@linux.vnet.ibm.com> wrote:


When a logical switch (localnet) has lots of ports on a hypervisor,
I find that broadcast packets from one of the ports is only forwarded 
to

a subset of the other ports, and the kernel module shows the message -
"kernel: openvswitch: ovs-system: deferred action limit reached, drop
recirc action"

There appears to be a discussion of a related problem and a possible 
patch

here:
http://openvswitch.org/pipermail//discuss/2015-October/019168.html
http://openvswitch.org/pipermail//discuss/2015-October/019198.html



Thanks for the bug report.

Those patches are now dropped since I have not heard back from the bug
reporter.

I have just posted a potential fix as RFC at:

http://openvswitch.org/pipermail/dev/2016-March/067794.html

Would you please test it against your set up?  Thanks.


I tried the patch in the link above on devstack-with-ovn. The 
openvswitch kernel module

is rebuilt with the above patch, I find that there is a system hang,
when I send the broadcast packet. The same test earlier showed "drop 
recirc action" errors.


The following change works however,

iff --git a/datapath/actions.c b/datapath/actions.c
index 20413c9..39dedcc 100644
--- a/datapath/actions.c
+++ b/datapath/actions.c
@@ -72,7 +72,7 @@ struct ovs_frag_data {
 static DEFINE_PER_CPU(struct ovs_frag_data, ovs_frag_data_storage);
 #endif

-#define DEFERRED_ACTION_FIFO_SIZE 10
+#define DEFERRED_ACTION_FIFO_SIZE 70

But, it appears that we cannot arbitrarily increase  
DEFERRED_ACTION_FIFO_SIZE  because
when I set it to 100, I am not able to load the openvswitch module - the 
following errors are seen.


[ 1094.163171] WARNING: at mm/percpu.c:896 pcpu_alloc+0x64a/0x670()
[ 1094.163174] illegal size (46408) or align (8) for percpu allocation

It appears that there is a hard limit on DEFERRED_ACTION_FIFO_SIZE 
determined by:

PCPU_MIN_UNIT_SIZE.



___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] ovn: broadcast packets dropped with "drop recirc action"

2016-03-19 Thread Ramu Ramamurthy

On 2016-03-16 16:35, Andy Zhou wrote:

On Wed, Mar 16, 2016 at 4:00 PM, Ramu Ramamurthy <
srama...@linux.vnet.ibm.com> wrote:


On 2016-03-15 11:16, Andy Zhou wrote:


On Mon, Mar 14, 2016 at 12:11 PM, Ramu Ramamurthy <
srama...@linux.vnet.ibm.com> wrote:

When a logical switch (localnet) has lots of ports on a hypervisor,
I find that broadcast packets from one of the ports is only 
forwarded to
a subset of the other ports, and the kernel module shows the message 
-
"kernel: openvswitch: ovs-system: deferred action limit reached, 
drop

recirc action"

There appears to be a discussion of a related problem and a possible
patch
here:
http://openvswitch.org/pipermail//discuss/2015-October/019168.html
http://openvswitch.org/pipermail//discuss/2015-October/019198.html




Thanks for the bug report.

Those patches are now dropped since I have not heard back from the 
bug

reporter.

I have just posted a potential fix as RFC at:

http://openvswitch.org/pipermail/dev/2016-March/067794.html

Would you please test it against your set up?  Thanks.



I tried the patch in the link above on devstack-with-ovn. The 
openvswitch

kernel module
is rebuilt with the above patch, I find that there is a system hang,
when I send the broadcast packet. The same test earlier showed "drop
recirc action" errors.



Ahh, I see.  Would you please try the following incremental change? 
Thanks.


 diff --git a/datapath/actions.c b/datapath/actions.c
index 3b85eaf..5e81a38 100644
--- a/datapath/actions.c
+++ b/datapath/actions.c
@@ -116,7 +116,7 @@ static struct deferred_action 
*action_fifo_get(struct

action

 static struct deferred_action *action_fifo_put(struct action_fifo 
*fifo)

 {
-   if (fifo->head >= fifo->size - 1) {
+   if (fifo->head >= fifo->size / sizeof(*fifo->fifo) - 1) {
/* out of fifo buffer space, try to allocate a new fifo
 * buffer. */
struct deferred_action *new_fifo;







thanks Andy, That worked !, I was able to broadcast (udp-dhcp-discovers) 
to 100 ports and

saw no "drop recirc action" errors.

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] ovn: broadcast packets dropped with "drop recirc action"

2016-03-15 Thread Ramu Ramamurthy

On 2016-03-15 11:16, Andy Zhou wrote:

On Mon, Mar 14, 2016 at 12:11 PM, Ramu Ramamurthy <
srama...@linux.vnet.ibm.com> wrote:


When a logical switch (localnet) has lots of ports on a hypervisor,
I find that broadcast packets from one of the ports is only forwarded 
to

a subset of the other ports, and the kernel module shows the message -
"kernel: openvswitch: ovs-system: deferred action limit reached, drop
recirc action"

There appears to be a discussion of a related problem and a possible 
patch

here:
http://openvswitch.org/pipermail//discuss/2015-October/019168.html
http://openvswitch.org/pipermail//discuss/2015-October/019198.html



Thanks for the bug report.

Those patches are now dropped since I have not heard back from the bug
reporter.

I have just posted a potential fix as RFC at:

http://openvswitch.org/pipermail/dev/2016-March/067794.html

Would you please test it against your set up?  Thanks.


Andy, Thanks for the fix,

I will update after testing it,

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] ovn: broadcast packets dropped with "drop recirc action"

2016-03-14 Thread Ramu Ramamurthy

When a logical switch (localnet) has lots of ports on a hypervisor,
I find that broadcast packets from one of the ports is only forwarded to
a subset of the other ports, and the kernel module shows the message -
"kernel: openvswitch: ovs-system: deferred action limit reached, drop 
recirc action"


There appears to be a discussion of a related problem and a possible 
patch here:

http://openvswitch.org/pipermail//discuss/2015-October/019168.html
http://openvswitch.org/pipermail//discuss/2015-October/019198.html

there is another related discussion here but it seems to have been 
fixed,

http://openvswitch.org/pipermail/dev/2015-November/061840.html
but I am still seeing the problem

I appreciate any feedback on possible fix for this problem.
(I will try the fixes in above link).

Details (for reproduction):

1. Create 50 logical ports on a localnet logical switch and bind them

for i in {1..50};do ovn-nbctl lport-add 
064d37d8-b3dd-45e6-81a3-15fc8d0a99a3 vif-${i}; done
for i in {1..50}; do sudo ip link add vif-${i} type veth peer name  
vifpeer-${i}; done

for i in {1..50}; do sudo ip link set dev vif-${i} up; done
for i in {1..50}; do sudo ip link set dev vifpeer-${i} up; done
sudo ip addr add 10.10.10.1/16 dev vifpeer-1

2. Send a broadcast packet on one of the ports
sudo dhclient -v vifpeer-1 <<< this triggers a broadcast DHCP DISCOVER 
message


3. I find that only about 10 of the 50 ports get the dhcp DISCOVER 
broadcast message.


The problem appears is in this flow in table 33. It seems that only the
first 10 resubmits to table 34 in the following flow have really taken 
effect,


cookie=0x0, duration=60631.984s, table=33, n_packets=118, n_bytes=11074, 
idle_age=2205, hard_age=5826, priority=100,reg7=0x,metadata=0x2 
actions=load:0x32->NXM_NX_REG5[],load:0x31->NXM_NX_REG7[],resubmit(,34),load:0x11->NXM_NX_REG5[],load:0x10->NXM_NX_REG7[],resubmit(,34),load:0x31->NXM_NX_REG5[],load:0x30->NXM_NX_REG7[],resubmit(,34),load:0x26->NXM_NX_REG5[],load:0x25->NXM_NX_REG7[],resubmit(,34),load:0x3->NXM_NX_REG5[],load:0x2->NXM_NX_REG7[],resubmit(,34),load:0xa->NXM_NX_REG5[],load:0x9->NXM_NX_REG7[],resubmit(,34),load:0xd->NXM_NX_REG5[],load:0xc->NXM_NX_REG7[],resubmit(,34),load:0x35->NXM_NX_REG5[],load:0x34->NXM_NX_REG7[],resubmit(,34),load:0x18->NXM_NX_REG5[],load:0x17->NXM_NX_REG7[],resubmit(,34),load:0x14->NXM_NX_REG5[],load:0x13->NXM_NX_REG7[],resubmit(,34),load:0x4->NXM_NX_REG5[],load:0x4->NXM_NX_REG7[],resubmit(,34),load:0x10->NXM_NX_REG5[],load:0xf->NXM_NX_REG7[],resubmit(,34),load:0x2b->NXM_NX_REG5[],load:0x2a->NXM_NX_REG7[],resubmit(,34),load:0x2d->NXM_NX_REG5[],load:0x2c->NXM_NX_REG7[],resubmit(,34),load:0x27->NXM_NX_REG5[],load:0x26->NXM_NX_REG7[],resubmit(,34),load:0xb->NXM_NX_REG5[],load:0x


a->NXM_N
X_REG7[],resubmit(,34),load:0x17->NXM_NX_REG5[],load:0x16->NXM_NX_REG7[],resubmit(,34),load:0x1f->NXM_NX_REG5[],load:0x1e->NXM_NX_REG7[],resubmit(,34),load:0x1b->NXM_NX_REG5[],load:0x1a->NXM_NX_REG7[],resubmit(,34),load:0x28->NXM_NX_REG5[],load:0x27->NXM_NX_REG7[],resubmit(,34),load:0x33->NXM_NX_REG5[],load:0x32->NXM_NX_REG7[],resubmit(,34),load:0x30->NXM_NX_REG5[],load:0x2f->NXM_NX_REG7[],resubmit(,34),load:0x5->NXM_NX_REG5[],load:0x5->NXM_NX_REG7[],resubmit(,34),load:0x34->NXM_NX_REG5[],load:0x33->NXM_NX_REG7[],resubmit(,34),load:0x15->NXM_NX_REG5[],load:0x14->NXM_NX_REG7[],resubmit(,34),load:0x8->NXM_NX_REG5[],load:0x7->NXM_NX_REG7[],resubmit(,34),load:0xc->NXM_NX_REG5[],load:0xb->NXM_NX_REG7[],resubmit(,34),load:0xe->NXM_NX_REG5[],load:0xd->NXM_NX_REG7[],resubmit(,34),load:0x1c->NXM_NX_REG5[],load:0x1b->NXM_NX_REG7[],resubmit(,34),load:0x1e->NXM_NX_REG5[],load:0x1d->NXM_NX_REG7[],resubmit(,34),load:0x12->NXM_NX_REG5[],load:0x11->NXM_NX_REG7[],resubmit(,34),load:0x9->NXM_N

X_REG5[]
,load:0x8->NXM_NX_REG7[],resubmit(,34),load:0x19->NXM_NX_REG5[],load:0x18->NXM_NX_REG7[],resubmit(,34),load:0x22->NXM_NX_REG5[],load:0x21->NXM_NX_REG7[],resubmit(,34),load:0x2a->NXM_NX_REG5[],load:0x29->NXM_NX_REG7[],resubmit(,34),load:0x13->NXM_NX_REG5[],load:0x12->NXM_NX_REG7[],resubmit(,34),load:0x1a->NXM_NX_REG5[],load:0x19->NXM_NX_REG7[],resubmit(,34),load:0x1d->NXM_NX_REG5[],load:0x1c->NXM_NX_REG7[],resubmit(,34),load:0x29->NXM_NX_REG5[],load:0x28->NXM_NX_REG7[],resubmit(,34),load:0x36->NXM_NX_REG5[],load:0x35->NXM_NX_REG7[],resubmit(,34),load:0x25->NXM_NX_REG5[],load:0x24->NXM_NX_REG7[],resubmit(,34),load:0x24->NXM_NX_REG5[],load:0x23->NXM_NX_REG7[],resubmit(,34),load:0x16->NXM_NX_REG5[],load:0x15->NXM_NX_REG7[],resubmit(,34),load:0x2c->NXM_NX_REG5[],load:0x2b->NXM_NX_REG7[],resubmit(,34),load:0x2e->NXM_NX_REG5[],load:0x2d->NXM_NX_REG7[],resubmit(,34),load:0xf->NXM_NX_REG5[],load:0xe->NXM_NX_REG7[],resubmit(,34),load:0x2->NXM_NX_REG5[],load:0x1->NXM_NX_REG7[],resubmit(

,34),loa

Re: [ovs-dev] [PATCH] ovn-controller: add restart test

2016-03-07 Thread ramu
> What I would probably do is submit them together: the zone-ids fix + this
> test case that helps test it.  If you take out your "xyz" change, will this
> test fail currently?
>
> If i take out the change which masks the zone-ids, this test fails on
master because, zone-ids dont match before and after restart.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] ovn-controller: add restart test

2016-03-07 Thread ramu
>
> Does this depend on your zone-ids fix?  Do you have another rev of that
> coming?
>

Russell, This test does not depend on the zone-id fix, because it replaces
zone-ids on flows
with a dummy string "xyz". It passes on the master as is.

I will send the next revision of the zone-id fix soon, and as part of that
fix, this test
will also be updated to compare zone-ids on flows.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] ovn-controller: add restart test

2016-03-07 Thread Ramu Ramamurthy
Add a test to validate that a restart of the ovn-controller
will program flows in table 0 consistently. Flows before
restart are compared against flows after restart to detect
problems with ofports or zone-ids.

Signed-off-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>
---
 tests/ovn-controller.at | 127 
 1 file changed, 127 insertions(+)

diff --git a/tests/ovn-controller.at b/tests/ovn-controller.at
index 2ff56af..68b78ff 100644
--- a/tests/ovn-controller.at
+++ b/tests/ovn-controller.at
@@ -117,3 +117,130 @@ check_patches
 # Gracefully terminate ovn-controller
 ovs-appctl -t ovn-controller exit
 AT_CLEANUP
+
+AT_SETUP([ovn-controller - restart])
+AT_KEYWORDS([ovn])
+
+# The purpose of this test is to validate flows programmed
+# (in table 0) after ovn-controller restart.
+# This test confirms that:
+# a) ofports remain the same, so, ports were not deleted/readded
+# b) zone-ids remain the same on ports
+#
+# For now, zone-ids in flows are masked out because, they are not assigned
+# consistently. This test is added to prepare for that code-change.
+#
+
+ovn_init_db ovn-sb
+net_add n1
+sim_add hv
+as hv
+ovs-vsctl \
+-- add-br br-phys \
+-- add-br br-eth0 \
+-- add-br br-eth1 \
+-- add-br br-eth2
+
+ovn_attach n1 br-phys 192.168.0.1
+
+AT_CHECK([ovs-vsctl set Open_vSwitch . 
external-ids:ovn-bridge-mappings=physnet1:br-eth0,physnet2:br-eth1,physnet3:br-eth2])
+
+#
+# - Create 2 localnet ports
+# - Create 4 VIFs
+# - Create 5 CIFs
+#
+
+#
+# Create a localnet port, along with a vif
+#
+ovn-sbctl \
+-- --id=@dp101 create Datapath_Binding tunnel_key=101 \
+-- create Port_Binding datapath=@dp101 logical_port=localnet1 tunnel_key=1 
\
+type=localnet options:network_name=physnet1 \
+-- create Port_Binding datapath=@dp101 logical_port=localvif1 tunnel_key=2
+ovs-vsctl add-port br-int localvif1 -- set Interface localvif1 
external_ids:iface-id=localvif1
+#
+# Create another localnet port, along with a vif
+#
+ovn-sbctl \
+-- --id=@dp201 create Datapath_Binding tunnel_key=201 \
+-- create Port_Binding datapath=@dp201 logical_port=localnet201 
tunnel_key=1 \
+type=localnet options:network_name=physnet2 \
+-- create Port_Binding datapath=@dp201 logical_port=localvif201 
tunnel_key=2
+
+ovs-vsctl add-port br-int localvif201 -- set Interface localvif201 
external_ids:iface-id=localvif201
+
+#
+# Create a vif port
+#
+ovn-sbctl \
+-- --id=@dp102 create Datapath_Binding tunnel_key=102 \
+-- create Port_Binding datapath=@dp102 logical_port=localvif2 tunnel_key=3
+ovs-vsctl add-port br-int localvif2 -- set Interface localvif2 
external_ids:iface-id=localvif2
+
+#
+# Create a vif port and add 5 cifs on it
+#
+ovn-sbctl \
+-- --id=@dp103 create Datapath_Binding tunnel_key=103 \
+-- create Port_Binding datapath=@dp103 logical_port=localvif3 tunnel_key=4 
\
+-- create Port_Binding datapath=@dp103 logical_port=localcif1 tunnel_key=5 
parent_port=localvif3 tag=1 \
+-- create Port_Binding datapath=@dp103 logical_port=localcif2 tunnel_key=6 
parent_port=localvif3 tag=2 \
+-- create Port_Binding datapath=@dp103 logical_port=localcif3 tunnel_key=7 
parent_port=localvif3 tag=3 \
+-- create Port_Binding datapath=@dp103 logical_port=localcif4 tunnel_key=8 
parent_port=localvif3 tag=4 \
+-- create Port_Binding datapath=@dp103 logical_port=localcif5 tunnel_key=9 
parent_port=localvif3 tag=5
+ovs-vsctl add-port br-int localvif3 -- set Interface localvif3 
external_ids:iface-id=localvif3
+
+ovs-vsctl show
+
+# wait for flows in table 0 to get programmed
+OVS_WAIT_UNTIL([test `ovs-ofctl dump-flows br-int table=0 | ofctl_strip | tail 
-n+2 | sort | wc -l` -ge 13])
+
+# save the table0 flows for a later comparison after restart
+ovs-ofctl dump-flows br-int table=0 | ofctl_strip | tail -n+2 | sort | sed 
's/load.\+NXM_NX_REG5/load:xyz->NXM_NX_REG5/g' > expout
+cat expout
+
+
+#kill the ovn controller ungracefully
+kill -9 `cat "$OVS_RUNDIR"/ovn-controller.pid`
+
+echo "killed the ovn-controller"
+
+#clean the flows
+ovs-ofctl del-flows br-int
+
+#restart the ovn controller
+start_daemon ovn-controller
+
+#wait for the table 0 flows to get reprogrammed
+OVS_WAIT_UNTIL([test `ovs-ofctl dump-flows br-int table=0 | ofctl_strip | tail 
-n+2 | sort | wc -l` -ge 13])
+
+# save flows for a comparison
+ovs-ofctl dump-flows br-int table=0 | ofctl_strip | tail -n+2 | sort | sed 
's/load.\+NXM_NX_REG5/load:xyz->NXM_NX_REG5/g' > afterkill
+cat afterkill
+
+#compare the table 0 flows before and after restart
+AT_CHECK([ovs-ofctl diff-flows afterkill expout])
+
+
+ovs-appctl -t ovn-controller exit
+echo "shutdown the ovn-controller gracefully"
+
+#clean the flows
+ovs-ofctl del-flows br-int
+
+start_daemon ovn-controller
+echo "restarted the ovn-controller"
+
+#wait for the table 0 flows to get reprogrammed
+OVS_WAIT_UNTIL([test `ovs-ofctl dump-flows br-int

[ovs-dev] [PATCH] ovn-controller: race between binding-run and patch-run for localnet ports

2016-03-04 Thread Ramu Ramamurthy
when ctx->ovnsb_idl_txn is null, binding_run exits early
and does not add any local_datapaths, but patch_run
doesnt check this, and ends up deleting localnet ports,
because there are no local datapaths for them,
They get readded in a subsequent run causing unnecessary
deletion and readdition.

Signed-off-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>
---
 ovn/controller/patch.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ovn/controller/patch.c b/ovn/controller/patch.c
index 753ce3e..cfae613 100644
--- a/ovn/controller/patch.c
+++ b/ovn/controller/patch.c
@@ -280,7 +280,7 @@ void
 patch_run(struct controller_ctx *ctx, const struct ovsrec_bridge *br_int,
   struct hmap *local_datapaths)
 {
-if (!ctx->ovs_idl_txn) {
+if (!ctx->ovs_idl_txn || !ctx->ovnsb_idl_txn) {
 return;
 }
 
-- 
2.3.9

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [ovs-dev, v5] ovn-controller: Assign zone-id consistently

2016-02-23 Thread ramu
Thanks for your review!

>
> From this part of the patch, it looks like external-ids:zone-id is
> accepted even if there are duplicates, or if the value is not valid.  I
> think that it should reject such cases:
>


> With that in mind, update_local_zone_ids() should also update
> external-ids:zone-id if it needs to change, instead of leaving it the
> same.
>
>
Here, please use INT_STRLEN(int) + 1 as the size of zone[].
> +char zone[12];
> +zone_id = simap_get(ct_zones, iface_id);
> +snprintf(zone, sizeof zone, "%d", zone_id);
>
>
> In the next version I will incorporate these changes,
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [RFC] OVN: Support Static Route

2016-02-23 Thread Ramu Ramamurthy


Reported-At: https://bugs.launchpad.net/networking-ovn/+bug/1539347
Reported-By: na...@cn.ibm.com
Co-Authored-By: na...@cn.ibm.com,rua...@cn.ibm.com

---
openstack-neutron (CMS) allows the configuration of
"static-routes" on a router as shown below.

neutron router-update --route 
destination=192.168.3.0/24,nexthop=192.168.1.100 router1


The static route gets implemented (in openstack) by adding a route in 
the router-namespace using a command such as: ip route add to 
destination-cidr via nexthop.


The next-hop is a port known to the CMS, and so, the CMS knows the 
IP->mac mapping

of the nexthop port.

The usecases addressed by static route includes a VPN router. A VPN 
router is a VM that has interfaces on a private network, and on a 
provider network.
The static route allows other VMs on the private network to reach the 
VPN-network via

the VPN router.

Example:
  Private networks are 192.168.1.0/24 and 192.168.2.0/24
  VPN-router has an interface (192.168.1.100) on private network 
(192.168.1.0/24)

  VPN connects to 192.168.3.0/24 over the provider network
  the following static route is needed to allow VMs private networks 
to reach the VPN network

  destination=192.168.3.0/24, nexthop=192.168.1.100


An approach to implement static routes in OVN follows:

1) A new column called static route is added to the Logical Router table
* the static route column may in turn refer to a new 
Logical_Router_Static_Route table whose columns include

 destination, nexthop
OR for simplicity,
   * the static route column may contain the static routes as a string 
with formatting to denote separation - for example, 
"destination=192.168.3.0/24,nexthop=192.168.1.100;destination=192.168.2.0/24,nexthop=10.0.1.4"


2) Based on the presence of the "static route" field in the logical 
router, OVN northd programs a logical flow corresponding to the each 
static route as follow:


actions : "ip.ttl--; reg0 = nexthop; next;"
external_ids : {stage-name=lr_in_ip_routing}
match : "ip4.dst == destination cidr"
pipeline : ingress

The lr_in_arp stage would have already have an arp-resolution-entry for 
the nexthop since the nexthop (IP,Mac) is already added as a logical 
port.



___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v5] ovn-controller: Assign zone-id consistently

2016-02-15 Thread ramu
>
>
> Unfortunately, I thought of another issue that complicates this whole
> approach.  A single interface does not necessarily map to a single
> logical port and zone ID.  We support sub-ports, initially aimed at
> modelling containers in VMs.  That means we need to track N different
> zone IDs on a single interface record.  For more info, see "Life Cycle
> of a Container Interface Inside a VM" in ovn-archtecture(7).
>
> For that use case, we could easily have many hundreds of sub-ports using
> a single interface.  Maybe we should have external-id keys of the form
> "ovn-zone-id-" where  is the logical port the zone id is
> for?  We'd need some additional code for cleaning up zone IDs for
> sub-ports that have been deleted.
>

Thanks, In the case of a cif, can we have the key as: ovn-zone-id-
where the tag is the vlan
which distinguishes the cif on the vif ? But, this will result in external
id containings lots of keys - would that be acceptable ?

An alternative approach just for comparison is: during ovn-controller
startup to look at the installed flows in table 0 of the flow table, and
recover the zone-id -> (ofport, tag) map and from there deduce the
logical-port -> zone-id map
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v5] ovn-controller: Assign zone-id consistently

2016-02-15 Thread Ramu Ramamurthy
Currently, ovn-controller does not record the
lport->zoneid map, and so, after ovn-controller restart,
zone-ids may get set inconsistently on lports, resulting
in possible hits to already established connections.

Set zone-id as an external-id of the interface record,
and recover the zone-id from that record

Reported-by: Russell Bryant <russ...@ovn.org>
Reported-at: https://bugs.launchpad.net/networking-ovn/+bug/1538696
Signed-off-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>
---
 Changes v3 to v4: update as per code-review
   * simplify code to update zone-id only when needed
   * update documentation for external id
   * update authors
 Changes v4 to v5: fix formatting

 AUTHORS |  1 +
 ovn/controller/binding.c| 39 +++--
 ovn/controller/ovn-controller.8.xml | 13 +
 3 files changed, 51 insertions(+), 2 deletions(-)

diff --git a/AUTHORS b/AUTHORS
index 936394d..597899d 100644
--- a/AUTHORS
+++ b/AUTHORS
@@ -161,6 +161,7 @@ pritesh pritesh.koth...@cisco.com
 Pravin B Shelar pshe...@nicira.com
 Raju Subramanianrsubraman...@nicira.com
 Rami Rosen  ramir...@gmail.com
+Ramu Ramamurthy ramu.ramamur...@us.ibm.com
 Randall Sharo   andall.sh...@navy.mil
 Ravi Kerur  ravi.ke...@telekom.com
 Reid Price  r...@nicira.com
diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
index cb12cea..543686c 100644
--- a/ovn/controller/binding.c
+++ b/ovn/controller/binding.c
@@ -50,7 +50,39 @@ binding_register_ovs_idl(struct ovsdb_idl *ovs_idl)
 }
 
 static void
-get_local_iface_ids(const struct ovsrec_bridge *br_int, struct shash *lports)
+update_local_zone_id(const struct ovsrec_interface *iface_rec, const char 
*iface_id,
+ struct simap *ct_zones, unsigned long *ct_zone_bitmap,
+ struct controller_ctx *ctx)
+{
+int zone_id;
+const char *OVN_ZONE_ID = "ovn-zone-id";
+
+zone_id = smap_get_int(_rec->external_ids, OVN_ZONE_ID, 0);
+if (zone_id && !simap_contains(ct_zones, iface_id)) {
+/* get zone-id from the local interface record */
+bitmap_set1(ct_zone_bitmap, zone_id);
+simap_put(ct_zones, iface_id, zone_id);
+}
+if (!zone_id && simap_contains(ct_zones, iface_id) &&
+ctx->ovs_idl_txn) {
+/* zone-id has been assigned to lport, but not
+ * recorded in the local interface record yet */
+struct smap new;
+char zone[12];
+zone_id = simap_get(ct_zones, iface_id);
+snprintf(zone, sizeof zone, "%d", zone_id);
+smap_clone(, _rec->external_ids);
+smap_replace(, OVN_ZONE_ID, zone);
+ovsrec_interface_verify_external_ids(iface_rec);
+ovsrec_interface_set_external_ids(iface_rec, );
+smap_destroy();
+}
+}
+
+static void
+get_local_iface_ids(const struct ovsrec_bridge *br_int, struct shash *lports,
+struct simap *ct_zones, unsigned long *ct_zone_bitmap,
+struct controller_ctx *ctx)
 {
 int i;
 
@@ -72,10 +104,13 @@ get_local_iface_ids(const struct ovsrec_bridge *br_int, 
struct shash *lports)
 continue;
 }
 shash_add(lports, iface_id, iface_rec);
+update_local_zone_id(iface_rec, iface_id, ct_zones,
+ ct_zone_bitmap, ctx);
 }
 }
 }
 
+
 static void
 update_ct_zones(struct sset *lports, struct simap *ct_zones,
 unsigned long *ct_zone_bitmap)
@@ -165,7 +200,7 @@ binding_run(struct controller_ctx *ctx, const struct 
ovsrec_bridge *br_int,
 
 struct shash lports = SHASH_INITIALIZER();
 if (br_int) {
-get_local_iface_ids(br_int, );
+get_local_iface_ids(br_int, , ct_zones, ct_zone_bitmap, ctx);
 } else {
 /* We have no integration bridge, therefore no local logical ports.
  * We'll remove our chassis from all port binding records below. */
diff --git a/ovn/controller/ovn-controller.8.xml 
b/ovn/controller/ovn-controller.8.xml
index b261af9..383de61 100644
--- a/ovn/controller/ovn-controller.8.xml
+++ b/ovn/controller/ovn-controller.8.xml
@@ -194,6 +194,19 @@
   logical patch port that it implements.
 
   
+
+  
+external-ids:ovn-zone-id in the
+Interface table
+  
+
+  
+
+  This key is set by the ovn-controller to identify the
+  conntrack-zone id used for the OVN logical port. Its value specifies
+  the zone-id.
+
+  
 
 
 Runtime Management Commands
-- 
2.3.9

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v4] ovn-controller: Assign zone-id consistently

2016-02-15 Thread Ramu Ramamurthy
Currently, ovn-controller does not record the
lport->zoneid map, and so, after ovn-controller restart,
zone-ids may get set inconsistently on lports, resulting
in possible hits to already established connections.

Set zone-id as an external-id of the interface record,
and recover the zone-id from that record

Reported-by: Russell Bryant <russ...@ovn.org>
Reported-at: https://bugs.launchpad.net/networking-ovn/+bug/1538696
Signed-off-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>
---
 Changes v3 to v4: update as per code-review 
  * simplify code to update zone-id only when needed
  * update documentation for external id
  * update authors

 AUTHORS |  1 +
 ovn/controller/binding.c| 39 +++--
 ovn/controller/ovn-controller.8.xml | 13 +
 3 files changed, 51 insertions(+), 2 deletions(-)

diff --git a/AUTHORS b/AUTHORS
index 936394d..597899d 100644
--- a/AUTHORS
+++ b/AUTHORS
@@ -161,6 +161,7 @@ pritesh pritesh.koth...@cisco.com
 Pravin B Shelar pshe...@nicira.com
 Raju Subramanianrsubraman...@nicira.com
 Rami Rosen  ramir...@gmail.com
+Ramu Ramamurthy ramu.ramamur...@us.ibm.com
 Randall Sharo   andall.sh...@navy.mil
 Ravi Kerur  ravi.ke...@telekom.com
 Reid Price  r...@nicira.com
diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
index cb12cea..a2ce2ba 100644
--- a/ovn/controller/binding.c
+++ b/ovn/controller/binding.c
@@ -50,7 +50,39 @@ binding_register_ovs_idl(struct ovsdb_idl *ovs_idl)
 }
 
 static void
-get_local_iface_ids(const struct ovsrec_bridge *br_int, struct shash *lports)
+update_local_zone_id(const struct ovsrec_interface *iface_rec, const char 
*iface_id,
+ struct simap *ct_zones, unsigned long *ct_zone_bitmap,
+ struct controller_ctx *ctx)
+{
+int zone_id;
+const char *OVN_ZONE_ID = "ovn-zone-id";
+
+zone_id = smap_get_int(_rec->external_ids, OVN_ZONE_ID, 0);
+if (zone_id && !simap_contains(ct_zones, iface_id)) {
+/* get zone-id from the local interface record */
+bitmap_set1(ct_zone_bitmap, zone_id);
+simap_put(ct_zones, iface_id, zone_id);
+}
+if (!zone_id && simap_contains(ct_zones, iface_id) &&
+ctx->ovs_idl_txn) {
+/* zone-id has been assigned to lport, but not
+ * recorded in the local interface record yet */
+struct smap new;
+char zone[12];
+zone_id = simap_get(ct_zones, iface_id);
+snprintf(zone, sizeof zone, "%d", zone_id);
+smap_clone(, _rec->external_ids);
+smap_replace(, OVN_ZONE_ID, zone);
+ovsrec_interface_verify_external_ids(iface_rec);
+ovsrec_interface_set_external_ids(iface_rec, );
+smap_destroy();
+}
+}
+
+static void
+get_local_iface_ids(const struct ovsrec_bridge *br_int, struct shash *lports,
+struct simap *ct_zones, unsigned long *ct_zone_bitmap,
+   struct controller_ctx *ctx)
 {
 int i;
 
@@ -72,10 +104,13 @@ get_local_iface_ids(const struct ovsrec_bridge *br_int, 
struct shash *lports)
 continue;
 }
 shash_add(lports, iface_id, iface_rec);
+update_local_zone_id(iface_rec, iface_id, ct_zones,
+ ct_zone_bitmap, ctx);
 }
 }
 }
 
+
 static void
 update_ct_zones(struct sset *lports, struct simap *ct_zones,
 unsigned long *ct_zone_bitmap)
@@ -165,7 +200,7 @@ binding_run(struct controller_ctx *ctx, const struct 
ovsrec_bridge *br_int,
 
 struct shash lports = SHASH_INITIALIZER();
 if (br_int) {
-get_local_iface_ids(br_int, );
+get_local_iface_ids(br_int, , ct_zones, ct_zone_bitmap, ctx);
 } else {
 /* We have no integration bridge, therefore no local logical ports.
  * We'll remove our chassis from all port binding records below. */
diff --git a/ovn/controller/ovn-controller.8.xml 
b/ovn/controller/ovn-controller.8.xml
index b261af9..383de61 100644
--- a/ovn/controller/ovn-controller.8.xml
+++ b/ovn/controller/ovn-controller.8.xml
@@ -194,6 +194,19 @@
   logical patch port that it implements.
 
   
+
+  
+external-ids:ovn-zone-id in the
+Interface table
+  
+
+  
+
+  This key is set by the ovn-controller to identify the
+  conntrack-zone id used for the OVN logical port. Its value specifies
+  the zone-id.
+
+  
 
 
 Runtime Management Commands
-- 
2.3.9

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v3] ovn-controller: Assign zone-id consistently

2016-02-11 Thread Ramu Ramamurthy
Currently, ovn-controller does not record the
lport->zoneid map, and so, after ovn-controller restart,
zone-ids may get set inconsistently on lports, resulting
in possible hits to already established connections.

Set zone-id as an external-id of the interface record,
and recover the zone-id from that record

Reported-by: Russell Bryant <russ...@ovn.org>
Reported-at: https://bugs.launchpad.net/networking-ovn/+bug/1538696
Signed-off-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>
---
Changes v1 to v2
 * add check for null br-int
Changes v2 to v3
 * updates per code-review comments

 ovn/controller/binding.c | 64 +---
 1 file changed, 60 insertions(+), 4 deletions(-)

diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
index cb12cea..4aecffe 100644
--- a/ovn/controller/binding.c
+++ b/ovn/controller/binding.c
@@ -50,7 +50,8 @@ binding_register_ovs_idl(struct ovsdb_idl *ovs_idl)
 }
 
 static void
-get_local_iface_ids(const struct ovsrec_bridge *br_int, struct shash *lports)
+get_local_iface_ids(const struct ovsrec_bridge *br_int, struct shash *lports,
+struct simap *ct_zones, unsigned long *ct_zone_bitmap)
 {
 int i;
 
@@ -65,6 +66,8 @@ get_local_iface_ids(const struct ovsrec_bridge *br_int, 
struct shash *lports)
 
 for (j = 0; j < port_rec->n_interfaces; j++) {
 const struct ovsrec_interface *iface_rec;
+const char *zone;
+int zone_id;
 
 iface_rec = port_rec->interfaces[j];
 iface_id = smap_get(_rec->external_ids, "iface-id");
@@ -72,13 +75,65 @@ get_local_iface_ids(const struct ovsrec_bridge *br_int, 
struct shash *lports)
 continue;
 }
 shash_add(lports, iface_id, iface_rec);
+zone = smap_get(_rec->external_ids, "ovn-zone-id");
+if (zone && ovs_scan(zone, "%d", _id)) {
+bitmap_set1(ct_zone_bitmap, zone_id);
+simap_put(ct_zones, iface_id, zone_id);
+}
+}
+}
+}
+
+static void
+update_local_zone_ids(const struct ovsrec_bridge *br_int, struct simap 
*ct_zones,
+  struct controller_ctx *ctx)
+{
+int i;
+
+if (!ctx->ovs_idl_txn || !br_int) {
+return;
+}
+
+for (i = 0; i < br_int->n_ports; i++) {
+const struct ovsrec_port *port_rec = br_int->ports[i];
+int j;
+
+if (!strcmp(port_rec->name, br_int->name)) {
+continue;
+}
+
+for (j = 0; j < port_rec->n_interfaces; j++) {
+const struct ovsrec_interface *iface_rec;
+const char *iface_id;
+struct smap new;
+int zone_id;
+char zone[12];
+
+iface_rec = port_rec->interfaces[j];
+iface_id = smap_get(_rec->external_ids, "iface-id");
+
+if (!iface_id ||
+smap_get(_rec->external_ids, "ovn-zone-id") ||
+!simap_contains(ct_zones, iface_id)) {
+continue;
+}
+
+zone_id = simap_get(ct_zones, iface_id);
+snprintf(zone, sizeof zone, "%d", zone_id);
+smap_clone(, _rec->external_ids);
+smap_replace(, "ovn-zone-id", zone);
+ovsrec_interface_verify_external_ids(iface_rec);
+ovsrec_interface_set_external_ids(iface_rec, );
+smap_destroy();
 }
 }
 }
 
 static void
 update_ct_zones(struct sset *lports, struct simap *ct_zones,
-unsigned long *ct_zone_bitmap)
+unsigned long *ct_zone_bitmap,
+const struct ovsrec_bridge *br_int,
+struct controller_ctx *ctx)
 {
 struct simap_node *ct_zone, *ct_zone_next;
 const char *iface_id;
@@ -119,6 +174,7 @@ update_ct_zones(struct sset *lports, struct simap *ct_zones,
  * xxx zone, but we need a generic interface to the conntrack
  * xxx table. */
 }
+update_local_zone_ids(br_int, ct_zones, ctx);
 }
 
 static void
@@ -165,7 +221,7 @@ binding_run(struct controller_ctx *ctx, const struct 
ovsrec_bridge *br_int,
 
 struct shash lports = SHASH_INITIALIZER();
 if (br_int) {
-get_local_iface_ids(br_int, );
+get_local_iface_ids(br_int, , ct_zones, ct_zone_bitmap);
 } else {
 /* We have no integration bridge, therefore no local logical ports.
  * We'll remove our chassis from all port binding records below. */
@@ -224,7 +280,7 @@ binding_run(struct controller_ctx *ctx, const struct 
ovsrec_bridge *br_int,
 VLOG_DBG("No port binding record for lport %s", node->name);
 }
 
-update_ct_zones(_lports, ct_zones, ct_zone_bitmap);
+update_ct_zones(_lports, ct_zones, ct_zone_bitmap, br_int, ctx);
 
 shash_destroy();
 sset_destroy(_lports);
-- 
2.3.9

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] ovn-controller: Assign zone-ids consistently

2016-02-08 Thread Ramu Ramamurthy



Currently, conntrack zone-id is assigned to lport by ovn-controller,
but the ovn-controller does not remember what was earlier assigned
to the same lport (possibly in an earlier run across restart).

So, after ovn-controller restart, the zone-ids may get set 
inconsistently

on lports, resulting in possible hits to already established
connections.

Fix is to remember the zone-id as an external-id of the interface record
in the local ovs-db, and recover zone-ids assigned earlier to lports 
from

that record.

This patch fixes:
https://bugs.launchpad.net/networking-ovn/+bug/1538696

Signed-off-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>

---
 ovn/controller/binding.c | 70 
+++-

 1 file changed, 63 insertions(+), 7 deletions(-)

diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
index ce9cccf..28adcef 100644
--- a/ovn/controller/binding.c
+++ b/ovn/controller/binding.c
@@ -46,10 +46,60 @@ binding_register_ovs_idl(struct ovsdb_idl *ovs_idl)
 ovsdb_idl_add_column(ovs_idl, _interface_col_external_ids);
 }

+
 static void
-get_local_iface_ids(const struct ovsrec_bridge *br_int, struct sset 
*lports)
+update_local_zone_ids(const struct ovsrec_bridge *br_int, struct simap 
*ct_zones,

+  struct controller_ctx *ctx)
 {
 int i;
+struct smap new;
+int zone_id;
+char *zone;
+
+if (!ctx->ovs_idl_txn) {
+return;
+}
+
+for (i = 0; i < br_int->n_ports; i++) {
+const struct ovsrec_port *port_rec = br_int->ports[i];
+const char *iface_id;
+int j;
+
+if (!strcmp(port_rec->name, br_int->name)) {
+continue;
+}
+
+for (j = 0; j < port_rec->n_interfaces; j++) {
+const struct ovsrec_interface *iface_rec;
+
+iface_rec = port_rec->interfaces[j];
+iface_id = smap_get(_rec->external_ids, "iface-id");
+
+if (!iface_id || smap_get(_rec->external_ids, 
"zone-id") ||

+!simap_contains(ct_zones, iface_id)) {
+continue;
+}
+
+zone_id = simap_get(ct_zones, iface_id);
+zone = xasprintf("%d", zone_id);
+smap_clone(, _rec->external_ids);
+smap_replace(, "zone-id", zone);
+ovsrec_interface_verify_external_ids(iface_rec);
+ovsrec_interface_set_external_ids(iface_rec, );
+free(zone);
+smap_destroy();
+}
+}
+}
+
+
+static void
+get_local_iface_ids(const struct ovsrec_bridge *br_int, struct sset 
*lports,
+struct simap *ct_zones, unsigned long 
*ct_zone_bitmap)

+{
+int i;
+const char *zone;
+int zone_id;

 for (i = 0; i < br_int->n_ports; i++) {
 const struct ovsrec_port *port_rec = br_int->ports[i];
@@ -69,13 +119,21 @@ get_local_iface_ids(const struct ovsrec_bridge 
*br_int, struct sset *lports)

 continue;
 }
 sset_add(lports, iface_id);
+zone = smap_get(_rec->external_ids, "zone-id");
+if (zone && ovs_scan(zone, "%d", _id)) {
+bitmap_set1(ct_zone_bitmap, zone_id);
+simap_put(ct_zones, iface_id, zone_id);
+}
 }
 }
 }

+
 static void
 update_ct_zones(struct sset *lports, struct simap *ct_zones,
-unsigned long *ct_zone_bitmap)
+unsigned long *ct_zone_bitmap,
+const struct ovsrec_bridge *br_int,
+struct controller_ctx *ctx)
 {
 struct simap_node *ct_zone, *ct_zone_next;
 const char *iface_id;
@@ -112,10 +170,8 @@ update_ct_zones(struct sset *lports, struct simap 
*ct_zones,

 bitmap_set1(ct_zone_bitmap, zone);
 simap_put(ct_zones, iface_id, zone);

-/* xxx We should erase any old entries for this
- * xxx zone, but we need a generic interface to the conntrack
- * xxx table. */
 }
+update_local_zone_ids(br_int, ct_zones, ctx);
 }

 static void
@@ -154,7 +210,7 @@ binding_run(struct controller_ctx *ctx, const struct 
ovsrec_bridge *br_int,

 sset_init();
 sset_init(_lports);
 if (br_int) {
-get_local_iface_ids(br_int, );
+get_local_iface_ids(br_int, , ct_zones, ct_zone_bitmap);
 } else {
 /* We have no integration bridge, therefore no local logical 
ports.
  * We'll remove our chassis from all port binding records 
below. */
@@ -203,7 +259,7 @@ binding_run(struct controller_ctx *ctx, const struct 
ovsrec_bridge *br_int,

 VLOG_DBG("No port binding record for lport %s", name);
 }

-update_ct_zones(_lports, ct_zones, ct_zone_bitmap);
+update_ct_zones(_lports, ct_zones, ct_zone_bitmap, br_int, 
ctx);


 sset_destroy();
 sset_destroy(_lports);
--
1.8.3.1





___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] ovn-controller: Assign zone-ids consistently

2016-02-08 Thread Ramu Ramamurthy


Resend (apologies) as the mailer mangled spaces on the patch.

Signed-off-by: Ramu Ramamurthy <ramu.ramamur...@us.ibm.com>

---
 ovn/controller/binding.c | 70 
+++-

 1 file changed, 63 insertions(+), 7 deletions(-)

diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
index ce9cccf..28adcef 100644
--- a/ovn/controller/binding.c
+++ b/ovn/controller/binding.c
@@ -46,10 +46,60 @@ binding_register_ovs_idl(struct ovsdb_idl *ovs_idl)
 ovsdb_idl_add_column(ovs_idl, _interface_col_external_ids);
 }

+
 static void
-get_local_iface_ids(const struct ovsrec_bridge *br_int, struct sset 
*lports)
+update_local_zone_ids(const struct ovsrec_bridge *br_int, struct simap 
*ct_zones,

+  struct controller_ctx *ctx)
 {
 int i;
+struct smap new;
+int zone_id;
+char *zone;
+
+if (!ctx->ovs_idl_txn) {
+return;
+}
+
+for (i = 0; i < br_int->n_ports; i++) {
+const struct ovsrec_port *port_rec = br_int->ports[i];
+const char *iface_id;
+int j;
+
+if (!strcmp(port_rec->name, br_int->name)) {
+continue;
+}
+
+for (j = 0; j < port_rec->n_interfaces; j++) {
+const struct ovsrec_interface *iface_rec;
+
+iface_rec = port_rec->interfaces[j];
+iface_id = smap_get(_rec->external_ids, "iface-id");
+
+if (!iface_id || smap_get(_rec->external_ids, 
"zone-id") ||

+!simap_contains(ct_zones, iface_id)) {
+continue;
+}
+
+zone_id = simap_get(ct_zones, iface_id);
+zone = xasprintf("%d", zone_id);
+smap_clone(, _rec->external_ids);
+smap_replace(, "zone-id", zone);
+ovsrec_interface_verify_external_ids(iface_rec);
+ovsrec_interface_set_external_ids(iface_rec, );
+free(zone);
+smap_destroy();
+}
+}
+}
+
+
+static void
+get_local_iface_ids(const struct ovsrec_bridge *br_int, struct sset 
*lports,
+struct simap *ct_zones, unsigned long 
*ct_zone_bitmap)

+{
+int i;
+const char *zone;
+int zone_id;

 for (i = 0; i < br_int->n_ports; i++) {
 const struct ovsrec_port *port_rec = br_int->ports[i];
@@ -69,13 +119,21 @@ get_local_iface_ids(const struct ovsrec_bridge 
*br_int, struct sset *lports)

 continue;
 }
 sset_add(lports, iface_id);
+zone = smap_get(_rec->external_ids, "zone-id");
+if (zone && ovs_scan(zone, "%d", _id)) {
+bitmap_set1(ct_zone_bitmap, zone_id);
+simap_put(ct_zones, iface_id, zone_id);
+}
 }
 }
 }

+
 static void
 update_ct_zones(struct sset *lports, struct simap *ct_zones,
-unsigned long *ct_zone_bitmap)
+unsigned long *ct_zone_bitmap,
+const struct ovsrec_bridge *br_int,
+struct controller_ctx *ctx)
 {
 struct simap_node *ct_zone, *ct_zone_next;
 const char *iface_id;
@@ -112,10 +170,8 @@ update_ct_zones(struct sset *lports, struct simap 
*ct_zones,

 bitmap_set1(ct_zone_bitmap, zone);
 simap_put(ct_zones, iface_id, zone);

-/* xxx We should erase any old entries for this
- * xxx zone, but we need a generic interface to the conntrack
- * xxx table. */
 }
+update_local_zone_ids(br_int, ct_zones, ctx);
 }

 static void
@@ -154,7 +210,7 @@ binding_run(struct controller_ctx *ctx, const struct 
ovsrec_bridge *br_int,

 sset_init();
 sset_init(_lports);
 if (br_int) {
-get_local_iface_ids(br_int, );
+get_local_iface_ids(br_int, , ct_zones, ct_zone_bitmap);
 } else {
 /* We have no integration bridge, therefore no local logical 
ports.
  * We'll remove our chassis from all port binding records 
below. */
@@ -203,7 +259,7 @@ binding_run(struct controller_ctx *ctx, const struct 
ovsrec_bridge *br_int,

 VLOG_DBG("No port binding record for lport %s", name);
 }

-update_ct_zones(_lports, ct_zones, ct_zone_bitmap);
+update_ct_zones(_lports, ct_zones, ct_zone_bitmap, br_int, 
ctx);


 sset_destroy();
 sset_destroy(_lports);
--
1.8.3.1



On 2016-02-08 10:48, Ramu Ramamurthy wrote:

Currently, conntrack zone-id is assigned to lport by ovn-controller,
but the ovn-controller does not remember what was earlier assigned
to the same lport (possibly in an earlier run across restart).

So, after ovn-controller restart, the zone-ids may get set 
inconsistently

on lports, resulting in possible hits to already established
connections.

Fix is to remember the zone-id as an external-id of the interface 
record
in the local ovs-db, and recover zone-ids assigned earlier to lports 
from

that record.


Re: [ovs-dev] [PATCH] ovn-controller: Assign zone-ids consistently

2016-02-08 Thread Ramu Ramamurthy

On 2016-02-08 11:12, Russell Bryant wrote:

On 02/08/2016 02:00 PM, Ramu Ramamurthy wrote:


Resend (apologies) as the mailer mangled spaces on the patch.


Can you try posting with the git-send-email utility?


I am posting again from a better behaved mailer

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] ovs segments frames when transmitting to veth

2016-01-13 Thread Ramu Ramamurthy


Problem:
-

We are running centos 7.2 (ovs-kernel module from distribution) with OVS 
2.3.1.


Large tcp frames (> 30Kbytes gro'd) are coming into a vxlan-tunnel port 
and switched by an ovs-bridge onto a veth port


vxlan-tunnel--(gro'd tcp frames sized > 30K)-->OVS-bridge--(frames sized 
1448)-->veth port


As the ovs transmits onto the veth port, the frames are getting 
segmented into 1448 byte sized packets. All the offload settings are at 
default on the veth port (gso on, gro on, tso on)


We expect that the OVS can transmit the frame without segmentation onto 
the veth port. But, due to segmentation, the performance suffers.


Root Cause
---

The following trace with debugs shows the root cause that the gso_type 
is not correctly set on
the skb as the packet comes in, and as a result, the dev_xmit_skb is 
segmenting the frame as ovs transmits it.





ksoftirqd/25-231   [025] ..s.   319.964120: vxlan_rcv: 
gso_size=1448,gso_type=1001


The above is debug in vport-vxlan.c:vxlan_rcv:
trace_printk("gso_size=%d,gso_type=%x\n",skb_shinfo(skb)->gso_size,skb_shinfo(skb)->gso_type);

Here, gso_type seem to have an invalid value.

ksoftirqd/25-231   [025] ..s.   319.964120: ovs_vport_receive 
<-vxlan_rcv
ksoftirqd/25-231   [025] ..s.   319.964121: ovs_flow_key_extract 
<-ovs_vport_receive
ksoftirqd/25-231   [025] ..s.   319.964121: key_extract 
<-ovs_flow_key_extract
ksoftirqd/25-231   [025] ..s.   319.964121: check_header 
<-key_extract

ksoftirqd/25-231   [025] ..s.   319.964121: tcphdr_ok <-key_extract
ksoftirqd/25-231   [025] ..s.   319.964121: ovs_dp_process_packet 
<-ovs_vport_receive
ksoftirqd/25-231   [025] ..s.   319.964121: 
ovs_flow_tbl_lookup_stats <-ovs_dp_process_packet
ksoftirqd/25-231   [025] ..s.   319.964121: masked_flow_lookup 
<-ovs_flow_tbl_lookup_stats
ksoftirqd/25-231   [025] ..s.   319.964121: ovs_flow_mask_key 
<-masked_flow_lookup
ksoftirqd/25-231   [025] ..s.   319.964122: find_bucket.isra.2 
<-masked_flow_lookup
ksoftirqd/25-231   [025] ..s.   319.964122: masked_flow_lookup 
<-ovs_flow_tbl_lookup_stats
ksoftirqd/25-231   [025] ..s.   319.964122: ovs_flow_mask_key 
<-masked_flow_lookup
ksoftirqd/25-231   [025] ..s.   319.964122: find_bucket.isra.2 
<-masked_flow_lookup
ksoftirqd/25-231   [025] ..s.   319.964122: ovs_flow_stats_update 
<-ovs_dp_process_packet
ksoftirqd/25-231   [025] ..s.   319.964122: _raw_spin_loca 
<-ovs_flow_stats_update
ksoftirqd/25-231   [025] ..s.   319.964122: _raw_spin_unlock 
<-ovs_flow_stats_update
ksoftirqd/25-231   [025] ..s.   319.964122: ovs_execute_actions 
<-ovs_dp_process_packet
ksoftirqd/25-231   [025] ..s.   319.964123: do_execute_actions 
<-ovs_execute_actions
ksoftirqd/25-231   [025] ..s.   319.964123: do_output 
<-do_execute_actions
ksoftirqd/25-231   [025] ..s.   319.964123: ovs_lookup_vport 
<-do_output
ksoftirqd/25-231   [025] ..s.   319.964123: ovs_vport_send 
<-do_output
ksoftirqd/25-231   [025] ..s.   319.964123: netdev_send 
<-ovs_vport_send




ksoftirqd/25-231   [025] ..s.   319.964123: netdev_send: devmtu=8950
ksoftirqd/25-231   [025] ..s.   319.964123: netdev_send: 
devname=ovs-veth0

ksoftirqd/25-231   [025] ..s.   319.964124: netdev_send: gso_segs=45
ksoftirqd/25-231   [025] ..s.   319.964124: netif_skb_features 
<-netdev_send
ksoftirqd/25-231   [025] ..s.   319.964124: skb_network_protocol 
<-netif_skb_features

ksoftirqd/25-231   [025] ..s.   319.964124: netdev_send: ip_summed=3
ksoftirqd/25-231   [025] ..s.   319.964124: netdev_send: 
gso_segs=45,len=65226,feat=6103db59e9
ksoftirqd/25-231   [025] ..s.   319.964124: netdev_send: 
net_gso_ok=0,has_frag_list=1,frag_feat=1
ksoftirqd/25-231   [025] ..s.   319.964125: netdev_send: 
gso_size=1448,gso_type=4097,ok=0


  The above are debugs in vport-netdev.c:netdev_send:
trace_printk("devmtu=%d\n",mtu);
trace_printk("devname=%s\n",netdev_vport->dev->name);
trace_printk("gso_segs=%d\n",skb_shinfo(skb)->gso_segs);
features = netif_skb_features(skb);
feature = skb_shinfo(skb)->gso_type << NETIF_F_GSO_SHIFT;
net_gso_ok = ((features & feature) == feature);
gso_ok = skb_gso_ok(skb, features);
has_frag_list = skb_has_frag_list(skb);
frag_feat = features & NETIF_F_FRAGLIST ? 1 : 0;
trace_printk("ip_summed=%d\n",skb->ip_summed);

trace_printk("gso_segs=%d,len=%d,feat=%llx\n",skb_shinfo(skb)->gso_segs,len,features);
   
trace_printk("net_gso_ok=%d,has_frag_list=%d,frag_feat=%d\n",net_gso_ok,has_frag_list,frag_feat);

trace_printk("gso_size=%d,gso_type=%d,ok=%d\n",skb_shinfo(skb)->gso_size,skb_shinfo(skb)->gso_type,gso_ok);





ksoftirqd/25-231   [025] ..s.   319.964125: dev_queue_xmit 
<-netdev_send
ksoftirqd/25-231   [025] ..s.   319.964125: local_bh_disable 
<-dev_queue_xmit
ksoftirqd/25-231   [025] ..s.   

Re: [ovs-dev] ovs segments frames when transmitting to veth

2016-01-13 Thread Ramu Ramamurthy

On 2016-01-13 16:22, Jesse Gross wrote:

On Wed, Jan 13, 2016 at 3:51 PM, Ramu Ramamurthy
<srama...@linux.vnet.ibm.com> wrote:


Problem:
-

We are running centos 7.2 (ovs-kernel module from distribution) with 
OVS

2.3.1.

Large tcp frames (> 30Kbytes gro'd) are coming into a vxlan-tunnel 
port and

switched by an ovs-bridge onto a veth port

vxlan-tunnel--(gro'd tcp frames sized > 30K)-->OVS-bridge--(frames 
sized

1448)-->veth port

As the ovs transmits onto the veth port, the frames are getting 
segmented
into 1448 byte sized packets. All the offload settings are at default 
on the

veth port (gso on, gro on, tso on)

We expect that the OVS can transmit the frame without segmentation 
onto the

veth port. But, due to segmentation, the performance suffers.


I think the problem is that tunnel decapsulation (in all cases, not
just VXLAN, and not just with OVS) doesn't appear to strip off the GSO
type that indicates that it is a tunnel frame. As a result, the veth
(which didn't support tunnel GSO until 3.13) sees this as a tunnel
frame rather than just a TCP frame and forces software segmentation.
The issue isn't with OVS code, so I would recommend that you submit a
patch to upstream Linux.

Thanks, I will look into the fix for this upstream.

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH RFC] - vxlan udp csum and openstack ovs performance

2015-08-27 Thread Ramu Ramamurthy

On 2015-08-27 18:26, Jesse Gross wrote:

On Thu, Aug 27, 2015 at 5:46 PM, Ramu Ramamurthy
srama...@linux.vnet.ibm.com wrote:
We found that enabling UDP checksums for vxlan triggers GRO on the 
receiver,

and boosts the vxlan performance to 8+ Gbps.
Hence, we want to enable vxlan csum for OVS vxlan tunnels. This is 
already

supported in OVS 2.4, and kernel versions 4+ - But we
want this fix to appear in 3.x upstream kernels, and the corresponding
userspace change in OVS 2.3.


Sorry, this isn't going to happen. Neither the upstream kernel nor OVS
backport features to already released versions.

OVS 2.4 has been released with this feature and is packaged with a
kernel module of backports. Please just use that - considering that
you'd have to make changes all the way up the stack, I don't think
it's any more disruptive.


We cannot change the base distribution - we are using RHEL 7 (3.x 
based).


So, you are suggesting to use OVS 2.4 userspace components, and
build the openvswitch kernel-module
shipped with OVS 2.4 along with its dependencies (vxlan, etc) also 
shipped
with OVS 2.4  against the RHEL 7.1 (3.x kernel)  - and not use any of 
the

upstream components for openvswitch, vxlan etc ?



___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH RFC] - vxlan udp csum and openstack ovs performance

2015-08-27 Thread Ramu Ramamurthy


posting again, since the earlier emailer mangles email addresses.

Overview:

We are running openstack kilo with neutron OVS using intel 10G 
adapters (br kx4 dual-port, 82599es).
neutron OVS is using vxlan tunnels. We are using RHEL 7.1 (which uses 
the 3.x kernel).

Openvswitch kernel module is from upstream (3.x) and ovs is 2.3.0 based.

Problem:

We find that the aggregate throughput does not exceed 3Gbps on the 10Gb 
link

when running various VM test scenarios (40VMs--40VMs).
The root cause is the inability to perform GRO at the physical device 
for the vxlan protocol.


See discussion on this problem and its follow ups:
http://marc.info/?t=14352771932r=1w=2

Fix:

We found that enabling UDP checksums for vxlan triggers GRO on the 
receiver, and boosts the vxlan performance to 8+ Gbps.
Hence, we want to enable vxlan csum for OVS vxlan tunnels. This is 
already supported in OVS 2.4, and kernel versions 4+ - But we
want this fix to appear in 3.x upstream kernels, and the corresponding 
userspace change in OVS 2.3.


So, the following set of patches achieves 8+Gbps throughput on 10G 
adapters (vs the 2Gbps without these) on

the above Openstack topology.

1)
   Backport this patch (to ovs 2.3) which allows the csum configuration 
on vxlan ports


http://openvswitch.org/pipermail/dev/2015-March/052823.html

2) Apply this new patch to send the CSUM attribute to the kernel module

--- /home/ovs/tmp/openvswitch-2.3.0/lib/dpif-linux.c 2014-08-14 
16:34:33.0 -0400

+++ lib/dpif-linux.c 2015-08-27 18:56:27.282610654 -0400
@@ -666,10 +666,15 @@
 }

 tnl_cfg = netdev_get_tunnel_config(netdev);
-if (tnl_cfg  tnl_cfg-dst_port != 0) {
+if (tnl_cfg  (tnl_cfg-dst_port != 0 || tnl_cfg-csum)) {
 ofpbuf_use_stack(options, options_stub, sizeof options_stub);
-nl_msg_put_u16(options, OVS_TUNNEL_ATTR_DST_PORT,
-   ntohs(tnl_cfg-dst_port));
+if (tnl_cfg-dst_port != 0) {
+nl_msg_put_u16(options, OVS_TUNNEL_ATTR_DST_PORT,
+   ntohs(tnl_cfg-dst_port));
+}
+if (tnl_cfg-csum) {
+nl_msg_put_flag(options, OVS_TUNNEL_KEY_ATTR_CSUM);
+}
 request.options = ofpbuf_data(options);
 request.options_len = ofpbuf_size(options);
 }

3) Apply this patch to the upstream 3.x kernels to set the UDP csum flag 
when creating the vxlan socket


--- ../linux-3.10.0-229.el7/net/openvswitch/vport-vxlan.c 2015-01-29 
18:15:53.0 -0500

+++ vport-vxlan.c 2015-08-27 18:51:35.619688332 -0400
@@ -100,6 +100,7 @@
  struct nlattr *a;
  u16 dst_port;
  int err;
+ u32 vxlan_sock_flags = 0;

  if (!options) {
  err = -EINVAL;
@@ -114,6 +115,11 @@
  goto error;
  }

+ a = nla_find_nested(options, OVS_TUNNEL_KEY_ATTR_CSUM);
+ if (a) {
+ vxlan_sock_flags |= VXLAN_F_UDP_CSUM;
+ }
+
  vport = ovs_vport_alloc(sizeof(struct vxlan_port),
  ovs_vxlan_vport_ops, parms);
  if (IS_ERR(vport))
@@ -122,7 +128,7 @@
  vxlan_port = vxlan_vport(vport);
  strncpy(vxlan_port-name, parms-name, IFNAMSIZ);

- vs = vxlan_sock_add(net, htons(dst_port), vxlan_rcv, vport, true, 0);
+ vs = vxlan_sock_add(net, htons(dst_port), vxlan_rcv, vport, true, 
vxlan_sock_flags);

  if (IS_ERR(vs)) {
  ovs_vport_free(vport);
  return (void *)vs;


4) Patch Openstack neutron-ovs-agent to optionally set the csum flag on 
vxlan tunnels

(this will be taken to the Openstack Neutron community)

Conclusion:

Since above patches provide a huge performance boost to OVS-based 
Openstack neutron using vxlan tunnels on
widely deployed 10G adapters, we want the community feedback on how best 
to apply the above patches.


___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev