Re: [ovs-dev] [PATCH] ovn-controller: update_ct_zone operates always on empty set

2016-07-26 Thread Babu Shanmugam



On Wednesday 27 July 2016 06:43 AM, Russell Bryant wrote:
On Tue, Jul 26, 2016 at 6:46 AM, > wrote:


From: Babu Shanmugam >

Commit 263064a (Convert binding_run to incremental processing.)
removed the usage
of all_lports from binding_run, but it is infact used in the
context of the caller,
especially by update_ct_zones().

Without this change, update_ct_zones operates on an empty set always.

Signed-off-by: Babu Shanmugam >


Ouch. This is a really bad regression.  If I understand correctly, 
we're not setting a ct zone ID for any logical ports.  All are just 
using the default zone of 0.



Yes Russell, your understanding is correct.
We should think about a good way to test OVN's use of conntrack zones 
to ensure that entries end up in separate zones for separate ports.  A 
good test for that may require userspace conntrack support, though.
 Another test we could do now would be looking at the flows in table 0 
and ensuring that the input flow for each port has a different 
conntrack zone ID assigned.  That feels like kind of a hack, though.
I agree that we need more test cases. I could not spend much time to 
figure out a proper approach for a test case. I will have a look at it.


Thank you,
Babu
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] ovn-nbctl: Improve usage message.

2016-07-26 Thread Ben Pfaff
The most important change here is to delete misspelled "the".

Signed-off-by: Ben Pfaff 
---
 ovn/utilities/ovn-nbctl.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/ovn/utilities/ovn-nbctl.c b/ovn/utilities/ovn-nbctl.c
index e594a32..947e3a1 100644
--- a/ovn/utilities/ovn-nbctl.c
+++ b/ovn/utilities/ovn-nbctl.c
@@ -402,9 +402,9 @@ DHCP Options commands:\n\
   dhcp-options-list\n\
lists the DHCP_Options rows\n\
   dhcp-options-set-options DHCP_OPTIONS_UUID  KEY=VALUE [KEY=VALUE]...\n\
-   set DHCP options to the DHCP_OPTIONS_UUID\n\
+   set DHCP options for DHCP_OPTIONS_UUID\n\
   dhcp-options-get-options DHCO_OPTIONS_UUID \n\
-   displays the DHCP options of th DHCP_OPTIONS_UUID\n\
+   displays the DHCP options for DHCP_OPTIONS_UUID\n\
 \n\
 %s\
 \n\
-- 
2.1.3

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] distributors needed

2016-07-26 Thread Tony Ryan

Hi,

I wanted to check in with you, did you receive my email from last week?

We are a 27 year old USA based company.
We manufacture slip resistant floor treatments
We need distributors worldwide.

One 30 minute treatment will make all types of floors slip resistant and
safe for a minimum of 4 years.
It can be used  indoors or outdoors.
No change in appearance

Use on:
ceramic,
porcelain
marble,
granite,
concrete
quarry tiles,
etc

Typical applications:
hospitals,
restaurant kitchen floors, ,
hotels
office buildings, etc.

Some of our current customers:
Kroger,
Holiday Inn
Miami Children's Hospital
McDonald's,

Please contact us for details and to check if there is a distributorship
available in your country.

5,000 USD initial inventory purchase required.
Please send your name, country and email address.

Thanks,
Tony Ryan
Email: buerter...@sina.com

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v1 2/3] ovn-controller: Add 'put_dhcpv6_opts' action in ovn-controller

2016-07-26 Thread Numan Siddique
On Wed, Jul 27, 2016 at 2:07 AM, Ben Pfaff  wrote:

> On Wed, Jul 27, 2016 at 12:55:00AM +0530, Numan Siddique wrote:
> > This patch adds a new OVN action 'put_dhcpv6_opts' to support native
> > DHCPv6 in OVN.
> >
> > ovn-controller parses this action and adds a NXT_PACKET_IN2
> > OF flow with 'pause' flag set and the DHCPv6 options stored in
> > 'userdata' field.
> >
> > When the valid DHCPv6 packet is received by ovn-controller, it frames a
> > new DHCPv6 reply packet with the DHCPv6 options present in the
> > 'userdata' field and resumes the packet and stores 1 in the 1-bit
> subfield.
> > If the packet is invalid, it resumes the packet without any modifying and
> > stores 0 in the 1-bit subfield.
> >
> > Eg. reg0[3] = put_dhcpv6_opts(IA_ADDR = aef0::4, SERVER_ID =
> 00:00:00:00:10:02,
> >  DNS_RECURSIVE_SERVER={ae70::1,ae70::2})
> >
> > A new 'DHCPv6_Options' table is added in SB DB which stores
> > the supported DHCPv6 options with DHCPv6 code and type. ovn-northd is
> > expected to popule this table.
> >
> > Upcoming patch will add logical flows with this action.
> >
> > Signed-off-by: Numan Siddique 
>
> Same comment here as previously, that the put_dhcpv6_opts action needs
> documentation in ovn-sb.xml.
>


​I think I have added the documentation in ovn-sb.xml. I will recheck again.
I will submit another patch set addressing the comments.

Thanks again for the reviews.

Numan
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] Returned mail: see transcript for details

2016-07-26 Thread lamquocthang
The original message was received at Wed, 27 Jul 2016 12:14:42 +0700 from 
36.47.224.224

- The following addresses had permanent fatal errors -
dev@openvswitch.org



___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v4 1/4] Add support for 802.1ad (QinQ tunneling)

2016-07-26 Thread Ben Pfaff
On Wed, Jul 27, 2016 at 11:51:25AM +0800, Xiao Liang wrote:
> On Tue, Jul 26, 2016 at 12:52 AM, Ben Pfaff  wrote:
> > On Tue, Jul 12, 2016 at 11:38:54PM +0800, Xiao Liang wrote:
> >> Flow key handleing changes:
> >> - Add VLAN header array in struct flow, to record multiple 802.1q VLAN
> >>   headers.
> >> - Add dpif multi-VLAN capability probing. If datapath supports multi-VLAN,
> >>   increase the maximum depth of nested OVS_KEY_ATTR_ENCAP.
> >>
> >> Refacter VLAN handling in dpif-xlate:
> >> - Introduce 'xvlan' to track VLAN stack during flow processing.
> >> - Input and output VLAN translation according to the xbundle type.
> >>
> >> Push VLAN action support:
> >> - Allow ethertype 0x88a8 in VLAN headers and push_vlan action.
> >> - Support push_vlan on dot1q packets.
> >>
> >> Add new port VLAN mode "dot1q-tunnel":
> >> - Example:
> >> ovs-vsctl set Port p1 vlan_mode=dot1q-tunnel tag=100
> >>   Pushes another VLAN 100 header on packets (tagged and untagged) on 
> >> ingress,
> >>   and pops it on egress.
> >> - Customer VLAN check:
> >> ovs-vsctl set Port p1 vlan_mode=dot1q-tunnel tag=100 cvlans=10,20
> >>   Only customer VLAN of 10 and 20 are allowed.
> >>
> >> Signed-off-by: Xiao Liang 
> >
> > The following incremental fixes some warnings from "sparse".  The one
> > from odp-util.c seems petty, but the others correct real conceptual
> > errors even if they would not be bugs in practice.
> >
> > diff --git a/lib/odp-util.c b/lib/odp-util.c
> > index 46ff6de..56a6145 100644
> > --- a/lib/odp-util.c
> > +++ b/lib/odp-util.c
> > @@ -5047,7 +5047,7 @@ parse_8021q_onward(const struct nlattr 
> > *attrs[OVS_KEY_ATTR_MAX + 1],
> >
> >  while (encaps < FLOW_MAX_VLAN_HEADERS &&
> > (is_mask?
> > -(src_flow->vlan[encaps].tci & htons(VLAN_CFI)) :
> > +(src_flow->vlan[encaps].tci & htons(VLAN_CFI)) != 0 :
> >  eth_type_vlan(flow->dl_type))) {
> >  /* Calculate fitness of outer attributes. */
> >  encap  = (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_ENCAP)
> > diff --git a/lib/ofp-actions.c b/lib/ofp-actions.c
> > index c4b656e..7184184 100644
> > --- a/lib/ofp-actions.c
> > +++ b/lib/ofp-actions.c
> > @@ -1666,7 +1666,7 @@ static void
> >  format_PUSH_VLAN(const struct ofpact_push_vlan *push_vlan, struct ds *s)
> >  {
> >  ds_put_format(s, "%spush_vlan:%s%#"PRIx16,
> > -  colors.param, colors.end, htons(push_vlan->ethertype));
> > +  colors.param, colors.end, ntohs(push_vlan->ethertype));
> >  }
> >
> >  /* Action structure for OFPAT10_SET_DL_SRC/DST and OFPAT11_SET_DL_SRC/DST. 
> > */
> > diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> > index fd41ac8..90cf74a 100644
> > --- a/ofproto/ofproto-dpif-xlate.c
> > +++ b/ofproto/ofproto-dpif-xlate.c
> > @@ -1920,7 +1920,7 @@ xvlan_extract(const struct flow *flow, struct xvlan 
> > *xvlan)
> >  !(flow->vlan[i].tci & htons(VLAN_CFI))) {
> >  break;
> >  }
> > -xvlan[i].tpid = htons(flow->vlan[i].tpid);
> > +xvlan[i].tpid = ntohs(flow->vlan[i].tpid);
> >  xvlan[i].vid = vlan_tci_to_vid(flow->vlan[i].tci);
> >  xvlan[i].pcp = flow->vlan[i].tci & htons(VLAN_PCP_MASK);
> >  }
> 
> Thanks for pointing it out.

I'm working on a thorough review of this series and hope to provide it
tomorro.w
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] ovn-controller: squelch expected duplicate flow warnings

2016-07-26 Thread Ryan Moats
Guru Shetty  wrote on 07/26/2016 10:22:30 PM:

> From: Guru Shetty 
> To: Ryan Moats/Omaha/IBM@IBMUS
> Cc: Guru Shetty , ovs dev 
> Date: 07/26/2016 10:22 PM
> Subject: Re: [ovs-dev] [PATCH] ovn-controller: squelch expected
> duplicate flow warnings
>
>
> On Jul 26, 2016, at 5:30 PM, Ryan Moats  wrote:

> Guru Shetty  wrote on 07/26/2016 06:05:47 PM:
>
> > From: Guru Shetty 
> > To: Ryan Moats/Omaha/IBM@IBMUS
> > Cc: ovs dev 
> > Date: 07/26/2016 06:06 PM
> > Subject: Re: [ovs-dev] [PATCH] ovn-controller: squelch expected
> > duplicate flow warnings
> >
> > On 26 July 2016 at 15:54, Ryan Moats  wrote:
> >
> >
> >
> > Guru Shetty  wrote on 07/26/2016 03:54:29 PM:
> >
> > > From: Guru Shetty 
> > > To: Ryan Moats/Omaha/IBM@IBMUS
> > > Cc: ovs dev 
> > > Date: 07/26/2016 03:54 PM
> > > Subject: Re: [ovs-dev] [PATCH] ovn-controller: squelch expected
> > > duplicate flow warnings
> > >
> > > On 24 July 2016 at 10:07, Ryan Moats  wrote:
> > > In the physical processing of ovn-controller, there are two
> > > sets of OF flows that are still fully recalculated every cycle:
> > >
> > >   Flows that aren't associated with any logical flow, and
> > >   Flows calculated based on multicast groups
> > >
> > > Because these flows are recalculated fully each cycle, full
> > > duplicates of existing OF flows are created and the OF management
> > > code in ovn-controller pollutes the logs with false positive
> > > warnings about repeated duplicates.
> > >
> > > As a short term measure, ignore full duplicates for both of
> > > these types of flows, but still warn if the action changes
> > > (as that is not expected and may be indicative of a problem).
> > >
> > > Signed-off-by: Ryan Moats 
> > >
> > > I also noticed that "commit 70c7cfef188b5ae9940abd5 (ovn-controller:
> > > Add incremental processing to lflow_run and physical_run)" causes
> > > load balancing system unit tests to fail. A little debugging shows
> > > that groups are getting deleted when new flows are added.  My hunch
> > > is that this is likely because 'desired_groups' in ofctl_put gets
> > > deleted in every run. But in the next run, it does not get updated
> > > as we no longer process all flows.
> >
> > That's going to take persisting the desired_groups data.
> >
> > I can take a shot if you'd like, just give me the link to the
> > patch set that includes the load balancing system unit tests
> > and I'll see what I can do to make it right ...
> >
> > It already exists in the OVN repo. tests/system-ovn.at
>
> Ack and verified that it is failing - I'll take a deeper look
> later tonight/tomorrow and see what I can make work.
>
> Thanks much.
>
> (Just to make sure you have the environment right, you should have
> the right kernel modules with conntrack support installed on your
> machine. On master, it will only work on pre 4.6 kernels if there is
> no ovs kernel module already instslled from upstream kernel. To make
> it work, you should either remove upstream kernel modules or install
> a /etc/depmod.d/openvswitch.conf to override upstream one. On 4.6
> and above it should not matter as upstream kernel module has
> conntrack support.
>
> You can make sure that you get the tests working before the said
> commit so that you dont go on a wild goose chase.)

Mitigation patch is at http://patchwork.ozlabs.org/patch/653068/ for
review.

In my previous message, I incorrectly stated that the above patch didn't
handle flow modifications correctly.  It actually does.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] ovn-controller: Persist desired conntrack groups.

2016-07-26 Thread Ryan Moats
[1] indicates that with incremental processing of logical flows
desired conntrack groups are not being persisted.  This patch
adds this capability, with the side effect of adding a ds_copy
method that this capability leverages.

[1] http://openvswitch.org/pipermail/dev/2016-July/076320.html

Signed-off-by: Ryan Moats 
---
 include/openvswitch/dynamic-string.h |  1 +
 lib/dynamic-string.c |  8 
 ovn/controller/lflow.c   |  4 +++-
 ovn/controller/ofctrl.c  | 33 +++--
 ovn/controller/ofctrl.h  |  3 +++
 ovn/lib/actions.c| 11 ---
 ovn/lib/actions.h|  9 +++--
 tests/test-ovn.c |  5 -
 8 files changed, 57 insertions(+), 17 deletions(-)

diff --git a/include/openvswitch/dynamic-string.h 
b/include/openvswitch/dynamic-string.h
index dfe2688..398e41a 100644
--- a/include/openvswitch/dynamic-string.h
+++ b/include/openvswitch/dynamic-string.h
@@ -73,6 +73,7 @@ void ds_swap(struct ds *, struct ds *);
 
 int ds_last(const struct ds *);
 bool ds_chomp(struct ds *, int c);
+void ds_copy(struct ds *, struct ds *);
 
 /* Inline functions. */
 
diff --git a/lib/dynamic-string.c b/lib/dynamic-string.c
index 1f17a9f..692468f 100644
--- a/lib/dynamic-string.c
+++ b/lib/dynamic-string.c
@@ -456,3 +456,11 @@ ds_chomp(struct ds *ds, int c)
 return false;
 }
 }
+
+void
+ds_copy(struct ds *dst, struct ds *source)
+{
+dst->length = source->length;
+dst->allocated = source->allocated;
+dst->string = xmemdup(source->string, source->allocated + 1);
+}
diff --git a/ovn/controller/lflow.c b/ovn/controller/lflow.c
index 42c9055..67b702c 100644
--- a/ovn/controller/lflow.c
+++ b/ovn/controller/lflow.c
@@ -383,6 +383,7 @@ add_logical_flows(struct controller_ctx *ctx, const struct 
lport_index *lports,
 
 if (full_flow_processing) {
 ovn_flow_table_clear();
+ovn_group_table_clear(group_table, false);
 full_logical_flow_processing = true;
 full_neighbor_flow_processing = true;
 full_flow_processing = false;
@@ -522,7 +523,8 @@ consider_logical_flow(const struct lport_index *lports,
 .output_ptable = output_ptable,
 .arp_ptable = OFTABLE_MAC_BINDING,
 };
-error = actions_parse_string(lflow->actions, , , );
+error = actions_parse_string(lflow->actions, , , ,
+ >header_.uuid);
 if (error) {
 static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
 VLOG_WARN_RL(, "error parsing actions \"%s\": %s",
diff --git a/ovn/controller/ofctrl.c b/ovn/controller/ofctrl.c
index f0451b7..40591c7 100644
--- a/ovn/controller/ofctrl.c
+++ b/ovn/controller/ofctrl.c
@@ -118,9 +118,6 @@ static enum mf_field_id mff_ovn_geneve;
 
 static void ovn_flow_table_destroy(void);
 
-static void ovn_group_table_clear(struct group_table *group_table,
-  bool existing);
-
 static void ofctrl_recv(const struct ofp_header *, enum ofptype);
 
 static struct hmap match_flow_table = HMAP_INITIALIZER(_flow_table);
@@ -630,6 +627,16 @@ ofctrl_remove_flows(const struct uuid *uuid)
 ovn_flow_destroy(f);
 }
 }
+
+/* Remove any group_info information created by this logical flow. */
+struct group_info *g, *next_g;
+HMAP_FOR_EACH_SAFE (g, next_g, hmap_node, >desired_groups) {
+if (uuid_equals(>lflow_uuid, uuid)) {
+hmap_remove(>desired_groups, >hmap_node);
+ds_destroy(>group);
+free(g);
+}
+}
 }
 
 /* Shortcut to remove all flows matching the supplied UUID and add this
@@ -777,6 +784,15 @@ queue_flow_mod(struct ofputil_flow_mod *fm)
 
 /* group_table. */
 
+static struct group_info *
+group_info_clone(struct group_info *source) {
+struct group_info *clone = xmalloc(sizeof *clone);
+ds_copy(>group, >group);
+clone->group_id = source->group_id;
+clone->hmap_node.hash = source->hmap_node.hash;
+return clone;
+}
+
 /* Finds and returns a group_info in 'existing_groups' whose key is identical
  * to 'target''s key, or NULL if there is none. */
 static struct group_info *
@@ -795,7 +811,7 @@ ovn_group_lookup(struct hmap *exisiting_groups,
 }
 
 /* Clear either desired_groups or existing_groups in group_table. */
-static void
+void
 ovn_group_table_clear(struct group_table *group_table, bool existing)
 {
 struct group_info *g, *next;
@@ -1000,13 +1016,10 @@ ofctrl_put(struct group_table *group_table)
 /* Move the contents of desired_groups to existing_groups. */
 HMAP_FOR_EACH_SAFE(desired, next_group, hmap_node,
_table->desired_groups) {
-hmap_remove(_table->desired_groups, >hmap_node);
 if (!ovn_group_lookup(_table->existing_groups, desired)) {
-hmap_insert(_table->existing_groups, >hmap_node,
-desired->hmap_node.hash);
-} 

Re: [ovs-dev] [PATCH] ovsdb: Fix memory leak in execute_update.

2016-07-26 Thread nickcooper-zhangtonghao
Hi William,
I reviewed your patch codes and found other bugs.

If the ‘ovsdb_condition_from_json’ return the error, cnd->clauses will be
set NULL, so ‘ovsdb_condition_destroy’ should check the 'cnd->clauses’ before
free it.

It is applied to ‘ovsdb_row_from_json’.


Signed-off-by: nickcooper-zhangtonghao 
---
 ovsdb/column.c  | 5 -
 ovsdb/condition.c   | 6 +-
 ovsdb/replication.c | 3 +++
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/ovsdb/column.c b/ovsdb/column.c
index 8838df3..f01301f 100644
--- a/ovsdb/column.c
+++ b/ovsdb/column.c
@@ -129,7 +129,10 @@ ovsdb_column_set_init(struct ovsdb_column_set *set)
 void
 ovsdb_column_set_destroy(struct ovsdb_column_set *set)
 {
-free(set->columns);
+if (set->columns) {
+free(set->columns);
+set->columns = NULL;
+}
 }

 void
diff --git a/ovsdb/condition.c b/ovsdb/condition.c
index 6da3b08..f337a37 100644
--- a/ovsdb/condition.c
+++ b/ovsdb/condition.c
@@ -485,7 +485,11 @@ ovsdb_condition_destroy(struct ovsdb_condition *cnd)
 for (i = 0; i < cnd->n_clauses; i++) {
 ovsdb_clause_free(>clauses[i]);
 }
-free(cnd->clauses);
+
+if (cnd->clauses) {
+free(cnd->clauses);
+cnd->clauses = NULL;
+}
 cnd->n_clauses = 0;

 ovsdb_condition_optimize_destroy(cnd);
diff --git a/ovsdb/replication.c b/ovsdb/replication.c
index 52b7085..bfd2ca1 100644
--- a/ovsdb/replication.c
+++ b/ovsdb/replication.c
@@ -568,6 +568,8 @@ execute_delete(struct ovsdb_txn *txn, const char *uuid,
 }

 ovsdb_condition_destroy();
+json_destroy(CONST_CAST(struct json *, where));
+
 return error;
 }

@@ -625,6 +627,7 @@ execute_update(struct ovsdb_txn *txn, const char *uuid,
 ovsdb_row_destroy(row);
 ovsdb_column_set_destroy();
 ovsdb_condition_destroy();
+json_destroy(CONST_CAST(struct json *, where));

 return error;
 }
--
1.8.3.1


> From: William Tu 
> To: dev@openvswitch.org
> Subject: [ovs-dev] [PATCH] ovsdb: Fix memory leak in execute_update.
> Message-ID: <1469582910-64371-1-git-send-email-u9012...@gmail.com>
> 
> Valgrind testcase 1804 ovsdb-server.at:1023 insert rows, update rows by value
> reports the following leak.
>json_from_string (json.c:1025)
>execute_update (replication.c:614), similarily at execute_delete()
>process_table_update (replication.c:502)
>process_notification.part.5 (replication.c:445)
>process_notification (replication.c:402)
>check_for_notifications (replication.c:418)
>replication_run (replication.c:110)
> 
> Signed-off-by: William Tu 
> ---
> ovsdb/replication.c | 3 +++
> 1 file changed, 3 insertions(+)
> 
> diff --git a/ovsdb/replication.c b/ovsdb/replication.c
> index af7ae5c..fe89d39 100644
> --- a/ovsdb/replication.c
> +++ b/ovsdb/replication.c
> @@ -573,6 +573,8 @@ execute_delete(struct ovsdb_txn *txn, const char *uuid,
> }
> 
> ovsdb_condition_destroy();
> +json_destroy(CONST_CAST(struct json *, where));
> +
> return error;
> }
> 
> @@ -630,6 +632,7 @@ execute_update(struct ovsdb_txn *txn, const char *uuid,
> ovsdb_row_destroy(row);
> ovsdb_column_set_destroy();
> ovsdb_condition_destroy();
> +json_destroy(CONST_CAST(struct json *, where));
> 
> return error;
> }
> -- 
> 2.5.0



___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v4 1/4] Add support for 802.1ad (QinQ tunneling)

2016-07-26 Thread Xiao Liang
On Tue, Jul 26, 2016 at 12:52 AM, Ben Pfaff  wrote:
> On Tue, Jul 12, 2016 at 11:38:54PM +0800, Xiao Liang wrote:
>> Flow key handleing changes:
>> - Add VLAN header array in struct flow, to record multiple 802.1q VLAN
>>   headers.
>> - Add dpif multi-VLAN capability probing. If datapath supports multi-VLAN,
>>   increase the maximum depth of nested OVS_KEY_ATTR_ENCAP.
>>
>> Refacter VLAN handling in dpif-xlate:
>> - Introduce 'xvlan' to track VLAN stack during flow processing.
>> - Input and output VLAN translation according to the xbundle type.
>>
>> Push VLAN action support:
>> - Allow ethertype 0x88a8 in VLAN headers and push_vlan action.
>> - Support push_vlan on dot1q packets.
>>
>> Add new port VLAN mode "dot1q-tunnel":
>> - Example:
>> ovs-vsctl set Port p1 vlan_mode=dot1q-tunnel tag=100
>>   Pushes another VLAN 100 header on packets (tagged and untagged) on ingress,
>>   and pops it on egress.
>> - Customer VLAN check:
>> ovs-vsctl set Port p1 vlan_mode=dot1q-tunnel tag=100 cvlans=10,20
>>   Only customer VLAN of 10 and 20 are allowed.
>>
>> Signed-off-by: Xiao Liang 
>
> The following incremental fixes some warnings from "sparse".  The one
> from odp-util.c seems petty, but the others correct real conceptual
> errors even if they would not be bugs in practice.
>
> diff --git a/lib/odp-util.c b/lib/odp-util.c
> index 46ff6de..56a6145 100644
> --- a/lib/odp-util.c
> +++ b/lib/odp-util.c
> @@ -5047,7 +5047,7 @@ parse_8021q_onward(const struct nlattr 
> *attrs[OVS_KEY_ATTR_MAX + 1],
>
>  while (encaps < FLOW_MAX_VLAN_HEADERS &&
> (is_mask?
> -(src_flow->vlan[encaps].tci & htons(VLAN_CFI)) :
> +(src_flow->vlan[encaps].tci & htons(VLAN_CFI)) != 0 :
>  eth_type_vlan(flow->dl_type))) {
>  /* Calculate fitness of outer attributes. */
>  encap  = (present_attrs & (UINT64_C(1) << OVS_KEY_ATTR_ENCAP)
> diff --git a/lib/ofp-actions.c b/lib/ofp-actions.c
> index c4b656e..7184184 100644
> --- a/lib/ofp-actions.c
> +++ b/lib/ofp-actions.c
> @@ -1666,7 +1666,7 @@ static void
>  format_PUSH_VLAN(const struct ofpact_push_vlan *push_vlan, struct ds *s)
>  {
>  ds_put_format(s, "%spush_vlan:%s%#"PRIx16,
> -  colors.param, colors.end, htons(push_vlan->ethertype));
> +  colors.param, colors.end, ntohs(push_vlan->ethertype));
>  }
>
>  /* Action structure for OFPAT10_SET_DL_SRC/DST and OFPAT11_SET_DL_SRC/DST. */
> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> index fd41ac8..90cf74a 100644
> --- a/ofproto/ofproto-dpif-xlate.c
> +++ b/ofproto/ofproto-dpif-xlate.c
> @@ -1920,7 +1920,7 @@ xvlan_extract(const struct flow *flow, struct xvlan 
> *xvlan)
>  !(flow->vlan[i].tci & htons(VLAN_CFI))) {
>  break;
>  }
> -xvlan[i].tpid = htons(flow->vlan[i].tpid);
> +xvlan[i].tpid = ntohs(flow->vlan[i].tpid);
>  xvlan[i].vid = vlan_tci_to_vid(flow->vlan[i].tci);
>  xvlan[i].pcp = flow->vlan[i].tci & htons(VLAN_PCP_MASK);
>  }

Thanks for pointing it out.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] ovsdb-client: Fix memory leak reported by Valgrind.

2016-07-26 Thread William Tu
Testcase 1857: ovsdb-monitor.at:538 monitor-cond-change reports the
following definitely memory leak:
ovsdb_schema_create (ovsdb.c:34)
ovsdb_schema_from_json (ovsdb.c:196)
fetch_schema (ovsdb-client.c:385)
do_monitor_cond (ovsdb-client.c:1112)

Signed-of-by: William Tu 
---
 ovsdb/ovsdb-client.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/ovsdb/ovsdb-client.c b/ovsdb/ovsdb-client.c
index 7dcc07c..1f83f3b 100644
--- a/ovsdb/ovsdb-client.c
+++ b/ovsdb/ovsdb-client.c
@@ -1120,6 +1120,7 @@ do_monitor_cond(struct jsonrpc *rpc, const char *database,
 NULL, ));
 ovsdb_condition_destroy();
 do_monitor__(rpc, database, OVSDB_MONITOR_V2, --argc, ++argv, condition);
+ovsdb_schema_destroy(schema);
 }
 
 struct dump_table_aux {
-- 
2.5.0

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] ovn-controller: update_ct_zone operates always on empty set

2016-07-26 Thread Guru Shetty


> On Jul 26, 2016, at 6:13 PM, Russell Bryant  wrote:
> 
>> On Tue, Jul 26, 2016 at 6:46 AM,  wrote:
>> 
>> From: Babu Shanmugam 
>> 
>> Commit 263064a (Convert binding_run to incremental processing.) removed
>> the usage
>> of all_lports from binding_run, but it is infact used in the context of
>> the caller,
>> especially by update_ct_zones().
>> 
>> Without this change, update_ct_zones operates on an empty set always.
>> 
>> Signed-off-by: Babu Shanmugam 
> 
> Ouch. This is a really bad regression.  If I understand correctly, we're
> not setting a ct zone ID for any logical ports.  All are just using the
> default zone of 0.
> 
> We should think about a good way to test OVN's use of conntrack zones to
> ensure that entries end up in separate zones for separate ports.  A good
> test for that may require userspace conntrack support, though.

I have added a couple of ovn tests in system-ovn.at that leverage kernel module 
and conntrack. A basic test for firewall can be added.

> 
> Another test we could do now would be looking at the flows in table 0 and
> ensuring that the input flow for each port has a different conntrack zone
> ID assigned.  That feels like kind of a hack, though.
> 
> ---
>> ovn/controller/binding.c| 4 +++-
>> ovn/controller/binding.h| 3 ++-
>> ovn/controller/ovn-controller.c | 2 +-
>> 3 files changed, 6 insertions(+), 3 deletions(-)
>> 
>> diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
>> index e83c1d5..7bc6fb4 100644
>> --- a/ovn/controller/binding.c
>> +++ b/ovn/controller/binding.c
>> @@ -230,7 +230,8 @@ consider_local_datapath(struct controller_ctx *ctx,
>> 
>> void
>> binding_run(struct controller_ctx *ctx, const struct ovsrec_bridge
>> *br_int,
>> -const char *chassis_id, struct hmap *local_datapaths)
>> +const char *chassis_id, struct hmap *local_datapaths,
>> +struct sset *all_lports)
>> {
>> const struct sbrec_chassis *chassis_rec;
>> const struct sbrec_port_binding *binding_rec;
>> @@ -292,6 +293,7 @@ binding_run(struct controller_ctx *ctx, const struct
>> ovsrec_bridge *br_int,
>> }
>> }
>> 
>> +sset_clone(all_lports, _ids);
> 
> I don't think this is quite sufficient.  It's missing, at least:
> 
> - the IDs of sub-ports
> 
> - localnet ports
> 
> The old handling of building up all_lports ensure those got added.
> 
> 
>> shash_destroy(_to_iface);
>> }
>> 
>> diff --git a/ovn/controller/binding.h b/ovn/controller/binding.h
>> index 8753d44..fbd16c8 100644
>> --- a/ovn/controller/binding.h
>> +++ b/ovn/controller/binding.h
>> @@ -29,7 +29,8 @@ struct sset;
>> void binding_register_ovs_idl(struct ovsdb_idl *);
>> void binding_reset_processing(void);
>> void binding_run(struct controller_ctx *, const struct ovsrec_bridge
>> *br_int,
>> - const char *chassis_id, struct hmap *local_datapaths);
>> + const char *chassis_id, struct hmap *local_datapaths,
>> + struct sset *all_lports);
>> bool binding_cleanup(struct controller_ctx *, const char *chassis_id);
>> 
>> #endif /* ovn/binding.h */
>> diff --git a/ovn/controller/ovn-controller.c
>> b/ovn/controller/ovn-controller.c
>> index 4d9490a..6a6bb93 100644
>> --- a/ovn/controller/ovn-controller.c
>> +++ b/ovn/controller/ovn-controller.c
>> @@ -425,7 +425,7 @@ main(int argc, char *argv[])
>> if (chassis_id) {
>> chassis_run(, chassis_id);
>> encaps_run(, br_int, chassis_id);
>> -binding_run(, br_int, chassis_id, _datapaths);
>> +binding_run(, br_int, chassis_id, _datapaths,
>> _lports);
>> }
>> 
>> if (br_int && chassis_id) {
>> --
>> 2.5.5
>> 
>> ___
>> dev mailing list
>> dev@openvswitch.org
>> http://openvswitch.org/mailman/listinfo/dev
> 
> 
> 
> -- 
> Russell Bryant
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] ovn-controller: squelch expected duplicate flow warnings

2016-07-26 Thread Guru Shetty


> On Jul 26, 2016, at 5:30 PM, Ryan Moats  wrote:
> 
> Guru Shetty  wrote on 07/26/2016 06:05:47 PM:
> 
> > From: Guru Shetty 
> > To: Ryan Moats/Omaha/IBM@IBMUS
> > Cc: ovs dev 
> > Date: 07/26/2016 06:06 PM
> > Subject: Re: [ovs-dev] [PATCH] ovn-controller: squelch expected 
> > duplicate flow warnings
> > 
> > On 26 July 2016 at 15:54, Ryan Moats  wrote:
> > 
> > 
> > 
> > Guru Shetty  wrote on 07/26/2016 03:54:29 PM:
> > 
> > > From: Guru Shetty 
> > > To: Ryan Moats/Omaha/IBM@IBMUS
> > > Cc: ovs dev 
> > > Date: 07/26/2016 03:54 PM
> > > Subject: Re: [ovs-dev] [PATCH] ovn-controller: squelch expected
> > > duplicate flow warnings
> > >
> > > On 24 July 2016 at 10:07, Ryan Moats  wrote:
> > > In the physical processing of ovn-controller, there are two
> > > sets of OF flows that are still fully recalculated every cycle:
> > >
> > >   Flows that aren't associated with any logical flow, and
> > >   Flows calculated based on multicast groups
> > >
> > > Because these flows are recalculated fully each cycle, full
> > > duplicates of existing OF flows are created and the OF management
> > > code in ovn-controller pollutes the logs with false positive
> > > warnings about repeated duplicates.
> > >
> > > As a short term measure, ignore full duplicates for both of
> > > these types of flows, but still warn if the action changes
> > > (as that is not expected and may be indicative of a problem).
> > >
> > > Signed-off-by: Ryan Moats 
> > >
> > > I also noticed that "commit 70c7cfef188b5ae9940abd5 (ovn-controller:
> > > Add incremental processing to lflow_run and physical_run)" causes
> > > load balancing system unit tests to fail. A little debugging shows
> > > that groups are getting deleted when new flows are added.  My hunch
> > > is that this is likely because 'desired_groups' in ofctl_put gets
> > > deleted in every run. But in the next run, it does not get updated
> > > as we no longer process all flows.
> > 
> > That's going to take persisting the desired_groups data.
> > 
> > I can take a shot if you'd like, just give me the link to the
> > patch set that includes the load balancing system unit tests
> > and I'll see what I can do to make it right ...
> > 
> > It already exists in the OVN repo. tests/system-ovn.at
> 
> Ack and verified that it is failing - I'll take a deeper look
> later tonight/tomorrow and see what I can make work.
> 

Thanks much. 

(Just to make sure you have the environment right, you should have the right 
kernel modules with conntrack support installed on your machine. On master, it 
will only work on pre 4.6 kernels if there is no ovs kernel module already 
instslled from upstream kernel. To make it work, you should either remove 
upstream kernel modules or install a /etc/depmod.d/openvswitch.conf to override 
upstream one. On 4.6 and above it should not matter as upstream kernel module 
has conntrack support.

You can make sure that you get the tests working before the said commit so that 
you dont go on a wild goose chase.)
>   
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] ovsdb: Fix memory leak reported by Valgrind.

2016-07-26 Thread William Tu
Valgrind testcase 1967: simple idl, conditional, modify as delete due
to condition - C reports the following leak:
json_array_create_empty (json.c:185)
json_parser_push_array (json.c:1234)
json_parser_input (json.c:1328)
json_lex_input (json.c:945)
json_parser_feed (json.c:1103)
json_from_string (json.c:1025)
parse_json (test-ovsdb.c:227)
update_conditions (test-ovsdb.c:2324)
do_idl (test-ovsdb.c:2389)
ovs_cmdl_run_command (command-line.c:121)
main (test-ovsdb.c:73)

Signed-off-by: William Tu 
---
 tests/test-ovsdb.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tests/test-ovsdb.c b/tests/test-ovsdb.c
index c21001e..4a68bca 100644
--- a/tests/test-ovsdb.c
+++ b/tests/test-ovsdb.c
@@ -2344,6 +2344,7 @@ update_conditions(struct ovsdb_idl *idl, char *commands)
 parse_link2_json_clause(idl, add_cmd, json->u.array.elems[i]);
 }
 }
+json_destroy(json);
 }
 }
 
-- 
2.5.0

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH, v4] Scanning only changed entries in the ovnsb

2016-07-26 Thread Hui Kang
- Improve performance by scanning only changed port binding entries
when determining whether to mark the logical switch port up or
down

v3->v4:
- Add an initialization function to scan all entries in Port_binding table
  when ovn-northd restarts or fails over

Signed-off-by: Hui Kang 
---
 ovn/northd/ovn-northd.c | 95 +
 1 file changed, 65 insertions(+), 30 deletions(-)

diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index 17fbf29..35566a7 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -53,6 +53,11 @@ struct northd_context {
 struct ovsdb_idl_txn *ovnsb_txn;
 };
 
+struct lport_hash_node {
+struct hmap_node node;
+const struct nbrec_logical_switch_port *nbsp;
+};
+
 static const char *ovnnb_db;
 static const char *ovnsb_db;
 
@@ -3196,13 +3201,47 @@ ovnnb_db_run(struct northd_context *ctx)
 hmap_destroy();
 }
 
+static void
+sb_chassis_update_nbsec(const struct sbrec_port_binding *sb,
+struct hmap *lports_hmap)
+
+{
+struct lport_hash_node *hash_node;
+const struct nbrec_logical_switch_port *nbsp;
+
+nbsp = NULL;
+HMAP_FOR_EACH_WITH_HASH(hash_node, node,
+hash_string(sb->logical_port, 0),
+lports_hmap) {
+if (!strcmp(sb->logical_port, hash_node->nbsp->name)) {
+nbsp = hash_node->nbsp;
+break;
+}
+}
+
+if (!nbsp) {
+/* The logical port doesn't exist for this port binding.  This can
+ * happen under normal circumstances when ovn-northd hasn't gotten
+ * around to pruning the Port_Binding yet. */
+return;
+}
+
+if (sb->chassis && (!nbsp->up || !*nbsp->up)) {
+bool up = true;
+nbrec_logical_switch_port_set_up(nbsp, , 1);
+} else if (!sb->chassis && (!nbsp->up || *nbsp->up)) {
+bool up = false;
+nbrec_logical_switch_port_set_up(nbsp, , 1);
+}
+}
+
 /*
  * The only change we get notified about is if the 'chassis' column of the
  * 'Port_Binding' table changes.  When this column is not empty, it means we
  * need to set the corresponding logical port as 'up' in the northbound DB.
  */
 static void
-ovnsb_db_run(struct northd_context *ctx)
+ovnsb_db_run(struct northd_context *ctx, bool is_init)
 {
 if (!ctx->ovnnb_txn) {
 return;
@@ -3211,10 +3250,7 @@ ovnsb_db_run(struct northd_context *ctx)
 const struct sbrec_port_binding *sb;
 const struct nbrec_logical_switch_port *nbsp;
 
-struct lport_hash_node {
-struct hmap_node node;
-const struct nbrec_logical_switch_port *nbsp;
-} *hash_node;
+struct lport_hash_node *hash_node;
 
 hmap_init(_hmap);
 
@@ -3224,30 +3260,13 @@ ovnsb_db_run(struct northd_context *ctx)
 hmap_insert(_hmap, _node->node, hash_string(nbsp->name, 
0));
 }
 
-SBREC_PORT_BINDING_FOR_EACH(sb, ctx->ovnsb_idl) {
-nbsp = NULL;
-HMAP_FOR_EACH_WITH_HASH(hash_node, node,
-hash_string(sb->logical_port, 0),
-_hmap) {
-if (!strcmp(sb->logical_port, hash_node->nbsp->name)) {
-nbsp = hash_node->nbsp;
-break;
-}
-}
-
-if (!nbsp) {
-/* The logical port doesn't exist for this port binding.  This can
- * happen under normal circumstances when ovn-northd hasn't gotten
- * around to pruning the Port_Binding yet. */
-continue;
+if (is_init) {
+SBREC_PORT_BINDING_FOR_EACH(sb, ctx->ovnsb_idl) {
+sb_chassis_update_nbsec(sb, _hmap);
 }
-
-if (sb->chassis && (!nbsp->up || !*nbsp->up)) {
-bool up = true;
-nbrec_logical_switch_port_set_up(nbsp, , 1);
-} else if (!sb->chassis && (!nbsp->up || *nbsp->up)) {
-bool up = false;
-nbrec_logical_switch_port_set_up(nbsp, , 1);
+} else {
+SBREC_PORT_BINDING_FOR_EACH_TRACKED(sb, ctx->ovnsb_idl) {
+sb_chassis_update_nbsec(sb, _hmap);
 }
 }
 
@@ -3417,6 +3436,19 @@ add_column_noalert(struct ovsdb_idl *idl,
 ovsdb_idl_omit_alert(idl, column);
 }
 
+static void
+ovnsb_db_run_init(struct ovsdb_idl_loop *ovnsb_idl_loop)
+{
+struct northd_context ctx = {
+.ovnnb_idl = NULL,
+.ovnnb_txn = NULL,
+.ovnsb_idl = ovnsb_idl_loop->idl,
+.ovnsb_txn = ovsdb_idl_loop_run(ovnsb_idl_loop),
+};
+
+ovnsb_db_run(, true);
+}
+
 int
 main(int argc, char *argv[])
 {
@@ -3485,7 +3517,6 @@ main(int argc, char *argv[])
 add_column_noalert(ovnsb_idl_loop.idl, _port_binding_col_type);
 add_column_noalert(ovnsb_idl_loop.idl, _port_binding_col_options);
 add_column_noalert(ovnsb_idl_loop.idl, _port_binding_col_mac);
-ovsdb_idl_add_column(ovnsb_idl_loop.idl, _port_binding_col_chassis);
 

Re: [ovs-dev] [PATCH] ovn-controller: update_ct_zone operates always on empty set

2016-07-26 Thread Ryan Moats
"dev"  wrote on 07/26/2016 08:13:00 PM:

> From: Russell Bryant 
> To: Babu Shanmugam 
> Cc: ovs dev 
> Date: 07/26/2016 08:13 PM
> Subject: Re: [ovs-dev] [PATCH] ovn-controller: update_ct_zone
> operates always on empty set
> Sent by: "dev" 
>
> On Tue, Jul 26, 2016 at 6:46 AM,  wrote:
>
> > From: Babu Shanmugam 
> >
> > Commit 263064a (Convert binding_run to incremental processing.) removed
> > the usage
> > of all_lports from binding_run, but it is infact used in the context of
> > the caller,
> > especially by update_ct_zones().
> >
> > Without this change, update_ct_zones operates on an empty set always.
> >
> > Signed-off-by: Babu Shanmugam 
> >
>
> Ouch. This is a really bad regression.  If I understand correctly, we're
> not setting a ct zone ID for any logical ports.  All are just using the
> default zone of 0.
>
> We should think about a good way to test OVN's use of conntrack zones to
> ensure that entries end up in separate zones for separate ports.  A good
> test for that may require userspace conntrack support, though.
>
> Another test we could do now would be looking at the flows in table 0 and
> ensuring that the input flow for each port has a different conntrack zone
> ID assigned.  That feels like kind of a hack, though.
>
> ---
> >  ovn/controller/binding.c| 4 +++-
> >  ovn/controller/binding.h| 3 ++-
> >  ovn/controller/ovn-controller.c | 2 +-
> >  3 files changed, 6 insertions(+), 3 deletions(-)
> >
> > diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
> > index e83c1d5..7bc6fb4 100644
> > --- a/ovn/controller/binding.c
> > +++ b/ovn/controller/binding.c
> > @@ -230,7 +230,8 @@ consider_local_datapath(struct controller_ctx *ctx,
> >
> >  void
> >  binding_run(struct controller_ctx *ctx, const struct ovsrec_bridge
> > *br_int,
> > -const char *chassis_id, struct hmap *local_datapaths)
> > +const char *chassis_id, struct hmap *local_datapaths,
> > +struct sset *all_lports)
> >  {
> >  const struct sbrec_chassis *chassis_rec;
> >  const struct sbrec_port_binding *binding_rec;
> > @@ -292,6 +293,7 @@ binding_run(struct controller_ctx *ctx, const
struct
> > ovsrec_bridge *br_int,
> >  }
> >  }
> >
> > +sset_clone(all_lports, _ids);
> >
>
> I don't think this is quite sufficient.  It's missing, at least:
>
>  - the IDs of sub-ports
>
>  - localnet ports
>
> The old handling of building up all_lports ensure those got added.
>

Yes, I'm inclined to put the all_lports code back in and persist it in
a similar way to how local_ids is persisted.

I just checked and the only test that I'm seeing consistently fail
is the LB test Guru pointed out previously, I think we've got a
problem in our test code.  While I'd be happy to write the test case,
I'm not sure I understand it, so can somebody give me a pointer to an
existing test case or draft something that can be used to help fix this
and avoid the regression in the future?

Ryan
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] ovsdb: Fix memory leak in execute_update.

2016-07-26 Thread William Tu
Valgrind testcase 1804 ovsdb-server.at:1023 insert rows, update rows by value
reports the following leak.
json_from_string (json.c:1025)
execute_update (replication.c:614), similarily at execute_delete()
process_table_update (replication.c:502)
process_notification.part.5 (replication.c:445)
process_notification (replication.c:402)
check_for_notifications (replication.c:418)
replication_run (replication.c:110)

Signed-off-by: William Tu 
---
 ovsdb/replication.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/ovsdb/replication.c b/ovsdb/replication.c
index af7ae5c..fe89d39 100644
--- a/ovsdb/replication.c
+++ b/ovsdb/replication.c
@@ -573,6 +573,8 @@ execute_delete(struct ovsdb_txn *txn, const char *uuid,
 }
 
 ovsdb_condition_destroy();
+json_destroy(CONST_CAST(struct json *, where));
+
 return error;
 }
 
@@ -630,6 +632,7 @@ execute_update(struct ovsdb_txn *txn, const char *uuid,
 ovsdb_row_destroy(row);
 ovsdb_column_set_destroy();
 ovsdb_condition_destroy();
+json_destroy(CONST_CAST(struct json *, where));
 
 return error;
 }
-- 
2.5.0

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] ovsdb active backup deployment

2016-07-26 Thread Russell Bryant
On Tue, Jul 26, 2016 at 3:48 PM, Andy Zhou  wrote:

>
>
> On Tue, Jul 26, 2016 at 11:59 AM, Russell Bryant  wrote:
>
>>
>>
>> On Tue, Jul 26, 2016 at 2:41 PM, Andy Zhou  wrote:
>>
>>>
>>>
>>> On Tue, Jul 26, 2016 at 5:37 AM, Russell Bryant  wrote:
>>>


 On Mon, Jul 25, 2016 at 8:15 PM, Andy Zhou  wrote:

> Hi, Rayn and Russell,
>

 Can we move this discussion to the ovs dev mailing list?  Feel free to
 just add it in a reply if you'd like.

>>> Done.
>>>


> I am wondering how we can actually use the active/backup feature that
> is now part of
> OVSDB to increase OVN availability.
>

 TO be clear, I haven't actually tried this yet.  I'm only speaking
 about how I think it should work.


> Specifically:
>
> 1. When the active OVSDB server failed, should the back up server take
> over, and allow write transactions? One simpler possibility is to allow
> read only access to the backup serve.
>

 The  backup server needs to take over.  It's OK if that requires
 intervention by an HA manager like Pacemaker.  If we can't make the passive
 server take over, I'd say the solution is incomplete.

>>>
>>> O.K. make sense.
>>>
>>> One possible issue with backup server taking over is "split head".  In
>>> case due to network error, backup server becomes disconnected from the
>>> active
>>> server, then we may have both server thinking they are active server
>>> now.  Does Pacemaker help with solving this issue.
>>>
>>
>> It can, yes.  I would expect Pacemaker to explicitly configure a node to
>> be either the active or passive node.
>>
> Manual switching is more straight forward. I agree.
>
>>

> 2. When a crashed active OVSDB server recovers, should it become the
> new backup, or it should switch back.
>

 Becoming the new backup is fine.  Again, this can be orchestrated by an
 HA manager (Pacemaker).

>>> I am not familiar with pacemaker. Can I assume it can provide a correct
>>> --sync-from argument (pointing to backup server) when relaunch OVSDB
>>> server?
>>>
>>
>> Yes.  I'd have to consult with some Pacemaker experts on exactly what the
>> implementation would look like, but roughly:
>>
>> Pacemaker manages services using "OCF Resource Agents", which are just
>> scripts with a defined set of inputs and outputs for service management.  I
>> would imagine a Pacemaker cluster being told it must have exactly 1 active
>> and 1 passive OVSDB service.  When the passive OVSDB service is started, it
>> would include the "sync-from" argument based on where the active OVSDB
>> service is currently running.
>>
>> We really need to prototype this and document it.  I'm guessing too
>> much.  Pacemaker is frequently used to manage active/passive HA, though.
>>
>> Sounds reasonable,  I will work on ovsdb internal changes to support
> manual switching, using appctl commands. Then looking into prototyping with
> HA systems.  I have not used pacemaker in the past, so it may take some
> time to ramp up.
>

I should be able to help.  We need to do this work anyway for integration
into OpenStack deployment tools.  Let me see if I can get some helpful
examples to follow.

-- 
Russell Bryant
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] ovn-controller: update_ct_zone operates always on empty set

2016-07-26 Thread Russell Bryant
On Tue, Jul 26, 2016 at 6:46 AM,  wrote:

> From: Babu Shanmugam 
>
> Commit 263064a (Convert binding_run to incremental processing.) removed
> the usage
> of all_lports from binding_run, but it is infact used in the context of
> the caller,
> especially by update_ct_zones().
>
> Without this change, update_ct_zones operates on an empty set always.
>
> Signed-off-by: Babu Shanmugam 
>

Ouch. This is a really bad regression.  If I understand correctly, we're
not setting a ct zone ID for any logical ports.  All are just using the
default zone of 0.

We should think about a good way to test OVN's use of conntrack zones to
ensure that entries end up in separate zones for separate ports.  A good
test for that may require userspace conntrack support, though.

Another test we could do now would be looking at the flows in table 0 and
ensuring that the input flow for each port has a different conntrack zone
ID assigned.  That feels like kind of a hack, though.

---
>  ovn/controller/binding.c| 4 +++-
>  ovn/controller/binding.h| 3 ++-
>  ovn/controller/ovn-controller.c | 2 +-
>  3 files changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
> index e83c1d5..7bc6fb4 100644
> --- a/ovn/controller/binding.c
> +++ b/ovn/controller/binding.c
> @@ -230,7 +230,8 @@ consider_local_datapath(struct controller_ctx *ctx,
>
>  void
>  binding_run(struct controller_ctx *ctx, const struct ovsrec_bridge
> *br_int,
> -const char *chassis_id, struct hmap *local_datapaths)
> +const char *chassis_id, struct hmap *local_datapaths,
> +struct sset *all_lports)
>  {
>  const struct sbrec_chassis *chassis_rec;
>  const struct sbrec_port_binding *binding_rec;
> @@ -292,6 +293,7 @@ binding_run(struct controller_ctx *ctx, const struct
> ovsrec_bridge *br_int,
>  }
>  }
>
> +sset_clone(all_lports, _ids);
>

I don't think this is quite sufficient.  It's missing, at least:

 - the IDs of sub-ports

 - localnet ports

The old handling of building up all_lports ensure those got added.


>  shash_destroy(_to_iface);
>  }
>
> diff --git a/ovn/controller/binding.h b/ovn/controller/binding.h
> index 8753d44..fbd16c8 100644
> --- a/ovn/controller/binding.h
> +++ b/ovn/controller/binding.h
> @@ -29,7 +29,8 @@ struct sset;
>  void binding_register_ovs_idl(struct ovsdb_idl *);
>  void binding_reset_processing(void);
>  void binding_run(struct controller_ctx *, const struct ovsrec_bridge
> *br_int,
> - const char *chassis_id, struct hmap *local_datapaths);
> + const char *chassis_id, struct hmap *local_datapaths,
> + struct sset *all_lports);
>  bool binding_cleanup(struct controller_ctx *, const char *chassis_id);
>
>  #endif /* ovn/binding.h */
> diff --git a/ovn/controller/ovn-controller.c
> b/ovn/controller/ovn-controller.c
> index 4d9490a..6a6bb93 100644
> --- a/ovn/controller/ovn-controller.c
> +++ b/ovn/controller/ovn-controller.c
> @@ -425,7 +425,7 @@ main(int argc, char *argv[])
>  if (chassis_id) {
>  chassis_run(, chassis_id);
>  encaps_run(, br_int, chassis_id);
> -binding_run(, br_int, chassis_id, _datapaths);
> +binding_run(, br_int, chassis_id, _datapaths,
> _lports);
>  }
>
>  if (br_int && chassis_id) {
> --
> 2.5.5
>
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
>



-- 
Russell Bryant
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] datapath: Add support for kernel 4.6

2016-07-26 Thread pravin shelar
On Tue, Jul 26, 2016 at 4:37 PM, Amitabha Biswas  wrote:
> Typo in the previous ack
>
> Acked-by: Amitabha Biswas 
>
> On Jul 26, 2016, at 4:22 PM, Amitabha Biswas  wrote:
>
> I was able to compile the openvswitch modules on Linux 4.6 kernel and
> stacked using OpenStack networking-ovn.
>
> The basic NAT system tests passed and the OVN test suite passed.
>
> Asked-by: Amitabha Biswas 
>
> On Jul 26, 2016, at 2:07 PM, Jesse Gross  wrote:
>
> On Mon, Jul 25, 2016 at 6:40 PM, Pravin B Shelar  wrote:
>
> Most of patch iron out USE_UPSTREAM_TUNNEL case where datapath
> directly use upstream tunneling modules.
>
> Signed-off-by: Pravin B Shelar 
>
>
> Acked-by: Jesse Gross 

Thanks. Pushed to master.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2 1/3] datapath: compat: fix udp checksum calculation

2016-07-26 Thread pravin shelar
On Tue, Jul 26, 2016 at 4:03 PM, Jesse Gross  wrote:
> On Tue, Jul 26, 2016 at 3:59 PM, pravin shelar  wrote:
>> On Tue, Jul 26, 2016 at 3:53 PM, Jesse Gross  wrote:
>>> On Tue, Jul 26, 2016 at 3:24 PM, Pravin B Shelar  wrote:
 diff --git a/datapath/linux/compat/include/net/udp.h 
 b/datapath/linux/compat/include/net/udp.h
 index fa49fa5..266e70a 100644
 --- a/datapath/linux/compat/include/net/udp.h
 +++ b/datapath/linux/compat/include/net/udp.h
 @@ -54,7 +54,7 @@ static inline __sum16 udp_v4_check(int len, __be32 saddr,
  }
  #endif

 -#ifndef HAVE_UDP_SET_CSUM
 +#if LINUX_VERSION_CODE < KERNEL_VERSION(3,18,0)
>>>
>>> I'm a little nervous about these version checks being hard to maintain
>>> - especially since they don't correspond to anything obvious in this
>>> function upstream. Maybe we could just declare a #define with a name
>>> that would make it clearer. That might actually be useful in any case
>>> since I suspect that we will start seeing some backports in
>>> distributions that will allow us to avoid doing OVS segmentation even
>>> on older kernels.
>>
>> Is it fine if I do it as part of separate patch? This patch is about
>> fixing the UDP checksum issue. And the requested change is about
>> general code improvement.
>
> Yes, that's fine. I think we'll want to convert all of the GSO related
> 3.18 version checks to use this symbol, so that's mostly not related
> to checksums anyways.
>
> Acked-by: Jesse Gross 

Thanks for reviews, I pushed this series to master. I also pushed
first two patches to branch 2.5.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v5 04/16] conntrack: New userspace connection tracker.

2016-07-26 Thread Daniele Di Proietto
This commit adds the conntrack module.

It is a connection tracker that resides entirely in userspace.  Its
primary user will be the dpif-netdev datapath.

The module main goal is to provide conntrack_execute(), which offers a
convenient interface to implement the datapath ct() action.

The conntrack module uses two submodules to deal with the l4 protocol
details (conntrack-other for UDP and ICMP, conntrack-tcp for TCP).

The conntrack-tcp submodule implementation is adapted from FreeBSD's pf
subsystem, therefore it's BSD licensed.  It has been slightly altered to
match the OVS coding style and to allow the pickup of already
established connections.

Signed-off-by: Daniele Di Proietto 
Acked-by: Antonio Fischetti 
Acked-by: Joe Stringer 
---
 COPYING |   1 +
 debian/copyright.in |   4 +
 include/openvswitch/types.h |   4 +
 lib/automake.mk |   5 +
 lib/conntrack-other.c   |  85 +
 lib/conntrack-private.h |  89 +
 lib/conntrack-tcp.c | 462 +++
 lib/conntrack.c | 890 
 lib/conntrack.h | 150 
 lib/util.h  |   9 +
 10 files changed, 1699 insertions(+)
 create mode 100644 lib/conntrack-other.c
 create mode 100644 lib/conntrack-private.h
 create mode 100644 lib/conntrack-tcp.c
 create mode 100644 lib/conntrack.c
 create mode 100644 lib/conntrack.h

diff --git a/COPYING b/COPYING
index 308e3ea..afb98b9 100644
--- a/COPYING
+++ b/COPYING
@@ -25,6 +25,7 @@ License, version 2.
 The following files are licensed under the 2-clause BSD license.
 include/windows/getopt.h
 lib/getopt_long.c
+lib/conntrack-tcp.c
 
 The following files are licensed under the 3-clause BSD-license
 include/windows/netinet/icmp6.h
diff --git a/debian/copyright.in b/debian/copyright.in
index 57d007a..a15f4dd 100644
--- a/debian/copyright.in
+++ b/debian/copyright.in
@@ -21,6 +21,9 @@ Upstream Copyright Holders:
Copyright (c) 2014 Michael Chapman
Copyright (c) 2014 WindRiver, Inc.
Copyright (c) 2014 Avaya, Inc.
+   Copyright (c) 2001 Daniel Hartmeier
+   Copyright (c) 2002 - 2008 Henning Brauer
+   Copyright (c) 2012 Gleb Smirnoff 
 
 License:
 
@@ -90,6 +93,7 @@ License:
lib/getopt_long.c
include/windows/getopt.h
datapath-windows/ovsext/Conntrack-tcp.c
+   lib/conntrack-tcp.c
 
 * The following files are licensed under the 3-clause BSD-license
 
diff --git a/include/openvswitch/types.h b/include/openvswitch/types.h
index da56d4b..2f5fcca 100644
--- a/include/openvswitch/types.h
+++ b/include/openvswitch/types.h
@@ -108,6 +108,10 @@ static const ovs_u128 OVS_U128_MAX = { { UINT32_MAX, 
UINT32_MAX,
  UINT32_MAX, UINT32_MAX } };
 static const ovs_be128 OVS_BE128_MAX OVS_UNUSED = { { OVS_BE32_MAX, 
OVS_BE32_MAX,
OVS_BE32_MAX, OVS_BE32_MAX } };
+static const ovs_u128 OVS_U128_MIN OVS_UNUSED = { {0, 0, 0, 0} };
+static const ovs_u128 OVS_BE128_MIN OVS_UNUSED = { {0, 0, 0, 0} };
+
+#define OVS_U128_ZERO OVS_U128_MIN
 
 /* A 64-bit value, in network byte order, that is only aligned on a 32-bit
  * boundary. */
diff --git a/lib/automake.mk b/lib/automake.mk
index 71c9d41..b1da53d 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -49,6 +49,11 @@ lib_libopenvswitch_la_SOURCES = \
lib/compiler.h \
lib/connectivity.c \
lib/connectivity.h \
+   lib/conntrack-private.h \
+   lib/conntrack-tcp.c \
+   lib/conntrack-other.c \
+   lib/conntrack.c \
+   lib/conntrack.h \
lib/coverage.c \
lib/coverage.h \
lib/crc32c.c \
diff --git a/lib/conntrack-other.c b/lib/conntrack-other.c
new file mode 100644
index 000..295cb2c
--- /dev/null
+++ b/lib/conntrack-other.c
@@ -0,0 +1,85 @@
+/*
+ * Copyright (c) 2015, 2016 Nicira, Inc.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include 
+
+#include "conntrack-private.h"
+#include "dp-packet.h"
+
+enum other_state {
+OTHERS_FIRST,
+OTHERS_MULTIPLE,
+OTHERS_BIDIR,
+};
+
+struct conn_other {
+struct conn up;
+enum other_state state;
+};
+
+static const enum ct_timeout other_timeouts[] = {
+[OTHERS_FIRST] = CT_TM_OTHER_FIRST,
+[OTHERS_MULTIPLE] = CT_TM_OTHER_MULTIPLE,
+

[ovs-dev] [PATCH v5 12/16] tests: Add conntrack ofproto-dpif tests.

2016-07-26 Thread Daniele Di Proietto
While the system testsuite already has connection tracking tests, it
will be still useful to add some to the standard testsuite because:

* They're run more often by developers.
* Some of them are more interesting for the userspace datapath.

Signed-off-by: Daniele Di Proietto 
Acked-by: Flavio Leitner 
---
 tests/ofproto-dpif.at | 678 ++
 1 file changed, 678 insertions(+)

diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
index 67bb5e2..19ff4ce 100644
--- a/tests/ofproto-dpif.at
+++ b/tests/ofproto-dpif.at
@@ -8118,5 +8118,683 @@ AT_CHECK([grep "Final flow:" stdout], [0], [Final flow: 
unchanged
 AT_CHECK([ovs-appctl ofproto/trace ovs-dummy 'in_port(100)'], [0], [stdout])
 AT_CHECK([grep "Final flow:" stdout], [0], [Final flow: unchanged
 ])
+
+OVS_VSWITCHD_STOP
+AT_CLEANUP
+
+AT_SETUP([ofproto-dpif - conntrack - controller])
+OVS_VSWITCHD_START
+
+add_of_ports br0 1 2
+
+AT_CHECK([ovs-appctl vlog/set dpif_netdev:dbg vconn:info ofproto_dpif:info])
+
+dnl Allow new connections on p1->p2, but not on p2->p1.
+AT_DATA([flows.txt], [dnl
+priority=1,action=drop
+priority=10,arp,action=normal
+priority=100,in_port=1,udp,action=ct(commit,zone=0),controller
+priority=100,in_port=2,ct_state=-trk,udp,action=ct(table=0,zone=0)
+priority=100,in_port=2,ct_state=+trk+est-new,udp,action=controller
+])
+
+AT_CHECK([ovs-ofctl add-flows br0 flows.txt])
+
+AT_CAPTURE_FILE([ofctl_monitor.log])
+AT_CHECK([ovs-ofctl monitor br0 65534 invalid_ttl -P nxt_packet_in --detach 
--no-chdir --pidfile 2> ofctl_monitor.log])
+
+AT_CHECK([ovs-appctl netdev-dummy/receive p2 
'in_port(2),eth(src=50:54:00:00:00:0a,dst=50:54:00:00:00:09),eth_type(0x0800),ipv4(src=10.1.1.2,dst=10.1.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=2,dst=1)'])
+
+dnl OK, now start a new connection from port 1.
+AT_CHECK([ovs-appctl netdev-dummy/receive p1 
'in_port(1),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x0800),ipv4(src=10.1.1.1,dst=10.1.1.2,proto=17,tos=0,ttl=64,frag=no),udp(src=1,dst=2)'])
+
+dnl Now try a reply from port 2.
+AT_CHECK([ovs-appctl netdev-dummy/receive p2 
'in_port(2),eth(src=50:54:00:00:00:0a,dst=50:54:00:00:00:09),eth_type(0x0800),ipv4(src=10.1.1.2,dst=10.1.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=2,dst=1)'])
+
+OVS_WAIT_UNTIL([test `wc -l < ofctl_monitor.log` -ge 4])
+OVS_WAIT_UNTIL([ovs-appctl -t ovs-ofctl exit])
+
+dnl Check this output. We only see the latter two packets, not the first.
+dnl Note that the first packet doesn't have the ct_state bits set. This
+dnl happens because the ct_state field is available only after recirc.
+AT_CHECK([cat ofctl_monitor.log], [0], [dnl
+NXT_PACKET_IN (xid=0x0): cookie=0x0 total_len=60 in_port=1 (via action) 
data_len=60 (unbuffered)
+udp,vlan_tci=0x,dl_src=50:54:00:00:00:09,dl_dst=50:54:00:00:00:0a,nw_src=10.1.1.1,nw_dst=10.1.1.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=1,tp_dst=2
 udp_csum:e9d6
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0x0 total_len=60 
ct_state=est|rpl|trk,in_port=2 (via action) data_len=60 (unbuffered)
+udp,vlan_tci=0x,dl_src=50:54:00:00:00:0a,dl_dst=50:54:00:00:00:09,nw_src=10.1.1.2,nw_dst=10.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=2,tp_dst=1
 udp_csum:e9d6
+])
+
+AT_CHECK([ovs-ofctl monitor br0 65534 invalid_ttl -P nxt_packet_in --detach 
--no-chdir --pidfile 2> ofctl_monitor.log])
+
+dnl OK, now start a second connection from port 1
+AT_CHECK([ovs-appctl netdev-dummy/receive p1 
'in_port(1),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x0800),ipv4(src=10.1.1.1,dst=10.1.1.2,proto=17,tos=0,ttl=64,frag=no),udp(src=3,dst=4)'])
+
+dnl Now try a reply from port 2.
+AT_CHECK([ovs-appctl netdev-dummy/receive p2 
'in_port(2),eth(src=50:54:00:00:00:0a,dst=50:54:00:00:00:09),eth_type(0x0800),ipv4(src=10.1.1.2,dst=10.1.1.1,proto=17,tos=0,ttl=64,frag=no),udp(src=4,dst=3)'])
+
+
+OVS_WAIT_UNTIL([test `wc -l < ofctl_monitor.log` -ge 4])
+OVS_WAIT_UNTIL([ovs-appctl -t ovs-ofctl exit])
+
+dnl Check this output. We should see both packets
+dnl Note that the first packet doesn't have the ct_state bits set. This
+dnl happens because the ct_state field is available only after recirc.
+AT_CHECK([cat ofctl_monitor.log], [0], [dnl
+NXT_PACKET_IN (xid=0x0): cookie=0x0 total_len=60 in_port=1 (via action) 
data_len=60 (unbuffered)
+udp,vlan_tci=0x,dl_src=50:54:00:00:00:09,dl_dst=50:54:00:00:00:0a,nw_src=10.1.1.1,nw_dst=10.1.1.2,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=3,tp_dst=4
 udp_csum:e9d2
+dnl
+NXT_PACKET_IN (xid=0x0): cookie=0x0 total_len=60 
ct_state=est|rpl|trk,in_port=2 (via action) data_len=60 (unbuffered)
+udp,vlan_tci=0x,dl_src=50:54:00:00:00:0a,dl_dst=50:54:00:00:00:09,nw_src=10.1.1.2,nw_dst=10.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=4,tp_dst=3
 udp_csum:e9d2
+])
+
+OVS_VSWITCHD_STOP
+AT_CLEANUP
+
+AT_SETUP([ofproto-dpif - conntrack - ipv6])
+OVS_VSWITCHD_START
+
+add_of_ports br0 1 2
+
+AT_CHECK([ovs-appctl vlog/set dpif_netdev:dbg vconn:info 

[ovs-dev] [PATCH v5 15/16] conntrack: Track ICMP type and code.

2016-07-26 Thread Daniele Di Proietto
From the connection tracker perspective, an ICMP connection is a tuple
identified by source ip address, destination ip address and ICMP id.

While this allows basic ICMP traffic (pings) to work, it doesn't take
into account the icmp type: the connection tracker will allow
requests/replies in any directions.

This is improved by making the ICMP type and code part of the connection
tuple.  An ICMP echo request packet from A to B, will create a
connection that matches ICMP echo request from A to B and ICMP echo
replies from B to A.  The same is done for timestamp and info
request/replies, and for ICMPv6.

A new modules conntrack-icmp is implemented, to allow only "request"
types to create new connections.

Also, since they're tracked in both userspace and kernel
implementations, ICMP type and code are always printed in ct-dpif (a few
testcase are updated as a consequence).

Reported-by: Subramani Paramasivam 
Signed-off-by: Daniele Di Proietto 
Acked-by: Joe Stringer 
---
 lib/automake.mk |   1 +
 lib/conntrack-icmp.c| 105 
 lib/conntrack-private.h |  11 -
 lib/conntrack.c |  62 
 lib/conntrack.h |   2 +
 lib/ct-dpif.c   |  24 ---
 lib/ct-dpif.h   |   3 +-
 lib/netlink-conntrack.c |   2 +-
 tests/system-traffic.at |  12 +++---
 9 files changed, 188 insertions(+), 34 deletions(-)
 create mode 100644 lib/conntrack-icmp.c

diff --git a/lib/automake.mk b/lib/automake.mk
index b1da53d..4110e5f 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -49,6 +49,7 @@ lib_libopenvswitch_la_SOURCES = \
lib/compiler.h \
lib/connectivity.c \
lib/connectivity.h \
+   lib/conntrack-icmp.c \
lib/conntrack-private.h \
lib/conntrack-tcp.c \
lib/conntrack-other.c \
diff --git a/lib/conntrack-icmp.c b/lib/conntrack-icmp.c
new file mode 100644
index 000..40fd1d8
--- /dev/null
+++ b/lib/conntrack-icmp.c
@@ -0,0 +1,105 @@
+/*
+ * Copyright (c) 2015, 2016 Nicira, Inc.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+#include "conntrack-private.h"
+#include "dp-packet.h"
+
+enum icmp_state {
+ICMPS_FIRST,
+ICMPS_REPLY,
+};
+
+struct conn_icmp {
+struct conn up;
+enum icmp_state state;
+};
+
+static const enum ct_timeout icmp_timeouts[] = {
+[ICMPS_FIRST] = CT_TM_ICMP_FIRST,
+[ICMPS_REPLY] = CT_TM_ICMP_REPLY,
+};
+
+static struct conn_icmp *
+conn_icmp_cast(const struct conn *conn)
+{
+return CONTAINER_OF(conn, struct conn_icmp, up);
+}
+
+static enum ct_update_res
+icmp_conn_update(struct conn *conn_, struct conntrack_bucket *ctb,
+ struct dp_packet *pkt OVS_UNUSED, bool reply, long long now)
+{
+struct conn_icmp *conn = conn_icmp_cast(conn_);
+
+if (reply && conn->state != ICMPS_REPLY) {
+conn->state = ICMPS_REPLY;
+}
+
+conn_update_expiration(ctb, >up, icmp_timeouts[conn->state], now);
+
+return CT_UPDATE_VALID;
+}
+
+static bool
+icmp4_valid_new(struct dp_packet *pkt)
+{
+struct icmp_header *icmp = dp_packet_l4(pkt);
+
+return icmp->icmp_type == ICMP4_ECHO_REQUEST
+   || icmp->icmp_type == ICMP4_INFOREQUEST
+   || icmp->icmp_type == ICMP4_TIMESTAMP;
+}
+
+static bool
+icmp6_valid_new(struct dp_packet *pkt)
+{
+struct icmp6_header *icmp6 = dp_packet_l4(pkt);
+
+return icmp6->icmp6_type == ICMP6_ECHO_REQUEST;
+}
+
+static struct conn *
+icmp_new_conn(struct conntrack_bucket *ctb, struct dp_packet *pkt OVS_UNUSED,
+   long long now)
+{
+struct conn_icmp *conn;
+
+conn = xzalloc(sizeof *conn);
+conn->state = ICMPS_FIRST;
+
+conn_init_expiration(ctb, >up, icmp_timeouts[conn->state], now);
+
+return >up;
+}
+
+struct ct_l4_proto ct_proto_icmp4 = {
+.new_conn = icmp_new_conn,
+.valid_new = icmp4_valid_new,
+.conn_update = icmp_conn_update,
+};
+
+struct ct_l4_proto ct_proto_icmp6 = {
+.new_conn = icmp_new_conn,
+.valid_new = icmp6_valid_new,
+.conn_update = icmp_conn_update,
+};
diff --git a/lib/conntrack-private.h b/lib/conntrack-private.h
index df32525..013f19f 100644
--- a/lib/conntrack-private.h
+++ b/lib/conntrack-private.h
@@ -40,7 +40,14 @@ struct ct_addr {
 
 struct ct_endpoint {
 struct ct_addr addr;
-ovs_be16 port;
+union {
+

[ovs-dev] [PATCH v5 06/16] tests: Add very simple conntrack benchmark.

2016-07-26 Thread Daniele Di Proietto
This introduces a very limited but simple benchmark for
conntrack_execute(). It just sends repeatedly the same batch of packets
through the connection tracker and returns the time spent to process
them.

While this is not a realistic benchmark, it has proven useful during
development to evaluate different batching and locking strategies.

E.g. the line:

`./tests/ovstest test-conntrack benchmark 1 1488 32`

starts 1 thread that will send 1488 packets to the connection
tracker, 32 at a time. It will print the time taken to process them.

Signed-off-by: Daniele Di Proietto 
Acked-by: Flavio Leitner 
---
 tests/automake.mk  |   1 +
 tests/test-conntrack.c | 172 +
 2 files changed, 173 insertions(+)
 create mode 100644 tests/test-conntrack.c

diff --git a/tests/automake.mk b/tests/automake.mk
index 575ffeb..a9ebf91 100644
--- a/tests/automake.mk
+++ b/tests/automake.mk
@@ -328,6 +328,7 @@ tests_ovstest_SOURCES = \
tests/test-classifier.c \
tests/test-ccmap.c \
tests/test-cmap.c \
+   tests/test-conntrack.c \
tests/test-csum.c \
tests/test-flows.c \
tests/test-hash.c \
diff --git a/tests/test-conntrack.c b/tests/test-conntrack.c
new file mode 100644
index 000..37c7277
--- /dev/null
+++ b/tests/test-conntrack.c
@@ -0,0 +1,172 @@
+/*
+ * Copyright (c) 2015 Nicira, Inc.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include 
+#include "conntrack.h"
+
+#include "dp-packet.h"
+#include "fatal-signal.h"
+#include "flow.h"
+#include "netdev.h"
+#include "ovs-thread.h"
+#include "ovstest.h"
+#include "timeval.h"
+
+static const char payload[] = "5054000a505400090800451c00"
+  "11a4cd0a0101010a010102000100020008";
+
+static struct dp_packet_batch *
+prepare_packets(size_t n, bool change, unsigned tid)
+{
+struct dp_packet_batch *pkt_batch = xzalloc(sizeof *pkt_batch);
+struct flow flow;
+size_t i;
+
+ovs_assert(n <= ARRAY_SIZE(pkt_batch->packets));
+
+dp_packet_batch_init(pkt_batch);
+pkt_batch->count = n;
+
+for (i = 0; i < n; i++) {
+struct udp_header *udp;
+struct dp_packet *pkt = dp_packet_new(sizeof payload/2);
+
+dp_packet_put_hex(pkt, payload, NULL);
+flow_extract(pkt, );
+
+udp = dp_packet_l4(pkt);
+udp->udp_src = htons(ntohs(udp->udp_src) + tid);
+
+if (change) {
+udp->udp_dst = htons(ntohs(udp->udp_dst) + i);
+}
+
+pkt_batch->packets[i] = pkt;
+}
+
+return pkt_batch;
+}
+
+static void
+destroy_packets(struct dp_packet_batch *pkt_batch)
+{
+dp_packet_delete_batch(pkt_batch, true);
+free(pkt_batch);
+}
+
+struct thread_aux {
+pthread_t thread;
+unsigned tid;
+};
+
+static struct conntrack ct;
+static unsigned long n_threads, n_pkts, batch_size;
+static bool change_conn = false;
+static struct ovs_barrier barrier;
+
+static void *
+ct_thread_main(void *aux_)
+{
+struct thread_aux *aux = aux_;
+struct dp_packet_batch *pkt_batch;
+size_t i;
+
+pkt_batch = prepare_packets(batch_size, change_conn, aux->tid);
+ovs_barrier_block();
+for (i = 0; i < n_pkts; i += batch_size) {
+conntrack_execute(, pkt_batch, true, 0, NULL, NULL, NULL);
+}
+ovs_barrier_block();
+destroy_packets(pkt_batch);
+
+return NULL;
+}
+
+static void
+test_benchmark(struct ovs_cmdl_context *ctx)
+{
+struct thread_aux *threads;
+long long start;
+unsigned i;
+
+fatal_signal_init();
+
+/* Parse arguments */
+n_threads = strtoul(ctx->argv[1], NULL, 0);
+if (!n_threads) {
+ovs_fatal(0, "n_threads must be at least one");
+}
+n_pkts = strtoul(ctx->argv[2], NULL, 0);
+batch_size = strtoul(ctx->argv[3], NULL, 0);
+if (batch_size == 0 || batch_size > NETDEV_MAX_BURST) {
+ovs_fatal(0, "batch_size must be between 1 and NETDEV_MAX_BURST(%u)",
+  NETDEV_MAX_BURST);
+}
+if (ctx->argc > 4) {
+change_conn = strtoul(ctx->argv[4], NULL, 0);
+}
+
+threads = xcalloc(n_threads, sizeof *threads);
+ovs_barrier_init(, n_threads + 1);
+conntrack_init();
+
+/* Create threads */
+for (i = 0; i < n_threads; i++) {
+threads[i].tid = i;
+threads[i].thread = ovs_thread_create("ct_thread", ct_thread_main,
+   

[ovs-dev] [PATCH v5 16/16] conntrack: Add 'dl_type' parameter to conntrack_execute().

2016-07-26 Thread Daniele Di Proietto
Now that dpif_execute has a 'flow' member, it's pretty easy to access a
the flow (or the matching megaflow) in dp_execute_cb().

This means that's not necessary anymore for the connection tracker to
reextract 'dl_type' from the packet, it can be passed as a parameter.

This change means that we have to complicate sightly test-conntrack to
group the packets by dl_type before passing them to the connection
tracker.

Signed-off-by: Daniele Di Proietto 
Acked-by: Joe Stringer 
---
 lib/conntrack.c| 47 ++---
 lib/conntrack.h|  3 ++-
 lib/dpif-netdev.c  | 21 ++--
 tests/test-conntrack.c | 52 +++---
 4 files changed, 77 insertions(+), 46 deletions(-)

diff --git a/lib/conntrack.c b/lib/conntrack.c
index 8e6c826..6ef9114 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -53,7 +53,8 @@ struct conn_lookup_ctx {
 };
 
 static bool conn_key_extract(struct conntrack *, struct dp_packet *,
- struct conn_lookup_ctx *, uint16_t zone);
+ ovs_be16 dl_type, struct conn_lookup_ctx *,
+ uint16_t zone);
 static uint32_t conn_key_hash(const struct conn_key *, uint32_t basis);
 static void conn_key_reverse(struct conn_key *);
 static void conn_key_lookup(struct conntrack_bucket *ctb,
@@ -265,7 +266,8 @@ process_one(struct conntrack *ct, struct dp_packet *pkt,
  * 'setlabel' behaves similarly for the connection label.*/
 int
 conntrack_execute(struct conntrack *ct, struct dp_packet_batch *pkt_batch,
-  bool commit, uint16_t zone, const uint32_t *setmark,
+  ovs_be16 dl_type, bool commit, uint16_t zone,
+  const uint32_t *setmark,
   const struct ovs_key_ct_labels *setlabel,
   const char *helper)
 {
@@ -299,7 +301,7 @@ conntrack_execute(struct conntrack *ct, struct 
dp_packet_batch *pkt_batch,
 for (i = 0; i < cnt; i++) {
 unsigned bucket;
 
-if (!conn_key_extract(ct, pkts[i], [i], zone)) {
+if (!conn_key_extract(ct, pkts[i], dl_type, [i], zone)) {
 write_ct_md(pkts[i], CS_INVALID, zone, 0, OVS_U128_ZERO);
 continue;
 }
@@ -917,7 +919,7 @@ extract_l4(struct conn_key *key, const void *data, size_t 
size, bool *related,
 }
 
 static bool
-conn_key_extract(struct conntrack *ct, struct dp_packet *pkt,
+conn_key_extract(struct conntrack *ct, struct dp_packet *pkt, ovs_be16 dl_type,
  struct conn_lookup_ctx *ctx, uint16_t zone)
 {
 const struct eth_header *l2 = dp_packet_l2(pkt);
@@ -941,43 +943,32 @@ conn_key_extract(struct conntrack *ct, struct dp_packet 
*pkt,
  *We already have the l3 and l4 headers' pointers.  Extracting
  *the l3 addresses and the l4 ports is really cheap, since they
  *can be found at fixed locations.
- * 2) To extract the l3 and l4 types.
- *Extracting the l3 and l4 types (especially the l3[1]) on the
- *other hand is quite expensive, because they're not at a
- *fixed location.
+ * 2) To extract the l4 type.
+ *Extracting the l4 types, for IPv6 can be quite expensive, because
+ *it's not at a fixed location.
  *
  * Here's a way to avoid (2) with the help of the datapath.
- * The datapath doesn't keep the packet's extracted flow[2], so
+ * The datapath doesn't keep the packet's extracted flow[1], so
  * using that is not an option.  We could use the packet's matching
- * megaflow for l3 type (it's always unwildcarded), and for l4 type
- * (we have to unwildcard it first).  This means either:
+ * megaflow, but we have to make sure that the l4 type (nw_proto)
+ * is unwildcarded.  This means either:
  *
- * a) dpif-netdev passes the matching megaflow to dp_execute_cb(), which
- *is used to extract the l3 type.  Unfortunately, dp_execute_cb() is
- *used also in dpif_netdev_execute(), which doesn't have a matching
- *megaflow.
+ * a) dpif-netdev unwildcards the l4 type when a new flow is installed
+ *if the actions contains ct().
  *
- * b) We define an alternative OVS_ACTION_ATTR_CT, used only by the
- *userspace datapath, which includes l3 (and l4) type.  The
- *alternative action could be generated by ofproto-dpif specifically
- *for the userspace datapath. Having a different interface for
- *userspace and kernel doesn't seem very clean, though.
+ * b) ofproto-dpif-xlate unwildcards the l4 type when translating a ct()
+ *action.  This is already done in different actions, but it's
+ *unnecessary for the kernel.
  *
  * ---
- * [1] A simple benchmark (running only the connection tracker
- * over and over on the same packets) shows that if the
- * l3 type is already provided we 

[ovs-dev] [PATCH v5 13/16] system-tests: Run conntrack tests with userspace.

2016-07-26 Thread Daniele Di Proietto
The userspace connection tracker doesn't support ALGs, frag reassembly
or NAT yet, so skip those tests.

Also, connection tracking state input from a local port is not possible
in userspace.

The userspace datapath pads all frames with 0, to make them at
least 64 bytes.

Finally, the userspace datapath checks for the IPv4 header checksum, so
fix those in the hardcoded packets.

Signed-off-by: Daniele Di Proietto 
Acked-by: Joe Stringer 
Acked-by: Flavio Leitner 
---
 tests/system-kmod-macros.at  | 28 +
 tests/system-ovn.at  | 10 +---
 tests/system-traffic.at  | 54 +---
 tests/system-userspace-macros.at | 45 ++---
 4 files changed, 116 insertions(+), 21 deletions(-)

diff --git a/tests/system-kmod-macros.at b/tests/system-kmod-macros.at
index 2134db7..e1b5707 100644
--- a/tests/system-kmod-macros.at
+++ b/tests/system-kmod-macros.at
@@ -67,3 +67,31 @@ m4_define([CHECK_CONNTRACK],
  on_exit 'ovstest test-netlink-conntrack flush'
 ]
 )
+
+# CHECK_CONNTRACK_ALG()
+#
+# Perform requirements checks for running conntrack ALG tests. The kernel
+# supports ALG, so no check is needed.
+#
+m4_define([CHECK_CONNTRACK_ALG])
+
+# CHECK_CONNTRACK_FRAG()
+#
+# Perform requirements checks for running conntrack fragmentations tests.
+# The kernel always supports fragmentation, so no check is needed.
+m4_define([CHECK_CONNTRACK_FRAG])
+
+# CHECK_CONNTRACK_LOCAL_STACK()
+#
+# Perform requirements checks for running conntrack tests with local stack.
+# The kernel always supports reading the connection state of an skb coming
+# from an internal port, without an explicit ct() action, so no check is
+# needed.
+m4_define([CHECK_CONNTRACK_LOCAL_STACK])
+
+# CHECK_CONNTRACK_NAT()
+#
+# Perform requirements checks for running conntrack NAT tests. The kernel
+# always supports NAT, so no check is needed.
+#
+m4_define([CHECK_CONNTRACK_NAT])
diff --git a/tests/system-ovn.at b/tests/system-ovn.at
index 13f380f..c043f74 100644
--- a/tests/system-ovn.at
+++ b/tests/system-ovn.at
@@ -2,6 +2,7 @@ AT_SETUP([ovn -- 2 LRs connected via LS, gateway router, NAT])
 AT_KEYWORDS([ovnnat])
 
 CHECK_CONNTRACK()
+CHECK_CONNTRACK_NAT()
 ovn_start
 OVS_TRAFFIC_VSWITCHD_START()
 ADD_BR([br-int])
@@ -111,7 +112,7 @@ NS_CHECK_EXEC([alice1], [ping -q -c 3 -i 0.3 -w 2 30.0.0.2 
| FORMAT_PING], \
 # Check conntrack entries.
 AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(172.16.1.2) | \
 sed -e 's/zone=[[0-9]]*/zone=/'], [0], [dnl
-icmp,orig=(src=172.16.1.2,dst=30.0.0.2,id=),reply=(src=192.168.1.2,dst=172.16.1.2,id=),zone=
+icmp,orig=(src=172.16.1.2,dst=30.0.0.2,id=,type=8,code=0),reply=(src=192.168.1.2,dst=172.16.1.2,id=,type=0,code=0),zone=
 ])
 
 # South-North SNAT: 'bar1' pings 'alice1'. But 'alice1' receives traffic
@@ -124,7 +125,7 @@ NS_CHECK_EXEC([bar1], [ping -q -c 3 -i 0.3 -w 2 172.16.1.2 
| FORMAT_PING], \
 # We verify that SNAT indeed happened via 'dump-conntrack' command.
 AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.1) | \
 sed -e 's/zone=[[0-9]]*/zone=/'], [0], [dnl
-icmp,orig=(src=192.168.2.2,dst=172.16.1.2,id=),reply=(src=172.16.1.2,dst=30.0.0.1,id=),zone=
+icmp,orig=(src=192.168.2.2,dst=172.16.1.2,id=,type=8,code=0),reply=(src=172.16.1.2,dst=30.0.0.1,id=,type=0,code=0),zone=
 ])
 
 # Add static routes to handle east-west NAT.
@@ -143,14 +144,14 @@ NS_CHECK_EXEC([bar1], [ping -q -c 3 -i 0.3 -w 2 30.0.0.2 
| FORMAT_PING], \
 # 30.0.0.2 to R2, it hits the DNAT rule and converts 30.0.0.2 to 192.168.1.2
 AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.2) | \
 sed -e 's/zone=[[0-9]]*/zone=/'], [0], [dnl
-icmp,orig=(src=192.168.2.2,dst=30.0.0.2,id=),reply=(src=192.168.1.2,dst=192.168.2.2,id=),zone=
+icmp,orig=(src=192.168.2.2,dst=30.0.0.2,id=,type=8,code=0),reply=(src=192.168.1.2,dst=192.168.2.2,id=,type=0,code=0),zone=
 ])
 
 # As we have a SNAT rule that converts 192.168.2.2 to 30.0.0.1, the source is
 # SNATted and 'foo1' receives it.
 AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.1) | \
 sed -e 's/zone=[[0-9]]*/zone=/'], [0], [dnl
-icmp,orig=(src=192.168.2.2,dst=192.168.1.2,id=),reply=(src=192.168.1.2,dst=30.0.0.1,id=),zone=
+icmp,orig=(src=192.168.2.2,dst=192.168.1.2,id=,type=8,code=0),reply=(src=192.168.1.2,dst=30.0.0.1,id=,type=0,code=0),zone=
 ])
 
 OVS_APP_EXIT_AND_WAIT([ovn-controller])
@@ -173,6 +174,7 @@ AT_SETUP([ovn -- load-balancing])
 AT_KEYWORDS([ovnlb])
 
 CHECK_CONNTRACK()
+CHECK_CONNTRACK_NAT()
 ovn_start
 OVS_TRAFFIC_VSWITCHD_START()
 ADD_BR([br-int])
diff --git a/tests/system-traffic.at b/tests/system-traffic.at
index a337950..0b4b4b7 100644
--- a/tests/system-traffic.at
+++ b/tests/system-traffic.at
@@ -510,13 +510,13 @@ AT_CAPTURE_FILE([ofctl_monitor.log])
 AT_CHECK([ovs-ofctl monitor br0 65534 invalid_ttl --detach --no-chdir 
--pidfile 2> ofctl_monitor.log])
 
 dnl Send an unsolicited reply from 

[ovs-dev] [PATCH v5 03/16] flow: Introduce parse_dl_type().

2016-07-26 Thread Daniele Di Proietto
The function simply returns the ethernet type of the packet (after
eventually discarding the VLAN tag).  It will be used by a following
commit.

Signed-off-by: Daniele Di Proietto 
Acked-by: Flavio Leitner 
---
 lib/flow.c | 14 --
 lib/flow.h |  1 +
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/lib/flow.c b/lib/flow.c
index f94b1f2..8cf707b 100644
--- a/lib/flow.c
+++ b/lib/flow.c
@@ -328,7 +328,7 @@ parse_mpls(const void **datap, size_t *sizep)
 return MIN(count, FLOW_MAX_MPLS_LABELS);
 }
 
-static inline ovs_be16
+static inline ALWAYS_INLINE ovs_be16
 parse_vlan(const void **datap, size_t *sizep)
 {
 const struct eth_header *eth = *datap;
@@ -350,7 +350,7 @@ parse_vlan(const void **datap, size_t *sizep)
 return 0;
 }
 
-static inline ovs_be16
+static inline ALWAYS_INLINE ovs_be16
 parse_ethertype(const void **datap, size_t *sizep)
 {
 const struct llc_snap_header *llc;
@@ -827,6 +827,16 @@ miniflow_extract(struct dp_packet *packet, struct miniflow 
*dst)
 dst->map = mf.map;
 }
 
+ovs_be16
+parse_dl_type(const struct eth_header *data_, size_t size)
+{
+const void *data = data_;
+
+parse_vlan(, );
+
+return parse_ethertype(, );
+}
+
 /* For every bit of a field that is wildcarded in 'wildcards', sets the
  * corresponding bit in 'flow' to zero. */
 void
diff --git a/lib/flow.h b/lib/flow.h
index c041e8a..fd9c712 100644
--- a/lib/flow.h
+++ b/lib/flow.h
@@ -108,6 +108,7 @@ void flow_compose(struct dp_packet *, const struct flow *);
 
 bool parse_ipv6_ext_hdrs(const void **datap, size_t *sizep, uint8_t *nw_proto,
  uint8_t *nw_frag);
+ovs_be16 parse_dl_type(const struct eth_header *data_, size_t size);
 
 static inline uint64_t
 flow_get_xreg(const struct flow *flow, int idx)
-- 
2.8.1

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v5 14/16] system-tests: Add ping through conntrack test.

2016-07-26 Thread Daniele Di Proietto
Signed-off-by: Daniele Di Proietto 
Acked-by: Joe Stringer 
---
 tests/system-traffic.at | 84 +
 1 file changed, 84 insertions(+)

diff --git a/tests/system-traffic.at b/tests/system-traffic.at
index 0b4b4b7..fd8b918 100644
--- a/tests/system-traffic.at
+++ b/tests/system-traffic.at
@@ -608,6 +608,90 @@ NS_CHECK_EXEC([at_ns1], [wget http://[[fc00::1]] -t 3 -T 1 
-v -o wget1.log], [4]
 OVS_TRAFFIC_VSWITCHD_STOP
 AT_CLEANUP
 
+AT_SETUP([conntrack - IPv4 ping])
+CHECK_CONNTRACK()
+OVS_TRAFFIC_VSWITCHD_START()
+
+ADD_NAMESPACES(at_ns0, at_ns1)
+
+ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24")
+ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24")
+
+dnl Allow any traffic from ns0->ns1. Only allow nd, return traffic from 
ns1->ns0.
+AT_DATA([flows.txt], [dnl
+priority=1,action=drop
+priority=10,arp,action=normal
+priority=100,in_port=1,icmp,action=ct(commit),2
+priority=100,in_port=2,icmp,ct_state=-trk,action=ct(table=0)
+priority=100,in_port=2,icmp,ct_state=+trk+est,action=1
+])
+
+AT_CHECK([ovs-ofctl --bundle add-flows br0 flows.txt])
+
+dnl Pings from ns0->ns1 should work fine.
+NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.2 | FORMAT_PING], 
[0], [dnl
+3 packets transmitted, 3 received, 0% packet loss, time 0ms
+])
+
+AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(10.1.1.2)], [0], [dnl
+icmp,orig=(src=10.1.1.1,dst=10.1.1.2,id=),reply=(src=10.1.1.2,dst=10.1.1.1,id=)
+])
+
+AT_CHECK([ovs-appctl dpctl/flush-conntrack])
+
+dnl Pings from ns1->ns0 should fail.
+NS_CHECK_EXEC([at_ns1], [ping -q -c 3 -i 0.3 -w 2 10.1.1.1 | FORMAT_PING], 
[0], [dnl
+7 packets transmitted, 0 received, 100% packet loss, time 0ms
+])
+
+OVS_TRAFFIC_VSWITCHD_STOP
+AT_CLEANUP
+
+AT_SETUP([conntrack - IPv6 ping])
+CHECK_CONNTRACK()
+OVS_TRAFFIC_VSWITCHD_START()
+
+ADD_NAMESPACES(at_ns0, at_ns1)
+
+ADD_VETH(p0, at_ns0, br0, "fc00::1/96")
+ADD_VETH(p1, at_ns1, br0, "fc00::2/96")
+
+AT_DATA([flows.txt], [dnl
+
+dnl ICMPv6 echo request and reply go to table 1.  The rest of the traffic goes
+dnl through normal action.
+table=0,priority=10,icmp6,icmp_type=128,action=goto_table:1
+table=0,priority=10,icmp6,icmp_type=129,action=goto_table:1
+table=0,priority=1,action=normal
+
+dnl Allow everything from ns0->ns1. Only allow return traffic from ns1->ns0.
+table=1,priority=100,in_port=1,icmp6,action=ct(commit),2
+table=1,priority=100,in_port=2,icmp6,ct_state=-trk,action=ct(table=0)
+table=1,priority=100,in_port=2,icmp6,ct_state=+trk+est,action=1
+table=1,priority=1,action=drop
+])
+
+AT_CHECK([ovs-ofctl --bundle add-flows br0 flows.txt])
+
+OVS_WAIT_UNTIL([ip netns exec at_ns0 ping6 -c 1 fc00::2])
+
+dnl Pings from ns1->ns0 should fail.
+NS_CHECK_EXEC([at_ns1], [ping6 -q -c 3 -i 0.3 -w 2 fc00::1 | FORMAT_PING], 
[0], [dnl
+7 packets transmitted, 0 received, 100% packet loss, time 0ms
+])
+
+dnl Pings from ns0->ns1 should work fine.
+NS_CHECK_EXEC([at_ns0], [ping6 -q -c 3 -i 0.3 -w 2 fc00::2 | FORMAT_PING], 
[0], [dnl
+3 packets transmitted, 3 received, 0% packet loss, time 0ms
+])
+
+AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(fc00::2)], [0], [dnl
+icmpv6,orig=(src=fc00::1,dst=fc00::2,id=),reply=(src=fc00::2,dst=fc00::1,id=)
+])
+
+OVS_TRAFFIC_VSWITCHD_STOP
+AT_CLEANUP
+
 AT_SETUP([conntrack - commit, recirc])
 CHECK_CONNTRACK()
 OVS_TRAFFIC_VSWITCHD_START()
-- 
2.8.1

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v5 09/16] dpif-netdev: Implement conntrack dump functions.

2016-07-26 Thread Daniele Di Proietto
New functions are implemented in the conntrack module to support this.

Signed-off-by: Daniele Di Proietto 
Acked-by: Flavio Leitner 
---
 lib/conntrack-private.h |   3 ++
 lib/conntrack-tcp.c |  34 +
 lib/conntrack.c | 123 
 lib/conntrack.h |  15 ++
 lib/dpif-netdev.c   |  60 +--
 5 files changed, 232 insertions(+), 3 deletions(-)

diff --git a/lib/conntrack-private.h b/lib/conntrack-private.h
index 5aac938..df32525 100644
--- a/lib/conntrack-private.h
+++ b/lib/conntrack-private.h
@@ -22,6 +22,7 @@
 #include 
 
 #include "conntrack.h"
+#include "ct-dpif.h"
 #include "openvswitch/hmap.h"
 #include "openvswitch/list.h"
 #include "openvswitch/types.h"
@@ -76,6 +77,8 @@ struct ct_l4_proto {
   struct conntrack_bucket *,
   struct dp_packet *pkt, bool reply,
   long long now);
+void (*conn_get_protoinfo)(const struct conn *,
+   struct ct_dpif_protoinfo *);
 };
 
 extern struct ct_l4_proto ct_proto_tcp;
diff --git a/lib/conntrack-tcp.c b/lib/conntrack-tcp.c
index 7edcce3..ea22400 100644
--- a/lib/conntrack-tcp.c
+++ b/lib/conntrack-tcp.c
@@ -457,8 +457,42 @@ tcp_new_conn(struct conntrack_bucket *ctb, struct 
dp_packet *pkt,
 return >up;
 }
 
+static uint8_t
+tcp_peer_to_protoinfo_flags(const struct tcp_peer *peer)
+{
+uint8_t res = 0;
+
+if (peer->wscale & CT_WSCALE_FLAG) {
+res |= CT_DPIF_TCPF_WINDOW_SCALE;
+}
+
+if (peer->wscale & CT_WSCALE_UNKNOWN) {
+res |= CT_DPIF_TCPF_BE_LIBERAL;
+}
+
+return res;
+}
+
+static void
+tcp_conn_get_protoinfo(const struct conn *conn_,
+   struct ct_dpif_protoinfo *protoinfo)
+{
+const struct conn_tcp *conn = conn_tcp_cast(conn_);
+
+protoinfo->proto = IPPROTO_TCP;
+protoinfo->tcp.state_orig = conn->peer[0].state;
+protoinfo->tcp.state_reply = conn->peer[1].state;
+
+protoinfo->tcp.wscale_orig = conn->peer[0].wscale & CT_WSCALE_MASK;
+protoinfo->tcp.wscale_reply = conn->peer[1].wscale & CT_WSCALE_MASK;
+
+protoinfo->tcp.flags_orig = tcp_peer_to_protoinfo_flags(>peer[0]);
+protoinfo->tcp.flags_reply = tcp_peer_to_protoinfo_flags(>peer[1]);
+}
+
 struct ct_l4_proto ct_proto_tcp = {
 .new_conn = tcp_new_conn,
 .valid_new = tcp_valid_new,
 .conn_update = tcp_conn_update,
+.conn_get_protoinfo = tcp_conn_get_protoinfo,
 };
diff --git a/lib/conntrack.c b/lib/conntrack.c
index 094a230..47214e1 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -26,6 +26,7 @@
 #include "conntrack-private.h"
 #include "coverage.h"
 #include "csum.h"
+#include "ct-dpif.h"
 #include "dp-packet.h"
 #include "flow.h"
 #include "netdev.h"
@@ -1050,3 +1051,125 @@ delete_conn(struct conn *conn)
 {
 free(conn);
 }
+
+static void
+ct_endpoint_to_ct_dpif_inet_addr(const struct ct_addr *a,
+ union ct_dpif_inet_addr *b,
+ ovs_be16 dl_type)
+{
+if (dl_type == htons(ETH_TYPE_IP)) {
+b->ip = a->ipv4_aligned;
+} else if (dl_type == htons(ETH_TYPE_IPV6)){
+b->in6 = a->ipv6_aligned;
+}
+}
+
+static void
+conn_key_to_tuple(const struct conn_key *key, struct ct_dpif_tuple *tuple)
+{
+if (key->dl_type == htons(ETH_TYPE_IP)) {
+tuple->l3_type = AF_INET;
+} else if (key->dl_type == htons(ETH_TYPE_IPV6)) {
+tuple->l3_type = AF_INET6;
+}
+tuple->ip_proto = key->nw_proto;
+ct_endpoint_to_ct_dpif_inet_addr(>src.addr, >src,
+ key->dl_type);
+ct_endpoint_to_ct_dpif_inet_addr(>dst.addr, >dst,
+ key->dl_type);
+
+if (key->nw_proto == IPPROTO_ICMP || key->nw_proto == IPPROTO_ICMPV6) {
+tuple->icmp_id = key->src.port;
+/* ICMP type and code are not tracked */
+tuple->icmp_type = 0;
+tuple->icmp_code = 0;
+} else {
+tuple->src_port = key->src.port;
+tuple->dst_port = key->dst.port;
+}
+}
+
+static void
+conn_to_ct_dpif_entry(const struct conn *conn, struct ct_dpif_entry *entry,
+  long long now)
+{
+struct ct_l4_proto *class;
+long long expiration;
+memset(entry, 0, sizeof *entry);
+conn_key_to_tuple(>key, >tuple_orig);
+conn_key_to_tuple(>rev_key, >tuple_reply);
+
+entry->zone = conn->key.zone;
+entry->mark = conn->mark;
+
+memcpy(>labels, >label, sizeof(entry->labels));
+/* Not implemented yet */
+entry->timestamp.start = 0;
+entry->timestamp.stop = 0;
+
+expiration = conn->expiration - now;
+entry->timeout = (expiration > 0) ? expiration / 1000 : 0;
+
+class = l4_protos[conn->key.nw_proto];
+if (class->conn_get_protoinfo) {
+class->conn_get_protoinfo(conn, >protoinfo);
+}
+}
+

[ovs-dev] [PATCH v5 11/16] flow: Generate checksum and udp_len in flow_compose().

2016-07-26 Thread Daniele Di Proietto
This is useful to test the connection tracker, which performs checksum
and udp length verification.

Signed-off-by: Daniele Di Proietto 
Acked-by: Joe Stringer 
---
 lib/flow.c|  62 ++--
 tests/ofproto-dpif.at | 198 +-
 2 files changed, 153 insertions(+), 107 deletions(-)

diff --git a/lib/flow.c b/lib/flow.c
index 8cf707b..ba4f8c7 100644
--- a/lib/flow.c
+++ b/lib/flow.c
@@ -2238,6 +2238,7 @@ flow_compose_l4(struct dp_packet *p, const struct flow 
*flow)
 udp = dp_packet_put_zeros(p, l4_len);
 udp->udp_src = flow->tp_src;
 udp->udp_dst = flow->tp_dst;
+udp->udp_len = htons(l4_len);
 } else if (flow->nw_proto == IPPROTO_SCTP) {
 struct sctp_header *sctp;
 
@@ -2252,8 +2253,6 @@ flow_compose_l4(struct dp_packet *p, const struct flow 
*flow)
 icmp = dp_packet_put_zeros(p, l4_len);
 icmp->icmp_type = ntohs(flow->tp_src);
 icmp->icmp_code = ntohs(flow->tp_dst);
-/* Checksum has already been zeroed by put_zeros call. */
-icmp->icmp_csum = csum(icmp, ICMP_HEADER_LEN);
 } else if (flow->nw_proto == IPPROTO_IGMP) {
 struct igmp_header *igmp;
 
@@ -2262,8 +2261,6 @@ flow_compose_l4(struct dp_packet *p, const struct flow 
*flow)
 igmp->igmp_type = ntohs(flow->tp_src);
 igmp->igmp_code = ntohs(flow->tp_dst);
 put_16aligned_be32(>group, flow->igmp_group_ip4);
-/* Checksum has already been zeroed by put_zeros call. */
-igmp->igmp_csum = csum(igmp, IGMP_HEADER_LEN);
 } else if (flow->nw_proto == IPPROTO_ICMPV6) {
 struct icmp6_hdr *icmp;
 
@@ -2297,22 +2294,65 @@ flow_compose_l4(struct dp_packet *p, const struct flow 
*flow)
 nd_opt->nd_opt_mac = flow->arp_tha;
 }
 }
-icmp->icmp6_cksum = (OVS_FORCE uint16_t)
-csum(icmp, (char *)dp_packet_tail(p) - (char *)icmp);
 }
 }
 return l4_len;
 }
 
+static void
+flow_compose_l4_csum(struct dp_packet *p, const struct flow *flow,
+ uint32_t pseudo_hdr_csum)
+{
+size_t l4_len = (char *) dp_packet_tail(p) - (char *) dp_packet_l4(p);
+
+if (!(flow->nw_frag & FLOW_NW_FRAG_ANY)
+|| !(flow->nw_frag & FLOW_NW_FRAG_LATER)) {
+if (flow->nw_proto == IPPROTO_TCP) {
+struct tcp_header *tcp = dp_packet_l4(p);
+
+/* Checksum has already been zeroed by put_zeros call in
+ * flow_compose_l4(). */
+tcp->tcp_csum = csum_finish(csum_continue(pseudo_hdr_csum,
+  tcp, l4_len));
+} else if (flow->nw_proto == IPPROTO_UDP) {
+struct udp_header *udp = dp_packet_l4(p);
+
+/* Checksum has already been zeroed by put_zeros call in
+ * flow_compose_l4(). */
+udp->udp_csum = csum_finish(csum_continue(pseudo_hdr_csum,
+  udp, l4_len));
+} else if (flow->nw_proto == IPPROTO_ICMP) {
+struct icmp_header *icmp = dp_packet_l4(p);
+
+/* Checksum has already been zeroed by put_zeros call in
+ * flow_compose_l4(). */
+icmp->icmp_csum = csum(icmp, l4_len);
+} else if (flow->nw_proto == IPPROTO_IGMP) {
+struct igmp_header *igmp = dp_packet_l4(p);
+
+/* Checksum has already been zeroed by put_zeros call in
+ * flow_compose_l4(). */
+igmp->igmp_csum = csum(igmp, l4_len);
+} else if (flow->nw_proto == IPPROTO_ICMPV6) {
+struct icmp6_hdr *icmp = dp_packet_l4(p);
+
+/* Checksum has already been zeroed by put_zeros call in
+ * flow_compose_l4(). */
+icmp->icmp6_cksum = (OVS_FORCE uint16_t)
+csum_finish(csum_continue(pseudo_hdr_csum, icmp, l4_len));
+}
+}
+}
+
 /* Puts into 'b' a packet that flow_extract() would parse as having the given
  * 'flow'.
  *
  * (This is useful only for testing, obviously, and the packet isn't really
- * valid. It hasn't got some checksums filled in, for one, and lots of fields
- * are just zeroed.) */
+ * valid.  Lots of fields are just zeroed.) */
 void
 flow_compose(struct dp_packet *p, const struct flow *flow)
 {
+uint32_t pseudo_hdr_csum;
 size_t l4_len;
 
 /* eth_compose() sets l3 pointer and makes sure it is 32-bit aligned. */
@@ -2353,6 +2393,9 @@ flow_compose(struct dp_packet *p, const struct flow *flow)
 ip->ip_tot_len = htons(p->l4_ofs - p->l3_ofs + l4_len);
 /* Checksum has already been zeroed by put_zeros call. */
 ip->ip_csum = csum(ip, sizeof *ip);
+
+pseudo_hdr_csum = packet_csum_pseudoheader(ip);
+flow_compose_l4_csum(p, flow, pseudo_hdr_csum);
 } else if 

[ovs-dev] [PATCH v5 10/16] dpif-netdev: Implement conntrack flush interface.

2016-07-26 Thread Daniele Di Proietto
New functions are implemented in the conntrack module to support this.

Signed-off-by: Daniele Di Proietto 
Acked-by: Flavio Leitner 
---
 lib/conntrack.c   | 23 +++
 lib/conntrack.h   |  2 ++
 lib/dpif-netdev.c | 10 +-
 3 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/lib/conntrack.c b/lib/conntrack.c
index 47214e1..15a9582 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -1173,3 +1173,26 @@ conntrack_dump_done(struct conntrack_dump *dump 
OVS_UNUSED)
 {
 return 0;
 }
+
+int
+conntrack_flush(struct conntrack *ct, const uint16_t *zone)
+{
+unsigned i;
+
+for (i = 0; i < CONNTRACK_BUCKETS; i++) {
+struct conn *conn, *next;
+
+ct_lock_lock(>buckets[i].lock);
+HMAP_FOR_EACH_SAFE(conn, next, node, >buckets[i].connections) {
+if (!zone || *zone == conn->key.zone) {
+ovs_list_remove(>exp_node);
+hmap_remove(>buckets[i].connections, >node);
+atomic_count_dec(>n_conn);
+delete_conn(conn);
+}
+}
+ct_lock_unlock(>buckets[i].lock);
+}
+
+return 0;
+}
diff --git a/lib/conntrack.h b/lib/conntrack.h
index 2f0680e..8802d35 100644
--- a/lib/conntrack.h
+++ b/lib/conntrack.h
@@ -83,6 +83,8 @@ int conntrack_dump_start(struct conntrack *, struct 
conntrack_dump *,
  const uint16_t *pzone);
 int conntrack_dump_next(struct conntrack_dump *, struct ct_dpif_entry *);
 int conntrack_dump_done(struct conntrack_dump *);
+
+int conntrack_flush(struct conntrack *, const uint16_t *zone);
 
 /* 'struct ct_lock' is a wrapper for an adaptive mutex.  It's useful to try
  * different types of locks (e.g. spinlocks) */
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 48861a2..5793995 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -4309,6 +4309,14 @@ dpif_netdev_ct_dump_done(struct dpif *dpif OVS_UNUSED,
 return err;
 }
 
+static int
+dpif_netdev_ct_flush(struct dpif *dpif, const uint16_t *zone)
+{
+struct dp_netdev *dp = get_dp_netdev(dpif);
+
+return conntrack_flush(>conntrack, zone);
+}
+
 const struct dpif_class dpif_netdev_class = {
 "netdev",
 dpif_netdev_init,
@@ -4352,7 +4360,7 @@ const struct dpif_class dpif_netdev_class = {
 dpif_netdev_ct_dump_start,
 dpif_netdev_ct_dump_next,
 dpif_netdev_ct_dump_done,
-NULL,   /* ct_flush */
+dpif_netdev_ct_flush,
 };
 
 static void
-- 
2.8.1

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v5 08/16] dpif-netdev: Execute conntrack action.

2016-07-26 Thread Daniele Di Proietto
This commit implements the OVS_ACTION_ATTR_CT action in dpif-netdev.

To allow ofproto-dpif to detect the conntrack feature, flow_put will not
discard anymore flows with ct_* fields set. We still shouldn't allow
flows with NAT bits set, since there is no support for NAT.

Signed-off-by: Daniele Di Proietto 
Acked-by: Flavio Leitner 
Acked-by: Antonio Fischetti 
---
 FAQ.md|  2 +-
 NEWS  |  2 ++
 lib/dpif-netdev.c | 63 ---
 tests/dpif-netdev.at  | 16 ++---
 tests/ofproto-dpif.at | 24 ++--
 tests/pmd.at  |  2 +-
 6 files changed, 79 insertions(+), 30 deletions(-)

diff --git a/FAQ.md b/FAQ.md
index 35e1cac..cec420b 100644
--- a/FAQ.md
+++ b/FAQ.md
@@ -193,7 +193,7 @@ A: Open vSwitch supports different datapaths on different 
platforms.  Each
 Feature   | Linux upstream | Linux OVS tree | Userspace | Hyper-V |
 --|:--:|:--:|:-:|:---:|
 NAT   |  4.6   |   YES  |NO |   NO|
-Connection tracking   |  4.3   |   YES  |NO | PARTIAL |
+Connection tracking   |  4.3   |   YES  |  PARTIAL  | PARTIAL |
 Tunnel - LISP |  NO|   YES  |NO |   NO|
 Tunnel - STT  |  NO|   YES  |NO |   YES   |
 Tunnel - GRE  |  3.11  |   YES  |YES|   YES   |
diff --git a/NEWS b/NEWS
index 73d3fcf..39157b8 100644
--- a/NEWS
+++ b/NEWS
@@ -59,6 +59,8 @@ Post-v2.5.0
  * PMD threads servicing vHost User ports can now come from the NUMA
node that device memory is located on if CONFIG_RTE_LIBRTE_VHOST_NUMA
is enabled in DPDK.
+ * Basic connection tracking for the userspace datapath (no ALG,
+   fragmentation or NAT support yet)
- Increase number of registers to 16.
- ovs-benchmark: This utility has been removed due to lack of use and
  bitrot.
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index f05ca4e..c928ffe 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -33,6 +33,7 @@
 
 #include "bitmap.h"
 #include "cmap.h"
+#include "conntrack.h"
 #include "coverage.h"
 #include "csum.h"
 #include "dp-packet.h"
@@ -89,9 +90,17 @@ static struct shash dp_netdevs 
OVS_GUARDED_BY(dp_netdev_mutex)
 
 static struct vlog_rate_limit upcall_rl = VLOG_RATE_LIMIT_INIT(600, 600);
 
+#define DP_NETDEV_CS_SUPPORTED_MASK (CS_NEW | CS_ESTABLISHED | CS_RELATED \
+ | CS_INVALID | CS_REPLY_DIR | CS_TRACKED)
+#define DP_NETDEV_CS_UNSUPPORTED_MASK (~(uint32_t)DP_NETDEV_CS_SUPPORTED_MASK)
+
 static struct odp_support dp_netdev_support = {
 .max_mpls_depth = SIZE_MAX,
 .recirc = true,
+.ct_state = true,
+.ct_zone = true,
+.ct_mark = true,
+.ct_label = true,
 };
 
 /* Stores a miniflow with inline values */
@@ -228,6 +237,8 @@ struct dp_netdev {
 char *pmd_cmask;
 
 uint64_t last_tnl_conf_seq;
+
+struct conntrack conntrack;
 };
 
 static struct dp_netdev_port *dp_netdev_lookup_port(const struct dp_netdev *dp,
@@ -934,6 +945,8 @@ create_dp_netdev(const char *name, const struct dpif_class 
*class,
 dp->upcall_aux = NULL;
 dp->upcall_cb = NULL;
 
+conntrack_init(>conntrack);
+
 cmap_init(>poll_threads);
 ovs_mutex_init_recursive(>non_pmd_mutex);
 ovsthread_key_create(>per_pmd_key, NULL);
@@ -1004,6 +1017,8 @@ dp_netdev_free(struct dp_netdev *dp)
 ovs_mutex_destroy(>non_pmd_mutex);
 ovsthread_key_delete(dp->per_pmd_key);
 
+conntrack_destroy(>conntrack);
+
 ovs_mutex_lock(>port_mutex);
 HMAP_FOR_EACH_SAFE (port, next, node, >ports) {
 do_del_port(dp, port);
@@ -2030,9 +2045,7 @@ dpif_netdev_flow_from_nlattrs(const struct nlattr *key, 
uint32_t key_len,
 return EINVAL;
 }
 
-/* Userspace datapath doesn't support conntrack. */
-if (flow->ct_state || flow->ct_zone || flow->ct_mark
-|| !ovs_u128_is_zero(flow->ct_label)) {
+if (flow->ct_state & DP_NETDEV_CS_UNSUPPORTED_MASK) {
 return EINVAL;
 }
 
@@ -4172,12 +4185,46 @@ dp_execute_cb(void *aux_, struct dp_packet_batch 
*packets_,
 VLOG_WARN("Packet dropped. Max recirculation depth exceeded.");
 break;
 
-case OVS_ACTION_ATTR_CT:
-/* If a flow with this action is slow-pathed, datapath assistance is
- * required to implement it. However, we don't support this action
- * in the userspace datapath. */
-VLOG_WARN("Cannot execute conntrack action in userspace.");
+case OVS_ACTION_ATTR_CT: {
+const struct nlattr *b;
+bool commit = false;
+unsigned int left;
+uint16_t zone = 0;
+const char *helper = NULL;
+const uint32_t *setmark = NULL;
+const struct ovs_key_ct_labels *setlabel 

[ovs-dev] [PATCH v5 05/16] conntrack: Periodically delete expired connections.

2016-07-26 Thread Daniele Di Proietto
This commit adds a thread that periodically removes expired connections.

The expiration time of a connection can be expressed by:

expiration = now + timeout

For each possible 'timeout' value (there aren't many) we keep a list.
When the expiration is updated, we move the connection to the back of the
corresponding 'timeout' list. This ways, the list is always ordered by
'expiration'.

When the cleanup thread iterates through the lists for expired
connections, it can stop at the first non expired connection.

Suggested-by: Joe Stringer 
Signed-off-by: Daniele Di Proietto 
---
 lib/conntrack-other.c   |  11 +--
 lib/conntrack-private.h |  21 --
 lib/conntrack-tcp.c |  20 +++---
 lib/conntrack.c | 186 
 lib/conntrack.h |  36 +-
 5 files changed, 243 insertions(+), 31 deletions(-)

diff --git a/lib/conntrack-other.c b/lib/conntrack-other.c
index 295cb2c..2920889 100644
--- a/lib/conntrack-other.c
+++ b/lib/conntrack-other.c
@@ -43,8 +43,8 @@ conn_other_cast(const struct conn *conn)
 }
 
 static enum ct_update_res
-other_conn_update(struct conn *conn_, struct dp_packet *pkt OVS_UNUSED,
-  bool reply, long long now)
+other_conn_update(struct conn *conn_, struct conntrack_bucket *ctb,
+  struct dp_packet *pkt OVS_UNUSED, bool reply, long long now)
 {
 struct conn_other *conn = conn_other_cast(conn_);
 
@@ -54,7 +54,7 @@ other_conn_update(struct conn *conn_, struct dp_packet *pkt 
OVS_UNUSED,
 conn->state = OTHERS_MULTIPLE;
 }
 
-update_expiration(conn_, other_timeouts[conn->state], now);
+conn_update_expiration(ctb, >up, other_timeouts[conn->state], now);
 
 return CT_UPDATE_VALID;
 }
@@ -66,14 +66,15 @@ other_valid_new(struct dp_packet *pkt OVS_UNUSED)
 }
 
 static struct conn *
-other_new_conn(struct dp_packet *pkt OVS_UNUSED, long long now)
+other_new_conn(struct conntrack_bucket *ctb, struct dp_packet *pkt OVS_UNUSED,
+   long long now)
 {
 struct conn_other *conn;
 
 conn = xzalloc(sizeof *conn);
 conn->state = OTHERS_FIRST;
 
-update_expiration(>up, other_timeouts[conn->state], now);
+conn_init_expiration(ctb, >up, other_timeouts[conn->state], now);
 
 return >up;
 }
diff --git a/lib/conntrack-private.h b/lib/conntrack-private.h
index bc32448..5aac938 100644
--- a/lib/conntrack-private.h
+++ b/lib/conntrack-private.h
@@ -69,10 +69,13 @@ enum ct_update_res {
 };
 
 struct ct_l4_proto {
-struct conn *(*new_conn)(struct dp_packet *pkt, long long now);
+struct conn *(*new_conn)(struct conntrack_bucket *, struct dp_packet *pkt,
+ long long now);
 bool (*valid_new)(struct dp_packet *pkt);
-enum ct_update_res (*conn_update)(struct conn *conn, struct dp_packet *pkt,
-  bool reply, long long now);
+enum ct_update_res (*conn_update)(struct conn *conn,
+  struct conntrack_bucket *,
+  struct dp_packet *pkt, bool reply,
+  long long now);
 };
 
 extern struct ct_l4_proto ct_proto_tcp;
@@ -81,9 +84,19 @@ extern struct ct_l4_proto ct_proto_other;
 extern long long ct_timeout_val[];
 
 static inline void
-update_expiration(struct conn *conn, enum ct_timeout tm, long long now)
+conn_init_expiration(struct conntrack_bucket *ctb, struct conn *conn,
+enum ct_timeout tm, long long now)
 {
 conn->expiration = now + ct_timeout_val[tm];
+ovs_list_push_back(>exp_lists[tm], >exp_node);
+}
+
+static inline void
+conn_update_expiration(struct conntrack_bucket *ctb, struct conn *conn,
+   enum ct_timeout tm, long long now)
+{
+ovs_list_remove(>exp_node);
+conn_init_expiration(ctb, conn, tm, now);
 }
 
 #endif /* conntrack-private.h */
diff --git a/lib/conntrack-tcp.c b/lib/conntrack-tcp.c
index 6da798d..7edcce3 100644
--- a/lib/conntrack-tcp.c
+++ b/lib/conntrack-tcp.c
@@ -152,8 +152,8 @@ tcp_payload_length(struct dp_packet *pkt)
 }
 
 static enum ct_update_res
-tcp_conn_update(struct conn* conn_, struct dp_packet *pkt, bool reply,
-long long now)
+tcp_conn_update(struct conn *conn_, struct conntrack_bucket *ctb,
+struct dp_packet *pkt, bool reply, long long now)
 {
 struct conn_tcp *conn = conn_tcp_cast(conn_);
 struct tcp_header *tcp = dp_packet_l4(pkt);
@@ -319,18 +319,18 @@ tcp_conn_update(struct conn* conn_, struct dp_packet 
*pkt, bool reply,
 
 if (src->state >= CT_DPIF_TCPS_FIN_WAIT_2
 && dst->state >= CT_DPIF_TCPS_FIN_WAIT_2) {
-update_expiration(conn_, CT_TM_TCP_CLOSED, now);
+conn_update_expiration(ctb, >up, CT_TM_TCP_CLOSED, now);
 } else if (src->state >= CT_DPIF_TCPS_CLOSING
&& dst->state >= CT_DPIF_TCPS_CLOSING) {
-

[ovs-dev] [PATCH v5 02/16] flow: Export parse_ipv6_ext_hdrs().

2016-07-26 Thread Daniele Di Proietto
This will be used by a future commit.

Signed-off-by: Daniele Di Proietto 
Acked-by: Joe Stringer 
Acked-by: Flavio Leitner 
---
 lib/flow.c | 140 ++---
 lib/flow.h |   3 ++
 2 files changed, 81 insertions(+), 62 deletions(-)

diff --git a/lib/flow.c b/lib/flow.c
index 5775127..f94b1f2 100644
--- a/lib/flow.c
+++ b/lib/flow.c
@@ -440,6 +440,82 @@ invalid:
 arp_buf[1] = eth_addr_zero;
 }
 
+static inline bool
+parse_ipv6_ext_hdrs__(const void **datap, size_t *sizep, uint8_t *nw_proto,
+  uint8_t *nw_frag)
+{
+while (1) {
+if (OVS_LIKELY((*nw_proto != IPPROTO_HOPOPTS)
+   && (*nw_proto != IPPROTO_ROUTING)
+   && (*nw_proto != IPPROTO_DSTOPTS)
+   && (*nw_proto != IPPROTO_AH)
+   && (*nw_proto != IPPROTO_FRAGMENT))) {
+/* It's either a terminal header (e.g., TCP, UDP) or one we
+ * don't understand.  In either case, we're done with the
+ * packet, so use it to fill in 'nw_proto'. */
+return true;
+}
+
+/* We only verify that at least 8 bytes of the next header are
+ * available, but many of these headers are longer.  Ensure that
+ * accesses within the extension header are within those first 8
+ * bytes. All extension headers are required to be at least 8
+ * bytes. */
+if (OVS_UNLIKELY(*sizep < 8)) {
+return false;
+}
+
+if ((*nw_proto == IPPROTO_HOPOPTS)
+|| (*nw_proto == IPPROTO_ROUTING)
+|| (*nw_proto == IPPROTO_DSTOPTS)) {
+/* These headers, while different, have the fields we care
+ * about in the same location and with the same
+ * interpretation. */
+const struct ip6_ext *ext_hdr = *datap;
+*nw_proto = ext_hdr->ip6e_nxt;
+if (OVS_UNLIKELY(!data_try_pull(datap, sizep,
+(ext_hdr->ip6e_len + 1) * 8))) {
+return false;
+}
+} else if (*nw_proto == IPPROTO_AH) {
+/* A standard AH definition isn't available, but the fields
+ * we care about are in the same location as the generic
+ * option header--only the header length is calculated
+ * differently. */
+const struct ip6_ext *ext_hdr = *datap;
+*nw_proto = ext_hdr->ip6e_nxt;
+if (OVS_UNLIKELY(!data_try_pull(datap, sizep,
+(ext_hdr->ip6e_len + 2) * 4))) {
+return false;
+}
+} else if (*nw_proto == IPPROTO_FRAGMENT) {
+const struct ovs_16aligned_ip6_frag *frag_hdr = *datap;
+
+*nw_proto = frag_hdr->ip6f_nxt;
+if (!data_try_pull(datap, sizep, sizeof *frag_hdr)) {
+return false;
+}
+
+/* We only process the first fragment. */
+if (frag_hdr->ip6f_offlg != htons(0)) {
+*nw_frag = FLOW_NW_FRAG_ANY;
+if ((frag_hdr->ip6f_offlg & IP6F_OFF_MASK) != htons(0)) {
+*nw_frag |= FLOW_NW_FRAG_LATER;
+*nw_proto = IPPROTO_FRAGMENT;
+return true;
+}
+}
+}
+}
+}
+
+bool
+parse_ipv6_ext_hdrs(const void **datap, size_t *sizep, uint8_t *nw_proto,
+uint8_t *nw_frag)
+{
+return parse_ipv6_ext_hdrs__(datap, sizep, nw_proto, nw_frag);
+}
+
 /* Initializes 'flow' members from 'packet' and 'md'
  *
  * Initializes 'packet' header l2 pointer to the start of the Ethernet
@@ -642,68 +718,8 @@ miniflow_extract(struct dp_packet *packet, struct miniflow 
*dst)
 nw_ttl = nh->ip6_hlim;
 nw_proto = nh->ip6_nxt;
 
-while (1) {
-if (OVS_LIKELY((nw_proto != IPPROTO_HOPOPTS)
-   && (nw_proto != IPPROTO_ROUTING)
-   && (nw_proto != IPPROTO_DSTOPTS)
-   && (nw_proto != IPPROTO_AH)
-   && (nw_proto != IPPROTO_FRAGMENT))) {
-/* It's either a terminal header (e.g., TCP, UDP) or one we
- * don't understand.  In either case, we're done with the
- * packet, so use it to fill in 'nw_proto'. */
-break;
-}
-
-/* We only verify that at least 8 bytes of the next header are
- * available, but many of these headers are longer.  Ensure that
- * accesses within the extension header are within those first 8
- * bytes. All extension headers are required to be at least 8
- * bytes. */
-if (OVS_UNLIKELY(size < 8)) {
-goto out;
-}
-
-if ((nw_proto == IPPROTO_HOPOPTS)
- 

[ovs-dev] [PATCH v5 07/16] tests: Add test-conntrack pcap test.

2016-07-26 Thread Daniele Di Proietto
Simple program that runs the packet in a pcap file through the
connection tracker and prints the 'ct_state' for each packet.

E.g. the line:

`./test/ovstest test-conntrack capture.pcap 2`

sends the packets in `capture.pcap` to the connection tracker, 2 per
call.

Useful for debugging.

Signed-off-by: Daniele Di Proietto 
Acked-by: Flavio Leitner 
---
 tests/test-conntrack.c | 73 ++
 1 file changed, 73 insertions(+)

diff --git a/tests/test-conntrack.c b/tests/test-conntrack.c
index 37c7277..0ff70d1 100644
--- a/tests/test-conntrack.c
+++ b/tests/test-conntrack.c
@@ -23,6 +23,7 @@
 #include "netdev.h"
 #include "ovs-thread.h"
 #include "ovstest.h"
+#include "pcap-file.h"
 #include "timeval.h"
 
 static const char payload[] = "5054000a505400090800451c00"
@@ -145,6 +146,74 @@ test_benchmark(struct ovs_cmdl_context *ctx)
 ovs_barrier_destroy();
 free(threads);
 }
+
+static void
+test_pcap(struct ovs_cmdl_context *ctx)
+{
+size_t total_count, i, batch_size;
+FILE *pcap;
+int err;
+
+pcap = ovs_pcap_open(ctx->argv[1], "rb");
+if (!pcap) {
+return;
+}
+
+batch_size = 1;
+if (ctx->argc > 2) {
+batch_size = strtoul(ctx->argv[2], NULL, 0);
+if (batch_size == 0 || batch_size > NETDEV_MAX_BURST) {
+ovs_fatal(0,
+  "batch_size must be between 1 and NETDEV_MAX_BURST(%u)",
+  NETDEV_MAX_BURST);
+}
+}
+
+fatal_signal_init();
+
+conntrack_init();
+total_count = 0;
+for (;;) {
+struct dp_packet_batch pkt_batch;
+
+dp_packet_batch_init(_batch);
+
+for (i = 0; i < batch_size; i++) {
+struct flow dummy_flow;
+
+err = ovs_pcap_read(pcap, _batch.packets[i], NULL);
+if (err) {
+break;
+}
+flow_extract(pkt_batch.packets[i], _flow);
+}
+
+pkt_batch.count = i;
+if (pkt_batch.count == 0) {
+break;
+}
+
+conntrack_execute(, _batch, true, 0, NULL, NULL, NULL);
+
+for (i = 0; i < pkt_batch.count; i++) {
+struct ds ds = DS_EMPTY_INITIALIZER;
+struct dp_packet *pkt = pkt_batch.packets[i];
+
+total_count++;
+
+format_flags(, ct_state_to_string, pkt->md.ct_state, '|');
+printf("%"PRIuSIZE": %s\n", total_count, ds_cstr());
+
+ds_destroy();
+}
+
+dp_packet_delete_batch(_batch, true);
+if (err) {
+break;
+}
+}
+conntrack_destroy();
+}
 
 static const struct ovs_cmdl_command commands[] = {
 /* Connection tracker tests. */
@@ -154,6 +223,10 @@ static const struct ovs_cmdl_command commands[] = {
  * destination port */
 {"benchmark", "n_threads n_pkts batch_size [change_connection]", 3, 4,
  test_benchmark},
+/* Reads packets from 'file' and sends them to the connection tracker,
+ * 'batch_size' (1 by default) per call, with the commit flag set.
+ * Prints the ct_state of each packet. */
+{"pcap", "file [batch_size]", 1, 2, test_pcap},
 
 {NULL, NULL, 0, 0, NULL},
 };
-- 
2.8.1

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v5 01/16] packets: Define ICMP types.

2016-07-26 Thread Daniele Di Proietto
Linux and FreeBSD have slightly different names for these constants.
Windows doesn't define them.  It is simpler to redefine them from
scratch for OVS.  The new names are different than those used in Linux
and FreeBSD.

These definitions will be used by a future commit.

Signed-off-by: Daniele Di Proietto 
Acked-by: Joe Stringer 
Acked-by: Flavio Leitner 
Acked-by: Ryan Moats 
---
 lib/packets.h | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/lib/packets.h b/lib/packets.h
index 5fd1e51..6ab235a 100644
--- a/lib/packets.h
+++ b/lib/packets.h
@@ -611,9 +611,21 @@ struct ip_header {
 ovs_16aligned_be32 ip_src;
 ovs_16aligned_be32 ip_dst;
 };
-
 BUILD_ASSERT_DECL(IP_HEADER_LEN == sizeof(struct ip_header));
 
+/* ICMPv4 types. */
+#define ICMP4_ECHO_REPLY 0
+#define ICMP4_DST_UNREACH 3
+#define ICMP4_SOURCEQUENCH 4
+#define ICMP4_REDIRECT 5
+#define ICMP4_ECHO_REQUEST 8
+#define ICMP4_TIME_EXCEEDED 11
+#define ICMP4_PARAM_PROB 12
+#define ICMP4_TIMESTAMP 13
+#define ICMP4_TIMESTAMPREPLY 14
+#define ICMP4_INFOREQUEST 15
+#define ICMP4_INFOREPLY 16
+
 #define ICMP_HEADER_LEN 8
 struct icmp_header {
 uint8_t icmp_type;
-- 
2.8.1

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v5 00/16] Userspace (DPDK) connection tracker

2016-07-26 Thread Daniele Di Proietto
This series aims to implement the ct() action for the dpif-netdev datapath.
The bulk of the code is in the new conntrack module: it contains some packet
parsing code, some lookup tables and the logic to implements all the ct bits.

The conntrack module is helped by conntrack-tcp, for TCP window and flags
tracking: the bulk of the code of this submodule is from the FreeBSD's pf
subsystem, therefore is BSD licensed.

The rest of the series integrates the connection tracker with the rest of
OVS: the ct() action is implemented in dpif-netdev, and the debugging
interfaces required by dpctl/{dump,flush}-conntrack are implemented.

Besides adding some unit tests, this series ports the existing conntrack
system test to the userspace datapath.  Some small modifications are
required to pass the testsuite, and some tests still have to be skipped.

This can also be downloaded at:

https://github.com/ddiproietto/ovs/tree/userconntrack_20160726

Any feedback is appreciated, thanks.

v4 -> v5:
* Rebase: hmap.h is moved, include ct_* field in some unit tests,
  skip and adapt to the new ct dump format the OVN tests.
* Style and typo fixes.
* Add coverage counter to detect long cleanup.
* Use ovs_barrier instead of pthread_barrier in test (fix compilation
  on OS X).
* Fix dumping tcp state in the reply direction.
* Squash together flow_compose improvements (checksum and udp_len).

v3 -> v4:
* Rebase: use struct dp_packet_batch, add extra ct_ fields in some
  new tests, use struct hmap_pos, skip some new system NAT tests.
* Style and typo fixes.
* Add OVS_NOT_REACHED() in switch in process_one().
* New commit: use dl_type from flow or matching megaflow.

v2 -> v3:
* Rebased.
* Squashed commits for flushing (in dpif-netdev and conntrack).
* Squashed commits for dumping (in dpif-netdev and conntrack).
* Use adaptive mutex instead of spinlock: this prevents livelock
  if the cleanup thread is executed on the same CPU as a forwarding
  thread.  Performance impact in minimal.
* Validate L3 and L4 checksum.
* Use proper L3 and L4 checksum in hardcoded packets in system and unit
  tests.
* Consider ICMPv6 as well as ICMP in l4_protos and conn_key_to_tuple.
* Mention conntrack in NEWS and FAQ.md.
* Use uint16_t for ct_state.
* Fix possible NULL dereference for conn in process_one().
* Add OVS_U128_MIN, OVS_U128_ZERO.
* Use HMAP_FOR_EACH_POP.
* Check that UDP length is valid.
* Style fix: prefer 'sizeof *object' instead of 'sizeof type'
* Don't accept packets from/to UDP/TCP port 0.
* Use defines for timeouts.
* Check expiration inside lookup loop in conn_key_lookup().
* Limit the number of connections.
* Simplify case if tcp_get_wscale().
* Introduce general INT_MOD_* macros for comparisons in modular arithmetic.
* Improve comments.
* New cleanup mechanism: we keep connections in an ordered list and we have
  a separate thread to performs the cleanup.  This doesn't block the main
  thread for long intervals anymore.
* Correctly fill UDP length and UDP/TCP/ICMP checksums in flow_compose():
  it's useful to write testcases for the connection tracker.
* Added system test with ICMP traffic through the connection tracker.
* Track ICMP type and code.

v1 -> v2:
* Fixed bug in tcp_get_wscale(), related to TCP options parsing.
* Changed names of ICMP constants: now they're different from Linux and
  FreeBSD.
* Fixed bug in parse_ipv6_ext_hdrs().
* Used ALWAYS_INLINE in parse_vlan and parse_ethertype, to avoid a
  performance regression in miniflow_extract().
* Updated copyright info in COPYING and debian/copyright.in.
* Rebased.
* Changed batching strategy in conntrack_execute() to allow a newly
  created connection to be picked up by packets in the same batch.
* Added an ovs-test module to throw pcap files at the connection tracker.
* Added a workaround for the userspace testsuite on new kernels and a tcp
  non-conntrack test.



Daniele Di Proietto (16):
  packets: Define ICMP types.
  flow: Export parse_ipv6_ext_hdrs().
  flow: Introduce parse_dl_type().
  conntrack: New userspace connection tracker.
  conntrack: Periodically delete expired connections.
  tests: Add very simple conntrack benchmark.
  tests: Add test-conntrack pcap test.
  dpif-netdev: Execute conntrack action.
  dpif-netdev: Implement conntrack dump functions.
  dpif-netdev: Implement conntrack flush interface.
  flow: Generate checksum and udp_len in flow_compose().
  tests: Add conntrack ofproto-dpif tests.
  system-tests: Run conntrack tests with userspace.
  system-tests: Add ping through conntrack test.
  conntrack: Track ICMP type and code.
  conntrack: Add 'dl_type' parameter to conntrack_execute().

 COPYING  |1 +
 FAQ.md   |2 +-
 NEWS |2 +
 debian/copyright.in  |4 +
 include/openvswitch/types.h  |4 +
 lib/automake.mk  |6 +
 lib/conntrack-icmp.c |  105 
 lib/conntrack-other.c|   86 +++
 

Re: [ovs-dev] [PATCH 2/3] ovsdb: Fix memory leak in replication logic

2016-07-26 Thread William Tu
thanks for fixing the memory leak!

Acked-by: William Tu 

On Tue, Jul 26, 2016 at 1:08 PM, Andy Zhou  wrote:
> Release the memory of reply message of the initial "monitor" request.
>
> Reported-at: http://openvswitch.org/pipermail/dev/2016-July/076075.html
> Signed-off-by: Andy Zhou 
> ---
>  ovsdb/replication.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/ovsdb/replication.c b/ovsdb/replication.c
> index 3d589ef..af7ae5c 100644
> --- a/ovsdb/replication.c
> +++ b/ovsdb/replication.c
> @@ -365,6 +365,10 @@ get_initial_db_state(const struct db *database)
>  if (msg->type == JSONRPC_REPLY) {
>  process_notification(msg->result, database->db);
>  }
> +
> +if (msg) {
> +jsonrpc_msg_destroy(msg);
> +}
>  }
>
>  static void
> --
> 1.9.1
>
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 1/3] ovsdb: Properly close replication rpc connection

2016-07-26 Thread William Tu
Hi Andy,

Thanks for fixing the memory leak! I've tested it and it solved the
issue. btw, I think we don't have to assign "NULL" to static variable,
C99 standard assume all static variable initializes to 0.

Acked-by: William Tu 

On Tue, Jul 26, 2016 at 1:08 PM, Andy Zhou  wrote:
> This patch removes rpc related memory leak reported below.
>
> Reported-at: http://openvswitch.org/pipermail/dev/2016-July/076075.html
> Signed-off-by: Andy Zhou 
> ---
>  ovsdb/ovsdb-server.c | 1 +
>  ovsdb/replication.c  | 5 +++--
>  2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/ovsdb/ovsdb-server.c b/ovsdb/ovsdb-server.c
> index 239cca8..1c6ddca 100644
> --- a/ovsdb/ovsdb-server.c
> +++ b/ovsdb/ovsdb-server.c
> @@ -202,6 +202,7 @@ main_loop(struct ovsdb_jsonrpc_server *jsonrpc, struct 
> shash *all_dbs,
>  }
>  }
>
> +disconnect_remote_server();
>  free(remotes_error);
>  }
>
> diff --git a/ovsdb/replication.c b/ovsdb/replication.c
> index 52b7085..3d589ef 100644
> --- a/ovsdb/replication.c
> +++ b/ovsdb/replication.c
> @@ -32,8 +32,8 @@
>  #include "table.h"
>  #include "transaction.h"
>
> -static char *remote_ovsdb_server;
> -static struct jsonrpc *rpc;
> +static char *remote_ovsdb_server = NULL;
> +static struct jsonrpc *rpc = NULL;
>  static struct sset monitored_tables = SSET_INITIALIZER(_tables);
>  static struct sset tables_blacklist = SSET_INITIALIZER(_blacklist);
>  static bool reset_dbs = true;
> @@ -391,6 +391,7 @@ check_for_notifications(struct shash *all_dbs)
>  if (error == EAGAIN) {
>  return;
>  } else if (error) {
> +jsonrpc_close(rpc);
>  rpc = open_jsonrpc(remote_ovsdb_server);
>  if (!rpc) {
>  /* Remote server went down. */
> --
> 1.9.1
>
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] netdev-provider: fix comments for netdev_rxq_recv

2016-07-26 Thread William Tu
Sorry, this should be

Acked-by: William Tu 

On Tue, Jul 26, 2016 at 8:32 AM, William Tu  wrote:
> Hi Mark,
>
> Thanks for fixing them! looks good to me.
>
> Signed-off-by: William Tu 
>
>
>
> On Tue, Jul 26, 2016 at 6:19 AM, Mark Kavanagh
>  wrote:
>> Commit 64839cf43 applies batch objects to netdev-providers, but
>> some comments were not updated accordingly. Fix these:
>>- replace 'pkts' with 'batch'
>>- replace '*cnt' with 'batch->count'
>>- replace MAX_RX_BATCH with NETDEV_MAX_BURST
>>- remove superfluous whitespace
>>
>> Signed-off-by: Mark Kavanagh 
>> ---
>>  lib/netdev-provider.h |  4 ++--
>>  lib/netdev.c  | 15 ---
>>  2 files changed, 10 insertions(+), 9 deletions(-)
>>
>> diff --git a/lib/netdev-provider.h b/lib/netdev-provider.h
>> index 915a5a5..3ded6c1 100644
>> --- a/lib/netdev-provider.h
>> +++ b/lib/netdev-provider.h
>> @@ -729,8 +729,8 @@ struct netdev_class {
>>
>>  /* Attempts to receive a batch of packets from 'rx'.  In 'batch', the
>>   * caller supplies 'packets' as the pointer to the beginning of an array
>> - * of MAX_RX_BATCH pointers to dp_packet.  If successful, the
>> - * implementation stores pointers to up to MAX_RX_BATCH dp_packets into
>> + * of NETDEV_MAX_BURST pointers to dp_packet.  If successful, the
>> + * implementation stores pointers to up to NETDEV_MAX_BURST dp_packets 
>> into
>>   * the array, transferring ownership of the packets to the caller, 
>> stores
>>   * the number of received packets into 'count', and returns 0.
>>   *
>> diff --git a/lib/netdev.c b/lib/netdev.c
>> index 31a6a46..be86519 100644
>> --- a/lib/netdev.c
>> +++ b/lib/netdev.c
>> @@ -608,14 +608,15 @@ netdev_rxq_close(struct netdev_rxq *rx)
>>  }
>>  }
>>
>> -/* Attempts to receive a batch of packets from 'rx'.  'pkts' should point to
>> - * the beginning of an array of MAX_RX_BATCH pointers to dp_packet.  If
>> - * successful, this function stores pointers to up to MAX_RX_BATCH 
>> dp_packets
>> - * into the array, transferring ownership of the packets to the caller, 
>> stores
>> - * the number of received packets into '*cnt', and returns 0.
>> +/* Attempts to receive a batch of packets from 'rx'.  'batch' should point 
>> to
>> + * the beginning of an array of NETDEV_MAX_BURST pointers to dp_packet.  If
>> + * successful, this function stores pointers to up to NETDEV_MAX_BURST
>> + * dp_packets into the array, transferring ownership of the packets to the
>> + * caller, stores the number of received packets in 'batch->count', and 
>> returns
>> + * 0.
>>   *
>>   * The implementation does not necessarily initialize any non-data members 
>> of
>> - * 'pkts'.  That is, the caller must initialize layer pointers and metadata
>> + * 'batch'.  That is, the caller must initialize layer pointers and metadata
>>   * itself, if desired, e.g. with pkt_metadata_init() and miniflow_extract().
>>   *
>>   * Returns EAGAIN immediately if no packet is ready to be received or 
>> another
>> @@ -625,7 +626,7 @@ netdev_rxq_recv(struct netdev_rxq *rx, struct 
>> dp_packet_batch *batch)
>>  {
>>  int retval;
>>
>> -retval = rx->netdev->netdev_class->rxq_recv(rx,  batch);
>> +retval = rx->netdev->netdev_class->rxq_recv(rx, batch);
>>  if (!retval) {
>>  COVERAGE_INC(netdev_received);
>>  } else {
>> --
>> 1.9.3
>>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] ovn-controller: squelch expected duplicate flow warnings

2016-07-26 Thread Ryan Moats

Guru Shetty  wrote on 07/26/2016 06:05:47 PM:

> From: Guru Shetty 
> To: Ryan Moats/Omaha/IBM@IBMUS
> Cc: ovs dev 
> Date: 07/26/2016 06:06 PM
> Subject: Re: [ovs-dev] [PATCH] ovn-controller: squelch expected
> duplicate flow warnings
>
> On 26 July 2016 at 15:54, Ryan Moats  wrote:
>
>
>
> Guru Shetty  wrote on 07/26/2016 03:54:29 PM:
>
> > From: Guru Shetty 
> > To: Ryan Moats/Omaha/IBM@IBMUS
> > Cc: ovs dev 
> > Date: 07/26/2016 03:54 PM
> > Subject: Re: [ovs-dev] [PATCH] ovn-controller: squelch expected
> > duplicate flow warnings
> >
> > On 24 July 2016 at 10:07, Ryan Moats  wrote:
> > In the physical processing of ovn-controller, there are two
> > sets of OF flows that are still fully recalculated every cycle:
> >
> >   Flows that aren't associated with any logical flow, and
> >   Flows calculated based on multicast groups
> >
> > Because these flows are recalculated fully each cycle, full
> > duplicates of existing OF flows are created and the OF management
> > code in ovn-controller pollutes the logs with false positive
> > warnings about repeated duplicates.
> >
> > As a short term measure, ignore full duplicates for both of
> > these types of flows, but still warn if the action changes
> > (as that is not expected and may be indicative of a problem).
> >
> > Signed-off-by: Ryan Moats 
> >
> > I also noticed that "commit 70c7cfef188b5ae9940abd5 (ovn-controller:
> > Add incremental processing to lflow_run and physical_run)" causes
> > load balancing system unit tests to fail. A little debugging shows
> > that groups are getting deleted when new flows are added.  My hunch
> > is that this is likely because 'desired_groups' in ofctl_put gets
> > deleted in every run. But in the next run, it does not get updated
> > as we no longer process all flows.
>
> That's going to take persisting the desired_groups data.
>
> I can take a shot if you'd like, just give me the link to the
> patch set that includes the load balancing system unit tests
> and I'll see what I can do to make it right ...
>
> It already exists in the OVN repo. tests/system-ovn.at

Ack and verified that it is failing - I'll take a deeper look
later tonight/tomorrow and see what I can make work.

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v4] dpif-netdev: XPS (Transmit Packet Steering) implementation.

2016-07-26 Thread Daniele Di Proietto
Thanks for the patch

I think the caller of dp_netdev_execute_actions() should always pass a valid 
timestamp.  We can pass it from aux->now to dp_execute_userspace_actions(), we 
can add it to fast_path_processing() so that it can be passed down to 
handle_packet_upcall().  In the other cases it's fine to call time_msec(), 
we're in the slow path anyway.

One more thing: I think we should avoid XPS entirely if there are enough txqs, 
to avoid any possible locks and even writing tx->last_used.


Thanks,

Daniele

On 13/07/2016 05:34, "Ilya Maximets"  wrote:

>If CPU number in pmd-cpu-mask is not divisible by the number of queues and
>in a few more complex situations there may be unfair distribution of TX
>queue-ids between PMD threads.
>
>For example, if we have 2 ports with 4 queues and 6 CPUs in pmd-cpu-mask
>such distribution is possible:
><>
>pmd thread numa_id 0 core_id 13:
>port: vhost-user1   queue-id: 1
>port: dpdk0 queue-id: 3
>pmd thread numa_id 0 core_id 14:
>port: vhost-user1   queue-id: 2
>pmd thread numa_id 0 core_id 16:
>port: dpdk0 queue-id: 0
>pmd thread numa_id 0 core_id 17:
>port: dpdk0 queue-id: 1
>pmd thread numa_id 0 core_id 12:
>port: vhost-user1   queue-id: 0
>port: dpdk0 queue-id: 2
>pmd thread numa_id 0 core_id 15:
>port: vhost-user1   queue-id: 3
><>
>
>As we can see above dpdk0 port polled by threads on cores:
>   12, 13, 16 and 17.
>
>By design of dpif-netdev, there is only one TX queue-id assigned to each
>pmd thread. This queue-id's are sequential similar to core-id's. And
>thread will send packets to queue with exact this queue-id regardless
>of port.
>
>In previous example:
>
>   pmd thread on core 12 will send packets to tx queue 0
>   pmd thread on core 13 will send packets to tx queue 1
>   ...
>   pmd thread on core 17 will send packets to tx queue 5
>
>So, for dpdk0 port after truncating in netdev-dpdk:
>
>   core 12 --> TX queue-id 0 % 4 == 0
>   core 13 --> TX queue-id 1 % 4 == 1
>   core 16 --> TX queue-id 4 % 4 == 0
>   core 17 --> TX queue-id 5 % 4 == 1
>
>As a result only 2 of 4 queues used.
>
>To fix this issue some kind of XPS implemented in following way:
>
>   * TX queue-ids are allocated dynamically.
>   * When PMD thread first time tries to send packets to new port
> it allocates less used TX queue for this port.
>   * PMD threads periodically performes revalidation of
> allocated TX queue-ids. If queue wasn't used in last
> XPS_TIMEOUT_MS milliseconds it will be freed while revalidation.
>
>Reported-by: Zhihong Wang 
>Signed-off-by: Ilya Maximets 
>---
> lib/dpif-netdev.c | 170 +-
> 1 file changed, 117 insertions(+), 53 deletions(-)
>
>diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
>index e0107b7..6345944 100644
>--- a/lib/dpif-netdev.c
>+++ b/lib/dpif-netdev.c
>@@ -248,6 +248,8 @@ enum pmd_cycles_counter_type {
> PMD_N_CYCLES
> };
> 
>+#define XPS_TIMEOUT_MS 500LL
>+
> /* A port in a netdev-based datapath. */
> struct dp_netdev_port {
> odp_port_t port_no;
>@@ -256,6 +258,8 @@ struct dp_netdev_port {
> struct netdev_saved_flags *sf;
> unsigned n_rxq; /* Number of elements in 'rxq' */
> struct netdev_rxq **rxq;
>+unsigned *txq_used; /* Number of threads that uses each tx queue. 
>*/
>+struct ovs_mutex txq_used_mutex;
> char *type; /* Port type as requested by user. */
> };
> 
>@@ -384,8 +388,9 @@ struct rxq_poll {
> 
> /* Contained by struct dp_netdev_pmd_thread's 'port_cache' or 'tx_ports'. */
> struct tx_port {
>-odp_port_t port_no;
>-struct netdev *netdev;
>+struct dp_netdev_port *port;
>+int qid;
>+long long last_used;
> struct hmap_node node;
> };
> 
>@@ -498,7 +503,8 @@ static void dp_netdev_execute_actions(struct 
>dp_netdev_pmd_thread *pmd,
>   struct dp_packet_batch *,
>   bool may_steal,
>   const struct nlattr *actions,
>-  size_t actions_len);
>+  size_t actions_len,
>+  long long now);
> static void dp_netdev_input(struct dp_netdev_pmd_thread *,
> struct dp_packet_batch *, odp_port_t port_no);
> static void dp_netdev_recirculate(struct dp_netdev_pmd_thread *,
>@@ -541,6 +547,12 @@ static void dp_netdev_pmd_flow_flush(struct 
>dp_netdev_pmd_thread *pmd);
> static void pmd_load_cached_ports(struct dp_netdev_pmd_thread *pmd)
> OVS_REQUIRES(pmd->port_mutex);
> 
>+static void

Re: [ovs-dev] [PATCH] datapath: Add support for kernel 4.6

2016-07-26 Thread Amitabha Biswas
Typo in the previous ack

Acked-by: Amitabha Biswas >

> On Jul 26, 2016, at 4:22 PM, Amitabha Biswas  wrote:
> 
> I was able to compile the openvswitch modules on Linux 4.6 kernel and stacked 
> using OpenStack networking-ovn.
> 
> The basic NAT system tests passed and the OVN test suite passed.
> 
> Asked-by: Amitabha Biswas >
> 
>> On Jul 26, 2016, at 2:07 PM, Jesse Gross > > wrote:
>> 
>> On Mon, Jul 25, 2016 at 6:40 PM, Pravin B Shelar > > wrote:
>>> Most of patch iron out USE_UPSTREAM_TUNNEL case where datapath
>>> directly use upstream tunneling modules.
>>> 
>>> Signed-off-by: Pravin B Shelar >
>> 
>> Acked-by: Jesse Gross >
>> ___
>> dev mailing list
>> dev@openvswitch.org 
>> http://openvswitch.org/mailman/listinfo/dev
> 

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] datapath: Add support for kernel 4.6

2016-07-26 Thread Amitabha Biswas
I was able to compile the openvswitch modules on Linux 4.6 kernel and stacked 
using OpenStack networking-ovn.

The basic NAT system tests passed and the OVN test suite passed.

Asked-by: Amitabha Biswas >

> On Jul 26, 2016, at 2:07 PM, Jesse Gross  wrote:
> 
> On Mon, Jul 25, 2016 at 6:40 PM, Pravin B Shelar  wrote:
>> Most of patch iron out USE_UPSTREAM_TUNNEL case where datapath
>> directly use upstream tunneling modules.
>> 
>> Signed-off-by: Pravin B Shelar 
> 
> Acked-by: Jesse Gross 
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v3 3/3] dpif-netdev: Introduce pmd-rxq-affinity.

2016-07-26 Thread Daniele Di Proietto
Looks mostly good to me, a couple more comments inline

Thanks,

Daniele


On 26/07/2016 06:48, "Ilya Maximets"  wrote:

>On 26.07.2016 04:46, Daniele Di Proietto wrote:
>> Thanks for the patch.
>> 
>> I haven't been able to apply this without the XPS patch.
>
>That was the original idea. Using of this patch with current
>tx queue management may lead to performance issues on multiqueue
>configurations.

Ok, in this case it should be part of the same series.

>
>> This looks like a perfect chance to add more tests to pmd.at.  I can do it 
>> if you want
>
>Sounds good.
>
>> I started taking a look at this patch and I have a few comments inline.  
>> I'll keep looking at it tomorrow
>> 
>> Thanks,
>> 
>> Daniele
>> 
>> 
>> On 15/07/2016 04:54, "Ilya Maximets"  wrote:
>> 
>>> New 'other_config:pmd-rxq-affinity' field for Interface table to
>>> perform manual pinning of RX queues to desired cores.
>>>
>>> This functionality is required to achieve maximum performance because
>>> all kinds of ports have different cost of rx/tx operations and
>>> only user can know about expected workload on different ports.
>>>
>>> Example:
>>> # ./bin/ovs-vsctl set interface dpdk0 options:n_rxq=4 \
>>>   other_config:pmd-rxq-affinity="0:3,1:7,3:8"
>>> Queue #0 pinned to core 3;
>>> Queue #1 pinned to core 7;
>>> Queue #2 not pinned.
>>> Queue #3 pinned to core 8;
>>>
>>> It's decided to automatically isolate cores that have rxq explicitly
>>> assigned to them because it's useful to keep constant polling rate on
>>> some performance critical ports while adding/deleting other ports
>>> without explicit pinning of all ports.
>>>
>>> Signed-off-by: Ilya Maximets 
>>> ---
>>> INSTALL.DPDK.md  |  49 +++-
>>> NEWS |   2 +
>>> lib/dpif-netdev.c| 218 
>>> ++-
>>> tests/pmd.at |   6 ++
>>> vswitchd/vswitch.xml |  23 ++
>>> 5 files changed, 257 insertions(+), 41 deletions(-)
>>>
>>> diff --git a/INSTALL.DPDK.md b/INSTALL.DPDK.md
>>> index 5407794..7609aa7 100644
>>> --- a/INSTALL.DPDK.md
>>> +++ b/INSTALL.DPDK.md
>>> @@ -289,14 +289,57 @@ advanced install guide [INSTALL.DPDK-ADVANCED.md]
>>>  # Check current stats
>>>ovs-appctl dpif-netdev/pmd-stats-show
>>>
>>> + # Clear previous stats
>>> +   ovs-appctl dpif-netdev/pmd-stats-clear
>>> + ```
>>> +
>>> +  7. Port/rxq assigment to PMD threads
>>> +
>>> + ```
>>>  # Show port/rxq assignment
>>>ovs-appctl dpif-netdev/pmd-rxq-show
>>> + ```
>>>
>>> - # Clear previous stats
>>> -   ovs-appctl dpif-netdev/pmd-stats-clear
>>> + To change default rxq assignment to pmd threads rxqs may be manually
>>> + pinned to desired cores using:
>>> +
>>> + ```
>>> + ovs-vsctl set Interface  \
>>> +   other_config:pmd-rxq-affinity=
>>>  ```
>>> + where:
>>> +
>>> + ```
>>> +  ::= NULL | 
>>> +  ::=  |
>>> +   , 
>>> +  ::=  : 
>>> + ```
>>> +
>>> + Example:
>>> +
>>> + ```
>>> + ovs-vsctl set interface dpdk0 options:n_rxq=4 \
>>> +   other_config:pmd-rxq-affinity="0:3,1:7,3:8"
>>> +
>>> + Queue #0 pinned to core 3;
>>> + Queue #1 pinned to core 7;
>>> + Queue #2 not pinned.
>>> + Queue #3 pinned to core 8;
>>> + ```
>>> +
>>> + After that PMD threads on cores where RX queues was pinned will become
>>> + `isolated`. This means that this thread will poll only pinned RX 
>>> queues.
>>> +
>>> + WARNING: If there are no `non-isolated` PMD threads, `non-pinned` RX 
>>> queues
>>> + will not be polled. Also, if provided `core_id` is not available (ex. 
>>> this
>>> + `core_id` not in `pmd-cpu-mask`), RX queue will not be polled by any
>>> + PMD thread.
>>> +
>>> + Isolation of PMD threads also can be checked using
>>> + `ovs-appctl dpif-netdev/pmd-rxq-show` command.
>>>
>>> -  7. Stop vswitchd & Delete bridge
>>> +  8. Stop vswitchd & Delete bridge
>>>
>>>  ```
>>>  ovs-appctl -t ovs-vswitchd exit
>>> diff --git a/NEWS b/NEWS
>>> index 6496dc1..9ccc1f5 100644
>>> --- a/NEWS
>>> +++ b/NEWS
>>> @@ -44,6 +44,8 @@ Post-v2.5.0
>>>Old 'other_config:n-dpdk-rxqs' is no longer supported.
>>>Not supported by vHost interfaces. For them number of rx and tx 
>>> queues
>>>is applied from connected virtio device.
>>> + * New 'other_config:pmd-rxq-affinity' field for PMD interfaces, that
>>> +   allows to pin port's rx queues to desired cores.
>>>  * New appctl command 'dpif-netdev/pmd-rxq-show' to check the port/rxq
>>>assignment.
>>>  * Type of log messages from PMD threads changed from INFO to DBG.
>>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
>>> index 18ce316..e5a8dec 100644
>>> --- a/lib/dpif-netdev.c
>>> +++ b/lib/dpif-netdev.c
>>> @@ -63,6 +63,7 @@
>>> 

Re: [ovs-dev] [PATCH] ovn-controller: squelch expected duplicate flow warnings

2016-07-26 Thread Guru Shetty
On 26 July 2016 at 15:54, Ryan Moats  wrote:

>
>
>
> Guru Shetty  wrote on 07/26/2016 03:54:29 PM:
>
> > From: Guru Shetty 
> > To: Ryan Moats/Omaha/IBM@IBMUS
> > Cc: ovs dev 
> > Date: 07/26/2016 03:54 PM
> > Subject: Re: [ovs-dev] [PATCH] ovn-controller: squelch expected
> > duplicate flow warnings
> >
> > On 24 July 2016 at 10:07, Ryan Moats  wrote:
> > In the physical processing of ovn-controller, there are two
> > sets of OF flows that are still fully recalculated every cycle:
> >
> >   Flows that aren't associated with any logical flow, and
> >   Flows calculated based on multicast groups
> >
> > Because these flows are recalculated fully each cycle, full
> > duplicates of existing OF flows are created and the OF management
> > code in ovn-controller pollutes the logs with false positive
> > warnings about repeated duplicates.
> >
> > As a short term measure, ignore full duplicates for both of
> > these types of flows, but still warn if the action changes
> > (as that is not expected and may be indicative of a problem).
> >
> > Signed-off-by: Ryan Moats 
> >
> > I also noticed that "commit 70c7cfef188b5ae9940abd5 (ovn-controller:
> > Add incremental processing to lflow_run and physical_run)" causes
> > load balancing system unit tests to fail. A little debugging shows
> > that groups are getting deleted when new flows are added.  My hunch
> > is that this is likely because 'desired_groups' in ofctl_put gets
> > deleted in every run. But in the next run, it does not get updated
> > as we no longer process all flows.
>
> That's going to take persisting the desired_groups data.
>
> I can take a shot if you'd like, just give me the link to the
> patch set that includes the load balancing system unit tests
> and I'll see what I can do to make it right ...
>

It already exists in the OVN repo. tests/system-ovn.at




> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2 1/3] datapath: compat: fix udp checksum calculation

2016-07-26 Thread Jesse Gross
On Tue, Jul 26, 2016 at 3:59 PM, pravin shelar  wrote:
> On Tue, Jul 26, 2016 at 3:53 PM, Jesse Gross  wrote:
>> On Tue, Jul 26, 2016 at 3:24 PM, Pravin B Shelar  wrote:
>>> diff --git a/datapath/linux/compat/include/net/udp.h 
>>> b/datapath/linux/compat/include/net/udp.h
>>> index fa49fa5..266e70a 100644
>>> --- a/datapath/linux/compat/include/net/udp.h
>>> +++ b/datapath/linux/compat/include/net/udp.h
>>> @@ -54,7 +54,7 @@ static inline __sum16 udp_v4_check(int len, __be32 saddr,
>>>  }
>>>  #endif
>>>
>>> -#ifndef HAVE_UDP_SET_CSUM
>>> +#if LINUX_VERSION_CODE < KERNEL_VERSION(3,18,0)
>>
>> I'm a little nervous about these version checks being hard to maintain
>> - especially since they don't correspond to anything obvious in this
>> function upstream. Maybe we could just declare a #define with a name
>> that would make it clearer. That might actually be useful in any case
>> since I suspect that we will start seeing some backports in
>> distributions that will allow us to avoid doing OVS segmentation even
>> on older kernels.
>
> Is it fine if I do it as part of separate patch? This patch is about
> fixing the UDP checksum issue. And the requested change is about
> general code improvement.

Yes, that's fine. I think we'll want to convert all of the GSO related
3.18 version checks to use this symbol, so that's mostly not related
to checksums anyways.

Acked-by: Jesse Gross 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] openvswitch

2016-07-26 Thread Henry Kai
This is a confirmation letter regarding registration of your company name 
openvswitch, please read it carefully.
Dear sir or madam,
We are a registrar for domain names authorized by Chinese government. Today, we 
received an application from Htg Corporation LTD applying to register 
openvswitch as their brand name and some top-level domain names(.CN .HK etc). 
After our initail checking, We found the main body of domain names is same as 
yours. 
We are handling the application and we need to confirm whether or not you 
authorize them to register them? Let me know your positon ASAP so as to solve 
it promptly. Looking forward to your reply.

Best Regards,
Henry Kai
Senior Manager
Tel: +86.7395266069Fax:+86.7395266169
Address: 98 Shanghai Ave. Hefei 230001 China
Http://www.dimn.org.cn 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2 1/3] datapath: compat: fix udp checksum calculation

2016-07-26 Thread pravin shelar
On Tue, Jul 26, 2016 at 3:53 PM, Jesse Gross  wrote:
> On Tue, Jul 26, 2016 at 3:24 PM, Pravin B Shelar  wrote:
>> diff --git a/datapath/linux/compat/include/net/udp.h 
>> b/datapath/linux/compat/include/net/udp.h
>> index fa49fa5..266e70a 100644
>> --- a/datapath/linux/compat/include/net/udp.h
>> +++ b/datapath/linux/compat/include/net/udp.h
>> @@ -54,7 +54,7 @@ static inline __sum16 udp_v4_check(int len, __be32 saddr,
>>  }
>>  #endif
>>
>> -#ifndef HAVE_UDP_SET_CSUM
>> +#if LINUX_VERSION_CODE < KERNEL_VERSION(3,18,0)
>
> I'm a little nervous about these version checks being hard to maintain
> - especially since they don't correspond to anything obvious in this
> function upstream. Maybe we could just declare a #define with a name
> that would make it clearer. That might actually be useful in any case
> since I suspect that we will start seeing some backports in
> distributions that will allow us to avoid doing OVS segmentation even
> on older kernels.

Is it fine if I do it as part of separate patch? This patch is about
fixing the UDP checksum issue. And the requested change is about
general code improvement.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2 3/3] datapath: compat: simplify ip_local_out().

2016-07-26 Thread Jesse Gross
On Tue, Jul 26, 2016 at 3:24 PM, Pravin B Shelar  wrote:
> Signed-off-by: Pravin B Shelar 
> ---
>  datapath/linux/compat/gso.c | 82 
> ++---
>  1 file changed, 33 insertions(+), 49 deletions(-)

Acked-by: Jesse Gross 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2 2/3] datapath: compat: unset skb encapsulation bit

2016-07-26 Thread Jesse Gross
On Tue, Jul 26, 2016 at 3:24 PM, Pravin B Shelar  wrote:
> OVS compat layer can handle tunnel GSO packets. but it does
> keep skb encapsulation on for packet handled in GSO. This can
> confuse some NIC drivers. I have seen this issue on intel devices:
>
  i40e :42:00.0: TX driver issue detected, PF reset issued
>
> Following patch resets this bit in case compat layer handles the packet.
>
> VMware-BZ: 1698877
> Signed-off-by: Pravin B Shelar 

Acked-by: Jesse Gross 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] ovn-controller: squelch expected duplicate flow warnings

2016-07-26 Thread Ryan Moats



Guru Shetty  wrote on 07/26/2016 03:54:29 PM:

> From: Guru Shetty 
> To: Ryan Moats/Omaha/IBM@IBMUS
> Cc: ovs dev 
> Date: 07/26/2016 03:54 PM
> Subject: Re: [ovs-dev] [PATCH] ovn-controller: squelch expected
> duplicate flow warnings
>
> On 24 July 2016 at 10:07, Ryan Moats  wrote:
> In the physical processing of ovn-controller, there are two
> sets of OF flows that are still fully recalculated every cycle:
>
>   Flows that aren't associated with any logical flow, and
>   Flows calculated based on multicast groups
>
> Because these flows are recalculated fully each cycle, full
> duplicates of existing OF flows are created and the OF management
> code in ovn-controller pollutes the logs with false positive
> warnings about repeated duplicates.
>
> As a short term measure, ignore full duplicates for both of
> these types of flows, but still warn if the action changes
> (as that is not expected and may be indicative of a problem).
>
> Signed-off-by: Ryan Moats 
>
> I also noticed that "commit 70c7cfef188b5ae9940abd5 (ovn-controller:
> Add incremental processing to lflow_run and physical_run)" causes
> load balancing system unit tests to fail. A little debugging shows
> that groups are getting deleted when new flows are added.  My hunch
> is that this is likely because 'desired_groups' in ofctl_put gets
> deleted in every run. But in the next run, it does not get updated
> as we no longer process all flows.

That's going to take persisting the desired_groups data.

I can take a shot if you'd like, just give me the link to the
patch set that includes the load balancing system unit tests
and I'll see what I can do to make it right ...
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v2 2/3] datapath: compat: unset skb encapsulation bit

2016-07-26 Thread Pravin B Shelar
OVS compat layer can handle tunnel GSO packets. but it does
keep skb encapsulation on for packet handled in GSO. This can
confuse some NIC drivers. I have seen this issue on intel devices:

>>>  i40e :42:00.0: TX driver issue detected, PF reset issued

Following patch resets this bit in case compat layer handles the packet.

VMware-BZ: 1698877
Signed-off-by: Pravin B Shelar 
---
 datapath/linux/compat/gso.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/datapath/linux/compat/gso.c b/datapath/linux/compat/gso.c
index 32790c3..3a73bcd 100644
--- a/datapath/linux/compat/gso.c
+++ b/datapath/linux/compat/gso.c
@@ -186,6 +186,8 @@ static struct sk_buff *tnl_skb_gso_segment(struct sk_buff 
*skb,
 * make copy of it to restore it back. */
memcpy(cb, skb->cb, sizeof(cb));
 
+   skb->encapsulation = 0;
+
/* We are handling offloads by segmenting l3 packet, so
 * no need to call OVS compat segmentation function. */
 
-- 
1.9.1

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v2 0/3] datapath: tunneling fixes.

2016-07-26 Thread Pravin B Shelar
First two patches fixes to issues related to geneva and vxlan tunnel.
Third patch is just code improvement.

Pravin B Shelar (3):
  datapath: compat: fix udp checksum calculation
  datapath: compat: unset skb encapsulation bit
  datapath: compat: simplify ip_local_out().

 acinclude.m4|  1 -
 datapath/linux/compat/gso.c | 84 ++---
 datapath/linux/compat/include/net/udp.h |  2 +-
 datapath/linux/compat/udp.c |  5 +-
 datapath/linux/compat/udp_tunnel.c  |  3 +-
 5 files changed, 41 insertions(+), 54 deletions(-)

-- 
1.9.1

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v2 1/3] datapath: compat: fix udp checksum calculation

2016-07-26 Thread Pravin B Shelar
In upstream linux kernel networking stack udp_set_csum() is called
with only udp header applied but in case of compat layer it can
be called with IP header. So following patch take the offset into
account.

Signed-off-by: Pravin B Shelar 
---
 acinclude.m4| 1 -
 datapath/linux/compat/include/net/udp.h | 2 +-
 datapath/linux/compat/udp.c | 5 +++--
 datapath/linux/compat/udp_tunnel.c  | 3 ++-
 4 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/acinclude.m4 b/acinclude.m4
index 5f38539..3f31e5f 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -639,7 +639,6 @@ AC_DEFUN([OVS_CHECK_LINUX_COMPAT], [
   [OVS_GREP_IFELSE([$KSRC/include/net/udp.h], 
[inet_get_local_port_range(net],
[OVS_DEFINE([HAVE_UDP_FLOW_SRC_PORT])])])
   OVS_GREP_IFELSE([$KSRC/include/net/udp.h], [udp_v4_check])
-  OVS_GREP_IFELSE([$KSRC/include/net/udp.h], [udp_set_csum])
   OVS_GREP_IFELSE([$KSRC/include/net/udp_tunnel.h], [udp_tunnel_gro_complete])
   OVS_FIND_FIELD_IFELSE([$KSRC/include/net/udp_tunnel.h], 
[udp_tunnel_sock_cfg],
 [gro_receive])
diff --git a/datapath/linux/compat/include/net/udp.h 
b/datapath/linux/compat/include/net/udp.h
index fa49fa5..266e70a 100644
--- a/datapath/linux/compat/include/net/udp.h
+++ b/datapath/linux/compat/include/net/udp.h
@@ -54,7 +54,7 @@ static inline __sum16 udp_v4_check(int len, __be32 saddr,
 }
 #endif
 
-#ifndef HAVE_UDP_SET_CSUM
+#if LINUX_VERSION_CODE < KERNEL_VERSION(3,18,0)
 #define udp_set_csum rpl_udp_set_csum
 void rpl_udp_set_csum(bool nocheck, struct sk_buff *skb,
  __be32 saddr, __be32 daddr, int len);
diff --git a/datapath/linux/compat/udp.c b/datapath/linux/compat/udp.c
index f0362b6..4cd22fa 100644
--- a/datapath/linux/compat/udp.c
+++ b/datapath/linux/compat/udp.c
@@ -1,6 +1,6 @@
 #include 
 
-#ifndef HAVE_UDP_SET_CSUM
+#if LINUX_VERSION_CODE < KERNEL_VERSION(3,18,0)
 
 #include 
 
@@ -26,12 +26,13 @@ void rpl_udp_set_csum(bool nocheck, struct sk_buff *skb,
skb->csum_offset = offsetof(struct udphdr, check);
uh->check = ~udp_v4_check(len, saddr, daddr, 0);
} else {
+   int l4_offset = skb_transport_offset(skb);
__wsum csum;
 
BUG_ON(skb->ip_summed == CHECKSUM_PARTIAL);
 
uh->check = 0;
-   csum = skb_checksum(skb, 0, len, 0);
+   csum = skb_checksum(skb, l4_offset, len, 0);
uh->check = udp_v4_check(len, saddr, daddr, csum);
if (uh->check == 0)
uh->check = CSUM_MANGLED_0;
diff --git a/datapath/linux/compat/udp_tunnel.c 
b/datapath/linux/compat/udp_tunnel.c
index 9265c8a..9cf7286 100644
--- a/datapath/linux/compat/udp_tunnel.c
+++ b/datapath/linux/compat/udp_tunnel.c
@@ -203,12 +203,13 @@ static void udp6_set_csum(bool nocheck, struct sk_buff 
*skb,
skb->csum_offset = offsetof(struct udphdr, check);
uh->check = ~udp_v6_check(len, saddr, daddr, 0);
} else {
+   int l4_offset = skb_transport_offset(skb);
__wsum csum;
 
BUG_ON(skb->ip_summed == CHECKSUM_PARTIAL);
 
uh->check = 0;
-   csum = skb_checksum(skb, 0, len, 0);
+   csum = skb_checksum(skb, l4_offset, len, 0);
uh->check = udp_v6_check(len, saddr, daddr, csum);
if (uh->check == 0)
uh->check = CSUM_MANGLED_0;
-- 
1.9.1

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v2 3/3] datapath: compat: simplify ip_local_out().

2016-07-26 Thread Pravin B Shelar
Signed-off-by: Pravin B Shelar 
---
 datapath/linux/compat/gso.c | 82 ++---
 1 file changed, 33 insertions(+), 49 deletions(-)

diff --git a/datapath/linux/compat/gso.c b/datapath/linux/compat/gso.c
index 3a73bcd..fbbbc89 100644
--- a/datapath/linux/compat/gso.c
+++ b/datapath/linux/compat/gso.c
@@ -229,101 +229,85 @@ free:
 
 static int output_ip(struct sk_buff *skb)
 {
-   int ret = NETDEV_TX_OK;
-   int err;
-
memset(IPCB(skb), 0, sizeof(*IPCB(skb)));
 
 #undef ip_local_out
-   err = ip_local_out(skb);
-   if (unlikely(net_xmit_eval(err)))
-   ret = err;
-
-   return ret;
+   return ip_local_out(skb);
 }
 
 int rpl_ip_local_out(struct net *net, struct sock *sk, struct sk_buff *skb)
 {
-   int ret = NETDEV_TX_OK;
-   int id = -1;
-
if (!OVS_GSO_CB(skb)->fix_segment)
return output_ip(skb);
 
if (skb_is_gso(skb)) {
-   struct iphdr *iph;
+   int ret;
+   int id;
 
-   iph = ip_hdr(skb);
-   id = ntohs(iph->id);
skb = tnl_skb_gso_segment(skb, 0, false, AF_INET);
if (!skb || IS_ERR(skb))
-   return 0;
+   return NET_XMIT_DROP;
+
+   id = ntohs(ip_hdr(skb)->id);
+   do {
+   struct sk_buff *next_skb = skb->next;
+
+   skb->next = NULL;
+   ip_hdr(skb)->id = htons(id++);
+
+   ret = output_ip(skb);
+   skb = next_skb;
+   } while (skb);
+   return ret;
}  else if (skb->ip_summed == CHECKSUM_PARTIAL) {
int err;
 
err = skb_checksum_help(skb);
if (unlikely(err))
-   return 0;
+   return NET_XMIT_DROP;
}
 
-   while (skb) {
-   struct sk_buff *next_skb = skb->next;
-   struct iphdr *iph;
-
-   skb->next = NULL;
-
-   iph = ip_hdr(skb);
-   if (id >= 0)
-   iph->id = htons(id++);
-
-   ret = output_ip(skb);
-   skb = next_skb;
-   }
-   return ret;
+   return output_ip(skb);
 }
 EXPORT_SYMBOL_GPL(rpl_ip_local_out);
 
 static int output_ipv6(struct sk_buff *skb)
 {
-   int ret = NETDEV_TX_OK;
-   int err;
-
memset(IP6CB(skb), 0, sizeof (*IP6CB(skb)));
 #undef ip6_local_out
-   err = ip6_local_out(skb);
-   if (unlikely(net_xmit_eval(err)))
-   ret = err;
-
-   return ret;
+   return ip6_local_out(skb);
 }
 
 int rpl_ip6_local_out(struct net *net, struct sock *sk, struct sk_buff *skb)
 {
-   int ret = NETDEV_TX_OK;
 
if (!OVS_GSO_CB(skb)->fix_segment)
return output_ipv6(skb);
 
if (skb_is_gso(skb)) {
+   int ret;
+
skb = tnl_skb_gso_segment(skb, 0, false, AF_INET6);
if (!skb || IS_ERR(skb))
-   return 0;
+   return NET_XMIT_DROP;
+
+   do {
+   struct sk_buff *next_skb = skb->next;
+
+   skb->next = NULL;
+   ret = output_ipv6(skb);
+   skb = next_skb;
+   } while (skb);
+   return ret;
}  else if (skb->ip_summed == CHECKSUM_PARTIAL) {
int err;
 
err = skb_checksum_help(skb);
if (unlikely(err))
-   return 0;
+   return NET_XMIT_DROP;
}
 
-   while (skb) {
-   struct sk_buff *next_skb = skb->next;
-
-   skb->next = NULL;
-   ret = output_ipv6(skb);
-   skb = next_skb;
-   }
-   return ret;
+   return output_ipv6(skb);
 }
 EXPORT_SYMBOL_GPL(rpl_ip6_local_out);
 #endif /* 3.18 */
-- 
1.9.1

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] ovs-numa: fixed cmask parse with 0x prefix

2016-07-26 Thread Wei Shen
Fixed a minor bug that would print out a confusing warning about core mask,
"ovs_numa|WARN|Invalid cpu mask: x", when dpdl-lcore-mask has 0x prefix, e.g.
0x123, which is the convention used in INSTALL.DPDK.md.
---
 lib/ovs-numa.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/lib/ovs-numa.c b/lib/ovs-numa.c
index c8173e0..c1938eb 100644
--- a/lib/ovs-numa.c
+++ b/lib/ovs-numa.c
@@ -551,6 +551,10 @@ ovs_numa_set_cpu_mask(const char *cmask)
 return;
 }
 
+/* Skip 0x if supplied in the cmask */
+if (!strncmp(cmask, "0x", 2))
+cmask += 2;
+
 for (i = strlen(cmask) - 1; i >= 0; i--) {
 char hex = toupper((unsigned char)cmask[i]);
 int bin, j;
-- 
2.5.5

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] releasing 2.6: branch Aug 1, release Sep 15

2016-07-26 Thread Jesse Gross
On Sun, Jul 24, 2016 at 10:53 AM, Ben Pfaff  wrote:
> On Sun, Jul 24, 2016 at 08:39:31AM -0300, Thadeu Lima de Souza Cascardo wrote:
>> On Sat, Jul 23, 2016 at 08:59:35AM -0700, Ben Pfaff wrote:
>> > The proposed Open vSwitch release schedule calls for branching 2.6 from
>> > master on Aug. 1, followed by a period of bug fixes and stabilization,
>> > with release on Sep. 15.  The proposed release schedule is posted here
>> > for review:
>> > https://patchwork.ozlabs.org/patch/650319/
>> >
>> > I don't yet know of a reason to modify this schedule.
>> >
>> > If you know of reasons to change it, now is an appropriate time to bring
>> > it up for discussion.  In addition, if you have features planned for 2.6
>> > that risk hitting master somewhat late for the branch, it is also a good
>> > time to bring these up for discussion, so that we can plan to backport
>> > them to the branch early on, or to delay the branch by a few days.
>>
>> I would like to see the rtnetlink patchset included. One of things
>> that needs to happen before that is taking those decisions about
>> netdev_open and the existence of conflicting port types with the same
>> name. For example, a system interface and an interface in the database
>> with the same name but a different type.
>>
>> I will post some comments on the discussion we already have opened for
>> that.
>>
>> Just wanted to take the opportunity to mention this expectation of
>> getting those into 2.6.
>
> For that feature, I need to defer to Jesse (added to the thread).

I think since there isn't yet a patch for this yet that is about ready
to be applied, we'll need to make a call at the time the code is
applied to master. If it's one day after we branch, sure that's fine;
one day before release, obviously not; anything in the middle we'll
need to decide.

However, based on the code that has been sent out previously, I think
this is mostly infrastructure at this point rather than user-visible
changes. It would allow other features to be built on top of it but
that would be a follow on change. If that's the case, is there any
particular reason to try to get this in 2.6?
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] ovn: Rename "gateway" to "l3gateway".

2016-07-26 Thread Kyle Mestery
On Tue, Jul 26, 2016 at 3:49 PM, Russell Bryant  wrote:
> When L3 gateway support was added, it introduced a port type called
> "gateway" and a corresponding option called "gateway-chassis".  Since
> that time, we also have an L2 gateway port type called "l2gateway" and a
> corresponding option called "l2gateway-chassis".  This patch renames the
> L3 gateway port type and option to "l3gateway" and "l3gateway-chassis"
> to make things a little more clear and consistent.
>
> Signed-off-by: Russell Bryant 

This seems very reasonable to me Russell, and makes sense given the
two disparate gateway port types.

Acked-by: Kyle Mestery 

> ---
>  ovn/controller/binding.c |  2 +-
>  ovn/controller/patch.c   |  4 ++--
>  ovn/northd/ovn-northd.c  | 12 ++--
>  ovn/ovn-sb.xml   | 18 +-
>  4 files changed, 18 insertions(+), 18 deletions(-)
>
> diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
> index e83c1d5..78ebec4 100644
> --- a/ovn/controller/binding.c
> +++ b/ovn/controller/binding.c
> @@ -219,7 +219,7 @@ consider_local_datapath(struct controller_ctx *ctx,
>  add_local_datapath(local_datapaths, binding_rec);
>  }
>  } else if (chassis_rec && binding_rec->chassis == chassis_rec
> -   && strcmp(binding_rec->type, "gateway")) {
> +   && strcmp(binding_rec->type, "l3gateway")) {
>  if (ctx->ovnsb_idl_txn) {
>  VLOG_INFO("Releasing lport %s from this chassis.",
>binding_rec->logical_port);
> diff --git a/ovn/controller/patch.c b/ovn/controller/patch.c
> index 707d08b..012e6ba 100644
> --- a/ovn/controller/patch.c
> +++ b/ovn/controller/patch.c
> @@ -346,9 +346,9 @@ add_logical_patch_ports(struct controller_ctx *ctx,
>  const struct sbrec_port_binding *binding;
>  SBREC_PORT_BINDING_FOR_EACH (binding, ctx->ovnsb_idl) {
>  bool local_port = false;
> -if (!strcmp(binding->type, "gateway")) {
> +if (!strcmp(binding->type, "l3gateway")) {
>  const char *chassis = smap_get(>options,
> -   "gateway-chassis");
> +   "l3gateway-chassis");
>  if (chassis && !strcmp(local_chassis_id, chassis)) {
>  local_port = true;
>  }
> diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
> index 38a3d30..7f5927e 100644
> --- a/ovn/northd/ovn-northd.c
> +++ b/ovn/northd/ovn-northd.c
> @@ -770,10 +770,10 @@ ovn_port_update_sbrec(const struct ovn_port *op)
>  sbrec_port_binding_set_datapath(op->sb, op->od->sb);
>  if (op->nbrp) {
>  /* If the router is for l3 gateway, it resides on a chassis
> - * and its port type is "gateway". */
> + * and its port type is "l3gateway". */
>  const char *chassis = smap_get(>od->nbr->options, "chassis");
>  if (chassis) {
> -sbrec_port_binding_set_type(op->sb, "gateway");
> +sbrec_port_binding_set_type(op->sb, "l3gateway");
>  } else {
>  sbrec_port_binding_set_type(op->sb, "patch");
>  }
> @@ -783,7 +783,7 @@ ovn_port_update_sbrec(const struct ovn_port *op)
>  smap_init();
>  smap_add(, "peer", peer);
>  if (chassis) {
> -smap_add(, "gateway-chassis", chassis);
> +smap_add(, "l3gateway-chassis", chassis);
>  }
>  sbrec_port_binding_set_options(op->sb, );
>  smap_destroy();
> @@ -802,9 +802,9 @@ ovn_port_update_sbrec(const struct ovn_port *op)
>  }
>
>  /* A switch port connected to a gateway router is also of
> - * type "gateway". */
> + * type "l3gateway". */
>  if (chassis) {
> -sbrec_port_binding_set_type(op->sb, "gateway");
> +sbrec_port_binding_set_type(op->sb, "l3gateway");
>  } else {
>  sbrec_port_binding_set_type(op->sb, "patch");
>  }
> @@ -818,7 +818,7 @@ ovn_port_update_sbrec(const struct ovn_port *op)
>  smap_init();
>  smap_add(, "peer", router_port);
>  if (chassis) {
> -smap_add(, "gateway-chassis", chassis);
> +smap_add(, "l3gateway-chassis", chassis);
>  }
>  sbrec_port_binding_set_options(op->sb, );
>  smap_destroy();
> diff --git a/ovn/ovn-sb.xml b/ovn/ovn-sb.xml
> index 3d26e65..3cdf91b 100644
> --- a/ovn/ovn-sb.xml
> +++ b/ovn/ovn-sb.xml
> @@ -1420,8 +1420,8 @@ tcp.flags = RST;
>database, which identifies logical ports via the conventions described
>in IntegrationGuide.md.  (The exceptions are for
>Port_Binding records with type of
> -  gateway, whose locations are identified by
> -  ovn-northd via the options:gateway-chassis
> +  l3gateway, whose locations are identified by
> +  

Re: [ovs-dev] [PATCH] [PATCH v1] ovn-northd: Fix {}-enclosed constants for ND responder

2016-07-26 Thread Russell Bryant
On Tue, Jul 26, 2016 at 2:02 AM, Zong Kai LI  wrote:

> It missed comma as constant seperator in match string for ND responder.
>
> Signed-off-by: Zong Kai LI 


Thanks!  I applied this to master.

-- 
Russell Bryant
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 2/2] rhel: Allow openvswitch to get parent information

2016-07-26 Thread Flavio Leitner
On Tue, Jul 26, 2016 at 12:57:07PM -0700, Joe Stringer wrote:
> On 25 July 2016 at 18:16, Flavio Leitner  wrote:
> > Updates SELinux to allow ovs-vsctl to get parent process
> > information and log that to the database:
> >
> > record 241: 2016-07-26 00:59:47.418 "ovs-vsctl (invoked by /bin/bash
> > (pid 1589)): ovs-vsctl -t 10 -- --if-exist ...
> >
> > Jul 25 12:57:35 localhost.localdomain audit[830]: AVC avc:  denied  {
> > search } for  pid=830 comm="ovs-vsctl" name="731" dev="proc" ino=14140
> > scontext=system_u:system_r:openvswitch_t:s0
> > tcontext=system_u:system_r:initrc_t:s0 tclass=dir permissive=0
> >
> > Signed-off-by: Flavio Leitner 
> > ---
> >  selinux/openvswitch-custom.te | 5 +
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/selinux/openvswitch-custom.te b/selinux/openvswitch-custom.te
> > index fc32b97..5739595 100644
> > --- a/selinux/openvswitch-custom.te
> > +++ b/selinux/openvswitch-custom.te
> > @@ -2,8 +2,13 @@ module openvswitch-custom 1.0;
> >
> >  require {
> >  type openvswitch_t;
> > +attribute domain;
> >  class netlink_socket { setopt getopt create connect getattr write 
> > read };
> > +class dir { search };
> > +class file { open getattr read };
> >  }
> >
> >  #= openvswitch_t ==
> >  allow openvswitch_t self:netlink_socket { setopt getopt create connect 
> > getattr write read };
> > +allow openvswitch_t domain:dir { search };
> > +allow openvswitch_t domain:file { open getattr read };
> 
> Hi Flavio,
> 
> Thanks for spending some time to get OVS in better shape with SELinux.
> I figure that once this settles down a bit we should take the policy
> file here and work towards upstreaming all of the policy changes.

Yeah, we can try to do both in parallel.  Once this gets in, I will
open the bz requesting to fix Fedora which would fix upstream too.

> As far as I can follow, this "domain" type is not just for accessing
> OVS directories and files (like openvswitch_t), but ifor a much wider
> range of paths:
> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/4/html/SELinux_Guide/rhlcommon-section-0048.html
> 
> "# The domain attribute identifies every type that can be
> # assigned to a process.  This attribute is used in TE rules
> # that should be applied to all domains, e.g. permitting
> # init to kill all processes."
> 
> Is my understanding (+documentation) correct here? Is there an similar

Your understanding is correct.  Turns out that we don't know which
process will be the parent, so it could bash unconfined or initrc_t
or in any other context (neutron?).

> but more restrictive policy that allows ovs-vsctl to access, for
> example, /var/run/openvswitch/* (with var_run_openvswitch_t or
> similar)? Alternatively is there an example of another daemon that has
> a similar policy that set a precedence for writing the policy like
> this?

I spent few hours on this and I couldn't find a way to restrict it
more that I proposed with selinux.  Basically the above is an expansion
of the interface domain_read_all_domains_state()[1] which other
applications are using to read other processes states.  However, that
seemed relatively new and probably not available on older distros, so
I have expanded to the relevant actions removing what we don't need.

[1] http://danwalsh.livejournal.com/51435.html

> Would you also be able to provide the full ovs-vsctl commandline? It
> was a little difficult to understand exactly what was going on during
> this event, or try to reproduce.

utilities/ovs-vsctl.c:2473

2472 static char *
2473 vsctl_parent_process_info(void)
2474 {
2475 #ifdef __linux__
2476 pid_t parent_pid;
2477 char *procfile;
2478 struct ds s;
2479 FILE *f;
2480 
2481 parent_pid = getppid();
2482 procfile = xasprintf("/proc/%d/cmdline", parent_pid);
2483 
2484 f = fopen(procfile, "r");

That is called from do_vsctl() to find the parent info.  If you run as
root, then it's unconfined and it works, but it doesn't work during
boot time (initrc_t) for instance.

To reproduce you just need to configure an OVS interface using ifcfg
with ONBOOT=yes and reboot.

> Lastly, I've just applied the other SELinux patch so you'll need to
> rebase this one.

Sure, not a problem.

-- 
fbl

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] datapath: Add support for kernel 4.6

2016-07-26 Thread Jesse Gross
On Mon, Jul 25, 2016 at 6:40 PM, Pravin B Shelar  wrote:
> Most of patch iron out USE_UPSTREAM_TUNNEL case where datapath
> directly use upstream tunneling modules.
>
> Signed-off-by: Pravin B Shelar 

Acked-by: Jesse Gross 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] ovn-controller: squelch expected duplicate flow warnings

2016-07-26 Thread Guru Shetty
On 24 July 2016 at 10:07, Ryan Moats  wrote:

> In the physical processing of ovn-controller, there are two
> sets of OF flows that are still fully recalculated every cycle:
>
>   Flows that aren't associated with any logical flow, and
>   Flows calculated based on multicast groups
>
> Because these flows are recalculated fully each cycle, full
> duplicates of existing OF flows are created and the OF management
> code in ovn-controller pollutes the logs with false positive
> warnings about repeated duplicates.
>
> As a short term measure, ignore full duplicates for both of
> these types of flows, but still warn if the action changes
> (as that is not expected and may be indicative of a problem).
>
> Signed-off-by: Ryan Moats 
>

I also noticed that "commit 70c7cfef188b5ae9940abd5 (ovn-controller: Add
incremental processing to lflow_run and physical_run)" causes load
balancing system unit tests to fail. A little debugging shows that groups
are getting deleted when new flows are added.  My hunch is that this is
likely because 'desired_groups' in ofctl_put gets deleted in every run. But
in the next run, it does not get updated as we no longer process all flows.


---
>  ovn/controller/ofctrl.c   | 26 +-
>  ovn/controller/ofctrl.h   |  3 +++
>  ovn/controller/physical.c | 28 +++-
>  3 files changed, 43 insertions(+), 14 deletions(-)
>
> diff --git a/ovn/controller/ofctrl.c b/ovn/controller/ofctrl.c
> index f0451b7..2b26f2d 100644
> --- a/ovn/controller/ofctrl.c
> +++ b/ovn/controller/ofctrl.c
> @@ -550,10 +550,10 @@ log_ovn_flow_rl(struct vlog_rate_limit *rl, enum
> vlog_level level,
>   *
>   * This just assembles the desired flow tables in memory.  Nothing is
> actually
>   * sent to the switch until a later call to ofctrl_run(). */
> -void
> -ofctrl_add_flow(uint8_t table_id, uint16_t priority,
> +static void
> +_ofctrl_add_flow(uint8_t table_id, uint16_t priority,
>  const struct match *match, const struct ofpbuf *actions,
> -const struct uuid *uuid)
> +const struct uuid *uuid, bool dupwarn)
>  {
>  /* Structure that uses table_id+priority+various things as hashes. */
>  struct ovn_flow *f = xmalloc(sizeof *f);
> @@ -591,8 +591,10 @@ ofctrl_add_flow(uint8_t table_id, uint16_t priority,
>   */
>  if (ofpacts_equal(f->ofpacts, f->ofpacts_len,
>d->ofpacts, d->ofpacts_len)) {
> -static struct vlog_rate_limit rl =
> VLOG_RATE_LIMIT_INIT(5, 1);
> -log_ovn_flow_rl(, VLL_INFO, f, "duplicate flow");
> +if (dupwarn) {
> +static struct vlog_rate_limit rl =
> VLOG_RATE_LIMIT_INIT(5, 1);
> +log_ovn_flow_rl(, VLL_INFO, f, "duplicate flow");
> +}
>  } else {
>  static struct vlog_rate_limit rl =
> VLOG_RATE_LIMIT_INIT(5, 1);
>  log_ovn_flow_rl(, VLL_WARN, f,
> @@ -617,6 +619,20 @@ ofctrl_add_flow(uint8_t table_id, uint16_t priority,
>  f->uuid_hindex_node.hash);
>  }
>
> +void
> +ofctrl_add_flow(uint8_t table_id, uint16_t priority,
> +const struct match *match, const struct ofpbuf *actions,
> +const struct uuid *uuid) {
> +_ofctrl_add_flow(table_id, priority, match, actions, uuid, true);
> +}
> +
> +void
> +ofctrl_add_flow_no_warn(uint8_t table_id, uint16_t priority,
> +const struct match *match, const struct ofpbuf
> *actions,
> +const struct uuid *uuid) {
> +_ofctrl_add_flow(table_id, priority, match, actions, uuid, false);
> +}
> +
>  /* Removes a bundles of flows from the flow table. */
>  void
>  ofctrl_remove_flows(const struct uuid *uuid)
> diff --git a/ovn/controller/ofctrl.h b/ovn/controller/ofctrl.h
> index 49b95b0..b591e82 100644
> --- a/ovn/controller/ofctrl.h
> +++ b/ovn/controller/ofctrl.h
> @@ -42,6 +42,9 @@ struct ovn_flow *ofctrl_dup_flow(struct ovn_flow
> *source);
>  void ofctrl_add_flow(uint8_t table_id, uint16_t priority,
>   const struct match *, const struct ofpbuf *ofpacts,
>   const struct uuid *uuid);
> +void ofctrl_add_flow_no_warn(uint8_t table_id, uint16_t priority,
> + const struct match *, const struct ofpbuf
> *ofpacts,
> + const struct uuid *uuid);
>
>  void ofctrl_remove_flows(const struct uuid *uuid);
>
> diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c
> index a104e33..9e6dff4 100644
> --- a/ovn/controller/physical.c
> +++ b/ovn/controller/physical.c
> @@ -549,8 +549,9 @@ consider_mc_group(enum mf_field_id mff_ovn_geneve,
>   * group as the logical output port. */
>  put_load(mc->tunnel_key, MFF_LOG_OUTPORT, 0, 32, ofpacts_p);
>
> -ofctrl_add_flow(OFTABLE_LOCAL_OUTPUT, 100,
> -, 

Re: [ovs-dev] [PATCH v1 3/3] ovn-northd: Add logical flows to support DHCPv6

2016-07-26 Thread Ben Pfaff
On Wed, Jul 27, 2016 at 12:55:24AM +0530, Numan Siddique wrote:
> OVN implements native DHCPv6. DHCPv6 options are stored
> in the 'DHCP_Options' NB table and logical ports refer to this
> table to configure the DHCPv6 options.
> 
> For each logical port configured with DHCPv6 Options following flows
> are added
>  - A logical flow which copies the DHCPv6 options to the DHCPv6
>request packets using the 'put_dhcpv6_opts' action and advances the
>packet to the next stage.
> 
>  - A logical flow which implements the DHCPv6 reponder by sending
>the DHCPv6 reply back to the inport once the 'put_dhcpv6_opts' action
>is applied.
> 
> Signed-off-by: Numan Siddique 

I think that the change to packets.c is a mistake; I don't see a reason
for it.

build_dhcpv6_action() could use ipv6_mask_is_any() instead of checking
for zeros by hand.  Actually ipv6_mask_is_any() is a pretty terrible
name for that function, it at least needs a comment to indicate that it
checks for all-zeros.

(Also not all systems define the s6_addr32 member of struct in6_addr.)

It seems fairly obnoxious to give the DHCPv6 options all-caps names,
could we make them lowercase instead?

A buffer for formatting an IPv6 address should be INET6_ADDRSTRLEN bytes
long, not IPV6_SCAN_LEN.  I guess they're the same number but
conceptually fairly different.

I don't see why build_dhcpv6_action() and build_acls() memset the string
buffers before formatting into them.

Like DHCPv4, I don't think that nothing_to_add is a safe optimization in
practice.  It'll burn us eventually.

Please write
+}
+else {
as
+} else {

Thanks,

Ben.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] ovn: Rename "gateway" to "l3gateway".

2016-07-26 Thread Russell Bryant
When L3 gateway support was added, it introduced a port type called
"gateway" and a corresponding option called "gateway-chassis".  Since
that time, we also have an L2 gateway port type called "l2gateway" and a
corresponding option called "l2gateway-chassis".  This patch renames the
L3 gateway port type and option to "l3gateway" and "l3gateway-chassis"
to make things a little more clear and consistent.

Signed-off-by: Russell Bryant 
---
 ovn/controller/binding.c |  2 +-
 ovn/controller/patch.c   |  4 ++--
 ovn/northd/ovn-northd.c  | 12 ++--
 ovn/ovn-sb.xml   | 18 +-
 4 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
index e83c1d5..78ebec4 100644
--- a/ovn/controller/binding.c
+++ b/ovn/controller/binding.c
@@ -219,7 +219,7 @@ consider_local_datapath(struct controller_ctx *ctx,
 add_local_datapath(local_datapaths, binding_rec);
 }
 } else if (chassis_rec && binding_rec->chassis == chassis_rec
-   && strcmp(binding_rec->type, "gateway")) {
+   && strcmp(binding_rec->type, "l3gateway")) {
 if (ctx->ovnsb_idl_txn) {
 VLOG_INFO("Releasing lport %s from this chassis.",
   binding_rec->logical_port);
diff --git a/ovn/controller/patch.c b/ovn/controller/patch.c
index 707d08b..012e6ba 100644
--- a/ovn/controller/patch.c
+++ b/ovn/controller/patch.c
@@ -346,9 +346,9 @@ add_logical_patch_ports(struct controller_ctx *ctx,
 const struct sbrec_port_binding *binding;
 SBREC_PORT_BINDING_FOR_EACH (binding, ctx->ovnsb_idl) {
 bool local_port = false;
-if (!strcmp(binding->type, "gateway")) {
+if (!strcmp(binding->type, "l3gateway")) {
 const char *chassis = smap_get(>options,
-   "gateway-chassis");
+   "l3gateway-chassis");
 if (chassis && !strcmp(local_chassis_id, chassis)) {
 local_port = true;
 }
diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index 38a3d30..7f5927e 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -770,10 +770,10 @@ ovn_port_update_sbrec(const struct ovn_port *op)
 sbrec_port_binding_set_datapath(op->sb, op->od->sb);
 if (op->nbrp) {
 /* If the router is for l3 gateway, it resides on a chassis
- * and its port type is "gateway". */
+ * and its port type is "l3gateway". */
 const char *chassis = smap_get(>od->nbr->options, "chassis");
 if (chassis) {
-sbrec_port_binding_set_type(op->sb, "gateway");
+sbrec_port_binding_set_type(op->sb, "l3gateway");
 } else {
 sbrec_port_binding_set_type(op->sb, "patch");
 }
@@ -783,7 +783,7 @@ ovn_port_update_sbrec(const struct ovn_port *op)
 smap_init();
 smap_add(, "peer", peer);
 if (chassis) {
-smap_add(, "gateway-chassis", chassis);
+smap_add(, "l3gateway-chassis", chassis);
 }
 sbrec_port_binding_set_options(op->sb, );
 smap_destroy();
@@ -802,9 +802,9 @@ ovn_port_update_sbrec(const struct ovn_port *op)
 }
 
 /* A switch port connected to a gateway router is also of
- * type "gateway". */
+ * type "l3gateway". */
 if (chassis) {
-sbrec_port_binding_set_type(op->sb, "gateway");
+sbrec_port_binding_set_type(op->sb, "l3gateway");
 } else {
 sbrec_port_binding_set_type(op->sb, "patch");
 }
@@ -818,7 +818,7 @@ ovn_port_update_sbrec(const struct ovn_port *op)
 smap_init();
 smap_add(, "peer", router_port);
 if (chassis) {
-smap_add(, "gateway-chassis", chassis);
+smap_add(, "l3gateway-chassis", chassis);
 }
 sbrec_port_binding_set_options(op->sb, );
 smap_destroy();
diff --git a/ovn/ovn-sb.xml b/ovn/ovn-sb.xml
index 3d26e65..3cdf91b 100644
--- a/ovn/ovn-sb.xml
+++ b/ovn/ovn-sb.xml
@@ -1420,8 +1420,8 @@ tcp.flags = RST;
   database, which identifies logical ports via the conventions described
   in IntegrationGuide.md.  (The exceptions are for
   Port_Binding records with type of
-  gateway, whose locations are identified by
-  ovn-northd via the options:gateway-chassis
+  l3gateway, whose locations are identified by
+  ovn-northd via the options:l3gateway-chassis
   column in this table.  ovn-controller is still responsible
   to populate the chassis column.)
 
@@ -1475,12 +1475,12 @@ tcp.flags = RST;
 connectivity to the corresponding physical network.
   
 
-  gateway
+  l3gateway
   
 The physical location of the L3 gateway.  To successfully identify 
a
 chassis, 

Re: [ovs-dev] [PATCH] selinux: Allow ovs-ctl force-reload-kmod.

2016-07-26 Thread Flavio Leitner
On Tue, Jul 26, 2016 at 01:31:00PM -0700, Joe Stringer wrote:
> On 26 July 2016 at 13:00, Flavio Leitner  wrote:
> > On Tue, Jul 26, 2016 at 12:41:01PM -0700, Joe Stringer wrote:
> >> On 25 July 2016 at 16:57, Flavio Leitner  wrote:
> >> > On Fri, Jul 22, 2016 at 02:10:51PM -0700, Joe Stringer wrote:
> >> >> When invoking ovs-ctl force-reload-kmod via '/etc/init.d/openvswitch
> >> >> force-reload-kmod', spurious errors would output related to 'hostname'
> >> >> and 'ip', and the system's selinux audit log would complain about some
> >> >> of the invocations such as those listed at the end of this commit 
> >> >> message.
> >> >>
> >> >> This patch loosens restrictions for openvswitch_t (used for ovs-ctl, as
> >> >> well as all of the OVS daemons) to allow it to execute 'hostname' and
> >> >> 'ip' commands, and also to execute temporary files created as
> >> >> openvswitch_tmp_t. This allows force-reload-kmod to run correctly.
> >> >>
> >> >> Example audit logs:
> >> >> type=AVC msg=audit(1468515192.912:16720): avc:  denied  { getattr } for
> >> >> pid=11687 comm="ovs-ctl" path="/usr/bin/hostname" dev="dm-1"
> >> >> ino=33557805 scontext=system_u:system_r:openvswitch_t:s0
> >> >> tcontext=system_u:object_r:hostname_exec_t:s0 tclass=file
> >> >>
> >> >> type=AVC msg=audit(1468519445.766:16829): avc:  denied  { getattr } for
> >> >> pid=13920 comm="ovs-save" path="/usr/sbin/ip" dev="dm-1" ino=67572988
> >> >> scontext=unconfined_u:system_r:openvswitch_t:s0
> >> >> tcontext=system_u:object_r:ifconfig_exec_t:s0 tclass=file
> >> >>
> >> >> type=AVC msg=audit(1468519445.890:16833): avc:  denied  { execute } for
> >> >> pid=13849 comm="ovs-ctl" name="tmp.jdEGHntG3Z" dev="dm-1" ino=106876762
> >> >> scontext=unconfined_u:system_r:openvswitch_t:s0
> >> >> tcontext=unconfined_u:object_r:openvswitch_tmp_t:s0 tclass=file
> >> >>
> >> >> Signed-off-by: Joe Stringer 
> >> >> ---
> >> >
> >> > LGTM.
> >> > Acked-by: Flavio Leitner 
> >>
> >> Thanks for the review, applied to master.
> >
> > I also opened bug to fix on Fedora:
> >
> > Bug 1360465 - SELinux blocks OVS to run 'hostname' and 'ip'
> > https://bugzilla.redhat.com/show_bug.cgi?id=1360465
> >
> Thanks. For what it's worth, when I tried, if I invoke
> "/usr/share/openvswitch/scripts/ovs-ctl force-reload-kmod" directly on
> centos7, OVS restarts unconfined. Usually in the openvswitch.spec path
> I will run it indirectly via /etc/init.d/openvswitch, but that isn't
> an option in the fedora packaging.

Right, because systemd doesn't support custom actions, so we have
a few fixed actions available to play with.  The plan is to move to
1:1 mapping between services and OVS daemons and run external scripts
to manage those.  See Aaron's patchset stepping in that direction.

The 'hostname' affects openvswitch-fedora.spec as well.

-- 
fbl

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v3] Scanning only changed entries in the ovnsb

2016-07-26 Thread Hui Kang


Russell Bryant  wrote on 07/26/2016 03:50:44 PM:

> From: Russell Bryant 
> To: Hui Kang/Watson/IBM@IBMUS
> Cc: Ben Pfaff , Hui Kang , ovs
> dev 
> Date: 07/26/2016 03:51 PM
> Subject: Re: [ovs-dev] [PATCH v3] Scanning only changed entries in the
ovnsb
>
> On Tue, Jul 26, 2016 at 3:44 PM, Hui Kang  wrote:
>
>
> "dev"  wrote on 07/26/2016 02:20:27 PM:
>
> > From: Ben Pfaff 
> > To: Hui Kang 
> > Cc: dev@openvswitch.org
> > Date: 07/26/2016 02:20 PM
> > Subject: Re: [ovs-dev] [PATCH v3] Scanning only changed entries in the
> ovnsb
> > Sent by: "dev" 
> >
> > On Sat, Jul 16, 2016 at 11:58:25PM -0400, Hui Kang wrote:
> > > Improve performance by scanning only changed port binding entries
> > > when determining whether to mark the logical switch port up or
> > > down
> > >
> > > Signed-off-by: Hui Kang 
> >
> > Won't this skip an initial round of updates at ovn-northd startup time?
> > (Certainly ovn-northd might get killed and restarted occasionally,
> > especially if we're doing failover to a second host.)
>
> Hi, Ben,
> After second thought, I think skipping the initial round is the purpose
of
> this patch.
>
> ovsdb_idl_create(ovsdb) copies the the Port_binding table from southbound
> database whenever ovn-northd gets started. In this case, the northbound
> DB and southbound db are synced. In ovnsb_db_run, ovn-northd only gets
> notified when there is change to the Chassis column [1]. Therefore,
> ovnsb_db_run should only look the entry that are changed with its Chassis
> column. There is no need to initialize by iterating every entry in the
> Port_binding table. Please correct me if my understanding is incorrect.
> Thanks.
>
> What if the Chassis column changes in some Port_Binding records
> while ovn-northd isn't running?

Hi, Russel,
Thanks for correcting me. In this case, initialization is necessary each
time
ovn-northd restarts.

- Hui

>
> --
> Russell Bryant
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v1 2/3] ovn-controller: Add 'put_dhcpv6_opts' action in ovn-controller

2016-07-26 Thread Ben Pfaff
On Wed, Jul 27, 2016 at 12:55:00AM +0530, Numan Siddique wrote:
> This patch adds a new OVN action 'put_dhcpv6_opts' to support native
> DHCPv6 in OVN.
> 
> ovn-controller parses this action and adds a NXT_PACKET_IN2
> OF flow with 'pause' flag set and the DHCPv6 options stored in
> 'userdata' field.
> 
> When the valid DHCPv6 packet is received by ovn-controller, it frames a
> new DHCPv6 reply packet with the DHCPv6 options present in the
> 'userdata' field and resumes the packet and stores 1 in the 1-bit subfield.
> If the packet is invalid, it resumes the packet without any modifying and
> stores 0 in the 1-bit subfield.
> 
> Eg. reg0[3] = put_dhcpv6_opts(IA_ADDR = aef0::4, SERVER_ID = 
> 00:00:00:00:10:02,
>  DNS_RECURSIVE_SERVER={ae70::1,ae70::2})
> 
> A new 'DHCPv6_Options' table is added in SB DB which stores
> the supported DHCPv6 options with DHCPv6 code and type. ovn-northd is
> expected to popule this table.
> 
> Upcoming patch will add logical flows with this action.
> 
> Signed-off-by: Numan Siddique 

Same comment here as previously, that the put_dhcpv6_opts action needs
documentation in ovn-sb.xml.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v1 1/3] ovn-northd: Add logical flows to support native DHCPv4

2016-07-26 Thread Ben Pfaff
I added them myself, thanks for the reminder.

On Wed, Jul 27, 2016 at 01:12:38AM +0530, Numan Siddique wrote:
> This patch has "Tested-by: Ramu Ramamurthy "
> and "Acked-by: Ramu Ramamurthy " and every time
> I forget to add this when I resubmit the patch.
> 
> Ramu - My apologies
> 
> Thanks
> Numan
> 
> 
> On Wed, Jul 27, 2016 at 12:54 AM, Numan Siddique 
> wrote:
> 
> > OVN implements a native DHCPv4 support which caters to the common
> > use case of providing an IP address to a booting instance by
> > providing stateless replies to DHCPv4 requests based on statically
> > configured address mappings. To do this it allows a short list of
> > DHCPv4 options to be configured and applied at each compute host
> > running ovn-controller.
> >
> > A new table 'DHCP_Options' is added in OVN NB DB to store the DHCP
> > options. Logical ports refer to this table to configure the DHCPv4
> > options.
> >
> > For each logical port configured with DHCPv4 Options following flows
> > are added
> >  - A logical flow which copies the DHCPv4 options to the DHCPv4
> >request packets using the 'put_dhcp_opts' action and advances the
> >packet to the next stage.
> >
> >  - A logical flow which implements the DHCP reponder by sending
> >the DHCPv4 reply back to the inport once the 'put_dhcp_opts' action
> >is applied.
> >
> > Signed-off-by: Numan Siddique 
> > Co-authored-by: Ben Pfaff 
> > Signed-off-by: Ben Pfaff 
> > ---
> >  ovn/northd/ovn-northd.8.xml   |  91 +-
> >  ovn/northd/ovn-northd.c   | 256 +-
> >  ovn/ovn-nb.ovsschema  |  20 ++-
> >  ovn/ovn-nb.xml| 270
> > 
> >  ovn/utilities/ovn-nbctl.8.xml |  30 +
> >  ovn/utilities/ovn-nbctl.c | 197 +
> >  tests/ovn.at  | 281
> > ++
> >  7 files changed, 1135 insertions(+), 10 deletions(-)
> >
> > diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
> > index ced2839..b95caef 100644
> > --- a/ovn/northd/ovn-northd.8.xml
> > +++ b/ovn/northd/ovn-northd.8.xml
> > @@ -457,7 +457,90 @@ output;
> >
> >  
> >
> > -Ingress Table 10: Destination Lookup
> > +Ingress Table 10: DHCP option processing
> > +
> > +
> > +  This table adds the DHCPv4 options to a DHCPv4 packet from the
> > +  logical ports configured with IPv4 address(es) and DHCPv4 options.
> > +
> > +
> > +
> > +  
> > +
> > +  A priority-100 logical flow is added for these logical ports
> > +  which matches the IPv4 packet with udp.src = 68 and
> > +  udp.dst = 67 and applies the action
> > +  put_dhcp_opts and advances the packet to the next
> > table.
> > +
> > +
> > +
> > +reg0[3] = put_dhcp_opts(offer_ip = O, options...);
> > +next;
> > +
> > +
> > +
> > +  For DHCPDISCOVER and DHCPREQUEST, this transforms the packet
> > into a
> > +  DHCP reply, adds the DHCP offer IP O and options to
> > the
> > +  packet, and stores 1 into reg0[3].  For other kinds of packets,
> > it
> > +  just stores 0 into reg0[3].  Either way, it continues to the
> > next
> > +  table.
> > +
> > +
> > +  
> > +
> > +  
> > +A priority-0 flow that matches all packets to advances to table
> > 11.
> > +  
> > +
> > +
> > +Ingress Table 11: DHCP responses
> > +
> > +
> > +  This table implements DHCP responder for the DHCP replies generated
> > by
> > +  the previous table.
> > +
> > +
> > +
> > +  
> > +
> > +  A priority 100 logical flow is added for the logical ports
> > configured
> > +  with DHCPv4 options which matches IPv4 packets with
> > udp.src == 68
> > +   udp.dst == 67  reg0[3] == 1 and
> > +  responds back to the inport after applying these
> > +  actions.  If reg0[3] is set to 1, it means that the
> > +  action put_dhcp_opts was successful.
> > +
> > +
> > +
> > +eth.dst = eth.src;
> > +eth.src = E;
> > +ip4.dst = O;
> > +ip4.src = S;
> > +udp.src = 67;
> > +udp.dst = 68;
> > +outport = P;
> > +inport = ""; /* Allow sending out inport. */
> > +output;
> > +
> > +
> > +
> > +  where E is the server MAC address and S
> > is the
> > +  server IPv4 address defined in the DHCPv4 options and
> > O is
> > +  the IPv4 address defined in the logical port's addresses column.
> > +
> > +
> > +
> > +  (This terminates ingress packet processing; the packet does not
> > go
> > +   to the next ingress table.)
> > +
> > +  
> > +
> > +  
> > +A priority-0 flow that matches all packets to advances to table
> > 

Re: [ovs-dev] [PATCH v1 1/3] ovn-northd: Add logical flows to support native DHCPv4

2016-07-26 Thread Ben Pfaff
On Wed, Jul 27, 2016 at 12:54:39AM +0530, Numan Siddique wrote:
> OVN implements a native DHCPv4 support which caters to the common
> use case of providing an IP address to a booting instance by
> providing stateless replies to DHCPv4 requests based on statically
> configured address mappings. To do this it allows a short list of
> DHCPv4 options to be configured and applied at each compute host
> running ovn-controller.
> 
> A new table 'DHCP_Options' is added in OVN NB DB to store the DHCP
> options. Logical ports refer to this table to configure the DHCPv4
> options.
> 
> For each logical port configured with DHCPv4 Options following flows
> are added
>  - A logical flow which copies the DHCPv4 options to the DHCPv4
>request packets using the 'put_dhcp_opts' action and advances the
>packet to the next stage.
> 
>  - A logical flow which implements the DHCP reponder by sending
>the DHCPv4 reply back to the inport once the 'put_dhcp_opts' action
>is applied.
> 
> Signed-off-by: Numan Siddique 
> Co-authored-by: Ben Pfaff 
> Signed-off-by: Ben Pfaff 

Thanks for the revision!

The check_and_add_supported_dhcp_opts_to_sb_db() function seems awfully
optimistic that, once it adds the DHCP options, nothing could ever make
them go away.  I think that's unrealistic: we want ovn-northd to be
resilient against changes to the southbound database, which might be
inadvertent or due to the database getting rolled back or recreated for
various reasons.  So I removed the nothing_to_add check there.

I also made some improvements to the documentation.

I'm appending the changes I folded in.  With those changes, I applied
this to master.  Thanks again!

--8<--cut here-->8--

diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index 578fbbb..aecc1d8 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -3288,12 +3288,6 @@ static struct dhcp_opts_map supported_dhcp_opts[] = {
 static void
 check_and_add_supported_dhcp_opts_to_sb_db(struct northd_context *ctx)
 {
-static bool nothing_to_add = false;
-
-if (nothing_to_add) {
-return;
-}
-
 struct hmap dhcp_opts_to_add = HMAP_INITIALIZER(_opts_to_add);
 for (size_t i = 0; (i < sizeof(supported_dhcp_opts) /
 sizeof(supported_dhcp_opts[0])); i++) {
@@ -3307,18 +3301,13 @@ check_and_add_supported_dhcp_opts_to_sb_db(struct 
northd_context *ctx)
 dhcp_opts_find(_opts_to_add, opt_row->name);
 if (dhcp_opt) {
 hmap_remove(_opts_to_add, _opt->hmap_node);
-}
-else {
+} else {
 sbrec_dhcp_options_delete(opt_row);
 }
 }
 
-if (!dhcp_opts_to_add.n) {
-nothing_to_add = true;
-}
-
 struct dhcp_opts_map *opt;
-HMAP_FOR_EACH_POP(opt, hmap_node, _opts_to_add) {
+HMAP_FOR_EACH (opt, hmap_node, _opts_to_add) {
 struct sbrec_dhcp_options *sbrec_dhcp_option =
 sbrec_dhcp_options_insert(ctx->ovnsb_txn);
 sbrec_dhcp_options_set_name(sbrec_dhcp_option, opt->name);
diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
index 86dbfa7..abd0340 100644
--- a/ovn/ovn-nb.xml
+++ b/ovn/ovn-nb.xml
@@ -932,7 +932,7 @@
 
   
 
-  
+  
 
   OVN implements a native DHCPv4 support which caters to the common
   use case of providing an IPv4 address to a booting instance by
@@ -952,34 +952,59 @@
 
   
 CMS should define the set of DHCPv4 options as key/value pairs in the
- column of this table. In order for the
+ column of this table. For
 ovn-controller to include these DHCPv4 options, the
  of 
 should refer to an entry in this table.
   
 
-  
+  
 
-  Below are the supported DHCPv4 options whose values are IPv4 address
-  or addresses. If the value has more than one IPv4 address, then it
-  should be enclosed within '{}' braces. Please refer to the
-  RFC 2132 "https://tools.ietf.org/html/rfc2132; for
-  more details on the DHCPv4 options and their codes.
+  The following options must be defined.
 
 
-
+
+  The IP address for the DHCP server to use.  This should be in the
+  subnet of the offered IP.  This is also included in the DHCP offer as
+  option 54, ``server identifier.''
+
+
+
+  The Ethernet address for the DHCP server to use.
+
+
+
   
-The DHCPv4 option code for this option is 1.
+The IP address of a gateway for the client to use.  This should be
+in the subnet of the offered IP.  The DHCPv4 option code for this
+option is 3.
   
+
 
+
   
-Example. key="netmask", value="255.255.255.0"
+The offered lease time in seconds, 
+  
+
+  
+  

Re: [ovs-dev] [PATCH] selinux: Allow ovs-ctl force-reload-kmod.

2016-07-26 Thread Joe Stringer
On 26 July 2016 at 13:00, Flavio Leitner  wrote:
> On Tue, Jul 26, 2016 at 12:41:01PM -0700, Joe Stringer wrote:
>> On 25 July 2016 at 16:57, Flavio Leitner  wrote:
>> > On Fri, Jul 22, 2016 at 02:10:51PM -0700, Joe Stringer wrote:
>> >> When invoking ovs-ctl force-reload-kmod via '/etc/init.d/openvswitch
>> >> force-reload-kmod', spurious errors would output related to 'hostname'
>> >> and 'ip', and the system's selinux audit log would complain about some
>> >> of the invocations such as those listed at the end of this commit message.
>> >>
>> >> This patch loosens restrictions for openvswitch_t (used for ovs-ctl, as
>> >> well as all of the OVS daemons) to allow it to execute 'hostname' and
>> >> 'ip' commands, and also to execute temporary files created as
>> >> openvswitch_tmp_t. This allows force-reload-kmod to run correctly.
>> >>
>> >> Example audit logs:
>> >> type=AVC msg=audit(1468515192.912:16720): avc:  denied  { getattr } for
>> >> pid=11687 comm="ovs-ctl" path="/usr/bin/hostname" dev="dm-1"
>> >> ino=33557805 scontext=system_u:system_r:openvswitch_t:s0
>> >> tcontext=system_u:object_r:hostname_exec_t:s0 tclass=file
>> >>
>> >> type=AVC msg=audit(1468519445.766:16829): avc:  denied  { getattr } for
>> >> pid=13920 comm="ovs-save" path="/usr/sbin/ip" dev="dm-1" ino=67572988
>> >> scontext=unconfined_u:system_r:openvswitch_t:s0
>> >> tcontext=system_u:object_r:ifconfig_exec_t:s0 tclass=file
>> >>
>> >> type=AVC msg=audit(1468519445.890:16833): avc:  denied  { execute } for
>> >> pid=13849 comm="ovs-ctl" name="tmp.jdEGHntG3Z" dev="dm-1" ino=106876762
>> >> scontext=unconfined_u:system_r:openvswitch_t:s0
>> >> tcontext=unconfined_u:object_r:openvswitch_tmp_t:s0 tclass=file
>> >>
>> >> Signed-off-by: Joe Stringer 
>> >> ---
>> >
>> > LGTM.
>> > Acked-by: Flavio Leitner 
>> >
>> >
>>
>> Thanks for the review, applied to master.
>
> I also opened bug to fix on Fedora:
>
> Bug 1360465 - SELinux blocks OVS to run 'hostname' and 'ip'
> https://bugzilla.redhat.com/show_bug.cgi?id=1360465
>
> --
> fbl

Thanks. For what it's worth, when I tried, if I invoke
"/usr/share/openvswitch/scripts/ovs-ctl force-reload-kmod" directly on
centos7, OVS restarts unconfined. Usually in the openvswitch.spec path
I will run it indirectly via /etc/init.d/openvswitch, but that isn't
an option in the fedora packaging.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH 3/3] test: Replication tests uses unix domain socket

2016-07-26 Thread Andy Zhou
Fix replication test titles to fit changes committed by
63b35ecc06cbd16bc0e93d1e26021d81c413a485.

Signed-off-by: Andy Zhou 
---
 tests/ovsdb-server.at | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/ovsdb-server.at b/tests/ovsdb-server.at
index e70498d..6c10164 100644
--- a/tests/ovsdb-server.at
+++ b/tests/ovsdb-server.at
@@ -1042,7 +1042,7 @@ AT_BANNER([OVSDB -- ovsdb-server replication 
table-exclusion])
 # TITLE is provided to AT_SETUP and KEYWORDS to AT_KEYWORDS.
 m4_define([OVSDB_CHECK_REPLICATION],
[AT_SETUP([$1])
-   AT_KEYWORDS([ovsdb server tcp replication table-exclusion])
+   AT_KEYWORDS([ovsdb server replication table-exclusion])
$2 > schema
AT_CHECK([ovsdb-tool create db1 schema], [0], [stdout], [ignore])
AT_CHECK([ovsdb-tool create db2 schema], [0], [stdout], [ignore])
-- 
1.9.1

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH 2/3] ovsdb: Fix memory leak in replication logic

2016-07-26 Thread Andy Zhou
Release the memory of reply message of the initial "monitor" request.

Reported-at: http://openvswitch.org/pipermail/dev/2016-July/076075.html
Signed-off-by: Andy Zhou 
---
 ovsdb/replication.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/ovsdb/replication.c b/ovsdb/replication.c
index 3d589ef..af7ae5c 100644
--- a/ovsdb/replication.c
+++ b/ovsdb/replication.c
@@ -365,6 +365,10 @@ get_initial_db_state(const struct db *database)
 if (msg->type == JSONRPC_REPLY) {
 process_notification(msg->result, database->db);
 }
+
+if (msg) {
+jsonrpc_msg_destroy(msg);
+}
 }
 
 static void
-- 
1.9.1

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH 1/3] ovsdb: Properly close replication rpc connection

2016-07-26 Thread Andy Zhou
This patch removes rpc related memory leak reported below.

Reported-at: http://openvswitch.org/pipermail/dev/2016-July/076075.html
Signed-off-by: Andy Zhou 
---
 ovsdb/ovsdb-server.c | 1 +
 ovsdb/replication.c  | 5 +++--
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/ovsdb/ovsdb-server.c b/ovsdb/ovsdb-server.c
index 239cca8..1c6ddca 100644
--- a/ovsdb/ovsdb-server.c
+++ b/ovsdb/ovsdb-server.c
@@ -202,6 +202,7 @@ main_loop(struct ovsdb_jsonrpc_server *jsonrpc, struct 
shash *all_dbs,
 }
 }
 
+disconnect_remote_server();
 free(remotes_error);
 }
 
diff --git a/ovsdb/replication.c b/ovsdb/replication.c
index 52b7085..3d589ef 100644
--- a/ovsdb/replication.c
+++ b/ovsdb/replication.c
@@ -32,8 +32,8 @@
 #include "table.h"
 #include "transaction.h"
 
-static char *remote_ovsdb_server;
-static struct jsonrpc *rpc;
+static char *remote_ovsdb_server = NULL;
+static struct jsonrpc *rpc = NULL;
 static struct sset monitored_tables = SSET_INITIALIZER(_tables);
 static struct sset tables_blacklist = SSET_INITIALIZER(_blacklist);
 static bool reset_dbs = true;
@@ -391,6 +391,7 @@ check_for_notifications(struct shash *all_dbs)
 if (error == EAGAIN) {
 return;
 } else if (error) {
+jsonrpc_close(rpc);
 rpc = open_jsonrpc(remote_ovsdb_server);
 if (!rpc) {
 /* Remote server went down. */
-- 
1.9.1

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] selinux: Allow ovs-ctl force-reload-kmod.

2016-07-26 Thread Flavio Leitner
On Tue, Jul 26, 2016 at 12:41:01PM -0700, Joe Stringer wrote:
> On 25 July 2016 at 16:57, Flavio Leitner  wrote:
> > On Fri, Jul 22, 2016 at 02:10:51PM -0700, Joe Stringer wrote:
> >> When invoking ovs-ctl force-reload-kmod via '/etc/init.d/openvswitch
> >> force-reload-kmod', spurious errors would output related to 'hostname'
> >> and 'ip', and the system's selinux audit log would complain about some
> >> of the invocations such as those listed at the end of this commit message.
> >>
> >> This patch loosens restrictions for openvswitch_t (used for ovs-ctl, as
> >> well as all of the OVS daemons) to allow it to execute 'hostname' and
> >> 'ip' commands, and also to execute temporary files created as
> >> openvswitch_tmp_t. This allows force-reload-kmod to run correctly.
> >>
> >> Example audit logs:
> >> type=AVC msg=audit(1468515192.912:16720): avc:  denied  { getattr } for
> >> pid=11687 comm="ovs-ctl" path="/usr/bin/hostname" dev="dm-1"
> >> ino=33557805 scontext=system_u:system_r:openvswitch_t:s0
> >> tcontext=system_u:object_r:hostname_exec_t:s0 tclass=file
> >>
> >> type=AVC msg=audit(1468519445.766:16829): avc:  denied  { getattr } for
> >> pid=13920 comm="ovs-save" path="/usr/sbin/ip" dev="dm-1" ino=67572988
> >> scontext=unconfined_u:system_r:openvswitch_t:s0
> >> tcontext=system_u:object_r:ifconfig_exec_t:s0 tclass=file
> >>
> >> type=AVC msg=audit(1468519445.890:16833): avc:  denied  { execute } for
> >> pid=13849 comm="ovs-ctl" name="tmp.jdEGHntG3Z" dev="dm-1" ino=106876762
> >> scontext=unconfined_u:system_r:openvswitch_t:s0
> >> tcontext=unconfined_u:object_r:openvswitch_tmp_t:s0 tclass=file
> >>
> >> Signed-off-by: Joe Stringer 
> >> ---
> >
> > LGTM.
> > Acked-by: Flavio Leitner 
> >
> >
> 
> Thanks for the review, applied to master.

I also opened bug to fix on Fedora:

Bug 1360465 - SELinux blocks OVS to run 'hostname' and 'ip'
https://bugzilla.redhat.com/show_bug.cgi?id=1360465

-- 
fbl

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] ovn-controller: squelch expected duplicate flow warnings

2016-07-26 Thread Guru Shetty
On 24 July 2016 at 10:07, Ryan Moats  wrote:

> In the physical processing of ovn-controller, there are two
> sets of OF flows that are still fully recalculated every cycle:
>
>   Flows that aren't associated with any logical flow, and
>   Flows calculated based on multicast groups
>
> Because these flows are recalculated fully each cycle, full
> duplicates of existing OF flows are created and the OF management
> code in ovn-controller pollutes the logs with false positive
> warnings about repeated duplicates.
>
> As a short term measure, ignore full duplicates for both of
> these types of flows, but still warn if the action changes
> (as that is not expected and may be indicative of a problem).
>
> Signed-off-by: Ryan Moats 
>

Even with this patch, I get consistent unit test failures when I run OVN
system tests via :

make check-kernel TESTSUITEFLAGS="-k ovn"

The test that fails has the following warning:
+2016-07-26T09:58:04.535Z|00013|ofctrl|WARN|duplicate flow with modified
action for parent a356d28e-84f1-4984-94b2-3ee5a3db2b9b: table_id=32,
priority=100, reg15=0x,metadata=0x6,
actions=set_field:0x1->reg15,resubmit(,34),set_field:0x->reg15,resubmit(,33)




> ---
>  ovn/controller/ofctrl.c   | 26 +-
>  ovn/controller/ofctrl.h   |  3 +++
>  ovn/controller/physical.c | 28 +++-
>  3 files changed, 43 insertions(+), 14 deletions(-)
>
> diff --git a/ovn/controller/ofctrl.c b/ovn/controller/ofctrl.c
> index f0451b7..2b26f2d 100644
> --- a/ovn/controller/ofctrl.c
> +++ b/ovn/controller/ofctrl.c
> @@ -550,10 +550,10 @@ log_ovn_flow_rl(struct vlog_rate_limit *rl, enum
> vlog_level level,
>   *
>   * This just assembles the desired flow tables in memory.  Nothing is
> actually
>   * sent to the switch until a later call to ofctrl_run(). */
> -void
> -ofctrl_add_flow(uint8_t table_id, uint16_t priority,
> +static void
> +_ofctrl_add_flow(uint8_t table_id, uint16_t priority,
>  const struct match *match, const struct ofpbuf *actions,
> -const struct uuid *uuid)
> +const struct uuid *uuid, bool dupwarn)
>  {
>  /* Structure that uses table_id+priority+various things as hashes. */
>  struct ovn_flow *f = xmalloc(sizeof *f);
> @@ -591,8 +591,10 @@ ofctrl_add_flow(uint8_t table_id, uint16_t priority,
>   */
>  if (ofpacts_equal(f->ofpacts, f->ofpacts_len,
>d->ofpacts, d->ofpacts_len)) {
> -static struct vlog_rate_limit rl =
> VLOG_RATE_LIMIT_INIT(5, 1);
> -log_ovn_flow_rl(, VLL_INFO, f, "duplicate flow");
> +if (dupwarn) {
> +static struct vlog_rate_limit rl =
> VLOG_RATE_LIMIT_INIT(5, 1);
> +log_ovn_flow_rl(, VLL_INFO, f, "duplicate flow");
> +}
>  } else {
>  static struct vlog_rate_limit rl =
> VLOG_RATE_LIMIT_INIT(5, 1);
>  log_ovn_flow_rl(, VLL_WARN, f,
> @@ -617,6 +619,20 @@ ofctrl_add_flow(uint8_t table_id, uint16_t priority,
>  f->uuid_hindex_node.hash);
>  }
>
> +void
> +ofctrl_add_flow(uint8_t table_id, uint16_t priority,
> +const struct match *match, const struct ofpbuf *actions,
> +const struct uuid *uuid) {
> +_ofctrl_add_flow(table_id, priority, match, actions, uuid, true);
> +}
> +
> +void
> +ofctrl_add_flow_no_warn(uint8_t table_id, uint16_t priority,
> +const struct match *match, const struct ofpbuf
> *actions,
> +const struct uuid *uuid) {
> +_ofctrl_add_flow(table_id, priority, match, actions, uuid, false);
> +}
> +
>  /* Removes a bundles of flows from the flow table. */
>  void
>  ofctrl_remove_flows(const struct uuid *uuid)
> diff --git a/ovn/controller/ofctrl.h b/ovn/controller/ofctrl.h
> index 49b95b0..b591e82 100644
> --- a/ovn/controller/ofctrl.h
> +++ b/ovn/controller/ofctrl.h
> @@ -42,6 +42,9 @@ struct ovn_flow *ofctrl_dup_flow(struct ovn_flow
> *source);
>  void ofctrl_add_flow(uint8_t table_id, uint16_t priority,
>   const struct match *, const struct ofpbuf *ofpacts,
>   const struct uuid *uuid);
> +void ofctrl_add_flow_no_warn(uint8_t table_id, uint16_t priority,
> + const struct match *, const struct ofpbuf
> *ofpacts,
> + const struct uuid *uuid);
>
>  void ofctrl_remove_flows(const struct uuid *uuid);
>
> diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c
> index a104e33..9e6dff4 100644
> --- a/ovn/controller/physical.c
> +++ b/ovn/controller/physical.c
> @@ -549,8 +549,9 @@ consider_mc_group(enum mf_field_id mff_ovn_geneve,
>   * group as the logical output port. */
>  put_load(mc->tunnel_key, MFF_LOG_OUTPORT, 0, 32, ofpacts_p);
>
> -ofctrl_add_flow(OFTABLE_LOCAL_OUTPUT, 100,
> -   

[ovs-dev] [PATCH v4] Windows: Local named pipe implementation

2016-07-26 Thread Alin Serdean
Currently in the case of command line arguments punix/unix, on Windows
we create a file, write a TCP port number to connect. This is a security
concern.

This patch adds support for the command line arguments punix/unix trying
to mimic AF_UNIX behind a local named pipe.

This patch drops the TCP socket implementation behind command line
arguments punix/unix and switches to the local named pipe implementation.

Since we do not write anything to the file created by the punix/unix
arguments, switch tests to plain file existence.

Man pages and code comments have been updated.

Signed-off-by: Alin Gabriel Serdean 
Acked-by: Paul Boca 
---
v4: improve spelling in man pages
v3: squash commits update documentation and code comments
v2: Address comments, fix handle leaks.
---
 lib/automake.mk  |   1 +
 lib/stream-tcp.c | 115 --
 lib/stream-windows.c | 587 +++
 lib/unixctl.c|   5 +-
 lib/unixctl.man  |  11 +-
 lib/vconn-active.man |   3 +-
 ovsdb/remote-active.man  |   4 +-
 ovsdb/remote-passive.man |   4 +-
 tests/ovsdb-server.at|   6 +-
 9 files changed, 606 insertions(+), 130 deletions(-)
 create mode 100644 lib/stream-windows.c

diff --git a/lib/automake.mk b/lib/automake.mk
index 71c9d41..9067c95 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -293,6 +293,7 @@ lib_libopenvswitch_la_SOURCES += \
lib/latch-windows.c \
lib/route-table-stub.c \
lib/if-notifier-stub.c \
+   lib/stream-windows.c \
lib/strsep.c
 else
 lib_libopenvswitch_la_SOURCES += \
diff --git a/lib/stream-tcp.c b/lib/stream-tcp.c
index 2b57ca7..1749fad 100644
--- a/lib/stream-tcp.c
+++ b/lib/stream-tcp.c
@@ -74,64 +74,6 @@ const struct stream_class tcp_stream_class = {
 NULL,   /* run_wait */
 NULL,   /* wait */
 };
-
-#ifdef _WIN32
-#include "dirs.h"
-
-static int
-windows_open(const char *name, char *suffix, struct stream **streamp,
- uint8_t dscp)
-{
-int error, port;
-FILE *file;
-char *suffix_new, *path;
-
-/* If the path does not contain a ':', assume it is relative to
- * OVS_RUNDIR. */
-if (!strchr(suffix, ':')) {
-path = xasprintf("%s/%s", ovs_rundir(), suffix);
-} else {
-path = xstrdup(suffix);
-}
-
-file = fopen(path, "r");
-if (!file) {
-error = errno;
-VLOG_DBG("%s: could not open %s (%s)", name, suffix,
- ovs_strerror(error));
-return error;
-}
-
-error = fscanf(file, "%d", );
-if (error != 1) {
-VLOG_ERR("failed to read port from %s", suffix);
-fclose(file);
-return EINVAL;
-}
-fclose(file);
-
-suffix_new = xasprintf("127.0.0.1:%d", port);
-
-error = tcp_open(name, suffix_new, streamp, dscp);
-
-free(suffix_new);
-free(path);
-return error;
-}
-
-const struct stream_class windows_stream_class = {
-"unix", /* name */
-false,  /* needs_probes */
-windows_open,  /* open */
-NULL,   /* close */
-NULL,   /* connect */
-NULL,   /* recv */
-NULL,   /* send */
-NULL,   /* run */
-NULL,   /* run_wait */
-NULL,   /* wait */
-};
-#endif
 
 /* Passive TCP. */
 
@@ -198,60 +140,3 @@ const struct pstream_class ptcp_pstream_class = {
 NULL,
 NULL,
 };
-
-#ifdef _WIN32
-static int
-pwindows_open(const char *name, char *suffix, struct pstream **pstreamp,
-  uint8_t dscp)
-{
-int error;
-char *suffix_new, *path;
-FILE *file;
-struct pstream *listener;
-
-suffix_new = xstrdup("0:127.0.0.1");
-
-/* If the path does not contain a ':', assume it is relative to
- * OVS_RUNDIR. */
-if (!strchr(suffix, ':')) {
-path = xasprintf("%s/%s", ovs_rundir(), suffix);
-} else {
-path = xstrdup(suffix);
-}
-
-error = new_pstream(suffix_new, name, pstreamp, dscp, path, false);
-if (error) {
-goto exit;
-}
-listener = *pstreamp;
-
-file = fopen(path, "w");
-if (!file) {
-error = errno;
-VLOG_DBG("could not open %s (%s)", path, ovs_strerror(error));
-goto exit;
-}
-
-fprintf(file, "%d\n", ntohs(listener->bound_port));
-if (fflush(file) == EOF) {
-error = EIO;
-VLOG_ERR("write failed for %s", path);
-fclose(file);
-goto exit;
-}
-fclose(file);
-
-exit:
-free(suffix_new);
-return error;
-}
-
-const struct pstream_class pwindows_pstream_class = {
-"punix",
-false,
-pwindows_open,
-NULL,
-NULL,
-NULL,
-};
-#endif
diff --git a/lib/stream-windows.c b/lib/stream-windows.c
new file mode 100644
index 

Re: [ovs-dev] [PATCH 2/2] rhel: Allow openvswitch to get parent information

2016-07-26 Thread Joe Stringer
On 25 July 2016 at 18:16, Flavio Leitner  wrote:
> Updates SELinux to allow ovs-vsctl to get parent process
> information and log that to the database:
>
> record 241: 2016-07-26 00:59:47.418 "ovs-vsctl (invoked by /bin/bash
> (pid 1589)): ovs-vsctl -t 10 -- --if-exist ...
>
> Jul 25 12:57:35 localhost.localdomain audit[830]: AVC avc:  denied  {
> search } for  pid=830 comm="ovs-vsctl" name="731" dev="proc" ino=14140
> scontext=system_u:system_r:openvswitch_t:s0
> tcontext=system_u:system_r:initrc_t:s0 tclass=dir permissive=0
>
> Signed-off-by: Flavio Leitner 
> ---
>  selinux/openvswitch-custom.te | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/selinux/openvswitch-custom.te b/selinux/openvswitch-custom.te
> index fc32b97..5739595 100644
> --- a/selinux/openvswitch-custom.te
> +++ b/selinux/openvswitch-custom.te
> @@ -2,8 +2,13 @@ module openvswitch-custom 1.0;
>
>  require {
>  type openvswitch_t;
> +attribute domain;
>  class netlink_socket { setopt getopt create connect getattr write 
> read };
> +class dir { search };
> +class file { open getattr read };
>  }
>
>  #= openvswitch_t ==
>  allow openvswitch_t self:netlink_socket { setopt getopt create connect 
> getattr write read };
> +allow openvswitch_t domain:dir { search };
> +allow openvswitch_t domain:file { open getattr read };

Hi Flavio,

Thanks for spending some time to get OVS in better shape with SELinux.
I figure that once this settles down a bit we should take the policy
file here and work towards upstreaming all of the policy changes.

As far as I can follow, this "domain" type is not just for accessing
OVS directories and files (like openvswitch_t), but ifor a much wider
range of paths:
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/4/html/SELinux_Guide/rhlcommon-section-0048.html

"# The domain attribute identifies every type that can be
# assigned to a process.  This attribute is used in TE rules
# that should be applied to all domains, e.g. permitting
# init to kill all processes."

Is my understanding (+documentation) correct here? Is there an similar
but more restrictive policy that allows ovs-vsctl to access, for
example, /var/run/openvswitch/* (with var_run_openvswitch_t or
similar)? Alternatively is there an example of another daemon that has
a similar policy that set a precedence for writing the policy like
this?

Would you also be able to provide the full ovs-vsctl commandline? It
was a little difficult to understand exactly what was going on during
this event, or try to reproduce.

Lastly, I've just applied the other SELinux patch so you'll need to
rebase this one.

Cheers,
Joe
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v3] Scanning only changed entries in the ovnsb

2016-07-26 Thread Russell Bryant
On Tue, Jul 26, 2016 at 3:44 PM, Hui Kang  wrote:

>
>
> "dev"  wrote on 07/26/2016 02:20:27 PM:
>
> > From: Ben Pfaff 
> > To: Hui Kang 
> > Cc: dev@openvswitch.org
> > Date: 07/26/2016 02:20 PM
> > Subject: Re: [ovs-dev] [PATCH v3] Scanning only changed entries in the
> ovnsb
> > Sent by: "dev" 
> >
> > On Sat, Jul 16, 2016 at 11:58:25PM -0400, Hui Kang wrote:
> > > Improve performance by scanning only changed port binding entries
> > > when determining whether to mark the logical switch port up or
> > > down
> > >
> > > Signed-off-by: Hui Kang 
> >
> > Won't this skip an initial round of updates at ovn-northd startup time?
> > (Certainly ovn-northd might get killed and restarted occasionally,
> > especially if we're doing failover to a second host.)
>
> Hi, Ben,
> After second thought, I think skipping the initial round is the purpose of
> this patch.
>
> ovsdb_idl_create(ovsdb) copies the the Port_binding table from southbound
> database whenever ovn-northd gets started. In this case, the northbound
> DB and southbound db are synced. In ovnsb_db_run, ovn-northd only gets
> notified when there is change to the Chassis column [1]. Therefore,
> ovnsb_db_run should only look the entry that are changed with its Chassis
> column. There is no need to initialize by iterating every entry in the
> Port_binding table. Please correct me if my understanding is incorrect.
> Thanks.
>

What if the Chassis column changes in some Port_Binding records while
ovn-northd isn't running?

-- 
Russell Bryant
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 2/3] datapath: compat: unset skb encapsulation bit

2016-07-26 Thread Jesse Gross
On Tue, Jul 26, 2016 at 12:30 PM, pravin shelar  wrote:
> On Tue, Jul 26, 2016 at 11:14 AM, Jesse Gross  wrote:
>> On Tue, Jul 26, 2016 at 10:56 AM, pravin shelar  wrote:
>>> On Tue, Jul 26, 2016 at 10:06 AM, Jesse Gross  wrote:
 On Mon, Jul 25, 2016 at 5:49 PM, Pravin B Shelar  wrote:
> OVS compat layer can handle tunnel GSO packets. but it does
> keep skb encapsulation on for packet handled in GSO. This can
> confuse some NIC drivers. I have seen this issue on intel devices:
>
>   i40e :42:00.0: TX driver issue detected, PF reset issued
>
> Following patch resets this bit in case compat layer handles the packet.
>
> VMware-BZ: 1698877
> Signed-off-by: Pravin B Shelar 

 In upstream, this is done as part of the GSO code (for example, in
 __skb_udp_tunnel_segment()) so that probably makes more sense and is
 safer if this is GSO specific. There is already code in
 ovs_iptunnel_handle_offloads() that will clear the encapsulation bit
 in the case of checksum offload on the outer header.

>>> ovs_iptunnel_handle_offloads() is not equivalent to
>>> skb_udp_tunnel_segment(). If handle-offload clear the bit it would not
>>> be possible to check the encapsulation bit after handle-offload call
>>> in tunnel implementation. At this point this bit is not checked, so it
>>> is not issue. But that would be different behavior compared to
>>> upstream.
>>
>> I was actually referring to existing behavior in
>> ovs_iptunnel_handle_offlads(). Here is the code that I was talking
>> about:
>>
>> #if LINUX_VERSION_CODE >= KERNEL_VERSION(3,8,0)
>> /* If packet is not gso and we are resolving any partial checksum,
>>  * clear encapsulation flag. This allows setting CHECKSUM_PARTIAL
>>  * on the outer header without confusing devices that implement
>>  * NETIF_F_IP_CSUM with encapsulation.
>>  */
>> if (csum_help)
>> skb->encapsulation = 0;
>> #endif
>>
>> In the case of outer checksums being enabled and offloaded and no GSO,
>> we will have cleared skb->encapsulation bit already, so there should
>> be no confusion for the NIC driver. As a result, this seems like a
>> GSO-only problem and we can just mirror what upstream does and clear
>> the bit in the GSO code. I'm a little bit nervous about
>> indiscriminately clearing it in all cases, since it seems like it is
>> possible that we will accidentally do it in some case where we are
>> trying to use more of the network stack.
>>
>
> So you are fine with clearing the bit in ip_local_out() but only for
> GSO packets.

Effectively, yes, just in a different spot. I think inside
tnl_skb_gso_segment() would make the most sense.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] ovsdb active backup deployment

2016-07-26 Thread Andy Zhou
On Tue, Jul 26, 2016 at 11:59 AM, Russell Bryant  wrote:

>
>
> On Tue, Jul 26, 2016 at 2:41 PM, Andy Zhou  wrote:
>
>>
>>
>> On Tue, Jul 26, 2016 at 5:37 AM, Russell Bryant  wrote:
>>
>>>
>>>
>>> On Mon, Jul 25, 2016 at 8:15 PM, Andy Zhou  wrote:
>>>
 Hi, Rayn and Russell,

>>>
>>> Can we move this discussion to the ovs dev mailing list?  Feel free to
>>> just add it in a reply if you'd like.
>>>
>> Done.
>>
>>>
>>>
 I am wondering how we can actually use the active/backup feature that
 is now part of
 OVSDB to increase OVN availability.

>>>
>>> TO be clear, I haven't actually tried this yet.  I'm only speaking about
>>> how I think it should work.
>>>
>>>
 Specifically:

 1. When the active OVSDB server failed, should the back up server take
 over, and allow write transactions? One simpler possibility is to allow
 read only access to the backup serve.

>>>
>>> The  backup server needs to take over.  It's OK if that requires
>>> intervention by an HA manager like Pacemaker.  If we can't make the passive
>>> server take over, I'd say the solution is incomplete.
>>>
>>
>> O.K. make sense.
>>
>> One possible issue with backup server taking over is "split head".  In
>> case due to network error, backup server becomes disconnected from the
>> active
>> server, then we may have both server thinking they are active server
>> now.  Does Pacemaker help with solving this issue.
>>
>
> It can, yes.  I would expect Pacemaker to explicitly configure a node to
> be either the active or passive node.
>
Manual switching is more straight forward. I agree.

>
>>>
 2. When a crashed active OVSDB server recovers, should it become the
 new backup, or it should switch back.

>>>
>>> Becoming the new backup is fine.  Again, this can be orchestrated by an
>>> HA manager (Pacemaker).
>>>
>> I am not familiar with pacemaker. Can I assume it can provide a correct
>> --sync-from argument (pointing to backup server) when relaunch OVSDB
>> server?
>>
>
> Yes.  I'd have to consult with some Pacemaker experts on exactly what the
> implementation would look like, but roughly:
>
> Pacemaker manages services using "OCF Resource Agents", which are just
> scripts with a defined set of inputs and outputs for service management.  I
> would imagine a Pacemaker cluster being told it must have exactly 1 active
> and 1 passive OVSDB service.  When the passive OVSDB service is started, it
> would include the "sync-from" argument based on where the active OVSDB
> service is currently running.
>
> We really need to prototype this and document it.  I'm guessing too much.
> Pacemaker is frequently used to manage active/passive HA, though.
>
> Sounds reasonable,  I will work on ovsdb internal changes to support
manual switching, using appctl commands. Then looking into prototyping with
HA systems.  I have not used pacemaker in the past, so it may take some
time to ramp up.

>
>>>
 Ben said one of you, or both may have worked with similar active-backup
 systems before, so I am very interested in your inputs.

 Thanks,

 Andy

>>>
>>>
>>>
>>> --
>>> Russell Bryant
>>>
>>
>>
> --
> Russell Bryant
>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v3] ovn: Make it possible for CMS to detect when the OVN system is up-to-date.

2016-07-26 Thread Russell Bryant
On Sun, Jul 24, 2016 at 4:14 PM, Ben Pfaff  wrote:

> Until now, there has been no reliable for the CMS (or ovn-nbctl, or
> anything else) to detect when changes made to the northbound configuration
> have been passed through to the southbound database or to the hypervisors.
> This commit adds this feature to the system, by adding sequence numbers
> to the northbound and southbound databases and adding code in ovn-nbctl,
> ovn-northd, and ovn-controller to keep those sequence numbers up-to-date.
>
> The biggest user-visible change from this commit is new a new option
> --wait to ovn-nbctl.  With --wait=sb, ovn-nbctl now waits for ovn-northd
> to update the southbound database; with --wait=hv, it waits for the
> changes to make their way to Open vSwitch on every hypervisor.
>
> Signed-off-by: Ben Pfaff 
> ---
> v1->v2: Rebase to fix up database version number.
> v3: Rebase due to changes on master.
>

The code looks sane and seems to be working for me with some basic
testing.  Cool feature, thanks!

Acked-by: Russell Bryant 


-- 
Russell Bryant
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [ovs-dev, v2, 2/2] ovn-northd: Add logical flows to support DHCPv6

2016-07-26 Thread Ben Pfaff
On Wed, Jul 27, 2016 at 01:00:44AM +0530, Numan Siddique wrote:
> On Wed, Jul 27, 2016 at 12:08 AM, Ben Pfaff  wrote:
> 
> > On Wed, Jul 27, 2016 at 12:01:06AM +0530, Numan Siddique wrote:
> > > OVN implements native DHCPv6. DHCPv6 options are stored
> > > in the 'DHCP_Options' NB table and logical ports refer to this
> > > table to configure the DHCPv6 options.
> > >
> > > For each logical port configured with DHCPv6 Options following flows
> > > are added
> > >  - A logical flow which copies the DHCPv6 options to the DHCPv6
> > >request packets using the 'put_dhcpv6_opts' action and advances the
> > >packet to the next stage.
> > >
> > >  - A logical flow which implements the DHCPv6 reponder by sending
> > >the DHCPv6 reply back to the inport once the 'put_dhcpv6_opts' action
> > >is applied.
> > >
> > > Signed-off-by: Numan Siddique 
> >
> > I'm still getting lots of patch rejects with v2.  Can you rebase?  Or
> > maybe you could push this somewhere as a branch or a pull request?
> >
> >
> ​Since DHCPv6 patch 2 depends on the ovn northd DHCPv4 patch, I submitted
> another series combining all these into one series to avoid the patch
> rejects.

Oh I didn't realize that--maybe I didn't read carefully enough.

I'll review the combined series.

> Also I have pushed it into my github branch here -
> https://github.com/numansiddique/ovs/tree/ovn_dhcp

Thanks!
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v3] Scanning only changed entries in the ovnsb

2016-07-26 Thread Hui Kang


"dev"  wrote on 07/26/2016 02:20:27 PM:

> From: Ben Pfaff 
> To: Hui Kang 
> Cc: dev@openvswitch.org
> Date: 07/26/2016 02:20 PM
> Subject: Re: [ovs-dev] [PATCH v3] Scanning only changed entries in the
ovnsb
> Sent by: "dev" 
>
> On Sat, Jul 16, 2016 at 11:58:25PM -0400, Hui Kang wrote:
> > Improve performance by scanning only changed port binding entries
> > when determining whether to mark the logical switch port up or
> > down
> >
> > Signed-off-by: Hui Kang 
>
> Won't this skip an initial round of updates at ovn-northd startup time?
> (Certainly ovn-northd might get killed and restarted occasionally,
> especially if we're doing failover to a second host.)

Hi, Ben,
After second thought, I think skipping the initial round is the purpose of
this patch.

ovsdb_idl_create(ovsdb) copies the the Port_binding table from southbound
database whenever ovn-northd gets started. In this case, the northbound
DB and southbound db are synced. In ovnsb_db_run, ovn-northd only gets
notified when there is change to the Chassis column [1]. Therefore,
ovnsb_db_run should only look the entry that are changed with its Chassis
column. There is no need to initialize by iterating every entry in the
Port_binding table. Please correct me if my understanding is incorrect.
Thanks.

- Hui


[1]
https://github.com/openvswitch/ovs/blob/master/ovn/northd/ovn-northd.c#L3034


> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v1 1/3] ovn-northd: Add logical flows to support native DHCPv4

2016-07-26 Thread Numan Siddique
This patch has "Tested-by: Ramu Ramamurthy "
and "Acked-by: Ramu Ramamurthy " and every time
I forget to add this when I resubmit the patch.

Ramu - My apologies

Thanks
Numan


On Wed, Jul 27, 2016 at 12:54 AM, Numan Siddique 
wrote:

> OVN implements a native DHCPv4 support which caters to the common
> use case of providing an IP address to a booting instance by
> providing stateless replies to DHCPv4 requests based on statically
> configured address mappings. To do this it allows a short list of
> DHCPv4 options to be configured and applied at each compute host
> running ovn-controller.
>
> A new table 'DHCP_Options' is added in OVN NB DB to store the DHCP
> options. Logical ports refer to this table to configure the DHCPv4
> options.
>
> For each logical port configured with DHCPv4 Options following flows
> are added
>  - A logical flow which copies the DHCPv4 options to the DHCPv4
>request packets using the 'put_dhcp_opts' action and advances the
>packet to the next stage.
>
>  - A logical flow which implements the DHCP reponder by sending
>the DHCPv4 reply back to the inport once the 'put_dhcp_opts' action
>is applied.
>
> Signed-off-by: Numan Siddique 
> Co-authored-by: Ben Pfaff 
> Signed-off-by: Ben Pfaff 
> ---
>  ovn/northd/ovn-northd.8.xml   |  91 +-
>  ovn/northd/ovn-northd.c   | 256 +-
>  ovn/ovn-nb.ovsschema  |  20 ++-
>  ovn/ovn-nb.xml| 270
> 
>  ovn/utilities/ovn-nbctl.8.xml |  30 +
>  ovn/utilities/ovn-nbctl.c | 197 +
>  tests/ovn.at  | 281
> ++
>  7 files changed, 1135 insertions(+), 10 deletions(-)
>
> diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
> index ced2839..b95caef 100644
> --- a/ovn/northd/ovn-northd.8.xml
> +++ b/ovn/northd/ovn-northd.8.xml
> @@ -457,7 +457,90 @@ output;
>
>  
>
> -Ingress Table 10: Destination Lookup
> +Ingress Table 10: DHCP option processing
> +
> +
> +  This table adds the DHCPv4 options to a DHCPv4 packet from the
> +  logical ports configured with IPv4 address(es) and DHCPv4 options.
> +
> +
> +
> +  
> +
> +  A priority-100 logical flow is added for these logical ports
> +  which matches the IPv4 packet with udp.src = 68 and
> +  udp.dst = 67 and applies the action
> +  put_dhcp_opts and advances the packet to the next
> table.
> +
> +
> +
> +reg0[3] = put_dhcp_opts(offer_ip = O, options...);
> +next;
> +
> +
> +
> +  For DHCPDISCOVER and DHCPREQUEST, this transforms the packet
> into a
> +  DHCP reply, adds the DHCP offer IP O and options to
> the
> +  packet, and stores 1 into reg0[3].  For other kinds of packets,
> it
> +  just stores 0 into reg0[3].  Either way, it continues to the
> next
> +  table.
> +
> +
> +  
> +
> +  
> +A priority-0 flow that matches all packets to advances to table
> 11.
> +  
> +
> +
> +Ingress Table 11: DHCP responses
> +
> +
> +  This table implements DHCP responder for the DHCP replies generated
> by
> +  the previous table.
> +
> +
> +
> +  
> +
> +  A priority 100 logical flow is added for the logical ports
> configured
> +  with DHCPv4 options which matches IPv4 packets with
> udp.src == 68
> +   udp.dst == 67  reg0[3] == 1 and
> +  responds back to the inport after applying these
> +  actions.  If reg0[3] is set to 1, it means that the
> +  action put_dhcp_opts was successful.
> +
> +
> +
> +eth.dst = eth.src;
> +eth.src = E;
> +ip4.dst = O;
> +ip4.src = S;
> +udp.src = 67;
> +udp.dst = 68;
> +outport = P;
> +inport = ""; /* Allow sending out inport. */
> +output;
> +
> +
> +
> +  where E is the server MAC address and S
> is the
> +  server IPv4 address defined in the DHCPv4 options and
> O is
> +  the IPv4 address defined in the logical port's addresses column.
> +
> +
> +
> +  (This terminates ingress packet processing; the packet does not
> go
> +   to the next ingress table.)
> +
> +  
> +
> +  
> +A priority-0 flow that matches all packets to advances to table
> 12.
> +  
> +
> +
> +Ingress Table 12: Destination Lookup
>
>  
>This table implements switching behavior.  It contains these logical
> @@ -531,6 +614,12 @@ output;
>there are no rules added for load balancing new connections.
>  
>
> +
> +  Also a priority 34000 logical flow is added for each logical port
> which
> +  has DHCPv4 options defined to allow the 

Re: [ovs-dev] [PATCH] rhel/openvswitch.spec: Add SELinux policy.

2016-07-26 Thread Joe Stringer
On 25 July 2016 at 17:34, Flavio Leitner  wrote:
> On Mon, Jul 25, 2016 at 02:09:26PM -0700, Joe Stringer wrote:
>> Commit 9b897c9125ef ("rhel: provide our own SELinux custom policy
>> package") added the SELinux policy to the fedora packaging as a
>> subpackage. This patch makes the corresponding change to
>> openvswitch.spec, so that users of that specfile can generate the
>> selinux policy package without having to build all of the fedora
>> packages.
>>
>> Signed-off-by: Joe Stringer 
>> ---
>
> Thanks Joe!
> Acked-by: Flavio Leitner 

Thanks for the review, applied to master.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] selinux: Allow ovs-ctl force-reload-kmod.

2016-07-26 Thread Joe Stringer
On 25 July 2016 at 16:57, Flavio Leitner  wrote:
> On Fri, Jul 22, 2016 at 02:10:51PM -0700, Joe Stringer wrote:
>> When invoking ovs-ctl force-reload-kmod via '/etc/init.d/openvswitch
>> force-reload-kmod', spurious errors would output related to 'hostname'
>> and 'ip', and the system's selinux audit log would complain about some
>> of the invocations such as those listed at the end of this commit message.
>>
>> This patch loosens restrictions for openvswitch_t (used for ovs-ctl, as
>> well as all of the OVS daemons) to allow it to execute 'hostname' and
>> 'ip' commands, and also to execute temporary files created as
>> openvswitch_tmp_t. This allows force-reload-kmod to run correctly.
>>
>> Example audit logs:
>> type=AVC msg=audit(1468515192.912:16720): avc:  denied  { getattr } for
>> pid=11687 comm="ovs-ctl" path="/usr/bin/hostname" dev="dm-1"
>> ino=33557805 scontext=system_u:system_r:openvswitch_t:s0
>> tcontext=system_u:object_r:hostname_exec_t:s0 tclass=file
>>
>> type=AVC msg=audit(1468519445.766:16829): avc:  denied  { getattr } for
>> pid=13920 comm="ovs-save" path="/usr/sbin/ip" dev="dm-1" ino=67572988
>> scontext=unconfined_u:system_r:openvswitch_t:s0
>> tcontext=system_u:object_r:ifconfig_exec_t:s0 tclass=file
>>
>> type=AVC msg=audit(1468519445.890:16833): avc:  denied  { execute } for
>> pid=13849 comm="ovs-ctl" name="tmp.jdEGHntG3Z" dev="dm-1" ino=106876762
>> scontext=unconfined_u:system_r:openvswitch_t:s0
>> tcontext=unconfined_u:object_r:openvswitch_tmp_t:s0 tclass=file
>>
>> Signed-off-by: Joe Stringer 
>> ---
>
> LGTM.
> Acked-by: Flavio Leitner 
>
>

Thanks for the review, applied to master.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [networking-ovn] Re: Issue when using ovn with Openstack

2016-07-26 Thread Richard Theis
"dev"  wrote on 07/20/2016 12:42:20 AM:

> From: Chen Li 
> To: Ryan Moats/Omaha/IBM@IBMUS
> Cc: dev@openvswitch.org, "OpenStack Development Mailing List \(not 
> for usage questions\)" 
> Date: 07/20/2016 12:42 AM
> Subject: Re: [ovs-dev] [networking-ovn] Re: Issue when using ovn 
> with Openstack
> Sent by: "dev" 
> 
> Just Tested with another multi-node setup with older code (code download 
in
> last week), and everything works.
> 
> Here are the last changes for workable ovs/neutron/networking-ovn:
> 
> ovs:
> 
> commit 3041e1fc963886c3e19f1b848df20c6f9d96b289
> Author: William Tu 
> Date:   Fri Jul 1 09:45:52 2016 -0700
> 
> system-traffic: Remove datapath specific tests and macro.
> 
> We generally try to keep the testsuite independent of the underlying
> datapath. This patch removes the datapath-specific tests and macros.
> 
> Tested-at: 
https://travis-ci.org/williamtu/ovs-travis/builds/141642065
> Signed-off-by: William Tu 
> Signed-off-by: Joe Stringer 
> 
> 
> neutron:
> 
> commit 2f29a5db3ca0a5a26571be07109bf49e57c0fc2d
> Merge: 86c02d8 09a6a46
> Author: Jenkins 
> Date:   Thu Jul 14 23:24:58 2016 +
> 
> Merge "Pecan: Implement pagination"
> 
> 
> networking-ovn:
> 
> commit 12919d0d48c63d252be4a42f3f20a2176b53f7d9
> Merge: 381b57f 28b2c55
> Author: Jenkins 
> Date:   Thu Jul 14 23:36:10 2016 +
> 
> Merge "Grenade plugin for testing OVN migration from ML2/OVS"
> 
> 

If this problem is still occurring on master, please file a bug at [1] 
with
your recreate scenario in detail.  If you can pinpoint the commit that 
broke
things that would be even better.

[1] https://bugs.launchpad.net/networking-ovn/+filebug

Thanks,
Richard

> On Wed, Jul 20, 2016 at 1:17 PM, Chen Li  
wrote:
> 
> > Thanks for adding the openstack-dev.
> >
> > Yes, I'm running with devstack, and using the master branch of 
everything.
> > I just updated every thing several hours ago to make sure this is not 
an
> > issue already been fixed.
> >
> > The last change in neutron:
> >
> > commit 122a971656671f92927d44ddd3725cca74b4e0bb
> > Merge: 827bb07 01a6c9c
> > Author: Jenkins 
> > Date:   Tue Jul 19 17:14:33 2016 +
> >
> > Merge "Generalize agent extension mechanism"
> >
> > The last change in networking-ovn:
> >
> > commit a8abf7517f86df6e0ff532cd49550b4dc3c0a9ed
> > Author: Ryan Moats 
> > Date:   Fri Jul 15 11:32:33 2016 -0500
> >
> > [doc] Prettify logical flow examples
> >
> > Rather than showing database objects, use the output of ovn-sbctl
> > lflow-list, because it is prettier.
> >
> > Change-Id: I243b7316731c6c723bf6e64c9326800272643578
> >
> >
> >
> > I do not know where to find : neutron.ini and networking-ovn.ini, are 
you
> > mean neutron.conf & networking-ovn.conf ? Could you point to me where 
I can
> > find them ?
> > I did no change to these configuration files after stack.sh finished.
> >
> > On Wed, Jul 20, 2016 at 12:42 PM, Ryan Moats  
wrote:
> >
> >> "dev"  wrote on 07/19/2016 10:44:27 PM:
> >>
> >> > From: Chen Li 
> >> > To: dev@openvswitch.org
> >> > Date: 07/19/2016 10:44 PM
> >> > Subject: [ovs-dev] Issue when using ovn with Openstack
> >> > Sent by: "dev" 
> >> >
> >> > Hi list,
> >> >
> >> > I have an all-in-one devstack environment with ovn enabled.
> >> > I create a neutron network.
> >> > Create a port A from the network with secgroup A
> >> > Create a vm from the network with secgroup B.
> >> > Secgroup B has both ICMP  and tcp 22 enabled.
> >> >
> >> > Then I try to ping the VM from the dhcp namespace, since the 
Secgroup B
> >> has
> >> > enabled ICMP,  I suppose this should work. But, unfortunately, this 
do
> >> not
> >> > work. And,  the ssh failed too.
> >> >
> >> > Anyone can help me to solve this issue ?
> >> >
> >> > I did some basic checks and looks like flows are missing in table 
52.
> >> >
> >> > Here are flows in table 52:
> >> >
> >> > sudo ovs-ofctl dump-flows br-int |grep table=52
> >> >
> >> >  cookie=0x0, duration=7766.195s, table=52, n_packets=0, n_bytes=0,
> >> > idle_age=7766,
> >> priority=65535,icmp6,metadata=0x4,icmp_type=135,icmp_code=0
> >> > actions=resubmit(,53)
> >> >  cookie=0x0, duration=7766.195s, table=52, n_packets=0, n_bytes=0,
> >> > idle_age=7766,
> >> priority=65535,icmp6,metadata=0x4,icmp_type=136,icmp_code=0
> >> > actions=resubmit(,53)
> >> >  cookie=0x0, duration=7766.195s, table=52, n_packets=4, 
n_bytes=1474,
> >> > idle_age=7744, priority=2002,udp,reg15=0x2,metadata=0x4,nw_src=
> >> > 192.168.0.0/24,tp_src=67,tp_dst=68
> >> > actions=load:0x1->NXM_NX_REG0[1],resubmit(,53)
> >> >  

[ovs-dev] [PATCH] debian: Add six dependency to python-openvswitch.

2016-07-26 Thread Joe Stringer
python-openvswitch uses the python "six" library, add a dependency for
this to the debian package.

VMware-BZ: #1700259
Reported-by: Devang Doshi 
Signed-off-by: Joe Stringer 
---
 debian/control | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/debian/control b/debian/control
index 37aff8db9db3..9ba6e6b75f04 100644
--- a/debian/control
+++ b/debian/control
@@ -240,7 +240,7 @@ Description: Debug symbols for Open vSwitch packages
 Package: python-openvswitch
 Architecture: all
 Section: python
-Depends: ${misc:Depends}, ${python:Depends}
+Depends: ${misc:Depends}, ${python:Depends}, python-six
 Description: Python bindings for Open vSwitch
  Open vSwitch is a production quality, multilayer, software-based,
  Ethernet virtual switch. It is designed to enable massive network
-- 
2.9.0

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [ovs-dev, v2, 2/2] ovn-northd: Add logical flows to support DHCPv6

2016-07-26 Thread Numan Siddique
On Wed, Jul 27, 2016 at 12:08 AM, Ben Pfaff  wrote:

> On Wed, Jul 27, 2016 at 12:01:06AM +0530, Numan Siddique wrote:
> > OVN implements native DHCPv6. DHCPv6 options are stored
> > in the 'DHCP_Options' NB table and logical ports refer to this
> > table to configure the DHCPv6 options.
> >
> > For each logical port configured with DHCPv6 Options following flows
> > are added
> >  - A logical flow which copies the DHCPv6 options to the DHCPv6
> >request packets using the 'put_dhcpv6_opts' action and advances the
> >packet to the next stage.
> >
> >  - A logical flow which implements the DHCPv6 reponder by sending
> >the DHCPv6 reply back to the inport once the 'put_dhcpv6_opts' action
> >is applied.
> >
> > Signed-off-by: Numan Siddique 
>
> I'm still getting lots of patch rejects with v2.  Can you rebase?  Or
> maybe you could push this somewhere as a branch or a pull request?
>
>
​Since DHCPv6 patch 2 depends on the ovn northd DHCPv4 patch, I submitted
another series combining all these into one series to avoid the patch
rejects.

Also I have pushed it into my github branch here -
https://github.com/numansiddique/ovs/tree/ovn_dhcp

Thanks for the review.

Numan


​


> Thanks,
>
> Ben.
>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 2/3] datapath: compat: unset skb encapsulation bit

2016-07-26 Thread pravin shelar
On Tue, Jul 26, 2016 at 11:14 AM, Jesse Gross  wrote:
> On Tue, Jul 26, 2016 at 10:56 AM, pravin shelar  wrote:
>> On Tue, Jul 26, 2016 at 10:06 AM, Jesse Gross  wrote:
>>> On Mon, Jul 25, 2016 at 5:49 PM, Pravin B Shelar  wrote:
 OVS compat layer can handle tunnel GSO packets. but it does
 keep skb encapsulation on for packet handled in GSO. This can
 confuse some NIC drivers. I have seen this issue on intel devices:

   i40e :42:00.0: TX driver issue detected, PF reset issued

 Following patch resets this bit in case compat layer handles the packet.

 VMware-BZ: 1698877
 Signed-off-by: Pravin B Shelar 
>>>
>>> In upstream, this is done as part of the GSO code (for example, in
>>> __skb_udp_tunnel_segment()) so that probably makes more sense and is
>>> safer if this is GSO specific. There is already code in
>>> ovs_iptunnel_handle_offloads() that will clear the encapsulation bit
>>> in the case of checksum offload on the outer header.
>>>
>> ovs_iptunnel_handle_offloads() is not equivalent to
>> skb_udp_tunnel_segment(). If handle-offload clear the bit it would not
>> be possible to check the encapsulation bit after handle-offload call
>> in tunnel implementation. At this point this bit is not checked, so it
>> is not issue. But that would be different behavior compared to
>> upstream.
>
> I was actually referring to existing behavior in
> ovs_iptunnel_handle_offlads(). Here is the code that I was talking
> about:
>
> #if LINUX_VERSION_CODE >= KERNEL_VERSION(3,8,0)
> /* If packet is not gso and we are resolving any partial checksum,
>  * clear encapsulation flag. This allows setting CHECKSUM_PARTIAL
>  * on the outer header without confusing devices that implement
>  * NETIF_F_IP_CSUM with encapsulation.
>  */
> if (csum_help)
> skb->encapsulation = 0;
> #endif
>
> In the case of outer checksums being enabled and offloaded and no GSO,
> we will have cleared skb->encapsulation bit already, so there should
> be no confusion for the NIC driver. As a result, this seems like a
> GSO-only problem and we can just mirror what upstream does and clear
> the bit in the GSO code. I'm a little bit nervous about
> indiscriminately clearing it in all cases, since it seems like it is
> possible that we will accidentally do it in some case where we are
> trying to use more of the network stack.
>

So you are fine with clearing the bit in ip_local_out() but only for
GSO packets.

>>> Something that I noticed while looking at this is it looks like the
>>> recent patch that moved the check for gso_type_mask into a GSO-only
>>> block in ovs_iptunnel_handle_offloads() might cause a bit of a
>>> performance regression. Even though that field is GSO-specific, it is
>>> also used to control whether we resolve partial checksums. Even in
>>> cases where we do need to use the OVS offload compat code, I suspect
>>> we could take better advantage of hardware offloads - not computing
>>> the checksum when !skb->encapsulation (since every kernel can do UDP
>>> checksum offload), using scatter/gather in GSO, checking for backport
>>> support when clearing type in rpl_udp_tunnel_handle_offloads().
>>
>> I am not sure I understand it correctly. the recent change actually
>> using compat gso code only for GSO packet, earlier it was using it for
>> non GSO packets too. By not setting "OVS_GSO_CB(skb)->fix_segment" we
>> are using networking stack.
>
> That's true - it will default to NULL. In that case, I don't know if
> this is entirely safe though. When encapsulation offloads first went
> in, many drivers essentially assumed that this meant VXLAN. It wasn't
> until 3.18 that ndo_features_check was available and drivers started
> implementing it. While this affects TSO more than checksum, issues are
> still possible if we try to pass a different type of encapsulation.

I agree, I am trying to fix these issues.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v1 3/3] ovn-northd: Add logical flows to support DHCPv6

2016-07-26 Thread Numan Siddique
OVN implements native DHCPv6. DHCPv6 options are stored
in the 'DHCP_Options' NB table and logical ports refer to this
table to configure the DHCPv6 options.

For each logical port configured with DHCPv6 Options following flows
are added
 - A logical flow which copies the DHCPv6 options to the DHCPv6
   request packets using the 'put_dhcpv6_opts' action and advances the
   packet to the next stage.

 - A logical flow which implements the DHCPv6 reponder by sending
   the DHCPv6 reply back to the inport once the 'put_dhcpv6_opts' action
   is applied.

Signed-off-by: Numan Siddique 
---
 lib/packets.c   |  29 --
 ovn/northd/ovn-northd.8.xml |  58 +++-
 ovn/northd/ovn-northd.c | 183 ++-
 ovn/ovn-nb.ovsschema|   9 +-
 ovn/ovn-nb.xml  |  88 -
 tests/ovn.at| 226 
 6 files changed, 573 insertions(+), 20 deletions(-)

diff --git a/lib/packets.c b/lib/packets.c
index 1bf887e..4a8f645 100644
--- a/lib/packets.c
+++ b/lib/packets.c
@@ -692,16 +692,6 @@ ipv6_addr_bitand(const struct in6_addr *a, const struct 
in6_addr *b)
return dst;
 }
 
-struct in6_addr
-ipv6_addr_bitxor(const struct in6_addr *a, const struct in6_addr *b)
-{
-   struct in6_addr dst;
-   IPV6_FOR_EACH (i) {
-   dst.s6_addrX[i] = a->s6_addrX[i] ^ b->s6_addrX[i];
-   }
-   return dst;
-}
-
 bool
 ipv6_is_zero(const struct in6_addr *a)
 {
@@ -713,6 +703,25 @@ ipv6_is_zero(const struct in6_addr *a)
return true;
 }
 
+struct in6_addr ipv6_addr_bitxor(const struct in6_addr *a,
+ const struct in6_addr *b)
+{
+int i;
+struct in6_addr dst;
+
+#ifdef s6_addr32
+for (i=0; i<4; i++) {
+dst.s6_addr32[i] = a->s6_addr32[i] ^ b->s6_addr32[i];
+}
+#else
+for (i=0; i<16; i++) {
+dst.s6_addr[i] = a->s6_addr[i] ^ b->s6_addr[i];
+}
+#endif
+
+return dst;
+}
+
 /* Returns an in6_addr consisting of 'mask' high-order 1-bits and 128-N
  * low-order 0-bits. */
 struct in6_addr
diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
index b95caef..3ccfd7e 100644
--- a/ovn/northd/ovn-northd.8.xml
+++ b/ovn/northd/ovn-northd.8.xml
@@ -460,8 +460,9 @@ output;
 Ingress Table 10: DHCP option processing
 
 
-  This table adds the DHCPv4 options to a DHCPv4 packet from the
-  logical ports configured with IPv4 address(es) and DHCPv4 options.
+  This table adds the DHCPv4 options to a DHCPv4 packet and DHCPv6 options
+  to a DHCPv6 packet from the logical ports configured with IPv4 
address(es)
+  and DHCPv4 options and IPv6 address(es) and DHCPv6 options.
 
 
 
@@ -489,6 +490,21 @@ next;
   
 
   
+
+  A priority-100 logical flow is added for these logical ports
+  which matches the IPv6 packet with udp.src = 546 and
+  udp.dst = 547 and applies the action
+  put_dhcpv6_opts and advances the packet to the next
+  table.
+
+
+
+reg0[3] = put_dhcpv6_opts(options...);
+next;
+
+  
+
+  
 A priority-0 flow that matches all packets to advances to table 11.
   
 
@@ -536,6 +552,41 @@ output;
   
 
   
+
+  A priority 100 logical flow is added for the logical ports configured
+  with DHCPv6 options which matches IPv6 packets with udp.src == 
546
+   udp.dst == 547  reg0[3] == 1 and
+  responds back to the inport after applying these
+  actions.  If reg0[3] is set to 1, it means that the
+  action put_dhcpv6_opts was successful.
+
+
+
+eth.dst = eth.src;
+eth.src = E;
+ip6.dst = O;
+ip6.src = S;
+udp.src = 547;
+udp.dst = 546;
+outport = P;
+inport = ""; /* Allow sending out inport. */
+output;
+
+
+
+  where E is the server MAC address and S is the
+  server IPv6 LLA address  generated from the SERVER_ID
+  defined in the DHCPv6 options and O is
+  the IPv6 address defined in the logical port's addresses column.
+
+
+
+  (This terminates packet processing; the packet does not go on the
+  next ingress table.)
+
+  
+
+  
 A priority-0 flow that matches all packets to advances to table 12.
   
 
@@ -616,7 +667,8 @@ output;
 
 
   Also a priority 34000 logical flow is added for each logical port which
-  has DHCPv4 options defined to allow the DHCPv4 reply packet from the
+  has DHCPv4 options defined to allow the DHCPv4 reply packet and which has
+  DHCPv6 options defined to allow the DHCPv6 reply packet from the
   Ingress Table 11: DHCP responses.
 
 
diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index 578fbbb..3075634 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -1446,6 +1446,72 @@ build_dhcpv4_action(struct ovn_port 

[ovs-dev] [PATCH v1 2/3] ovn-controller: Add 'put_dhcpv6_opts' action in ovn-controller

2016-07-26 Thread Numan Siddique
This patch adds a new OVN action 'put_dhcpv6_opts' to support native
DHCPv6 in OVN.

ovn-controller parses this action and adds a NXT_PACKET_IN2
OF flow with 'pause' flag set and the DHCPv6 options stored in
'userdata' field.

When the valid DHCPv6 packet is received by ovn-controller, it frames a
new DHCPv6 reply packet with the DHCPv6 options present in the
'userdata' field and resumes the packet and stores 1 in the 1-bit subfield.
If the packet is invalid, it resumes the packet without any modifying and
stores 0 in the 1-bit subfield.

Eg. reg0[3] = put_dhcpv6_opts(IA_ADDR = aef0::4, SERVER_ID = 00:00:00:00:10:02,
 DNS_RECURSIVE_SERVER={ae70::1,ae70::2})

A new 'DHCPv6_Options' table is added in SB DB which stores
the supported DHCPv6 options with DHCPv6 code and type. ovn-northd is
expected to popule this table.

Upcoming patch will add logical flows with this action.

Signed-off-by: Numan Siddique 
---
 ovn/controller/lflow.c   |   7 ++
 ovn/controller/pinctrl.c | 295 +++
 ovn/lib/actions.c| 114 ++
 ovn/lib/actions.h|   9 ++
 ovn/lib/ovn-dhcp.h   |  71 
 ovn/ovn-sb.ovsschema |  15 ++-
 ovn/ovn-sb.xml   | 129 +
 tests/ovn.at |  11 ++
 tests/test-ovn.c |   6 +
 9 files changed, 655 insertions(+), 2 deletions(-)

diff --git a/ovn/controller/lflow.c b/ovn/controller/lflow.c
index 42c9055..22e105e 100644
--- a/ovn/controller/lflow.c
+++ b/ovn/controller/lflow.c
@@ -396,6 +396,13 @@ add_logical_flows(struct controller_ctx *ctx, const struct 
lport_index *lports,
  dhcp_opt_row->type);
 }
 
+
+const struct sbrec_dhcpv6_options *dhcpv6_opt_row;
+SBREC_DHCPV6_OPTIONS_FOR_EACH(dhcpv6_opt_row, ctx->ovnsb_idl) {
+   dhcp_opt_add(_opts, dhcpv6_opt_row->name, dhcpv6_opt_row->code,
+dhcpv6_opt_row->type);
+}
+
 if (full_logical_flow_processing) {
 SBREC_LOGICAL_FLOW_FOR_EACH (lflow, ctx->ovnsb_idl) {
 consider_logical_flow(lports, mcgroups, lflow, local_datapaths,
diff --git a/ovn/controller/pinctrl.c b/ovn/controller/pinctrl.c
index 0ae6501..99a66f9 100644
--- a/ovn/controller/pinctrl.c
+++ b/ovn/controller/pinctrl.c
@@ -37,6 +37,7 @@
 #include "ovn-controller.h"
 #include "ovn/lib/actions.h"
 #include "ovn/lib/logical-fields.h"
+#include "ovn/lib/ovn-dhcp.h"
 #include "ovn/lib/ovn-util.h"
 #include "poll-loop.h"
 #include "rconn.h"
@@ -365,6 +366,295 @@ exit:
 }
 }
 
+static bool
+compose_out_dhcpv6_opts(struct ofpbuf *userdata,
+struct ofpbuf *out_dhcpv6_opts, ovs_be32 iaid)
+{
+while (userdata->size) {
+struct dhcp_opt6_header *userdata_opt = ofpbuf_try_pull(
+userdata, sizeof *userdata_opt);
+if (!userdata_opt) {
+return false;
+}
+
+uint8_t *userdata_opt_data = ofpbuf_try_pull(userdata,
+ userdata_opt->len);
+if (!userdata_opt_data) {
+return false;
+}
+
+switch(userdata_opt->code) {
+case DHCPV6_OPT_SERVER_ID_CODE:
+{
+/* The Server Identifier option is used to carry a DUID
+ * identifying a server between a client and a server.
+ * See RFC 3315 Sec 9 and Sec 22.3
+ *
+ * We will use DUID Based on Link-layer Address [DUID-LL]
+ */
+
+struct dhcpv6_opt_server_id *opt_server_id = ofpbuf_put_zeros(
+out_dhcpv6_opts, sizeof *opt_server_id);
+
+opt_server_id->opt.code = htons(DHCPV6_OPT_SERVER_ID_CODE);
+opt_server_id->opt.len = htons(userdata_opt->len + 4);
+opt_server_id->duid_type = htons(DHCPV6_DUID_LL);
+opt_server_id->hw_type = htons(DHCPV6_HW_TYPE_ETH);
+memcpy(_server_id->mac, userdata_opt_data,
+sizeof(struct eth_addr));
+break;
+}
+
+case DHCPV6_OPT_IA_ADDR_CODE:
+{
+if (userdata_opt->len != sizeof(struct in6_addr)) {
+return false;
+}
+
+/* IA Address option is used to specify IPv6 addresses associated
+ * with an IA_NA or IA_TA. The IA Address option must be
+ * encapsulated in the Options field of an IA_NA or IA_TA option.
+ *
+ * We will encapsulate the IA Address within the IA_NA option.
+ * Please see RFC 3315 section 22.5 and 22.6
+ */
+struct dhcpv6_opt_ia_na *opt_ia_na = ofpbuf_put_zeros(
+out_dhcpv6_opts, sizeof *opt_ia_na);
+opt_ia_na->opt.code = htons(DHCPV6_OPT_IA_NA_CODE);
+/* IA_NA length (in bytes)-
+ *  IAID - 4
+ *  T1   - 4
+ *  T2   - 4
+ *  IA Address - sizeof(struct dhcpv6_opt_ia_addr)
+

[ovs-dev] [PATCH v1 1/3] ovn-northd: Add logical flows to support native DHCPv4

2016-07-26 Thread Numan Siddique
OVN implements a native DHCPv4 support which caters to the common
use case of providing an IP address to a booting instance by
providing stateless replies to DHCPv4 requests based on statically
configured address mappings. To do this it allows a short list of
DHCPv4 options to be configured and applied at each compute host
running ovn-controller.

A new table 'DHCP_Options' is added in OVN NB DB to store the DHCP
options. Logical ports refer to this table to configure the DHCPv4
options.

For each logical port configured with DHCPv4 Options following flows
are added
 - A logical flow which copies the DHCPv4 options to the DHCPv4
   request packets using the 'put_dhcp_opts' action and advances the
   packet to the next stage.

 - A logical flow which implements the DHCP reponder by sending
   the DHCPv4 reply back to the inport once the 'put_dhcp_opts' action
   is applied.

Signed-off-by: Numan Siddique 
Co-authored-by: Ben Pfaff 
Signed-off-by: Ben Pfaff 
---
 ovn/northd/ovn-northd.8.xml   |  91 +-
 ovn/northd/ovn-northd.c   | 256 +-
 ovn/ovn-nb.ovsschema  |  20 ++-
 ovn/ovn-nb.xml| 270 
 ovn/utilities/ovn-nbctl.8.xml |  30 +
 ovn/utilities/ovn-nbctl.c | 197 +
 tests/ovn.at  | 281 ++
 7 files changed, 1135 insertions(+), 10 deletions(-)

diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
index ced2839..b95caef 100644
--- a/ovn/northd/ovn-northd.8.xml
+++ b/ovn/northd/ovn-northd.8.xml
@@ -457,7 +457,90 @@ output;
   
 
 
-Ingress Table 10: Destination Lookup
+Ingress Table 10: DHCP option processing
+
+
+  This table adds the DHCPv4 options to a DHCPv4 packet from the
+  logical ports configured with IPv4 address(es) and DHCPv4 options.
+
+
+
+  
+
+  A priority-100 logical flow is added for these logical ports
+  which matches the IPv4 packet with udp.src = 68 and
+  udp.dst = 67 and applies the action
+  put_dhcp_opts and advances the packet to the next table.
+
+
+
+reg0[3] = put_dhcp_opts(offer_ip = O, options...);
+next;
+
+
+
+  For DHCPDISCOVER and DHCPREQUEST, this transforms the packet into a
+  DHCP reply, adds the DHCP offer IP O and options to the
+  packet, and stores 1 into reg0[3].  For other kinds of packets, it
+  just stores 0 into reg0[3].  Either way, it continues to the next
+  table.
+
+
+  
+
+  
+A priority-0 flow that matches all packets to advances to table 11.
+  
+
+
+Ingress Table 11: DHCP responses
+
+
+  This table implements DHCP responder for the DHCP replies generated by
+  the previous table.
+
+
+
+  
+
+  A priority 100 logical flow is added for the logical ports configured
+  with DHCPv4 options which matches IPv4 packets with udp.src == 
68
+   udp.dst == 67  reg0[3] == 1 and
+  responds back to the inport after applying these
+  actions.  If reg0[3] is set to 1, it means that the
+  action put_dhcp_opts was successful.
+
+
+
+eth.dst = eth.src;
+eth.src = E;
+ip4.dst = O;
+ip4.src = S;
+udp.src = 67;
+udp.dst = 68;
+outport = P;
+inport = ""; /* Allow sending out inport. */
+output;
+
+
+
+  where E is the server MAC address and S is the
+  server IPv4 address defined in the DHCPv4 options and O is
+  the IPv4 address defined in the logical port's addresses column.
+
+
+
+  (This terminates ingress packet processing; the packet does not go
+   to the next ingress table.)
+
+  
+
+  
+A priority-0 flow that matches all packets to advances to table 12.
+  
+
+
+Ingress Table 12: Destination Lookup
 
 
   This table implements switching behavior.  It contains these logical
@@ -531,6 +614,12 @@ output;
   there are no rules added for load balancing new connections.
 
 
+
+  Also a priority 34000 logical flow is added for each logical port which
+  has DHCPv4 options defined to allow the DHCPv4 reply packet from the
+  Ingress Table 11: DHCP responses.
+
+
 Egress Table 6: Egress Port Security - IP
 
 
diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index 38a3d30..578fbbb 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -27,6 +27,7 @@
 #include "openvswitch/hmap.h"
 #include "openvswitch/json.h"
 #include "ovn/lib/lex.h"
+#include "ovn/lib/ovn-dhcp.h"
 #include "ovn/lib/ovn-nb-idl.h"
 #include "ovn/lib/ovn-sb-idl.h"
 #include "ovn/lib/ovn-util.h"
@@ -99,7 +100,9 @@ enum ovn_stage {
 PIPELINE_STAGE(SWITCH, IN,  LB, 7, 

[ovs-dev] [PATCH v1 0/3] ovn: Support native DHCPv4 and DHCPv6 proposal

2016-07-26 Thread Numan Siddique
This patch series support native DHCPv4 and native DHCPv6 in OVN.

Patch 1 adds DHCPv4 logical flows in ovn northd. It uses the OVN action 
"put_dhcp_opts".

Patch 2 and 3 adds native DHCPv6 support.


Numan Siddique (3):
  ovn-northd: Add logical flows to support native DHCPv4
  ovn-controller: Add 'put_dhcpv6_opts' action in ovn-controller
  ovn-northd: Add logical flows to support DHCPv6

 lib/packets.c |  29 ++-
 ovn/controller/lflow.c|   7 +
 ovn/controller/pinctrl.c  | 295 
 ovn/lib/actions.c | 114 ++
 ovn/lib/actions.h |   9 +
 ovn/lib/ovn-dhcp.h|  71 ++
 ovn/northd/ovn-northd.8.xml   | 143 +++-
 ovn/northd/ovn-northd.c   | 435 ++-
 ovn/ovn-nb.ovsschema  |  25 +-
 ovn/ovn-nb.xml| 354 +
 ovn/ovn-sb.ovsschema  |  15 +-
 ovn/ovn-sb.xml| 129 +++
 ovn/utilities/ovn-nbctl.8.xml |  30 +++
 ovn/utilities/ovn-nbctl.c | 197 
 tests/ovn.at  | 518 ++
 tests/test-ovn.c  |   6 +
 16 files changed, 2354 insertions(+), 23 deletions(-)

-- 
2.7.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v2 3/3] rhel: Improved Systemd Integration

2016-07-26 Thread Flavio Leitner
On Mon, Jul 25, 2016 at 02:03:53PM -0400, Aaron Conole wrote:
> This commit builds upon some of the recent ovs-ctl changes to build a
> more integrated systemd setup.  A new service (ovs-vswitchd) is
> added to track the ovs-vswitchd, and ovsdb-server service is reserved
> for the ovsdb-server daemon.  The systemd scripts still use ovs-ctl to
> actually initialize the daemons.
> 
> Signed-off-by: Aaron Conole 
> Reviewed-by: Markos Chandras 
> ---

Acked-by: Flavio Leitner 

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] ovsdb active backup deployment

2016-07-26 Thread Russell Bryant
On Tue, Jul 26, 2016 at 2:41 PM, Andy Zhou  wrote:

>
>
> On Tue, Jul 26, 2016 at 5:37 AM, Russell Bryant  wrote:
>
>>
>>
>> On Mon, Jul 25, 2016 at 8:15 PM, Andy Zhou  wrote:
>>
>>> Hi, Rayn and Russell,
>>>
>>
>> Can we move this discussion to the ovs dev mailing list?  Feel free to
>> just add it in a reply if you'd like.
>>
> Done.
>
>>
>>
>>> I am wondering how we can actually use the active/backup feature that is
>>> now part of
>>> OVSDB to increase OVN availability.
>>>
>>
>> TO be clear, I haven't actually tried this yet.  I'm only speaking about
>> how I think it should work.
>>
>>
>>> Specifically:
>>>
>>> 1. When the active OVSDB server failed, should the back up server take
>>> over, and allow write transactions? One simpler possibility is to allow
>>> read only access to the backup serve.
>>>
>>
>> The  backup server needs to take over.  It's OK if that requires
>> intervention by an HA manager like Pacemaker.  If we can't make the passive
>> server take over, I'd say the solution is incomplete.
>>
>
> O.K. make sense.
>
> One possible issue with backup server taking over is "split head".  In
> case due to network error, backup server becomes disconnected from the
> active
> server, then we may have both server thinking they are active server now.
> Does Pacemaker help with solving this issue.
>

It can, yes.  I would expect Pacemaker to explicitly configure a node to be
either the active or passive node.


>
>>
>>> 2. When a crashed active OVSDB server recovers, should it become the new
>>> backup, or it should switch back.
>>>
>>
>> Becoming the new backup is fine.  Again, this can be orchestrated by an
>> HA manager (Pacemaker).
>>
> I am not familiar with pacemaker. Can I assume it can provide a correct
> --sync-from argument (pointing to backup server) when relaunch OVSDB
> server?
>

Yes.  I'd have to consult with some Pacemaker experts on exactly what the
implementation would look like, but roughly:

Pacemaker manages services using "OCF Resource Agents", which are just
scripts with a defined set of inputs and outputs for service management.  I
would imagine a Pacemaker cluster being told it must have exactly 1 active
and 1 passive OVSDB service.  When the passive OVSDB service is started, it
would include the "sync-from" argument based on where the active OVSDB
service is currently running.

We really need to prototype this and document it.  I'm guessing too much.
Pacemaker is frequently used to manage active/passive HA, though.


>>
>>> Ben said one of you, or both may have worked with similar active-backup
>>> systems before, so I am very interested in your inputs.
>>>
>>> Thanks,
>>>
>>> Andy
>>>
>>
>>
>>
>> --
>> Russell Bryant
>>
>
>
-- 
Russell Bryant
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] ovsdb active backup deployment

2016-07-26 Thread Andy Zhou
On Tue, Jul 26, 2016 at 5:37 AM, Russell Bryant  wrote:

>
>
> On Mon, Jul 25, 2016 at 8:15 PM, Andy Zhou  wrote:
>
>> Hi, Rayn and Russell,
>>
>
> Can we move this discussion to the ovs dev mailing list?  Feel free to
> just add it in a reply if you'd like.
>
Done.

>
>
>> I am wondering how we can actually use the active/backup feature that is
>> now part of
>> OVSDB to increase OVN availability.
>>
>
> TO be clear, I haven't actually tried this yet.  I'm only speaking about
> how I think it should work.
>
>
>> Specifically:
>>
>> 1. When the active OVSDB server failed, should the back up server take
>> over, and allow write transactions? One simpler possibility is to allow
>> read only access to the backup serve.
>>
>
> The  backup server needs to take over.  It's OK if that requires
> intervention by an HA manager like Pacemaker.  If we can't make the passive
> server take over, I'd say the solution is incomplete.
>

O.K. make sense.

One possible issue with backup server taking over is "split head".  In case
due to network error, backup server becomes disconnected from the active
server, then we may have both server thinking they are active server now.
Does Pacemaker help with solving this issue.

>
>
>> 2. When a crashed active OVSDB server recovers, should it become the new
>> backup, or it should switch back.
>>
>
> Becoming the new backup is fine.  Again, this can be orchestrated by an HA
> manager (Pacemaker).
>
I am not familiar with pacemaker. Can I assume it can provide a correct
--sync-from argument (pointing to backup server) when relaunch OVSDB
server?

>
>
>> Ben said one of you, or both may have worked with similar active-backup
>> systems before, so I am very interested in your inputs.
>>
>> Thanks,
>>
>> Andy
>>
>
>
>
> --
> Russell Bryant
>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [RFC 4/5] dpctl: uses open_type when calling netdev_open

2016-07-26 Thread Daniele Di Proietto
2016-07-26 11:29 GMT-07:00 Thadeu Lima de Souza Cascardo <
casca...@redhat.com>:

> On Tue, Jul 26, 2016 at 11:20:38AM -0700, Daniele Di Proietto wrote:
> > Hi Cascardo,
> >
> > thanks for your input on this.  It's quite messy right now, but I believe
> > we have a chance
> > to fix this up.
> >
> > Replies inline
> >
> > 2016-07-26 7:33 GMT-07:00 Thadeu Lima de Souza Cascardo <
> casca...@redhat.com
> > >:
> >
> > > On Mon, Jul 25, 2016 at 11:03:29AM -0700, Daniele Di Proietto wrote:
> > > > 2016-07-25 9:57 GMT-07:00 Thadeu Lima de Souza Cascardo <
> > > casca...@redhat.com
> > > > >:
> > > >
> > > > > On Fri, Jul 22, 2016 at 02:49:39PM -0700, Daniele Di Proietto
> wrote:
> > > > > > I would prefer if dpctl kept using the datapath types.  The
> > > translation
> > > > > > from database types to datapath type should happen in ofproto,
> dpctl
> > > is
> > > > > > supposed to be used to interact with the datapath directly.
> > > > > >
> > > > > > What do you guys think?
> > > > > >
> > > > > > The rest of the series looks good to me as well.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Daniele
> > > > > >
> > > > >
> > > > >
> > > > Hi Cascardo,
> > > >
> > > > Thanks for the detailed analysis.  The problem is that there are
> three
> > > > types:
> > > >
> > > > a) the database type
> > > > b) the port type in dpif-netdev
> > > > c) the netdev type
> > > >
> > > > I was assuming that b and c are always equal, but they're not.  The
> only
> > > > case
> > > > when they're not equal is the "ovs-netdev" (or "ovs-dummy") port.
> > > >
> > > > I think we can easily remove this case and make b and c always equal
> > > > with the following changes:
> > >
> > > Well, we also have ofproto type.
> > >
> > >
> > If I'm not mistaken, ofproto type is always equal to b) and therefore to
> c).
> >
>
> I didn't think so, I thought that a would equal the ofproto type. But it
> seems
> you are right, and we can just have two types: database type and netdev
> type,
> and make sure dpif and ofproto types match the netdev type.
>
> >
> > > I had a different approach, in which I would use the netdev_type when
> > > doing the
> > > query. That broke tests too. The affected tests were just dpctl output
> > > shown to
> > > the user. But I would expect some breakage when ofproto_query also used
> > > the same
> > > type and vswitchd would see the database type and ofproto type as
> > > different and
> > > try to reconfigure the port.
> > >
> > >
> > I don't think so. In bridge_delete_or_reconfigure() we compare the
> ofproto
> > type with
> > iface->netdev_type:
> >
> > if (strcmp(ofproto_port.type, iface->netdev_type)
> > || netdev_set_config(iface->netdev, >cfg->options,
>
> You are right. It was just some confusion because of the related problem I
> found
> (and this code is a recent fix from myself because we were using the
> database
> type).
>
> The problem was that we were comparing the database type to "system" and
> my mind
> was thinking "system" was the database type and the ofproto_port.type was
> different. I just didn't look back at the code and made a quick assumption.
>
> > NULL)) {
> > /* The interface is the wrong type or can't be configured.
> >  * Delete it. */
> > goto delete;
> > }
> >
> >
> > > Then, I looked at your patch below and noticed that you do the
> opposite,
> > > you
> > > eliminate the open_type and only use it for the internal type. Then I
> > > thought
> > > that would break other cases. But dpif_netdev_port_add uses the
> netdev_type
> > > already. Hey...
> > >
> > > So, dpctl does see it as tap when I add an internal port. Which
> probably
> > > means
> > > ofproto_type is also tap. I guess we will have to fix that too.
> > >
> >
> > To sum up, why do we have to fix that?  The translation between the
> database
> > type and the netdev type happens in vswitchd/bridge.c.  The below layers,
> > ofproto
> > and dpif-netdev, deal with the netdev type directly.
> >
> > Is there a problem with this approach?
> >
> > The few changes I suggested fix the confusion for the "ovs-netdev" port.
> >
>
> I guess that just causes the slight difference in behavior that dpctl will
> output "tap" instead of "internal" for the local port and only for the
> local
> port. For other internal ports, "tap" was already used. As you mentioned
> dpctl
> is a debugging tool and would be OK with that change, I will use your
> patch with
> your from and my sign-off. Is that OK?
>
>
I think it's fine to change the output.

You can be the author of the change.  Quite a few other lines need to be
changed
in the testsuite as well.

Here's a signoff for my part:

Signed-off-by: Daniele Di Proietto 

Thanks for getting to the bottom of this.

Daniele


> Thanks.
> Cascardo.
>
> >
> > >
> > > I am attaching my version of the patch here as well. Which of the 3
> > > versions do
> > > you think I should send? The original one I sent 

  1   2   3   >