date:20170117

[ovs-dev] [RFC] ofproto/bond: operational vs administratively disabled bond interface

2017-01-17 Thread Eelco Chaudron


Currently OVS does not distinguish between a bond slave being operational
disabled, i.e. link being down, and administratively disabled.

Take the example where the administrator disabled a link in a bond,
"ovs-appctl bond/disable-slave bond0 enp129s0f0", it's automatically
enabled again due to the fact the link is up.

I would like to change this behavior such that when disabled trough appctl
the slave is no longer used until explicitly enabled again via appctl.

Any comments? If not I go ahead and prepare a patch-set.

Cheers,

Eelco

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] ovn-controller: Provide the option to set Encap.options:csum

2017-01-17 Thread Numan Siddique

On Mon, Jan 16, 2017 at 11:46 PM, Russell Bryant  wrote:

>
>
> On Mon, Jan 16, 2017 at 12:01 PM, Ben Pfaff  wrote:
>
>> On Mon, Jan 16, 2017 at 09:28:28AM -0500, Russell Bryant wrote:
>> > On Sat, Jan 14, 2017 at 11:29 AM, Ben Pfaff  wrote:
>> >
>> > > On Sat, Jan 14, 2017 at 07:37:53PM +0530, Numan Siddique wrote:
>> > > > On Sat, Jan 14, 2017 at 3:18 AM, Ben Pfaff  wrote:
>> > > >
>> > > > > On Tue, Jan 10, 2017 at 11:34:42AM +0530, Numan Siddique wrote:
>> > > > > > ovn-controller by default enables UDP checksums for geneve
>> > > > > > tunnels. With this patch user can set the desired value in
>> > > > > > Open_vSwitch.external_ids:ovn_encap_csum.
>> > > > > >
>> > > > > > Signed-off-by: Numan Siddique 
>> > > > >
>> > > > > I don't see technical problems with this, but I also don't know
>> why a
>> > > > > user would want to disable checksums.  Can you send a v2 that
>> adds this
>> > > > > rationale to the documentation and to the commit message?
>> > > > >
>> > > >
>> > > > Thanks for the review. Sure I will do that. The reason for this
>> patch is
>> > > -
>> > > > we are seeing significant performance increase (more than 100%) in
>> our
>> > > > testing when tunnel checksum is disabled.
>> > > >
>> > > > The lab servers have nics with geneve offload support (
>> > > > tx-udp_tnl-segmentation
>> > > > )
>> > > > .
>> > >
>> > > OK, that sounds like a good reason to document.
>> >
>> >
>> > In particular, it looks like the NICs we have, Intel X710, will do TCP
>> > Segmentation Offload (TSO) with geneve or vxlan, but only if udp
>> checksums
>> > are turned off.  Once they're on, TCP throughput gets cut to less than
>> half.
>> >
>> > This is going to be painful to document well if it's hardware dependent.
>> > I'm not sure what the better default is, since checksums should actually
>> > improve performance for NICs without geneve offload support.
>>
>> Well, it's at least helpful to document that performance is the reason,
>> and that offloads factor into the issue.  Otherwise readers will have no
>> idea why they'd want to turn off checksums.  At worst, users can
>> benchmark both cases in their environments.
>>
>
> Totally agreed.  I was just adding some more detail about what we've seen,
> and then reflecting on how we might want to document this new option for
> users.
>
>
I have submitted v2 of the patch with updated documenation -
https://mail.openvswitch.org/pipermail/ovs-dev/2017-January/327660.html

Thanks
Numan


> --
> Russell Bryant
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH v10 0/8] ovn: add distributed NAT capability

2017-01-17 Thread Mickey Spiegel

PATCH v9 -> PATCH v10
Patch 3 (distributed gateway port / chassisredirect port_binding) is
based on v8, with some rewrites in ovn-nb.xml to make it higher level
and remove references to sb constructs.
A decision still needs to be made between v10 patch 3 and v9 patch 3,
which dropped the suggested new port type of "chassisredirect" and
replaced it with a MLF_FORCE_CHASSIS_REDIRECT flag.
The egress loopback patch (patch 4) has been completely rewritten as
an action.  Flags are no longer needed for egress loopback, and there
are no longer any changes to ovn/controller/physical.c.  The action
now includes ct_clear.
The NAT flows patch now uses the new egress_loopback action, and adds
REGBIT_EGRESS_LOOPBACK, instead of the previous
MLF_EGRESS_LOOPBACK_OCCURRED flag.
SNAT recirc changes are now documented in ovn-sb.xml.
A new patch has been added extending ovn-nbctl nat commands with new
arguments for logical_port and external_mac.


Currently OVN supports NAT functionality by connecting each distributed
logical router to a centralized "l3gateway" router that resides on a
single chassis.  NAT is only carried out in the "l3gateway" router.

This patch set introduces NAT capability in the distributed logical
router itself, avoiding the need to pass through a transit logical
switch and a second logical router, and in many cases avoiding the need
to pass through a centralized chassis.

NAT functionality is associated with the logical router gateway port.
In order to support one-to-many SNAT (aka IP masquerading), where
multiple private IP addresses spread across multiple chassis are mapped
to a single public IP address, it will be necessary to handle some of
the logical router processing on a specific chassis in a centralized
manner.  Some NAT flows are handled in a distributed manner on all
chassis (following the local "patch" port as is normally done for
distributed logical routers), while other NAT flows are handled on a
centralized "redirect-chassis".

Possible future work items (hopefully not required for this patch set
to be accepted) include:
1. The NAT flows patch lifts the restriction that conntrack zones are
   only assigned to datapaths for gateway routers.  Given recent
   changes to ovn-controller, a hypervisor only sees the datapaths
   for which there is a port resident on this chassis, or datapaths
   reachable from ports resident on this chassis.  Is that good
   enough?  Or should conntrack zone assignment for datapaths be
   restricted further, perhaps only to logical router datapaths?
2. The current automated test for NAT flows is single node, so it does
   not cover the distributed functionality.  Full coverage requires a
   multi-node test with conntrack NAT capability, either in the kernel
   or userspace.  Is this possible?
   Multi-node tests have been added for the chassisdirect patch,
   testing non-NAT aspects of the distributed router gateway port.
3. Consider how to generalize distributed versus centralized handling
   of non-NAT traffic being output on the distributed gateway port.
   If MAC learning is used in the upstream network, then the
   distributed gateway port’s MAC address must be restricted to the
   redirect-chassis by using the chassisredirect port.  In the
   presence of dynamic protocols such as BGP EVPN, non-NAT traffic
   could be handled in a distributed manner.
4. Gratuitous ARP for NAT addresses needs to be updated for
   distributed NAT.
5. Add load balancing on the redirect chassis of an otherwise
   distributed logical router.

PATCH v8 -> PATCH v9
This is just the first 3 patches.  The rest of the patch set will follow
once a decision is made on v8 patch 4 versus v9 patch 3.
The peer address patch was accepted, so it is no longer included in the
patch set.
The suggested new port type of "chassisredirect" has been dropped and
replaced by MLF_FORCE_CHASSIS_REDIRECT flag.

PATCH v7 -> PATCH v8
Incorporated incremental changes to is_chassis_resident() from blp.
Added patch that describes logical routers and logical patches in
ovn-architecture.
Renamed chassisredirect patch to emphasize distributed gateway ports
as well.
Added description of distributed gateway ports to ovn-architecture,
in distributed gateway port / chassisredirect patch.
Rewrote commit message for distributed gateway port / chassisredirect.

PATCH v6 -> PATCH v7
Rebase.
Documentation improvements to lsp addresses "router" patch as
suggested by blp.  Also added to ovn-nbctl documentation.

PATCH v5 -> PATCH v6
Added patch to automatically add router addresses to the addresses of
type "router" lsps.
Restricted logical switch destination lookup flows for logical router
distributed gateway port's MAC to the redirect chassis.
Automatically add distributed NAT MAC addresses to logical switch
destination lookup flows on the chassis where the NAT logical port resides.
Added tests for reachability from VIFs on the same logical switch as
localnet, through the logical router's distributed gateway port, to
in

[ovs-dev] [PATCH v10 1/8] ovn: document logical routers and logical patch ports in ovn-architecture

2017-01-17 Thread Mickey Spiegel

This patch adds a description of logical routers and logical patch ports,
including gateway routers, to ovn/ovn-architecture.7.xml.

Signed-off-by: Mickey Spiegel 
---
 ovn/ovn-architecture.7.xml | 148 ++---
 1 file changed, 140 insertions(+), 8 deletions(-)

diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml
index b049e51..d92f878 100644
--- a/ovn/ovn-architecture.7.xml
+++ b/ovn/ovn-architecture.7.xml
@@ -381,6 +381,36 @@
   switch.  Logical switches and routers are both implemented as logical
   datapaths.
 
+
+
+  
+Logical ports represent the points of connectivity in and
+out of logical switches and logical routers.  Some common types of
+logical ports are:
+  
+
+  
+
+  Logical ports representing VIFs.
+
+
+
+  Localnet ports represent the points of connectivity
+  between logical switches and the physical network.  They are
+  implemented as OVS patch ports between the integration bridge
+  and the separate Open vSwitch bridge that underlay physical
+  ports attach to.
+
+
+
+  Logical patch ports represent the points of
+  connectivity between logical switches and logical routers, and
+  in some cases between peer logical routers.  There is a pair of
+  logical patch ports at each such point of connectivity, one on
+  each side.
+
+  
+
   
 
   Life Cycle of a VIF
@@ -1040,17 +1070,119 @@
 is a container nested with a VM, then before sending the packet the
 actions push on a VLAN header with an appropriate VLAN ID.
   
-
-  
-If the logical egress port is a logical patch port, then table 65
-outputs to an OVS patch port that represents the logical patch port.
-The packet re-enters the OpenFlow flow table from the OVS patch port's
-peer in table 0, which identifies the logical datapath and logical
-input port based on the OVS patch port's OpenFlow port number.
-  
 
   
 
+  Logical Routers and Logical Patch Ports
+
+  
+Typically logical routers and logical patch ports do not have a
+physical location and effectively reside on every hypervisor.  This is
+the case for logical patch ports between logical routers and logical
+switches behind those logical routers, to which VMs (and VIFs) attach.
+  
+
+  
+Consider a packet sent from one virtual machine or container to another
+VM or container that resides on a different subnet.  The packet will
+traverse tables 0 to 65 as described in the previous section
+Architectural Physical Life Cycle of a Packet, using the
+logical datapath representing the logical switch that the sender is
+attached to.  At table 32, the packet will use the fallback flow that
+resubmits locally to table 33 on the same hypervisor.  In this case,
+all of the processing from table 0 to table 65 occurs on the hypervisor
+where the sender resides.
+  
+
+  
+When the packet reaches table 65, the logical egress port is a logical
+patch port.  The behavior at table 65 differs depending on the OVS
+version:
+  
+
+  
+
+  In OVS versions 2.6 and earlier, table 65 outputs to an OVS patch
+  port that represents the logical patch port.  The packet re-enters
+  the OpenFlow flow table from the OVS patch port's peer in table 0,
+  which identifies the logical datapath and logical input port based
+  on the OVS patch port's OpenFlow port number.
+
+
+
+  In OVS versions 2.7 and later, the packet is cloned and resubmitted
+  directly to OpenFlow flow table 16, setting the logical ingress
+  port to the peer logical patch port, and using the peer logical
+  patch port's logical datapath (that represents the logical router).
+
+  
+
+  
+The packet re-enters the ingress pipeline in order to traverse tables
+16 to 65 again, this time using the logical datapath representing the
+logical router.  The processing continues as described in the previous
+section Architectural Physical Life Cycle of a Packet.
+When the packet reachs table 65, the logical egress port will once
+again be a logical patch port.  In the same manner as described above,
+this logical patch port will cause the packet to be resubmitted to
+OpenFlow tables 16 to 65, this time using the logical datapath
+representing the logical switch that the destination VM or container
+is attached to.
+  
+
+  
+The packet traverses tables 16 to 65 a third and final time.  If the
+destination VM or container resides on a remote hypervisor, then table
+32 will send the packet on a tunnel port from the sender's hypervisor
+to the remote hypervisor.  Finally table 65 will output the packet
+directly to the destination VM or container.
+  
+
+  
+

[ovs-dev] [PATCH v10 2/8] ovn: add is_chassis_resident match expression component

2017-01-17 Thread Mickey Spiegel

This patch introduces a new match expression component
is_chassis_resident().  Unlike match expression comparisons,
is_chassis_resident is not pushed down to OpenFlow.  It is a
conditional that is evaluated in the controller during expr_simplify(),
when it is replaced by a boolean expression.  The is_chassis_resident
conditional evaluates to "true" when the specified string identifies a
port name that is resident on this controller chassis, i.e., the
corresponding southbound database Port_Binding has a chassis column
that matches this chassis.  Otherwise it evaluates to "false".

This allows higher level features to specify flows that are only
installed on some chassis rather than on all chassis with the
corresponding datapath.

Suggested-by: Ben Pfaff 
Signed-off-by: Mickey Spiegel 
Acked-by: Ben Pfaff 
---
 include/ovn/expr.h  |  22 +-
 ovn/controller/lflow.c  |  31 ++--
 ovn/controller/lflow.h  |   5 +-
 ovn/controller/ovn-controller.c |   5 +-
 ovn/lib/expr.c  | 160 ++--
 ovn/ovn-sb.xml  |  14 
 ovn/utilities/ovn-trace.c   |  21 +-
 tests/ovn.at|  14 
 tests/test-ovn.c|  24 +-
 9 files changed, 279 insertions(+), 17 deletions(-)

diff --git a/include/ovn/expr.h b/include/ovn/expr.h
index 2169a8c..711713e 100644
--- a/include/ovn/expr.h
+++ b/include/ovn/expr.h
@@ -292,6 +292,15 @@ enum expr_type {
 EXPR_T_AND, /* Logical AND of 2 or more subexpressions. */
 EXPR_T_OR,  /* Logical OR of 2 or more subexpressions. */
 EXPR_T_BOOLEAN, /* True or false constant. */
+EXPR_T_CONDITION,   /* Conditional to be evaluated in the
+ * controller during expr_simplify(),
+ * prior to constructing OpenFlow matches. */
+};
+
+/* Expression condition type. */
+enum expr_cond_type {
+EXPR_COND_CHASSIS_RESIDENT, /* Check if specified logical port name is
+ * resident on the controller chassis. */
 };
 
 /* Relational operator. */
@@ -349,6 +358,14 @@ struct expr {
 
 /* EXPR_T_BOOLEAN. */
 bool boolean;
+
+/* EXPR_T_CONDITION. */
+struct {
+enum expr_cond_type type;
+bool not;
+/* XXX Should arguments for conditions be generic? */
+char *string;
+} cond;
 };
 };
 
@@ -375,7 +392,10 @@ void expr_destroy(struct expr *);
 
 struct expr *expr_annotate(struct expr *, const struct shash *symtab,
char **errorp);
-struct expr *expr_simplify(struct expr *);
+struct expr *expr_simplify(struct expr *,
+   bool (*is_chassis_resident)(const void *c_aux,
+   const char *port_name),
+   const void *c_aux);
 struct expr *expr_normalize(struct expr *);
 
 bool expr_honors_invariants(const struct expr *);
diff --git a/ovn/controller/lflow.c b/ovn/controller/lflow.c
index 71d8c59..3d7633e 100644
--- a/ovn/controller/lflow.c
+++ b/ovn/controller/lflow.c
@@ -50,12 +50,18 @@ struct lookup_port_aux {
 const struct sbrec_datapath_binding *dp;
 };
 
+struct condition_aux {
+const struct lport_index *lports;
+const struct sbrec_chassis *chassis;
+};
+
 static void consider_logical_flow(const struct lport_index *lports,
   const struct mcgroup_index *mcgroups,
   const struct sbrec_logical_flow *lflow,
   const struct hmap *local_datapaths,
   struct group_table *group_table,
   const struct simap *ct_zones,
+  const struct sbrec_chassis *chassis,
   struct hmap *dhcp_opts,
   struct hmap *dhcpv6_opts,
   uint32_t *conj_id_ofs,
@@ -85,6 +91,16 @@ lookup_port_cb(const void *aux_, const char *port_name, 
unsigned int *portp)
 }
 
 static bool
+is_chassis_resident_cb(const void *c_aux_, const char *port_name)
+{
+const struct condition_aux *c_aux = c_aux_;
+
+const struct sbrec_port_binding *pb
+= lport_lookup_by_name(c_aux->lports, port_name);
+return pb && pb->chassis && pb->chassis == c_aux->chassis;
+}
+
+static bool
 is_switch(const struct sbrec_datapath_binding *ldp)
 {
 return smap_get(&ldp->external_ids, "logical-switch") != NULL;
@@ -98,6 +114,7 @@ add_logical_flows(struct controller_ctx *ctx, const struct 
lport_index *lports,
   const struct hmap *local_datapaths,
   struct group_table *group_table,
   const struct simap *ct_zones,
+  const struct sbrec_chassis *chassis,
   const struct shash *addr_sets,

[ovs-dev] [PATCH v10 3/8] ovn: Introduce distributed gateway port and "chassisredirect" port binding

2017-01-17 Thread Mickey Spiegel

Currently OVN distributed logical routers achieve reachability to
physical networks by passing through a "join" logical switch to a
centralized gateway router, which then connects to another logical
switch that has a localnet port connecting to the physical network.

This patch adds logical port and port binding abstractions that allow
an OVN distributed logical router to connect directly to a logical
switch that has a localnet port connecting to the physical network.
In this patch, this logical router port is called a "distributed
gateway port".

The primary design goal of distributed gateway ports is to allow as
much traffic as possible to be handled locally on the hypervisor
where a VM or container resides.  Whenever possible, packets from
the VM or container to the outside world should be processed
completely on that VM's or container's hypervisor, eventually
traversing a localnet port instance on that hypervisor to the
physical network.  Whenever possible, packets from the outside
world to a VM or container should be directed through the physical
network directly to the VM's or container's hypervisor, where the
packet will enter the integration bridge through a localnet port.

However, due to the implications of the use of L2 learning in the
physical network, as well as the need to support advanced features
such as one-to-many NAT (aka IP masquerading), where multiple
logical IP addresses spread across multiple chassis are mapped to
one external IP address, it will be necessary to handle some of the
logical router processing on a specific chassis in a centralized
manner.  For this reason, the user must associate a chassis with
each distributed gateway port.

In order to allow for the distributed processing of some packets,
distributed gateway ports need to be logical patch ports that
effectively reside on every hypervisor, rather than "l3gateway"
ports that are bound to a particular chassis.  However, the flows
associated with distributed gateway ports often need to be
associated with physical locations.  This is implemented in this
patch (and subsequent patches) by adding "is_chassis_resident()"
match conditions to several logical router flows.

While most of the physical location dependent aspects of distributed
gateway ports can be handled by restricting some flows to specific
chassis, one additional mechanism is required.  When a packet
leaves the ingress pipeline and the logical egress port is the
distributed gateway port, one of two different sets of actions is
required at table 32:
- If the packet can be handled locally on the sender's hypervisor
  (e.g. one-to-one NAT traffic), then the packet should just be
  resubmitted locally to table 33, in the normal manner for
  distributed logical patch ports.
- However, if the packet needs to be handled on the chassis
  associated with the distributed gateway port (e.g. one-to-many
  SNAT traffic or non-NAT traffic), then table 32 must send the
  packet on a tunnel port to that chassis.
In order to trigger the second set of actions, the
"chassisredirect" type of southbound port_binding is introduced.
Setting the logical egress port to the type "chassisredirect"
logical port is simply a way to indicate that although the packet
is destined for the distributed gateway port, it needs to be
redirected to a different chassis.  At table 32, packets with this
logical egress port are sent to a specific chassis, in the same
way that table 32 directs packets whose logical egress port is a
VIF or a type "l3gateway" port to different chassis.  Once the
packet arrives at that chassis, table 33 resets the logical egress
port to the value representing the distributed gateway port.  For
each distributed gateway port, there is one type "chassisredirect"
port, in addition to the distributed logical patch port
representing the distributed gateway port.

A "chassisredirect" port represents a particular instance, bound
to a specific chassis, of an otherwise distributed port.  A
"chassisredirect" port is associated with a chassis in the same
manner as a "l3gateway" port.  However, unlike "l3gateway" ports,
"chassisredirect" ports have no associated IP or MAC addresses,
and "chassisredirect" ports should never be used as the "inport".
Any pipeline stages that depend on port specific IP or MAC addresses
should be carried out in the context of the distributed gateway
port's logical patch port.

Although the abstraction represented by the "chassisredirect" port
binding is generalized, in this patch the "chassisredirect" port binding
is only created for NB logical router ports that specify the new
"redirect-chassis" option.  There is no explicit notion of a
"chassisredirect" port in the NB database.  The expectation is when
capabilities are implemented that take advantage of "chassisredirect"
ports (e.g. distributed gateway ports), flows specifying a
"chassisredirect" port as the outport will be added as part of that
capability.

Signed-off-by: Mickey Spiegel 
---
 ovn/controlle

[ovs-dev] [PATCH v10 4/8] ovn: add egress_loopback action

2017-01-17 Thread Mickey Spiegel

This patch adds an action that loops a clone of the packet back to the
beginning of the ingress pipeline with logical inport equal to the value
of the current logical outport.  The following actions are executed on
the clone:

clears the connection tracking state
in_port = 0
inport = outport
outport = 0
flags = 0
reg0 ... reg9 = 0
nested actions from inside "{ ... }"
for example "reg9[1] = 1" to indicate that egress loopback has
occurred
executes the ingress pipeline as a subroutine

This action is expected to be executed in the egress pipeline.  No
changes are made to the logical datapath or to the connection tracking
zones, which will continue to be correct when carrying out loopback
from the egress pipeline to the ingress pipeline.

This capability is needed in order to implement some of the east/west
distributed NAT flows.

Signed-off-by: Mickey Spiegel 
---
 include/ovn/actions.h |  5 +++-
 ovn/controller/lflow.c|  1 +
 ovn/lib/actions.c | 71 +--
 ovn/ovn-sb.xml| 35 +++
 ovn/utilities/ovn-trace.c | 49 
 tests/ovn.at  |  4 +++
 tests/test-ovn.c  |  1 +
 7 files changed, 163 insertions(+), 3 deletions(-)

diff --git a/include/ovn/actions.h b/include/ovn/actions.h
index 0bf6145..5f9d203 100644
--- a/include/ovn/actions.h
+++ b/include/ovn/actions.h
@@ -68,7 +68,8 @@ struct simap;
 OVNACT(PUT_ND,ovnact_put_mac_bind)  \
 OVNACT(PUT_DHCPV4_OPTS, ovnact_put_dhcp_opts)   \
 OVNACT(PUT_DHCPV6_OPTS, ovnact_put_dhcp_opts)   \
-OVNACT(SET_QUEUE,   ovnact_set_queue)
+OVNACT(SET_QUEUE,   ovnact_set_queue)   \
+OVNACT(EGRESS_LOOPBACK, ovnact_nest)
 
 /* enum ovnact_type, with a member OVNACT_ for each action. */
 enum OVS_PACKED_ENUM ovnact_type {
@@ -444,6 +445,8 @@ struct ovnact_encode_params {
 uint8_t output_ptable;  /* OpenFlow table for 'output' to resubmit. */
 uint8_t mac_bind_ptable;/* OpenFlow table for 'get_arp'/'get_nd' to
resubmit. */
+uint8_t ingress_ptable; /* OpenFlow table for 'egress_loopback' to
+   resubmit. */
 };
 
 void ovnacts_encode(const struct ovnact[], size_t ovnacts_len,
diff --git a/ovn/controller/lflow.c b/ovn/controller/lflow.c
index 3d7633e..c41368f 100644
--- a/ovn/controller/lflow.c
+++ b/ovn/controller/lflow.c
@@ -226,6 +226,7 @@ consider_logical_flow(const struct lport_index *lports,
 .first_ptable = first_ptable,
 .output_ptable = output_ptable,
 .mac_bind_ptable = OFTABLE_MAC_BINDING,
+.ingress_ptable = OFTABLE_LOG_INGRESS_PIPELINE,
 };
 ovnacts_encode(ovnacts.data, ovnacts.size, &ep, &ofpacts);
 ovnacts_free(ovnacts.data, ovnacts.size);
diff --git a/ovn/lib/actions.c b/ovn/lib/actions.c
index 686ecc5..dda675b 100644
--- a/ovn/lib/actions.c
+++ b/ovn/lib/actions.c
@@ -169,6 +169,21 @@ put_load(uint64_t value, enum mf_field_id dst, int ofs, 
int n_bits,
 bitwise_copy(&n_value, 8, 0, sf->value, sf->field->n_bytes, ofs, n_bits);
 bitwise_one(ofpact_set_field_mask(sf), sf->field->n_bytes, ofs, n_bits);
 }
+
+static void
+put_move(enum mf_field_id src, int src_ofs,
+ enum mf_field_id dst, int dst_ofs,
+ int n_bits,
+ struct ofpbuf *ofpacts)
+{
+struct ofpact_reg_move *move = ofpact_put_REG_MOVE(ofpacts);
+move->src.field = mf_from_id(src);
+move->src.ofs = src_ofs;
+move->src.n_bits = n_bits;
+move->dst.field = mf_from_id(dst);
+move->dst.ofs = dst_ofs;
+move->dst.n_bits = n_bits;
+}
 
 /* Context maintained during ovnacts_parse(). */
 struct action_context {
@@ -1021,7 +1036,10 @@ free_CT_LB(struct ovnact_ct_lb *ct_lb)
 }
 
 /* Implements the "arp" and "nd_na" actions, which execute nested actions on a
- * packet derived from the one being processed. */
+ * packet derived from the one being processed.  Also implements the
+ * "egress_loopback" action, which executes nested actions after clearing
+ * registers and connection state, then loops the packet back to the
+ * beginning of the ingress pipeline. */
 static void
 parse_nested_action(struct action_context *ctx, enum ovnact_type type,
 const char *prereq)
@@ -1055,7 +1073,9 @@ parse_nested_action(struct action_context *ctx, enum 
ovnact_type type,
 return;
 }
 
-add_prerequisite(ctx, prereq);
+if (prereq) {
+add_prerequisite(ctx, prereq);
+}
 
 struct ovnact_nest *on = ovnact_put(ctx->ovnacts, type, sizeof *on);
 on->nested_len = nested.size;
@@ -1075,6 +1095,12 @@ parse_ND_NA(struct action_context *ctx)
 }
 
 static void
+parse_EGRESS_LOOPBACK(struct action_context *ctx)
+{
+parse_nested_action(ctx, OVNACT_EGRESS_LOOPBACK, NULL);
+}
+
+static void
 format_nested_action(const struct ovnact_nest *on, const char *name,

[ovs-dev] [PATCH v10 5/8] ovn: move load balancing flows after NAT flows

2017-01-17 Thread Mickey Spiegel

This will make it easy for distributed NAT to reuse some of the
existing code for NAT flows, while leaving load balancing and defrag
as functionality specific to gateway routers.  There is no intent to
change any functionality in this patch.

Signed-off-by: Mickey Spiegel 
Acked-by: Gurucharan Shetty 
---
 ovn/northd/ovn-northd.c | 140 
 1 file changed, 70 insertions(+), 70 deletions(-)

diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index 87c80d1..219a69c 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -4099,76 +4099,6 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap 
*ports,
 const char *lb_force_snat_ip = get_force_snat_ip(od, "lb",
  &snat_ip);
 
-/* A set to hold all ips that need defragmentation and tracking. */
-struct sset all_ips = SSET_INITIALIZER(&all_ips);
-
-for (int i = 0; i < od->nbr->n_load_balancer; i++) {
-struct nbrec_load_balancer *lb = od->nbr->load_balancer[i];
-struct smap *vips = &lb->vips;
-struct smap_node *node;
-
-SMAP_FOR_EACH (node, vips) {
-uint16_t port = 0;
-
-/* node->key contains IP:port or just IP. */
-char *ip_address = NULL;
-ip_address_and_port_from_lb_key(node->key, &ip_address, &port);
-if (!ip_address) {
-continue;
-}
-
-if (!sset_contains(&all_ips, ip_address)) {
-sset_add(&all_ips, ip_address);
-}
-
-/* Higher priority rules are added for load-balancing in DNAT
- * table.  For every match (on a VIP[:port]), we add two flows
- * via add_router_lb_flow().  One flow is for specific matching
- * on ct.new with an action of "ct_lb($targets);".  The other
- * flow is for ct.est with an action of "ct_dnat;". */
-ds_clear(&actions);
-ds_put_format(&actions, "ct_lb(%s);", node->value);
-
-ds_clear(&match);
-ds_put_format(&match, "ip && ip4.dst == %s",
-  ip_address);
-free(ip_address);
-
-if (port) {
-if (lb->protocol && !strcmp(lb->protocol, "udp")) {
-ds_put_format(&match, " && udp && udp.dst == %d",
-  port);
-} else {
-ds_put_format(&match, " && tcp && tcp.dst == %d",
-  port);
-}
-add_router_lb_flow(lflows, od, &match, &actions, 120,
-   lb_force_snat_ip);
-} else {
-add_router_lb_flow(lflows, od, &match, &actions, 110,
-   lb_force_snat_ip);
-}
-}
-}
-
-/* If there are any load balancing rules, we should send the
- * packet to conntrack for defragmentation and tracking.  This helps
- * with two things.
- *
- * 1. With tracking, we can send only new connections to pick a
- *DNAT ip address from a group.
- * 2. If there are L4 ports in load balancing rules, we need the
- *defragmentation to match on L4 ports. */
-const char *ip_address;
-SSET_FOR_EACH(ip_address, &all_ips) {
-ds_clear(&match);
-ds_put_format(&match, "ip && ip4.dst == %s", ip_address);
-ovn_lflow_add(lflows, od, S_ROUTER_IN_DEFRAG,
-  100, ds_cstr(&match), "ct_next;");
-}
-
-sset_destroy(&all_ips);
-
 for (int i = 0; i < od->nbr->n_nat; i++) {
 const struct nbrec_nat *nat;
 
@@ -4323,6 +4253,76 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap 
*ports,
 * routing in the openflow pipeline. */
 ovn_lflow_add(lflows, od, S_ROUTER_IN_DNAT, 50,
   "ip", "flags.loopback = 1; ct_dnat;");
+
+/* A set to hold all ips that need defragmentation and tracking. */
+struct sset all_ips = SSET_INITIALIZER(&all_ips);
+
+for (int i = 0; i < od->nbr->n_load_balancer; i++) {
+struct nbrec_load_balancer *lb = od->nbr->load_balancer[i];
+struct smap *vips = &lb->vips;
+struct smap_node *node;
+
+SMAP_FOR_EACH (node, vips) {
+uint16_t port = 0;
+
+/* node->key contains IP:port or just IP. */
+char *ip_address = NULL;
+ip_address_and_port_from_lb_key(node->key, &ip_address, &port);
+if (!ip_address) {
+continue;
+}
+
+if (!sset_contains(&all_ips, ip_address)) {
+

[ovs-dev] [PATCH v10 8/8] ovn: ovn-nbctl commands for distributed NAT

2017-01-17 Thread Mickey Spiegel

This patch adds the new optional arguments "logical_port" and
"external_mac" to lr-nat-add, and displays that information in
lr-nat-list.

Signed-off-by: Mickey Spiegel 
---
 ovn/utilities/ovn-nbctl.8.xml | 27 +++---
 ovn/utilities/ovn-nbctl.c | 54 +--
 tests/ovn-nbctl.at| 47 +
 tests/system-ovn.at   | 30 +---
 4 files changed, 119 insertions(+), 39 deletions(-)

diff --git a/ovn/utilities/ovn-nbctl.8.xml b/ovn/utilities/ovn-nbctl.8.xml
index 4911c6a..c408484 100644
--- a/ovn/utilities/ovn-nbctl.8.xml
+++ b/ovn/utilities/ovn-nbctl.8.xml
@@ -444,7 +444,7 @@
 NAT Commands
 
 
-  [--may-exist] lr-nat-add router 
type external_ip logical_ip
+  [--may-exist] lr-nat-add router 
type external_ip logical_ip 
[logical_port external_mac]
   
 
   Adds the specified NAT to router.
@@ -453,6 +453,13 @@
   The external_ip is an IPv4 address.
   The logical_ip is an IPv4 network (e.g 192.168.1.0/24)
   or an IPv4 address.
+  The logical_port and external_mac are only
+  accepted when router is a distributed router (rather
+  than a gateway router) and type is
+  dnat_and_snat.
+  The logical_port is the name of an existing logical
+  switch port where the logical_ip resides.
+  The external_mac is an Ethernet address.
 
 
   When type is dnat, the externally
@@ -475,8 +482,22 @@
   the IP address in external_ip.
 
 
-  It is an error if a NAT already exists,
-  unless --may-exist is specified.
+  When the logical_port and external_mac
+  are specified, the NAT rule will be programmed on the chassis
+  where the logical_port resides.  This includes
+  ARP replies for the external_ip, which return the
+  value of external_mac.  All packets transmitted
+  with source IP address equal to external_ip will
+  be sent using the external_mac.
+
+
+  It is an error if a NAT already exists with the same values
+  of router, type, external_ip,
+  and logical_ip, unless --may-exist is
+  specified.  When --may-exist,
+  logical_port, and external_mac are all
+  specified, the existing values of logical_port and
+  external_mac are overwritten.
 
   
 
diff --git a/ovn/utilities/ovn-nbctl.c b/ovn/utilities/ovn-nbctl.c
index 4397daf..661f7de 100644
--- a/ovn/utilities/ovn-nbctl.c
+++ b/ovn/utilities/ovn-nbctl.c
@@ -384,7 +384,7 @@ Route commands:\n\
   lr-route-list ROUTER  print routes for ROUTER\n\
 \n\
 NAT commands:\n\
-  lr-nat-add ROUTER TYPE EXTERNAL_IP LOGICAL_IP\n\
+  lr-nat-add ROUTER TYPE EXTERNAL_IP LOGICAL_IP [LOGICAL_PORT EXTERNAL_MAC]\n\
 add a NAT to ROUTER\n\
   lr-nat-del ROUTER [TYPE [IP]]\n\
 remove NATs from ROUTER\n\
@@ -2233,6 +2233,30 @@ nbctl_lr_nat_add(struct ctl_context *ctx)
 new_logical_ip = normalize_ipv4_prefix(ipv4, plen);
 }
 
+const char *logical_port;
+const char *external_mac;
+if (ctx->argc == 6) {
+ctl_fatal("lr-nat-add with logical_port "
+  "must also specify external_mac.");
+} else if (ctx->argc == 7) {
+if (strcmp(nat_type, "dnat_and_snat")) {
+ctl_fatal("logical_port and external_mac are only valid when "
+  "type is \"dnat_and_snat\".");
+}
+
+logical_port = ctx->argv[5];
+lsp_by_name_or_uuid(ctx, logical_port, true);
+
+external_mac = ctx->argv[6];
+struct eth_addr ea;
+if (!eth_addr_from_string(external_mac, &ea)) {
+ctl_fatal("invalid mac address %s.", external_mac);
+}
+} else {
+logical_port = NULL;
+external_mac = NULL;
+}
+
 bool may_exist = shash_find(&ctx->options, "--may-exist") != NULL;
 int is_snat = !strcmp("snat", nat_type);
 for (size_t i = 0; i < lr->n_nat; i++) {
@@ -2243,6 +2267,10 @@ nbctl_lr_nat_add(struct ctl_context *ctx)
 if (!strcmp(is_snat ? external_ip : new_logical_ip,
 is_snat ? nat->external_ip : nat->logical_ip)) {
 if (may_exist) {
+nbrec_nat_verify_logical_port(nat);
+nbrec_nat_verify_external_mac(nat);
+nbrec_nat_set_logical_port(nat, logical_port);
+nbrec_nat_set_external_mac(nat, external_mac);
 free(new_logical_ip);
 return;
 }
@@ -2265,6 +2293,10 @@ nbctl_lr_nat_add(struct ctl_context *ctx)
 nbrec_nat_set_type(nat, nat_type);
 nbrec_nat_set_external_ip(nat, external_ip);
 nbrec_nat_set_lo

[ovs-dev] [PATCH v10 6/8] ovn: avoid snat recirc only on gateway routers

2017-01-17 Thread Mickey Spiegel

Currently, for performance reasons on gateway routers, ct_snat
that does not specify an IP address does not immediately trigger
recirculation.  On gateway routers, ct_snat that does not specify
an IP address happens in the UNSNAT pipeline stage, which is
followed by the DNAT pipeline stage that triggers recirculation
for all packets.  This DNAT pipeline stage recirculation takes
care of the recirculation needs of UNSNAT as well as other cases
such as UNDNAT.

On distributed routers, UNDNAT is handled in the egress pipeline
stage, separately from DNAT in the ingress pipeline stages.  The
DNAT pipeline stage only triggers recirculation for some packets.
Due to this difference in design, UNSNAT needs to trigger its own
recirculation.

This patch restricts the logic that avoids recirculation for
ct_snat, so that it only applies to datapaths representing
gateway routers.

Signed-off-by: Mickey Spiegel 
---
 include/ovn/actions.h  |  3 +++
 ovn/controller/lflow.c | 10 ++
 ovn/lib/actions.c  | 15 +--
 ovn/ovn-sb.xml | 23 +++
 tests/ovn.at   |  2 +-
 5 files changed, 42 insertions(+), 11 deletions(-)

diff --git a/include/ovn/actions.h b/include/ovn/actions.h
index 5f9d203..810b901 100644
--- a/include/ovn/actions.h
+++ b/include/ovn/actions.h
@@ -418,6 +418,9 @@ struct ovnact_encode_params {
 /* 'true' if the flow is for a switch. */
 bool is_switch;
 
+/* 'true' if the flow is for a gateway router. */
+bool is_gateway_router;
+
 /* A map from a port name to its connection tracking zone. */
 const struct simap *ct_zones;
 
diff --git a/ovn/controller/lflow.c b/ovn/controller/lflow.c
index c41368f..b011109 100644
--- a/ovn/controller/lflow.c
+++ b/ovn/controller/lflow.c
@@ -107,6 +107,15 @@ is_switch(const struct sbrec_datapath_binding *ldp)
 
 }
 
+static bool
+is_gateway_router(const struct sbrec_datapath_binding *ldp,
+  const struct hmap *local_datapaths)
+{
+struct local_datapath *ld =
+get_local_datapath(local_datapaths, ldp->tunnel_key);
+return ld ? ld->has_local_l3gateway : false;
+}
+
 /* Adds the logical flows from the Logical_Flow table to flow tables. */
 static void
 add_logical_flows(struct controller_ctx *ctx, const struct lport_index *lports,
@@ -220,6 +229,7 @@ consider_logical_flow(const struct lport_index *lports,
 .lookup_port = lookup_port_cb,
 .aux = &aux,
 .is_switch = is_switch(ldp),
+.is_gateway_router = is_gateway_router(ldp, local_datapaths),
 .ct_zones = ct_zones,
 .group_table = group_table,
 
diff --git a/ovn/lib/actions.c b/ovn/lib/actions.c
index dda675b..5c57ab7 100644
--- a/ovn/lib/actions.c
+++ b/ovn/lib/actions.c
@@ -803,12 +803,15 @@ encode_ct_nat(const struct ovnact_ct_nat *cn,
 ct = ofpacts->header;
 if (cn->ip) {
 ct->flags |= NX_CT_F_COMMIT;
-} else if (snat) {
-/* XXX: For performance reasons, we try to prevent additional
- * recirculations.  So far, ct_snat which is used in a gateway router
- * does not need a recirculation. ct_snat(IP) does need a
- * recirculation.  Should we consider a method to let the actions
- * specify whether an action needs recirculation if there more use
+} else if (snat && ep->is_gateway_router) {
+/* For performance reasons, we try to prevent additional
+ * recirculations.  ct_snat which is used in a gateway router
+ * does not need a recirculation.  ct_snat(IP) does need a
+ * recirculation.  ct_snat in a distributed router needs
+ * recirculation regardless of whether an IP address is
+ * specified.
+ * XXX Should we consider a method to let the actions specify
+ * whether an action needs recirculation if there are more use
  * cases?. */
 ct->recirc_table = NX_CT_RECIRC_NONE;
 }
diff --git a/ovn/ovn-sb.xml b/ovn/ovn-sb.xml
index a7c29c3..8fe0e2b 100644
--- a/ovn/ovn-sb.xml
+++ b/ovn/ovn-sb.xml
@@ -1122,11 +1122,26 @@
 
   
 ct_snat sends the packet through the SNAT zone to
-unSNAT any packet that was SNATed in the opposite direction.  If
-the packet needs to be sent to the next tables, then it should be
-followed by a next; action.  The next tables will not
-see the changes in the packet caused by the connection tracker.
+unSNAT any packet that was SNATed in the opposite direction.  The
+behavior on gateway routers differs from the behavior on a
+distributed router:
   
+  
+
+  On a gateway router, if the packet needs to be sent to the next
+  tables, then it should be followed by a next;
+  action.  The next tables will not see the changes in the packet
+  caused by the connection tracker.
+
+
+  On a distribute

[ovs-dev] [PATCH v10 7/8] ovn: distributed NAT flows

2017-01-17 Thread Mickey Spiegel

This patch implements the flows required in the ingress and egress
pipeline stages in order to support NAT on a distributed logical router.

NAT functionality is associated with the logical router gateway port.
The flows that carry out NAT functionality all have match conditions on
inport or outport equal to the logical router gateway port.  There are
additional flows that are used to redirect traffic when necessary,
using the tunnel key of a "chassisredirect" SB port binding in order to
redirect traffic to the instance of the logical router gateway port on
the centralized "redirect-chassis".

North/south traffic subject to one-to-one "dnat_and_snat" is handled
in a distributed manner, with south-to-north traffic going to the
local instance of the logical router gateway port.  North/south
traffic subject to (possibly one-to-many) "snat" is handled in a
centralized manner, with south-to-north traffic going to the instance
of the logical router gateway port on the "redirect-chassis".
North-to-south traffic is directed to the corresponding chassis by
limiting ARP responses to the appropriate instance of the logical
router gateway port on one chassis.  For centralized NAT rules, this
is the instance on the "redirect-chassis".  For distributed NAT rules,
this is the chassis where the corresponding logical port resides, using
an ethernet address specified in the NB NAT rule to trigger upstream
MAC learning.

East/west NAT traffic is all handled in a centralized manner.  While it
is certainly possible to handle some of this traffic in a distributed
manner, the centralized approach keeps the NAT flows simpler and
cleaner.  The expectation is that east/west NAT traffic is not as
important to optimize as north/south NAT traffic, with most east/west
traffic not requiring NAT.

Automated tests are currently limited to only a single node.  The
single node automated tests cover both north/south and east/west
traffic flows.

Signed-off-by: Mickey Spiegel 
---
 ovn/controller/ovn-controller.c |   6 +-
 ovn/northd/ovn-northd.8.xml | 374 ++--
 ovn/northd/ovn-northd.c | 416 ++--
 ovn/ovn-architecture.7.xml  |   7 +-
 ovn/ovn-nb.ovsschema|   6 +-
 ovn/ovn-nb.xml  |  49 -
 tests/system-ovn.at | 338 
 7 files changed, 1115 insertions(+), 81 deletions(-)

diff --git a/ovn/controller/ovn-controller.c b/ovn/controller/ovn-controller.c
index 7cef3f8..ea299da 100644
--- a/ovn/controller/ovn-controller.c
+++ b/ovn/controller/ovn-controller.c
@@ -323,10 +323,8 @@ update_ct_zones(struct sset *lports, const struct hmap 
*local_datapaths,
 /* Local patched datapath (gateway routers) need zones assigned. */
 const struct local_datapath *ld;
 HMAP_FOR_EACH (ld, hmap_node, local_datapaths) {
-if (!ld->has_local_l3gateway) {
-continue;
-}
-
+/* XXX Add method to limit zone assignment to logical router
+ * datapaths with NAT */
 char *dnat = alloc_nat_zone_key(&ld->datapath->header_.uuid, "dnat");
 char *snat = alloc_nat_zone_key(&ld->datapath->header_.uuid, "snat");
 sset_add(&all_users, dnat);
diff --git a/ovn/northd/ovn-northd.8.xml b/ovn/northd/ovn-northd.8.xml
index 49e4291..168826e 100644
--- a/ovn/northd/ovn-northd.8.xml
+++ b/ovn/northd/ovn-northd.8.xml
@@ -752,9 +752,25 @@ output;
column is set to router and
   the connected logical router port specifies a
-  redirect-chassis, the flow is only programmed on the
-  redirect-chassis.
+  redirect-chassis:
 
+
+
+  
+The flow for the connected logical router port's Ethernet
+address is only programmed on the redirect-chassis.
+  
+
+  
+If the logical router has rules specified in
+ with
+, then
+those addresses are also used to populate the switch's destination
+lookup on the chassis where
+ is
+resident.
+  
+
   
 
   
@@ -890,6 +906,23 @@ output;
   redirect-chassis.
 
   
+
+  
+
+  For each dnat_and_snat NAT rule on a distributed
+  router that specifies an external Ethernet address E,
+  a priority-50 flow that matches inport == GW
+  && eth.dst == E, where GW
+  is the logical router gateway port, with action
+  next;.
+
+
+
+  This flow is only programmed on the gateway port instance on
+  the chassis where the logical_port specified in
+  the NAT rule resides.
+
+  
 
 
 
@@ -928,7 +961,9 @@ output;
   
   
 ip4.src or ip6.src is any IP
-address owned by the router.
+address owned by the router, unless the packet was recirculated
+

Re: [ovs-dev] [PATCH] netdev-dpdk: Create separate memory pool for each port

2017-01-17 Thread Robert Wojciechowicz

On Sun, Jan 15, 2017 at 06:14:47PM +, Stokes, Ian wrote:
> > Since it's possible to delete memory pool in DPDK we can try to estimate
> > better required memory size when port is reconfigured, e.g. with different
> > number of rx queues.
> 
> Thanks for the patch.
> 
> It will need to be rebased for it to apply cleanly to master.

Yes, this patch waits here for quite some time. I need to rebase it and send 
once again.

> I've applied the changes manually just to test and from what I can see there 
> are no issues.
> It also resolves issue reported with having 64 queues for dpdk physical ports 
> which causes ovs to crash.

Thanks for testing

> 
> A few other comments below.
> > 
> > Signed-off-by: Robert Wojciechowicz 
> > ---
> >  lib/netdev-dpdk.c | 85 +++---
> > -
> >  1 file changed, 41 insertions(+), 44 deletions(-)
> > 
> > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 7c1523e..4c9e1e8
> > 100644
> > --- a/lib/netdev-dpdk.c
> > +++ b/lib/netdev-dpdk.c
[...]
> > 
> >  static struct dpdk_mp *
> > -dpdk_mp_get(int socket_id, int mtu)
> > +dpdk_mp_get(struct netdev_dpdk *dev, int mtu)
> >  {
> >  struct dpdk_mp *dmp;
> > 
> >  ovs_mutex_lock(&dpdk_mp_mutex);
> > -LIST_FOR_EACH (dmp, list_node, &dpdk_mp_list) {
> > -if (dmp->socket_id == socket_id && dmp->mtu == mtu) {
> > -dmp->refcount++;
> > -goto out;
> > -}
> > -}
> > -
> As you remove the use of dpdk_mp_list above, I wonder is there a need to 
> maintain the global list?
> With mempools now set on a per netdev basis, perhaps the global list could be 
> removed also. Thoughts?
> 

I was thinking about that when I prepared this patch. In this implementation 
memory pool
will be always used only by one port, so I could get rid of `refcount` and 
`dpdk_mp_list`.
I didn't do that because I didn't want to make huge revolution in the 
implementation
without even knowing if this idea of separated memory pools makes sense for the 
community.
Now I have positive feedbak from you, so maybe it's worth it to work on this 
patch further.

Br,
Robert
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH] ofproto-dpif: Reduce the time of creating large mount of bridges

2017-01-17 Thread zhangsha (A)

>From e7623970cd71f8af7c3849a5beff87fd31fbdf68 Mon Sep 17 00:00:00 2001
From: Sha Zhang 
Date: Tue, 17 Jan 2017 11:34:24 +0800
Subject: [PATCH] ofproto-dpif: Reduce the time of creating large mount of
bridges

This patch moves function xlate_txn_start and xlate_txn_commit out of the loop
traversing all the ofproto-dpifs to reduce the time of creating a large mount
of bridges in separate database transactions.
As a global variable, new_xcfg should only be allocated at the beginning and
commited at the end once time, other than doing it repeatly in the loop body.

Signed-off-by: Sha Zhang 
---
ofproto/ofproto-dpif.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index 72e91b9..9cb8472 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -432,6 +432,7 @@ type_run(const char *type)
 }
 backer->need_revalidate = 0;
+xlate_txn_start();
 HMAP_FOR_EACH (ofproto, all_ofproto_dpifs_node, &all_ofproto_dpifs) {
 struct ofport_dpif *ofport;
 struct ofbundle *bundle;
@@ -440,7 +441,6 @@ type_run(const char *type)
 continue;
 }
-xlate_txn_start();
 xlate_ofproto_set(ofproto, ofproto->up.name,
   ofproto->backer->dpif, ofproto->ml,
   ofproto->stp, ofproto->rstp, ofproto->ms,
@@ -471,9 +471,9 @@ type_run(const char *type)
  ofport->up.pp.state, ofport->is_tunnel,
  ofport->may_enable);
 }
-xlate_txn_commit();
 }
+xlate_txn_commit();
 udpif_revalidate(backer->udpif);
 }
--
1.8.3.1
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] vlan.rst: Strip leftover HTML.

2017-01-17 Thread Russell Bryant

On Mon, Jan 16, 2017 at 5:14 PM, Ben Pfaff  wrote:

> On Mon, Jan 16, 2017 at 04:46:58PM -0500, Russell Bryant wrote:
> > String a couple of closing HTML tags that were left over from when this
> doc
>
> s/String/Strip/
>

Oops, fixed.


>
> > was converted from the web site to RST.
> >
> > Signed-off-by: Russell Bryant 
>
> Acked-by: Ben Pfaff 
>

Thanks, applied to master.

-- 
Russell Bryant
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [RFC] ofproto/bond: operational vs administratively disabled bond interface

2017-01-17 Thread Flavio Leitner


Hi,

On Tue, 17 Jan 2017 09:34:19 +0100
Eelco Chaudron  wrote:

> Currently OVS does not distinguish between a bond slave being operational
> disabled, i.e. link being down, and administratively disabled.
> 
> Take the example where the administrator disabled a link in a bond,
> "ovs-appctl bond/disable-slave bond0 enp129s0f0", it's automatically
> enabled again due to the fact the link is up.
> 
> I would like to change this behavior such that when disabled trough appctl
> the slave is no longer used until explicitly enabled again via appctl.

Eelco and I discussed this off list and I agree that this sounds like
a bug.  The slave should not be used if the admin has disabled it
regardless of its link state.

-- 
Flavio

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] netdev-dpdk: Use intermediate queue during packet transmission.

2017-01-17 Thread Fischetti, Antonio

Will respin a patch v2 where we addressed your comments. 
So please let's continue our discussion on v2.

Thanks,
Antonio

> -Original Message-
> From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> boun...@openvswitch.org] On Behalf Of Fischetti, Antonio
> Sent: Thursday, January 12, 2017 5:40 PM
> To: Ilya Maximets ; Bodireddy, Bhanuprakash
> ; Aaron Conole 
> Cc: d...@openvswitch.org; Daniele Di Proietto 
> Subject: Re: [ovs-dev] [PATCH] netdev-dpdk: Use intermediate queue during
> packet transmission.
> 
> Hi Ilya,
> thanks for your detailed feedback. I've added a couple of replies
> inline about when the instant send happens and the may_steal param.
> 
> 
> Regards,
> Antonio
> 
> > -Original Message-
> > From: Ilya Maximets [mailto:i.maxim...@samsung.com]
> > Sent: Tuesday, December 20, 2016 12:47 PM
> > To: Bodireddy, Bhanuprakash ; Aaron
> > Conole 
> > Cc: d...@openvswitch.org; Daniele Di Proietto ;
> > Thadeu Lima de Souza Cascardo ; Fischetti, Antonio
> > ; Markus Magnusson
> > 
> > Subject: Re: [ovs-dev] [PATCH] netdev-dpdk: Use intermediate queue
> during
> > packet transmission.
> >
> > On 20.12.2016 15:19, Bodireddy, Bhanuprakash wrote:
> > >> -Original Message-
> > >> From: Ilya Maximets [mailto:i.maxim...@samsung.com]
> > >> Sent: Tuesday, December 20, 2016 8:09 AM
> > >> To: Bodireddy, Bhanuprakash ; Aaron
> > >> Conole 
> > >> Cc: d...@openvswitch.org; Daniele Di Proietto
> ;
> > >> Thadeu Lima de Souza Cascardo ; Fischetti,
> Antonio
> > >> ; Markus Magnusson
> > >> 
> > >> Subject: Re: [ovs-dev] [PATCH] netdev-dpdk: Use intermediate queue
> > during
> > >> packet transmission.
> > >>
> > >> On 19.12.2016 21:05, Bodireddy, Bhanuprakash wrote:
> > >>> Thanks Ilya and Aaron for reviewing this patch and providing your
> > >> comments, my reply inline.
> > >>>
> >  -Original Message-
> >  From: Ilya Maximets [mailto:i.maxim...@samsung.com]
> >  Sent: Monday, December 19, 2016 8:41 AM
> >  To: Aaron Conole ; Bodireddy, Bhanuprakash
> >  
> >  Cc: d...@openvswitch.org; Daniele Di Proietto
> >  ; Thadeu Lima de Souza Cascardo
> >  ; Fischetti, Antonio
> >  ; Markus Magnusson
> >  
> >  Subject: Re: [ovs-dev] [PATCH] netdev-dpdk: Use intermediate queue
> >  during packet transmission.
> > 
> >  Hi,
> > 
> >  I didn't test this patch yet. Bhanu, could you please describe your
> >  test scenario and performance results in more details.
> > >>>
> > >>> During the recent performance analysis improvements for classifier,
> we
> > >> found that bottleneck was also observed at flow batching.
> > >>> This was due to fewer packets in batch. To reproduce this, a simple
> > P2P test
> > >> case can be used with 30 IXIA streams and matching IP flow rules.
> > >>> Eg:  For an IXIA stream with src Ip: 2.2.2.1, dst tip: 5.5.5.1
> > >>> ovs-ofctl add-flow br0
> > >>> dl_type=0x0800,nw_src=2.2.2.1,actions=output:2
> > >>>
> > >>> For an IXIA stream with src Ip: 4.4.4.1, dst tip: 15.15.15.1
> > >>> ovs-ofctl add-flow br0
> > >>> dl_type=0x0800,nw_src=4.4.4.1,actions=output:2
> > >>>
> > >>> This leaves fewer packets in batches and
> > packet_batch_per_flow_execute()
> > >> shall be invoked for every batch.
> > >>> With 30 flows, I see the throughput drops to ~7.0 Mpps in PHY2PHY
> case
> > for
> > >> 64 byte udp packets.
> > >>>
> >  It'll be nice if you provide throughput and latency measurement
> >  results for different scenarios and packet sizes. Latency is
> > important here.
> > >>> We are yet to do latency measurements in this case. With 30 IXIA
> > >>> streams comprising of 64 byte udp packets there was an throughput
> > >>> improvement of 30% in P2P case and 13-15% in PVP case(single queue).
> > we
> > >> will try to get the latency stats with and without this patch.
> > >>>
> > 
> >  About the patch itself:
> > 
> >   1. 'drain' called only for PMD threads. This will lead to
> >  broken connection with non-PMD ports.
> > >>> I see that non-PMD ports are handled with vswitchd thread.
> > >>> Tested PVP loopback case with tap ports and found to be working as
> > >>> expected. Can you let me know the specific case you are referring
> here
> > so
> > >> that I can verify if the patch breaks it.
> > >>
> > >> I meant something like this:
> > >>
> > >>
> > >> *---HOST-1(br-int)-*   *-HOST-2(br-
> > int)--*
> > >> |  |   |
> > |
> > >> |   internal_port <--> dpdk0 <---> dpdk0 <-->
> > internal_port  |
> > >> |   192.168.0.1/24 |   |
> > 192.168.0.2/24 |
> > >> *--*   *-
> --
> > --*
> > >>
> > >> (HOST-1)# ping 192.168.0.2
> > >>
> > >> In this case I'm expecting that first (NETDEV_MAX_BURST - 1) icmp
> > packets
> > >> will stuck in TX queue. The next packet will cause sendi

Re: [ovs-dev] [PATCH] dp-packet: Fix comment on DPDK buffer data.

2017-01-17 Thread Fischetti, Antonio

Self-NACK. Will include this change with another patch.

Antonio

> -Original Message-
> From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> boun...@openvswitch.org] On Behalf Of antonio.fische...@intel.com
> Sent: Wednesday, January 11, 2017 5:11 PM
> To: d...@openvswitch.org
> Subject: [ovs-dev] [PATCH] dp-packet: Fix comment on DPDK buffer data.
> 
> Small fix on the comments about the DPDK buffer data to refer
> to dp_packet_init_dpdk() function.
> 
> Signed-off-by: Antonio Fischetti 
> ---
>  lib/dp-packet.c | 4 ++--
>  lib/dp-packet.h | 2 +-
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/dp-packet.c b/lib/dp-packet.c
> index 793b54f..14ab15e 100644
> --- a/lib/dp-packet.c
> +++ b/lib/dp-packet.c
> @@ -91,8 +91,8 @@ dp_packet_use_const(struct dp_packet *b, const void
> *data, size_t size)
> 
>  /* Initializes 'b' as an empty dp_packet that contains the 'allocated'
> bytes of
>   * memory starting at 'base'.  DPDK allocated dp_packet and *data is
> allocated
> - * from one continous memory region, so in memory data start right after
> - * dp_packet.  Therefore there is special method to free this type of
> + * from one continuous memory region, so in memory data start right after
> + * dp_packet.  Therefore there is a special method to free this type of
>   * buffer.  dp_packet base, data and size are initialized by dpdk rcv()
> so no
>   * need to initialize those fields. */
>  void
> diff --git a/lib/dp-packet.h b/lib/dp-packet.h
> index cf7d247..b941540 100644
> --- a/lib/dp-packet.h
> +++ b/lib/dp-packet.h
> @@ -39,7 +39,7 @@ enum OVS_PACKED_ENUM dp_packet_source {
>  DPBUF_STACK,   /* Un-movable stack space or static
> buffer. */
>  DPBUF_STUB,/* Starts on stack, may expand into heap.
> */
>  DPBUF_DPDK,/* buffer data is from DPDK allocated
> memory.
> -* ref to build_dp_packet() in netdev-
> dpdk. */
> +* Ref to dp_packet_init_dpdk() in dp-
> packet. */
>  };
> 
>  #define DP_PACKET_CONTEXT_SIZE 64
> --
> 2.4.11
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH v2 0/2] netdev-dpdk: Use intermediate queue during packet transmission.

2017-01-17 Thread antonio . fischetti

After packet classification packets are queued in to batches depending
on the matching netdev flow. Thereafter each batch is processed to execute
the related actions. This becomes particularly inefficient if there
are few packets in each batch as rte_eth_tx_burst() incurs expensive MMIO
writes.

This patchset adds back the intermediate queue implementation. Packets are
queued and burst when their count is >= NETDEV_MAX_BURST. Also a drain
logic is refactored to handle packets hanging into the tx queues. Testing
shows significant performance gains with this implementation.

Fixes: b59cc14e032d("netdev-dpdk: Use instant sending instead of
queueing of packets.")

V1 => V2:
  * xps_get_tx_qid() is no more called twice. The last used qid is stored so
the drain function will flush the right queue also when XPS is enabled.

  * netdev_txq_drain() is called unconditionally and not just for dpdk ports.

  * txq_drain() takes the 'tx_lock' for queue in case of dynamic tx queues.

  * Restored counting of dropped packets.

  * Changed scheduling of drain function.

  * Updated comments in netdev-provider.h

  * Fixed a comment in dp-packet.h.

Fischetti, Antonio (2):
  netdev-dpdk: Use intermediate queue during packet transmission.
  dp-packet: Fix comment on DPDK buffer data.

 lib/dp-packet.c   |  4 +--
 lib/dp-packet.h   | 10 +++
 lib/dpif-netdev.c | 53 +++--
 lib/netdev-bsd.c  |  1 +
 lib/netdev-dpdk.c | 82 ++-
 lib/netdev-dummy.c|  1 +
 lib/netdev-linux.c|  1 +
 lib/netdev-provider.h |  8 +
 lib/netdev-vport.c|  3 +-
 lib/netdev.c  |  9 ++
 lib/netdev.h  |  1 +
 11 files changed, 156 insertions(+), 17 deletions(-)

-- 
2.4.11

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH v2 2/2] dp-packet: Fix comment on DPDK buffer data.

2017-01-17 Thread antonio . fischetti

Fix comments about the DPDK buffer data to refer to dp_packet_init_dpdk()
function.

Signed-off-by: Antonio Fischetti 
---
 lib/dp-packet.c |  4 ++--
 lib/dp-packet.h | 10 +-
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/lib/dp-packet.c b/lib/dp-packet.c
index 793b54f..14ab15e 100644
--- a/lib/dp-packet.c
+++ b/lib/dp-packet.c
@@ -91,8 +91,8 @@ dp_packet_use_const(struct dp_packet *b, const void *data, 
size_t size)
 
 /* Initializes 'b' as an empty dp_packet that contains the 'allocated' bytes of
  * memory starting at 'base'.  DPDK allocated dp_packet and *data is allocated
- * from one continous memory region, so in memory data start right after
- * dp_packet.  Therefore there is special method to free this type of
+ * from one continuous memory region, so in memory data start right after
+ * dp_packet.  Therefore there is a special method to free this type of
  * buffer.  dp_packet base, data and size are initialized by dpdk rcv() so no
  * need to initialize those fields. */
 void
diff --git a/lib/dp-packet.h b/lib/dp-packet.h
index cf7d247..27a5506 100644
--- a/lib/dp-packet.h
+++ b/lib/dp-packet.h
@@ -35,11 +35,11 @@ extern "C" {
 #endif
 
 enum OVS_PACKED_ENUM dp_packet_source {
-DPBUF_MALLOC,  /* Obtained via malloc(). */
-DPBUF_STACK,   /* Un-movable stack space or static buffer. */
-DPBUF_STUB,/* Starts on stack, may expand into heap. */
-DPBUF_DPDK,/* buffer data is from DPDK allocated memory.
-* ref to build_dp_packet() in netdev-dpdk. */
+DPBUF_MALLOC,  /* Obtained via malloc(). */
+DPBUF_STACK,   /* Un-movable stack space or static buffer. */
+DPBUF_STUB,/* Starts on stack, may expand into heap. */
+DPBUF_DPDK,/* Buffer data is from DPDK allocated memory.
+* Refer to dp_packet_init_dpdk() in dp-packet.c */
 };
 
 #define DP_PACKET_CONTEXT_SIZE 64
-- 
2.4.11

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH v2 1/2] netdev-dpdk: Use intermediate queue during packet transmission.

2017-01-17 Thread antonio . fischetti

This patch implements the intermediate Tx queues on 'dpdk' type ports.

Test results:
 * In worst case scenario with fewer packets per batch, a significant
   bottleneck is observed for netdev_dpdk_eth_send() function due to
   expensive MMIO writes.

 * Also its observed that CPI(cycles per instruction) Rate for the function
   stood between 3.15 and 4.1 which is significantly higher than acceptable
   limit of 1.0 for HPC applications and theoretical limit of 0.25 (As Backend
   pipeline can retire 4 micro-operations in a cycle).

 * With this patch, CPI for netdev_dpdk_eth_send() is at 0.55 and the overall
   throughput improved significantly.


Signed-off-by: Antonio Fischetti 
Signed-off-by: Bhanuprakash Bodireddy 
Co-authored-by: Bhanuprakash Bodireddy 
Signed-off-by: Markus Magnusson 
Co-authored-by: Markus Magnusson 
---
 lib/dpif-netdev.c | 53 +++--
 lib/netdev-bsd.c  |  1 +
 lib/netdev-dpdk.c | 82 ++-
 lib/netdev-dummy.c|  1 +
 lib/netdev-linux.c|  1 +
 lib/netdev-provider.h |  8 +
 lib/netdev-vport.c|  3 +-
 lib/netdev.c  |  9 ++
 lib/netdev.h  |  1 +
 9 files changed, 149 insertions(+), 10 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 3901129..58ac429 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -289,6 +289,8 @@ struct dp_netdev_rxq {
 struct dp_netdev_pmd_thread *pmd;  /* pmd thread that will poll this 
queue. */
 };
 
+#define LAST_USED_QID_NONE -1
+
 /* A port in a netdev-based datapath. */
 struct dp_netdev_port {
 odp_port_t port_no;
@@ -303,6 +305,8 @@ struct dp_netdev_port {
 char *type; /* Port type as requested by user. */
 char *rxq_affinity_list;/* Requested affinity of rx queues. */
 bool need_reconfigure;  /* True if we should reconfigure netdev. */
+int last_used_qid;  /* Last queue id where packets could be
+   enqueued. */
 };
 
 /* Contained by struct dp_netdev_flow's 'stats' member.  */
@@ -619,6 +623,9 @@ static int dpif_netdev_xps_get_tx_qid(const struct 
dp_netdev_pmd_thread *pmd,
 static inline bool emc_entry_alive(struct emc_entry *ce);
 static void emc_clear_entry(struct emc_entry *ce);
 
+static struct tx_port *pmd_send_port_cache_lookup
+(const struct dp_netdev_pmd_thread *pmd, odp_port_t port_no);
+
 static void
 emc_cache_init(struct emc_cache *flow_cache)
 {
@@ -3507,15 +3514,19 @@ pmd_load_queues_and_ports(struct dp_netdev_pmd_thread 
*pmd,
 return i;
 }
 
+enum { DRAIN_TSC = 2ULL };
+
 static void *
 pmd_thread_main(void *f_)
 {
 struct dp_netdev_pmd_thread *pmd = f_;
-unsigned int lc = 0;
+unsigned int lc = 0, lc_drain = 0;
 struct polled_queue *poll_list;
 bool exiting;
 int poll_cnt;
 int i;
+uint64_t prev = 0, now = 0;
+struct tx_port *tx_port;
 
 poll_list = NULL;
 
@@ -3548,6 +3559,26 @@ reload:
poll_list[i].port_no);
 }
 
+#define MAX_LOOP_TO_DRAIN 128
+if (lc_drain++ > MAX_LOOP_TO_DRAIN) {
+lc_drain = 0;
+prev = now;
+now = pmd->last_cycles;
+if ((now - prev) > DRAIN_TSC) {
+HMAP_FOR_EACH (tx_port, node, &pmd->tx_ports) {
+if (tx_port->port->last_used_qid != LAST_USED_QID_NONE) {
+/* This queue may contain some buffered packets waiting
+ * to be sent out. */
+netdev_txq_drain(tx_port->port->netdev,
+tx_port->port->last_used_qid,
+tx_port->port->dynamic_txqs);
+/* Mark it as empty. */
+tx_port->port->last_used_qid = LAST_USED_QID_NONE;
+}
+}
+}
+}
+
 if (lc++ > 1024) {
 bool reload;
 
@@ -3883,6 +3914,7 @@ dp_netdev_add_port_tx_to_pmd(struct dp_netdev_pmd_thread 
*pmd,
 
 tx->port = port;
 tx->qid = -1;
+port->last_used_qid = LAST_USED_QID_NONE;
 
 hmap_insert(&pmd->tx_ports, &tx->node, hash_port_no(tx->port->port_no));
 pmd->need_reload = true;
@@ -4538,7 +4570,24 @@ dp_execute_cb(void *aux_, struct dp_packet_batch 
*packets_,
 } else {
 tx_qid = pmd->static_tx_qid;
 }
+//TODO Add UNLIKELY to the 1st condition?
+/* Is the current qid the same as the last one we used? */
+if ((p->port->last_used_qid != LAST_USED_QID_NONE) &&
+(p->port->last_used_qid != tx_qid)) {
+/* The current assigned queue was changed, we need to drain
+ * packets from the previous queue. */
+netdev_txq_drain(p->port->netdev,
+p->port->last_used_qid,
+p->port->dynamic_txqs);
+p->port->last_used

Re: [ovs-dev] [BUG] ovs-ofctl version 2.5.0 will crash with OFPFMFC_BAD_COMMAND

2017-01-17 Thread Vidyasagara Guntaka via dev

Hi Ben,

Here i is more debug information related to this incident (still using version 
2.5.0):

Summary :

We think that there is some race condition involved in processing OF Controller 
connections and Packet miss processing in ovs-vswitchd.

Reasoning :

Please consider the following GDB Debug Session:

Breakpoint 1, ofconn_set_protocol (ofconn=0x16d5810, 
protocol=OFPUTIL_P_OF10_STD) at ofproto/connmgr.c:999
(gdb) f 2
#2  0x0045f586 in connmgr_wants_packet_in_on_miss (mgr=0x16a6de0) at 
ofproto/connmgr.c:1613
1613enum ofputil_protocol protocol = ofconn_get_protocol(ofconn);
(gdb) p *ofconn
$2 = {node = {prev = 0x16a6e18, next = 0x16a6e18}, hmap_node = {hash = 0, next 
= 0x0}, connmgr = 0x16a6de0, rconn = 0x16edb50, type = OFCONN_SERVICE, band = 
OFPROTO_IN_BAND, enable_async_msgs = true, 
  role = OFPCR12_ROLE_EQUAL, protocol = OFPUTIL_P_OF10_STD_TID, 
packet_in_format = NXPIF_OPENFLOW10, packet_in_counter = 0x167a170, schedulers 
= {0x0, 0x0}, pktbuf = 0x0, miss_send_len = 0, 
  controller_id = 0, reply_counter = 0x1673190, master_async_config = {3, 7, 7, 
0, 0, 0}, slave_async_config = {0, 7, 0, 0, 0, 0}, n_add = 0, n_delete = 0, 
n_modify = 0, 
  first_op = -9223372036854775808, last_op = -9223372036854775808, 
next_op_report = 9223372036854775807, op_backoff = -9223372036854775808, 
monitors = {buckets = 0x16d58f0, one = 0x0, mask = 0, 
n = 0}, monitor_paused = 0, monitor_counter = 0x16759f0, updates = {prev = 
0x16d5918, next = 0x16d5918}, sent_abbrev_update = false, bundles = {buckets = 
0x16d5938, one = 0x0, mask = 0, n = 0}}
(gdb) bt
#0  ofconn_set_protocol (ofconn=0x16d5810, protocol=OFPUTIL_P_OF10_STD) at 
ofproto/connmgr.c:999
#1  0x0045e194 in ofconn_get_protocol (ofconn=0x16d5810) at 
ofproto/connmgr.c:982
#2  0x0045f586 in connmgr_wants_packet_in_on_miss (mgr=0x16a6de0) at 
ofproto/connmgr.c:1613
#3  0x00435261 in rule_dpif_lookup_from_table (ofproto=0x16a6880, 
version=323, flow=0x7f2ace7f86e8, wc=0x7f2ace7f84b0, stats=0x0, 
table_id=0x7f2ace7f7eda "", in_port=28, may_packet_in=true, 
honor_table_miss=true) at ofproto/ofproto-dpif.c:3973
#4  0x00457ecf in xlate_actions (xin=0x7f2ace7f86e0, 
xout=0x7f2ace7f8010) at ofproto/ofproto-dpif-xlate.c:5188
#5  0x004481b1 in revalidate_ukey (udpif=0x16a7300, 
ukey=0x7f2ab80060e0, stats=0x7f2ace7f94e0, odp_actions=0x7f2ace7f8a40, 
reval_seq=585728, recircs=0x7f2ace7f8a30)
at ofproto/ofproto-dpif-upcall.c:1866
#6  0x00448fb2 in revalidate (revalidator=0x1691990) at 
ofproto/ofproto-dpif-upcall.c:2186
#7  0x0044593e in udpif_revalidator (arg=0x1691990) at 
ofproto/ofproto-dpif-upcall.c:862
#8  0x0050b93d in ovsthread_wrapper (aux_=0x16f4560) at 
lib/ovs-thread.c:340
#9  0x7f2ad75c2184 in start_thread () from 
/lib/x86_64-linux-gnu/libpthread.so.0
#10 0x7f2ad6de137d in clone () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) f 1
#1  0x0045e194 in ofconn_get_protocol (ofconn=0x16d5810) at 
ofproto/connmgr.c:982
982 ofconn_set_protocol(CONST_CAST(struct ofconn *, ofconn),
(gdb) l
977 {
978 if (ofconn->protocol == OFPUTIL_P_NONE &&
979 rconn_is_connected(ofconn->rconn)) {
980 int version = rconn_get_version(ofconn->rconn);
981 if (version > 0) {
982 ofconn_set_protocol(CONST_CAST(struct ofconn *, ofconn),
983 
ofputil_protocol_from_ofp_version(version));
984 }
985 }
986 
(gdb) p *ofconn
$3 = {node = {prev = 0x16a6e18, next = 0x16a6e18}, hmap_node = {hash = 0, next 
= 0x0}, connmgr = 0x16a6de0, rconn = 0x16edb50, type = OFCONN_SERVICE, band = 
OFPROTO_IN_BAND, enable_async_msgs = true, 
  role = OFPCR12_ROLE_EQUAL, protocol = OFPUTIL_P_OF10_STD_TID, 
packet_in_format = NXPIF_OPENFLOW10, packet_in_counter = 0x167a170, schedulers 
= {0x0, 0x0}, pktbuf = 0x0, miss_send_len = 0, 
  controller_id = 0, reply_counter = 0x1673190, master_async_config = {3, 7, 7, 
0, 0, 0}, slave_async_config = {0, 7, 0, 0, 0, 0}, n_add = 0, n_delete = 0, 
n_modify = 0, 
  first_op = -9223372036854775808, last_op = -9223372036854775808, 
next_op_report = 9223372036854775807, op_backoff = -9223372036854775808, 
monitors = {buckets = 0x16d58f0, one = 0x0, mask = 0, 
n = 0}, monitor_paused = 0, monitor_counter = 0x16759f0, updates = {prev = 
0x16d5918, next = 0x16d5918}, sent_abbrev_update = false, bundles = {buckets = 
0x16d5938, one = 0x0, mask = 0, n = 0}}
(gdb) p ofconn
$4 = (const struct ofconn *) 0x16d5810
(gdb) c
Continuing.
[Thread 0x7f2ad79f5980 (LWP 20165) exited]

>From the above GDB Session, ovs-vswitchd is in the middle of processing a 
>packet miss that was read from the data path.
The break point was set inside ofconn_set_protocol inside so that we hit if 
protocol was already set to other than OFPUTIL_P_NONE and now is being set to 
OFPUTIL_P_OF10_STD.
Yes, we modified the code in ofconn_set_protocol with this if condit

Re: [ovs-dev] [PATCH v2 15/17] datapath: enable vxlangpe creation in compat mode

2017-01-17 Thread Jiri Benc

On Wed, 28 Dec 2016 20:26:28 +0800, Yi Yang wrote:
> --- a/datapath/linux/compat/include/linux/openvswitch.h
> +++ b/datapath/linux/compat/include/linux/openvswitch.h
> @@ -291,6 +291,7 @@ enum ovs_vport_attr {
>  enum {
>   OVS_VXLAN_EXT_UNSPEC,
>   OVS_VXLAN_EXT_GBP,  /* Flag or __u32 */
> + OVS_VXLAN_EXT_GPE,  /* Flag or __u32 */

Please leave several free slots between OVS_VXLAN_EXT_GBP and
OVS_VXLAN_EXT_GPE. Although it's not expected that we add more values
to this enum upstream, we should not rule it out completely. Having a
buffer here allows to prevent nasty problems with attribute conflicts
between the upstream kernel and the compat code.

 Jiri
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [BUG] ovs-ofctl version 2.5.0 will crash with OFPFMFC_BAD_COMMAND

2017-01-17 Thread Ben Pfaff

It would be more helpful to have a simple reproduction case.

Why haven't you tried a newer version from branch-2.5?

On Tue, Jan 17, 2017 at 07:59:05AM -0800, Vidyasagara Guntaka wrote:
> Hi Ben,
> 
> Here i is more debug information related to this incident (still using 
> version 2.5.0):
> 
> Summary :
> 
> We think that there is some race condition involved in processing OF 
> Controller connections and Packet miss processing in ovs-vswitchd.
> 
> Reasoning :
> 
> Please consider the following GDB Debug Session:
> 
> Breakpoint 1, ofconn_set_protocol (ofconn=0x16d5810, 
> protocol=OFPUTIL_P_OF10_STD) at ofproto/connmgr.c:999
> (gdb) f 2
> #2  0x0045f586 in connmgr_wants_packet_in_on_miss (mgr=0x16a6de0) at 
> ofproto/connmgr.c:1613
> 1613  enum ofputil_protocol protocol = ofconn_get_protocol(ofconn);
> (gdb) p *ofconn
> $2 = {node = {prev = 0x16a6e18, next = 0x16a6e18}, hmap_node = {hash = 0, 
> next = 0x0}, connmgr = 0x16a6de0, rconn = 0x16edb50, type = OFCONN_SERVICE, 
> band = OFPROTO_IN_BAND, enable_async_msgs = true, 
>   role = OFPCR12_ROLE_EQUAL, protocol = OFPUTIL_P_OF10_STD_TID, 
> packet_in_format = NXPIF_OPENFLOW10, packet_in_counter = 0x167a170, 
> schedulers = {0x0, 0x0}, pktbuf = 0x0, miss_send_len = 0, 
>   controller_id = 0, reply_counter = 0x1673190, master_async_config = {3, 7, 
> 7, 0, 0, 0}, slave_async_config = {0, 7, 0, 0, 0, 0}, n_add = 0, n_delete = 
> 0, n_modify = 0, 
>   first_op = -9223372036854775808, last_op = -9223372036854775808, 
> next_op_report = 9223372036854775807, op_backoff = -9223372036854775808, 
> monitors = {buckets = 0x16d58f0, one = 0x0, mask = 0, 
> n = 0}, monitor_paused = 0, monitor_counter = 0x16759f0, updates = {prev 
> = 0x16d5918, next = 0x16d5918}, sent_abbrev_update = false, bundles = 
> {buckets = 0x16d5938, one = 0x0, mask = 0, n = 0}}
> (gdb) bt
> #0  ofconn_set_protocol (ofconn=0x16d5810, protocol=OFPUTIL_P_OF10_STD) at 
> ofproto/connmgr.c:999
> #1  0x0045e194 in ofconn_get_protocol (ofconn=0x16d5810) at 
> ofproto/connmgr.c:982
> #2  0x0045f586 in connmgr_wants_packet_in_on_miss (mgr=0x16a6de0) at 
> ofproto/connmgr.c:1613
> #3  0x00435261 in rule_dpif_lookup_from_table (ofproto=0x16a6880, 
> version=323, flow=0x7f2ace7f86e8, wc=0x7f2ace7f84b0, stats=0x0, 
> table_id=0x7f2ace7f7eda "", in_port=28, may_packet_in=true, 
> honor_table_miss=true) at ofproto/ofproto-dpif.c:3973
> #4  0x00457ecf in xlate_actions (xin=0x7f2ace7f86e0, 
> xout=0x7f2ace7f8010) at ofproto/ofproto-dpif-xlate.c:5188
> #5  0x004481b1 in revalidate_ukey (udpif=0x16a7300, 
> ukey=0x7f2ab80060e0, stats=0x7f2ace7f94e0, odp_actions=0x7f2ace7f8a40, 
> reval_seq=585728, recircs=0x7f2ace7f8a30)
> at ofproto/ofproto-dpif-upcall.c:1866
> #6  0x00448fb2 in revalidate (revalidator=0x1691990) at 
> ofproto/ofproto-dpif-upcall.c:2186
> #7  0x0044593e in udpif_revalidator (arg=0x1691990) at 
> ofproto/ofproto-dpif-upcall.c:862
> #8  0x0050b93d in ovsthread_wrapper (aux_=0x16f4560) at 
> lib/ovs-thread.c:340
> #9  0x7f2ad75c2184 in start_thread () from 
> /lib/x86_64-linux-gnu/libpthread.so.0
> #10 0x7f2ad6de137d in clone () from /lib/x86_64-linux-gnu/libc.so.6
> (gdb) f 1
> #1  0x0045e194 in ofconn_get_protocol (ofconn=0x16d5810) at 
> ofproto/connmgr.c:982
> 982   ofconn_set_protocol(CONST_CAST(struct ofconn *, ofconn),
> (gdb) l
> 977   {
> 978   if (ofconn->protocol == OFPUTIL_P_NONE &&
> 979   rconn_is_connected(ofconn->rconn)) {
> 980   int version = rconn_get_version(ofconn->rconn);
> 981   if (version > 0) {
> 982   ofconn_set_protocol(CONST_CAST(struct ofconn *, ofconn),
> 983   
> ofputil_protocol_from_ofp_version(version));
> 984   }
> 985   }
> 986   
> (gdb) p *ofconn
> $3 = {node = {prev = 0x16a6e18, next = 0x16a6e18}, hmap_node = {hash = 0, 
> next = 0x0}, connmgr = 0x16a6de0, rconn = 0x16edb50, type = OFCONN_SERVICE, 
> band = OFPROTO_IN_BAND, enable_async_msgs = true, 
>   role = OFPCR12_ROLE_EQUAL, protocol = OFPUTIL_P_OF10_STD_TID, 
> packet_in_format = NXPIF_OPENFLOW10, packet_in_counter = 0x167a170, 
> schedulers = {0x0, 0x0}, pktbuf = 0x0, miss_send_len = 0, 
>   controller_id = 0, reply_counter = 0x1673190, master_async_config = {3, 7, 
> 7, 0, 0, 0}, slave_async_config = {0, 7, 0, 0, 0, 0}, n_add = 0, n_delete = 
> 0, n_modify = 0, 
>   first_op = -9223372036854775808, last_op = -9223372036854775808, 
> next_op_report = 9223372036854775807, op_backoff = -9223372036854775808, 
> monitors = {buckets = 0x16d58f0, one = 0x0, mask = 0, 
> n = 0}, monitor_paused = 0, monitor_counter = 0x16759f0, updates = {prev 
> = 0x16d5918, next = 0x16d5918}, sent_abbrev_update = false, bundles = 
> {buckets = 0x16d5938, one = 0x0, mask = 0, n = 0}}
> (gdb) p ofconn
> $4 = (const struct ofconn *) 0x16d5810
> (gdb) c
> Continuing.
> [Thread 0x7f2ad79f5980 (LWP 20165) exited]
> 
> Fr

Re: [ovs-dev] [BUG] ovs-ofctl version 2.5.0 will crash with OFPFMFC_BAD_COMMAND

2017-01-17 Thread Vidyasagara Guntaka via dev

This issue happened on our in-use systems and we were trying to find a way
to move forward avoiding this issue so that we do not have to upgrade OVS
on thousands of our hypervisors causing down time. Our debugging did help
us avoid the issue for now by installing an explicit rule to to drop
packets when there is no match and this issue is not seen over many hours
of test runs.

We will definitely run this test with latest version.  But, will need more
time since we are busy with our release related activities.

Regards,
Sagar.

On Tue, Jan 17, 2017 at 8:42 AM, Ben Pfaff  wrote:

> It would be more helpful to have a simple reproduction case.
>
> Why haven't you tried a newer version from branch-2.5?
>
> On Tue, Jan 17, 2017 at 07:59:05AM -0800, Vidyasagara Guntaka wrote:
> > Hi Ben,
> >
> > Here i is more debug information related to this incident (still using
> version 2.5.0):
> >
> > Summary :
> >
> > We think that there is some race condition involved in processing OF
> Controller connections and Packet miss processing in ovs-vswitchd.
> >
> > Reasoning :
> >
> > Please consider the following GDB Debug Session:
> >
> > Breakpoint 1, ofconn_set_protocol (ofconn=0x16d5810,
> protocol=OFPUTIL_P_OF10_STD) at ofproto/connmgr.c:999
> > (gdb) f 2
> > #2  0x0045f586 in connmgr_wants_packet_in_on_miss
> (mgr=0x16a6de0) at ofproto/connmgr.c:1613
> > 1613  enum ofputil_protocol protocol =
> ofconn_get_protocol(ofconn);
> > (gdb) p *ofconn
> > $2 = {node = {prev = 0x16a6e18, next = 0x16a6e18}, hmap_node = {hash =
> 0, next = 0x0}, connmgr = 0x16a6de0, rconn = 0x16edb50, type =
> OFCONN_SERVICE, band = OFPROTO_IN_BAND, enable_async_msgs = true,
> >   role = OFPCR12_ROLE_EQUAL, protocol = OFPUTIL_P_OF10_STD_TID,
> packet_in_format = NXPIF_OPENFLOW10, packet_in_counter = 0x167a170,
> schedulers = {0x0, 0x0}, pktbuf = 0x0, miss_send_len = 0,
> >   controller_id = 0, reply_counter = 0x1673190, master_async_config =
> {3, 7, 7, 0, 0, 0}, slave_async_config = {0, 7, 0, 0, 0, 0}, n_add = 0,
> n_delete = 0, n_modify = 0,
> >   first_op = -9223372036854775808, last_op = -9223372036854775808,
> next_op_report = 9223372036854775807, op_backoff = -9223372036854775808,
> monitors = {buckets = 0x16d58f0, one = 0x0, mask = 0,
> > n = 0}, monitor_paused = 0, monitor_counter = 0x16759f0, updates =
> {prev = 0x16d5918, next = 0x16d5918}, sent_abbrev_update = false, bundles =
> {buckets = 0x16d5938, one = 0x0, mask = 0, n = 0}}
> > (gdb) bt
> > #0  ofconn_set_protocol (ofconn=0x16d5810, protocol=OFPUTIL_P_OF10_STD)
> at ofproto/connmgr.c:999
> > #1  0x0045e194 in ofconn_get_protocol (ofconn=0x16d5810) at
> ofproto/connmgr.c:982
> > #2  0x0045f586 in connmgr_wants_packet_in_on_miss
> (mgr=0x16a6de0) at ofproto/connmgr.c:1613
> > #3  0x00435261 in rule_dpif_lookup_from_table
> (ofproto=0x16a6880, version=323, flow=0x7f2ace7f86e8, wc=0x7f2ace7f84b0,
> stats=0x0, table_id=0x7f2ace7f7eda "", in_port=28, may_packet_in=true,
> > honor_table_miss=true) at ofproto/ofproto-dpif.c:3973
> > #4  0x00457ecf in xlate_actions (xin=0x7f2ace7f86e0,
> xout=0x7f2ace7f8010) at ofproto/ofproto-dpif-xlate.c:5188
> > #5  0x004481b1 in revalidate_ukey (udpif=0x16a7300,
> ukey=0x7f2ab80060e0, stats=0x7f2ace7f94e0, odp_actions=0x7f2ace7f8a40,
> reval_seq=585728, recircs=0x7f2ace7f8a30)
> > at ofproto/ofproto-dpif-upcall.c:1866
> > #6  0x00448fb2 in revalidate (revalidator=0x1691990) at
> ofproto/ofproto-dpif-upcall.c:2186
> > #7  0x0044593e in udpif_revalidator (arg=0x1691990) at
> ofproto/ofproto-dpif-upcall.c:862
> > #8  0x0050b93d in ovsthread_wrapper (aux_=0x16f4560) at
> lib/ovs-thread.c:340
> > #9  0x7f2ad75c2184 in start_thread () from /lib/x86_64-linux-gnu/
> libpthread.so.0
> > #10 0x7f2ad6de137d in clone () from /lib/x86_64-linux-gnu/libc.so.6
> > (gdb) f 1
> > #1  0x0045e194 in ofconn_get_protocol (ofconn=0x16d5810) at
> ofproto/connmgr.c:982
> > 982   ofconn_set_protocol(CONST_CAST(struct ofconn *,
> ofconn),
> > (gdb) l
> > 977   {
> > 978   if (ofconn->protocol == OFPUTIL_P_NONE &&
> > 979   rconn_is_connected(ofconn->rconn)) {
> > 980   int version = rconn_get_version(ofconn->rconn);
> > 981   if (version > 0) {
> > 982   ofconn_set_protocol(CONST_CAST(struct ofconn *,
> ofconn),
> > 983   ofputil_protocol_from_ofp_
> version(version));
> > 984   }
> > 985   }
> > 986
> > (gdb) p *ofconn
> > $3 = {node = {prev = 0x16a6e18, next = 0x16a6e18}, hmap_node = {hash =
> 0, next = 0x0}, connmgr = 0x16a6de0, rconn = 0x16edb50, type =
> OFCONN_SERVICE, band = OFPROTO_IN_BAND, enable_async_msgs = true,
> >   role = OFPCR12_ROLE_EQUAL, protocol = OFPUTIL_P_OF10_STD_TID,
> packet_in_format = NXPIF_OPENFLOW10, packet_in_counter = 0x167a170,
> schedulers = {0x0, 0x0}, pktbuf = 0x0, miss_send_len = 0,
> >   controller_id = 0, reply_counter = 0x1673190, master_asy

[ovs-dev] [PATCH v2] dpif-netdev: Change definitions of 'idle' & 'processing' cycles

2017-01-17 Thread Ciara Loftus

Instead of counting all polling cycles as processing cycles, only count
the cycles where packets were received from the polling.

Signed-off-by: Georg Schmuecking 
Signed-off-by: Ciara Loftus 
Co-authored-by: Ciara Loftus 
---
v2:
- Rebase
---
 lib/dpif-netdev.c | 57 ++-
 1 file changed, 44 insertions(+), 13 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 3901129..3854c79 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -272,7 +272,10 @@ enum dp_stat_type {
 
 enum pmd_cycles_counter_type {
 PMD_CYCLES_POLLING, /* Cycles spent polling NICs. */
-PMD_CYCLES_PROCESSING,  /* Cycles spent processing packets */
+PMD_CYCLES_IDLE,/* Cycles spent idle or unsuccessful polling */
+PMD_CYCLES_PROCESSING,  /* Cycles spent successfully polling and
+ * processing polled packets */
+
 PMD_N_CYCLES
 };
 
@@ -747,10 +750,10 @@ pmd_info_show_stats(struct ds *reply,
 }
 
 ds_put_format(reply,
-  "\tpolling cycles:%"PRIu64" (%.02f%%)\n"
+  "\tidle cycles:%"PRIu64" (%.02f%%)\n"
   "\tprocessing cycles:%"PRIu64" (%.02f%%)\n",
-  cycles[PMD_CYCLES_POLLING],
-  cycles[PMD_CYCLES_POLLING] / (double)total_cycles * 100,
+  cycles[PMD_CYCLES_IDLE],
+  cycles[PMD_CYCLES_IDLE] / (double)total_cycles * 100,
   cycles[PMD_CYCLES_PROCESSING],
   cycles[PMD_CYCLES_PROCESSING] / (double)total_cycles * 100);
 
@@ -2892,30 +2895,43 @@ cycles_count_end(struct dp_netdev_pmd_thread *pmd,
 non_atomic_ullong_add(&pmd->cycles.n[type], interval);
 }
 
-static void
+/* Calculate the intermediate cycle result and add to the counter 'type' */
+static inline void
+cycles_count_intermediate(struct dp_netdev_pmd_thread *pmd,
+  enum pmd_cycles_counter_type type)
+OVS_NO_THREAD_SAFETY_ANALYSIS
+{
+unsigned long long new_cycles = cycles_counter();
+unsigned long long interval = new_cycles - pmd->last_cycles;
+pmd->last_cycles = new_cycles;
+
+non_atomic_ullong_add(&pmd->cycles.n[type], interval);
+}
+
+static int
 dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd,
struct netdev_rxq *rx,
odp_port_t port_no)
 {
 struct dp_packet_batch batch;
 int error;
+int batch_cnt = 0;
 
 dp_packet_batch_init(&batch);
-cycles_count_start(pmd);
 error = netdev_rxq_recv(rx, &batch);
-cycles_count_end(pmd, PMD_CYCLES_POLLING);
 if (!error) {
 *recirc_depth_get() = 0;
 
-cycles_count_start(pmd);
+batch_cnt = batch.count;
 dp_netdev_input(pmd, &batch, port_no);
-cycles_count_end(pmd, PMD_CYCLES_PROCESSING);
 } else if (error != EAGAIN && error != EOPNOTSUPP) {
 static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
 
 VLOG_ERR_RL(&rl, "error receiving data from %s: %s",
 netdev_rxq_get_name(rx), ovs_strerror(error));
 }
+
+return batch_cnt;
 }
 
 static struct tx_port *
@@ -3377,21 +3393,27 @@ dpif_netdev_run(struct dpif *dpif)
 struct dp_netdev *dp = get_dp_netdev(dpif);
 struct dp_netdev_pmd_thread *non_pmd;
 uint64_t new_tnl_seq;
+int process_packets = 0;
 
 ovs_mutex_lock(&dp->port_mutex);
 non_pmd = dp_netdev_get_pmd(dp, NON_PMD_CORE_ID);
 if (non_pmd) {
 ovs_mutex_lock(&dp->non_pmd_mutex);
+cycles_count_start(non_pmd);
 HMAP_FOR_EACH (port, node, &dp->ports) {
 if (!netdev_is_pmd(port->netdev)) {
 int i;
 
 for (i = 0; i < port->n_rxq; i++) {
-dp_netdev_process_rxq_port(non_pmd, port->rxqs[i].rx,
-   port->port_no);
+process_packets +=
+dp_netdev_process_rxq_port(non_pmd,
+   port->rxqs[i].rx,
+   port->port_no);
 }
 }
 }
+cycles_count_end(non_pmd, process_packets ? PMD_CYCLES_PROCESSING
+  : PMD_CYCLES_IDLE);
 dpif_netdev_xps_revalidate_pmd(non_pmd, time_msec(), false);
 ovs_mutex_unlock(&dp->non_pmd_mutex);
 
@@ -3516,6 +3538,7 @@ pmd_thread_main(void *f_)
 bool exiting;
 int poll_cnt;
 int i;
+int process_packets = 0;
 
 poll_list = NULL;
 
@@ -3542,10 +3565,12 @@ reload:
 lc = UINT_MAX;
 }
 
+cycles_count_start(pmd);
 for (;;) {
 for (i = 0; i < poll_cnt; i++) {
-dp_netdev_process_rxq_port(pmd, poll_list[i].rx,
-   poll_list[i].port_no);
+process_packets +=
+dp_netdev_process_rxq_port(pmd, poll_list

[ovs-dev] Lograr el éxito no es casualidad

2017-01-17 Thread Alta Dirección


 
 
 

 
 
 
 " style="display:block; border:none; outline:none; text-decoration:none; 
color:#ff !important; font-size:36px;" 
src="http://www.tumanualestudioenlinea.com/mail/img/npm-global-executive-mba.png";
 class="banner" / 
 
  
 
 
  
 
 Maestría Global Executive MBA
  
 
Las empresas requieren profesionales más efectivos en los puestos de 
dirección, con pensamiento global y local, con capacidad de motivación y 
gestión intangibles,
  preparados para evaluar y realizar el 
seguimiento de la estrategia a seguir para alcanzar objetivos.


Este programa está diseñado para que los participantes adquieran 
conocimientos, competencias y habilidades que les permitan una gestión efectiva 
y de éxito en sus empresas.


¿Quiere desarrollar las competencias directivas más relevantes?


  
 


Requiero Información 

  
 
  
 













Doble Titulación por:

IOE Business School y

Universidad Rey Juan Carlos 

 
 
 
 
  
 
 
  
 E-Learning
100% Online
  
  
 
 
  
 Temario
Multimedia
  
  
 
 
  
 Progamas
Tutorizados
  
  

  
 
 
 
 
  
 High Quality   
  

 
  
 Garantía de Calidad
  
 Este programa ha superado un estricto proceso de certificación por 
parte de la Universidad Rey Juan Carlos (Madrid).   
   

  

  
 
 
Si no 
quiere recibir más publicidad de máster, maestrías o 
cursos de capacitación puede dar de baja su correo electrónico de 
esta lista de distribución pulsando aquí. Muchas gracias de 
antemano por la atención prestada. Reciba un cordial saludo. 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] Dear Developer, do you need Apple search ads optimize tool?

2017-01-17 Thread ASM BEE

 HI APP DEVELOPER,

 If you are developing Apple Apps & finding an analytic tool for Apple
Search Ads?

 We're building a unique tool for this specific aim, it can give you
the insight info, like App performance, keywords coverage,
competitors' keywords, etc.

 Here is the URL: 
http://tracking.asmranker.com/tracking/click?msgid=nmtbKJl1wZ9kaH0ZkPEV0A2&target=http%3a%2f%2fwww.asmbee.com&v=wuo5ItR4dlgj5KF4AIQK0w2
http://tracking.asmranker.com/tracking/click?msgid=nmtbKJl1wZ9kaH0ZkPEV0A2&target=http%3a%2f%2fwizz.asmranker.com%2findex.php%2fcampaigns%2fbj487e8pakf1d%2ftrack-url%2fte411v0oxwb00%2f10b2a389e9acdc19e9f3dc2e8f75180b063162df&v=kSKm3MRhZI68cTy0tT79NA2

 
http://tracking.asmranker.com/tracking/click?msgid=nmtbKJl1wZ9kaH0ZkPEV0A2&target=http%3a%2f%2fwizz.asmranker.com%2findex.php%2fcampaigns%2fbj487e8pakf1d%2ftrack-url%2fte411v0oxwb00%2f854580d8896fe0be5943e11eee97688cfea17266&v=U4BqqIYOYzQ0uOiAcqrqtA2

 Very sorry if I bothered you, but this could be really useful if
you're doing Apple search ads.

 BTW, it's still beta so is 100% free, just signup and you will be
able to analyze Quite a few Keywords Now!

 Wish to see you on the website.

 ASMBEE TEAM

   
Sorry if I'm disturding you

 This Letter is delievered to d...@openvswitch.org , if this is not
you, or you want no more Emails from us:

 Please Click Here to Unsubscribe
http://tracking.asmranker.com/tracking/click?msgid=nmtbKJl1wZ9kaH0ZkPEV0A2&target=http%3a%2f%2fwizz.asmranker.com%2findex.php%2fcampaigns%2fbj487e8pakf1d%2ftrack-url%2fte411v0oxwb00%2faaf6c055ba8410e273210971d9b77b4d3cc27f6f&v=BLWc-GILegKbrRK5dH9T-g2
 
http://tracking.asmranker.com/tracking/unsubscribe?msgid=nmtbKJl1wZ9kaH0ZkPEV0A2
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [RFC] ofproto/bond: operational vs administratively disabled bond interface

2017-01-17 Thread Ben Pfaff

On Tue, Jan 17, 2017 at 12:10:59PM -0200, Flavio Leitner wrote:
> On Tue, 17 Jan 2017 09:34:19 +0100
> Eelco Chaudron  wrote:
> 
> > Currently OVS does not distinguish between a bond slave being operational
> > disabled, i.e. link being down, and administratively disabled.
> > 
> > Take the example where the administrator disabled a link in a bond,
> > "ovs-appctl bond/disable-slave bond0 enp129s0f0", it's automatically
> > enabled again due to the fact the link is up.
> > 
> > I would like to change this behavior such that when disabled trough appctl
> > the slave is no longer used until explicitly enabled again via appctl.
> 
> Eelco and I discussed this off list and I agree that this sounds like
> a bug.  The slave should not be used if the admin has disabled it
> regardless of its link state.

The behavior matches the documentation:

   bond/enable-slave port slave
   bond/disable-slave port slave
  Enables (or disables) slave on the given bond port, skipping any
  updelay (or downdelay).

  This  setting  is not permanent: it persists only until the car‐
  rier status of slave changes.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v10 1/8] ovn: document logical routers and logical patch ports in ovn-architecture

2017-01-17 Thread Ben Pfaff

On Tue, Jan 17, 2017 at 01:45:02AM -0800, Mickey Spiegel wrote:
> This patch adds a description of logical routers and logical patch ports,
> including gateway routers, to ovn/ovn-architecture.7.xml.
> 
> Signed-off-by: Mickey Spiegel 

Thanks for working on the documentation!  I applied this to master,
folding in just one change (because I wanted to emphasize that the
behavior isn't different, just the implementation):

diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml
index d92f878..c0ceb65 100644
--- a/ovn/ovn-architecture.7.xml
+++ b/ovn/ovn-architecture.7.xml
@@ -1096,8 +1096,8 @@
 
   
 When the packet reaches table 65, the logical egress port is a logical
-patch port.  The behavior at table 65 differs depending on the OVS
-version:
+patch port.  The implementation in table 65 differs depending on the OVS
+version, although the observed behavior is meant to be the same:
   
 
   

Thanks,

Ben.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v2] dpif-netdev: Change definitions of 'idle' & 'processing' cycles

2017-01-17 Thread Kevin Traynor

On 01/17/2017 05:43 PM, Ciara Loftus wrote:
> Instead of counting all polling cycles as processing cycles, only count
> the cycles where packets were received from the polling.

This makes these stats much clearer. One minor comment below, other than
that

Acked-by: Kevin Traynor 

> 
> Signed-off-by: Georg Schmuecking 
> Signed-off-by: Ciara Loftus 
> Co-authored-by: Ciara Loftus 
> ---
> v2:
> - Rebase
> ---
>  lib/dpif-netdev.c | 57 
> ++-
>  1 file changed, 44 insertions(+), 13 deletions(-)
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index 3901129..3854c79 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -272,7 +272,10 @@ enum dp_stat_type {
>  
>  enum pmd_cycles_counter_type {
>  PMD_CYCLES_POLLING, /* Cycles spent polling NICs. */

this is not used anymore and can be removed

> -PMD_CYCLES_PROCESSING,  /* Cycles spent processing packets */
> +PMD_CYCLES_IDLE,/* Cycles spent idle or unsuccessful polling 
> */
> +PMD_CYCLES_PROCESSING,  /* Cycles spent successfully polling and
> + * processing polled packets */
> +
>  PMD_N_CYCLES
>  };
>  
> @@ -747,10 +750,10 @@ pmd_info_show_stats(struct ds *reply,
>  }
>  
>  ds_put_format(reply,
> -  "\tpolling cycles:%"PRIu64" (%.02f%%)\n"
> +  "\tidle cycles:%"PRIu64" (%.02f%%)\n"
>"\tprocessing cycles:%"PRIu64" (%.02f%%)\n",
> -  cycles[PMD_CYCLES_POLLING],
> -  cycles[PMD_CYCLES_POLLING] / (double)total_cycles * 100,
> +  cycles[PMD_CYCLES_IDLE],
> +  cycles[PMD_CYCLES_IDLE] / (double)total_cycles * 100,
>cycles[PMD_CYCLES_PROCESSING],
>cycles[PMD_CYCLES_PROCESSING] / (double)total_cycles * 
> 100);
>  
> @@ -2892,30 +2895,43 @@ cycles_count_end(struct dp_netdev_pmd_thread *pmd,
>  non_atomic_ullong_add(&pmd->cycles.n[type], interval);
>  }
>  
> -static void
> +/* Calculate the intermediate cycle result and add to the counter 'type' */
> +static inline void
> +cycles_count_intermediate(struct dp_netdev_pmd_thread *pmd,
> +  enum pmd_cycles_counter_type type)
> +OVS_NO_THREAD_SAFETY_ANALYSIS
> +{
> +unsigned long long new_cycles = cycles_counter();
> +unsigned long long interval = new_cycles - pmd->last_cycles;
> +pmd->last_cycles = new_cycles;
> +
> +non_atomic_ullong_add(&pmd->cycles.n[type], interval);
> +}
> +
> +static int
>  dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd,
> struct netdev_rxq *rx,
> odp_port_t port_no)
>  {
>  struct dp_packet_batch batch;
>  int error;
> +int batch_cnt = 0;
>  
>  dp_packet_batch_init(&batch);
> -cycles_count_start(pmd);
>  error = netdev_rxq_recv(rx, &batch);
> -cycles_count_end(pmd, PMD_CYCLES_POLLING);
>  if (!error) {
>  *recirc_depth_get() = 0;
>  
> -cycles_count_start(pmd);
> +batch_cnt = batch.count;
>  dp_netdev_input(pmd, &batch, port_no);
> -cycles_count_end(pmd, PMD_CYCLES_PROCESSING);
>  } else if (error != EAGAIN && error != EOPNOTSUPP) {
>  static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>  
>  VLOG_ERR_RL(&rl, "error receiving data from %s: %s",
>  netdev_rxq_get_name(rx), ovs_strerror(error));
>  }
> +
> +return batch_cnt;
>  }
>  
>  static struct tx_port *
> @@ -3377,21 +3393,27 @@ dpif_netdev_run(struct dpif *dpif)
>  struct dp_netdev *dp = get_dp_netdev(dpif);
>  struct dp_netdev_pmd_thread *non_pmd;
>  uint64_t new_tnl_seq;
> +int process_packets = 0;
>  
>  ovs_mutex_lock(&dp->port_mutex);
>  non_pmd = dp_netdev_get_pmd(dp, NON_PMD_CORE_ID);
>  if (non_pmd) {
>  ovs_mutex_lock(&dp->non_pmd_mutex);
> +cycles_count_start(non_pmd);
>  HMAP_FOR_EACH (port, node, &dp->ports) {
>  if (!netdev_is_pmd(port->netdev)) {
>  int i;
>  
>  for (i = 0; i < port->n_rxq; i++) {
> -dp_netdev_process_rxq_port(non_pmd, port->rxqs[i].rx,
> -   port->port_no);
> +process_packets +=
> +dp_netdev_process_rxq_port(non_pmd,
> +   port->rxqs[i].rx,
> +   port->port_no);
>  }
>  }
>  }
> +cycles_count_end(non_pmd, process_packets ? PMD_CYCLES_PROCESSING
> +  : PMD_CYCLES_IDLE);
>  dpif_netdev_xps_revalidate_pmd(non_pmd, time_msec(), false);
>  ovs_mutex_unlock(&dp->non_pmd_mutex);
>  
> @@ -3516,6 +3538,7 @@ pmd_thread_main(void *f_)
>  bool exiting;
>

Re: [ovs-dev] [PATCH 1/6] datapath-windows: Fix alignment in MapTunAttrToFlowPut

2017-01-17 Thread Sairam Venugopal

Acked-by: Sairam Venugopal 





On 1/10/17, 8:48 AM, "ovs-dev-boun...@openvswitch.org on behalf of Alin 
Serdean"  wrote:

>Found by inspection.
>
>Signed-off-by: Alin Gabriel Serdean 
>---
> datapath-windows/ovsext/Flow.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/datapath-windows/ovsext/Flow.c b/datapath-windows/ovsext/Flow.c
>index 2e8b42b..749d83a 100644
>--- a/datapath-windows/ovsext/Flow.c
>+++ b/datapath-windows/ovsext/Flow.c
>@@ -1809,7 +1809,7 @@ MapTunAttrToFlowPut(PNL_ATTR *keyAttrs,
> }
> 
> if (tunAttrs[OVS_TUNNEL_KEY_ATTR_OAM]) {
>-destKey->tunKey.flags |= OVS_TNL_F_OAM;
>+destKey->tunKey.flags |= OVS_TNL_F_OAM;
> }
> 
> if (tunAttrs[OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS]) {
>-- 
>2.10.2.windows.1
>___
>dev mailing list
>d...@openvswitch.org
>https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=Z6vowHUOjP5ysP_g372c49Nqc1vEKqHKNBkR5Q5Z7uo&m=X2oRO4F4ebnUFPnYizFIvjlYkWE7OoYt3JMNaIC94LU&s=A-q0ojWDoYrOOwbjz6V1EmHWNYTPFkii9E24dn7lQ2s&e=
> 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 1/3] datapath: compat: Fix build on RHEL 7.3

2017-01-17 Thread Joe Stringer

On 13 January 2017 at 19:14, Han Zhou  wrote:
> Should this one be applied to stable branches such as ovs2.6?
>
> Now I got the compile error while compiling ovs2.6 with kernel version
> 3.10.0-514.2.2.el7.x86_64 (CentOS 7.3):
>
> ovs/datapath/linux/compat/include/linux/netdevice.h:125:34: error:
> conflicting types for 'netdev_notifier_info_to_dev'
>  static inline struct net_device *netdev_notifier_info_to_dev(void *info)

This is effectively a new feature (support for a new kernel), we
typically do not backport new features or new kernel support back to
older branches. Users who wish to use OVS on newer CentOS kernels may
use the upstream module provided by CentOS, or use a newer release
(master for now, or v2.7 when it comes out ETA next month).
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 2/6] datapath-windows: Allow tunnel action to modify destination port

2017-01-17 Thread Sairam Venugopal

Patch 2 and 3 could’ve been combined into the same patch since they are about 
dst_port.

Acked-by: Sairam Venugopal 





On 1/10/17, 8:48 AM, "ovs-dev-boun...@openvswitch.org on behalf of Alin 
Serdean"  wrote:

>'OvsTunnelAttrToIPv4TunnelKey' modifies 'tunkey' with the received netlink
>attributes(i.e. OVS_TUNNEL_KEY_ATTR_IPV4_DST).
>
>Change the order of the value assignment to reflect the values received via
>userspace.
>
>Signed-off-by: Alin Gabriel Serdean 
>---
> datapath-windows/ovsext/Actions.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
>diff --git a/datapath-windows/ovsext/Actions.c 
>b/datapath-windows/ovsext/Actions.c
>index b4a286b..1530d8b 100644
>--- a/datapath-windows/ovsext/Actions.c
>+++ b/datapath-windows/ovsext/Actions.c
>@@ -1492,11 +1492,11 @@ OvsExecuteSetAction(OvsForwardingContext *ovsFwdCtx,
> case OVS_KEY_ATTR_TUNNEL:
> {
> OvsIPv4TunnelKey tunKey;
>+tunKey.flow_hash = (uint16)(hash ? *hash : OvsHashFlow(key));
>+tunKey.dst_port = key->ipKey.l4.tpDst;
> NTSTATUS convertStatus = OvsTunnelAttrToIPv4TunnelKey((PNL_ATTR)a, 
> &tunKey);
> status = SUCCEEDED(convertStatus) ? NDIS_STATUS_SUCCESS : 
> NDIS_STATUS_FAILURE;
> ASSERT(status == NDIS_STATUS_SUCCESS);
>-tunKey.flow_hash = (uint16)(hash ? *hash : OvsHashFlow(key));
>-tunKey.dst_port = key->ipKey.l4.tpDst;
> RtlCopyMemory(&ovsFwdCtx->tunKey, &tunKey, sizeof ovsFwdCtx->tunKey);
> break;
> }
>-- 
>2.10.2.windows.1
>___
>dev mailing list
>d...@openvswitch.org
>https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=Z6vowHUOjP5ysP_g372c49Nqc1vEKqHKNBkR5Q5Z7uo&m=ZijUW-XONuy4-0Gaus241SbH5Y3jgJ6h3MNUavcwlRc&s=1XFYfOzat_zlBGT85Z2Z7b3sEH8rtGJQKCGCjXsLtag&e=
> 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 3/6] datapath-windows: Add support for OVS_TUNNEL_KEY_ATTR_TP_DST

2017-01-17 Thread Sairam Venugopal

We should consolidate this logic into 1 function for retrieving the tunKey from 
attr. This can be done separately. 

Acked-by: Sairam Venugopal 





On 1/10/17, 8:48 AM, "ovs-dev-boun...@openvswitch.org on behalf of Alin 
Serdean"  wrote:

>Add support for netlink attribute OVS_TUNNEL_KEY_ATTR_TP_DST get/set
>flow functions.
>
>Signed-off-by: Alin Gabriel Serdean 
>---
> datapath-windows/ovsext/Flow.c | 8 
> 1 file changed, 8 insertions(+)
>
>diff --git a/datapath-windows/ovsext/Flow.c b/datapath-windows/ovsext/Flow.c
>index 749d83a..96ff9fa 100644
>--- a/datapath-windows/ovsext/Flow.c
>+++ b/datapath-windows/ovsext/Flow.c
>@@ -1737,6 +1737,9 @@ OvsTunnelAttrToIPv4TunnelKey(PNL_ATTR attr,
> case OVS_TUNNEL_KEY_ATTR_OAM:
> tunKey->flags |= OVS_TNL_F_OAM;
> break;
>+case OVS_TUNNEL_KEY_ATTR_TP_DST:
>+tunKey->dst_port = NlAttrGetBe16(a);
>+break;
> case OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS:
> if (hasOpt) {
> /* Duplicate options attribute is not allowed. */
>@@ -1812,6 +1815,11 @@ MapTunAttrToFlowPut(PNL_ATTR *keyAttrs,
> destKey->tunKey.flags |= OVS_TNL_F_OAM;
> }
> 
>+if (tunAttrs[OVS_TUNNEL_KEY_ATTR_TP_DST]) {
>+destKey->tunKey.dst_port =
>+NlAttrGetU16(tunAttrs[OVS_TUNNEL_KEY_ATTR_TP_DST]);
>+}
>+
> if (tunAttrs[OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS]) {
> NTSTATUS status = OvsTunnelAttrToGeneveOptions(
>   tunAttrs[OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS],
>-- 
>2.10.2.windows.1
>___
>dev mailing list
>d...@openvswitch.org
>https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=Z6vowHUOjP5ysP_g372c49Nqc1vEKqHKNBkR5Q5Z7uo&m=b5wDdE2vUUc40ZLOT7Qh6Xa7oJn8L5PUF1xaK4QXlRc&s=wV7RC6p0TO50RU-QknKEC1uCMcEjT0QBCechGsJCnWc&e=
> 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 4/6] datapath-windows: VXLAN Check for flow destination port

2017-01-17 Thread Sairam Venugopal

Do we need to convert the tunKey->dst_port to network byte order?



On 1/10/17, 8:48 AM, "ovs-dev-boun...@openvswitch.org on behalf of Alin 
Serdean"  wrote:

>Change the UDP destination port(VXLAN header) to check if it was set by
>the userspace, use it if it was set.
>If the userspace did not specify a destination port, use the configured
>vport destination port.
>
>Signed-off-by: Alin Gabriel Serdean 
>---
> datapath-windows/ovsext/Vxlan.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
>diff --git a/datapath-windows/ovsext/Vxlan.c b/datapath-windows/ovsext/Vxlan.c
>index 949e069..84c2f2f 100644
>--- a/datapath-windows/ovsext/Vxlan.c
>+++ b/datapath-windows/ovsext/Vxlan.c
>@@ -289,7 +289,8 @@ OvsDoEncapVxlan(POVS_VPORT_ENTRY vport,
> /* UDP header */
> udpHdr = (UDPHdr *)((PCHAR)ipHdr + sizeof *ipHdr);
> udpHdr->source = htons(tunKey->flow_hash | MAXINT16);
>-udpHdr->dest = htons(vportVxlan->dstPort);
>+udpHdr->dest = tunKey->dst_port ? tunKey->dst_port :
>+  htons(vportVxlan->dstPort);
> udpHdr->len = htons(NET_BUFFER_DATA_LENGTH(curNb) - headRoom +
> sizeof *udpHdr + sizeof *vxlanHdr);
> 
>-- 
>2.10.2.windows.1
>___
>dev mailing list
>d...@openvswitch.org
>https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=Z6vowHUOjP5ysP_g372c49Nqc1vEKqHKNBkR5Q5Z7uo&m=5_pKhBH-PUBhzkQB9iiJQHwEXF4_z3jQTCICjBvpj1M&s=LmmZoSZ1s-GGDQBO7ydyZsKXCOqTZ_QAp99ZVHO633M&e=
> 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v2 15/17] datapath: enable vxlangpe creation in compat mode

2017-01-17 Thread Yang, Yi Y

Make sense, will change it in v3.

-Original Message-
From: Jiri Benc [mailto:jb...@redhat.com] 
Sent: Wednesday, January 18, 2017 12:37 AM
To: Yang, Yi Y 
Cc: d...@openvswitch.org
Subject: Re: [ovs-dev] [PATCH v2 15/17] datapath: enable vxlangpe creation in 
compat mode

On Wed, 28 Dec 2016 20:26:28 +0800, Yi Yang wrote:
> --- a/datapath/linux/compat/include/linux/openvswitch.h
> +++ b/datapath/linux/compat/include/linux/openvswitch.h
> @@ -291,6 +291,7 @@ enum ovs_vport_attr {  enum {
>   OVS_VXLAN_EXT_UNSPEC,
>   OVS_VXLAN_EXT_GBP,  /* Flag or __u32 */
> + OVS_VXLAN_EXT_GPE,  /* Flag or __u32 */

Please leave several free slots between OVS_VXLAN_EXT_GBP and 
OVS_VXLAN_EXT_GPE. Although it's not expected that we add more values to this 
enum upstream, we should not rule it out completely. Having a buffer here 
allows to prevent nasty problems with attribute conflicts between the upstream 
kernel and the compat code.

 Jiri
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] Business Offer

2017-01-17 Thread Sergio






I have a business in my office that will interest you.

Kindly reply for business proposal details.

Thanks,

Sergio


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v2] dpif-netdev: Change definitions of 'idle' & 'processing' cycles

2017-01-17 Thread Daniele Di Proietto

2017-01-17 11:43 GMT-08:00 Kevin Traynor :
> On 01/17/2017 05:43 PM, Ciara Loftus wrote:
>> Instead of counting all polling cycles as processing cycles, only count
>> the cycles where packets were received from the polling.
>
> This makes these stats much clearer. One minor comment below, other than
> that
>
> Acked-by: Kevin Traynor 
>
>>
>> Signed-off-by: Georg Schmuecking 
>> Signed-off-by: Ciara Loftus 
>> Co-authored-by: Ciara Loftus 

Minor: the co-authored-by tag should be different from the main author.

This makes it easier to understand how busy a pmd thread is, a valid question
that a sysadmin might have.

The counters were originally introduced to help developers understand how cycles
are spent between drivers(netdev rx) and datapath processing(dpif).
Do you think
it's ok to lose this type of information?  Perhaps it is, since a
developer can also
use a profiler, I'm not sure.

Maybe we could 'last_cycles' as it is and introduce a separate counter to get
the idle/busy ratio.  I'm not 100% sure this is the best way.

What do you guys think?

Thanks,

Daniele

>> ---
>> v2:
>> - Rebase
>> ---
>>  lib/dpif-netdev.c | 57 
>> ++-
>>  1 file changed, 44 insertions(+), 13 deletions(-)
>>
>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
>> index 3901129..3854c79 100644
>> --- a/lib/dpif-netdev.c
>> +++ b/lib/dpif-netdev.c
>> @@ -272,7 +272,10 @@ enum dp_stat_type {
>>
>>  enum pmd_cycles_counter_type {
>>  PMD_CYCLES_POLLING, /* Cycles spent polling NICs. */
>
> this is not used anymore and can be removed
>
>> -PMD_CYCLES_PROCESSING,  /* Cycles spent processing packets */
>> +PMD_CYCLES_IDLE,/* Cycles spent idle or unsuccessful 
>> polling */
>> +PMD_CYCLES_PROCESSING,  /* Cycles spent successfully polling and
>> + * processing polled packets */
>> +
>>  PMD_N_CYCLES
>>  };
>>
>> @@ -747,10 +750,10 @@ pmd_info_show_stats(struct ds *reply,
>>  }
>>
>>  ds_put_format(reply,
>> -  "\tpolling cycles:%"PRIu64" (%.02f%%)\n"
>> +  "\tidle cycles:%"PRIu64" (%.02f%%)\n"
>>"\tprocessing cycles:%"PRIu64" (%.02f%%)\n",
>> -  cycles[PMD_CYCLES_POLLING],
>> -  cycles[PMD_CYCLES_POLLING] / (double)total_cycles * 100,
>> +  cycles[PMD_CYCLES_IDLE],
>> +  cycles[PMD_CYCLES_IDLE] / (double)total_cycles * 100,
>>cycles[PMD_CYCLES_PROCESSING],
>>cycles[PMD_CYCLES_PROCESSING] / (double)total_cycles * 
>> 100);
>>
>> @@ -2892,30 +2895,43 @@ cycles_count_end(struct dp_netdev_pmd_thread *pmd,
>>  non_atomic_ullong_add(&pmd->cycles.n[type], interval);
>>  }
>>
>> -static void
>> +/* Calculate the intermediate cycle result and add to the counter 'type' */
>> +static inline void
>> +cycles_count_intermediate(struct dp_netdev_pmd_thread *pmd,
>> +  enum pmd_cycles_counter_type type)

I'd add an OVS_REQUIRES(&cycles_counter_fake_mutex)

>> +OVS_NO_THREAD_SAFETY_ANALYSIS
>> +{
>> +unsigned long long new_cycles = cycles_counter();
>> +unsigned long long interval = new_cycles - pmd->last_cycles;
>> +pmd->last_cycles = new_cycles;
>> +
>> +non_atomic_ullong_add(&pmd->cycles.n[type], interval);
>> +}
>> +
>> +static int
>>  dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd,
>> struct netdev_rxq *rx,
>> odp_port_t port_no)
>>  {
>>  struct dp_packet_batch batch;
>>  int error;
>> +int batch_cnt = 0;
>>
>>  dp_packet_batch_init(&batch);
>> -cycles_count_start(pmd);
>>  error = netdev_rxq_recv(rx, &batch);
>> -cycles_count_end(pmd, PMD_CYCLES_POLLING);
>>  if (!error) {
>>  *recirc_depth_get() = 0;
>>
>> -cycles_count_start(pmd);
>> +batch_cnt = batch.count;
>>  dp_netdev_input(pmd, &batch, port_no);
>> -cycles_count_end(pmd, PMD_CYCLES_PROCESSING);
>>  } else if (error != EAGAIN && error != EOPNOTSUPP) {
>>  static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>>
>>  VLOG_ERR_RL(&rl, "error receiving data from %s: %s",
>>  netdev_rxq_get_name(rx), ovs_strerror(error));
>>  }
>> +
>> +return batch_cnt;
>>  }
>>
>>  static struct tx_port *
>> @@ -3377,21 +3393,27 @@ dpif_netdev_run(struct dpif *dpif)
>>  struct dp_netdev *dp = get_dp_netdev(dpif);
>>  struct dp_netdev_pmd_thread *non_pmd;
>>  uint64_t new_tnl_seq;
>> +int process_packets = 0;
>>
>>  ovs_mutex_lock(&dp->port_mutex);
>>  non_pmd = dp_netdev_get_pmd(dp, NON_PMD_CORE_ID);
>>  if (non_pmd) {
>>  ovs_mutex_lock(&dp->non_pmd_mutex);
>> +cycles_count_start(non_pmd);
>>  HMAP_FOR_EACH (port, node, &dp->ports) {
>>  if (!netdev_is_pmd(port->netdev)) {
>>  int i;
>>
>>

Re: [ovs-dev] [PATCH 1/1] dpif-netdev: Conditional EMC insert

2017-01-17 Thread Daniele Di Proietto

2017-01-12 8:49 GMT-08:00 Ciara Loftus :
> Unconditional insertion of EMC entries results in EMC thrashing at high
> numbers of parallel flows. When this occurs, the performance of the EMC
> often falls below that of the dpcls classifier, rendering the EMC
> practically useless.
>
> Instead of unconditionally inserting entries into the EMC when a miss
> occurs, use a 1% probability of insertion. This ensures that the most
> frequent flows have the highest chance of creating an entry in the EMC,
> and the probability of thrashing the EMC is also greatly reduced.
>
> Signed-off-by: Ciara Loftus 
> Signed-off-by: Georg Schmuecking 
> Co-authored-by: Georg Schmuecking 

Thanks for the patch

> ---
>  lib/dpif-netdev.c | 20 ++--
>  1 file changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index 546a1e9..8d55ba2 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -144,6 +144,9 @@ struct netdev_flow_key {
>  #define EM_FLOW_HASH_MASK (EM_FLOW_HASH_ENTRIES - 1)
>  #define EM_FLOW_HASH_SEGS 2
>
> +#define EM_FLOW_INSERT_PROB 100
> +#define EM_FLOW_INSERT_MIN (UINT32_MAX / EM_FLOW_INSERT_PROB)
> +
>  struct emc_entry {
>  struct dp_netdev_flow *flow;
>  struct netdev_flow_key key;   /* key.hash used for emc hash value. */
> @@ -1994,6 +1997,19 @@ emc_insert(struct emc_cache *cache, const struct 
> netdev_flow_key *key,
>  emc_change_entry(to_be_replaced, flow, key);
>  }
>
> +static inline void
> +emc_probabilistic_insert(struct dp_netdev_pmd_thread *pmd,
> + struct emc_cache *cache,
> + const struct netdev_flow_key *key,
> + struct dp_netdev_flow *flow)
> +{
> +/* One in every EM_FLOW_INSERT_PROB packets are inserted to reduce
> + * thrashing */
> +if ((key->hash ^ (uint32_t)pmd->last_cycles) <= EM_FLOW_INSERT_MIN) {

pmd->last_cycles is always 0 when OVS is compiled without DPDK.  While
we currently
don't require high throughput from this code unless DPDK is enabled, I
think that
depending only on the hash might decrease the coverage of the exact
match cache in
the unit tests.

Have you thought about just using a counter?

> +emc_insert(cache, key, flow);
> +}
> +}
> +
>  static inline struct dp_netdev_flow *
>  emc_lookup(struct emc_cache *cache, const struct netdev_flow_key *key)
>  {
> @@ -4092,7 +4108,7 @@ handle_packet_upcall(struct dp_netdev_pmd_thread *pmd, 
> struct dp_packet *packet,
>  }
>  ovs_mutex_unlock(&pmd->flow_mutex);
>
> -emc_insert(&pmd->flow_cache, key, netdev_flow);
> +emc_probabilistic_insert(pmd, &pmd->flow_cache, key, netdev_flow);
>  }
>  }
>
> @@ -4187,7 +4203,7 @@ fast_path_processing(struct dp_netdev_pmd_thread *pmd,
>
>  flow = dp_netdev_flow_cast(rules[i]);
>
> -emc_insert(flow_cache, &keys[i], flow);
> +emc_probabilistic_insert(pmd, flow_cache, &keys[i], flow);
>  dp_netdev_queue_batches(packet, flow, &keys[i].mf, batches, 
> n_batches);
>  }
>
> --
> 2.4.3
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v2 1/2] netdev-dpdk: Use intermediate queue during packet transmission.

2017-01-17 Thread Ilya Maximets

Not a complete review. This code is full of races.
See details inline.

Best regards, Ilya Maximets.

On 17.01.2017 18:37, antonio.fische...@intel.com wrote:
> This patch implements the intermediate Tx queues on 'dpdk' type ports.
> 
> Test results:
>  * In worst case scenario with fewer packets per batch, a significant
>bottleneck is observed for netdev_dpdk_eth_send() function due to
>expensive MMIO writes.
> 
>  * Also its observed that CPI(cycles per instruction) Rate for the function
>stood between 3.15 and 4.1 which is significantly higher than acceptable
>limit of 1.0 for HPC applications and theoretical limit of 0.25 (As Backend
>pipeline can retire 4 micro-operations in a cycle).
> 
>  * With this patch, CPI for netdev_dpdk_eth_send() is at 0.55 and the overall
>throughput improved significantly.
> 
> 
> Signed-off-by: Antonio Fischetti 
> Signed-off-by: Bhanuprakash Bodireddy 
> Co-authored-by: Bhanuprakash Bodireddy 
> Signed-off-by: Markus Magnusson 
> Co-authored-by: Markus Magnusson 
> ---
>  lib/dpif-netdev.c | 53 +++--
>  lib/netdev-bsd.c  |  1 +
>  lib/netdev-dpdk.c | 82 
> ++-
>  lib/netdev-dummy.c|  1 +
>  lib/netdev-linux.c|  1 +
>  lib/netdev-provider.h |  8 +
>  lib/netdev-vport.c|  3 +-
>  lib/netdev.c  |  9 ++
>  lib/netdev.h  |  1 +
>  9 files changed, 149 insertions(+), 10 deletions(-)
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index 3901129..58ac429 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -289,6 +289,8 @@ struct dp_netdev_rxq {
>  struct dp_netdev_pmd_thread *pmd;  /* pmd thread that will poll this 
> queue. */
>  };
>  
> +#define LAST_USED_QID_NONE -1
> +
>  /* A port in a netdev-based datapath. */
>  struct dp_netdev_port {
>  odp_port_t port_no;
> @@ -303,6 +305,8 @@ struct dp_netdev_port {
>  char *type; /* Port type as requested by user. */
>  char *rxq_affinity_list;/* Requested affinity of rx queues. */
>  bool need_reconfigure;  /* True if we should reconfigure netdev. */
> +int last_used_qid;  /* Last queue id where packets could be
> +   enqueued. */
>  };
>  
>  /* Contained by struct dp_netdev_flow's 'stats' member.  */
> @@ -619,6 +623,9 @@ static int dpif_netdev_xps_get_tx_qid(const struct 
> dp_netdev_pmd_thread *pmd,
>  static inline bool emc_entry_alive(struct emc_entry *ce);
>  static void emc_clear_entry(struct emc_entry *ce);
>  
> +static struct tx_port *pmd_send_port_cache_lookup
> +(const struct dp_netdev_pmd_thread *pmd, odp_port_t port_no);
> +
>  static void
>  emc_cache_init(struct emc_cache *flow_cache)
>  {
> @@ -3507,15 +3514,19 @@ pmd_load_queues_and_ports(struct dp_netdev_pmd_thread 
> *pmd,
>  return i;
>  }
>  
> +enum { DRAIN_TSC = 2ULL };
> +
>  static void *
>  pmd_thread_main(void *f_)
>  {
>  struct dp_netdev_pmd_thread *pmd = f_;
> -unsigned int lc = 0;
> +unsigned int lc = 0, lc_drain = 0;
>  struct polled_queue *poll_list;
>  bool exiting;
>  int poll_cnt;
>  int i;
> +uint64_t prev = 0, now = 0;
> +struct tx_port *tx_port;
>  
>  poll_list = NULL;
>  
> @@ -3548,6 +3559,26 @@ reload:
> poll_list[i].port_no);
>  }
>  
> +#define MAX_LOOP_TO_DRAIN 128
> +if (lc_drain++ > MAX_LOOP_TO_DRAIN) {
> +lc_drain = 0;
> +prev = now;
> +now = pmd->last_cycles;
> +if ((now - prev) > DRAIN_TSC) {
> +HMAP_FOR_EACH (tx_port, node, &pmd->tx_ports) {

'pmd->tx_ports' must be protected by 'pmd->port_mutex'. Also it can be changed
while pmd still working. I think you wanted something like 
'pmd->send_port_cache'.

> +if (tx_port->port->last_used_qid != LAST_USED_QID_NONE) {
> +/* This queue may contain some buffered packets 
> waiting
> + * to be sent out. */
> +netdev_txq_drain(tx_port->port->netdev,
> +tx_port->port->last_used_qid,
> +tx_port->port->dynamic_txqs);
> +/* Mark it as empty. */
> +tx_port->port->last_used_qid = LAST_USED_QID_NONE;

'port' is a pointer to the common structure --> 'port->last_used_qid' will be
concurrently updated by all threads --> total mess.

> +}
> +}
> +}
> +}
> +
>  if (lc++ > 1024) {
>  bool reload;
>  
> @@ -3883,6 +3914,7 @@ dp_netdev_add_port_tx_to_pmd(struct 
> dp_netdev_pmd_thread *pmd,
>  
>  tx->port = port;
>  tx->qid = -1;
> +port->last_used_qid = LAST_USED_QID_NONE;
>  
>  hmap_insert(&pmd->tx_ports, &tx->node, hash_port_no(tx->port->port_no));
>  pmd->need_reload = true;
> @@ -4538

[ovs-dev] [PATCH] configuration.rst: Update the example of DPDK port's configuration

2017-01-17 Thread Binbin Xu

After the hotplug of DPDK ports, a valid dpdk-devargs must be
specified. Otherwise, the DPDK device can't be available.

Signed-off-by: Binbin Xu 
---
 Documentation/faq/configuration.rst | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/Documentation/faq/configuration.rst 
b/Documentation/faq/configuration.rst
index c03d069..8bd0e11 100644
--- a/Documentation/faq/configuration.rst
+++ b/Documentation/faq/configuration.rst
@@ -107,12 +107,11 @@ Q: How do I configure a DPDK port as an access port?
 startup when other_config:dpdk-init is set to 'true'.
 
 Secondly, when adding a DPDK port, unlike a system port, the type for the
-interface must be specified. For example::
+interface and valid dpdk-devargs must be specified. For example::
 
 $ ovs-vsctl add-br br0
-$ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
-
-Finally, it is required that DPDK port names begin with ``dpdk``.
+$ ovs-vsctl add-port br0 myportname -- set Interface myportname \
+type=dpdk options:dpdk-devargs=:06:00.0
 
 Refer to :doc:`/intro/install/dpdk` for more information on enabling and
 using DPDK with Open vSwitch.
-- 
1.8.3.1


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

41 matches

Mail list logo