date:20160805

[ovs-dev] [PATCH V14] Function tracer to trace all function calls

2016-08-05 Thread nghosh

From: Nirapada Ghosh 

In some circumstances, we might need to figure out where in
code, the CPU time is being spent most, so as to pinpoint
the bottleneck and thereby resolve it with proper changes.
Using '-finstrument-functions' flag, that can be achieved, and
this patch exactly does that.

There is a python file [generate_ft_report.py] with the patch,
that may be used to convert this trace output to a human readable
format with symbol names instead of address and their execution
times. This tool uses addr2line that expects the executable to
be built with -g flag.

To enable this feature, ovs needs to be configured with
"--enable-ft" command line argument [i.e. configure --enable-ft]

This feature logs the tracing output to log files,
that is set using "ovs-appctl vlog/trace " command.
"ovs-appctl vlog/trace off" turns the logging off.

The feature uses the hooks GNU-C provides when compiled with
-finstrument-functions flag, we just have to implement
them. What it means is, once you compile the code with --enable-ft
option, function calls are going to be routed to the tracing routine
anyways. In other words, even if we do disable tracing, the extra calls would
be happening though with very less CPU overhead, because the calls
would return right away. The initialization call [ constructor ] happens
even before main() is invoked, so no chance of completely disabling
tracing when configured with --enable-ft. So, unless you intend on debugging
a module in OVS, this option would better be turned off by default.

It is intended to be used for debugging purposes only.

Signed-off-by: Nirapada Ghosh 
---
 configure.ac|  10 
 lib/vlog-unixctl.man|   9 +++
 lib/vlog.c  | 123 
 utilities/automake.mk   |   9 +++
 utilities/generate_ft_report.py |  83 +++
 utilities/ovs-appctl.8.in   |   8 +++
 utilities/ovs-appctl.c  |   5 ++
 7 files changed, 247 insertions(+)
 create mode 100644 utilities/generate_ft_report.py

diff --git a/configure.ac b/configure.ac
index 05d80d5..6eb2c1c 100644
--- a/configure.ac
+++ b/configure.ac
@@ -28,6 +28,16 @@ AC_PROG_MKDIR_P
 AC_PROG_FGREP
 AC_PROG_EGREP
 
+AC_ARG_ENABLE([ft],
+  [AC_HELP_STRING([--enable-ft], [Turn on function tracing])],
+  [case "${enableval}" in
+(yes) ft=true ;;
+(no)  ft=false ;;
+(*) AC_MSG_ERROR([bad value ${enableval} for --enable-ft]) ;;
+  esac],
+  [ft=false])
+AM_CONDITIONAL([ENABLE_FT], [test x$ft = xtrue])
+
 AC_ARG_VAR([PERL], [path to Perl interpreter])
 AC_PATH_PROG([PERL], perl, no)
 if test "$PERL" = no; then
diff --git a/lib/vlog-unixctl.man b/lib/vlog-unixctl.man
index 7372a7e..35f9f06 100644
--- a/lib/vlog-unixctl.man
+++ b/lib/vlog-unixctl.man
@@ -4,6 +4,15 @@
 .  IP "\\$1"
 ..
 .SS "VLOG COMMANDS"
+This command is used to enable/disable function-tracing, availble
+only when configured with --enable-ft and only with GNUC.
+.IP "\fBvlog/trace\fR [\fIfilename\fR]"
+Sets function tracing on or off. If "off" is passed, it
+turns off tracing for the module in question, Otherwise,
+\fIfilename\fR is the name of the trace log file and tracing will
+be turned on with this command automatically.
+.
+.PP
 These commands manage \fB\*(PN\fR's logging settings.
 .IP "\fBvlog/set\fR [\fIspec\fR]"
 Sets logging levels.  Without any \fIspec\fR, sets the log level for
diff --git a/lib/vlog.c b/lib/vlog.c
index 37b..95514fe 100644
--- a/lib/vlog.c
+++ b/lib/vlog.c
@@ -46,6 +46,26 @@
 
 VLOG_DEFINE_THIS_MODULE(vlog);
 
+#if defined(ENABLE_FT) && defined(__GNUC__)
+/* File pointer for logging trace output. */
+static FILE *ft_fp;
+/* Global flag holding current state of ft-enabled or not. */
+static bool ft_enabled = false;
+
+/* Prototypes for the functions declared/used in this file. */
+static void vlog_unixctl_set_ft(struct unixctl_conn *conn, int argc,
+const char *argv[], void *aux OVS_UNUSED);
+char * vlog_set_ft_log(const char *s_);
+void __attribute__ ((constructor,no_instrument_function)) ft_begin(void);
+void __attribute__ ((destructor,no_instrument_function)) ft_end(void);
+void __attribute__ ((no_instrument_function)) ft(const char * direction,
+ void *func, void * caller);
+void __attribute__ ((no_instrument_function)) __cyg_profile_func_enter(
+ void *func, void *caller);
+void __attribute__ ((no_instrument_function)) __cyg_profile_func_exit(
+ void *func, void *caller);
+#endif
+
 /* ovs_assert() logs the assertion message, so using ovs_assert() in this
  * source file could cause recursion. */
 #undef ovs_assert
@@ -467,6 +487,7 @@ vlog_change_owner_unix(uid_t user, gid_t group)
 }
 #endif
 
+
 /* Set debugging levels.  Returns null if successful, otherwise an error
  * message that

[ovs-dev] [PATCH v4 2/2] python: Add support for partial map and partial set updates

2016-08-05 Thread Ryan Moats

Allow the python IDL to use mutate operations more freely
by mimicing the partial map and partial set operations now
available in the C IDL.

Unit tests for both of these types of operations are included.
They are not carbon copies of the C tests, because testing
idempotency is a bit difficult for the current python IDL
test harness.

Signed-off-by: Ryan Moats 
---
 python/ovs/db/idl.py | 196 ---
 tests/ovsdb-idl.at   |  30 
 tests/test-ovsdb.py  |  88 +++
 3 files changed, 303 insertions(+), 11 deletions(-)

diff --git a/python/ovs/db/idl.py b/python/ovs/db/idl.py
index 92a7382..6f376c7 100644
--- a/python/ovs/db/idl.py
+++ b/python/ovs/db/idl.py
@@ -18,6 +18,7 @@ import uuid
 import six
 
 import ovs.jsonrpc
+import ovs.db.data as data
 import ovs.db.parser
 import ovs.db.schema
 from ovs.db import error
@@ -588,8 +589,7 @@ class Idl(object):
 continue
 
 try:
-datum_diff = ovs.db.data.Datum.from_json(column.type,
- datum_diff_json)
+datum_diff = data.Datum.from_json(column.type, datum_diff_json)
 except error.Error as e:
 # XXX rate-limit
 vlog.warn("error parsing column %s in table %s: %s"
@@ -614,7 +614,7 @@ class Idl(object):
 continue
 
 try:
-datum = ovs.db.data.Datum.from_json(column.type, datum_json)
+datum = data.Datum.from_json(column.type, datum_json)
 except error.Error as e:
 # XXX rate-limit
 vlog.warn("error parsing column %s in table %s: %s"
@@ -730,6 +730,24 @@ class Row(object):
 #   - None, if this transaction deletes this row.
 self.__dict__["_changes"] = {}
 
+# _mutations describes changes to this row to be handled via a
+# mutate operation on the wire.  It takes the following values:
+#
+#   - {}, the empty dictionary, if no transaction is active or if the
+# row has yet not been mutated within this transaction.
+#
+#   - A dictionary that contains two keys:
+#
+# - "_inserts" contains a dictionary that maps column names to
+#   new keys/key-value pairs that should be inserted into the
+#   column
+# - "_removes" contains a dictionary that maps column names to
+#   the keys/key-value pairs that should be removed from the
+#   column
+#
+#   - None, if this transaction deletes this row.
+self.__dict__["_mutations"] = {}
+
 # A dictionary whose keys are the names of columns that must be
 # verified as prerequisites when the transaction commits.  The values
 # in the dictionary are all None.
@@ -750,17 +768,47 @@ class Row(object):
 
 def __getattr__(self, column_name):
 assert self._changes is not None
+assert self._mutations is not None
 
 datum = self._changes.get(column_name)
+inserts = None
+if '_inserts' in self._mutations.keys():
+inserts = self._mutations['_inserts'].get(column_name)
+removes = None
+if '_removes' in self._mutations.keys():
+removes = self._mutations['_removes'].get(column_name)
 if datum is None:
 if self._data is None:
-raise AttributeError("%s instance has no attribute '%s'" %
- (self.__class__.__name__, column_name))
+if inserts is None:
+raise AttributeError("%s instance has no attribute '%s'" %
+ (self.__class__.__name__,
+  column_name))
+else:
+datum = inserts
 if column_name in self._data:
 datum = self._data[column_name]
+try:
+if inserts is not None:
+datum.extend(inserts)
+if removes is not None:
+datum = [x for x in datum if x not in removes]
+except error.Error:
+pass
+try:
+if inserts is not None:
+datum.merge(inserts)
+if removes is not None:
+for key in removes.keys():
+del datum[key]
+except error.Error:
+pass
 else:
-raise AttributeError("%s instance has no attribute '%s'" %
- (self.__class__.__name__, column_name))
+if inserts is None:
+raise AttributeError("%s instance has no attribute '%s'" %
+

[ovs-dev] [PATCH v4 1/2] ovsdb: Add/use partial set updates.

2016-08-05 Thread Ryan Moats

This patchset mimics the changes introduced in

  f199df26 (ovsdb-idl: Add partial map updates functionality.)
  010fe7ae (ovsdb-idlc.in: Autogenerate partial map updates functions.)
  7251075c (tests: Add test for partial map updates.)

but for columns that store sets of values rather than key-value
pairs.  These columns will now be able to use the OVSDB mutate
operation to transmit deltas on the wire rather than use
verify/update and transmit wait/update operations on the wire.

Side effect of modifying the comments in the partial map update
tests.

Signed-off-by: Ryan Moats 
---
 lib/automake.mk  |   2 +
 lib/ovsdb-idl-provider.h |   3 +
 lib/ovsdb-idl.c  | 390 +++
 lib/ovsdb-idl.h  |   6 +
 lib/ovsdb-set-op.c   | 170 +
 lib/ovsdb-set-op.h   |  44 ++
 ovsdb/ovsdb-idlc.in  |  65 +++-
 tests/idltest.ovsschema  |  30 
 tests/idltest2.ovsschema |  30 
 tests/ovsdb-idl.at   |  36 +
 tests/test-ovsdb.c   | 137 -
 11 files changed, 806 insertions(+), 107 deletions(-)
 create mode 100644 lib/ovsdb-set-op.c
 create mode 100644 lib/ovsdb-set-op.h

diff --git a/lib/automake.mk b/lib/automake.mk
index 97c83e9..30a281f 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -187,6 +187,8 @@ lib_libopenvswitch_la_SOURCES = \
lib/ovsdb-idl.h \
lib/ovsdb-map-op.c \
lib/ovsdb-map-op.h \
+   lib/ovsdb-set-op.c \
+   lib/ovsdb-set-op.h \
lib/ovsdb-condition.h \
lib/ovsdb-condition.c \
lib/ovsdb-parser.c \
diff --git a/lib/ovsdb-idl-provider.h b/lib/ovsdb-idl-provider.h
index 55ed793..64e8ec3 100644
--- a/lib/ovsdb-idl-provider.h
+++ b/lib/ovsdb-idl-provider.h
@@ -20,6 +20,7 @@
 #include "openvswitch/list.h"
 #include "ovsdb-idl.h"
 #include "ovsdb-map-op.h"
+#include "ovsdb-set-op.h"
 #include "ovsdb-types.h"
 #include "openvswitch/shash.h"
 #include "uuid.h"
@@ -39,6 +40,8 @@ struct ovsdb_idl_row {
 struct hmap_node txn_node;  /* Node in ovsdb_idl_txn's list. */
 unsigned long int *map_op_written; /* Bitmap of columns pending map ops. */
 struct map_op_list **map_op_lists; /* Per-column map operations. */
+unsigned long int *set_op_written; /* Bitmap of columns pending set ops. */
+struct set_op_list **set_op_lists; /* Per-column set operations. */
 
 /* Tracking data */
 unsigned int change_seqno[OVSDB_IDL_CHANGE_MAX];
diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c
index d70fb10..691f3bf 100644
--- a/lib/ovsdb-idl.c
+++ b/lib/ovsdb-idl.c
@@ -184,6 +184,7 @@ static struct ovsdb_idl_row *ovsdb_idl_row_create(struct 
ovsdb_idl_table *,
 static void ovsdb_idl_row_destroy(struct ovsdb_idl_row *);
 static void ovsdb_idl_row_destroy_postprocess(struct ovsdb_idl *);
 static void ovsdb_idl_destroy_all_map_op_lists(struct ovsdb_idl_row *);
+static void ovsdb_idl_destroy_all_set_op_lists(struct ovsdb_idl_row *);
 
 static void ovsdb_idl_row_parse(struct ovsdb_idl_row *);
 static void ovsdb_idl_row_unparse(struct ovsdb_idl_row *);
@@ -200,6 +201,10 @@ static void ovsdb_idl_txn_add_map_op(struct ovsdb_idl_row 
*,
  const struct ovsdb_idl_column *,
  struct ovsdb_datum *,
  enum map_op_type);
+static void ovsdb_idl_txn_add_set_op(struct ovsdb_idl_row *,
+ const struct ovsdb_idl_column *,
+ struct ovsdb_datum *,
+ enum set_op_type);
 
 static void ovsdb_idl_send_lock_request(struct ovsdb_idl *);
 static void ovsdb_idl_send_unlock_request(struct ovsdb_idl *);
@@ -1811,7 +1816,9 @@ ovsdb_idl_row_create(struct ovsdb_idl_table *table, const 
struct uuid *uuid)
 row->uuid = *uuid;
 row->table = table;
 row->map_op_written = NULL;
-row->map_op_lists = NULL;
+row->map_op_written = NULL;
+row->set_op_lists = NULL;
+row->set_op_lists = NULL;
 return row;
 }
 
@@ -1822,6 +1829,7 @@ ovsdb_idl_row_destroy(struct ovsdb_idl_row *row)
 ovsdb_idl_row_clear_old(row);
 hmap_remove(>table->rows, >hmap_node);
 ovsdb_idl_destroy_all_map_op_lists(row);
+ovsdb_idl_destroy_all_set_op_lists(row);
 if (ovsdb_idl_track_is_set(row->table)) {
 row->change_seqno[OVSDB_IDL_CHANGE_DELETE]
 = row->table->change_seqno[OVSDB_IDL_CHANGE_DELETE]
@@ -1856,6 +1864,27 @@ ovsdb_idl_destroy_all_map_op_lists(struct ovsdb_idl_row 
*row)
 }
 
 static void
+ovsdb_idl_destroy_all_set_op_lists(struct ovsdb_idl_row *row)
+{
+if (row->set_op_written) {
+/* Clear Set Operation Lists */
+size_t idx, n_columns;
+const struct ovsdb_idl_column *columns;
+const struct ovsdb_type *type;
+n_columns = row->table->class->n_columns;
+columns = row->table->class->columns;
+

[ovs-dev] [PATCH v4 0/2] Partial set operations and Python IDL update

2016-08-05 Thread Ryan Moats

This patch set adds partial set updates and updates the Python IDL
to support parital map and parital set operations. The python unit
tests are not a complete carbon copy of their C bretheren as the
Python IDL test harness does not appear to handle idempotency testing
and dumping of a map to string ends up dumping an unordered list
leading to sporadic test failures (suggestions for how to fix this
will be appreciated).

v3->v4:
  Fixed an issues with idltest2.ovsschema that will lead to
unit test case failure

v2->v3:
  Fixed some whitespace errors in patch

v1->v2:
  Removed RFC designation for ovsdb patch
  Added python patch

Ryan Moats (2):
  ovsdb: Add/use partial set updates.
  python: Add support for partial map and partial set updates

 lib/automake.mk  |   2 +
 lib/ovsdb-idl-provider.h |   3 +
 lib/ovsdb-idl.c  | 390 +++
 lib/ovsdb-idl.h  |   6 +
 lib/ovsdb-set-op.c   | 170 +
 lib/ovsdb-set-op.h   |  44 ++
 ovsdb/ovsdb-idlc.in  |  65 +++-
 python/ovs/db/idl.py | 196 ++--
 tests/idltest.ovsschema  |  30 
 tests/idltest2.ovsschema |  30 
 tests/ovsdb-idl.at   |  66 
 tests/test-ovsdb.c   | 137 -
 tests/test-ovsdb.py  |  88 +++
 13 files changed, 1109 insertions(+), 118 deletions(-)
 create mode 100644 lib/ovsdb-set-op.c
 create mode 100644 lib/ovsdb-set-op.h

-- 
2.7.4 (Apple Git-66)


Ryan Moats (2):
  ovsdb: Add/use partial set updates.
  python: Add support for partial map and partial set updates

 lib/automake.mk  |   2 +
 lib/ovsdb-idl-provider.h |   3 +
 lib/ovsdb-idl.c  | 390 +++
 lib/ovsdb-idl.h  |   6 +
 lib/ovsdb-set-op.c   | 170 +
 lib/ovsdb-set-op.h   |  44 ++
 ovsdb/ovsdb-idlc.in  |  65 +++-
 python/ovs/db/idl.py | 196 ++--
 tests/idltest.ovsschema  |  30 
 tests/ovsdb-idl.at   |  66 
 tests/test-ovsdb.c   | 137 -
 tests/test-ovsdb.py  |  88 +++
 12 files changed, 1079 insertions(+), 118 deletions(-)
 create mode 100644 lib/ovsdb-set-op.c
 create mode 100644 lib/ovsdb-set-op.h

-- 
2.7.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH v2 3/3] netdev-dpdk: vHost client mode and reconnect

2016-08-05 Thread Daniele Di Proietto

The patch mostly looks good to me, thanks.

I'm not 100% sure about the interface.  Can we make the flag interface
specific?

If I'm not mistaken we currently limit vhost-sock-dir to be under OVS
rundir.  With client mode this is not necessary anymore.

I hope that client will be made the default mode at some point, I think we
should keep that in mind when considering the interface.

Since we're planning to break compatibility with the dpdk phy naming
change, maybe we can break compatibility also with vhost ports and add a
path option.

Thoughts?

Daniele

2016-08-04 7:09 GMT-07:00 Ciara Loftus :

> A new other_config DB option has been added called 'vhost-driver-mode'.
> By default this is set to 'server' which is the mode of operation OVS
> with DPDK has used up until this point - whereby OVS creates and manages
> vHost user sockets.
>
> If set to 'client', OVS will act as the vHost client and connect to
> sockets created and managed by QEMU which acts as the server. This mode
> allows for reconnect capability, which allows vHost ports to resume
> normal connectivity in event of switch reset.
>
> QEMU v2.7.0+ is required when using OVS in client mode and QEMU in
> server mode.
>
> Signed-off-by: Ciara Loftus 
> ---
> v2
> - Updated comments in vhost construct & destruct
> - Add check for server-mode before printing error when destruct is called
>   on a running VM
> - Fixed coding style/standards issues
> - Use strcmp instead of strncmp when processing 'vhost-driver-mode'
>
>  INSTALL.DPDK-ADVANCED.md | 27 +++
>  NEWS |  1 +
>  lib/netdev-dpdk.c| 31 +++
>  vswitchd/vswitch.xml | 13 +
>  4 files changed, 64 insertions(+), 8 deletions(-)
>
> diff --git a/INSTALL.DPDK-ADVANCED.md b/INSTALL.DPDK-ADVANCED.md
> index f9587b5..a773533 100755
> --- a/INSTALL.DPDK-ADVANCED.md
> +++ b/INSTALL.DPDK-ADVANCED.md
> @@ -483,6 +483,33 @@ For users wanting to do packet forwarding using
> kernel stack below are the steps
> where `-L`: Changes the numbers of channels of the specified
> network device
> and `combined`: Changes the number of multi-purpose channels.
>
> +4. Enable OVS vHost client-mode & vHost reconnect (OPTIONAL)
> +
> +   By default, OVS DPDK acts as the vHost socket server and QEMU the
> +   client. In QEMU v2.7 the option is available for QEMU to act as the
> +   server. In order for this to work, OVS DPDK must be switched to
> 'client'
> +   mode. This is possible by setting the 'vhost-driver-mode' DB entry
> to
> +   'client' like so:
> +
> +   ```
> +   ovs-vsctl set Open_vSwitch . other_config:vhost-driver-
> mode="client"
> +   ```
> +
> +   This must be done before the switch is launched. It cannot
> sucessfully
> +   be changed after switch has launched.
> +
> +   One must also append ',server' to the 'chardev' arguments on the
> QEMU
> +   command line, to instruct QEMU to use vHost server mode, like so:
> +
> +   
> +   -chardev socket,id=char0,path=/usr/local/var/run/openvswitch/
> vhost0,server
> +   
> +
> +   One benefit of using this mode is the ability for vHost ports to
> +   'reconnect' in event of the switch crashing or being brought down.
> Once
> +   it is brought back up, the vHost ports will reconnect
> automatically and
> +   normal service will resume.
> +
>- VM Configuration with libvirt
>
>  * change the user/group, access control policty and restart libvirtd.
> diff --git a/NEWS b/NEWS
> index 9f09e1c..99412ba 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -70,6 +70,7 @@ Post-v2.5.0
> fragmentation or NAT support yet)
>   * Support for DPDK 16.07
>   * Remove dpdkvhostcuse port type.
> + * OVS client mode for vHost and vHost reconnect (Requires QEMU 2.7)
> - Increase number of registers to 16.
> - ovs-benchmark: This utility has been removed due to lack of use and
>   bitrot.
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index 7692cc8..39c448b 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -136,7 +136,8 @@ BUILD_ASSERT_DECL((MAX_NB_MBUF /
> ROUND_DOWN_POW2(MAX_NB_MBUF/MIN_NB_MBUF))
>  #define OVS_VHOST_QUEUE_DISABLED(-2) /* Queue was disabled by guest
> and not
>* yet mapped to another queue.
> */
>
> -static char *vhost_sock_dir = NULL;   /* Location of vhost-user sockets */
> +static char *vhost_sock_dir = NULL; /* Location of vhost-user sockets
> */
> +static uint64_t vhost_driver_flags = 0; /* Denote whether client/server
> mode */
>
>  #define VHOST_ENQ_RETRY_NUM 8
>  #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
> @@ -833,7 +834,6 @@ netdev_dpdk_vhost_user_construct(struct netdev
> *netdev)
>  struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>  const char *name = netdev->name;
>  int err;
>

[ovs-dev] [RFC] Introducing experimental OVN integration for Mesos

2016-08-05 Thread Nimay Desai

This commit introduces experimental support for OVN integration with Apache
Mesos.  It is experimental because the network plugability infrastructure for
Mesos is being continuously developed in the Mesos master branch.  Mesos does
not yet have all the components necessary to allow usage of OVN as a complete
container networking solution.  Mainly, it lacks the port mapping
infrastructure to support North to South connectivity.

With this commit, you can have clean East-West and South-North connectivity.
INSTALL.Mesos.md includes instructions on how to use the setup scripts to
create an OVN overlay network and attach Mesos nodes to the network.  It also
includes instructions on how to set up the plugin and start Mesos so that
containers automatically connect to the OVN network upon creation.

Signed-off-by: Nimay Desai 
---
 INSTALL.Mesos.md   | 176 +++
 Makefile.am|   1 +
 ovn/utilities/automake.mk  |   8 +-
 ovn/utilities/ovn-mesos-overlay-driver | 182 +
 ovn/utilities/ovn-mesos-plugin | 105 +++
 python/automake.mk |  15 ++-
 python/ovn/__init__.py |   1 +
 python/ovn/mesos/__init__.py   |   1 +
 python/ovn/mesos/ovnutil.py|  71 +
 9 files changed, 554 insertions(+), 6 deletions(-)
 create mode 100644 INSTALL.Mesos.md
 create mode 100755 ovn/utilities/ovn-mesos-overlay-driver
 create mode 100755 ovn/utilities/ovn-mesos-plugin
 create mode 100644 python/ovn/__init__.py
 create mode 100644 python/ovn/mesos/__init__.py
 create mode 100644 python/ovn/mesos/ovnutil.py

diff --git a/INSTALL.Mesos.md b/INSTALL.Mesos.md
new file mode 100644
index 000..dbeee81
--- /dev/null
+++ b/INSTALL.Mesos.md
@@ -0,0 +1,176 @@
+How to Use Open vSwitch with Apache Mesos
+=
+
+This document describes how to use Open Virtual Networking with Apache Mesos .
+This document assumes that you installed Open vSwitch by following [INSTALL.md]
+or by using the distribution packages such as .deb or .rpm.  Consult
+www.mesos.apache.org for instructions on how to install Mesos.
+
+
+Setup
+=
+
+* Start the central components.
+
+OVN architecture has a central component which stores your networking intent
+in a database.  On one of your machines, with an IP Address of $CENTRAL_IP,
+where you have installed and started Open vSwitch, you will need to start some
+central components.
+
+Start ovn-northd daemon.  This daemon translates networking intent from Mesos
+stored in the OVN_Northbound database to logical flows in OVN_Southbound
+database.  It is also responsible for managing and dynamically allocating
+IP/MAC addresses for Mesos containers.
+
+```
+/usr/share/openvswitch/scripts/ovn-ctl start_northd
+```
+
+* One time setup.
+
+On each host, where you plan to spawn your containers, you will need to
+run the following command once.  (You need to run it again if your OVS database
+gets cleared.  It is harmless to run it again in any case.)
+
+$LOCAL_IP in the below command is the IP address via which other hosts
+can reach this host.  This acts as your local tunnel endpoint.
+
+$ENCAP_TYPE is the type of tunnel that you would like to use for overlay
+networking.  The options are "geneve" or "stt".  (Please note that your
+kernel should have support for your chosen $ENCAP_TYPE.  Both geneve
+and stt are part of the Open vSwitch kernel module that is compiled from this
+repo.  If you use the Open vSwitch kernel module from upstream Linux,
+you will need a minumum kernel version of 3.18 for geneve.  There is no stt
+support in upstream Linux.  You can verify whether you have the support in your
+kernel by doing a "lsmod | grep $ENCAP_TYPE".)
+
+```
+ovs-vsctl set Open_vSwitch . external_ids:ovn-remote="tcp:$CENTRAL_IP:6641" \
+  external_ids:ovn-nb="tcp:$CENTRAL_IP:6641" 
external_ids:ovn-encap-ip=$LOCAL_IP external_ids:ovn-encap-type="$ENCAP_TYPE"
+```
+
+And finally, start the ovn-controller.  (You need to run the below command
+on every boot)
+
+```
+/usr/share/openvswitch/scripts/ovn-ctl start_controller
+```
+
+* Initialize the OVN network using the OVN network driver.
+
+Run the OVN network driver with the "plugin-init" subcommand once on any host.
+Running "ovn-nbctl show" should now display a single logical router called
+"mesos-router."
+
+```
+PYTHONPATH=$OVS_PYTHON_LIBS_PATH ovn-mesos-overlay-driver plugin-init
+```
+
+* Add each of the hosts to the OVN network.
+
+On each host where you will have a Mesos agent/master running, run the
+OVN network driver with the "node-init" subcommand. $SUBNET is the subnet
+(e.g. 172.16.1.0/24) of your host, $CLUSTER_SUBNET is the subnet of your entire
+Mesos cluster (e.g. 172.16.0.0/16), gateway will be the IPv4 address of your
+host's router port (e.g. 172.16.1.1/24), and $PATH_TO_CNI_CONFIG_DIR is the
+absolute path

Re: [ovs-dev] [PATCH 1/3] system-userspace-macros: Check the exit code of ethtool.

2016-08-05 Thread Daniele Di Proietto






On 05/08/2016 11:16, "Joe Stringer"  wrote:

>On 4 August 2016 at 18:40, Daniele Di Proietto  wrote:
>> If the ethtool command is not available on the system we should fail,
>> since the userspace testsuite cannot work properly without disabling
>> offloads.
>>
>> Also, add ethtool to the list of installed packages on Vagrantfile.
>>
>> Fixes: ddcf96d2dcc1 ("system-tests: Disable offloads in userspace tests.")
>> Reported-by: Joe Stringer 
>> Signed-off-by: Daniele Di Proietto 
>
>Thanks, this should make it more obvious when offloads are causing failures.
>
>The commit message doesn't really explain why the vagrantfile change
>is in the same commit; a simple mention that it's being added here 'to
>ensure that offloads don't cause test failures in the vagrant VM when
>the kernel is updated' would make it more clear why these two changes
>are in the same commit.

Ok

>
>Acked-by: Joe Stringer 

Thanks! I applied this and the next patch to master
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH 3/3] check-kernel: Remove '-d' from TESTSUITEFLAGS.

2016-08-05 Thread Daniele Di Proietto






On 05/08/2016 10:18, "Andy Zhou"  wrote:

>
>
>On Thu, Aug 4, 2016 at 6:43 PM, Daniele Di Proietto 
> wrote:
>
>The '-d' flag tells autotest to always keep the testcase output, but
>prevents '--recheck' from working.  If a user wants to always keep the
>output from the tests, the '-d' flag can be passed explicitly.  This is
>more in line with other test make target ('check',
>'check-system-userspace').
>
>CC: Andy Zhou 
>Signed-off-by: Daniele Di Proietto 
>---
> tests/automake.mk  | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/tests/automake.mk  b/tests/automake.mk 
>
>index a9ebf91..5d12ae5 100644
>--- a/tests/automake.mk 
>+++ b/tests/automake.mk 
>@@ -243,7 +243,7 @@ EXTRA_DIST += tests/run-ryu
>
> # Run kmod tests. Assume kernel modules has been installed or linked into the 
> kernel
> check-kernel: all tests/atconfig tests/atlocal $(SYSTEM_KMOD_TESTSUITE)
>-   $(SHELL) '$(SYSTEM_KMOD_TESTSUITE)' -C tests  
>AUTOTEST_PATH='$(AUTOTEST_PATH)' -d $(TESTSUITEFLAGS) -j1
>+   $(SHELL) '$(SYSTEM_KMOD_TESTSUITE)' -C tests  
>AUTOTEST_PATH='$(AUTOTEST_PATH)' $(TESTSUITEFLAGS) -j1
>
> # Testing the out of tree Kernel module
> check-kmod: all tests/atconfig tests/atlocal $(SYSTEM_KMOD_TESTSUITE)
>
>
>
>
>LGTM
>Acked-by: Andy Zhou 

Thanks, pushed to master
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [ovs-dev,V2] netdev-dpdk: fix memory leak

2016-08-05 Thread Daniele Di Proietto

Thanks for the report, I didn't realize that the callback could come in the
same thread.

I sent a patch that I believe should fix the deadlock here:

http://openvswitch.org/pipermail/dev/2016-August/077315.html

2016-08-05 7:48 GMT-07:00 Ilya Maximets :

> On 04.08.2016 12:49, Mark Kavanagh wrote:
> > DPDK v16.07 introduces the ability to free memzones.
> > Up until this point, DPDK memory pools created in OVS could
> > not be destroyed, thus incurring a memory leak.
> >
> > Leverage the DPDK v16.07 rte_mempool API to free DPDK
> > mempools when their associated reference count reaches 0 (this
> > indicates that the memory pool is no longer in use).
> >
> > Signed-off-by: Mark Kavanagh 
> > ---
> >
> > v2->v1: rebase to head of master, and remove 'RFC' tag
> >
> >  lib/netdev-dpdk.c | 29 +++--
> >  1 file changed, 15 insertions(+), 14 deletions(-)
> >
> > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> > index aaac0d1..ffcd35c 100644
> > --- a/lib/netdev-dpdk.c
> > +++ b/lib/netdev-dpdk.c
> > @@ -506,7 +506,7 @@ dpdk_mp_get(int socket_id, int mtu)
> OVS_REQUIRES(dpdk_mutex)
> >  }
> >
> >  static void
> > -dpdk_mp_put(struct dpdk_mp *dmp)
> > +dpdk_mp_put(struct dpdk_mp *dmp) OVS_REQUIRES(dpdk_mutex)
> >  {
> >
> >  if (!dmp) {
> > @@ -514,15 +514,12 @@ dpdk_mp_put(struct dpdk_mp *dmp)
> >  }
> >
> >  dmp->refcount--;
> > -ovs_assert(dmp->refcount >= 0);
> >
> > -#if 0
> > -/* I could not find any API to destroy mp. */
> > -if (dmp->refcount == 0) {
> > -list_delete(dmp->list_node);
> > -/* destroy mp-pool. */
> > -}
> > -#endif
> > +if (OVS_UNLIKELY(!dmp->refcount)) {
> > +ovs_list_remove(>list_node);
> > +rte_mempool_free(dmp->mp);
> > + }
> > +
> >  }
> >
> >  static void
> > @@ -928,16 +925,18 @@ netdev_dpdk_destruct(struct netdev *netdev)
> >  {
> >  struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> >
> > +ovs_mutex_lock(_mutex);
> >  ovs_mutex_lock(>mutex);
> > +
> >  rte_eth_dev_stop(dev->port_id);
> >  free(ovsrcu_get_protected(struct ingress_policer *,
> >>ingress_policer));
> > -ovs_mutex_unlock(>mutex);
> >
> > -ovs_mutex_lock(_mutex);
> >  rte_free(dev->tx_q);
> >  ovs_list_remove(>list_node);
> >  dpdk_mp_put(dev->dpdk_mp);
> > +
> > +ovs_mutex_unlock(>mutex);
> >  ovs_mutex_unlock(_mutex);
> >  }
> >
> > @@ -946,6 +945,9 @@ netdev_dpdk_vhost_destruct(struct netdev *netdev)
> >  {
> >  struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> >
> > +ovs_mutex_lock(_mutex);
> > +ovs_mutex_lock(>mutex);
> > +
> >  /* Guest becomes an orphan if still attached. */
> >  if (netdev_dpdk_get_vid(dev) >= 0) {
> >  VLOG_ERR("Removing port '%s' while vhost device still
> attached.",
> > @@ -961,15 +963,14 @@ netdev_dpdk_vhost_destruct(struct netdev *netdev)
> >  fatal_signal_remove_file_to_unlink(dev->vhost_id);
> >  }
> >
> > -ovs_mutex_lock(>mutex);
> >  free(ovsrcu_get_protected(struct ingress_policer *,
> >>ingress_policer));
> > -ovs_mutex_unlock(>mutex);
> >
> > -ovs_mutex_lock(_mutex);
> >  rte_free(dev->tx_q);
> >  ovs_list_remove(>list_node);
> >  dpdk_mp_put(dev->dpdk_mp);
> > +
> > +ovs_mutex_unlock(>mutex);
> >  ovs_mutex_unlock(_mutex);
> >  }
>
> I agree that locking here was wrong but this change introduces issue
> because
> 'rte_vhost_driver_unregister()' may call 'destroy_device()' and OVS will
> be aborted
> on attempt to lock 'dpdk_mutex' again:
>
> VHOST_CONFIG: free connfd = 37 for device '/vhost1'
> ovs-vswitchd: lib/netdev-dpdk.c:2305: pthread_mutex_lock failed (Resource
> deadlock avoided)
>
> Program received signal SIGABRT, Aborted.
> 0x007fb7ad6d38 in raise () from /lib64/libc.so.6
> (gdb) bt
> #0  0x007fb7ad6d38 in raise () from /lib64/libc.so.6
> #1  0x007fb7ad8aa8 in abort () from /lib64/libc.so.6
> #2  0x00692be0 in ovs_abort_valist at lib/util.c:335
> #3  0x00692ba0 in ovs_abort at lib/util.c:327
> #4  0x00651800 in ovs_mutex_lock_at (l_=0x899ab0 ,
> where=0x78a458 "lib/netdev-dpdk.c:2305") at lib/ovs-thread.c:76
> #5  0x006c0190 in destroy_device (vid=0) at lib/netdev-dpdk.c:2305
> #6  0x004ea850 in vhost_destroy_device ()
> #7  0x004ee578 in rte_vhost_driver_unregister ()
> #8  0x006bc8c8 in netdev_dpdk_vhost_destruct (netdev=0x7f6bffed00)
> at lib/netdev-dpdk.c:944
> #9  0x005e4ad4 in netdev_unref (dev=0x7f6bffed00) at
> lib/netdev.c:499
> #10 0x005e4b9c in netdev_close (netdev=0x7f6bffed00) at
> lib/netdev.c:523
> [...]
> #20 0x0053ad94 in main (argc=7, argv=0x7ff318) at
> vswitchd/ovs-vswitchd.c:112
>
> May be reproduced by removing port while virtio still attached.
> This blocks reconnection feature and deletion of port while QEMU still
>

Re: [ovs-dev] [PATCH] dpcls_lookup: added comments.

2016-08-05 Thread Jarno Rajahalme

Acked-by: Jarno Rajahalme 

Pushed to master with a rebase and minor edits.

  Jarno


> On Aug 5, 2016, at 6:40 AM, antonio.fische...@intel.com wrote:
> 
> This patch adds some comments to the dpcls_lookup() funtion,
> which is one of the most important places where the Userspace
> wildcard matching happens.
> The purpose is to give some more explanations on its design
> and also on how it works.
> 
> Signed-off-by: Antonio Fischetti 
> ---
> lib/dpif-netdev.c | 40 ++--
> 1 file changed, 34 insertions(+), 6 deletions(-)
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index e0107b7..a390758 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -4492,8 +4492,8 @@ dpcls_rule_matches_key(const struct dpcls_rule *rule,
> return true;
> }
> 
> -/* For each miniflow in 'flows' performs a classifier lookup writing the 
> result
> - * into the corresponding slot in 'rules'.  If a particular entry in 'flows' 
> is
> +/* For each miniflow in 'keys' performs a classifier lookup writing the 
> result
> + * into the corresponding slot in 'rules'.  If a particular entry in 'keys' 
> is
>  * NULL it is skipped.
>  *
>  * This function is optimized for use in the userspace datapath and therefore
> @@ -4501,12 +4501,15 @@ dpcls_rule_matches_key(const struct dpcls_rule *rule,
>  * classifier_lookup() function.  Specifically, it does not implement
>  * priorities, instead returning any rule which matches the flow.
>  *
> - * Returns true if all flows found a corresponding rule. */
> + * Returns true if all miniflows found a corresponding rule. */
> static bool
> dpcls_lookup(const struct dpcls *cls, const struct netdev_flow_key keys[],
>  struct dpcls_rule **rules, const size_t cnt)
> {
> -/* The batch size 16 was experimentally found faster than 8 or 32. */
> +/* The received 'cnt' miniflows are the search-keys that will be 
> processed
> + * in batches of 16 elements.  N_MAPS will contain the number of these
> + * 16-elements batches.  i.e. for 'cnt'=32, N_MAPS shall be 2.
> + * The batch size 16 was experimentally found faster than 8 or 32. */
> typedef uint16_t map_type;
> #define MAP_BITS (sizeof(map_type) * CHAR_BIT)
> 
> @@ -4524,6 +4527,16 @@ dpcls_lookup(const struct dpcls *cls, const struct 
> netdev_flow_key keys[],
> }
> memset(rules, 0, cnt * sizeof *rules);
> 
> +/* The Datapath classifier - aka dpcls - is composed of subtables.
> + * They are dynamically created depending on the new rules we need to
> + * cache.
> + * Each subtable collects rules with a certain subset of packet fields 
> and
> + * with a given unique mask.
> + * We need to process every search-key against each subtable.
> + * When an entry is found the search can stop because rules are
> + * non-overlapping by nature.
> + * The next macro loops on the current subtables listed into the
> + * 'cls->subtables' pvector. */
> PVECTOR_FOR_EACH (subtable, >subtables) {
> const struct netdev_flow_key *mkeys = keys;
> struct dpcls_rule **mrules = rules;
> @@ -4532,6 +4545,7 @@ dpcls_lookup(const struct dpcls *cls, const struct 
> netdev_flow_key keys[],
> 
> BUILD_ASSERT_DECL(sizeof remains == sizeof *maps);
> 
> +/* Loops on each batch of 16 search-keys. */
> for (m = 0; m < N_MAPS; m++, mkeys += MAP_BITS, mrules += MAP_BITS) {
> uint32_t hashes[MAP_BITS];
> const struct cmap_node *nodes[MAP_BITS];
> @@ -4542,14 +4556,25 @@ dpcls_lookup(const struct dpcls *cls, const struct 
> netdev_flow_key keys[],
> continue; /* Skip empty maps. */
> }
> 
> -/* Compute hashes for the remaining keys. */
> +/* Compute hashes for the remaining keys.
> + * Beside the search-key we need to pass also the specific mask
> + * of the current subtable, because we are using Hash tables for
> + * a wildcard match.
> + * The mask will be applied to the search-key before computing 
> the
> + * Hash value. */
> ULLONG_FOR_EACH_1(i, map) {
> hashes[i] = netdev_flow_key_hash_in_mask([i],
>  >mask);
> }
> /* Lookup. */
> map = cmap_find_batch(>rules, map, hashes, nodes);
> -/* Check results. */
> +/* Check results.
> + * When the i-th bit of map is set, it means that a Hash entry
> + * was found for the i-th search-key.  Considering how Hash
> + * mechanism works, we still need to check that the found entry
> + * really matches our masked search-key.  Otherwise we will loop 
> on
> + * the linked nodes - which will be present if any collision
> + * occurred - to repeat the check for a match. */
>

[ovs-dev] [PATCH] netdev-dpdk: Fix deadlock in destroy_device().

2016-08-05 Thread Daniele Di Proietto

netdev_dpdk_vhost_destruct() calls rte_vhost_driver_unregister(), which
can trigger the destroy_device() callback.  destroy_device() will try to
take two mutexes already held by netdev_dpdk_vhost_destruct(), causing a
deadlock.

This problem can be solved by dropping the mutexes before calling
rte_vhost_driver_unregister().  The netdev_dpdk_vhost_destruct() and
construct() call are already serialized by netdev_mutex.

This commit also makes clear that dev->vhost_id is constant and can be
accessed without taking any mutexes in the lifetime of the devices.

Fixes: 8d38823bdf8b("netdev-dpdk: fix memory leak")
Reported-by: Ilya Maximets 
Signed-off-by: Daniele Di Proietto 
---
 lib/netdev-dpdk.c | 34 --
 1 file changed, 24 insertions(+), 10 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index f37ec1c..98bff62 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -355,8 +355,10 @@ struct netdev_dpdk {
 /* True if vHost device is 'up' and has been reconfigured at least once */
 bool vhost_reconfigured;
 
-/* Identifier used to distinguish vhost devices from each other */
-char vhost_id[PATH_MAX];
+/* Identifier used to distinguish vhost devices from each other.  It does
+ * not change during the lifetime of a struct netdev_dpdk.  It can be read
+ * without holding any mutex. */
+const char vhost_id[PATH_MAX];
 
 /* In dpdk_list. */
 struct ovs_list list_node OVS_GUARDED_BY(dpdk_mutex);
@@ -846,7 +848,8 @@ netdev_dpdk_vhost_cuse_construct(struct netdev *netdev)
 }
 
 ovs_mutex_lock(_mutex);
-strncpy(dev->vhost_id, netdev->name, sizeof(dev->vhost_id));
+strncpy(CONST_CAST(char *, dev->vhost_id), netdev->name,
+sizeof dev->vhost_id);
 err = vhost_construct_helper(netdev);
 ovs_mutex_unlock(_mutex);
 return err;
@@ -878,7 +881,7 @@ netdev_dpdk_vhost_user_construct(struct netdev *netdev)
 /* Take the name of the vhost-user port and append it to the location where
  * the socket is to be created, then register the socket.
  */
-snprintf(dev->vhost_id, sizeof(dev->vhost_id), "%s/%s",
+snprintf(CONST_CAST(char *,dev->vhost_id), sizeof(dev->vhost_id), "%s/%s",
  vhost_sock_dir, name);
 
 err = rte_vhost_driver_register(dev->vhost_id, flags);
@@ -938,6 +941,17 @@ netdev_dpdk_destruct(struct netdev *netdev)
 ovs_mutex_unlock(_mutex);
 }
 
+/* rte_vhost_driver_unregister() can call back destroy_device(), which will
+ * try to acquire 'dpdk_mutex' and possibly 'dev->mutex'.  To avoid a
+ * deadlock, none of the mutexes must be held while calling this function. */
+static int
+dpdk_vhost_driver_unregister(struct netdev_dpdk *dev)
+OVS_EXCLUDED(dpdk_mutex)
+OVS_EXCLUDED(dev->mutex)
+{
+return rte_vhost_driver_unregister(dev->vhost_id);
+}
+
 static void
 netdev_dpdk_vhost_destruct(struct netdev *netdev)
 {
@@ -955,12 +969,6 @@ netdev_dpdk_vhost_destruct(struct netdev *netdev)
  dev->vhost_id);
 }
 
-if (rte_vhost_driver_unregister(dev->vhost_id)) {
-VLOG_ERR("Unable to remove vhost-user socket %s", dev->vhost_id);
-} else {
-fatal_signal_remove_file_to_unlink(dev->vhost_id);
-}
-
 free(ovsrcu_get_protected(struct ingress_policer *,
   >ingress_policer));
 
@@ -970,6 +978,12 @@ netdev_dpdk_vhost_destruct(struct netdev *netdev)
 
 ovs_mutex_unlock(>mutex);
 ovs_mutex_unlock(_mutex);
+
+if (dpdk_vhost_driver_unregister(dev)) {
+VLOG_ERR("Unable to remove vhost-user socket %s", dev->vhost_id);
+} else {
+fatal_signal_remove_file_to_unlink(dev->vhost_id);
+}
 }
 
 static void
-- 
2.8.1

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [ovs-discuss] [openvswitch 2.5.90] testsuite: 2224 failed

2016-08-05 Thread Daniele Di Proietto

I can reproduce this too

With -march=native, if the CPU has CRC32 extensions we use a different hash
function.  I suspect the dhcp options are output on the packet in a
different order because of this.  Perhaps we should make the test agnostic
of the order, or order the options on the DHCP packet.

Thanks,

Daniele

2016-08-05 11:41 GMT-07:00 Lance Richardson :

> Wow, that is a very strange finding.
>
> I also see it on Fedora 23 with gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6),
> 2/100 failures with default configuration, 100% failure rate with
> -march=native.
>
>Lance
>
> - Original Message -
> > From: "Ilya Maximets" 
> > To: "Numan Siddique" , "Ben Pfaff" ,
> b...@openvswitch.org
> > Cc: dev@openvswitch.org, "Ramu Ramamurthy" ,
> "Dyasly Sergey" 
> > Sent: Friday, August 5, 2016 9:21:33 AM
> > Subject: Re: [ovs-dev] [openvswitch 2.5.90] testsuite: 2224 failed
> >
> > Exactly same situation with gcc (GCC) 6.1.1 20160510 (Red Hat 6.1.1-2).
> >
> > On 05.08.2016 14:37, Ilya Maximets wrote:
> > > There is one interesting bug:
> > >
> > > Test 2224 (ovn -- dhcpv4 : 1 HV, 2 LS, 2 LSPs/LS) constantly fails
> > > with 'CFLAGS=-march=native'. All other tests works normally.
> > >
> > > Environment:
> > >
> > > * OVS current master:
> > >   commit d59831e9b08e ("bridge: No QoS configured is not an error")
> > > * Red Hat Enterprise Linux Server release 7.2 (Maipo)
> > > * Compiler: gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-4)
> > > * Intel(R) Xeon(R) CPU E5-2690 v3
> > >
> > > Test scenario:
> > >
> > > 1. Checkout current master branch.
> > >
> > > 2. Configure OVS with default configuration:
> > >
> > ># ./boot.sh && ./configure && make
> > >
> > > 3. Check test #2224
> > >
> > ># make check TESTSUITEFLAGS='2224'
> > >2224: ovn -- dhcpv4 : 1 HV, 2 LS, 2 LSPs/LS   ok
> > >
> > > 4. Clean up
> > >
> > ># make distclean
> > >
> > > 5. Configure OVS with '-march=native':
> > >
> > ># ./boot.sh && ./configure CFLAGS="-march=native" && make
> > >
> > > 6. Check test #2224
> > >
> > ># make check TESTSUITEFLAGS='2224'
> > >2224: ovn -- dhcpv4 : 1 HV, 2 LS, 2 LSPs/LS   FAILED
> > >(ovn.at:3205)
> > >
> > > Test failed because of bad packet:
> > >
> > > ./ovn.at:3205: cat 1.packets | cut -c 53-
> > > --- expout  2016-08-05 14:29:47.205360523 +0300
> > > +++ /ovs/tests/testsuite.dir/at-groups/2224/stdout   2016-08-05
> > > 14:29:47.215360172 +0300
> > > @@ -1 +1 @@
> > > -0a010a0400430044011c020106006359aa76
> 0a04
> > >  f001
> 
> > >  
> 
> > >  
> 
> > >  
> 
> > >  
> 638253633501020104ff
> > >  0003040a0136040a0133040e10ff
> > > +0a010a0400430044011c020106006359aa76
> 0a04
> > >  f001
> 
> > >  
> 
> > >  
> 
> > >  
> 
> > >  
> 6382536335010236040a
> > >  010104ff0003040a0133040e10ff
> > >
> > > Full log attached.
> > >
> > > Best regards, Ilya Maximets.
> > >
> > ___
> > dev mailing list
> > dev@openvswitch.org
> > http://openvswitch.org/mailman/listinfo/dev
> >
> ___
> discuss mailing list
> disc...@openvswitch.org
> http://openvswitch.org/mailman/listinfo/discuss
>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [openvswitch 2.5.90] testsuite: 2224 failed

2016-08-05 Thread Lance Richardson

Wow, that is a very strange finding.

I also see it on Fedora 23 with gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6),
2/100 failures with default configuration, 100% failure rate with -march=native.

   Lance

- Original Message -
> From: "Ilya Maximets" 
> To: "Numan Siddique" , "Ben Pfaff" , 
> b...@openvswitch.org
> Cc: dev@openvswitch.org, "Ramu Ramamurthy" , 
> "Dyasly Sergey" 
> Sent: Friday, August 5, 2016 9:21:33 AM
> Subject: Re: [ovs-dev] [openvswitch 2.5.90] testsuite: 2224 failed
> 
> Exactly same situation with gcc (GCC) 6.1.1 20160510 (Red Hat 6.1.1-2).
> 
> On 05.08.2016 14:37, Ilya Maximets wrote:
> > There is one interesting bug:
> > 
> > Test 2224 (ovn -- dhcpv4 : 1 HV, 2 LS, 2 LSPs/LS) constantly fails
> > with 'CFLAGS=-march=native'. All other tests works normally.
> > 
> > Environment:
> > 
> > * OVS current master:
> >   commit d59831e9b08e ("bridge: No QoS configured is not an error")
> > * Red Hat Enterprise Linux Server release 7.2 (Maipo)
> > * Compiler: gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-4)
> > * Intel(R) Xeon(R) CPU E5-2690 v3
> > 
> > Test scenario:
> > 
> > 1. Checkout current master branch.
> > 
> > 2. Configure OVS with default configuration:
> > 
> ># ./boot.sh && ./configure && make
> > 
> > 3. Check test #2224
> > 
> ># make check TESTSUITEFLAGS='2224'
> >2224: ovn -- dhcpv4 : 1 HV, 2 LS, 2 LSPs/LS   ok
> > 
> > 4. Clean up
> > 
> ># make distclean
> > 
> > 5. Configure OVS with '-march=native':
> > 
> ># ./boot.sh && ./configure CFLAGS="-march=native" && make
> > 
> > 6. Check test #2224
> > 
> ># make check TESTSUITEFLAGS='2224'
> >2224: ovn -- dhcpv4 : 1 HV, 2 LS, 2 LSPs/LS   FAILED
> >(ovn.at:3205)
> > 
> > Test failed because of bad packet:
> > 
> > ./ovn.at:3205: cat 1.packets | cut -c 53-
> > --- expout  2016-08-05 14:29:47.205360523 +0300
> > +++ /ovs/tests/testsuite.dir/at-groups/2224/stdout   2016-08-05
> > 14:29:47.215360172 +0300
> > @@ -1 +1 @@
> > -0a010a0400430044011c020106006359aa760a04
> >  
> > f001
> >  
> > 
> >  
> > 
> >  
> > 
> >  
> > 638253633501020104ff
> >  0003040a0136040a0133040e10ff
> > +0a010a0400430044011c020106006359aa760a04
> >  
> > f001
> >  
> > 
> >  
> > 
> >  
> > 
> >  
> > 6382536335010236040a
> >  010104ff0003040a0133040e10ff
> > 
> > Full log attached.
> > 
> > Best regards, Ilya Maximets.
> > 
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
> 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH] system-traffic: Make ping6 vlan test more reliable.

2016-08-05 Thread Joe Stringer

On 4 August 2016 at 18:28, Daniele Di Proietto  wrote:
> LGTM, thanks
>
> Acked-by: 

Thanks, applied.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH 2/3] system-traffic: Flush conntrack after debug ping6.

2016-08-05 Thread Joe Stringer

On 4 August 2016 at 18:42, Daniele Di Proietto  wrote:
> We want to discard any state created by the initial ping6 (used to wait
> for an available IP address).  Otherwise some weird state can show up in
> the connection tracking tables (such as ICMP connection from link-local
> addresses).
>
> Fixes: e5cf8cce2759("system-tests: Add ping through conntrack test.")
> Reported-by: Joe Stringer 
> Signed-off-by: Daniele Di Proietto 

Acked-by: Joe Stringer 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH 1/3] system-userspace-macros: Check the exit code of ethtool.

2016-08-05 Thread Joe Stringer

On 4 August 2016 at 18:40, Daniele Di Proietto  wrote:
> If the ethtool command is not available on the system we should fail,
> since the userspace testsuite cannot work properly without disabling
> offloads.
>
> Also, add ethtool to the list of installed packages on Vagrantfile.
>
> Fixes: ddcf96d2dcc1 ("system-tests: Disable offloads in userspace tests.")
> Reported-by: Joe Stringer 
> Signed-off-by: Daniele Di Proietto 

Thanks, this should make it more obvious when offloads are causing failures.

The commit message doesn't really explain why the vagrantfile change
is in the same commit; a simple mention that it's being added here 'to
ensure that offloads don't cause test failures in the vagrant VM when
the kernel is updated' would make it more clear why these two changes
are in the same commit.

Acked-by: Joe Stringer 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] Let's talk the NB DB IDL Part I - things we've see scaling the networking-ovn to NB DB connection

2016-08-05 Thread Ryan Moats

"dev"  wrote on 08/04/2016 10:34:08 AM:

> From: Ryan Moats/Omaha/IBM@IBMUS
> To: Ben Pfaff 
> Cc: ovs-dev 
> Date: 08/04/2016 10:34 AM
> Subject: Re: [ovs-dev] Let's talk the NB DB IDL Part I - things
> we've see scaling the networking-ovn to NB DB connection
> Sent by: "dev" 
>
> "dev"  wrote on 08/03/2016 04:53:42 PM:
>
> > From: Ben Pfaff 
> > To: Russell Bryant 
> > Cc: ovs-dev 
> > Date: 08/03/2016 04:54 PM
> > Subject: Re: [ovs-dev] Let's talk the NB DB IDL Part I - things
> > we've see scaling the networking-ovn to NB DB connection
> > Sent by: "dev" 
> >
> > On Wed, Aug 03, 2016 at 11:58:52AM -0400, Russell Bryant wrote:
> > > On Wed, Aug 3, 2016 at 11:39 AM, Kyle Mestery 
> wrote:
> > >
> > > > On Wed, Aug 3, 2016 at 10:30 AM, Ryan Moats 
> wrote:
> > > > >
> > > > > Russell Bryant  wrote on 08/03/2016 10:11:57 AM:
> > > > >
> > > > >> From: Russell Bryant 
> > > > >> To: Ryan Moats/Omaha/IBM@IBMUS
> > > > >> Cc: Ben Pfaff , ovs-dev 
> > > > >> Date: 08/03/2016 10:12 AM
> > > > >> Subject: Re: [ovs-dev] Let's talk the NB DB IDL Part I - things
> > > > >> we've see scaling the networking-ovn to NB DB connection
> > > > >>
> > > > >> On Wed, Aug 3, 2016 at 9:28 AM, Ryan Moats 
> wrote:
> > > > >>
> > > > >>
> > > > >> Ben Pfaff  wrote on 08/03/2016 12:27:48 AM:
> > > > >>
> > > > >> > From: Ben Pfaff 
> > > > >> > To: Ryan Moats/Omaha/IBM@IBMUS
> > > > >> > Cc: ovs-dev 
> > > > >> > Date: 08/03/2016 12:28 AM
> > > > >> > Subject: Re: [ovs-dev] Let's talk the NB DB IDL Part I -
things
> > > > >> > we've see scaling the networking-ovn to NB DB connection
> > > > >> >
> > > > >> > On Wed, Aug 03, 2016 at 12:06:47AM -0500, Ryan Moats wrote:
> > > > >> > > Ben Pfaff  wrote on 08/02/2016 11:52:23 PM:
> > > > >> > >
> > > > >> > > > From: Ben Pfaff 
> > > > >> > > > To: Ryan Moats/Omaha/IBM@IBMUS
> > > > >> > > > Cc: ovs-dev 
> > > > >> > > > Date: 08/02/2016 11:52 PM
> > > > >> > > > Subject: Re: [ovs-dev] Let's talk the NB DB IDL Part I -
> things
> > > > >> > > > we've see scaling the networking-ovn to NB DB connection
> > > > >> > > >
> > > > >> > > > On Tue, Aug 02, 2016 at 11:45:07PM -0500, Ryan Moats
wrote:
> > > > >> > > > > "dev"  wrote on 08/02/2016
> > > > 10:56:07
> > > > >> PM:
> > > > >> > > > > > Ben Pfaff  wrote on 08/02/2016 10:14:46
PM:
> > > > >> > > > > > > Presumably this means that networking-ovn is calling
> > > > "verify"
> > > > >> on
> > > > >> > > the
> > > > >> > > > > > > column in question.  Probably, networking-ovn
> > should use the
> > > > >> > > partial
> > > > >> > > > > map
> > > > >> > > > > > > update functionality introduced in commit
> f199df26e8e28
> > > > >> "ovsdb-idl:
> > > > >> > > Add
> > > > >> > > > > > > partial map updates functionality."  I don't know
> whether
> > > > > it's
> > > > >> in
> > > > >> > > the
> > > > >> > > > > > > Python IDL yet.
> > > > >> > > > > >
> > > > >> > > > > > Indeed they are and thanks for the pointer to the
commit
> -
> > > > I'll
> > > > >> dig
> > > > >> > > > > > into it tomorrow and see if that code is reflected in
> the
> > > > > Python
> > > > >> > > > > > IDL via that or another commit.  If it is, great.  If
> not,
> > > > > there
> > > > >> > > > > > will likely also be a patch adding it so that we can
> move
> > > > > along.
> > > > >> > > > >
> > > > >> > > > > Hmm, maybe I'm misreading something, but I don't thing
> that's
> > > > > going
> > > > >> > > > > to work without some additional modifications - the
> partial map
> > > > >> commit
> > > > >> > > > > currently codes for columns that have a particular value
> type
> > > > >> defined
> > > > >> > > > > by the schema.  The problem we are seeing is with
theports
> and
> > > > >> acls
> > > > >> > > > > columns of the logical switch table, which are lists of
> strong
> > > > >> > > > > references.  Since they don't have a defined value, the
> > > > generated
> > > > >> IDL
> > > > >> > > > > code doesn't provide hooks for using partial map
> operations and
> > > > > we
> > > > >> > > default
> > > > >> > > > > back to update/verify with the given above results.
> > > > >> > > > >
> > > > >> > > > > Now, I think this an oversight, because I can argue that
> since
> > > > >> these
> > > > >> > > > > are strong references, I should be able to use partial
> maps to
> > > > >> update
> > > > >> > > > > them as keys with a null value.  Does this make sense or
> am I
> > > > >> breaking
> > > > >> > > > > something if I look at going this route?
> > > > >> > > >
> > > > >> > > > If they're implemented as

Re: [ovs-dev] [PATCH v3 1/2] ovsdb: Add/use partial set updates.

2016-08-05 Thread Kyle Mestery

On Fri, Aug 5, 2016 at 12:49 PM, Ryan Moats  wrote:
> This patchset mimics the changes introduced in
>
>   f199df26 (ovsdb-idl: Add partial map updates functionality.)
>   010fe7ae (ovsdb-idlc.in: Autogenerate partial map updates functions.)
>   7251075c (tests: Add test for partial map updates.)
>
> but for columns that store sets of values rather than key-value
> pairs.  These columns will now be able to use the OVSDB mutate
> operation to transmit deltas on the wire rather than use
> verify/update and transmit wait/update operations on the wire.
>
> Side effect of modifying the comments in the partial map update
> tests.
>

I was very interested in using this patch to verify a performance
problem we were seeing. Our environment was testing with 400 chassis,
a single (OpenStack provider) network, and 8000 ports. Before this
change, running a test with ovn-scale-test (create port in NB DB, bind
ports in SB DB) was taking approximately 8 hours (480 minutes). I've
re-run this test with Ryan's change, and it now completes in 87
minutes. This patch has shaved almost 400 minutes off the test!

I hope we can look at getting this into 2.6, because for large
deployments, this is a huge performance win!

Note, I have not verified the OpenStack portion of this yet, this was
purely with the OVN control plane.

Tested-by: Kyle Mestery 

> Signed-off-by: Ryan Moats 
> ---
>  lib/automake.mk  |   2 +
>  lib/ovsdb-idl-provider.h |   3 +
>  lib/ovsdb-idl.c  | 390 
> +++
>  lib/ovsdb-idl.h  |   6 +
>  lib/ovsdb-set-op.c   | 170 +
>  lib/ovsdb-set-op.h   |  44 ++
>  ovsdb/ovsdb-idlc.in  |  65 +++-
>  tests/idltest.ovsschema  |  30 
>  tests/ovsdb-idl.at   |  36 +
>  tests/test-ovsdb.c   | 137 -
>  10 files changed, 776 insertions(+), 107 deletions(-)
>  create mode 100644 lib/ovsdb-set-op.c
>  create mode 100644 lib/ovsdb-set-op.h
>
> diff --git a/lib/automake.mk b/lib/automake.mk
> index 97c83e9..30a281f 100644
> --- a/lib/automake.mk
> +++ b/lib/automake.mk
> @@ -187,6 +187,8 @@ lib_libopenvswitch_la_SOURCES = \
> lib/ovsdb-idl.h \
> lib/ovsdb-map-op.c \
> lib/ovsdb-map-op.h \
> +   lib/ovsdb-set-op.c \
> +   lib/ovsdb-set-op.h \
> lib/ovsdb-condition.h \
> lib/ovsdb-condition.c \
> lib/ovsdb-parser.c \
> diff --git a/lib/ovsdb-idl-provider.h b/lib/ovsdb-idl-provider.h
> index 55ed793..64e8ec3 100644
> --- a/lib/ovsdb-idl-provider.h
> +++ b/lib/ovsdb-idl-provider.h
> @@ -20,6 +20,7 @@
>  #include "openvswitch/list.h"
>  #include "ovsdb-idl.h"
>  #include "ovsdb-map-op.h"
> +#include "ovsdb-set-op.h"
>  #include "ovsdb-types.h"
>  #include "openvswitch/shash.h"
>  #include "uuid.h"
> @@ -39,6 +40,8 @@ struct ovsdb_idl_row {
>  struct hmap_node txn_node;  /* Node in ovsdb_idl_txn's list. */
>  unsigned long int *map_op_written; /* Bitmap of columns pending map ops. 
> */
>  struct map_op_list **map_op_lists; /* Per-column map operations. */
> +unsigned long int *set_op_written; /* Bitmap of columns pending set ops. 
> */
> +struct set_op_list **set_op_lists; /* Per-column set operations. */
>
>  /* Tracking data */
>  unsigned int change_seqno[OVSDB_IDL_CHANGE_MAX];
> diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c
> index d70fb10..691f3bf 100644
> --- a/lib/ovsdb-idl.c
> +++ b/lib/ovsdb-idl.c
> @@ -184,6 +184,7 @@ static struct ovsdb_idl_row *ovsdb_idl_row_create(struct 
> ovsdb_idl_table *,
>  static void ovsdb_idl_row_destroy(struct ovsdb_idl_row *);
>  static void ovsdb_idl_row_destroy_postprocess(struct ovsdb_idl *);
>  static void ovsdb_idl_destroy_all_map_op_lists(struct ovsdb_idl_row *);
> +static void ovsdb_idl_destroy_all_set_op_lists(struct ovsdb_idl_row *);
>
>  static void ovsdb_idl_row_parse(struct ovsdb_idl_row *);
>  static void ovsdb_idl_row_unparse(struct ovsdb_idl_row *);
> @@ -200,6 +201,10 @@ static void ovsdb_idl_txn_add_map_op(struct 
> ovsdb_idl_row *,
>   const struct ovsdb_idl_column *,
>   struct ovsdb_datum *,
>   enum map_op_type);
> +static void ovsdb_idl_txn_add_set_op(struct ovsdb_idl_row *,
> + const struct ovsdb_idl_column *,
> + struct ovsdb_datum *,
> + enum set_op_type);
>
>  static void ovsdb_idl_send_lock_request(struct ovsdb_idl *);
>  static void ovsdb_idl_send_unlock_request(struct ovsdb_idl *);
> @@ -1811,7 +1816,9 @@ ovsdb_idl_row_create(struct ovsdb_idl_table *table, 
> const struct uuid *uuid)
>  row->uuid = *uuid;
>  row->table = table;
>  row->map_op_written = NULL;
> -row->map_op_lists = NULL;
> +row->map_op_written = NULL;
> +row->set_op_lists = NULL;
> +

[ovs-dev] [PATCH v3 2/2] python: Add support for partial map and partial set updates

2016-08-05 Thread Ryan Moats

Allow the python IDL to use mutate operations more freely
by mimicing the partial map and partial set operations now
available in the C IDL.

Unit tests for both of these types of operations are included.
They are not carbon copies of the C tests, because testing
idempotency is a bit difficult for the current python IDL
test harness.

Signed-off-by: Ryan Moats 
---
 python/ovs/db/idl.py | 196 ---
 tests/ovsdb-idl.at   |  30 
 tests/test-ovsdb.py  |  88 +++
 3 files changed, 303 insertions(+), 11 deletions(-)

diff --git a/python/ovs/db/idl.py b/python/ovs/db/idl.py
index 92a7382..6f376c7 100644
--- a/python/ovs/db/idl.py
+++ b/python/ovs/db/idl.py
@@ -18,6 +18,7 @@ import uuid
 import six
 
 import ovs.jsonrpc
+import ovs.db.data as data
 import ovs.db.parser
 import ovs.db.schema
 from ovs.db import error
@@ -588,8 +589,7 @@ class Idl(object):
 continue
 
 try:
-datum_diff = ovs.db.data.Datum.from_json(column.type,
- datum_diff_json)
+datum_diff = data.Datum.from_json(column.type, datum_diff_json)
 except error.Error as e:
 # XXX rate-limit
 vlog.warn("error parsing column %s in table %s: %s"
@@ -614,7 +614,7 @@ class Idl(object):
 continue
 
 try:
-datum = ovs.db.data.Datum.from_json(column.type, datum_json)
+datum = data.Datum.from_json(column.type, datum_json)
 except error.Error as e:
 # XXX rate-limit
 vlog.warn("error parsing column %s in table %s: %s"
@@ -730,6 +730,24 @@ class Row(object):
 #   - None, if this transaction deletes this row.
 self.__dict__["_changes"] = {}
 
+# _mutations describes changes to this row to be handled via a
+# mutate operation on the wire.  It takes the following values:
+#
+#   - {}, the empty dictionary, if no transaction is active or if the
+# row has yet not been mutated within this transaction.
+#
+#   - A dictionary that contains two keys:
+#
+# - "_inserts" contains a dictionary that maps column names to
+#   new keys/key-value pairs that should be inserted into the
+#   column
+# - "_removes" contains a dictionary that maps column names to
+#   the keys/key-value pairs that should be removed from the
+#   column
+#
+#   - None, if this transaction deletes this row.
+self.__dict__["_mutations"] = {}
+
 # A dictionary whose keys are the names of columns that must be
 # verified as prerequisites when the transaction commits.  The values
 # in the dictionary are all None.
@@ -750,17 +768,47 @@ class Row(object):
 
 def __getattr__(self, column_name):
 assert self._changes is not None
+assert self._mutations is not None
 
 datum = self._changes.get(column_name)
+inserts = None
+if '_inserts' in self._mutations.keys():
+inserts = self._mutations['_inserts'].get(column_name)
+removes = None
+if '_removes' in self._mutations.keys():
+removes = self._mutations['_removes'].get(column_name)
 if datum is None:
 if self._data is None:
-raise AttributeError("%s instance has no attribute '%s'" %
- (self.__class__.__name__, column_name))
+if inserts is None:
+raise AttributeError("%s instance has no attribute '%s'" %
+ (self.__class__.__name__,
+  column_name))
+else:
+datum = inserts
 if column_name in self._data:
 datum = self._data[column_name]
+try:
+if inserts is not None:
+datum.extend(inserts)
+if removes is not None:
+datum = [x for x in datum if x not in removes]
+except error.Error:
+pass
+try:
+if inserts is not None:
+datum.merge(inserts)
+if removes is not None:
+for key in removes.keys():
+del datum[key]
+except error.Error:
+pass
 else:
-raise AttributeError("%s instance has no attribute '%s'" %
- (self.__class__.__name__, column_name))
+if inserts is None:
+raise AttributeError("%s instance has no attribute '%s'" %
+

[ovs-dev] [PATCH v3 0/2] Partial set operations and Python IDL update

2016-08-05 Thread Ryan Moats

This patch set adds partial set updates and updates the Python IDL
to support parital map and parital set operations. The python unit
tests are not a complete carbon copy of their C bretheren as the
Python IDL test harness does not appear to handle idempotency testing
and dumping of a map to string ends up dumping an unordered list
leading to sporadic test failures (suggestions for how to fix this
will be appreciated)

Ryan Moats (2):
  ovsdb: Add/use partial set updates.
  python: Add support for partial map and partial set updates

 lib/automake.mk  |   2 +
 lib/ovsdb-idl-provider.h |   3 +
 lib/ovsdb-idl.c  | 390 +++
 lib/ovsdb-idl.h  |   6 +
 lib/ovsdb-set-op.c   | 170 +
 lib/ovsdb-set-op.h   |  44 ++
 ovsdb/ovsdb-idlc.in  |  65 +++-
 python/ovs/db/idl.py | 196 ++--
 tests/idltest.ovsschema  |  30 
 tests/ovsdb-idl.at   |  66 
 tests/test-ovsdb.c   | 137 -
 tests/test-ovsdb.py  |  88 +++
 12 files changed, 1079 insertions(+), 118 deletions(-)
 create mode 100644 lib/ovsdb-set-op.c
 create mode 100644 lib/ovsdb-set-op.h

-- 
2.7.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [PATCH v3 1/2] ovsdb: Add/use partial set updates.

2016-08-05 Thread Ryan Moats

This patchset mimics the changes introduced in

  f199df26 (ovsdb-idl: Add partial map updates functionality.)
  010fe7ae (ovsdb-idlc.in: Autogenerate partial map updates functions.)
  7251075c (tests: Add test for partial map updates.)

but for columns that store sets of values rather than key-value
pairs.  These columns will now be able to use the OVSDB mutate
operation to transmit deltas on the wire rather than use
verify/update and transmit wait/update operations on the wire.

Side effect of modifying the comments in the partial map update
tests.

Signed-off-by: Ryan Moats 
---
 lib/automake.mk  |   2 +
 lib/ovsdb-idl-provider.h |   3 +
 lib/ovsdb-idl.c  | 390 +++
 lib/ovsdb-idl.h  |   6 +
 lib/ovsdb-set-op.c   | 170 +
 lib/ovsdb-set-op.h   |  44 ++
 ovsdb/ovsdb-idlc.in  |  65 +++-
 tests/idltest.ovsschema  |  30 
 tests/ovsdb-idl.at   |  36 +
 tests/test-ovsdb.c   | 137 -
 10 files changed, 776 insertions(+), 107 deletions(-)
 create mode 100644 lib/ovsdb-set-op.c
 create mode 100644 lib/ovsdb-set-op.h

diff --git a/lib/automake.mk b/lib/automake.mk
index 97c83e9..30a281f 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -187,6 +187,8 @@ lib_libopenvswitch_la_SOURCES = \
lib/ovsdb-idl.h \
lib/ovsdb-map-op.c \
lib/ovsdb-map-op.h \
+   lib/ovsdb-set-op.c \
+   lib/ovsdb-set-op.h \
lib/ovsdb-condition.h \
lib/ovsdb-condition.c \
lib/ovsdb-parser.c \
diff --git a/lib/ovsdb-idl-provider.h b/lib/ovsdb-idl-provider.h
index 55ed793..64e8ec3 100644
--- a/lib/ovsdb-idl-provider.h
+++ b/lib/ovsdb-idl-provider.h
@@ -20,6 +20,7 @@
 #include "openvswitch/list.h"
 #include "ovsdb-idl.h"
 #include "ovsdb-map-op.h"
+#include "ovsdb-set-op.h"
 #include "ovsdb-types.h"
 #include "openvswitch/shash.h"
 #include "uuid.h"
@@ -39,6 +40,8 @@ struct ovsdb_idl_row {
 struct hmap_node txn_node;  /* Node in ovsdb_idl_txn's list. */
 unsigned long int *map_op_written; /* Bitmap of columns pending map ops. */
 struct map_op_list **map_op_lists; /* Per-column map operations. */
+unsigned long int *set_op_written; /* Bitmap of columns pending set ops. */
+struct set_op_list **set_op_lists; /* Per-column set operations. */
 
 /* Tracking data */
 unsigned int change_seqno[OVSDB_IDL_CHANGE_MAX];
diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c
index d70fb10..691f3bf 100644
--- a/lib/ovsdb-idl.c
+++ b/lib/ovsdb-idl.c
@@ -184,6 +184,7 @@ static struct ovsdb_idl_row *ovsdb_idl_row_create(struct 
ovsdb_idl_table *,
 static void ovsdb_idl_row_destroy(struct ovsdb_idl_row *);
 static void ovsdb_idl_row_destroy_postprocess(struct ovsdb_idl *);
 static void ovsdb_idl_destroy_all_map_op_lists(struct ovsdb_idl_row *);
+static void ovsdb_idl_destroy_all_set_op_lists(struct ovsdb_idl_row *);
 
 static void ovsdb_idl_row_parse(struct ovsdb_idl_row *);
 static void ovsdb_idl_row_unparse(struct ovsdb_idl_row *);
@@ -200,6 +201,10 @@ static void ovsdb_idl_txn_add_map_op(struct ovsdb_idl_row 
*,
  const struct ovsdb_idl_column *,
  struct ovsdb_datum *,
  enum map_op_type);
+static void ovsdb_idl_txn_add_set_op(struct ovsdb_idl_row *,
+ const struct ovsdb_idl_column *,
+ struct ovsdb_datum *,
+ enum set_op_type);
 
 static void ovsdb_idl_send_lock_request(struct ovsdb_idl *);
 static void ovsdb_idl_send_unlock_request(struct ovsdb_idl *);
@@ -1811,7 +1816,9 @@ ovsdb_idl_row_create(struct ovsdb_idl_table *table, const 
struct uuid *uuid)
 row->uuid = *uuid;
 row->table = table;
 row->map_op_written = NULL;
-row->map_op_lists = NULL;
+row->map_op_written = NULL;
+row->set_op_lists = NULL;
+row->set_op_lists = NULL;
 return row;
 }
 
@@ -1822,6 +1829,7 @@ ovsdb_idl_row_destroy(struct ovsdb_idl_row *row)
 ovsdb_idl_row_clear_old(row);
 hmap_remove(>table->rows, >hmap_node);
 ovsdb_idl_destroy_all_map_op_lists(row);
+ovsdb_idl_destroy_all_set_op_lists(row);
 if (ovsdb_idl_track_is_set(row->table)) {
 row->change_seqno[OVSDB_IDL_CHANGE_DELETE]
 = row->table->change_seqno[OVSDB_IDL_CHANGE_DELETE]
@@ -1856,6 +1864,27 @@ ovsdb_idl_destroy_all_map_op_lists(struct ovsdb_idl_row 
*row)
 }
 
 static void
+ovsdb_idl_destroy_all_set_op_lists(struct ovsdb_idl_row *row)
+{
+if (row->set_op_written) {
+/* Clear Set Operation Lists */
+size_t idx, n_columns;
+const struct ovsdb_idl_column *columns;
+const struct ovsdb_type *type;
+n_columns = row->table->class->n_columns;
+columns = row->table->class->columns;
+BITMAP_FOR_EACH_1 (idx, n_columns,

Re: [ovs-dev] [ovs-dev, 3/4] ovsdb: Fix bug, set rpc to NULL after freeing.

2016-08-05 Thread Andy Zhou

On Fri, Aug 5, 2016 at 9:08 AM, Daniel Levy  wrote:

> Tested this and it works, however it needs a rebase.
>

The rebase may be caused by patches ahead of this one in the series.

Thanks for testing and reporting.

>
> --
> Sincerely,
> Daniel Levy
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH 3/3] check-kernel: Remove '-d' from TESTSUITEFLAGS.

2016-08-05 Thread Andy Zhou

On Thu, Aug 4, 2016 at 6:43 PM, Daniele Di Proietto 
wrote:

> The '-d' flag tells autotest to always keep the testcase output, but
> prevents '--recheck' from working.  If a user wants to always keep the
> output from the tests, the '-d' flag can be passed explicitly.  This is
> more in line with other test make target ('check',
> 'check-system-userspace').
>
> CC: Andy Zhou 
> Signed-off-by: Daniele Di Proietto 
> ---
>  tests/automake.mk | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tests/automake.mk b/tests/automake.mk
> index a9ebf91..5d12ae5 100644
> --- a/tests/automake.mk
> +++ b/tests/automake.mk
> @@ -243,7 +243,7 @@ EXTRA_DIST += tests/run-ryu
>
>  # Run kmod tests. Assume kernel modules has been installed or linked into
> the kernel
>  check-kernel: all tests/atconfig tests/atlocal $(SYSTEM_KMOD_TESTSUITE)
> -   $(SHELL) '$(SYSTEM_KMOD_TESTSUITE)' -C tests
> AUTOTEST_PATH='$(AUTOTEST_PATH)' -d $(TESTSUITEFLAGS) -j1
> +   $(SHELL) '$(SYSTEM_KMOD_TESTSUITE)' -C tests
> AUTOTEST_PATH='$(AUTOTEST_PATH)' $(TESTSUITEFLAGS) -j1
>
>  # Testing the out of tree Kernel module
>  check-kmod: all tests/atconfig tests/atlocal $(SYSTEM_KMOD_TESTSUITE)
>

LGTM
Acked-by: Andy Zhou 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [ovs-dev, 3/4] ovsdb: Fix bug, set rpc to NULL after freeing.

2016-08-05 Thread Daniel Levy

Tested this and it works, however it needs a rebase.

-- 
Sincerely,
Daniel Levy
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [CudaMailTagged] [PATCH 2/2] Tunnel: Fix the issue of tunnel port creation

2016-08-05 Thread Thadeu Lima de Souza Cascardo

On Sat, Aug 06, 2016 at 04:55:15AM +0800, Binbin Xu wrote:
> If a kernel space vxlan port was added first, and then we try to
> add a user space vxlan port. But unfortunate, the user space
> vxlan port can't be created.
> 
> This commit separates kernel space with user space tunnel port,
> for example:
>  kernel_space  user_space
> vxlanvxlan_sys_4789vxlan_usr_4789
> gre  gre_sys   gre_usr
> ..
> 

This makes sense to me. You point out a real problem, and both tunnel types
should be able to coexist.

However, I am only guessing why this would fail. Can you point out some of the
details on why this fails?

Thanks.
Cascardo.

> Signed-off-by: Binbin Xu 
> ---
>  lib/dpif-netdev.c  |  3 ++-
>  lib/dpif-netlink.c |  2 +-
>  lib/netdev-vport.c | 26 +++---
>  lib/netdev-vport.h |  2 +-
>  lib/netdev.c   | 13 ++---
>  lib/netdev.h   |  2 +-
>  ofproto/ofproto-dpif.c | 15 ++-
>  vswitchd/bridge.c  |  2 +-
>  8 files changed, 41 insertions(+), 24 deletions(-)
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index e39362e..3bdf204 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -1343,7 +1343,8 @@ dpif_netdev_port_add(struct dpif *dpif, struct netdev 
> *netdev,
>  int error;
>  
>  ovs_mutex_lock(>port_mutex);
> -dpif_port = netdev_vport_get_dpif_port(netdev, namebuf, sizeof namebuf);
> +dpif_port = netdev_vport_get_dpif_port(netdev, "netdev",
> +   namebuf, sizeof namebuf);
>  if (*port_nop != ODPP_NONE) {
>  port_no = *port_nop;
>  error = dp_netdev_lookup_port(dp, *port_nop) ? EBUSY : 0;
> diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
> index a39faa2..1b0cd2c 100644
> --- a/lib/dpif-netlink.c
> +++ b/lib/dpif-netlink.c
> @@ -810,7 +810,7 @@ dpif_netlink_port_add__(struct dpif_netlink *dpif, struct 
> netdev *netdev,
>  {
>  const struct netdev_tunnel_config *tnl_cfg;
>  char namebuf[NETDEV_VPORT_NAME_BUFSIZE];
> -const char *name = netdev_vport_get_dpif_port(netdev,
> +const char *name = netdev_vport_get_dpif_port(netdev, "netlink",
>namebuf, sizeof namebuf);
>  const char *type = netdev_get_type(netdev);
>  struct dpif_netlink_vport request, reply;
> diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c
> index f5eb76f..e374f9d 100755
> --- a/lib/netdev-vport.c
> +++ b/lib/netdev-vport.c
> @@ -119,16 +119,19 @@ netdev_vport_class_get_dpif_port(const struct 
> netdev_class *class)
>  }
>  
>  const char *
> -netdev_vport_get_dpif_port(const struct netdev *netdev,
> +netdev_vport_get_dpif_port(const struct netdev *netdev, const char *dp_type,
> char namebuf[], size_t bufsize)
>  {
>  const struct netdev_class *class = netdev_get_class(netdev);
>  const char *dpif_port = netdev_vport_class_get_dpif_port(class);
> +char *type;
>  
>  if (!dpif_port) {
>  return netdev_get_name(netdev);
>  }
>  
> +type = !strcmp(dp_type, "netdev") ? "usr" : "sys";
> +
>  if (netdev_vport_needs_dst_port(netdev)) {
>  const struct netdev_vport *vport = netdev_vport_cast(netdev);
>  
> @@ -138,13 +141,14 @@ netdev_vport_get_dpif_port(const struct netdev *netdev,
>   * port numbers but assert just in case.
>   */
>  BUILD_ASSERT(NETDEV_VPORT_NAME_BUFSIZE >= IFNAMSIZ);
> -ovs_assert(strlen(dpif_port) + 6 < IFNAMSIZ);
> -snprintf(namebuf, bufsize, "%s_%d", dpif_port,
> +ovs_assert(strlen(dpif_port) + 10 < IFNAMSIZ);
> +snprintf(namebuf, bufsize, "%s_%s_%d", dpif_port, type,
>   ntohs(vport->tnl_cfg.dst_port));
> -return namebuf;
>  } else {
> -return dpif_port;
> +snprintf(namebuf, NETDEV_VPORT_NAME_BUFSIZE, "%s_%s",
> +dpif_port, type);
>  }
> +return namebuf;
>  }
>  
>  /* Whenever the route-table change number is incremented,
> @@ -890,18 +894,18 @@ netdev_vport_tunnel_register(void)
>  /* The name of the dpif_port should be short enough to accomodate adding
>   * a port number to the end if one is necessary. */
>  static const struct vport_class vport_classes[] = {
> -TUNNEL_CLASS("geneve", "genev_sys", netdev_geneve_build_header,
> +TUNNEL_CLASS("geneve", "genev", netdev_geneve_build_header,
>  netdev_tnl_push_udp_header,
>  netdev_geneve_pop_header),
> -TUNNEL_CLASS("gre", "gre_sys", netdev_gre_build_header,
> +TUNNEL_CLASS("gre", "gre", netdev_gre_build_header,
> netdev_gre_push_header,
> netdev_gre_pop_header),
> -TUNNEL_CLASS("ipsec_gre", "gre_sys", NULL, NULL, NULL),
> -

[ovs-dev] OVS is compatible/support LLDP and BDDP packets?

2016-08-05 Thread Maurizio Marrocco

Hi OVS team,

My question is the following: OVS is compatible with LLDP and BDDP (Broadcast 
Domain Discovery Protocol) packets? or there are API that manage these packets?

Thanks

Maurizio

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [PATCH 5/7] tests: Add a new MTU test.

2016-08-05 Thread Mark Kavanagh

From: Daniele Di Proietto 

Also, netdev-dummy needs to call netdev_change_seq_changed() in
set_mtu().

Signed-off-by: Daniele Di Proietto 
---
 lib/netdev-dummy.c|  5 -
 tests/ofproto-dpif.at | 30 ++
 2 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/lib/netdev-dummy.c b/lib/netdev-dummy.c
index 92af15f..c8f82b7 100644
--- a/lib/netdev-dummy.c
+++ b/lib/netdev-dummy.c
@@ -1155,7 +1155,10 @@ netdev_dummy_set_mtu(const struct netdev *netdev, int 
mtu)
 struct netdev_dummy *dev = netdev_dummy_cast(netdev);
 
 ovs_mutex_lock(>mutex);
-dev->mtu = mtu;
+if (dev->mtu != mtu) {
+dev->mtu = mtu;
+netdev_change_seq_changed(netdev);
+}
 ovs_mutex_unlock(>mutex);
 
 return 0;
diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
index a46fc81..3638063 100644
--- a/tests/ofproto-dpif.at
+++ b/tests/ofproto-dpif.at
@@ -8859,3 +8859,33 @@ n_packets=0
 
 OVS_VSWITCHD_STOP
 AT_CLEANUP
+
+AT_SETUP([ofproto - set mtu])
+OVS_VSWITCHD_START
+
+add_of_ports br0 1
+
+# Check that initial MTU is 1500 for 'br0' and 'p1'.
+AT_CHECK([ovs-vsctl get Interface br0 mtu], [0], [dnl
+1500
+])
+AT_CHECK([ovs-vsctl get Interface p1 mtu], [0], [dnl
+1500
+])
+
+# Request new MTU for 'p1'
+AT_CHECK([ovs-vsctl set Interface p1 mtu_request=1600])
+
+# Check that the new MTU is applied
+AT_CHECK([ovs-vsctl --timeout=10 wait-until Interface p1 mtu=1600])
+# The internal port 'br0' should have the same MTU value as p1, becase it's
+# the new bridge minimum.
+AT_CHECK([ovs-vsctl --timeout=10 wait-until Interface br0 mtu=1600])
+
+AT_CHECK([ovs-vsctl del-port br0 p1])
+
+# When 'p1' is deleted, the internal port should return to the default MTU
+AT_CHECK([ovs-vsctl --timeout=10 wait-until Interface br0 mtu=1500])
+
+OVS_VSWITCHD_STOP
+AT_CLEANUP
-- 
1.9.3

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [ovs-dev,V2] netdev-dpdk: fix memory leak

2016-08-05 Thread Ilya Maximets

On 04.08.2016 12:49, Mark Kavanagh wrote:
> DPDK v16.07 introduces the ability to free memzones.
> Up until this point, DPDK memory pools created in OVS could
> not be destroyed, thus incurring a memory leak.
> 
> Leverage the DPDK v16.07 rte_mempool API to free DPDK
> mempools when their associated reference count reaches 0 (this
> indicates that the memory pool is no longer in use).
> 
> Signed-off-by: Mark Kavanagh 
> ---
> 
> v2->v1: rebase to head of master, and remove 'RFC' tag
> 
>  lib/netdev-dpdk.c | 29 +++--
>  1 file changed, 15 insertions(+), 14 deletions(-)
> 
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index aaac0d1..ffcd35c 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -506,7 +506,7 @@ dpdk_mp_get(int socket_id, int mtu) 
> OVS_REQUIRES(dpdk_mutex)
>  }
>  
>  static void
> -dpdk_mp_put(struct dpdk_mp *dmp)
> +dpdk_mp_put(struct dpdk_mp *dmp) OVS_REQUIRES(dpdk_mutex)
>  {
>  
>  if (!dmp) {
> @@ -514,15 +514,12 @@ dpdk_mp_put(struct dpdk_mp *dmp)
>  }
>  
>  dmp->refcount--;
> -ovs_assert(dmp->refcount >= 0);
>  
> -#if 0
> -/* I could not find any API to destroy mp. */
> -if (dmp->refcount == 0) {
> -list_delete(dmp->list_node);
> -/* destroy mp-pool. */
> -}
> -#endif
> +if (OVS_UNLIKELY(!dmp->refcount)) {
> +ovs_list_remove(>list_node);
> +rte_mempool_free(dmp->mp);
> + }
> +
>  }
>  
>  static void
> @@ -928,16 +925,18 @@ netdev_dpdk_destruct(struct netdev *netdev)
>  {
>  struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>  
> +ovs_mutex_lock(_mutex);
>  ovs_mutex_lock(>mutex);
> +
>  rte_eth_dev_stop(dev->port_id);
>  free(ovsrcu_get_protected(struct ingress_policer *,
>>ingress_policer));
> -ovs_mutex_unlock(>mutex);
>  
> -ovs_mutex_lock(_mutex);
>  rte_free(dev->tx_q);
>  ovs_list_remove(>list_node);
>  dpdk_mp_put(dev->dpdk_mp);
> +
> +ovs_mutex_unlock(>mutex);
>  ovs_mutex_unlock(_mutex);
>  }
>  
> @@ -946,6 +945,9 @@ netdev_dpdk_vhost_destruct(struct netdev *netdev)
>  {
>  struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>  
> +ovs_mutex_lock(_mutex);
> +ovs_mutex_lock(>mutex);
> +
>  /* Guest becomes an orphan if still attached. */
>  if (netdev_dpdk_get_vid(dev) >= 0) {
>  VLOG_ERR("Removing port '%s' while vhost device still attached.",
> @@ -961,15 +963,14 @@ netdev_dpdk_vhost_destruct(struct netdev *netdev)
>  fatal_signal_remove_file_to_unlink(dev->vhost_id);
>  }
>  
> -ovs_mutex_lock(>mutex);
>  free(ovsrcu_get_protected(struct ingress_policer *,
>>ingress_policer));
> -ovs_mutex_unlock(>mutex);
>  
> -ovs_mutex_lock(_mutex);
>  rte_free(dev->tx_q);
>  ovs_list_remove(>list_node);
>  dpdk_mp_put(dev->dpdk_mp);
> +
> +ovs_mutex_unlock(>mutex);
>  ovs_mutex_unlock(_mutex);
>  }

I agree that locking here was wrong but this change introduces issue because
'rte_vhost_driver_unregister()' may call 'destroy_device()' and OVS will be 
aborted
on attempt to lock 'dpdk_mutex' again:

VHOST_CONFIG: free connfd = 37 for device '/vhost1'
ovs-vswitchd: lib/netdev-dpdk.c:2305: pthread_mutex_lock failed (Resource 
deadlock avoided)

Program received signal SIGABRT, Aborted.
0x007fb7ad6d38 in raise () from /lib64/libc.so.6
(gdb) bt
#0  0x007fb7ad6d38 in raise () from /lib64/libc.so.6
#1  0x007fb7ad8aa8 in abort () from /lib64/libc.so.6
#2  0x00692be0 in ovs_abort_valist at lib/util.c:335
#3  0x00692ba0 in ovs_abort at lib/util.c:327
#4  0x00651800 in ovs_mutex_lock_at (l_=0x899ab0 , 
where=0x78a458 "lib/netdev-dpdk.c:2305") at lib/ovs-thread.c:76
#5  0x006c0190 in destroy_device (vid=0) at lib/netdev-dpdk.c:2305
#6  0x004ea850 in vhost_destroy_device ()
#7  0x004ee578 in rte_vhost_driver_unregister ()
#8  0x006bc8c8 in netdev_dpdk_vhost_destruct (netdev=0x7f6bffed00) at 
lib/netdev-dpdk.c:944
#9  0x005e4ad4 in netdev_unref (dev=0x7f6bffed00) at lib/netdev.c:499
#10 0x005e4b9c in netdev_close (netdev=0x7f6bffed00) at lib/netdev.c:523
[...]
#20 0x0053ad94 in main (argc=7, argv=0x7ff318) at 
vswitchd/ovs-vswitchd.c:112

May be reproduced by removing port while virtio still attached.
This blocks reconnection feature and deletion of port while QEMU still attached.

Someone should fix this. Any thoughts?

Best regards, Ilya Maximets.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [PATCH 7/7] netdev-dpdk: add support for Jumbo Frames

2016-08-05 Thread Mark Kavanagh

Add support for Jumbo Frames to DPDK-enabled port types,
using single-segment-mbufs.

Using this approach, the amount of memory allocated to each mbuf
to store frame data is increased to a value greater than 1518B
(typical Ethernet maximum frame length). The increased space
available in the mbuf means that an entire Jumbo Frame of a specific
size can be carried in a single mbuf, as opposed to partitioning
it across multiple mbuf segments.

The amount of space allocated to each mbuf to hold frame data is
defined dynamically by the user with ovs-vsctl, via the 'mtu_request'
parameter.

Signed-off-by: Mark Kavanagh 
[diproiet...@vmware.com rebased]
Signed-off-by: Daniele Di Proietto 
---

Previous: http://openvswitch.org/pipermail/dev/2016-July/076845.html

v2->v1:
- rebase to HEAD of master
- fall back to previous 'good' MTU if reconfigure fails
- introduce new field 'last_mtu' in struct netdev-dpdk to facilitate
  fall-back
- rename 'mtu_request' to 'requested_mtu' in struct netdev_dpdk
- remove rebasing artifact in INSTALL.DPDK-Advanced.md
- remove superflous variable in dpdk_mp_configure
- fix minor coding style infraction

 INSTALL.DPDK-ADVANCED.md |  58 -
 INSTALL.DPDK.md  |   1 -
 NEWS |   1 +
 lib/netdev-dpdk.c| 165 ---
 4 files changed, 197 insertions(+), 28 deletions(-)

diff --git a/INSTALL.DPDK-ADVANCED.md b/INSTALL.DPDK-ADVANCED.md
index 0ab43d4..5e758ce 100755
--- a/INSTALL.DPDK-ADVANCED.md
+++ b/INSTALL.DPDK-ADVANCED.md
@@ -1,5 +1,5 @@
 OVS DPDK ADVANCED INSTALL GUIDE
-=
+===
 
 ## Contents
 
@@ -12,7 +12,8 @@ OVS DPDK ADVANCED INSTALL GUIDE
 7. [QOS](#qos)
 8. [Rate Limiting](#rl)
 9. [Flow Control](#fc)
-10. [Vsperf](#vsperf)
+10. [Jumbo Frames](#jumbo)
+11. [Vsperf](#vsperf)
 
 ##  1. Overview
 
@@ -862,7 +863,58 @@ respective parameter. To disable the flow control at tx 
side,
 
 `ovs-vsctl set Interface dpdk0 options:tx-flow-ctrl=false`
 
-##  10. Vsperf
+##  10. Jumbo Frames
+
+By default, DPDK ports are configured with standard Ethernet MTU (1500B). To
+enable Jumbo Frames support for a DPDK port, change the Interface's 
`mtu_request`
+attribute to a sufficiently large value.
+
+e.g. Add a DPDK Phy port with MTU of 9000:
+
+`ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk -- set 
Interface dpdk0 mtu_request=9000`
+
+e.g. Change the MTU of an existing port to 6200:
+
+`ovs-vsctl set Interface dpdk0 mtu_request=6200`
+
+When Jumbo Frames are enabled, the size of a DPDK port's mbuf segments are
+increased, such that a full Jumbo Frame of a specific size may be accommodated
+within a single mbuf segment.
+
+Jumbo frame support has been validated against 9728B frames (largest frame size
+supported by Fortville NIC), using the DPDK `i40e` driver, but larger frames
+(particularly in use cases involving East-West traffic only), and other DPDK 
NIC
+drivers may be supported.
+
+### 9.1 vHost Ports and Jumbo Frames
+
+Some additional configuration is needed to take advantage of jumbo frames with
+vhost ports:
+
+1. `mergeable buffers` must be enabled for vHost ports, as demonstrated in
+the QEMU command line snippet below:
+
+```
+'-netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \'
+'-device 
virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,mrg_rxbuf=on'
+```
+
+2. Where virtio devices are bound to the Linux kernel driver in a guest
+   environment (i.e. interfaces are not bound to an in-guest DPDK driver),
+   the MTU of those logical network interfaces must also be increased to a
+   sufficiently large value. This avoids segmentation of Jumbo Frames
+   received in the guest. Note that 'MTU' refers to the length of the IP
+   packet only, and not that of the entire frame.
+
+   To calculate the exact MTU of a standard IPv4 frame, subtract the L2
+   header and CRC lengths (i.e. 18B) from the max supported frame size.
+   So, to set the MTU for a 9018B Jumbo Frame:
+
+   ```
+   ifconfig eth1 mtu 9000
+   ```
+
+##  11. Vsperf
 
 Vsperf project goal is to develop vSwitch test framework that can be used to
 validate the suitability of different vSwitch implementations in a Telco 
deployment
diff --git a/INSTALL.DPDK.md b/INSTALL.DPDK.md
index 253d022..a810ac8 100644
--- a/INSTALL.DPDK.md
+++ b/INSTALL.DPDK.md
@@ -590,7 +590,6 @@ can be found in [Vhost Walkthrough].
 
 ##  6. Limitations
 
-  - Supports MTU size 1500, MTU setting for DPDK netdevs will be in future OVS 
release.
   - Currently DPDK ports does not use HW offload functionality.
   - Network Interface Firmware requirements:
 Each release of DPDK is validated against a specific firmware version for
diff --git a/NEWS b/NEWS
index ce10982..53c816b 100644
--- a/NEWS
+++ b/NEWS
@@

[ovs-dev] [PATCH 6/7] netdev: Make netdev_set_mtu() netdev parameter non-const.

2016-08-05 Thread Mark Kavanagh

From: Daniele Di Proietto 

Every provider silently drops the const attribute when converting the
parameter to the appropriate subclass.  Might as well drop the const
attribute from the parameter, since this is a "set" function.

Signed-off-by: Daniele Di Proietto 
---
v2->v1: add missing 'Signed-off-by' field in commit message.

 lib/netdev-dummy.c| 2 +-
 lib/netdev-linux.c| 2 +-
 lib/netdev-provider.h | 2 +-
 lib/netdev.c  | 2 +-
 lib/netdev.h  | 2 +-
 5 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/lib/netdev-dummy.c b/lib/netdev-dummy.c
index c8f82b7..dec1a8e 100644
--- a/lib/netdev-dummy.c
+++ b/lib/netdev-dummy.c
@@ -1150,7 +1150,7 @@ netdev_dummy_get_mtu(const struct netdev *netdev, int 
*mtup)
 }
 
 static int
-netdev_dummy_set_mtu(const struct netdev *netdev, int mtu)
+netdev_dummy_set_mtu(struct netdev *netdev, int mtu)
 {
 struct netdev_dummy *dev = netdev_dummy_cast(netdev);
 
diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index 1b5f7c1..20b5cc7 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -1382,7 +1382,7 @@ netdev_linux_get_mtu(const struct netdev *netdev_, int 
*mtup)
  * networking ioctl interface.
  */
 static int
-netdev_linux_set_mtu(const struct netdev *netdev_, int mtu)
+netdev_linux_set_mtu(struct netdev *netdev_, int mtu)
 {
 struct netdev_linux *netdev = netdev_linux_cast(netdev_);
 struct ifreq ifr;
diff --git a/lib/netdev-provider.h b/lib/netdev-provider.h
index 5bcfeba..cd04ae9 100644
--- a/lib/netdev-provider.h
+++ b/lib/netdev-provider.h
@@ -389,7 +389,7 @@ struct netdev_class {
  * If 'netdev' does not have an MTU (e.g. as some tunnels do not), then
  * this function should return EOPNOTSUPP.  This function may be set to
  * null if it would always return EOPNOTSUPP. */
-int (*set_mtu)(const struct netdev *netdev, int mtu);
+int (*set_mtu)(struct netdev *netdev, int mtu);
 
 /* Returns the ifindex of 'netdev', if successful, as a positive number.
  * On failure, returns a negative errno value.
diff --git a/lib/netdev.c b/lib/netdev.c
index 589d37c..5cf8bbb 100644
--- a/lib/netdev.c
+++ b/lib/netdev.c
@@ -869,7 +869,7 @@ netdev_get_mtu(const struct netdev *netdev, int *mtup)
  * MTU (as e.g. some tunnels do not).  On other failure, returns a positive
  * errno value. */
 int
-netdev_set_mtu(const struct netdev *netdev, int mtu)
+netdev_set_mtu(struct netdev *netdev, int mtu)
 {
 const struct netdev_class *class = netdev->netdev_class;
 int error;
diff --git a/lib/netdev.h b/lib/netdev.h
index dc7ede8..d8ec627 100644
--- a/lib/netdev.h
+++ b/lib/netdev.h
@@ -132,7 +132,7 @@ const char *netdev_get_name(const struct netdev *);
 const char *netdev_get_type(const struct netdev *);
 const char *netdev_get_type_from_name(const char *);
 int netdev_get_mtu(const struct netdev *, int *mtup);
-int netdev_set_mtu(const struct netdev *, int mtu);
+int netdev_set_mtu(struct netdev *, int mtu);
 int netdev_get_ifindex(const struct netdev *);
 int netdev_set_tx_multiq(struct netdev *, unsigned int n_txq);
 
-- 
1.9.3

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [PATCH 3/7] netdev: Pass 'netdev_class' to ->run() and ->wait().

2016-08-05 Thread Mark Kavanagh

From: Daniele Di Proietto 

This will allow run() and wait() methods to be shared between different
classes and still perform class-specific work.

Signed-off-by: Daniele Di Proietto 
---
 lib/netdev-bsd.c  |  6 +++---
 lib/netdev-dummy.c|  4 ++--
 lib/netdev-linux.c|  6 +++---
 lib/netdev-provider.h | 14 ++
 lib/netdev-vport.c|  4 ++--
 lib/netdev.c  |  4 ++--
 6 files changed, 22 insertions(+), 16 deletions(-)

diff --git a/lib/netdev-bsd.c b/lib/netdev-bsd.c
index 2bba0ed..75a330b 100644
--- a/lib/netdev-bsd.c
+++ b/lib/netdev-bsd.c
@@ -146,7 +146,7 @@ static void ifr_set_flags(struct ifreq *, int flags);
 static int af_link_ioctl(unsigned long command, const void *arg);
 #endif
 
-static void netdev_bsd_run(void);
+static void netdev_bsd_run(const struct netdev_class *);
 static int netdev_bsd_get_mtu(const struct netdev *netdev_, int *mtup);
 
 static bool
@@ -180,7 +180,7 @@ netdev_get_kernel_name(const struct netdev *netdev)
  * interface status changes, and eventually calls all the user callbacks.
  */
 static void
-netdev_bsd_run(void)
+netdev_bsd_run(const struct netdev_class *netdev_class OVS_UNUSED)
 {
 rtbsd_notifier_run();
 }
@@ -190,7 +190,7 @@ netdev_bsd_run(void)
  * be called.
  */
 static void
-netdev_bsd_wait(void)
+netdev_bsd_wait(const struct netdev_class *netdev_class OVS_UNUSED)
 {
 rtbsd_notifier_wait();
 }
diff --git a/lib/netdev-dummy.c b/lib/netdev-dummy.c
index a950409..2a6aa56 100644
--- a/lib/netdev-dummy.c
+++ b/lib/netdev-dummy.c
@@ -622,7 +622,7 @@ dummy_netdev_get_conn_state(struct dummy_packet_conn *conn)
 }
 
 static void
-netdev_dummy_run(void)
+netdev_dummy_run(const struct netdev_class *netdev_class OVS_UNUSED)
 {
 struct netdev_dummy *dev;
 
@@ -636,7 +636,7 @@ netdev_dummy_run(void)
 }
 
 static void
-netdev_dummy_wait(void)
+netdev_dummy_wait(const struct netdev_class *netdev_class OVS_UNUSED)
 {
 struct netdev_dummy *dev;
 
diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index fa37bcf..1b5f7c1 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -526,7 +526,7 @@ static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 
20);
  * changes in the device miimon status, so we can use atomic_count. */
 static atomic_count miimon_cnt = ATOMIC_COUNT_INIT(0);
 
-static void netdev_linux_run(void);
+static void netdev_linux_run(const struct netdev_class *);
 
 static int netdev_linux_do_ethtool(const char *name, struct ethtool_cmd *,
int cmd, const char *cmd_name);
@@ -623,7 +623,7 @@ netdev_linux_miimon_enabled(void)
 }
 
 static void
-netdev_linux_run(void)
+netdev_linux_run(const struct netdev_class *netdev_class OVS_UNUSED)
 {
 struct nl_sock *sock;
 int error;
@@ -697,7 +697,7 @@ netdev_linux_run(void)
 }
 
 static void
-netdev_linux_wait(void)
+netdev_linux_wait(const struct netdev_class *netdev_class OVS_UNUSED)
 {
 struct nl_sock *sock;
 
diff --git a/lib/netdev-provider.h b/lib/netdev-provider.h
index ae390cb..5bcfeba 100644
--- a/lib/netdev-provider.h
+++ b/lib/netdev-provider.h
@@ -236,15 +236,21 @@ struct netdev_class {
 int (*init)(void);
 
 /* Performs periodic work needed by netdevs of this class.  May be null if
- * no periodic work is necessary. */
-void (*run)(void);
+ * no periodic work is necessary.
+ *
+ * 'netdev_class' points to the class.  It is useful in case the same
+ * function is used to implement different classes. */
+void (*run)(const struct netdev_class *netdev_class);
 
 /* Arranges for poll_block() to wake up if the "run" member function needs
  * to be called.  Implementations are additionally required to wake
  * whenever something changes in any of its netdevs which would cause their
  * ->change_seq() function to change its result.  May be null if nothing is
- * needed here. */
-void (*wait)(void);
+ * needed here.
+ *
+ * 'netdev_class' points to the class.  It is useful in case the same
+ * function is used to implement different classes. */
+void (*wait)(const struct netdev_class *netdev_class);
 
 /* ##  ## */
 /* ## netdev Functions ## */
diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c
index 87a30f8..7eabd2c 100644
--- a/lib/netdev-vport.c
+++ b/lib/netdev-vport.c
@@ -321,7 +321,7 @@ netdev_vport_update_flags(struct netdev *netdev OVS_UNUSED,
 }
 
 static void
-netdev_vport_run(void)
+netdev_vport_run(const struct netdev_class *netdev_class OVS_UNUSED)
 {
 uint64_t seq;
 
@@ -334,7 +334,7 @@ netdev_vport_run(void)
 }
 
 static void
-netdev_vport_wait(void)
+netdev_vport_wait(const struct netdev_class *netdev_class OVS_UNUSED)
 {
 uint64_t seq;
 
diff --git a/lib/netdev.c b/lib/netdev.c
index 75bf1cb..589d37c 100644
--- a/lib/netdev.c
+++ b/lib/netdev.c
@@ -160,7 +160,7 @@ netdev_run(void)
 struct netdev_registered_class *rc;
 CMAP_FOR_EACH (rc, cmap_node,

[ovs-dev] [PATCH 4/7] netdev-dummy: Add dummy-internal class.

2016-08-05 Thread Mark Kavanagh

From: Daniele Di Proietto 

"internal" netdevs are treated specially in OVS (e.g. for MTU), but
the dummy datapath remaps both "system" and "internal" devices to the
same "dummy" netdev class, so there's no way to discern those in tests.

This commit adds a new "dummy-internal" netdev type, which will be used
by the dummy datapath for internal ports, so that other parts of the
code can understand which ports are internal just by looking at the
netdev object.

The alternative solution, using the original interface type ("internal")
instead of the translated netdev type ("dummy"), is harder to implement,
because in so many places only the netdev object is available.

Signed-off-by: Daniele Di Proietto 
---
 lib/dpif-netdev.c |  2 +-
 lib/netdev-dummy.c| 14 --
 tests/bridge.at   |  6 +++---
 tests/dpctl.at| 12 ++--
 tests/mpls-xlate.at   |  4 ++--
 tests/netdev-type.at  |  2 +-
 tests/ofproto-dpif.at | 18 +-
 tests/ovs-vswitchd.at |  6 +++---
 tests/pmd.at  |  8 
 tests/tunnel-push-pop-ipv6.at |  4 ++--
 tests/tunnel-push-pop.at  |  4 ++--
 tests/tunnel.at   | 28 ++--
 12 files changed, 59 insertions(+), 49 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index e39362e..6f2e07d 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -888,7 +888,7 @@ static const char *
 dpif_netdev_port_open_type(const struct dpif_class *class, const char *type)
 {
 return strcmp(type, "internal") ? type
-  : dpif_netdev_class_is_dummy(class) ? "dummy"
+  : dpif_netdev_class_is_dummy(class) ? "dummy-internal"
   : "tap";
 }
 
diff --git a/lib/netdev-dummy.c b/lib/netdev-dummy.c
index 2a6aa56..92af15f 100644
--- a/lib/netdev-dummy.c
+++ b/lib/netdev-dummy.c
@@ -622,12 +622,15 @@ dummy_netdev_get_conn_state(struct dummy_packet_conn 
*conn)
 }
 
 static void
-netdev_dummy_run(const struct netdev_class *netdev_class OVS_UNUSED)
+netdev_dummy_run(const struct netdev_class *netdev_class)
 {
 struct netdev_dummy *dev;
 
 ovs_mutex_lock(_list_mutex);
 LIST_FOR_EACH (dev, list_node, _list) {
+if (netdev_get_class(>up) != netdev_class) {
+continue;
+}
 ovs_mutex_lock(>mutex);
 dummy_packet_conn_run(dev);
 ovs_mutex_unlock(>mutex);
@@ -636,12 +639,15 @@ netdev_dummy_run(const struct netdev_class *netdev_class 
OVS_UNUSED)
 }
 
 static void
-netdev_dummy_wait(const struct netdev_class *netdev_class OVS_UNUSED)
+netdev_dummy_wait(const struct netdev_class *netdev_class)
 {
 struct netdev_dummy *dev;
 
 ovs_mutex_lock(_list_mutex);
 LIST_FOR_EACH (dev, list_node, _list) {
+if (netdev_get_class(>up) != netdev_class) {
+continue;
+}
 ovs_mutex_lock(>mutex);
 dummy_packet_conn_wait(>conn);
 ovs_mutex_unlock(>mutex);
@@ -1380,6 +1386,9 @@ netdev_dummy_update_flags(struct netdev *netdev_,
 static const struct netdev_class dummy_class =
 NETDEV_DUMMY_CLASS("dummy", false, NULL);
 
+static const struct netdev_class dummy_internal_class =
+NETDEV_DUMMY_CLASS("dummy-internal", false, NULL);
+
 static const struct netdev_class dummy_pmd_class =
 NETDEV_DUMMY_CLASS("dummy-pmd", true,
netdev_dummy_reconfigure);
@@ -1751,6 +1760,7 @@ netdev_dummy_register(enum dummy_level level)
 netdev_dummy_override("system");
 }
 netdev_register_provider(_class);
+netdev_register_provider(_internal_class);
 netdev_register_provider(_pmd_class);
 
 netdev_vport_tunnel_register();
diff --git a/tests/bridge.at b/tests/bridge.at
index 37c55ba..3dbabe5 100644
--- a/tests/bridge.at
+++ b/tests/bridge.at
@@ -12,7 +12,7 @@ add_of_ports br0 1 2
 AT_CHECK([ovs-appctl dpif/show], [0], [dnl
 dummy@ovs-dummy: hit:0 missed:0
br0:
-   br0 65534/100: (dummy)
+   br0 65534/100: (dummy-internal)
p1 1/1: (dummy)
p2 2/2: (dummy)
 ])
@@ -23,7 +23,7 @@ AT_CHECK([ovs-appctl dpctl/del-if dummy@ovs-dummy p1])
 AT_CHECK([ovs-appctl dpif/show], [0], [dnl
 dummy@ovs-dummy: hit:0 missed:0
br0:
-   br0 65534/100: (dummy)
+   br0 65534/100: (dummy-internal)
p2 2/2: (dummy)
 ])
 
@@ -32,7 +32,7 @@ AT_CHECK([ovs-vsctl del-port p2])
 AT_CHECK([ovs-appctl dpif/show], [0], [dnl
 dummy@ovs-dummy: hit:0 missed:0
br0:
-   br0 65534/100: (dummy)
+   br0 65534/100: (dummy-internal)
p1 1/1: (dummy)
 ])
 OVS_APP_EXIT_AND_WAIT([ovs-vswitchd])
diff --git a/tests/dpctl.at b/tests/dpctl.at
index b6d5dd6..8c761c8 100644
--- a/tests/dpctl.at
+++ b/tests/dpctl.at
@@ -23,14 +23,14 @@ AT_CHECK([ovs-appctl dpctl/show dummy@br0], [0], [dnl
 dummy@br0:
lookups: hit:0

[ovs-dev] [PATCH 2/7] vswitchd: Introduce 'mtu_request' column in Interface.

2016-08-05 Thread Mark Kavanagh

From: Daniele Di Proietto 

The 'mtu_request' column can be used to set the MTU of a specific
interface.

This column is useful because it will allow changing the MTU of DPDK
devices (implemented in a future commit), which are not accessible
outside the ovs-vswitchd process, but it can be used for kernel
interfaces as well.

The current implementation of set_mtu() in netdev-dpdk is removed
because it's broken.  It will be reintroduced by a subsequent commit on
this series.

Signed-off-by: Daniele Di Proietto 
---
 NEWS   |  2 ++
 lib/netdev-dpdk.c  | 53 +-
 vswitchd/bridge.c  |  9 
 vswitchd/vswitch.ovsschema | 10 +++--
 vswitchd/vswitch.xml   | 52 +
 5 files changed, 58 insertions(+), 68 deletions(-)

diff --git a/NEWS b/NEWS
index c2ed71d..ce10982 100644
--- a/NEWS
+++ b/NEWS
@@ -101,6 +101,8 @@ Post-v2.5.0
- ovs-pki: Changed message digest algorithm from SHA-1 to SHA-512 because
  SHA-1 is no longer secure and some operating systems have started to
  disable it in OpenSSL.
+   - Add 'mtu_request' column to the Interface table. It can be used to
+ configure the MTU of non-internal ports.
 
 
 v2.5.0 - 26 Feb 2016
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index f37ec1c..60db568 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -1639,57 +1639,6 @@ netdev_dpdk_get_mtu(const struct netdev *netdev, int 
*mtup)
 }
 
 static int
-netdev_dpdk_set_mtu(const struct netdev *netdev, int mtu)
-{
-struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
-int old_mtu, err, dpdk_mtu;
-struct dpdk_mp *old_mp;
-struct dpdk_mp *mp;
-uint32_t buf_size;
-
-ovs_mutex_lock(_mutex);
-ovs_mutex_lock(>mutex);
-if (dev->mtu == mtu) {
-err = 0;
-goto out;
-}
-
-buf_size = dpdk_buf_size(mtu);
-dpdk_mtu = FRAME_LEN_TO_MTU(buf_size);
-
-mp = dpdk_mp_get(dev->socket_id, dpdk_mtu);
-if (!mp) {
-err = ENOMEM;
-goto out;
-}
-
-rte_eth_dev_stop(dev->port_id);
-
-old_mtu = dev->mtu;
-old_mp = dev->dpdk_mp;
-dev->dpdk_mp = mp;
-dev->mtu = mtu;
-dev->max_packet_len = MTU_TO_FRAME_LEN(dev->mtu);
-
-err = dpdk_eth_dev_init(dev);
-if (err) {
-dpdk_mp_put(mp);
-dev->mtu = old_mtu;
-dev->dpdk_mp = old_mp;
-dev->max_packet_len = MTU_TO_FRAME_LEN(dev->mtu);
-dpdk_eth_dev_init(dev);
-goto out;
-}
-
-dpdk_mp_put(old_mp);
-netdev_change_seq_changed(netdev);
-out:
-ovs_mutex_unlock(>mutex);
-ovs_mutex_unlock(_mutex);
-return err;
-}
-
-static int
 netdev_dpdk_get_carrier(const struct netdev *netdev, bool *carrier);
 
 static int
@@ -2964,7 +2913,7 @@ netdev_dpdk_vhost_cuse_reconfigure(struct netdev *netdev)
 netdev_dpdk_set_etheraddr,\
 netdev_dpdk_get_etheraddr,\
 netdev_dpdk_get_mtu,  \
-netdev_dpdk_set_mtu,  \
+NULL,   /* set_mtu */ \
 netdev_dpdk_get_ifindex,  \
 GET_CARRIER,  \
 netdev_dpdk_get_carrier_resets,   \
diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
index ddf1fe5..397be70 100644
--- a/vswitchd/bridge.c
+++ b/vswitchd/bridge.c
@@ -775,6 +775,15 @@ bridge_delete_or_reconfigure_ports(struct bridge *br)
 goto delete;
 }
 
+if (iface->cfg->n_mtu_request == 1
+&& strcmp(iface->type,
+  ofproto_port_open_type(br->type, "internal"))) {
+/* Try to set the MTU to the requested value.  This is not done
+ * for internal interfaces, since their MTU is decided by the
+ * ofproto module, based on other ports in the bridge. */
+netdev_set_mtu(iface->netdev, *iface->cfg->mtu_request);
+}
+
 /* If the requested OpenFlow port for 'iface' changed, and it's not
  * already the correct port, then we might want to temporarily delete
  * this interface, so we can add it back again with the new OpenFlow
diff --git a/vswitchd/vswitch.ovsschema b/vswitchd/vswitch.ovsschema
index 32fdf28..8966803 100644
--- a/vswitchd/vswitch.ovsschema
+++ b/vswitchd/vswitch.ovsschema
@@ -1,6 +1,6 @@
 {"name": "Open_vSwitch",
- "version": "7.13.0",
- "cksum": "889248633 22774",
+ "version": "7.14.0",
+ "cksum": "3974332717 22936",
  "tables": {
"Open_vSwitch": {
  "columns": {
@@ -321,6 +321,12 @@
"mtu": {
  "type": {"key": "integer", "min": 0, "max": 1},
  "ephemeral": true},
+   "mtu_request": {
+ "type": {
+   "key": {"type": "integer",
+   "minInteger": 1},
+

[ovs-dev] [PATCH 1/7] ofproto: Consider datapath_type when looking for internal ports.

2016-08-05 Thread Mark Kavanagh

From: Daniele Di Proietto 

Interfaces with type "internal" end up having a netdev with type "tap"
in the dpif-netdev datapath, so a strcmp will fail to match internal
interfaces.

We can translate the types with ofproto_port_open_type() before calling
strcmp to fix this.

This fixes a minor issue where internal interfaces are considered
non-internal in the userspace datapath for the purpose of adjusting the
MTU.

Signed-off-by: Daniele Di Proietto 
---
 ofproto/ofproto.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c
index 8e59c69..088f91a 100644
--- a/ofproto/ofproto.c
+++ b/ofproto/ofproto.c
@@ -220,7 +220,8 @@ static void learned_cookies_flush(struct ofproto *, struct 
ovs_list *dead_cookie
 /* ofport. */
 static void ofport_destroy__(struct ofport *) OVS_EXCLUDED(ofproto_mutex);
 static void ofport_destroy(struct ofport *, bool del);
-static inline bool ofport_is_internal(const struct ofport *);
+static inline bool ofport_is_internal(const struct ofproto *,
+  const struct ofport *);
 
 static int update_port(struct ofproto *, const char *devname);
 static int init_ports(struct ofproto *);
@@ -2465,7 +2466,7 @@ static void
 ofport_remove(struct ofport *ofport)
 {
 struct ofproto *p = ofport->ofproto;
-bool is_internal = ofport_is_internal(ofport);
+bool is_internal = ofport_is_internal(p, ofport);
 
 connmgr_send_port_status(ofport->ofproto->connmgr, NULL, >pp,
  OFPPR_DELETE);
@@ -2751,9 +2752,10 @@ init_ports(struct ofproto *p)
 }
 
 static inline bool
-ofport_is_internal(const struct ofport *port)
+ofport_is_internal(const struct ofproto *p, const struct ofport *port)
 {
-return !strcmp(netdev_get_type(port->netdev), "internal");
+return !strcmp(netdev_get_type(port->netdev),
+   ofproto_port_open_type(p->type, "internal"));
 }
 
 /* Find the minimum MTU of all non-datapath devices attached to 'p'.
@@ -2770,7 +2772,7 @@ find_min_mtu(struct ofproto *p)
 
 /* Skip any internal ports, since that's what we're trying to
  * set. */
-if (ofport_is_internal(ofport)) {
+if (ofport_is_internal(p, ofport)) {
 continue;
 }
 
@@ -2797,7 +2799,7 @@ update_mtu(struct ofproto *p, struct ofport *port)
 port->mtu = 0;
 return;
 }
-if (ofport_is_internal(port)) {
+if (ofport_is_internal(p, port)) {
 if (dev_mtu > p->min_mtu) {
if (!netdev_set_mtu(port->netdev, p->min_mtu)) {
dev_mtu = p->min_mtu;
@@ -2827,7 +2829,7 @@ update_mtu_ofproto(struct ofproto *p)
 HMAP_FOR_EACH (ofport, hmap_node, >ports) {
 struct netdev *netdev = ofport->netdev;
 
-if (ofport_is_internal(ofport)) {
+if (ofport_is_internal(p, ofport)) {
 if (!netdev_set_mtu(netdev, p->min_mtu)) {
 ofport->mtu = p->min_mtu;
 }
-- 
1.9.3

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH v2 0/5] check-kernel: add 802.1ad tests

2016-08-05 Thread Eric Garver

Joe,

Thanks for further review. I'll add the changes you have below to the
series.

I'll take a look at the "check-system-userspace" failure. The "802.1ad
- push/pop outer tag" test fails on at least one of my setups.

On Thu, Aug 04, 2016 at 05:53:59PM -0700, Joe Stringer wrote:
> Thanks for updating the series.
> 
> With the incremental patch below this is looking pretty reliable for
> check-kmod/check-kernel on the platforms I can test on, although
> there's still some issue with "make check-system-userspace". It seems
> like the userspace datapath cannot receive double-tagged packets from
> AF_PACKET properly; it's unclear where the issue is yet.
> 
> diff --git a/tests/system-traffic.at b/tests/system-traffic.at
> index ff67be997370..694eeb5f4665 100644
> --- a/tests/system-traffic.at
> +++ b/tests/system-traffic.at
> @@ -85,6 +85,8 @@ ADD_SVLAN(p1, at_ns1, 4094, "10.255.2.2/24")
> ADD_CVLAN(p0.4094, at_ns0, 100, "10.2.2.1/24")
> ADD_CVLAN(p1.4094, at_ns1, 100, "10.2.2.2/24")
> 
> +OVS_WAIT_UNTIL([ip netns exec at_ns0 ping -c 1 10.2.2.2])
> +
> NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.2.2.2 |
> FORMAT_PING], [0], [dnl
> 3 packets transmitted, 3 received, 0% packet loss, time 0ms
> ])
> @@ -176,7 +178,7 @@ ADD_CVLAN(p1.4094, at_ns1, 100, "fc00:1::2/96")
> dnl Linux seems to take a little time to get its IPv6 stack in order. Without
> dnl waiting, we get occasional failures due to the following error:
> dnl "connect: Cannot assign requested address"
> -OVS_WAIT_UNTIL([ip netns exec at_ns0 ping6 -c 1 fc00::2])
> +OVS_WAIT_UNTIL([ip netns exec at_ns0 ping6 -c 1 fc00:1::2])
> 
> NS_CHECK_EXEC([at_ns0], [ping6 -q -c 3 -i 0.3 -w 2 fc00:1::2 |
> FORMAT_PING], [0], [dnl
> 3 packets transmitted, 3 received, 0% packet loss, time 0ms
> 
> 
> 
> On 2 August 2016 at 08:20, Eric Garver  wrote:
> > This series adds 6 test cases to the "check-kernel" make target for
> > 802.1ad. It is meant as a counterpart to the 802.1ad work currently
> > going on and being discussed on the dev list.
> >
> > User space support for 802.1ad is being worked on by Xiao Liang (based
> > on Thomas F Herbert's work).
> >
> > Kernel support is being worked on by myself (also based on Tom's work).
> > I will post (and CC ovs-dev) the kernel series once net-next opens again
> > for new content. If there is interest I can post that series to ovs-dev
> > for discussion in the mean time.
> >
> > These patches have been tested with Xiao's most recent series and my yet
> > to be posted kernel series.
> >
> > Update v2:
> >  - Properly skip tests on older versions of OVS and kernel
> >  - Set CVLAN mtu to 1496 to allow tests to pass on older kernels
> >
> > Eric Garver (5):
> >   check-kernel: Add macros to check for and test 802.1ad.
> >   check-kernel: 802.1ad: Add datapath ping tests for CVLANs.
> >   check-kernel: 802.1ad: Add conntrack ping tests for CVLANs.
> >   check-kernel: 802.1ad: Add push/pop test case.
> >   check-kernel: 802.1ad: Add dot1q-tunnel test case.
> >
> >  tests/system-common-macros.at |  30 -
> >  tests/system-traffic.at   | 268 
> > ++
> >  2 files changed, 297 insertions(+), 1 deletion(-)
> >
> > --
> > 2.5.5
> >
> > ___
> > dev mailing list
> > dev@openvswitch.org
> > http://openvswitch.org/mailman/listinfo/dev
> ___
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [PATCH] dpcls_lookup: added comments.

2016-08-05 Thread antonio . fischetti

This patch adds some comments to the dpcls_lookup() funtion,
which is one of the most important places where the Userspace
wildcard matching happens.
The purpose is to give some more explanations on its design
and also on how it works.

Signed-off-by: Antonio Fischetti 
---
 lib/dpif-netdev.c | 40 ++--
 1 file changed, 34 insertions(+), 6 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index e0107b7..a390758 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -4492,8 +4492,8 @@ dpcls_rule_matches_key(const struct dpcls_rule *rule,
 return true;
 }
 
-/* For each miniflow in 'flows' performs a classifier lookup writing the result
- * into the corresponding slot in 'rules'.  If a particular entry in 'flows' is
+/* For each miniflow in 'keys' performs a classifier lookup writing the result
+ * into the corresponding slot in 'rules'.  If a particular entry in 'keys' is
  * NULL it is skipped.
  *
  * This function is optimized for use in the userspace datapath and therefore
@@ -4501,12 +4501,15 @@ dpcls_rule_matches_key(const struct dpcls_rule *rule,
  * classifier_lookup() function.  Specifically, it does not implement
  * priorities, instead returning any rule which matches the flow.
  *
- * Returns true if all flows found a corresponding rule. */
+ * Returns true if all miniflows found a corresponding rule. */
 static bool
 dpcls_lookup(const struct dpcls *cls, const struct netdev_flow_key keys[],
  struct dpcls_rule **rules, const size_t cnt)
 {
-/* The batch size 16 was experimentally found faster than 8 or 32. */
+/* The received 'cnt' miniflows are the search-keys that will be processed
+ * in batches of 16 elements.  N_MAPS will contain the number of these
+ * 16-elements batches.  i.e. for 'cnt'=32, N_MAPS shall be 2.
+ * The batch size 16 was experimentally found faster than 8 or 32. */
 typedef uint16_t map_type;
 #define MAP_BITS (sizeof(map_type) * CHAR_BIT)
 
@@ -4524,6 +4527,16 @@ dpcls_lookup(const struct dpcls *cls, const struct 
netdev_flow_key keys[],
 }
 memset(rules, 0, cnt * sizeof *rules);
 
+/* The Datapath classifier - aka dpcls - is composed of subtables.
+ * They are dynamically created depending on the new rules we need to
+ * cache.
+ * Each subtable collects rules with a certain subset of packet fields and
+ * with a given unique mask.
+ * We need to process every search-key against each subtable.
+ * When an entry is found the search can stop because rules are
+ * non-overlapping by nature.
+ * The next macro loops on the current subtables listed into the
+ * 'cls->subtables' pvector. */
 PVECTOR_FOR_EACH (subtable, >subtables) {
 const struct netdev_flow_key *mkeys = keys;
 struct dpcls_rule **mrules = rules;
@@ -4532,6 +4545,7 @@ dpcls_lookup(const struct dpcls *cls, const struct 
netdev_flow_key keys[],
 
 BUILD_ASSERT_DECL(sizeof remains == sizeof *maps);
 
+/* Loops on each batch of 16 search-keys. */
 for (m = 0; m < N_MAPS; m++, mkeys += MAP_BITS, mrules += MAP_BITS) {
 uint32_t hashes[MAP_BITS];
 const struct cmap_node *nodes[MAP_BITS];
@@ -4542,14 +4556,25 @@ dpcls_lookup(const struct dpcls *cls, const struct 
netdev_flow_key keys[],
 continue; /* Skip empty maps. */
 }
 
-/* Compute hashes for the remaining keys. */
+/* Compute hashes for the remaining keys.
+ * Beside the search-key we need to pass also the specific mask
+ * of the current subtable, because we are using Hash tables for
+ * a wildcard match.
+ * The mask will be applied to the search-key before computing the
+ * Hash value. */
 ULLONG_FOR_EACH_1(i, map) {
 hashes[i] = netdev_flow_key_hash_in_mask([i],
  >mask);
 }
 /* Lookup. */
 map = cmap_find_batch(>rules, map, hashes, nodes);
-/* Check results. */
+/* Check results.
+ * When the i-th bit of map is set, it means that a Hash entry
+ * was found for the i-th search-key.  Considering how Hash
+ * mechanism works, we still need to check that the found entry
+ * really matches our masked search-key.  Otherwise we will loop on
+ * the linked nodes - which will be present if any collision
+ * occurred - to repeat the check for a match. */
 ULLONG_FOR_EACH_1(i, map) {
 struct dpcls_rule *rule;
 
@@ -4559,6 +4584,9 @@ dpcls_lookup(const struct dpcls *cls, const struct 
netdev_flow_key keys[],
 goto next;
 }
 }
+/* The search did find an entry but none of the linked nodes
+ *

Re: [ovs-dev] [openvswitch 2.5.90] testsuite: 2224 failed

2016-08-05 Thread Ilya Maximets

Same situation on another environment:

* Ubuntu 16.04 LTS
* Compiler: gcc (Ubuntu 5.3.1-14ubuntu2.1) 5.3.1 20160413
* Intel(R) Core(TM) i7-3770 CPU

Best regards, Ilya Maximets.

On 05.08.2016 14:37, Ilya Maximets wrote:
> There is one interesting bug:
> 
> Test 2224 (ovn -- dhcpv4 : 1 HV, 2 LS, 2 LSPs/LS) constantly fails
> with 'CFLAGS=-march=native'. All other tests works normally.
> 
> Environment:
> 
>   * OVS current master:
> commit d59831e9b08e ("bridge: No QoS configured is not an error")
>   * Red Hat Enterprise Linux Server release 7.2 (Maipo)
>   * Compiler: gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-4)
>   * Intel(R) Xeon(R) CPU E5-2690 v3
> 
> Test scenario:
> 
>   1. Checkout current master branch.
> 
>   2. Configure OVS with default configuration:
> 
>  # ./boot.sh && ./configure && make
> 
>   3. Check test #2224
> 
>  # make check TESTSUITEFLAGS='2224'
>  2224: ovn -- dhcpv4 : 1 HV, 2 LS, 2 LSPs/LS   ok
> 
>   4. Clean up
> 
>  # make distclean
> 
>   5. Configure OVS with '-march=native':
> 
>  # ./boot.sh && ./configure CFLAGS="-march=native" && make
> 
>   6. Check test #2224
> 
>  # make check TESTSUITEFLAGS='2224'
>  2224: ovn -- dhcpv4 : 1 HV, 2 LS, 2 LSPs/LS   FAILED 
> (ovn.at:3205)
> 
> Test failed because of bad packet:
> 
> ./ovn.at:3205: cat 1.packets | cut -c 53-
> --- expout  2016-08-05 14:29:47.205360523 +0300
> +++ /ovs/tests/testsuite.dir/at-groups/2224/stdout   2016-08-05 
> 14:29:47.215360172 +0300
> @@ -1 +1 @@
> -0a010a0400430044011c020106006359aa760a04
>  
> f001
>  
> 
>  
> 
>  
> 
>  
> 638253633501020104ff
>  0003040a0136040a0133040e10ff
> +0a010a0400430044011c020106006359aa760a04
>  
> f001
>  
> 
>  
> 
>  
> 
>  
> 6382536335010236040a
>  010104ff0003040a0133040e10ff
> 
> Full log attached.
> 
> Best regards, Ilya Maximets.
> 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] Failing Test 2214 on i686

2016-08-05 Thread Christian Ehrhardt

Hi,
while checking latest master we found that there were several tests that
failed sometimes.
  15: bfd - bfd decay
1149: ofproto-dpif - in place modification (vlan)
But one of them seems to reliably fail in i686 mode while working just fine
in a similar amd64 build.

Steps to reproduce:
make check TESTSUITEFLAGS='2214'

I usually use the Debian build using the debian/ subdir packaged in
openvswitch:
dh build --with autoreconf,python2,python3,systemd --parallel

But to make sure I also ran a "normal" clean build with:
./boot.sh && ./configure && make
Built that way it fails just as well.

It seems that the packets were expected twice but only received once:
checking packets in hv1/vif1-tx.pcap against 1.expected:
expected 6 packets, only received 3
../../tests/ovn.at:1326: sort $rcv_text
--- expout 2016-08-05 12:13:09.661688314 +
+++ /<>/_dpdk/tests/testsuite.dir/at-groups/2214/stdout
2016-08-05 12:13:09.661688314 +
@@ -1,6 +1,3 @@
 0100f00202ff
-0100f00303ff
 f001f0020021
-f001f0030031
 f00202ff
-f00303ff

Full log(s) attached

-- 
Christian Ehrhardt
Software Engineer, Ubuntu Server
Canonical Ltd
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [openvswitch 2.5.90] testsuite: 2224 failed

2016-08-05 Thread Ilya Maximets

Exactly same situation with gcc (GCC) 6.1.1 20160510 (Red Hat 6.1.1-2).

On 05.08.2016 14:37, Ilya Maximets wrote:
> There is one interesting bug:
> 
> Test 2224 (ovn -- dhcpv4 : 1 HV, 2 LS, 2 LSPs/LS) constantly fails
> with 'CFLAGS=-march=native'. All other tests works normally.
> 
> Environment:
> 
>   * OVS current master:
> commit d59831e9b08e ("bridge: No QoS configured is not an error")
>   * Red Hat Enterprise Linux Server release 7.2 (Maipo)
>   * Compiler: gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-4)
>   * Intel(R) Xeon(R) CPU E5-2690 v3
> 
> Test scenario:
> 
>   1. Checkout current master branch.
> 
>   2. Configure OVS with default configuration:
> 
>  # ./boot.sh && ./configure && make
> 
>   3. Check test #2224
> 
>  # make check TESTSUITEFLAGS='2224'
>  2224: ovn -- dhcpv4 : 1 HV, 2 LS, 2 LSPs/LS   ok
> 
>   4. Clean up
> 
>  # make distclean
> 
>   5. Configure OVS with '-march=native':
> 
>  # ./boot.sh && ./configure CFLAGS="-march=native" && make
> 
>   6. Check test #2224
> 
>  # make check TESTSUITEFLAGS='2224'
>  2224: ovn -- dhcpv4 : 1 HV, 2 LS, 2 LSPs/LS   FAILED 
> (ovn.at:3205)
> 
> Test failed because of bad packet:
> 
> ./ovn.at:3205: cat 1.packets | cut -c 53-
> --- expout  2016-08-05 14:29:47.205360523 +0300
> +++ /ovs/tests/testsuite.dir/at-groups/2224/stdout   2016-08-05 
> 14:29:47.215360172 +0300
> @@ -1 +1 @@
> -0a010a0400430044011c020106006359aa760a04
>  
> f001
>  
> 
>  
> 
>  
> 
>  
> 638253633501020104ff
>  0003040a0136040a0133040e10ff
> +0a010a0400430044011c020106006359aa760a04
>  
> f001
>  
> 
>  
> 
>  
> 
>  
> 6382536335010236040a
>  010104ff0003040a0133040e10ff
> 
> Full log attached.
> 
> Best regards, Ilya Maximets.
> 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] Emailing: IMG(00).gif

2016-08-05 Thread Angeline



___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] Emailing: IMG(06).png

2016-08-05 Thread Stacie



___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] Error

2016-08-05 Thread Returned mail

Your message was not delivered due to the following reason(s):

Your message was not delivered because the destination computer was
not reachable within the allowed queue period. The amount of time
a message is queued before it is returned depends on local configura-
tion parameters.

Most likely there is a network problem that prevented delivery, but
it is also possible that the computer is turned off, or does not
have a mail system running right now.

Your message was not delivered within 4 days:
Host 165.253.186.102 is not responding.

The following recipients could not receive this message:


Please reply to postmas...@openvswitch.org
if you feel this message to be in error.

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [openvswitch 2.5.90] testsuite: 2224 failed

2016-08-05 Thread Ilya Maximets

There is one interesting bug:

Test 2224 (ovn -- dhcpv4 : 1 HV, 2 LS, 2 LSPs/LS) constantly fails
with 'CFLAGS=-march=native'. All other tests works normally.

Environment:

* OVS current master:
  commit d59831e9b08e ("bridge: No QoS configured is not an error")
* Red Hat Enterprise Linux Server release 7.2 (Maipo)
* Compiler: gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-4)
* Intel(R) Xeon(R) CPU E5-2690 v3

Test scenario:

1. Checkout current master branch.

2. Configure OVS with default configuration:

   # ./boot.sh && ./configure && make

3. Check test #2224

   # make check TESTSUITEFLAGS='2224'
   2224: ovn -- dhcpv4 : 1 HV, 2 LS, 2 LSPs/LS   ok

4. Clean up

   # make distclean

5. Configure OVS with '-march=native':

   # ./boot.sh && ./configure CFLAGS="-march=native" && make

6. Check test #2224

   # make check TESTSUITEFLAGS='2224'
   2224: ovn -- dhcpv4 : 1 HV, 2 LS, 2 LSPs/LS   FAILED 
(ovn.at:3205)

Test failed because of bad packet:

./ovn.at:3205: cat 1.packets | cut -c 53-
--- expout  2016-08-05 14:29:47.205360523 +0300
+++ /ovs/tests/testsuite.dir/at-groups/2224/stdout   2016-08-05 
14:29:47.215360172 +0300
@@ -1 +1 @@
-0a010a0400430044011c020106006359aa760a04
 
f001
 

 

 

 
638253633501020104ff
 0003040a0136040a0133040e10ff
+0a010a0400430044011c020106006359aa760a04
 
f001
 

 

 

 
6382536335010236040a
 010104ff0003040a0133040e10ff

Full log attached.

Best regards, Ilya Maximets.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] Emailing: IMG(84).png

2016-08-05 Thread Kellie



___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] Emailing: Photo(564).png

2016-08-05 Thread Karin



___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] 答复: 答复: ovs dpdk : userspace connection tracker cannot support L7？

2016-08-05 Thread Yangyongqiang (Tony, Shannon)

Hi

Is there any mechanism that we could use kernel CT in userspace?

I mean we send packets to kernel, and kernel CT return the ct-state.

Or could you give us any comments about userspace CT of L7 protocol? 

Very appreciate for your reply

thanks

-邮件原件-
发件人: Joe Stringer [mailto:j...@ovn.org] 
发送时间: 2016年8月5日 9:24
收件人: Yangyongqiang (Tony, Shannon)
抄送: dev@openvswitch.org; huangyongtao (A)
主题: Re: 答复: [ovs-dev] ovs dpdk : userspace connection tracker cannot support L7？

I'm not aware of anyone working on this currently.

On 4 August 2016 at 18:20, Yangyongqiang (Tony, Shannon) 
 wrote:
> Hi Joe，
> Thanks for your reply, this is very helpful.
> Is there any body do this now? so we can do it together.
>
> thanks
>
> -邮件原件-
> 发件人: Joe Stringer [mailto:j...@ovn.org]
> 发送时间: 2016年8月5日 8:57
> 收件人: Yangyongqiang (Tony, Shannon)
> 抄送: dev@openvswitch.org
> 主题: Re: [ovs-dev] ovs dpdk : userspace connection tracker cannot 
> support L7？
>
> On 2 August 2016 at 19:30, Yangyongqiang (Tony, Shannon) 
>  wrote:
>> Hello,
>>
>> We read the connection tracker code, and find this patch can not parse ftp 
>> protocol.
>>
>> Whether the userspace connection tracker only has L4 feather or has L7 
>> feather too ?
>>
>> If the ct cannot L7, then ovs dpdk cannot be used for stateful security 
>> group, so do we have a plan for supporting L7?
>
> In the userspace datapath (dpdk), there is no support for IP fragmentation or 
> ALGs today. Only the linux kernel datapath has support for these.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] [CudaMailTagged] [PATCH 2/2] Tunnel: Fix the issue of tunnel port creation

2016-08-05 Thread Binbin Xu

If a kernel space vxlan port was added first, and then we try to
add a user space vxlan port. But unfortunate, the user space
vxlan port can't be created.

This commit separates kernel space with user space tunnel port,
for example:
 kernel_space  user_space
vxlanvxlan_sys_4789vxlan_usr_4789
gre  gre_sys   gre_usr
..

Signed-off-by: Binbin Xu 
---
 lib/dpif-netdev.c  |  3 ++-
 lib/dpif-netlink.c |  2 +-
 lib/netdev-vport.c | 26 +++---
 lib/netdev-vport.h |  2 +-
 lib/netdev.c   | 13 ++---
 lib/netdev.h   |  2 +-
 ofproto/ofproto-dpif.c | 15 ++-
 vswitchd/bridge.c  |  2 +-
 8 files changed, 41 insertions(+), 24 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index e39362e..3bdf204 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -1343,7 +1343,8 @@ dpif_netdev_port_add(struct dpif *dpif, struct netdev 
*netdev,
 int error;
 
 ovs_mutex_lock(>port_mutex);
-dpif_port = netdev_vport_get_dpif_port(netdev, namebuf, sizeof namebuf);
+dpif_port = netdev_vport_get_dpif_port(netdev, "netdev",
+   namebuf, sizeof namebuf);
 if (*port_nop != ODPP_NONE) {
 port_no = *port_nop;
 error = dp_netdev_lookup_port(dp, *port_nop) ? EBUSY : 0;
diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
index a39faa2..1b0cd2c 100644
--- a/lib/dpif-netlink.c
+++ b/lib/dpif-netlink.c
@@ -810,7 +810,7 @@ dpif_netlink_port_add__(struct dpif_netlink *dpif, struct 
netdev *netdev,
 {
 const struct netdev_tunnel_config *tnl_cfg;
 char namebuf[NETDEV_VPORT_NAME_BUFSIZE];
-const char *name = netdev_vport_get_dpif_port(netdev,
+const char *name = netdev_vport_get_dpif_port(netdev, "netlink",
   namebuf, sizeof namebuf);
 const char *type = netdev_get_type(netdev);
 struct dpif_netlink_vport request, reply;
diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c
index f5eb76f..e374f9d 100755
--- a/lib/netdev-vport.c
+++ b/lib/netdev-vport.c
@@ -119,16 +119,19 @@ netdev_vport_class_get_dpif_port(const struct 
netdev_class *class)
 }
 
 const char *
-netdev_vport_get_dpif_port(const struct netdev *netdev,
+netdev_vport_get_dpif_port(const struct netdev *netdev, const char *dp_type,
char namebuf[], size_t bufsize)
 {
 const struct netdev_class *class = netdev_get_class(netdev);
 const char *dpif_port = netdev_vport_class_get_dpif_port(class);
+char *type;
 
 if (!dpif_port) {
 return netdev_get_name(netdev);
 }
 
+type = !strcmp(dp_type, "netdev") ? "usr" : "sys";
+
 if (netdev_vport_needs_dst_port(netdev)) {
 const struct netdev_vport *vport = netdev_vport_cast(netdev);
 
@@ -138,13 +141,14 @@ netdev_vport_get_dpif_port(const struct netdev *netdev,
  * port numbers but assert just in case.
  */
 BUILD_ASSERT(NETDEV_VPORT_NAME_BUFSIZE >= IFNAMSIZ);
-ovs_assert(strlen(dpif_port) + 6 < IFNAMSIZ);
-snprintf(namebuf, bufsize, "%s_%d", dpif_port,
+ovs_assert(strlen(dpif_port) + 10 < IFNAMSIZ);
+snprintf(namebuf, bufsize, "%s_%s_%d", dpif_port, type,
  ntohs(vport->tnl_cfg.dst_port));
-return namebuf;
 } else {
-return dpif_port;
+snprintf(namebuf, NETDEV_VPORT_NAME_BUFSIZE, "%s_%s",
+dpif_port, type);
 }
+return namebuf;
 }
 
 /* Whenever the route-table change number is incremented,
@@ -890,18 +894,18 @@ netdev_vport_tunnel_register(void)
 /* The name of the dpif_port should be short enough to accomodate adding
  * a port number to the end if one is necessary. */
 static const struct vport_class vport_classes[] = {
-TUNNEL_CLASS("geneve", "genev_sys", netdev_geneve_build_header,
+TUNNEL_CLASS("geneve", "genev", netdev_geneve_build_header,
 netdev_tnl_push_udp_header,
 netdev_geneve_pop_header),
-TUNNEL_CLASS("gre", "gre_sys", netdev_gre_build_header,
+TUNNEL_CLASS("gre", "gre", netdev_gre_build_header,
netdev_gre_push_header,
netdev_gre_pop_header),
-TUNNEL_CLASS("ipsec_gre", "gre_sys", NULL, NULL, NULL),
-TUNNEL_CLASS("vxlan", "vxlan_sys", netdev_vxlan_build_header,
+TUNNEL_CLASS("ipsec_gre", "gre", NULL, NULL, NULL),
+TUNNEL_CLASS("vxlan", "vxlan", netdev_vxlan_build_header,
netdev_tnl_push_udp_header,
netdev_vxlan_pop_header),
-TUNNEL_CLASS("lisp", "lisp_sys", NULL, NULL, NULL),
-TUNNEL_CLASS("stt", "stt_sys", NULL, NULL, NULL),
+TUNNEL_CLASS("lisp", "lisp", NULL, NULL, NULL),
+

[ovs-dev] [PATCH 1/2] netdev-vport: remove unused function

2016-08-05 Thread Binbin Xu

The function netdev_vport_get_dpif_port_strdup is not
used anymore. So we can remove it now.

Signed-off-by: Binbin Xu 
---
 lib/netdev-vport.c | 9 -
 lib/netdev-vport.h | 1 -
 2 files changed, 10 deletions(-)
 mode change 100644 => 100755 lib/netdev-vport.c
 mode change 100644 => 100755 lib/netdev-vport.h

diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c
old mode 100644
new mode 100755
index 87a30f8..f5eb76f
--- a/lib/netdev-vport.c
+++ b/lib/netdev-vport.c
@@ -147,15 +147,6 @@ netdev_vport_get_dpif_port(const struct netdev *netdev,
 }
 }
 
-char *
-netdev_vport_get_dpif_port_strdup(const struct netdev *netdev)
-{
-char namebuf[NETDEV_VPORT_NAME_BUFSIZE];
-
-return xstrdup(netdev_vport_get_dpif_port(netdev, namebuf,
-  sizeof namebuf));
-}
-
 /* Whenever the route-table change number is incremented,
  * netdev_vport_route_changed() should be called to update
  * the corresponding tunnel interface status. */
diff --git a/lib/netdev-vport.h b/lib/netdev-vport.h
old mode 100644
new mode 100755
index be02cb5..048aa6e
--- a/lib/netdev-vport.h
+++ b/lib/netdev-vport.h
@@ -51,6 +51,5 @@ enum { NETDEV_VPORT_NAME_BUFSIZE = 256 };
 const char *netdev_vport_get_dpif_port(const struct netdev *,
char namebuf[], size_t bufsize)
 OVS_WARN_UNUSED_RESULT;
-char *netdev_vport_get_dpif_port_strdup(const struct netdev *);
 
 #endif /* netdev-vport.h */
-- 
1.8.3.1

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

[ovs-dev] OVS VxLAN over IPSec Support

2016-08-05 Thread Muthukrishnan Thangasamy

Dear Team ,


I am using OVS version 2.5.0 in Ubuntu 16.04 for Tunnel Experimentation.

I am trying to create Interface Type VxLAN over IP-sec(ipsec_vxlan) , OVS is 
saying its not supported , But  GRE over IPSec

supported (ipsec_gre) .


Why Vxlan over IPsec is not supported ? any reason behind it ?


Please let me know if any patch or workaround for the same.


Thanks

Muthukrishnan

9952012433
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] ovsdb active backup deployment

2016-08-05 Thread Andy Zhou

On Thu, Aug 4, 2016 at 11:14 PM, Russell Bryant  wrote:

>
>
> On Thu, Aug 4, 2016 at 8:17 PM, Andy Zhou  wrote:
>
>>
>> On Wed, Jul 27, 2016 at 1:04 PM, Andy Zhou  wrote:
>>
>>>
>>>
>>> On Tue, Jul 26, 2016 at 6:20 PM, Russell Bryant  wrote:
>>>


 On Tue, Jul 26, 2016 at 3:48 PM, Andy Zhou  wrote:

>
>
> On Tue, Jul 26, 2016 at 11:59 AM, Russell Bryant 
> wrote:
>
>>
>>
>> On Tue, Jul 26, 2016 at 2:41 PM, Andy Zhou  wrote:
>>
>>>
>>>
>>> On Tue, Jul 26, 2016 at 5:37 AM, Russell Bryant 
>>> wrote:
>>>


 On Mon, Jul 25, 2016 at 8:15 PM, Andy Zhou  wrote:

> Hi, Rayn and Russell,
>

 Can we move this discussion to the ovs dev mailing list?  Feel free
 to just add it in a reply if you'd like.

>>> Done.
>>>


> I am wondering how we can actually use the active/backup feature
> that is now part of
> OVSDB to increase OVN availability.
>

 TO be clear, I haven't actually tried this yet.  I'm only speaking
 about how I think it should work.


> Specifically:
>
> 1. When the active OVSDB server failed, should the back up server
> take over, and allow write transactions? One simpler possibility is to
> allow read only access to the backup serve.
>

 The  backup server needs to take over.  It's OK if that requires
 intervention by an HA manager like Pacemaker.  If we can't make the 
 passive
 server take over, I'd say the solution is incomplete.

>>>
>>> O.K. make sense.
>>>
>>> One possible issue with backup server taking over is "split head".
>>> In case due to network error, backup server becomes disconnected from 
>>> the
>>> active
>>> server, then we may have both server thinking they are active server
>>> now.  Does Pacemaker help with solving this issue.
>>>
>>
>> It can, yes.  I would expect Pacemaker to explicitly configure a node
>> to be either the active or passive node.
>>
> Manual switching is more straight forward. I agree.
>
>>

> 2. When a crashed active OVSDB server recovers, should it become
> the new backup, or it should switch back.
>

 Becoming the new backup is fine.  Again, this can be orchestrated
 by an HA manager (Pacemaker).

>>> I am not familiar with pacemaker. Can I assume it can provide a
>>> correct --sync-from argument (pointing to backup server) when relaunch
>>> OVSDB server?
>>>
>>
>> Yes.  I'd have to consult with some Pacemaker experts on exactly what
>> the implementation would look like, but roughly:
>>
>> Pacemaker manages services using "OCF Resource Agents", which are
>> just scripts with a defined set of inputs and outputs for service
>> management.  I would imagine a Pacemaker cluster being told it must have
>> exactly 1 active and 1 passive OVSDB service.  When the passive OVSDB
>> service is started, it would include the "sync-from" argument based on
>> where the active OVSDB service is currently running.
>>
>> We really need to prototype this and document it.  I'm guessing too
>> much.  Pacemaker is frequently used to manage active/passive HA, though.
>>
>> Sounds reasonable,  I will work on ovsdb internal changes to support
> manual switching, using appctl commands. Then looking into prototyping 
> with
> HA systems.  I have not used pacemaker in the past, so it may take some
> time to ramp up.
>

 I should be able to help.  We need to do this work anyway for
 integration into OpenStack deployment tools.  Let me see if I can get some
 helpful examples to follow.

>>>
>>> Thanks for helping out.
>>>
>>> Given that, I now plan to work from bottom up, initially focusing on
>>> ovsdb server changes.
>>>
>>> 1. Add a state in ovsdb-server for it to know whether it is an active
>>> server.  Backup server will not accept any connections.  Server started with
>>> --sync-from argument will be put in the back state by default.
>>>
>>> 2. Add appctl commands to allow manually switch state.
>>>
>>> 3. Add a new table for backup server to register its address and ports.
>>> OVSDB clients can learn about them at run time. Back up server should issue
>>> an
>>> transaction to register its address before issuing the monitoring
>>> request.  This feature is not strictly necessary, and can be pushed to HA
>>> manager,
>>> but having it built into ovsdb-server may make it simpler for
>>> integrationl.
>>>
>>> What do you think?
>>>
>>>
>>>
>> Russell, Would HA

Re: [ovs-dev] ovsdb active backup deployment

2016-08-05 Thread Russell Bryant

On Thu, Aug 4, 2016 at 8:17 PM, Andy Zhou  wrote:

>
> On Wed, Jul 27, 2016 at 1:04 PM, Andy Zhou  wrote:
>
>>
>>
>> On Tue, Jul 26, 2016 at 6:20 PM, Russell Bryant  wrote:
>>
>>>
>>>
>>> On Tue, Jul 26, 2016 at 3:48 PM, Andy Zhou  wrote:
>>>


 On Tue, Jul 26, 2016 at 11:59 AM, Russell Bryant 
 wrote:

>
>
> On Tue, Jul 26, 2016 at 2:41 PM, Andy Zhou  wrote:
>
>>
>>
>> On Tue, Jul 26, 2016 at 5:37 AM, Russell Bryant 
>> wrote:
>>
>>>
>>>
>>> On Mon, Jul 25, 2016 at 8:15 PM, Andy Zhou  wrote:
>>>
 Hi, Rayn and Russell,

>>>
>>> Can we move this discussion to the ovs dev mailing list?  Feel free
>>> to just add it in a reply if you'd like.
>>>
>> Done.
>>
>>>
>>>
 I am wondering how we can actually use the active/backup feature
 that is now part of
 OVSDB to increase OVN availability.

>>>
>>> TO be clear, I haven't actually tried this yet.  I'm only speaking
>>> about how I think it should work.
>>>
>>>
 Specifically:

 1. When the active OVSDB server failed, should the back up server
 take over, and allow write transactions? One simpler possibility is to
 allow read only access to the backup serve.

>>>
>>> The  backup server needs to take over.  It's OK if that requires
>>> intervention by an HA manager like Pacemaker.  If we can't make the 
>>> passive
>>> server take over, I'd say the solution is incomplete.
>>>
>>
>> O.K. make sense.
>>
>> One possible issue with backup server taking over is "split head".
>> In case due to network error, backup server becomes disconnected from the
>> active
>> server, then we may have both server thinking they are active server
>> now.  Does Pacemaker help with solving this issue.
>>
>
> It can, yes.  I would expect Pacemaker to explicitly configure a node
> to be either the active or passive node.
>
 Manual switching is more straight forward. I agree.

>
>>>
 2. When a crashed active OVSDB server recovers, should it become
 the new backup, or it should switch back.

>>>
>>> Becoming the new backup is fine.  Again, this can be orchestrated by
>>> an HA manager (Pacemaker).
>>>
>> I am not familiar with pacemaker. Can I assume it can provide a
>> correct --sync-from argument (pointing to backup server) when relaunch
>> OVSDB server?
>>
>
> Yes.  I'd have to consult with some Pacemaker experts on exactly what
> the implementation would look like, but roughly:
>
> Pacemaker manages services using "OCF Resource Agents", which are just
> scripts with a defined set of inputs and outputs for service management.  
> I
> would imagine a Pacemaker cluster being told it must have exactly 1 active
> and 1 passive OVSDB service.  When the passive OVSDB service is started, 
> it
> would include the "sync-from" argument based on where the active OVSDB
> service is currently running.
>
> We really need to prototype this and document it.  I'm guessing too
> much.  Pacemaker is frequently used to manage active/passive HA, though.
>
> Sounds reasonable,  I will work on ovsdb internal changes to support
 manual switching, using appctl commands. Then looking into prototyping with
 HA systems.  I have not used pacemaker in the past, so it may take some
 time to ramp up.

>>>
>>> I should be able to help.  We need to do this work anyway for
>>> integration into OpenStack deployment tools.  Let me see if I can get some
>>> helpful examples to follow.
>>>
>>
>> Thanks for helping out.
>>
>> Given that, I now plan to work from bottom up, initially focusing on
>> ovsdb server changes.
>>
>> 1. Add a state in ovsdb-server for it to know whether it is an active
>> server.  Backup server will not accept any connections.  Server started with
>> --sync-from argument will be put in the back state by default.
>>
>> 2. Add appctl commands to allow manually switch state.
>>
>> 3. Add a new table for backup server to register its address and ports.
>> OVSDB clients can learn about them at run time. Back up server should issue
>> an
>> transaction to register its address before issuing the monitoring
>> request.  This feature is not strictly necessary, and can be pushed to HA
>> manager,
>> but having it built into ovsdb-server may make it simpler for
>> integrationl.
>>
>> What do you think?
>>
>>
>>
> Russell, Would HA manager also manage ovn-controller switch over?
>

Yes, indirectly.  The way this is typically handled is by using a virtual
IP that moves to whatever host is currently the master.

-- 
Russell Bryant

51 matches

Mail list logo