Re: [ovs-dev] [PATCH v1] ofproto: Fix OVS crash when packets hit Openflow rules with certain combinations of nested actions

2019-09-02 Thread Anil Kumar Koli via dev
Hi Ilya,

Sorry for late reply and thanks for the review comments.
Yes, the issue happens only in the userspace datapath and not in kernel 
datapath. I have tested it on kernel datapath and could not see this issue for 
the same build.

Regarding the suggested solution, I still feel like going with recursive mutex 
approach is more cleaner then releasing the lock for 
ofproto_packet_out_finish().  We have the ofproto_dpif_xcache_execute() which 
may lead to race when a flow mod learn is called from another thread. May be 
it could also be possible that in certain combination we may result the crash 
in ofproto_packet_out_start().  May be Ben can share his inputs as well.

Best Regards,Anil 
Kumar

-Original Message-
From: Ilya Maximets 
Sent: Thursday, 29 August, 2019 05:10 PM
To: Anil Kumar Koli ; 
ovs-dev@openvswitch.org
Cc: 'Ben Pfaff' 
Subject: Re: [ovs-dev] [PATCH v1] ofproto: Fix OVS crash when packets hit 
Openflow rules with certain combinations of nested actions

On 29.08.2019 12:52, Anil Kumar Koli wrote:
> Hi Ilya,
>
> Thanks for the review.
> Please find below the stack trace of the crash
>
>  (gdb) bt
> #0  0x7f0a3da46c37 in __GI_raise (sig=sig@entry=6) at
> ../nptl/sysdeps/unix/sysv/linux/raise.c:56
> #1  0x7f0a3da4a028 in __GI_abort () at abort.c:89
> #2  0x0094262e in ovs_abort_valist (err_no=,
> format=, args=args@entry=0x7fffaf59e308) at
> lib/util.c:419
> #3  0x009426b7 in ovs_abort (err_no=,
> format=format@entry=0xb0289f "%s: pthread_%s_%s failed") at
> lib/util.c:411
> #4  0x009104f9 in ovs_mutex_lock_at (l_=l_@entry=0xe47cc0
> , where=where@entry=0xad4199 "ofproto/ofproto.c:5391")
> at lib/ovs-thread.c:75
> #5  0x0083fb1f in ofproto_flow_mod_learn
> (ofm=ofm@entry=0x7fffaf59e620, keep_ref=,
> limit=, below_limitp=below_limitp@entry=0x7fffaf59e540)
> at ofproto/ofproto.c:5391
> #6  0x0085e5d0 in xlate_learn_action
> (ctx=ctx@entry=0x7fffaf5a02e0, learn=learn@entry=0x2b18308) at
> ofproto/ofproto-dpif-xlate.c:5378
> #7  0x008615c3 in do_xlate_actions (ofpacts=,
> ofpacts_len=, ctx=,
> is_last_action=, group_bucket_action=)
> at ofproto/ofproto-dpif-xlate.c:6893
> #8  0x0085edba in xlate_recursively (actions_xlator=0x860730
> , is_last_action=false, deepens=,
> rule=0x2b8e440, ctx=0x7fffaf5a02e0) at
> ofproto/ofproto-dpif-xlate.c:4233
> #9  xlate_table_action (ctx=0x7fffaf5a02e0, in_port=, 
> table_id=, may_packet_in=, 
> honor_table_miss=, with_ct_orig=, 
> is_last_action=false,
> xlator=0x860730 ) at
> ofproto/ofproto-dpif-xlate.c:4361
> #10 0x00861aaa in xlate_ofpact_resubmit (resubmit= out>, resubmit=, resubmit=,
> is_last_action=, ctx=0x7fffaf5a02e0) at
> ofproto/ofproto-dpif-xlate.c:4672
> #11 do_xlate_actions (ofpacts=ofpacts@entry=0x2b654e8, 
> ofpacts_len=ofpacts_len@entry=48, ctx=ctx@entry=0x7fffaf5a02e0, 
> is_last_action=is_last_action@entry=true, 
> group_bucket_action=group_bucket_action@entry=false)
> at ofproto/ofproto-dpif-xlate.c:6773
> #12 0x008696d6 in xlate_actions (xin=xin@entry=0x7fffaf5a0da0,
> xout=xout@entry=0x7fffaf5a11b0) at ofproto/ofproto-dpif-xlate.c:7570
> #13 0x00859b56 in upcall_xlate (wc=0x7fffaf5a23f0,
> odp_actions=0x7fffaf5a1bc0, upcall=0x7fffaf5a1150, udpif=0x2b10670) at
> ofproto/ofproto-dpif-upcall.c:1197
> #14 process_upcall (udpif=udpif@entry=0x2b10670,
> upcall=upcall@entry=0x7fffaf5a1150,
> odp_actions=odp_actions@entry=0x7fffaf5a1bc0,
> wc=wc@entry=0x7fffaf5a23f0) at ofproto/ofproto-dpif-upcall.c:1413
> #15 0x0085a9eb in upcall_cb (packet=, 
> flow=0x7fffaf5a2150, ufid=, pmd_id=, 
> type=, userdata=, actions=0x7fffaf5a1bc0, 
> wc=0x7fffaf5a23f0,
> put_actions=0x7fffaf5a1c00, aux=0x2b10670) at
> ofproto/ofproto-dpif-upcall.c:1315
> #16 0x0088419c in dp_netdev_upcall (pmd=pmd@entry=0x7f0a35f34010, 
> packet_=packet_@entry=0x2b68930, flow=flow@entry=0x7fffaf5a2150, 
> wc=wc@entry=0x7fffaf5a23f0, ufid=ufid@entry=0x7fffaf5a1ba0, 
> type=type@entry=DPIF_UC_MISS,
> userdata=userdata@entry=0x0, actions=actions@entry=0x7fffaf5a1bc0,
> put_actions=put_actions@entry=0x7fffaf5a1c00) at
> lib/dpif-netdev.c:6236
> #17 0x0088d053 in handle_packet_upcall
> (put_actions=0x7fffaf5a1c00, actions=0x7fffaf5a1bc0,
> key=0x7fffaf5a3080, packet=0x2b68930, pmd=0x7f0a35f34010) at
> lib/dpif-netdev.c:6591
> #18 fast_path_processing (pmd=pmd@entry=0x7f0a35f34010, 
> packets_=packets_@entry=0x7fffaf5a35f0, keys=keys@entry=0x7fffaf5a3040, 
> flow_map=flow_map@entry=0x7fffaf5a2ef0, 
> index_map=index_map@entry=0x7fffaf5a2ee0 "",
> in_port=) at lib/dpif-netdev.c:6709
> #19 0x0088e9ce in dp_netdev_input__
> (pmd=pmd@entry=0x7f0a35f34010, packets=packets@entry=0x7fffaf5a35f0,
> md_is_valid=md_is_valid@entry=true, port_no=port_no@entry=0) at
> lib/dpif-netdev.c:6797
> #20 0x00890710 in dp_netdev_recirculate
> (packets=0x7fffaf5a35f0, pmd=0x7f0a35f34010) at lib/dpif-netdev.c:6842
> #21 dp_execute_cb (aux_=aux_@entry=0x7fffa

Re: [ovs-dev] [PATCH ovn 1/2] Add ovn-appctl utility

2019-09-02 Thread 0-day Robot
Bleep bloop.  Greetings Numan Siddique, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line lacks whitespace around operator
WARNING: Line lacks whitespace around operator
#555 FILE: utilities/ovn-appctl.c:90:
  -t, --target=TARGET  pidfile or socket to contact\n\

WARNING: Line lacks whitespace around operator
#557 FILE: utilities/ovn-appctl.c:92:
  list-commands  List commands supported by the target\n\

WARNING: Line lacks whitespace around operator
#559 FILE: utilities/ovn-appctl.c:94:
  vlog/list  List current logging levels\n\

WARNING: Line lacks whitespace around operator
WARNING: Line lacks whitespace around operator
#560 FILE: utilities/ovn-appctl.c:95:
  vlog/list-pattern  List logging patterns for each destination.\n\

WARNING: Line lacks whitespace around operator
#561 FILE: utilities/ovn-appctl.c:96:
  vlog/set [SPEC]\n\

WARNING: Line lacks whitespace around operator
#566 FILE: utilities/ovn-appctl.c:101:
  vlog/reopenMake the program reopen its log file\n\

WARNING: Line lacks whitespace around operator
WARNING: Line lacks whitespace around operator
#568 FILE: utilities/ovn-appctl.c:103:
  --timeout=SECS wait at most SECS seconds for a response\n\

WARNING: Line lacks whitespace around operator
#570 FILE: utilities/ovn-appctl.c:105:
  -V, --version  Display ovs-appctl version information\n",

WARNING: Line is 113 characters long (recommended limit is 79)
#731 FILE: utilities/ovn-ctl:68:
ovn-appctl -t $OVN_RUNDIR/ovnnb_db.ctl 
ovsdb-server/set-active-ovsdb-server `cat $ovnnb_active_conf_file`

WARNING: Line is 87 characters long (recommended limit is 79)
#732 FILE: utilities/ovn-ctl:69:
ovn-appctl -t $OVN_RUNDIR/ovnnb_db.ctl 
ovsdb-server/connect-active-ovsdb-server

WARNING: Line is 113 characters long (recommended limit is 79)
#742 FILE: utilities/ovn-ctl:82:
ovn-appctl -t $OVN_RUNDIR/ovnsb_db.ctl 
ovsdb-server/set-active-ovsdb-server `cat $ovnsb_active_conf_file`

WARNING: Line is 87 characters long (recommended limit is 79)
#743 FILE: utilities/ovn-ctl:83:
ovn-appctl -t $OVN_RUNDIR/ovnsb_db.ctl 
ovsdb-server/connect-active-ovsdb-server

WARNING: Line is 86 characters long (recommended limit is 79)
#752 FILE: utilities/ovn-ctl:92:
ovn-appctl -t $OVN_RUNDIR/ovnnb_db.ctl 
ovsdb-server/disconnect-active-ovsdb-server

WARNING: Line is 86 characters long (recommended limit is 79)
#758 FILE: utilities/ovn-ctl:97:
ovn-appctl -t $OVN_RUNDIR/ovnsb_db.ctl 
ovsdb-server/disconnect-active-ovsdb-server

WARNING: Line is 98 characters long (recommended limit is 79)
#767 FILE: utilities/ovn-ctl:277:
ovn-appctl -t $OVN_RUNDIR/ovn${1}_db.ctl ovsdb-server/sync-status | awk 
'{if(NR==1) print $2}'

Lines checked: 773, Warnings: 18, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn 2/2] Generate documentation and manpages for ovn-archicture and ovn-nb/ovn-sb

2019-09-02 Thread nusiddiq
From: Numan Siddique 

This was missing when OVN was split from OVS.

Signed-off-by: Numan Siddique 
---
 Makefile.am | 17 +
 TODO_SPLIT.rst  |  6 ---
 automake.mk | 84 +
 ovn-nb.xml  |  2 +-
 rhel/ovn-fedora.spec.in |  6 +--
 5 files changed, 89 insertions(+), 26 deletions(-)
 create mode 100644 automake.mk

diff --git a/Makefile.am b/Makefile.am
index f3df733a1..97dc309e3 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -488,22 +488,7 @@ dist-docs:
 .PHONY: dist-docs
 
 
-# Version checking for ovn-nb.ovsschema.
-ALL_LOCAL += ovn-nb.ovsschema.stamp
-ovn-nb.ovsschema.stamp: ovn-nb.ovsschema
-   $(srcdir)/build-aux/cksum-schema-check $? $@
-CLEANFILES += ovn-nb.ovsschema.stamp
-
-# Version checking for ovn-sb.ovsschema.
-ALL_LOCAL += ovn-sb.ovsschema.stamp
-ovn-sb.ovsschema.stamp: ovn-sb.ovsschema
-   $(srcdir)/build-aux/cksum-schema-check $? $@
-
-pkgdata_DATA += ovn-nb.ovsschema
-pkgdata_DATA += ovn-sb.ovsschema
-
-CLEANFILES += ovn-sb.ovsschema.stamp
-
+include automake.mk
 include Documentation/automake.mk
 include m4/automake.mk
 include lib/automake.mk
diff --git a/TODO_SPLIT.rst b/TODO_SPLIT.rst
index bb8657eb1..933534084 100644
--- a/TODO_SPLIT.rst
+++ b/TODO_SPLIT.rst
@@ -35,12 +35,6 @@ Immediate tasks
 * Someone with a decent ability to write should give the README.rst file some
   polish (or even just rewrite it. I won't be offended).
 
-* After the split, the below things are missing during compilation
- - OVN northbound/southbound E-R diagram
- - OVN northbound/southbound schema documentation
- - ovn-architecture manpage generation.
-  This needs to be fixed.
-
 * Cleanup the acinclude.m4 and m4 folder
 
 Immediate to Short-term tasks
diff --git a/automake.mk b/automake.mk
new file mode 100644
index 0..ad801f1e5
--- /dev/null
+++ b/automake.mk
@@ -0,0 +1,84 @@
+man_MANS += ovn-architecture.7
+EXTRA_DIST += ovn-architecture.7.xml
+CLEANFILES += ovn-architecture.7
+
+# OVN northbound E-R diagram
+#
+# If "python" or "dot" is not available, then we do not add graphical diagram
+# to the documentation.
+if HAVE_PYTHON
+if HAVE_DOT
+OVSDB_DOT = $(run_python) ${OVSDIR}/ovsdb/ovsdb-dot.in
+ovn-nb.gv: ${OVSDIR}/ovsdb/ovsdb-dot.in $(srcdir)/ovn-nb.ovsschema
+   $(AM_V_GEN)$(OVSDB_DOT) --no-arrows $(srcdir)/ovn-nb.ovsschema > $@
+ovn-nb.pic: ovn-nb.gv ${OVSDIR}/ovsdb/dot2pic
+   $(AM_V_GEN)(dot -T plain < ovn-nb.gv | $(PYTHON) 
${OVSDIR}/ovsdb/dot2pic -f 3) > $@.tmp && \
+   mv $@.tmp $@
+OVN_NB_PIC = ovn-nb.pic
+OVN_NB_DOT_DIAGRAM_ARG = --er-diagram=$(OVN_NB_PIC)
+CLEANFILES += ovn-nb.gv ovn-nb.pic
+endif
+endif
+
+# OVN northbound schema documentation
+EXTRA_DIST += ovn-nb.xml
+CLEANFILES += ovn-nb.5
+man_MANS += ovn-nb.5
+
+OVSDB_DOC = $(run_python) ${OVSDIR}/ovsdb/ovsdb-doc
+ovn-nb.5: \
+   ${OVSDIR}/ovsdb/ovsdb-doc $(srcdir)/ovn-nb.xml 
$(srcdir)/ovn-nb.ovsschema $(OVN_NB_PIC)
+   $(AM_V_GEN)$(OVSDB_DOC) \
+   $(OVN_NB_DOT_DIAGRAM_ARG) \
+   --version=$(VERSION) \
+   $(srcdir)/ovn-nb.ovsschema \
+   $(srcdir)/ovn-nb.xml > $@.tmp && \
+   mv $@.tmp $@
+
+# OVN southbound E-R diagram
+#
+# If "python" or "dot" is not available, then we do not add graphical diagram
+# to the documentation.
+if HAVE_PYTHON
+if HAVE_DOT
+ovn-sb.gv: ${OVSDIR}/ovsdb/ovsdb-dot.in $(srcdir)/ovn-sb.ovsschema
+   $(AM_V_GEN)$(OVSDB_DOT) --no-arrows $(srcdir)/ovn-sb.ovsschema > $@
+ovn-sb.pic: ovn-sb.gv ${OVSDIR}/ovsdb/dot2pic
+   $(AM_V_GEN)(dot -T plain < ovn-sb.gv | $(PYTHON) 
${OVSDIR}/ovsdb/dot2pic -f 3) > $@.tmp && \
+   mv $@.tmp $@
+OVN_SB_PIC = ovn-sb.pic
+OVN_SB_DOT_DIAGRAM_ARG = --er-diagram=$(OVN_SB_PIC)
+CLEANFILES += ovn-sb.gv ovn-sb.pic
+endif
+endif
+
+# OVN southbound schema documentation
+EXTRA_DIST += ovn-sb.xml
+CLEANFILES += ovn-sb.5
+man_MANS += ovn-sb.5
+
+ovn-sb.5: \
+   ${OVSDIR}/ovsdb/ovsdb-doc $(srcdir)/ovn-sb.xml 
$(srcdir)/ovn-sb.ovsschema $(OVN_SB_PIC)
+   $(AM_V_GEN)$(OVSDB_DOC) \
+   $(OVN_SB_DOT_DIAGRAM_ARG) \
+   --version=$(VERSION) \
+   $(srcdir)/ovn-sb.ovsschema \
+   $(srcdir)/ovn-sb.xml > $@.tmp && \
+   mv $@.tmp $@
+
+
+# Version checking for ovn-nb.ovsschema.
+ALL_LOCAL += ovn-nb.ovsschema.stamp
+ovn-nb.ovsschema.stamp: ovn-nb.ovsschema
+   $(srcdir)/build-aux/cksum-schema-check $? $@
+CLEANFILES += ovn-nb.ovsschema.stamp
+
+# Version checking for ovn-sb.ovsschema.
+ALL_LOCAL += ovn-sb.ovsschema.stamp
+ovn-sb.ovsschema.stamp: ovn-sb.ovsschema
+   $(srcdir)/build-aux/cksum-schema-check $? $@
+
+pkgdata_DATA += ovn-nb.ovsschema
+pkgdata_DATA += ovn-sb.ovsschema
+
+CLEANFILES += ovn-sb.ovsschema.stamp
diff --git a/ovn-nb.xml b/ovn-nb.xml
index b99a808b8..afef3e432 100644
--- a/ovn-nb.xml
+++ b/ovn-nb.xml
@@ -1527,7 +1527,7 @@
   address.
 
   
-  
+  
 
   Enables/disables IP mul

[ovs-dev] [PATCH ovn 1/2] Add ovn-appctl utility

2019-09-02 Thread nusiddiq
From: Numan Siddique 

Now that OVN has it's own rundir, "ovs-appctl -t ovn-controller/ovn-northd"
doesn't work. To fix this, ovn-appctl utility is added which
looks for the OVN pid/ctl files in the ovn rundir.

The code is taken from ovs-appctl.c and modified to use ovn_rundir()
instead of ovs_rundir().

Signed-off-by: Numan Siddique 
---
 rhel/ovn-fedora.spec.in|   2 +
 utilities/.gitignore   |   2 +
 utilities/automake.mk  |  13 +-
 utilities/ovn-appctl.8.xml | 352 +
 utilities/ovn-appctl.c | 239 +
 utilities/ovn-ctl  |  18 +-
 6 files changed, 615 insertions(+), 11 deletions(-)
 create mode 100644 utilities/ovn-appctl.8.xml
 create mode 100644 utilities/ovn-appctl.c

diff --git a/rhel/ovn-fedora.spec.in b/rhel/ovn-fedora.spec.in
index 14035de9a..9ee807fab 100644
--- a/rhel/ovn-fedora.spec.in
+++ b/rhel/ovn-fedora.spec.in
@@ -430,6 +430,7 @@ fi
 %{_bindir}/ovn-sbctl
 %{_bindir}/ovn-trace
 %{_bindir}/ovn-detrace
+%{_bindir}/ovn-appctl
 %{_datadir}/ovn/scripts/ovn-ctl
 %{_datadir}/ovn/scripts/ovn-lib
 %{_datadir}/ovn/scripts/ovndb-servers.ocf
@@ -440,6 +441,7 @@ fi
 %{_mandir}/man8/ovn-nbctl.8*
 %{_mandir}/man8/ovn-trace.8*
 %{_mandir}/man1/ovn-detrace.1*
+%{_mandir}/man8/ovn-appctl.8*
 #%{_mandir}/man7/ovn-architecture.7* - Uncomment this once the manpage is fixed
 %{_mandir}/man8/ovn-sbctl.8*
 #%{_mandir}/man5/ovn-nb.5* - Uncomment this once the manpage is fixed
diff --git a/utilities/.gitignore b/utilities/.gitignore
index 1d01e0b28..b319e8366 100644
--- a/utilities/.gitignore
+++ b/utilities/.gitignore
@@ -3,6 +3,8 @@
 /ovn-nbctl.8
 /ovn-sbctl
 /ovn-sbctl.8
+/ovn-appctl
+/ovn-appctl.8
 /ovn-trace
 /ovn-trace.8
 /ovn-detrace
diff --git a/utilities/automake.mk b/utilities/automake.mk
index 21dd8ccdf..ab0f6003a 100644
--- a/utilities/automake.mk
+++ b/utilities/automake.mk
@@ -8,7 +8,8 @@ man_MANS += \
 utilities/ovn-nbctl.8 \
 utilities/ovn-sbctl.8 \
 utilities/ovn-trace.8 \
-utilities/ovn-detrace.1
+utilities/ovn-detrace.1 \
+utilities/ovn-appctl.8
 
 MAN_ROOTS += \
 utilities/ovn-sbctl.8.in \
@@ -27,6 +28,7 @@ EXTRA_DIST += \
 utilities/ovn-docker-overlay-driver.in \
 utilities/ovn-docker-underlay-driver.in \
 utilities/ovn-nbctl.8.xml \
+utilities/ovn-appctl.8.xml \
 utilities/ovn-trace.8.xml \
 utilities/ovn-detrace.in \
 utilities/ovndb-servers.ocf \
@@ -49,7 +51,9 @@ CLEANFILES += \
 utilities/ovn-sbctl.8 \
 utilities/ovn-trace.8 \
 utilities/ovn-detrace.1 \
-utilities/ovn-detrace
+utilities/ovn-detrace \
+utilities/ovn-appctl.8 \
+utilities/ovn-appctl
 
 utilities/ovn-lib: $(top_builddir)/config.status
 
@@ -68,4 +72,9 @@ bin_PROGRAMS += utilities/ovn-trace
 utilities_ovn_trace_SOURCES = utilities/ovn-trace.c
 utilities_ovn_trace_LDADD = lib/libovn.la $(OVSDB_LIBDIR)/libovsdb.la 
$(OVS_LIBDIR)/libopenvswitch.la
 
+# ovn-nbctl
+bin_PROGRAMS += utilities/ovn-appctl
+utilities_ovn_appctl_SOURCES = utilities/ovn-appctl.c
+utilities_ovn_appctl_LDADD = lib/libovn.la $(OVSDB_LIBDIR)/libovsdb.la 
$(OVS_LIBDIR)/libopenvswitch.la
+
 include utilities/bugtool/automake.mk
diff --git a/utilities/ovn-appctl.8.xml b/utilities/ovn-appctl.8.xml
new file mode 100644
index 0..32a42a766
--- /dev/null
+++ b/utilities/ovn-appctl.8.xml
@@ -0,0 +1,352 @@
+
+
+Name
+ovn-appctl -- utility for configuring running OVN daemons
+
+Synopsis
+
+   ovn-appctl [--target=target | -t target]
+  [-T secs | --timeout=secs] command [arg...]
+
+ovn-appctl --help 
+ovn-appctl --version 
+
+Description
+
+  OVN daemons accept certain commands at runtime to control their behavior
+  and query their settings. Every daemon accepts a common set of commands
+  documented under COMMON COMMANDS below. Some daemons support additional
+  commands documented in their own manpages.
+
+
+
+   The ovn-appctl program provides a simple way to invoke
+   these commands. The command to be sent is specified on
+   ovn-appctl's command line as non-option arguments.
+   ovn-appctl sends the command and prints the daemon's
+   response on standard output.
+
+
+
+  ovn-ctl is exactly similar to Open vSwitch
+  ovs-appctl utility.
+
+
+Command Commands
+
+  Every OVN daemon supports a common set of commands, which are documented
+  in this section.
+
+
+General Commands
+
+  These commands display daemon-specific commands and the running version.
+  Note that these commands are different from the --help and --version
+  options that return information about the ovn-appctl
+  utility itself.
+
+
+
+  list-commands
+  
+Lists the commands supported by the target.
+  
+
+  version
+  
+Displays the version and compilation date of the target.
+  
+
+
+Logging Commands
+
+  OVN ha

Re: [ovs-dev] ovn-controller is taking 100% CPU all the time in one deployment

2019-09-02 Thread Numan Siddique
On Sat, Aug 31, 2019 at 2:48 AM Han Zhou  wrote:

>
>
> On Thu, Aug 29, 2019 at 11:36 PM Numan Siddique 
> wrote:
> >
> >
> >
> > On Fri, Aug 30, 2019 at 1:04 AM Han Zhou  wrote:
> >>
> >>
> >>
> >> On Thu, Aug 29, 2019 at 12:16 PM Numan Siddique 
> wrote:
> >>>
> >>>
> >>>
> >>> On Fri, Aug 30, 2019 at 12:37 AM Han Zhou  wrote:
> 
> 
> 
>  On Thu, Aug 29, 2019 at 11:40 AM Numan Siddique 
> wrote:
>  >
>  > Hello Everyone,
>  >
>  > In one of the OVN deployments, we are seeing 100% CPU usage by
> ovn-controllers all the time.
>  >
>  > After investigations we found the below
>  >
>  >  - ovn-controller is taking more than 20 seconds to complete full
> loop (mainly in lflow_run() function)
>  >
>  >  - The physical switch is sending GARPs periodically every 10
> seconds.
>  >
>  >  - There is ovn-bridge-mappings configured and these GARP packets
> reaches br-int via the patch port.
>  >
>  >  - We have a flow in router pipeline which applies the action -
> put_arp
>  > if it is arp packet.
>  >
>  >  - ovn-controller pinctrl thread receives these garps, stores the
> learnt mac-ips in the 'put_mac_bindings' hmap and notifies the
> ovn-controller main thread by incrementing the seq no.
>  >
>  >  - In the ovn-controller main thread, after lflow_run() finishes,
> pinctrl_wait() is called. This function calls - poll_immediate_wake() as
> 'put_mac_bindings' hmap is not empty.
>  >
>  > - This causes the ovn-controller poll_block() to not sleep at all
> and this repeats all the time resulting in 100% cpu usage.
>  >
>  > The deployment has OVS/OVN 2.9.  We have back ported the
> pinctrl_thread patch.
>  >
>  > Some time back I had reported an issue about lflow_run() taking lot
> of time -
> https://mail.openvswitch.org/pipermail/ovs-dev/2019-July/360414.html
>  >
>  > I think we need to improve the logical processing sooner or later.
>  >
>  > But to fix this issue urgently, we are thinking of the below
> approach.
>  >
>  >  - pinctrl_thread will locally cache the mac_binding entries (just
> like it caches the dns entries). (Please note pinctrl_thread can not access
> the SB DB IDL).
>  >
>  > - Upon receiving any arp packet (via the put_arp action),
> pinctrl_thread will check the local mac_binding cache and will only wake up
> the main ovn-controller thread only if the mac_binding update is required.
>  >
>  > This approach will solve the issue since the MAC sent by the
> physical switches will not change. So there is no need to wake up
> ovn-controller main thread.
>  >
>  > In the present master/2.12 these GARPs will not cause this 100% cpu
> loop issue because incremental processing will not recompute flows.
>  >
>  > Even though the above approach is not really required for
> master/2.12, I think it is still Ok to have this as there is no harm.
>  >
>  > I would like to know your comments and any concerns if any.
>  >
>  > Thanks
>  > Numan
>  >
> 
>  Hi Numan,
> 
>  I think this approach should work. Just to make sure, to update the
> cache efficiently (to avoid another kind of recompute), it should use ovsdb
> change-tracking to update it incrementally.
> 
>  Regarding master/2.12, it is not harmful except that it will add some
> more code and increase memory footprint. For our current use cases, there
> can be easily 10,000s mac_bindings, but it may still be ok because each
> entry is very small. However, is there any benefit for doing this in
> master/2.12?
> >>>
> >>>
> >>> I don't see much benefit. But I can't submit a patch to branch 2.9
> without the fix getting merged in master first right ?
> >>> May be once it is merged in branch 2.9, we can consider to delete it ?
> >>>
> >> I think it is just about how would you maintain a downstream branch.
> Since it is downstream, I don't think you need a change to be in upstream
> before fixing a problem. In this case it may be *no harm*, but what if the
> upstream is completely changed and incompatible for such a fix any more? It
> shouldn't prevent you from fixing your downstream. (Of course it is better
> to not have downstream at all, but sometimes it is useful to have it for a
> temporary period, and since you (and us, too) are already there ... :)
> >
> >
> > The dowstream 2.9 what we have is - OVS 2.9.0 + a bunch of patches (to
> fix issues) which are already merged upstream (preferably upstream branch
> or at least upstream master).  Any downstream only patch is frowned upon.
> When we updrade to 2.10 or higher versions there is  a risk of functional
> changes if the patch is not upstream.
> >
> > If we have apply the approach I described above to downstream 2.9 then
> there is definitely some functional change. When such GARPs are received,
> in the case of our downstream 2.9 we will not wake up ovn-controller main
> thread
> > but

[ovs-dev] [PATCH ovn] Handle GARP reply packets from provider networks only on gateway chassis

2019-09-02 Thread nusiddiq
From: Numan Siddique 

Suppose there is a provider network (with localnet port) and it is
connected to a logical router via a distributed gateway router port.
When an external switch sends GARP request packets, these packets
enter the br-int (via the patch port from the provider bridge) and
the action - put_arp is applied in the router pipeline only on the
gateway chassis where the distributed gateway router port is scheduled.
Other chassis even if they receive don't apply this action. This functionality
was added by the commit [1].

But that is not the case with GARP reply packets sent by the external
switch. ovn-controllers running on all the chassis configured with
ovn-bridge-mappings receive these packets as packet-ins because
of put_arp action. This results in unnecessary processing of these
packets only to be ignored. pinctrl thread wakes up the main ovn-controller
thread wasting few CPU cycles. It is enough for the gateway chassis to
handle these packets where the distributed router gateway port is scheduled.

This patch achieves the same - similar to GARP request packets. The below
logical flows are added
 - table=1 (lr_in_ip_input), priority=92,
   match=(inport == "lrp" &&
  !is_chassis_resident("cr-lrp") && eth.bcast &&
  arp.op == 2 && arp.spa == PROVIDER_NETWORK_CIDR), action=(drop;)

 - table=1 (lr_in_ip_input), priority=92
   match=(inport == "lr0-public" && is_chassis_resident("cr-lr0-public")
  && eth.bcast && arp.op == 2 && arp.spa == PROVIDER_NETWORK_CIDR),
   action=(put_arp(inport, arp.spa, arp.sha);)

This patch doesn't affect the arp replies from the overlay networks in the
router pipeline. This only handles GARP replies from the provider networks.

In the present master this is not much of any issue even if ovn-controller
wakes up, thanks to incremental processing engine. But in the older
versions - 2.11, 2.10 and 2.9, this results in complete flow calculations
resulting in much more CPU cyles. This patch will definitely help in saving
these CPU cyles if backported.

[1] - 3dd9febe953f("ovn-northd: Support learning neighbor from ARP request.")

CC: Han ZHou 
Signed-off-by: Numan Siddique 
---
 northd/ovn-northd.8.xml | 31 +++
 northd/ovn-northd.c | 35 +++
 2 files changed, 62 insertions(+), 4 deletions(-)

diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index 442e899bc..a06e404dd 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -1442,10 +1442,33 @@ arp.sha = external_mac;
   
 
   
-ARP reply handling.  This flow uses ARP replies to populate the
-logical router's ARP table.  A priority-90 flow with match arp.op
-== 2 has actions put_arp(inport, arp.spa,
-arp.sha);.
+
+  ARP reply handling.  Following flows are added to handle ARP replies.
+
+
+
+  For each distributed gateway logical router port a priority-92 flow
+  with match inport == P &&
+  is_chassis_resident(cr-P) && eth.bcast &&
+  arp.op == 2 && arp.spa == I with the
+  action put_arp(inport, arp.spa, arp.sha); so that the
+  resident gateway chassis can learn the GARP reply, where
+  P is the distributed gateway router port name,
+  I is the logical router port's network address.
+
+
+
+  For each distributed gateway logical router port a priority-92 flow
+  with match inport == P &&
+  !is_chassis_resident(cr-P) && eth.bcast &&
+  arp.op == 2 && arp.spa == I with the action
+  drop; so that other chassis drop this packet.
+
+
+
+  A priority-90 flow with match arp.op == 2 has actions
+  put_arp(inport, arp.spa, arp.sha);.
+
   
 
   
diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index 9a282..a83b56362 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -6552,6 +6552,41 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap 
*ports,
 
 }
 
+/* Handle GARP reply packets received on a distributed router gateway
+ * port. GARP reply broadcast packets could be sent by external
+ * switches. We don't want them to be handled by all the
+ * ovn-controllers if they receive it. So add a priority-92 flow to
+ * apply the put_arp action on a redirect chassis and drop it on
+ * other chassis.
+ * Note that we are already adding a priority-90 logical flow in the
+ * table S_ROUTER_IN_IP_INPUT to apply the put_arp action if
+ * arp.op == 2.
+ * */
+if (op->od->l3dgw_port && op == op->od->l3dgw_port
+&& op->od->l3redirect_port) {
+for (int i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) {
+ds_clear(&match);
+ds_put_format(&match,
+  "inport == %s && is_chassis_resident(%s) &

Re: [ovs-dev] [PATCH v2 3/3] dpdk: Use ovs-numa provided functions to manage thread affinity.

2019-09-02 Thread David Marchand
On Mon, Sep 2, 2019 at 3:38 PM Ilya Maximets  wrote:
>
> On 02.09.2019 16:17, David Marchand wrote:
> > On Mon, Sep 2, 2019 at 3:15 PM Ilya Maximets  wrote:
> >>
> >> On 02.09.2019 16:09, David Marchand wrote:
> >>> On Mon, Sep 2, 2019 at 1:27 PM Ilya Maximets  
> >>> wrote:
> 
>  This allows to decrease code duplication and avoid using Linux-specific
>  functions (this might be useful in the future if we'll try to allow
>  running OvS+DPDK on FreeBSD).
> 
>  Signed-off-by: Ilya Maximets 
>  Acked-by: William Tu 
>  ---
>   lib/dpdk.c | 27 ---
>   1 file changed, 12 insertions(+), 15 deletions(-)
> 
>  diff --git a/lib/dpdk.c b/lib/dpdk.c
>  index fc58de55a..6f297d918 100644
>  --- a/lib/dpdk.c
>  +++ b/lib/dpdk.c
>  @@ -275,7 +275,7 @@ dpdk_init__(const struct smap *ovs_other_config)
>   int result;
>   bool auto_determine = true;
>   int err = 0;
> >>>
> >>> Nit: err can be removed.
> >>
> >> It's used 2 times in this function.
> >> We could only avoid initializing it here.
> >
> > Yes, but we do not care about err itself, we are just checking
> > function return values.
>
> I see. It seems for me more like a separate change. This could be
> done, for example, along with improving stat/pdump error messages.

This patch removed the only reason to keep this variable from my pov.
But yes, it can go in a cleanup later.


-- 
David Marchand
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 3/3] dpdk: Use ovs-numa provided functions to manage thread affinity.

2019-09-02 Thread Ilya Maximets
On 02.09.2019 16:17, David Marchand wrote:
> On Mon, Sep 2, 2019 at 3:15 PM Ilya Maximets  wrote:
>>
>> On 02.09.2019 16:09, David Marchand wrote:
>>> On Mon, Sep 2, 2019 at 1:27 PM Ilya Maximets  wrote:

 This allows to decrease code duplication and avoid using Linux-specific
 functions (this might be useful in the future if we'll try to allow
 running OvS+DPDK on FreeBSD).

 Signed-off-by: Ilya Maximets 
 Acked-by: William Tu 
 ---
  lib/dpdk.c | 27 ---
  1 file changed, 12 insertions(+), 15 deletions(-)

 diff --git a/lib/dpdk.c b/lib/dpdk.c
 index fc58de55a..6f297d918 100644
 --- a/lib/dpdk.c
 +++ b/lib/dpdk.c
 @@ -275,7 +275,7 @@ dpdk_init__(const struct smap *ovs_other_config)
  int result;
  bool auto_determine = true;
  int err = 0;
>>>
>>> Nit: err can be removed.
>>
>> It's used 2 times in this function.
>> We could only avoid initializing it here.
> 
> Yes, but we do not care about err itself, we are just checking
> function return values.

I see. It seems for me more like a separate change. This could be
done, for example, along with improving stat/pdump error messages.

> Anyway, this is really minor.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [ovs-discuss] ovn-controller is taking 100% CPU all the time in one deployment

2019-09-02 Thread Daniel Alvarez Sanchez
Hi Han,

On Fri, Aug 30, 2019 at 10:37 PM Han Zhou  wrote:
>
> On Fri, Aug 30, 2019 at 1:25 PM Numan Siddique  wrote:
> >
> > Hi Han,
> >
> > I am thinking of this approach to solve this problem. I still need to
> test it.
> > If you have any comments or concerns do let me know.
> >
> >
> > **
> > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> > index 9a282..a83b56362 100644
> > --- a/northd/ovn-northd.c
> > +++ b/northd/ovn-northd.c
> > @@ -6552,6 +6552,41 @@ build_lrouter_flows(struct hmap *datapaths, struct
> hmap *ports,
> >
> >  }
> >
> > +/* Handle GARP reply packets received on a distributed router
> gateway
> > + * port. GARP reply broadcast packets could be sent by external
> > + * switches. We don't want them to be handled by all the
> > + * ovn-controllers if they receive it. So add a priority-92 flow
> to
> > + * apply the put_arp action on a redirect chassis and drop it on
> > + * other chassis.
> > + * Note that we are already adding a priority-90 logical flow in
> the
> > + * table S_ROUTER_IN_IP_INPUT to apply the put_arp action if
> > + * arp.op == 2.
> > + * */
> > +if (op->od->l3dgw_port && op == op->od->l3dgw_port
> > +&& op->od->l3redirect_port) {
> > +for (int i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) {
> > +ds_clear(&match);
> > +ds_put_format(&match,
> > +  "inport == %s && is_chassis_resident(%s)
> && "
> > +  "eth.bcast && arp.op == 2 && arp.spa ==
> %s/%u",
> > +  op->json_key,
> op->od->l3redirect_port->json_key,
> > +  op->lrp_networks.ipv4_addrs[i].network_s,
> > +  op->lrp_networks.ipv4_addrs[i].plen);
> > +ovn_lflow_add(lflows, op->od, S_ROUTER_IN_IP_INPUT, 92,
> > +  ds_cstr(&match),
> > +  "put_arp(inport, arp.spa, arp.sha);");
> > +ds_clear(&match);
> > +ds_put_format(&match,
> > +  "inport == %s && !is_chassis_resident(%s)
> && "
> > +  "eth.bcast && arp.op == 2 && arp.spa ==
> %s/%u",
> > +  op->json_key,
> op->od->l3redirect_port->json_key,
> > +  op->lrp_networks.ipv4_addrs[i].network_s,
> > +  op->lrp_networks.ipv4_addrs[i].plen);
> > +ovn_lflow_add(lflows, op->od, S_ROUTER_IN_IP_INPUT, 92,
> > +  ds_cstr(&match), "drop;");
> > +}
> > +}
> > +
> >  /* A set to hold all load-balancer vips that need ARP responses.
> */
> >  struct sset all_ips = SSET_INITIALIZER(&all_ips);
> >  int addr_family;
> > *
> >
> > If a physical switch sends GARP request packets we have existing logical
> flows
> > which handle them only on the gateway chassis.
> >
> > But if the physical switch sends GARP reply packets, then these packets
> > are handled by ovn-controllers where bridge mappings are configured.
> > I think its good enough if the gateway chassis handles these packet.
> >
> > In the deployment where we are seeing this issue, the physical switch
> sends GARP reply
> > packets.
> >
> > Thanks
> > Numan
> >
> >
> Hi Numan,
>
> I think both GARP request and reply should be handled on all chassises. It
> should work not only for physical switch, but also for virtual workloads.
> At least our current use cases relies on that.

I believe that Numan's patch will not change the behavior for virtual
(OVN) workloads, does it?

Although I'm in favor of this patch, I still think that it's not
enough for non-Incremental Processing versions of OVS because even
we're going to release pressure on the compute nodes, still on loaded
systems, the gateway nodes are going to be hogging the CPU. Plus, I
think there's value even from a security standpoint in having it on
stable branches as it looks like a simple attack vector.

>
> Thanks,
> Han
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 3/3] dpdk: Use ovs-numa provided functions to manage thread affinity.

2019-09-02 Thread David Marchand
On Mon, Sep 2, 2019 at 3:15 PM Ilya Maximets  wrote:
>
> On 02.09.2019 16:09, David Marchand wrote:
> > On Mon, Sep 2, 2019 at 1:27 PM Ilya Maximets  wrote:
> >>
> >> This allows to decrease code duplication and avoid using Linux-specific
> >> functions (this might be useful in the future if we'll try to allow
> >> running OvS+DPDK on FreeBSD).
> >>
> >> Signed-off-by: Ilya Maximets 
> >> Acked-by: William Tu 
> >> ---
> >>  lib/dpdk.c | 27 ---
> >>  1 file changed, 12 insertions(+), 15 deletions(-)
> >>
> >> diff --git a/lib/dpdk.c b/lib/dpdk.c
> >> index fc58de55a..6f297d918 100644
> >> --- a/lib/dpdk.c
> >> +++ b/lib/dpdk.c
> >> @@ -275,7 +275,7 @@ dpdk_init__(const struct smap *ovs_other_config)
> >>  int result;
> >>  bool auto_determine = true;
> >>  int err = 0;
> >
> > Nit: err can be removed.
>
> It's used 2 times in this function.
> We could only avoid initializing it here.

Yes, but we do not care about err itself, we are just checking
function return values.
Anyway, this is really minor.

-- 
David Marchand
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 3/3] dpdk: Use ovs-numa provided functions to manage thread affinity.

2019-09-02 Thread Ilya Maximets
On 02.09.2019 16:09, David Marchand wrote:
> On Mon, Sep 2, 2019 at 1:27 PM Ilya Maximets  wrote:
>>
>> This allows to decrease code duplication and avoid using Linux-specific
>> functions (this might be useful in the future if we'll try to allow
>> running OvS+DPDK on FreeBSD).
>>
>> Signed-off-by: Ilya Maximets 
>> Acked-by: William Tu 
>> ---
>>  lib/dpdk.c | 27 ---
>>  1 file changed, 12 insertions(+), 15 deletions(-)
>>
>> diff --git a/lib/dpdk.c b/lib/dpdk.c
>> index fc58de55a..6f297d918 100644
>> --- a/lib/dpdk.c
>> +++ b/lib/dpdk.c
>> @@ -275,7 +275,7 @@ dpdk_init__(const struct smap *ovs_other_config)
>>  int result;
>>  bool auto_determine = true;
>>  int err = 0;
> 
> Nit: err can be removed.

It's used 2 times in this function.
We could only avoid initializing it here.

> 
> 
>> -cpu_set_t cpuset;
>> +struct ovs_numa_dump *affinity = NULL;
>>  struct svec args = SVEC_EMPTY_INITIALIZER;
>>
>>  log_stream = fopencookie(NULL, "w+", dpdk_log_func);
>> @@ -357,22 +357,22 @@ dpdk_init__(const struct smap *ovs_other_config)
>>   * lcore for the DPDK Master.
>>   */
>>  if (auto_determine) {
>> +const struct ovs_numa_info_core *core;
>>  int cpu = 0;
>>
>>  /* Get the main thread affinity */
>> -CPU_ZERO(&cpuset);
>> -err = pthread_getaffinity_np(pthread_self(), sizeof(cpu_set_t),
>> - &cpuset);
>> -if (!err) {
>> -for (cpu = 0; cpu < CPU_SETSIZE; cpu++) {
>> -if (CPU_ISSET(cpu, &cpuset)) {
>> -break;
>> +affinity = ovs_numa_thread_getaffinity_dump();
>> +if (affinity) {
>> +cpu = INT_MAX;
>> +FOR_EACH_CORE_ON_DUMP (core, affinity) {
>> +if (cpu > core->core_id) {
>> +cpu = core->core_id;
>>  }
>>  }
>>  } else {
>>  /* User did not set dpdk-lcore-mask and unable to get current
>>   * thread affintity - default to core #0 */
>> -VLOG_ERR("Thread getaffinity error %d. Using core #0", err);
>> +VLOG_ERR("Thread getaffinity failed. Using core #0");
>>  }
>>  svec_add(&args, "-l");
>>  svec_add_nocopy(&args, xasprintf("%d", cpu));
>> @@ -403,12 +403,9 @@ dpdk_init__(const struct smap *ovs_other_config)
>>  svec_destroy(&args);
>>
>>  /* Set the main thread affinity back to pre rte_eal_init() value */
>> -if (auto_determine && !err) {
>> -err = pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t),
>> - &cpuset);
>> -if (err) {
>> -VLOG_ERR("Thread setaffinity error %d", err);
>> -}
>> +if (affinity) {
>> +ovs_numa_thread_setaffinity_dump(affinity);
>> +ovs_numa_dump_destroy(affinity);
>>  }
>>
>>  if (result < 0) {
>> --
>> 2.17.1
>>
> 
> Reviewed-by: David Marchand 
> 
> 
> --
> David Marchand
> 
> 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [ovs-discuss] ovn-controller is taking 100% CPU all the time in one deployment

2019-09-02 Thread Daniel Alvarez Sanchez
On Fri, Aug 30, 2019 at 8:18 PM Han Zhou  wrote:
>
>
>
> On Fri, Aug 30, 2019 at 6:46 AM Mark Michelson  wrote:
> >
> > On 8/30/19 5:39 AM, Daniel Alvarez Sanchez wrote:
> > > On Thu, Aug 29, 2019 at 10:01 PM Mark Michelson  
> > > wrote:
> > >>
> > >> On 8/29/19 2:39 PM, Numan Siddique wrote:
> > >>> Hello Everyone,
> > >>>
> > >>> In one of the OVN deployments, we are seeing 100% CPU usage by
> > >>> ovn-controllers all the time.
> > >>>
> > >>> After investigations we found the below
> > >>>
> > >>>- ovn-controller is taking more than 20 seconds to complete full loop
> > >>> (mainly in lflow_run() function)
> > >>>
> > >>>- The physical switch is sending GARPs periodically every 10 seconds.
> > >>>
> > >>>- There is ovn-bridge-mappings configured and these GARP packets
> > >>> reaches br-int via the patch port.
> > >>>
> > >>>- We have a flow in router pipeline which applies the action - 
> > >>> put_arp
> > >>> if it is arp packet.
> > >>>
> > >>>- ovn-controller pinctrl thread receives these garps, stores the
> > >>> learnt mac-ips in the 'put_mac_bindings' hmap and notifies the
> > >>> ovn-controller main thread by incrementing the seq no.
> > >>>
> > >>>- In the ovn-controller main thread, after lflow_run() finishes,
> > >>> pinctrl_wait() is called. This function calls - poll_immediate_wake() as
> > >>> 'put_mac_bindings' hmap is not empty.
> > >>>
> > >>> - This causes the ovn-controller poll_block() to not sleep at all and
> > >>> this repeats all the time resulting in 100% cpu usage.
> > >>>
> > >>> The deployment has OVS/OVN 2.9.  We have back ported the pinctrl_thread
> > >>> patch.
> > >>>
> > >>> Some time back I had reported an issue about lflow_run() taking lot of
> > >>> time - 
> > >>> https://mail.openvswitch.org/pipermail/ovs-dev/2019-July/360414.html
> > >>>
> > >>> I think we need to improve the logical processing sooner or later.
> > >>
> > >> I agree that this is very important. I know that logical flow processing
> > >> is the biggest bottleneck for ovn-controller, but 20 seconds is just
> > >> ridiculous. In your scale testing, you found that lflow_run() was taking
> > >> 10 seconds to complete.
> > > I support this statement 100% (20 seconds is just ridiculous). To be
> > > precise, in this deployment we see over 23 seconds for the main loop
> > > to process and I've seen even 30 seconds some times. I've been talking
> > > to Numan these days about this issue and I support profiling this
> > > actual deployment so that we can figure out how incremental processing
> > > would help.
> > >
> > >>
> > >> I'm curious if there are any factors in this particular deployment's
> > >> configuration that might contribute to this. For instance, does this
> > >> deployment have a glut of ACLs? Are they not using port groups?
> > > They're not using port groups because it's 2.9 and it is not there.
> > > However, I don't think port groups would make a big difference in
> > > terms of ovn-controller computation. I might be wrong but Port Groups
> > > help reduce the number of ACLs in the NB database while the # of
> > > Logical Flows would still remain the same. We'll try to get the
> > > contents of the NB database and figure out what's killing it.
> > >
> >
> > You're right that port groups won't reduce the number of logical flows.
>
> I think port-group reduces number of logical flows significantly, and also 
> reduces OVS flows when conjunctive matches are effective.

Right, definitely the number of lflows will be much lower. My bad as I
was directly involved in this! :) I was just thinking that the number
of OVS flows will remain the same so the computation for
ovn-controller would be similar but I missed the conjunctive matches
part in my statement.


> Please see my calculation here: 
> https://www.slideshare.net/hanzhou1978/large-scale-overlay-networks-with-ovn-problems-and-solutions/30
>
> > However, it can reduce the computation in ovn-controller. The reason is
> > that the logical flows generated by ACLs that use port groups may result
> > in conjunctive matches being used. If you want a bit more information,
> > see the "Port groups" section of this blog post I wrote:
> >
> > https://developers.redhat.com/blog/2019/01/02/performance-improvements-in-ovn-past-and-future/
> >
> > The TL;DR is that with port groups, I saw the number of OpenFlow flows
> > generated by ovn-controller drop by 3 orders of magnitude. And that
> > meant that flow processing was 99% faster for large networks.
> >
> > You may not see the same sort of improvement for this deployment, mainly
> > because my test case was tailored to illustrate how port groups help.
> > There may be other factors in this deployment that complicate flow
> > processing.
> >
> > >>
> > >> This particular deployment's configuration may give us a good scenario
> > >> for our testing to improve lflow processing time.
> > > Absolutely!
> > >>
> > >>>
> > >>> But to fix this issue urgently, we are thinking of the b

Re: [ovs-dev] [PATCH v2 3/3] dpdk: Use ovs-numa provided functions to manage thread affinity.

2019-09-02 Thread David Marchand
On Mon, Sep 2, 2019 at 1:27 PM Ilya Maximets  wrote:
>
> This allows to decrease code duplication and avoid using Linux-specific
> functions (this might be useful in the future if we'll try to allow
> running OvS+DPDK on FreeBSD).
>
> Signed-off-by: Ilya Maximets 
> Acked-by: William Tu 
> ---
>  lib/dpdk.c | 27 ---
>  1 file changed, 12 insertions(+), 15 deletions(-)
>
> diff --git a/lib/dpdk.c b/lib/dpdk.c
> index fc58de55a..6f297d918 100644
> --- a/lib/dpdk.c
> +++ b/lib/dpdk.c
> @@ -275,7 +275,7 @@ dpdk_init__(const struct smap *ovs_other_config)
>  int result;
>  bool auto_determine = true;
>  int err = 0;

Nit: err can be removed.


> -cpu_set_t cpuset;
> +struct ovs_numa_dump *affinity = NULL;
>  struct svec args = SVEC_EMPTY_INITIALIZER;
>
>  log_stream = fopencookie(NULL, "w+", dpdk_log_func);
> @@ -357,22 +357,22 @@ dpdk_init__(const struct smap *ovs_other_config)
>   * lcore for the DPDK Master.
>   */
>  if (auto_determine) {
> +const struct ovs_numa_info_core *core;
>  int cpu = 0;
>
>  /* Get the main thread affinity */
> -CPU_ZERO(&cpuset);
> -err = pthread_getaffinity_np(pthread_self(), sizeof(cpu_set_t),
> - &cpuset);
> -if (!err) {
> -for (cpu = 0; cpu < CPU_SETSIZE; cpu++) {
> -if (CPU_ISSET(cpu, &cpuset)) {
> -break;
> +affinity = ovs_numa_thread_getaffinity_dump();
> +if (affinity) {
> +cpu = INT_MAX;
> +FOR_EACH_CORE_ON_DUMP (core, affinity) {
> +if (cpu > core->core_id) {
> +cpu = core->core_id;
>  }
>  }
>  } else {
>  /* User did not set dpdk-lcore-mask and unable to get current
>   * thread affintity - default to core #0 */
> -VLOG_ERR("Thread getaffinity error %d. Using core #0", err);
> +VLOG_ERR("Thread getaffinity failed. Using core #0");
>  }
>  svec_add(&args, "-l");
>  svec_add_nocopy(&args, xasprintf("%d", cpu));
> @@ -403,12 +403,9 @@ dpdk_init__(const struct smap *ovs_other_config)
>  svec_destroy(&args);
>
>  /* Set the main thread affinity back to pre rte_eal_init() value */
> -if (auto_determine && !err) {
> -err = pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t),
> - &cpuset);
> -if (err) {
> -VLOG_ERR("Thread setaffinity error %d", err);
> -}
> +if (affinity) {
> +ovs_numa_thread_setaffinity_dump(affinity);
> +ovs_numa_dump_destroy(affinity);
>  }
>
>  if (result < 0) {
> --
> 2.17.1
>

Reviewed-by: David Marchand 


--
David Marchand
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 2/3] dpif-netdev-perf: Fix TSC frequency for non-DPDK case.

2019-09-02 Thread David Marchand
On Mon, Sep 2, 2019 at 1:27 PM Ilya Maximets  wrote:
>
> Unlike 'rte_get_tsc_cycles()' which doesn't need any specific
> initialization, 'rte_get_tsc_hz()' could be used only after successfull
> call to 'rte_eal_init()'. 'rte_eal_init()' estimates the TSC frequency
> for later use by 'rte_get_tsc_hz()'.  Fairly said, we're not allowed
> to use 'rte_get_tsc_cycles()' before initializing DPDK too, but it
> works this way for now and provides correct results.
>
> This patch provides TSC frequency estimation code that will be used
> in two cases:
>   * DPDK is not compiled in, i.e. DPDK_NETDEV not defined.
>   * DPDK compiled in but not initialized,
> i.e. other_config:dpdk-init=false
>
> This change is mostly useful for AF_XDP netdev support, i.e. allows
> to use dpif-netdev/pmd-perf-show command and various PMD perf metrics.
>
> Signed-off-by: Ilya Maximets 
> Acked-by: William Tu 

Reviewed-by: David Marchand 

--
David Marchand
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 1/3] ovs-numa: Add dump based thread affinity functions.

2019-09-02 Thread David Marchand
On Mon, Sep 2, 2019 at 1:27 PM Ilya Maximets  wrote:
>
> New functions to get and set CPU affinity using CPU dumps.
> This will abstract OS specific implementation details from the
> cross-platform code.
>
> Signed-off-by: Ilya Maximets 
> ---
>  lib/ovs-numa.c | 72 ++
>  lib/ovs-numa.h |  2 ++
>  2 files changed, 69 insertions(+), 5 deletions(-)
>
> diff --git a/lib/ovs-numa.c b/lib/ovs-numa.c
> index 24edeab2a..eb608edc4 100644
> --- a/lib/ovs-numa.c
> +++ b/lib/ovs-numa.c
> @@ -532,22 +532,69 @@ ovs_numa_dump_destroy(struct ovs_numa_dump *dump)
>  free(dump);
>  }
>
> -int ovs_numa_thread_setaffinity_core(unsigned core_id OVS_UNUSED)
> +struct ovs_numa_dump *
> +ovs_numa_thread_getaffinity_dump(void)
>  {
>  if (dummy_numa) {
> -/* Nothing to do */
> +/* Nothing to do. */
> +return NULL;
> +}
> +
> +#ifndef __linux__
> +return NULL;
> +#else
> +struct ovs_numa_dump *dump;
> +const struct numa_node *n;
> +cpu_set_t cpuset;
> +int err;
> +
> +CPU_ZERO(&cpuset);
> +err = pthread_getaffinity_np(pthread_self(), sizeof cpuset, &cpuset);
> +if (err) {
> +VLOG_ERR("Thread getaffinity error: %s", ovs_strerror(err));
> +return NULL;
> +}
> +
> +dump = ovs_numa_dump_create();
> +
> +HMAP_FOR_EACH (n, hmap_node, &all_numa_nodes) {
> +const struct cpu_core *core;
> +
> +LIST_FOR_EACH (core, list_node, &n->cores) {
> +if (CPU_ISSET(core->core_id, &cpuset)) {
> +ovs_numa_dump_add(dump, core->numa->numa_id, core->core_id);
> +}
> +}
> +}
> +
> +if (!ovs_numa_dump_count(dump)) {
> +ovs_numa_dump_destroy(dump);
> +return NULL;
> +}
> +return dump;
> +#endif /* __linux__ */
> +}
> +
> +int
> +ovs_numa_thread_setaffinity_dump(struct ovs_numa_dump *dump)

const?


> +{
> +if (!dump || dummy_numa) {
> +/* Nothing to do. */
>  return 0;
>  }
>
>  #ifdef __linux__
> +struct ovs_numa_info_core *core;

const?

>  cpu_set_t cpuset;
>  int err;
>
>  CPU_ZERO(&cpuset);
> -CPU_SET(core_id, &cpuset);
> -err = pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t), &cpuset);
> +FOR_EACH_CORE_ON_DUMP (core, dump) {
> +CPU_SET(core->core_id, &cpuset);
> +}
> +err = pthread_setaffinity_np(pthread_self(), sizeof cpuset, &cpuset);
>  if (err) {
> -VLOG_ERR("Thread affinity error %d",err);
> +VLOG_ERR("Thread setaffinity error: %s", ovs_strerror(err));
>  return err;
>  }
>
> @@ -556,3 +603,18 @@ int ovs_numa_thread_setaffinity_core(unsigned core_id 
> OVS_UNUSED)
>  return EOPNOTSUPP;
>  #endif /* __linux__ */
>  }
> +
> +int ovs_numa_thread_setaffinity_core(unsigned core_id)
> +{
> +struct ovs_numa_dump *affinity = ovs_numa_dump_create();
> +struct cpu_core *core = get_core_by_core_id(core_id);

const?

> +int ret = EINVAL;
> +
> +if (core) {
> +ovs_numa_dump_add(affinity, core->numa->numa_id, core->core_id);
> +ret = ovs_numa_thread_setaffinity_dump(affinity);
> +}
> +
> +ovs_numa_dump_destroy(affinity);
> +return ret;
> +}
> diff --git a/lib/ovs-numa.h b/lib/ovs-numa.h
> index 088fcb8c3..88352a93e 100644
> --- a/lib/ovs-numa.h
> +++ b/lib/ovs-numa.h
> @@ -60,6 +60,8 @@ struct ovs_numa_dump *ovs_numa_dump_n_cores_per_numa(int n);
>  bool ovs_numa_dump_contains_core(const struct ovs_numa_dump *,
>   int numa_id, unsigned core_id);
>  size_t ovs_numa_dump_count(const struct ovs_numa_dump *);
> +struct ovs_numa_dump * ovs_numa_thread_getaffinity_dump(void);
> +int ovs_numa_thread_setaffinity_dump(struct ovs_numa_dump *);
>  void ovs_numa_dump_destroy(struct ovs_numa_dump *);
>  int ovs_numa_thread_setaffinity_core(unsigned core_id);
>
> --
> 2.17.1
>

Reviewed-by: David Marchand 

--
David Marchand
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2 3/3] dpdk: Use ovs-numa provided functions to manage thread affinity.

2019-09-02 Thread Ilya Maximets
This allows to decrease code duplication and avoid using Linux-specific
functions (this might be useful in the future if we'll try to allow
running OvS+DPDK on FreeBSD).

Signed-off-by: Ilya Maximets 
Acked-by: William Tu 
---
 lib/dpdk.c | 27 ---
 1 file changed, 12 insertions(+), 15 deletions(-)

diff --git a/lib/dpdk.c b/lib/dpdk.c
index fc58de55a..6f297d918 100644
--- a/lib/dpdk.c
+++ b/lib/dpdk.c
@@ -275,7 +275,7 @@ dpdk_init__(const struct smap *ovs_other_config)
 int result;
 bool auto_determine = true;
 int err = 0;
-cpu_set_t cpuset;
+struct ovs_numa_dump *affinity = NULL;
 struct svec args = SVEC_EMPTY_INITIALIZER;
 
 log_stream = fopencookie(NULL, "w+", dpdk_log_func);
@@ -357,22 +357,22 @@ dpdk_init__(const struct smap *ovs_other_config)
  * lcore for the DPDK Master.
  */
 if (auto_determine) {
+const struct ovs_numa_info_core *core;
 int cpu = 0;
 
 /* Get the main thread affinity */
-CPU_ZERO(&cpuset);
-err = pthread_getaffinity_np(pthread_self(), sizeof(cpu_set_t),
- &cpuset);
-if (!err) {
-for (cpu = 0; cpu < CPU_SETSIZE; cpu++) {
-if (CPU_ISSET(cpu, &cpuset)) {
-break;
+affinity = ovs_numa_thread_getaffinity_dump();
+if (affinity) {
+cpu = INT_MAX;
+FOR_EACH_CORE_ON_DUMP (core, affinity) {
+if (cpu > core->core_id) {
+cpu = core->core_id;
 }
 }
 } else {
 /* User did not set dpdk-lcore-mask and unable to get current
  * thread affintity - default to core #0 */
-VLOG_ERR("Thread getaffinity error %d. Using core #0", err);
+VLOG_ERR("Thread getaffinity failed. Using core #0");
 }
 svec_add(&args, "-l");
 svec_add_nocopy(&args, xasprintf("%d", cpu));
@@ -403,12 +403,9 @@ dpdk_init__(const struct smap *ovs_other_config)
 svec_destroy(&args);
 
 /* Set the main thread affinity back to pre rte_eal_init() value */
-if (auto_determine && !err) {
-err = pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t),
- &cpuset);
-if (err) {
-VLOG_ERR("Thread setaffinity error %d", err);
-}
+if (affinity) {
+ovs_numa_thread_setaffinity_dump(affinity);
+ovs_numa_dump_destroy(affinity);
 }
 
 if (result < 0) {
-- 
2.17.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2 2/3] dpif-netdev-perf: Fix TSC frequency for non-DPDK case.

2019-09-02 Thread Ilya Maximets
Unlike 'rte_get_tsc_cycles()' which doesn't need any specific
initialization, 'rte_get_tsc_hz()' could be used only after successfull
call to 'rte_eal_init()'. 'rte_eal_init()' estimates the TSC frequency
for later use by 'rte_get_tsc_hz()'.  Fairly said, we're not allowed
to use 'rte_get_tsc_cycles()' before initializing DPDK too, but it
works this way for now and provides correct results.

This patch provides TSC frequency estimation code that will be used
in two cases:
  * DPDK is not compiled in, i.e. DPDK_NETDEV not defined.
  * DPDK compiled in but not initialized,
i.e. other_config:dpdk-init=false

This change is mostly useful for AF_XDP netdev support, i.e. allows
to use dpif-netdev/pmd-perf-show command and various PMD perf metrics.

Signed-off-by: Ilya Maximets 
Acked-by: William Tu 
---
 lib/dpdk-stub.c|  6 
 lib/dpdk.c |  6 
 lib/dpdk.h |  1 +
 lib/dpif-netdev-perf.c | 75 --
 lib/dpif-netdev-perf.h |  2 ++
 lib/dpif-netdev.c  |  9 +
 6 files changed, 81 insertions(+), 18 deletions(-)

diff --git a/lib/dpdk-stub.c b/lib/dpdk-stub.c
index e55be5750..c332c217c 100644
--- a/lib/dpdk-stub.c
+++ b/lib/dpdk-stub.c
@@ -68,6 +68,12 @@ dpdk_per_port_memory(void)
 return false;
 }
 
+bool
+dpdk_available(void)
+{
+return false;
+}
+
 void
 print_dpdk_version(void)
 {
diff --git a/lib/dpdk.c b/lib/dpdk.c
index f31e1580c..fc58de55a 100644
--- a/lib/dpdk.c
+++ b/lib/dpdk.c
@@ -518,6 +518,12 @@ dpdk_per_port_memory(void)
 return per_port_memory;
 }
 
+bool
+dpdk_available(void)
+{
+return dpdk_initialized;
+}
+
 void
 dpdk_set_lcore_id(unsigned cpu)
 {
diff --git a/lib/dpdk.h b/lib/dpdk.h
index 7dab83775..736a64279 100644
--- a/lib/dpdk.h
+++ b/lib/dpdk.h
@@ -41,6 +41,7 @@ const char *dpdk_get_vhost_sock_dir(void);
 bool dpdk_vhost_iommu_enabled(void);
 bool dpdk_vhost_postcopy_enabled(void);
 bool dpdk_per_port_memory(void);
+bool dpdk_available(void);
 void print_dpdk_version(void);
 void dpdk_status(const struct ovsrec_open_vswitch *);
 #endif /* dpdk.h */
diff --git a/lib/dpif-netdev-perf.c b/lib/dpif-netdev-perf.c
index e7ed49e7e..baf90b0f4 100644
--- a/lib/dpif-netdev-perf.c
+++ b/lib/dpif-netdev-perf.c
@@ -17,9 +17,11 @@
 #include 
 #include 
 
+#include "dpdk.h"
 #include "dpif-netdev-perf.h"
 #include "openvswitch/dynamic-string.h"
 #include "openvswitch/vlog.h"
+#include "ovs-numa.h"
 #include "ovs-thread.h"
 #include "timeval.h"
 
@@ -43,21 +45,59 @@ uint64_t iter_cycle_threshold;
 
 static struct vlog_rate_limit latency_rl = VLOG_RATE_LIMIT_INIT(600, 600);
 
-#ifdef DPDK_NETDEV
-static uint64_t
-get_tsc_hz(void)
-{
-return rte_get_tsc_hz();
-}
-#else
-/* This function is only invoked from PMD threads which depend on DPDK.
- * A dummy function is sufficient when building without DPDK_NETDEV. */
-static uint64_t
-get_tsc_hz(void)
+static uint64_t tsc_hz = 1;
+
+void
+pmd_perf_estimate_tsc_frequency(void)
 {
-return 1;
-}
+#ifdef DPDK_NETDEV
+if (dpdk_available()) {
+tsc_hz = rte_get_tsc_hz();
+}
+if (tsc_hz > 1) {
+VLOG_INFO("DPDK provided TSC frequency: %"PRIu64" KHz", tsc_hz / 1000);
+return;
+}
 #endif
+struct ovs_numa_dump *affinity;
+struct pmd_perf_stats s;
+uint64_t start, stop;
+
+/* DPDK is not available or returned unreliable value.
+ * Trying to estimate. */
+affinity = ovs_numa_thread_getaffinity_dump();
+if (affinity) {
+const struct ovs_numa_info_core *core;
+
+FOR_EACH_CORE_ON_DUMP (core, affinity) {
+/* Setting affinity to a single core from the affinity mask to
+ * avoid re-scheduling to another core while sleeping. */
+ovs_numa_thread_setaffinity_core(core->core_id);
+break;
+}
+}
+
+start = cycles_counter_update(&s);
+/* Using xnanosleep as it's interrupt resistant.
+ * Sleeping only 100 ms to avoid holding the main thread for too long. */
+xnanosleep(1E8);
+stop = cycles_counter_update(&s);
+
+if (affinity) {
+/* Restoring previous affinity. */
+ovs_numa_thread_setaffinity_dump(affinity);
+ovs_numa_dump_destroy(affinity);
+}
+
+if (stop <= start) {
+VLOG_WARN("TSC source is unreliable.");
+tsc_hz = 1;
+} else {
+tsc_hz = (stop - start) * 10;
+}
+
+VLOG_INFO("Estimated TSC frequency: %"PRIu64" KHz", tsc_hz / 1000);
+}
 
 /* Histogram functions. */
 
@@ -170,7 +210,6 @@ pmd_perf_format_overall_stats(struct ds *str, struct 
pmd_perf_stats *s,
   double duration)
 {
 uint64_t stats[PMD_N_STATS];
-uint64_t tsc_hz = get_tsc_hz();
 double us_per_cycle = 100.0 / tsc_hz;
 
 if (duration == 0) {
@@ -555,7 +594,7 @@ pmd_perf_end_iteration(struct pmd_perf_stats *s, int 
rx_packets,
 cum_ms->timestamp = now;
 }
 /* Do the next check after 4 us (10K cycles at 2.5 GHz TSC cl

[ovs-dev] [PATCH v2 1/3] ovs-numa: Add dump based thread affinity functions.

2019-09-02 Thread Ilya Maximets
New functions to get and set CPU affinity using CPU dumps.
This will abstract OS specific implementation details from the
cross-platform code.

Signed-off-by: Ilya Maximets 
---
 lib/ovs-numa.c | 72 ++
 lib/ovs-numa.h |  2 ++
 2 files changed, 69 insertions(+), 5 deletions(-)

diff --git a/lib/ovs-numa.c b/lib/ovs-numa.c
index 24edeab2a..eb608edc4 100644
--- a/lib/ovs-numa.c
+++ b/lib/ovs-numa.c
@@ -532,22 +532,69 @@ ovs_numa_dump_destroy(struct ovs_numa_dump *dump)
 free(dump);
 }
 
-int ovs_numa_thread_setaffinity_core(unsigned core_id OVS_UNUSED)
+struct ovs_numa_dump *
+ovs_numa_thread_getaffinity_dump(void)
 {
 if (dummy_numa) {
-/* Nothing to do */
+/* Nothing to do. */
+return NULL;
+}
+
+#ifndef __linux__
+return NULL;
+#else
+struct ovs_numa_dump *dump;
+const struct numa_node *n;
+cpu_set_t cpuset;
+int err;
+
+CPU_ZERO(&cpuset);
+err = pthread_getaffinity_np(pthread_self(), sizeof cpuset, &cpuset);
+if (err) {
+VLOG_ERR("Thread getaffinity error: %s", ovs_strerror(err));
+return NULL;
+}
+
+dump = ovs_numa_dump_create();
+
+HMAP_FOR_EACH (n, hmap_node, &all_numa_nodes) {
+const struct cpu_core *core;
+
+LIST_FOR_EACH (core, list_node, &n->cores) {
+if (CPU_ISSET(core->core_id, &cpuset)) {
+ovs_numa_dump_add(dump, core->numa->numa_id, core->core_id);
+}
+}
+}
+
+if (!ovs_numa_dump_count(dump)) {
+ovs_numa_dump_destroy(dump);
+return NULL;
+}
+return dump;
+#endif /* __linux__ */
+}
+
+int
+ovs_numa_thread_setaffinity_dump(struct ovs_numa_dump *dump)
+{
+if (!dump || dummy_numa) {
+/* Nothing to do. */
 return 0;
 }
 
 #ifdef __linux__
+struct ovs_numa_info_core *core;
 cpu_set_t cpuset;
 int err;
 
 CPU_ZERO(&cpuset);
-CPU_SET(core_id, &cpuset);
-err = pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t), &cpuset);
+FOR_EACH_CORE_ON_DUMP (core, dump) {
+CPU_SET(core->core_id, &cpuset);
+}
+err = pthread_setaffinity_np(pthread_self(), sizeof cpuset, &cpuset);
 if (err) {
-VLOG_ERR("Thread affinity error %d",err);
+VLOG_ERR("Thread setaffinity error: %s", ovs_strerror(err));
 return err;
 }
 
@@ -556,3 +603,18 @@ int ovs_numa_thread_setaffinity_core(unsigned core_id 
OVS_UNUSED)
 return EOPNOTSUPP;
 #endif /* __linux__ */
 }
+
+int ovs_numa_thread_setaffinity_core(unsigned core_id)
+{
+struct ovs_numa_dump *affinity = ovs_numa_dump_create();
+struct cpu_core *core = get_core_by_core_id(core_id);
+int ret = EINVAL;
+
+if (core) {
+ovs_numa_dump_add(affinity, core->numa->numa_id, core->core_id);
+ret = ovs_numa_thread_setaffinity_dump(affinity);
+}
+
+ovs_numa_dump_destroy(affinity);
+return ret;
+}
diff --git a/lib/ovs-numa.h b/lib/ovs-numa.h
index 088fcb8c3..88352a93e 100644
--- a/lib/ovs-numa.h
+++ b/lib/ovs-numa.h
@@ -60,6 +60,8 @@ struct ovs_numa_dump *ovs_numa_dump_n_cores_per_numa(int n);
 bool ovs_numa_dump_contains_core(const struct ovs_numa_dump *,
  int numa_id, unsigned core_id);
 size_t ovs_numa_dump_count(const struct ovs_numa_dump *);
+struct ovs_numa_dump * ovs_numa_thread_getaffinity_dump(void);
+int ovs_numa_thread_setaffinity_dump(struct ovs_numa_dump *);
 void ovs_numa_dump_destroy(struct ovs_numa_dump *);
 int ovs_numa_thread_setaffinity_core(unsigned core_id);
 
-- 
2.17.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2 0/3] Fix TSC frequency if DPDK is not available.

2019-09-02 Thread Ilya Maximets


Version 2:
  * Removed redundant OVS_UNUSED.
  * 'ovs_numa_thread_setaffinity_core' reimplemented via
'ovs_numa_thread_setaffinity_dump' to reduce code duplication.
  * Added Ack from William to patches #2 and #3 as they wasn't changed.


Ilya Maximets (3):
  ovs-numa: Add dump based thread affinity functions.
  dpif-netdev-perf: Fix TSC frequency for non-DPDK case.
  dpdk: Use ovs-numa provided functions to manage thread affinity.

 lib/dpdk-stub.c|  6 
 lib/dpdk.c | 33 ++-
 lib/dpdk.h |  1 +
 lib/dpif-netdev-perf.c | 75 --
 lib/dpif-netdev-perf.h |  2 ++
 lib/dpif-netdev.c  |  9 +
 lib/ovs-numa.c | 72 +---
 lib/ovs-numa.h |  2 ++
 8 files changed, 162 insertions(+), 38 deletions(-)

-- 
2.17.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3] flow: fix incorrect padding length checking in ipv6_sanity_check

2019-09-02 Thread Yanqin Wei
The padding length is (packet size - ipv6 header length - ipv6 plen).  This
patch fixes incorrect padding size checking in ipv6_sanity_check.

Reviewed-by: Gavin Hu 
Signed-off-by: Yanqin Wei 
---
 lib/flow.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/flow.c b/lib/flow.c
index ac6a4e1..0413c67 100644
--- a/lib/flow.c
+++ b/lib/flow.c
@@ -699,7 +699,7 @@ ipv6_sanity_check(const struct ovs_16aligned_ip6_hdr *nh, 
size_t size)
 return false;
 }
 /* Jumbo Payload option not supported yet. */
-if (OVS_UNLIKELY(size - plen > UINT8_MAX)) {
+if (OVS_UNLIKELY(size - (plen + IPV6_HEADER_LEN) > UINT8_MAX)) {
 return false;
 }
 
-- 
2.7.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-2.9] dpdk: Use DPDK 17.11.6 release.

2019-09-02 Thread Lukasz Pawlik
Modify travis linux build script to use the latest DPDK stable release
17.11.6. Update docs for latest DPDK stable releases.

Signed-off-by: Lukasz Pawlik 
---
 .travis/linux-build.sh   | 2 +-
 Documentation/faq/releases.rst   | 2 +-
 Documentation/intro/install/dpdk.rst | 8 
 Documentation/topics/dpdk/vhost-user.rst | 6 +++---
 NEWS | 2 ++
 5 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/.travis/linux-build.sh b/.travis/linux-build.sh
index 796c2f8..b13adb4 100755
--- a/.travis/linux-build.sh
+++ b/.travis/linux-build.sh
@@ -83,7 +83,7 @@ fi
 
 if [ "$DPDK" ]; then
 if [ -z "$DPDK_VER" ]; then
-DPDK_VER="17.11.4"
+DPDK_VER="17.11.6"
 fi
 install_dpdk $DPDK_VER
 if [ "$CC" = "clang" ]; then
diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst
index acf4c77..0ea4234 100644
--- a/Documentation/faq/releases.rst
+++ b/Documentation/faq/releases.rst
@@ -164,7 +164,7 @@ Q: What DPDK version does each Open vSwitch release work 
with?
 2.6.x16.07.2
 2.7.x16.11.8
 2.8.x17.05.2
-2.9.x17.11.4
+2.9.x17.11.6
  ===
 
 Q: Are all the DPDK releases that OVS versions work with maintained?
diff --git a/Documentation/intro/install/dpdk.rst 
b/Documentation/intro/install/dpdk.rst
index 1dafaca..fcbe6b0 100644
--- a/Documentation/intro/install/dpdk.rst
+++ b/Documentation/intro/install/dpdk.rst
@@ -40,7 +40,7 @@ Build requirements
 In addition to the requirements described in :doc:`general`, building Open
 vSwitch with DPDK will require the following:
 
-- DPDK 17.11.4
+- DPDK 17.11.6
 
 - A `DPDK supported NIC`_
 
@@ -69,9 +69,9 @@ Install DPDK
 #. Download the `DPDK sources`_, extract the file and set ``DPDK_DIR``::
 
$ cd /usr/src/
-   $ wget http://fast.dpdk.org/rel/dpdk-17.11.4.tar.xz
-   $ tar xf dpdk-17.11.4.tar.xz
-   $ export DPDK_DIR=/usr/src/dpdk-stable-17.11.4
+   $ wget http://fast.dpdk.org/rel/dpdk-17.11.6.tar.xz
+   $ tar xf dpdk-17.11.6.tar.xz
+   $ export DPDK_DIR=/usr/src/dpdk-stable-17.11.6
$ cd $DPDK_DIR
 
 #. (Optional) Configure DPDK as a shared library
diff --git a/Documentation/topics/dpdk/vhost-user.rst 
b/Documentation/topics/dpdk/vhost-user.rst
index a4a14eb..197bfac 100644
--- a/Documentation/topics/dpdk/vhost-user.rst
+++ b/Documentation/topics/dpdk/vhost-user.rst
@@ -316,9 +316,9 @@ To begin, instantiate a guest as described in 
:ref:`dpdk-vhost-user` or
 DPDK sources to VM and build DPDK::
 
 $ cd /root/dpdk/
-$ wget http://fast.dpdk.org/rel/dpdk-17.11.4.tar.xz
-$ tar xf dpdk-17.11.4.tar.xz
-$ export DPDK_DIR=/root/dpdk/dpdk-stable-17.11.4
+$ wget http://fast.dpdk.org/rel/dpdk-17.11.6.tar.xz
+$ tar xf dpdk-17.11.6.tar.xz
+$ export DPDK_DIR=/root/dpdk/dpdk-stable-17.11.6
 $ export DPDK_TARGET=x86_64-native-linuxapp-gcc
 $ export DPDK_BUILD=$DPDK_DIR/$DPDK_TARGET
 $ cd $DPDK_DIR
diff --git a/NEWS b/NEWS
index c1bb90f..d8e423f 100644
--- a/NEWS
+++ b/NEWS
@@ -4,6 +4,8 @@ v2.9.6 - xx xxx 
 v2.9.5 - 30 Mar 2019
 
- Bug fixes
+   - DPDK
+* OVS validated with DPDK 17.11.6 which is recommended to be used.
 
 v2.9.4 - 20 Feb 2019
 
-- 
2.7.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH branch-2.10] dpdk: Use DPDK 17.11.6 release.

2019-09-02 Thread Lukasz Pawlik
Modify travis linux build script to use the latest DPDK stable release
17.11.6. Update docs for latest DPDK stable releases.

Signed-off-by: Lukasz Pawlik 
---
 .travis/linux-build.sh   | 2 +-
 Documentation/faq/releases.rst   | 4 ++--
 Documentation/intro/install/dpdk.rst | 8 
 Documentation/topics/dpdk/vhost-user.rst | 6 +++---
 NEWS | 2 ++
 5 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/.travis/linux-build.sh b/.travis/linux-build.sh
index 1fe5bbf..ede62a5 100755
--- a/.travis/linux-build.sh
+++ b/.travis/linux-build.sh
@@ -83,7 +83,7 @@ fi
 
 if [ "$DPDK" ]; then
 if [ -z "$DPDK_VER" ]; then
-DPDK_VER="17.11.4"
+DPDK_VER="17.11.6"
 fi
 install_dpdk $DPDK_VER
 if [ "$CC" = "clang" ]; then
diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst
index 216eb30..8fe16d9 100644
--- a/Documentation/faq/releases.rst
+++ b/Documentation/faq/releases.rst
@@ -167,8 +167,8 @@ Q: What DPDK version does each Open vSwitch release work 
with?
 2.6.x16.07.2
 2.7.x16.11.8
 2.8.x17.05.2
-2.9.x17.11.4
-2.10.x   17.11.4
+2.9.x17.11.6
+2.10.x   17.11.6
  ===
 
 Q: Are all the DPDK releases that OVS versions work with maintained?
diff --git a/Documentation/intro/install/dpdk.rst 
b/Documentation/intro/install/dpdk.rst
index 13546bb..5f39263 100644
--- a/Documentation/intro/install/dpdk.rst
+++ b/Documentation/intro/install/dpdk.rst
@@ -42,7 +42,7 @@ Build requirements
 In addition to the requirements described in :doc:`general`, building Open
 vSwitch with DPDK will require the following:
 
-- DPDK 17.11.4
+- DPDK 17.11.6
 
 - A `DPDK supported NIC`_
 
@@ -71,9 +71,9 @@ Install DPDK
 #. Download the `DPDK sources`_, extract the file and set ``DPDK_DIR``::
 
$ cd /usr/src/
-   $ wget http://fast.dpdk.org/rel/dpdk-17.11.4.tar.xz
-   $ tar xf dpdk-17.11.4.tar.xz
-   $ export DPDK_DIR=/usr/src/dpdk-stable-17.11.4
+   $ wget http://fast.dpdk.org/rel/dpdk-17.11.6.tar.xz
+   $ tar xf dpdk-17.11.6.tar.xz
+   $ export DPDK_DIR=/usr/src/dpdk-stable-17.11.6
$ cd $DPDK_DIR
 
 #. (Optional) Configure DPDK as a shared library
diff --git a/Documentation/topics/dpdk/vhost-user.rst 
b/Documentation/topics/dpdk/vhost-user.rst
index 6334590..c5fe9ee 100644
--- a/Documentation/topics/dpdk/vhost-user.rst
+++ b/Documentation/topics/dpdk/vhost-user.rst
@@ -320,9 +320,9 @@ To begin, instantiate a guest as described in 
:ref:`dpdk-vhost-user` or
 DPDK sources to VM and build DPDK::
 
 $ cd /root/dpdk/
-$ wget http://fast.dpdk.org/rel/dpdk-17.11.4.tar.xz
-$ tar xf dpdk-17.11.4.tar.xz
-$ export DPDK_DIR=/root/dpdk/dpdk-stable-17.11.4
+$ wget http://fast.dpdk.org/rel/dpdk-17.11.6.tar.xz
+$ tar xf dpdk-17.11.6.tar.xz
+$ export DPDK_DIR=/root/dpdk/dpdk-stable-17.11.6
 $ export DPDK_TARGET=x86_64-native-linuxapp-gcc
 $ export DPDK_BUILD=$DPDK_DIR/$DPDK_TARGET
 $ cd $DPDK_DIR
diff --git a/NEWS b/NEWS
index 064550b..92c3f81 100644
--- a/NEWS
+++ b/NEWS
@@ -4,6 +4,8 @@ v2.10.3 - xx xxx 
 v2.10.2 - 30 Mar 2019
 
- Bug fixes
+   - DPDK
+* OVS validated with DPDK 17.11.6 which is recommended to be used.
 
 v2.10.1 - 19 Oct 2018
 
-- 
2.7.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev