[ovs-dev] [PATCH ovn] xml2nroff: Fix typo in generated nroff.

2021-03-24 Thread Ben Pfaff
This is just a typo introduced in a previous fix.  The typo shows up
at the top of generated manpages.

Signed-off-by: Ben Pfaff 
Fixes: 852316e8dc12 ("xml2nroff: Properly support .")
---
 build-aux/xml2nroff | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/build-aux/xml2nroff b/build-aux/xml2nroff
index fc6de0dcbf19..9e781a396dac 100755
--- a/build-aux/xml2nroff
+++ b/build-aux/xml2nroff
@@ -94,10 +94,10 @@ def manpage_to_nroff(xml_file, subst, include_path, 
version=None):
 .  PP
 .  I "\\$1"
 ..
-build/''' % (build.nroff.text_to_nroff(program),
- build.nroff.text_to_nroff(section),
- build.nroff.text_to_nroff(title),
- build.nroff.text_to_nroff(version))
+''' % (build.nroff.text_to_nroff(program),
+   build.nroff.text_to_nroff(section),
+   build.nroff.text_to_nroff(title),
+   build.nroff.text_to_nroff(version))
 
 s += build.nroff.block_xml_to_nroff(doc.childNodes) + "\n"
 
-- 
2.29.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn] ovn-nbctl: Fix comment.

2021-03-24 Thread Ben Pfaff
I guess this was a cut-and-paste error from ovs-ofctl.

Signed-off-by: Ben Pfaff 
---
 utilities/ovn-nbctl.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/utilities/ovn-nbctl.c b/utilities/ovn-nbctl.c
index 2c77f4ba7645..18405835699d 100644
--- a/utilities/ovn-nbctl.c
+++ b/utilities/ovn-nbctl.c
@@ -91,8 +91,7 @@ static int leader_only = true;
  * are specified in the connetion method string. */
 static int shuffle_remotes = true;
 
-/* --unixctl-path: Path to use for unixctl server, for "monitor" and "snoop"
- commands. */
+/* --unixctl-path: Path to use for unixctl server socket, for daemon mode. */
 static char *unixctl_path;
 
 static unixctl_cb_func server_cmd_exit;
-- 
2.29.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn 1/3] ovn-northd-ddlog: Fix memory leaks.

2021-03-24 Thread Ben Pfaff
On Wed, Mar 24, 2021 at 11:47:38AM +0530, Numan Siddique wrote:
> On Wed, Mar 10, 2021 at 5:55 AM Ben Pfaff  wrote:
> >
> > ddlog_delta_get_table() returns data that must be freed, but the code
> > in northd_update_probe_interval() did not do that.  This fixes it.
> >
> > In addition, the accumulated deltas weren't freed when the daemon exits.
> > This doesn't really matter but it's cleaner to do so, so this commit
> > also does that.
> >
> > Signed-off-by: Ben Pfaff 
> > Reported-by: Numan Siddique 
> 
> Thanks for these fixes and for the cleanup.
> 
> For all the 3 patches in the series.
> Acked-by: Numan Siddique 

Thanks, I applied them all to master.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] ovn-northd-ddlog: Fix error checking in ddlog_commit().

2021-03-24 Thread Ben Pfaff
On Wed, Mar 24, 2021 at 11:52:04AM +0530, Numan Siddique wrote:
> On Wed, Mar 10, 2021 at 6:15 AM Ben Pfaff  wrote:
> >
> > new_delta has the error status but this function was checking a
> > different variable instead.
> >
> > Found by inspection.;
> >
> > Signed-off-by: Ben Pfaff 
> 
> Acked-by: Numan Siddique 

Thanks, applied to master.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] ovn-northd-ddlog: Fix minor memory leak.

2021-03-24 Thread Ben Pfaff
On Wed, Mar 24, 2021 at 11:52:47AM +0530, Numan Siddique wrote:
> On Wed, Mar 10, 2021 at 6:06 AM Ben Pfaff  wrote:
> >
> > This string is allocated in northd_ctx_create() but until now it was
> > not freed.
> >
> > Signed-off-by: Ben Pfaff 
> > Reported-by: Numan Siddique 
> 
> Acked-by: Numan Siddique 

Thanks, applied to master.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v4] OVN: Multiple distributed gateway port support

2021-03-24 Thread Ben Pfaff
On Wed, Mar 24, 2021 at 08:27:07PM +, Dhathri Purohith wrote:
> Will work on the ddlog changes and update the patch.

I'm available to help with the ddlog changes, especially if you have
trouble figuring it out yourself or if you have any questions.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v4] binding: Fix the crashes seen when port binding type changes.

2021-03-24 Thread Mark Michelson

Hi Numan,

I have some minor findings below. They're the sort of thing that could 
be folded into the patch when you merge it.


I'm not giving an ack on this. The reason isn't because I don't approve, 
it's more that I'm not comfortable enough with this code to understand 
fully if I should be giving it an ack. The test cases help to make it 
seem good though.


I understand the general gist of this: Create a common local_binding 
type that then contains a list of binding_lports representing the parent 
and child ports. I just don't feel comfortable knowing that every 
possible case has been covered.


On 3/23/21 11:55 AM, num...@ovn.org wrote:

From: Numan Siddique 

When a port binding type changes from type 'A' to type 'B', then
there are many code paths in the existing binding.c which results
in crashes due to use-after-free or NULL references.

Below crashes are seen when a container lport is changed to a normal
lport and then deleted.

***
  (gdb) bt
0  in raise () from /lib64/libc.so.6
1  in abort () from /lib64/libc.so.6
2  ovs_abort_valist ("%s: assertion %s failed in %s()") at lib/util.c:419
3  vlog_abort_valist ("%s: assertion %s failed in %s()") at lib/vlog.c:1249
4  vlog_abort ("%s: assertion %s failed in %s()") at lib/vlog.c:1263
5  ovs_assert_failure (where="lib/ovsdb-idl.c:4653",
   function="ovsdb_idl_txn_write__",
   condition="row->new_datum != NULL") at lib/util.c:86
6  ovsdb_idl_txn_write__ () at lib/ovsdb-idl.c:4695
7  ovsdb_idl_txn_write_clone () at lib/ovsdb-idl.c:4757
8  sbrec_port_binding_set_chassis () at lib/ovn-sb-idl.c:25946
9  release_lport () at controller/binding.c:971
   10  release_local_binding_children () at controller/binding.c:1039
   11  release_local_binding () at controller/binding.c:1056
   12  consider_iface_release (iface_rec=.. 
iface_id="bb43e818-b2ee-4329-b67e-218556580056") at controller/binding.c:1880
   13  binding_handle_ovs_interface_changes () at controller/binding.c:1998
   14  runtime_data_ovs_interface_handler () at controller/ovn-controller.c:1481
   15  engine_compute () at lib/inc-proc-eng.c:306
   16  engine_run_node () at lib/inc-proc-eng.c:352
   17  engine_run () at lib/inc-proc-eng.c:377
   18  main () at controller/ovn-controller.c:2826

The present code creates a 'struct local_binding' instance for a
container lport and adds this object to the parent local binding
children list.  And if the container lport is changed to a normal
vif, then there is no way to access the local binding object created
earlier.  This patch fixes these type of issues by refactoring the
'local binding' code of binding.c.  This patch now creates only one
instance of 'struct local_binding' for every OVS interface with
external_ids:iface-id set.  A new structure 'struct binding_lport' is
added which is created for a VIF, container and virtual port bindings
and is stored in 'binding_lports' shash.  'struct local_binding' now
maintains a list of binding_lports which it maps to.

When a container lport is changed to a normal lport, we now can
easily access the 'binding_lport' object of the container lport
fron the 'binding_lports' shash.

Reported-by: Dumitru Ceara 
Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1936328
Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1936331

Co-authored-by: Dumitru Ceara 
Signed-off-by: Dumitru Ceara 
Signed-off-by: Numan Siddique 
---

v3 -> v4
---
  * -> Addressed some of the review comments from Dumitru.
   Added another test case to test the scenarion when a container
   port changes its parent from one to another port.

v2 -> v3

  * Fixed a crash seen when a container port is changed to normal port
and then deleted in the same transaction.  Added the relevant test case
for this.

v1 -> v2

  * Fixed a crash seen when 2 container ports are changed to normal ports
in the same transaction.  Added the relevant test case for this.

  controller/binding.c| 973 +++-
  controller/binding.h|  32 +-
  controller/ovn-controller.c |  25 +-
  controller/physical.c   |  13 +-
  tests/ovn.at| 304 +++
  5 files changed, 949 insertions(+), 398 deletions(-)

diff --git a/controller/binding.c b/controller/binding.c
index 4e6c756969..0e3e4aad91 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -597,6 +597,23 @@ remove_local_lport_ids(const struct sbrec_port_binding *pb,
  }
  }
  
+/* Corresponds to each Port_Binding.type. */

+enum en_lport_type {
+LP_UNKNOWN,
+LP_VIF,
+LP_CONTAINER,
+LP_PATCH,
+LP_L3GATEWAY,
+LP_LOCALNET,
+LP_LOCALPORT,
+LP_L2GATEWAY,
+LP_VTEP,
+LP_CHASSISREDIRECT,
+LP_VIRTUAL,
+LP_EXTERNAL,
+LP_REMOTE
+};
+
  /* Local bindings. binding.c module binds the logical port (represented by
   * Port_Binding rows) and sets the 'chassis' column when it sees the

Re: [ovs-dev] [PATCH ovn v4] OVN: Multiple distributed gateway port support

2021-03-24 Thread Numan Siddique
On Thu, Mar 4, 2021 at 6:49 AM svc.eng.git-pa...@nutanix.com
 wrote:
>
> From: Ankur Sharma 
>
> By default, OVN support only one distributed gateway
> port (we will call it l3dgw port for further reference)
> per logical router. While a single l3dgw port suffices
> for most of the North South connectivity, however there
> are requirements where a logical router could be connected
> to multiple physical networks and based on routing decision
> packet could go to vlan X or vlan Y. Additionally, packet
> may or may not get NATed based on the configuration.
>
> This patch adds flexibility of having multiple l3dgw ports
> per logical router.
>
> Changes can classified as following:
> a. Data structure changes to allow multiple l3dgw ports per
>ovn_datapath.
>
> b. Consumption of new data structure in logical flows for
>individual features.
>
> c. Features that require changes are:
>i. Regular NS traffic flow.
>   ii. Network Address Translation.
>  iii. Load Balancer
>   iv. Gateway_mtu.
>v. reside-on-redirect-chassis
>   vi. Misc code sections that assumed a single l3dgw port.
>
> d. ovn-nbctl cli change to allow multiple external ips
>for a logical ip for same type.
>
> e. Except for reside-on-redirect-chassis all the other features
>could be extended to multiple l3dgw ports. Reside on redirect
>chassis with its current specification could not be extended
>and hence should be used only with the logical router that
>has a single l3dgw port.
>
> FUTURE WORK:
> CT ZONES are still common for traffic from different physical networks.
> This adds a restriction/assumption that same 5 tuple will not come
> from different l3dgw ports.
> A cleaner approach would be have different ct zones for each l3dgw port.
> Changing the CT ZONE assignment is one of the enhancements we are
> considering as next step.
>
> Signed-off-by: Ankur Sharma 
> Signed-off-by: Dhathri Purohith 
> Co-authored-by: Dhathri Purohith 

Hi Ankur,

Thanks for v4.  I'm sorry that it took a while to review this patch.

This patch needs a rebase. Overall the patch LGTM.  I have few comments.
Please see below. If you can address those and add ddlog support in
v5, I think it is good to go.
)
This patch needs to update the documentation for distributed gateway
ports in ovn-nb.xml
(mainly here - https://github.com/ovn-org/ovn/blob/master/ovn-nb.xml#L2274
and in ovn-architecture.7.xml.  Please do consider adding a section
for multiple gw ports
in ovn-architecture.7.xml if you can.

Please add an entry in the NEWS file about this feature.

Please see below for additional comments.

> ---
>  northd/ovn-northd.c   | 521 +++---
>  ovn-nb.xml|   4 +-
>  tests/ovn-nbctl.at|  38 ++-
>  tests/ovn-northd.at   | 465 -
>  tests/ovn.at  | 310 -
>  utilities/ovn-nbctl.c | 147 +++-
>  6 files changed, 1276 insertions(+), 209 deletions(-)
>
> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> index ac872aade..cc228fd96 100644
> --- a/northd/ovn-northd.c
> +++ b/northd/ovn-northd.c
> @@ -602,6 +602,19 @@ ovn_mcast_group_allocate_key(struct mcast_info 
> *mcast_info)
>_info->group_tnlid_hint);
>  }
>
> +struct ovn_l3dgw_port {
> +/* OVN northd only needs to know about the logical router gateway port 
> for
> + * NAT on a distributed router.  This "distributed gateway port" is
> + * populated only when there is a gateway chassis or ha chassis group
> + * specified for one of the ports on the logical router. Otherwise this
> + * will be NULL. */
> +struct ovn_port *dgw_port;
> +
> +/* The "derived" OVN port representing the instance of l3dgw_port on
> + * the gateway chassis. */
> +struct ovn_port *redirect_port;
> +};
> +
>  /* The 'key' comes from nbs->header_.uuid or nbr->header_.uuid or
>   * sb->external_ids:logical-switch. */
>  struct ovn_datapath {
> @@ -633,14 +646,9 @@ struct ovn_datapath {
>  /* Multicast data. */
>  struct mcast_info mcast_info;
>
> -/* OVN northd only needs to know about the logical router gateway port 
> for
> - * NAT on a distributed router.  This "distributed gateway port" is
> - * populated only when there is a gateway chassis specified for one of
> - * the ports on the logical router.  Otherwise this will be NULL. */
> -struct ovn_port *l3dgw_port;
> -/* The "derived" OVN port representing the instance of l3dgw_port on
> - * the gateway chassis. */
> -struct ovn_port *l3redirect_port;
> +/* L3 distributed gateway ports */
> +struct ovn_l3dgw_port *l3dgw_ports;
> +size_t n_l3dgw_ports;
>
>  /* NAT entries configured on the router. */
>  struct ovn_nat *nat_entries;
> @@ -863,6 +871,7 @@ ovn_datapath_destroy(struct hmap *datapaths, struct 
> ovn_datapath *od)
>  ovn_destroy_tnlids(>port_tnlids);
>  destroy_ipam_info(>ipam_info);
> 

Re: [ovs-dev] [PATCH ovn v2] pinctrl: Don't send gARPs for localports

2021-03-24 Thread 0-day Robot
Bleep bloop.  Greetings Daniel Alvarez Sanchez, I am a robot and I have tried 
out your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Unexpected sign-offs from developers who are not authors or co-authors 
or committers: Dumitru Ceara 
Lines checked: 107, Warnings: 1, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v2] pinctrl: Don't send gARPs for localports

2021-03-24 Thread Daniel Alvarez Sanchez
Ports of type 'localport' are present on every hypervisor and
ovn-controller is sending gARPs for them which makes upstream
switches to see its MAC address flapping.

In order to avoid this behavior, the current patch is skipping
localports when sending gARP/RARP packets.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1939470

Signed-off-by: Daniel Alvarez Sanchez 
Signed-off-by: Dumitru Ceara 
---

v1 -> v2: added test 

 controller/pinctrl.c |  6 +
 tests/ovn.at | 55 
 2 files changed, 61 insertions(+)

diff --git a/controller/pinctrl.c b/controller/pinctrl.c
index b42288ea5..523a45b9a 100644
--- a/controller/pinctrl.c
+++ b/controller/pinctrl.c
@@ -4240,6 +4240,12 @@ send_garp_rarp_update(struct ovsdb_idl_txn 
*ovnsb_idl_txn,
   struct shash *nat_addresses)
 {
 volatile struct garp_rarp_data *garp_rarp = NULL;
+
+/* Skip localports as they don't need to be announced */
+if (!strcmp(binding_rec->type, "localport")) {
+return;
+}
+
 /* Update GARP for NAT IP if it exists.  Consider port bindings with type
  * "l3gateway" for logical switch ports attached to gateway routers, and
  * port bindings with type "patch" for logical switch ports attached to
diff --git a/tests/ovn.at b/tests/ovn.at
index b751d6db2..8c0146c2d 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -11596,6 +11596,61 @@ OVN_CLEANUP([hv1],[hv2])
 AT_CLEANUP
 ])
 
+OVN_FOR_EACH_NORTHD([
+AT_SETUP([ovn -- localport suppress gARP])
+ovn_start
+
+net_add n1
+sim_add hv1
+as hv1
+check ovs-vsctl add-br br-phys
+ovn_attach n1 br-phys 192.168.0.1
+
+check ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
+
+check ovn-nbctl ls-add ls \
+-- lsp-add ls lp \
+-- lsp-set-type lp localport \
+-- lsp-set-addresses lp "00:00:00:00:00:01 10.0.0.1" \
+-- lsp-add ls ln \
+-- lsp-set-type ln localnet \
+-- lsp-set-options ln network_name=phys \
+-- lsp-add ls lsp \
+-- lsp-set-addresses lsp "00:00:00:00:00:02 10.0.0.2"
+
+dnl First bind the localport.
+check ovs-vsctl add-port br-int vif1 \
+-- set Interface vif1 external-ids:iface-id=lp
+check ovn-nbctl --wait=hv sync
+
+dnl Then bind the regular vif.
+check ovs-vsctl add-port br-int vif2 \
+-- set Interface vif2 external-ids:iface-id=lsp \
+options:tx_pcap=hv1/vif2-tx.pcap \
+options:rxq_pcap=hv1/vif2-rx.pcap
+
+wait_for_ports_up lsp
+check ovn-nbctl --wait=hv sync
+
+dnl Wait for at least two gARPs from lsp (10.0.0.2).
+lsp_garp=00020806000108000604000100020a020a02
+OVS_WAIT_UNTIL([
+garps=`$PYTHON "$ovs_srcdir/utilities/ovs-pcap.in" hv1/br-phys-tx.pcap | 
grep ${lsp_garp} -c`
+test $garps -ge 2
+])
+
+dnl At this point it's safe to assume that ovn-controller skipped sending gARP
+dnl for the localport.  Check that there are no other packets than the gARPs
+dnl for the regular vif.
+AT_CHECK([
+pkts=`$PYTHON "$ovs_srcdir/utilities/ovs-pcap.in" hv1/br-phys-tx.pcap | 
grep -v ${lsp_garp} -c`
+test 0 -eq $pkts
+])
+
+OVN_CLEANUP([hv1])
+AT_CLEANUP
+])
+
 OVN_FOR_EACH_NORTHD([
 AT_SETUP([ovn -- 1 LR with HA distributed router gateway port])
 ovn_start
-- 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] crash in dst_cache_destroy

2021-03-24 Thread Gregory Rose




On 3/24/2021 12:48 AM, wanghan...@corp.netease.com wrote:

Hi ALL,
We found a kernel panic about ovs 2.8.2 with kernel 4.9.65.


That's a very old release on an old kernel.  Can you repro
with something more recent?  I suggest upgrading your OVS
version.

- Greg




[20655819.900423] BUG: unable to handle kernel NULL pointer dereference at 
0062
[20655819.909234] IP: [] dst_release+0x11/0x70
[20655819.915841] PGD 0
[20655819.917963]
[20655819.920779] Oops:  [#1] SMP
...
[20655820.072555] CPU: 13 PID: 585747 Comm: handler146 Tainted: G   O   
 4.9.65-k8s-netease3-1 #1
[20655820.082713] Hardware name: Inspur NF5280M4/YZMB-00689-101, BIOS 4.1.19 
02/27/2018
[20655820.091284] task: 9054dec880c0 task.stack: b77e799ec000
[20655820.098257] RIP: 0010:[]  [] 
dst_release+0x11/0x70
[20655820.107239] RSP: 0018:9034ff943ec8  EFLAGS: 00010202
[20655820.113627] RAX: 9034ff60 RBX:  RCX: 

[20655820.122013] RDX: 47293d0355f0 RSI: 0200 RDI: 
0002
[20655820.130183] RBP: 9034ae6465d8 R08:  R09: 
00ff
[20655820.138360] R10: 9034d9a9cb10 R11:  R12: 
b3d18e00
[20655820.146755] R13: 9034acad3d00 R14: 9054dec880c0 R15: 
9034ff9596c0
[20655820.154969] FS:  7f8a09ffb700() GS:9034ff94() 
knlGS:
[20655820.164098] CS:  0010 DS:  ES:  CR0: 80050033
[20655820.170910] CR2: 0062 CR3: 001fdef6e000 CR4: 
00360670
[20655820.179082] DR0:  DR1:  DR2: 

[20655820.187235] DR3:  DR6: fffe0ff0 DR7: 
0400
[20655820.195445] Stack:
[20655820.198492]   9034ae6465d8 b3529748 
9034ae646500
[20655820.207727]    b350c811 
b3c4fc00
[20655820.216333]  9034ff9596f8 0001 b350cb5a 
b30dd643
[20655820.224998] Call Trace:
[20655820.228487]  
[20655820.230688]  [] ? dst_cache_destroy+0x38/0x70
[20655820.237639]  [] ? dst_destroy+0x21/0x110
[20655820.244225]  [] ? dst_destroy_rcu+0xa/0x20
[20655820.250930]  [] ? rcu_process_callbacks+0x1e3/0x5b0
[20655820.258457]  [] ? __do_softirq+0x105/0x290
[20655820.265757]  [] ? irq_exit+0xae/0xb0
[20655820.272245]  [] ? smp_apic_timer_interrupt+0x3e/0x50
[20655820.280312]  [] ? apic_timer_interrupt+0x96/0xa0
[20655820.287534]  
[20655820.289734]  [] ? mutex_spin_on_owner.isra.1+0x28/0x40
[20655820.297580]  [] ? mutex_optimistic_spin+0x7f/0x1b0
[20655820.305032]  [] ? __mutex_lock_slowpath+0x3f/0x130
[20655820.312464]  [] ? mutex_lock+0x1b/0x30
[20655820.318831]  [] ? pcpu_alloc+0x48f/0x680
[20655820.325376]  [] ? dst_cache_init+0x1d/0x40
[20655820.332126]  [] ? validate_and_copy_set_tun+0xe0/0x360 
[openvswitch]
[20655820.341108]  [] ? validate_set+0x2b3/0x380 [openvswitch]
[20655820.349009]  [] ? __ovs_nla_copy_actions+0x2b1/0x680 
[openvswitch]
[20655820.357765]  [] ? __kmalloc_reserve.isra.35+0x2e/0x80
[20655820.365372]  [] ? kmem_cache_alloc_node+0xd4/0x530
[20655820.372684]  [] ? ovs_nla_copy_actions+0x6e/0x90 
[openvswitch]
[20655820.381064]  [] ? ovs_packet_cmd_execute+0x16d/0x290 
[openvswitch]
[20655820.389745]  [] ? genl_family_rcv_msg+0x1bc/0x360
[20655820.396972]  [] ? genl_family_rcv_msg+0x360/0x360
[20655820.404247]  [] ? genl_rcv_msg+0x82/0xc0
[20655820.410646]  [] ? netlink_rcv_skb+0xa1/0xc0
[20655820.417350]  [] ? genl_rcv+0x24/0x40
[20655820.423927]  [] ? netlink_unicast+0x184/0x230
[20655820.430773]  [] ? netlink_sendmsg+0x2f8/0x3b0
[20655820.437554]  [] ? sock_sendmsg+0x30/0x40
[20655820.443925]  [] ? ___sys_sendmsg+0x2c2/0x2d0
[20655820.450596]  [] ? ep_ptable_queue_proc+0x90/0x90
[20655820.457686]  [] ? 
ep_scan_ready_list.constprop.12+0x208/0x210
[20655820.465888]  [] ? ep_poll+0x192/0x350
[20655820.471919]  [] ? __sys_sendmsg+0x51/0x90
[20655820.478876]  [] ? system_call_fast_compare_end+0xc/0xb7

Regards,
Hanlin
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn 2/2] northd: Provide the option to not use ct.inv in lflows.

2021-03-24 Thread Numan Siddique
Hi Ben,

Need your eye on this patch for the ddlog stuff.  I was wondering
if there is a better way to do than what I've done.

Not urgent. Please take your time.

Thanks
Numan

On Mon, Mar 22, 2021 at 4:29 PM  wrote:
>
> From: Numan Siddique 
>
> Presently we add 65535 priority lflows in the stages -
> 'ls_in_acl' and 'ls_out_acl' to drop packets which
> match on 'ct.inv'.
>
> This patch provides an option to not use this field in the
> logical flow matches for the following
> reasons:
>
>  • Some of the smart NICs which support offloading datapath
>flows may not support this field.  In such cases, almost
>all the datapath flows cannot be offloaded if ACLs/load balancers
>are configured on the logical switch datapath.
>
>  • A recent commit in kernel ovs datapath sets the committed
>connection tracking entry to be liberal for out-of-window
>tcp packets (nf_ct_set_tcp_be_liberal()).  Such TCP
>packets will not be marked as invalid.
>
> There are still certain scenarios where a packet can be marked
> invalid.  Like
>  • If the tcp flags are not correct.
>
>  • If there are checksum errors.
>
> In such cases, the packet will be delivered to the destination
> port.  So CMS should evaluate their usecases before enabling
> this option.
>
> Signed-off-by: Numan Siddique 
> ---
>  NEWS |  5 +++
>  northd/lswitch.dl| 13 +++-
>  northd/ovn-northd.c  | 41 ++-
>  northd/ovn_northd.dl | 51 +++--
>  ovn-nb.xml   | 11 +++
>  tests/ovn-northd.at  | 77 
>  6 files changed, 171 insertions(+), 27 deletions(-)
>
> diff --git a/NEWS b/NEWS
> index 530c5d42fe..3181649ba8 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -7,6 +7,11 @@ Post-v21.03.0
>  (This may take testing and tuning to be effective.)  This version of OVN
>  requires DDLog 0.36.
>- Introduce ovn-controller incremetal processing engine statistics
> +  - Add a new NB Global option - us_ct_inv_match with the default value of 
> true.
> +If this is set to false, then the logical field - "ct.inv" will not be
> +used in the logical flow matches.  CMS can consider setting this to 
> false,
> +if they want to use smart NICs which don't support offloading datapath 
> flows
> +with this field used.
>
>  OVN v21.03.0 - 12 Mar 2021
>  -
> diff --git a/northd/lswitch.dl b/northd/lswitch.dl
> index 4bf8a5b907..1daf249382 100644
> --- a/northd/lswitch.dl
> +++ b/northd/lswitch.dl
> @@ -197,6 +197,7 @@ relation (
>  ipv6_prefix:   Option,
>  mcast_cfg: Ref,
>  is_vlan_transparent: bool,
> +use_ct_inv_match:  bool,
>
>  /* Does this switch have at least one port with type != "router"? */
>  has_non_router_port: bool
> @@ -223,7 +224,8 @@ function ipv6_parse_prefix(s: string): Option {
>  .ipv6_prefix   = ipv6_prefix,
>  .mcast_cfg = mcast_cfg,
>  .has_non_router_port = has_non_router_port,
> -.is_vlan_transparent = is_vlan_transparent) :-
> +.is_vlan_transparent = is_vlan_transparent,
> +.use_ct_inv_match = use_ct_inv_match) :-
>  nb::Logical_Switch[ls],
>  LogicalSwitchHasStatefulACL(ls._uuid, has_stateful_acl),
>  LogicalSwitchHasLBVIP(ls._uuid, has_lb_vip),
> @@ -232,6 +234,7 @@ function ipv6_parse_prefix(s: string): Option {
>  LogicalSwitchLocalnetPorts(ls._uuid, localnet_ports),
>  LogicalSwitchHasNonRouterPort(ls._uuid, has_non_router_port),
>  mcast_cfg in (.datapath = ls._uuid),
> +UseCtInvMatch(use_ct_inv_match),
>  var subnet =
>  match (ls.other_config.get("subnet")) {
>  None -> None,
> @@ -744,3 +747,11 @@ function put_svc_monitor_mac(options: Map,
>  relation SvcMonitorMac(mac: eth_addr)
>  SvcMonitorMac(get_svc_monitor_mac(options, uuid)) :-
>  nb::NB_Global(._uuid = uuid, .options = options).
> +
> +function can_use_ct_inv_match(options: Map): bool {
> +map_get_bool_def(options, "use_ct_inv_match", true)
> +}
> +
> +relation UseCtInvMatch(use_ct_inv_match: bool)
> +UseCtInvMatch(can_use_ct_inv_match(options)) :-
> +nb::NB_Global(._uuid = uuid, .options = options).
> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> index 3221fb9ff7..6baed2a418 100644
> --- a/northd/ovn-northd.c
> +++ b/northd/ovn-northd.c
> @@ -4066,6 +4066,10 @@ ovn_lflow_init(struct ovn_lflow *lflow, struct 
> ovn_datapath *od,
>   * logical datapath only by creating a datapath group. */
>  static bool use_logical_dp_groups = false;
>
> +/* If this option is 'true' northd will make use of ct.inv match fields.
> + * Otherwise, it will avoid using it.  The default is true. */
> +static bool use_ct_inv_match = true;
> +
>  /* Adds a row with the specified contents to the Logical_Flow table. */
>  static void
>  ovn_lflow_add_at(struct hmap *lflow_map, struct ovn_datapath *od,
> @@ -5681,12 +5685,13 @@ build_acls(struct ovn_datapath *od, struct 

[ovs-dev] [PATCH] LLDP: add new command to show LLDP neighbor info

2021-03-24 Thread Rick Zhong
Dear OVS reviewers/supervisors:


This patch is related to LLDP which provides a new command "ovs-appctl 
lldp/neighbor" to show LLDP neighbor info when LLDP is enabled on OVS 
interfaces.


With this new command, user is enable to get LLDP neighbor info even if not in 
SPB network.


One limitation is that when multiple peer Management IP addresses are found by 
LLDP, only one Management IP address is displayed by the command.


The patch is well-tested on Linux.


Related commit: af4b3d3 and e4bc70c (add new command to show LLDP neighbor info 
#349)


Signed-off-by: Rick Zhong (winsome8...@163.com)


=
diff --git a/lib/ovs-lldp.c b/lib/ovs-lldp.c
index 05c1dd434..4c8ab9126 100644
--- a/lib/ovs-lldp.c
+++ b/lib/ovs-lldp.c
@@ -324,6 +324,84 @@ aa_print_isid_status(struct ds *ds, struct lldp *lldp) 
OVS_REQUIRES(mutex)
 }
 }
 
+static void
+lldp_print_neighbor_port(struct ds *ds, struct lldpd_hardware *hw)
+{
+struct lldpd_port *port;
+
+LIST_FOR_EACH (port, p_entries, >h_rports) {
+const char *none_str = "";
+char *id = NULL;
+const char *name = NULL;
+const char *port_id = NULL;
+char ipaddress[INET6_ADDRSTRLEN + 1];
+memset(ipaddress, 0, INET6_ADDRSTRLEN + 1);
+
+if (port->p_chassis) {
+if (port->p_chassis->c_id_len > 0) {
+chassisid_to_string(port->p_chassis->c_id,
+port->p_chassis->c_id_len, );
+}
+
+name = port->p_chassis->c_name;
+
+struct lldpd_mgmt *mgmt;
+LIST_FOR_EACH (mgmt, m_entries, >p_chassis->c_mgmt) {
+int af;
+size_t alen;
+switch (mgmt->m_family) {
+case LLDPD_AF_IPV4:
+alen = INET_ADDRSTRLEN + 1;
+af = AF_INET;
+break;
+case LLDPD_AF_IPV6:
+alen = INET6_ADDRSTRLEN + 1;
+af = AF_INET6;
+break;
+default:
+continue;
+}
+
+if (inet_ntop(af, >m_addr, ipaddress, alen) == NULL) {
+continue;
+}
+break;
+}
+}
+
+port_id = port->p_id;
+
+ds_put_format(ds, "  Neighbor Chassis ID: %s\n",
+  id ? id : none_str);
+ds_put_format(ds, "  Neighbor Chassis SysName: %s\n",
+  name ? name : none_str);
+ds_put_format(ds, "  Neighbor Management IP: %s\n",
+  ipaddress);
+ds_put_format(ds, "  Neighbor Port ID: %s\n",
+  port_id ? port_id : none_str);
+
+if (id != NULL) {
+free(id);
+}
+}
+}
+
+static void
+lldp_print_neighbor(struct ds *ds, struct lldp *lldp) OVS_REQUIRES(mutex)
+{
+struct lldpd_hardware *hw;
+
+ds_put_format(ds, "LLDP: %s\n", lldp->name);
+
+if (!lldp->lldpd) {
+return;
+}
+
+LIST_FOR_EACH (hw, h_entries, >lldpd->g_hardware) {
+lldp_print_neighbor_port(ds, hw);
+}
+}
+
 static void
 aa_unixctl_status(struct unixctl_conn *conn, int argc OVS_UNUSED,
   const char *argv[] OVS_UNUSED, void *aux OVS_UNUSED)
@@ -382,6 +460,25 @@ aa_unixctl_statistics(struct unixctl_conn *conn, int argc 
OVS_UNUSED,
 unixctl_command_reply(conn, ds_cstr());
 }
 
+static void
+lldp_unixctl_show_neighbor(struct unixctl_conn *conn, int argc OVS_UNUSED,
+  const char *argv[] OVS_UNUSED, void *aux OVS_UNUSED)
+OVS_EXCLUDED(mutex)
+{
+struct lldp *lldp;
+struct ds ds = DS_EMPTY_INITIALIZER;
+
+ovs_mutex_lock();
+
+HMAP_FOR_EACH (lldp, hmap_node, all_lldps) {
+lldp_print_neighbor(, lldp);
+}
+unixctl_command_reply(conn, ds_cstr());
+ds_destroy();
+
+ovs_mutex_unlock();
+}
+
 /* An Auto Attach mapping was configured.  Populate the corresponding
  * structures in the LLDP hardware.
  */
@@ -649,6 +746,8 @@ lldp_init(void)
  aa_unixctl_show_isid, NULL);
 unixctl_command_register("autoattach/statistics", "[bridge]", 0, 1,
  aa_unixctl_statistics, NULL);
+unixctl_command_register("lldp/neighbor", "[bridge]", 0, 1,
+ lldp_unixctl_show_neighbor, NULL);
 }
 
 /* Returns true if 'lldp' should process packets from 'flow'.  Sets
=


Regards,
Rick Zhong
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] Support vlan-passthru for tag=0 logical switch ports

2021-03-24 Thread Ben Pfaff
On Wed, Mar 24, 2021 at 10:30:14AM -0400, Ihar Hrachyshka wrote:
> On Wed, Mar 17, 2021 at 2:01 PM Ben Pfaff  wrote:
> >
> > On Wed, Mar 17, 2021 at 01:47:16PM -0400, Ihar Hrachyshka wrote:
> > > When vlan-passthru is true for LS, for untagged localnet ports, tagged
> > > VLAN traffic originating from VIFs should be passed through intact since
> > > the VLAN header belongs to VIF user, not the localnet port fabric.
> > >
> > > This patch adds multinode test cases to cover scenarios where packets
> > > traverse through fabric layer, for both tagged and untagged (tag=0)
> > > localnet ports.
> > >
> > > Signed-off-by: Ihar Hrachyshka 
> >
> > Thanks for contributing a feature to OVN!  I see that this changes
> > ovn-northd.c in a way that needs a corresponding change to
> > ovn_northd.dl.  I am available to help with that--please let me know!
> >
> 
> I've sent v2 with dl implementation.

Thanks!
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2] Support vlan-passthru for tag=0 logical switch ports

2021-03-24 Thread Ben Pfaff
On Tue, Mar 23, 2021 at 09:22:43PM -0400, Ihar Hrachyshka wrote:
> When vlan-passthru is true for LS, for untagged localnet ports, tagged
> VLAN traffic originating from VIFs should be passed through intact since
> the VLAN header belongs to VIF user, not the localnet port fabric.
> 
> This patch adds multinode test cases to cover scenarios where packets
> traverse through fabric layer, for both tagged and untagged (tag=0)
> localnet ports.
> 
> Signed-off-by: Ihar Hrachyshka 

Thanks for adding ddlog support.  It looks correct to me.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] ovn-northd-ddlog scale issues

2021-03-24 Thread Ben Pfaff
Thanks a lot for the report!  I need a new test case so I'll give this
one a shot when I can.

On Wed, Mar 24, 2021 at 04:03:07PM +0100, Dumitru Ceara wrote:
> Hi Ben,
> 
> We discussed a bit about this during one of the recent IRC OVN meetings,
> but I didn't get around to properly reporting this until now.
> 
> I've tried running ovn-northd-ddlog against some large OVN NB/DB
> databases extracted from one of our scale testing runs:
> 
> http://people.redhat.com/~dceara/ovn-northd-ddlog-tests/20210324/existing-nb-sb/
> 
> It seems that ovn-northd-ddlog gets stuck in a busy loop and uses a lot
> of memory:
> 
> 775734 root  10 -10   81.6g  80.8g  22396 S  99.7  64.2   3:50.79 
> ovn-northd-ddlog
> 
> ovn-northd-ddlog is stuck in an (infinite?) loop in
> ddlog_transaction_commit_dump_changes():
> 
> (gdb) bt
> #0  0x7f2762167e92 in pthread_cond_wait@@GLIBC_2.3.2 () from 
> /lib64/libpthread.so.0
> #1  0x00891eb3 in std::thread::park ()
> #2  0x00530963 in crossbeam_channel::context::Context::wait_until ()
> #3  0x00530aa1 in 
> crossbeam_channel::context::Context::with::{{closure}} ()
> #4  0x00531f5c in crossbeam_channel::flavors::list::Channel::recv 
> ()
> #5  0x0075b35d in crossbeam_channel::channel::Receiver::recv ()
> #6  0x00512156 in 
> differential_datalog::program::RunningProgram::flush ()
> #7  0x0050de67 in 
> differential_datalog::program::RunningProgram::transaction_commit ()
> #8  0x0093f17d in  differential_datalog::ddlog::DDlog>::transaction_commit_dump_changes ()
> #9  0x0040cc40 in ddlog_transaction_commit_dump_changes ()
> #10 0x0040a758 in ddlog_commit (ddlog=) at 
> northd/ovn-northd-ddlog.c:435
> #11 northd_parse_update (update=0x3dc0598, ctx=0x3d71880) at 
> northd/ovn-northd-ddlog.c:435
> #12 northd_run (ctx=0x3d71880) at northd/ovn-northd-ddlog.c:512
> #13 0x00408edd in main (argc=, argv=) 
> at northd/ovn-northd-ddlog.c:1203
> 
> For comparison, running the C version of ovn-northd I get:
> 
> 2021-03-24T14:48:06.556Z|00033|timeval|WARN|Unreasonably long 11290ms poll 
> interval (10916ms user, 334ms system)
> 2021-03-24T14:48:14.678Z|00050|timeval|WARN|Unreasonably long 8122ms poll 
> interval (8046ms user, 48ms system)
> 
> But after the northd iteration ends, memory usage is OK:
> 
> 777567 root  10 -10 1657308   1.6g   3496 S   0.0   1.2   0:49.08 
> ovn-northd
> 
> The behavior above is consistent both with current OVN master and also
> when cherry-picking the ddlog-related patches that are pending in
> patchwork:
> http://patchwork.ozlabs.org/project/ovn/list/?series=233075
> http://patchwork.ozlabs.org/project/ovn/list/?series=232480
> http://patchwork.ozlabs.org/project/ovn/list/?series=233079
> http://patchwork.ozlabs.org/project/ovn/list/?series=233080
> 
> I didn't try out the changes from the following series though as I
> understand they need a v2:
> http://patchwork.ozlabs.org/project/ovn/list/?series=232040
> 
> Regards,
> Dumitru
> 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] rhel: Do not update kmod RPM newer major revision kernels

2021-03-24 Thread Gregory Rose




On 3/22/2021 4:58 AM, Ilya Maximets wrote:

On 2/9/21 8:47 PM, Greg Rose wrote:

The ovs-kmod-manage.sh script will run weak-updates even on newer
release kernels installing a non-compatible or un-runnable kernel
module.

Update the script to never install weak-updates onto kernels with
newer major release versions.

VMware-BZ: #2717283
Signed-off-by: Greg Rose 
---
  rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh | 10 ++
  1 file changed, 10 insertions(+)

diff --git a/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh 
b/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh
index f147857e4..66b09472a 100644
--- a/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh
+++ b/rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh
@@ -155,6 +155,16 @@ kmod_versions=()
  kversion=$(rpm -ql ${rpmname} | grep '\.ko$' | \
 sed -n -e 's/^\/lib\/modules\/\(.*\)\/extra\/.*$/\1/p' | \
 sort | uniq)
+
+IFS='.\|-' read installed_major installed_minor installed_patch \
+installed_major_rev installed_minor_rev installed_extra <<<"${kversion}"
+
+if [ "$installed_major_rev" -lt "$major_rev" ]; then
+echo Not installing RPM with major revision "$installed_major_rev" \
+to kernel with greater major revision "$major_rev.  Exiting"


I fixed formatting here a bit and applied to master.  Thanks!

Best regards, Ilya Maximets.



Thanks!  I kind of forgot about this one while working on other stuff.
Still need it though so much appreciated.

- Greg
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [OVN Patch v15 1/3] ovn-libs: Add support for parallel processing

2021-03-24 Thread Numan Siddique
On Mon, Mar 1, 2021 at 6:35 PM  wrote:
>
> From: Anton Ivanov 
>
> This adds a set of functions and macros intended to process
> hashes in parallel.
>
> The principles of operation are documented in the ovn-parallel-hmap.h
>
> If these one day go into the OVS tree, the OVS tree versions
> would be used in preference.
>
> Signed-off-by: Anton Ivanov 

Hi Anton,

I tested the first 2 patches of this series and it crashes again for me.

This time I ran tests on a 4 core  machine - Intel(R) Xeon(R) CPU
E3-1220 v5 @ 3.00GHz

The below trace is seen for both gcc and clang.


[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `ovn-northd -vjsonrpc
--ovnnb-db=unix:/mnt/mydisk/myhome/numan_alt/work/ovs_ovn/'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x7f27594ae212 in __new_sem_wait_slow.constprop.0 () from
/lib64/libpthread.so.0
[Current thread is 1 (Thread 0x7f2758c68640 (LWP 347378))]
Missing separate debuginfos, use: dnf debuginfo-install
glibc-2.32-3.fc33.x86_64 libcap-ng-0.8-1.fc33.x86_64
libevent-2.1.8-10.fc33.x86_64 openssl-libs-1.1.1i-1.fc33.x86_64
python3-libs-3.9.1-2.fc33.x86_64 unbound-libs-1.10.1-4.fc33.x86_64
zlib-1.2.11-23.fc33.x86_64
(gdb) bt
#0  0x7f27594ae212 in __new_sem_wait_slow.constprop.0 () from
/lib64/libpthread.so.0
#1  0x00422184 in wait_for_work (control=) at
../lib/ovn-parallel-hmap.h:203
#2  build_lflows_thread (arg=0x2538420) at ../northd/ovn-northd.c:11855
#3  0x0049cd12 in ovsthread_wrapper (aux_=) at
../lib/ovs-thread.c:383
#4  0x7f27594a53f9 in start_thread () from /lib64/libpthread.so.0
#5  0x7f2759142903 in clone () from /lib64/libc.so.6
-

I'm not sure why you're not able to reproduce this issue.

All the test cases passed for me. So maybe something's wrong when
ovn-northd exits.
IMHO, these crashes should be addressed before these patches can be considered.

Thanks
Numan

> ---
>  lib/automake.mk |   2 +
>  lib/ovn-parallel-hmap.c | 455 
>  lib/ovn-parallel-hmap.h | 301 ++
>  3 files changed, 758 insertions(+)
>  create mode 100644 lib/ovn-parallel-hmap.c
>  create mode 100644 lib/ovn-parallel-hmap.h
>
> diff --git a/lib/automake.mk b/lib/automake.mk
> index 250c7aefa..781be2109 100644
> --- a/lib/automake.mk
> +++ b/lib/automake.mk
> @@ -13,6 +13,8 @@ lib_libovn_la_SOURCES = \
> lib/expr.c \
> lib/extend-table.h \
> lib/extend-table.c \
> +   lib/ovn-parallel-hmap.h \
> +   lib/ovn-parallel-hmap.c \
> lib/ip-mcast-index.c \
> lib/ip-mcast-index.h \
> lib/mcast-group-index.c \
> diff --git a/lib/ovn-parallel-hmap.c b/lib/ovn-parallel-hmap.c
> new file mode 100644
> index 0..e83ae23cb
> --- /dev/null
> +++ b/lib/ovn-parallel-hmap.c
> @@ -0,0 +1,455 @@
> +/*
> + * Copyright (c) 2020 Red Hat, Inc.
> + * Copyright (c) 2008, 2009, 2010, 2012, 2013, 2015, 2019 Nicira, Inc.
> + *
> + * Licensed under the Apache License, Version 2.0 (the "License");
> + * you may not use this file except in compliance with the License.
> + * You may obtain a copy of the License at:
> + *
> + * http://www.apache.org/licenses/LICENSE-2.0
> + *
> + * Unless required by applicable law or agreed to in writing, software
> + * distributed under the License is distributed on an "AS IS" BASIS,
> + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
> + * See the License for the specific language governing permissions and
> + * limitations under the License.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include "fatal-signal.h"
> +#include "util.h"
> +#include "openvswitch/vlog.h"
> +#include "openvswitch/hmap.h"
> +#include "openvswitch/thread.h"
> +#include "ovn-parallel-hmap.h"
> +#include "ovs-atomic.h"
> +#include "ovs-thread.h"
> +#include "ovs-numa.h"
> +#include "random.h"
> +
> +VLOG_DEFINE_THIS_MODULE(ovn_parallel_hmap);
> +
> +#ifndef OVS_HAS_PARALLEL_HMAP
> +
> +#define WORKER_SEM_NAME "%x-%p-%x"
> +#define MAIN_SEM_NAME "%x-%p-main"
> +
> +/* These are accessed under mutex inside add_worker_pool().
> + * They do not need to be atomic.
> + */
> +
> +static atomic_bool initial_pool_setup = ATOMIC_VAR_INIT(false);
> +static bool can_parallelize = false;
> +
> +/* This is set only in the process of exit and the set is
> + * accompanied by a fence. It does not need to be atomic or be
> + * accessed under a lock.
> + */
> +
> +static bool workers_must_exit = false;
> +
> +static struct ovs_list worker_pools = OVS_LIST_INITIALIZER(_pools);
> +
> +static struct ovs_mutex init_mutex = OVS_MUTEX_INITIALIZER;
> +
> +static int pool_size;
> +
> +static int sembase;
> +
> +static void worker_pool_hook(void *aux OVS_UNUSED);
> +static void setup_worker_pools(bool force);
> +static void merge_list_results(struct worker_pool *pool OVS_UNUSED,
> 

[ovs-dev] ovn-northd-ddlog scale issues

2021-03-24 Thread Dumitru Ceara
Hi Ben,

We discussed a bit about this during one of the recent IRC OVN meetings,
but I didn't get around to properly reporting this until now.

I've tried running ovn-northd-ddlog against some large OVN NB/DB
databases extracted from one of our scale testing runs:

http://people.redhat.com/~dceara/ovn-northd-ddlog-tests/20210324/existing-nb-sb/

It seems that ovn-northd-ddlog gets stuck in a busy loop and uses a lot
of memory:

775734 root  10 -10   81.6g  80.8g  22396 S  99.7  64.2   3:50.79 
ovn-northd-ddlog

ovn-northd-ddlog is stuck in an (infinite?) loop in
ddlog_transaction_commit_dump_changes():

(gdb) bt
#0  0x7f2762167e92 in pthread_cond_wait@@GLIBC_2.3.2 () from 
/lib64/libpthread.so.0
#1  0x00891eb3 in std::thread::park ()
#2  0x00530963 in crossbeam_channel::context::Context::wait_until ()
#3  0x00530aa1 in 
crossbeam_channel::context::Context::with::{{closure}} ()
#4  0x00531f5c in crossbeam_channel::flavors::list::Channel::recv ()
#5  0x0075b35d in crossbeam_channel::channel::Receiver::recv ()
#6  0x00512156 in differential_datalog::program::RunningProgram::flush 
()
#7  0x0050de67 in 
differential_datalog::program::RunningProgram::transaction_commit ()
#8  0x0093f17d in ::transaction_commit_dump_changes ()
#9  0x0040cc40 in ddlog_transaction_commit_dump_changes ()
#10 0x0040a758 in ddlog_commit (ddlog=) at 
northd/ovn-northd-ddlog.c:435
#11 northd_parse_update (update=0x3dc0598, ctx=0x3d71880) at 
northd/ovn-northd-ddlog.c:435
#12 northd_run (ctx=0x3d71880) at northd/ovn-northd-ddlog.c:512
#13 0x00408edd in main (argc=, argv=) at 
northd/ovn-northd-ddlog.c:1203

For comparison, running the C version of ovn-northd I get:

2021-03-24T14:48:06.556Z|00033|timeval|WARN|Unreasonably long 11290ms poll 
interval (10916ms user, 334ms system)
2021-03-24T14:48:14.678Z|00050|timeval|WARN|Unreasonably long 8122ms poll 
interval (8046ms user, 48ms system)

But after the northd iteration ends, memory usage is OK:

777567 root  10 -10 1657308   1.6g   3496 S   0.0   1.2   0:49.08 ovn-northd

The behavior above is consistent both with current OVN master and also
when cherry-picking the ddlog-related patches that are pending in
patchwork:
http://patchwork.ozlabs.org/project/ovn/list/?series=233075
http://patchwork.ozlabs.org/project/ovn/list/?series=232480
http://patchwork.ozlabs.org/project/ovn/list/?series=233079
http://patchwork.ozlabs.org/project/ovn/list/?series=233080

I didn't try out the changes from the following series though as I
understand they need a v2:
http://patchwork.ozlabs.org/project/ovn/list/?series=232040

Regards,
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v2] ovn-northd: bypass ct for allow ACLs

2021-03-24 Thread Ihar Hrachyshka
For allow ACLs, bypass connection tracking by avoiding setting ct
hints for matching traffic. Avoid sending all traffic to ct when a
stateful ACL is present. Before the patch, this unnecessarily hit
performance when mixed ACL action types were used for the same
datapath.

===

For performance measurements, ovn-fake-multinode environment and qperf
were used. Performance measured between two virtual nodes, two ports
that belong to different LSs connected via router. Using qperf,
performance was measured for UDP, TCP, SCTP protocols (using
_lat and _bw tests). The qperf version used:
0.4.9-16.fc31.x86_64.  Each test scenario was executed five times and
averages compared.

Tests were executed with `allow` rules for the tested protocol and
`allow-related` for another protocol set for both ports, both
directions, e.g. for TCP scenario, the following ACLs were defined:

ovn-nbctl acl-add sw0 to-lport 100 tcp allow
ovn-nbctl acl-add sw0 from-lport 100 tcp allow
ovn-nbctl acl-add sw1 to-lport 100 tcp allow
ovn-nbctl acl-add sw1 from-lport 100 tcp allow

ovn-nbctl acl-add sw0 to-lport 100 sctp allow-related
ovn-nbctl acl-add sw0 from-lport 100 sctp allow-related
ovn-nbctl acl-add sw1 to-lport 100 sctp allow-related
ovn-nbctl acl-add sw1 from-lport 100 sctp allow-related

In this particular environment, improvement was seen in send_bw,
latency, and msg_rate measurements, where applicable, for all three
protocols under test.

for UDP, send_bw: 293.6 MB/sec => 313.2 MB/sec (+6.68%)
 latency: 16 us => 14.08 us (-12%)
 msg_rate: 62.56 K/sec => 71.06 K/sec (+13.59%)

for TCP, latency: 18.6 us => 14.88 us (-20%)
 msg_rate: 53.8 K/sec => 67.28 K/sec (+25.06%)

for SCTP, latency: 21.98 us => 19.42 us (-11.65%)
  msg_rate: 45.58 K/sec => 51.54 K/sec (+13.08%)

Interestingly, some performance improvement was also seen for the same
scenarios with no ACLs set at all, albeit significantly more
negligible.

for UDP, send_bw: 320.0 MB/sec => 338.6 MB/sec (+5.81%)
 latency: 13.74 us => 12.88 us (-6.68%)
 msg_rate: 73.02 K/sec => 77.84 K/sec (+6.6%)

for TCP, latency: 15.62 us => 14.26 us (-9.54%)
 msg_rate: 64.02 K/sec => 70.26 K/sec (+9.75%)

for SCTP, latency: 19.56 us => 18.16 us (-7.16%)
  msg_rate: 51.16 K/sec => 55.12 K/sec (+7.74%)

Comparable numbers can be captured with iperf. It may be useful to run
more tests in a more elaborate (bare metal) environment.

===

The patch takes inspiration from a now abandoned patch:

"ovn-northd: Support mixing stateless/stateful ACLs with
Stateless_Filter." by Dumitru Ceara.

Signed-off-by: Ihar Hrachyshka 

---

v1: initial version.
v2: rebased after conflict.
---
 NEWS|   1 +
 northd/ovn-northd.8.xml |   9 +-
 northd/ovn-northd.c | 166 +++
 tests/ovn-northd.at | 287 
 4 files changed, 400 insertions(+), 63 deletions(-)

diff --git a/NEWS b/NEWS
index 530c5d42f..548a45fb7 100644
--- a/NEWS
+++ b/NEWS
@@ -7,6 +7,7 @@ Post-v21.03.0
 (This may take testing and tuning to be effective.)  This version of OVN
 requires DDLog 0.36.
   - Introduce ovn-controller incremetal processing engine statistics
+  - Bypass connection tracking for ACL "allow" action processing.
 
 OVN v21.03.0 - 12 Mar 2021
 -
diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index a62f5c057..f38d71682 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -419,7 +419,9 @@
   before eventually advancing to ingress table ACLs. If
   special ports such as route ports or localnet ports can't use ct(), a
   priority-110 flow is added to skip over stateful ACLs. IPv6 Neighbor
-  Discovery and MLD traffic also skips stateful ACLs.
+  Discovery and MLD traffic also skips stateful ACLs. For stateless "allow"
+  ACLs, a flow is added to bypass setting the hint for connection tracker
+  processing.
 
 
 
@@ -603,10 +605,7 @@
 
   
 allow ACLs translate into logical flows with
-the next; action.  If there are any stateful ACLs
-on this datapath, then allow ACLs translate to
-ct_commit; next; (which acts as a hint for the next tables
-to commit the connection to conntrack),
+the next; action.
   
   
 allow-related ACLs translate into logical
diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index 4783e43d7..cd343a3e3 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -4943,7 +4943,58 @@ skip_port_from_conntrack(struct ovn_datapath *od, struct 
ovn_port *op,
 }
 
 static void
-build_pre_acls(struct ovn_datapath *od, struct hmap *lflows)
+build_stateless_filter(struct ovn_datapath *od,
+   const struct nbrec_acl *acl,
+   struct hmap *lflows)
+{
+/* Stateless filters must be applied in both directions so that reply
+ * traffic bypasses conntrack too.
+ */

[ovs-dev] [PATCH ovn] ovn-northd: bypass ct for allow ACLs

2021-03-24 Thread Ihar Hrachyshka
For allow ACLs, bypass connection tracking by avoiding setting ct
hints for matching traffic. Avoid sending all traffic to ct when a
stateful ACL is present. Before the patch, this unnecessarily hit
performance when mixed ACL action types were used for the same
datapath.

===

For performance measurements, ovn-fake-multinode environment and qperf
were used. Performance measured between two virtual nodes, two ports
that belong to different LSs connected via router. Using qperf,
performance was measured for UDP, TCP, SCTP protocols (using
_lat and _bw tests). The qperf version used:
0.4.9-16.fc31.x86_64.  Each test scenario was executed five times and
averages compared.

Tests were executed with `allow` rules for the tested protocol and
`allow-related` for another protocol set for both ports, both
directions, e.g. for TCP scenario, the following ACLs were defined:

ovn-nbctl acl-add sw0 to-lport 100 tcp allow
ovn-nbctl acl-add sw0 from-lport 100 tcp allow
ovn-nbctl acl-add sw1 to-lport 100 tcp allow
ovn-nbctl acl-add sw1 from-lport 100 tcp allow

ovn-nbctl acl-add sw0 to-lport 100 sctp allow-related
ovn-nbctl acl-add sw0 from-lport 100 sctp allow-related
ovn-nbctl acl-add sw1 to-lport 100 sctp allow-related
ovn-nbctl acl-add sw1 from-lport 100 sctp allow-related

In this particular environment, improvement was seen in send_bw,
latency, and msg_rate measurements, where applicable, for all three
protocols under test.

for UDP, send_bw: 293.6 MB/sec => 313.2 MB/sec (+6.68%)
 latency: 16 us => 14.08 us (-12%)
 msg_rate: 62.56 K/sec => 71.06 K/sec (+13.59%)

for TCP, latency: 18.6 us => 14.88 us (-20%)
 msg_rate: 53.8 K/sec => 67.28 K/sec (+25.06%)

for SCTP, latency: 21.98 us => 19.42 us (-11.65%)
  msg_rate: 45.58 K/sec => 51.54 K/sec (+13.08%)

Interestingly, some performance improvement was also seen for the same
scenarios with no ACLs set at all, albeit significantly more
negligible.

for UDP, send_bw: 320.0 MB/sec => 338.6 MB/sec (+5.81%)
 latency: 13.74 us => 12.88 us (-6.68%)
 msg_rate: 73.02 K/sec => 77.84 K/sec (+6.6%)

for TCP, latency: 15.62 us => 14.26 us (-9.54%)
 msg_rate: 64.02 K/sec => 70.26 K/sec (+9.75%)

for SCTP, latency: 19.56 us => 18.16 us (-7.16%)
  msg_rate: 51.16 K/sec => 55.12 K/sec (+7.74%)

Comparable numbers can be captured with iperf. It may be useful to run
more tests in a more elaborate (bare metal) environment.

===

The patch takes inspiration from a now abandoned patch:

"ovn-northd: Support mixing stateless/stateful ACLs with
Stateless_Filter." by Dumitru Ceara.

Signed-off-by: Ihar Hrachyshka 

---

v1: initial version.
v2: rebased after conflict.
---
 NEWS|   1 +
 northd/ovn-northd.8.xml |   9 +-
 northd/ovn-northd.c | 166 +++
 tests/ovn-northd.at | 287 
 4 files changed, 400 insertions(+), 63 deletions(-)

diff --git a/NEWS b/NEWS
index 530c5d42f..548a45fb7 100644
--- a/NEWS
+++ b/NEWS
@@ -7,6 +7,7 @@ Post-v21.03.0
 (This may take testing and tuning to be effective.)  This version of OVN
 requires DDLog 0.36.
   - Introduce ovn-controller incremetal processing engine statistics
+  - Bypass connection tracking for ACL "allow" action processing.
 
 OVN v21.03.0 - 12 Mar 2021
 -
diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index a62f5c057..f38d71682 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -419,7 +419,9 @@
   before eventually advancing to ingress table ACLs. If
   special ports such as route ports or localnet ports can't use ct(), a
   priority-110 flow is added to skip over stateful ACLs. IPv6 Neighbor
-  Discovery and MLD traffic also skips stateful ACLs.
+  Discovery and MLD traffic also skips stateful ACLs. For stateless "allow"
+  ACLs, a flow is added to bypass setting the hint for connection tracker
+  processing.
 
 
 
@@ -603,10 +605,7 @@
 
   
 allow ACLs translate into logical flows with
-the next; action.  If there are any stateful ACLs
-on this datapath, then allow ACLs translate to
-ct_commit; next; (which acts as a hint for the next tables
-to commit the connection to conntrack),
+the next; action.
   
   
 allow-related ACLs translate into logical
diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index 4783e43d7..cd343a3e3 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -4943,7 +4943,58 @@ skip_port_from_conntrack(struct ovn_datapath *od, struct 
ovn_port *op,
 }
 
 static void
-build_pre_acls(struct ovn_datapath *od, struct hmap *lflows)
+build_stateless_filter(struct ovn_datapath *od,
+   const struct nbrec_acl *acl,
+   struct hmap *lflows)
+{
+/* Stateless filters must be applied in both directions so that reply
+ * traffic bypasses conntrack too.
+ */

Re: [ovs-dev] [PATCH ovn] Support vlan-passthru for tag=0 logical switch ports

2021-03-24 Thread Ihar Hrachyshka
On Wed, Mar 17, 2021 at 2:01 PM Ben Pfaff  wrote:
>
> On Wed, Mar 17, 2021 at 01:47:16PM -0400, Ihar Hrachyshka wrote:
> > When vlan-passthru is true for LS, for untagged localnet ports, tagged
> > VLAN traffic originating from VIFs should be passed through intact since
> > the VLAN header belongs to VIF user, not the localnet port fabric.
> >
> > This patch adds multinode test cases to cover scenarios where packets
> > traverse through fabric layer, for both tagged and untagged (tag=0)
> > localnet ports.
> >
> > Signed-off-by: Ihar Hrachyshka 
>
> Thanks for contributing a feature to OVN!  I see that this changes
> ovn-northd.c in a way that needs a corresponding change to
> ovn_northd.dl.  I am available to help with that--please let me know!
>

I've sent v2 with dl implementation.

Ihar

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2 6/9] tests: Amend release stale port binding test for RBAC

2021-03-24 Thread Frode Nordahl
On Wed, Mar 24, 2021 at 1:54 PM Numan Siddique  wrote:
> I applied the patches 6 and 7 to the main branch.
>
> There are some issues with patch 9.  I didn't apply patch 8 as it
> seems related to patch 9.
>
> If I configure like below and run "make check" it fails for me.  Can
> you please take a look.
>
> $mkdir _gcc
> $cd _gcc
> $../configure --enable-Werror --enable-sparse --with-ovs-source=...
> $cd ..
> $make -C _gcc check
> make[2]: Entering directory /tmp/ovn/_gcc'
> make[2]: 'tests/atlocal' is up to date.
> make[2]: 'tests/testpki-cacert.pem' is up to date.
> make[2]: 'tests/testpki-test-cert.pem' is up to date.
> make[2]: 'tests/testpki-test-privkey.pem' is up to date.
> make[2]: 'tests/testpki-test-req.pem' is up to date.
> make[2]: 'tests/testpki-test2-cert.pem' is up to date.
> make[2]: 'tests/testpki-test2-privkey.pem' is up to date.
> make[2]: 'tests/testpki-test2-req.pem' is up to date.
> cp /tmp/ovn/_gcc/tests/pki/main-cert.pem tests/testpki-main-cert.pem
> cp: cannot stat '/tmp/ovn/_gcc/tests/pki/main-cert.pem': No such file
> or directory
> make[2]: *** [Makefile:3512: tests/testpki-main-cert.pem] Error 1
> make[2]: Leaving directory '/tmp/ovn/_gcc'

Thank you for finding this issue, as it hid itself from me when
checking with a plain `make distcheck`.

I'll investigate and put up a v3.

-- 
Frode Nordahl


> Thanks
> Numan
>
>
> >
> > --
> > Frode Nordahl
> >
> > > Thanks.
> > >
> > > On 3/5/21 7:16 AM, Frode Nordahl wrote:
> > > > The current version of the test attempts to simulate chassis
> > > > registration prior to starting `ovn-controller`, however it does
> > > > not set the `hostname` field.
> > > >
> > > > The RBAC role for `ovn-controller` does not allow for a chassis to
> > > > change its own name or hostname, which makes sense as this is used
> > > > for authentication.
> > > >
> > > > Update the test to set the `hostname` field when simulating chassis
> > > > registration so that `ovn-controller` does not attempt to update it
> > > > and subsequently make the test fail.
> > > >
> > > > Fixes b6b3823d4 ("ovn-controller: Fix I-P for SB Port_Binding and OVS 
> > > > Interface")
> > > >
> > > > Signed-off-by: Frode Nordahl 
> > > > ---
> > > >   tests/ovn.at | 2 +-
> > > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/tests/ovn.at b/tests/ovn.at
> > > > index bec593dcc..ca9623fee 100644
> > > > --- a/tests/ovn.at
> > > > +++ b/tests/ovn.at
> > > > @@ -21572,7 +21572,7 @@ ovn-nbctl --wait=sb lsp-add ls1 lsp1
> > > >
> > > >   # Simulate the fact that lsp1 had been previously bound on hv1.
> > > >   ovn-sbctl --id=@e create encap chassis_name=hv1 ip="192.168.0.1" 
> > > > type="geneve" \
> > > > --- --id=@c create chassis name=hv1 encaps=@e \
> > > > +-- --id=@c create chassis hostname=hv1 name=hv1 encaps=@e \
> > > >   -- set Port_Binding lsp1 chassis=@c
> > > >
> > > >   as hv1
> > > >
> > >
> > ___
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >



-- 
Frode Nordahl
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] ovn-nbctl: update BFD rows in nbctl_lr_route_del routine

2021-03-24 Thread Lorenzo Bianconi
> On 3/24/21 12:26 PM, Numan Siddique wrote:
> > On Tue, Mar 23, 2021 at 12:11 AM Mark Michelson  wrote:
> >>
> >> Correct me if I'm wrong, but couldn't this be done automatically by just
> >> setting "isRoot: false" for the BFD table in ovn-nb.ovsschema?
> > 
> > I think changing to isRoot: false could have upgrade issues.
> > 
> > IMO, since CMS creates the BFD rows, it's better for it to delete when
> > it's not required.  Although I'm not against this patch.
> > 
> 
> One more comment here:  Enforcing this policy (remove entry if there are no
> users) on the ovn-nbctl level will result in different behavior for different
> users.  e.g. if someone will perform the same action (remove static route)
> directly via jsonrpc or some library like go-ovn, they will not have the BFD
> row deleted.  This might complicate things like migration from nbctl to go-ovn
> in ovn-k8s.  And since this behavior is not specified in a DB schema, this
> should be, at least, clearly documented.

ack, I am fine to just drop this patch and keep current behaviour.

Regards,
Lorenzo

> 
> Bets regards, Ilya Maximets.
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v12 00/11] Add offload support for sFlow

2021-03-24 Thread Chris Mi

On 3/24/2021 8:14 PM, Ilya Maximets wrote:

On 3/24/21 10:17 AM, Chris Mi wrote:

On 3/23/2021 10:24 PM, Ilya Maximets wrote:

On 3/5/21 4:27 AM, Chris Mi wrote:

Hi Ilya,

I think about your suggestion recently. But I'm still not very clear about the 
design.
Please see my reply below:

On 3/1/2021 8:48 PM, Ilya Maximets wrote:

On 3/1/21 9:30 AM, Chris Mi wrote:

Hi Simon, Ilya,

Could I know what should we do to make progress for this patch set?
It has been posted in the community for a long time 

In general, the way to get your patches reviewed is to review other
patches.  It's simply because we still have a huge review backlog
(214 patches right now in patchwork and most of them needs review)
and bugfixes usually has a bit higher priority.  By reviewing other
patches you're reducing amount of work for maintainers so they can
get to your patches faster.

OK, I see.

For the series and comments from Eelco:
I didn't read the patches carefully, only a quick glance, but I still
do not understand why we need a separate thread to poll psample events.
Why can't we just allow usual handler threads to do that?

I'm not sure if you are aware of that the psample netlink is different from the 
ovs
netlink. Without offload, kernel sends missed packet and sFlow packet to 
userspace
using the same netlink 'ovs_packet_family'. So they can use the same handler 
thread.
But in order to offload sFlow action, we use psample kernel module to send 
sampled
packets from kernel to userspace. The format for ovs netlink message and psample
netlink messages are different.

Hi.  Sorry for late reply.

Yes, I understand that message format is different, but it doesn't really
matter.  All the message-specific handling and parsing should happen
inside the netdev_offload_tc_recv().  This function should have a prototype
similar to dpif_netlink_recv(), i.e. it should receive a pointer to the
struct dpif_upcall from the caller and fill it with data.  Maybe other
type of a generic data structure if it's really not possible to construct
struct dpif_upcall.



     From the
architecture perspective it's not a good thing to call ofproto code
from the offload-provider.  This introduces lots of complications
and might cause lots of issues in the future.

I'd say that design should look like this:

     handler thread ->
   dpif_recv() ->
     dpif_netlink_recv() ->
   netdev_offload_recv() ->
     netdev_offload_tc_recv() ->
   nl_sock_recv()

In order to use the handler thread, I'm not sure if we should add psample socket
fd to every handler's epoll_fd. If we should do that, we should consider how to
differentiate if the event comes from ovs or psample netlink. Maybe we should
allocate a special port number for psample and assign it to event.data.u32.
Anyway, that's the details. If this is the right direction, I'll think about it.

Last three calls should be implemented.

The better version of this will be to throw away dpif part from
above call chain and make it:

     handler thread ->
   netdev_offload_recv() ->
     netdev_offload_tc_recv() ->
   nl_sock_recv()

If we throw away dpif part, maybe we have to write some duplicate epoll code
for psample only. And we can't block in nl_sock_recv(). Maybe we have to add
psample socket fd to every handler's epoll_fd. So we have to change the dpif
somehow.

There is no need to implement any epoll_fd logic for psample.
You may only use handler with handler_id == 0.  Only this handler will receive
psample upcalls.  netdev_offload_recv_wait() should be implemented similar
to how dpif_netlink_recv_wait() implemented for windows case, i.e. it will
call nl_sock_wait(nl_sock, POLLIN) for the psample netlink socket.
There is no need to block and you're never blocking anywhere because your're
calling nl_sock_recv(..., wait = false) which will use MSG_DONTWAIT while
actually receiving a message.

Hi Ilya,

With this new design, the code will be changed greatly. I'm not unwilling to 
change it.
The effort is not small. So before I start, I want to make sure that it is 
feasible.

Yes, we won't block in nl_sock_recv(), but the handler thread will be blocked 
in poll_block().
If any of the vport netlink socket is readable , it will be waken up by kernel. 
If we don't use
epoll_ctl to add the psample netlink socket to the handler's epoll fd, we can't 
receive the
psample packet as soon as possible. That means we can only receive the sampled 
packet
when there is a miss upcall.

I added some debug message in ovs epoll code and got above conclusion. But I'm 
not
the expert of the epoll API, so I'm not sure if I missed anything.

Handler thread wakes up on POLLIN on epoll_fd itself, not on the event on one
of the file descriptors added to epoll.  epoll_fd is added to a thread's poll
loop by
   poll_fd_wait(handler->epoll_fd, POLLIN);

If you will call nl_sock_wait() on psample socket, this socket will be added
to the same poll loop with:
   

Re: [ovs-dev] [PATCH ovn v2 6/9] tests: Amend release stale port binding test for RBAC

2021-03-24 Thread Numan Siddique
On Tue, Mar 23, 2021 at 11:53 AM Frode Nordahl
 wrote:
>
> On Mon, Mar 22, 2021 at 8:47 PM Mark Michelson  wrote:
> >
> > Hi Frode,
> >
> > For patches 6-9:
> >
> > Acked-by: Mark Michelson 
>
> Thank you for the review, much appreciated.
>
> > Can you specify which of the patches need to be backported and to which
> > branches they need to be backported? I can take care of all the merges
> > at the same time once I get this info.
>
> Numan already did the required backports for patches 1-5 so we should
> be good there. Would be great to get the tests into master.
>
> Cheers!

I applied the patches 6 and 7 to the main branch.

There are some issues with patch 9.  I didn't apply patch 8 as it
seems related to patch 9.

If I configure like below and run "make check" it fails for me.  Can
you please take a look.

$mkdir _gcc
$cd _gcc
$../configure --enable-Werror --enable-sparse --with-ovs-source=...
$cd ..
$make -C _gcc check
make[2]: Entering directory /tmp/ovn/_gcc'
make[2]: 'tests/atlocal' is up to date.
make[2]: 'tests/testpki-cacert.pem' is up to date.
make[2]: 'tests/testpki-test-cert.pem' is up to date.
make[2]: 'tests/testpki-test-privkey.pem' is up to date.
make[2]: 'tests/testpki-test-req.pem' is up to date.
make[2]: 'tests/testpki-test2-cert.pem' is up to date.
make[2]: 'tests/testpki-test2-privkey.pem' is up to date.
make[2]: 'tests/testpki-test2-req.pem' is up to date.
cp /tmp/ovn/_gcc/tests/pki/main-cert.pem tests/testpki-main-cert.pem
cp: cannot stat '/tmp/ovn/_gcc/tests/pki/main-cert.pem': No such file
or directory
make[2]: *** [Makefile:3512: tests/testpki-main-cert.pem] Error 1
make[2]: Leaving directory '/tmp/ovn/_gcc'

Thanks
Numan


>
> --
> Frode Nordahl
>
> > Thanks.
> >
> > On 3/5/21 7:16 AM, Frode Nordahl wrote:
> > > The current version of the test attempts to simulate chassis
> > > registration prior to starting `ovn-controller`, however it does
> > > not set the `hostname` field.
> > >
> > > The RBAC role for `ovn-controller` does not allow for a chassis to
> > > change its own name or hostname, which makes sense as this is used
> > > for authentication.
> > >
> > > Update the test to set the `hostname` field when simulating chassis
> > > registration so that `ovn-controller` does not attempt to update it
> > > and subsequently make the test fail.
> > >
> > > Fixes b6b3823d4 ("ovn-controller: Fix I-P for SB Port_Binding and OVS 
> > > Interface")
> > >
> > > Signed-off-by: Frode Nordahl 
> > > ---
> > >   tests/ovn.at | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/tests/ovn.at b/tests/ovn.at
> > > index bec593dcc..ca9623fee 100644
> > > --- a/tests/ovn.at
> > > +++ b/tests/ovn.at
> > > @@ -21572,7 +21572,7 @@ ovn-nbctl --wait=sb lsp-add ls1 lsp1
> > >
> > >   # Simulate the fact that lsp1 had been previously bound on hv1.
> > >   ovn-sbctl --id=@e create encap chassis_name=hv1 ip="192.168.0.1" 
> > > type="geneve" \
> > > --- --id=@c create chassis name=hv1 encaps=@e \
> > > +-- --id=@c create chassis hostname=hv1 name=hv1 encaps=@e \
> > >   -- set Port_Binding lsp1 chassis=@c
> > >
> > >   as hv1
> > >
> >
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] ovn-nbctl: update BFD rows in nbctl_lr_route_del routine

2021-03-24 Thread Ilya Maximets
On 3/24/21 12:26 PM, Numan Siddique wrote:
> On Tue, Mar 23, 2021 at 12:11 AM Mark Michelson  wrote:
>>
>> Correct me if I'm wrong, but couldn't this be done automatically by just
>> setting "isRoot: false" for the BFD table in ovn-nb.ovsschema?
> 
> I think changing to isRoot: false could have upgrade issues.
> 
> IMO, since CMS creates the BFD rows, it's better for it to delete when
> it's not required.  Although I'm not against this patch.
> 

One more comment here:  Enforcing this policy (remove entry if there are no
users) on the ovn-nbctl level will result in different behavior for different
users.  e.g. if someone will perform the same action (remove static route)
directly via jsonrpc or some library like go-ovn, they will not have the BFD
row deleted.  This might complicate things like migration from nbctl to go-ovn
in ovn-k8s.  And since this behavior is not specified in a DB schema, this
should be, at least, clearly documented.

Bets regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v12 00/11] Add offload support for sFlow

2021-03-24 Thread Ilya Maximets
On 3/24/21 10:17 AM, Chris Mi wrote:
> On 3/23/2021 10:24 PM, Ilya Maximets wrote:
>> On 3/5/21 4:27 AM, Chris Mi wrote:
>>> Hi Ilya,
>>>
>>> I think about your suggestion recently. But I'm still not very clear about 
>>> the design.
>>> Please see my reply below:
>>>
>>> On 3/1/2021 8:48 PM, Ilya Maximets wrote:
 On 3/1/21 9:30 AM, Chris Mi wrote:
> Hi Simon, Ilya,
>
> Could I know what should we do to make progress for this patch set?
> It has been posted in the community for a long time 
 In general, the way to get your patches reviewed is to review other
 patches.  It's simply because we still have a huge review backlog
 (214 patches right now in patchwork and most of them needs review)
 and bugfixes usually has a bit higher priority.  By reviewing other
 patches you're reducing amount of work for maintainers so they can
 get to your patches faster.
>>> OK, I see.
 For the series and comments from Eelco:
 I didn't read the patches carefully, only a quick glance, but I still
 do not understand why we need a separate thread to poll psample events.
 Why can't we just allow usual handler threads to do that?
>>> I'm not sure if you are aware of that the psample netlink is different from 
>>> the ovs
>>> netlink. Without offload, kernel sends missed packet and sFlow packet to 
>>> userspace
>>> using the same netlink 'ovs_packet_family'. So they can use the same 
>>> handler thread.
>>> But in order to offload sFlow action, we use psample kernel module to send 
>>> sampled
>>> packets from kernel to userspace. The format for ovs netlink message and 
>>> psample
>>> netlink messages are different.
>> Hi.  Sorry for late reply.
>>
>> Yes, I understand that message format is different, but it doesn't really
>> matter.  All the message-specific handling and parsing should happen
>> inside the netdev_offload_tc_recv().  This function should have a prototype
>> similar to dpif_netlink_recv(), i.e. it should receive a pointer to the
>> struct dpif_upcall from the caller and fill it with data.  Maybe other
>> type of a generic data structure if it's really not possible to construct
>> struct dpif_upcall.
>>
>>
     From the
 architecture perspective it's not a good thing to call ofproto code
 from the offload-provider.  This introduces lots of complications
 and might cause lots of issues in the future.

 I'd say that design should look like this:

     handler thread ->
   dpif_recv() ->
     dpif_netlink_recv() ->
   netdev_offload_recv() ->
     netdev_offload_tc_recv() ->
   nl_sock_recv()
>>> In order to use the handler thread, I'm not sure if we should add psample 
>>> socket
>>> fd to every handler's epoll_fd. If we should do that, we should consider 
>>> how to
>>> differentiate if the event comes from ovs or psample netlink. Maybe we 
>>> should
>>> allocate a special port number for psample and assign it to event.data.u32.
>>> Anyway, that's the details. If this is the right direction, I'll think 
>>> about it.
 Last three calls should be implemented.

 The better version of this will be to throw away dpif part from
 above call chain and make it:

     handler thread ->
   netdev_offload_recv() ->
     netdev_offload_tc_recv() ->
   nl_sock_recv()
>>> If we throw away dpif part, maybe we have to write some duplicate epoll code
>>> for psample only. And we can't block in nl_sock_recv(). Maybe we have to add
>>> psample socket fd to every handler's epoll_fd. So we have to change the dpif
>>> somehow.
>> There is no need to implement any epoll_fd logic for psample.
>> You may only use handler with handler_id == 0.  Only this handler will 
>> receive
>> psample upcalls.  netdev_offload_recv_wait() should be implemented similar
>> to how dpif_netlink_recv_wait() implemented for windows case, i.e. it will
>> call nl_sock_wait(nl_sock, POLLIN) for the psample netlink socket.
>> There is no need to block and you're never blocking anywhere because your're
>> calling nl_sock_recv(..., wait = false) which will use MSG_DONTWAIT while
>> actually receiving a message.
> Hi Ilya,
> 
> With this new design, the code will be changed greatly. I'm not unwilling to 
> change it.
> The effort is not small. So before I start, I want to make sure that it is 
> feasible.
> 
> Yes, we won't block in nl_sock_recv(), but the handler thread will be blocked 
> in poll_block().
> If any of the vport netlink socket is readable , it will be waken up by 
> kernel. If we don't use
> epoll_ctl to add the psample netlink socket to the handler's epoll fd, we 
> can't receive the
> psample packet as soon as possible. That means we can only receive the 
> sampled packet
> when there is a miss upcall.
> 
> I added some debug message in ovs epoll code and got above conclusion. But 
> I'm not
> the expert of the epoll API, so I'm not sure if I 

Re: [ovs-dev] [PATCH ovn] ovn-nbctl: update BFD rows in nbctl_lr_route_del routine

2021-03-24 Thread Numan Siddique
On Tue, Mar 23, 2021 at 12:11 AM Mark Michelson  wrote:
>
> Correct me if I'm wrong, but couldn't this be done automatically by just
> setting "isRoot: false" for the BFD table in ovn-nb.ovsschema?

I think changing to isRoot: false could have upgrade issues.

IMO, since CMS creates the BFD rows, it's better for it to delete when
it's not required.  Although I'm not against this patch.

Thanks
Numan

>
> On 3/1/21 8:12 AM, Lorenzo Bianconi wrote:
> > Remove BFD entry if its the last reference in static_routes table has
> > been removed.
> >
> > Signed-off-by: Lorenzo Bianconi 
> > ---
> >   tests/ovn-nbctl.at| 11 ---
> >   utilities/ovn-nbctl.c | 17 +
> >   2 files changed, 25 insertions(+), 3 deletions(-)
> >
> > diff --git a/tests/ovn-nbctl.at b/tests/ovn-nbctl.at
> > index 6d91aa4c5..e9aeaa6e5 100644
> > --- a/tests/ovn-nbctl.at
> > +++ b/tests/ovn-nbctl.at
> > @@ -1633,10 +1633,15 @@ IPv6 Routes
> >   ])
> >
> >   AT_CHECK([ovn-nbctl lrp-add lr0 lr0-p0 00:00:01:01:02:03 192.168.10.1/24])
> > -bfd_uuid=$(ovn-nbctl create bfd logical_port=lr0-p0 dst_ip=100.0.0.50 
> > status=down min_tx=250 min_rx=250 detect_mult=10)
> > -AT_CHECK([ovn-nbctl lr-route-add lr0 100.0.0.0/24 192.168.0.1])
> > +bfd_uuid=$(ovn-nbctl create bfd logical_port=lr0-p0 dst_ip=192.168.10.2 
> > status=down min_tx=250 min_rx=250 detect_mult=10)
> > +AT_CHECK([ovn-nbctl lr-route-add lr0 100.0.0.0/24 192.168.10.2])
> >   route_uuid=$(fetch_column nb:logical_router_static_route _uuid 
> > ip_prefix="100.0.0.0/24")
> > -AT_CHECK([ovn-nbctl set logical_router_static_route $route_uuid 
> > bfd=$bfd_uuid])])
> > +AT_CHECK([ovn-nbctl set logical_router_static_route $route_uuid 
> > bfd=$bfd_uuid])
> > +AT_CHECK([ovn-nbctl --bfd lr-route-add lr0 200.0.0.0/24 192.168.10.2 
> > lr0-p0])
> > +AT_CHECK([ovn-nbctl lr-route-del lr0 100.0.0.0/24 192.168.10.2])
> > +check_row_count nb:BFD 1
> > +AT_CHECK([ovn-nbctl lr-route-del lr0 200.0.0.0/24 192.168.10.2])
> > +check_row_count nb:BFD 0])
> >
> >   dnl -
> >
> > diff --git a/utilities/ovn-nbctl.c b/utilities/ovn-nbctl.c
> > index 2c77f4ba7..5ea6c6249 100644
> > --- a/utilities/ovn-nbctl.c
> > +++ b/utilities/ovn-nbctl.c
> > @@ -4211,6 +4211,23 @@ nbctl_lr_route_del(struct ctl_context *ctx)
> >   }
> >   }
> >
> > +if (lr->static_routes[i]->bfd) {
> > +/* Check if this is the last reference to the BFD entry. */
> > +size_t j;
> > +
> > +for (j = 0; j < lr->n_static_routes; j++) {
> > +if (lr->static_routes[j] == lr->static_routes[i]) {
> > +continue;
> > +}
> > +if (lr->static_routes[j]->bfd == 
> > lr->static_routes[i]->bfd) {
> > +break;
> > +}
> > +}
> > +if (j == lr->n_static_routes) {
> > +nbrec_bfd_delete(lr->static_routes[i]->bfd);
> > +}
> > +}
> > +
> >   /* Everything matched. Removing. */
> >   nbrec_logical_router_update_static_routes_delvalue(
> >   lr, lr->static_routes[i]);
> >
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v1] Static Routes: Add ability to specify "discard" nexthop

2021-03-24 Thread Numan Siddique
On Fri, Mar 5, 2021 at 4:56 PM Mark Gray  wrote:
>
> On 22/02/2021 19:40, svc.eng.git-m...@nutanix.com wrote:
> > From: karthik-kc 
>
> Hi Karthik. Thanks for this. I have some comments below.
>
> Also, just to let you know, its needs a rebase and the UTs passed for me.

Thanks Karthik for the patch.

The patch overall LGTM.  Can you please address Mark G's comments,
add ddlog support and submit v2 ?

If you've any questions on ddlog, feel free to ask here.

Thanks
Numan

> >
> > Physical switches have the ability to specify "discard" or sometimes
> > "NULL interface" as a nexthop for routes. This can be used to prevent
> > routing loops in the network. Add a similar configuration for ovn
> > where nexthop accepts the string "discard". When the nexthop is discard
> > the action in the routing table will be to drop the packets.
> >
> > Signed-off-by: Karthik Chandrashekar 
> > ---
> >  northd/ovn-northd.c   | 126 +++---
> >  ovn-nb.xml|   4 +-
> >  tests/ovn-nbctl.at|  13 +
> >  tests/ovn.at  |  93 +++
> >  utilities/ovn-nbctl.c |  50 -
> >  5 files changed, 215 insertions(+), 71 deletions(-)
> >
> > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> > index 39d798782..18d0e0b43 100644
> > --- a/northd/ovn-northd.c
> > +++ b/northd/ovn-northd.c
> > @@ -7749,6 +7749,7 @@ struct parsed_route {
> >  uint32_t hash;
> >  const struct nbrec_logical_router_static_route *route;
> >  bool ecmp_symmetric_reply;
> > +bool is_discard_route;
> >  };
> >
> >  static uint32_t
> > @@ -7768,20 +7769,23 @@ parsed_routes_add(struct ovs_list *routes,
> >  /* Verify that the next hop is an IP address with an all-ones mask. */
> >  struct in6_addr nexthop;
> >  unsigned int plen;
> > -if (!ip46_parse_cidr(route->nexthop, , )) {
> > -static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
> > -VLOG_WARN_RL(, "bad 'nexthop' %s in static route"
> > - UUID_FMT, route->nexthop,
> > - UUID_ARGS(>header_.uuid));
> > -return NULL;
> > -}
> > -if ((IN6_IS_ADDR_V4MAPPED() && plen != 32) ||
> > -(!IN6_IS_ADDR_V4MAPPED() && plen != 128)) {
> > -static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
> > -VLOG_WARN_RL(, "bad next hop mask %s in static route"
> > - UUID_FMT, route->nexthop,
> > - UUID_ARGS(>header_.uuid));
> > -return NULL;
> > +bool is_discard_route = !strcmp(route->nexthop, "discard");
> > +if (!is_discard_route) {
> > +if (!ip46_parse_cidr(route->nexthop, , )) {
> > +static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
> > +VLOG_WARN_RL(, "bad 'nexthop' %s in static route"
> > + UUID_FMT, route->nexthop,
> > + UUID_ARGS(>header_.uuid));
> > +return NULL;
> > +}
> > +if ((IN6_IS_ADDR_V4MAPPED() && plen != 32) ||
> > +(!IN6_IS_ADDR_V4MAPPED() && plen != 128)) {
> > +static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
> > +VLOG_WARN_RL(, "bad next hop mask %s in static route"
> > + UUID_FMT, route->nexthop,
> > + UUID_ARGS(>header_.uuid));
> > +return NULL;
> > +}
> >  }
> >
> >  /* Parse ip_prefix */
> > @@ -7795,13 +7799,15 @@ parsed_routes_add(struct ovs_list *routes,
> >  }
> >
> >  /* Verify that ip_prefix and nexthop have same address familiy. */
> > -if (IN6_IS_ADDR_V4MAPPED() != IN6_IS_ADDR_V4MAPPED()) {
> > -static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
> > -VLOG_WARN_RL(, "Address family doesn't match between 
> > 'ip_prefix' %s"
> > - " and 'nexthop' %s in static route"UUID_FMT,
> > - route->ip_prefix, route->nexthop,
> > - UUID_ARGS(>header_.uuid));
> > -return NULL;
> > +if (!is_discard_route) {
> > +if (IN6_IS_ADDR_V4MAPPED() != 
> > IN6_IS_ADDR_V4MAPPED()) {
> > +static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
> > +VLOG_WARN_RL(, "Address family doesn't match between 
> > 'ip_prefix' %s"
> > + " and 'nexthop' %s in static route"UUID_FMT,
> > + route->ip_prefix, route->nexthop,
> > + UUID_ARGS(>header_.uuid));
> > +return NULL;
> > +}
> >  }
> >
> >  const struct nbrec_bfd *nb_bt = route->bfd;
> > @@ -7832,6 +7838,7 @@ parsed_routes_add(struct ovs_list *routes,
> >  pr->route = route;
> >  pr->ecmp_symmetric_reply = smap_get_bool(>options,
> >   "ecmp_symmetric_reply", 
> > false);
> > +pr->is_discard_route = is_discard_route;
> >  

Re: [ovs-dev] [PATCH ovn v3] ovn-controller: Add 'local_ip' option to tunnel ports for IPsec case

2021-03-24 Thread Numan Siddique
On Tue, Mar 16, 2021 at 2:32 AM Mark Michelson  wrote:
>
> LGMT
>
> Acked-by: Mark Michelson 

Thank you Mark G (and Mark M for the reviews).

I applied this patch to master.

Numan

>
> On 2/16/21 6:55 AM, Mark Gray wrote:
> > If a chassis has multiple interfaces, 'ovn-encap-ip' can be used
> > to specify the IP address of the interface that is used for tunnel
> > traffic. OVN uses that IP address to configure the 'remote_ip' of
> > a tunnel port. OVS tunnel ports also accept 'options:local_ip', which,
> > according to the OVS documentation specifies "the tunnel destination
> > IP that received packets must match. Default is to match all addresses".
> > OVN does not set 'local_ip'.
> >
> > 'ovs-monitor-ipsec' is an OVS daemon that is used to configure and IPsec
> > IKE daemon on the host. In order to correctly specify an IPsec
> > connection, it requires the source and destination IP address of
> > that connection. In the OVN case, as 'local_ip' is not specified, it
> > is unable to infer the IP address of both sides of a tunnel and, therefore,
> > cannot setup an IPsec connection.
> >
> > This patch configures 'local_ip' on tunnel ports when IPsec has
> > been enabled. This allows for OVS/OVN IPsec to work when 'ovn-encap-ip'
> > is not specified as the chassis default gateway interface.
> >
> > This patch also adds some unit tests. The OVS daemon 'ovs-monitor-ipsec'
> > requires a number of options to be configured on OVS tunnel ports in order
> > to function correctly. These unit tests ensure that these options are
> > configured correctly when IPsec has been enabled through the northbound
> > database.
> >
> > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1924041
> > Signed-off-by: Mark Gray 
> > ---
> >
> > v2: Updated topic filter to "PATCH ovn"
> > v3: Rebased due to 0-day bot warning
> >
> >   controller/chassis.c |   5 +++
> >   controller/encaps.c  |  26 ++-
> >   tests/automake.mk|   3 +-
> >   tests/ovn-ipsec.at   | 104 +++
> >   tests/testsuite.at   |   1 +
> >   5 files changed, 137 insertions(+), 2 deletions(-)
> >   create mode 100644 tests/ovn-ipsec.at
> >
> > diff --git a/controller/chassis.c b/controller/chassis.c
> > index 310132d09d2e..9b0a36cf076f 100644
> > --- a/controller/chassis.c
> > +++ b/controller/chassis.c
> > @@ -279,6 +279,11 @@ chassis_parse_ovs_config(const struct 
> > ovsrec_open_vswitch_table *ovs_table,
> >   return false;
> >   }
> >
> > +/* 'ovn-encap-ip' can accept a comma-delimited list of IP addresses 
> > instead
> > + * of a single IP address. Although this is undocumented, it can be 
> > used
> > + * to enable certain hardware-offloaded use cases in which a host has
> > + * multiple NICs and is assigning SR-IOV VFs to a guest (as logical 
> > ports).
> > + */
> >   if (!chassis_parse_ovs_encap_ip(encap_ips, _cfg->encap_ip_set)) {
> >   sset_destroy(_cfg->encap_type_set);
> >   return false;
> > diff --git a/controller/encaps.c b/controller/encaps.c
> > index 7eac4bb064ac..fc93bf1eeb87 100644
> > --- a/controller/encaps.c
> > +++ b/controller/encaps.c
> > @@ -59,6 +59,7 @@ struct tunnel_ctx {
> >
> >   struct ovsdb_idl_txn *ovs_txn;
> >   const struct ovsrec_bridge *br_int;
> > +const struct sbrec_chassis *this_chassis;
> >   };
> >
> >   struct chassis_node {
> > @@ -176,6 +177,28 @@ tunnel_add(struct tunnel_ctx *tc, const struct 
> > sbrec_sb_global *sbg,
> >
> >   /* Add auth info if ipsec is enabled. */
> >   if (sbg->ipsec) {
> > +const struct sbrec_chassis *this_chassis = tc->this_chassis;
> > +const char *local_ip = NULL;
> > +
> > +/* Determine 'ovn-encap-ip' of the local chassis as this will be 
> > the
> > + * tunnel port's 'local_ip'. We do not support the case in which
> > + * 'ovn-encap-ip' holds multiple comma-delimited IP addresses.
> > + */
> > +for (int i = 0; i < this_chassis->n_encaps; i++) {
> > +if (local_ip && strcmp(local_ip, this_chassis->encaps[i]->ip)) 
> > {
> > +VLOG_ERR("ovn-encap-ip has been configured as a list. This 
> > "
> > + "is unsupported for IPsec.");
> > +/* No need to loop further as we know this condition has 
> > been
> > + * hit */
> > +break;
> > +} else {
> > +local_ip = this_chassis->encaps[i]->ip;
> > +}
> > +}
> > +
> > +if (local_ip) {
> > +smap_add(, "local_ip", local_ip);
> > +}
> >   smap_add(, "remote_name", new_chassis_id);
> >   }
> >
> > @@ -310,7 +333,8 @@ encaps_run(struct ovsdb_idl_txn *ovs_idl_txn,
> >   struct tunnel_ctx tc = {
> >   .chassis = SHASH_INITIALIZER(),
> >   .port_names = SSET_INITIALIZER(_names),
> > -.br_int = br_int
> > +.br_int = br_int,
> > +.this_chassis = 

Re: [ovs-dev] [PATCH v2] conntrack: Add a test for IPv4 UDP zero checksum

2021-03-24 Thread Paolo Valerio
Aaron Conole  writes:

> Paolo Valerio  writes:
>
>> Hi Aaron,
>>
>> Aaron Conole  writes:
>>
>>> Recently, during some conntrack testing a bug was uncovered in a DPDK
>>> PMD, which doesn't support an IPv4 packet with a zero checksum value.
>>> In order to show that the connection tracking code in userspace
>>> supports IPv4 UDP with a zero checksum, add a test case to enforce
>>> this behavior.
>>>
>>> Reported-at: http://mails.dpdk.org/archives/dev/2021-January/198528.html
>>> Reported-by: Paolo Valerio 
>>> Signed-off-by: Aaron Conole 
>>> ---
>>> Previous version:
>>> https://mail.openvswitch.org/pipermail/ovs-dev/2021-February/380266.html
>>>
>>> v1->v2: - Addressed most of the comments by Flavio & William Tu.
>>> - Added the 0x checksum case
>>> - Added a bad checksum case
>>> - For the single instance of "sleep 1," this needs to be retained
>>>   due to the negative test case (we have no WAIT_UNTIL to rely on)
>>>   but there are other instances of sleep 1 throughout the suite,
>>>   so I guess it should be okay.
>>>
>>>  tests/system-traffic.at | 64 +
>>>  1 file changed, 64 insertions(+)
>>>
>>> diff --git a/tests/system-traffic.at b/tests/system-traffic.at
>>> index fb5b9a36d2..5e51c24760 100644
>>> --- a/tests/system-traffic.at
>>> +++ b/tests/system-traffic.at
>>> @@ -5927,6 +5927,69 @@ ovs-appctl dpif/dump-flows br0
>>>  OVS_TRAFFIC_VSWITCHD_STOP
>>>  AT_CLEANUP
>>>  
>>> +
>>> +AT_SETUP([conntrack - IPv4 UDP zero checksum])
>>
>> Don't know if this second new line up here was intentional (mixed style
>> in the file).
>> TBH, I have no strong preference about it, I pointed out just in case.
>
> Ugh... every time I rebase some stray new line makes it in.  It's like I
> have a 4 year old in my lap...
>
>>> +dnl This tracks sending zero checksum packets for udp over ipv4
>>> +CHECK_CONNTRACK()
>>> +OVS_TRAFFIC_VSWITCHD_START()
>>> +OVS_CHECK_CT_CLEAR()
>>> +
>>> +ADD_NAMESPACES(at_ns0, at_ns1)
>>> +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24", "f0:00:00:01:01:01")
>>> +ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24", "f0:00:00:01:01:02")
>>> +dnl setup ct flows
>>> +AT_DATA([flows.txt], [dnl
>>> +table=0,priority=10  ip,udp,ct_state=-trk action=ct(zone=1,table=1)
>>> +table=0,priority=0   action=drop
>>> +table=1,priority=10  ct_state=-est+trk+new,ip,ct_zone=1,in_port=1 
>>> action=ct(commit,table=2)
>>> +table=1,priority=10  ct_state=+est-new+trk,ct_zone=1,in_port=1 
>>> action=resubmit(,2)
>>> +table=1,priority=0   action=drop
>>> +table=2,priority=10  ct_state=+trk+new,in_port=1 action=2
>>> +table=2,priority=10  ct_state=+trk+est action=2
>>> +])
>>> +
>>> +AT_CHECK([ovs-ofctl --bundle add-flows br0 flows.txt])
>>> +
>>> +# sending udp pkt with  checksum
>>> +NS_CHECK_EXEC([at_ns0], [$PYTHON3 $srcdir/sendpkt.py p0 f0 00 00 01
>>> 01 02 f0 00 00 01 01 01 08 00 45 00 00 28 00 01 00 00 40 11 64 c0 0a
>>> 01 01 01 0a 01 01 02 04 d2 04 d2 00 14 00 00 aa aa aa aa aa aa aa aa
>>> aa aa aa aa > /dev/null])
>>> +
>>> +OVS_WAIT_UNTIL([ovs-appctl dpctl/dump-conntrack | grep "udp"])
>>> +
>>> +dnl ensure CT picked up the packet
>>> +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(10.1.1)], [0], [dnl
>>> +udp,orig=(src=10.1.1.1,dst=10.1.1.2,sport=,dport=),reply=(src=10.1.1.2,dst=10.1.1.1,sport=,dport=)
>>> +])
>>> +
>>> +dnl clear CT tuples
>>> +AT_CHECK([ovs-appctl dpctl/flush-conntrack])
>>> +AT_CHECK([ovs-appctl dpctl/dump-conntrack | grep "udp"], [1])
>>> +
>>> +dnl send UDP with  checksum
>>> +NS_CHECK_EXEC([at_ns0], [$PYTHON3 $srcdir/sendpkt.py p0 f0 00 00 01
>>> 01 02 f0 00 00 01 01 01 08 00 45 00 00 40 00 01 00 00 40 11 64 a8 0a
>>> 01 01 01 0a 01 01 02 04 d2 04 d2 00 2c ff ff 00 00 00 00 00 00 00 00
>>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>> 00 00 00 df ed > /dev/null])
>>> +OVS_WAIT_UNTIL([ovs-appctl dpctl/dump-conntrack | grep "udp"])
>>> +
>>> +dnl ensure CT picked up the packet
>>> +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(10.1.1)], [0], [dnl
>>> +udp,orig=(src=10.1.1.1,dst=10.1.1.2,sport=,dport=),reply=(src=10.1.1.2,dst=10.1.1.1,sport=,dport=)
>>> +])
>>> +
>>> +dnl clear CT tuples
>>> +AT_CHECK([ovs-appctl dpctl/flush-conntrack])
>>> +AT_CHECK([ovs-appctl dpctl/dump-conntrack | grep "udp"], [1])
>>> +
>>> +# sending udp pkt with bad checksum
>>> +NS_CHECK_EXEC([at_ns0], [$PYTHON3 $srcdir/sendpkt.py p0 f0 00 00 01
>>> 01 02 f0 00 00 01 01 01 08 00 45 00 00 28 00 01 00 00 40 11 64 c0 0a
>>> 01 01 01 0a 01 01 02 04 d2 04 d2 00 14 11 11 aa aa aa aa aa aa aa aa
>>> aa aa aa aa > /dev/null])
>>> +
>>> +dnl There isn't anything else to do here but call a sleep... we can't use 
>>> a WAIT_UNTIL
>>> +dnl because there isn't a condition to check against - we just expect the 
>>> packet to
>>> +dnl be dropped.
>>> +sleep 1
>>> +
>>> +AT_CHECK([ovs-appctl dpctl/dump-conntrack | grep "udp"], [1])
>>> +
>>
>> what if we restrict a little the pattern here?
>
> Sure, 

Re: [ovs-dev] [PATCH ovn] expr: Combine multiple ipv4 with wildcard mask.

2021-03-24 Thread Dumitru Ceara
On 3/22/21 1:10 PM, gmingchen(陈供明) wrote:
> 
> 
> On 2021/3/19, 9:02 PM, "Dumitru Ceara"  wrote:
> 
> On 3/10/21 2:29 PM, gmingchen(陈供明) wrote:
> > From: Gongming Chen 
> > 
> > This patch merges ipv4 addresses with wildcard masks, and replaces this
> > ipv4 addresses with the combined ip/mask. This will greatly reduce the
> > entries in the ovs security group flow table, especially when the host
> > size is large.
> > 
> > Analysis in the simplest scenario, a network 1.1.1.0/24 network,create 
> 253
> > ports(1.1.1.2-1.1.1.254).
> > Only focus on the number of ip addresses, the original 253 addresses 
> will
> > be combined into 13 addresses.
> > 1.1.1.2/31
> > 1.1.1.4/30
> > 1.1.1.8/29
> > 1.1.1.16/28
> > 1.1.1.32/27
> > 1.1.1.64/26
> > 1.1.1.128/26
> > 1.1.1.192/27
> > 1.1.1.224/28
> > 1.1.1.240/29
> > 1.1.1.248/30
> > 1.1.1.252/31
> > 1.1.1.254
> > 
> > Some scenes are similar to the following:
> > 1.1.1.2, 1.1.1.6
> > After the combine:
> > 1.1.1.2/255.255.255.251
> > You can use ovn-match-ip utility to match ip.
> > such as:
> > ovs-ofctl dump-flows br-int | ovn-match-ip 1.1.1.6
> > 1.1.1.2/255.255.255.251 will show.
> > 
> > Simple description of the algorithm.
> > There are two situations
> > 1. Combine once
> > such as:
> > 1.1.1.0 1.1.1.1 1.0.1.0 1.0.1.1
> > Combined into: 1.1.1.0/31, 1.0.1.0/31
> > 2. Combine multiple times
> > 1.1.1.0 1.1.1.1 1.0.1.0 1.0.1.1
> > Combined into: 1.0.1.0/255.254.255.254
> > 
> > Considering the actual scene and simplicity, the first case is used to
> > combine once.
> > 
> > ...00...
> > ...01...
> > ...10...
> > ...11...
> > "..." means the same value omitted.
> > Obviously, the above value can be expressed as ...00.../11100111. This
> > continuous interval that can be represented by one or several wildcard
> > masks is called a segment.
> > Only if all 2< > exist, can they be combined into 00...(n)/00...( n)
> > 
> > First sort all the values by size. Iterate through each value.
> > 1. Find a new segment, where two values differ only by 1 bit, such as
> > ...0... and ...1...
> > diff = ip_next ^ ip
> > if (diff & (diff-1)) == 0
> > new_segment = true
> > The first non-zero place in the high direction of ip is the end of the
> > segment(segment_end).
> > For example...100... and...101..., the segment_end is ...111...
> > 
> > 2. Count the number of consecutive and less than continuous_size in the
> > segment.
> > diff = ip_next - ip
> > if (diff & (diff-1)) == 0 && ip_next <= segment_end
> > continuous_size++
> > 
> > 3. Combine different ip intervals in the segment according to
> > continuous_size.
> > In continuous_size, from the highest bit of 1 to the lowest bit of 1, in
> > the order of segment start, each bit that is 1 is a different ip 
> interval
> > that can be combined with a wildcard mask.
> > For example, 000, 001, 010:
> > continuous_size: 3 (binary 11), segment_start: 000
> > mask: ~(1 << 1 - 1) = 110; ~(1 << 0 - 1) = 111;
> > Combined to: 000/110, 010/111
> > 
> > 4. The ip that cannot be recorded in a segment will not be combined.
> > 
> > Signed-off-by: Gongming Chen 
> > ---
> 
> Hi Gongming,
> 
> Sorry for the delayed review.
> 
> I have a few general remarks/concerns and some specific comments inline.
>  First, the general remarks.
> 
> I'm wondering if it would make more sense for this wildcard combination
> of IPs to be done outside OVN, in the CMS, for the following reasons:
> 
> - it's IPv4 specific.
> - the CMS usually has more information about when it is matching on IP
> sets and can probably optimize the match it uses in the ACL.
> - the code in expr_const_sets_add() used to be a relatively direct
> translation from sets of constants to sets of port names/IPs; this
> changes with the optimization your proposing.
> - the new utility, ovn-match-ip, would be useful but I'm worried that
> users will often forget to use it.
> 
> I'm curious about your and the maintainers' opinion on these points.
> 
> Hi Dumitru,
> First, thanks for your review.
> 
> On the one hand, I agree with your point of view. The cms side is more
> clear about the details of the ip, and the ip can be better optimized.
> However, on the other hand, I think that ovn, as a network provider, should
> not assume that upper layers such as cms have already made these 
> optimizations.
> In fact, some cms do not have such a function, such as openstack, k8s,
> and they hope that this specific and complex work will be realized by
> the network provider.
> In terms of better serving the upper 

Re: [ovs-dev] [PATCH v3 2/3] ovsdb-idl: Preserve references for deleted rows.

2021-03-24 Thread Dumitru Ceara
On 3/24/21 3:16 AM, Han Zhou wrote:
>> If it's OK with you, I can send a v4 incorporating the changes we
>> discussed until now (no refactoring though).  I'm also adding more test
>> cases.  Hopefully that makes the review easier.
> Sounds good.
> 
> Thanks,
> Han
> 

V4 available for review:
https://patchwork.ozlabs.org/project/openvswitch/list/?series=235642

Thanks,
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v4 3/3] ovsdb-idl: Mark arc sources as updated when destination is deleted.

2021-03-24 Thread Dumitru Ceara
Considering two DB rows, 'a' from table A and 'b' from table B (with
column 'ref_a' a reference to table A):
a = {A._uuid=}
b = {B._uuid=, B.ref_a=}

When the IDL client processes an update that deletes row 'a', row 'b'
is also marked as 'updated' if change tracking is enabled for table B.

Fixes: 102781cc02c6 ("ovsdb-idl: Track changes for table references.")
Signed-off-by: Dumitru Ceara 
---
v4:
- Rebase.
---
 lib/ovsdb-idl.c|3 +++
 tests/ovsdb-idl.at |2 ++
 2 files changed, 5 insertions(+)

diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c
index 9043de0fab..25a78b5934 100644
--- a/lib/ovsdb-idl.c
+++ b/lib/ovsdb-idl.c
@@ -198,6 +198,8 @@ static void ovsdb_idl_remove_from_indexes(const struct 
ovsdb_idl_row *);
 static int ovsdb_idl_try_commit_loop_txn(struct ovsdb_idl_loop *loop,
  bool *may_need_wakeup);
 
+static void add_tracked_change_for_references(struct ovsdb_idl_row *);
+
 /* Creates and returns a connection to database 'remote', which should be in a
  * form acceptable to jsonrpc_session_open().  The connection will maintain an
  * in-memory replica of the remote database whose schema is described by
@@ -1385,6 +1387,7 @@ ovsdb_idl_reparse_deleted(struct ovsdb_idl *db)
 
 LIST_FOR_EACH_SAFE (row, next, track_node, >deleted_untracked_rows) {
 ovsdb_idl_row_untrack_change(row);
+add_tracked_change_for_references(row);
 ovsdb_idl_row_reparse_backrefs(row);
 
 /* Orphan rows that are still unreferenced or are part of tables that
diff --git a/tests/ovsdb-idl.at b/tests/ovsdb-idl.at
index 6f09257738..62181dd4de 100644
--- a/tests/ovsdb-idl.at
+++ b/tests/ovsdb-idl.at
@@ -1286,6 +1286,7 @@ OVSDB_CHECK_IDL_TRACK([track, simple idl, initially 
populated, orphan rows, cond
 003: table simple: inserted row: i=0 r=0 b=false s=row0_s u=<2> ia=[] ra=[] 
ba=[] sa=[] ua=[] uuid=<1>
 003: table simple: updated columns: s
 004: change conditions
+005: table simple6: name=first_row weak_ref=[] uuid=<0>
 005: table simple: deleted row: i=0 r=0 b=false s=row0_s u=<2> ia=[] ra=[] 
ba=[] sa=[] ua=[] uuid=<1>
 005: table simple: inserted row: i=0 r=0 b=false s=row1_s u=<2> ia=[] ra=[] 
ba=[] sa=[] ua=[] uuid=<3>
 005: table simple: updated columns: s
@@ -1497,6 +1498,7 @@ OVSDB_CHECK_IDL_TRACK([track, simple idl, initially 
populated, strong references
 001: table simple4: inserted row: name=row0_s4 uuid=<0>
 001: table simple4: updated columns: name
 002: change conditions
+003: table simple3: name=row0_s3 uset=[] uref=[] uuid=<1>
 003: table simple4: deleted row: name=row0_s4 uuid=<0>
 004: {"error":null,"result":[{"uuid":["uuid","<2>"]}]}
 005: table simple3: name=row0_s3 uset=[] uref=[] uuid=<1>

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v4 2/3] ovsdb-idl: Preserve references for deleted rows.

2021-03-24 Thread Dumitru Ceara
Considering two DB rows, 'a' from table A and 'b' from table B (with
column 'ref_a' a reference to table A):
a = {A._uuid=}
b = {B._uuid=, B.ref_a=}

Assuming both records are present in the IDL client's in-memory view of
the database, depending whether row 'b' is also deleted in the same
transaction or not, deletion of row 'a' should generate the following
tracked changes:

1. only row 'a' is deleted:
- for table A:
  - deleted records: a = {A._uuid=}
- for table B:
  - updated records: b = {B._uuid=, B.ref_a=[]}

2. row 'a' and row 'b' are deleted in the same update:
- for table A:
  - deleted records: a = {A._uuid=}
- for table B:
  - deleted records: b = {B._uuid=, B.ref_a=}

To ensure this, we now delay reparsing row backrefs for deleted rows
until all updates in the current run have been processed.

Without this change, in scenario 2 above, the tracked changes for table
B would be:
- deleted records: b = {B._uuid=, B.ref_a=[]}

In particular, for strong references, row 'a' can never be deleted in
a transaction that happens strictly before row 'b' is deleted.  In some
cases [0] both rows are deleted in the same transaction and having
B.ref_a=[] would violate the integrity of the database from client
perspective.  This would force the client to always validate that
strong reference fields are non-NULL.  This is not really an option
because the information in the original reference is required for
incrementally processing the record deletion.

[0] with ovn-monitor-all=true, the following command triggers a crash
in ovn-controller because a strong reference field becomes NULL:
$ ovn-nbctl --wait=hv -- lr-add r -- lrp-add r rp 00:00:00:00:00:01 
1.0.0.1/24
$ ovn-nbctl lr-del r

Reported-at: https://bugzilla.redhat.com/1932642
Fixes: 72aeb243a52a ("ovsdb-idl: Tracking - preserve data for deleted rows.")
Signed-off-by: Dumitru Ceara 
---
v4:
- Rename orphan_rows to deleted_untracked_rows.
- Rename ovsdb_idl_process_orphans() to ovsdb_idl_reparse_deleted().
- Revert changes to ovsdb_idl_row_reparse_backrefs().
- Unified test-ovsdb.c and test-ovsdb.py output for simple3's uset and
  uref columns.
- Added two more tests for deletion of strong references due to monitor
  condition change.
---
 lib/ovsdb-idl.c |  131 -
 tests/ovsdb-idl.at  |  312 ---
 tests/test-ovsdb.c  |   64 ++
 tests/test-ovsdb.py |   28 -
 4 files changed, 481 insertions(+), 54 deletions(-)

diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c
index 9e1e7876e7..9043de0fab 100644
--- a/lib/ovsdb-idl.c
+++ b/lib/ovsdb-idl.c
@@ -92,6 +92,9 @@ struct ovsdb_idl {
 struct ovsdb_idl_txn *txn;
 struct hmap outstanding_txns;
 bool verify_write_only;
+struct ovs_list deleted_untracked_rows; /* Stores rows deleted in the
+ * current run, that are not yet
+ * added to the track_list. */
 };
 
 static struct ovsdb_cs_ops ovsdb_idl_cs_ops;
@@ -144,6 +147,7 @@ static bool ovsdb_idl_modify_row(struct ovsdb_idl_row *,
  const struct shash *values, bool xor);
 static void ovsdb_idl_parse_update(struct ovsdb_idl *,
const struct ovsdb_cs_update_event *);
+static void ovsdb_idl_reparse_deleted(struct ovsdb_idl *);
 
 static void ovsdb_idl_txn_process_reply(struct ovsdb_idl *,
 const struct jsonrpc_msg *);
@@ -163,6 +167,10 @@ static void ovsdb_idl_row_unparse(struct ovsdb_idl_row *);
 static void ovsdb_idl_row_clear_old(struct ovsdb_idl_row *);
 static void ovsdb_idl_row_clear_new(struct ovsdb_idl_row *);
 static void ovsdb_idl_row_clear_arcs(struct ovsdb_idl_row *, bool 
destroy_dsts);
+static void ovsdb_idl_row_reparse_backrefs(struct ovsdb_idl_row *);
+static void ovsdb_idl_row_track_change(struct ovsdb_idl_row *,
+   enum ovsdb_idl_change);
+static void ovsdb_idl_row_untrack_change(struct ovsdb_idl_row *);
 
 static void ovsdb_idl_txn_abort_all(struct ovsdb_idl *);
 static bool ovsdb_idl_txn_extract_mutations(struct ovsdb_idl_row *,
@@ -248,6 +256,8 @@ ovsdb_idl_create_unconnected(const struct ovsdb_idl_class 
*class,
 .txn = NULL,
 .outstanding_txns = HMAP_INITIALIZER(>outstanding_txns),
 .verify_write_only = false,
+.deleted_untracked_rows
+= OVS_LIST_INITIALIZER(>deleted_untracked_rows),
 };
 
 uint8_t default_mode = (monitor_everything_by_default
@@ -351,6 +361,14 @@ ovsdb_idl_set_leader_only(struct ovsdb_idl *idl, bool 
leader_only)
 static void
 ovsdb_idl_clear(struct ovsdb_idl *db)
 {
+/* Process deleted rows, removing them from the 'deleted_untracked_rows'
+ * list and reparsing their backrefs.
+ */
+ovsdb_idl_reparse_deleted(db);
+
+/* Cleanup all rows; each row gets added to its own table's
+ * 'track_list'.
+ */
 for (size_t i = 0; 

[ovs-dev] [PATCH v4 1/3] ovsdb-idl.at: Make test outputs more predictable.

2021-03-24 Thread Dumitru Ceara
IDL tests need predictable output from test-ovsdb.

This used to be done by first sorting the output of test-ovsdb and then
applying uuidfilt to predictably translate UUIDs.  This was not
reliable enough in case test-ovsdb processes two or more insert/delete
operations in the same iteration because the order of lines in the
output depends on the automatically generated UUID values.

To fix this we change the way test-ovsdb and test-ovsdb.py generate
outputs and prepend the table name and tracking information before
printing the contents of a row.

All existing ovsdb-idl.at and ovsdb-cluster.at tests are updated to
expect the new output format.

Signed-off-by: Dumitru Ceara 
---
Note: the old approach was enough for outputs of the existing tests but
the next patch in this series adds a new test that requires this
change.

v4:
- Rebase.
- Readd the missing uuid to outputs as reported by Ilya.
- Fix indentation in test-ovsdb.c.
v3:
- Changed expected output of ovsdb-cluster.at to reflect the new
  formatting in test-ovsdb output.
- Fixed typo in test-ovsdb.py.
v2:
- Reworked the patch and changed test-ovsdb.c and test-ovsdb.py to
  generate output that can be sorted predictably.
- Rephrased commit message.
---
 lib/ovsdb-idl.c|3 
 lib/ovsdb-idl.h|2 
 tests/ovsdb-cluster.at |2 
 tests/ovsdb-idl.at |  471 +++-
 tests/test-ovsdb.c |  186 +++
 tests/test-ovsdb.py|   91 +
 6 files changed, 396 insertions(+), 359 deletions(-)

diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c
index 2c8a0c9cfe..9e1e7876e7 100644
--- a/lib/ovsdb-idl.c
+++ b/lib/ovsdb-idl.c
@@ -182,7 +182,6 @@ ovsdb_idl_table_from_class(const struct ovsdb_idl *,
 static struct ovsdb_idl_table *
 ovsdb_idl_table_from_class(const struct ovsdb_idl *,
const struct ovsdb_idl_table_class *);
-static bool ovsdb_idl_track_is_set(struct ovsdb_idl_table *table);
 static void ovsdb_idl_track_clear__(struct ovsdb_idl *, bool flush_all);
 
 static void ovsdb_idl_destroy_indexes(struct ovsdb_idl_table *);
@@ -1140,7 +1139,7 @@ ovsdb_idl_track_add_all(struct ovsdb_idl *idl)
 }
 
 /* Returns true if 'table' has any tracked column. */
-static bool
+bool
 ovsdb_idl_track_is_set(struct ovsdb_idl_table *table)
 {
 size_t i;
diff --git a/lib/ovsdb-idl.h b/lib/ovsdb-idl.h
index 05bb48d66c..d93483245e 100644
--- a/lib/ovsdb-idl.h
+++ b/lib/ovsdb-idl.h
@@ -53,6 +53,7 @@ struct ovsdb_datum;
 struct ovsdb_idl_class;
 struct ovsdb_idl_row;
 struct ovsdb_idl_column;
+struct ovsdb_idl_table;
 struct ovsdb_idl_table_class;
 struct uuid;
 
@@ -217,6 +218,7 @@ unsigned int ovsdb_idl_row_get_seqno(
 void ovsdb_idl_track_add_column(struct ovsdb_idl *idl,
 const struct ovsdb_idl_column *column);
 void ovsdb_idl_track_add_all(struct ovsdb_idl *idl);
+bool ovsdb_idl_track_is_set(struct ovsdb_idl_table *table);
 const struct ovsdb_idl_row *ovsdb_idl_track_get_first(
 const struct ovsdb_idl *, const struct ovsdb_idl_table_class *);
 const struct ovsdb_idl_row *ovsdb_idl_track_get_next(const struct 
ovsdb_idl_row *);
diff --git a/tests/ovsdb-cluster.at b/tests/ovsdb-cluster.at
index 92aa427093..cf43e9cf86 100644
--- a/tests/ovsdb-cluster.at
+++ b/tests/ovsdb-cluster.at
@@ -128,7 +128,7 @@ ovsdb_test_cluster_disconnect () {
"rows": [{"i": 1}]}]]' > test-ovsdb.log 2>&1 &
 echo $! > test-ovsdb.pid
 
-OVS_WAIT_UNTIL([grep "000: i=1" test-ovsdb.log])
+OVS_WAIT_UNTIL([grep "000: table simple: i=1" test-ovsdb.log])
 
 # Start collecting raft_is_connected logs for $target before shutting down
 # any servers.
diff --git a/tests/ovsdb-idl.at b/tests/ovsdb-idl.at
index e00e67e949..f96cc17d46 100644
--- a/tests/ovsdb-idl.at
+++ b/tests/ovsdb-idl.at
@@ -355,28 +355,28 @@ OVSDB_CHECK_IDL([simple idl, initially empty, various 
ops],
 'reconnect']],
   [[000: empty
 001: {"error":null,"result":[{"uuid":["uuid","<0>"]},{"uuid":["uuid","<1>"]}]}
-002: i=0 r=0 b=false s= u=<2> ia=[] ra=[] ba=[] sa=[] ua=[] uuid=<1>
-002: i=1 r=2 b=true s=mystring u=<3> ia=[1 2 3] ra=[-0.5] ba=[true] sa=[abc 
def] ua=[<4> <5>] uuid=<0>
+002: table simple: i=0 r=0 b=false s= u=<2> ia=[] ra=[] ba=[] sa=[] ua=[] 
uuid=<1>
+002: table simple: i=1 r=2 b=true s=mystring u=<3> ia=[1 2 3] ra=[-0.5] 
ba=[true] sa=[abc def] ua=[<4> <5>] uuid=<0>
 003: {"error":null,"result":[{"count":2}]}
-004: i=0 r=0 b=true s= u=<2> ia=[] ra=[] ba=[] sa=[] ua=[] uuid=<1>
-004: i=1 r=2 b=true s=mystring u=<3> ia=[1 2 3] ra=[-0.5] ba=[true] sa=[abc 
def] ua=[<4> <5>] uuid=<0>
+004: table simple: i=0 r=0 b=true s= u=<2> ia=[] ra=[] ba=[] sa=[] ua=[] 
uuid=<1>
+004: table simple: i=1 r=2 b=true s=mystring u=<3> ia=[1 2 3] ra=[-0.5] 
ba=[true] sa=[abc def] ua=[<4> <5>] uuid=<0>
 005: {"error":null,"result":[{"count":2}]}
-006: i=0 r=123.5 b=true s= u=<2> ia=[] ra=[] ba=[] sa=[] ua=[] uuid=<1>
-006: i=1 r=123.5 b=true s=mystring u=<3> ia=[1 2 3] 

[ovs-dev] [PATCH v4 0/3] ovsdb-idl: Preserve references for tracked deleted rows.

2021-03-24 Thread Dumitru Ceara
Patch 1/3 of the series makes the ovsdb-idl tests more future proof
by trying to ensure more predictable output from test-ovsdb.

Paches 2/3 and 3/3 fix problems in the IDL change tracking code.

Changes in v4:
- Patch 1/3:
  - Rebase.
  - Readd UUID to test-ovsdb.py output.
  - Fix indentation in test-ovsdb.c.
- Patch 2/3:
  - Rename orphan_rows to deleted_untracked_rows.
  - Rename ovsdb_idl_process_orphans() to ovsdb_idl_reparse_deleted().
  - Revert changes to ovsdb_idl_row_reparse_backrefs().
  - Unified test-ovsdb.c and test-ovsdb.py output for simple3's uset and
uref columns.
  - Added two more tests for deletion of strong references due to monitor
condition change.
- Patch 3/3:
  - Rebase.

Changes in v3:
- Patch 1/3:
  - Changed expected output of ovsdb-cluster.at to reflect the new
formatting in test-ovsdb output.
  - Fixed typo in test-ovsdb.py.
- Patch 2/3:
  - Rework based on the discussion with Ilya.
  - Added more tests.
- Add patch 3/3:
  - Mark reference sources as "udpated" when destinations are deleted.

Changes in v2:
- Patch 1/2:
  - reworked the patch to improve the output of test-ovsdb.c and
test-ovsdb.py themselves.
- Patch 2/2:
  - added a test for strong references.

Dumitru Ceara (3):
  ovsdb-idl.at: Make test outputs more predictable.
  ovsdb-idl: Preserve references for deleted rows.
  ovsdb-idl: Mark arc sources as updated when destination is deleted.


 lib/ovsdb-idl.c|  137 +++--
 lib/ovsdb-idl.h|2 
 tests/ovsdb-cluster.at |2 
 tests/ovsdb-idl.at |  747 
 tests/test-ovsdb.c |  246 +++-
 tests/test-ovsdb.py|  119 +---
 6 files changed, 861 insertions(+), 392 deletions(-)

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v12 00/11] Add offload support for sFlow

2021-03-24 Thread Chris Mi

On 3/23/2021 10:24 PM, Ilya Maximets wrote:

On 3/5/21 4:27 AM, Chris Mi wrote:

Hi Ilya,

I think about your suggestion recently. But I'm still not very clear about the 
design.
Please see my reply below:

On 3/1/2021 8:48 PM, Ilya Maximets wrote:

On 3/1/21 9:30 AM, Chris Mi wrote:

Hi Simon, Ilya,

Could I know what should we do to make progress for this patch set?
It has been posted in the community for a long time 

In general, the way to get your patches reviewed is to review other
patches.  It's simply because we still have a huge review backlog
(214 patches right now in patchwork and most of them needs review)
and bugfixes usually has a bit higher priority.  By reviewing other
patches you're reducing amount of work for maintainers so they can
get to your patches faster.

OK, I see.

For the series and comments from Eelco:
I didn't read the patches carefully, only a quick glance, but I still
do not understand why we need a separate thread to poll psample events.
Why can't we just allow usual handler threads to do that?

I'm not sure if you are aware of that the psample netlink is different from the 
ovs
netlink. Without offload, kernel sends missed packet and sFlow packet to 
userspace
using the same netlink 'ovs_packet_family'. So they can use the same handler 
thread.
But in order to offload sFlow action, we use psample kernel module to send 
sampled
packets from kernel to userspace. The format for ovs netlink message and psample
netlink messages are different.

Hi.  Sorry for late reply.

Yes, I understand that message format is different, but it doesn't really
matter.  All the message-specific handling and parsing should happen
inside the netdev_offload_tc_recv().  This function should have a prototype
similar to dpif_netlink_recv(), i.e. it should receive a pointer to the
struct dpif_upcall from the caller and fill it with data.  Maybe other
type of a generic data structure if it's really not possible to construct
struct dpif_upcall.



    From the
architecture perspective it's not a good thing to call ofproto code
from the offload-provider.  This introduces lots of complications
and might cause lots of issues in the future.

I'd say that design should look like this:

    handler thread ->
  dpif_recv() ->
    dpif_netlink_recv() ->
  netdev_offload_recv() ->
    netdev_offload_tc_recv() ->
  nl_sock_recv()

In order to use the handler thread, I'm not sure if we should add psample socket
fd to every handler's epoll_fd. If we should do that, we should consider how to
differentiate if the event comes from ovs or psample netlink. Maybe we should
allocate a special port number for psample and assign it to event.data.u32.
Anyway, that's the details. If this is the right direction, I'll think about it.

Last three calls should be implemented.

The better version of this will be to throw away dpif part from
above call chain and make it:

    handler thread ->
  netdev_offload_recv() ->
    netdev_offload_tc_recv() ->
  nl_sock_recv()

If we throw away dpif part, maybe we have to write some duplicate epoll code
for psample only. And we can't block in nl_sock_recv(). Maybe we have to add
psample socket fd to every handler's epoll_fd. So we have to change the dpif
somehow.

There is no need to implement any epoll_fd logic for psample.
You may only use handler with handler_id == 0.  Only this handler will receive
psample upcalls.  netdev_offload_recv_wait() should be implemented similar
to how dpif_netlink_recv_wait() implemented for windows case, i.e. it will
call nl_sock_wait(nl_sock, POLLIN) for the psample netlink socket.
There is no need to block and you're never blocking anywhere because your're
calling nl_sock_recv(..., wait = false) which will use MSG_DONTWAIT while
actually receiving a message.

Hi Ilya,

With this new design, the code will be changed greatly. I'm not 
unwilling to change it.
The effort is not small. So before I start, I want to make sure that it 
is feasible.


Yes, we won't block in nl_sock_recv(), but the handler thread will be 
blocked in poll_block().
If any of the vport netlink socket is readable , it will be waken up by 
kernel. If we don't use
epoll_ctl to add the psample netlink socket to the handler's epoll fd, 
we can't receive the
psample packet as soon as possible. That means we can only receive the 
sampled packet

when there is a miss upcall.

I added some debug message in ovs epoll code and got above conclusion. 
But I'm not

the expert of the epoll API, so I'm not sure if I missed anything.

Thanks,
Chris


So, there should be several call chains:

1. Init.

open_dpif_backer/type_run() ->
  netdev_offload_recv_set() ->
netdev_offload_tc_recv_set(enabled) ->
  if (enable)
psample_sock = nl_sock_create(...);
  else
close(psample_sock);

2. Wait.

udpif_upcall_handler() ->
  netdev_offload_recv_wait() ->
netdev_offload_tc_recv_wait() ->
   

[ovs-dev] crash in dst_cache_destroy

2021-03-24 Thread wanghanlin
Hi ALL,
We found a kernel panic about ovs 2.8.2 with kernel 4.9.65.

[20655819.900423] BUG: unable to handle kernel NULL pointer dereference at 
0062
[20655819.909234] IP: [] dst_release+0x11/0x70
[20655819.915841] PGD 0
[20655819.917963]
[20655819.920779] Oops:  [#1] SMP
...
[20655820.072555] CPU: 13 PID: 585747 Comm: handler146 Tainted: G   O   
 4.9.65-k8s-netease3-1 #1
[20655820.082713] Hardware name: Inspur NF5280M4/YZMB-00689-101, BIOS 4.1.19 
02/27/2018
[20655820.091284] task: 9054dec880c0 task.stack: b77e799ec000
[20655820.098257] RIP: 0010:[]  [] 
dst_release+0x11/0x70
[20655820.107239] RSP: 0018:9034ff943ec8  EFLAGS: 00010202
[20655820.113627] RAX: 9034ff60 RBX:  RCX: 

[20655820.122013] RDX: 47293d0355f0 RSI: 0200 RDI: 
0002
[20655820.130183] RBP: 9034ae6465d8 R08:  R09: 
00ff
[20655820.138360] R10: 9034d9a9cb10 R11:  R12: 
b3d18e00
[20655820.146755] R13: 9034acad3d00 R14: 9054dec880c0 R15: 
9034ff9596c0
[20655820.154969] FS:  7f8a09ffb700() GS:9034ff94() 
knlGS:
[20655820.164098] CS:  0010 DS:  ES:  CR0: 80050033
[20655820.170910] CR2: 0062 CR3: 001fdef6e000 CR4: 
00360670
[20655820.179082] DR0:  DR1:  DR2: 

[20655820.187235] DR3:  DR6: fffe0ff0 DR7: 
0400
[20655820.195445] Stack:
[20655820.198492]   9034ae6465d8 b3529748 
9034ae646500
[20655820.207727]    b350c811 
b3c4fc00
[20655820.216333]  9034ff9596f8 0001 b350cb5a 
b30dd643
[20655820.224998] Call Trace:
[20655820.228487]  
[20655820.230688]  [] ? dst_cache_destroy+0x38/0x70
[20655820.237639]  [] ? dst_destroy+0x21/0x110
[20655820.244225]  [] ? dst_destroy_rcu+0xa/0x20
[20655820.250930]  [] ? rcu_process_callbacks+0x1e3/0x5b0
[20655820.258457]  [] ? __do_softirq+0x105/0x290
[20655820.265757]  [] ? irq_exit+0xae/0xb0
[20655820.272245]  [] ? smp_apic_timer_interrupt+0x3e/0x50
[20655820.280312]  [] ? apic_timer_interrupt+0x96/0xa0
[20655820.287534]  
[20655820.289734]  [] ? mutex_spin_on_owner.isra.1+0x28/0x40
[20655820.297580]  [] ? mutex_optimistic_spin+0x7f/0x1b0
[20655820.305032]  [] ? __mutex_lock_slowpath+0x3f/0x130
[20655820.312464]  [] ? mutex_lock+0x1b/0x30
[20655820.318831]  [] ? pcpu_alloc+0x48f/0x680
[20655820.325376]  [] ? dst_cache_init+0x1d/0x40
[20655820.332126]  [] ? validate_and_copy_set_tun+0xe0/0x360 
[openvswitch]
[20655820.341108]  [] ? validate_set+0x2b3/0x380 [openvswitch]
[20655820.349009]  [] ? __ovs_nla_copy_actions+0x2b1/0x680 
[openvswitch]
[20655820.357765]  [] ? __kmalloc_reserve.isra.35+0x2e/0x80
[20655820.365372]  [] ? kmem_cache_alloc_node+0xd4/0x530
[20655820.372684]  [] ? ovs_nla_copy_actions+0x6e/0x90 
[openvswitch]
[20655820.381064]  [] ? ovs_packet_cmd_execute+0x16d/0x290 
[openvswitch]
[20655820.389745]  [] ? genl_family_rcv_msg+0x1bc/0x360
[20655820.396972]  [] ? genl_family_rcv_msg+0x360/0x360
[20655820.404247]  [] ? genl_rcv_msg+0x82/0xc0
[20655820.410646]  [] ? netlink_rcv_skb+0xa1/0xc0
[20655820.417350]  [] ? genl_rcv+0x24/0x40
[20655820.423927]  [] ? netlink_unicast+0x184/0x230
[20655820.430773]  [] ? netlink_sendmsg+0x2f8/0x3b0
[20655820.437554]  [] ? sock_sendmsg+0x30/0x40
[20655820.443925]  [] ? ___sys_sendmsg+0x2c2/0x2d0
[20655820.450596]  [] ? ep_ptable_queue_proc+0x90/0x90
[20655820.457686]  [] ? 
ep_scan_ready_list.constprop.12+0x208/0x210
[20655820.465888]  [] ? ep_poll+0x192/0x350
[20655820.471919]  [] ? __sys_sendmsg+0x51/0x90
[20655820.478876]  [] ? system_call_fast_compare_end+0xc/0xb7

Regards,
Hanlin
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] ovn-northd-ddlog: Fix minor memory leak.

2021-03-24 Thread Numan Siddique
On Wed, Mar 10, 2021 at 6:06 AM Ben Pfaff  wrote:
>
> This string is allocated in northd_ctx_create() but until now it was
> not freed.
>
> Signed-off-by: Ben Pfaff 
> Reported-by: Numan Siddique 

Acked-by: Numan Siddique 

Numan

> ---
>  northd/ovn-northd-ddlog.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/northd/ovn-northd-ddlog.c b/northd/ovn-northd-ddlog.c
> index 4efdfa38749a..c98bded71b5f 100644
> --- a/northd/ovn-northd-ddlog.c
> +++ b/northd/ovn-northd-ddlog.c
> @@ -200,6 +200,7 @@ northd_ctx_destroy(struct northd_ctx *ctx)
>  if (ctx) {
>  ovsdb_cs_destroy(ctx->cs);
>  json_destroy(ctx->output_only_data);
> +free(ctx->prefix);
>  free(ctx);
>  }
>  }
> --
> 2.29.2
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] ovn-northd-ddlog: Fix error checking in ddlog_commit().

2021-03-24 Thread Numan Siddique
On Wed, Mar 10, 2021 at 6:15 AM Ben Pfaff  wrote:
>
> new_delta has the error status but this function was checking a
> different variable instead.
>
> Found by inspection.;
>
> Signed-off-by: Ben Pfaff 

Acked-by: Numan Siddique 

Numan

> ---
>  northd/ovn-northd-ddlog.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/northd/ovn-northd-ddlog.c b/northd/ovn-northd-ddlog.c
> index 238987410239..81ea8c144a96 100644
> --- a/northd/ovn-northd-ddlog.c
> +++ b/northd/ovn-northd-ddlog.c
> @@ -320,7 +320,7 @@ static int
>  ddlog_commit(ddlog_prog ddlog)
>  {
>  ddlog_delta *new_delta = ddlog_transaction_commit_dump_changes(ddlog);
> -if (!delta) {
> +if (!new_delta) {
>  VLOG_WARN("Transaction commit failed");
>  return -1;
>  }
> --
> 2.29.2
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn 1/3] ovn-northd-ddlog: Fix memory leaks.

2021-03-24 Thread Numan Siddique
On Wed, Mar 10, 2021 at 5:55 AM Ben Pfaff  wrote:
>
> ddlog_delta_get_table() returns data that must be freed, but the code
> in northd_update_probe_interval() did not do that.  This fixes it.
>
> In addition, the accumulated deltas weren't freed when the daemon exits.
> This doesn't really matter but it's cleaner to do so, so this commit
> also does that.
>
> Signed-off-by: Ben Pfaff 
> Reported-by: Numan Siddique 

Thanks for these fixes and for the cleanup.

For all the 3 patches in the series.
Acked-by: Numan Siddique 

Numan

> ---
>  northd/ovn-northd-ddlog.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/northd/ovn-northd-ddlog.c b/northd/ovn-northd-ddlog.c
> index 238987410239..be3de5cee29e 100644
> --- a/northd/ovn-northd-ddlog.c
> +++ b/northd/ovn-northd-ddlog.c
> @@ -577,6 +577,7 @@ northd_update_probe_interval(struct northd_ctx *nb, 
> struct northd_ctx *sb)
>  table_id tid = ddlog_get_table_id("Northd_Probe_Interval");
>  ddlog_delta *probe_delta = ddlog_delta_get_table(delta, tid);
>  ddlog_delta_enumerate(probe_delta, northd_update_probe_interval_cb, 
> (uintptr_t) _interval);
> +ddlog_free_delta(probe_delta);
>
>  ovsdb_cs_set_probe_interval(nb->cs, probe_interval);
>  ovsdb_cs_set_probe_interval(sb->cs, probe_interval);
> @@ -1230,6 +1231,7 @@ main(int argc, char *argv[])
>  northd_ctx_destroy(nb_ctx);
>  northd_ctx_destroy(sb_ctx);
>
> +ddlog_free_delta(delta);
>  ddlog_stop(ddlog);
>
>  if (replay_fd >= 0) {
> --
> 2.29.2
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev