Please see below the updated cover letter which has more details.

(Apologies for sending this as an html instead of plain text).


This patch series adds incremental processing in the lflow engine
node to handle changes to northd and other engine nodes.
Changed related to load balancers and NAT are mainly handled in
this patch series.

This patch series can also be found here -
https://github.com/numansiddique/ovn/tree/northd_lflow_ip/v1

Prior to this patch series, most of the changes to northd engine
resulted in full recomputation of logical flows. This series
aims to improve the performance of ovn-northd by adding the I-P
support. In order to add this support, some of the northd engine
node data (from struct ovn_datapath) is split and moved over to
new engine nodes - mainly related to load balancers, NAT and ACLs.

Below are the scale testing results done with these patches applied
using ovn-heater. The test ran the scenario -
ocp-500-density-heavy.yml [1].

With all the lflow I-P patches applied, the resuts are:

-------------------------------------------------------------------------------------------------------------------------------------------------------
Min (s) Median (s) 90%ile (s) 99%ile (s) Max (s) Mean (s) Total (s) Count
Failed
-------------------------------------------------------------------------------------------------------------------------------------------------------
Iteration Total 0.136883 1.129016 1.192001 1.204167 1.212728 0.665017
83.127099 125 0
Namespace.add_ports 0.005216 0.005736 0.007034 0.015486 0.018978 0.006211
0.776373 125 0
WorkerNode.bind_port 0.035030 0.046082 0.052469 0.058293 0.060311 0.045973
11.493259 250 0
WorkerNode.ping_port 0.005057 0.006727 1.047692 1.069253 1.071336 0.266896
66.724094 250 0
-------------------------------------------------------------------------------------------------------------------------------------------------------

The results with the present main are:

-------------------------------------------------------------------------------------------------------------------------------------------------------
Min (s) Median (s) 90%ile (s) 99%ile (s) Max (s) Mean (s) Total (s) Count
Failed
-------------------------------------------------------------------------------------------------------------------------------------------------------
Iteration Total 0.135491 2.223805 3.311270 3.339078 3.345346 1.729172
216.146495 125 0
Namespace.add_ports 0.005380 0.005744 0.006819 0.018773 0.020800 0.006292
0.786532 125 0
WorkerNode.bind_port 0.034179 0.046055 0.053488 0.058801 0.071043 0.046117
11.529311 250 0
WorkerNode.ping_port 0.004956 0.006952 3.086952 3.191743 3.192807 0.791544
197.886026 250 0
-------------------------------------------------------------------------------------------------------------------------------------------------------

[1] -
https://github.com/ovn-org/ovn-heater/blob/main/test-scenarios/ocp-500-density-heavy.yml


v2 -> v3
-------
* Addressed some of the review comments from Han and Dumitru. There
are still a few pending review comments which needs to be addressed
or discussed.

* Renamed the engine node from "lr_lbnat_data" to "lr_stateful"
(v3 patch 5).
* Renamed the engine node from "ls_lbacls" to "ls_stateful" (v3 patch 8).

* Removed v2 patch 2 from the series (northd: Track ovn_datapaths in
northd engine track data."). This patch is now part of v3 patch 7
(northd: Add a new node 'ls_stateful').

* Squashed v2 patch 8 (northd: Don't commit dhcp response flows in
the conntrack.) into v3 patch 7 (northd: Add a new node
'ls_stateful'.)


v1 -> v2
--------
* Now also maintaing array indexes for ls_lbacls, lr_nat and
lr_lb_nat_data tables (similar to ovn_datapaths->array) to
make the lookup effecient. The same ovn_datapath->index
is reused.

* Made some signficant changes to 'struct lflow_ref' in lflow-mgr.c.
In v2 we don't use objdep_mgr to maintain the resource to lflow
references. Instead we maintain the 'struct lflow' pointer.
With this we don't need to maintain additional hmap of lflows.


Below is the brief description of this patch series. (This could help the
reviewers).


This patch series adds 3 new engine nodes
1. lr_nat:
It maintains the NAT related data for each logical router which was earlier
maintained by northd engine node in the 'struct ovn_datapath'

It has only one input node - "northd"

engine_add_input(&en_lr_nat, &en_northd, lr_nat_northd_handler);

Below NAT related data is stored for each logical router

struct lr_nat_record {
...
const struct ovn_datapath *od;

struct ovn_nat *nat_entries;
size_t n_nat_entries;

bool has_distributed_nat;

/* Set of nat external ips on the router. */
struct sset external_ips;

/* Set of nat external macs on the router. */
struct sset external_macs;

/* SNAT IPs owned by the router (shash of 'struct ovn_snat_ip'). */
struct shash snat_ips;

struct lport_addresses dnat_force_snat_addrs;
struct lport_addresses lb_force_snat_addrs;
bool lb_force_snat_router_ip;
...
};

2. lr_stateful:
It maintains the stateful data for each logical router. Stateful data
consists of Load balancers and NAT. The name 'stateful' is chosen since
OVN uses conntrack for Load balancers and NAT.

It has 3 inputs
- northd
- lr_nat
- lb_data

engine_add_input(&en_lr_stateful, &en_northd, lr_stateful_northd_handler);
engine_add_input(&en_lr_stateful, &en_lr_nat, lr_stateful_lr_nat_handler);
engine_add_input(&en_lr_stateful, &en_lb_data,
lr_stateful_lb_data_handler);

The below 'stateful' related data is stored for each logical router

struct lr_stateful_record {
...
const struct lr_nat_record *lrnat_rec;
bool has_lb_vip;

/* Load Balancer vIPs relevant for this datapath. */
struct ovn_lb_ip_set *lb_ips;

/* sset of vips which are also part of lr nats. */
struct sset vip_nats;

/* 'lflow_ref' is used to reference logical flows generated for
* this lr_stateful record. */
struct lflow_ref *lflow_ref;
...
};
"lr_nat" engine node data could have been part of "lr_stateful" engine node.
But it had to be separated out in order to generate "vip_nats" i.e set of
vips which
are also part of logical router NAT external ips.

'lr_stateful' becomes an input to 'lflow' engine node.
engine_add_input(&en_lflow, &en_lr_stateful, lflow_lr_stateful_handler);

References to the stateful logical flows generated for each logical router
are
stored in the lr_stateful_record->lflow_ref similar to the lflow references
stored for each VIF logical port.

3. ls_stateful
It maintains the stateful data for each logical switch. Stateful data
consists of Load balancers and ACLs associated with a logical switch.
The name 'stateful' is chosen since OVN uses conntrack for Load balancers
and ACls.

It has 2 inputs
- northd
- port_group

engine_add_input(&en_ls_stateful, &en_northd, ls_stateful_northd_handler);
engine_add_input(&en_ls_stateful, &en_port_group,
ls_stateful_port_group_handler);

The below 'stateful' related data is stored for each logical switch

struct ls_stateful_record {
...
const struct ovn_datapath *od;
bool has_stateful_acl;
bool has_lb_vip;
bool has_acls;
uint64_t max_acl_tier;

/* 'lflow_ref' is used to reference logical flows generated for
* this ls_stateful record. */
struct lflow_ref *lflow_ref;
...
};

'ls_stateful' becomes an input to 'lflow' engine node.
engine_add_input(&en_lflow, &en_ls_stateful, lflow_ls_stateful_handler);
References to the stateful logical flows generated for each logical switch
are
stored in the ls_stateful_record->lflow_ref.


Lflow references
----------------
Prior to this patch series, lflow I-P support was added for VIF logical
ports
and all the logical flows related to a VIF logical port were stored in the
ovs_list ovn_port->lflows. 'struct lflow_ref_node' was added to store
the logical port to logical flow reference.

-----
/* A node that maintains link between an object (such as an ovn_port) and
* a lflow. */
struct lflow_ref_node {
/* This list follows different lflows referenced by the same object. List
* head is, for example, ovn_port->lflows. */
struct ovs_list lflow_list_node;
/* This list follows different objects that reference the same lflow. List
* head is ovn_lflow->referenced_by. */
struct ovs_list ref_list_node;
/* The lflow. */
struct ovn_lflow *lflow;
};
-----

In this patch series, 'struct lflow_ref_node' is modified and the list
of lflow references for a given resource (logical port, load balancer, etc)
is stored in 'struct lflow_ref'

----
/* lflow ref mgr */
struct lflow_ref {
char *res_name;

/* head of the list 'struct lflow_ref_node'. */
struct ovs_list lflows_ref_list;

/* hmapx_node is 'struct lflow *'. This is used to ensure
* that there are no duplicates in 'lflow_ref_list' above. */
struct hmapx lflows;
};

struct lflow_ref_node {
struct ovs_list ref_list_node;

struct ovn_lflow *lflow;
size_t dp_index;
};
----

A logical flow can be added to the lflow table using lflow_table_add_lflow()
and 'struct lflow_ref' instance can be passed to this function in which case
the lflow is added to this lflow_ref.

All the logical flows pertaining to a given 'ovn_port' are referenced in
ovn_port->lflow_ref.
Similarly all the logical flows pertaining to 'lr_stateful', 'ls_stateful'
and 'lb_datapaths'
are stored in their respective lflow_refs.

This becomes easier to clear the logical flows for a given resource and
rebuild them
when the resource gets modified.

I-P for the lflows
-------------------

Below is how this patch series handles changes to NAT incrementally
starting from
northd engine node to logical flow generation.

Changes to NAT
------------

|
| NAT changes (ovn-nbctl lr-nat-add lr0 ...)
v
+-------------------------------+
| |
| en_northd |
| (northd_handle_lr_changes) |
+-------------------------------+
| northd_data->trk_data.lr_with_changed_nats = [lr0]
|
v
+-------------------------------+
| |
| en_lr_nat |
| (lr_nat_northd_handler) |
+-------------------------------+
| lr_nat_data->trk_data.crupdated = [lr0]
|
v
+-------------------------------+
| |
| en_lr_stateful |
| (lr_stateful_lr_nat_handler) |
+-------------------------------+
|lr_stateful_data->trk_data.crupdated = [lr0]
|
v
+-------------------------------+
| |
| en_lflow |
| (lflow_lr_stateful_handler) |
+-------------------------------+
|
|
v
lflow_lr_stateful_handler() {
for each lr_stateful record 'lr_sful_rec' in the
lr_stateful_data->trk_data.crupdated
do
lflow_ref_clear_lflows(lr_sful_rec->lflow_ref); /* Clear the old lflows. */
build_lr_stateful_flows(..., lr_sful_rec->lflow_ref)
lflow_ref_sync_lflows_to_sb(lr_sful_rec->lflow_ref) /* Sync the lflows to
sb. */
done
}


Below is how this patch series handles changes to LBs incrementally
starting from
lb_data engine node to logical flow generation.

Let's say there is a load balancer 'lb0' associated with a logical switch
'sw0' and logical router 'lr0'
and a new VIP is added to it.

ovn-nbctl set load_balancer lb0 vips:10.0.0.10=20.0.0.20

|
| LB changes (ovn-nbctl set load_balancer lb0 ....)
v
+------------------------------------+
| |
| en_lb_data |
| (lb_data_load_balancer_handler) |
| (lb_data_logical_switch_handler) |
| (lb_data_logical_router_handler) |
+------------------------------------+
| tracked_lb_data->crupdated_lbs = [lb0]
| tracked_lb_data->crupdated_ls_lbs = [ {od:sw0}, {assoc_lbs: [lb0]} ]
| tracked_lb_data->crupdated_lr_lbs = [ {od:lr0}, {assoc_lbs: [lb0]} ]
v
+---------------------------------------------------------------+
| |
| |
v |
+-------------------------------+ |
| | |
| en_northd | |
| (northd_lb_data_handler) | |
+-------------------------------+ |
| |
| tracked_northd_data->trk_lbs.crupdated = [lb0] |
| tracked_northd_data->ls_with_changed_lbs = [sw0] |
| |
+------- |
| |
| |
| tracked_northd_data->ls_with_changed_lbs = [sw0] |
v |
+-------------------------------+ |
| | |
| en_ls_stateful | |
| (ls_stateful_northd_handler) | |
+-------------------------------+ |
| |
| |
v |
ls_stateful_northd_handler() { |
for each lswitch od 'ls' in tracked_northd_data->ls_with_changed_lbs |
do |
ls_sful_rec = ls_stateful_table_find('ls') |
ls_stateful_record_reinit(ls_sful_rec) |
Add ls_sful_rec to tracked data -> ls_stateful_data->trk_data.crupdated |
done |
} |
| |
| |
|
| +---------------------------------- +
|
| | tracked_lb_data->crupdated_lbs = [lb0]
| | tracked_lb_data->crupdated_lr_lbs = [ {od:lr0}, {assoc_lbs: [lb0]} ]
| v
| +------------------------------------+
| | |
| | en_lr_stateful |
| | (lr_stateful_lb_data_handler) |
| +------------------------------------+
| |
| |
| v
| lr_stateful_lb_data_handler() {
| for each 'lr_lb' in tracked_lb_data->crupdated_lr_lbs
| do
| lr = lr_lb['od']
| lr_sful_rec = lr_sful_table_find(lr)
|
| for each 'lb' in lr_lb['assoc_lbs']
| do
| build_lrouter_lb_ips(lr, lb)
| build_lrouter_lb_reachable_ips(lr, lb)
| done
|
| Add lr_sful_rec to tracked data -> ls_stateful_data->trk_data.crupdated
| done
| }
| |
| +--------------------------------+
| |
| |
| |
| |
| |
| |
| |
| |
|ls_sful_data->trk_data.crupdated = [sw0] |
+--------+ |
| |
| lr_sful_data->trk_data.crupdated = [lr0] |
| +--------------------------------------------+
| |
v v
+------------------------------------+
| |
| en_lflow |
| (lflow_ls_stateful_handler) |
| (lflow_lr_stateful_handler) | tracked_northd_data->trk_lbs.crupdated =
[lb0]
| (lflow_northd_handler) <-------|----- (from en_northd)
+------------------------------------+
|
|
|
v
lflow_ls_stateful_handler() {
for each lr_sful_rec in ls_sful_data->trk_data.crupdated
do
lflow_ref_clear_lflows(lr_sful_rec->lflow_ref); /* Clear the old lflows. */
build_ls_stateful_flows(..., ls_sful_rec->lflow_ref)
lflow_ref_sync_lflows_to_sb(ls_sful_rec->lflow_ref) /* Sync the lflows to
sb. */
done
}
lflow_lr_stateful_handler() {
for each lr_sful_rec in the lr_sful_data->trk_data.crupdated
do
lflow_ref_clear_lflows(lr_sful_rec->lflow_ref); /* Clear the old lflows. */
build_lr_stateful_flows(..., lr_sful_rec->lflow_ref)
lflow_ref_sync_lflows_to_sb(lr_sful_rec->lflow_ref) /* Sync the lflows to
sb. */
done
}

lflow_northd_handler() {
for each 'lb_dps' in tracked_northd_data->trk_lbs.crupdated
do
lflow_ref_clear_lflows(lb_dps->lflow_ref); /* Clear the old lflows. */
build_lflows_for_lb_dps(..., lb_dps->lflow_ref);
lflow_ref_sync_lflows_to_sb(lb_dps->lflow_ref) /* Sync the lflows to sb. */
done
}


Thanks
Numan


On Mon, Nov 27, 2023 at 9:39 PM Numan Siddique <num...@ovn.org> wrote:
>
> On Mon, Nov 27, 2023 at 9:34 PM <num...@ovn.org> wrote:
> >
> > From: Numan Siddique <num...@ovn.org>
> >
> > This patch series adds incremental processing in the lflow engine
> > node to handle changes to northd and other engine nodes.
> > Changed related to load balancers and NAT are mainly handled in
> > this patch series.
> >
> > This patch series can also be found here -
https://github.com/numansiddique/ovn/tree/northd_lflow_ip/v1
>
> Correction -
https://github.com/numansiddique/ovn/tree/northd_lbnatacl_lflow/v3
>
> Thanks
> Numan
>
> >
> > Prior to this patch series, most of the changes to northd engine
> > resulted in full recomputation of logical flows.  This series
> > aims to improve the performance of ovn-northd by adding the I-P
> > support.  In order to add this support, some of the northd engine
> > node data (from struct ovn_datapath) is split and moved over to
> > new engine nodes - mainly related to load balancers, NAT and ACLs.
> >
> > Below are the scale testing results done with these patches applied
> > using ovn-heater.  The test ran the scenario  -
> > ocp-500-density-heavy.yml [1].
> >
> > With all the lflow I-P patches applied, the resuts are:
> >
> >
-------------------------------------------------------------------------------------------------------------------------------------------------------
> >                         Min (s)         Median (s)      90%ile (s)
 99%ile (s)      Max (s)         Mean (s)        Total (s)       Count
Failed
> >
-------------------------------------------------------------------------------------------------------------------------------------------------------
> > Iteration Total         0.136883        1.129016        1.192001
 1.204167        1.212728        0.665017        83.127099       125     0
> > Namespace.add_ports     0.005216        0.005736        0.007034
 0.015486        0.018978        0.006211        0.776373        125     0
> > WorkerNode.bind_port    0.035030        0.046082        0.052469
 0.058293        0.060311        0.045973        11.493259       250     0
> > WorkerNode.ping_port    0.005057        0.006727        1.047692
 1.069253        1.071336        0.266896        66.724094       250     0
> >
-------------------------------------------------------------------------------------------------------------------------------------------------------
> >
> > The results with the present main are:
> >
> >
-------------------------------------------------------------------------------------------------------------------------------------------------------
> >                         Min (s)         Median (s)      90%ile (s)
 99%ile (s)      Max (s)         Mean (s)        Total (s)       Count
Failed
> >
-------------------------------------------------------------------------------------------------------------------------------------------------------
> > Iteration Total         0.135491        2.223805        3.311270
 3.339078        3.345346        1.729172        216.146495      125     0
> > Namespace.add_ports     0.005380        0.005744        0.006819
 0.018773        0.020800        0.006292        0.786532        125     0
> > WorkerNode.bind_port    0.034179        0.046055        0.053488
 0.058801        0.071043        0.046117        11.529311       250     0
> > WorkerNode.ping_port    0.004956        0.006952        3.086952
 3.191743        3.192807        0.791544        197.886026      250     0
> >
-------------------------------------------------------------------------------------------------------------------------------------------------------
> >
> > [1] -
https://github.com/ovn-org/ovn-heater/blob/main/test-scenarios/ocp-500-density-heavy.yml
> >
> >
> > Note:  This cover letter will be enhanced with more details in v4.
> >
> > v2 -> v3
> > -------
> >    * Addressed some of the review comments from Han and Dumitru.  There
> >      are still a few pending review comments which needs to be addressed
> >      or discussed.
> >
> >    * Renamed the engine node from "lr_lbnat_data" to "lr_stateful"
> >      (v3 patch 5).
> >
> >    * Renamed the engine node from "ls_lbacls" to "ls_stateful" (v3
patch 8).
> >
> >    * Removed v2 patch 2 from the series (northd: Track ovn_datapaths in
> >      northd engine track data.").  This patch is now part of v3 patch 7
> >      (northd: Add a new node 'ls_stateful').
> >
> >    * Squashed v2 patch 8 (northd: Don't commit dhcp response flows in
> >      the conntrack.) into v3 patch 7 (northd: Add a new node
> >      'ls_stateful'.)
> >
> >
> > v1 -> v2
> > --------
> >    * Now also maintaing array indexes for ls_lbacls, lr_nat and
> >      lr_lb_nat_data tables (similar to ovn_datapaths->array) to
> >      make the lookup effecient.  The same ovn_datapath->index
> >      is reused.
> >
> >    * Made some signficant changes to 'struct lflow_ref' in lflow-mgr.c.
> >      In v2 we don't use objdep_mgr to maintain the resource to lflow
> >      references.  Instead we maintain the 'struct lflow' pointer.
> >      With this we don't need to maintain additional hmap of lflows.
> >
> >
> >
> > Numan Siddique (16):
> >   northd: Refactor the northd change tracking.
> >   tests: Add a couple of tests in ovn-northd for I-P.
> >   northd: Move router ports SB PB options sync to sync_to_sb_pb node.
> >   northd: Add a new engine 'lr_nat' to manage lr NAT data.
> >   northd: Add a new engine 'lr_stateful' to manage lr's stateful data.
> >   northd:  Generate router's stateful flows using lr_stateful data.
> >   northd: Add a new node 'ls_stateful'.
> >   northd: Refactor lflow management into a separate module.
> >   northd: Use lflow_ref when adding all logical flows.
> >   northd:  Move ovn_lb_datapaths from lib to northd module.
> >   northd: Handle lb changes in lflow engine.
> >   northd: Add lr_stateful handler for lflow engine node.
> >   northd: Add ls_stateful handler for lflow engine node.
> >   northd:  Add I-P for NB_Global and SB_Global.
> >   northd: Add a noop handler for northd SB mac binding.
> >   northd: Add northd change handler for sync_to_sb_lb node.
> >
> >  lib/lb.c                  |   96 -
> >  lib/lb.h                  |   57 -
> >  lib/ovn-util.c            |   17 +-
> >  lib/ovn-util.h            |    2 +-
> >  lib/stopwatch-names.h     |    3 +
> >  northd/aging.c            |   21 +-
> >  northd/automake.mk        |   12 +-
> >  northd/en-global-config.c |  588 ++++
> >  northd/en-global-config.h |   65 +
> >  northd/en-lflow.c         |  122 +-
> >  northd/en-lflow.h         |    8 +
> >  northd/en-lr-nat.c        |  423 +++
> >  northd/en-lr-nat.h        |  130 +
> >  northd/en-lr-stateful.c   |  678 ++++
> >  northd/en-lr-stateful.h   |  136 +
> >  northd/en-ls-stateful.c   |  448 +++
> >  northd/en-ls-stateful.h   |  104 +
> >  northd/en-northd.c        |   70 +-
> >  northd/en-northd.h        |    2 +-
> >  northd/en-port-group.h    |    3 +
> >  northd/en-sync-from-sb.c  |    2 +-
> >  northd/en-sync-sb.c       |  510 ++-
> >  northd/inc-proc-northd.c  |   71 +-
> >  northd/lflow-mgr.c        | 1078 ++++++
> >  northd/lflow-mgr.h        |  192 ++
> >  northd/northd.c           | 6510 ++++++++++++++++---------------------
> >  northd/northd.h           |  593 +++-
> >  northd/ovn-northd.c       |    7 +
> >  tests/ovn-macros.at       |   44 +
> >  tests/ovn-northd.at       |  953 +++++-
> >  30 files changed, 8825 insertions(+), 4120 deletions(-)
> >  create mode 100644 northd/en-global-config.c
> >  create mode 100644 northd/en-global-config.h
> >  create mode 100644 northd/en-lr-nat.c
> >  create mode 100644 northd/en-lr-nat.h
> >  create mode 100644 northd/en-lr-stateful.c
> >  create mode 100644 northd/en-lr-stateful.h
> >  create mode 100644 northd/en-ls-stateful.c
> >  create mode 100644 northd/en-ls-stateful.h
> >  create mode 100644 northd/lflow-mgr.c
> >  create mode 100644 northd/lflow-mgr.h
> >
> > --
> > 2.41.0
> >
> > _______________________________________________
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to