Re: [ovs-dev] [PATCH] dpdk: expose cpu usage stats on telemetry socket

2023-09-12 Thread Eelco Chaudron


On 11 Sep 2023, at 12:41, Robin Jarry wrote:

> Hey Kevin,
>
> Kevin Traynor, Sep 07, 2023 at 15:37:
>> This came up in conversation with other maintainers as I mentioned I was
>> reviewing and the question raised was - Why add this ? if you want these
>> values exposed, wouldn't it be better to to add to ovsdb ?
>
> That's a good point. I had considered using ovsdb but it seemed to me
> less suitable for a few reasons:
>
> * I had understood that ovsdb is a configuration database, not a state
>   reporting database.
>
> * To have reliable and up to date numbers, ovs would need to push them
>   at high rate to the database so that clients to get outdated cpu
>   usage. The DPDK telemetry socket is real-time, the current numbers are
>   returned on every request.
>
> * I would need to define a custom schema / table to store structured
>   information in the db. The DPDK telemetry socket already has a schema
>   defined for this.
>
> * Accessing ovsdb requires a library making it more complex to use for
>   telemetry scrapers. The DPDK telemetry socket can be accessed with
>   a standalone python script with no external dependencies[1].
>
> [1]: 
> https://github.com/rjarry/dpdk/blob/main/usertools/prometheus-dpdk-exporter.py#L135-L143
>
> Maybe my observations are wrong, please do correct me if they are.

I feel like if we do need another way of getting (real time) statistics out of 
OVS, we should use the same communication channel as the other ovs-xxx 
utilities are using. But rather than returning text-based responses, we might 
be able to make it JSON (which is already used by the dbase). I know that 
Adrian is already investigating machine-readable output for some existing 
utilities, maybe it can be extended for the (pmd) statistics use case.

Using something like the DPDK telemetry socket, might not work for other use 
cases where DPDK is not in play.

>> Are you looking for individual lcore usage with identification of that
>> pmd? or overall aggregate usage ?
>>
>> I ask because it will report lcore id's which would need to be mapped to
>> pmd core id's for anything regarding individual pmds.
>>
>> That can be found in ovs-vswitchd.log or checked locally with
>> 'ovs-appctl dpdk/lcore-list' but assuming if they were available, then
>> user would not be using dpdk telemetry anyway.
>
> I would assume that the important data is the aggregate usage for
> overall monitoring and resource planing. Individual pmd usage can be
> accessed for fine tuning and debugging via appctl.
>
>> These stats are cumulative so in the absence of 'ovs-appctl
>> dpif-netdev/pmd-stats-clear'  that would need to be taken care of with
>> some post-processing by whatever is pulling these stats - otherwise
>> you'll get cumulative stats for an unknown time period and unknown
>> traffic profile (e.g. it would be counting before any traffic started).
>>
>> These might also be reset with pmd-stats-clear independently, so that
>> would need to be accounted for too.
>
> The only important data point that we need is the ratio between
> busy/(busy + idle) over a specified delta which any scraper can do.
> I consider these numbers like any other counter that can eventually be
> reset.
>
> See this reply from Morten Brørup on dpdk-dev for more context:
>
> https://lore.kernel.org/dpdk-dev/98cbd80474fa8b44bf855df32c47dc35d87...@smartserver.smartshare.dk/
>
>> Another thing I noticed is that without the pmd-sleep info the stats in
>> isolation can be misleading. Example below:
>>
>> With low rate traffic and clearing stats between 10 sec runs
>>
>> 2023-09-07T13:14:56Z|00158|dpif_netdev|INFO|PMD max sleep request is 0
>> usecs.
>> 2023-09-07T13:14:56Z|00159|dpif_netdev|INFO|PMD load based sleeps are
>> disabled.
>>
>> Time: 13:15:06.842
>> Measurement duration: 10.009 s
>>
>> pmd thread numa_id 0 core_id 8:
>>
>>Iterations: 51712564  (0.19 us/it)
>>- Used TSC cycles:   26021354654  (100.0 % of total cycles)
>>- idle iterations:  51710963  ( 99.9 % of used cycles)
>>- busy iterations:  1601  (  0.1 % of used cycles)
>>- sleep iterations:0  (  0.0 % of iterations)
>> ^^^ can see here that pmd does not sleep and is 0.1% busy
>>
>>Sleep time (us):   0  (  0 us/iteration avg.)
>>Rx packets:37250  (4 Kpps, 866 cycles/pkt)
>>Datapath passes:   37250  (1.00 passes/pkt)
>>- PHWOL hits:  0  (  0.0 %)
>>- MFEX Opt hits:   0  (  0.0 %)
>>- Simple Match hits:   37250  (100.0 %)
>>- EMC hits:0  (  0.0 %)
>>- SMC hits:0  (  0.0 %)
>>- Megaflow hits:   0  (  0.0 %, 0.00 subtbl lookups/hit)
>>- Upcalls: 0  (  0.0 %, 0.0 us/upcall)
>>- Lost upcalls:0  (  0.0 %)
>>Tx packets:0
>>
>> {
>>"/eal/lcore/usage": {
>>  "lcore_ids": [
>>1
>>  ],
>>  "total_cycles":

Re: [ovs-dev] [PATCH net-next] net: dst: remove unnecessary input parameter in dst_alloc and dst_init

2023-09-12 Thread patchwork-bot+netdevbpf
Hello:

This patch was applied to netdev/net-next.git (main)
by Paolo Abeni :

On Mon, 11 Sep 2023 20:50:45 +0800 you wrote:
> Since commit 1202cdd66531("Remove DECnet support from kernel") has been
> merged, all callers pass in the initial_ref value of 1 when they call
> dst_alloc(). Therefore, remove initial_ref when the dst_alloc() is
> declared and replace initial_ref with 1 in dst_alloc().
> Also when all callers call dst_init(), the value of initial_ref is 1.
> Therefore, remove the input parameter initial_ref of the dst_init() and
> replace initial_ref with the value 1 in dst_init.
> 
> [...]

Here is the summary with links:
  - [net-next] net: dst: remove unnecessary input parameter in dst_alloc and 
dst_init
https://git.kernel.org/netdev/net-next/c/762c8dc7f269

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn branch-21.12 1/2] Split out code to handle port binding db updates

2023-09-12 Thread Ales Musil
From: Ihar Hrachyshka 

This function will later be used to handle port binding updates for
postponed (throttled) bindings.

Conflicts:
controller/binding.c

Signed-off-by: Ihar Hrachyshka 
Acked-by: Mark Michelson 
Signed-off-by: Numan Siddique 
(cherry picked from commit 3103487e087b27b1b3577afba016403fd1ac3093)
(cherry picked from commit 79cba201419cfbfa00876404dc9021b6d135f6da)
Signed-off-by: Mark Michelson 
(cherry picked from commit 3af319c8ef5e732f4d1159ca23c276e0846b9003)
---
 controller/binding.c | 248 ++-
 1 file changed, 129 insertions(+), 119 deletions(-)

diff --git a/controller/binding.c b/controller/binding.c
index 28edde26c..eb9de261c 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -2272,6 +2272,134 @@ consider_patch_port_for_local_datapaths(const struct 
sbrec_port_binding *pb,
 }
 }
 
+static bool
+handle_updated_port(struct binding_ctx_in *b_ctx_in,
+struct binding_ctx_out *b_ctx_out,
+const struct sbrec_port_binding *pb,
+struct hmap *qos_map_ptr)
+{
+/* Loop to handle create and update changes only. */
+if (sbrec_port_binding_is_deleted(pb)) {
+return true;
+}
+
+update_active_pb_ras_pd(pb, b_ctx_out->local_datapaths,
+b_ctx_out->local_active_ports_ipv6_pd,
+"ipv6_prefix_delegation");
+
+update_active_pb_ras_pd(pb, b_ctx_out->local_datapaths,
+b_ctx_out->local_active_ports_ras,
+"ipv6_ra_send_periodic");
+
+enum en_lport_type lport_type = get_lport_type(pb);
+
+struct binding_lport *b_lport =
+binding_lport_find(&b_ctx_out->lbinding_data->lports,
+   pb->logical_port);
+if (b_lport) {
+ovs_assert(b_lport->pb == pb);
+
+if (b_lport->type != lport_type) {
+b_lport->type = lport_type;
+}
+
+if (b_lport->lbinding) {
+if (!local_binding_handle_stale_binding_lports(
+b_lport->lbinding, b_ctx_in, b_ctx_out, qos_map_ptr)) {
+return false;
+}
+}
+}
+
+bool handled = true;
+
+switch (lport_type) {
+case LP_VIF:
+case LP_CONTAINER:
+case LP_VIRTUAL:
+handled = handle_updated_vif_lport(pb, lport_type, b_ctx_in,
+   b_ctx_out, qos_map_ptr);
+break;
+
+case LP_LOCALPORT:
+handled = consider_localport(pb, b_ctx_in, b_ctx_out);
+break;
+
+case LP_PATCH:
+update_related_lport(pb, b_ctx_out);
+consider_patch_port_for_local_datapaths(pb, b_ctx_in, b_ctx_out);
+break;
+
+case LP_VTEP:
+update_related_lport(pb, b_ctx_out);
+/* VTEP lports are claimed/released by ovn-controller-vteps.
+ * We are not sure what changed. */
+b_ctx_out->non_vif_ports_changed = true;
+break;
+
+case LP_L2GATEWAY:
+handled = consider_l2gw_lport(pb, b_ctx_in, b_ctx_out);
+break;
+
+case LP_L3GATEWAY:
+handled = consider_l3gw_lport(pb, b_ctx_in, b_ctx_out);
+break;
+
+case LP_CHASSISREDIRECT:
+handled = consider_cr_lport(pb, b_ctx_in, b_ctx_out);
+if (!handled) {
+break;
+}
+const char *distributed_port = smap_get(&pb->options,
+"distributed-port");
+if (!distributed_port) {
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+VLOG_WARN_RL(&rl, "No distributed-port option set for "
+ "chassisredirect port %s", pb->logical_port);
+break;
+}
+const struct sbrec_port_binding *distributed_pb
+= lport_lookup_by_name(b_ctx_in->sbrec_port_binding_by_name,
+   distributed_port);
+if (!distributed_pb) {
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+VLOG_WARN_RL(&rl, "No port binding record for distributed "
+ "port %s referred by chassisredirect port %s",
+ distributed_port, pb->logical_port);
+break;
+}
+consider_patch_port_for_local_datapaths(distributed_pb, b_ctx_in,
+b_ctx_out);
+break;
+
+case LP_EXTERNAL:
+handled = consider_external_lport(pb, b_ctx_in, b_ctx_out);
+update_ld_external_ports(pb, b_ctx_out->local_datapaths);
+break;
+
+case LP_LOCALNET: {
+consider_localnet_lport(pb, b_ctx_in, b_ctx_out, qos_map_ptr);
+
+struct shash bridge_mappings =
+SHASH_INITIALIZER(&bridge_mappings);
+add_ovs_bridge_mappings(b_ctx_in->ovs_table,
+b_ctx_in->bridge_table,
+&bridge_mappi

[ovs-dev] [PATCH ovn branch-21.12 2/2] controller: throttle port claim attempts

2023-09-12 Thread Ales Musil
From: Ihar Hrachyshka 

When multiple chassis are fighting for the same port (requested-chassis
is not set, e.g. for gateway ports), they may produce an unreasonable
number of chassis field updates in a very short time frame (hundreds of
updates in several seconds). This puts unnecessary load on OVN as well
as any db notification consumers trying to keep up with the barrage.

This patch throttles port claim attempts so that they don't happen more
frequently than once per 0.5 seconds.

Conflicts:
  controller/binding.c
  controller/binding.h
  controller/ovn-controller.c

Reported: https://bugzilla.redhat.com/show_bug.cgi?id=1974898
Signed-off-by: Ihar Hrachyshka 
Acked-by: Mark Michelson 
Signed-off-by: Numan Siddique 
(cherry picked from commit 4dc4bc7fdb848bcc626becbd2c80ffef8a39ff9a)
(cherry picked from commit 887a8df4f4aa08a4a87b42f7aa684ed7e9aff9a1)
Signed-off-by: Mark Michelson 
(cherry picked from commit 2c98163e024f0543d84df44f9c0840ce0347e2bc)
---
 controller/binding.c| 121 +++-
 controller/binding.h|  10 +++
 controller/ovn-controller.c |  49 +++
 tests/ovn.at|  41 
 4 files changed, 218 insertions(+), 3 deletions(-)

diff --git a/controller/binding.c b/controller/binding.c
index eb9de261c..fad84945f 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -48,6 +48,67 @@ VLOG_DEFINE_THIS_MODULE(binding);
 
 #define OVN_QOS_TYPE "linux-htb"
 
+#define CLAIM_TIME_THRESHOLD_MS 500
+
+struct claimed_port {
+long long int last_claimed;
+};
+
+static struct shash _claimed_ports = SHASH_INITIALIZER(&_claimed_ports);
+static struct sset _postponed_ports = SSET_INITIALIZER(&_postponed_ports);
+
+struct sset *
+get_postponed_ports(void)
+{
+return &_postponed_ports;
+}
+
+static long long int
+get_claim_timestamp(const char *port_name)
+{
+struct claimed_port *cp = shash_find_data(&_claimed_ports, port_name);
+return cp ? cp->last_claimed : 0;
+}
+
+static void
+register_claim_timestamp(const char *port_name, long long int t)
+{
+struct claimed_port *cp = shash_find_data(&_claimed_ports, port_name);
+if (!cp) {
+cp = xzalloc(sizeof *cp);
+shash_add(&_claimed_ports, port_name, cp);
+}
+cp->last_claimed = t;
+}
+
+static void
+cleanup_claimed_port_timestamps(void)
+{
+long long int now = time_msec();
+struct shash_node *node;
+SHASH_FOR_EACH_SAFE (node, &_claimed_ports) {
+struct claimed_port *cp = (struct claimed_port *) node->data;
+if (now - cp->last_claimed >= 5 * CLAIM_TIME_THRESHOLD_MS) {
+free(cp);
+shash_delete(&_claimed_ports, node);
+}
+}
+}
+
+/* Schedule any pending binding work. Runs with in the main ovn-controller
+ * thread context.*/
+void
+binding_wait(void)
+{
+const char *port_name;
+SSET_FOR_EACH (port_name, &_postponed_ports) {
+long long int t = get_claim_timestamp(port_name);
+if (t) {
+poll_timer_wait_until(t + CLAIM_TIME_THRESHOLD_MS);
+}
+}
+}
+
 struct qos_queue {
 struct hmap_node node;
 uint32_t queue_id;
@@ -920,6 +981,21 @@ claimed_lport_set_up(const struct sbrec_port_binding *pb,
 }
 }
 
+static bool
+lport_maybe_postpone(const char *port_name, long long int now,
+ struct sset *postponed_ports)
+{
+long long int last_claimed = get_claim_timestamp(port_name);
+if (now - last_claimed >= CLAIM_TIME_THRESHOLD_MS) {
+return false;
+}
+
+sset_add(postponed_ports, port_name);
+VLOG_DBG("Postponed claim on logical port %s.", port_name);
+
+return true;
+}
+
 /* Returns false if lport is not claimed due to 'sb_readonly'.
  * Returns true otherwise.
  */
@@ -930,7 +1006,8 @@ claim_lport(const struct sbrec_port_binding *pb,
 const struct ovsrec_interface *iface_rec,
 bool sb_readonly, bool notify_up,
 struct hmap *tracked_datapaths,
-struct if_status_mgr *if_mgr)
+struct if_status_mgr *if_mgr,
+struct sset *postponed_ports)
 {
 if (!sb_readonly) {
 claimed_lport_set_up(pb, parent_pb, chassis_rec, notify_up, if_mgr);
@@ -941,7 +1018,12 @@ claim_lport(const struct sbrec_port_binding *pb,
 return false;
 }
 
+long long int now = time_msec();
 if (pb->chassis) {
+if (lport_maybe_postpone(pb->logical_port, now,
+ postponed_ports)) {
+return true;
+}
 VLOG_INFO("Changing chassis for lport %s from %s to %s.",
 pb->logical_port, pb->chassis->name,
 chassis_rec->name);
@@ -957,6 +1039,9 @@ claim_lport(const struct sbrec_port_binding *pb,
 if (tracked_datapaths) {
 update_lport_tracking(pb, tracked_datapaths, true);
 }
+
+register_claim_timestamp(pb->logical_port, now);
+sset_find_and_

Re: [ovs-dev] [PATCH ovn branch-21.12 1/2] Split out code to handle port binding db updates

2023-09-12 Thread 0-day Robot
Bleep bloop.  Greetings Ales Musil, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Unexpected sign-offs from developers who are not authors or co-authors 
or committers: Numan Siddique , Mark Michelson 

Lines checked: 292, Warnings: 1, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn branch-21.12 2/2] controller: throttle port claim attempts

2023-09-12 Thread 0-day Robot
Bleep bloop.  Greetings Ales Musil, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Unexpected sign-offs from developers who are not authors or co-authors 
or committers: Numan Siddique , Mark Michelson 

Lines checked: 418, Warnings: 1, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] utilities: Add kernel_delay.py script to debug a busy Linux kernel.

2023-09-12 Thread Eelco Chaudron


On 11 Sep 2023, at 14:00, Adrian Moreno wrote:

> On 8/21/23 17:41, Eelco Chaudron wrote:
>> This patch adds an utility that can be used to determine if
>> an issue is related to a lack of Linux kernel resources.
>>
>> This tool is also featured in a Red Hat developers blog article:
>>
>> https://developers.redhat.com/articles/2023/07/24/troubleshooting-open-vswitch-kernel-blame
>>
>> Signed-off-by: Eelco Chaudron 
>> ---
>>   utilities/automake.mk   |4
>>   utilities/usdt-scripts/kernel_delay.py  | 1397 
>> +++
>>   utilities/usdt-scripts/kernel_delay.rst |  594 +
>>   3 files changed, 1995 insertions(+)
>>   create mode 100755 utilities/usdt-scripts/kernel_delay.py
>>   create mode 100644 utilities/usdt-scripts/kernel_delay.rst
>>
>
> I have some comments below but overall an awesome script! Thanks Eelco.

Thanks Adrian for the review! I’ve included your comments and will sent out a 
v4 soon.

Below are some comments on stuff I kept as is.

Cheers,

Eelco



>> +import pytz
>
> IIRC it's the first time we add this dependency.
> I think usdt-script dependencies are starting to grow. Have you considered 
> adding a requirements.txt to document them all and make it easier for users 
> to consume these scripts? Alternatively, we should maybe add it to the p

I’ve added the dependencies in the header of the script. I do not think a 
requerments.txt will work, as we need one for each script. Also, the BCC 
dependencies need to be installed manually or through the distribution package.



>> +/*
>> + * For measuring the hard irq time, we need the following.
>> + */
>
> nits:
> - These comments seem to introduce fairly independent sections of ebpf code 
> (which makes reading it much easier), but in this case the above "section" is 
> incomplete without the below tracepoints.

Yes this is true, but as they record different stats, it was even worse when I 
moved it to one section, so for now I think keeping it separate will be more 
visually appealing ;)




>> +
>> +def _get_kprobe_c_code(self, function_name, function_content):
>
> Argument function_name is unused.
>

Correct, wanted to keep all function definitions the same, see comment below.



>> +if BPF.kernel_struct_has_field(b'task_struct', b'state') == 1:
>> +source = source.replace('', 'state')
>> +else:
>> +source = source.replace('', '__state')
>> +
>
> Not that it makes a huge difference but have you considered using 
> bpf_core_field_exists() directly on the ebpf side?

With the bpf_core_field_exists it would execute code each time, so decided to 
just make the BPF code static.



>> index 0..95e98db34
>> --- /dev/null
>> +++ b/utilities/usdt-scripts/kernel_delay.rst
>
> What do you think about moving this file to (or linking it from) the 
> Documentation so it gets published?

I looked at this, but for now, I decided to keep the debug script stuff 
together.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH net-next] net: dst: remove unnecessary input parameter in dst_alloc and dst_init

2023-09-12 Thread Zhengchao Shao via dev
Since commit 1202cdd66531("Remove DECnet support from kernel") has been
merged, all callers pass in the initial_ref value of 1 when they call
dst_alloc(). Therefore, remove initial_ref when the dst_alloc() is
declared and replace initial_ref with 1 in dst_alloc().
Also when all callers call dst_init(), the value of initial_ref is 1.
Therefore, remove the input parameter initial_ref of the dst_init() and
replace initial_ref with the value 1 in dst_init.

Signed-off-by: Zhengchao Shao 
---
 include/net/dst.h |  4 ++--
 net/core/dst.c| 10 +-
 net/ipv4/route.c  |  6 +++---
 net/ipv6/route.c  |  4 ++--
 net/openvswitch/actions.c |  4 ++--
 net/sched/sch_frag.c  |  4 ++--
 net/xfrm/xfrm_policy.c|  2 +-
 7 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/include/net/dst.h b/include/net/dst.h
index 78884429deed..f8b8599a0600 100644
--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -392,10 +392,10 @@ static inline int dst_discard(struct sk_buff *skb)
 {
return dst_discard_out(&init_net, skb->sk, skb);
 }
-void *dst_alloc(struct dst_ops *ops, struct net_device *dev, int initial_ref,
+void *dst_alloc(struct dst_ops *ops, struct net_device *dev,
int initial_obsolete, unsigned short flags);
 void dst_init(struct dst_entry *dst, struct dst_ops *ops,
- struct net_device *dev, int initial_ref, int initial_obsolete,
+ struct net_device *dev, int initial_obsolete,
  unsigned short flags);
 struct dst_entry *dst_destroy(struct dst_entry *dst);
 void dst_dev_put(struct dst_entry *dst);
diff --git a/net/core/dst.c b/net/core/dst.c
index 980e2fd2f013..6838d3212c37 100644
--- a/net/core/dst.c
+++ b/net/core/dst.c
@@ -45,7 +45,7 @@ const struct dst_metrics dst_default_metrics = {
 EXPORT_SYMBOL(dst_default_metrics);
 
 void dst_init(struct dst_entry *dst, struct dst_ops *ops,
- struct net_device *dev, int initial_ref, int initial_obsolete,
+ struct net_device *dev, int initial_obsolete,
  unsigned short flags)
 {
dst->dev = dev;
@@ -66,7 +66,7 @@ void dst_init(struct dst_entry *dst, struct dst_ops *ops,
dst->tclassid = 0;
 #endif
dst->lwtstate = NULL;
-   rcuref_init(&dst->__rcuref, initial_ref);
+   rcuref_init(&dst->__rcuref, 1);
INIT_LIST_HEAD(&dst->rt_uncached);
dst->__use = 0;
dst->lastuse = jiffies;
@@ -77,7 +77,7 @@ void dst_init(struct dst_entry *dst, struct dst_ops *ops,
 EXPORT_SYMBOL(dst_init);
 
 void *dst_alloc(struct dst_ops *ops, struct net_device *dev,
-   int initial_ref, int initial_obsolete, unsigned short flags)
+   int initial_obsolete, unsigned short flags)
 {
struct dst_entry *dst;
 
@@ -90,7 +90,7 @@ void *dst_alloc(struct dst_ops *ops, struct net_device *dev,
if (!dst)
return NULL;
 
-   dst_init(dst, ops, dev, initial_ref, initial_obsolete, flags);
+   dst_init(dst, ops, dev, initial_obsolete, flags);
 
return dst;
 }
@@ -270,7 +270,7 @@ static void __metadata_dst_init(struct metadata_dst *md_dst,
struct dst_entry *dst;
 
dst = &md_dst->dst;
-   dst_init(dst, &dst_blackhole_ops, NULL, 1, DST_OBSOLETE_NONE,
+   dst_init(dst, &dst_blackhole_ops, NULL, DST_OBSOLETE_NONE,
 DST_METADATA | DST_NOCOUNT);
memset(dst + 1, 0, sizeof(*md_dst) + optslen - sizeof(*dst));
md_dst->type = type;
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 66f419e7f9a7..fb3045692b99 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1630,7 +1630,7 @@ struct rtable *rt_dst_alloc(struct net_device *dev,
 {
struct rtable *rt;
 
-   rt = dst_alloc(&ipv4_dst_ops, dev, 1, DST_OBSOLETE_FORCE_CHK,
+   rt = dst_alloc(&ipv4_dst_ops, dev, DST_OBSOLETE_FORCE_CHK,
   (noxfrm ? DST_NOXFRM : 0));
 
if (rt) {
@@ -1658,7 +1658,7 @@ struct rtable *rt_dst_clone(struct net_device *dev, 
struct rtable *rt)
 {
struct rtable *new_rt;
 
-   new_rt = dst_alloc(&ipv4_dst_ops, dev, 1, DST_OBSOLETE_FORCE_CHK,
+   new_rt = dst_alloc(&ipv4_dst_ops, dev, DST_OBSOLETE_FORCE_CHK,
   rt->dst.flags);
 
if (new_rt) {
@@ -2832,7 +2832,7 @@ struct dst_entry *ipv4_blackhole_route(struct net *net, 
struct dst_entry *dst_or
struct rtable *ort = (struct rtable *) dst_orig;
struct rtable *rt;
 
-   rt = dst_alloc(&ipv4_dst_blackhole_ops, NULL, 1, DST_OBSOLETE_DEAD, 0);
+   rt = dst_alloc(&ipv4_dst_blackhole_ops, NULL, DST_OBSOLETE_DEAD, 0);
if (rt) {
struct dst_entry *new = &rt->dst;
 
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 9c687b357e6a..9d8dfc7423e4 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -341,7 +341,7 @@ struct rt6_info *ip6_dst_alloc(struct net *net, struct 
net_device *dev,
   int flags)
 {
struct rt6_info *rt = dst

Re: [ovs-dev] [PATCH 1/1] ofproto-dpif-trace: Support detailed output for conjunctive match.

2023-09-12 Thread Simon Horman
On Thu, Sep 07, 2023 at 03:08:41PM +0900, Nobuhiro MIKI wrote:
> A conjunctive flow consists of two or more multiple flows with
> conjunction actions. When input to the ofproto/trace command
> matches a conjunctive flow, it outputs flows of all dimensions.
> 
> Signed-off-by: Nobuhiro MIKI 

Hi Miki-san,

the CI run for this patch has reported a number of errors.
And I suspect not all of them are transient failures relating
to imperfections in the test-suite. Could you look into this?

https://github.com/ovsrobot/ovs/actions/runs/6106124874/job/16570697146
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v4] utilities: Add kernel_delay.py script to debug a busy Linux kernel.

2023-09-12 Thread Eelco Chaudron
This patch adds an utility that can be used to determine if
an issue is related to a lack of Linux kernel resources.

This tool is also featured in a Red Hat developers blog article:

  
https://developers.redhat.com/articles/2023/07/24/troubleshooting-open-vswitch-kernel-blame

Signed-off-by: Eelco Chaudron 

---
v2: Addressed review comments from Aaron.
v3: Changed wording in documentation.
v4: Addressed review comments from Adrian.

 utilities/automake.mk   |4 
 utilities/usdt-scripts/kernel_delay.py  | 1420 +++
 utilities/usdt-scripts/kernel_delay.rst |  596 +
 3 files changed, 2020 insertions(+)
 create mode 100755 utilities/usdt-scripts/kernel_delay.py
 create mode 100644 utilities/usdt-scripts/kernel_delay.rst

diff --git a/utilities/automake.mk b/utilities/automake.mk
index 37d679f82..9a2114df4 100644
--- a/utilities/automake.mk
+++ b/utilities/automake.mk
@@ -23,6 +23,8 @@ scripts_DATA += utilities/ovs-lib
 usdt_SCRIPTS += \
utilities/usdt-scripts/bridge_loop.bt \
utilities/usdt-scripts/dpif_nl_exec_monitor.py \
+   utilities/usdt-scripts/kernel_delay.py \
+   utilities/usdt-scripts/kernel_delay.rst \
utilities/usdt-scripts/reval_monitor.py \
utilities/usdt-scripts/upcall_cost.py \
utilities/usdt-scripts/upcall_monitor.py
@@ -70,6 +72,8 @@ EXTRA_DIST += \
utilities/docker/debian/build-kernel-modules.sh \
utilities/usdt-scripts/bridge_loop.bt \
utilities/usdt-scripts/dpif_nl_exec_monitor.py \
+   utilities/usdt-scripts/kernel_delay.py \
+   utilities/usdt-scripts/kernel_delay.rst \
utilities/usdt-scripts/reval_monitor.py \
utilities/usdt-scripts/upcall_cost.py \
utilities/usdt-scripts/upcall_monitor.py
diff --git a/utilities/usdt-scripts/kernel_delay.py 
b/utilities/usdt-scripts/kernel_delay.py
new file mode 100755
index 0..636e108be
--- /dev/null
+++ b/utilities/usdt-scripts/kernel_delay.py
@@ -0,0 +1,1420 @@
+#!/usr/bin/env python3
+#
+# Copyright (c) 2022,2023 Red Hat, Inc.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at:
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+#
+# Script information:
+# ---
+# This script allows a developer to quickly identify if the issue at hand
+# might be related to the kernel running out of resources or if it really is
+# an Open vSwitch issue.
+#
+# For documentation see the kernel_delay.rst file.
+#
+#
+# Dependencies:
+# -
+#  You need to install the BCC package for your specific platform or build it
+#  yourself using the following instructions:
+#https://raw.githubusercontent.com/iovisor/bcc/master/INSTALL.md
+#
+#  Python needs the following additional packages installed:
+#- pytz
+#- psutil
+#
+#  You can either install your distribution specific package or use pip:
+#pip install pytz psutil
+#
+import argparse
+import datetime
+import os
+import pytz
+import psutil
+import re
+import sys
+import time
+
+import ctypes as ct
+
+try:
+from bcc import BPF, USDT, USDTException
+from bcc.syscall import syscalls, syscall_name
+except ModuleNotFoundError:
+print("ERROR: Can't find the BPF Compiler Collection (BCC) tools!")
+sys.exit(os.EX_OSFILE)
+
+from enum import IntEnum
+
+
+#
+# Actual eBPF source code
+#
+EBPF_SOURCE = """
+#include 
+#include 
+
+#define MONITOR_PID 
+
+enum {
+
+};
+
+struct event_t {
+u64 ts;
+u32 tid;
+u32 id;
+
+int user_stack_id;
+int kernel_stack_id;
+
+u32 syscall;
+u64 entry_ts;
+
+};
+
+BPF_RINGBUF_OUTPUT(events, );
+BPF_STACK_TRACE(stack_traces, );
+BPF_TABLE("percpu_array", uint32_t, uint64_t, dropcnt, 1);
+BPF_TABLE("percpu_array", uint32_t, uint64_t, trigger_miss, 1);
+
+BPF_ARRAY(capture_on, u64, 1);
+static inline bool capture_enabled(u64 pid_tgid) {
+int key = 0;
+u64 *ret;
+
+if ((pid_tgid >> 32) != MONITOR_PID)
+return false;
+
+ret = capture_on.lookup(&key);
+return ret && *ret == 1;
+}
+
+static inline bool capture_enabled__() {
+int key = 0;
+u64 *ret;
+
+ret = capture_on.lookup(&key);
+return ret && *ret == 1;
+}
+
+static struct event_t *get_event(uint32_t id) {
+struct event_t *event = events.ringbuf_reserve(sizeof(struct event_t));
+
+if (!event) {
+dropcnt.increment(0);
+return NULL;
+}
+
+event->id = id;
+event->ts = bpf_ktime_get_ns();
+event->tid = bpf_get_current_pid_tgid();
+
+return event;
+}
+
+static int star

Re: [ovs-dev] [PATCH v4] utilities: Add kernel_delay.py script to debug a busy Linux kernel.

2023-09-12 Thread 0-day Robot
Bleep bloop.  Greetings Eelco Chaudron, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line is 106 characters long (recommended limit is 79)
#1529 FILE: utilities/usdt-scripts/kernel_delay.rst:54:
  --  


WARNING: Line is 93 characters long (recommended limit is 79)
#1534 FILE: utilities/usdt-scripts/kernel_delay.rst:59:
  NAME NUMBER   COUNT  TOTAL ns 
   MAX ns

WARNING: Line is 93 characters long (recommended limit is 79)
#1535 FILE: utilities/usdt-scripts/kernel_delay.rst:60:
  poll  7   5   184,193,176 
  184,191,520

WARNING: Line is 93 characters long (recommended limit is 79)
#1536 FILE: utilities/usdt-scripts/kernel_delay.rst:61:
  recvmsg  47 494   125,208,756 
  310,331

WARNING: Line is 93 characters long (recommended limit is 79)
#1537 FILE: utilities/usdt-scripts/kernel_delay.rst:62:
  futex   202   818,768,758 
4,023,039

WARNING: Line is 93 characters long (recommended limit is 79)
#1538 FILE: utilities/usdt-scripts/kernel_delay.rst:63:
  sendto   44  10   375,861 
  266,867

WARNING: Line is 93 characters long (recommended limit is 79)
#1539 FILE: utilities/usdt-scripts/kernel_delay.rst:64:
  sendmsg  46   443,294 
   11,213

WARNING: Line is 93 characters long (recommended limit is 79)
#1540 FILE: utilities/usdt-scripts/kernel_delay.rst:65:
  write 1   1 5,949 
5,949

WARNING: Line is 93 characters long (recommended limit is 79)
#1541 FILE: utilities/usdt-scripts/kernel_delay.rst:66:
  getrusage98   1 1,424 
1,424

WARNING: Line is 93 characters long (recommended limit is 79)
#1542 FILE: utilities/usdt-scripts/kernel_delay.rst:67:
  read  0   1 1,292 
1,292

WARNING: Line is 82 characters long (recommended limit is 79)
#1546 FILE: utilities/usdt-scripts/kernel_delay.rst:71:
  SCHED_CNT   TOTAL nsMIN nsMAX 
ns

WARNING: Line is 86 characters long (recommended limit is 79)
#1554 FILE: utilities/usdt-scripts/kernel_delay.rst:79:
  NAME   COUNT  TOTAL ns
MAX ns

WARNING: Line is 86 characters long (recommended limit is 79)
#1555 FILE: utilities/usdt-scripts/kernel_delay.rst:80:
  eno8303-rx-1   1 3,586
 3,586

WARNING: Line is 94 characters long (recommended limit is 79)
#1559 FILE: utilities/usdt-scripts/kernel_delay.rst:84:
  NAME VECT_NR   COUNT  TOTAL ns
MAX ns

WARNING: Line is 94 characters long (recommended limit is 79)
#1560 FILE: utilities/usdt-scripts/kernel_delay.rst:85:
  net_rx 3   117,699
17,699

WARNING: Line is 94 characters long (recommended limit is 79)
#1561 FILE: utilities/usdt-scripts/kernel_delay.rst:86:
  sched  7   613,820
 3,226

WARNING: Line is 94 characters long (recommended limit is 79)
#1562 FILE: utilities/usdt-scripts/kernel_delay.rst:87:
  rcu9  1613,586
 1,554

WARNING: Line is 94 characters long (recommended limit is 79)
#1563 FILE: utilities/usdt-scripts/kernel_delay.rst:88:
  timer  1   310,259
 3,815

WARNING: Line is 106 characters long (recommended limit is 79)
#1664 FILE: utilities/usdt-scripts/kernel_delay.rst:189:
  --  


WARNING: Line is 87 characters long (recommended limit is 79)
#1679 FILE: utilities/usdt-scripts/kernel_delay.rst:204:
 ENTRY (ns)   EXIT (ns)TID COMM DELTA (us)  
SYSCALL

WARNING: Line is 100 characters long (recommended limit is 79)
#1680 FILE: utilities/usdt-scripts/kernel_delay.rst:205:
--- --- --  
--  

WARNING: Line is 89 characters long (recommended limit is 79)
#1681 FILE: utilities/usdt-scripts/kernel_delay.rst:206:
   216182169493548621618216950312013359699 revalidator14
95  futex

WARNING: Line is 89 characters long (recommended limit is 79)
#1690 FILE: u

Re: [ovs-dev] Scale testing OVN with ovn-heater for OpenStack use cases

2023-09-12 Thread Frode Nordahl
On Thu, Jul 27, 2023 at 1:38 PM Frode Nordahl
 wrote:
>
> On Wed, Jul 12, 2023 at 11:40 AM Frode Nordahl
>  wrote:
> >
> > On Mon, Jul 10, 2023 at 9:42 AM Frode Nordahl
> >  wrote:
> > >
> > > Have now sent out an invite to everyone participating in this thread.
> > >
> > > For the benefit of anyone else wanting to attend I'm also sharing the
> > > video link and other resources here:
> > > Date/Time: Tuesday July 11th 13:30 UTC
> > > Video link: https://meet.google.com/jno-ouvk-rfs
> > > Meeting notes: 
> > > https://docs.google.com/document/d/1IHeWiLuspPiUIzgKZMmV_9i8_0VujUZfw58LFeau8eE
> > >
> > > Agenda:
> > > Introductions / Code of Conduct 10m
> > > approach to development 5m
> > > ovn-heater approach – simulation 10m
> > > simulation of desired state vs. load inducing CMS behavior 10m
> > > first steps 10m
> > > next meeting 5m
> > > AOB
> > >
> > > (Apologies for the top post.)
> >
> > For the benefit of the community, the meeting was recorded and you
> > will find links to the video recording [0] and chat transcript [1]
> > below.
> >
> > 0: https://drive.google.com/file/d/1hNYu9VKF-VJtSa4A-thvVcr1hrDuuFHH/view
> > 1: https://drive.google.com/file/d/1YIPCapwhajvuJjoJCjjct7P018bkSWiP/view
>
> The second meeting was held Tuesday July 25th and for the benefit of
> the community you'll find video recording and chat transcript below:
> Video: https://drive.google.com/file/d/14m0L1ov0Gr6k4eAH9-S0fA5IqE4kd101/view
> Chat transcript:
> https://drive.google.com/file/d/16GKfJx9y_diKQBzWMO8yPOcWbvNB_Vj_/view
> Meeting notes: 
> https://docs.google.com/document/d/1IHeWiLuspPiUIzgKZMmV_9i8_0VujUZfw58LFeau8eE
>
> Many good topics discussed this time as well, would like to highlight
> that we agreed to finalize the questionnaire in today's OVN IRC
> meeting, and will follow up on initial results of that in a video
> meeting at the end of August.
>
> In the next meeting I also expect we will discuss concrete tasks for
> completion in the following two week period. Anyone having an itch to
> scratch for development proposals are free to do so before that as
> well, file issues and post PRs on
> https://github.com/ovn-org/ovn-heater

We had another meeting Tuesday September 5th and for the benefit of
the community you'll find video recording and chat transcript below:
Video: https://drive.google.com/file/d/1kMI1J1FAX3f0s60AXIDQQef26jmFBZfd
Chat transcript:
https://drive.google.com/file/d/1JmlnmowMqxIUHB_jm75v3tgcVphhSD-Z
Meeting notes: 
https://docs.google.com/document/d/1IHeWiLuspPiUIzgKZMmV_9i8_0VujUZfw58LFeau8eE

Highlights are:
* Walkthrough of responses to the questionnaire.
* Volunteers for development (Thanks to Felix Huettner, Martin Kalcok
and Robin Jarry!)
* Focused development of initial OpenStack support is on-going right
now, and you can follow along in #openvswitch on Libera Chat and in
our in-flight thoughts document [2]. Contributions of any sort are
welcome.

2: 
https://docs.google.com/document/d/1n4_GWf5ztYXOKZBgwEuLJ8A-qGL0diGP1A55plCdUjs


--
Frode Nordahl

> --
> Frode Nordahl
>
> > --
> > Frode Nordahl
> >
> > > --
> > > Frode Nordahl
> > >
> > > On Wed, Jul 5, 2023 at 10:18 AM Haresh Khandelwal  
> > > wrote:
> > > >
> > > > Hi Frode, Please add me (hakha...@redhat.com) as well to the invite 
> > > > list.
> > > >
> > > > Thanks
> > > > -Haresh
> > > >
> > > > On Tue, Jul 4, 2023 at 10:30 PM Roberto Bartzen Acosta via dev <
> > > > ovs-dev@openvswitch.org> wrote:
> > > >
> > > > > I'm interested in attending this meet, Frode. Please include me in the
> > > > > invite list.
> > > > >
> > > > > Thanks
> > > > >
> > > > > Em ter., 4 de jul. de 2023 às 13:06, Numan Siddique 
> > > > > 
> > > > > escreveu:
> > > > >
> > > > > > On Tue, Jul 4, 2023, 8:00 PM Frode Nordahl 
> > > > > > 
> > > > > > wrote:
> > > > > >
> > > > > > > On Tue, Jul 4, 2023 at 4:16 PM Dumitru Ceara 
> > > > > wrote:
> > > > > > > >
> > > > > > > > On 6/30/23 23:07, Terry Wilson wrote:
> > > > > > > > > On Fri, Jun 30, 2023 at 2:26 AM Frode Nordahl
> > > > > > > > >  wrote:
> > > > > > > > >>
> > > > > > > > >> Hello all,
> > > > > > > > >>
> > > > > > > > >> On Tue, May 30, 2023 at 5:16 PM Felix Huettner
> > > > > > > > >>  wrote:
> > > > > > > > >>>
> > > > > > > > >>> Hi Dumitru,
> > > > > > > > >>>
> > > > > > > > >>> On Fri, May 26, 2023 at 01:30:54PM +0200, Dumitru Ceara 
> > > > > > > > >>> wrote:
> > > > > > > >  On 5/24/23 09:37, Felix Huettner wrote:
> > > > > > > > > Hi everyone,
> > > > > > > > 
> > > > > > > >  Hi Felix,
> > > > > > > > 
> > > > > > > > >
> > > > > > > > > Ilya mentioned to me that you will want to bring openstack
> > > > > > > examples to
> > > > > > > > > ovn-heater.
> > > > > > > > >
> > > > > > > > 
> > > > > > > >  Yes, we're discussing that.
> > > > > > > > 
> > > > > > > > > I wanted to ask how to best join this effort. It would be 
> > > > > > > > > great
> > > > > > > for us
> > > > > > > > 
> > > > > > > >  Everyone is we

Re: [ovs-dev] [PATCH] dpdk: expose cpu usage stats on telemetry socket

2023-09-12 Thread Robin Jarry
Eelco Chaudron, Sep 12, 2023 at 09:17:
> I feel like if we do need another way of getting (real time)
> statistics out of OVS, we should use the same communication channel as
> the other ovs-xxx utilities are using. But rather than returning
> text-based responses, we might be able to make it JSON (which is
> already used by the dbase). I know that Adrian is already
> investigating machine-readable output for some existing utilities,
> maybe it can be extended for the (pmd) statistics use case.
>
> Using something like the DPDK telemetry socket, might not work for
> other use cases where DPDK is not in play.

Maybe the telemetry socket code could be reused even when DPDK is not in
play. It already has all the APIs to return structured data and
serialize it to JSON. It would be nice not to have to reinvent the
wheel.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] dpdk: expose cpu usage stats on telemetry socket

2023-09-12 Thread Eelco Chaudron



On 12 Sep 2023, at 15:19, Robin Jarry wrote:

> Eelco Chaudron, Sep 12, 2023 at 09:17:
>> I feel like if we do need another way of getting (real time)
>> statistics out of OVS, we should use the same communication channel as
>> the other ovs-xxx utilities are using. But rather than returning
>> text-based responses, we might be able to make it JSON (which is
>> already used by the dbase). I know that Adrian is already
>> investigating machine-readable output for some existing utilities,
>> maybe it can be extended for the (pmd) statistics use case.
>>
>> Using something like the DPDK telemetry socket, might not work for
>> other use cases where DPDK is not in play.
>
> Maybe the telemetry socket code could be reused even when DPDK is not in
> play. It already has all the APIs to return structured data and
> serialize it to JSON. It would be nice not to have to reinvent the
> wheel.

But this is a new type of connecting into OVS, and I feel like we should keep 
the existing infrastructure, and not add another connection type. This would 
make it easy for existing tools to also benefit from the new format over the 
existing connection methods.

Any input from others in the community? Adrian maybe you can share your 
research, ideas?

//Eelco

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] checkpatch: Add checks for the subject line.

2023-09-12 Thread Simon Horman
On Mon, Sep 11, 2023 at 05:06:26PM +0200, Eelco Chaudron wrote:
> This patch adds WARNINGs for the subject line length and the format,
> i.e., the sentence should start with a capital and end with a dot.
> 
> Signed-off-by: Eelco Chaudron 

Acked-by: Simon Horman 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] python: idl: Fix last-id update from a monitor reply.

2023-09-12 Thread Simon Horman
On Sat, Sep 09, 2023 at 04:18:36AM +0200, Ilya Maximets wrote:
> While sending a reply to the monitor_cond_since request, server
> includes the last transaction ID.  And it sends new IDs with each
> subsequent update.  Current implementation doesn't use the one
> supplied with a monitor reply, and only takes into account IDs
> provided with monitor updates.  That may cause various issues:
> 
> 1. Performance: During initialization, the last-id is set to zero.
>If re-connection will happen after receiving a monitor reply,
>but before any monitor update, the client will send a new
>monitor request with an all-zero last-id and will re-download
>the whole database again.
> 
> 2. Data inconsistency: Assuming one of the clients sends a
>transaction, but our python client disconnects before receiving
>a monitor update for this transaction.  The las-id will point
>to a database state before this transaction.  On re-connection,
>this last-id will be sent and the monitor reply will contain
>a diff with a new data from that transaction.  But if another
>disconnection happens right after that, on second re-connection
>our python client will send another monitor_cond_since with
>exactly the same last-id.  That will cause receiving the same
>set of updates again.  And since it's an update2 message with
>a diff of the data, the client will remove previously applied
>result of the transaction.  At this point it will have a
>different database view with the server potentially leading
>to all sorts of data inconsistency problems.
> 
> Fix that by always updating the last-id from the latest monitor
> reply.
> 
> Fixes: 46d44cf3be0d ("python: idl: Add monitor_cond_since support.")
> Signed-off-by: Ilya Maximets 

Acked-by: Simon Horman 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v7 1/4] northd: Handle load balancer changes for a logical switch.

2023-09-12 Thread Numan Siddique
On Mon, Sep 11, 2023 at 9:21 PM Han Zhou  wrote:
>
> On Mon, Sep 11, 2023 at 9:01 AM  wrote:
> >
> > From: Numan Siddique 
> >
> > 'lb_data' engine node now also handles logical switch changes.
> > Its data maintains ls to lb related information. i.e if a
> > logical switch sw0 has lb1, lb2 and lb3 associated then
> > it stores this info in its data.  And when a new load balancer
> > lb4 is associated to it, it stores this information in its
> > tracked data so that 'northd' engine node can handle it
> > accordingly.  Tracked data will have information like:
> >   changed ls -> {sw0 : {associated_lbs: [lb4]}
> >
> > The first handler 'northd_lb_data_handler_pre_od' is called before the
> > 'northd_nb_logical_switch_handler' handler and it just creates or
> > deletes the lb_datapaths hmap for the tracked lbs.
> >
> > The northd handler 'northd_lb_data_handler' updates the
> > ovn_lb_datapaths's 'nb_ls_map' bitmap accordingly.
> >
> > Eg.  If the lb_data has the below tracked data:
> >
> > tracked_data = {'crupdated_lbs': [lb1, lb2],
> > 'deleted_lbs': [lb3],
> > 'crupdated_lb_groups': [lbg1, lbg2],
> > 'crupdated_ls_lbs': [{ls: sw0, assoc_lbs: [lb1],
> >  {ls: sw1, assoc_lbs: [lb1, lb2]}
> >
> > The handler northd_lb_data_handler(), creates the
> > ovn_lb_datapaths object for lb1 and lb2 and deletes lb3 from
> > the ovn_lb_datapaths hmap.  It does the same for the created or updated lb
> > groups lbg1 and lbg2 in the ovn_lbgrp_datapaths map.  It also updates the
> > nb_ls_bitmap of lb1 for sw0 and sw1 and nb_ls_bitmap of lb2 for sw1.
> >
> > Reviewed-by: Ales Musil 
> > Acked-by: Mark Michelson 
> > Signed-off-by: Numan Siddique 
> > ---
> >  lib/lb.c |   5 +-
> >  northd/en-lb-data.c  | 176 +++
> >  northd/en-lb-data.h  |  17 
> >  northd/en-lflow.c|   6 ++
> >  northd/en-northd.c   |   6 +-
> >  northd/inc-proc-northd.c |   2 +
> >  northd/northd.c  |  83 +++---
> >  northd/northd.h  |   4 +-
> >  tests/ovn-northd.at  |  56 +
> >  9 files changed, 322 insertions(+), 33 deletions(-)
> >
> > diff --git a/lib/lb.c b/lib/lb.c
> > index 6fd67e2218..e6c9dc2be2 100644
> > --- a/lib/lb.c
> > +++ b/lib/lb.c
> > @@ -1088,7 +1088,10 @@ ovn_lb_datapaths_add_ls(struct ovn_lb_datapaths
> *lb_dps, size_t n,
> >  struct ovn_datapath **ods)
> >  {
> >  for (size_t i = 0; i < n; i++) {
> > -bitmap_set1(lb_dps->nb_ls_map, ods[i]->index);
> > +if (!bitmap_is_set(lb_dps->nb_ls_map, ods[i]->index)) {
> > +bitmap_set1(lb_dps->nb_ls_map, ods[i]->index);
> > +lb_dps->n_nb_ls++;
> > +}
> >  }
> >  }
> >
> > diff --git a/northd/en-lb-data.c b/northd/en-lb-data.c
> > index 8acd9c8cb2..02b1bfd7a4 100644
> > --- a/northd/en-lb-data.c
> > +++ b/northd/en-lb-data.c
> > @@ -39,6 +39,14 @@ static void lb_data_destroy(struct ed_type_lb_data *);
> >  static void build_lbs(const struct nbrec_load_balancer_table *,
> >const struct nbrec_load_balancer_group_table *,
> >struct hmap *lbs, struct hmap *lb_groups);
> > +static void build_od_lb_map(const struct nbrec_logical_switch_table *,
> > + struct hmap *od_lb_map);
> > +static struct od_lb_data *find_od_lb_data(struct hmap *od_lb_map,
> > +  const struct uuid *od_uuid);
> > +static void destroy_od_lb_data(struct od_lb_data *od_lb_data);
> > +static struct od_lb_data *create_od_lb_data(struct hmap *od_lb_map,
> > +const struct uuid *od_uuid);
> > +
> >  static struct ovn_lb_group *create_lb_group(
> >  const struct nbrec_load_balancer_group *, struct hmap *lbs,
> >  struct hmap *lb_groups);
> > @@ -54,6 +62,7 @@ static struct crupdated_lbgrp *
> > struct tracked_lb_data *);
> >  static void add_deleted_lbgrp_to_tracked_data(
> >  struct ovn_lb_group *, struct tracked_lb_data *);
> > +static bool is_ls_lbs_changed(const struct nbrec_logical_switch *nbs);
> >
> >  /* 'lb_data' engine node manages the NB load balancers and load balancer
> >   * groups.  For each NB LB, it creates 'struct ovn_northd_lb' and
> > @@ -80,9 +89,13 @@ en_lb_data_run(struct engine_node *node, void *data)
> >  EN_OVSDB_GET(engine_get_input("NB_load_balancer", node));
> >  const struct nbrec_load_balancer_group_table *nb_lbg_table =
> >  EN_OVSDB_GET(engine_get_input("NB_load_balancer_group", node));
> > +const struct nbrec_logical_switch_table *nb_ls_table =
> > +EN_OVSDB_GET(engine_get_input("NB_logical_switch", node));
> >
> >  lb_data->tracked = false;
> >  build_lbs(nb_lb_table, nb_lbg_table, &lb_data->lbs,
> &lb_data->lbgrps);
> > +build_od_lb_map(nb_ls_table, &lb_data->ls_lb_map);

[ovs-dev] [PATCH ovn v8 0/4] northd: I-P for load balancer and lb groups

2023-09-12 Thread numans
From: Numan Siddique 

This patch series adds the support to handle load balancer and
load balancer group changes incrementally in the "northd" engine
node. Changes to logical switches and router's load
balancer and load balancer group columns are also handled incrementally
provided other columns do not change.

v6 of the series also included lflow I-P handling.  But v7 drops these
patches as there are some concerns with the obj dep mgr usage. lflow I-P
handling patches will be submitted separately.

Below are the scale testing results done with these patches applied
using ovn-heater.  The test ran the scenario  -
ocp-500-density-heavy.yml [1].

With these patches applied (with load balancer I-P handling only in
northd engine node) the resuts are:

---
Min (s) Median (s)  90%ile (s)  99%ile 
(s)  Max (s) Mean (s)Total (s)   Count   Failed
---
Iteration Total 0.1313631.1899943.213526
4.3081344.3945621.385713173.214153  125 0
Namespace.add_ports 0.0051760.0056110.006476
0.0198670.0242060.0060580.757188125 0
WorkerNode.bind_port0.0337760.0463430.054414
0.0617190.0636130.04681511.703773   250 0
WorkerNode.ping_port0.0051560.0069592.044939
3.6553284.2414960.627103156.775832  250 0
---


The results with the present main (Result 3) are:


---
Min (s) Median (s)  90%ile (s)  99%ile 
(s)  Max (s) Mean (s)Total (s)   Count   Failed
---
Iteration Total 3.2337954.3649265.400982
6.4128037.4097574.792270599.033790  125 0
Namespace.add_ports 0.0052300.0065640.007379
0.0190600.0374900.0072230.902930125 0
WorkerNode.bind_port0.0338640.0440520.049608
0.0548490.0561960.04400511.001231   250 0
WorkerNode.ping_port0.0053342.0604775.222422
6.2673327.2840012.323020580.754964  250 0
---

v7 -> v8
---
  * Addressed review comments

v6 -> v7
---
  * First 4 patches of v6 are merged in main and branch-23.09 and
patches 9 to 16 are dropped.
  * v7 only has 4 patches now.
  * Addressed review comments.  There is only one handler for lb_data
engine input in northd engine node - northd_handle_lb_data_changes().
In v6 and earlier there were 2
handle functions - northd_handle_lb_data_changes_pre_od() and
northd_handle_lb_data_changes_post_od().

v5 -> v6
---
  * Rebased.  Added 2 more patches (p15 and p16) for LR NAT I-P handling.

v4 -> v5
---
  * 6 new patches are added to the series which handles the LB changes
in the lflow engine node.
v3 -> v4
---
  * Covered more test scearios.
  * Found few issues and fixed them.  v3 was not handling the scenario of
a vip getting added or removed from a load balancer.

v2 -> v3

  * v2 was very inefficient in handling the load balancer group changes
and in associating the load balancers of the lb group to the
datapaths. This was the main reason for the regression in the full
recompute time taken.
v3 addressed these by more efficiently handling the lb group changes
incrementally.


Numan Siddique (4):
  northd: Handle load balancer changes for a logical switch.
  northd: Handle load balancer group changes for a logical switch.
  northd: Sync SB Port bindings NAT column in a separate engine node.
  northd: Handle load balancer/group changes for a logical router.

 lib/lb.c |  51 ++-
 lib/lb.h |   9 +
 northd/en-lb-data.c  | 432 -
 northd/en-lb-data.h  |  49 +++
 northd/en-lflow.c|   6 +
 northd/en-northd.c   |  26 +-
 northd/en-northd.h   |   1 +
 northd/en-sync-f

[ovs-dev] [PATCH ovn v8 1/4] northd: Handle load balancer changes for a logical switch.

2023-09-12 Thread numans
From: Numan Siddique 

'lb_data' engine node now also handles logical switch changes.
Its data maintains ls to lb related information. i.e if a
logical switch sw0 has lb1, lb2 and lb3 associated then
it stores this info in its data.  And when a new load balancer
lb4 is associated to it, it stores this information in its
tracked data so that 'northd' engine node can handle it
accordingly.  Tracked data will have information like:
  changed ls -> {sw0 : {associated_lbs: [lb4]}

The first handler 'northd_lb_data_handler_pre_od' is called before the
'northd_nb_logical_switch_handler' handler and it just creates or
deletes the lb_datapaths hmap for the tracked lbs.

The northd handler 'northd_lb_data_handler' updates the
ovn_lb_datapaths's 'nb_ls_map' bitmap accordingly.

Eg.  If the lb_data has the below tracked data:

tracked_data = {'crupdated_lbs': [lb1, lb2],
'deleted_lbs': [lb3],
'crupdated_lb_groups': [lbg1, lbg2],
'crupdated_ls_lbs': [{ls: sw0, assoc_lbs: [lb1],
 {ls: sw1, assoc_lbs: [lb1, lb2]}

The handler northd_lb_data_handler(), creates the
ovn_lb_datapaths object for lb1 and lb2 and deletes lb3 from
the ovn_lb_datapaths hmap.  It does the same for the created or updated lb
groups lbg1 and lbg2 in the ovn_lbgrp_datapaths map.  It also updates the
nb_ls_bitmap of lb1 for sw0 and sw1 and nb_ls_bitmap of lb2 for sw1.

Reviewed-by: Ales Musil 
Acked-by: Mark Michelson 
Signed-off-by: Numan Siddique 
---
 lib/lb.c |   5 +-
 northd/en-lb-data.c  | 176 +++
 northd/en-lb-data.h  |  31 +++
 northd/en-lflow.c|   6 ++
 northd/en-northd.c   |   6 +-
 northd/en-sync-from-sb.c |  10 ++-
 northd/en-sync-sb.c  |  18 ++--
 northd/inc-proc-northd.c |   2 +
 northd/northd.c  |  89 +---
 northd/northd.h  |   4 +-
 tests/ovn-northd.at  |  56 +
 11 files changed, 359 insertions(+), 44 deletions(-)

diff --git a/lib/lb.c b/lib/lb.c
index 6fd67e2218..e6c9dc2be2 100644
--- a/lib/lb.c
+++ b/lib/lb.c
@@ -1088,7 +1088,10 @@ ovn_lb_datapaths_add_ls(struct ovn_lb_datapaths *lb_dps, 
size_t n,
 struct ovn_datapath **ods)
 {
 for (size_t i = 0; i < n; i++) {
-bitmap_set1(lb_dps->nb_ls_map, ods[i]->index);
+if (!bitmap_is_set(lb_dps->nb_ls_map, ods[i]->index)) {
+bitmap_set1(lb_dps->nb_ls_map, ods[i]->index);
+lb_dps->n_nb_ls++;
+}
 }
 }
 
diff --git a/northd/en-lb-data.c b/northd/en-lb-data.c
index 8acd9c8cb2..d854042feb 100644
--- a/northd/en-lb-data.c
+++ b/northd/en-lb-data.c
@@ -39,6 +39,14 @@ static void lb_data_destroy(struct ed_type_lb_data *);
 static void build_lbs(const struct nbrec_load_balancer_table *,
   const struct nbrec_load_balancer_group_table *,
   struct hmap *lbs, struct hmap *lb_groups);
+static void build_od_lb_map(const struct nbrec_logical_switch_table *,
+ struct hmap *od_lb_map);
+static struct od_lb_data *find_od_lb_data(struct hmap *od_lb_map,
+  const struct uuid *od_uuid);
+static void destroy_od_lb_data(struct od_lb_data *od_lb_data);
+static struct od_lb_data *create_od_lb_data(struct hmap *od_lb_map,
+const struct uuid *od_uuid);
+
 static struct ovn_lb_group *create_lb_group(
 const struct nbrec_load_balancer_group *, struct hmap *lbs,
 struct hmap *lb_groups);
@@ -54,6 +62,7 @@ static struct crupdated_lbgrp *
struct tracked_lb_data *);
 static void add_deleted_lbgrp_to_tracked_data(
 struct ovn_lb_group *, struct tracked_lb_data *);
+static bool is_ls_lbs_changed(const struct nbrec_logical_switch *nbs);
 
 /* 'lb_data' engine node manages the NB load balancers and load balancer
  * groups.  For each NB LB, it creates 'struct ovn_northd_lb' and
@@ -80,9 +89,13 @@ en_lb_data_run(struct engine_node *node, void *data)
 EN_OVSDB_GET(engine_get_input("NB_load_balancer", node));
 const struct nbrec_load_balancer_group_table *nb_lbg_table =
 EN_OVSDB_GET(engine_get_input("NB_load_balancer_group", node));
+const struct nbrec_logical_switch_table *nb_ls_table =
+EN_OVSDB_GET(engine_get_input("NB_logical_switch", node));
 
 lb_data->tracked = false;
 build_lbs(nb_lb_table, nb_lbg_table, &lb_data->lbs, &lb_data->lbgrps);
+build_od_lb_map(nb_ls_table, &lb_data->ls_lb_map);
+
 engine_set_node_state(node, EN_UPDATED);
 }
 
@@ -230,18 +243,98 @@ lb_data_load_balancer_group_handler(struct engine_node 
*node, void *data)
 return true;
 }
 
+bool
+lb_data_logical_switch_handler(struct engine_node *node, void *data)
+{
+struct ed_type_lb_data *lb_data = (struct ed_type_lb_data *) data;
+const struct nbrec_logical_switch_table *nb_ls_table =
+EN_OVSDB_GET(e

[ovs-dev] [PATCH ovn v8 2/4] northd: Handle load balancer group changes for a logical switch.

2023-09-12 Thread numans
From: Numan Siddique 

For a given load balancer group 'A', northd engine data maintains
a bitmap of datapaths associated to this lb group.  So when lb group 'A'
gets associated to a logical switch 's1', the bitmap index of 's1' is set
in its bitmap.

In order to handle the load balancer group changes incrementally for a
logical switch, we need to set and clear the bitmap bits accordingly.
And this patch does it.

Reviewed-by: Ales Musil 
Acked-by: Mark Michelson 
Signed-off-by: Numan Siddique 
---
 northd/en-lb-data.c | 102 
 northd/en-lb-data.h |   4 ++
 northd/northd.c |  63 +++
 tests/ovn-northd.at |  30 +
 4 files changed, 155 insertions(+), 44 deletions(-)

diff --git a/northd/en-lb-data.c b/northd/en-lb-data.c
index d854042feb..fd09e719d8 100644
--- a/northd/en-lb-data.c
+++ b/northd/en-lb-data.c
@@ -63,6 +63,7 @@ static struct crupdated_lbgrp *
 static void add_deleted_lbgrp_to_tracked_data(
 struct ovn_lb_group *, struct tracked_lb_data *);
 static bool is_ls_lbs_changed(const struct nbrec_logical_switch *nbs);
+static bool is_ls_lbgrps_changed(const struct nbrec_logical_switch *nbs);
 
 /* 'lb_data' engine node manages the NB load balancers and load balancer
  * groups.  For each NB LB, it creates 'struct ovn_northd_lb' and
@@ -264,12 +265,15 @@ lb_data_logical_switch_handler(struct engine_node *node, 
void *data)
 hmapx_add(&trk_lb_data->deleted_od_lb_data, od_lb_data);
 }
 } else {
-if (!is_ls_lbs_changed(nbs)) {
+bool ls_lbs_changed = is_ls_lbs_changed(nbs);
+bool ls_lbgrps_changed = is_ls_lbgrps_changed(nbs);
+if (!ls_lbs_changed && !ls_lbgrps_changed) {
 continue;
 }
 struct crupdated_od_lb_data *codlb = xzalloc(sizeof *codlb);
 codlb->od_uuid = nbs->header_.uuid;
 uuidset_init(&codlb->assoc_lbs);
+uuidset_init(&codlb->assoc_lbgrps);
 
 struct od_lb_data *od_lb_data =
 find_od_lb_data(&lb_data->ls_lb_map, &nbs->header_.uuid);
@@ -278,38 +282,66 @@ lb_data_logical_switch_handler(struct engine_node *node, 
void *data)
 &nbs->header_.uuid);
 }
 
-struct uuidset *pre_lb_uuids = od_lb_data->lbs;
-od_lb_data->lbs = xzalloc(sizeof *od_lb_data->lbs);
-uuidset_init(od_lb_data->lbs);
-
-for (size_t i = 0; i < nbs->n_load_balancer; i++) {
-const struct uuid *lb_uuid =
-&nbs->load_balancer[i]->header_.uuid;
-uuidset_insert(od_lb_data->lbs, lb_uuid);
+if (ls_lbs_changed) {
+struct uuidset *pre_lb_uuids = od_lb_data->lbs;
+od_lb_data->lbs = xzalloc(sizeof *od_lb_data->lbs);
+uuidset_init(od_lb_data->lbs);
+
+for (size_t i = 0; i < nbs->n_load_balancer; i++) {
+const struct uuid *lb_uuid =
+&nbs->load_balancer[i]->header_.uuid;
+uuidset_insert(od_lb_data->lbs, lb_uuid);
+
+struct uuidset_node *unode = uuidset_find(pre_lb_uuids,
+lb_uuid);
+
+if (!unode || (nbrec_load_balancer_row_get_seqno(
+nbs->load_balancer[i],
+OVSDB_IDL_CHANGE_MODIFY) > 0)) {
+/* Add this lb to the tracked data. */
+uuidset_insert(&codlb->assoc_lbs, lb_uuid);
+changed = true;
+}
+
+if (unode) {
+uuidset_delete(pre_lb_uuids, unode);
+}
+}
+if (!uuidset_is_empty(pre_lb_uuids)) {
+trk_lb_data->has_dissassoc_lbs_from_od = true;
+changed = true;
+}
 
-struct uuidset_node *unode = uuidset_find(pre_lb_uuids,
-  lb_uuid);
+uuidset_destroy(pre_lb_uuids);
+free(pre_lb_uuids);
+}
 
-if (!unode || (nbrec_load_balancer_row_get_seqno(
-nbs->load_balancer[i], OVSDB_IDL_CHANGE_MODIFY) > 0)) {
-/* Add this lb to the tracked data. */
-uuidset_insert(&codlb->assoc_lbs, lb_uuid);
-changed = true;
+if (ls_lbgrps_changed) {
+struct uuidset *pre_lbgrp_uuids = od_lb_data->lbgrps;
+od_lb_data->lbgrps = xzalloc(sizeof *od_lb_data->lbgrps);
+uuidset_init(od_lb_data->lbgrps);
+for (size_t i = 0; i < nbs->n_load_balancer_group; i++) {
+const struct uuid *lbg_uuid =
+&nbs->load

[ovs-dev] [PATCH ovn v8 3/4] northd: Sync SB Port bindings NAT column in a separate engine node.

2023-09-12 Thread numans
From: Numan Siddique 

A new engine node 'sync_to_sb_pb' is added within 'sync_to_sb'
node to sync NAT column of Port bindings table.  This separation
is required in order to add load balancer group I-P handling
in 'northd' engine node (which is handled in the next commit).

'sync_to_sb_pb' engine node can be later expanded to sync other
Port binding columns if required.

Reviewed-by: Ales Musil 
Acked-by: Mark Michelson 
Signed-off-by: Numan Siddique 
---
 northd/en-sync-sb.c  |  52 
 northd/en-sync-sb.h  |   5 +
 northd/inc-proc-northd.c |   8 +-
 northd/northd.c  | 274 +++
 northd/northd.h  |   3 +
 tests/ovn-northd.at  |  17 ++-
 6 files changed, 239 insertions(+), 120 deletions(-)

diff --git a/northd/en-sync-sb.c b/northd/en-sync-sb.c
index 37ec631b6e..aae396a43d 100644
--- a/northd/en-sync-sb.c
+++ b/northd/en-sync-sb.c
@@ -248,6 +248,58 @@ sync_to_sb_lb_northd_handler(struct engine_node *node, 
void *data OVS_UNUSED)
 return true;
 }
 
+/* sync_to_sb_pb engine node functions.
+ * This engine node syncs the SB Port Bindings (partly).
+ * en_northd engine create the SB Port binding rows and
+ * updates most of the columns.
+ * This engine node updates the port binding columns which
+ * needs to be updated after northd engine is run.
+ */
+
+void *
+en_sync_to_sb_pb_init(struct engine_node *node OVS_UNUSED,
+  struct engine_arg *arg OVS_UNUSED)
+{
+return NULL;
+}
+
+void
+en_sync_to_sb_pb_run(struct engine_node *node, void *data OVS_UNUSED)
+{
+const struct engine_context *eng_ctx = engine_get_context();
+struct northd_data *northd_data = engine_get_input_data("northd", node);
+
+sync_pbs(eng_ctx->ovnsb_idl_txn, &northd_data->ls_ports);
+engine_set_node_state(node, EN_UPDATED);
+}
+
+void
+en_sync_to_sb_pb_cleanup(void *data OVS_UNUSED)
+{
+
+}
+
+bool
+sync_to_sb_pb_northd_handler(struct engine_node *node, void *data OVS_UNUSED)
+{
+const struct engine_context *eng_ctx = engine_get_context();
+if (!eng_ctx->ovnsb_idl_txn) {
+return false;
+}
+
+struct northd_data *nd = engine_get_input_data("northd", node);
+if (!nd->change_tracked) {
+return false;
+}
+
+if (!sync_pbs_for_northd_ls_changes(&nd->tracked_ls_changes)) {
+return false;
+}
+
+engine_set_node_state(node, EN_UPDATED);
+return true;
+}
+
 /* static functions. */
 static void
 sync_addr_set(struct ovsdb_idl_txn *ovnsb_txn, const char *name,
diff --git a/northd/en-sync-sb.h b/northd/en-sync-sb.h
index 06d2a57710..f08565eee1 100644
--- a/northd/en-sync-sb.h
+++ b/northd/en-sync-sb.h
@@ -22,4 +22,9 @@ void en_sync_to_sb_lb_run(struct engine_node *, void *data);
 void en_sync_to_sb_lb_cleanup(void *data);
 bool sync_to_sb_lb_northd_handler(struct engine_node *, void *data OVS_UNUSED);
 
+void *en_sync_to_sb_pb_init(struct engine_node *, struct engine_arg *);
+void en_sync_to_sb_pb_run(struct engine_node *, void *data);
+void en_sync_to_sb_pb_cleanup(void *data);
+bool sync_to_sb_pb_northd_handler(struct engine_node *, void *data OVS_UNUSED);
+
 #endif /* end of EN_SYNC_SB_H */
diff --git a/northd/inc-proc-northd.c b/northd/inc-proc-northd.c
index 303b58d43f..e9e28c4bea 100644
--- a/northd/inc-proc-northd.c
+++ b/northd/inc-proc-northd.c
@@ -144,6 +144,7 @@ static ENGINE_NODE_WITH_CLEAR_TRACK_DATA(port_group, 
"port_group");
 static ENGINE_NODE(fdb_aging, "fdb_aging");
 static ENGINE_NODE(fdb_aging_waker, "fdb_aging_waker");
 static ENGINE_NODE(sync_to_sb_lb, "sync_to_sb_lb");
+static ENGINE_NODE(sync_to_sb_pb, "sync_to_sb_pb");
 static ENGINE_NODE_WITH_CLEAR_TRACK_DATA(lb_data, "lb_data");
 
 void inc_proc_northd_init(struct ovsdb_idl_loop *nb,
@@ -228,15 +229,20 @@ void inc_proc_northd_init(struct ovsdb_idl_loop *nb,
  sync_to_sb_lb_northd_handler);
 engine_add_input(&en_sync_to_sb_lb, &en_sb_load_balancer, NULL);
 
+engine_add_input(&en_sync_to_sb_pb, &en_northd,
+ sync_to_sb_pb_northd_handler);
+
 /* en_sync_to_sb engine node syncs the SB database tables from
  * the NB database tables.
  * Right now this engine syncs the SB Address_Set table, Port_Group table
- * SB Meter/Meter_Band tables and SB Load_Balancer table.
+ * SB Meter/Meter_Band tables and SB Load_Balancer table and
+ * (partly) SB Port_Binding table.
  */
 engine_add_input(&en_sync_to_sb, &en_sync_to_sb_addr_set, NULL);
 engine_add_input(&en_sync_to_sb, &en_port_group, NULL);
 engine_add_input(&en_sync_to_sb, &en_sync_meters, NULL);
 engine_add_input(&en_sync_to_sb, &en_sync_to_sb_lb, NULL);
+engine_add_input(&en_sync_to_sb, &en_sync_to_sb_pb, NULL);
 
 engine_add_input(&en_sync_from_sb, &en_northd,
  sync_from_sb_northd_handler);
diff --git a/northd/northd.c b/northd/northd.c
index d500f940b1..f6258d4830 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -3527,8 +3527,6 @@ ovn_po

[ovs-dev] [PATCH ovn v8 4/4] northd: Handle load balancer/group changes for a logical router.

2023-09-12 Thread numans
From: Numan Siddique 

When a logical router gets updated due to load balancer or load balancer
groups changes, it is now incrementally handled first in 'lb_data'
engine node similar to how logical switch changes are handled.  The
tracking data of 'lb_data' is updated similarly so that northd engine
handler - northd_handle_lb_data_changes() handles it.

A new handler northd_handle_lr_changes() is added in the 'northd' engine
node for logical router changes.  This handler returns true if only
load balancer or load balancer group columns are changed.  It returns
false for any other changes.

northd_handle_lb_data_changes() also sets the logical router
od's lb_ips accordingly.

Below are the scale testing results done with these patches applied
using ovn-heater.  The test ran the scenario  -
ocp-500-density-heavy.yml [1].

With these patches applied (with load balancer I-P handling in northd
engine node) the resuts are:

---
Min (s) Median (s)  90%ile (s)  99%ile 
(s)  Max (s) Mean (s)Total (s)   Count   Failed
---
Iteration Total 0.1313631.1899943.213526
4.3081344.3945621.385713173.214153  125 0
Namespace.add_ports 0.0051760.0056110.006476
0.0198670.0242060.0060580.757188125 0
WorkerNode.bind_port0.0337760.0463430.054414
0.0617190.0636130.04681511.703773   250 0
WorkerNode.ping_port0.0051560.0069592.044939
3.6553284.2414960.627103156.775832  250 0
---

The results with the present main are:

---
Min (s) Median (s)  90%ile (s)  99%ile 
(s)  Max (s) Mean (s)Total (s)   Count   Failed
---
Iteration Total 3.2337954.3649265.400982
6.4128037.4097574.792270599.033790  125 0
Namespace.add_ports 0.0052300.0065640.007379
0.0190600.0374900.0072230.902930125 0
WorkerNode.bind_port0.0338640.0440520.049608
0.0548490.0561960.04400511.001231   250 0
WorkerNode.ping_port0.0053342.0604775.222422
6.2673327.2840012.323020580.754964  250 0
---

Few observations:

 - The total time taken has come down significantly from 599 seconds to 173.
 - 99%ile with these patches is 4.3 seconds compared to 6.4 seconds for the
   main.
 - 99%ile with these patches is 3.2 seconds compared to 5.4 seconds for the
   main.
 - CPU utilization of northd during the test with these patches
   is between 100% to 300% which is almost the same as main.
   Main difference being that, with these patches the test duration is
   less and hence overall less CPU utilization.

[1] - 
https://github.com/ovn-org/ovn-heater/blob/main/test-scenarios/ocp-500-density-heavy.yml

Reviewed-by: Ales Musil 
Acked-by: Mark Michelson 
Signed-off-by: Numan Siddique 
---
 lib/lb.c |  46 +-
 lib/lb.h |   9 ++
 northd/en-lb-data.c  | 328 +++
 northd/en-lb-data.h  |  14 ++
 northd/en-northd.c   |  20 +++
 northd/en-northd.h   |   1 +
 northd/inc-proc-northd.c |   5 +-
 northd/northd.c  | 245 ++---
 northd/northd.h  |   2 +
 tests/ovn-northd.at  |  42 ++---
 10 files changed, 599 insertions(+), 113 deletions(-)

diff --git a/lib/lb.c b/lib/lb.c
index e6c9dc2be2..d0d562b6fb 100644
--- a/lib/lb.c
+++ b/lib/lb.c
@@ -794,6 +794,7 @@ ovn_lb_group_init(struct ovn_lb_group *lb_group,
 const struct uuid *lb_uuid =
 &nbrec_lb_group->load_balancer[i]->header_.uuid;
 lb_group->lbs[i] = ovn_northd_lb_find(lbs, lb_uuid);
+lb_group->has_routable_lb |= lb_group->lbs[i]->routable;
 }
 }
 
@@ -815,6 +816,7 @@ ovn_lb_group_cleanup(struct ovn_lb_

Re: [ovs-dev] [PATCH ovn v7 1/4] northd: Handle load balancer changes for a logical switch.

2023-09-12 Thread Han Zhou
On Tue, Sep 12, 2023 at 8:38 AM Numan Siddique  wrote:
>
> On Mon, Sep 11, 2023 at 9:21 PM Han Zhou  wrote:
> >
> > On Mon, Sep 11, 2023 at 9:01 AM  wrote:
> > >
> > > From: Numan Siddique 
> > >
> > > 'lb_data' engine node now also handles logical switch changes.
> > > Its data maintains ls to lb related information. i.e if a
> > > logical switch sw0 has lb1, lb2 and lb3 associated then
> > > it stores this info in its data.  And when a new load balancer
> > > lb4 is associated to it, it stores this information in its
> > > tracked data so that 'northd' engine node can handle it
> > > accordingly.  Tracked data will have information like:
> > >   changed ls -> {sw0 : {associated_lbs: [lb4]}
> > >
> > > The first handler 'northd_lb_data_handler_pre_od' is called before the
> > > 'northd_nb_logical_switch_handler' handler and it just creates or
> > > deletes the lb_datapaths hmap for the tracked lbs.
> > >
> > > The northd handler 'northd_lb_data_handler' updates the
> > > ovn_lb_datapaths's 'nb_ls_map' bitmap accordingly.
> > >
> > > Eg.  If the lb_data has the below tracked data:
> > >
> > > tracked_data = {'crupdated_lbs': [lb1, lb2],
> > > 'deleted_lbs': [lb3],
> > > 'crupdated_lb_groups': [lbg1, lbg2],
> > > 'crupdated_ls_lbs': [{ls: sw0, assoc_lbs: [lb1],
> > >  {ls: sw1, assoc_lbs: [lb1, lb2]}
> > >
> > > The handler northd_lb_data_handler(), creates the
> > > ovn_lb_datapaths object for lb1 and lb2 and deletes lb3 from
> > > the ovn_lb_datapaths hmap.  It does the same for the created or
updated lb
> > > groups lbg1 and lbg2 in the ovn_lbgrp_datapaths map.  It also updates
the
> > > nb_ls_bitmap of lb1 for sw0 and sw1 and nb_ls_bitmap of lb2 for sw1.
> > >
> > > Reviewed-by: Ales Musil 
> > > Acked-by: Mark Michelson 
> > > Signed-off-by: Numan Siddique 
> > > ---
> > >  lib/lb.c |   5 +-
> > >  northd/en-lb-data.c  | 176
+++
> > >  northd/en-lb-data.h  |  17 
> > >  northd/en-lflow.c|   6 ++
> > >  northd/en-northd.c   |   6 +-
> > >  northd/inc-proc-northd.c |   2 +
> > >  northd/northd.c  |  83 +++---
> > >  northd/northd.h  |   4 +-
> > >  tests/ovn-northd.at  |  56 +
> > >  9 files changed, 322 insertions(+), 33 deletions(-)
> > >
> > > diff --git a/lib/lb.c b/lib/lb.c
> > > index 6fd67e2218..e6c9dc2be2 100644
> > > --- a/lib/lb.c
> > > +++ b/lib/lb.c
> > > @@ -1088,7 +1088,10 @@ ovn_lb_datapaths_add_ls(struct ovn_lb_datapaths
> > *lb_dps, size_t n,
> > >  struct ovn_datapath **ods)
> > >  {
> > >  for (size_t i = 0; i < n; i++) {
> > > -bitmap_set1(lb_dps->nb_ls_map, ods[i]->index);
> > > +if (!bitmap_is_set(lb_dps->nb_ls_map, ods[i]->index)) {
> > > +bitmap_set1(lb_dps->nb_ls_map, ods[i]->index);
> > > +lb_dps->n_nb_ls++;
> > > +}
> > >  }
> > >  }
> > >
> > > diff --git a/northd/en-lb-data.c b/northd/en-lb-data.c
> > > index 8acd9c8cb2..02b1bfd7a4 100644
> > > --- a/northd/en-lb-data.c
> > > +++ b/northd/en-lb-data.c
> > > @@ -39,6 +39,14 @@ static void lb_data_destroy(struct ed_type_lb_data
*);
> > >  static void build_lbs(const struct nbrec_load_balancer_table *,
> > >const struct nbrec_load_balancer_group_table *,
> > >struct hmap *lbs, struct hmap *lb_groups);
> > > +static void build_od_lb_map(const struct nbrec_logical_switch_table
*,
> > > + struct hmap *od_lb_map);
> > > +static struct od_lb_data *find_od_lb_data(struct hmap *od_lb_map,
> > > +  const struct uuid
*od_uuid);
> > > +static void destroy_od_lb_data(struct od_lb_data *od_lb_data);
> > > +static struct od_lb_data *create_od_lb_data(struct hmap *od_lb_map,
> > > +const struct uuid
*od_uuid);
> > > +
> > >  static struct ovn_lb_group *create_lb_group(
> > >  const struct nbrec_load_balancer_group *, struct hmap *lbs,
> > >  struct hmap *lb_groups);
> > > @@ -54,6 +62,7 @@ static struct crupdated_lbgrp *
> > > struct tracked_lb_data *);
> > >  static void add_deleted_lbgrp_to_tracked_data(
> > >  struct ovn_lb_group *, struct tracked_lb_data *);
> > > +static bool is_ls_lbs_changed(const struct nbrec_logical_switch
*nbs);
> > >
> > >  /* 'lb_data' engine node manages the NB load balancers and load
balancer
> > >   * groups.  For each NB LB, it creates 'struct ovn_northd_lb' and
> > > @@ -80,9 +89,13 @@ en_lb_data_run(struct engine_node *node, void
*data)
> > >  EN_OVSDB_GET(engine_get_input("NB_load_balancer", node));
> > >  const struct nbrec_load_balancer_group_table *nb_lbg_table =
> > >  EN_OVSDB_GET(engine_get_input("NB_load_balancer_group",
node));
> > > +const struct nbrec_logical_switch_table *nb_ls_tabl

Re: [ovs-dev] [PATCH] python: idl: Fix last-id update from a monitor reply.

2023-09-12 Thread Han Zhou
Thanks Ilya!

On Fri, Sep 8, 2023 at 7:18 PM Ilya Maximets  wrote:
>
> While sending a reply to the monitor_cond_since request, server
> includes the last transaction ID.  And it sends new IDs with each
> subsequent update.  Current implementation doesn't use the one
> supplied with a monitor reply, and only takes into account IDs
> provided with monitor updates.  That may cause various issues:
>
> 1. Performance: During initialization, the last-id is set to zero.
>If re-connection will happen after receiving a monitor reply,
>but before any monitor update, the client will send a new
>monitor request with an all-zero last-id and will re-download
>the whole database again.
>
> 2. Data inconsistency: Assuming one of the clients sends a
>transaction, but our python client disconnects before receiving
>a monitor update for this transaction.  The las-id will point

nit: s/las-id/last-id

This example clearly shows the problem, but I think it may be even more
simplified: the client doesn't have to send a transaction to hit the
problem. If there are any transactions (sent by any clients) committed to
the DB between the disconnection and reconnection of the client, then there
will be such an inconsistency problem if a second disconnection (and later
reconnect again) happens immediately after receiving the monitor reply.

Acked-by: Han Zhou 

>to a database state before this transaction.  On re-connection,
>this last-id will be sent and the monitor reply will contain
>a diff with a new data from that transaction.  But if another
>disconnection happens right after that, on second re-connection
>our python client will send another monitor_cond_since with
>exactly the same last-id.  That will cause receiving the same
>set of updates again.  And since it's an update2 message with
>a diff of the data, the client will remove previously applied
>result of the transaction.  At this point it will have a
>different database view with the server potentially leading
>to all sorts of data inconsistency problems.
>
> Fix that by always updating the last-id from the latest monitor
> reply.
>
> Fixes: 46d44cf3be0d ("python: idl: Add monitor_cond_since support.")
> Signed-off-by: Ilya Maximets 
> ---
>  python/ovs/db/idl.py |  1 +
>  tests/ovsdb-idl.at   | 22 +-
>  2 files changed, 22 insertions(+), 1 deletion(-)
>
> diff --git a/python/ovs/db/idl.py b/python/ovs/db/idl.py
> index 9fc2159b0..16ece0334 100644
> --- a/python/ovs/db/idl.py
> +++ b/python/ovs/db/idl.py
> @@ -494,6 +494,7 @@ class Idl(object):
>  if not msg.result[0]:
>  self.__clear()
>  self.__parse_update(msg.result[2], OVSDB_UPDATE3)
> +self.last_id = msg.result[1]
>  elif self.state ==
self.IDL_S_DATA_MONITOR_COND_REQUESTED:
>  self.__clear()
>  self.__parse_update(msg.result, OVSDB_UPDATE2)
> diff --git a/tests/ovsdb-idl.at b/tests/ovsdb-idl.at
> index df5a9d2fd..1028b0237 100644
> --- a/tests/ovsdb-idl.at
> +++ b/tests/ovsdb-idl.at
> @@ -2332,6 +2332,23 @@ CHECK_STREAM_OPEN_BLOCK([Python3], [$PYTHON3
$srcdir/test-stream.py],
>  CHECK_STREAM_OPEN_BLOCK([Python3], [$PYTHON3 $srcdir/test-stream.py],
>  [ssl6], [[[::1]]])
>
> +dnl OVSDB_CLUSTER_CHECK_MONITOR_COND_SINCE_TXN_IDS(LOG)
> +dnl
> +dnl Looks up transaction IDs in the log of OVSDB client application.
> +dnl All-zero UUID should not be sent within a monitor request more than
once,
> +dnl unless some database requests were lost (not replied).
> +m4_define([OVSDB_CLUSTER_CHECK_MONITOR_COND_SINCE_TXN_IDS],
> +[
> +   requests=$(grep -c 'send request' $1)
> +   replies=$(grep -c 'received reply' $1)
> +
> +   if test "$requests" -eq "$replies"; then
> + AT_CHECK([grep 'monitor_cond_since' $1 \
> +| grep -c "----" | tr -d
'\n'],
> +  [0], [1])
> +   fi
> +])
> +
>  # same as OVSDB_CHECK_IDL but uses Python IDL implementation with tcp
>  # with multiple remotes to assert the idl connects to the leader of the
Raft cluster
>  m4_define([OVSDB_CHECK_IDL_LEADER_ONLY_PY],
> @@ -2347,10 +2364,11 @@ m4_define([OVSDB_CHECK_IDL_LEADER_ONLY_PY],
> pids=$(cat s2.pid s3.pid s1.pid | tr '\n' ',')
> echo $pids
> AT_CHECK([$PYTHON3 $srcdir/test-ovsdb.py  -t30 idl-cluster
$srcdir/idltest.ovsschema $remotes $pids $3],
> -[0], [stdout], [ignore])
> +[0], [stdout], [stderr])
> remote=$(ovsdb_cluster_leader $remotes "idltest")
> leader=$(echo $remote | cut -d'|' -f 1)
> AT_CHECK([grep -F -- "${leader}" stdout], [0], [ignore])
> +   OVSDB_CLUSTER_CHECK_MONITOR_COND_SINCE_TXN_IDS([stderr])
> AT_CLEANUP])
>
>  OVSDB_CHECK_IDL_LEADER_ONLY_PY([Check Python IDL connects to leader], 3,
['remote'])
> @@ -2393,6 +2411,7 @@ m4_define([OVSDB_CHECK_CLUSTER_IDL_C],
>

Re: [ovs-dev] [PATCH 1/1] ofproto-dpif-trace: Support detailed output for conjunctive match.

2023-09-12 Thread Nobuhiro MIKI
On 2023/09/12 19:34, Simon Horman wrote:
> On Thu, Sep 07, 2023 at 03:08:41PM +0900, Nobuhiro MIKI wrote:
>> A conjunctive flow consists of two or more multiple flows with
>> conjunction actions. When input to the ofproto/trace command
>> matches a conjunctive flow, it outputs flows of all dimensions.
>>
>> Signed-off-by: Nobuhiro MIKI 
> 
> Hi Miki-san,
> 
> the CI run for this patch has reported a number of errors.
> And I suspect not all of them are transient failures relating
> to imperfections in the test-suite. Could you look into this?
> 
> https://github.com/ovsrobot/ovs/actions/runs/6106124874/job/16570697146

Hi Simon-san,

Thanks for letting me know.
It looks like there are a variety of reasons for the
failures, but I'm guessing it's due to the fact that
I'm manipulating const variables in this patch.

I would like to implement this feature while respecting
the existing const qualifier, so I will try to find a way.

Best Regards,
Nobuhiro Miki
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2 1/2] ovn-ic fix multiple routers in an az

2023-09-12 Thread Felix Huettner via dev
Hi Ales,

thanks for the feedback. That will all be addressed in v3.

Regards
Felix
On Tue, Sep 12, 2023 at 08:18:18AM +0200, Ales Musil wrote:
> On Wed, Aug 30, 2023 at 8:20 AM Felix Huettner via dev <
> ovs-dev@openvswitch.org> wrote:
>
> > previously if multiple routers in the same az are connected to the same
> > transit switch then ovn-ic would only propagate the routes of one of
> > these routers to the ic-sb.
> > This commit fixes this behaviour and allows multiple routers in a single
> > az to use route advertisements.
> >
> > Co-authored-by: Maxim Korezkij 
> > Signed-off-by: Maxim Korezkij 
> > Signed-off-by: Felix Huettner 
> >
>
> Hi Felix,
>
> I have a few minor comments, see below.
>
> ---
> >  ic/ovn-ic.c | 27 +++-
> >  tests/ovn-ic.at | 86 +
> >  2 files changed, 105 insertions(+), 8 deletions(-)
> >
> > diff --git a/ic/ovn-ic.c b/ic/ovn-ic.c
> > index db7e86bc1..ec749e25f 100644
> > --- a/ic/ovn-ic.c
> > +++ b/ic/ovn-ic.c
> > @@ -1587,9 +1587,9 @@ build_ts_routes_to_adv(struct ic_context *ctx,
> >  }
> >
> >  static void
> > -advertise_lr_routes(struct ic_context *ctx,
> > -const struct icsbrec_availability_zone *az,
> > -struct ic_router_info *ic_lr)
> > +collect_lr_routes(struct ic_context *ctx,
> > +struct ic_router_info *ic_lr,
> > +struct shash *routes_ad_by_ts)
> >
>
> The arguments are not properly aligned.
>
>
> >  {
> >  const struct nbrec_nb_global *nb_global =
> >  nbrec_nb_global_first(ctx->ovnnb_idl);
> > @@ -1600,7 +1600,7 @@ advertise_lr_routes(struct ic_context *ctx,
> >  struct lport_addresses ts_port_addrs;
> >  const struct icnbrec_transit_switch *key;
> >
> > -struct hmap routes_ad = HMAP_INITIALIZER(&routes_ad);
> > +struct hmap *routes_ad;
> >  for (int i = 0; i < ic_lr->n_isb_pbs; i++) {
> >  isb_pb = ic_lr->isb_pbs[i];
> >  key = icnbrec_transit_switch_index_init_row(
> > @@ -1609,6 +1609,12 @@ advertise_lr_routes(struct ic_context *ctx,
> >  ts_name = icnbrec_transit_switch_index_find(
> >  ctx->icnbrec_transit_switch_by_name, key)->name;
> >  icnbrec_transit_switch_index_destroy_row(key);
> > +routes_ad = shash_find_data(routes_ad_by_ts, ts_name);
> > +if (!routes_ad) {
> > +routes_ad = xzalloc(sizeof *routes_ad);
> > +hmap_init(routes_ad);
> > +shash_add(routes_ad_by_ts, ts_name, routes_ad);
> > +}
> >
> >  if (!extract_lsp_addresses(isb_pb->address, &ts_port_addrs)) {
> >  static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
> > @@ -1620,12 +1626,10 @@ advertise_lr_routes(struct ic_context *ctx,
> >  }
> >  lrp_name = get_lrp_name_by_ts_port_name(ctx,
> > isb_pb->logical_port);
> >  route_table = get_route_table_by_lrp_name(ctx, lrp_name);
> > -build_ts_routes_to_adv(ctx, ic_lr, &routes_ad, &ts_port_addrs,
> > +build_ts_routes_to_adv(ctx, ic_lr, routes_ad, &ts_port_addrs,
> > nb_global, route_table);
> > -advertise_routes(ctx, az, ts_name, &routes_ad);
> >  destroy_lport_addresses(&ts_port_addrs);
> >  }
> > -hmap_destroy(&routes_ad);
> >  }
> >
> >  static void
> > @@ -1726,14 +1730,21 @@ route_run(struct ic_context *ctx,
> >  icsbrec_port_binding_index_destroy_row(isb_pb_key);
> >
> >  struct ic_router_info *ic_lr;
> > +struct shash routes_ad_by_ts = SHASH_INITIALIZER(&routes_ad_by_ts);
> >  HMAP_FOR_EACH_SAFE (ic_lr, node, &ic_lrs) {
> > -advertise_lr_routes(ctx, az, ic_lr);
> > +collect_lr_routes(ctx, ic_lr, &routes_ad_by_ts);
> >  sync_learned_routes(ctx, az, ic_lr);
> >  free(ic_lr->isb_pbs);
> >  hmap_destroy(&ic_lr->routes_learned);
> >  hmap_remove(&ic_lrs, &ic_lr->node);
> >  free(ic_lr);
> >  }
> > +struct shash_node *node;
> > +SHASH_FOR_EACH_SAFE (node, &routes_ad_by_ts) {
> >
>
> The SHASH iteration doesn't have to be SAFE, we are not removing anything
> from it during the iteration.
>
>
> > +advertise_routes(ctx, az, node->name, node->data);
> > +hmap_destroy(node->data);
> > +}
> > +shash_destroy_free_data(&routes_ad_by_ts);
> >  hmap_destroy(&ic_lrs);
> >  }
> >
> > diff --git a/tests/ovn-ic.at b/tests/ovn-ic.at
> > index a654e59fe..8ef2362c4 100644
> > --- a/tests/ovn-ic.at
> > +++ b/tests/ovn-ic.at
> > @@ -1164,3 +1164,89 @@ AT_CHECK([ovn_as az2 ovn-nbctl lr-route-list lr12 |
> > grep dst-ip | sort], [0], [d
> >
> >  AT_CLEANUP
> >  ])
> > +
> > +OVN_FOR_EACH_NORTHD([
> > +AT_SETUP([ovn-ic -- route sync -- multiple logical routers])
> > +
> > +ovn_init_ic_db
> > +ovn-ic-nbctl ts-add ts1
> > +
> > +for i in 1 2; do
> > +ovn_start az$i
> > +ovn_as az$i
> > +
> > +# Enable route learning at AZ level
> > +ovn-nbctl set nb_g

[ovs-dev] [PATCH ovn v3 1/2] ovn-ic fix multiple routers in an az

2023-09-12 Thread Felix Huettner via dev
previously if multiple routers in the same az are connected to the same
transit switch then ovn-ic would only propagate the routes of one of
these routers to the ic-sb.
This commit fixes this behaviour and allows multiple routers in a single
az to use route advertisements.

Co-authored-by: Maxim Korezkij 
Signed-off-by: Maxim Korezkij 
Signed-off-by: Felix Huettner 
---
 ic/ovn-ic.c | 27 +++-
 tests/ovn-ic.at | 86 +
 2 files changed, 105 insertions(+), 8 deletions(-)

diff --git a/ic/ovn-ic.c b/ic/ovn-ic.c
index 11b533981..2df1235dc 100644
--- a/ic/ovn-ic.c
+++ b/ic/ovn-ic.c
@@ -1592,9 +1592,9 @@ build_ts_routes_to_adv(struct ic_context *ctx,
 }

 static void
-advertise_lr_routes(struct ic_context *ctx,
-const struct icsbrec_availability_zone *az,
-struct ic_router_info *ic_lr)
+collect_lr_routes(struct ic_context *ctx,
+  struct ic_router_info *ic_lr,
+  struct shash *routes_ad_by_ts)
 {
 const struct nbrec_nb_global *nb_global =
 nbrec_nb_global_first(ctx->ovnnb_idl);
@@ -1605,7 +1605,7 @@ advertise_lr_routes(struct ic_context *ctx,
 struct lport_addresses ts_port_addrs;
 const struct icnbrec_transit_switch *key;

-struct hmap routes_ad = HMAP_INITIALIZER(&routes_ad);
+struct hmap *routes_ad;
 for (int i = 0; i < ic_lr->n_isb_pbs; i++) {
 isb_pb = ic_lr->isb_pbs[i];
 key = icnbrec_transit_switch_index_init_row(
@@ -1614,6 +1614,12 @@ advertise_lr_routes(struct ic_context *ctx,
 ts_name = icnbrec_transit_switch_index_find(
 ctx->icnbrec_transit_switch_by_name, key)->name;
 icnbrec_transit_switch_index_destroy_row(key);
+routes_ad = shash_find_data(routes_ad_by_ts, ts_name);
+if (!routes_ad) {
+routes_ad = xzalloc(sizeof *routes_ad);
+hmap_init(routes_ad);
+shash_add(routes_ad_by_ts, ts_name, routes_ad);
+}

 if (!extract_lsp_addresses(isb_pb->address, &ts_port_addrs)) {
 static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
@@ -1625,12 +1631,10 @@ advertise_lr_routes(struct ic_context *ctx,
 }
 lrp_name = get_lrp_name_by_ts_port_name(ctx, isb_pb->logical_port);
 route_table = get_route_table_by_lrp_name(ctx, lrp_name);
-build_ts_routes_to_adv(ctx, ic_lr, &routes_ad, &ts_port_addrs,
+build_ts_routes_to_adv(ctx, ic_lr, routes_ad, &ts_port_addrs,
nb_global, route_table);
-advertise_routes(ctx, az, ts_name, &routes_ad);
 destroy_lport_addresses(&ts_port_addrs);
 }
-hmap_destroy(&routes_ad);
 }

 static void
@@ -1731,14 +1735,21 @@ route_run(struct ic_context *ctx,
 icsbrec_port_binding_index_destroy_row(isb_pb_key);

 struct ic_router_info *ic_lr;
+struct shash routes_ad_by_ts = SHASH_INITIALIZER(&routes_ad_by_ts);
 HMAP_FOR_EACH_SAFE (ic_lr, node, &ic_lrs) {
-advertise_lr_routes(ctx, az, ic_lr);
+collect_lr_routes(ctx, ic_lr, &routes_ad_by_ts);
 sync_learned_routes(ctx, az, ic_lr);
 free(ic_lr->isb_pbs);
 hmap_destroy(&ic_lr->routes_learned);
 hmap_remove(&ic_lrs, &ic_lr->node);
 free(ic_lr);
 }
+struct shash_node *node;
+SHASH_FOR_EACH (node, &routes_ad_by_ts) {
+advertise_routes(ctx, az, node->name, node->data);
+hmap_destroy(node->data);
+}
+shash_destroy_free_data(&routes_ad_by_ts);
 hmap_destroy(&ic_lrs);
 }

diff --git a/tests/ovn-ic.at b/tests/ovn-ic.at
index 9a5f3e312..4b1c33c99 100644
--- a/tests/ovn-ic.at
+++ b/tests/ovn-ic.at
@@ -1166,3 +1166,89 @@ AT_CHECK([ovn_as az2 ovn-nbctl lr-route-list lr12 | grep 
dst-ip | sort], [0], [d

 AT_CLEANUP
 ])
+
+OVN_FOR_EACH_NORTHD([
+AT_SETUP([ovn-ic -- route sync -- multiple logical routers])
+
+ovn_init_ic_db
+ovn-ic-nbctl ts-add ts1
+
+for i in 1 2; do
+ovn_start az$i
+ovn_as az$i
+
+# Enable route learning at AZ level
+ovn-nbctl set nb_global . options:ic-route-learn=true
+# Enable route advertising at AZ level
+ovn-nbctl set nb_global . options:ic-route-adv=true
+done
+
+# Create new transit switches and LRs. Test topology is next:
+#
+# logical router (lr11) - transit switch (ts1) - logical router (lr21)
+#  \- logical router (lr22)
+#
+# each LR has one connected subnet except TS port
+
+
+# create lr11, lr21, lr22, ts1 and connect them
+ovn-ic-nbctl ts-add ts1
+
+ovn_as az1
+
+lr=lr11
+ovn-nbctl lr-add $lr
+
+lrp=lrp-$lr-ts1
+lsp=lsp-ts1-$lr
+# Create LRP and connect to TS
+ovn-nbctl lrp-add $lr $lrp aa:aa:aa:aa:a1:01 169.254.10.11/24
+ovn-nbctl lsp-add ts1 $lsp \
+-- lsp-set-addresses $lsp router \
+-- lsp-set-type $lsp router \
+-- lsp-set-options $lsp router-port=$lrp
+
+ovn_as az2
+for i in 1 2; do
+lr=lr2$i
+ovn-nbctl lr-add $lr
+
+lrp=lrp-$lr-ts

[ovs-dev] [PATCH ovn v3 2/2] ovn-ic: support learning routes in same AZ

2023-09-12 Thread Felix Huettner via dev
when connecting multiple logical routers to a transit switch per az then
previously the routers in the same az would not learn each others
routes while the routers in the others az would learn all of them.

As this is confusing and would require each user to have additional
logic that configures static routing within each az.

Acked-by: Mark Michelson 
Acked-by: Ales Musil 
Co-Authored-By: Maxim Korezkij 
Signed-off-by: Maxim Korezkij 
Signed-off-by: Felix Huettner 
---
 ic/ovn-ic.c | 48 
 tests/ovn-ic.at |  2 ++
 2 files changed, 38 insertions(+), 12 deletions(-)

diff --git a/ic/ovn-ic.c b/ic/ovn-ic.c
index 2df1235dc..e2023c2ba 100644
--- a/ic/ovn-ic.c
+++ b/ic/ovn-ic.c
@@ -871,6 +871,8 @@ struct ic_route_info {
 const char *origin;
 const char *route_table;

+const struct nbrec_logical_router *nb_lr;
+
 /* Either nb_route or nb_lrp is set and the other one must be NULL.
  * - For a route that is learned from IC-SB, or a static route that is
  *   generated from a route that is configured in NB, the "nb_route"
@@ -947,7 +949,8 @@ parse_route(const char *s_prefix, const char *s_nexthop,
 /* Return false if can't be added due to bad format. */
 static bool
 add_to_routes_learned(struct hmap *routes_learned,
-  const struct nbrec_logical_router_static_route *nb_route)
+  const struct nbrec_logical_router_static_route *nb_route,
+  const struct nbrec_logical_router *nb_lr)
 {
 struct in6_addr prefix, nexthop;
 unsigned int plen;
@@ -969,6 +972,7 @@ add_to_routes_learned(struct hmap *routes_learned,
 ic_route->nb_route = nb_route;
 ic_route->origin = origin;
 ic_route->route_table = nb_route->route_table;
+ic_route->nb_lr = nb_lr;
 hmap_insert(routes_learned, &ic_route->node,
 ic_route_hash(&prefix, plen, &nexthop, origin,
   nb_route->route_table));
@@ -1109,7 +1113,8 @@ add_to_routes_ad(struct hmap *routes_ad, const struct 
in6_addr prefix,
  unsigned int plen, const struct in6_addr nexthop,
  const char *origin, const char *route_table,
  const struct nbrec_logical_router_port *nb_lrp,
- const struct nbrec_logical_router_static_route *nb_route)
+ const struct nbrec_logical_router_static_route *nb_route,
+ const struct nbrec_logical_router *nb_lr)
 {
 if (route_table == NULL) {
 route_table = "";
@@ -1127,6 +1132,7 @@ add_to_routes_ad(struct hmap *routes_ad, const struct 
in6_addr prefix,
 ic_route->origin = origin;
 ic_route->route_table = route_table;
 ic_route->nb_lrp = nb_lrp;
+ic_route->nb_lr = nb_lr;
 hmap_insert(routes_ad, &ic_route->node, hash);
 } else {
 static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
@@ -1140,6 +1146,7 @@ static void
 add_static_to_routes_ad(
 struct hmap *routes_ad,
 const struct nbrec_logical_router_static_route *nb_route,
+const struct nbrec_logical_router *nb_lr,
 const struct lport_addresses *nexthop_addresses,
 const struct smap *nb_options)
 {
@@ -1182,14 +1189,15 @@ add_static_to_routes_ad(
 }

 add_to_routes_ad(routes_ad, prefix, plen, nexthop, ROUTE_ORIGIN_STATIC,
- nb_route->route_table, NULL, nb_route);
+ nb_route->route_table, NULL, nb_route, nb_lr);
 }

 static void
 add_network_to_routes_ad(struct hmap *routes_ad, const char *network,
  const struct nbrec_logical_router_port *nb_lrp,
  const struct lport_addresses *nexthop_addresses,
- const struct smap *nb_options)
+ const struct smap *nb_options,
+ const struct nbrec_logical_router *nb_lr)
 {
 struct in6_addr prefix, nexthop;
 unsigned int plen;
@@ -1228,7 +1236,7 @@ add_network_to_routes_ad(struct hmap *routes_ad, const 
char *network,

 /* directly-connected routes go to  route table */
 add_to_routes_ad(routes_ad, prefix, plen, nexthop, ROUTE_ORIGIN_CONNECTED,
- NULL, nb_lrp, NULL);
+ NULL, nb_lrp, NULL, nb_lr);
 }

 static bool
@@ -1332,7 +1340,6 @@ lrp_is_ts_port(struct ic_context *ctx, struct 
ic_router_info *ic_lr,

 static void
 sync_learned_routes(struct ic_context *ctx,
-const struct icsbrec_availability_zone *az,
 struct ic_router_info *ic_lr)
 {
 ovs_assert(ctx->ovnnb_txn);
@@ -1355,7 +1362,15 @@ sync_learned_routes(struct ic_context *ctx,

 ICSBREC_ROUTE_FOR_EACH_EQUAL (isb_route, isb_route_key,
   ctx->icsbrec_route_by_ts) {
-if (isb_route->availability_zone == az) {
+const char *lr_id = smap_get(&isb_route->external_ids, "lr-id");
+if (lr_id == NULL) {
+

[ovs-dev] [PATCH ovn 1/2] northd: Fix naming and comments related to HA reference chassis.

2023-09-12 Thread Han Zhou
Minor (non-functional) fixes to commit 4023d6a5fa57.
1. Fix typo of function name: collect_lb_groups_for_ha_chassis_groups
2. Update the comments of the collect_lb_groups_for_ha_chassis_groups
   function to avoid confusion.
3. Rename tmp_ha_chassis to tmp_ha_ref_chassis, because HA chassis is
   more like the chassis belonging to a HA chassis group.

Fixes: 4023d6a5fa57 ("northd: Fix recompute of referenced chassis in HA chassis 
groups.")
Signed-off-by: Han Zhou 
---
 northd/northd.c | 52 +
 1 file changed, 27 insertions(+), 25 deletions(-)

diff --git a/northd/northd.c b/northd/northd.c
index e9cb906e2aa4..bb56bfac6c0f 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -801,7 +801,7 @@ struct lrouter_group {
 /* Set of ha_chassis_groups which are associated with the router dps. */
 struct sset ha_chassis_groups;
 /* Temporary storage for chassis references while computing HA groups. */
-struct hmapx tmp_ha_chassis;
+struct hmapx tmp_ha_ref_chassis;
 };
 
 static struct ovn_datapath *
@@ -8762,7 +8762,7 @@ build_lrouter_groups(struct hmap *lr_ports, struct 
ovs_list *lr_list)
 od->lr_group->router_dps[0] = od;
 od->lr_group->n_router_dps = 1;
 sset_init(&od->lr_group->ha_chassis_groups);
-hmapx_init(&od->lr_group->tmp_ha_chassis);
+hmapx_init(&od->lr_group->tmp_ha_ref_chassis);
 build_lrouter_groups__(lr_ports, od);
 }
 }
@@ -17250,7 +17250,7 @@ destroy_datapaths_and_ports(struct ovn_datapaths 
*ls_datapaths,
 
 free(lr_group->router_dps);
 sset_destroy(&lr_group->ha_chassis_groups);
-hmapx_destroy(&lr_group->tmp_ha_chassis);
+hmapx_destroy(&lr_group->tmp_ha_ref_chassis);
 free(lr_group);
 }
 }
@@ -17609,27 +17609,28 @@ update_sb_ha_group_ref_chassis(
 hmap_destroy(&ha_ch_grps);
 }
 
-/* This function checks if the port binding 'sb' references
- * a HA chassis group.
- * Eg. Suppose a distributed logical router port - lr0-public
- * uses an HA chassis group - hagrp1 and if hagrp1 has 3 ha
- * chassis - gw1, gw2 and gw3.
+/* This function and the next function build_ha_chassis_group_ref_chassis
+ * build the reference chassis 'ref_chassis' for each HA chassis group.
+ *
+ * Suppose a distributed logical router port - lr0-public uses an HA chassis
+ * group - hagrp1 and if hagrp1 has 3 ha chassis - gw1, gw2 and gw3.
  * Or
- * If the distributed logical router port - lr0-public has
- * 3 gateway chassis - gw1, gw2 and gw3.
- * ovn-northd creates ha chassis group - hagrp1 in SB DB
- * and adds gw1, gw2 and gw3 to its ha_chassis list.
+ * If the distributed logical router port - lr0-public has 3 gateway chassis -
+ * gw1, gw2 and gw3.
+ *
+ * ovn-northd creates ha chassis group - hagrp1 in SB DB and adds gw1, gw2 and
+ * gw3 to its ha_chassis list.
  *
- * If port binding 'sb' represents a logical switch port 'p1'
- * and its logical switch is connected to the logical router
- * 'lr0' directly or indirectly (i.e p1's logical switch is
- *  connected to a router 'lr1' and 'lr1' has a path to lr0 via
- *  transit logical switches) and 'sb' is claimed by chassis - 'c1' then
- * this function adds c1 to the list of the reference chassis
- *  - 'ref_chassis' of hagrp1.
+ * If port binding 'sb' represents a logical switch port 'p1' and its logical
+ * switch is connected to the logical router 'lr0' directly or indirectly (i.e
+ * p1's logical switch is connected to a router 'lr1' and 'lr1' has a path to
+ * lr0 via transit logical switches) and 'sb' is claimed by chassis - 'c1' then
+ * this function adds c1 to the 'tmp_ha_ref_chassis' of lr_group, and later the
+ * function build_ha_chassis_group_ref_chassis will add these chassis to the
+ * list of the reference chassis - 'ref_chassis' of hagrp1.
  */
 static void
-collect_lb_groups_for_ha_chassis_groups(const struct sbrec_port_binding *sb,
+collect_lr_groups_for_ha_chassis_groups(const struct sbrec_port_binding *sb,
 struct ovn_port *op,
 struct hmapx *lr_groups)
 {
@@ -17651,7 +17652,7 @@ collect_lb_groups_for_ha_chassis_groups(const struct 
sbrec_port_binding *sb,
 }
 
 hmapx_add(lr_groups, lr_group);
-hmapx_add(&lr_group->tmp_ha_chassis, sb->chassis);
+hmapx_add(&lr_group->tmp_ha_ref_chassis, sb->chassis);
 }
 
 static void
@@ -17678,11 +17679,12 @@ build_ha_chassis_group_ref_chassis(struct 
ovsdb_idl_index *ha_ch_grp_by_name,
 shash_find_data(ha_ref_chassis_map, sb_ha_chassis_grp->name);
 ovs_assert(ref_ch_info);
 
-add_to_ha_ref_chassis_info(ref_ch_info, &lr_group->tmp_ha_chassis);
+add_to_ha_ref_chassis_info(ref_ch_info,
+   &lr_group->tmp_ha_ref_chassis);
 }
 
-hmapx_destroy(&lr_group->tmp_ha_chassis);
-hmapx_init(&lr_group

[ovs-dev] [PATCH ovn 0/2] Minor improvements of HA reference chassis.

2023-09-12 Thread Han Zhou
Han Zhou (2):
  northd: Fix naming and comments related to HA reference chassis.
  northd: Improve HA reference chassis build logic.

 northd/en-sync-from-sb.c |  6 +---
 northd/northd.c  | 75 +---
 northd/northd.h  |  1 -
 3 files changed, 33 insertions(+), 49 deletions(-)

-- 
2.38.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn 2/2] northd: Improve HA reference chassis build logic.

2023-09-12 Thread Han Zhou
Every LR datapath references to some LR group, but not every LR group
has HA chassis groups. This patch avoids collecting LR groups and
reference chassis for the LR groups without HA chassis groups.

In addition, this patch also refactors the function
build_ha_chassis_group_ref_chassis to avoid the unnecessary SB
ha_chassis_group lookup by name, because the only field used is the
name.

Signed-off-by: Han Zhou 
---
 northd/en-sync-from-sb.c |  6 +-
 northd/northd.c  | 23 +--
 northd/northd.h  |  1 -
 3 files changed, 6 insertions(+), 24 deletions(-)

diff --git a/northd/en-sync-from-sb.c b/northd/en-sync-from-sb.c
index 4109aebe4517..2df02ad12067 100644
--- a/northd/en-sync-from-sb.c
+++ b/northd/en-sync-from-sb.c
@@ -53,13 +53,9 @@ en_sync_from_sb_run(struct engine_node *node, void *data 
OVS_UNUSED)
 EN_OVSDB_GET(engine_get_input("SB_port_binding", node));
 const struct sbrec_ha_chassis_group_table *sb_ha_ch_grp_table =
 EN_OVSDB_GET(engine_get_input("SB_ha_chassis_group", node));
-struct ovsdb_idl_index *sb_ha_ch_grp_by_name =
-engine_ovsdb_node_get_index(
-engine_get_input("SB_ha_chassis_group", node),
-"sbrec_ha_chassis_grp_by_name");
 stopwatch_start(OVNSB_DB_RUN_STOPWATCH_NAME, time_msec());
 ovnsb_db_run(eng_ctx->ovnnb_idl_txn, eng_ctx->ovnsb_idl_txn,
- sb_pb_table, sb_ha_ch_grp_table, sb_ha_ch_grp_by_name,
+ sb_pb_table, sb_ha_ch_grp_table,
  &nd->ls_ports, &nd->lr_ports);
 stopwatch_stop(OVNSB_DB_RUN_STOPWATCH_NAME, time_msec());
 }
diff --git a/northd/northd.c b/northd/northd.c
index bb56bfac6c0f..83f341438c3b 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -17647,7 +17647,7 @@ collect_lr_groups_for_ha_chassis_groups(const struct 
sbrec_port_binding *sb,
 break;
 }
 
-if (!lr_group) {
+if (!lr_group || sset_is_empty(&lr_group->ha_chassis_groups)) {
 return;
 }
 
@@ -17656,8 +17656,7 @@ collect_lr_groups_for_ha_chassis_groups(const struct 
sbrec_port_binding *sb,
 }
 
 static void
-build_ha_chassis_group_ref_chassis(struct ovsdb_idl_index *ha_ch_grp_by_name,
-   struct hmapx *lr_groups,
+build_ha_chassis_group_ref_chassis(struct hmapx *lr_groups,
struct shash *ha_ref_chassis_map)
 {
 struct hmapx_node *node;
@@ -17667,16 +17666,8 @@ build_ha_chassis_group_ref_chassis(struct 
ovsdb_idl_index *ha_ch_grp_by_name,
 const char *ha_group_name;
 
 SSET_FOR_EACH (ha_group_name, &lr_group->ha_chassis_groups) {
-const struct sbrec_ha_chassis_group *sb_ha_chassis_grp;
-
-sb_ha_chassis_grp = ha_chassis_group_lookup_by_name(
-ha_ch_grp_by_name, ha_group_name);
-if (!sb_ha_chassis_grp) {
-continue;
-}
-
 struct ha_ref_chassis_info *ref_ch_info =
-shash_find_data(ha_ref_chassis_map, sb_ha_chassis_grp->name);
+shash_find_data(ha_ref_chassis_map, ha_group_name);
 ovs_assert(ref_ch_info);
 
 add_to_ha_ref_chassis_info(ref_ch_info,
@@ -17711,7 +17702,6 @@ static void
 handle_port_binding_changes(struct ovsdb_idl_txn *ovnsb_txn,
 const struct sbrec_port_binding_table *sb_pb_table,
 const struct sbrec_ha_chassis_group_table *sb_ha_ch_grp_table,
-struct ovsdb_idl_index *sb_ha_ch_grp_by_name,
 struct hmap *ls_ports,
 struct hmap *lr_ports,
 struct shash *ha_ref_chassis_map)
@@ -17784,8 +17774,7 @@ handle_port_binding_changes(struct ovsdb_idl_txn 
*ovnsb_txn,
 }
 
 /* Update ha chassis group's ref_chassis if required. */
-build_ha_chassis_group_ref_chassis(sb_ha_ch_grp_by_name, &lr_groups,
-   ha_ref_chassis_map);
+build_ha_chassis_group_ref_chassis(&lr_groups, ha_ref_chassis_map);
 hmapx_destroy(&lr_groups);
 }
 
@@ -17795,7 +17784,6 @@ ovnsb_db_run(struct ovsdb_idl_txn *ovnnb_txn,
  struct ovsdb_idl_txn *ovnsb_txn,
  const struct sbrec_port_binding_table *sb_pb_table,
  const struct sbrec_ha_chassis_group_table *sb_ha_ch_grp_table,
- struct ovsdb_idl_index *sb_ha_ch_grp_by_name,
  struct hmap *ls_ports,
  struct hmap *lr_ports)
 {
@@ -17806,8 +17794,7 @@ ovnsb_db_run(struct ovsdb_idl_txn *ovnnb_txn,
 
 struct shash ha_ref_chassis_map = SHASH_INITIALIZER(&ha_ref_chassis_map);
 handle_port_binding_changes(ovnsb_txn, sb_pb_table, sb_ha_ch_grp_table,
-sb_ha_ch_grp_by_name, ls_ports, lr_ports,
-&ha_ref_chassis_map);
+ls_ports, lr_ports, &ha_ref_chassis_map);
 if (ovnsb_txn) {
 update_sb_ha_group_ref_chassis(sb_ha_ch_grp_table,
 

[ovs-dev] [PATCH] netdev-dpdk: fix tso bug

2023-09-12 Thread Dexia Li via dev
when userspace tso enabled, mbuf RTE_MBUF_F_TX_TCP_SEG and 
RTE_MBUF_F_TX_TCP_CKSUM

flag bits all positioned will result in driver hang using intel E810 vf

driver iavf. As refered to dpdk csum example, RTE_MBUF_F_TX_TCP_SEG

should only be positioned when tso is open.

Signed-off-by: Dexia Li 
---
 lib/netdev-dpdk.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 55700250d..c7cd00fc3 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -2461,7 +2461,6 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, 
struct rte_mbuf *mbuf)
 }
 
 mbuf->l4_len = TCP_OFFSET(th->tcp_ctl) * 4;
-mbuf->ol_flags |= RTE_MBUF_F_TX_TCP_CKSUM;
 mbuf->tso_segsz = dev->mtu - mbuf->l3_len - mbuf->l4_len;
 
 if (mbuf->ol_flags & RTE_MBUF_F_TX_IPV4) {
-- 
2.33.0.windows.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev