On 8/30/25 7:00 PM, Rukomoinikova Aleksandra wrote:
> Hi Ilya! Thanks for your answer)
>
> I spent some more time on this issue, and my assumption from the previous
> email turned out to be incorrect.
>
> You asked for a link to the build - here it is. This is exactly two red
> tests I'm talking about:
> https://github.com/Sashhkaa/ovn/actions/runs/17321708600
>
> Here is the current picture I have. I have two Ubuntu 22.04 machines:
>
> * Completely identical build flags for OVS and OVN
> * Identical versions of the compiler (gcc), assembler, and linker
> (libc version 2.35-0ubuntu3.8, gcc 4:11.2.0-1ubuntu1)
> * Absolutely identical versions of all userspace packages
> * Kernel and its configs are identical (5.15.0-113-generic)
> * Accordingly, identical kernel modules
> * Just in case, I made all sysctl parameters identical
>
> The only thing that differs is the architecture - the first virtual machine
> has 2 CPU cores, and the second has only 1 CPU core. This is the only
> difference
> I found. On the machine where the test fails, there is 1 core; on the other
> with
> two cores, the test passes. Honestly, I have absolutely no ideas.
>
> I also built OVS and OVN on Ubuntu 24.04 using the same libc version that I
> had
> on Ubuntu 22 - I did not observe the same behavior there. So most likely, I am
> making a mistake somewhere in my current investigation.
>
> I would be glad if you have time to look into what the issue might be. Or
> maybe
> you have ideas on where else to look for differences between the builds on
> these
> two machines?
Hi. So, I looked at the branch you provided, and one of my previous guesses was
more or less correct. There are indeed two OpenFlow rules with the same
priority,
but with slightly different matches (matches are mutually exclusive, so it is
OK).
Depending on which one is getting hit first during the classifier lookup, you
get
different results. These rules are the following two in table 17:
table=17,
priority=100,ct_state=+new-rpl+trk,ip,reg14=0x2,metadata=0x1,nw_dst=10.0.0.0/24
actions=ct(commit,zone=NXM_NX_REG11[0..15],nat(src),
exec(move:NXM_OF_ETH_SRC[]->NXM_NX_CT_LABEL[32..79],
load:0x2->NXM_NX_CT_MARK[16..31])),
resubmit(,18)
table=17,
priority=100,ct_state=+est-rpl+trk,ip,reg14=0x2,metadata=0x1,nw_dst=10.0.0.0/24
actions=ct(commit,zone=NXM_NX_REG11[0..15],nat(src),
exec(move:NXM_OF_ETH_SRC[]->NXM_NX_CT_LABEL[32..79],
load:0x2->NXM_NX_CT_MARK[16..31])),
resubmit(,18)
They only differ by the +new vs +est flag in the match criteria.
In the successful case, the subtable with the first rule is getting evaluated
first, and if there is a match, then the next subtable is skipped (since it has
the same max priority). This way the new|trk packet only gains +new-rpl+trk
match. Packet for established connection will not match in the first subtable
and will move to the next one and gain both -new and +est, so we get a
combination
-new+est-rpl+trk. The test expects flows with these two sets of matches:
./system-ovn.at:6165: ovs-appctl dpctl/dump-flows | grep
'ct_state(+new-rpl+trk).*ct(.*label=0x1001020400000000/.*)' -c
./system-ovn.at:6165: ovs-appctl dpctl/dump-flows | grep
'ct_state(-new+est+rpl+trk).*ct_label(0x1001020400000000)' -c
However, the order of these subtables is actually undefined, because they have
the same max priority. In the failing case we have the subtable with the est
match evaluated first. So, the new|trk traffic doesn't match in it and gains
-est-rpl+trk match, then in matches in the second subtable gaining the +new,
resulting with +new-est-rpl+trk in the end. The established traffic, OTOH,
matches in the first subtable gaining only +est-rpl+trk.
So, depending on the order in which these subtables are added, there will be
different set of matches. Either +new and -new+est or +new-est and +est.
You can confirm that by hacking the classifier to see which subtables exactly
it is going through (see the the patch below).
Successful:
2025-09-01T07:35:08.719Z|00007|ofproto_dpif(handler14)|INFO|Rule lookup in
table 17
2025-09-01T07:35:08.719Z|00008|classifier(handler14)|INFO|Subtables Considered:
subtable with max-prio 65535, ct_state mask: 0
Before: value: new|trk, mask: 0, together: 0/0
After: value: new|trk, mask: 0, together: 0/0
subtable with max-prio 65532, ct_state mask: 0
Before: value: new|trk, mask: 0, together: 0/0
After: value: new|trk, mask: 0, together: 0/0
subtable with max-prio 65532, ct_state mask: 0
Before: value: new|trk, mask: 0, together: 0/0
After: value: new|trk, mask: 0, together: 0/0
subtable with max-prio 65532, ct_state mask: 0
Before: value: new|trk, mask: 0, together: 0/0
After: value: new|trk, mask: 0, together: 0/0
subtable with max-prio 100, ct_state mask: new|rpl|trk
Before: value: new|trk, mask: 0, together: 0/0
After: value: new|trk, mask: new|rpl|trk, together: +new-rpl+trk
2025-09-01T07:35:08.725Z|00007|ofproto_dpif(handler6)|INFO|Rule lookup in table
17
2025-09-01T07:35:08.725Z|00356|classifier(handler40)|INFO|Subtables Considered:
subtable with max-prio 65535, ct_state mask: 0
Before: value: est|trk, mask: 0, together: 0/0
After: value: est|trk, mask: 0, together: 0/0
subtable with max-prio 65532, ct_state mask: 0
Before: value: est|trk, mask: 0, together: 0/0
After: value: est|trk, mask: 0, together: 0/0
subtable with max-prio 65532, ct_state mask: 0
Before: value: est|trk, mask: 0, together: 0/0
After: value: est|trk, mask: 0, together: 0/0
subtable with max-prio 65532, ct_state mask: 0
Before: value: est|trk, mask: 0, together: 0/0
After: value: est|trk, mask: 0, together: 0/0
subtable with max-prio 100, ct_state mask: new|rpl|trk
Before: value: est|trk, mask: 0, together: 0/0
After: value: est|trk, mask: new|rpl|trk, together: -new-rpl+trk
subtable with max-prio 100, ct_state mask: est|rpl|trk
Before: value: est|trk, mask: new|rpl|trk, together: -new-rpl+trk
After: value: est|trk, mask: new|est|rpl|trk, together: -new+est-rpl+trk
Failing:
2025-09-01T07:32:58.341Z|00533|ofproto_dpif(handler6)|INFO|Rule lookup in table
17
2025-09-01T07:32:58.341Z|00534|classifier(handler6)|INFO|Subtables Considered:
subtable with max-prio 65535, ct_state mask: 0
Before: value: new|trk, mask: 0, together: 0/0
After: value: new|trk, mask: 0, together: 0/0
subtable with max-prio 65532, ct_state mask: 0
Before: value: new|trk, mask: 0, together: 0/0
After: value: new|trk, mask: 0, together: 0/0
subtable with max-prio 65532, ct_state mask: 0
Before: value: new|trk, mask: 0, together: 0/0
After: value: new|trk, mask: 0, together: 0/0
subtable with max-prio 65532, ct_state mask: 0
Before: value: new|trk, mask: 0, together: 0/0
After: value: new|trk, mask: 0, together: 0/0
subtable with max-prio 100, ct_state mask: est|rpl|trk
Before: value: new|trk, mask: 0, together: 0/0
After: value: new|trk, mask: est|rpl|trk, together: -est-rpl+trk
subtable with max-prio 100, ct_state mask: new|rpl|trk
Before: value: new|trk, mask: est|rpl|trk, together: -est-rpl+trk
After: value: new|trk, mask: new|est|rpl|trk, together: +new-est-rpl+trk
2025-09-01T07:32:58.351Z|01531|ofproto_dpif(handler6)|INFO|Rule lookup in table
17
2025-09-01T07:32:58.351Z|01532|classifier(handler6)|INFO|Subtables Considered:
subtable with max-prio 65535, ct_state mask: 0
Before: value: est|trk, mask: 0, together: 0/0
After: value: est|trk, mask: 0, together: 0/0
subtable with max-prio 65532, ct_state mask: 0
Before: value: est|trk, mask: 0, together: 0/0
After: value: est|trk, mask: 0, together: 0/0
subtable with max-prio 65532, ct_state mask: 0
Before: value: est|trk, mask: 0, together: 0/0
After: value: est|trk, mask: 0, together: 0/0
subtable with max-prio 65532, ct_state mask: 0
Before: value: est|trk, mask: 0, together: 0/0
After: value: est|trk, mask: 0, together: 0/0
subtable with max-prio 100, ct_state mask: est|rpl|trk
Before: value: est|trk, mask: 0, together: 0/0
After: value: est|trk, mask: est|rpl|trk, together: +est-rpl+trk
The order of subtables is not really defined and depends on how they are placed
in the priority vector after qsort(). And the qsort algorithm is not generally
stable, i.e. doesn't guarantee that elements of the same priority will remain in
the original relative order.
Saying that, the order seem to mostly depend on the order in which
ovn-controller
adds the rule to OVS. The order of subtables seems to mostly match the order
of OF messages, even though that is not a guarantee.
Successful:
2025-09-01T07:35:08.452Z|00994|vconn|DBG|unix#2: received:
OFPT_BUNDLE_ADD_MESSAGE (OF1.5) (xid=0x39a):
bundle_id=0x15 flags=atomic ordered
OFPT_FLOW_MOD (OF1.5) (xid=0x39a): ADD table:17
priority=100,ct_state=+new-rpl+trk,ip,reg14=0x2,metadata=0x1,nw_dst=10.0.0.0/24
...
2025-09-01T07:35:08.452Z|01009|vconn|DBG|unix#2: received:
OFPT_BUNDLE_ADD_MESSAGE (OF1.5) (xid=0x3a9):
bundle_id=0x15 flags=atomic ordered
OFPT_FLOW_MOD (OF1.5) (xid=0x3a9): ADD table:17
priority=100,ct_state=+est-rpl+trk,ip,reg14=0x2,metadata=0x1,nw_dst=10.0.0.0/24
...
Failing:
2025-09-01T07:32:58.159Z|00992|vconn|DBG|unix#2: received:
OFPT_BUNDLE_ADD_MESSAGE (OF1.5) (xid=0x398):
bundle_id=0x15 flags=atomic ordered
OFPT_FLOW_MOD (OF1.5) (xid=0x398): ADD table:17
priority=100,ct_state=+est-rpl+trk,ip,reg14=0x2,metadata=0x1,nw_dst=10.0.0.0/24
...
2025-09-01T07:32:58.160Z|01033|vconn|DBG|unix#2: received:
OFPT_BUNDLE_ADD_MESSAGE (OF1.5) (xid=0x3c1):
bundle_id=0x15 flags=atomic ordered
OFPT_FLOW_MOD (OF1.5) (xid=0x3c1): ADD table:17
priority=100,ct_state=+new-rpl+trk,ip,reg14=0x2,metadata=0x1,nw_dst=10.0.0.0/24
...
You may need to disable rate limit for vconn logs to see these:
ovs-appctl vlog/disable-rate-limit vconn
They also show up in different order in the flow dump.
ovn-controller internally uses hash tables to store the desired OpenFlow rules,
so the order in which these rules are sent to OVS depends on the order inside
the hash table, which is generally not defined and depends on hashing, the table
size and timing of events like database updates.
It is still interesting how it passes or fails relatively consistently depending
on the system, but it may be due to various factors, starting from the hash
function
implementation on a particular CPU and ending with slight performance
differences
and initial entropy of a random number generator gathered at a boot time. So,
I'm not sure if it's worth investigating further. If you want to have some fun
and dig deeper, you may add some more instrumentation into ovn-controller to see
how the hash values are generated for these flows and in which order they are in
the hash table.
The sulution here would be to give these two rules slightly different priorities
or just expect either variant of matches in the test, e.g. by 'sed'ing out the
-est in the first check and the -new in the second. Both rules can also be
probbaly replaced with a single rule that matches on -inv instead, since both
have the same actions, but this may have some impact on hardware offload, if you
care about that.
Best regards, Ilya Maximets.
The classifier hack that I used:
diff --git a/lib/classifier.c b/lib/classifier.c
index 0729bd190..a61a39599 100644
--- a/lib/classifier.c
+++ b/lib/classifier.c
@@ -22,10 +22,13 @@
#include <netinet/in.h>
#include "byte-order.h"
#include "openvswitch/dynamic-string.h"
+#include "openvswitch/vlog.h"
#include "odp-util.h"
#include "packets.h"
#include "util.h"
+VLOG_DEFINE_THIS_MODULE(classifier);
+
struct trie_ctx;
/* A collection of "struct cls_conjunction"s currently embedded into a
@@ -967,7 +970,7 @@ static const struct cls_rule *
classifier_lookup__(const struct classifier *cls, ovs_version_t version,
struct flow *flow, struct flow_wildcards *wc,
bool allow_conjunctive_matches,
- struct hmapx *conj_flows)
+ struct hmapx *conj_flows, struct ds *s)
{
struct trie_ctx trie_ctx[CLS_MAX_TRIES];
const struct cls_match *match;
@@ -1000,10 +1003,37 @@ classifier_lookup__(const struct classifier *cls,
ovs_version_t version,
&cls->subtables) {
struct cls_conjunction_set *conj_set;
+ uint8_t ct_state = MINIFLOW_GET_U8(&subtable->mask.masks, ct_state);
+ ds_put_format(s, "subtable with max-prio %d, ct_state mask: ",
+ subtable->max_priority);
+ if (ct_state) {
+ format_flags(s, ct_state_to_string, ct_state, '|');
+ } else {
+ ds_put_cstr(s, "0"); /* No state. */
+ }
+ ds_put_cstr(s, "\nBefore: value: ");
+ format_flags(s, ct_state_to_string, flow->ct_state, '|');
+ ds_put_cstr(s, ", mask: ");
+ format_flags(s, ct_state_to_string, wc->masks.ct_state, '|');
+ ds_put_cstr(s, ", together: ");
+ format_flags_masked(s, NULL, ct_state_to_string,
+ flow->ct_state, wc->masks.ct_state, UINT16_MAX);
+ ds_put_char(s, '\n');
+
/* Skip subtables with no match, or where the match is lower-priority
* than some certain match we've already found. */
match = find_match_wc(subtable, version, flow, trie_ctx, cls->n_tries,
wc);
+
+ ds_put_cstr(s, "After: value: ");
+ format_flags(s, ct_state_to_string, flow->ct_state, '|');
+ ds_put_cstr(s, ", mask: ");
+ format_flags(s, ct_state_to_string, wc->masks.ct_state, '|');
+ ds_put_cstr(s, ", together: ");
+ format_flags_masked(s, NULL, ct_state_to_string,
+ flow->ct_state, wc->masks.ct_state, UINT16_MAX);
+ ds_put_char(s, '\n');
+
if (!match || match->priority <= hard_pri) {
continue;
}
@@ -1128,7 +1158,7 @@ classifier_lookup__(const struct classifier *cls,
ovs_version_t version,
flow->conj_id = id;
rule = classifier_lookup__(cls, version, flow, wc, false,
- NULL);
+ NULL, s);
flow->conj_id = saved_conj_id;
if (rule) {
@@ -1205,7 +1235,15 @@ classifier_lookup(const struct classifier *cls,
ovs_version_t version,
struct flow *flow, struct flow_wildcards *wc,
struct hmapx *conj_flows)
{
- return classifier_lookup__(cls, version, flow, wc, true, conj_flows);
+ struct ds s = DS_EMPTY_INITIALIZER;
+ const struct cls_rule *result;
+
+ result = classifier_lookup__(cls, version, flow, wc, true, conj_flows, &s);
+
+ VLOG_INFO("Subtables Considered:\n%s", ds_cstr(&s));
+ ds_destroy(&s);
+
+ return result;
}
/* Finds and returns a rule in 'cls' with exactly the same priority and
diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index da4cbbc07..48e5be305 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -4412,6 +4412,7 @@ rule_dpif_lookup_in_table(struct ofproto_dpif *ofproto,
ovs_version_t version,
struct hmapx *conj_flows)
{
struct classifier *cls = &ofproto->up.tables[table_id].cls;
+ VLOG_INFO("Rule lookup in table %d", table_id);
return rule_dpif_cast(rule_from_cls_rule(classifier_lookup(cls, version,
flow, wc,
conj_flows)));
---
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev