On Tue, Nov 7, 2023 at 8:06 AM Dumitru Ceara <dce...@redhat.com> wrote: > > On 11/6/23 08:19, Han Zhou wrote: > > On Sun, Nov 5, 2023 at 10:59 PM Ales Musil <amu...@redhat.com> wrote: > >> > >> > >> > >> On Sat, Nov 4, 2023 at 5:45 AM Han Zhou <hz...@ovn.org> wrote: > >>> > >>> > >>> > >>> On Fri, Nov 3, 2023 at 1:08 AM Ales Musil <amu...@redhat.com> wrote: > >>>> > >>>> > >>>> > >>>> On Tue, Oct 24, 2023 at 9:36 PM Han Zhou <hz...@ovn.org> wrote: > >>>>> > >>>>> Enhance MAC_Binding aging to allow CIDR-based threshold > > configurations. > >>>>> This enables distinct threshold settings for different IP ranges, > >>>>> applying the longest prefix matching for overlapping ranges. > >>>>> > >>>>> A common use case involves setting a default threshold for all IPs, > > while > >>>>> disabling aging for a specific range and potentially excluding a > > subset > >>>>> within that range. > >>>>> > >>>>> Signed-off-by: Han Zhou <hz...@ovn.org> > >>>>> --- > >>>> > >>>> > >>>> Hi Han, > >>>> > >>>> thank you for the patch, I have a couple of comments down below. > >>> > >>> Thanks for your review! > >>> > > > Hi Han, > > Thanks for the patch! Not a very in-depth review but I left a bunch of > minor comments and shared my opinion on the configuration format. >
Thanks for your review! > Regards, > Dumitru > > >>>> > >>>>> > >>>>> northd/aging.c | 297 > > +++++++++++++++++++++++++++++++++++++++++++++--- > >>>>> northd/aging.h | 3 + > >>>>> northd/northd.c | 11 +- > >>>>> ovn-nb.xml | 63 +++++++++- > >>>>> tests/ovn.at | 60 ++++++++++ > >>>>> 5 files changed, 413 insertions(+), 21 deletions(-) > >>>>> > >>>>> diff --git a/northd/aging.c b/northd/aging.c > >>>>> index f626c72c8ca3..e5868211a63b 100644 > >>>>> --- a/northd/aging.c > >>>>> +++ b/northd/aging.c > >>>>> @@ -47,12 +47,253 @@ aging_waker_schedule_next_wake(struct > > aging_waker *waker, int64_t next_wake_ms) > >>>>> } > >>>>> } > >>>>> > >>>>> +struct threshold_entry { > >>>>> + union { > >>>>> + ovs_be32 ipv4; > >>>>> + struct in6_addr ipv6; > >>>>> + } prefix; > >>>>> + bool is_v4; > >>>> > >>>> > >>>> We can avoid the is_v4 whatsover by storing the address as mapped v4 > > in 'in6_addr', I saw the concern and we can still have separate arrays for > > v4 and v6. The parsing IMO becomes much simpler if we use the mapped v4 for > > both see down below. > >>>> > >>> > >>> I did consider using in6_addr, but it would make the > > find_threshold_for_ip() slightly more tedious because for every v4 entry, > > even if stored in a separate array, we will still need to call > > in6_addr_get_mapped_ipv4() to convert it. While using the union we can > > directly access what we need (ipv4 or ipv6). > >> > >> > >> We wouldn't have to actually call in6_addr_get_mapped_ipv4() for the v4 > > variant, that's why I suggested that approach. We can adjust the mask and > > then use the same logic as for ipv6 (entry->plen += 96 should do the trick). > >> > > > > Understood, and my reply below was about that: > >>> The search may be simplified a little in find_threshold_for_ip() with > > adjusted prefix for v4, only if we combine the two arrays to a single > > array. Otherwise we will need to have a function to abstract and reuse the > > ipv6 matching logic, while the ipv4 matching is really just 3 lines of code > > :) > > > >>> > >>> The is_v4 member was needed when using a single array, but not necessary > > any more when storing v4 and v6 entries separately, but I kept it just for > > convenience, so that I don't need to use another arg in > > parse_threshold_entry() to return the AF type. Using in6_addr can avoid the > > extra arg, too, but similarly, we would have to always call the macro > > IN6_IS_ADDR_V4MAPPED, which doesn't seem cleaner than the is_v4 field to > > me. Using ip46_parse_cidr does save several lines of code, but that code > > snippet itself is quite small. So with all these being considered, I think > > either approach doesn't really make too much difference. I am totally fine > > to change it if you still think a single in6_addr is better than the union. > >>> > >>>>> > >>>>> + unsigned int plen; > >>>>> + unsigned int threshold; > >>>> > >>>> > >>>> I personally prefer sized integers. > >>>> > >>> I think it may be better to match the type of related helper functions. > >>> For plen, it is ip[v6]_parse_cidr(). > >>> For threshold, it is str_to_uint(), or even smap_get_uint() returns > > unsigned int. > >>> > >>>>> > >>>>> +}; > >>>>> + > >>>>> +/* Contains CIDR-based aging threshold configuration parsed from > >>>>> + * "Logical_Router:options:mac_binding_age_threshold". > >>>>> + * > >>>>> + * This struct is also used for non-CIDR-based threshold, e.g. the > > ones from > >>>>> + * "NB_Global:other_config:fdb_age_threshold" for the common > > aging_context > >>>>> + * interface. > >>>>> + * > >>>>> + * - The arrays `v4_entries` and `v6_entries` are populated with > > parsed entries > >>>>> + * for IPv4 and IPv6 CIDRs, respectively, along with their > > associated > >>>>> + * thresholds. Entries within these arrays are sorted by prefix > > length, > >>>>> + * starting with the longest. > >>>>> + * > >>>>> + * - If a threshold is provided without an accompanying prefix, it's > > captured > >>>>> + * in `default_threshold`. In cases with multiple unprefixed > > thresholds, > >>>>> + * `default_threshold` will only store the last one. */ > >>>>> +struct threshold_config { > >>>>> + struct threshold_entry *v4_entries; > >>>>> + int n_v4_entries; > > Nit: size_t > Ack > >>>>> + struct threshold_entry *v6_entries; > >>>>> + int n_v6_entries; > > Nit: size_t > Ack > >>>>> + unsigned int default_threshold; > >>>>> +}; > >>>>> + > >>>>> +static int > >>>>> +compare_entries_by_prefix_length(const void *a, const void *b) > >>>>> +{ > >>>>> + const struct threshold_entry *entry_a = a; > >>>>> + const struct threshold_entry *entry_b = b; > >>>>> + > >>>>> + return entry_b->plen - entry_a->plen; > >>>>> +} > >>>>> + > >>>>> +/* Parse an ENTRY in the threshold option, with the format: > >>>>> + * [CIDR:]THRESHOLD > >>>>> + * > >>>>> + * Returns true if successful, false if failed. */ > >>>>> +static bool > >>>>> +parse_threshold_entry(const char *str, struct threshold_entry *entry) > >>>>> +{ > >>>>> + char *colon_ptr; > >>>>> + unsigned int value; > >>>>> + const char *threshold_str; > >>>>> + > >>>>> + colon_ptr = strrchr(str, ':'); > >>>> > >>>> > >>>> I find this a bit confusing and probably error prone, could we use > > different delimiters for the threshold? I guess we could change it to > > 'cidr;threshold,cidr;threshold' WDYT? This would allow us to use > > 'strtok_r()' here and we could spare the multiple copies and replace with a > > single one. > >>>> > >>> I thought about that, too, but decided to use ":" instead of ";" because > > "A:B" looks more natural to represent "A has a value B", and ";" is more > > like a delimiter to separate two different elements, just like ",". So I > > feel that ":" might be more friendly to end users. I'd love to hear > > opinions from more people. > >>> If ";" or something else turned out to be a better option for the end > > user, we can use strtok_r() here but I think it shouldn't be a concern if > > we can't use it. > > IMO ":" is better. But that's just personal preference. > OK > >>> > >>>>> > >>>>> + if (!colon_ptr) { > >>>>> + threshold_str = str; > >>>>> + entry->plen = 0; > >>>>> + } else { > >>>>> + threshold_str = colon_ptr + 1; > >>>>> + } > >>>>> + > >>>>> + if (!str_to_uint(threshold_str, 10, &value)) { > >>>>> + return false; > >>>>> + } > >>>>> + entry->threshold = value; > >>>>> + > >>>>> + if (!colon_ptr) { > >>>>> + return true; > >>>>> + } > >>>>> + > >>>>> + /* ":" was found, so parse the string before ":" as a cidr. */ > >>>>> + char ip_cidr[128]; > >>>>> + ovs_strzcpy(ip_cidr, str, MIN(colon_ptr - str + 1, sizeof > > ip_cidr)); > >>>>> + char *error = ip_parse_cidr(ip_cidr, &entry->prefix.ipv4, > > &entry->plen); > >>>> > >>>> > >>>> As mentioned earlier by storing it in 'in6_addr' we could use > > 'ip46_parse_cidr()' which would simplify the logic. We can also adjust the > > prefix for v4 so the search is the same for both. > >>> > >>> The search may be simplified a little in find_threshold_for_ip() with > > adjusted prefix for v4, only if we combine the two arrays to a single > > array. Otherwise we will need to have a function to abstract and reuse the > > ipv6 matching logic, while the ipv4 matching is really just 3 lines of code > > :) > >>> But if you prefer to use a single array for both v4 and v6, I can try > > that in v2. > >>> > >>>> > >>>>> > >>>>> + if (!error) { > >>>>> + entry->is_v4 = true; > >>>>> + return true; > >>>>> + } > >>>>> + free(error); > >>>>> + error = ipv6_parse_cidr(ip_cidr, &entry->prefix.ipv6, > > &entry->plen); > > If I'm not wrong this is ambiguous for cases when the cidr length is not > present. > > Maybe we should add a test for an (explicit and implicit) /128 config. > Both ip_parse_cidr and ipv6_parse_cidr accept different formats including addresses without mask/length. There are actually lots of combinations but I think it should belong to unit tests of those functions instead of testing them everywhere when they are used. I did include random examples in the test case including an IPv4 address without mask and I can add one IPv6 without mask, too, but still, it is not supposed to be a thorough test for those CIDR parsing functions. > >>>>> + if (!error) { > >>>>> + entry->is_v4 = false; > >>>>> + return true; > >>>>> + } > >>>>> + free(error); > >>>>> + return false; > >>>>> +} > >>>>> + > >>>>> +static void > >>>>> +threshold_config_destroy(struct threshold_config *config) > >>>>> +{ > >>>>> + free(config->v4_entries); > >>>>> + free(config->v6_entries); > >>>>> + config->v4_entries = config->v6_entries = NULL; > >>>>> + config->n_v4_entries = config->n_v6_entries = 0; > >>>>> + config->default_threshold = 0; > >>>>> +} > >>>>> + > >>>>> +/* Parse the threshold option string, which has the format: > >>>>> + * ENTRY[;ENTRY[...]] > >>>>> + * > >>>>> + * For the exact format of ENTRY, refer to the function > >>>>> + * `parse_threshold_entry`. > >>>>> + * > >>>>> + * The parsed data is populated to the struct threshold_config. > >>>>> + * See the comments of struct threshold_config for details. > >>>>> + * > >>>>> + * Return Values: > >>>>> + * - Returns `false` if the input does not match the expected format. > >>>>> + * Consequently, no entries will be populated. > >>>>> + * - Returns `true` upon successful parsing. The caller is > > responsible for > >>>>> + * releasing the allocated memory by calling > > threshold_config_destroy. */ > >>>>> +static bool > >>>>> +parse_aging_threshold(const char *opt, > >>>>> + struct threshold_config *config) > >>>>> +{ > >>>>> + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1); > >>>>> + struct threshold_entry e; > >>>>> + const char *start = opt, *end; > >>>>> + memset(config, 0, sizeof *config); > >>>>> + > >>>>> + while (start && *start) { > >>>>> + end = strchr(start, ';'); > >>>> > >>>> > >>>> We should use 'strtok_r()' this again allows us to avoid the constant > > copies replaced by single one. > >>> > >>> Make sense. I will change this in v2. > >>> > >>>> > >>>>> > >>>>> + if (end) { > >>>>> + /* Big enough for any <ipv6>/<ipv6 mask>:<threshold> */ > >>>>> + char buf[128]; > >>>>> + ovs_strzcpy(buf, start, MIN(end - start + 1, sizeof > > buf)); > >>>>> + if (!parse_threshold_entry(buf, &e)) { > >>>>> + VLOG_WARN_RL(&rl, "Parsing aging threshold '%s' > > failed.", buf); > >>>>> + goto err; > >>>>> + } > >>>>> + start = end + 1; > >>>>> + } else { > >>>>> + if (!parse_threshold_entry(start, &e)) { > >>>>> + VLOG_WARN_RL(&rl, "Parsing aging threshold '%s' > > failed.", > >>>>> + start); > >>>>> + goto err; > >>>>> + } > >>>>> + start += strlen(start); > >>>>> + } > >>>>> + if (!e.plen) { > >>>>> + config->default_threshold = e.threshold; > >>>>> + } else if (e.is_v4) { > >>>>> + config->n_v4_entries++; > >>>>> + config->v4_entries = xrealloc(config->v4_entries, > >>>>> + config->n_v4_entries * > > sizeof e); > >>>>> + config->v4_entries[config->n_v4_entries - 1] = e; > > Nit: Maybe use x2nrealloc()? I opted not to use x2nrealloc since it's merely a short configuration string (without much thinking). Using x2nrealloc could give the impression to the code reader that a large number of entries are anticipated. But, I am really fine either way. Best regards, Han > > >>>>> + } else { > >>>>> + config->n_v6_entries++; > >>>>> + config->v6_entries = xrealloc(config->v6_entries, > >>>>> + config->n_v6_entries * > > sizeof e); > >>>>> + config->v6_entries[config->n_v6_entries - 1] = e; > >>>>> + } > >>>>> + } > >>>>> + > >>>>> + if (config->n_v4_entries > 0) { > >>>>> + qsort(config->v4_entries, config->n_v4_entries, sizeof e, > >>>>> + compare_entries_by_prefix_length); > >>>>> + } > >>>>> + > >>>>> + if (config->n_v6_entries > 0) { > >>>>> + qsort(config->v6_entries, config->n_v6_entries, sizeof e, > >>>>> + compare_entries_by_prefix_length); > >>>>> + } > >>>>> + > >>>>> + return true; > >>>>> + > >>>>> +err: > >>>>> + threshold_config_destroy(config); > >>>>> + return false; > >>>>> +} > >>>>> + > >>>>> +static unsigned int > >>>>> +find_threshold_for_ip(const char *ip_str, > >>>>> + const struct threshold_config *config) > >>>>> +{ > >>>>> + if (!ip_str) { > >>>>> + return config->default_threshold; > >>>>> + } > >>>>> + > >>>>> + ovs_be32 ipv4; > >>>>> + struct in6_addr ipv6; > >>>>> + if (ip_parse(ip_str, &ipv4)) { > >>>>> + for (int i = 0; i < config->n_v4_entries; i++) { > >>>>> + ovs_be32 masked_ip = ipv4 & > >>>>> + be32_prefix_mask(config->v4_entries[i].plen); > >>>>> + if (masked_ip == config->v4_entries[i].prefix.ipv4) { > >>>>> + return config->v4_entries[i].threshold; > >>>>> + } > >>>>> + } > >>>>> + } else if (ipv6_parse(ip_str, &ipv6)) { > >>>>> + for (int i = 0; i < config->n_v6_entries; i++) { > >>>>> + struct in6_addr v6_mask = > >>>>> + ipv6_create_mask(config->v6_entries[i].plen); > >>>>> + struct in6_addr masked_ip = ipv6_addr_bitand(&ipv6, > > &v6_mask); > >>>>> + if (ipv6_addr_equals(&masked_ip, > >>>>> + > > &config->v6_entries[i].prefix.ipv6)) { > >>>>> + return config->v6_entries[i].threshold; > >>>>> + } > >>>>> + } > >>>>> + } > >>>>> + return config->default_threshold; > >>>>> +} > >>>>> + > >>>>> +/* Parse the threshold option string (see the comment of the function > >>>>> + * parse_aging_threshold), and returns the smallest threshold. */ > >>>>> +unsigned int > >>>>> +min_mac_binding_age_threshold(const char *opt) > >>>>> +{ > >>>>> + struct threshold_config config; > >>>>> + if (!parse_aging_threshold(opt, &config)) { > >>>>> + return 0; > >>>>> + } > >>>>> + > >>>>> + unsigned int threshold = UINT_MAX; > >>>>> + unsigned int t; > >>>>> + > >>>>> + for (int i = 0; i < config.n_v4_entries; i++) { > >>>>> + t = config.v4_entries[i].threshold; > >>>>> + if (t && t < threshold) { > >>>>> + threshold = t; > >>>>> + } > >>>>> + } > >>>>> + > >>>>> + for (int i = 0; i < config.n_v6_entries; i++) { > >>>>> + t = config.v6_entries[i].threshold; > >>>>> + if (t && t < threshold) { > >>>>> + threshold = t; > >>>>> + } > >>>>> + } > >>>>> + > >>>>> + t = config.default_threshold; > >>>>> + if (t && t < threshold) { > >>>>> + threshold = t; > >>>>> + } > >>>>> + > >>>>> + threshold_config_destroy(&config); > >>>>> + > >>>>> + return threshold == UINT_MAX ? 0 : threshold; > >>>>> +} > >>>>> + > >>>>> struct aging_context { > >>>>> int64_t next_wake_ms; > >>>>> int64_t time_wall_now; > >>>>> uint32_t removal_limit; > >>>>> uint32_t n_removed; > >>>>> - uint64_t threshold; > >>>>> + struct threshold_config *threshold; > >>>>> }; > >>>>> > >>>>> static struct aging_context > >>>>> @@ -63,13 +304,14 @@ aging_context_init(uint32_t removal_limit) > >>>>> .time_wall_now = time_wall_msec(), > >>>>> .removal_limit = removal_limit, > >>>>> .n_removed = 0, > >>>>> - .threshold = 0, > >>>>> + .threshold = NULL, > >>>>> }; > >>>>> return ctx; > >>>>> } > >>>>> > >>>>> static void > >>>>> -aging_context_set_threshold(struct aging_context *ctx, uint64_t > > threshold) > >>>>> +aging_context_set_threshold(struct aging_context *ctx, > >>>>> + struct threshold_config *threshold) > >>>>> { > >>>>> ctx->threshold = threshold; > >>>>> } > >>>>> @@ -81,19 +323,27 @@ aging_context_is_at_limit(struct aging_context > > *ctx) > >>>>> } > >>>>> > >>>>> static bool > >>>>> -aging_context_handle_timestamp(struct aging_context *ctx, int64_t > > timestamp) > >>>>> +aging_context_handle_timestamp(struct aging_context *ctx, int64_t > > timestamp, > >>>>> + const char *ip) > >>>>> { > >>>>> int64_t elapsed = ctx->time_wall_now - timestamp; > >>>>> if (elapsed < 0) { > >>>>> return false; > >>>>> } > >>>>> > >>>>> - if (elapsed >= ctx->threshold) { > >>>>> + ovs_assert(ctx->threshold); > >>>>> + uint64_t threshold = 1000 * find_threshold_for_ip(ip, > > ctx->threshold); > >>>>> + > >>>>> + if (!threshold) { > >>>>> + return false; > >>>>> + } > >>>>> + > >>>>> + if (elapsed >= threshold) { > >>>>> ctx->n_removed++; > >>>>> return true; > >>>>> } > >>>>> > >>>>> - ctx->next_wake_ms = MIN(ctx->next_wake_ms, (ctx->threshold - > > elapsed)); > >>>>> + ctx->next_wake_ms = MIN(ctx->next_wake_ms, (threshold - > > elapsed)); > >>>>> return false; > >>>>> } > >>>>> > >>>>> @@ -121,13 +371,18 @@ mac_binding_aging_run_for_datapath(const struct > > sbrec_datapath_binding *dp, > >>>>> return; > >>>>> } > >>>>> > >>>>> + if (!ctx->threshold->n_v4_entries && > > !ctx->threshold->n_v6_entries > >>>>> + && !ctx->threshold->default_threshold) { > >>>>> + return; > >>>>> + } > >>>>> + > >>>>> struct sbrec_mac_binding *mb_index_row = > >>>>> sbrec_mac_binding_index_init_row(mb_by_datapath); > >>>>> sbrec_mac_binding_index_set_datapath(mb_index_row, dp); > >>>>> > >>>>> const struct sbrec_mac_binding *mb; > >>>>> SBREC_MAC_BINDING_FOR_EACH_EQUAL (mb, mb_index_row, > > mb_by_datapath) { > >>>>> - if (aging_context_handle_timestamp(ctx, mb->timestamp)) { > >>>>> + if (aging_context_handle_timestamp(ctx, mb->timestamp, > > mb->ip)) { > >>>>> sbrec_mac_binding_delete(mb); > >>>>> if (aging_context_is_at_limit(ctx)) { > >>>>> break; > >>>>> @@ -166,13 +421,20 @@ en_mac_binding_aging_run(struct engine_node > > *node, void *data OVS_UNUSED) > >>>>> continue; > >>>>> } > >>>>> > >>>>> - uint64_t threshold = smap_get_uint(&od->nbr->options, > >>>>> - > > "mac_binding_age_threshold", 0); > >>>>> - aging_context_set_threshold(&ctx, threshold * 1000); > >>>>> + struct threshold_config threshold_config; > >>>>> + if (!parse_aging_threshold(smap_get(&od->nbr->options, > >>>>> + > > "mac_binding_age_threshold"), > >>>>> + &threshold_config)) { > >>>>> + return; > >>>>> + } > >>>>> + > >>>>> + aging_context_set_threshold(&ctx, &threshold_config); > >>>>> > >>>>> mac_binding_aging_run_for_datapath(od->sb, > >>>>> > > sbrec_mac_binding_by_datapath, > >>>>> &ctx); > >>>>> + threshold_config_destroy(&threshold_config); > >>>>> + > >>>>> if (aging_context_is_at_limit(&ctx)) { > >>>>> /* Schedule the next run after specified delay. */ > >>>>> ctx.next_wake_ms = AGING_BULK_REMOVAL_DELAY_MSEC; > >>>>> @@ -255,7 +517,7 @@ fdb_run_for_datapath(const struct > > sbrec_datapath_binding *dp, > >>>>> > >>>>> const struct sbrec_fdb *fdb; > >>>>> SBREC_FDB_FOR_EACH_EQUAL (fdb, fdb_index_row, fdb_by_dp_key) { > >>>>> - if (aging_context_handle_timestamp(ctx, fdb->timestamp)) { > >>>>> + if (aging_context_handle_timestamp(ctx, fdb->timestamp, > > NULL)) { > >>>>> sbrec_fdb_delete(fdb); > >>>>> if (aging_context_is_at_limit(ctx)) { > >>>>> break; > >>>>> @@ -293,11 +555,16 @@ en_fdb_aging_run(struct engine_node *node, void > > *data OVS_UNUSED) > >>>>> continue; > >>>>> } > >>>>> > >>>>> - uint64_t threshold = > >>>>> - smap_get_uint(&od->nbs->other_config, > > "fdb_age_threshold", 0); > >>>>> - aging_context_set_threshold(&ctx, threshold * 1000); > >>>>> - > >>>>> + struct threshold_config threshold_config; > >>>>> + if (!parse_aging_threshold(smap_get(&od->nbs->other_config, > >>>>> + "fdb_age_threshold"), > >>>>> + &threshold_config)) { > >>>> > >>>> > >>>> This would mean that fdb aging would accept the same arguments, but > > that doesn't seem right. We should set only the default value instead of > > the whole parsing. > >>>> > >>> Calling parse_aging_threshold here is just a convenient way to > > initialize the other fields of the structure while setting the default > > value. I agree it may be just more clear to explicitly initialize the > > threshold_config and set the default value by calling smap_get_uint(). I > > will do that in v2. > >>> > >>> I will wait for the consensus on the CIDR option's format and the > > separate v.s. combined array before sending v2. > >> > >> > >> I definitely don't want to block this patch on something that is more or > > less cosmetic change so let's see what others think. > >> > > The option format may be more important since it is user facing. With an > > example, we need to choose between: > > > > A: 192.168.0.0/16:300;192.168.10.0/24:0;fe80::/10:600;1200 > > > > or > > > > B: 192.168.0.0/16;300,192.168.10.0/24;0,fe80::/10;600,1200 > > > > Thanks, > > Han > > > >>> > >>> > >>> Thanks, > >>> Han > >>> > >>>>> > >>>>> + return; > >>>>> + } > >>>>> + aging_context_set_threshold(&ctx, &threshold_config); > >>>>> fdb_run_for_datapath(od->sb, sbrec_fdb_by_dp_key, &ctx); > >>>>> + threshold_config_destroy(&threshold_config); > >>>>> + > >>>>> if (aging_context_is_at_limit(&ctx)) { > >>>>> /* Schedule the next run after specified delay. */ > >>>>> ctx.next_wake_ms = AGING_BULK_REMOVAL_DELAY_MSEC; > >>>>> diff --git a/northd/aging.h b/northd/aging.h > >>>>> index 650990e244aa..2cfe2fd44457 100644 > >>>>> --- a/northd/aging.h > >>>>> +++ b/northd/aging.h > >>>>> @@ -30,6 +30,9 @@ void *en_mac_binding_aging_waker_init(struct > > engine_node *node, > >>>>> struct engine_arg *arg); > >>>>> void en_mac_binding_aging_waker_cleanup(void *data); > >>>>> > >>>>> +/* The MAC binding aging helper function. */ > >>>>> +unsigned int min_mac_binding_age_threshold(const char *opt); > >>>>> + > >>>>> /* The FDB aging node functions. */ > >>>>> void en_fdb_aging_run(struct engine_node *node, void *data); > >>>>> void *en_fdb_aging_init(struct engine_node *node, struct engine_arg > > *arg); > >>>>> diff --git a/northd/northd.c b/northd/northd.c > >>>>> index f8b046d83e85..4bc898982cf2 100644 > >>>>> --- a/northd/northd.c > >>>>> +++ b/northd/northd.c > >>>>> @@ -17,6 +17,7 @@ > >>>>> #include <stdlib.h> > >>>>> #include <stdio.h> > >>>>> > >>>>> +#include "aging.h" > >>>>> #include "debug.h" > >>>>> #include "bitmap.h" > >>>>> #include "coverage.h" > >>>>> @@ -1166,8 +1167,14 @@ ovn_datapath_update_external_ids(struct > > ovn_datapath *od) > >>>>> smap_add(&ids, "always_learn_from_arp_request", "false"); > >>>>> } > >>>>> > >>>>> - uint32_t age_threshold = smap_get_uint(&od->nbr->options, > >>>>> - > > "mac_binding_age_threshold", 0); > >>>>> + /* For timestamp refreshing, the smallest threshold of the > > option is > >>>>> + * set to SB to make sure all entries are refreshed in time. > >>>>> + * XXX: This approach simplifies processing in > > ovn-controller, but it > >>>>> + * may be enhanced, if necessary, to parse the complete > > CIDR-based > >>>>> + * threshold configurations to SB to reduce unnecessary > > refreshes. */ > >>>>> + uint32_t age_threshold = min_mac_binding_age_threshold( > >>>>> + smap_get(&od->nbr->options, > >>>>> + > > "mac_binding_age_threshold")); > >>>>> if (age_threshold) { > >>>>> smap_add_format(&ids, "mac_binding_age_threshold", > >>>>> "%u", age_threshold); > >>>>> diff --git a/ovn-nb.xml b/ovn-nb.xml > >>>>> index 1de0c30416ce..80733273484e 100644 > >>>>> --- a/ovn-nb.xml > >>>>> +++ b/ovn-nb.xml > >>>>> @@ -2686,10 +2686,65 @@ or > >>>>> </column> > >>>>> > >>>>> <column name="options" key="mac_binding_age_threshold" > >>>>> - type='{"type": "integer", "minInteger": 0, > > "maxInteger": 4294967295}'> > >>>>> - MAC binding aging <code>threshold</code> value in seconds. > > MAC binding > >>>>> - exceeding this timeout will be automatically removed. The > > value > >>>>> - defaults to 0, which means disabled. > >>>>> + type='{"type": "string"}'> > >>>>> + <p> > >>>>> + Specifies the MAC binding aging thresholds based on > > CIDRs, with the > >>>>> + format: > > <var>entry</var>[<code>;</code><var>entry</var>[...]], > >>>>> + where each <var>entry</var> has the format: > >>>>> + [<var>cidr</var><code>:</code>]<var>threshold</var> > >>>>> + </p> > >>>>> + > >>>>> + <ul> > >>>>> + <li> > >>>>> + <var>cidr</var>: Can be either an IPv4 or IPv6 CIDR. > >>>>> + </li> > >>>>> + <li> > >>>>> + <var>threshold</var>: Threshold value in seconds. MAC > > bindings with > >>>>> + IP addresses matching the specified CIDR that exceed > > this timeout > >>>>> + will be automatically removed. > >>>>> + </li> > >>>>> + </ul> > >>>>> + > >>>>> + <p> > >>>>> + If an <var>entry</var> is provided without an CIDR (just > > the > >>>>> + threshold value), it specifies the default threshold for > > MAC > >>>>> + bindings that don't match any of the given CIDRs. If > > there are > >>>>> + multiple default threshold entries in the option, the > > behavior is > >>>>> + undefined. > >>>>> + </p> > >>>>> + > >>>>> + <p> > >>>>> + If there are multiple CIDRs matching a MAC binding IP, > > the one with > >>>>> + the longest prefix length takes effect. If there are > > multiple > >>>>> + entries with the same CIDR in the option, the behavior is > >>>>> + undefined. > >>>>> + </p> > >>>>> + > >>>>> + <p> > >>>>> + If no matching CIDR is found for a MAC binding IP, and > > no default > >>>>> + threshold is specified, the behavior defaults to the > > original: the > >>>>> + binding will not be removed based on age. > >>>>> + </p> > >>>>> + > >>>>> + <p> > >>>>> + The value can also default to an empty string, which > > means that the > >>>>> + aging threshold is disabled. Any string not in the above > > format is > >>>>> + regarded as invalid and the aging is disabled. > >>>>> + </p> > >>>>> + > >>>>> + <p> > >>>>> + Example: > >>>>> + <code> > > 192.168.0.0/16:300;192.168.10.0/24:0;fe80::/10:600;1200</code> > >>>>> + </p> > >>>>> + > >>>>> + <p> > >>>>> + This sets a threshold of 300 seconds for MAC bindings > > with IP > >>>>> + addresses in the 192.168.0.0/16 range, excluding the > > 192.168.1.0/24 > >>>>> + range (for which the aging is disabled), a threshold of > > 600 seconds > >>>>> + for MAC bindings with IP addresses in the fe80::/10 IPv6 > > range, and > >>>>> + a default threshold of 1200 seconds for all other MAC > > bindings. > >>>>> + </p> > >>>>> + > >>>>> </column> > >>>>> </group> > >>>>> > >>>>> diff --git a/tests/ovn.at b/tests/ovn.at > >>>>> index 637d92bedbc7..07b5c1ec57fc 100644 > >>>>> --- a/tests/ovn.at > >>>>> +++ b/tests/ovn.at > >>>>> @@ -34676,6 +34676,66 @@ OVS_WAIT_UNTIL([ > >>>>> > >>>>> AT_CHECK([as hv1 ovs-ofctl dump-flows br-int table=79 --no-stats | > > strip_cookie], [0], []) > >>>>> > >>>>> +# Test CIDR-based threshold configuration > >>>>> +check ovn-nbctl set logical_router gw > > options:mac_binding_age_threshold=" > > 192.168.10.0/255.255.255.0:2;192.168.10.64/26:0;192.168.10.20:0" > >>>>> +check ovn-nbctl --wait=sb sync > >>>>> +uuid=$(fetch_column datapath _uuid external_ids:name=gw) > >>>>> +AT_CHECK([ovn-sbctl get datapath $uuid > > external_ids:mac_binding_age_threshold], [0], [dnl > >>>>> +"2" > >>>>> +]) > >>>>> + > >>>>> +# Send GARP to populate MAC binding table records > >>>>> +send_garp hv1 ext1 10 # belong to 192.168.10.0/24 > >>>>> +send_garp hv2 ext2 20 # belong to 192.168.10.20/32 > >>>>> +send_garp hv2 ext2 65 # belong to 192.168.10.64/26 > >>>>> + > >>>>> +OVS_WAIT_UNTIL([ovn-sbctl list mac_binding | grep -q > > "192.168.10.10"]) > >>>>> +OVS_WAIT_UNTIL([ovn-sbctl list mac_binding | grep -q > > "192.168.10.20"]) > >>>>> +OVS_WAIT_UNTIL([ovn-sbctl list mac_binding | grep -q > > "192.168.10.65"]) > >>>>> + > >>>>> +OVS_WAIT_UNTIL([ > >>>>> + test "0" = "$(ovn-sbctl list mac_binding | grep -c > > '192.168.10.10')" > >>>>> +]) > >>>>> +# The other two should remain because the corresponding prefixes > > have threshold 0 > >>>>> +AT_CHECK([ovn-sbctl list mac_binding | grep -q "192.168.10.20"]) > >>>>> +AT_CHECK([ovn-sbctl list mac_binding | grep -q "192.168.10.65"]) > >>>>> +check ovn-sbctl --all destroy mac_binding > >>>>> + > >>>>> +# Set the aging threshold mixed with IPv6 prefixes and default > > threshold > >>>>> +check ovn-nbctl set logical_router gw > > options:mac_binding_age_threshold="1; 192.168.10.64/26:0;ff00:1234::/32:888" > >>>>> +check ovn-nbctl --wait=sb sync > >>>>> +uuid=$(fetch_column datapath _uuid external_ids:name=gw) > >>>>> +AT_CHECK([ovn-sbctl get datapath $uuid > > external_ids:mac_binding_age_threshold], [0], [dnl > >>>>> +"1" > >>>>> +]) > >>>>> + > >>>>> +# Send GARP to populate MAC binding table records > >>>>> +send_garp hv1 ext1 10 # belong to default > >>>>> +send_garp hv2 ext2 65 # belong to 192.168.10.64/26 > >>>>> + > >>>>> +OVS_WAIT_UNTIL([ovn-sbctl list mac_binding | grep -q > > "192.168.10.10"]) > >>>>> +OVS_WAIT_UNTIL([ovn-sbctl list mac_binding | grep -q > > "192.168.10.65"]) > >>>>> + > >>>>> +OVS_WAIT_UNTIL([ > >>>>> + test "0" = "$(ovn-sbctl list mac_binding | grep -c > > '192.168.10.10')" > >>>>> +]) > >>>>> +AT_CHECK([ovn-sbctl list mac_binding | grep -q "192.168.10.65"]) > >>>>> +check ovn-sbctl --all destroy mac_binding > >>>>> + > >>>>> +# Set the aging threshold with invalid format > >>>>> +check ovn-nbctl set logical_router gw > > options:mac_binding_age_threshold="1;abc/26:0" > >>>>> +check ovn-nbctl --wait=sb sync > >>>>> +uuid=$(fetch_column datapath _uuid external_ids:name=gw) > >>>>> +AT_CHECK([ovn-sbctl get datapath $uuid > > external_ids:mac_binding_age_threshold], [1], [ignore], [ignore]) > >>>>> + > >>>>> +# Send GARP to populate MAC binding table records > >>>>> +send_garp hv1 ext1 10 > >>>>> + > >>>>> +OVS_WAIT_UNTIL([ovn-sbctl list mac_binding | grep -q > > "192.168.10.10"]) > >>>>> +# The record is not deleted > >>>>> +sleep 5 > >>>>> +AT_CHECK([ovn-sbctl list mac_binding | grep -q "192.168.10.10"]) > >>>>> + > >>>>> OVN_CLEANUP([hv1], [hv2]) > >>>>> AT_CLEANUP > >>>>> ]) > >>>>> -- > >>>>> 2.38.1 > >>>>> > >>>>> _______________________________________________ > >>>>> dev mailing list > >>>>> d...@openvswitch.org > >>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >>>>> > >>>> > >>>> Thanks, > >>>> Ales > >>>> > >>>> -- > >>>> > >>>> Ales Musil > >>>> > >>>> Senior Software Engineer - OVN Core > >>>> > >>>> Red Hat EMEA > >>>> > >>>> amu...@redhat.com > >> > >> > >> Thanks, > >> Ales > >> -- > >> > >> Ales Musil > >> > >> Senior Software Engineer - OVN Core > >> > >> Red Hat EMEA > >> > >> amu...@redhat.com > > _______________________________________________ > > dev mailing list > > d...@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev