[ovs-dev] [PATCH v3 ovn] northd: add the capability to inherit logical routers lbs on logical switches
Add the capability to automatically deploy a load-balancer on each logical-switch connected to a logical router where the load-balancer has been installed by the CMS. This patch allow to overcome the distributed gw router scenario limitation where a load-balancer must be installed on each datapath to properly reach the load-balancer. Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2043543 Signed-off-by: Lorenzo Bianconi --- Changes since v2: - rebase on top of ovn master Changes since v1: - rebase on top of ovn master - add NEWS entry - improve selftests --- NEWS| 5 +++ northd/northd.c | 56 + northd/ovn-northd.8.xml | 8 + tests/ovn-northd.at | 80 + 4 files changed, 149 insertions(+) diff --git a/NEWS b/NEWS index e015ae8e7..8b4f91553 100644 --- a/NEWS +++ b/NEWS @@ -4,6 +4,11 @@ Post v22.06.0 "ovn-encap-df_default" to enable or disable tunnel DF flag. - Add option "localnet_learn_fdb" to LSP that will allow localnet ports to learn MAC addresses and store them in FDB table. + - northd: introduce the capability to automatically deploy a load-balancer +on each logical-switch connected to a logical router where the +load-balancer has been installed by the CMS. In order to enable the +feature the CMS has to set install_ls_lb_from_router to true in option +column of NB_Global table. OVN v22.06.0 - XX XXX -- diff --git a/northd/northd.c b/northd/northd.c index 0207f6ce1..a95a5148e 100644 --- a/northd/northd.c +++ b/northd/northd.c @@ -63,6 +63,8 @@ static bool lflow_hash_lock_initialized = false; static bool check_lsp_is_up; +static bool install_ls_lb_from_router; + /* MAC allocated for service monitor usage. Just one mac is allocated * for this purpose and ovn-controller's on each chassis will make use * of this mac when sending out the packets to monitor the services @@ -4140,6 +4142,55 @@ build_lrouter_lbs_reachable_ips(struct hmap *datapaths, struct hmap *lbs) } } +static void +build_lswitch_lbs_from_lrouter(struct hmap *datapaths, struct hmap *lbs) +{ +if (!install_ls_lb_from_router) { +return; +} + +struct ovn_datapath *od; +HMAP_FOR_EACH (od, key_node, datapaths) { +if (!od->nbs) { +continue; +} + +struct ovn_port *op; +LIST_FOR_EACH (op, dp_node, >port_list) { +if (!lsp_is_router(op->nbsp)) { +continue; +} +if (!op->peer) { +continue; +} + +struct ovn_datapath *peer_od = op->peer->od; +for (size_t i = 0; i < peer_od->nbr->n_load_balancer; i++) { +bool installed = false; +const struct uuid *lb_uuid = +_od->nbr->load_balancer[i]->header_.uuid; +struct ovn_northd_lb *lb = ovn_northd_lb_find(lbs, lb_uuid); +if (!lb) { +continue; +} + +for (size_t j = 0; j < lb->n_nb_ls; j++) { + if (lb->nb_ls[j] == od) { + installed = true; + break; + } +} +if (!installed) { +ovn_northd_lb_add_ls(lb, od); +} +if (lb->nlb) { +od->has_lb_vip |= lb_has_vip(lb->nlb); +} +} +} +} +} + /* This must be called after all ports have been processed, i.e., after * build_ports() because the reachability check requires the router ports * networks to have been parsed. @@ -4152,6 +4203,7 @@ build_lb_port_related_data(struct hmap *datapaths, struct hmap *ports, build_lrouter_lbs_check(datapaths); build_lrouter_lbs_reachable_ips(datapaths, lbs); build_lb_svcs(input_data, ovnsb_txn, ports, lbs); +build_lswitch_lbs_from_lrouter(datapaths, lbs); } /* Syncs relevant load balancers (applied to logical switches) to the @@ -15378,6 +15430,10 @@ ovnnb_db_run(struct northd_input *input_data, "ignore_lsp_down", true); default_acl_drop = smap_get_bool(>options, "default_acl_drop", false); +install_ls_lb_from_router = smap_get_bool(>options, + "install_ls_lb_from_router", + false); + build_chassis_features(input_data, >features); build_datapaths(input_data, ovnsb_txn, >datapaths, >lr_list); build_lbs(input_data, >datapaths, >lbs); diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml index 1f7022490..2a2b33051 100644 --- a/northd/ovn-northd.8.xml +++ b/northd/ovn-northd.8.xml @@ -882,6 +882,10 @@ reg2. For IPv6 traffic the flow also loads the original destination IP and transport port in registers xxreg1 and reg2.
Re: [ovs-dev] [ovs-discuss] Commit 355fef6f2 seems to break connectivity in my setup
On 5/31/22 23:48, Ilya Maximets wrote: > On 5/31/22 21:15, Frode Nordahl wrote: >> On Mon, May 30, 2022 at 5:25 PM Frode Nordahl >> I've pushed the first part of the fix here: >> https://mail.openvswitch.org/pipermail/ovs-dev/2022-May/394450.html > > Thanks! I saw that and I tend to think that it is correct. > I'll try to test it and apply in the next couple of days. > > One question about the test above: which entity actually adds > the ct_state to the packet or at which moment that happens? > I see it, but I'm not sure I fully understand that. Looks > like I'm missing smething obvious. I found what is going on there - kernel simply tracks everything if not told to do so, so ICMP packets create the ct entry and subsequent packets re-use it, so icmp replies have +trk set while entering OVS. Let's have some summary of the issues discovered here so far, including a few new issues: 1. ct states set externally are not tracked correctly by OVS. Fix: https://mail.openvswitch.org/pipermail/ovs-dev/2022-May/394450.html Status: LGTM, will apply soon. This fixes the problem originally reported by Liam, IIUC. Right? 2. Kernel ct() actions are trying to re-use the cached connection after the tuple changes. This ends up to be the OVN hairpin issue as reported here: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856 Proposed Fix: https://lore.kernel.org/netdev/20220606221140.488984-1-i.maxim...@ovn.org/T/#u Status: Needs review. 3. ct_clear is not a reversible action, because it required to pass the packet through conntrack again in order to restore the conntrack state. Fix: To move the CT_CLEAR to a group of irreversible actions in the reversible_actions() in ofproto/ofproto-dpif-xlate.c. Status: An easy fix that does nothing useful without fixes for later issues, because we clear the ct_state before the patch_port_output() processing and optimizing out the CT_CLEAR action. 4. (new!) reversible_actions() optimization is logically incorrect. The reason is that reversible_actions() doesn't look further than a list of actions of the first OF rule in the second bridge. If the list of action contains resubmit, there can still be irreversible actions and packet will be irreversibly modified by the patch port processing without cloning. Simple reproducer: diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at index dbb3b6dda..be30ad5bf 100644 --- a/tests/ofproto-dpif.at +++ b/tests/ofproto-dpif.at @@ -8736,7 +8736,8 @@ OVS_VSWITCHD_START( AT_CHECK([ovs-ofctl -O OpenFlow13 add-meter br1 'meter=1 pktps stats bands=type=drop rate=2']) AT_CHECK([ovs-ofctl del-flows br0]) AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br0 in_port=local,ip,actions=2,1]) -AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 in_port=1,ip,actions=meter:1,3]) +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 "in_port=1,ip,actions=resubmit(,7)"]) +AT_CHECK([ovs-ofctl -O OpenFlow13 add-flow br1 table=7,in_port=1,ip,actions=meter:1,3]) AT_CHECK([ovs-appctl ofproto/trace ovs-dummy 'in_port(100),eth(src=f8:bc:12:44:34:b6,dst=f8:bc:12:46:58:e0),eth_type(0x0800),ipv4(src=10.1.1.22,dst=10.0.0.3,proto=6,tos=0,ttl=64,frag=no),tcp(src=53295,dst=8080)'], [0], [stdout]) AT_CHECK([tail -1 stdout], [0], --- Status: unclear. One idea is to reverse the logic and check datapath actions for being reversible when going up in recursion instead of trying to predict all the future actions while going down. Ideas are welcome. Hammerhead approach: Mark 'resubmit' as irreversible action. But that will effectively mean that we're always cloning on patch port output, which is not great, see the issue #8. 5. Packets after the ct() in a non-forked pipeline must be untracked. Status: unclear. Fix for the issue #2 will cover issues for already tracked packets, but if the packet is supposed to be untracked, but it is tracked, then we must emit the ct_clear action from the userspace. The most viable approach, I guess, is the previously proposed 'pending_ct_clear' fix. Alternative is to always emit ct_clear right after the ct() action and not loose our minds trying to track the pending action in all the weird cases. 6. Conntrack state must not be preserved while going through the patch port. State: unclear. The right solution would be to emit the ct_clear datapath action after cloning for the patch port. Few problems here: - current code in ofproto/ofproto-dpif-xlate.c will not actually allow us to do that, we need to fix the issue #4 first to be able to inject new actions post factum. - ct_clear is non-reversible (issue #3) meaning if we're adding ct-clear, we have to clone even if the patch port processing doesn't have any ct() actions or recirculations. Excessive ct_clear's and clones can be avoided by fixing the issue #4 and
[ovs-dev] [PATCH net] net: openvswitch: fix misuse of the cached connection on tuple changes
If packet headers changed, the cached nfct is no longer relevant for the packet and attempt to re-use it leads to the incorrect packet classification. This issue is causing broken connectivity in OpenStack deployments with OVS/OVN due to hairpin traffic being unexpectedly dropped. The setup has datapath flows with several conntrack actions and tuple changes between them: actions:ct(commit,zone=8,mark=0/0x1,nat(src)), set(eth(src=00:00:00:00:00:01,dst=00:00:00:00:00:06)), set(ipv4(src=172.18.2.10,dst=192.168.100.6,ttl=62)), ct(zone=8),recirc(0x4) After the first ct() action the packet headers are almost fully re-written. The next ct() tries to re-use the existing nfct entry and marks the packet as invalid, so it gets dropped later in the pipeline. Clearing the cached conntrack entry whenever packet tuple is changed to avoid the issue. The flow key should not be cleared though, because we should still be able to match on the ct_state if the recirculation happens after the tuple change but before the next ct() action. Cc: sta...@vger.kernel.org Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action") Reported-by: Frode Nordahl Link: https://mail.openvswitch.org/pipermail/ovs-discuss/2022-May/051829.html Link: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856 Signed-off-by: Ilya Maximets --- The function ovs_ct_clear() looks a bit differently on older branches, but the change should be exactly the same, i.e. move the ovs_ct_fill_key() under the 'if (key)'. The same behavior for userspace datapath was introduced along with the conntrack caching support here: https://github.com/openvswitch/ovs/commit/594570ea1cdecc7ef7880d707cbc7a4a4ecef09f Interestingly, above commit also introduced the system test that can check the issue for the kernel as well, but the test sends only one packet and this packet goes via upcall to userspace and back to the kernel effectively clearing the cached connection along the way and avoiding the issue. If the test is modified to send more than a few packets [1], it starts to fail without the kernel fix: make check-kernel TESTSUITEFLAGS='-k negative' 142: conntrack - negative test for recirculation optimization FAILED [1] https://pastebin.com/H1YMqaLa net/openvswitch/actions.c | 6 ++ net/openvswitch/conntrack.c | 4 +++- 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c index 1b5d73079dc9..868db4669a29 100644 --- a/net/openvswitch/actions.c +++ b/net/openvswitch/actions.c @@ -373,6 +373,7 @@ static void set_ip_addr(struct sk_buff *skb, struct iphdr *nh, update_ip_l4_checksum(skb, nh, *addr, new_addr); csum_replace4(>check, *addr, new_addr); skb_clear_hash(skb); + ovs_ct_clear(skb, NULL); *addr = new_addr; } @@ -420,6 +421,7 @@ static void set_ipv6_addr(struct sk_buff *skb, u8 l4_proto, update_ipv6_checksum(skb, l4_proto, addr, new_addr); skb_clear_hash(skb); + ovs_ct_clear(skb, NULL); memcpy(addr, new_addr, sizeof(__be32[4])); } @@ -660,6 +662,7 @@ static int set_nsh(struct sk_buff *skb, struct sw_flow_key *flow_key, static void set_tp_port(struct sk_buff *skb, __be16 *port, __be16 new_port, __sum16 *check) { + ovs_ct_clear(skb, NULL); inet_proto_csum_replace2(check, skb, *port, new_port, false); *port = new_port; } @@ -699,6 +702,7 @@ static int set_udp(struct sk_buff *skb, struct sw_flow_key *flow_key, uh->dest = dst; flow_key->tp.src = src; flow_key->tp.dst = dst; + ovs_ct_clear(skb, NULL); } skb_clear_hash(skb); @@ -761,6 +765,8 @@ static int set_sctp(struct sk_buff *skb, struct sw_flow_key *flow_key, sh->checksum = old_csum ^ old_correct_csum ^ new_csum; skb_clear_hash(skb); + ovs_ct_clear(skb, NULL); + flow_key->tp.src = sh->source; flow_key->tp.dst = sh->dest; diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c index 4a947c13c813..4e70df91d0f2 100644 --- a/net/openvswitch/conntrack.c +++ b/net/openvswitch/conntrack.c @@ -1342,7 +1342,9 @@ int ovs_ct_clear(struct sk_buff *skb, struct sw_flow_key *key) nf_ct_put(ct); nf_ct_set(skb, NULL, IP_CT_UNTRACKED); - ovs_ct_fill_key(skb, key, false); + + if (key) + ovs_ct_fill_key(skb, key, false); return 0; } -- 2.34.3 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v5 2/2] handlers: Fix handlers mapping
The handler and CPU mapping in upcalls are incorrect, and this is specially noticeable systems with cpu isolation enabled. Say we have a 12 core system where only every even number CPU is enabled C0, C2, C4, C6, C8, C10 This means we will create an array of size 6 that will be sent to kernel that is populated with sockets [S0, S1, S2, S3, S4, S5] The problem is when the kernel does an upcall it checks the socket array via the index of the CPU, effectively adding additional load on some CPUs while leaving no work on other CPUs. e.g. C0 indexes to S0 C2 indexes to S2 (should be S1) C4 indexes to S4 (should be S2) Modulo of 6 (size of socket array) is applied, so we wrap back to S0 C6 indexes to S0 (should be S3) C8 indexes to S2 (should be S4) C10 indexes to S4 (should be S5) Effectively sockets S0, S2, S4 get overloaded while sockets S1, S3, S5 get no work assigned to them This leads to the kernel to throw the following message: "openvswitch: cpu_id mismatch with handler threads" Instead we will send the kernel a corrected array of sockets the size of all CPUs in the system. In the above example we would create a corrected array in a round-robin(assuming prime bias) fashion as follows: [S0, S1, S2, S3, S4, S5, S6, S0, S1, S2, S3, S4] Fixes: b1e517bd2f81 ("dpif-netlink: Introduce per-cpu upcall dispatch.") Co-authored-by: Aaron Conole signed-off-by: Aaron Conole Signed-off-by: Michael Santana --- lib/dpif-netlink.c | 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c index e948a829b..dad4f684c 100644 --- a/lib/dpif-netlink.c +++ b/lib/dpif-netlink.c @@ -804,11 +804,11 @@ dpif_netlink_set_handler_pids(struct dpif *dpif_, const uint32_t *upcall_pids, struct dpif_netlink *dpif = dpif_netlink_cast(dpif_); struct dpif_netlink_dp request, reply; struct ofpbuf *bufp; -int error; -int n_cores; -n_cores = count_cpu_cores(); -ovs_assert(n_cores == n_upcall_pids); +uint32_t *corrected; +int error, i, n_cores; + +n_cores = count_total_cores(); VLOG_DBG("Dispatch mode(per-cpu): Number of CPUs is %d", n_cores); dpif_netlink_dp_init(); @@ -818,7 +818,12 @@ dpif_netlink_set_handler_pids(struct dpif *dpif_, const uint32_t *upcall_pids, request.user_features = dpif->user_features | OVS_DP_F_DISPATCH_UPCALL_PER_CPU; -request.upcall_pids = upcall_pids; +corrected = xcalloc(n_cores, sizeof(uint32_t)); + +for (i = 0; i < n_cores; i++) { +corrected[i] = upcall_pids[i % n_upcall_pids]; +} +request.upcall_pids = corrected; request.n_upcall_pids = n_cores; error = dpif_netlink_dp_transact(, , ); @@ -826,9 +831,10 @@ dpif_netlink_set_handler_pids(struct dpif *dpif_, const uint32_t *upcall_pids, dpif->user_features = reply.user_features; ofpbuf_delete(bufp); if (!dpif_netlink_upcall_per_cpu(dpif)) { -return -EOPNOTSUPP; +error = -EOPNOTSUPP; } } +free(corrected); return error; } -- 2.33.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v5 1/2] handlers: Create additional handler threads when using CPU isolation
Additional threads are required to service upcalls when we have CPU isolation (in per-cpu dispatch mode). The reason additional threads are required is because it creates a more fair distribution. With more threads we decrease the load of each thread as more threads would decrease the number of cores each threads is assigned. Adding as many threads are there are cores could have a performance hit when the number of active cores (which all threads have to share) is very low. For this reason we avoid creating as many threads as there are cores and instead meet somewhere in the middle. The formula used to calculate the number of handler threads to create is as follows: handlers_n = min(next_prime(active_cores+1), total_cores) Where next_prime is a function that returns the next numeric prime number that is strictly greater than active_cores Assume default behavior when total_cores <= 2, that is do not create additional threads when we have less than 2 total cores on the system Fixes: b1e517bd2f81 ("dpif-netlink: Introduce per-cpu upcall dispatch.") Signed-off-by: Michael Santana --- lib/dpif-netlink.c | 86 -- lib/ovs-thread.c | 16 + lib/ovs-thread.h | 1 + 3 files changed, 101 insertions(+), 2 deletions(-) diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c index 06e1e8ca0..e948a829b 100644 --- a/lib/dpif-netlink.c +++ b/lib/dpif-netlink.c @@ -31,6 +31,7 @@ #include #include #include +#include #include "bitmap.h" #include "dpif-netlink-rtnl.h" @@ -2506,6 +2507,87 @@ dpif_netlink_handler_uninit(struct dpif_handler *handler) } #endif +/* + * Returns 1 if num is a prime number, + * Otherwise return 0 + */ +static uint32_t +is_prime(uint32_t num) +{ +if ((num < 2) || ((num % 2) == 0)) { +return num == 2; +} + +uint32_t i; +uint32_t limit = sqrt((float) num); +for (i = 3; i <= limit; i += 2) { +if (num % i == 0) { +return 0; +} +} + +return 1; +} + +/* + * Returns num if num is a prime number. Otherwise returns the next + * prime greater than num. Search is limited by UINT32_MAX. + * + * Returns 0 if no prime has been found between num and UINT32_MAX + */ +static uint32_t +next_prime(uint32_t num) +{ +if (num < 2) { +return 2; +} +if (num == 2) { +return 3; +} +if (num % 2 == 0) { +num++; +} + +uint32_t i; +for (i = num; i < UINT32_MAX; i += 2) { +if (is_prime(i)) { +return i; +} +} + +return 0; +} + +/* + * Calcuales and returns the number of handler threads needed based + * the following formula: + * + * handlers_n = min(next_prime(active_cores+1), total_cores) + * + * Where next_prime is a function that returns the next numeric prime + * number that is strictly greater than active_cores + */ +static uint32_t +dpif_netlink_calculate_handlers_num(void) +{ +uint32_t next_prime_num; +uint32_t n_handlers = count_cpu_cores(); +uint32_t total_cores = count_total_cores(); + +/* + * If we have isolated cores, add additional handler threads to + * service inactive cores in the unlikely event that traffic goes + * through inactive cores + */ +if (n_handlers < total_cores && total_cores > 2) { +next_prime_num = next_prime(n_handlers + 1); +n_handlers = next_prime_num >= total_cores ? +total_cores : next_prime_num; +} + +return n_handlers; +} + static int dpif_netlink_refresh_handlers_cpu_dispatch(struct dpif_netlink *dpif) OVS_REQ_WRLOCK(dpif->upcall_lock) @@ -2515,7 +2597,7 @@ dpif_netlink_refresh_handlers_cpu_dispatch(struct dpif_netlink *dpif) uint32_t n_handlers; uint32_t *upcall_pids; -n_handlers = count_cpu_cores(); +n_handlers = dpif_netlink_calculate_handlers_num(); if (dpif->n_handlers != n_handlers) { VLOG_DBG("Dispatch mode(per-cpu): initializing %d handlers", n_handlers); @@ -2755,7 +2837,7 @@ dpif_netlink_number_handlers_required(struct dpif *dpif_, uint32_t *n_handlers) struct dpif_netlink *dpif = dpif_netlink_cast(dpif_); if (dpif_netlink_upcall_per_cpu(dpif)) { -*n_handlers = count_cpu_cores(); +*n_handlers = dpif_netlink_calculate_handlers_num(); return true; } diff --git a/lib/ovs-thread.c b/lib/ovs-thread.c index 805cba622..2172b3d3f 100644 --- a/lib/ovs-thread.c +++ b/lib/ovs-thread.c @@ -663,6 +663,22 @@ count_cpu_cores(void) return n_cores > 0 ? n_cores : 0; } +/* Returns the total number of cores on the system, or 0 if the + * number cannot be determined. */ +int +count_total_cores(void) { +long int n_cores; + +#ifndef _WIN32 +n_cores = sysconf(_SC_NPROCESSORS_CONF); +#else +n_cores = 0; +errno = ENOTSUP; +#endif + +return n_cores > 0 ? n_cores : 0; +} + /* Returns 'true' if current thread is PMD thread. */ bool
Re: [ovs-dev] [PATCH v2 ovn] northd: add the capability to inherit logical routers lbs on logical switches
On Wed, Jun 1, 2022 at 3:43 PM Lorenzo Bianconi wrote: > > Add the capability to automatically deploy a load-balancer on each > logical-switch connected to a logical router where the load-balancer > has been installed by the CMS. This patch allow to overcome the > distributed gw router scenario limitation where a load-balancer must be > installed on each datapath to properly reach the load-balancer. > > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2043543 > Signed-off-by: Lorenzo Bianconi Hi Lorenzo, Can you please respin another version as it needs a rebase. Also with this version of the patch, the CI run failed. Probably due to rebase issues and hence can't see the CI test results. Numan > --- > Changes since v1: > - rebase on top of ovn master > - add NEWS entry > - improve selftests > --- > NEWS| 5 +++ > northd/northd.c | 56 + > northd/ovn-northd.8.xml | 8 + > tests/ovn-northd.at | 80 + > 4 files changed, 149 insertions(+) > > diff --git a/NEWS b/NEWS > index 21782cc66..c97807fed 100644 > --- a/NEWS > +++ b/NEWS > @@ -2,6 +2,11 @@ Post v22.06.0 > - >- ovn-controller: Add configuration knob, through OVS external-id > "ovn-encap-df_default" to enable or disable tunnel DF flag. > + - northd: introduce the capability to automatically deploy a load-balancer > +on each logical-switch connected to a logical router where the > +load-balancer has been installed by the CMS. In order to enable the > +feature the CMS has to set install_ls_lb_from_router to true in option > +column of NB_Global table. > > OVN v22.06.0 - XX XXX > -- > diff --git a/northd/northd.c b/northd/northd.c > index 51dec36b3..4eed228b1 100644 > --- a/northd/northd.c > +++ b/northd/northd.c > @@ -63,6 +63,8 @@ static bool lflow_hash_lock_initialized = false; > > static bool check_lsp_is_up; > > +static bool install_ls_lb_from_router; > + > /* MAC allocated for service monitor usage. Just one mac is allocated > * for this purpose and ovn-controller's on each chassis will make use > * of this mac when sending out the packets to monitor the services > @@ -4088,6 +4090,55 @@ build_lrouter_lbs_reachable_ips(struct hmap > *datapaths, struct hmap *lbs) > } > } > > +static void > +build_lswitch_lbs_from_lrouter(struct hmap *datapaths, struct hmap *lbs) > +{ > +if (!install_ls_lb_from_router) { > +return; > +} > + > +struct ovn_datapath *od; > +HMAP_FOR_EACH (od, key_node, datapaths) { > +if (!od->nbs) { > +continue; > +} > + > +struct ovn_port *op; > +LIST_FOR_EACH (op, dp_node, >port_list) { > +if (!lsp_is_router(op->nbsp)) { > +continue; > +} > +if (!op->peer) { > +continue; > +} > + > +struct ovn_datapath *peer_od = op->peer->od; > +for (size_t i = 0; i < peer_od->nbr->n_load_balancer; i++) { > +bool installed = false; > +const struct uuid *lb_uuid = > +_od->nbr->load_balancer[i]->header_.uuid; > +struct ovn_northd_lb *lb = ovn_northd_lb_find(lbs, lb_uuid); > +if (!lb) { > +continue; > +} > + > +for (size_t j = 0; j < lb->n_nb_ls; j++) { > + if (lb->nb_ls[j] == od) { > + installed = true; > + break; > + } > +} > +if (!installed) { > +ovn_northd_lb_add_ls(lb, od); > +} > +if (lb->nlb) { > +od->has_lb_vip |= lb_has_vip(lb->nlb); > +} > +} > +} > +} > +} > + > /* This must be called after all ports have been processed, i.e., after > * build_ports() because the reachability check requires the router ports > * networks to have been parsed. > @@ -4100,6 +4151,7 @@ build_lb_port_related_data(struct hmap *datapaths, > struct hmap *ports, > build_lrouter_lbs_check(datapaths); > build_lrouter_lbs_reachable_ips(datapaths, lbs); > build_lb_svcs(input_data, ovnsb_txn, ports, lbs); > +build_lswitch_lbs_from_lrouter(datapaths, lbs); > } > > /* Syncs relevant load balancers (applied to logical switches) to the > @@ -15262,6 +15314,10 @@ ovnnb_db_run(struct northd_input *input_data, > "ignore_lsp_down", true); > default_acl_drop = smap_get_bool(>options, "default_acl_drop", > false); > > +install_ls_lb_from_router = smap_get_bool(>options, > + "install_ls_lb_from_router", > + false); > + > build_datapaths(input_data, ovnsb_txn, >datapaths, >lr_list); >
Re: [ovs-dev] [PATCH ovn v2 0/2] Add option to enable MAC learning on localnet port
On Mon, Jun 6, 2022 at 6:47 AM Ales Musil wrote: > > Add new option for LSP called 'localnet_learn_fdb', > which will allow localnet LSP to add entries to FDB > when set to true. The default is false. > > --- > v2: Address comments and rebase on main Thanks for the v2. I applied both the patches to the main branch with the below minor changes in patch 2. --- diff --git a/tests/ovn.at b/tests/ovn.at index c0ade387d3..59d51f3e03 100644 --- a/tests/ovn.at +++ b/tests/ovn.at @@ -31973,6 +31973,7 @@ check test "$snat_zone" = "$SNAT_ZONE_REG" OVN_CLEANUP([hv1]) AT_CLEANUP +OVN_FOR_EACH_NORTHD([ AT_SETUP([OVN FDB (MAC learning) - Localnet]) ovn_start @@ -32020,18 +32021,21 @@ AT_CHECK([ovn-nbctl --wait=hv sync]) # Learning is disabled, the table should be empty send_packet 20 +AT_CHECK([ovn-nbctl --wait=hv sync]) AT_CHECK([test $(ovn-sbctl list fdb | grep -c "00:00:00:00:10:20") = 0]) # Enable learning on localnet port AT_CHECK([ovn-nbctl set logical_switch_port ln_port options:localnet_learn_fdb=true]) AT_CHECK([ovn-nbctl --wait=hv sync]) send_packet 20 +AT_CHECK([ovn-nbctl --wait=hv sync]) AT_CHECK([test $(ovn-sbctl list fdb | grep -c "00:00:00:00:10:20") = 1]) # Disable learning on localnet port AT_CHECK([ovn-nbctl set logical_switch_port ln_port options:localnet_learn_fdb=false]) AT_CHECK([ovn-nbctl --wait=hv sync]) send_packet 30 +AT_CHECK([ovn-nbctl --wait=hv sync]) AT_CHECK([test $(ovn-sbctl list fdb | grep -c "00:00:00:00:10:30") = 0]) OVN_CLEANUP([hv1]) --- I added ovn-nbctl --wait=hv sync before the checks to eliminate any test flakes due to timing issues. Numan > > Ales Musil (2): > northd.c: Add lsp_is_localnet helper method > northd.c: Add option to enable MAC learning on localnet > > NEWS| 2 ++ > northd/northd.c | 42 +++-- > ovn-nb.xml | 7 + > ovn-sb.xml | 1 + > tests/ovn-northd.at | 36 + > tests/ovn.at| 65 + > 6 files changed, 139 insertions(+), 14 deletions(-) > > -- > 2.35.3 > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH ovn v4 1/1] IPsec: Add option to force NAT-T encapsulation
Provide options to enforce NAT-T UDP encapsulation. Options are encapsulation=true for libreswan and forceencaps=true for strongswan. This may be required in environments where firewalls drop ESP traffic but where NAT-T detection fails because packets are not subject to NAT. Signed-off-by: Andreas Karis Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2041681 --- Documentation/tutorials/ovn-ipsec.rst | 24 NEWS | 4 controller/encaps.c | 15 +++ tests/ovn-ipsec.at| 3 +++ 4 files changed, 46 insertions(+) diff --git a/Documentation/tutorials/ovn-ipsec.rst b/Documentation/tutorials/ovn-ipsec.rst index 305dd566d..aea7aa309 100644 --- a/Documentation/tutorials/ovn-ipsec.rst +++ b/Documentation/tutorials/ovn-ipsec.rst @@ -93,6 +93,29 @@ database to false:: # systemctl enable firewalld # firewall-cmd --permanent --add-service ipsec +Enforcing IPsec NAT-T UDP encapsulation +--- + +In specific situations, it may be required to enforce NAT-T (RFC3948) UDP +encapsulation unconditionally and to bypass the normal NAT detection mechanism. +For example, this may be required in environments where firewalls drop ESP +traffic, but where NAT-T detection (RFC3947) fails because packets otherwise +are not subject to NAT. +In such scenarios, UDP encapsulation can be enforced with the following. + +For libreswan backends:: + +$ ovn-nbctl set nb_global . options:ipsec_encapsulation=true + +For strongswan backends:: + +$ ovn-nbctl set nb_global . options:ipsec_forceencaps=true + +.. note:: + + Support for this feature is only availably when OVN is used together with + OVS releases that accept IPsec custom tunnel options. + Troubleshooting --- @@ -119,6 +142,7 @@ For example:: Remote name:host_2 CA cert:/path/to/cacert.pem PSK:None + Custom Options: {'encapsulation': 'yes'} < Whether NAT-T is enforced Ofport: 2 <--- Whether ovs-vswitchd has assigned Ofport number to this Tunnel Port CFM state: Disabled <--- Whether CFM declared this tunnel healthy diff --git a/NEWS b/NEWS index 21782cc66..3f474bfcb 100644 --- a/NEWS +++ b/NEWS @@ -2,6 +2,10 @@ Post v22.06.0 - - ovn-controller: Add configuration knob, through OVS external-id "ovn-encap-df_default" to enable or disable tunnel DF flag. + - Added nb_global IPsec options ipsec_encapsulation=true (libreswan) +and ipsec_forceencaps=true (strongswan) to unconditionally enforce +NAT-T UDP encapsulation. Requires OVS support for IPsec custom tunnel +options. OVN v22.06.0 - XX XXX -- diff --git a/controller/encaps.c b/controller/encaps.c index a06aa258c..9647ba507 100644 --- a/controller/encaps.c +++ b/controller/encaps.c @@ -207,6 +207,21 @@ tunnel_add(struct tunnel_ctx *tc, const struct sbrec_sb_global *sbg, if (sbg->ipsec) { set_local_ip = true; smap_add(, "remote_name", new_chassis_id); + +/* Force NAT-T traversal via configuration */ +/* Two ipsec backends are supported: libreswan and strongswan */ +/* libreswan param: encapsulation; strongswan param: forceencaps */ +bool encapsulation; +bool forceencaps; +encapsulation = smap_get_bool(>options, "ipsec_encapsulation", + false); +forceencaps = smap_get_bool(>options, "ipsec_forceencaps", false); +if (encapsulation) { +smap_add(, "ipsec_encapsulation", "yes"); +} +if (forceencaps) { +smap_add(, "ipsec_forceencaps", "yes"); +} } if (set_local_ip) { diff --git a/tests/ovn-ipsec.at b/tests/ovn-ipsec.at index 4c600a9f2..10ef97878 100644 --- a/tests/ovn-ipsec.at +++ b/tests/ovn-ipsec.at @@ -44,15 +44,18 @@ ovs-vsctl \ # Enable IPsec ovn-nbctl set nb_global . ipsec=true +ovn-nbctl set nb_global . options:ipsec_encapsulation=true check ovn-nbctl --wait=hv sync AT_CHECK([as hv2 ovs-vsctl get Interface ovn-hv1-0 options:remote_ip | tr -d '"\n'], [0], [192.168.0.1]) AT_CHECK([as hv2 ovs-vsctl get Interface ovn-hv1-0 options:local_ip | tr -d '"\n'], [0], [192.168.0.2]) AT_CHECK([as hv2 ovs-vsctl get Interface ovn-hv1-0 options:remote_name | tr -d '\n'], [0], [hv1]) +AT_CHECK([as hv2 ovs-vsctl get Interface ovn-hv1-0 options:ipsec_encapsulation | tr -d '\n'], [0], [yes]) AT_CHECK([as hv1 ovs-vsctl get Interface ovn-hv2-0 options:remote_ip | tr -d '"\n'], [0], [192.168.0.2]) AT_CHECK([as hv1 ovs-vsctl get Interface ovn-hv2-0 options:local_ip | tr -d '"\n'], [0], [192.168.0.1]) AT_CHECK([as hv1 ovs-vsctl get Interface ovn-hv2-0 options:remote_name | tr -d '\n'], [0], [hv2]) +AT_CHECK([as hv1 ovs-vsctl get Interface ovn-hv2-0 options:ipsec_encapsulation | tr -d
Re: [ovs-dev] [PATCH v2 ovn 1/2] Handle re-used pids in pidfile_is_running
On 6/2/22 16:34, Terry Wilson wrote: > Since pids can be re-used, it is necessary to check that the > process that is running with a pid matches the one that we expect. > > This adds the ability to optionally pass a 'binary' argument to > pidfile_is_running, and if it is passed to match the binary against > /proc/$pid/exe. > > Signed-off-by: Terry Wilson > --- > utilities/ovn-ctl | 9 - > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/utilities/ovn-ctl b/utilities/ovn-ctl > index d733aa42d..41fa89770 100755 > --- a/utilities/ovn-ctl > +++ b/utilities/ovn-ctl > @@ -40,9 +40,16 @@ ovn_ic_db_conf_file="$ovn_etcdir/ovn-ic-db-params.conf" > ## start ## > ## - ## > > +pid_exe_matches () { > +pid=$1 > +binary=$2 > +[ -z "$binary" -o `readlink /proc/$pid/exe` = "$binary" ] Hi, Terry. Good catch! Though, I think, it will be better to just check the 'comm' as OVS does instead of comparing the full path to the binary. This will protect us from running the same daemon from different locations in a general case, i.e. during development or an upgrade if the binary suddenly moved. 'readlink' also seems to not be very reliable, because it will add '(deleted)' to the end of the result if the binary was deleted or replaced. That also might be a problem during updates where the binary is updated on the disk but the process is still running. Best regards, Ilya Maximets. > +} > + > pidfile_is_running () { > pidfile=$1 > -test -e "$pidfile" && [ -s "$pidfile" ] && pid=`cat "$pidfile"` && > pid_exists "$pid" > +binary=$2 > +test -e "$pidfile" && [ -s "$pidfile" ] && pid=`cat "$pidfile"` && > pid_exists "$pid" && pid_exe_matches "$pid" "$binary" > } >/dev/null 2>&1 > > stop_nb_ovsdb() { ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 1/1] ovsdb: Fix memory leak on error path in ovsdb_file_read__().
Found by Coverity. Fixes: 1b1d2e6daa56 ("ovsdb: Introduce experimental support for clustered databases.") Signed-off-by: Yunjian Wang --- ovsdb/file.c | 1 + 1 file changed, 1 insertion(+) diff --git a/ovsdb/file.c b/ovsdb/file.c index 9f44007d9..ca80c2823 100644 --- a/ovsdb/file.c +++ b/ovsdb/file.c @@ -524,6 +524,7 @@ ovsdb_file_read__(const char *filename, bool rw, error = ovsdb_txn_replay_commit(txn); if (error) { +ovsdb_error_destroy(error); ovsdb_storage_unread(storage); break; } -- 2.27.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH ovn 2/2] northd.c: Add option to enable MAC learning on localnet
On Fri, Jun 3, 2022 at 11:28 PM Numan Siddique wrote: > On Wed, Jun 1, 2022 at 8:28 AM Ales Musil wrote: > > > > The localnet is excluded from MAC learning for scale > > reason. However there might be a valid workflow > > when yo uwant to enable the learning and benefit > > for that for HW offload. Add option called > > 'localnet_learn_fdb' to LSP, which will enable/disable > > the learning. Setting it as disabled by default. > > > > Reported-at: https://bugzilla.redhat.com/2070529 > > Signed-off-by: Ales Musil > > Thanks for adding the support. > > A few minor comments below. > > Numan > > > --- > > northd/northd.c | 10 -- > > ovn-nb.xml | 7 +++ > > ovn-sb.xml | 1 + > > tests/ovn-northd.at | 37 + > > 4 files changed, 53 insertions(+), 2 deletions(-) > > > > diff --git a/northd/northd.c b/northd/northd.c > > index 16ea7a6aa..b3077714e 100644 > > --- a/northd/northd.c > > +++ b/northd/northd.c > > @@ -5408,8 +5408,14 @@ build_lswitch_learn_fdb_op( > > struct ovn_port *op, struct hmap *lflows, > > struct ds *actions, struct ds *match) > > { > > -if (op->nbsp && !op->n_ps_addrs && !strcmp(op->nbsp->type, "") && > > -op->has_unknown) { > > +if (!op->nbsp) { > > +return; > > +} > > + > > +bool localnet_learn_fdb = smap_get_bool(>nbsp->options, > > +"localnet_learn_fdb", > false); > > I'd suggest to add a helper function for the above i.e to check if > learn fdb is enabled or not for localnet ports > and modify the below 'if' as > > if (!op->n_ps_addrs && op->has_unknown && (!strcmp(op->nbsp->type, "") || > (lsp_is_localnet(op->nbsp) && > localnet_lsp_has_learn_(op->nbsp->options))) > { > /// > } > > This would avoid unnecessary smap_get_bool() call for other lsp types. > > I think it would be good if you can add a new test in ovn.at or > enhance existing OVN FDB (MAC learning) test cases in ovn.at > to cover this scenario. > > You can create a vif in the localnet bridge and inject some packets > from there so that the packet enters the br-int via the patch port > and the test can check if the mac learning is working as expected or not. > > Thanks > Numan > > > +if (!op->n_ps_addrs && op->has_unknown && (!strcmp(op->nbsp->type, > "") || > > +(localnet_learn_fdb && lsp_is_localnet(op->nbsp { > > ds_clear(match); > > ds_clear(actions); > > ds_put_format(match, "inport == %s", op->json_key); > > diff --git a/ovn-nb.xml b/ovn-nb.xml > > index 3e3e142b3..9df6b1aab 100644 > > --- a/ovn-nb.xml > > +++ b/ovn-nb.xml > > @@ -976,6 +976,13 @@ > >headers. Supported values: 802.11q (default), 802.11ad. > > > > > > + > +type='{"type": "boolean"}'> > > + Optional. Allows localnet port to learn MACs and store them > in FDB > > + table if set to true. The default value is > > + false. > > + > > + > > > > > > > > diff --git a/ovn-sb.xml b/ovn-sb.xml > > index 4c35dda36..3d92ba88f 100644 > > --- a/ovn-sb.xml > > +++ b/ovn-sb.xml > > @@ -4602,6 +4602,7 @@ tcp.flags = RST; > > > > > >This table is primarily used to learn the MACs observed on a VIF > > + (or a localnet port with 'localnet_learn_fdb' enabled) > >which belongs to a Logical_Switch_Port record in > >OVN_Northbound whose port security is disabled > >and 'unknown' address set. If port security is disabled on a > > diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at > > index 5bd0935e7..89eeb0894 100644 > > --- a/tests/ovn-northd.at > > +++ b/tests/ovn-northd.at > > @@ -7418,3 +7418,40 @@ AT_CHECK([cat sw0flows | grep -e port_sec | sort > | sed 's/table=./table=?/' ], [ > > > > AT_CLEANUP > > ]) > > + > > +OVN_FOR_EACH_NORTHD([ > > +AT_SETUP([Localnet MAC learning option]) > > +ovn_start > > + > > +AT_CHECK([ovn-nbctl ls-add ls0]) > > + > > +AT_CHECK([ovn-nbctl lsp-add ls0 ln_port]) > > +AT_CHECK([ovn-nbctl lsp-set-addresses ln_port unknown]) > > +AT_CHECK([ovn-nbctl lsp-set-type ln_port localnet]) > > +AT_CHECK([ovn-nbctl lsp-set-options ln_port network_name=phys]) > > +AT_CHECK([ovn-nbctl --wait=sb sync]) > > + > > +# Check MAC learning flows with 'localnet_learn_fdb' default (false) > > +AT_CHECK([ovn-sbctl dump-flows ls0 | grep -e > 'ls_in_\(put\|lookup\)_fdb' | sort | sed 's/table=./table=?/'], [0], [dnl > > + table=? (ls_in_lookup_fdb ), priority=0, match=(1), > action=(next;) > > + table=? (ls_in_put_fdb ), priority=0, match=(1), > action=(next;) > > +]) > > + > > +# Enable 'localnet_learn_fdb' and check the flows > > +AT_CHECK([ovn-nbctl --wait=sb lsp-set-options ln_port > localnet_learn_fdb=true]) > > +AT_CHECK([ovn-sbctl dump-flows ls0 | grep -e > 'ls_in_\(put\|lookup\)_fdb' | sort | sed 's/table=./table=?/'], [0], [dnl > > + table=? (ls_in_lookup_fdb ), priority=0
[ovs-dev] [PATCH ovn v2 2/2] northd.c: Add option to enable MAC learning on localnet
The localnet is excluded from MAC learning for scale reason. However there might be a valid workflow when yo uwant to enable the learning and benefit for that for HW offload. Add option called 'localnet_learn_fdb' to LSP, which will enable/disable the learning. Setting it as disabled by default. Reported-at: https://bugzilla.redhat.com/2070529 Signed-off-by: Ales Musil --- v2: Add helper method for getting the option, add test case in ovn.at, rebase on main --- NEWS| 2 ++ northd/northd.c | 14 -- ovn-nb.xml | 7 + ovn-sb.xml | 1 + tests/ovn-northd.at | 36 + tests/ovn.at| 65 + 6 files changed, 123 insertions(+), 2 deletions(-) diff --git a/NEWS b/NEWS index 2ee283a56..e015ae8e7 100644 --- a/NEWS +++ b/NEWS @@ -2,6 +2,8 @@ Post v22.06.0 - - ovn-controller: Add configuration knob, through OVS external-id "ovn-encap-df_default" to enable or disable tunnel DF flag. + - Add option "localnet_learn_fdb" to LSP that will allow localnet +ports to learn MAC addresses and store them in FDB table. OVN v22.06.0 - XX XXX -- diff --git a/northd/northd.c b/northd/northd.c index b78adee11..0207f6ce1 100644 --- a/northd/northd.c +++ b/northd/northd.c @@ -1845,6 +1845,12 @@ lsp_is_localnet(const struct nbrec_logical_switch_port *nbsp) return !strcmp(nbsp->type, "localnet"); } +static bool +localnet_can_learn_mac(const struct nbrec_logical_switch_port *nbsp) +{ +return smap_get_bool(>options, "localnet_learn_fdb", false); +} + static bool lsp_is_type_changed(const struct sbrec_port_binding *sb, const struct nbrec_logical_switch_port *nbsp, @@ -5448,8 +5454,12 @@ build_lswitch_learn_fdb_op( struct ovn_port *op, struct hmap *lflows, struct ds *actions, struct ds *match) { -if (op->nbsp && !op->n_ps_addrs && !strcmp(op->nbsp->type, "") && -op->has_unknown) { +if (!op->nbsp) { +return; +} + +if (!op->n_ps_addrs && op->has_unknown && (!strcmp(op->nbsp->type, "") || +(lsp_is_localnet(op->nbsp) && localnet_can_learn_mac(op->nbsp { ds_clear(match); ds_clear(actions); ds_put_format(match, "inport == %s", op->json_key); diff --git a/ovn-nb.xml b/ovn-nb.xml index c197f431f..618a48b97 100644 --- a/ovn-nb.xml +++ b/ovn-nb.xml @@ -976,6 +976,13 @@ headers. Supported values: 802.11q (default), 802.11ad. + + Optional. Allows localnet port to learn MACs and store them in FDB + table if set to true. The default value is + false. + + diff --git a/ovn-sb.xml b/ovn-sb.xml index 9f47a037e..898f3676a 100644 --- a/ovn-sb.xml +++ b/ovn-sb.xml @@ -4676,6 +4676,7 @@ tcp.flags = RST; This table is primarily used to learn the MACs observed on a VIF + (or a localnet port with 'localnet_learn_fdb' enabled) which belongs to a Logical_Switch_Port record in OVN_Northbound whose port security is disabled and 'unknown' address set. If port security is disabled on a diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at index a071b3689..a94a7d441 100644 --- a/tests/ovn-northd.at +++ b/tests/ovn-northd.at @@ -7577,3 +7577,39 @@ AT_CHECK([ovn-sbctl lflow-list | grep 'ls.*acl.*blocked' ], [0], [dnl AT_CLEANUP ]) + +AT_SETUP([Localnet MAC learning option]) +ovn_start + +AT_CHECK([ovn-nbctl ls-add ls0]) + +AT_CHECK([ovn-nbctl lsp-add ls0 ln_port]) +AT_CHECK([ovn-nbctl lsp-set-addresses ln_port unknown]) +AT_CHECK([ovn-nbctl lsp-set-type ln_port localnet]) +AT_CHECK([ovn-nbctl lsp-set-options ln_port network_name=phys]) +AT_CHECK([ovn-nbctl --wait=sb sync]) + +# Check MAC learning flows with 'localnet_learn_fdb' default (false) +AT_CHECK([ovn-sbctl dump-flows ls0 | grep -e 'ls_in_\(put\|lookup\)_fdb' | sort | sed 's/table=./table=?/'], [0], [dnl + table=? (ls_in_lookup_fdb ), priority=0, match=(1), action=(next;) + table=? (ls_in_put_fdb ), priority=0, match=(1), action=(next;) +]) + +# Enable 'localnet_learn_fdb' and check the flows +AT_CHECK([ovn-nbctl --wait=sb lsp-set-options ln_port localnet_learn_fdb=true]) +AT_CHECK([ovn-sbctl dump-flows ls0 | grep -e 'ls_in_\(put\|lookup\)_fdb' | sort | sed 's/table=./table=?/'], [0], [dnl + table=? (ls_in_lookup_fdb ), priority=0, match=(1), action=(next;) + table=? (ls_in_lookup_fdb ), priority=100 , match=(inport == "ln_port"), action=(reg0[[11]] = lookup_fdb(inport, eth.src); next;) + table=? (ls_in_put_fdb ), priority=0, match=(1), action=(next;) + table=? (ls_in_put_fdb ), priority=100 , match=(inport == "ln_port" && reg0[[11]] == 0), action=(put_fdb(inport, eth.src); next;) +]) + +# Disable 'localnet_learn_fdb' and check the flows +AT_CHECK([ovn-nbctl --wait=sb lsp-set-options ln_port localnet_learn_fdb=false])
[ovs-dev] [PATCH ovn v2 1/2] northd.c: Add lsp_is_localnet helper method
Add helper method for checking if lsp is localnet to reduce the verbosity and increase readability a bit. Reported-at: https://bugzilla.redhat.com/2070529 Signed-off-by: Ales Musil --- v2: Rebase on top of main --- northd/northd.c | 28 1 file changed, 16 insertions(+), 12 deletions(-) diff --git a/northd/northd.c b/northd/northd.c index f120c2366..b78adee11 100644 --- a/northd/northd.c +++ b/northd/northd.c @@ -1839,6 +1839,12 @@ lsp_is_remote(const struct nbrec_logical_switch_port *nbsp) return !strcmp(nbsp->type, "remote"); } +static bool +lsp_is_localnet(const struct nbrec_logical_switch_port *nbsp) +{ +return !strcmp(nbsp->type, "localnet"); +} + static bool lsp_is_type_changed(const struct sbrec_port_binding *sb, const struct nbrec_logical_switch_port *nbsp, @@ -2610,7 +2616,7 @@ join_logical_ports(struct northd_input *input_data, ovs_list_push_back(nb_only, >list); } -if (!strcmp(nbsp->type, "localnet")) { +if (lsp_is_localnet(nbsp)) { if (od->n_localnet_ports >= n_allocated_localnet_ports) { od->localnet_ports = x2nrealloc( od->localnet_ports, _allocated_localnet_ports, @@ -3444,7 +3450,7 @@ ovn_port_update_sbrec(struct northd_input *input_data, struct smap options; char *name = ""; -if (!strcmp(op->nbsp->type, "localnet")) { +if (lsp_is_localnet(op->nbsp)) { uuid = >sb->header_.uuid; name = "localnet"; } else if (op->sb->chassis) { @@ -5424,7 +5430,7 @@ build_lswitch_port_sec_op(struct ovn_port *op, struct hmap *lflows, ds_cstr(match), ds_cstr(actions), op->key, >nbsp->header_); -if (!strcmp(op->nbsp->type, "localnet")) { +if (lsp_is_localnet(op->nbsp)) { ds_clear(match); ds_clear(actions); ds_put_format(match, "outport == %s", op->json_key); @@ -7725,15 +7731,13 @@ build_lswitch_arp_nd_responder_skip_local(struct ovn_port *op, struct hmap *lflows, struct ds *match) { -if (op->nbsp) { -if (!strcmp(op->nbsp->type, "localnet")) { -ds_clear(match); -ds_put_format(match, "inport == %s", op->json_key); -ovn_lflow_add_with_lport_and_hint(lflows, op->od, - S_SWITCH_IN_ARP_ND_RSP, 100, - ds_cstr(match), "next;", op->key, - >nbsp->header_); -} +if (op->nbsp && lsp_is_localnet(op->nbsp)) { +ds_clear(match); +ds_put_format(match, "inport == %s", op->json_key); +ovn_lflow_add_with_lport_and_hint(lflows, op->od, + S_SWITCH_IN_ARP_ND_RSP, 100, + ds_cstr(match), "next;", op->key, + >nbsp->header_); } } -- 2.35.3 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH ovn v2 0/2] Add option to enable MAC learning on localnet port
Add new option for LSP called 'localnet_learn_fdb', which will allow localnet LSP to add entries to FDB when set to true. The default is false. --- v2: Address comments and rebase on main Ales Musil (2): northd.c: Add lsp_is_localnet helper method northd.c: Add option to enable MAC learning on localnet NEWS| 2 ++ northd/northd.c | 42 +++-- ovn-nb.xml | 7 + ovn-sb.xml | 1 + tests/ovn-northd.at | 36 + tests/ovn.at| 65 + 6 files changed, 139 insertions(+), 14 deletions(-) -- 2.35.3 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 1/1] packets: Re-calculate IPv6 checksum only for last frags upon modify.
In case of modifying an IPv6 packet src/dst address the checksum should be recalculated only for the last frag. Currently it's done for all frags, leading to incorrect reassembled packet checksum. Fix it by adding a new flag to recalculate the checksum only for last frag. Fixes: bc7a5acdff08 ("datapath: add ipv6 'set' action") Signed-off-by: Salem Sol Acked-by: Mike Pattrick --- lib/packets.c | 20 ++-- tests/system-traffic.at | 35 +++ 2 files changed, 53 insertions(+), 2 deletions(-) diff --git a/lib/packets.c b/lib/packets.c index d0fba8176..b7aef41a5 100644 --- a/lib/packets.c +++ b/lib/packets.c @@ -1148,6 +1148,18 @@ packet_set_ipv4_addr(struct dp_packet *packet, put_16aligned_be32(addr, new_addr); } +static bool +packet_is_last_ipv6_frag(struct dp_packet *packet) +{ +const struct ovs_16aligned_ip6_frag *frag_hdr; +uint8_t *data = dp_packet_l3(packet); + +data += sizeof (struct ovs_16aligned_ip6_hdr); +frag_hdr = ALIGNED_CAST(struct ovs_16aligned_ip6_frag *, data); +return (frag_hdr->ip6f_offlg & IP6F_OFF_MASK) && + !(frag_hdr->ip6f_offlg & IP6F_MORE_FRAG); +} + /* Returns true, if packet contains at least one routing header where * segements_left > 0. * @@ -1334,17 +1346,21 @@ packet_set_ipv6(struct dp_packet *packet, const struct in6_addr *src, { struct ovs_16aligned_ip6_hdr *nh = dp_packet_l3(packet); uint8_t proto = 0; +bool recalc_csum; bool rh_present; rh_present = packet_rh_present(packet, ); +recalc_csum = nh->ip6_nxt == IPPROTO_FRAGMENT ? +packet_is_last_ipv6_frag(packet) : true; if (memcmp(>ip6_src, src, sizeof(ovs_be32[4]))) { -packet_set_ipv6_addr(packet, proto, nh->ip6_src.be32, src, true); +packet_set_ipv6_addr(packet, proto, nh->ip6_src.be32, + src, recalc_csum); } if (memcmp(>ip6_dst, dst, sizeof(ovs_be32[4]))) { packet_set_ipv6_addr(packet, proto, nh->ip6_dst.be32, dst, - !rh_present); + !rh_present && recalc_csum); } packet_set_ipv6_tc(>ip6_flow, key_tc); diff --git a/tests/system-traffic.at b/tests/system-traffic.at index 239105e89..16ba42d84 100644 --- a/tests/system-traffic.at +++ b/tests/system-traffic.at @@ -192,6 +192,41 @@ NS_CHECK_EXEC([at_ns0], [ping6 -s 3200 -q -c 3 -i 0.3 -w 2 fc00:1::2 | FORMAT_PI OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP +AT_SETUP([datapath - ping6 between two ports with header modify]) +OVS_TRAFFIC_VSWITCHD_START() + +AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"]) + +ADD_NAMESPACES(at_ns0, at_ns1) + +ADD_VETH(p0, at_ns0, br0, "fc00::1/96", e4:11:22:33:44:55) +ADD_VETH(p1, at_ns1, br0, "fc00::2/96", e4:11:22:33:44:54) +NS_CHECK_EXEC([at_ns0], [ip -6 neigh add fc00::3 lladdr e4:11:22:33:44:54 dev p0]) +NS_CHECK_EXEC([at_ns0], [ip -6 neigh add fc00::2 lladdr e4:11:22:33:44:54 dev p0]) +NS_CHECK_EXEC([at_ns0], [ip -6 neigh add fc00::1 lladdr e4:11:22:33:44:55 dev p0]) + +dnl Linux seems to take a little time to get its IPv6 stack in order. Without +dnl waiting, we get occasional failures due to the following error: +dnl "connect: Cannot assign requested address" +OVS_WAIT_UNTIL([ip netns exec at_ns0 ping6 -c 1 fc00::2]) + +AT_CHECK([ovs-ofctl del-flows -OOpenFlow15 br0], [], [stdout], [stderr]) +AT_CHECK([ovs-ofctl add-flow -OOpenFlow15 br0 in_port=ovs-p0,ipv6,ipv6_dst=fc00::3,ipv6_src=fc00::1,actions=set_field:fc00::2-\>ipv6_dst,ovs-p1], [], [stdout], [stderr]) +AT_CHECK([ovs-ofctl add-flow -OOpenFlow15 br0 in_port=ovs-p1,ipv6,ipv6_dst=fc00::1,ipv6_src=fc00::2,actions=set_field:fc00::3-\>ipv6_src,ovs-p0], [], [stdout], [stderr]) + +NS_CHECK_EXEC([at_ns0], [ping6 -q -c 3 -i 0.3 -w 2 fc00::3 | FORMAT_PING], [0], [dnl +3 packets transmitted, 3 received, 0% packet loss, time 0ms +]) +NS_CHECK_EXEC([at_ns0], [ping6 -s 1600 -q -c 3 -i 0.3 -w 2 fc00::3 | FORMAT_PING], [0], [dnl +3 packets transmitted, 3 received, 0% packet loss, time 0ms +]) +NS_CHECK_EXEC([at_ns0], [ping6 -s 3200 -q -c 3 -i 0.3 -w 2 fc00::3 | FORMAT_PING], [0], [dnl +3 packets transmitted, 3 received, 0% packet loss, time 0ms +]) + +OVS_TRAFFIC_VSWITCHD_STOP +AT_CLEANUP + AT_SETUP([datapath - ping over bond]) OVS_TRAFFIC_VSWITCHD_START() -- 2.21.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v13 ovn] Implement RARP activation strategy for ports
On Fri, Jun 3, 2022 at 1:20 PM Ihar Hrachyshka wrote: > > When options:activation-strategy is set to "rarp" for LSP, when used in > combination with multiple chassis names listed in > options:requested-chassis, additional chassis will install special flows > that would block all ingress and egress traffic for the port until a > special activation event happens. > > For "rarp" strategy, an observation of a RARP packet sent from the port > on the additional chassis is such an event. When it occurs, a special > flow passes control to a controller() action handler that eventually > removes the installed blocking flows and also marks the port as > options:additional-chassis-activated in southbound db. > > This feature is useful in live migration scenarios where it's not > advisable to unlock the destination port location prematurily to avoid > duplicate packets originating from the port. > > Signed-off-by: Ihar Hrachyshka > --- > v13: use resubmit() action to reinject RARP into pipeline. > v13: lock / unlock pinctrl_mutex in functions invoked from main thread. > v13: db_is_port_activated->lport_is_activated_by_activation_strategy. Thanks Ihar for the revision. Sorry that I didn't follow up in time after v6. Please see some of my quick comments for this version regarding I-P: > --- > NEWS| 2 + > controller/lport.c | 22 +++ > controller/lport.h | 3 + > controller/ovn-controller.c | 87 + > controller/physical.c | 94 ++ > controller/pinctrl.c| 145 +- > controller/pinctrl.h| 13 ++ > include/ovn/actions.h | 3 + > northd/northd.c | 10 + > northd/ovn-northd.c | 5 +- > ovn-nb.xml | 11 ++ > ovn-sb.xml | 15 ++ > tests/ovn.at| 365 > 13 files changed, 772 insertions(+), 3 deletions(-) > > diff --git a/NEWS b/NEWS > index 2ee283a56..7c54670ed 100644 > --- a/NEWS > +++ b/NEWS > @@ -29,6 +29,8 @@ OVN v22.06.0 - XX XXX >- Added support for setting the Next server IP in the DHCP header > using the private DHCP option - 253 in native OVN DHCPv4 responder. >- Support list of chassis for Logical_Switch_Port:options:requested-chassis. > + - Support Logical_Switch_Port:options:activation-strategy for live migration > +scenarios. > > OVN v22.03.0 - 11 Mar 2022 > -- > diff --git a/controller/lport.c b/controller/lport.c > index bf55d83f2..add7e91aa 100644 > --- a/controller/lport.c > +++ b/controller/lport.c > @@ -197,3 +197,25 @@ get_peer_lport(const struct sbrec_port_binding *pb, > peer_name); > return (peer && peer->datapath) ? peer : NULL; > } > + > +bool > +lport_is_activated_by_activation_strategy(const struct sbrec_port_binding *pb, > + const struct sbrec_chassis *chassis) > +{ > +const char *activated_chassis = smap_get(>options, > + "additional-chassis-activated"); > +if (activated_chassis) { > +char *save_ptr; > +char *tokstr = xstrdup(activated_chassis); > +for (const char *chassis_name = strtok_r(tokstr, ",", _ptr); > + chassis_name != NULL; > + chassis_name = strtok_r(NULL, ",", _ptr)) { > +if (!strcmp(chassis_name, chassis->name)) { > +free(tokstr); > +return true; > +} > +} > +free(tokstr); > +} > +return false; > +} > diff --git a/controller/lport.h b/controller/lport.h > index 115881655..644c67255 100644 > --- a/controller/lport.h > +++ b/controller/lport.h > @@ -70,4 +70,7 @@ const struct sbrec_port_binding *lport_get_peer( > const struct sbrec_port_binding *lport_get_l3gw_peer( > const struct sbrec_port_binding *, > struct ovsdb_idl_index *sbrec_port_binding_by_name); > +bool > +lport_is_activated_by_activation_strategy(const struct sbrec_port_binding *pb, > + const struct sbrec_chassis *chassis); > #endif /* controller/lport.h */ > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c > index b597c0e37..a37dfcb78 100644 > --- a/controller/ovn-controller.c > +++ b/controller/ovn-controller.c > @@ -1047,6 +1047,50 @@ en_ofctrl_is_connected_run(struct engine_node *node, void *data) > engine_set_node_state(node, EN_UNCHANGED); > } > > +struct ed_type_activated_ports { > +struct ovs_list *activated_ports; > +}; > + > +static void * > +en_activated_ports_init(struct engine_node *node OVS_UNUSED, > +struct engine_arg *arg OVS_UNUSED) > +{ > +struct ed_type_activated_ports *data = xzalloc(sizeof *data); > +data->activated_ports = NULL; > +return data; > +} > + > +static void > +en_activated_ports_cleanup(void *data_) > +{ > +struct ed_type_activated_ports *data = data_; > + > +struct activated_port *pp; > +
Re: [ovs-dev] [PATCH v4 4/5] system-offloads-traffic: Properly initialize offload before testing.
On 2022-06-03 11:58 AM, Eelco Chaudron wrote: This patch will properly initialize offload, as it requires the setting to be enabled before starting ovs-vswitchd (or do a restart once configured). Signed-off-by: Eelco Chaudron --- v4: - Use the existing dbinit-aux-args argument, rather than creating a new pre-vswitchd command option. v3: - No changes. v2: - Unified all the OVS_TRAFFIC_VSWITCHD_START macro's tests/ofproto-macros.at |3 ++- tests/system-kmod-macros.at |4 ++-- tests/system-offloads-traffic.at |9 +++-- tests/system-tso-macros.at |4 ++-- tests/system-userspace-macros.at |4 ++-- 5 files changed, 11 insertions(+), 13 deletions(-) diff --git a/tests/ofproto-macros.at b/tests/ofproto-macros.at index b18f0fbc1..af956bdcb 100644 --- a/tests/ofproto-macros.at +++ b/tests/ofproto-macros.at @@ -153,7 +153,7 @@ m4_divert_pop([PREPARE_TESTS]) m4_define([TESTABLE_LOG], [-vPATTERN:ANY:'%c|%p|%m']) -# _OVS_VSWITCHD_START([vswitchd-aux-args]) +# _OVS_VSWITCHD_START([vswitchd-aux-args] [dbinit-aux-args]) # # Creates an empty database and starts ovsdb-server. # Starts ovs-vswitchd, with additional arguments 'vswitchd-aux-args'. @@ -189,6 +189,7 @@ m4_define([_OVS_VSWITCHD_START], /ofproto|INFO|datapath ID changed to fedcba9876543210/d /dpdk|INFO|DPDK Disabled - Use other_config:dpdk-init to enable/d /netlink_socket|INFO|netlink: could not enable listening to all nsid/d +/netdev_offload|INFO|netdev: Flow API Enabled/d /probe tc:/d /setting extended ack support failed/d /tc: Using policy/d']]) diff --git a/tests/system-kmod-macros.at b/tests/system-kmod-macros.at index 86d633ac4..b8a6b67c8 100644 --- a/tests/system-kmod-macros.at +++ b/tests/system-kmod-macros.at @@ -4,7 +4,7 @@ # appropriate type and properties m4_define([_ADD_BR], [[add-br $1 -- set Bridge $1 protocols=OpenFlow10,OpenFlow11,OpenFlow12,OpenFlow13,OpenFlow14,OpenFlow15 fail-mode=secure ]]) -# OVS_TRAFFIC_VSWITCHD_START([vsctl-args], [vsctl-output], [=override]) +# OVS_TRAFFIC_VSWITCHD_START([vsctl-args], [vsctl-output], [dbinit-aux-args]]) # # Creates a database and starts ovsdb-server, starts ovs-vswitchd # connected to that database, calls ovs-vsctl to create a bridge named @@ -24,7 +24,7 @@ m4_define([OVS_TRAFFIC_VSWITCHD_START], ]) on_exit 'ovs-dpctl del-dp ovs-system' on_exit 'ovs-appctl dpctl/flush-conntrack' - _OVS_VSWITCHD_START([]) + _OVS_VSWITCHD_START([], [$3]) dnl Add bridges, ports, etc. AT_CHECK([ovs-vsctl -- _ADD_BR([br0]) -- $1 m4_if([$2], [], [], [| uuidfilt])], [0], [$2]) ]) diff --git a/tests/system-offloads-traffic.at b/tests/system-offloads-traffic.at index 80bc1dd5c..f7373664c 100644 --- a/tests/system-offloads-traffic.at +++ b/tests/system-offloads-traffic.at @@ -39,9 +39,8 @@ AT_CLEANUP AT_SETUP([offloads - ping between two ports - offloads enabled]) -OVS_TRAFFIC_VSWITCHD_START() +OVS_TRAFFIC_VSWITCHD_START([], [], [-- set Open_vSwitch . other_config:hw-offload=true]) -AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:hw-offload=true]) AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"]) ADD_NAMESPACES(at_ns0, at_ns1) @@ -97,8 +96,7 @@ AT_CLEANUP AT_SETUP([offloads - set ingress_policing_rate and ingress_policing_burst - offloads enabled]) AT_KEYWORDS([ingress_policing]) AT_SKIP_IF([test $HAVE_TC = "no"]) -OVS_TRAFFIC_VSWITCHD_START() -AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:hw-offload=true]) +OVS_TRAFFIC_VSWITCHD_START([], [], [-- set Open_vSwitch . other_config:hw-offload=true]) AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"]) ADD_NAMESPACES(at_ns0) ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24") @@ -146,8 +144,7 @@ AT_CLEANUP AT_SETUP([offloads - set ingress_policing_kpkts_rate and ingress_policing_kpkts_burst - offloads enabled]) AT_KEYWORDS([ingress_policing_kpkts]) AT_SKIP_IF([test $SUPPORT_TC_INGRESS_PPS = "no"]) -OVS_TRAFFIC_VSWITCHD_START() -AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:hw-offload=true]) +OVS_TRAFFIC_VSWITCHD_START([], [], [-- set Open_vSwitch . other_config:hw-offload=true]) AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"]) ADD_NAMESPACES(at_ns0) ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24") diff --git a/tests/system-tso-macros.at b/tests/system-tso-macros.at index 1a8004761..1881c28fb 100644 --- a/tests/system-tso-macros.at +++ b/tests/system-tso-macros.at @@ -4,7 +4,7 @@ # appropriate type and properties m4_define([_ADD_BR], [[add-br $1 -- set Bridge $1 datapath_type="netdev" protocols=OpenFlow10,OpenFlow11,OpenFlow12,OpenFlow13,OpenFlow14,OpenFlow15 fail-mode=secure ]]) -# OVS_TRAFFIC_VSWITCHD_START([vsctl-args], [vsctl-output], [=override]) +# OVS_TRAFFIC_VSWITCHD_START([vsctl-args], [vsctl-output], [pre-vswitchd-commands]) # # Creates a database and starts ovsdb-server, starts ovs-vswitchd # connected to that database, calls ovs-vsctl to create a bridge