Re: [ovs-dev] [PATCH v5 1/2] handlers: Create additional handler threads when using CPU isolation

2022-07-05 Thread Michael Santana




On 7/4/22 09:46, Ilya Maximets wrote:

Hi, Michael.  Thanks for the new version!

Hi Ilya, thanks for the response


On 6/6/22 20:59, Michael Santana wrote:

Additional threads are required to service upcalls when we have CPU
isolation (in per-cpu dispatch mode). The reason additional threads
are required is because it creates a more fair distribution.


I think, the description above lacks the information on why we need
to create additional threads, and why we choose the number to be a
prime number.  Yes, you said that it creates a more fair distribution,
but we probably need to describe that more with an example.


With more
threads we decrease the load of each thread as more threads would
decrease the number of cores each threads is assigned.


While that is true, it's not obvious why increasing the number of
threads beyond the number of available cores is good for us.
The key factor is more fair load distribution among threads, so all
cores are utilized, but that is not highlighted.

Yes agreed.

We want additional threads because we want to minimize the number of 
cores assigned to individual threads. 75 handler threads servicing 100 
cores is more optimal than 50 threads servicing 100 cores. This is 
because on the 75 handler case, each thread would have to service on 
average 1.33 cores where as the 50 handler case each thread would have 
to service 2 cores. Less cores assigned to individual threads less to 
less work that thread has to do. This example ignores the number of 
actual cores available to use for OVS userspace.


On the flip side, we do not wish to create too many threads as we fear 
we end up in a case where we have too many threads and not enough active 
cores OVS can use


Which brings me to my last point. The prime schema is arbitrary (or good 
enough for now). We really just want more threads and there is no reason 
why prime schema is better than the next (at least not without knowing 
how many threads we can get away with adding before kernel overhead 
hurts performance). I think you had mentioned it several times but I 
think we need to do testing to figure out exactly how many threads we 
can add before kernel overhead hurts performance. I speculate the prime 
schema is on the low end and I think that we can add more threads 
without hurting performance than the prime schema will allow. We can 
look at how the upcalls/s rate changes based on the number of handlers 
and the number of active cores. The prime schema is an easy solution but 
it is a little driving blindly without knowing more stats.


On the other hand, stats might be different from one system to another. 
What might be good on my test system doesn't necessarily translate to 
another system. LMK what you think about this, if this is worth the effort


The prime schema is sufficient as is. I just think that we can improve 
it a little bit. But we dont have to go down that road if need be








Adding as many
threads are there are cores could have a performance hit when the number
of active cores (which all threads have to share) is very low. For this
reason we avoid creating as many threads as there are cores and instead
meet somewhere in the middle.

The formula used to calculate the number of handler threads to create
is as follows:

handlers_n = min(next_prime(active_cores+1), total_cores)

Where next_prime is a function that returns the next numeric prime
number that is strictly greater than active_cores

Assume default behavior when total_cores <= 2, that is do not create
additional threads when we have less than 2 total cores on the system

Fixes: b1e517bd2f81 ("dpif-netlink: Introduce per-cpu upcall dispatch.")
Signed-off-by: Michael Santana 
---
  lib/dpif-netlink.c | 86 --
  lib/ovs-thread.c   | 16 +
  lib/ovs-thread.h   |  1 +
  3 files changed, 101 insertions(+), 2 deletions(-)

diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
index 06e1e8ca0..e948a829b 100644
--- a/lib/dpif-netlink.c
+++ b/lib/dpif-netlink.c
@@ -31,6 +31,7 @@
  #include 
  #include 
  #include 
+#include 
  
  #include "bitmap.h"

  #include "dpif-netlink-rtnl.h"
@@ -2506,6 +2507,87 @@ dpif_netlink_handler_uninit(struct dpif_handler *handler)
  }
  #endif
  
+/*

+ * Returns 1 if num is a prime number,
+ * Otherwise return 0
+ */
+static uint32_t


This should just return bool.

ACK, thanks



+is_prime(uint32_t num)
+{
+if ((num < 2) || ((num % 2) == 0)) {


There is no need for so many parenthesis.

And it might make sense to just write 3 if blocks
here to make the code simpler, i.e. check for
< 2, == 2 and % 2.

ACK, thanks



+return num == 2;
+}
+
+uint32_t i;
+uint32_t limit = sqrt((float) num);
+for (i = 3; i <= limit; i += 2) {


for (uint64_t i = 3; i * i <= num; i += 2) {

There is no real need for sqrt.  I don't think we should be concerned
about 'i * i' performance here.

ACK, I think I got carried way with optimization



+if 

[ovs-dev] [linux-next:master] BUILD REGRESSION 2a2aa3f05338270aecbe2492fda910d6c17e0102

2022-07-05 Thread kernel test robot
tree/branch: 
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
branch HEAD: 2a2aa3f05338270aecbe2492fda910d6c17e0102  Add linux-next specific 
files for 20220705

Error/Warning reports:

https://lore.kernel.org/linux-doc/202207051821.3f0erisl-...@intel.com

Error/Warning: (recently discovered and may have been fixed)

Documentation/PCI/endpoint/pci-vntb-function.rst:82: WARNING: Unexpected 
indentation.
Documentation/PCI/endpoint/pci-vntb-howto.rst:131: WARNING: Title underline too 
short.
drivers/pci/endpoint/functions/pci-epf-vntb.c:975:5: warning: no previous 
prototype for 'pci_read' [-Wmissing-prototypes]
drivers/pci/endpoint/functions/pci-epf-vntb.c:984:5: warning: no previous 
prototype for 'pci_write' [-Wmissing-prototypes]

Unverified Error/Warning (likely false positive, please contact us if 
interested):

block/partitions/efi.c:223:1: internal compiler error: in arc_ifcvt, at 
config/arc/arc.c:9637
block/sed-opal.c:427:1: internal compiler error: in arc_ifcvt, at 
config/arc/arc.c:9637
crypto/asymmetric_keys/pkcs7_verify.c:311:1: internal compiler error: in 
arc_ifcvt, at config/arc/arc.c:9637
drivers/ata/libata-core.c:2802:1: internal compiler error: in arc_ifcvt, at 
config/arc/arc.c:9637
drivers/ata/libata-eh.c:2842:1: internal compiler error: in arc_ifcvt, at 
config/arc/arc.c:9637
drivers/ata/sata_dwc_460ex.c:691:1: internal compiler error: in arc_ifcvt, at 
config/arc/arc.c:9637
drivers/base/power/runtime.c:1570:1: internal compiler error: in arc_ifcvt, at 
config/arc/arc.c:9637
drivers/block/rbd.c:6142:1: internal compiler error: in arc_ifcvt, at 
config/arc/arc.c:9637
drivers/bluetooth/hci_ll.c:588:1: internal compiler error: in arc_ifcvt, at 
config/arc/arc.c:9637
drivers/bluetooth/hci_qca.c:2137:1: internal compiler error: in arc_ifcvt, at 
config/arc/arc.c:9637
drivers/cdrom/cdrom.c:1041:1: internal compiler error: in arc_ifcvt, at 
config/arc/arc.c:9637
drivers/char/ipmi/ipmi_ssif.c:1918:1: internal compiler error: in arc_ifcvt, at 
config/arc/arc.c:9637
drivers/char/pcmcia/cm4000_cs.c:922:1: internal compiler error: in arc_ifcvt, 
at config/arc/arc.c:9637
drivers/char/random.c:869:1: internal compiler error: in arc_ifcvt, at 
config/arc/arc.c:9637
drivers/char/tpm/tpm_tis_core.c:1122:1: internal compiler error: in arc_ifcvt, 
at config/arc/arc.c:9637
drivers/clk/bcm/clk-iproc-armpll.c:139:1: internal compiler error: in 
arc_ifcvt, at config/arc/arc.c:9637
drivers/clk/clk-bd718x7.c:50:1: internal compiler error: in arc_ifcvt, at 
config/arc/arc.c:9637
drivers/clk/clk-lochnagar.c:187:1: internal compiler error: in arc_ifcvt, at 
config/arc/arc.c:9637
drivers/crypto/ccree/cc_request_mgr.c:206:1: internal compiler error: in 
arc_ifcvt, at config/arc/arc.c:9637
drivers/crypto/qce/sha.c:73:1: internal compiler error: in arc_ifcvt, at 
config/arc/arc.c:9637
drivers/crypto/qce/skcipher.c:61:1: internal compiler error: in arc_ifcvt, at 
config/arc/arc.c:9637
drivers/cxl/core/hdm.c:38:1: internal compiler error: in arc_ifcvt, at 
config/arc/arc.c:9637
drivers/cxl/core/pci.c:67:1: internal compiler error: in arc_ifcvt, at 
config/arc/arc.c:9637
drivers/dma-buf/dma-buf.c:1100:1: internal compiler error: in arc_ifcvt, at 
config/arc/arc.c:9637
drivers/firmware/arm_scmi/bus.c:152:1: internal compiler error: in arc_ifcvt, 
at config/arc/arc.c:9637
drivers/firmware/arm_scmi/clock.c:394:1: internal compiler error: in arc_ifcvt, 
at config/arc/arc.c:9637
drivers/firmware/arm_scmi/powercap.c:376:1: internal compiler error: in 
arc_ifcvt, at config/arc/arc.c:9637
drivers/firmware/arm_scmi/sensors.c:673:1: internal compiler error: in 
arc_ifcvt, at config/arc/arc.c:9637
drivers/firmware/arm_scmi/voltage.c:363:1: internal compiler error: in 
arc_ifcvt, at config/arc/arc.c:9637
drivers/fpga/dfl-fme-mgr.c:163:1: internal compiler error: in arc_ifcvt, at 
config/arc/arc.c:9637
drivers/gnss/usb.c:68:1: internal compiler error: in arc_ifcvt, at 
config/arc/arc.c:9637
drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_debug.c:175:1: internal 
compiler error: in arc_ifcvt, at config/arc/arc.c:9637
drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link_dp.c:1006:1: internal 
compiler error: in arc_ifcvt, at config/arc/arc.c:9637
drivers/gpu/drm/amd/amdgpu/../display/dc/dce110/dce110_resource.c:1035:1: 
internal compiler error: in arc_ifcvt, at config/arc/arc.c:9637
drivers/gpu/drm/amd/amdgpu/../display/dc/dce112/dce112_resource.c:955:1: 
internal compiler error: in arc_ifcvt, at config/arc/arc.c:9637
drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/smu7_hwmgr.c:3895:1: internal 
compiler error: in arc_ifcvt, at config/arc/arc.c:9637
drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/smu8_hwmgr.c:754:1: internal 
compiler error: in arc_ifcvt, at config/arc/arc.c:9637
drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/vega10_powertune.c:1214:1: 
internal compiler error: in arc_ifcvt, at config/arc/arc.c:9637
drivers/gpu/drm/amd/amdgpu/../pm/powerplay/smumgr/smu7_smumgr.c:195:1: internal

Re: [ovs-dev] [PATCH ovn 2/3] ovn-controller: Track OVSDB column open_vswitch:external_ids.

2022-07-05 Thread 0-day Robot
Bleep bloop.  Greetings Han Zhou, I am a robot and I have tried out your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Comment with 'xxx' marker
#36 FILE: controller/ovn-controller.c:926:
 * XXX: when there are overlapping columns monitored by different modules,

Lines checked: 80, Warnings: 1, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn 3/3] ovn-controller: Support option ovn-allow-vips-share-hairpin-backend.

2022-07-05 Thread Han Zhou
Add the option for deployments that do not require multiple LB VIPs
sharing same backends for hairpin traffic, so that they can set this
to false to be able to better utilize HW offload capabilities.

Signed-off-by: Han Zhou 
---
 controller/lflow.c  | 46 +
 controller/lflow.h  |  1 +
 controller/ovn-controller.8.xml | 10 +++
 controller/ovn-controller.c | 13 ++
 tests/ovn.at| 10 +++
 5 files changed, 69 insertions(+), 11 deletions(-)

diff --git a/controller/lflow.c b/controller/lflow.c
index a44f6d056..35de03a21 100644
--- a/controller/lflow.c
+++ b/controller/lflow.c
@@ -1939,6 +1939,7 @@ add_lb_vip_hairpin_flows(struct ovn_controller_lb *lb,
  struct ovn_lb_backend *lb_backend,
  uint8_t lb_proto,
  bool use_ct_mark,
+ bool check_ct_dst,
  struct ovn_desired_flow_table *flow_table)
 {
 uint64_t stub[1024 / 8];
@@ -1949,11 +1950,24 @@ add_lb_vip_hairpin_flows(struct ovn_controller_lb *lb,
 put_load(, sizeof value, MFF_LOG_FLAGS,
  MLF_LOOKUP_LB_HAIRPIN_BIT, 1, );
 
-/* Matching on ct_nw_dst()/ct_ipv6_dst()/ct_tp_dst() requires matching
- * on ct_state first.
+/* To detect the hairpin flow accurately, the CT dst IP and ports need to
+ * be matched, to distingiush when more than one VIPs share the same
+ * hairpin backend ip+port.
+ *
+ * Matching on ct_nw_dst()/ct_ipv6_dst()/ct_tp_dst() requires matching
+ * on ct_state == DNAT first.
+ *
+ * However, matching ct_state == DNAT may prevent the flows being
+ * offloaded to HW. The option "check_ct_dst" (default to true) is
+ * provided to disable this check and the related CT fields, so that for
+ * environments that doesn't require support of VIPs sharing same backends
+ * can fully utilize HW offload capability for better performance.
  */
-uint32_t ct_state = OVS_CS_F_TRACKED | OVS_CS_F_DST_NAT;
-match_set_ct_state_masked(_match, ct_state, ct_state);
+
+if (check_ct_dst) {
+uint32_t ct_state = OVS_CS_F_TRACKED | OVS_CS_F_DST_NAT;
+match_set_ct_state_masked(_match, ct_state, ct_state);
+}
 
 if (IN6_IS_ADDR_V4MAPPED(_vip->vip)) {
 ovs_be32 bip4 = in6_addr_get_mapped_ipv4(_backend->ip);
@@ -1965,7 +1979,9 @@ add_lb_vip_hairpin_flows(struct ovn_controller_lb *lb,
 match_set_dl_type(_match, htons(ETH_TYPE_IP));
 match_set_nw_src(_match, bip4);
 match_set_nw_dst(_match, bip4);
-match_set_ct_nw_dst(_match, vip4);
+if (check_ct_dst) {
+match_set_ct_nw_dst(_match, vip4);
+}
 
 add_lb_vip_hairpin_reply_action(NULL, snat_vip4, lb_proto,
 lb_backend->port,
@@ -1980,7 +1996,9 @@ add_lb_vip_hairpin_flows(struct ovn_controller_lb *lb,
 match_set_dl_type(_match, htons(ETH_TYPE_IPV6));
 match_set_ipv6_src(_match, bip6);
 match_set_ipv6_dst(_match, bip6);
-match_set_ct_ipv6_dst(_match, _vip->vip);
+if (check_ct_dst) {
+match_set_ct_ipv6_dst(_match, _vip->vip);
+}
 
 add_lb_vip_hairpin_reply_action(snat_vip6, 0, lb_proto,
 lb_backend->port,
@@ -1991,8 +2009,10 @@ add_lb_vip_hairpin_flows(struct ovn_controller_lb *lb,
 if (lb_backend->port) {
 match_set_nw_proto(_match, lb_proto);
 match_set_tp_dst(_match, htons(lb_backend->port));
-match_set_ct_nw_proto(_match, lb_proto);
-match_set_ct_tp_dst(_match, htons(lb_vip->vip_port));
+if (check_ct_dst) {
+match_set_ct_nw_proto(_match, lb_proto);
+match_set_ct_tp_dst(_match, htons(lb_vip->vip_port));
+}
 }
 
 /* In the original direction, only match on traffic that was already
@@ -2279,7 +2299,7 @@ add_lb_ct_snat_hairpin_flows(struct ovn_controller_lb *lb,
 static void
 consider_lb_hairpin_flows(const struct sbrec_load_balancer *sbrec_lb,
   const struct hmap *local_datapaths,
-  bool use_ct_mark,
+  bool use_ct_mark, bool check_ct_dst,
   struct ovn_desired_flow_table *flow_table,
   struct simap *ids)
 {
@@ -2319,7 +2339,7 @@ consider_lb_hairpin_flows(const struct 
sbrec_load_balancer *sbrec_lb,
 struct ovn_lb_backend *lb_backend = _vip->backends[j];
 
 add_lb_vip_hairpin_flows(lb, lb_vip, lb_backend, lb_proto,
- use_ct_mark, flow_table);
+ use_ct_mark, check_ct_dst, flow_table);
 }
 }
 
@@ -2333,6 +2353,7 @@ consider_lb_hairpin_flows(const struct 
sbrec_load_balancer *sbrec_lb,
 static void
 add_lb_hairpin_flows(const struct sbrec_load_balancer_table *lb_table,
  

[ovs-dev] [PATCH ovn 1/3] Don't save original dst IP and Port to avoid megaflow unwildcarding.

2022-07-05 Thread Han Zhou
The ls_in_pre_stateful priority 120 flow that saves dst IP and Port to
registers is causing a critical dataplane performance impact to
short-lived connections, because it unwildcards megaflows with exact
match on dst IP and L4 ports. Any new connections with a different
client side L4 port will encounter datapath flow miss and upcall to
ovs-vswitchd.

These fields (dst IP and port) were saved to registers to solve a
problem of LB hairpin use case when different VIPs are sharing
overlapping backend+port [0]. The change [0] might not have as wide
performance impact as it is now because at that time one of the match
condition "REGBIT_CONNTRACK_NAT == 1" was set only for established and
natted traffic, while now the impact is more obvious because
REGBIT_CONNTRACK_NAT is now set for all IP traffic (if any VIP
configured on the LS) since commit [1], after several other indirectly
related optimizations and refactors.

Since the changes that introduced the performance problem had their
own values (fixes problems or optimizes performance), so we don't want
to revert any of the changes (and it is also not straightforward to
revert any of them because there have been lots of changes and refactors
on top of them).

Change [0] itself has added an alternative way to solve the overlapping
backends problem, which utilizes ct fields instead of saving dst IP and
port to registers. This patch forces to that approach and removes the
flows/actions that saves the dst IP and port to avoid the dataplane
performance problem for short-lived connections.

(With this approach, the ct_state DNAT is not HW offload friendly, so it
may result in those flows not being offloaded, which is supposed to be
solved in a follow-up patch)

[0] ce0ef8d59850 ("Properly handle hairpin traffic for VIPs with shared 
backends.")
[1] 0038579d1928 ("northd: Optimize ct nat for load balancer traffic.")

Signed-off-by: Han Zhou 
---
 controller/lflow.c   |  74 ++--
 include/ovn/logical-fields.h |   5 -
 lib/lb.c |   3 -
 lib/lb.h |   3 -
 northd/northd.c  |  95 ---
 northd/ovn-northd.8.xml  |  30 +
 tests/ovn-northd.at  |  78 -
 tests/ovn.at | 218 +--
 8 files changed, 141 insertions(+), 365 deletions(-)

diff --git a/controller/lflow.c b/controller/lflow.c
index 934b23698..a44f6d056 100644
--- a/controller/lflow.c
+++ b/controller/lflow.c
@@ -1932,10 +1932,6 @@ add_lb_vip_hairpin_reply_action(struct in6_addr *vip6, 
ovs_be32 vip,
 }
 
 /* Adds flows to detect hairpin sessions.
- *
- * For backwards compatibilty with older ovn-northd versions, uses
- * ct_nw_dst(), ct_ipv6_dst(), ct_tp_dst(), otherwise uses the
- * original destination tuple stored by ovn-northd.
  */
 static void
 add_lb_vip_hairpin_flows(struct ovn_controller_lb *lb,
@@ -1956,10 +1952,8 @@ add_lb_vip_hairpin_flows(struct ovn_controller_lb *lb,
 /* Matching on ct_nw_dst()/ct_ipv6_dst()/ct_tp_dst() requires matching
  * on ct_state first.
  */
-if (!lb->hairpin_orig_tuple) {
-uint32_t ct_state = OVS_CS_F_TRACKED | OVS_CS_F_DST_NAT;
-match_set_ct_state_masked(_match, ct_state, ct_state);
-}
+uint32_t ct_state = OVS_CS_F_TRACKED | OVS_CS_F_DST_NAT;
+match_set_ct_state_masked(_match, ct_state, ct_state);
 
 if (IN6_IS_ADDR_V4MAPPED(_vip->vip)) {
 ovs_be32 bip4 = in6_addr_get_mapped_ipv4(_backend->ip);
@@ -1971,14 +1965,7 @@ add_lb_vip_hairpin_flows(struct ovn_controller_lb *lb,
 match_set_dl_type(_match, htons(ETH_TYPE_IP));
 match_set_nw_src(_match, bip4);
 match_set_nw_dst(_match, bip4);
-
-if (!lb->hairpin_orig_tuple) {
-match_set_ct_nw_dst(_match, vip4);
-} else {
-match_set_reg(_match,
-  MFF_LOG_LB_ORIG_DIP_IPV4 - MFF_LOG_REG0,
-  ntohl(vip4));
-}
+match_set_ct_nw_dst(_match, vip4);
 
 add_lb_vip_hairpin_reply_action(NULL, snat_vip4, lb_proto,
 lb_backend->port,
@@ -1993,17 +1980,7 @@ add_lb_vip_hairpin_flows(struct ovn_controller_lb *lb,
 match_set_dl_type(_match, htons(ETH_TYPE_IPV6));
 match_set_ipv6_src(_match, bip6);
 match_set_ipv6_dst(_match, bip6);
-
-if (!lb->hairpin_orig_tuple) {
-match_set_ct_ipv6_dst(_match, _vip->vip);
-} else {
-ovs_be128 vip6_value;
-
-memcpy(_value, _vip->vip, sizeof vip6_value);
-match_set_xxreg(_match,
-MFF_LOG_LB_ORIG_DIP_IPV6 - MFF_LOG_XXREG0,
-ntoh128(vip6_value));
-}
+match_set_ct_ipv6_dst(_match, _vip->vip);
 
 add_lb_vip_hairpin_reply_action(snat_vip6, 0, lb_proto,
 lb_backend->port,
@@ -2014,14 +1991,8 @@ add_lb_vip_hairpin_flows(struct ovn_controller_lb *lb,
 

[ovs-dev] [PATCH ovn 2/3] ovn-controller: Track OVSDB column open_vswitch:external_ids.

2022-07-05 Thread Han Zhou
The column was not tracked before while it should. The column includes
many ovn-controller global configurations that could impact the way
flows are computed. It worked before because lots of the configurations
are also populated to OVN-SB DB's chassis table, and as a result the SB
DB change would notify back to the ovn-controller itself, thus
triggering a recompute to make the configure change effective. However,
this is not the way it is supposed to work. We should track the
open_vswitch:external_ids column directly, and the I-P engine would
immediately recompute and apply the change.

This patch also adjust the order of adding tracked and untracked columns
to monitoring, to workaround a problem in OVS IDL that could end up
overwriting the track flag. A XXX comment is added for future
improvement.

Signed-off-by: Han Zhou 
---
 controller/ovn-controller.c | 28 +---
 1 file changed, 17 insertions(+), 11 deletions(-)

diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index 69615308e..dacef0204 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -922,23 +922,19 @@ static void
 ctrl_register_ovs_idl(struct ovsdb_idl *ovs_idl)
 {
 /* We do not monitor all tables by default, so modules must register
- * their interest explicitly. */
+ * their interest explicitly.
+ * XXX: when there are overlapping columns monitored by different modules,
+ * there is a chance that "track" flag added by ovsdb_idl_track_add_column
+ * by one module being overwritten by a following ovsdb_idl_add_column by
+ * another module. Before this is fixed in OVSDB IDL, we need to be careful
+ * about the order so that the "track" calls are after the "non-track"
+ * calls. */
 ovsdb_idl_add_table(ovs_idl, _table_open_vswitch);
-ovsdb_idl_add_column(ovs_idl, _open_vswitch_col_external_ids);
 ovsdb_idl_add_column(ovs_idl, _open_vswitch_col_other_config);
 ovsdb_idl_add_column(ovs_idl, _open_vswitch_col_bridges);
 ovsdb_idl_add_column(ovs_idl, _open_vswitch_col_datapaths);
 ovsdb_idl_add_table(ovs_idl, _table_interface);
-ovsdb_idl_track_add_column(ovs_idl, _interface_col_name);
-ovsdb_idl_track_add_column(ovs_idl, _interface_col_bfd);
-ovsdb_idl_track_add_column(ovs_idl, _interface_col_bfd_status);
-ovsdb_idl_track_add_column(ovs_idl, _interface_col_type);
-ovsdb_idl_track_add_column(ovs_idl, _interface_col_options);
-ovsdb_idl_track_add_column(ovs_idl, _interface_col_ofport);
 ovsdb_idl_add_table(ovs_idl, _table_port);
-ovsdb_idl_track_add_column(ovs_idl, _port_col_name);
-ovsdb_idl_track_add_column(ovs_idl, _port_col_interfaces);
-ovsdb_idl_track_add_column(ovs_idl, _port_col_external_ids);
 ovsdb_idl_add_table(ovs_idl, _table_bridge);
 ovsdb_idl_add_column(ovs_idl, _bridge_col_ports);
 ovsdb_idl_add_column(ovs_idl, _bridge_col_name);
@@ -958,6 +954,16 @@ ctrl_register_ovs_idl(struct ovsdb_idl *ovs_idl)
 bfd_register_ovs_idl(ovs_idl);
 physical_register_ovs_idl(ovs_idl);
 vif_plug_register_ovs_idl(ovs_idl);
+ovsdb_idl_track_add_column(ovs_idl, _open_vswitch_col_external_ids);
+ovsdb_idl_track_add_column(ovs_idl, _interface_col_name);
+ovsdb_idl_track_add_column(ovs_idl, _interface_col_bfd);
+ovsdb_idl_track_add_column(ovs_idl, _interface_col_bfd_status);
+ovsdb_idl_track_add_column(ovs_idl, _interface_col_type);
+ovsdb_idl_track_add_column(ovs_idl, _interface_col_options);
+ovsdb_idl_track_add_column(ovs_idl, _interface_col_ofport);
+ovsdb_idl_track_add_column(ovs_idl, _port_col_name);
+ovsdb_idl_track_add_column(ovs_idl, _port_col_interfaces);
+ovsdb_idl_track_add_column(ovs_idl, _port_col_external_ids);
 }
 
 #define SB_NODES \
-- 
2.30.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC PATCH 0/6] Remove OVS kernel driver

2022-07-05 Thread Gregory Rose




On 6/29/2022 3:42 PM, Ilya Maximets wrote:

On 6/10/22 02:31, Frode Nordahl wrote:

On Fri, Jun 3, 2022 at 4:16 PM Frode Nordahl
 wrote:


On Thu, Jun 2, 2022 at 6:58 PM Ilya Maximets  wrote:


On 6/1/22 22:53, Gregory Rose wrote:



On 5/31/2022 12:22 PM, Frode Nordahl wrote:

On Tue, May 31, 2022 at 7:05 PM Ilya Maximets  wrote:


On 5/31/22 17:36, Gregory Rose wrote:



On 5/25/2022 6:47 AM, Flavio Leitner wrote:


Hi Greg,


On Mon, May 23, 2022 at 09:10:36PM +0200, Ilya Maximets wrote:

On 5/19/22 20:04, Gregory Rose wrote:



On 4/15/2022 2:42 PM, Greg Rose wrote:

It is time to remove support for the OVS kernel driver and push
towards use of the upstream Linux openvswitch kernel driver
in it's place [1].

This patch series represents a first attempt but there are a few
primary remaining issues that I have yet to address.

A) Removal of debian packing support for the dkms kernel driver
   module. The debian/rules are not well known to me - I've never
   actually made any changes in that area and do not have a
   well formed understanding of how debian packaging works.  I wil
   attempt to fix that up in upcoming patch series.
B) Figuring out how the github workflow - I removed the tests I
   could find that depend on the Linux kernel (i.e. they use
   install_kernel() function.  Several other tests are  failing
   that would not seem to depend on the Linux kernel.  I need to
   read and understand that code better.
C) There are many Linux specific source modules in the datapath that
   will need eventual removal but some headers are still required for
   the userspace code (which seems counterintuitive but...)

Reviews, suggestions, etc. are appreciated!

1.  https://mail.openvswitch.org/pipermail/ovs-dev/2022-April/393292.html


I would like to suggest at this time that rather than removing the OVS
Linux kernel path that we "freeze" it at Linux 5.8. This will make it
easier for some consumers of OVS that are continuing to support the
Linux kernel datapath in old distributions.

The ultimate goal of shifting toward DPDK and AFXDP datapaths is still
preserved but we are placing less burden on some consumers of OVS for
older Linux distributions.

Perhaps in suggesting removal of the kernel datapath I was being a bit
overly aggressive.

Thoughts? Concerns? Other suggestions?


Hi.  I think we discussed that before.  Removal from the master branch
doesn't mean that we will stop supporting the kernel module immediately.
It will remain in branch 2.17 which will become our new LTS series soon.
This branch will be supported until 2025.  And we also talked about
possibility of extending the support just for a kernel module on that
branch, if required.  It's not necassary to use the kernel module and
OVS form the same branch, obviously.

Removal from the master branch will just make it possible to remove
the maintenance burden eventually, not right away.

And FWIW, the goal is not to force everyone to use userspace datapath,
but remove a maintenance burden and push users to use a better supported
version of a code.  Frankly, we're not doing a great job supporting the
out-of-tree module these days.  It's getting hard to backport bug fixes.
And will be even harder over time since the code drifts away from the
version in the upstream kernel.  Mainly because we're not backporting
new features for a few years already.

Does that make sense?


Any thoughts on this? The freeze time is approaching, so it would
be great to know your plans for this patch set.

Thanks,
fbl



Hi Flavio and Ilya,

I'll go ahead with the plans as per previous agreements - having issues
with removing the debian kernel module support since I have never
worked on debian rules type make environments.  I seem to break
something with every attempt but I will keep at it.

What's my time frame before the freeze?


The "soft-freeze" supposed to be on July 1st.  The branch creation
for a new release - mid July.  It would be great if we can get this
in before the soft freeze, but branching point is also fine.
So, we have about 6 weeks.

If you can think of any part of the work that can be done separately
by someone else, we could try and find someone to help you out.  I'm
not sure if we have experts on debian packaging though.  Maybe we
can ask some folks from Canonical.  They do their own packaging, but
should know a thing or two about packaging in general.


We'd be happy to help out with the packaging bits.

Both Debian and Ubuntu have drifted away from what is currently in the
debian/ folder in the OVS and OVN repositories.  This state is
problematic because from time to time someone tries to build packages
from the OVS/OVN debian package source and then expect that package to
work with up-/down-grades from-/to/ distro versions.

So we would prefer to either remove what's there and replace it with a
README pointing to Debian and Ubuntu package sources, or update what's
there to match packaging state du 

Re: [ovs-dev] [ovn] bug: load balancer health check status is not updated if port binding is released from chassis

2022-07-05 Thread Vladislav Odintsov
Thanks Numan for the hint.
I’ve submitted the full patch here: 
https://patchwork.ozlabs.org/project/ovn/patch/20220705175154.3095150-1-odiv...@gmail.com/
It would be great if you can find some time for review.

Thanks.

Regards,
Vladislav Odintsov

> On 5 Jul 2022, at 18:44, Numan Siddique  wrote:
> 
> On Tue, Jul 5, 2022 at 8:32 AM Vladislav Odintsov  > wrote:
>> 
>> Hi Numan,
>> 
>> I’ve send a draft patch: 
>> https://patchwork.ozlabs.org/project/ovn/patch/20220705122031.2568471-1-odiv...@gmail.com/
>> 
>> While implementing I've encountered a problem where port binding record, 
>> which I’m trying to get from struct ovn_port has stale state (it doesn’t 
>> reflect "up" field changes, which I see in ovn-northd logs with an update 
>> and I have no idea how to pull.
>> 
>> From ovn-northd debug logs:
>> 
>> port binding "up" field update comes from SB:
>> 
>> 2022-07-05T12:17:25.445Z|00412|jsonrpc|DBG|unix:/home/ec2-user/ovn/tests/system-kmod-testsuite.dir/097/ovn-sb/ovn-sb.sock:
>>  received notification, method="update3", 
>> params=[["monid","OVN_Southbound"],"----",{"Port_Binding":{"ec0e94cc-f2d0-44de-8ad0-7fd62b7d50d6":{"modify":{"up":false,"chassis":["set",[]]]
>> 
>> next I print op->sb->up ? "UP" : "DOWN" and it remains "UP":
>> 
> 
> You need to do  :  (op->sb->n_up && op->sb->up[0]) ? "UP" : "DOWN".
> 
> Thanks
> Numan
> 
> 
> 
>> 2022-07-05T12:17:25.446Z|00414|northd|INFO|PORT: sw1-p1, pbid: 
>> ec0e94cc-f2d0-44de-8ad0-7fd62b7d50d6, PB: UP, lspid: 
>> aed98c29-03db-4761-b178-42f288a12692, LSP: UP, svc->logical_port: sw1-p1, 
>> svc->status: online
>> 
>> I guess this is a result of my misunderstanding of principle of incremental 
>> engine operation.
>> Can you help to get an idea of why port_binding structure has stale state 
>> and how to "pull changes" for it?
>> 
>> Regards,
>> Vladislav Odintsov
>> 
>>> On 4 Jul 2022, at 22:13, Numan Siddique  wrote:
>>> 
>>> On Mon, Jul 4, 2022 at 12:56 PM Vladislav Odintsov  
>>> wrote:
 
 Thanks Numan,
 
 would you have time to fix it or maybe give an idea how to do it, so I can 
 try?
>>> 
>>> It would be great if you want to give it a try.  I was thinking of 2
>>> possible approaches to fix this issue.
>>> Even though pinctrl.c sets the status to offline, it cannot set the
>>> service monitor to offline when ovn-controller releases the port
>>> binding.
>>> The first option is to handle this in binding.c as you suggested
>>> earlier.  Or ovn-northd can set the status to offline, if the service
>>> monitor's logical port
>>> is no longer claimed by any ovn-controller.  I'm more inclined to
>>> handle this in ovn-northd.   What do you think?
>>> 
>>> Thanks
>>> Numan
>>> 
 
 Regards,
 Vladislav Odintsov
 
> On 4 Jul 2022, at 19:51, Numan Siddique  wrote:
> 
> On Mon, Jul 4, 2022 at 7:48 AM Vladislav Odintsov  > wrote:
>> 
>> Hi,
>> 
>> we’ve found a wrong behaviour with service_monitor record status for a 
>> health-checked Load Balancer.
>> Its status can stay online forever even if virtual machine is stopped. 
>> This leads to load balanced traffic been sent to a dead backend.
>> 
>> Below is the script to reproduce the issue because I doubt about the 
>> correct place for a possible fix (my guess is it should be fixed in 
>> controller/binding.c in function binding_lport_set_down, but I’m not 
>> sure how this can affect VM live migration…):
>> 
>> # cat ./repro.sh
>> #!/bin/bash -x
>> 
>> ovn-nbctl ls-add ls1
>> ovn-nbctl lsp-add ls1 lsp1 -- \
>>  lsp-set-addresses lsp1 "00:00:00:00:00:01 192.168.0.10"
>> ovn-nbctl lb-add lb1 192.168.0.100:80 192.168.0.10:80
>> ovn-nbctl set Load_balancer lb1 
>> ip_port_mappings:192.168.0.10=lsp1:192.168.0.8
>> ovn-nbctl --id=@id create Load_Balancer_Health_Check 
>> vip='"192.168.0.100:80"' -- set Load_Balancer lb1 health_check=@id
>> ovn-nbctl ls-lb-add ls1 lb1
>> 
>> ovs-vsctl add-port br-int test-lb -- set interface test-lb type=internal 
>> external_ids:iface-id=lsp1
>> ip li set test-lb addr 00:00:00:00:00:01
>> ip a add 192.168.0.10/24 dev test-lb
>> ip li set test-lb up
>> 
>> # check service_monitor
>> ovn-sbctl list service_mon
>> 
>> # ensure state became offline
>> sleep 4
>> ovn-sbctl list service_mon
>> 
>> # start listen on :80 with netcat
>> ncat -k -l 192.168.0.10 80 &
>> 
>> # ensure state turned to online
>> sleep 4
>> ovn-sbctl list service_mon
>> 
>> # trigger binding release
>> ovs-vsctl remove interface test-lb external_ids iface-id
>> 
>> # ensure state remains online
>> sleep 10
>> ovn-sbctl list service_mon
>> 
>> # ensure OVS group and backend is still in bucket
>> ovs-ofctl dump-groups br-int | grep 192.168.0.10
> 
> 
> 

[ovs-dev] [PATCH ovn] northd: set svc_mon status to offline if port_binding released

2022-07-05 Thread Vladislav Odintsov
This patch fixes situation where a Load Balancer
has corresponding Load_Balancer_Health_Check and
current checks are successful.  This means
Service_Monitor record in SBDB has status == "online".

In case VM (or container) stops and its Port_Binding
is released (for instance, by removing interface from
OVS, or removing iface-id from interface's external_ids),
prior to this patch state remained "online" and affected
traffic directing it to dead backend.

Corresponding testcase was updated with described behaviour.

Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-dev/2022-July/395504.html
Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2103740
Signed-off-by: Vladislav Odintsov 
---
 northd/northd.c |  7 +++
 tests/ovn-northd.at | 33 -
 tests/system-ovn.at | 11 ++-
 3 files changed, 37 insertions(+), 14 deletions(-)

diff --git a/northd/northd.c b/northd/northd.c
index 27cc8775c..63b6a73c8 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -3884,6 +3884,13 @@ ovn_lb_svc_create(struct ovsdb_idl_txn *ovnsb_txn, 
struct ovn_northd_lb *lb,
 backend_nb->svc_mon_src_ip);
 }
 
+if ((!op->sb->n_up || !op->sb->up[0])
+&& mon_info->sbrec_mon->status
+&& !strcmp(mon_info->sbrec_mon->status, "online")) {
+sbrec_service_monitor_set_status(mon_info->sbrec_mon,
+ "offline");
+}
+
 backend_nb->sbrec_monitor = mon_info->sbrec_mon;
 mon_info->required = true;
 }
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index fe97bedad..812c60b53 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -1207,6 +1207,13 @@ ovn-nbctl ls-add sw1
 ovn-nbctl --wait=sb lsp-add sw1 sw1-p1 -- lsp-set-addresses sw1-p1 \
 "02:00:00:00:00:03 20.0.0.3"
 
+# service_monitors state online requires corresponding port_binding to be "up"
+ovn-sbctl chassis-add hv1 geneve 127.0.0.1
+ovn-sbctl lsp-bind sw0-p1 hv1
+ovn-sbctl lsp-bind sw1-p1 hv1
+wait_row_count nb:Logical_Switch_Port 1 name=sw0-p1 'up=true'
+wait_row_count nb:Logical_Switch_Port 1 name=sw1-p1 'up=true'
+
 wait_row_count Service_Monitor 0
 
 ovn-nbctl --wait=sb set load_balancer . 
ip_port_mappings:10.0.0.3=sw0-p1:10.0.0.2
@@ -1220,7 +1227,7 @@ check ovn-nbctl --wait=sb ls-lb-add sw0 lb1
 AT_CAPTURE_FILE([sbflows])
 OVS_WAIT_FOR_OUTPUT(
   [ovn-sbctl dump-flows sw0 | tee sbflows | grep 'priority=120.*backends' | 
sed 's/table=..//'], 0, [dnl
-  (ls_in_lb   ), priority=120  , match=(ct.new && ip4.dst == 10.0.0.10 
&& tcp.dst == 80), action=(reg0[[1]] = 0; reg1 = 10.0.0.10; reg2[[0..15]] = 80; 
ct_lb_mark(backends=10.0.0.3:80,20.0.0.3:80);)
+  (ls_in_lb   ), priority=120  , match=(ct.new && ip4.dst == 10.0.0.10 
&& tcp.dst == 80), action=(reg0[[1]] = 0; reg1 = 10.0.0.10; reg2[[0..15]] = 80; 
ct_lb(backends=10.0.0.3:80,20.0.0.3:80);)
 ])
 
 AS_BOX([Delete the Load_Balancer_Health_Check])
@@ -1230,7 +1237,7 @@ wait_row_count Service_Monitor 0
 AT_CAPTURE_FILE([sbflows2])
 OVS_WAIT_FOR_OUTPUT(
   [ovn-sbctl dump-flows sw0 | tee sbflows2 | grep 'priority=120.*backends' | 
sed 's/table=..//'], [0],
-[  (ls_in_lb   ), priority=120  , match=(ct.new && ip4.dst == 
10.0.0.10 && tcp.dst == 80), action=(reg0[[1]] = 0; reg1 = 10.0.0.10; 
reg2[[0..15]] = 80; ct_lb_mark(backends=10.0.0.3:80,20.0.0.3:80);)
+[  (ls_in_lb   ), priority=120  , match=(ct.new && ip4.dst == 
10.0.0.10 && tcp.dst == 80), action=(reg0[[1]] = 0; reg1 = 10.0.0.10; 
reg2[[0..15]] = 80; ct_lb(backends=10.0.0.3:80,20.0.0.3:80);)
 ])
 
 AS_BOX([Create the Load_Balancer_Health_Check again.])
@@ -1242,7 +1249,7 @@ check ovn-nbctl --wait=sb sync
 
 ovn-sbctl dump-flows sw0 | grep backends | grep priority=120 > lflows.txt
 AT_CHECK([cat lflows.txt | sed 's/table=..//'], [0], [dnl
-  (ls_in_lb   ), priority=120  , match=(ct.new && ip4.dst == 10.0.0.10 
&& tcp.dst == 80), action=(reg0[[1]] = 0; reg1 = 10.0.0.10; reg2[[0..15]] = 80; 
ct_lb_mark(backends=10.0.0.3:80,20.0.0.3:80);)
+  (ls_in_lb   ), priority=120  , match=(ct.new && ip4.dst == 10.0.0.10 
&& tcp.dst == 80), action=(reg0[[1]] = 0; reg1 = 10.0.0.10; reg2[[0..15]] = 80; 
ct_lb(backends=10.0.0.3:80,20.0.0.3:80);)
 ])
 
 AS_BOX([Get the uuid of both the service_monitor])
@@ -1252,7 +1259,7 @@ sm_sw1_p1=$(fetch_column Service_Monitor _uuid 
logical_port=sw1-p1)
 AT_CAPTURE_FILE([sbflows3])
 OVS_WAIT_FOR_OUTPUT(
   [ovn-sbctl dump-flows sw0 | tee sbflows 3 | grep 'priority=120.*backends' | 
sed 's/table=..//'], [0],
-[  (ls_in_lb   ), priority=120  , match=(ct.new && ip4.dst == 
10.0.0.10 && tcp.dst == 80), action=(reg0[[1]] = 0; reg1 = 10.0.0.10; 
reg2[[0..15]] = 80; ct_lb_mark(backends=10.0.0.3:80,20.0.0.3:80);)
+[  (ls_in_lb   ), priority=120  , match=(ct.new && ip4.dst == 
10.0.0.10 && tcp.dst == 80), action=(reg0[[1]] = 0; reg1 = 10.0.0.10; 
reg2[[0..15]] = 80; 

Re: [ovs-dev] [PATCH v11 0/4] MFEX Optimizations IPv6 + Hashing Optimizations

2022-07-05 Thread Stokes, Ian
> The patchset introuduces IPv6 optimized MFEX profiles
> with AVX512 which can deliver upto 20% to 30% gain in
> performance over the existing scalar data-path.
> 
> Hashing Optimization are also included which can further
> improve performance by approximately 10%.
> 
> The patch also removes the magic numbers for MF bits, packet offsets
> and packet lenghts.

Thanks for the patch series Amber & Harry. Have been testing this the last few 
days and it looks good to me. I've applied the series to master with a few 
modifications (I removed the entry from NEWS as we didn't add an entry for the 
IPv4 optimizations that were merged previously, so I think it makes sense to do 
the same thing here, also fixed a few typos in commit messages).

Thanks
Ian
 
> 
> ---
> v11:
> - Rebase on top of AVX512 runtime ISA check patch.
> v10:
> - Reabse onto Cian Partial AVX512 build patch.
> v9:
> - Fix Ubsan un-alinged memory load.
> v8:
> - Rename packet offsets defines to aling with packet struct.
> v7:
> - Remove magic numbers from AVX512 Profiles.
> v5:
> - Add Ipv6 and TCP packet length checks.
> v4:
> - rebase to master.
> - use static key lenghts for different packet types.
> v3:
> - rebase to master.
> v2:
> - fix the CI build.
> - fix check-patch for co-author.
> ---
> 
> Kumar Amber (4):
>   mfex_avx512: Calculate pkt offsets at compile time.
>   mfex_avx512: Calculate miniflow_bits at compile time.
>   dpif-netdev/mfex: Add AVX512 ipv6 traffic profiles
>   dpif-netdev/mfex: Add ipv6 profile based hashing.
> 
>  NEWS  |   4 +
>  acinclude.m4  |   1 +
>  lib/automake.mk   |   5 +-
>  lib/dp-packet.h   |  43 
>  lib/dpif-netdev-extract-avx512.c  | 349
> +-
>  lib/dpif-netdev-private-extract.c |  51 -
>  lib/dpif-netdev-private-extract.h |  20 ++
>  lib/flow.c|   4 +
>  8 files changed, 463 insertions(+), 14 deletions(-)
> 
> --
> 2.25.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v4] OVN-CI: Add test cases with monitor-all enabled.

2022-07-05 Thread Numan Siddique
On Fri, Jul 1, 2022 at 4:18 PM Dumitru Ceara  wrote:
>
> On 6/30/22 13:59, Mohammad Heib wrote:
> > currently ovn ci only has one test case with the option
> > ovn-monitor-all enabled, this patch will add one more
> > execution with option ovn-monitor-all=true for each test case that
> > are wrapped by OVN_FOR_EACH_NORTHD macro.
> >
> > This will more or less double the number of test cases.
> > It is possible to select a reduce set of test cases using -k "keywords".
> > Keyword such as
> > dp-groups=yes
> > dp-groups=no
> > parallelization=yes
> > parallelization=no
> > ovn-northd
> > ovn-northd-ddlog
> > ovn_monitor_all=yes
> > can be used to select a range of tests, as title is searched as well.
> >
> > For instance, to run ovn-monitor-all tests, with dp-groups enabled and 
> > ddlog disabled:
> > make check TESTSUITEFLAGS="-k 
> > dp-groups=yes,ovn_monitor_all=yes,\!ovn-northd-ddlog"
> >
> > Signed-off-by: Mohammad Heib 
> > ---
>
> Looks good to me, thanks!
>
> Acked-by: Dumitru Ceara 

Thanks Mohammad and Dumitru.  I applied this patch to the main.

Numan

>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [ovn] bug: load balancer health check status is not updated if port binding is released from chassis

2022-07-05 Thread Numan Siddique
On Tue, Jul 5, 2022 at 8:32 AM Vladislav Odintsov  wrote:
>
> Hi Numan,
>
> I’ve send a draft patch: 
> https://patchwork.ozlabs.org/project/ovn/patch/20220705122031.2568471-1-odiv...@gmail.com/
>
> While implementing I've encountered a problem where port binding record, 
> which I’m trying to get from struct ovn_port has stale state (it doesn’t 
> reflect "up" field changes, which I see in ovn-northd logs with an update and 
> I have no idea how to pull.
>
> From ovn-northd debug logs:
>
> port binding "up" field update comes from SB:
>
> 2022-07-05T12:17:25.445Z|00412|jsonrpc|DBG|unix:/home/ec2-user/ovn/tests/system-kmod-testsuite.dir/097/ovn-sb/ovn-sb.sock:
>  received notification, method="update3", 
> params=[["monid","OVN_Southbound"],"----",{"Port_Binding":{"ec0e94cc-f2d0-44de-8ad0-7fd62b7d50d6":{"modify":{"up":false,"chassis":["set",[]]]
>
> next I print op->sb->up ? "UP" : "DOWN" and it remains "UP":
>

You need to do  :  (op->sb->n_up && op->sb->up[0]) ? "UP" : "DOWN".

Thanks
Numan



> 2022-07-05T12:17:25.446Z|00414|northd|INFO|PORT: sw1-p1, pbid: 
> ec0e94cc-f2d0-44de-8ad0-7fd62b7d50d6, PB: UP, lspid: 
> aed98c29-03db-4761-b178-42f288a12692, LSP: UP, svc->logical_port: sw1-p1, 
> svc->status: online
>
> I guess this is a result of my misunderstanding of principle of incremental 
> engine operation.
> Can you help to get an idea of why port_binding structure has stale state and 
> how to "pull changes" for it?
>
> Regards,
> Vladislav Odintsov
>
> > On 4 Jul 2022, at 22:13, Numan Siddique  wrote:
> >
> > On Mon, Jul 4, 2022 at 12:56 PM Vladislav Odintsov  
> > wrote:
> >>
> >> Thanks Numan,
> >>
> >> would you have time to fix it or maybe give an idea how to do it, so I can 
> >> try?
> >
> > It would be great if you want to give it a try.  I was thinking of 2
> > possible approaches to fix this issue.
> > Even though pinctrl.c sets the status to offline, it cannot set the
> > service monitor to offline when ovn-controller releases the port
> > binding.
> > The first option is to handle this in binding.c as you suggested
> > earlier.  Or ovn-northd can set the status to offline, if the service
> > monitor's logical port
> > is no longer claimed by any ovn-controller.  I'm more inclined to
> > handle this in ovn-northd.   What do you think?
> >
> > Thanks
> > Numan
> >
> >>
> >> Regards,
> >> Vladislav Odintsov
> >>
> >>> On 4 Jul 2022, at 19:51, Numan Siddique  wrote:
> >>>
> >>> On Mon, Jul 4, 2022 at 7:48 AM Vladislav Odintsov  >>> > wrote:
> 
>  Hi,
> 
>  we’ve found a wrong behaviour with service_monitor record status for a 
>  health-checked Load Balancer.
>  Its status can stay online forever even if virtual machine is stopped. 
>  This leads to load balanced traffic been sent to a dead backend.
> 
>  Below is the script to reproduce the issue because I doubt about the 
>  correct place for a possible fix (my guess is it should be fixed in 
>  controller/binding.c in function binding_lport_set_down, but I’m not 
>  sure how this can affect VM live migration…):
> 
>  # cat ./repro.sh
>  #!/bin/bash -x
> 
>  ovn-nbctl ls-add ls1
>  ovn-nbctl lsp-add ls1 lsp1 -- \
>    lsp-set-addresses lsp1 "00:00:00:00:00:01 192.168.0.10"
>  ovn-nbctl lb-add lb1 192.168.0.100:80 192.168.0.10:80
>  ovn-nbctl set Load_balancer lb1 
>  ip_port_mappings:192.168.0.10=lsp1:192.168.0.8
>  ovn-nbctl --id=@id create Load_Balancer_Health_Check 
>  vip='"192.168.0.100:80"' -- set Load_Balancer lb1 health_check=@id
>  ovn-nbctl ls-lb-add ls1 lb1
> 
>  ovs-vsctl add-port br-int test-lb -- set interface test-lb type=internal 
>  external_ids:iface-id=lsp1
>  ip li set test-lb addr 00:00:00:00:00:01
>  ip a add 192.168.0.10/24 dev test-lb
>  ip li set test-lb up
> 
>  # check service_monitor
>  ovn-sbctl list service_mon
> 
>  # ensure state became offline
>  sleep 4
>  ovn-sbctl list service_mon
> 
>  # start listen on :80 with netcat
>  ncat -k -l 192.168.0.10 80 &
> 
>  # ensure state turned to online
>  sleep 4
>  ovn-sbctl list service_mon
> 
>  # trigger binding release
>  ovs-vsctl remove interface test-lb external_ids iface-id
> 
>  # ensure state remains online
>  sleep 10
>  ovn-sbctl list service_mon
> 
>  # ensure OVS group and backend is still in bucket
>  ovs-ofctl dump-groups br-int | grep 192.168.0.10
> >>>
> >>>
> >>> Thanks for the bug report.  I could reproduce it locally.  Looks to me
> >>> it should be fixed in pinctrl.c as it sets the service monitor status.
> >>>
> >>> I've also raised a bugzilla here -
> >>> https://bugzilla.redhat.com/show_bug.cgi?id=2103740 
> >>> 
> >>>
> >>> Numan
> >>>
> 
> 
>  
>  Looking forward to hear any thoughts on this.

Re: [ovs-dev] [ovn] bug: load balancer health check status is not updated if port binding is released from chassis

2022-07-05 Thread Vladislav Odintsov
Hi Numan,

I’ve send a draft patch: 
https://patchwork.ozlabs.org/project/ovn/patch/20220705122031.2568471-1-odiv...@gmail.com/

While implementing I've encountered a problem where port binding record, which 
I’m trying to get from struct ovn_port has stale state (it doesn’t reflect "up" 
field changes, which I see in ovn-northd logs with an update and I have no idea 
how to pull.

From ovn-northd debug logs:

port binding "up" field update comes from SB:

2022-07-05T12:17:25.445Z|00412|jsonrpc|DBG|unix:/home/ec2-user/ovn/tests/system-kmod-testsuite.dir/097/ovn-sb/ovn-sb.sock:
 received notification, method="update3", 
params=[["monid","OVN_Southbound"],"----",{"Port_Binding":{"ec0e94cc-f2d0-44de-8ad0-7fd62b7d50d6":{"modify":{"up":false,"chassis":["set",[]]]

next I print op->sb->up ? "UP" : "DOWN" and it remains "UP":

2022-07-05T12:17:25.446Z|00414|northd|INFO|PORT: sw1-p1, pbid: 
ec0e94cc-f2d0-44de-8ad0-7fd62b7d50d6, PB: UP, lspid: 
aed98c29-03db-4761-b178-42f288a12692, LSP: UP, svc->logical_port: sw1-p1, 
svc->status: online

I guess this is a result of my misunderstanding of principle of incremental 
engine operation.
Can you help to get an idea of why port_binding structure has stale state and 
how to "pull changes" for it?

Regards,
Vladislav Odintsov

> On 4 Jul 2022, at 22:13, Numan Siddique  wrote:
> 
> On Mon, Jul 4, 2022 at 12:56 PM Vladislav Odintsov  wrote:
>> 
>> Thanks Numan,
>> 
>> would you have time to fix it or maybe give an idea how to do it, so I can 
>> try?
> 
> It would be great if you want to give it a try.  I was thinking of 2
> possible approaches to fix this issue.
> Even though pinctrl.c sets the status to offline, it cannot set the
> service monitor to offline when ovn-controller releases the port
> binding.
> The first option is to handle this in binding.c as you suggested
> earlier.  Or ovn-northd can set the status to offline, if the service
> monitor's logical port
> is no longer claimed by any ovn-controller.  I'm more inclined to
> handle this in ovn-northd.   What do you think?
> 
> Thanks
> Numan
> 
>> 
>> Regards,
>> Vladislav Odintsov
>> 
>>> On 4 Jul 2022, at 19:51, Numan Siddique  wrote:
>>> 
>>> On Mon, Jul 4, 2022 at 7:48 AM Vladislav Odintsov >> > wrote:
 
 Hi,
 
 we’ve found a wrong behaviour with service_monitor record status for a 
 health-checked Load Balancer.
 Its status can stay online forever even if virtual machine is stopped. 
 This leads to load balanced traffic been sent to a dead backend.
 
 Below is the script to reproduce the issue because I doubt about the 
 correct place for a possible fix (my guess is it should be fixed in 
 controller/binding.c in function binding_lport_set_down, but I’m not sure 
 how this can affect VM live migration…):
 
 # cat ./repro.sh
 #!/bin/bash -x
 
 ovn-nbctl ls-add ls1
 ovn-nbctl lsp-add ls1 lsp1 -- \
   lsp-set-addresses lsp1 "00:00:00:00:00:01 192.168.0.10"
 ovn-nbctl lb-add lb1 192.168.0.100:80 192.168.0.10:80
 ovn-nbctl set Load_balancer lb1 
 ip_port_mappings:192.168.0.10=lsp1:192.168.0.8
 ovn-nbctl --id=@id create Load_Balancer_Health_Check 
 vip='"192.168.0.100:80"' -- set Load_Balancer lb1 health_check=@id
 ovn-nbctl ls-lb-add ls1 lb1
 
 ovs-vsctl add-port br-int test-lb -- set interface test-lb type=internal 
 external_ids:iface-id=lsp1
 ip li set test-lb addr 00:00:00:00:00:01
 ip a add 192.168.0.10/24 dev test-lb
 ip li set test-lb up
 
 # check service_monitor
 ovn-sbctl list service_mon
 
 # ensure state became offline
 sleep 4
 ovn-sbctl list service_mon
 
 # start listen on :80 with netcat
 ncat -k -l 192.168.0.10 80 &
 
 # ensure state turned to online
 sleep 4
 ovn-sbctl list service_mon
 
 # trigger binding release
 ovs-vsctl remove interface test-lb external_ids iface-id
 
 # ensure state remains online
 sleep 10
 ovn-sbctl list service_mon
 
 # ensure OVS group and backend is still in bucket
 ovs-ofctl dump-groups br-int | grep 192.168.0.10
>>> 
>>> 
>>> Thanks for the bug report.  I could reproduce it locally.  Looks to me
>>> it should be fixed in pinctrl.c as it sets the service monitor status.
>>> 
>>> I've also raised a bugzilla here -
>>> https://bugzilla.redhat.com/show_bug.cgi?id=2103740 
>>> 
>>> 
>>> Numan
>>> 
 
 
 
 Looking forward to hear any thoughts on this.
 
 PS. don’t forget to kill ncat ;)
 
 
 
 Regards,
 Vladislav Odintsov
 ___
 dev mailing list
 d...@openvswitch.org 
 https://mail.openvswitch.org/mailman/listinfo/ovs-dev 
 
>> 

[ovs-dev] [PATCH RFC ovn] northd: reset svc_mon status to offline if port_binding is released

2022-07-05 Thread Vladislav Odintsov
Signed-off-by: Vladislav Odintsov 
---
 northd/northd.c | 14 ++
 tests/system-ovn.at | 13 -
 2 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/northd/northd.c b/northd/northd.c
index 964af992f..4f8f1aaf9 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -3876,6 +3876,20 @@ ovn_lb_svc_create(struct ovsdb_idl_txn *ovnsb_txn, 
struct ovn_northd_lb *lb,
 mon_info->sbrec_mon,
 backend_nb->svc_mon_src_ip);
 }
+VLOG_DBG("PORT: %s, pbid: "UUID_FMT", PB: %s, lspid: "UUID_FMT
+ ", LSP: %s, svc->logical_port: %s, svc->status: %s",
+ op->key,
+ UUID_ARGS(>sb->header_.uuid),
+ op->sb->up ? "UP" : "DOWN",
+ UUID_ARGS(>nbsp->header_.uuid),
+ op->nbsp->up ? "UP" : "DOWN",
+ mon_info->sbrec_mon->logical_port,
+ mon_info->sbrec_mon->status);
+if ((!op->sb || !op->sb->up)
+&& !strcmp(mon_info->sbrec_mon->status, "online")) {
+sbrec_service_monitor_set_status(mon_info->sbrec_mon,
+ "offline");
+}
 
 backend_nb->sbrec_monitor = mon_info->sbrec_mon;
 mon_info->required = true;
diff --git a/tests/system-ovn.at b/tests/system-ovn.at
index df2da3408..9a3ed6286 100644
--- a/tests/system-ovn.at
+++ b/tests/system-ovn.at
@@ -4441,7 +4441,7 @@ 
tcp,orig=(src=10.0.0.4,dst=10.0.0.10,sport=,dport=),reply=(src
 
tcp,orig=(src=10.0.0.4,dst=10.0.0.10,sport=,dport=),reply=(src=20.0.0.3,dst=10.0.0.4,sport=,dport=),zone=,mark=2,protoinfo=(state=)
 ])
 
-# Stop webserer in sw0-p1
+# Stop webserver in sw0-p1
 kill `cat $sw0_p1_pid_file`
 
 # Wait until service_monitor for sw0-p1 is set to offline
@@ -4465,6 +4465,17 @@ sed -e 's/zone=[[0-9]]*/zone=/'], [0], [dnl
 
tcp,orig=(src=10.0.0.4,dst=10.0.0.10,sport=,dport=),reply=(src=20.0.0.3,dst=10.0.0.4,sport=,dport=),zone=,mark=2,protoinfo=(state=)
 ])
 
+# trigger port binding release and check if status changed to offline
+ovs-vsctl remove interface ovs-sw1-p1 external_ids iface-id
+sleep 4
+ovn-sbctl list port-b
+ovn-sbctl list service_monitor
+ovn-nbctl list logical-switch-port
+wait_row_count Service_Monitor 2
+wait_row_count Service_Monitor 2 status=offline
+wait_row_count Service_Monitor 0 status=online
+
+
 # Create udp load balancer.
 ovn-nbctl lb-add lb2 10.0.0.10:80 10.0.0.3:80,20.0.0.3:80 udp
 lb_udp=`ovn-nbctl lb-list | grep udp | awk '{print $1}'`
-- 
2.26.3

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn 4/4] multicast: Properly flood IGMP queries and reports.

2022-07-05 Thread Dumitru Ceara
RFC 4541 "Considerations for Internet Group Management Protocol (IGMP)
and Multicast Listener Discovery (MLD) Snooping Switches" [0] states:

For IGMP packets:
2.1.1.  IGMP Forwarding Rules
   1) A snooping switch should forward IGMP Membership Reports only to
  those ports where multicast routers are attached.

  Alternatively stated: a snooping switch should not forward IGMP
  Membership Reports to ports on which only hosts are attached.  An
  administrative control may be provided to override this
  restriction, allowing the report messages to be flooded to other
  ports.

  This is the main IGMP snooping functionality for the control path.

  Sending membership reports to other hosts can result, for IGMPv1
  and IGMPv2, in unintentionally preventing a host from joining a
  specific multicast group.

  When an IGMPv1 or IGMPv2 host receives a membership report for a
  group address that it intends to join, the host will suppress its
  own membership report for the same group.  This join or message
  suppression is a requirement for IGMPv1 and IGMPv2 hosts.
  However, if a switch does not receive a membership report from the
  host it will not forward multicast data to it.

In OVN this translates to forwarding reports only on:
a. ports where mrouters have been learned (IGMP queries were received on
   those ports)
b. ports connecting a multicast enabled logical switch to a multicast
   relay enabled logical router (OVN mrouter)
c. ports configured with mcast_flood_reports=true

2.1.2.  Data Forwarding Rules

   1) Packets with a destination IP address outside 224.0.0.X which are
  not IGMP should be forwarded according to group-based port
  membership tables and must also be forwarded on router ports.

In OVN this translates to forwarding traffic for a group G to:
a. all ports where G was learned
b. all ports where an (external or OVN) mrouter was learned.
c. all ports configured with mcast_flood=true

IGMP queries are already snooped by ovn-controller.  Just propagate the
information about where mrouters are to the Southbound DB through means
of a custom IGMP_Group (name="mrouters").

Snooped queries are then re-injected in the OVS pipeline with outport
set to MC_FLOOD_L2 (only flood them in the L2 domain).

Snooped reports are re-injected in the OVS pipeline with outport set to
MC_MROUTER_FLOOD (flood them to snooped mrouters and OVN mrouters).

The same behavior applies to IPv6 too (MLD).

[0] https://datatracker.ietf.org/doc/html/rfc4541

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1933990
Reported-at: https://github.com/ovn-org/ovn/issues/126
Signed-off-by: Dumitru Ceara 
---
 controller/ip-mcast.c   |  129 -
 controller/ip-mcast.h   |   14 
 controller/pinctrl.c|  129 +
 lib/ip-mcast-index.h|5 +
 lib/mcast-group-index.h |4 -
 northd/northd.c |   54 +++
 northd/ovn-northd.8.xml |6 --
 tests/ovn-macros.at |   25 +++
 tests/ovn.at|  164 ++-
 9 files changed, 472 insertions(+), 58 deletions(-)

diff --git a/controller/ip-mcast.c b/controller/ip-mcast.c
index 9b0b4465a..a870fb29e 100644
--- a/controller/ip-mcast.c
+++ b/controller/ip-mcast.c
@@ -16,6 +16,7 @@
 #include 
 
 #include "ip-mcast.h"
+#include "ip-mcast-index.h"
 #include "lport.h"
 #include "lib/ovn-sb-idl.h"
 
@@ -27,6 +28,18 @@ struct igmp_group_port {
 const struct sbrec_port_binding *port;
 };
 
+static const struct sbrec_igmp_group *
+igmp_group_lookup_(struct ovsdb_idl_index *igmp_groups,
+   const char *addr_str,
+   const struct sbrec_datapath_binding *datapath,
+   const struct sbrec_chassis *chassis);
+
+static struct sbrec_igmp_group *
+igmp_group_create_(struct ovsdb_idl_txn *idl_txn,
+   const char *addr_str,
+   const struct sbrec_datapath_binding *datapath,
+   const struct sbrec_chassis *chassis);
+
 struct ovsdb_idl_index *
 igmp_group_index_create(struct ovsdb_idl *idl)
 {
@@ -54,17 +67,16 @@ igmp_group_lookup(struct ovsdb_idl_index *igmp_groups,
 return NULL;
 }
 
-struct sbrec_igmp_group *target =
-sbrec_igmp_group_index_init_row(igmp_groups);
-
-sbrec_igmp_group_index_set_address(target, addr_str);
-sbrec_igmp_group_index_set_datapath(target, datapath);
-sbrec_igmp_group_index_set_chassis(target, chassis);
+return igmp_group_lookup_(igmp_groups, addr_str, datapath, chassis);
+}
 
-const struct sbrec_igmp_group *g =
-sbrec_igmp_group_index_find(igmp_groups, target);
-sbrec_igmp_group_index_destroy_row(target);
-return g;
+const struct sbrec_igmp_group *
+igmp_mrouter_lookup(struct ovsdb_idl_index *igmp_groups,
+   

[ovs-dev] [PATCH ovn 3/4] tests: Factor out reset_pcap_file() helper.

2022-07-05 Thread Dumitru Ceara
Signed-off-by: Dumitru Ceara 
---
 tests/ovn-macros.at |   20 +++
 tests/ovn.at|  358 +++
 2 files changed, 43 insertions(+), 335 deletions(-)

diff --git a/tests/ovn-macros.at b/tests/ovn-macros.at
index 335f9158c..77e89f6b4 100644
--- a/tests/ovn-macros.at
+++ b/tests/ovn-macros.at
@@ -508,9 +508,13 @@ wait_for_ports_up() {
 fi
 }
 
-# reset_pcap_file iface pcap_file
+# reset_iface_pcap_file iface pcap_file
 # Resets the pcap file associates with OVS interface.  should be used
 # with dummy datapath.
+#
+# XXX: This should actually replace reset_pcap_file() as they do almost
+# exactly the same thing but the "wait while the pcap file has the size
+# of the PCAP header" check causes tests to fail.
 reset_iface_pcap_file() {
 local iface=$1
 local pcap_file=$2
@@ -525,6 +529,20 @@ options:rxq_pcap=${pcap_file}-rx.pcap
 OVS_WAIT_WHILE([test 24 = $(wc -c ${pcap_file}-tx.pcap | cut -d " " -f1)])
 }
 
+# reset_pcap_file iface pcap_file
+# Resets the pcap file associates with OVS interface.  should be used
+# with dummy datapath.
+reset_pcap_file() {
+local iface=$1
+local pcap_file=$2
+check rm -f dummy-*.pcap
+check ovs-vsctl -- set Interface $iface options:tx_pcap=dummy-tx.pcap \
+options:rxq_pcap=dummy-rx.pcap
+check rm -f ${pcap_file}*.pcap
+check ovs-vsctl -- set Interface $iface 
options:tx_pcap=${pcap_file}-tx.pcap \
+options:rxq_pcap=${pcap_file}-rx.pcap
+}
+
 # Receive a packet on a dummy netdev interface. If we expect packets to be
 # recorded, then wait until the pcap file reflects the change.
 netdev_dummy_receive() {
diff --git a/tests/ovn.at b/tests/ovn.at
index 0e868ae66..9512fd033 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -6479,16 +6479,6 @@ compare_dhcp_packets() {
 fi
 }
 
-reset_pcap_file() {
-local iface=$1
-local pcap_file=$2
-check ovs-vsctl -- set Interface $iface options:tx_pcap=dummy-tx.pcap \
-options:rxq_pcap=dummy-rx.pcap
-rm -f ${pcap_file}*.pcap
-check ovs-vsctl -- set Interface $iface 
options:tx_pcap=${pcap_file}-tx.pcap \
-options:rxq_pcap=${pcap_file}-rx.pcap
-}
-
 AT_CAPTURE_FILE([sbflows])
 ovn-sbctl dump-flows > sbflows
 
@@ -7074,16 +7064,6 @@ test_dhcpv6() {
 as hv1 ovs-appctl netdev-dummy/receive hv1-vif$inport $request
 }
 
-reset_pcap_file() {
-local iface=$1
-local pcap_file=$2
-ovs-vsctl -- set Interface $iface options:tx_pcap=dummy-tx.pcap \
-options:rxq_pcap=dummy-rx.pcap
-rm -f ${pcap_file}*.pcap
-ovs-vsctl -- set Interface $iface options:tx_pcap=${pcap_file}-tx.pcap \
-options:rxq_pcap=${pcap_file}-rx.pcap
-}
-
 AT_CAPTURE_FILE([ofctl_monitor0.log])
 as hv1 ovs-ofctl monitor br-int resume --detach --no-chdir \
 --pidfile=ovs-ofctl0.pid 2> ofctl_monitor0.log
@@ -8781,16 +8761,6 @@ OVS_WAIT_UNTIL([
 ])
 ovn-nbctl lsp-set-options lrp0-rp router-port=lrp0 
nat-addresses="f0:00:00:00:00:03 192.168.0.3"
 
-reset_pcap_file() {
-local iface=$1
-local pcap_file=$2
-ovs-vsctl -- set Interface $iface options:tx_pcap=dummy-tx.pcap \
-options:rxq_pcap=dummy-rx.pcap
-rm -f ${pcap_file}*.pcap
-ovs-vsctl -- set Interface $iface options:tx_pcap=${pcap_file}-tx.pcap \
-options:rxq_pcap=${pcap_file}-rx.pcap
-}
-
 reset_pcap_file snoopvif hv1/snoopvif
 OVS_WAIT_UNTIL([test `wc -c < "hv1/snoopvif-tx.pcap"` -ge 140])
 
@@ -8865,20 +8835,8 @@ grep "Port patch-br-int-to-ln_port" | wc -l`])
 # Temporarily remove lr0 chassis
 AT_CHECK([ovn-nbctl remove logical_router lr0 options chassis])
 
-reset_pcap_file() {
-local hv=$1
-local iface=$2
-local pcap_file=$3
-as $hv
-ovs-vsctl -- set Interface $iface options:tx_pcap=dummy-tx.pcap \
-options:rxq_pcap=dummy-rx.pcap
-rm -f ${pcap_file}*.pcap
-ovs-vsctl -- set Interface $iface options:tx_pcap=${pcap_file}-tx.pcap \
-options:rxq_pcap=${pcap_file}-rx.pcap
-}
-
-reset_pcap_file hv1 snoopvif hv1/snoopvif
-reset_pcap_file hv2 snoopvif hv2/snoopvif
+as hv1 reset_pcap_file snoopvif hv1/snoopvif
+as hv2 reset_pcap_file snoopvif hv2/snoopvif
 
 hv1_uuid=$(ovn-sbctl --bare --columns _uuid list chassis hv1)
 AT_CHECK([ovn-nbctl set logical_router lr0 options:chassis=hv1])
@@ -8894,8 +8852,8 @@ OVN_CHECK_PACKETS([hv2/snoopvif-tx.pcap], 
[empty_expected])
 # Temporarily remove lr0 chassis
 AT_CHECK([ovn-nbctl remove logical_router lr0 options chassis])
 
-reset_pcap_file hv1 snoopvif hv1/snoopvif
-reset_pcap_file hv2 snoopvif hv2/snoopvif
+as hv1 reset_pcap_file snoopvif hv1/snoopvif
+as hv2 reset_pcap_file snoopvif hv2/snoopvif
 
 hv2_uuid=$(ovn-sbctl --bare --columns _uuid list chassis hv2)
 AT_CHECK([ovn-nbctl set logical_router lr0 options:chassis=hv2])
@@ -8980,16 +8938,6 @@ AT_CHECK([sort packets], [0], [expout])
 # due to GARP backoff
 ovn-nbctl lsp-set-options lrp0-rp router-port=lrp0 nat-addresses=""
 
-reset_pcap_file() {
-local iface=$1
-local pcap_file=$2
-ovs-vsctl -- set Interface $iface options:tx_pcap=dummy-tx.pcap \

[ovs-dev] [PATCH ovn 2/4] tests: Add 'IP-multicast' keyword to all relevant tests.

2022-07-05 Thread Dumitru Ceara
Signed-off-by: Dumitru Ceara 
---
 tests/ovn.at|7 ---
 tests/system-ovn.at |2 +-
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/tests/ovn.at b/tests/ovn.at
index 6e35bb5b8..0e868ae66 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -20931,7 +20931,7 @@ AT_CLEANUP
 
 OVN_FOR_EACH_NORTHD([
 AT_SETUP([IGMP snoop/querier/relay])
-AT_KEYWORDS([snoop query querier relay])
+AT_KEYWORDS([IP-multicast snoop query querier relay])
 
 ovn_start
 
@@ -21524,7 +21524,7 @@ AT_CLEANUP
 
 OVN_FOR_EACH_NORTHD([
 AT_SETUP([IGMP relay - distributed gateway port])
-AT_KEYWORDS([snoop relay])
+AT_KEYWORDS([IP-multicast snoop relay])
 
 ovn_start
 
@@ -21673,7 +21673,7 @@ AT_CLEANUP
 
 OVN_FOR_EACH_NORTHD([
 AT_SETUP([MLD snoop/querier/relay])
-AT_KEYWORDS([snoop query querier relay])
+AT_KEYWORDS([IP-multicast snoop query querier relay])
 
 ovn_start
 
@@ -24664,6 +24664,7 @@ AT_CLEANUP
 
 OVN_FOR_EACH_NORTHD([
 AT_SETUP([interconnection - IGMP/MLD multicast])
+AT_KEYWORDS([IP-multicast])
 
 # Logical network:
 #
diff --git a/tests/system-ovn.at b/tests/system-ovn.at
index df2da3408..8df26 100644
--- a/tests/system-ovn.at
+++ b/tests/system-ovn.at
@@ -4147,7 +4147,7 @@ AT_CLEANUP
 OVN_FOR_EACH_NORTHD([
 AT_SETUP([2 LSs IGMP and MLD])
 AT_SKIP_IF([test $HAVE_TCPDUMP = no])
-AT_KEYWORDS([ovnigmp])
+AT_KEYWORDS([ovnigmp IP-multicast])
 
 ovn_start
 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn 1/4] multicast: Document reserved multicast groups.

2022-07-05 Thread Dumitru Ceara
Signed-off-by: Dumitru Ceara 
---
 lib/mcast-group-index.h |   26 +-
 1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/lib/mcast-group-index.h b/lib/mcast-group-index.h
index 5bc725451..e8d82b126 100644
--- a/lib/mcast-group-index.h
+++ b/lib/mcast-group-index.h
@@ -28,11 +28,27 @@ struct sbrec_datapath_binding;
 enum ovn_mcast_tunnel_keys {
 
 OVN_MCAST_FLOOD_TUNNEL_KEY = OVN_MIN_MULTICAST,
-OVN_MCAST_UNKNOWN_TUNNEL_KEY,
-OVN_MCAST_MROUTER_FLOOD_TUNNEL_KEY,
-OVN_MCAST_MROUTER_STATIC_TUNNEL_KEY,
-OVN_MCAST_STATIC_TUNNEL_KEY,
-OVN_MCAST_FLOOD_L2_TUNNEL_KEY,
+OVN_MCAST_UNKNOWN_TUNNEL_KEY,/* For L2 unknown dest traffic. */
+OVN_MCAST_MROUTER_FLOOD_TUNNEL_KEY,  /* For L3 multicast traffic that must
+  * be relayed (multicast routed).
+  */
+OVN_MCAST_MROUTER_STATIC_TUNNEL_KEY, /* For multicast reports that need to
+  * be forwarded statically towards
+  * mrouters.
+  */
+OVN_MCAST_STATIC_TUNNEL_KEY, /* For switches:
+  * - for L3 multicast traffic that
+  *   needs to be forwarded
+  *   statically.
+  * For routers:
+  * - for L3 multicast traffic AND
+  *   reports that need to be
+  *   forwarded statically.
+  */
+OVN_MCAST_FLOOD_L2_TUNNEL_KEY,   /* Logical switch broadcast domain
+  * excluding ports towards logical
+  * routers.
+  */
 OVN_MIN_IP_MULTICAST,
 OVN_MAX_IP_MULTICAST = OVN_MAX_MULTICAST,
 };

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn 0/4] Properly forward packets to IP mrouters.

2022-07-05 Thread Dumitru Ceara
The first 3 patches of the series are not necessarily related to the issue
being fixed by patch 4/4 but they're good to have as they improve
code documentation and simplify (a bit) the multicast related tests.  They
can also be applied independently of patch 4/4.

The last patch of the series changes the way OVN floods IP multicast control
packets (IGMP/MLD queries and reports) making it more RFC compliant.


Dumitru Ceara (4):
  multicast: Document reserved multicast groups.
  tests: Add 'IP-multicast' keyword to all relevant tests.
  tests: Factor out reset_pcap_file() helper.
  multicast: Properly flood IGMP queries and reports.


 controller/ip-mcast.c   | 129 --
 controller/ip-mcast.h   |  14 ++
 controller/pinctrl.c| 129 --
 lib/ip-mcast-index.h|   5 +
 lib/mcast-group-index.h |   4 -
 northd/northd.c |  54 ++--
 northd/ovn-northd.8.xml |   6 +-
 tests/ovn-macros.at |  45 +++-
 tests/ovn.at| 529 ++--
 tests/system-ovn.at |   2 +-
 10 files changed, 520 insertions(+), 397 deletions(-)

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 4/5] dpif-netdev: Add function pointer for dpif re-circulate.

2022-07-05 Thread Van Haaren, Harry
> -Original Message-
> From: Amber, Kumar 
> Sent: Tuesday, June 28, 2022 9:45 AM
> To: ovs-dev@openvswitch.org
> Cc: echau...@redhat.com; i.maxim...@ovn.org; Ferriter, Cian
> ; Stokes, Ian ; 
> f...@sysclose.org;
> Van Haaren, Harry ; Amber, Kumar
> 
> Subject: [PATCH v3 4/5] dpif-netdev: Add function pointer for dpif 
> re-circulate.
> 
> The patch adds and re-uses the dpif set command to set the
> function pointers to be used to switch between different inner
> dpifs.
> 
> Signed-off-by: Kumar Amber 
> Signed-off-by: Cian Ferriter 
> Co-authored-by: Cian Ferriter 
> 
> ---
> v3:
> - Add description  to the dpif recirc function.
> - Fix use of return value to fall back to scalar dpif.
> ---
> ---
>  lib/dpif-netdev-private-dpif.c   | 57 +++-
>  lib/dpif-netdev-private-dpif.h   | 18 ++
>  lib/dpif-netdev-private-thread.h |  3 ++
>  lib/dpif-netdev.c| 19 +--
>  4 files changed, 87 insertions(+), 10 deletions(-)
> 
> diff --git a/lib/dpif-netdev-private-dpif.c b/lib/dpif-netdev-private-dpif.c
> index 2dc51270a..96bfd4824 100644
> --- a/lib/dpif-netdev-private-dpif.c
> +++ b/lib/dpif-netdev-private-dpif.c
> @@ -27,6 +27,8 @@
>  #include "util.h"
> 
>  VLOG_DEFINE_THIS_MODULE(dpif_netdev_impl);
> +#define DPIF_NETDEV_IMPL_AVX512_CHECK (__x86_64__ && HAVE_AVX512F \
> +&& HAVE_LD_AVX512_GOOD && __SSE4_2__)

This #define rework should be in a different patch, it is not related to "add 
func ptr for dpif recirc".




> +/* Returns the default recirculate DPIF which is first ./configure selected,
> + * but can be overridden at runtime. */
> +dp_netdev_recirc_func dp_netdev_recirc_impl_get_default(void);

Simplify comment?
/* Returns the default implementation to recirculate packets. */

The wording today suggests the compile time ./configure selected can be 
overridden...
which reads funny and isn't technically true. Simply state what the func does 
instead.



> @@ -8819,7 +8828,10 @@ dp_execute_cb(void *aux_, struct dp_packet_batch
> *packets_,
>  }
> 
>  (*depth)++;
> -dp_netdev_recirculate(pmd, packets_);
> +int ret = pmd->netdev_input_recirc_func(pmd, packets_);
> +if (ret != 0) {
> +dp_netdev_recirculate(pmd, packets_);
> +}

Prefer the simpler if() format below?

if (ret) {

exists another time below.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 3/5] dpif-netdev-avx512: Refactor avx512 dpif and create new APIs.

2022-07-05 Thread Van Haaren, Harry
> -Original Message-
> From: Amber, Kumar 
> Sent: Tuesday, June 28, 2022 9:45 AM
> To: ovs-dev@openvswitch.org
> Cc: echau...@redhat.com; i.maxim...@ovn.org; Ferriter, Cian
> ; Stokes, Ian ; 
> f...@sysclose.org;
> Van Haaren, Harry ; Amber, Kumar
> 
> Subject: [PATCH v3 3/5] dpif-netdev-avx512: Refactor avx512 dpif and create 
> new
> APIs.
> 
> Create new APIs for the avx512 DPIF, enabling one baseline
> common code to be specialized into DPIF implementations for
> "outer" processing, and "recirc" processing.
> 
> Signed-off-by: Kumar Amber 
> Signed-off-by: Cian Ferriter 
> Co-authored-by: Cian Ferriter 



> diff --git a/lib/dpif-netdev-avx512.c b/lib/dpif-netdev-avx512.c


> +static int32_t
> +dp_netdev_input_avx512__(struct dp_netdev_pmd_thread *pmd,
> + struct dp_packet_batch *packets,
> + bool md_is_valid, odp_port_t in_port);
> +



> +static inline int32_t ALWAYS_INLINE
> +dp_netdev_input_avx512__(struct dp_netdev_pmd_thread *pmd,
> + struct dp_packet_batch *packets,
> + bool md_is_valid OVS_UNUSED, odp_port_t in_port)
>  {

Hard to see in diffs, but easier in the full file post-patch, there is 1 
function
being defined and then declared/implemented right after eachother here.
Remove the definition, its duplicate code.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 0/5] DPIF AVX512 Recirculation

2022-07-05 Thread Van Haaren, Harry
> -Original Message-
> From: Amber, Kumar 
> Sent: Tuesday, June 28, 2022 9:45 AM
> To: ovs-dev@openvswitch.org
> Cc: echau...@redhat.com; i.maxim...@ovn.org; Ferriter, Cian
> ; Stokes, Ian ; 
> f...@sysclose.org;
> Van Haaren, Harry ; Amber, Kumar
> 
> Subject: [PATCH v3 0/5] DPIF AVX512 Recirculation
> 
> The patch adds support for recirculation of packets in AVX512
> DPIF which would allow for processing tunneled packets.

Some general feedback on this patchset, having reviewed the code in detail.

1) we need to ensure that MFEX-avx512 behave correctly when used from a recirc 
context (today it only does outer).
This can be done using the MFEX autovalidator, but will likely require some 
minor modifications to make it "recirc aware".
At that point, the MFEX autovalidator can correctly validate inner & outer 
packets, so gives confidence in the usage of optimized
MFEX implementations, from both inner & outer contexts.

2) Today, the MFEX optimized routines are *not* capable of handling recirc 
packets. That was previously included in this
patchset, but was split out to a separate patchset to ease reviews & streamline 
merging.
(The v1 of this series was called "DPIF+MFEX Inner vxlan opts": 
https://patchwork.ozlabs.org/project/openvswitch/cover/20220321061441.1833575-1-kumar.am...@intel.com/)

Once 1) and 2) are in place, autovalidation should identify that current MFEX 
cannot operate on inner packets, and the
DPIF should not execute the optimized MFEX routines for *inner* packets (yet). 
Enabling MFEX on inner will be done for 2.19 release,
improving the optimized avx512 MFEX handling for recirculated/inner packets (as 
was present in v1 linked above already).

For a high-level overview, hope that makes sense. @Amber, please incorporate 
the above into the next version.

Regards, -Harry
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4 02/17] python: add mask, ip and eth decoders

2022-07-05 Thread Adrian Moreno



On 6/27/22 13:33, Ilya Maximets wrote:

On 6/27/22 13:25, Ilya Maximets wrote:

On 6/16/22 08:32, Adrian Moreno wrote:

Add more decoders that can be used by KVParser.

For IPv4 and IPv6 addresses, create a new class that wraps
netaddr.IPAddress.
For Ethernet addresses, create a new class that wraps netaddr.EUI.
For Integers, create a new class that performs basic bitwise mask
comparisons

Acked-by: Eelco Chaudron 
Signed-off-by: Adrian Moreno 
---
  python/ovs/flow/decoders.py | 398 
  python/setup.py |   2 +-
  2 files changed, 399 insertions(+), 1 deletion(-)

diff --git a/python/ovs/flow/decoders.py b/python/ovs/flow/decoders.py
index 0c2259c76..883e61acf 100644
--- a/python/ovs/flow/decoders.py
+++ b/python/ovs/flow/decoders.py
@@ -5,6 +5,15 @@ A decoder is generally a callable that accepts a string and 
returns the value
  object.
  """
  
+import netaddr





diff --git a/python/setup.py b/python/setup.py
index 7ac3c3662..350ac6056 100644
--- a/python/setup.py
+++ b/python/setup.py
@@ -87,7 +87,7 @@ setup_args = dict(
  ext_modules=[setuptools.Extension("ovs._json", sources=["ovs/_json.c"],
libraries=['openvswitch'])],
  cmdclass={'build_ext': try_build_ext},
-install_requires=['sortedcontainers'],
+install_requires=['sortedcontainers', 'netaddr'],
  extras_require={':sys_platform == "win32"': ['pywin32 >= 1.0']},
  )
  


So, the 'netaddr' is not a test dependency, but a new global dependency.
The tests are failing on patches #11 and #12 for that reason (unable to
import netaddr), it will be installed as a test dependency in patch #13.

We have 2 options here:

1.
   - Document the new global dependency in installation docs.
   - Install python3-netaddr explicitly in CI scripts.
   - Update fedora spec and debian/control with a build dependency and
 runtime dependency for a python package.

2.
   - Make it an optional dependency, by adding to extras_require and
 providing a meaningful error on attempt to import ovs.flow if
 netaddr is not available instead of throwing an import error.
   - Skip python part of tests in patches #11 and #12 if ovs.flow
 import fails. (try/catch + exit(0) in test scripts ?)
   - Document optional dependency for a flow library and update packaging
 with 'Suggests' section for pytohn3-netaddr.


- Install python3-netaddr explicitly in CI scripts (GHA and CirrusCI).

Needed for that option too.
 >>

The second option seems better to me as it doesn't force users to install
python3-netaddr if they don't want to parse OVS flows in python.
What do you think?



Thanks Ilya,

I'll implement option 2 and send another version.


Best regards, Ilya Maximets.




--
Adrián Moreno

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 0/2] tc: Fixes for tunnel offloading.

2022-07-05 Thread Roi Dayan via dev




On 2022-07-05 1:45 AM, Ilya Maximets wrote:

This is started from the issue with broken ERSPAN tunnel
reported by Eelco, but turned out a much wider problem with
ignoring different tunnel keys and flags while constructing
TC flower.  Some of that can be fixed, some can't be fixed
due to lack of support from kernel.  Strictly speaking, we
should have prohibit silent ignorign of everything including
DONT_FRAGMENT and CSUM tunnel flags, but that will break the
tunnel offloading entirely.  So, partially keeping incorrect
behaviour for now, fixing the most broken bits.

To have a more o less viable solution, TC in kernel should add
support for TUNNEL_DONT_FRAGMENT in set tunnel key action
(also, TUNNEL_OAM support would be nice).  And we also need
support to match on them, i.e. support for:

   TCA_FLOWER_KEY_ENC_DONT_FRAGMENT
   TCA_FLOWER_KEY_ENC_NO_CSUM
   TCA_FLOWER_KEY_ENC_OAM

And corresponding _MASK keys as well.
Maybe TCA_FLOWER_KEY_ENC_TUN_FLAGS instead?

With that OVS will have to probe the kernel for support and
use them or fail the offload.

I didn't test these patches much.  Only checked a couple of
system tests.

Ilya Maximets (2):
   netdev-offload-tc: Fix ignoring unknown tunnel keys.
   tc: Support masks for tunnel destination ports.

  lib/dpif-netlink.c  | 14 +-
  lib/netdev-offload-tc.c | 61 +++--
  lib/netdev-offload.h|  3 --
  lib/tc.c| 11 ++--
  4 files changed, 61 insertions(+), 28 deletions(-)




thanks. the patches looks ok to me but I'm going to run some tests
before acking today/tomorrow.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev