Hi Michael, Great, know that. I will try on my cluster too. Btw, do you know how to find compiling options of OVS-DPDK package from Ubuntu repo?
Best regards. On Wed, Apr 5, 2023, 1:56 PM Plato, Michael <michael.pl...@tu-berlin.de> wrote: > Hi, > > > > yes our k8s cluster is on the same subnet. I stopped one of the etcd nodes > yesterday which triggers a lot of reconnection attempts from the other > cluster members. Stilll no issues so far and no ovs crashes 😊 > > > > Regards > > > > Michael > > > > *Von:* Lazuardi Nasution <mrxlazuar...@gmail.com> > *Gesendet:* Dienstag, 4. April 2023 09:56 > *An:* Plato, Michael <michael.pl...@tu-berlin.de> > *Cc:* ovs-discuss@openvswitch.org > *Betreff:* Re: [ovs-discuss] ovs-vswitchd crashes serveral times a day > > > > Hi Michael, > > > > I assume that your k8s cluster is on the same subnet, right? Would you > mind testing it by shutting down one of etcd instances and see if this bug > still exists? > > > > Best regards. > > > > On Tue, Apr 4, 2023 at 2:50 PM Plato, Michael <michael.pl...@tu-berlin.de> > wrote: > > Hi, > > from my perspective the patch works for all cases. My test environment > runs with several k8s clusters and I haven't noticed any etcd failures so > far. > > > > Best regards > > > > Michael > > > > *Von:* Lazuardi Nasution <mrxlazuar...@gmail.com> > *Gesendet:* Dienstag, 4. April 2023 09:41 > *An:* Plato, Michael <michael.pl...@tu-berlin.de> > *Cc:* ovs-discuss@openvswitch.org > *Betreff:* Re: [ovs-discuss] ovs-vswitchd crashes serveral times a day > > > > Hi Michael, > > > > Is your patch working on the same subnet unreachable traffic too. In my > case, crashes happen when too many unreachable replies even from the same > subnet. For example, when one of the etcd instances is down, there will be > huge reconnection attempts and then unreachable replies from the > destination VM where the down etcd instance exists. > > > > Best regards. > > > > On Tue, Apr 4, 2023 at 1:06 PM Plato, Michael <michael.pl...@tu-berlin.de> > wrote: > > Hi, > > I have some news on this topic. Unfortunately I could not find the root > cause. But I managed to implement a workaround (see patch in attachment). > The basic idea is to mark the nat flows as invalid if there is no longer an > associated connection. From my point of view it is a race condition. It can > be triggered by many short-lived connections. With the patch we no longer > have any crashes. I can't say if it has any negative effects though, as I'm > not an expert. So far I haven't found any problems at least. Without this > patch we had hundreds of crashes a day :/ > > > > Best regards > > > Michael > > > > *Von:* Lazuardi Nasution <mrxlazuar...@gmail.com> > *Gesendet:* Montag, 3. April 2023 13:50 > *An:* ovs-discuss@openvswitch.org > *Cc:* Plato, Michael <michael.pl...@tu-berlin.de> > *Betreff:* Re: [ovs-discuss] ovs-vswitchd crashes serveral times a day > > > > Hi, > > > > Is this related to following glibc bug? I'm not so sure about this because > when I check the glibc source of installed version (2.35), the proposed > patch has been applied. > > > > https://sourceware.org/bugzilla/show_bug.cgi?id=12889 > > > > I can confirm that this problem only happen if I use statefull ACL which > is related to conntrack. The racing situation happen when massive > unreachable replies are received. For example, if I run etcd on VMs but one > etcd node has been disabled which causes massive connection attempts and > unreachable replies. > > > > Best regards. > > > > On Mon, Mar 20, 2023, 10:58 PM Lazuardi Nasution <mrxlazuar...@gmail.com> > wrote: > > Hi Michael, > > > > Have you found the solution for this case? I find the same weird problem > without any information about which conntrack entries are causing > this issue. > > > > I'm using OVS 3.0.1 with DPDK 21.11.2 on Ubuntu 22.04. By the way, this > problem is disappear after I remove some Kubernutes cluster VMs and some DB > cluster VMs. > > > > Best regards. > > > > Date: Thu, 29 Sep 2022 07:56:32 +0000 > From: "Plato, Michael" <michael.pl...@tu-berlin.de> > To: "ovs-discuss@openvswitch.org" <ovs-discuss@openvswitch.org> > Subject: [ovs-discuss] ovs-vswitchd crashes serveral times a day > Message-ID: <8e53d3d0674049e69b2b7f3c4b0b8...@tu-berlin.de> > Content-Type: text/plain; charset="us-ascii" > > Hi, > > we are about to roll out our new openstack infrastructure based on yoga > and during our testing we observered that the openvswitch-switch systemd > unit restarts several times a day, causing network interruptions for all > VMs on the compute node in question. > After some research we found that the ovs-vswitchd crashes with the > following assertion failure: > > "2022-09-29T06:51:05.195Z|00003|util(pmd-c01/id:8)|EMER|../lib/conntrack.c:1095: > assertion conn->conn_type == CT_CONN_TYPE_DEFAULT failed in > conn_update_state()" > > To get more information about the connection that leads to this assertion > failure, I added some debug code to conntrack.c . > We have seen that we can trigger this issue when trying to connect from a > VM to a destination which is unreachable. For example curl > https://www.google.de:444 > > Shortly after that we get an assertion and the debug code says: > > conn_type=1 (may be CT_CONN_TYPE_UN_NAT) ? > src ip 172.217.16.67 dst ip 141.23.xx.xx rev src ip 141.23.xx.xx rev dst > ip 172.217.16.67 src/dst ports 444/46212 rev src/dst ports 46212/444 > zone/rev zone 2/2 nw_proto/rev nw_proto 6/6 > > ovs-appctl dpctl/dump-conntrack | grep "444" > > tcp,orig=(src=141.23.xx.xx,dst=172.217.16.67,sport=46212,dport=444),reply=(src=172.217.16.67,dst=141.23.xx.xx,sport=444,dport=46212),zone=2,protoinfo=(state=SYN_SENT) > > Versions: > ovs-vsctl --version > ovs-vsctl (Open vSwitch) 2.17.2 > DB Schema 8.3.0 > > ovn-controller --version > ovn-controller 22.03.0 > Open vSwitch Library 2.17.0 > OpenFlow versions 0x6:0x6 > SB DB Schema 20.21.0 > > DPDK 21.11.2 > > We are now unsure if this is a misconfiguration or if we hit a bug. > > Thanks for any feedback > > Michael > >
_______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss