Re: [ovs-discuss] Impact of OVN Network Connectivity on Virtual Machine Heartbeat in OVN with External Deployment

2024-07-15 Thread Lazuardi Nasution via discuss
Hi Nash,

I'm interested in your case, but I need some information below to
understand your case. We can set up collaboration if needed to solve this.
I'm using OVN mostly for OpenStack.

1. May I know your physical and logical topology?
2. What have you seen with packet capture regarding your heartbeat loss
case?
3. Does this case happen from the beginning of the VM connected by OVN?
4. How do you configure your OVN logical networks on NB?

Best regards,

Date: Mon, 15 Jul 2024 10:16:17 +0100
> From: Nash Oudha 
> To: b...@openvswitch.org
> Subject: [ovs-discuss] Impact of OVN Network Connectivity on Virtual
> Machine Heartbeat in OVN with External Deployment
> Message-ID: 
> Content-Type: text/plain; charset="utf-8"; Format="flowed"
>
> Dear OpenVSwitch Team,
>
> We're reaching out to inquire about the behavior of OVN networks in a
> specific deployment scenario.
>
> *Our Setup:*
>
>   * We have Open Virtual Network (OVN) deployed within an Ovirt
> environment.
>   * The OVN northbound (NB) database,southbound (SB) database,and other
> daemons reside on the Ovirt manager/engine,which is located outside
> the KVM hypervisors.This means they are not running on the same
> physical machines as the KVM hypervisors.
>
> *The Issue:*
>
>   * We have virtual machines configured to use OVN networks for their
> cluster network heartbeat.
>   * In a recent incident,the network connection between the Ovirt
> manager and the KVM hypervisors was lost.It's important to note that
> the Ovirt engine itself was not connected to the heartbeat OVN
> network.Only the management connection between the engine and the
> KVM hypervisors was affected.While the KVM hypervisors should have
> remained able to communicate with each other,we observed the
> following behavior:
>   o As soon as the engine network connection to the KVM host was
> disconnected,virtual machines using the OVN network began to
> panic and reboot.
>   o Other VMS that does not use OVN network has no issue whatsoever.
>
> *Our Question:*
>
>   * In this scenario,where OVN is deployed externally to the
> hypervisors,does restarting the VM/host that houses the Ovirt engine
> and OVN lead to a loss of connection for all OVN networks on the
> hypervisors?
>
> *Additional Context:*
>
>   * We'd appreciate any insights or recommendations you may have to
> mitigate this behavior and ensure the stability of our OVN network
> in such situations.
>
> Thank you for your time and assistance.
>
>
> Thanks
>
> Nash Oudha
>
> Oracle Virtualization Team
>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


[ovs-discuss] OVS-DPDK ConnTrack Update Racing Condition

2023-05-25 Thread Lazuardi Nasution via discuss
Hi,

Continuing my posting on "ovs-vswitchd crashes several times a day", it
seems that I find some racing conditions on the conntrack update. Without
enabling debugging logs, I find logs like the following frequently.

2023-05-25T12:48:07.270Z|02757|conntrack(pmd-c47/id:101)|WARN|Unable to NAT
due to tuple space exhaustion - if DoS attack, use firewalling and/or zone
partitioning.
2023-05-25T12:48:09.318Z|02758|conntrack(pmd-c47/id:101)|WARN|Unable to NAT
due to tuple space exhaustion - if DoS attack, use firewalling and/or zone
partitioning.

After enabling debugging logs, I find the logs like the following before
the above logs.

2023-05-25T12:48:06.979Z|00030|conntrack_tp(pmd-c71/id:103)|DBG|Update
timeout TCP_ESTABLISHED zone=4 with policy id=0 val=86400 sec.

At that time, the conntrack table is only like the following with only a
single ESTABLISHED entry.

root@controller02:~# ovs-appctl dpctl/dump-conntrack -s
icmp,orig=(src=192.168.14.14,dst=8.8.8.8,id=6,type=8,code=0),reply=(src=8.8.8.8,dst=10.10.141.153,id=6,type=0,code=0),zone=4,timeout=29
icmp,orig=(src=192.168.14.11,dst=10.10.41.70,id=4,type=8,code=0),reply=(src=10.10.41.70,dst=10.10.141.153,id=4,type=0,code=0),zone=4,timeout=27
tcp,orig=(src=192.168.14.14,dst=10.10.41.73,sport=49852,dport=3306),reply=(src=10.10.41.73,dst=10.10.141.153,sport=3306,dport=49852),zone=4,timeout=86399,protoinfo=(state=ESTABLISHED)

It seems that OVS fails on searching conntrack entries whenever there is an
update. Is there any idea how to deal with this?

Best regards.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] ovs-vswitchd crashes serveral times a day

2023-05-23 Thread Lazuardi Nasution via discuss
Hi Paolo, Hi Michael,

I want to confirm that following patch is working on openvswitch 3.0.3 and
the OVS crash is not happen after patching.

https://patchwork.ozlabs.org/project/openvswitch/patch/168192964823.4031872.3228556334798413886.st...@fed.void/

But currently, I find some logs like following. I'm not sure if it is
related with above patch.

2023-05-23T08:35:18.383Z|5|conntrack(pmd-c49/id:104)|WARN|Unable to NAT
due to tuple space exhaustion - if DoS attack, use firewalling and/or zone
partitioning.

Any ideas?

Best regards.


> Date: Thu, 04 May 2023 19:24:53 +0200
> From: Paolo Valerio 
> To: Lazuardi Nasution 
> Cc: , ovs-discuss@openvswitch.org
> Subject: Re: [ovs-discuss] ovs-vswitchd crashes serveral times a day
> Message-ID: <871qjwt3fe@fed.void>
> Content-Type: text/plain; charset=utf-8
>
> Lazuardi Nasution  writes:
>
> > Hi Paolo,
> >
> > Should we combine this patch too?
> >
> > https://patchwork.ozlabs.org/project/openvswitch/patch/
> > 168192964823.4031872.3228556334798413886.st...@fed.void/
> >
>
> Hi,
>
> no, it basically does the same thing in a slightly different way
> reducing the need for modification in the case of backporting to
> previous versions.
>
> > Best regards.
> >
> > On Wed, Apr 5, 2023 at 2:51?AM Paolo Valerio 
> wrote:
> >
> > Hello,
> >
> > thanks for reporting this.
> > I had a look at it, and, although this needs to be confirmed, I
> suspect
> > it's related to nat (CT_CONN_TYPE_UN_NAT) and expired connections
> (but
> > not yet reclaimed).
> >
> > The nat part does not necessarily perform any actual translation, but
> > could still be triggered by ct(nat(src)...) which is the all-zero
> binding
> > to avoid collisions, if any.
> >
> > Is there any chance to test the following patch (targeted for ovs
> 2.17)?
> > This should help to confirm.
> >
> > -- >8 --
> > diff --git a/lib/conntrack.c b/lib/conntrack.c
> > index 08da4ddf7..ba334afb0 100644
> > --- a/lib/conntrack.c
> > +++ b/lib/conntrack.c
> > @@ -94,9 +94,8 @@ static bool valid_new(struct dp_packet *pkt, struct
> > conn_key *);
> > ?static struct conn *new_conn(struct conntrack *ct, struct dp_packet
> *pkt,
> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? struct conn_key *, long long now,
> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? uint32_t tp_id);
> > -static void delete_conn_cmn(struct conn *);
> > +static void delete_conn__(struct conn *);
> > ?static void delete_conn(struct conn *);
> > -static void delete_conn_one(struct conn *conn);
> > ?static enum ct_update_res conn_update(struct conntrack *ct, struct
> conn
> > *conn,
> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?struct dp_packet *pkt,
> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?struct conn_lookup_ctx *ctx,
> > @@ -444,14 +443,13 @@ zone_limit_delete(struct conntrack *ct,
> uint16_t
> > zone)
> > ?}
> >
> > ?static void
> > -conn_clean_cmn(struct conntrack *ct, struct conn *conn)
> > +conn_clean_cmn(struct conntrack *ct, struct conn *conn, uint32_t
> hash)
> > ? ? ?OVS_REQUIRES(ct->ct_lock)
> > ?{
> > ? ? ?if (conn->alg) {
> > ? ? ? ? ?expectation_clean(ct, >key);
> > ? ? ?}
> >
> > -? ? uint32_t hash = conn_key_hash(>key, ct->hash_basis);
> > ? ? ?cmap_remove(>conns, >cm_node, hash);
> >
> > ? ? ?struct zone_limit *zl = zone_limit_lookup(ct, conn->admit_zone);
> > @@ -467,11 +465,14 @@ conn_clean(struct conntrack *ct, struct conn
> *conn)
> > ? ? ?OVS_REQUIRES(ct->ct_lock)
> > ?{
> > ? ? ?ovs_assert(conn->conn_type == CT_CONN_TYPE_DEFAULT);
> > +? ? uint32_t conn_hash = conn_key_hash(>key, ct->hash_basis);
> >
> > -? ? conn_clean_cmn(ct, conn);
> > +? ? conn_clean_cmn(ct, conn, conn_hash);
> > ? ? ?if (conn->nat_conn) {
> > ? ? ? ? ?uint32_t hash = conn_key_hash(>nat_conn->key, ct->
> > hash_basis);
> > -? ? ? ? cmap_remove(>conns, >nat_conn->cm_node, hash);
> > +? ? ? ? if (conn_hash != hash) {
> > +? ? ? ? ? ? cmap_remove(>conns, >nat_conn->cm_node, hash);
> > +? ? ? ? }
> > ? ? ?}
> > ? ? ?ovs_list_remove(>exp_node);
> > ? ? ?conn->cleaned = true;
> > @@ -479,19 +480,6 @@ conn_clean(struct conntrack *ct, struct conn
> *conn)
> > ? ? ?atomic_count_dec(>n_conn);
> > ?}
> >
> > -static void
> > -conn_clean_one(struct conntrack *ct, struct conn *conn)
> > -? ? OVS_REQUIRES(ct->ct_lock)
> > -{
> > -? ? conn_clean_cmn(ct, conn);
> > -? ? if (conn->conn_type == CT_CONN_TYPE_DEFAULT) {
> > -? ? ? ? ovs_list_remove(>exp_node);
> > -? ? ? ? conn->cleaned = true;
> > -? ? ? ? atomic_count_dec(>n_conn);
> > -? ? }
> > -? ? ovsrcu_postpone(delete_conn_one, conn);
> > -}
> > -
> > ?/* Destroys the connection tracker 'ct' and frees all the allocated
> > memory.
> > ? * The caller of this function must already have shut down packet
> input
> > ? * and PMD threads 

Re: [ovs-discuss] ovs-vswitchd crashes serveral times a day

2023-05-04 Thread Lazuardi Nasution via discuss
Hi Paolo,

Should we combine this patch too?

https://patchwork.ozlabs.org/project/openvswitch/patch/168192964823.4031872.3228556334798413886.st...@fed.void/

Best regards.

On Wed, Apr 5, 2023 at 2:51 AM Paolo Valerio  wrote:

> Hello,
>
> thanks for reporting this.
> I had a look at it, and, although this needs to be confirmed, I suspect
> it's related to nat (CT_CONN_TYPE_UN_NAT) and expired connections (but
> not yet reclaimed).
>
> The nat part does not necessarily perform any actual translation, but
> could still be triggered by ct(nat(src)...) which is the all-zero binding
> to avoid collisions, if any.
>
> Is there any chance to test the following patch (targeted for ovs 2.17)?
> This should help to confirm.
>
> -- >8 --
> diff --git a/lib/conntrack.c b/lib/conntrack.c
> index 08da4ddf7..ba334afb0 100644
> --- a/lib/conntrack.c
> +++ b/lib/conntrack.c
> @@ -94,9 +94,8 @@ static bool valid_new(struct dp_packet *pkt, struct
> conn_key *);
>  static struct conn *new_conn(struct conntrack *ct, struct dp_packet *pkt,
>   struct conn_key *, long long now,
>   uint32_t tp_id);
> -static void delete_conn_cmn(struct conn *);
> +static void delete_conn__(struct conn *);
>  static void delete_conn(struct conn *);
> -static void delete_conn_one(struct conn *conn);
>  static enum ct_update_res conn_update(struct conntrack *ct, struct conn
> *conn,
>struct dp_packet *pkt,
>struct conn_lookup_ctx *ctx,
> @@ -444,14 +443,13 @@ zone_limit_delete(struct conntrack *ct, uint16_t
> zone)
>  }
>
>  static void
> -conn_clean_cmn(struct conntrack *ct, struct conn *conn)
> +conn_clean_cmn(struct conntrack *ct, struct conn *conn, uint32_t hash)
>  OVS_REQUIRES(ct->ct_lock)
>  {
>  if (conn->alg) {
>  expectation_clean(ct, >key);
>  }
>
> -uint32_t hash = conn_key_hash(>key, ct->hash_basis);
>  cmap_remove(>conns, >cm_node, hash);
>
>  struct zone_limit *zl = zone_limit_lookup(ct, conn->admit_zone);
> @@ -467,11 +465,14 @@ conn_clean(struct conntrack *ct, struct conn *conn)
>  OVS_REQUIRES(ct->ct_lock)
>  {
>  ovs_assert(conn->conn_type == CT_CONN_TYPE_DEFAULT);
> +uint32_t conn_hash = conn_key_hash(>key, ct->hash_basis);
>
> -conn_clean_cmn(ct, conn);
> +conn_clean_cmn(ct, conn, conn_hash);
>  if (conn->nat_conn) {
>  uint32_t hash = conn_key_hash(>nat_conn->key,
> ct->hash_basis);
> -cmap_remove(>conns, >nat_conn->cm_node, hash);
> +if (conn_hash != hash) {
> +cmap_remove(>conns, >nat_conn->cm_node, hash);
> +}
>  }
>  ovs_list_remove(>exp_node);
>  conn->cleaned = true;
> @@ -479,19 +480,6 @@ conn_clean(struct conntrack *ct, struct conn *conn)
>  atomic_count_dec(>n_conn);
>  }
>
> -static void
> -conn_clean_one(struct conntrack *ct, struct conn *conn)
> -OVS_REQUIRES(ct->ct_lock)
> -{
> -conn_clean_cmn(ct, conn);
> -if (conn->conn_type == CT_CONN_TYPE_DEFAULT) {
> -ovs_list_remove(>exp_node);
> -conn->cleaned = true;
> -atomic_count_dec(>n_conn);
> -}
> -ovsrcu_postpone(delete_conn_one, conn);
> -}
> -
>  /* Destroys the connection tracker 'ct' and frees all the allocated
> memory.
>   * The caller of this function must already have shut down packet input
>   * and PMD threads (which would have been quiesced).  */
> @@ -505,7 +493,10 @@ conntrack_destroy(struct conntrack *ct)
>
>  ovs_mutex_lock(>ct_lock);
>  CMAP_FOR_EACH (conn, cm_node, >conns) {
> -conn_clean_one(ct, conn);
> +if (conn->conn_type == CT_CONN_TYPE_UN_NAT) {
> +continue;
> +}
> +conn_clean(ct, conn);
>  }
>  cmap_destroy(>conns);
>
> @@ -1052,7 +1043,10 @@ conn_not_found(struct conntrack *ct, struct
> dp_packet *pkt,
>  nat_conn->alg = NULL;
>  nat_conn->nat_conn = NULL;
>  uint32_t nat_hash = conn_key_hash(_conn->key,
> ct->hash_basis);
> -cmap_insert(>conns, _conn->cm_node, nat_hash);
> +
> +if (nat_hash != ctx->hash) {
> +cmap_insert(>conns, _conn->cm_node, nat_hash);
> +}
>  }
>
>  nc->nat_conn = nat_conn;
> @@ -1080,7 +1074,7 @@ conn_not_found(struct conntrack *ct, struct
> dp_packet *pkt,
>  nat_res_exhaustion:
>  free(nat_conn);
>  ovs_list_remove(>exp_node);
> -delete_conn_cmn(nc);
> +delete_conn__(nc);
>  static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 5);
>  VLOG_WARN_RL(, "Unable to NAT due to tuple space exhaustion - "
>   "if DoS attack, use firewalling and/or zone
> partitioning.");
> @@ -2549,7 +2543,7 @@ new_conn(struct conntrack *ct, struct dp_packet
> *pkt, struct conn_key *key,
>  }
>
>  static void
> -delete_conn_cmn(struct conn *conn)
> +delete_conn__(struct conn *conn)
>  {
>  free(conn->alg);
>  free(conn);
> @@ -2561,17 

Re: [ovs-discuss] ovs-vswitchd crashes serveral times a day

2023-04-13 Thread Lazuardi Nasution via discuss
Hi Paolo,

I'm interested in your statement of "expired connections (but not yet
reclaimed)". Do you think that shortening conntrack timeout policy will
help? Or, should we make it larger so there will be fewer conntrack table
update and flush attempts?

Best regards.

On Wed, Apr 5, 2023 at 2:51 AM Paolo Valerio  wrote:

> Hello,
>
> thanks for reporting this.
> I had a look at it, and, although this needs to be confirmed, I suspect
> it's related to nat (CT_CONN_TYPE_UN_NAT) and expired connections (but
> not yet reclaimed).
>
> The nat part does not necessarily perform any actual translation, but
> could still be triggered by ct(nat(src)...) which is the all-zero binding
> to avoid collisions, if any.
>
> Is there any chance to test the following patch (targeted for ovs 2.17)?
> This should help to confirm.
>
> -- >8 --
> diff --git a/lib/conntrack.c b/lib/conntrack.c
> index 08da4ddf7..ba334afb0 100644
> --- a/lib/conntrack.c
> +++ b/lib/conntrack.c
> @@ -94,9 +94,8 @@ static bool valid_new(struct dp_packet *pkt, struct
> conn_key *);
>  static struct conn *new_conn(struct conntrack *ct, struct dp_packet *pkt,
>   struct conn_key *, long long now,
>   uint32_t tp_id);
> -static void delete_conn_cmn(struct conn *);
> +static void delete_conn__(struct conn *);
>  static void delete_conn(struct conn *);
> -static void delete_conn_one(struct conn *conn);
>  static enum ct_update_res conn_update(struct conntrack *ct, struct conn
> *conn,
>struct dp_packet *pkt,
>struct conn_lookup_ctx *ctx,
> @@ -444,14 +443,13 @@ zone_limit_delete(struct conntrack *ct, uint16_t
> zone)
>  }
>
>  static void
> -conn_clean_cmn(struct conntrack *ct, struct conn *conn)
> +conn_clean_cmn(struct conntrack *ct, struct conn *conn, uint32_t hash)
>  OVS_REQUIRES(ct->ct_lock)
>  {
>  if (conn->alg) {
>  expectation_clean(ct, >key);
>  }
>
> -uint32_t hash = conn_key_hash(>key, ct->hash_basis);
>  cmap_remove(>conns, >cm_node, hash);
>
>  struct zone_limit *zl = zone_limit_lookup(ct, conn->admit_zone);
> @@ -467,11 +465,14 @@ conn_clean(struct conntrack *ct, struct conn *conn)
>  OVS_REQUIRES(ct->ct_lock)
>  {
>  ovs_assert(conn->conn_type == CT_CONN_TYPE_DEFAULT);
> +uint32_t conn_hash = conn_key_hash(>key, ct->hash_basis);
>
> -conn_clean_cmn(ct, conn);
> +conn_clean_cmn(ct, conn, conn_hash);
>  if (conn->nat_conn) {
>  uint32_t hash = conn_key_hash(>nat_conn->key,
> ct->hash_basis);
> -cmap_remove(>conns, >nat_conn->cm_node, hash);
> +if (conn_hash != hash) {
> +cmap_remove(>conns, >nat_conn->cm_node, hash);
> +}
>  }
>  ovs_list_remove(>exp_node);
>  conn->cleaned = true;
> @@ -479,19 +480,6 @@ conn_clean(struct conntrack *ct, struct conn *conn)
>  atomic_count_dec(>n_conn);
>  }
>
> -static void
> -conn_clean_one(struct conntrack *ct, struct conn *conn)
> -OVS_REQUIRES(ct->ct_lock)
> -{
> -conn_clean_cmn(ct, conn);
> -if (conn->conn_type == CT_CONN_TYPE_DEFAULT) {
> -ovs_list_remove(>exp_node);
> -conn->cleaned = true;
> -atomic_count_dec(>n_conn);
> -}
> -ovsrcu_postpone(delete_conn_one, conn);
> -}
> -
>  /* Destroys the connection tracker 'ct' and frees all the allocated
> memory.
>   * The caller of this function must already have shut down packet input
>   * and PMD threads (which would have been quiesced).  */
> @@ -505,7 +493,10 @@ conntrack_destroy(struct conntrack *ct)
>
>  ovs_mutex_lock(>ct_lock);
>  CMAP_FOR_EACH (conn, cm_node, >conns) {
> -conn_clean_one(ct, conn);
> +if (conn->conn_type == CT_CONN_TYPE_UN_NAT) {
> +continue;
> +}
> +conn_clean(ct, conn);
>  }
>  cmap_destroy(>conns);
>
> @@ -1052,7 +1043,10 @@ conn_not_found(struct conntrack *ct, struct
> dp_packet *pkt,
>  nat_conn->alg = NULL;
>  nat_conn->nat_conn = NULL;
>  uint32_t nat_hash = conn_key_hash(_conn->key,
> ct->hash_basis);
> -cmap_insert(>conns, _conn->cm_node, nat_hash);
> +
> +if (nat_hash != ctx->hash) {
> +cmap_insert(>conns, _conn->cm_node, nat_hash);
> +}
>  }
>
>  nc->nat_conn = nat_conn;
> @@ -1080,7 +1074,7 @@ conn_not_found(struct conntrack *ct, struct
> dp_packet *pkt,
>  nat_res_exhaustion:
>  free(nat_conn);
>  ovs_list_remove(>exp_node);
> -delete_conn_cmn(nc);
> +delete_conn__(nc);
>  static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 5);
>  VLOG_WARN_RL(, "Unable to NAT due to tuple space exhaustion - "
>   "if DoS attack, use firewalling and/or zone
> partitioning.");
> @@ -2549,7 +2543,7 @@ new_conn(struct conntrack *ct, struct dp_packet
> *pkt, struct conn_key *key,
>  }
>
>  static void
> -delete_conn_cmn(struct conn *conn)
> 

Re: [ovs-discuss] ovs-vswitchd crashes serveral times a day

2023-04-05 Thread Lazuardi Nasution via discuss
Hi Michael,

Great, know that. I will try on my cluster too. Btw, do you know how to
find compiling options of OVS-DPDK package from Ubuntu repo?

Best regards.

On Wed, Apr 5, 2023, 1:56 PM Plato, Michael 
wrote:

> Hi,
>
>
>
> yes our k8s cluster is on the same subnet. I stopped one of the etcd nodes
> yesterday which triggers a lot of reconnection attempts from the other
> cluster members. Stilll no issues so far and no ovs crashes 
>
>
>
> Regards
>
>
>
> Michael
>
>
>
> *Von:* Lazuardi Nasution 
> *Gesendet:* Dienstag, 4. April 2023 09:56
> *An:* Plato, Michael 
> *Cc:* ovs-discuss@openvswitch.org
> *Betreff:* Re: [ovs-discuss] ovs-vswitchd crashes serveral times a day
>
>
>
> Hi Michael,
>
>
>
> I assume that your k8s cluster is on the same subnet, right? Would you
> mind testing it by shutting down one of etcd instances and see if this bug
> still exists?
>
>
>
> Best regards.
>
>
>
> On Tue, Apr 4, 2023 at 2:50 PM Plato, Michael 
> wrote:
>
> Hi,
>
> from my perspective the patch works for all cases. My test environment
> runs with several k8s clusters and I haven't noticed any etcd failures so
> far.
>
>
>
> Best regards
>
>
>
> Michael
>
>
>
> *Von:* Lazuardi Nasution 
> *Gesendet:* Dienstag, 4. April 2023 09:41
> *An:* Plato, Michael 
> *Cc:* ovs-discuss@openvswitch.org
> *Betreff:* Re: [ovs-discuss] ovs-vswitchd crashes serveral times a day
>
>
>
> Hi Michael,
>
>
>
> Is your patch working on the same subnet unreachable traffic too. In my
> case, crashes happen when too many unreachable replies even from the same
> subnet. For example, when one of the etcd instances is down, there will be
> huge reconnection attempts and then unreachable replies from the
> destination VM where the down etcd instance exists.
>
>
>
> Best regards.
>
>
>
> On Tue, Apr 4, 2023 at 1:06 PM Plato, Michael 
> wrote:
>
> Hi,
>
> I have some news on this topic. Unfortunately I could not find the root
> cause. But I managed to implement a workaround (see patch in attachment).
> The basic idea is to mark the nat flows as invalid if there is no longer an
> associated connection. From my point of view it is a race condition. It can
> be triggered by many short-lived connections. With the patch we no longer
> have any crashes. I can't say if it has any negative effects though, as I'm
> not an expert. So far I haven't found any problems at least. Without this
> patch we had hundreds of crashes a day :/
>
>
>
> Best regards
>
>
> Michael
>
>
>
> *Von:* Lazuardi Nasution 
> *Gesendet:* Montag, 3. April 2023 13:50
> *An:* ovs-discuss@openvswitch.org
> *Cc:* Plato, Michael 
> *Betreff:* Re: [ovs-discuss] ovs-vswitchd crashes serveral times a day
>
>
>
> Hi,
>
>
>
> Is this related to following glibc bug? I'm not so sure about this because
> when I check the glibc source of installed version (2.35), the proposed
> patch has been applied.
>
>
>
> https://sourceware.org/bugzilla/show_bug.cgi?id=12889
>
>
>
> I can confirm that this problem only happen if I use statefull ACL which
> is related to conntrack. The racing situation happen when massive
> unreachable replies are received. For example, if I run etcd on VMs but one
> etcd node has been disabled which causes massive connection attempts and
> unreachable replies.
>
>
>
> Best regards.
>
>
>
> On Mon, Mar 20, 2023, 10:58 PM Lazuardi Nasution 
> wrote:
>
> Hi Michael,
>
>
>
> Have you found the solution for this case? I find the same weird problem
> without any information about which conntrack entries are causing
> this issue.
>
>
>
> I'm using OVS 3.0.1 with DPDK  21.11.2 on Ubuntu 22.04. By the way, this
> problem is disappear after I remove some Kubernutes cluster VMs and some DB
> cluster VMs.
>
>
>
> Best regards.
>
>
>
> Date: Thu, 29 Sep 2022 07:56:32 +
> From: "Plato, Michael" 
> To: "ovs-discuss@openvswitch.org" 
> Subject: [ovs-discuss] ovs-vswitchd crashes serveral times a day
> Message-ID: <8e53d3d0674049e69b2b7f3c4b0b8...@tu-berlin.de>
> Content-Type: text/plain; charset="us-ascii"
>
> Hi,
>
> we are about to roll out our new openstack infrastructure based on yoga
> and during our testing we observered that the openvswitch-switch systemd
> unit restarts several times a day, causing network interruptions for all
> VMs on the compute node in question.
> After some research we found that the ovs-vswitchd crashes with the
> following assertion failure:
>
> "2022-09-29T06:51:05.195Z|3|util(pmd-c01/id:8)|EMER|../lib/conntrack.c:1095:
> assertion conn->conn_type == CT_CONN_TYPE_DEFAULT failed in
> conn_update_state()"
>
> To get more information about the connection that leads to this assertion
> failure, I added some debug code to conntrack.c .
> We have seen that we can trigger this issue when trying to connect from a
> VM to a destination which is unreachable. For example curl
> https://www.google.de:444
>
> Shortly after that we get an assertion and the debug code says:
>
> conn_type=1 (may be CT_CONN_TYPE_UN_NAT) ?
> src ip 172.217.16.67 dst ip 141.23.xx.xx 

Re: [ovs-discuss] ovs-vswitchd crashes serveral times a day

2023-04-04 Thread Lazuardi Nasution via discuss
Hi Paolo,

Would you mind to explain this to me? Currently, I'm still looking for
compiling options of installed OVS-DPDK from Ubuntu repo. After that, I'll
try your patch and compile it with same options.

Best regards.

On Wed, Apr 5, 2023, 2:51 AM Paolo Valerio  wrote:

> Hello,
>
> thanks for reporting this.
> I had a look at it, and, although this needs to be confirmed, I suspect
> it's related to nat (CT_CONN_TYPE_UN_NAT) and expired connections (but
> not yet reclaimed).
>
> The nat part does not necessarily perform any actual translation, but
> could still be triggered by ct(nat(src)...) which is the all-zero binding
> to avoid collisions, if any.
>
> Is there any chance to test the following patch (targeted for ovs 2.17)?
> This should help to confirm.
>
> -- >8 --
> diff --git a/lib/conntrack.c b/lib/conntrack.c
> index 08da4ddf7..ba334afb0 100644
> --- a/lib/conntrack.c
> +++ b/lib/conntrack.c
> @@ -94,9 +94,8 @@ static bool valid_new(struct dp_packet *pkt, struct
> conn_key *);
>  static struct conn *new_conn(struct conntrack *ct, struct dp_packet *pkt,
>   struct conn_key *, long long now,
>   uint32_t tp_id);
> -static void delete_conn_cmn(struct conn *);
> +static void delete_conn__(struct conn *);
>  static void delete_conn(struct conn *);
> -static void delete_conn_one(struct conn *conn);
>  static enum ct_update_res conn_update(struct conntrack *ct, struct conn
> *conn,
>struct dp_packet *pkt,
>struct conn_lookup_ctx *ctx,
> @@ -444,14 +443,13 @@ zone_limit_delete(struct conntrack *ct, uint16_t
> zone)
>  }
>
>  static void
> -conn_clean_cmn(struct conntrack *ct, struct conn *conn)
> +conn_clean_cmn(struct conntrack *ct, struct conn *conn, uint32_t hash)
>  OVS_REQUIRES(ct->ct_lock)
>  {
>  if (conn->alg) {
>  expectation_clean(ct, >key);
>  }
>
> -uint32_t hash = conn_key_hash(>key, ct->hash_basis);
>  cmap_remove(>conns, >cm_node, hash);
>
>  struct zone_limit *zl = zone_limit_lookup(ct, conn->admit_zone);
> @@ -467,11 +465,14 @@ conn_clean(struct conntrack *ct, struct conn *conn)
>  OVS_REQUIRES(ct->ct_lock)
>  {
>  ovs_assert(conn->conn_type == CT_CONN_TYPE_DEFAULT);
> +uint32_t conn_hash = conn_key_hash(>key, ct->hash_basis);
>
> -conn_clean_cmn(ct, conn);
> +conn_clean_cmn(ct, conn, conn_hash);
>  if (conn->nat_conn) {
>  uint32_t hash = conn_key_hash(>nat_conn->key,
> ct->hash_basis);
> -cmap_remove(>conns, >nat_conn->cm_node, hash);
> +if (conn_hash != hash) {
> +cmap_remove(>conns, >nat_conn->cm_node, hash);
> +}
>  }
>  ovs_list_remove(>exp_node);
>  conn->cleaned = true;
> @@ -479,19 +480,6 @@ conn_clean(struct conntrack *ct, struct conn *conn)
>  atomic_count_dec(>n_conn);
>  }
>
> -static void
> -conn_clean_one(struct conntrack *ct, struct conn *conn)
> -OVS_REQUIRES(ct->ct_lock)
> -{
> -conn_clean_cmn(ct, conn);
> -if (conn->conn_type == CT_CONN_TYPE_DEFAULT) {
> -ovs_list_remove(>exp_node);
> -conn->cleaned = true;
> -atomic_count_dec(>n_conn);
> -}
> -ovsrcu_postpone(delete_conn_one, conn);
> -}
> -
>  /* Destroys the connection tracker 'ct' and frees all the allocated
> memory.
>   * The caller of this function must already have shut down packet input
>   * and PMD threads (which would have been quiesced).  */
> @@ -505,7 +493,10 @@ conntrack_destroy(struct conntrack *ct)
>
>  ovs_mutex_lock(>ct_lock);
>  CMAP_FOR_EACH (conn, cm_node, >conns) {
> -conn_clean_one(ct, conn);
> +if (conn->conn_type == CT_CONN_TYPE_UN_NAT) {
> +continue;
> +}
> +conn_clean(ct, conn);
>  }
>  cmap_destroy(>conns);
>
> @@ -1052,7 +1043,10 @@ conn_not_found(struct conntrack *ct, struct
> dp_packet *pkt,
>  nat_conn->alg = NULL;
>  nat_conn->nat_conn = NULL;
>  uint32_t nat_hash = conn_key_hash(_conn->key,
> ct->hash_basis);
> -cmap_insert(>conns, _conn->cm_node, nat_hash);
> +
> +if (nat_hash != ctx->hash) {
> +cmap_insert(>conns, _conn->cm_node, nat_hash);
> +}
>  }
>
>  nc->nat_conn = nat_conn;
> @@ -1080,7 +1074,7 @@ conn_not_found(struct conntrack *ct, struct
> dp_packet *pkt,
>  nat_res_exhaustion:
>  free(nat_conn);
>  ovs_list_remove(>exp_node);
> -delete_conn_cmn(nc);
> +delete_conn__(nc);
>  static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 5);
>  VLOG_WARN_RL(, "Unable to NAT due to tuple space exhaustion - "
>   "if DoS attack, use firewalling and/or zone
> partitioning.");
> @@ -2549,7 +2543,7 @@ new_conn(struct conntrack *ct, struct dp_packet
> *pkt, struct conn_key *key,
>  }
>
>  static void
> -delete_conn_cmn(struct conn *conn)
> +delete_conn__(struct conn *conn)
>  {
>  

Re: [ovs-discuss] ovs-vswitchd crashes serveral times a day

2023-04-04 Thread Lazuardi Nasution via discuss
Hi Michael,

I assume that your k8s cluster is on the same subnet, right? Would you mind
testing it by shutting down one of etcd instances and see if this bug still
exists?

Best regards.

On Tue, Apr 4, 2023 at 2:50 PM Plato, Michael 
wrote:

> Hi,
>
> from my perspective the patch works for all cases. My test environment
> runs with several k8s clusters and I haven't noticed any etcd failures so
> far.
>
>
>
> Best regards
>
>
>
> Michael
>
>
>
> *Von:* Lazuardi Nasution 
> *Gesendet:* Dienstag, 4. April 2023 09:41
> *An:* Plato, Michael 
> *Cc:* ovs-discuss@openvswitch.org
> *Betreff:* Re: [ovs-discuss] ovs-vswitchd crashes serveral times a day
>
>
>
> Hi Michael,
>
>
>
> Is your patch working on the same subnet unreachable traffic too. In my
> case, crashes happen when too many unreachable replies even from the same
> subnet. For example, when one of the etcd instances is down, there will be
> huge reconnection attempts and then unreachable replies from the
> destination VM where the down etcd instance exists.
>
>
>
> Best regards.
>
>
>
> On Tue, Apr 4, 2023 at 1:06 PM Plato, Michael 
> wrote:
>
> Hi,
>
> I have some news on this topic. Unfortunately I could not find the root
> cause. But I managed to implement a workaround (see patch in attachment).
> The basic idea is to mark the nat flows as invalid if there is no longer an
> associated connection. From my point of view it is a race condition. It can
> be triggered by many short-lived connections. With the patch we no longer
> have any crashes. I can't say if it has any negative effects though, as I'm
> not an expert. So far I haven't found any problems at least. Without this
> patch we had hundreds of crashes a day :/
>
>
>
> Best regards
>
>
> Michael
>
>
>
> *Von:* Lazuardi Nasution 
> *Gesendet:* Montag, 3. April 2023 13:50
> *An:* ovs-discuss@openvswitch.org
> *Cc:* Plato, Michael 
> *Betreff:* Re: [ovs-discuss] ovs-vswitchd crashes serveral times a day
>
>
>
> Hi,
>
>
>
> Is this related to following glibc bug? I'm not so sure about this because
> when I check the glibc source of installed version (2.35), the proposed
> patch has been applied.
>
>
>
> https://sourceware.org/bugzilla/show_bug.cgi?id=12889
>
>
>
> I can confirm that this problem only happen if I use statefull ACL which
> is related to conntrack. The racing situation happen when massive
> unreachable replies are received. For example, if I run etcd on VMs but one
> etcd node has been disabled which causes massive connection attempts and
> unreachable replies.
>
>
>
> Best regards.
>
>
>
> On Mon, Mar 20, 2023, 10:58 PM Lazuardi Nasution 
> wrote:
>
> Hi Michael,
>
>
>
> Have you found the solution for this case? I find the same weird problem
> without any information about which conntrack entries are causing
> this issue.
>
>
>
> I'm using OVS 3.0.1 with DPDK  21.11.2 on Ubuntu 22.04. By the way, this
> problem is disappear after I remove some Kubernutes cluster VMs and some DB
> cluster VMs.
>
>
>
> Best regards.
>
>
>
> Date: Thu, 29 Sep 2022 07:56:32 +
> From: "Plato, Michael" 
> To: "ovs-discuss@openvswitch.org" 
> Subject: [ovs-discuss] ovs-vswitchd crashes serveral times a day
> Message-ID: <8e53d3d0674049e69b2b7f3c4b0b8...@tu-berlin.de>
> Content-Type: text/plain; charset="us-ascii"
>
> Hi,
>
> we are about to roll out our new openstack infrastructure based on yoga
> and during our testing we observered that the openvswitch-switch systemd
> unit restarts several times a day, causing network interruptions for all
> VMs on the compute node in question.
> After some research we found that the ovs-vswitchd crashes with the
> following assertion failure:
>
> "2022-09-29T06:51:05.195Z|3|util(pmd-c01/id:8)|EMER|../lib/conntrack.c:1095:
> assertion conn->conn_type == CT_CONN_TYPE_DEFAULT failed in
> conn_update_state()"
>
> To get more information about the connection that leads to this assertion
> failure, I added some debug code to conntrack.c .
> We have seen that we can trigger this issue when trying to connect from a
> VM to a destination which is unreachable. For example curl
> https://www.google.de:444
>
> Shortly after that we get an assertion and the debug code says:
>
> conn_type=1 (may be CT_CONN_TYPE_UN_NAT) ?
> src ip 172.217.16.67 dst ip 141.23.xx.xx rev src ip 141.23.xx.xx rev dst
> ip 172.217.16.67 src/dst ports 444/46212 rev src/dst ports 46212/444
> zone/rev zone 2/2 nw_proto/rev nw_proto 6/6
>
> ovs-appctl dpctl/dump-conntrack | grep "444"
>
> tcp,orig=(src=141.23.xx.xx,dst=172.217.16.67,sport=46212,dport=444),reply=(src=172.217.16.67,dst=141.23.xx.xx,sport=444,dport=46212),zone=2,protoinfo=(state=SYN_SENT)
>
> Versions:
> ovs-vsctl --version
> ovs-vsctl (Open vSwitch) 2.17.2
> DB Schema 8.3.0
>
> ovn-controller --version
> ovn-controller 22.03.0
> Open vSwitch Library 2.17.0
> OpenFlow versions 0x6:0x6
> SB DB Schema 20.21.0
>
> DPDK 21.11.2
>
> We are now unsure if this is a misconfiguration or if we hit a bug.
>
> Thanks for any feedback
>
> Michael

Re: [ovs-discuss] ovs-vswitchd crashes serveral times a day

2023-04-04 Thread Lazuardi Nasution via discuss
Hi Michael,

Is your patch working on the same subnet unreachable traffic too. In my
case, crashes happen when too many unreachable replies even from the same
subnet. For example, when one of the etcd instances is down, there will be
huge reconnection attempts and then unreachable replies from the
destination VM where the down etcd instance exists.

Best regards.

On Tue, Apr 4, 2023 at 1:06 PM Plato, Michael 
wrote:

> Hi,
>
> I have some news on this topic. Unfortunately I could not find the root
> cause. But I managed to implement a workaround (see patch in attachment).
> The basic idea is to mark the nat flows as invalid if there is no longer an
> associated connection. From my point of view it is a race condition. It can
> be triggered by many short-lived connections. With the patch we no longer
> have any crashes. I can't say if it has any negative effects though, as I'm
> not an expert. So far I haven't found any problems at least. Without this
> patch we had hundreds of crashes a day :/
>
>
>
> Best regards
>
>
> Michael
>
>
>
> *Von:* Lazuardi Nasution 
> *Gesendet:* Montag, 3. April 2023 13:50
> *An:* ovs-discuss@openvswitch.org
> *Cc:* Plato, Michael 
> *Betreff:* Re: [ovs-discuss] ovs-vswitchd crashes serveral times a day
>
>
>
> Hi,
>
>
>
> Is this related to following glibc bug? I'm not so sure about this because
> when I check the glibc source of installed version (2.35), the proposed
> patch has been applied.
>
>
>
> https://sourceware.org/bugzilla/show_bug.cgi?id=12889
>
>
>
> I can confirm that this problem only happen if I use statefull ACL which
> is related to conntrack. The racing situation happen when massive
> unreachable replies are received. For example, if I run etcd on VMs but one
> etcd node has been disabled which causes massive connection attempts and
> unreachable replies.
>
>
>
> Best regards.
>
>
>
> On Mon, Mar 20, 2023, 10:58 PM Lazuardi Nasution 
> wrote:
>
> Hi Michael,
>
>
>
> Have you found the solution for this case? I find the same weird problem
> without any information about which conntrack entries are causing
> this issue.
>
>
>
> I'm using OVS 3.0.1 with DPDK  21.11.2 on Ubuntu 22.04. By the way, this
> problem is disappear after I remove some Kubernutes cluster VMs and some DB
> cluster VMs.
>
>
>
> Best regards.
>
>
>
> Date: Thu, 29 Sep 2022 07:56:32 +
> From: "Plato, Michael" 
> To: "ovs-discuss@openvswitch.org" 
> Subject: [ovs-discuss] ovs-vswitchd crashes serveral times a day
> Message-ID: <8e53d3d0674049e69b2b7f3c4b0b8...@tu-berlin.de>
> Content-Type: text/plain; charset="us-ascii"
>
> Hi,
>
> we are about to roll out our new openstack infrastructure based on yoga
> and during our testing we observered that the openvswitch-switch systemd
> unit restarts several times a day, causing network interruptions for all
> VMs on the compute node in question.
> After some research we found that the ovs-vswitchd crashes with the
> following assertion failure:
>
> "2022-09-29T06:51:05.195Z|3|util(pmd-c01/id:8)|EMER|../lib/conntrack.c:1095:
> assertion conn->conn_type == CT_CONN_TYPE_DEFAULT failed in
> conn_update_state()"
>
> To get more information about the connection that leads to this assertion
> failure, I added some debug code to conntrack.c .
> We have seen that we can trigger this issue when trying to connect from a
> VM to a destination which is unreachable. For example curl
> https://www.google.de:444
>
> Shortly after that we get an assertion and the debug code says:
>
> conn_type=1 (may be CT_CONN_TYPE_UN_NAT) ?
> src ip 172.217.16.67 dst ip 141.23.xx.xx rev src ip 141.23.xx.xx rev dst
> ip 172.217.16.67 src/dst ports 444/46212 rev src/dst ports 46212/444
> zone/rev zone 2/2 nw_proto/rev nw_proto 6/6
>
> ovs-appctl dpctl/dump-conntrack | grep "444"
>
> tcp,orig=(src=141.23.xx.xx,dst=172.217.16.67,sport=46212,dport=444),reply=(src=172.217.16.67,dst=141.23.xx.xx,sport=444,dport=46212),zone=2,protoinfo=(state=SYN_SENT)
>
> Versions:
> ovs-vsctl --version
> ovs-vsctl (Open vSwitch) 2.17.2
> DB Schema 8.3.0
>
> ovn-controller --version
> ovn-controller 22.03.0
> Open vSwitch Library 2.17.0
> OpenFlow versions 0x6:0x6
> SB DB Schema 20.21.0
>
> DPDK 21.11.2
>
> We are now unsure if this is a misconfiguration or if we hit a bug.
>
> Thanks for any feedback
>
> Michael
>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] ovs-dpdk crash

2023-04-03 Thread Lazuardi Nasution via discuss
Hi,

Any update with this issue? This issue happens with OVS 3.0.1 and DPDK
21.11.3 too.

Best regards,


> Date: Mon, 23 Jan 2023 15:45:37 +0800 (CST)
> From: ?? <13813836...@163.com>
> To: b...@openvswitch.org
> Subject: [ovs-discuss] ovs-dpdk crash
> Message-ID: <6cb12966.2f9.185dd96f9a5.coremail.13813836...@163.com>
> Content-Type: text/plain; charset="gbk"
>
> We use ovs2.17.2 and dpdk 22.03. After configuring SNAT?we  encountered
> coredump problems.
> Please take the trouble to look at these problems?thank you?
>
>
>
> -- next part --
> An HTML attachment was scrubbed...
> URL: <
> http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20230123/fc24053c/attachment-0001.html
> >
> -- next part --
> An embedded and charset-unspecified text was scrubbed...
> Name: crash1.txt
> URL: <
> http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20230123/fc24053c/attachment-0002.txt
> >
> -- next part --
> An embedded and charset-unspecified text was scrubbed...
> Name: crash2.txt
> URL: <
> http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20230123/fc24053c/attachment-0003.txt
> >
>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] ovs-vswitchd crashes serveral times a day

2023-04-03 Thread Lazuardi Nasution via discuss
Hi,

Is this related to following glibc bug? I'm not so sure about this because
when I check the glibc source of installed version (2.35), the proposed
patch has been applied.

https://sourceware.org/bugzilla/show_bug.cgi?id=12889

I can confirm that this problem only happen if I use statefull ACL which is
related to conntrack. The racing situation happen when massive unreachable
replies are received. For example, if I run etcd on VMs but one etcd node
has been disabled which causes massive connection attempts and unreachable
replies.

Best regards.

On Mon, Mar 20, 2023, 10:58 PM Lazuardi Nasution 
wrote:

> Hi Michael,
>
> Have you found the solution for this case? I find the same weird problem
> without any information about which conntrack entries are causing
> this issue.
>
> I'm using OVS 3.0.1 with DPDK  21.11.2 on Ubuntu 22.04. By the way, this
> problem is disappear after I remove some Kubernutes cluster VMs and some DB
> cluster VMs.
>
> Best regards.
>
>
>> Date: Thu, 29 Sep 2022 07:56:32 +
>> From: "Plato, Michael" 
>> To: "ovs-discuss@openvswitch.org" 
>> Subject: [ovs-discuss] ovs-vswitchd crashes serveral times a day
>> Message-ID: <8e53d3d0674049e69b2b7f3c4b0b8...@tu-berlin.de>
>> Content-Type: text/plain; charset="us-ascii"
>>
>> Hi,
>>
>> we are about to roll out our new openstack infrastructure based on yoga
>> and during our testing we observered that the openvswitch-switch systemd
>> unit restarts several times a day, causing network interruptions for all
>> VMs on the compute node in question.
>> After some research we found that the ovs-vswitchd crashes with the
>> following assertion failure:
>>
>> "2022-09-29T06:51:05.195Z|3|util(pmd-c01/id:8)|EMER|../lib/conntrack.c:1095:
>> assertion conn->conn_type == CT_CONN_TYPE_DEFAULT failed in
>> conn_update_state()"
>>
>> To get more information about the connection that leads to this assertion
>> failure, I added some debug code to conntrack.c .
>> We have seen that we can trigger this issue when trying to connect from a
>> VM to a destination which is unreachable. For example curl
>> https://www.google.de:444
>>
>> Shortly after that we get an assertion and the debug code says:
>>
>> conn_type=1 (may be CT_CONN_TYPE_UN_NAT) ?
>> src ip 172.217.16.67 dst ip 141.23.xx.xx rev src ip 141.23.xx.xx rev dst
>> ip 172.217.16.67 src/dst ports 444/46212 rev src/dst ports 46212/444
>> zone/rev zone 2/2 nw_proto/rev nw_proto 6/6
>>
>> ovs-appctl dpctl/dump-conntrack | grep "444"
>>
>> tcp,orig=(src=141.23.xx.xx,dst=172.217.16.67,sport=46212,dport=444),reply=(src=172.217.16.67,dst=141.23.xx.xx,sport=444,dport=46212),zone=2,protoinfo=(state=SYN_SENT)
>>
>> Versions:
>> ovs-vsctl --version
>> ovs-vsctl (Open vSwitch) 2.17.2
>> DB Schema 8.3.0
>>
>> ovn-controller --version
>> ovn-controller 22.03.0
>> Open vSwitch Library 2.17.0
>> OpenFlow versions 0x6:0x6
>> SB DB Schema 20.21.0
>>
>> DPDK 21.11.2
>>
>> We are now unsure if this is a misconfiguration or if we hit a bug.
>>
>> Thanks for any feedback
>>
>> Michael
>>
>>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] ovs-vswitchd crashes serveral times a day

2023-03-20 Thread Lazuardi Nasution via discuss
Hi Michael,

Have you found the solution for this case? I find the same weird problem
without any information about which conntrack entries are causing
this issue.

I'm using OVS 3.0.1 with DPDK  21.11.2 on Ubuntu 22.04. By the way, this
problem is disappear after I remove some Kubernutes cluster VMs and some DB
cluster VMs.

Best regards.


> Date: Thu, 29 Sep 2022 07:56:32 +
> From: "Plato, Michael" 
> To: "ovs-discuss@openvswitch.org" 
> Subject: [ovs-discuss] ovs-vswitchd crashes serveral times a day
> Message-ID: <8e53d3d0674049e69b2b7f3c4b0b8...@tu-berlin.de>
> Content-Type: text/plain; charset="us-ascii"
>
> Hi,
>
> we are about to roll out our new openstack infrastructure based on yoga
> and during our testing we observered that the openvswitch-switch systemd
> unit restarts several times a day, causing network interruptions for all
> VMs on the compute node in question.
> After some research we found that the ovs-vswitchd crashes with the
> following assertion failure:
>
> "2022-09-29T06:51:05.195Z|3|util(pmd-c01/id:8)|EMER|../lib/conntrack.c:1095:
> assertion conn->conn_type == CT_CONN_TYPE_DEFAULT failed in
> conn_update_state()"
>
> To get more information about the connection that leads to this assertion
> failure, I added some debug code to conntrack.c .
> We have seen that we can trigger this issue when trying to connect from a
> VM to a destination which is unreachable. For example curl
> https://www.google.de:444
>
> Shortly after that we get an assertion and the debug code says:
>
> conn_type=1 (may be CT_CONN_TYPE_UN_NAT) ?
> src ip 172.217.16.67 dst ip 141.23.xx.xx rev src ip 141.23.xx.xx rev dst
> ip 172.217.16.67 src/dst ports 444/46212 rev src/dst ports 46212/444
> zone/rev zone 2/2 nw_proto/rev nw_proto 6/6
>
> ovs-appctl dpctl/dump-conntrack | grep "444"
>
> tcp,orig=(src=141.23.xx.xx,dst=172.217.16.67,sport=46212,dport=444),reply=(src=172.217.16.67,dst=141.23.xx.xx,sport=444,dport=46212),zone=2,protoinfo=(state=SYN_SENT)
>
> Versions:
> ovs-vsctl --version
> ovs-vsctl (Open vSwitch) 2.17.2
> DB Schema 8.3.0
>
> ovn-controller --version
> ovn-controller 22.03.0
> Open vSwitch Library 2.17.0
> OpenFlow versions 0x6:0x6
> SB DB Schema 20.21.0
>
> DPDK 21.11.2
>
> We are now unsure if this is a misconfiguration or if we hit a bug.
>
> Thanks for any feedback
>
> Michael
>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVN Failed Flow Offload

2023-02-21 Thread Lazuardi Nasution via discuss
Any progress about this case?

By the way, is it possible to filter reserved broadcast addresses from
being seen by such VF? I'm thinking about something like the following
rule, but I know that rule will never work.

ethtool -N ens4f0np0 flow-type ether dst 01:80:c2:00:00:00 m
ff:ff:ff:ff:ff:00 vf 1 queue -1

Best regards.

On Thu, Feb 16, 2023 at 12:26 PM Lazuardi Nasution 
wrote:

> Hi Ajit,
>
> It seem that the flows which are related with broadcast DMAC are appear on
> bridges which are connected to PF/VF like below.
>
> admin@controller01:~$ sudo ovs-appctl dpif/dump-flows type=all br-provider
> recirc_id(0),in_port(7),packet_type(ns=0,id=0),eth(src=84:16:0c:f0:47:2b,dst=01:80:c2:00:00:0e),eth_type(0x88cc),
> packets:0, bytes:0, used:never, actions:drop
> recirc_id(0),in_port(7),packet_type(ns=0,id=0),eth(src=84:16:0c:f0:23:97,dst=01:80:c2:00:00:0e),eth_type(0x88cc),
> packets:0, bytes:0, used:never, actions:drop
> recirc_id(0),in_port(7),packet_type(ns=0,id=0),eth(src=84:16:0c:f0:47:2a,dst=01:80:c2:00:00:0e),eth_type(0x88cc),
> packets:0, bytes:0, used:never, actions:drop
> recirc_id(0),in_port(7),packet_type(ns=0,id=0),eth(src=84:16:0c:f0:23:96,dst=01:80:c2:00:00:0e),eth_type(0x88cc),
> packets:0, bytes:0, used:never, actions:drop
> recirc_id(0),in_port(7),packet_type(ns=0,id=0),eth(src=00:03:0f:bb:80:db,dst=01:80:c2:00:00:02),eth_type(0x8809),
> packets:198547, bytes:24619828, used:0.419s, actions:drop
> recirc_id(0),in_port(7),packet_type(ns=0,id=0),eth(src=84:16:0c:f0:2d:62,dst=01:80:c2:00:00:0e),eth_type(0x88cc),
> packets:0, bytes:0, used:never, actions:drop
> recirc_id(0),in_port(7),packet_type(ns=0,id=0),eth(src=84:16:0c:f0:2d:63,dst=01:80:c2:00:00:0e),eth_type(0x88cc),
> packets:0, bytes:0, used:never, actions:drop
> recirc_id(0),in_port(8),packet_type(ns=0,id=0),eth(src=00:03:0f:bb:80:db,dst=01:80:c2:00:00:02),eth_type(0x8809),
> packets:198547, bytes:24619828, used:0.419s, actions:drop
>
> admin@controller01:~$ sudo ovs-appctl dpif/dump-flows type=all br-int
> tunnel(tun_id=0x0,src=10.10.203.13,dst=10.10.203.11,geneve(),flags(-df+csum+key)),recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(proto=17,frag=no),udp(dst=3784),
> packets:226653, bytes:14959098, used:0.370s,
> actions:userspace(pid=0,slow_path(bfd))
> tunnel(tun_id=0x0,src=10.10.203.17,dst=10.10.203.11,geneve(),flags(-df+csum+key)),recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(proto=17,frag=no),udp(dst=3784),
> packets:156086, bytes:10301676, used:0.088s,
> actions:userspace(pid=0,slow_path(bfd))
> tunnel(tun_id=0x0,src=10.10.203.16,dst=10.10.203.11,geneve(),flags(-df+csum+key)),recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(proto=17,frag=no),udp(dst=3784),
> packets:225335, bytes:14872110, used:0.449s,
> actions:userspace(pid=0,slow_path(bfd))
> tunnel(tun_id=0x0,src=10.10.203.15,dst=10.10.203.11,geneve(),flags(-df+csum+key)),recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(proto=17,frag=no),udp(dst=3784),
> packets:224925, bytes:14845050, used:0.766s,
> actions:userspace(pid=0,slow_path(bfd))
> tunnel(tun_id=0x0,src=10.10.203.18,dst=10.10.203.11,geneve(),flags(-df+csum+key)),recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(proto=17,frag=no),udp(dst=3784),
> packets:226408, bytes:14942928, used:0.093s,
> actions:userspace(pid=0,slow_path(bfd))
>
> admin@controller01:~$ sudo ovs-appctl dpif/dump-flows type=all br-tun
> recirc_id(0),in_port(6),packet_type(ns=0,id=0),eth(src=bc:97:e1:05:60:90,dst=01:80:c2:00:00:0e),eth_type(0x88cc),
> packets:0, bytes:0, used:never, actions:drop
> recirc_id(0),in_port(6),packet_type(ns=0,id=0),eth(src=bc:97:e1:05:60:71,dst=01:80:c2:00:00:0e),eth_type(0x88cc),
> packets:0, bytes:0, used:never, actions:drop
> recirc_id(0),in_port(6),packet_type(ns=0,id=0),eth(src=4e:fd:17:56:20:4d,dst=02:41:ed:5e:e4:4f),eth_type(0x8100),vlan(vid=1203,pcp=0),encap(eth_type(0x0800),ipv4(dst=10.10.203.11,proto=17,frag=no),udp(dst=6081)),
> packets:225346, bytes:27041520, used:0.349s, actions:pop_vlan,tnl_pop(1)
> recirc_id(0),in_port(6),packet_type(ns=0,id=0),eth(src=e4:3d:1a:dc:4c:00,dst=01:80:c2:00:00:02),eth_type(0x8809),
> packets:0, bytes:0, used:never, actions:drop
> recirc_id(0),in_port(6),packet_type(ns=0,id=0),eth(src=26:a0:81:bc:46:41,dst=02:41:ed:5e:e4:4f),eth_type(0x8100),vlan(vid=1203,pcp=0),encap(eth_type(0x0800),ipv4(dst=10.10.203.11,proto=17,frag=no),udp(dst=6081)),
> packets:7170, bytes:860400, used:0.708s, actions:pop_vlan,tnl_pop(1)
> recirc_id(0),in_port(6),packet_type(ns=0,id=0),eth(src=3e:79:4e:a0:a0:4b,dst=02:41:ed:5e:e4:4f),eth_type(0x8100),vlan(vid=1203,pcp=0),encap(eth_type(0x0800),ipv4(dst=10.10.203.11,proto=17,frag=no),udp(dst=6081)),
> packets:226664, bytes:27199680, used:0.530s, actions:pop_vlan,tnl_pop(1)
> recirc_id(0),in_port(6),packet_type(ns=0,id=0),eth(src=bc:97:e1:05:56:50,dst=01:80:c2:00:00:0e),eth_type(0x88cc),
> packets:0, bytes:0, used:never, actions:drop
> 

Re: [ovs-discuss] OVN Failed Flow Offload

2023-02-15 Thread Lazuardi Nasution via discuss
Hi Ajit,

It seem that the flows which are related with broadcast DMAC are appear on
bridges which are connected to PF/VF like below.

admin@controller01:~$ sudo ovs-appctl dpif/dump-flows type=all br-provider
recirc_id(0),in_port(7),packet_type(ns=0,id=0),eth(src=84:16:0c:f0:47:2b,dst=01:80:c2:00:00:0e),eth_type(0x88cc),
packets:0, bytes:0, used:never, actions:drop
recirc_id(0),in_port(7),packet_type(ns=0,id=0),eth(src=84:16:0c:f0:23:97,dst=01:80:c2:00:00:0e),eth_type(0x88cc),
packets:0, bytes:0, used:never, actions:drop
recirc_id(0),in_port(7),packet_type(ns=0,id=0),eth(src=84:16:0c:f0:47:2a,dst=01:80:c2:00:00:0e),eth_type(0x88cc),
packets:0, bytes:0, used:never, actions:drop
recirc_id(0),in_port(7),packet_type(ns=0,id=0),eth(src=84:16:0c:f0:23:96,dst=01:80:c2:00:00:0e),eth_type(0x88cc),
packets:0, bytes:0, used:never, actions:drop
recirc_id(0),in_port(7),packet_type(ns=0,id=0),eth(src=00:03:0f:bb:80:db,dst=01:80:c2:00:00:02),eth_type(0x8809),
packets:198547, bytes:24619828, used:0.419s, actions:drop
recirc_id(0),in_port(7),packet_type(ns=0,id=0),eth(src=84:16:0c:f0:2d:62,dst=01:80:c2:00:00:0e),eth_type(0x88cc),
packets:0, bytes:0, used:never, actions:drop
recirc_id(0),in_port(7),packet_type(ns=0,id=0),eth(src=84:16:0c:f0:2d:63,dst=01:80:c2:00:00:0e),eth_type(0x88cc),
packets:0, bytes:0, used:never, actions:drop
recirc_id(0),in_port(8),packet_type(ns=0,id=0),eth(src=00:03:0f:bb:80:db,dst=01:80:c2:00:00:02),eth_type(0x8809),
packets:198547, bytes:24619828, used:0.419s, actions:drop

admin@controller01:~$ sudo ovs-appctl dpif/dump-flows type=all br-int
tunnel(tun_id=0x0,src=10.10.203.13,dst=10.10.203.11,geneve(),flags(-df+csum+key)),recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(proto=17,frag=no),udp(dst=3784),
packets:226653, bytes:14959098, used:0.370s,
actions:userspace(pid=0,slow_path(bfd))
tunnel(tun_id=0x0,src=10.10.203.17,dst=10.10.203.11,geneve(),flags(-df+csum+key)),recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(proto=17,frag=no),udp(dst=3784),
packets:156086, bytes:10301676, used:0.088s,
actions:userspace(pid=0,slow_path(bfd))
tunnel(tun_id=0x0,src=10.10.203.16,dst=10.10.203.11,geneve(),flags(-df+csum+key)),recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(proto=17,frag=no),udp(dst=3784),
packets:225335, bytes:14872110, used:0.449s,
actions:userspace(pid=0,slow_path(bfd))
tunnel(tun_id=0x0,src=10.10.203.15,dst=10.10.203.11,geneve(),flags(-df+csum+key)),recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(proto=17,frag=no),udp(dst=3784),
packets:224925, bytes:14845050, used:0.766s,
actions:userspace(pid=0,slow_path(bfd))
tunnel(tun_id=0x0,src=10.10.203.18,dst=10.10.203.11,geneve(),flags(-df+csum+key)),recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(proto=17,frag=no),udp(dst=3784),
packets:226408, bytes:14942928, used:0.093s,
actions:userspace(pid=0,slow_path(bfd))

admin@controller01:~$ sudo ovs-appctl dpif/dump-flows type=all br-tun
recirc_id(0),in_port(6),packet_type(ns=0,id=0),eth(src=bc:97:e1:05:60:90,dst=01:80:c2:00:00:0e),eth_type(0x88cc),
packets:0, bytes:0, used:never, actions:drop
recirc_id(0),in_port(6),packet_type(ns=0,id=0),eth(src=bc:97:e1:05:60:71,dst=01:80:c2:00:00:0e),eth_type(0x88cc),
packets:0, bytes:0, used:never, actions:drop
recirc_id(0),in_port(6),packet_type(ns=0,id=0),eth(src=4e:fd:17:56:20:4d,dst=02:41:ed:5e:e4:4f),eth_type(0x8100),vlan(vid=1203,pcp=0),encap(eth_type(0x0800),ipv4(dst=10.10.203.11,proto=17,frag=no),udp(dst=6081)),
packets:225346, bytes:27041520, used:0.349s, actions:pop_vlan,tnl_pop(1)
recirc_id(0),in_port(6),packet_type(ns=0,id=0),eth(src=e4:3d:1a:dc:4c:00,dst=01:80:c2:00:00:02),eth_type(0x8809),
packets:0, bytes:0, used:never, actions:drop
recirc_id(0),in_port(6),packet_type(ns=0,id=0),eth(src=26:a0:81:bc:46:41,dst=02:41:ed:5e:e4:4f),eth_type(0x8100),vlan(vid=1203,pcp=0),encap(eth_type(0x0800),ipv4(dst=10.10.203.11,proto=17,frag=no),udp(dst=6081)),
packets:7170, bytes:860400, used:0.708s, actions:pop_vlan,tnl_pop(1)
recirc_id(0),in_port(6),packet_type(ns=0,id=0),eth(src=3e:79:4e:a0:a0:4b,dst=02:41:ed:5e:e4:4f),eth_type(0x8100),vlan(vid=1203,pcp=0),encap(eth_type(0x0800),ipv4(dst=10.10.203.11,proto=17,frag=no),udp(dst=6081)),
packets:226664, bytes:27199680, used:0.530s, actions:pop_vlan,tnl_pop(1)
recirc_id(0),in_port(6),packet_type(ns=0,id=0),eth(src=bc:97:e1:05:56:50,dst=01:80:c2:00:00:0e),eth_type(0x88cc),
packets:0, bytes:0, used:never, actions:drop
recirc_id(0),in_port(6),packet_type(ns=0,id=0),eth(src=00:03:0f:bb:ce:f4,dst=01:80:c2:00:00:02),eth_type(0x8809),
packets:197484, bytes:24488016, used:0.302s, actions:drop
recirc_id(0),in_port(6),packet_type(ns=0,id=0),eth(src=e4:3d:1a:dc:48:80,dst=01:80:c2:00:00:0e),eth_type(0x88cc),
packets:0, bytes:0, used:never, actions:drop
recirc_id(0),in_port(5),packet_type(ns=0,id=0),eth(src=bc:97:e1:05:60:70,dst=01:80:c2:00:00:0e),eth_type(0x88cc),
packets:0, bytes:0, used:never, actions:drop

Re: [ovs-discuss] OVN Failed Flow Offload

2023-02-14 Thread Lazuardi Nasution via discuss
Hi Ajit,

Is there any update on this? If it is firmware matter, what is suggested
firmware for enabling flow offload with OVN?

Best regards.

On Thu, Feb 9, 2023, 12:17 PM Lazuardi Nasution 
wrote:

> Hi Ajit,
>
> I'm using firmware version 219.0.144.0.of
>
> I'm not sure that the problem is about the capability of the firmware. By
> digging the source code of bnxt PMD, it seems that this problem is related
> to bnxt_validate_and_parse_flow_type() function which throws an error if
> the destination Ethernet address is broadcast Ethernet address. I'm using
> the following URL as reference.
>
> https://github.com/DPDK/dpdk/blob/v21.11/drivers/net/bnxt/bnxt_flow.c#L228
>
> From what I can understand of David statement, it should not throw an RTE
> error but just leave an incompatible flow non-offloaded.
>
> Best regards.
>
> On Thu, Feb 9, 2023 at 12:14 AM Ajit Khaparde 
> wrote:
>
>> Hi,
>> From what I can see, it looks like the offload is being attempted on a
>> card which does not have offload functionality enabled.
>> Can you share the FW version on the NICs?
>>
>> If needed, will it be possible for you to update the firmware on the NICs?
>>
>> For the warning regarding flow control setting, let me check and get back.
>>
>> Thanks
>> Ajit
>>
>> On Wed, Feb 8, 2023 at 4:14 AM Lazuardi Nasution 
>> wrote:
>> >
>> > Hi Ajit,
>> >
>> > Have you find the way to overcome this problem? Would you mind to
>> explain why this reserved Ethernet addresses throw error on offloading the
>> flows and not just make related flows non-offloaded?
>> >
>> > Another think, but not so important is bnxt PMD logs warning about
>> cannot do flow control on VF even though I have used none, true or false of
>> interface flow control setting. This warning always appear on OVS
>> restarting.
>> >
>> > Best regards.
>> >
>> > On Tue, Feb 7, 2023, 12:21 AM Lazuardi Nasution 
>> wrote:
>> >>
>> >> Hi Ajit,
>> >>
>> >> I'm using the following versions.
>> >>
>> >> dpdk_version: "DPDK 21.11.2"
>> >> ovs_version : "3.0.1"
>> >>
>> >> Best regards.
>> >>
>> >> On Tue, Feb 7, 2023 at 12:12 AM Ajit Khaparde <
>> ajit.khapa...@broadcom.com> wrote:
>> >>>
>> >>> On Mon, Feb 6, 2023 at 9:02 AM Lazuardi Nasution <
>> mrxlazuar...@gmail.com> wrote:
>> >>> >
>> >>> > Hi David,
>> >>> >
>> >>> > I think I can understand your opinion. So your target is to prevent
>> frames with those ethernet addresses from reaching CP, right? FYI, I'm
>> using bonded VFs of bonded PFs as OVS-DPDK interfaces, so offcourse LACP
>> should be handled by bonded PFs only.
>> >>> What is the version of DPDK & OVS used here, BTW? Thanks
>> >>>
>> >>> >
>> >>> > Best regards,
>> >>> >
>> >>> > On Mon, Feb 6, 2023 at 11:54 PM David Marchand <
>> david.march...@redhat.com> wrote:
>> >>> >>
>> >>> >> On Mon, Feb 6, 2023 at 5:46 PM Lazuardi Nasution <
>> mrxlazuar...@gmail.com> wrote:
>> >>> >> >
>> >>> >> > HI David,
>> >>> >> >
>> >>> >> > Don't you think that offload of reserved Ethernet address should
>> be disabled by default?
>> >>> >>
>> >>> >> What OVN requests in this trace (dropping) makes sense to me if
>> those
>> >>> >> lacp frames are to be ignored at the CP level.
>> >>> >> I don't see why some ethernet address would require some special
>> >>> >> offloading considerations, but maybe others have a better opinion
>> on
>> >>> >> this topic.
>> >>> >>
>> >>> >>
>> >>> >> --
>> >>> >> David Marchand
>> >>> >>
>>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVN Failed Flow Offload

2023-02-08 Thread Lazuardi Nasution via discuss
Hi Ajit,

I'm using firmware version 219.0.144.0.of

I'm not sure that the problem is about the capability of the firmware. By
digging the source code of bnxt PMD, it seems that this problem is related
to bnxt_validate_and_parse_flow_type() function which throws an error if
the destination Ethernet address is broadcast Ethernet address. I'm using
the following URL as reference.

https://github.com/DPDK/dpdk/blob/v21.11/drivers/net/bnxt/bnxt_flow.c#L228

>From what I can understand of David statement, it should not throw an RTE
error but just leave an incompatible flow non-offloaded.

Best regards.

On Thu, Feb 9, 2023 at 12:14 AM Ajit Khaparde 
wrote:

> Hi,
> From what I can see, it looks like the offload is being attempted on a
> card which does not have offload functionality enabled.
> Can you share the FW version on the NICs?
>
> If needed, will it be possible for you to update the firmware on the NICs?
>
> For the warning regarding flow control setting, let me check and get back.
>
> Thanks
> Ajit
>
> On Wed, Feb 8, 2023 at 4:14 AM Lazuardi Nasution 
> wrote:
> >
> > Hi Ajit,
> >
> > Have you find the way to overcome this problem? Would you mind to
> explain why this reserved Ethernet addresses throw error on offloading the
> flows and not just make related flows non-offloaded?
> >
> > Another think, but not so important is bnxt PMD logs warning about
> cannot do flow control on VF even though I have used none, true or false of
> interface flow control setting. This warning always appear on OVS
> restarting.
> >
> > Best regards.
> >
> > On Tue, Feb 7, 2023, 12:21 AM Lazuardi Nasution 
> wrote:
> >>
> >> Hi Ajit,
> >>
> >> I'm using the following versions.
> >>
> >> dpdk_version: "DPDK 21.11.2"
> >> ovs_version : "3.0.1"
> >>
> >> Best regards.
> >>
> >> On Tue, Feb 7, 2023 at 12:12 AM Ajit Khaparde <
> ajit.khapa...@broadcom.com> wrote:
> >>>
> >>> On Mon, Feb 6, 2023 at 9:02 AM Lazuardi Nasution <
> mrxlazuar...@gmail.com> wrote:
> >>> >
> >>> > Hi David,
> >>> >
> >>> > I think I can understand your opinion. So your target is to prevent
> frames with those ethernet addresses from reaching CP, right? FYI, I'm
> using bonded VFs of bonded PFs as OVS-DPDK interfaces, so offcourse LACP
> should be handled by bonded PFs only.
> >>> What is the version of DPDK & OVS used here, BTW? Thanks
> >>>
> >>> >
> >>> > Best regards,
> >>> >
> >>> > On Mon, Feb 6, 2023 at 11:54 PM David Marchand <
> david.march...@redhat.com> wrote:
> >>> >>
> >>> >> On Mon, Feb 6, 2023 at 5:46 PM Lazuardi Nasution <
> mrxlazuar...@gmail.com> wrote:
> >>> >> >
> >>> >> > HI David,
> >>> >> >
> >>> >> > Don't you think that offload of reserved Ethernet address should
> be disabled by default?
> >>> >>
> >>> >> What OVN requests in this trace (dropping) makes sense to me if
> those
> >>> >> lacp frames are to be ignored at the CP level.
> >>> >> I don't see why some ethernet address would require some special
> >>> >> offloading considerations, but maybe others have a better opinion on
> >>> >> this topic.
> >>> >>
> >>> >>
> >>> >> --
> >>> >> David Marchand
> >>> >>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVN Failed Flow Offload

2023-02-08 Thread Lazuardi Nasution via discuss
Hi Ajit,

Have you find the way to overcome this problem? Would you mind to explain
why this reserved Ethernet addresses throw error on offloading the flows
and not just make related flows non-offloaded?

Another think, but not so important is bnxt PMD logs warning about cannot
do flow control on VF even though I have used none, true or false of
interface flow control setting. This warning always appear on OVS
restarting.

Best regards.

On Tue, Feb 7, 2023, 12:21 AM Lazuardi Nasution 
wrote:

> Hi Ajit,
>
> I'm using the following versions.
>
> dpdk_version: "DPDK 21.11.2"
> ovs_version : "3.0.1"
>
> Best regards.
>
> On Tue, Feb 7, 2023 at 12:12 AM Ajit Khaparde 
> wrote:
>
>> On Mon, Feb 6, 2023 at 9:02 AM Lazuardi Nasution 
>> wrote:
>> >
>> > Hi David,
>> >
>> > I think I can understand your opinion. So your target is to prevent
>> frames with those ethernet addresses from reaching CP, right? FYI, I'm
>> using bonded VFs of bonded PFs as OVS-DPDK interfaces, so offcourse LACP
>> should be handled by bonded PFs only.
>> What is the version of DPDK & OVS used here, BTW? Thanks
>>
>> >
>> > Best regards,
>> >
>> > On Mon, Feb 6, 2023 at 11:54 PM David Marchand <
>> david.march...@redhat.com> wrote:
>> >>
>> >> On Mon, Feb 6, 2023 at 5:46 PM Lazuardi Nasution <
>> mrxlazuar...@gmail.com> wrote:
>> >> >
>> >> > HI David,
>> >> >
>> >> > Don't you think that offload of reserved Ethernet address should be
>> disabled by default?
>> >>
>> >> What OVN requests in this trace (dropping) makes sense to me if those
>> >> lacp frames are to be ignored at the CP level.
>> >> I don't see why some ethernet address would require some special
>> >> offloading considerations, but maybe others have a better opinion on
>> >> this topic.
>> >>
>> >>
>> >> --
>> >> David Marchand
>> >>
>>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVN Failed Flow Offload

2023-02-06 Thread Lazuardi Nasution via discuss
Hi Ajit,

I'm using the following versions.

dpdk_version: "DPDK 21.11.2"
ovs_version : "3.0.1"

Best regards.

On Tue, Feb 7, 2023 at 12:12 AM Ajit Khaparde 
wrote:

> On Mon, Feb 6, 2023 at 9:02 AM Lazuardi Nasution 
> wrote:
> >
> > Hi David,
> >
> > I think I can understand your opinion. So your target is to prevent
> frames with those ethernet addresses from reaching CP, right? FYI, I'm
> using bonded VFs of bonded PFs as OVS-DPDK interfaces, so offcourse LACP
> should be handled by bonded PFs only.
> What is the version of DPDK & OVS used here, BTW? Thanks
>
> >
> > Best regards,
> >
> > On Mon, Feb 6, 2023 at 11:54 PM David Marchand <
> david.march...@redhat.com> wrote:
> >>
> >> On Mon, Feb 6, 2023 at 5:46 PM Lazuardi Nasution <
> mrxlazuar...@gmail.com> wrote:
> >> >
> >> > HI David,
> >> >
> >> > Don't you think that offload of reserved Ethernet address should be
> disabled by default?
> >>
> >> What OVN requests in this trace (dropping) makes sense to me if those
> >> lacp frames are to be ignored at the CP level.
> >> I don't see why some ethernet address would require some special
> >> offloading considerations, but maybe others have a better opinion on
> >> this topic.
> >>
> >>
> >> --
> >> David Marchand
> >>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVN Failed Flow Offload

2023-02-06 Thread Lazuardi Nasution via discuss
Hi David,

I think I can understand your opinion. So your target is to prevent frames
with those ethernet addresses from reaching CP, right? FYI, I'm using
bonded VFs of bonded PFs as OVS-DPDK interfaces, so offcourse LACP should
be handled by bonded PFs only.

Best regards,

On Mon, Feb 6, 2023 at 11:54 PM David Marchand 
wrote:

> On Mon, Feb 6, 2023 at 5:46 PM Lazuardi Nasution 
> wrote:
> >
> > HI David,
> >
> > Don't you think that offload of reserved Ethernet address should be
> disabled by default?
>
> What OVN requests in this trace (dropping) makes sense to me if those
> lacp frames are to be ignored at the CP level.
> I don't see why some ethernet address would require some special
> offloading considerations, but maybe others have a better opinion on
> this topic.
>
>
> --
> David Marchand
>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVN Failed Flow Offload

2023-02-06 Thread Lazuardi Nasution via discuss
Hi Ajit,

If you think that it should not be offloaded, please suggest the OVS
development team for excluding reserved Ethernet addresses from offloading.
As far as I know, LLDP Ethernet addresses and other reserved Ethernet
addresses should be processed by software (control plane).

Best regards.

On Mon, Feb 6, 2023 at 11:34 PM Ajit Khaparde 
wrote:

> On Mon, Feb 6, 2023 at 7:36 AM David Marchand 
> wrote:
> >
> > Hello,
> >
> > On Mon, Feb 6, 2023 at 4:16 PM Lazuardi Nasution via discuss
> >  wrote:
> > >
> > > Hi,
> > >
> > > I'm deploying OVN with OVS-DPDK. When I'm trying to offload flows by
> using rte_flow, I find the following logs appear every second. How to get
> rid those offload incompatible flows warning/error logs? How can I safely
> select which offload incompatible flow should not be offloaded?
> >
> > I don't think there is a way to select offloaded flows in OVN.
> >
> >
> > >
> > >
> 2023-02-06T15:04:03.588Z|00134|netdev_offload_dpdk(hw_offload105)|WARN|dpdkb201:
> rte_flow creation failed: 13 (DMAC is invalid).
> > >
> 2023-02-06T15:04:03.588Z|00135|netdev_offload_dpdk(hw_offload105)|WARN|dpdkb201:
> Failed flow:   flow create 3 ingress priority 0 group 0 transfer pattern
> eth src is e4:3d:1a:dc:48:80 dst is 01:80:c2:00:00:0e type is 0x88cc
> has_vlan is 0 / end actions count / drop / end
> > >
> 2023-02-06T15:04:03.588Z|00136|dpdk(hw_offload105)|ERR|bnxt_validate_and_parse_flow_type():
> DMAC is invalid!
> >
> > This flow (for lldp frames) is rejected by the DPDK net/bnxt driver.
> > Copying this driver MAINTAINERS, maybe they can give the reason why
> > this driver won't accept such destination mac address.
> Thanks for bringing this to our attention.
> We will take a look at this and get back.
> I don't have the answer on the top of my head right now.
>
> Thanks
> Ajit
>
> >
>
> >
> > --
> > David Marchand
> >
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVN Failed Flow Offload

2023-02-06 Thread Lazuardi Nasution via discuss
HI David,

Don't you think that offload of reserved Ethernet address should be
disabled by default?

Best regards.

On Mon, Feb 6, 2023 at 10:36 PM David Marchand 
wrote:

> Hello,
>
> On Mon, Feb 6, 2023 at 4:16 PM Lazuardi Nasution via discuss
>  wrote:
> >
> > Hi,
> >
> > I'm deploying OVN with OVS-DPDK. When I'm trying to offload flows by
> using rte_flow, I find the following logs appear every second. How to get
> rid those offload incompatible flows warning/error logs? How can I safely
> select which offload incompatible flow should not be offloaded?
>
> I don't think there is a way to select offloaded flows in OVN.
>
>
> >
> >
> 2023-02-06T15:04:03.588Z|00134|netdev_offload_dpdk(hw_offload105)|WARN|dpdkb201:
> rte_flow creation failed: 13 (DMAC is invalid).
> >
> 2023-02-06T15:04:03.588Z|00135|netdev_offload_dpdk(hw_offload105)|WARN|dpdkb201:
> Failed flow:   flow create 3 ingress priority 0 group 0 transfer pattern
> eth src is e4:3d:1a:dc:48:80 dst is 01:80:c2:00:00:0e type is 0x88cc
> has_vlan is 0 / end actions count / drop / end
> >
> 2023-02-06T15:04:03.588Z|00136|dpdk(hw_offload105)|ERR|bnxt_validate_and_parse_flow_type():
> DMAC is invalid!
>
> This flow (for lldp frames) is rejected by the DPDK net/bnxt driver.
> Copying this driver MAINTAINERS, maybe they can give the reason why
> this driver won't accept such destination mac address.
>
>
> --
> David Marchand
>
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


[ovs-discuss] OVN Failed Flow Offload

2023-02-06 Thread Lazuardi Nasution via discuss
Hi,

I'm deploying OVN with OVS-DPDK. When I'm trying to offload flows by using
rte_flow, I find the following logs appear every second. How to get rid
those offload incompatible flows warning/error logs? How can I safely
select which offload incompatible flow should not be offloaded?

2023-02-06T15:04:03.588Z|00134|netdev_offload_dpdk(hw_offload105)|WARN|dpdkb201:
rte_flow creation failed: 13 (DMAC is invalid).
2023-02-06T15:04:03.588Z|00135|netdev_offload_dpdk(hw_offload105)|WARN|dpdkb201:
Failed flow:   flow create 3 ingress priority 0 group 0 transfer pattern
eth src is e4:3d:1a:dc:48:80 dst is 01:80:c2:00:00:0e type is 0x88cc
has_vlan is 0 / end actions count / drop / end
2023-02-06T15:04:03.588Z|00136|dpdk(hw_offload105)|ERR|bnxt_validate_and_parse_flow_type():
DMAC is invalid!
2023-02-06T15:04:03.588Z|00137|netdev_offload_dpdk(hw_offload105)|WARN|dpdkb201:
rte_flow creation failed: 13 (DMAC is invalid).
2023-02-06T15:04:03.588Z|00138|netdev_offload_dpdk(hw_offload105)|WARN|dpdkb201:
Failed flow:   flow create 3 ingress priority 0 group 0 pattern eth src is
e4:3d:1a:dc:48:80 dst is 01:80:c2:00:00:0e type is 0x88cc has_vlan is 0 /
end actions mark id 1 / rss / end
2023-02-06T15:04:03.588Z|00139|dpdk(hw_offload105)|ERR|bnxt_validate_and_parse_flow_type():
DMAC is invalid!
2023-02-06T15:04:03.588Z|00140|netdev_offload_dpdk(hw_offload105)|WARN|dpdkb200:
rte_flow creation failed: 13 (DMAC is invalid).
2023-02-06T15:04:03.588Z|00141|netdev_offload_dpdk(hw_offload105)|WARN|dpdkb200:
Failed flow:   flow create 2 ingress priority 0 group 0 transfer pattern
eth src is e4:3d:1a:dc:48:81 dst is 01:80:c2:00:00:0e type is 0x88cc
has_vlan is 0 / end actions count / drop / end
2023-02-06T15:04:03.588Z|00142|dpdk(hw_offload105)|ERR|bnxt_validate_and_parse_flow_type():
DMAC is invalid!
2023-02-06T15:04:03.588Z|00143|netdev_offload_dpdk(hw_offload105)|WARN|dpdkb200:
rte_flow creation failed: 13 (DMAC is invalid).
2023-02-06T15:04:03.588Z|00144|netdev_offload_dpdk(hw_offload105)|WARN|dpdkb200:
Failed flow:   flow create 2 ingress priority 0 group 0 pattern eth src is
e4:3d:1a:dc:48:81 dst is 01:80:c2:00:00:0e type is 0x88cc has_vlan is 0 /
end actions mark id 1 / rss / end

Best regards.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


[ovs-discuss] OVS-DPDK with Bonded bnxt VFs PMD

2023-01-23 Thread Lazuardi Nasution via discuss
Hi,

I have problem on using bonded bnxt VF PMDs with OVS-DPDK for OVN. I have
successfully attached both VF PMD to br-int. I have created br-int.vlanid
internal port and give it IP address for GENEVE tunnel address since the
tunnel should use that VLAN ID. Both respective PFs are LACP bonded on
kernel and the kernel bonding is working. I have tried balance-tcp (with
LACP) and balanced-slb (without LACP) mode of DPDK bonding, but the virtual
ports of logical switch cannot ping each other if they are on different
nodes. I can see that the kernel cannot get ARP of remote tunnel address
and if I put static ARP that tunnel address cannot ping each other. Is
there anybody has experience with OVS-DPDK with bonded bnxt VFs PMD where
respective PFs is LACP bonded on kernel?

Best regards.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVS-DPDK in Virtual Machine (KVM)

2022-11-20 Thread Lazuardi Nasution via discuss
Hi,

It seem that you can attach SR-IOV VF of NIC to the VM. You mau use
suitable PMD of NIV for that purpose.

Best regards.

Date: Fri, 18 Nov 2022 08:51:51 +
> From: Rohan Bose 
> To: "ovs-discuss@openvswitch.org" 
> Subject: [ovs-discuss] OVS-DPDK in Virtual Machine (KVM)
> Message-ID: <81078bbb1ac24a0193483afa8e499...@mailbox.tu-dresden.de>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Hello all,
>
>
> I need to run OVS-DPDK version inside several virtual machines for testing
> out some scenarios. Is it possible to install OVS-DPDK inside the VM and
> configure one ore more of its interfaces as dpdk ports? If yes what type of
> interfaces do I need to attach to the VM and which pmd drivers can I use
> for that?
>
>
> Thanks and Regards,
>
> Rohan
> -- next part --
> An HTML attachment was scrubbed...
> URL: <
> http://mail.openvswitch.org/pipermail/ovs-discuss/attachments/20221118/d33c00d0/attachment-0001.html
> >
>
> --
>
> Subject: Digest Footer
>
> ___
> discuss mailing list
> disc...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>
>
> --
>
> End of discuss Digest, Vol 161, Issue 15
> 
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss