[ovs-dev] [PATCH v6] ofproto-dpif-upcall: Don't set statistics to 0 when they jump back

2023-05-17 Thread Balazs Nemeth
The only way that stats->{n_packets,n_bytes} would decrease is due to an
overflow, or if there are bugs in how statistics are handled. In the
past, there were multiple issues that caused a jump backward. A
workaround was in place to set the statistics to 0 in that case. When
this happened while the revalidator was under heavy load, the workaround
had an unintended side effect where should_revalidate returned false
causing the flow to be removed because the metric it calculated was
based on a bogus value. Since many of those bugs have now been
identified and resolved, there is no need to set the statistics to 0. In
addition, the (unlikely) overflow still needs to be handled
appropriately. If an unexpected jump does happen, just log it as a
warning.

Signed-off-by: Balazs Nemeth 
---
 ofproto/ofproto-dpif-upcall.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
index cd57fdbd9..3cf080c75 100644
--- a/ofproto/ofproto-dpif-upcall.c
+++ b/ofproto/ofproto-dpif-upcall.c
@@ -2372,18 +2372,17 @@ revalidate_ukey(struct udpif *udpif, struct udpif_key 
*ukey,
 
 push.used = stats->used;
 push.tcp_flags = stats->tcp_flags;
-push.n_packets = (stats->n_packets > ukey->stats.n_packets
-  ? stats->n_packets - ukey->stats.n_packets
-  : 0);
-push.n_bytes = (stats->n_bytes > ukey->stats.n_bytes
-? stats->n_bytes - ukey->stats.n_bytes
-: 0);
+push.n_packets = stats->n_packets - ukey->stats.n_packets;
+push.n_bytes = stats->n_bytes - ukey->stats.n_bytes;
 
 if (stats->n_packets < ukey->stats.n_packets &&
 ukey->stats.n_packets < UINT64_THREE_QUARTERS) {
 /* Report cases where the packet counter is lower than the previous
  * instance, but exclude the potential wrapping of an uint64_t. */
 COVERAGE_INC(ukey_invalid_stat_reset);
+
+VLOG_WARN("Unexpected jump in packet stats from %"PRIu64" to %"PRIu64,
+stats->n_packets, ukey->stats.n_packets);
 }
 
 if (need_revalidate) {
-- 
2.40.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v6] ofproto-dpif-upcall: Don't set statistics to 0 when they jump back

2023-05-17 Thread Ilya Maximets
On 5/17/23 14:50, Balazs Nemeth wrote:
> The only way that stats->{n_packets,n_bytes} would decrease is due to an
> overflow, or if there are bugs in how statistics are handled. In the
> past, there were multiple issues that caused a jump backward. A
> workaround was in place to set the statistics to 0 in that case. When
> this happened while the revalidator was under heavy load, the workaround
> had an unintended side effect where should_revalidate returned false
> causing the flow to be removed because the metric it calculated was
> based on a bogus value. Since many of those bugs have now been
> identified and resolved, there is no need to set the statistics to 0. In
> addition, the (unlikely) overflow still needs to be handled
> appropriately. If an unexpected jump does happen, just log it as a
> warning.
> 
> Signed-off-by: Balazs Nemeth 
> ---
>  ofproto/ofproto-dpif-upcall.c | 11 +--
>  1 file changed, 5 insertions(+), 6 deletions(-)
> 
> diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
> index cd57fdbd9..3cf080c75 100644
> --- a/ofproto/ofproto-dpif-upcall.c
> +++ b/ofproto/ofproto-dpif-upcall.c
> @@ -2372,18 +2372,17 @@ revalidate_ukey(struct udpif *udpif, struct udpif_key 
> *ukey,
>  
>  push.used = stats->used;
>  push.tcp_flags = stats->tcp_flags;
> -push.n_packets = (stats->n_packets > ukey->stats.n_packets
> -  ? stats->n_packets - ukey->stats.n_packets
> -  : 0);
> -push.n_bytes = (stats->n_bytes > ukey->stats.n_bytes
> -? stats->n_bytes - ukey->stats.n_bytes
> -: 0);
> +push.n_packets = stats->n_packets - ukey->stats.n_packets;
> +push.n_bytes = stats->n_bytes - ukey->stats.n_bytes;
>  
>  if (stats->n_packets < ukey->stats.n_packets &&
>  ukey->stats.n_packets < UINT64_THREE_QUARTERS) {
>  /* Report cases where the packet counter is lower than the previous
>   * instance, but exclude the potential wrapping of an uint64_t. */
>  COVERAGE_INC(ukey_invalid_stat_reset);
> +
> +VLOG_WARN("Unexpected jump in packet stats from %"PRIu64" to 
> %"PRIu64,
> +stats->n_packets, ukey->stats.n_packets);

Hi, Balazs.  We should, probbaly, print out the key and actions,
so we can better understand which flow caused the issue.
We should be able to use odp_flow_key_format() and format_odp_actions()
for this.  ukey contains the key.

What do you think?

We should also rate-limit the log just in case.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev