On Thu, May 23, 2024 at 6:47 AM Roi Dayan via dev
<ovs-dev@openvswitch.org> wrote:
>
> It is observed in some environments that there are much more ukeys than
> actual DP flows. For example:
>
> $ ovs-appctl upcall/show
> system@ovs-system:
> flows : (current 7) (avg 6) (max 117) (limit 2125)
> offloaded flows : 525
> dump duration : 1063ms
> ufid enabled : true
>
> 23: (keys 3612)
> 24: (keys 3625)
> 25: (keys 3485)
>
> The revalidator threads are busy revalidating the stale ukeys leading to
> high CPU and long dump duration.
>
> There are some possible situations that may result in stale ukeys that
> have no corresponding DP flows.
>
> In revalidator, push_dp_ops() doesn't check error if the op type is not
> DEL. It is possible that a PUT(MODIFY) fails, especially for tc offload
> case, where the old flow is deleted first and then the new one is
> created. If the creation fails, the ukey will be stale (no corresponding
> DP flow). This patch adds a warning in such case.
>
> Another possible scenario is in handle_upcalls() if a PUT operation did
> not succeed and op->error attribute was not set correctly it can lead to
> stale ukey in operational state.
>
> This patch adds checks in the sweep phase for such ukeys and move them
> to DELETE so that they can be cleared eventually.
>
> Co-authored-by: Han Zhou <hz...@ovn.org>
> Signed-off-by: Han Zhou <hz...@ovn.org>
> Signed-off-by: Roi Dayan <r...@nvidia.com>
> ---
>  ofproto/ofproto-dpif-upcall.c | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
>
> diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
> index 83609ec62b63..e9520ebdf910 100644
> --- a/ofproto/ofproto-dpif-upcall.c
> +++ b/ofproto/ofproto-dpif-upcall.c
> @@ -57,6 +57,7 @@ COVERAGE_DEFINE(dumped_inconsistent_flow);
>  COVERAGE_DEFINE(dumped_new_flow);
>  COVERAGE_DEFINE(handler_duplicate_upcall);
>  COVERAGE_DEFINE(revalidate_missed_dp_flow);
> +COVERAGE_DEFINE(revalidate_missed_dp_flow_del);
>  COVERAGE_DEFINE(ukey_dp_change);
>  COVERAGE_DEFINE(ukey_invalid_stat_reset);
>  COVERAGE_DEFINE(ukey_replace_contention);
> @@ -278,6 +279,7 @@ enum flow_del_reason {
>      FDR_BAD_ODP_FIT,        /* Bad ODP flow fit. */
>      FDR_FLOW_IDLE,          /* Flow idle timeout. */
>      FDR_FLOW_LIMIT,         /* Kill all flows condition reached. */
> +    FDR_FLOW_STALE,         /* Flow stale detected. */
>      FDR_FLOW_WILDCARDED,    /* Flow needs a narrower wildcard mask. */
>      FDR_NO_OFPROTO,         /* Bridge not found. */
>      FDR_PURGE,              /* User requested flow deletion. */
> @@ -2557,6 +2559,10 @@ push_dp_ops(struct udpif *udpif, struct ukey_op *ops, 
> size_t n_ops)
>
>          if (op->dop.type != DPIF_OP_FLOW_DEL) {
>              /* Only deleted flows need their stats pushed. */
> +            if (op->dop.error) {
> +                VLOG_WARN_RL(&rl, "push_dp_ops: error %d in op type %d, ukey 
> "
> +                             "%p", op->dop.error, op->dop.type, op->ukey);
> +            }
>              continue;
>          }
>
> @@ -3027,6 +3033,15 @@ revalidator_sweep__(struct revalidator *revalidator, 
> bool purge)
>                      del_reason = purge ? FDR_PURGE : FDR_UPDATE_FAIL;
>                  } else if (!seq_mismatch) {
>                      result = UKEY_KEEP;
> +                } else if (!ukey->stats.used &&

Would it be possible for stats.used to be set but the dp flow to be
deleted? For example, if a flow is offloaded to TC, but something
external to OVS clears it?


Thanks,
Mike

> +                           udpif_flow_time_delta(udpif, ukey) * 1000 >
> +                           ofproto_max_idle) {
> +                    COVERAGE_INC(revalidate_missed_dp_flow_del);
> +                    VLOG_WARN_RL(&rl, "revalidator_sweep__: Remove stale 
> ukey "
> +                                 "%p delta %llus", ukey,
> +                                 udpif_flow_time_delta(udpif, ukey));
> +                    result = UKEY_DELETE;
> +                    del_reason = FDR_FLOW_STALE;
>                  } else {
>                      struct dpif_flow_stats stats;
>                      COVERAGE_INC(revalidate_missed_dp_flow);
> --
> 2.21.0
>
> _______________________________________________
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to