On 30 Sep 2022, at 13:12, Simon Horman wrote:

> From: Yifan Li <yifan...@corigine.com>
>
> Allow revalidator for hardware offload of meters via OVS-TC.
> Without revalidator, tc meter action can not be deleted while
> flow exists. The revalidator fix this bug by continuously
> checking existing meters and delete the unneeded ones. The
> autotest cases of revalidator are also added.

Hi Yifan,

I understand the problem, but I’m wondering why you decided to revalidate all 
the meters. Would it not be better to just keep track of a list of meters where 
deletion failed, and only handle those? Or maybe even better, check if the 
meter needs cleaning up when a flow with a meter action gets deleted. Maybe 
there are better/other ways but I need to give it some thought. Jainbo?

As mentioned before I think we should avoid doing all this extra work when 
there can be a simpler solution to the problem.


Some comments are below, which was not because I did a full review, but just 
noticed them when glancing over it.

> Signed-off-by: Yifan Li <yifan...@corigine.com>
> Signed-off-by: Simon Horman <simon.hor...@corigine.com>
> ---
>  lib/dpif-netdev.c                |   1 +
>  lib/dpif-netlink.c               | 201 +++++++++++++++++++++++++++++++
>  lib/dpif-netlink.h               |   2 +
>  lib/dpif-provider.h              |   4 +
>  lib/dpif.c                       |   7 ++
>  lib/dpif.h                       |   2 +
>  lib/id-pool.c                    |  13 ++
>  lib/id-pool.h                    |   3 +
>  lib/netdev-linux.c               |   6 +
>  lib/netdev-offload-tc.c          |  11 +-
>  lib/tc.c                         |   5 -
>  lib/tc.h                         |   9 ++
>  ofproto/ofproto-dpif-upcall.c    |   5 +
>  tests/system-offloads-traffic.at |   4 +
>  14 files changed, 267 insertions(+), 6 deletions(-)
>
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index a45b460145c6..365aacadb03a 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -9583,6 +9583,7 @@ const struct dpif_class dpif_netdev_class = {
>      dpif_netdev_meter_set,
>      dpif_netdev_meter_get,
>      dpif_netdev_meter_del,
> +    NULL,                       /* meter_revalidate */
>      dpif_netdev_bond_add,
>      dpif_netdev_bond_del,
>      dpif_netdev_bond_stats_get,
> diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
> index a620a6ec52dd..fac3535e1748 100644
> --- a/lib/dpif-netlink.c
> +++ b/lib/dpif-netlink.c
> @@ -25,6 +25,7 @@
>  #include <net/if.h>
>  #include <linux/types.h>
>  #include <linux/pkt_sched.h>
> +#include <linux/rtnetlink.h>
>  #include <poll.h>
>  #include <stdlib.h>
>  #include <strings.h>
> @@ -61,6 +62,7 @@
>  #include "packets.h"
>  #include "random.h"
>  #include "sset.h"
> +#include "tc.h"
>  #include "timeval.h"
>  #include "unaligned.h"
>  #include "util.h"
> @@ -4161,6 +4163,191 @@ dpif_netlink_meter_get_features(const struct dpif 
> *dpif_,
>      ofpbuf_delete(msg);
>  }
>
> +static bool dpif_netlink_meter_should_revalidate(struct id_pool *meter_ids,
> +                                                 uint32_t meter_id)
> +{
> +    return !id_pool_id_exist(meter_ids, meter_id);
> +}
> +
> +static void
> +dpif_tc_meter_revalidate(struct dpif *dpif_ OVS_UNUSED,
> +                         struct id_pool *meter_ids, struct ofpbuf *reply)
> +{
> +    static struct nl_policy actions_orders_policy[ACT_MAX_NUM + 1] = {};
> +    struct nlattr *actions_orders[ARRAY_SIZE(actions_orders_policy)];
> +    static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5,20);
> +    static const struct nl_policy tca_root_policy[] = {
> +        [TCA_ACT_TAB] = { .type = NL_A_NESTED, .optional = false },
> +        [TCA_ROOT_COUNT] = { .type = NL_A_U32, .optional = false },
> +    };
> +    struct nlattr *action_root_attrs[ARRAY_SIZE(tca_root_policy)];
> +    static const struct nl_policy police_policy[] = {
> +        [TCA_POLICE_TBF] = { .type = NL_A_UNSPEC,
> +                             .min_len = sizeof(struct tc_police),
> +                             .optional = false},
> +    };
> +    struct nlattr *action_police_tab[ARRAY_SIZE(police_policy)];
> +    static const struct nl_policy act_policy[] = {
> +        [TCA_ACT_KIND] = { .type = NL_A_STRING, .optional = false, },
> +        [TCA_ACT_COOKIE] = { .type = NL_A_UNSPEC, .optional = true, },
> +        [TCA_ACT_OPTIONS] = { .type = NL_A_NESTED, .optional = true, },
> +        [TCA_ACT_STATS] = { .type = NL_A_NESTED, .optional = false, },
> +    };
> +    struct nlattr *action_police_attrs[ARRAY_SIZE(act_policy)];
> +    const int max_size = ARRAY_SIZE(actions_orders_policy);
> +    const struct tc_police *tc_police = NULL;
> +    ofproto_meter_id meter_id;
> +    size_t revalidate_num;
> +    size_t act_count;
> +    uint32_t index;
> +    int i;
> +
> +    if (!reply) {
> +        VLOG_ERR_RL(&rl, "Null reply message during meter revalidation");
> +        return;
> +    }
> +
> +    if (reply->size <= NLMSG_ALIGNTO + NLMSG_HDRLEN) {
> +        VLOG_DBG_RL(&rl, "No meters present in tc during meter "
> +                    "revalidation");
> +        return;
> +    }
> +
> +    if (!nl_policy_parse(reply, NLMSG_HDRLEN + sizeof(struct tcamsg),
> +                        tca_root_policy, action_root_attrs,
> +                        ARRAY_SIZE(action_root_attrs))) {
> +        VLOG_ERR_RL(&rl, "Failed to parse reply message during meter "
> +                    "revalidation");
> +        return;
> +    }
> +
> +    act_count = nl_attr_get_u32(action_root_attrs[TCA_ROOT_COUNT]);
> +    if (!act_count) {
> +        VLOG_ERR_RL(&rl, "No police action returned in message during "
> +                    "meter revalidation");
> +        return;
> +    }
> +
> +    for (i = 0; i < max_size; i++) {
> +        actions_orders_policy[i].type = NL_A_NESTED;
> +        actions_orders_policy[i].optional = true;
> +    }
> +
> +    revalidate_num = act_count > ACT_MAX_NUM ?
> +                                (ACT_MAX_NUM + 1) : (act_count + 1);
> +    if (!nl_parse_nested(action_root_attrs[TCA_ACT_TAB], 
> actions_orders_policy,
> +                        actions_orders, revalidate_num)) {
> +        VLOG_ERR_RL(&rl, "Failed to parse TCA_ACT_TAB during meter "
> +                    "revalidation of act_count");
> +        return;
> +    }
> +
> +    for (i = 0; i < revalidate_num; i++) {
> +        if (!actions_orders[i]) {
> +            continue;
> +        }
> +
> +        if (!nl_parse_nested(actions_orders[i], act_policy,
> +                            action_police_attrs, ARRAY_SIZE(act_policy))) {
> +            VLOG_ERR_RL(&rl, "Failed to parse police action during meter "
> +                      "revalidation");
> +            return;
> +        }
> +
> +        if (strcmp(nl_attr_get_string(action_police_attrs[TCA_KIND]),
> +                                    "police")) {
> +            VLOG_EMER("Non-police action found during meter revalidation");
> +            continue;
> +        }
> +
> +        if (!nl_parse_nested(action_police_attrs[TCA_ACT_OPTIONS],
> +                             police_policy, action_police_tab,
> +                             ARRAY_SIZE(action_police_tab))) {
> +            VLOG_ERR_RL(&rl, "Failed to parse the single police action "
> +                        "during meter revalidation");
> +            return;
> +        }
> +
> +        tc_police = nl_attr_get_unspec(action_police_tab[TCA_POLICE_TBF],
> +                                       sizeof *tc_police);
> +        if (!tc_police) {
> +            VLOG_ERR_RL(&rl, "Can not get police struct during meter "
> +                        "revalidation");
> +            continue;
> +        }
> +        index = tc_police->index;
> +        /* The range of meter index is 0x10000000 to 0x1fffffff
> +         * If not meter, continue */
> +        if (!tc_is_meter_index(index)) {
> +            VLOG_DBG_RL(&rl, "Meter index : %d is not meter:"
> +                        "action = %d, rate = %d, burst = %d, mtu = %d",
> +                        index, tc_police->action, tc_police->rate.rate,
> +                        tc_police->burst, tc_police->mtu);
> +            continue;
> +        }
> +
> +        /* transform police index to meter id */
> +        index = POLICY_INDEX_TO_METER_ID(index);
> +        if (dpif_netlink_meter_should_revalidate(meter_ids, index)) {
> +            meter_id.uint32 = index;
> +            VLOG_DBG_RL(&rl, "Revalidate meter id %u for police index "
> +                        "%08x", index, tc_police->index);
> +            meter_offload_del(meter_id, NULL);
> +        }
> +    }
> +}
> +
> +static int
> +dpif_netlink_meter_revalidate__(struct dpif *dpif_ OVS_UNUSED,
> +                                struct id_pool *meter_ids)
> +{
> +    static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5,20);
> +    struct nla_bitfield32 dump_flags = { TCA_DUMP_FLAGS_TERSE,
> +                                         TCA_DUMP_FLAGS_TERSE};
> +    struct ofpbuf request;
> +    struct ofpbuf *reply;
> +    struct tcamsg *tcmsg;
> +    size_t total_offset;
> +    size_t act_offset;
> +    int prio = 0;
> +    int error;
> +
> +    if (!netdev_is_flow_api_enabled()) {
> +        return 0;
> +    }
> +    /* Make tc action request */
> +    ofpbuf_init(&request, 16384);
> +    nl_msg_put_nlmsghdr(&request, sizeof *tcmsg, RTM_GETACTION,
> +                        NLM_F_REQUEST | NLM_F_DUMP);
> +    tcmsg = ofpbuf_put_zeros(&request, sizeof *tcmsg);
> +    tcmsg->tca_family = AF_UNSPEC;
> +    if (!tcmsg) {
> +        return ENODEV;
> +    }
> +
> +    /* Data path interface netlink police start */
> +    total_offset = nl_msg_start_nested(&request, TCA_ACT_TAB);
> +    act_offset = nl_msg_start_nested(&request, ++prio);
> +    /* Send request */
> +    nl_msg_put_string(&request, TCA_KIND, "police");
> +
> +    /* Data path interface netlink police end */
> +    nl_msg_end_nested(&request, act_offset);
> +    nl_msg_end_nested(&request, total_offset);
> +
> +    nl_msg_put_unspec(&request, TCA_ROOT_FLAGS, &dump_flags,
> +                      sizeof dump_flags);
> +    error = tc_transact(&request, &reply);
> +    if (error) {
> +        VLOG_ERR_RL(&rl, "Failed to send dump netlink msg for revalidate "
> +                    "error %d", error);
> +        return error;
> +    }
> +    dpif_tc_meter_revalidate(dpif_, meter_ids, reply);
> +    ofpbuf_delete(reply);
> +    return 0;
> +}
> +
>  static int
>  dpif_netlink_meter_set__(struct dpif *dpif_, ofproto_meter_id meter_id,
>                           struct ofputil_meter_config *config)
> @@ -4370,6 +4557,19 @@ dpif_netlink_meter_del(struct dpif *dpif, 
> ofproto_meter_id meter_id,
>      return err;
>  }
>
> +static int
> +dpif_netlink_meter_revalidate(struct dpif *dpif_, struct id_pool *meter_ids)
> +{
> +    int err;
> +
> +    if (probe_broken_meters(dpif_)) {
> +        return ENOMEM;
> +    }
> +    err = dpif_netlink_meter_revalidate__(dpif_, meter_ids);
> +
> +    return err;
> +}
> +
>  static bool
>  probe_broken_meters__(struct dpif *dpif)
>  {
> @@ -4589,6 +4789,7 @@ const struct dpif_class dpif_netlink_class = {
>      dpif_netlink_meter_set,
>      dpif_netlink_meter_get,
>      dpif_netlink_meter_del,
> +    dpif_netlink_meter_revalidate,
>      NULL,                       /* bond_add */
>      NULL,                       /* bond_del */
>      NULL,                       /* bond_stats_get */
> diff --git a/lib/dpif-netlink.h b/lib/dpif-netlink.h
> index 24294bc42dc3..b3b113dc407c 100644
> --- a/lib/dpif-netlink.h
> +++ b/lib/dpif-netlink.h
> @@ -17,6 +17,8 @@
>  #ifndef DPIF_NETLINK_H
>  #define DPIF_NETLINK_H 1
>
> +#include <linux/pkt_cls.h>
> +
>  #include <stdbool.h>
>  #include <stddef.h>
>  #include <stdint.h>
> diff --git a/lib/dpif-provider.h b/lib/dpif-provider.h
> index 12477a24feee..5eb252104ee8 100644
> --- a/lib/dpif-provider.h
> +++ b/lib/dpif-provider.h
> @@ -631,6 +631,10 @@ struct dpif_class {
>      int (*meter_del)(struct dpif *, ofproto_meter_id meter_id,
>                       struct ofputil_meter_stats *, uint16_t n_bands);
>
> +    /* Checks unneeded meters from 'dpif' and removes them. They may
> +     * be caused by deleting in-use meters. */
> +    int (*meter_revalidate)(struct dpif *, struct id_pool *);
> +
>      /* Adds a bond with 'bond_id' and the member-map to 'dpif'. */
>      int (*bond_add)(struct dpif *dpif, uint32_t bond_id,
>                      odp_port_t *member_map);
> diff --git a/lib/dpif.c b/lib/dpif.c
> index 40f5fe44606e..dedb33f71f59 100644
> --- a/lib/dpif.c
> +++ b/lib/dpif.c
> @@ -2028,6 +2028,13 @@ dpif_meter_del(struct dpif *dpif, ofproto_meter_id 
> meter_id,
>      return error;
>  }
>
> +int
> +dpif_meter_revalidate(struct dpif *dpif, struct id_pool *meter_ids){
> +    return dpif->dpif_class->meter_revalidate
> +           ? dpif->dpif_class->meter_revalidate(dpif, meter_ids)
> +           : EOPNOTSUPP;
> +}
> +
>  int
>  dpif_bond_add(struct dpif *dpif, uint32_t bond_id, odp_port_t *member_map)
>  {
> diff --git a/lib/dpif.h b/lib/dpif.h
> index 6cb4dae6d8d7..dedb97f008f2 100644
> --- a/lib/dpif.h
> +++ b/lib/dpif.h
> @@ -384,6 +384,7 @@
>  #include "ovs-numa.h"
>  #include "packets.h"
>  #include "util.h"
> +#include "id-pool.h"
>
>  #ifdef  __cplusplus
>  extern "C" {
> @@ -905,6 +906,7 @@ int dpif_meter_get(const struct dpif *, ofproto_meter_id 
> meter_id,
>                     struct ofputil_meter_stats *, uint16_t n_bands);
>  int dpif_meter_del(struct dpif *, ofproto_meter_id meter_id,
>                     struct ofputil_meter_stats *, uint16_t n_bands);
> +int dpif_meter_revalidate(struct dpif *dpif, struct id_pool *meter_ids);
>
>  /* Bonding. */
>
> diff --git a/lib/id-pool.c b/lib/id-pool.c
> index 69910ad08e83..64096a184563 100644
> --- a/lib/id-pool.c
> +++ b/lib/id-pool.c
> @@ -155,3 +155,16 @@ id_pool_free_id(struct id_pool *pool, uint32_t id)
>          }
>      }
>  }
> +
> +bool
> +id_pool_id_exist(struct  id_pool *pool, uint32_t id)

Maybe call this id_pool_id_allocated() to be more in line with the 
id_pool_alloc_id() API.

> +{
> +    return !!id_pool_find(pool, id);
> +}
> +
> +uint32_t id_pool_base(struct id_pool *pool){

Don’t think this function is used, so you could remove it.

> +    return pool->base;
> +}

Add newline

> +uint32_t id_pool_n(struct id_pool *pool){

uint32_t and { on it’s own line to be consistent with the other functions in 
this file.

Maybe call it id_pool_size() to be inline with other APIs?

> +    return pool->n_ids;
> +}
> diff --git a/lib/id-pool.h b/lib/id-pool.h
> index 8721f87934bb..c9b1903d8c74 100644
> --- a/lib/id-pool.h
> +++ b/lib/id-pool.h
> @@ -30,6 +30,9 @@ bool id_pool_alloc_id(struct id_pool *, uint32_t *id);
>  void id_pool_free_id(struct id_pool *, uint32_t id);
>  void id_pool_add(struct id_pool *, uint32_t id);
>
No new line needed I guess.

> +bool id_pool_id_exist(struct id_pool *pool, uint32_t id);
> +uint32_t id_pool_base(struct id_pool *pool);
> +uint32_t id_pool_n(struct id_pool *pool);
Newline should be here.
>  /*
>   * ID pool.
>   * ========
> diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
> index cdc66246ced3..2bb0eda9d581 100644
> --- a/lib/netdev-linux.c
> +++ b/lib/netdev-linux.c
> @@ -5823,6 +5823,12 @@ tc_del_policer_action(uint32_t index, struct 
> ofputil_meter_stats *stats)
>
>      error = tc_transact(&request, &replyp);
>      if (error) {
> +        if (error == EPERM || error == ENOENT) {
> +            /* EPERM means flow exists, it is right that meter deletion is
> +             * not permited.
> +             * ENOENT means meter has already been deleted. */
> +            return error;
> +        }
>          VLOG_ERR_RL(&rl, "Failed to delete police action (index: %u), 
> err=%d",
>                      index, error);
>          return error;
> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
> index f6f90a741fde..df0247dc264c 100644
> --- a/lib/netdev-offload-tc.c
> +++ b/lib/netdev-offload-tc.c
> @@ -2953,10 +2953,19 @@ meter_tc_del_policer(ofproto_meter_id meter_id,
>      if (!meter_id_lookup(meter_id.uint32, &police_index)) {
>          err = tc_del_policer_action(police_index, stats);
>          if (err && err != ENOENT) {
> +            if (err == EPERM) {
> +                /* Flow exists, it is right that meter deletion is not
> +                 * permited. */
> +                return err;
> +            }
>              VLOG_ERR_RL(&error_rl,
> -                        "Failed to del police %u for meter %u: %s",
> +                        "Deletion of police %u for meter %u failed: %s",
>                          police_index, meter_id.uint32, ovs_strerror(err));
> +            return err;
>          } else {
> +            VLOG_DBG("Deletion of police %u for meter %u succeeded: %s",
> +                        police_index, meter_id.uint32, ovs_strerror(err));
> +            err = 0;
>              meter_free_police_index(police_index);
>          }
>          meter_id_remove(meter_id.uint32);
> diff --git a/lib/tc.c b/lib/tc.c
> index 94044cde6060..c7d539562fc6 100644
> --- a/lib/tc.c
> +++ b/lib/tc.c
> @@ -51,9 +51,6 @@
>  #define TCM_IFINDEX_MAGIC_BLOCK (0xFFFFFFFFU)
>  #endif
>
> -#ifndef TCA_DUMP_FLAGS_TERSE
> -#define TCA_DUMP_FLAGS_TERSE (1 << 0)
> -#endif
>
>  #if TCA_MAX < 15
>  #define TCA_CHAIN 11
> @@ -1987,8 +1984,6 @@ tc_parse_action_stats(struct nlattr *action, struct 
> ovs_flow_stats *stats_sw,
>                                   stats_hw, stats_dropped);
>  }
>
> -#define TCA_ACT_MIN_PRIO 1
> -
>  static int
>  nl_parse_flower_actions(struct nlattr **attrs, struct tc_flower *flower,
>                          bool terse)
> diff --git a/lib/tc.h b/lib/tc.h
> index 2e64ad372592..fcadc6d8c8a4 100644
> --- a/lib/tc.h
> +++ b/lib/tc.h
> @@ -60,6 +60,10 @@ enum tc_qdisc_hook {
>
>  #define METER_POLICE_IDS_BASE 0x10000000
>  #define METER_POLICE_IDS_MAX  0x1FFFFFFF
> +/* Mapping meter_id.uint32 into a 32-bit integer */
> +#define METER_ID_TO_POLICY_INDEX(meter_id) (meter_id | METER_POLICE_IDS_BASE)
> +/* Mapping policy_index to meter_id */
> +#define POLICY_INDEX_TO_METER_ID(index) (index & ~METER_POLICE_IDS_BASE)

These macro’s are not valid, as the policy ID is allocated with 
id_pool_alloc_id() which will always give the lowest pool id.
Some values could also be reserved as they could not be deleted at OVS startup.

>  static inline bool
>  tc_is_meter_index(uint32_t index) {
> @@ -301,7 +305,12 @@ enum tc_offloaded_state {
>      TC_OFFLOADED_STATE_NOT_IN_HW,
>  };
>
> +#ifndef TCA_DUMP_FLAGS_TERSE
> +#define TCA_DUMP_FLAGS_TERSE (1 << 0)
> +#endif
> +#define ACT_MAX_NUM 1024
>  #define TCA_ACT_MAX_NUM 16
> +#define TCA_ACT_MIN_PRIO 1
>
>  struct tcf_id {
>      enum tc_qdisc_hook hook;
> diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
> index 7ad728adffdb..35f898ca25f7 100644
> --- a/ofproto/ofproto-dpif-upcall.c
> +++ b/ofproto/ofproto-dpif-upcall.c
> @@ -2658,6 +2658,7 @@ revalidate(struct revalidator *revalidator)
>
>      struct udpif *udpif = revalidator->udpif;
>      struct dpif_flow_dump_thread *dump_thread;
> +    bool flow_delete_exist = false;
>      uint64_t dump_seq, reval_seq;
>      bool kill_warn_print = true;
>      unsigned int flow_limit;
> @@ -2790,6 +2791,7 @@ revalidate(struct revalidator *revalidator)
>
>              if (result != UKEY_KEEP) {
>                  /* Takes ownership of 'recircs'. */
> +                flow_delete_exist = true;
>                  reval_op_init(&ops[n_ops++], result, udpif, ukey, &recircs,
>                                &odp_actions);
>              }
> @@ -2803,6 +2805,9 @@ revalidate(struct revalidator *revalidator)
>          ovsrcu_quiesce();
>      }
>      dpif_flow_dump_thread_destroy(dump_thread);
> +    if (flow_delete_exist) {
> +        dpif_meter_revalidate(udpif->dpif, udpif->backer->meter_ids);
> +    }
>      ofpbuf_uninit(&odp_actions);
>  }
>
> diff --git a/tests/system-offloads-traffic.at 
> b/tests/system-offloads-traffic.at
> index 1a60570801e1..0ca3897a721f 100644
> --- a/tests/system-offloads-traffic.at
> +++ b/tests/system-offloads-traffic.at
> @@ -273,6 +273,10 @@ meter:1 flow_count:1 packet_in_count:11 
> byte_in_count:377 duration:0.001s bands:
>  0: packet_count:9 byte_count:0
>  ])
>
> +AT_CHECK([ovs-ofctl -O Openflow13 del-meter br0])
> +sleep 10

Do we need such a long delay? It will be a shame to wait 10 seconds if it’s 
done in 5. Maybe you can use OVS_WAIT_UNTIL() here.


> +AT_CHECK([tc actions ls action police], [0], [])
> +
>  OVS_TRAFFIC_VSWITCHD_STOP
>  AT_CLEANUP
>
> -- 
> 2.30.2

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to