On Fri, 2022-04-29 at 20:50 +0200, Ilya Maximets wrote:
> On 4/29/22 20:01, Ilya Maximets wrote:
> > On 4/29/22 05:28, Jianbo Liu wrote:
> > > On Thu, 2022-04-28 at 14:00 +0200, Ilya Maximets wrote:
> > > > On 4/28/22 13:49, Ilya Maximets wrote:
> > > > > On 4/28/22 11:06, Jianbo Liu wrote:
> > > > > > On Wed, 2022-04-27 at 13:37 +0200, Ilya Maximets wrote:
> > > > > > > On 4/27/22 04:41, Jianbo Liu wrote:
> > > > > > > > On Wed, 2022-04-27 at 01:58 +0200, Ilya Maximets wrote:
> > > > > > > > > On 4/26/22 09:14, Simon Horman wrote:
> > > > > > > > > > On Tue, Apr 19, 2022 at 01:38:33AM +0000, Jianbo
> > > > > > > > > > Liu via
> > > > > > > > > > dev
> > > > > > > > > > wrote:
> > > > > > > > > > > On Thu, 2022-04-07 at 09:15 +0000, Jianbo Liu
> > > > > > > > > > > wrote:
> > > > > > > > > > > > This series is to add support for tc offloading
> > > > > > > > > > > > of
> > > > > > > > > > > > ovs
> > > > > > > > > > > > metering, and
> > > > > > > > > > > > enhance OVS to use new kernel feature which
> > > > > > > > > > > > offload
> > > > > > > > > > > > tc
> > > > > > > > > > > > police
> > > > > > > > > > > > action to
> > > > > > > > > > > > hardware.
> > > > > > > > > > > > To do the offloading, new APIs for meter are
> > > > > > > > > > > > added in
> > > > > > > > > > > > netdev-
> > > > > > > > > > > > offload,
> > > > > > > > > > > > and OVS meters are mapped to tc police actions
> > > > > > > > > > > > with
> > > > > > > > > > > > one-to-
> > > > > > > > > > > > one
> > > > > > > > > > > > relationship for dpif-netlink.
> > > > > > > > > > > > 
> > > > > > > > > > > > Notes:
> > > > > > > > > > > >     v3
> > > > > > > > > > > >     - Add netdev-offload APIs for meter.
> > > > > > > > > > > >     - Move the implementation to lower netdev-
> > > > > > > > > > > > offload-tc
> > > > > > > > > > > > layer.
> > > > > > > > > > > > 
> > > > > > > > > > > >     v2
> > > > > > > > > > > >     - Move tc police parse call from last patch
> > > > > > > > > > > > to
> > > > > > > > > > > > 2nd
> > > > > > > > > > > > patch.
> > > > > > > > > > > >       2nd patch is adding the parse lib func
> > > > > > > > > > > > and add
> > > > > > > > > > > > the
> > > > > > > > > > > > call
> > > > > > > > > > > > to it in
> > > > > > > > > > > > that patch.
> > > > > > > > > > > >       Last patch is the put of tc police act.
> > > > > > > > > > > >       In 2nd patch also add empty switch case
> > > > > > > > > > > > for tc
> > > > > > > > > > > > police
> > > > > > > > > > > > act
> > > > > > > > > > > > put as
> > > > > > > > > > > > the impl.
> > > > > > > > > > > >       is done in the last patch when support
> > > > > > > > > > > > for put
> > > > > > > > > > > > is
> > > > > > > > > > > > being
> > > > > > > > > > > > added.
> > > > > > > > > > 
> > > > > > > > > > ...
> > > > > > > > > > 
> > > > > > > > > > > Friendly ping for review.
> > > > > > > > > > 
> > > > > > > > > > Hi Jianbo,
> > > > > > > > > > 
> > > > > > > > > > sorry for the delayed response.
> > > > > > > > > > 
> > > > > > > > > > My colleagues at Corigine have exercised this
> > > > > > > > > > patchset
> > > > > > > > > > and have
> > > > > > > > > > verified
> > > > > > > > > > it for our test-cases.
> > > > > > > > > > 
> > > > > > > > > > Also, I have run the upstream test workflows over
> > > > > > > > > > them,
> > > > > > > > > > and
> > > > > > > > > > they
> > > > > > > > > > look good.
> > > > > > > > > > 
> > > > > > > > > > 1. netdev-offload: Add meter offload provider
> > > > > > > > > >     
> > > > > > > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220399423&data=05%7C01%7Cjianbol%40nvidia.com%7Cdffd3b765d664c5cc83e08da2a111571%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637868550123285362%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=lWLyTcDLytPueeTu%2BG9gYizo9TW%2B2rlPrQtwWmLoZck%3D&reserved=0
> > > > > > > > > > 2. tc: Add support parsing tc police action
> > > > > > > > > >     
> > > > > > > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220409233&data=05%7C01%7Cjianbol%40nvidia.com%7Cdffd3b765d664c5cc83e08da2a111571%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637868550123285362%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=3kc%2B1Nv6vDXbQTqoXG6TvUNGCN4FA76%2Bo%2B65Id%2FMqS8%3D&reserved=0
> > > > > > > > > > 3. netdev-linux: Refactor put police action netlink
> > > > > > > > > > message
> > > > > > > > > >     
> > > > > > > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220516986&data=05%7C01%7Cjianbol%40nvidia.com%7Cdffd3b765d664c5cc83e08da2a111571%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637868550123285362%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=dFHBeqSRK5PbKER90d3EKGXA5jmyjtbNrve5dQodaII%3D&reserved=0
> > > > > > > > > > 4. netdev-linux: Add functions to manipulate tc
> > > > > > > > > > police
> > > > > > > > > > action
> > > > > > > > > >     
> > > > > > > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220566774&data=05%7C01%7Cjianbol%40nvidia.com%7Cdffd3b765d664c5cc83e08da2a111571%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637868550123285362%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=J0EO7GdsTuz47RdZ0au9jxwA9HEcYp6IXZ1tGo86Ftc%3D&reserved=0
> > > > > > > > > > 5. netdev-offload-tc: Implement and register meter
> > > > > > > > > > offload API
> > > > > > > > > > for
> > > > > > > > > > tc
> > > > > > > > > >     
> > > > > > > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220568491&data=05%7C01%7Cjianbol%40nvidia.com%7Cdffd3b765d664c5cc83e08da2a111571%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637868550123285362%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=vc9r%2FigEJjKpePnS6qD0fYZuAe2T1ugE0QohKMY5hb0%3D&reserved=0
> > > > > > > > > > 6. netdev-offload-tc: Cleanup police actions with
> > > > > > > > > > reserved
> > > > > > > > > > indexes
> > > > > > > > > > on startup
> > > > > > > > > >     
> > > > > > > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220569761&data=05%7C01%7Cjianbol%40nvidia.com%7Cdffd3b765d664c5cc83e08da2a111571%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637868550123285362%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=uVZ%2F1tx8N8svXFxr7lQGN8rew7q%2Ft8JP7EcuKWuexYY%3D&reserved=0
> > > > > > > > > > 7. netdev-offload-tc: Offloading rules with police
> > > > > > > > > > actions
> > > > > > > > > >     
> > > > > > > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220571039&data=05%7C01%7Cjianbol%40nvidia.com%7Cdffd3b765d664c5cc83e08da2a111571%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637868550123285362%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=%2FxMMpHTGaUDs%2BbroW46ey5482uFlVgdd4RJOOfonM68%3D&reserved=0
> > > > > > > > > > 8. dpif-netlink: Offloading meter to tc police
> > > > > > > > > > action
> > > > > > > > > >     
> > > > > > > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220572904&data=05%7C01%7Cjianbol%40nvidia.com%7Cdffd3b765d664c5cc83e08da2a111571%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637868550123285362%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=HzWYm4vZAgdGf5bEWOluX3lev5fXRkYP9M5I29ttykc%3D&reserved=0
> > > > > > > > > > 
> > > > > > > > > > I have also reviewed the patches, and with my OVS
> > > > > > > > > > maintainer
> > > > > > > > > > hat
> > > > > > > > > > on, I'm
> > > > > > > > > > happy to apply them, and plan to do so in the
> > > > > > > > > > coming days
> > > > > > > > > > unless
> > > > > > > > > > there are
> > > > > > > > > > objections.
> > > > > > > > > 
> > > > > > > > > Thanks, Simon, for taking care of this patch set! 
> > > > > > > > > And
> > > > > > > > > sorry for
> > > > > > > > > me
> > > > > > > > > not being
> > > > > > > > > very responsive.
> > > > > > > > > 
> > > > > > > > > I had a quick look over the patch set just now and I
> > > > > > > > > think
> > > > > > > > > that
> > > > > > > > > in
> > > > > > > > > this
> > > > > > > > > version most of my main concerns were addressed and
> > > > > > > > > it
> > > > > > > > > looks good
> > > > > > > > > in
> > > > > > > > > general.
> > > > > > > > > 
> > > > > > > > > One thing I didn't understand is why we need to have
> > > > > > > > > 'struct
> > > > > > > > > meter_offload_api'
> > > > > > > > > as a separate entity?  I mean, it never used if a
> > > > > > > > > normal
> > > > > > > > > offload
> > > > > > > > > API
> > > > > > > > > is
> > > > > > > > > not initialized and it's tied with the particular
> > > > > > > > > offload
> > > > > > > > > API. 
> > > > > > > > > Can
> > > > > > > > > these
> > > > > > > > > callbacks be just part of the normal 'struct
> > > > > > > > > netdev_flow_api' ?
> > > > > > > > > Should not be hard to merge meter_offload_api and
> > > > > > > > > netdev_flow_api
> > > > > > > > > into one.
> > > > > > > > > That should also reduce the amount of code.  Or am I
> > > > > > > > > missing
> > > > > > > > > something?
> > > > > > > > 
> > > > > > > > Conceptually, meter is used globally, and it does not
> > > > > > > > belong
> > > > > > > > to any
> > > > > > > > perticular netdev/flow. All the callbacks in 'struct
> > > > > > > > netdev_flow_api'
> > > > > > > > have a parameter related to netdev, but meter callbacks
> > > > > > > > don't
> > > > > > > > need.
> > > > > > > > The meter_offload_set/get/del can be part of 'struct
> > > > > > > > netdev_flow_api',
> > > > > > > > but meter_offload_init/destory must be placed to
> > > > > > > > elsewhere,
> > > > > > > > to
> > > > > > > > avoid
> > > > > > > > being called more than once.
> > > > > > > > So, why not add new API for meter? But I'd like to
> > > > > > > > change if
> > > > > > > > you
> > > > > > > > insist. What's your decision?
> > > > > > > 
> > > > > > > I agree that meters do not belong to any particular
> > > > > > > netdev,
> > > > > > > that's
> > > > > > > true.  However, netdev_flow_api is also a global entity. 
> > > > > > > All
> > > > > > > netdevs
> > > > > > > are using the same instance of the netdev_fow_api.  And
> > > > > > > most of
> > > > > > > the
> > > > > > > structures in netdev-offload-tc are global (hash maps,
> > > > > > > configurations).
> > > > > > > 'netdev' argument can just be viewed as a guide for these
> > > > > > > callbacks,
> > > > > > > like meter_id for the meter offload API, not something
> > > > > > > necessary.
> > > > > > 
> > > > > > Usually, we get netdev_flow_api from netdev->flow_api, but
> > > > > > dpif_class-
> > > > > > > meter_get/set/del don't have the param of netdev, we must
> > > > > > > add a
> > > > > > > helper
> > > > > > to get any netdev from dpif. The helper is more like:
> > > > > >  
> > > > > > struct netdev *
> > > > > > netdev_get_any(const char *dpif_type)
> > > > > > {
> > > > > >     struct port_to_netdev_data *data;
> > > > > >     struct netdev *dev = NULL;
> > > > > > 
> > > > > >     ovs_rwlock_rdlock(&netdev_hmap_rwlock);
> > > > > >     HMAP_FOR_EACH (data, portno_node, &port_to_netdev) {
> > > > > >         if (netdev_get_dpif_type(data->netdev) ==
> > > > > > dpif_type) {
> > > > > >             dev = netdev_ref(data->netdev);
> > > > > >             break;
> > > > > >         }
> > > > > >     }
> > > > > >     ovs_rwlock_unlock(&netdev_hmap_rwlock);
> > > > > > 
> > > > > >     return dev;
> > > > > > }
> > > > > > 
> > > > > > Do you accept it, or have better solution?
> > > > > 
> > > > > Hmm.  Good point, I missed that.
> > > > > In fact, a single dpif can have ports with different offload
> > > > > API
> > > > > initialized.  For example, linux ports may use TC, while DPDK
> > > > > ports
> > > > > can use DPDK's rte_flow while being in the same dpif-netdev
> > > > > instance.
> > > > > 
> > > > > So, I think, the correct solution is to add a meter to all
> > > > > flow
> > > > > APIs
> > > > > that are initialized for ports within that datapath.  Maybe
> > > > > something
> > > > > like this (not tested, pseudo-code):
> > > > > 
> > > > > diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c
> > > > > index fb108c0d5..91504ee15 100644
> > > > > --- a/lib/netdev-offload.c
> > > > > +++ b/lib/netdev-offload.c
> > > > > @@ -79,6 +79,9 @@ struct netdev_registered_flow_api {
> > > > >      /* Number of references: one for the flow_api itself and
> > > > > one
> > > > > for every
> > > > >       * instance of the netdev that uses it. */
> > > > >      struct ovs_refcount refcnt;
> > > > > +
> > > > > +    /* Reference counter for dpif types which has a port
> > > > > with this
> > > > > flow API. */
> > > > > +    struct simap used_by_dpif_types;
> > > > >  };
> > > > >  
> > > > >  static struct netdev_registered_flow_api *
> > > > > @@ -120,6 +123,7 @@ netdev_register_flow_api_provider(const
> > > > > struct
> > > > > netdev_flow_api *new_flow_api)
> > > > >                      hash_string(new_flow_api->type, 0));
> > > > >          rfa->flow_api = new_flow_api;
> > > > >          ovs_refcount_init(&rfa->refcnt);
> > > > > +        simap_init(&rfa->used_by_dpif_types);
> > > > >          VLOG_DBG("netdev: flow API '%s' registered.",
> > > > > new_flow_api->type);
> > > > >      }
> > > > >      ovs_mutex_unlock(&netdev_flow_api_provider_mutex);
> > > > > @@ -154,6 +158,7 @@ netdev_unregister_flow_api_provider(const
> > > > > char
> > > > > *type)
> > > > >      } else  {
> > > > >          cmap_remove(&netdev_flow_apis, &rfa->cmap_node,
> > > > >                      hash_string(rfa->flow_api->type, 0));
> > > > > +        ovsrcu_postpone(simap_destroy, &rfa-
> > > > > >used_by_dpif_types);
> > > > >          ovsrcu_postpone(free, rfa);
> > > > >          error = 0;
> > > > >      }
> > > > > @@ -182,6 +187,8 @@ netdev_assign_flow_api(struct netdev
> > > > > *netdev)
> > > > >      CMAP_FOR_EACH (rfa, cmap_node, &netdev_flow_apis) {
> > > > >          if (!rfa->flow_api->init_flow_api(netdev)) {
> > > > >              ovs_refcount_ref(&rfa->refcnt);
> > > > > +            simap_increase(&rfa->used_by_dpif_types,
> > > > > +                           netdev_get_dpif_type(netdev), 1);
> > > > >              ovsrcu_set(&netdev->flow_api, rfa->flow_api);
> > > > >              VLOG_INFO("%s: Assigned flow API '%s'.",
> > > > >                        netdev_get_name(netdev), rfa-
> > > > > >flow_api-
> > > > > > type);
> > > > > @@ -344,6 +351,8 @@ netdev_uninit_flow_api(struct netdev
> > > > > *netdev)
> > > > >  
> > > > >      ovsrcu_set(&netdev->flow_api, NULL);
> > > > >      rfa = netdev_lookup_flow_api(flow_api->type);
> > > > > +    ovs_assert(simap_increase(&rfa->used_by_dpif_types,
> > > > > +                              netdev_get_dpif_type(netdev),
> > > > > -1) >=
> > > > > 0);
> > > > >      ovs_refcount_unref(&rfa->refcnt);
> > > > >  }
> > > > >  
> > > > > @@ -739,6 +748,20 @@ netdev_ports_get_n_flows(const char
> > > > > *dpif_type, odp_port_t port_no,
> > > > >      return ret;
> > > > >  }
> > > > >  
> > > > > +
> > > > > +void
> > > > > +netdev_offload_meter_set(const char *dpif_type, ...)
> > > > > +{
> > > > > +    struct netdev_registered_flow_api *rfa;
> > > > > +
> > > > > +    CMAP_FOR_EACH (rfa, cmap_node, &netdev_flow_apis) {
> > > > > +        if (simap_get(&rfa->used_by_dpif_types, dpif_type) >
> > > > > 0) {
> > > > > +            rfa->flow_api->meter_set(...);
> > > > > +        }
> > > > > +    }
> > > > > +}
> > > > > +
> > > > > +
> > > > >  odp_port_t
> > > > >  netdev_ifindex_to_odp_port(int ifindex)
> > > > >  {
> > > > > ---
> > > > > 
> > > > > ?
> > > > 
> > > > Thinking more, there is no point in counting netdevs, since
> > > > they
> > > > can be added after meters already created.  So, I guess, we
> > > > just
> > > > have to always add meters to all registered flow API providers,
> > > > i.e. just:
> > > > 
> > > > void
> > > > netdev_offload_meter_set(...)
> > > > {
> > > >     struct netdev_registered_flow_api *rfa;
> > > > 
> > > >     CMAP_FOR_EACH (rfa, cmap_node, &netdev_flow_apis) {
> > > >         rfa->flow_api->meter_set(...);
> > > >     }
> > > > }
> > > 
> > > You know I will put this netdev_offload_meter_set in
> > > dpif_netlink_meter_set. But, in the future, someone may implement
> > > meter_set for netdev_offload_dpdk, so it will be called as well.
> > > It
> > > looks unreasonable as we want to call linux_tc's meter_set only,
> > > dpdk_flow_api's meter_set should be called in dpif_netdev. 
> > 
> > meter_set for netdev_offload_dpdk will not be called unless
> > netdev_offload_dpdk is registered and it will not be registered
> > unless dpdk is explicitely enabled by the user via database knob.
> > And if user enables dpdk, they likely want to use userpsace
> > datapath.  So that doesn't sound like a big problem.
> > 
> > Userspace datapath though is more versatile and it may want
> > to use several different offload APIs at the same time.  And
> > we don't really know for packets from which ports the meter
> > will be used.  And since they may have different offload APIs
> > we kind of have to create meters in all registered.
> 
> Alternative solution is to not create meters anywhere at all,
> but only store them globally inside the netdev_offload module.
> Flow API implementations will look up and create the meter
> on-demand while creating a flow that is going to use that meter.
> Not sure how deletions should look like in this case.
> 

I prefer to the one of calling all of the registered meter APIs.
Will send the update version soon. Thank you very much!

> > 
> > Binding a flow API to a specific datapath doesn't look good to me.
> > 
> > I guess, this also raises the point of using the same meter
> > for flows installed in kernel and TC for dpif-netlink, as well
> > as using the same meter for flows in TC and DPDK with userspace
> > datapath.
> > But that problem seems to not have a clear solution as
> > bandwidth sharing doesn't seem to be possible in such cases.
> > 

Agree.

> > Best regards, Ilya Maximets.
> 

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to