On 4/28/22 13:49, Ilya Maximets wrote: > On 4/28/22 11:06, Jianbo Liu wrote: >> On Wed, 2022-04-27 at 13:37 +0200, Ilya Maximets wrote: >>> On 4/27/22 04:41, Jianbo Liu wrote: >>>> On Wed, 2022-04-27 at 01:58 +0200, Ilya Maximets wrote: >>>>> On 4/26/22 09:14, Simon Horman wrote: >>>>>> On Tue, Apr 19, 2022 at 01:38:33AM +0000, Jianbo Liu via dev >>>>>> wrote: >>>>>>> On Thu, 2022-04-07 at 09:15 +0000, Jianbo Liu wrote: >>>>>>>> This series is to add support for tc offloading of ovs >>>>>>>> metering, and >>>>>>>> enhance OVS to use new kernel feature which offload tc >>>>>>>> police >>>>>>>> action to >>>>>>>> hardware. >>>>>>>> To do the offloading, new APIs for meter are added in >>>>>>>> netdev- >>>>>>>> offload, >>>>>>>> and OVS meters are mapped to tc police actions with one-to- >>>>>>>> one >>>>>>>> relationship for dpif-netlink. >>>>>>>> >>>>>>>> Notes: >>>>>>>> v3 >>>>>>>> - Add netdev-offload APIs for meter. >>>>>>>> - Move the implementation to lower netdev-offload-tc >>>>>>>> layer. >>>>>>>> >>>>>>>> v2 >>>>>>>> - Move tc police parse call from last patch to 2nd >>>>>>>> patch. >>>>>>>> 2nd patch is adding the parse lib func and add the >>>>>>>> call >>>>>>>> to it in >>>>>>>> that patch. >>>>>>>> Last patch is the put of tc police act. >>>>>>>> In 2nd patch also add empty switch case for tc police >>>>>>>> act >>>>>>>> put as >>>>>>>> the impl. >>>>>>>> is done in the last patch when support for put is >>>>>>>> being >>>>>>>> added. >>>>>> >>>>>> ... >>>>>> >>>>>>> Friendly ping for review. >>>>>> >>>>>> Hi Jianbo, >>>>>> >>>>>> sorry for the delayed response. >>>>>> >>>>>> My colleagues at Corigine have exercised this patchset and have >>>>>> verified >>>>>> it for our test-cases. >>>>>> >>>>>> Also, I have run the upstream test workflows over them, and >>>>>> they >>>>>> look good. >>>>>> >>>>>> 1. netdev-offload: Add meter offload provider >>>>>> >>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220399423&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=8lRjZK35bENpZ3vRbMFBfGWJLXX4p%2BfjR57JEhADucw%3D&reserved=0 >>>>>> 2. tc: Add support parsing tc police action >>>>>> >>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220409233&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=iorfBm3gxzS3fILCBF8h18%2FWLSUQXEhc2g7jWMn%2F4ug%3D&reserved=0 >>>>>> 3. netdev-linux: Refactor put police action netlink message >>>>>> >>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220516986&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=9%2BKMNZqAe6CM7ysWXwbeFpIyf%2FF44NdSJj6WVN46IdM%3D&reserved=0 >>>>>> 4. netdev-linux: Add functions to manipulate tc police action >>>>>> >>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220566774&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=wMPlN5cYZJfZloAZ48bKLpS3SKSMTyZAovWlSPICXxY%3D&reserved=0 >>>>>> 5. netdev-offload-tc: Implement and register meter offload API >>>>>> for >>>>>> tc >>>>>> >>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220568491&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=MFKxDa7ZDScho536VVERyLS00uF76BgcbCS1q%2BRQoTI%3D&reserved=0 >>>>>> 6. netdev-offload-tc: Cleanup police actions with reserved >>>>>> indexes >>>>>> on startup >>>>>> >>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220569761&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ef%2BFz7twkOqc9PQC%2FQPkgaqHwreKHpnw4iuPyp%2BdFDQ%3D&reserved=0 >>>>>> 7. netdev-offload-tc: Offloading rules with police actions >>>>>> >>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220571039&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=VjzfzOLeU%2FRgHhFBfGwTAUXvcRjIuYsY4%2BNV4l3OEb8%3D&reserved=0 >>>>>> 8. dpif-netlink: Offloading meter to tc police action >>>>>> >>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220572904&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=HWC5u8pDX3vPRfeoJnXXBUiMlJDNYeApPFNauX2zVbc%3D&reserved=0 >>>>>> >>>>>> I have also reviewed the patches, and with my OVS maintainer >>>>>> hat >>>>>> on, I'm >>>>>> happy to apply them, and plan to do so in the coming days >>>>>> unless >>>>>> there are >>>>>> objections. >>>>> >>>>> Thanks, Simon, for taking care of this patch set! And sorry for >>>>> me >>>>> not being >>>>> very responsive. >>>>> >>>>> I had a quick look over the patch set just now and I think that >>>>> in >>>>> this >>>>> version most of my main concerns were addressed and it looks good >>>>> in >>>>> general. >>>>> >>>>> One thing I didn't understand is why we need to have 'struct >>>>> meter_offload_api' >>>>> as a separate entity? I mean, it never used if a normal offload >>>>> API >>>>> is >>>>> not initialized and it's tied with the particular offload API. >>>>> Can >>>>> these >>>>> callbacks be just part of the normal 'struct netdev_flow_api' ? >>>>> Should not be hard to merge meter_offload_api and netdev_flow_api >>>>> into one. >>>>> That should also reduce the amount of code. Or am I missing >>>>> something? >>>> >>>> Conceptually, meter is used globally, and it does not belong to any >>>> perticular netdev/flow. All the callbacks in 'struct >>>> netdev_flow_api' >>>> have a parameter related to netdev, but meter callbacks don't need. >>>> The meter_offload_set/get/del can be part of 'struct >>>> netdev_flow_api', >>>> but meter_offload_init/destory must be placed to elsewhere, to >>>> avoid >>>> being called more than once. >>>> So, why not add new API for meter? But I'd like to change if you >>>> insist. What's your decision? >>> >>> I agree that meters do not belong to any particular netdev, that's >>> true. However, netdev_flow_api is also a global entity. All netdevs >>> are using the same instance of the netdev_fow_api. And most of the >>> structures in netdev-offload-tc are global (hash maps, >>> configurations). >>> 'netdev' argument can just be viewed as a guide for these callbacks, >>> like meter_id for the meter offload API, not something necessary. >> >> Usually, we get netdev_flow_api from netdev->flow_api, but dpif_class- >>> meter_get/set/del don't have the param of netdev, we must add a helper >> to get any netdev from dpif. The helper is more like: >> >> struct netdev * >> netdev_get_any(const char *dpif_type) >> { >> struct port_to_netdev_data *data; >> struct netdev *dev = NULL; >> >> ovs_rwlock_rdlock(&netdev_hmap_rwlock); >> HMAP_FOR_EACH (data, portno_node, &port_to_netdev) { >> if (netdev_get_dpif_type(data->netdev) == dpif_type) { >> dev = netdev_ref(data->netdev); >> break; >> } >> } >> ovs_rwlock_unlock(&netdev_hmap_rwlock); >> >> return dev; >> } >> >> Do you accept it, or have better solution? > > Hmm. Good point, I missed that. > In fact, a single dpif can have ports with different offload API > initialized. For example, linux ports may use TC, while DPDK ports > can use DPDK's rte_flow while being in the same dpif-netdev instance. > > So, I think, the correct solution is to add a meter to all flow APIs > that are initialized for ports within that datapath. Maybe something > like this (not tested, pseudo-code): > > diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c > index fb108c0d5..91504ee15 100644 > --- a/lib/netdev-offload.c > +++ b/lib/netdev-offload.c > @@ -79,6 +79,9 @@ struct netdev_registered_flow_api { > /* Number of references: one for the flow_api itself and one for every > * instance of the netdev that uses it. */ > struct ovs_refcount refcnt; > + > + /* Reference counter for dpif types which has a port with this flow API. > */ > + struct simap used_by_dpif_types; > }; > > static struct netdev_registered_flow_api * > @@ -120,6 +123,7 @@ netdev_register_flow_api_provider(const struct > netdev_flow_api *new_flow_api) > hash_string(new_flow_api->type, 0)); > rfa->flow_api = new_flow_api; > ovs_refcount_init(&rfa->refcnt); > + simap_init(&rfa->used_by_dpif_types); > VLOG_DBG("netdev: flow API '%s' registered.", new_flow_api->type); > } > ovs_mutex_unlock(&netdev_flow_api_provider_mutex); > @@ -154,6 +158,7 @@ netdev_unregister_flow_api_provider(const char *type) > } else { > cmap_remove(&netdev_flow_apis, &rfa->cmap_node, > hash_string(rfa->flow_api->type, 0)); > + ovsrcu_postpone(simap_destroy, &rfa->used_by_dpif_types); > ovsrcu_postpone(free, rfa); > error = 0; > } > @@ -182,6 +187,8 @@ netdev_assign_flow_api(struct netdev *netdev) > CMAP_FOR_EACH (rfa, cmap_node, &netdev_flow_apis) { > if (!rfa->flow_api->init_flow_api(netdev)) { > ovs_refcount_ref(&rfa->refcnt); > + simap_increase(&rfa->used_by_dpif_types, > + netdev_get_dpif_type(netdev), 1); > ovsrcu_set(&netdev->flow_api, rfa->flow_api); > VLOG_INFO("%s: Assigned flow API '%s'.", > netdev_get_name(netdev), rfa->flow_api->type); > @@ -344,6 +351,8 @@ netdev_uninit_flow_api(struct netdev *netdev) > > ovsrcu_set(&netdev->flow_api, NULL); > rfa = netdev_lookup_flow_api(flow_api->type); > + ovs_assert(simap_increase(&rfa->used_by_dpif_types, > + netdev_get_dpif_type(netdev), -1) >= 0); > ovs_refcount_unref(&rfa->refcnt); > } > > @@ -739,6 +748,20 @@ netdev_ports_get_n_flows(const char *dpif_type, > odp_port_t port_no, > return ret; > } > > + > +void > +netdev_offload_meter_set(const char *dpif_type, ...) > +{ > + struct netdev_registered_flow_api *rfa; > + > + CMAP_FOR_EACH (rfa, cmap_node, &netdev_flow_apis) { > + if (simap_get(&rfa->used_by_dpif_types, dpif_type) > 0) { > + rfa->flow_api->meter_set(...); > + } > + } > +} > + > + > odp_port_t > netdev_ifindex_to_odp_port(int ifindex) > { > --- > > ?
Thinking more, there is no point in counting netdevs, since they can be added after meters already created. So, I guess, we just have to always add meters to all registered flow API providers, i.e. just: void netdev_offload_meter_set(...) { struct netdev_registered_flow_api *rfa; CMAP_FOR_EACH (rfa, cmap_node, &netdev_flow_apis) { rfa->flow_api->meter_set(...); } } > >> >>> (I guess, we could rename netdev_flow_api to just offload_api to make >>> this more clear, but let's not think about that too much for now to >>> not overcomplicate things.) >>> >>> meter_offload_init/destory APIs might not be even needed. >>> >>> meter_tc_init_policer can be called directly from the >>> netdev_tc_init_flow_api, >>> if made re-enter-able (if (meter_police_ids) { return; }), or can be >>> called from under the existing ovsthread_once_start(&once). >>> >>> meter_offload_destory in current implementation only frees the id >>> pool. >>> This might be not necessary for the following reasons: >>> - If datapath is destroyed, but the process is still running, it's >>> likely that we're waiting for the creation of a new datapath that >>> will re-use the id-pool. >>> - If we're going to shut down the OVS, it's nice but not necessary >>> to free all the memory before doing so. Since the id-pool is a >>> global >>> variable, that will not be treated as a memory leak by sanitizers. >>> >>> What do you think? >>> >> >> It makes sense. >> >>> >>> Forgot to mention in a previous reply that the patch 8/8 of the patch >>> set should add a NEWS entry for the new feature (the NEWS file is a >>> bit mangled at the moment, but Alin is going to fix it soon). >>> Some test in the tests/system-offloads-traffic.at would be also nice >>> to have, if possible. >>> >>>> >>>> Thanks! >>>> Jianbo >>>> >>>>> >>>>> Another small nit is that, I guess, lib/tc.c is a better home for >>>>> most of the >>>>> tc_police functions from lib/netdev-linux.c. But I agree that >>>>> they >>>>> are >>>>> already mixed up and it's kind of a separate task to split them. >>>>> So, >>>>> it's >>>>> not really a concern for this patch set. >>>>> >>>>>> >>>>>> One area that we, at Corigine, would like to follow-up on is >>>>>> using >>>>>> a >>>>>> revalidation process rather than evicting rules on start-up. >>>>> >>>>> If TC police deletion failures can happen, that can become a >>>>> problem >>>>> for >>>>> long-running setups indeed. Not sure what the best solution >>>>> should >>>>> be, but >>>>> I agree that this case has to be handled in some way. >>>>> >>>>> Best regards, Ilya Maximets. >>>> >>> >>> _______________________________________________ >>> dev mailing list >>> d...@openvswitch.org >>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fmailman%2Flistinfo%2Fovs-dev&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=dazl4Ccgfrkb7tkne2e6c%2BZTHIjZ2RJqjchon%2BA9l%2BY%3D&reserved=0 >> > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev