On 4/28/22 11:06, Jianbo Liu wrote: > On Wed, 2022-04-27 at 13:37 +0200, Ilya Maximets wrote: >> On 4/27/22 04:41, Jianbo Liu wrote: >>> On Wed, 2022-04-27 at 01:58 +0200, Ilya Maximets wrote: >>>> On 4/26/22 09:14, Simon Horman wrote: >>>>> On Tue, Apr 19, 2022 at 01:38:33AM +0000, Jianbo Liu via dev >>>>> wrote: >>>>>> On Thu, 2022-04-07 at 09:15 +0000, Jianbo Liu wrote: >>>>>>> This series is to add support for tc offloading of ovs >>>>>>> metering, and >>>>>>> enhance OVS to use new kernel feature which offload tc >>>>>>> police >>>>>>> action to >>>>>>> hardware. >>>>>>> To do the offloading, new APIs for meter are added in >>>>>>> netdev- >>>>>>> offload, >>>>>>> and OVS meters are mapped to tc police actions with one-to- >>>>>>> one >>>>>>> relationship for dpif-netlink. >>>>>>> >>>>>>> Notes: >>>>>>> v3 >>>>>>> - Add netdev-offload APIs for meter. >>>>>>> - Move the implementation to lower netdev-offload-tc >>>>>>> layer. >>>>>>> >>>>>>> v2 >>>>>>> - Move tc police parse call from last patch to 2nd >>>>>>> patch. >>>>>>> 2nd patch is adding the parse lib func and add the >>>>>>> call >>>>>>> to it in >>>>>>> that patch. >>>>>>> Last patch is the put of tc police act. >>>>>>> In 2nd patch also add empty switch case for tc police >>>>>>> act >>>>>>> put as >>>>>>> the impl. >>>>>>> is done in the last patch when support for put is >>>>>>> being >>>>>>> added. >>>>> >>>>> ... >>>>> >>>>>> Friendly ping for review. >>>>> >>>>> Hi Jianbo, >>>>> >>>>> sorry for the delayed response. >>>>> >>>>> My colleagues at Corigine have exercised this patchset and have >>>>> verified >>>>> it for our test-cases. >>>>> >>>>> Also, I have run the upstream test workflows over them, and >>>>> they >>>>> look good. >>>>> >>>>> 1. netdev-offload: Add meter offload provider >>>>> >>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220399423&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=8lRjZK35bENpZ3vRbMFBfGWJLXX4p%2BfjR57JEhADucw%3D&reserved=0 >>>>> 2. tc: Add support parsing tc police action >>>>> >>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220409233&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=iorfBm3gxzS3fILCBF8h18%2FWLSUQXEhc2g7jWMn%2F4ug%3D&reserved=0 >>>>> 3. netdev-linux: Refactor put police action netlink message >>>>> >>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220516986&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=9%2BKMNZqAe6CM7ysWXwbeFpIyf%2FF44NdSJj6WVN46IdM%3D&reserved=0 >>>>> 4. netdev-linux: Add functions to manipulate tc police action >>>>> >>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220566774&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=wMPlN5cYZJfZloAZ48bKLpS3SKSMTyZAovWlSPICXxY%3D&reserved=0 >>>>> 5. netdev-offload-tc: Implement and register meter offload API >>>>> for >>>>> tc >>>>> >>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220568491&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=MFKxDa7ZDScho536VVERyLS00uF76BgcbCS1q%2BRQoTI%3D&reserved=0 >>>>> 6. netdev-offload-tc: Cleanup police actions with reserved >>>>> indexes >>>>> on startup >>>>> >>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220569761&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ef%2BFz7twkOqc9PQC%2FQPkgaqHwreKHpnw4iuPyp%2BdFDQ%3D&reserved=0 >>>>> 7. netdev-offload-tc: Offloading rules with police actions >>>>> >>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220571039&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=VjzfzOLeU%2FRgHhFBfGwTAUXvcRjIuYsY4%2BNV4l3OEb8%3D&reserved=0 >>>>> 8. dpif-netlink: Offloading meter to tc police action >>>>> >>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220572904&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=HWC5u8pDX3vPRfeoJnXXBUiMlJDNYeApPFNauX2zVbc%3D&reserved=0 >>>>> >>>>> I have also reviewed the patches, and with my OVS maintainer >>>>> hat >>>>> on, I'm >>>>> happy to apply them, and plan to do so in the coming days >>>>> unless >>>>> there are >>>>> objections. >>>> >>>> Thanks, Simon, for taking care of this patch set! And sorry for >>>> me >>>> not being >>>> very responsive. >>>> >>>> I had a quick look over the patch set just now and I think that >>>> in >>>> this >>>> version most of my main concerns were addressed and it looks good >>>> in >>>> general. >>>> >>>> One thing I didn't understand is why we need to have 'struct >>>> meter_offload_api' >>>> as a separate entity? I mean, it never used if a normal offload >>>> API >>>> is >>>> not initialized and it's tied with the particular offload API. >>>> Can >>>> these >>>> callbacks be just part of the normal 'struct netdev_flow_api' ? >>>> Should not be hard to merge meter_offload_api and netdev_flow_api >>>> into one. >>>> That should also reduce the amount of code. Or am I missing >>>> something? >>> >>> Conceptually, meter is used globally, and it does not belong to any >>> perticular netdev/flow. All the callbacks in 'struct >>> netdev_flow_api' >>> have a parameter related to netdev, but meter callbacks don't need. >>> The meter_offload_set/get/del can be part of 'struct >>> netdev_flow_api', >>> but meter_offload_init/destory must be placed to elsewhere, to >>> avoid >>> being called more than once. >>> So, why not add new API for meter? But I'd like to change if you >>> insist. What's your decision? >> >> I agree that meters do not belong to any particular netdev, that's >> true. However, netdev_flow_api is also a global entity. All netdevs >> are using the same instance of the netdev_fow_api. And most of the >> structures in netdev-offload-tc are global (hash maps, >> configurations). >> 'netdev' argument can just be viewed as a guide for these callbacks, >> like meter_id for the meter offload API, not something necessary. > > Usually, we get netdev_flow_api from netdev->flow_api, but dpif_class- >> meter_get/set/del don't have the param of netdev, we must add a helper > to get any netdev from dpif. The helper is more like: > > struct netdev * > netdev_get_any(const char *dpif_type) > { > struct port_to_netdev_data *data; > struct netdev *dev = NULL; > > ovs_rwlock_rdlock(&netdev_hmap_rwlock); > HMAP_FOR_EACH (data, portno_node, &port_to_netdev) { > if (netdev_get_dpif_type(data->netdev) == dpif_type) { > dev = netdev_ref(data->netdev); > break; > } > } > ovs_rwlock_unlock(&netdev_hmap_rwlock); > > return dev; > } > > Do you accept it, or have better solution?
Hmm. Good point, I missed that. In fact, a single dpif can have ports with different offload API initialized. For example, linux ports may use TC, while DPDK ports can use DPDK's rte_flow while being in the same dpif-netdev instance. So, I think, the correct solution is to add a meter to all flow APIs that are initialized for ports within that datapath. Maybe something like this (not tested, pseudo-code): diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c index fb108c0d5..91504ee15 100644 --- a/lib/netdev-offload.c +++ b/lib/netdev-offload.c @@ -79,6 +79,9 @@ struct netdev_registered_flow_api { /* Number of references: one for the flow_api itself and one for every * instance of the netdev that uses it. */ struct ovs_refcount refcnt; + + /* Reference counter for dpif types which has a port with this flow API. */ + struct simap used_by_dpif_types; }; static struct netdev_registered_flow_api * @@ -120,6 +123,7 @@ netdev_register_flow_api_provider(const struct netdev_flow_api *new_flow_api) hash_string(new_flow_api->type, 0)); rfa->flow_api = new_flow_api; ovs_refcount_init(&rfa->refcnt); + simap_init(&rfa->used_by_dpif_types); VLOG_DBG("netdev: flow API '%s' registered.", new_flow_api->type); } ovs_mutex_unlock(&netdev_flow_api_provider_mutex); @@ -154,6 +158,7 @@ netdev_unregister_flow_api_provider(const char *type) } else { cmap_remove(&netdev_flow_apis, &rfa->cmap_node, hash_string(rfa->flow_api->type, 0)); + ovsrcu_postpone(simap_destroy, &rfa->used_by_dpif_types); ovsrcu_postpone(free, rfa); error = 0; } @@ -182,6 +187,8 @@ netdev_assign_flow_api(struct netdev *netdev) CMAP_FOR_EACH (rfa, cmap_node, &netdev_flow_apis) { if (!rfa->flow_api->init_flow_api(netdev)) { ovs_refcount_ref(&rfa->refcnt); + simap_increase(&rfa->used_by_dpif_types, + netdev_get_dpif_type(netdev), 1); ovsrcu_set(&netdev->flow_api, rfa->flow_api); VLOG_INFO("%s: Assigned flow API '%s'.", netdev_get_name(netdev), rfa->flow_api->type); @@ -344,6 +351,8 @@ netdev_uninit_flow_api(struct netdev *netdev) ovsrcu_set(&netdev->flow_api, NULL); rfa = netdev_lookup_flow_api(flow_api->type); + ovs_assert(simap_increase(&rfa->used_by_dpif_types, + netdev_get_dpif_type(netdev), -1) >= 0); ovs_refcount_unref(&rfa->refcnt); } @@ -739,6 +748,20 @@ netdev_ports_get_n_flows(const char *dpif_type, odp_port_t port_no, return ret; } + +void +netdev_offload_meter_set(const char *dpif_type, ...) +{ + struct netdev_registered_flow_api *rfa; + + CMAP_FOR_EACH (rfa, cmap_node, &netdev_flow_apis) { + if (simap_get(&rfa->used_by_dpif_types, dpif_type) > 0) { + rfa->flow_api->meter_set(...); + } + } +} + + odp_port_t netdev_ifindex_to_odp_port(int ifindex) { --- ? > >> (I guess, we could rename netdev_flow_api to just offload_api to make >> this more clear, but let's not think about that too much for now to >> not overcomplicate things.) >> >> meter_offload_init/destory APIs might not be even needed. >> >> meter_tc_init_policer can be called directly from the >> netdev_tc_init_flow_api, >> if made re-enter-able (if (meter_police_ids) { return; }), or can be >> called from under the existing ovsthread_once_start(&once). >> >> meter_offload_destory in current implementation only frees the id >> pool. >> This might be not necessary for the following reasons: >> - If datapath is destroyed, but the process is still running, it's >> likely that we're waiting for the creation of a new datapath that >> will re-use the id-pool. >> - If we're going to shut down the OVS, it's nice but not necessary >> to free all the memory before doing so. Since the id-pool is a >> global >> variable, that will not be treated as a memory leak by sanitizers. >> >> What do you think? >> > > It makes sense. > >> >> Forgot to mention in a previous reply that the patch 8/8 of the patch >> set should add a NEWS entry for the new feature (the NEWS file is a >> bit mangled at the moment, but Alin is going to fix it soon). >> Some test in the tests/system-offloads-traffic.at would be also nice >> to have, if possible. >> >>> >>> Thanks! >>> Jianbo >>> >>>> >>>> Another small nit is that, I guess, lib/tc.c is a better home for >>>> most of the >>>> tc_police functions from lib/netdev-linux.c. But I agree that >>>> they >>>> are >>>> already mixed up and it's kind of a separate task to split them. >>>> So, >>>> it's >>>> not really a concern for this patch set. >>>> >>>>> >>>>> One area that we, at Corigine, would like to follow-up on is >>>>> using >>>>> a >>>>> revalidation process rather than evicting rules on start-up. >>>> >>>> If TC police deletion failures can happen, that can become a >>>> problem >>>> for >>>> long-running setups indeed. Not sure what the best solution >>>> should >>>> be, but >>>> I agree that this case has to be handled in some way. >>>> >>>> Best regards, Ilya Maximets. >>> >> >> _______________________________________________ >> dev mailing list >> d...@openvswitch.org >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fmailman%2Flistinfo%2Fovs-dev&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=dazl4Ccgfrkb7tkne2e6c%2BZTHIjZ2RJqjchon%2BA9l%2BY%3D&reserved=0 > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev