On 4/28/22 11:06, Jianbo Liu wrote:
> On Wed, 2022-04-27 at 13:37 +0200, Ilya Maximets wrote:
>> On 4/27/22 04:41, Jianbo Liu wrote:
>>> On Wed, 2022-04-27 at 01:58 +0200, Ilya Maximets wrote:
>>>> On 4/26/22 09:14, Simon Horman wrote:
>>>>> On Tue, Apr 19, 2022 at 01:38:33AM +0000, Jianbo Liu via dev
>>>>> wrote:
>>>>>> On Thu, 2022-04-07 at 09:15 +0000, Jianbo Liu wrote:
>>>>>>> This series is to add support for tc offloading of ovs
>>>>>>> metering, and
>>>>>>> enhance OVS to use new kernel feature which offload tc
>>>>>>> police
>>>>>>> action to
>>>>>>> hardware.
>>>>>>> To do the offloading, new APIs for meter are added in
>>>>>>> netdev-
>>>>>>> offload,
>>>>>>> and OVS meters are mapped to tc police actions with one-to-
>>>>>>> one
>>>>>>> relationship for dpif-netlink.
>>>>>>>
>>>>>>> Notes:
>>>>>>>     v3
>>>>>>>     - Add netdev-offload APIs for meter.
>>>>>>>     - Move the implementation to lower netdev-offload-tc
>>>>>>> layer.
>>>>>>>
>>>>>>>     v2
>>>>>>>     - Move tc police parse call from last patch to 2nd
>>>>>>> patch.
>>>>>>>       2nd patch is adding the parse lib func and add the
>>>>>>> call
>>>>>>> to it in
>>>>>>> that patch.
>>>>>>>       Last patch is the put of tc police act.
>>>>>>>       In 2nd patch also add empty switch case for tc police
>>>>>>> act
>>>>>>> put as
>>>>>>> the impl.
>>>>>>>       is done in the last patch when support for put is
>>>>>>> being
>>>>>>> added.
>>>>>
>>>>> ...
>>>>>
>>>>>> Friendly ping for review.
>>>>>
>>>>> Hi Jianbo,
>>>>>
>>>>> sorry for the delayed response.
>>>>>
>>>>> My colleagues at Corigine have exercised this patchset and have
>>>>> verified
>>>>> it for our test-cases.
>>>>>
>>>>> Also, I have run the upstream test workflows over them, and
>>>>> they
>>>>> look good.
>>>>>
>>>>> 1. netdev-offload: Add meter offload provider
>>>>>     
>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220399423&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=8lRjZK35bENpZ3vRbMFBfGWJLXX4p%2BfjR57JEhADucw%3D&reserved=0
>>>>> 2. tc: Add support parsing tc police action
>>>>>     
>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220409233&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=iorfBm3gxzS3fILCBF8h18%2FWLSUQXEhc2g7jWMn%2F4ug%3D&reserved=0
>>>>> 3. netdev-linux: Refactor put police action netlink message
>>>>>     
>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220516986&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=9%2BKMNZqAe6CM7ysWXwbeFpIyf%2FF44NdSJj6WVN46IdM%3D&reserved=0
>>>>> 4. netdev-linux: Add functions to manipulate tc police action
>>>>>     
>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220566774&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=wMPlN5cYZJfZloAZ48bKLpS3SKSMTyZAovWlSPICXxY%3D&reserved=0
>>>>> 5. netdev-offload-tc: Implement and register meter offload API
>>>>> for
>>>>> tc
>>>>>     
>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220568491&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=MFKxDa7ZDScho536VVERyLS00uF76BgcbCS1q%2BRQoTI%3D&reserved=0
>>>>> 6. netdev-offload-tc: Cleanup police actions with reserved
>>>>> indexes
>>>>> on startup
>>>>>     
>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220569761&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ef%2BFz7twkOqc9PQC%2FQPkgaqHwreKHpnw4iuPyp%2BdFDQ%3D&reserved=0
>>>>> 7. netdev-offload-tc: Offloading rules with police actions
>>>>>     
>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220571039&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=VjzfzOLeU%2FRgHhFBfGwTAUXvcRjIuYsY4%2BNV4l3OEb8%3D&reserved=0
>>>>> 8. dpif-netlink: Offloading meter to tc police action
>>>>>     
>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220572904&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=HWC5u8pDX3vPRfeoJnXXBUiMlJDNYeApPFNauX2zVbc%3D&reserved=0
>>>>>
>>>>> I have also reviewed the patches, and with my OVS maintainer
>>>>> hat
>>>>> on, I'm
>>>>> happy to apply them, and plan to do so in the coming days
>>>>> unless
>>>>> there are
>>>>> objections.
>>>>
>>>> Thanks, Simon, for taking care of this patch set!  And sorry for
>>>> me
>>>> not being
>>>> very responsive.
>>>>
>>>> I had a quick look over the patch set just now and I think that
>>>> in
>>>> this
>>>> version most of my main concerns were addressed and it looks good
>>>> in
>>>> general.
>>>>
>>>> One thing I didn't understand is why we need to have 'struct
>>>> meter_offload_api'
>>>> as a separate entity?  I mean, it never used if a normal offload
>>>> API
>>>> is
>>>> not initialized and it's tied with the particular offload API. 
>>>> Can
>>>> these
>>>> callbacks be just part of the normal 'struct netdev_flow_api' ?
>>>> Should not be hard to merge meter_offload_api and netdev_flow_api
>>>> into one.
>>>> That should also reduce the amount of code.  Or am I missing
>>>> something?
>>>
>>> Conceptually, meter is used globally, and it does not belong to any
>>> perticular netdev/flow. All the callbacks in 'struct
>>> netdev_flow_api'
>>> have a parameter related to netdev, but meter callbacks don't need.
>>> The meter_offload_set/get/del can be part of 'struct
>>> netdev_flow_api',
>>> but meter_offload_init/destory must be placed to elsewhere, to
>>> avoid
>>> being called more than once.
>>> So, why not add new API for meter? But I'd like to change if you
>>> insist. What's your decision?
>>
>> I agree that meters do not belong to any particular netdev, that's
>> true.  However, netdev_flow_api is also a global entity.  All netdevs
>> are using the same instance of the netdev_fow_api.  And most of the
>> structures in netdev-offload-tc are global (hash maps,
>> configurations).
>> 'netdev' argument can just be viewed as a guide for these callbacks,
>> like meter_id for the meter offload API, not something necessary.
> 
> Usually, we get netdev_flow_api from netdev->flow_api, but dpif_class-
>> meter_get/set/del don't have the param of netdev, we must add a helper
> to get any netdev from dpif. The helper is more like:
>  
> struct netdev *
> netdev_get_any(const char *dpif_type)
> {
>     struct port_to_netdev_data *data;
>     struct netdev *dev = NULL;
> 
>     ovs_rwlock_rdlock(&netdev_hmap_rwlock);
>     HMAP_FOR_EACH (data, portno_node, &port_to_netdev) {
>         if (netdev_get_dpif_type(data->netdev) == dpif_type) {
>             dev = netdev_ref(data->netdev);
>             break;
>         }
>     }
>     ovs_rwlock_unlock(&netdev_hmap_rwlock);
> 
>     return dev;
> }
> 
> Do you accept it, or have better solution?

Hmm.  Good point, I missed that.
In fact, a single dpif can have ports with different offload API
initialized.  For example, linux ports may use TC, while DPDK ports
can use DPDK's rte_flow while being in the same dpif-netdev instance.

So, I think, the correct solution is to add a meter to all flow APIs
that are initialized for ports within that datapath.  Maybe something
like this (not tested, pseudo-code):

diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c
index fb108c0d5..91504ee15 100644
--- a/lib/netdev-offload.c
+++ b/lib/netdev-offload.c
@@ -79,6 +79,9 @@ struct netdev_registered_flow_api {
     /* Number of references: one for the flow_api itself and one for every
      * instance of the netdev that uses it. */
     struct ovs_refcount refcnt;
+
+    /* Reference counter for dpif types which has a port with this flow API. */
+    struct simap used_by_dpif_types;
 };
 
 static struct netdev_registered_flow_api *
@@ -120,6 +123,7 @@ netdev_register_flow_api_provider(const struct 
netdev_flow_api *new_flow_api)
                     hash_string(new_flow_api->type, 0));
         rfa->flow_api = new_flow_api;
         ovs_refcount_init(&rfa->refcnt);
+        simap_init(&rfa->used_by_dpif_types);
         VLOG_DBG("netdev: flow API '%s' registered.", new_flow_api->type);
     }
     ovs_mutex_unlock(&netdev_flow_api_provider_mutex);
@@ -154,6 +158,7 @@ netdev_unregister_flow_api_provider(const char *type)
     } else  {
         cmap_remove(&netdev_flow_apis, &rfa->cmap_node,
                     hash_string(rfa->flow_api->type, 0));
+        ovsrcu_postpone(simap_destroy, &rfa->used_by_dpif_types);
         ovsrcu_postpone(free, rfa);
         error = 0;
     }
@@ -182,6 +187,8 @@ netdev_assign_flow_api(struct netdev *netdev)
     CMAP_FOR_EACH (rfa, cmap_node, &netdev_flow_apis) {
         if (!rfa->flow_api->init_flow_api(netdev)) {
             ovs_refcount_ref(&rfa->refcnt);
+            simap_increase(&rfa->used_by_dpif_types,
+                           netdev_get_dpif_type(netdev), 1);
             ovsrcu_set(&netdev->flow_api, rfa->flow_api);
             VLOG_INFO("%s: Assigned flow API '%s'.",
                       netdev_get_name(netdev), rfa->flow_api->type);
@@ -344,6 +351,8 @@ netdev_uninit_flow_api(struct netdev *netdev)
 
     ovsrcu_set(&netdev->flow_api, NULL);
     rfa = netdev_lookup_flow_api(flow_api->type);
+    ovs_assert(simap_increase(&rfa->used_by_dpif_types,
+                              netdev_get_dpif_type(netdev), -1) >= 0);
     ovs_refcount_unref(&rfa->refcnt);
 }
 
@@ -739,6 +748,20 @@ netdev_ports_get_n_flows(const char *dpif_type, odp_port_t 
port_no,
     return ret;
 }
 
+
+void
+netdev_offload_meter_set(const char *dpif_type, ...)
+{
+    struct netdev_registered_flow_api *rfa;
+
+    CMAP_FOR_EACH (rfa, cmap_node, &netdev_flow_apis) {
+        if (simap_get(&rfa->used_by_dpif_types, dpif_type) > 0) {
+            rfa->flow_api->meter_set(...);
+        }
+    }
+}
+
+
 odp_port_t
 netdev_ifindex_to_odp_port(int ifindex)
 {
---

?

> 
>> (I guess, we could rename netdev_flow_api to just offload_api to make
>> this more clear, but let's not think about that too much for now to
>> not overcomplicate things.)
>>
>> meter_offload_init/destory APIs might not be even needed.
>>
>> meter_tc_init_policer can be called directly from the
>> netdev_tc_init_flow_api,
>> if made re-enter-able (if (meter_police_ids) { return; }), or can be
>> called from under the existing ovsthread_once_start(&once).
>>
>> meter_offload_destory in current implementation only frees the id
>> pool.
>> This might be not necessary for the following reasons:
>> - If datapath is destroyed, but the process is still running, it's
>>   likely that we're waiting for the creation of a new datapath that
>>   will re-use the id-pool.
>> - If we're going to shut down the OVS, it's nice but not necessary
>>   to free all the memory before doing so.  Since the id-pool is a
>> global
>>   variable, that will not be treated as a memory leak by sanitizers.
>>
>> What do you think?
>>
> 
> It makes sense.
> 
>>
>> Forgot to mention in a previous reply that the patch 8/8 of the patch
>> set should add a NEWS entry for the new feature (the NEWS file is a
>> bit mangled at the moment, but Alin is going to fix it soon).
>> Some test in the tests/system-offloads-traffic.at would be also nice
>> to have, if possible.
>>
>>>
>>> Thanks!
>>> Jianbo 
>>>
>>>>
>>>> Another small nit is that, I guess, lib/tc.c is a better home for
>>>> most of the
>>>> tc_police functions from lib/netdev-linux.c.  But I agree that
>>>> they
>>>> are
>>>> already mixed up and it's kind of a separate task to split them. 
>>>> So,
>>>> it's
>>>> not really a concern for this patch set.
>>>>
>>>>>
>>>>> One area that we, at Corigine, would like to follow-up on is
>>>>> using
>>>>> a
>>>>> revalidation process rather than evicting rules on start-up.
>>>>
>>>> If TC police deletion failures can happen, that can become a
>>>> problem
>>>> for
>>>> long-running setups indeed.  Not sure what the best solution
>>>> should
>>>> be, but
>>>> I agree that this case has to be handled in some way.
>>>>
>>>> Best regards, Ilya Maximets.
>>>
>>
>> _______________________________________________
>> dev mailing list
>> d...@openvswitch.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fmailman%2Flistinfo%2Fovs-dev&data=05%7C01%7Cjianbol%40nvidia.com%7Ccb6b0d3e0c3c4d3d295c08da284260d3%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637866562838020340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=dazl4Ccgfrkb7tkne2e6c%2BZTHIjZ2RJqjchon%2BA9l%2BY%3D&reserved=0
> 

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to