On 4/29/22 05:28, Jianbo Liu wrote:
> On Thu, 2022-04-28 at 14:00 +0200, Ilya Maximets wrote:
>> On 4/28/22 13:49, Ilya Maximets wrote:
>>> On 4/28/22 11:06, Jianbo Liu wrote:
>>>> On Wed, 2022-04-27 at 13:37 +0200, Ilya Maximets wrote:
>>>>> On 4/27/22 04:41, Jianbo Liu wrote:
>>>>>> On Wed, 2022-04-27 at 01:58 +0200, Ilya Maximets wrote:
>>>>>>> On 4/26/22 09:14, Simon Horman wrote:
>>>>>>>> On Tue, Apr 19, 2022 at 01:38:33AM +0000, Jianbo Liu via
>>>>>>>> dev
>>>>>>>> wrote:
>>>>>>>>> On Thu, 2022-04-07 at 09:15 +0000, Jianbo Liu wrote:
>>>>>>>>>> This series is to add support for tc offloading of
>>>>>>>>>> ovs
>>>>>>>>>> metering, and
>>>>>>>>>> enhance OVS to use new kernel feature which offload
>>>>>>>>>> tc
>>>>>>>>>> police
>>>>>>>>>> action to
>>>>>>>>>> hardware.
>>>>>>>>>> To do the offloading, new APIs for meter are added in
>>>>>>>>>> netdev-
>>>>>>>>>> offload,
>>>>>>>>>> and OVS meters are mapped to tc police actions with
>>>>>>>>>> one-to-
>>>>>>>>>> one
>>>>>>>>>> relationship for dpif-netlink.
>>>>>>>>>>
>>>>>>>>>> Notes:
>>>>>>>>>>     v3
>>>>>>>>>>     - Add netdev-offload APIs for meter.
>>>>>>>>>>     - Move the implementation to lower netdev-
>>>>>>>>>> offload-tc
>>>>>>>>>> layer.
>>>>>>>>>>
>>>>>>>>>>     v2
>>>>>>>>>>     - Move tc police parse call from last patch to
>>>>>>>>>> 2nd
>>>>>>>>>> patch.
>>>>>>>>>>       2nd patch is adding the parse lib func and add
>>>>>>>>>> the
>>>>>>>>>> call
>>>>>>>>>> to it in
>>>>>>>>>> that patch.
>>>>>>>>>>       Last patch is the put of tc police act.
>>>>>>>>>>       In 2nd patch also add empty switch case for tc
>>>>>>>>>> police
>>>>>>>>>> act
>>>>>>>>>> put as
>>>>>>>>>> the impl.
>>>>>>>>>>       is done in the last patch when support for put
>>>>>>>>>> is
>>>>>>>>>> being
>>>>>>>>>> added.
>>>>>>>>
>>>>>>>> ...
>>>>>>>>
>>>>>>>>> Friendly ping for review.
>>>>>>>>
>>>>>>>> Hi Jianbo,
>>>>>>>>
>>>>>>>> sorry for the delayed response.
>>>>>>>>
>>>>>>>> My colleagues at Corigine have exercised this patchset
>>>>>>>> and have
>>>>>>>> verified
>>>>>>>> it for our test-cases.
>>>>>>>>
>>>>>>>> Also, I have run the upstream test workflows over them,
>>>>>>>> and
>>>>>>>> they
>>>>>>>> look good.
>>>>>>>>
>>>>>>>> 1. netdev-offload: Add meter offload provider
>>>>>>>>     
>>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220399423&data=05%7C01%7Cjianbol%40nvidia.com%7Cd62f3e1865b2471053e608da290eb982%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637867440481236224%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=UmOUgdVlvQuuCPEqlfgRlei5tZtmAv7liSbfVVaWXgE%3D&reserved=0
>>>>>>>> 2. tc: Add support parsing tc police action
>>>>>>>>     
>>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220409233&data=05%7C01%7Cjianbol%40nvidia.com%7Cd62f3e1865b2471053e608da290eb982%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637867440481236224%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=3CA7zQl8evhu9RIF6BC5pwiLpASiSLN3jJ7%2B4lANKCo%3D&reserved=0
>>>>>>>> 3. netdev-linux: Refactor put police action netlink
>>>>>>>> message
>>>>>>>>     
>>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220516986&data=05%7C01%7Cjianbol%40nvidia.com%7Cd62f3e1865b2471053e608da290eb982%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637867440481236224%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Fv0GEkVRMvT5TQo2wxwB%2FB4FMmUu3d2F5mBRONhen1U%3D&reserved=0
>>>>>>>> 4. netdev-linux: Add functions to manipulate tc police
>>>>>>>> action
>>>>>>>>     
>>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220566774&data=05%7C01%7Cjianbol%40nvidia.com%7Cd62f3e1865b2471053e608da290eb982%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637867440481236224%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Nm0T35Z4q5RlkAAAXsiwOqbVo7Utm4nVZYfEtYul4gk%3D&reserved=0
>>>>>>>> 5. netdev-offload-tc: Implement and register meter
>>>>>>>> offload API
>>>>>>>> for
>>>>>>>> tc
>>>>>>>>     
>>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220568491&data=05%7C01%7Cjianbol%40nvidia.com%7Cd62f3e1865b2471053e608da290eb982%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637867440481236224%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=f67ByBBmS7sJC8D4kOcgSOTQSAl5NueDo4NhJL2aozU%3D&reserved=0
>>>>>>>> 6. netdev-offload-tc: Cleanup police actions with
>>>>>>>> reserved
>>>>>>>> indexes
>>>>>>>> on startup
>>>>>>>>     
>>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220569761&data=05%7C01%7Cjianbol%40nvidia.com%7Cd62f3e1865b2471053e608da290eb982%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637867440481236224%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=1iLtBH6tsyoOSH8HoPzg3o1zcqEkBQUVNjcs%2BHg3%2Fvc%3D&reserved=0
>>>>>>>> 7. netdev-offload-tc: Offloading rules with police
>>>>>>>> actions
>>>>>>>>     
>>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220571039&data=05%7C01%7Cjianbol%40nvidia.com%7Cd62f3e1865b2471053e608da290eb982%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637867440481236224%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=i1%2FUfISCkzpZJmSygf1eT2T%2FlMiDFosrcBrIinYJ5x8%3D&reserved=0
>>>>>>>> 8. dpif-netlink: Offloading meter to tc police action
>>>>>>>>     
>>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220572904&data=05%7C01%7Cjianbol%40nvidia.com%7Cd62f3e1865b2471053e608da290eb982%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637867440481236224%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=vGH1Em4ddDfnHiayWJXabdyqT7TXNua8awyJoPTD4Fk%3D&reserved=0
>>>>>>>>
>>>>>>>> I have also reviewed the patches, and with my OVS
>>>>>>>> maintainer
>>>>>>>> hat
>>>>>>>> on, I'm
>>>>>>>> happy to apply them, and plan to do so in the coming days
>>>>>>>> unless
>>>>>>>> there are
>>>>>>>> objections.
>>>>>>>
>>>>>>> Thanks, Simon, for taking care of this patch set!  And
>>>>>>> sorry for
>>>>>>> me
>>>>>>> not being
>>>>>>> very responsive.
>>>>>>>
>>>>>>> I had a quick look over the patch set just now and I think
>>>>>>> that
>>>>>>> in
>>>>>>> this
>>>>>>> version most of my main concerns were addressed and it
>>>>>>> looks good
>>>>>>> in
>>>>>>> general.
>>>>>>>
>>>>>>> One thing I didn't understand is why we need to have
>>>>>>> 'struct
>>>>>>> meter_offload_api'
>>>>>>> as a separate entity?  I mean, it never used if a normal
>>>>>>> offload
>>>>>>> API
>>>>>>> is
>>>>>>> not initialized and it's tied with the particular offload
>>>>>>> API. 
>>>>>>> Can
>>>>>>> these
>>>>>>> callbacks be just part of the normal 'struct
>>>>>>> netdev_flow_api' ?
>>>>>>> Should not be hard to merge meter_offload_api and
>>>>>>> netdev_flow_api
>>>>>>> into one.
>>>>>>> That should also reduce the amount of code.  Or am I
>>>>>>> missing
>>>>>>> something?
>>>>>>
>>>>>> Conceptually, meter is used globally, and it does not belong
>>>>>> to any
>>>>>> perticular netdev/flow. All the callbacks in 'struct
>>>>>> netdev_flow_api'
>>>>>> have a parameter related to netdev, but meter callbacks don't
>>>>>> need.
>>>>>> The meter_offload_set/get/del can be part of 'struct
>>>>>> netdev_flow_api',
>>>>>> but meter_offload_init/destory must be placed to elsewhere,
>>>>>> to
>>>>>> avoid
>>>>>> being called more than once.
>>>>>> So, why not add new API for meter? But I'd like to change if
>>>>>> you
>>>>>> insist. What's your decision?
>>>>>
>>>>> I agree that meters do not belong to any particular netdev,
>>>>> that's
>>>>> true.  However, netdev_flow_api is also a global entity.  All
>>>>> netdevs
>>>>> are using the same instance of the netdev_fow_api.  And most of
>>>>> the
>>>>> structures in netdev-offload-tc are global (hash maps,
>>>>> configurations).
>>>>> 'netdev' argument can just be viewed as a guide for these
>>>>> callbacks,
>>>>> like meter_id for the meter offload API, not something
>>>>> necessary.
>>>>
>>>> Usually, we get netdev_flow_api from netdev->flow_api, but
>>>> dpif_class-
>>>>> meter_get/set/del don't have the param of netdev, we must add a
>>>>> helper
>>>> to get any netdev from dpif. The helper is more like:
>>>>  
>>>> struct netdev *
>>>> netdev_get_any(const char *dpif_type)
>>>> {
>>>>     struct port_to_netdev_data *data;
>>>>     struct netdev *dev = NULL;
>>>>
>>>>     ovs_rwlock_rdlock(&netdev_hmap_rwlock);
>>>>     HMAP_FOR_EACH (data, portno_node, &port_to_netdev) {
>>>>         if (netdev_get_dpif_type(data->netdev) == dpif_type) {
>>>>             dev = netdev_ref(data->netdev);
>>>>             break;
>>>>         }
>>>>     }
>>>>     ovs_rwlock_unlock(&netdev_hmap_rwlock);
>>>>
>>>>     return dev;
>>>> }
>>>>
>>>> Do you accept it, or have better solution?
>>>
>>> Hmm.  Good point, I missed that.
>>> In fact, a single dpif can have ports with different offload API
>>> initialized.  For example, linux ports may use TC, while DPDK ports
>>> can use DPDK's rte_flow while being in the same dpif-netdev
>>> instance.
>>>
>>> So, I think, the correct solution is to add a meter to all flow
>>> APIs
>>> that are initialized for ports within that datapath.  Maybe
>>> something
>>> like this (not tested, pseudo-code):
>>>
>>> diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c
>>> index fb108c0d5..91504ee15 100644
>>> --- a/lib/netdev-offload.c
>>> +++ b/lib/netdev-offload.c
>>> @@ -79,6 +79,9 @@ struct netdev_registered_flow_api {
>>>      /* Number of references: one for the flow_api itself and one
>>> for every
>>>       * instance of the netdev that uses it. */
>>>      struct ovs_refcount refcnt;
>>> +
>>> +    /* Reference counter for dpif types which has a port with this
>>> flow API. */
>>> +    struct simap used_by_dpif_types;
>>>  };
>>>  
>>>  static struct netdev_registered_flow_api *
>>> @@ -120,6 +123,7 @@ netdev_register_flow_api_provider(const struct
>>> netdev_flow_api *new_flow_api)
>>>                      hash_string(new_flow_api->type, 0));
>>>          rfa->flow_api = new_flow_api;
>>>          ovs_refcount_init(&rfa->refcnt);
>>> +        simap_init(&rfa->used_by_dpif_types);
>>>          VLOG_DBG("netdev: flow API '%s' registered.",
>>> new_flow_api->type);
>>>      }
>>>      ovs_mutex_unlock(&netdev_flow_api_provider_mutex);
>>> @@ -154,6 +158,7 @@ netdev_unregister_flow_api_provider(const char
>>> *type)
>>>      } else  {
>>>          cmap_remove(&netdev_flow_apis, &rfa->cmap_node,
>>>                      hash_string(rfa->flow_api->type, 0));
>>> +        ovsrcu_postpone(simap_destroy, &rfa->used_by_dpif_types);
>>>          ovsrcu_postpone(free, rfa);
>>>          error = 0;
>>>      }
>>> @@ -182,6 +187,8 @@ netdev_assign_flow_api(struct netdev *netdev)
>>>      CMAP_FOR_EACH (rfa, cmap_node, &netdev_flow_apis) {
>>>          if (!rfa->flow_api->init_flow_api(netdev)) {
>>>              ovs_refcount_ref(&rfa->refcnt);
>>> +            simap_increase(&rfa->used_by_dpif_types,
>>> +                           netdev_get_dpif_type(netdev), 1);
>>>              ovsrcu_set(&netdev->flow_api, rfa->flow_api);
>>>              VLOG_INFO("%s: Assigned flow API '%s'.",
>>>                        netdev_get_name(netdev), rfa->flow_api-
>>>> type);
>>> @@ -344,6 +351,8 @@ netdev_uninit_flow_api(struct netdev *netdev)
>>>  
>>>      ovsrcu_set(&netdev->flow_api, NULL);
>>>      rfa = netdev_lookup_flow_api(flow_api->type);
>>> +    ovs_assert(simap_increase(&rfa->used_by_dpif_types,
>>> +                              netdev_get_dpif_type(netdev), -1) >=
>>> 0);
>>>      ovs_refcount_unref(&rfa->refcnt);
>>>  }
>>>  
>>> @@ -739,6 +748,20 @@ netdev_ports_get_n_flows(const char
>>> *dpif_type, odp_port_t port_no,
>>>      return ret;
>>>  }
>>>  
>>> +
>>> +void
>>> +netdev_offload_meter_set(const char *dpif_type, ...)
>>> +{
>>> +    struct netdev_registered_flow_api *rfa;
>>> +
>>> +    CMAP_FOR_EACH (rfa, cmap_node, &netdev_flow_apis) {
>>> +        if (simap_get(&rfa->used_by_dpif_types, dpif_type) > 0) {
>>> +            rfa->flow_api->meter_set(...);
>>> +        }
>>> +    }
>>> +}
>>> +
>>> +
>>>  odp_port_t
>>>  netdev_ifindex_to_odp_port(int ifindex)
>>>  {
>>> ---
>>>
>>> ?
>>
>> Thinking more, there is no point in counting netdevs, since they
>> can be added after meters already created.  So, I guess, we just
>> have to always add meters to all registered flow API providers,
>> i.e. just:
>>
>> void
>> netdev_offload_meter_set(...)
>> {
>>     struct netdev_registered_flow_api *rfa;
>>
>>     CMAP_FOR_EACH (rfa, cmap_node, &netdev_flow_apis) {
>>         rfa->flow_api->meter_set(...);
>>     }
>> }
> 
> You know I will put this netdev_offload_meter_set in
> dpif_netlink_meter_set. But, in the future, someone may implement
> meter_set for netdev_offload_dpdk, so it will be called as well. It
> looks unreasonable as we want to call linux_tc's meter_set only,
> dpdk_flow_api's meter_set should be called in dpif_netdev. 

meter_set for netdev_offload_dpdk will not be called unless
netdev_offload_dpdk is registered and it will not be registered
unless dpdk is explicitely enabled by the user via database knob.
And if user enables dpdk, they likely want to use userpsace
datapath.  So that doesn't sound like a big problem.

Userspace datapath though is more versatile and it may want
to use several different offload APIs at the same time.  And
we don't really know for packets from which ports the meter
will be used.  And since they may have different offload APIs
we kind of have to create meters in all registered.

Binding a flow API to a specific datapath doesn't look good to me.

I guess, this also raises the point of using the same meter
for flows installed in kernel and TC for dpif-netlink, as well
as using the same meter for flows in TC and DPDK with userspace
datapath.
But that problem seems to not have a clear solution as
bandwidth sharing doesn't seem to be possible in such cases.

Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to