On 4/29/22 20:01, Ilya Maximets wrote:
> On 4/29/22 05:28, Jianbo Liu wrote:
>> On Thu, 2022-04-28 at 14:00 +0200, Ilya Maximets wrote:
>>> On 4/28/22 13:49, Ilya Maximets wrote:
>>>> On 4/28/22 11:06, Jianbo Liu wrote:
>>>>> On Wed, 2022-04-27 at 13:37 +0200, Ilya Maximets wrote:
>>>>>> On 4/27/22 04:41, Jianbo Liu wrote:
>>>>>>> On Wed, 2022-04-27 at 01:58 +0200, Ilya Maximets wrote:
>>>>>>>> On 4/26/22 09:14, Simon Horman wrote:
>>>>>>>>> On Tue, Apr 19, 2022 at 01:38:33AM +0000, Jianbo Liu via
>>>>>>>>> dev
>>>>>>>>> wrote:
>>>>>>>>>> On Thu, 2022-04-07 at 09:15 +0000, Jianbo Liu wrote:
>>>>>>>>>>> This series is to add support for tc offloading of
>>>>>>>>>>> ovs
>>>>>>>>>>> metering, and
>>>>>>>>>>> enhance OVS to use new kernel feature which offload
>>>>>>>>>>> tc
>>>>>>>>>>> police
>>>>>>>>>>> action to
>>>>>>>>>>> hardware.
>>>>>>>>>>> To do the offloading, new APIs for meter are added in
>>>>>>>>>>> netdev-
>>>>>>>>>>> offload,
>>>>>>>>>>> and OVS meters are mapped to tc police actions with
>>>>>>>>>>> one-to-
>>>>>>>>>>> one
>>>>>>>>>>> relationship for dpif-netlink.
>>>>>>>>>>>
>>>>>>>>>>> Notes:
>>>>>>>>>>>     v3
>>>>>>>>>>>     - Add netdev-offload APIs for meter.
>>>>>>>>>>>     - Move the implementation to lower netdev-
>>>>>>>>>>> offload-tc
>>>>>>>>>>> layer.
>>>>>>>>>>>
>>>>>>>>>>>     v2
>>>>>>>>>>>     - Move tc police parse call from last patch to
>>>>>>>>>>> 2nd
>>>>>>>>>>> patch.
>>>>>>>>>>>       2nd patch is adding the parse lib func and add
>>>>>>>>>>> the
>>>>>>>>>>> call
>>>>>>>>>>> to it in
>>>>>>>>>>> that patch.
>>>>>>>>>>>       Last patch is the put of tc police act.
>>>>>>>>>>>       In 2nd patch also add empty switch case for tc
>>>>>>>>>>> police
>>>>>>>>>>> act
>>>>>>>>>>> put as
>>>>>>>>>>> the impl.
>>>>>>>>>>>       is done in the last patch when support for put
>>>>>>>>>>> is
>>>>>>>>>>> being
>>>>>>>>>>> added.
>>>>>>>>>
>>>>>>>>> ...
>>>>>>>>>
>>>>>>>>>> Friendly ping for review.
>>>>>>>>>
>>>>>>>>> Hi Jianbo,
>>>>>>>>>
>>>>>>>>> sorry for the delayed response.
>>>>>>>>>
>>>>>>>>> My colleagues at Corigine have exercised this patchset
>>>>>>>>> and have
>>>>>>>>> verified
>>>>>>>>> it for our test-cases.
>>>>>>>>>
>>>>>>>>> Also, I have run the upstream test workflows over them,
>>>>>>>>> and
>>>>>>>>> they
>>>>>>>>> look good.
>>>>>>>>>
>>>>>>>>> 1. netdev-offload: Add meter offload provider
>>>>>>>>>     
>>>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220399423&data=05%7C01%7Cjianbol%40nvidia.com%7Cd62f3e1865b2471053e608da290eb982%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637867440481236224%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=UmOUgdVlvQuuCPEqlfgRlei5tZtmAv7liSbfVVaWXgE%3D&reserved=0
>>>>>>>>> 2. tc: Add support parsing tc police action
>>>>>>>>>     
>>>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220409233&data=05%7C01%7Cjianbol%40nvidia.com%7Cd62f3e1865b2471053e608da290eb982%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637867440481236224%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=3CA7zQl8evhu9RIF6BC5pwiLpASiSLN3jJ7%2B4lANKCo%3D&reserved=0
>>>>>>>>> 3. netdev-linux: Refactor put police action netlink
>>>>>>>>> message
>>>>>>>>>     
>>>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220516986&data=05%7C01%7Cjianbol%40nvidia.com%7Cd62f3e1865b2471053e608da290eb982%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637867440481236224%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Fv0GEkVRMvT5TQo2wxwB%2FB4FMmUu3d2F5mBRONhen1U%3D&reserved=0
>>>>>>>>> 4. netdev-linux: Add functions to manipulate tc police
>>>>>>>>> action
>>>>>>>>>     
>>>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220566774&data=05%7C01%7Cjianbol%40nvidia.com%7Cd62f3e1865b2471053e608da290eb982%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637867440481236224%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Nm0T35Z4q5RlkAAAXsiwOqbVo7Utm4nVZYfEtYul4gk%3D&reserved=0
>>>>>>>>> 5. netdev-offload-tc: Implement and register meter
>>>>>>>>> offload API
>>>>>>>>> for
>>>>>>>>> tc
>>>>>>>>>     
>>>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220568491&data=05%7C01%7Cjianbol%40nvidia.com%7Cd62f3e1865b2471053e608da290eb982%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637867440481236224%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=f67ByBBmS7sJC8D4kOcgSOTQSAl5NueDo4NhJL2aozU%3D&reserved=0
>>>>>>>>> 6. netdev-offload-tc: Cleanup police actions with
>>>>>>>>> reserved
>>>>>>>>> indexes
>>>>>>>>> on startup
>>>>>>>>>     
>>>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220569761&data=05%7C01%7Cjianbol%40nvidia.com%7Cd62f3e1865b2471053e608da290eb982%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637867440481236224%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=1iLtBH6tsyoOSH8HoPzg3o1zcqEkBQUVNjcs%2BHg3%2Fvc%3D&reserved=0
>>>>>>>>> 7. netdev-offload-tc: Offloading rules with police
>>>>>>>>> actions
>>>>>>>>>     
>>>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220571039&data=05%7C01%7Cjianbol%40nvidia.com%7Cd62f3e1865b2471053e608da290eb982%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637867440481236224%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=i1%2FUfISCkzpZJmSygf1eT2T%2FlMiDFosrcBrIinYJ5x8%3D&reserved=0
>>>>>>>>> 8. dpif-netlink: Offloading meter to tc police action
>>>>>>>>>     
>>>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms2%2Fovs%2Factions%2Fruns%2F2220572904&data=05%7C01%7Cjianbol%40nvidia.com%7Cd62f3e1865b2471053e608da290eb982%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637867440481236224%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=vGH1Em4ddDfnHiayWJXabdyqT7TXNua8awyJoPTD4Fk%3D&reserved=0
>>>>>>>>>
>>>>>>>>> I have also reviewed the patches, and with my OVS
>>>>>>>>> maintainer
>>>>>>>>> hat
>>>>>>>>> on, I'm
>>>>>>>>> happy to apply them, and plan to do so in the coming days
>>>>>>>>> unless
>>>>>>>>> there are
>>>>>>>>> objections.
>>>>>>>>
>>>>>>>> Thanks, Simon, for taking care of this patch set!  And
>>>>>>>> sorry for
>>>>>>>> me
>>>>>>>> not being
>>>>>>>> very responsive.
>>>>>>>>
>>>>>>>> I had a quick look over the patch set just now and I think
>>>>>>>> that
>>>>>>>> in
>>>>>>>> this
>>>>>>>> version most of my main concerns were addressed and it
>>>>>>>> looks good
>>>>>>>> in
>>>>>>>> general.
>>>>>>>>
>>>>>>>> One thing I didn't understand is why we need to have
>>>>>>>> 'struct
>>>>>>>> meter_offload_api'
>>>>>>>> as a separate entity?  I mean, it never used if a normal
>>>>>>>> offload
>>>>>>>> API
>>>>>>>> is
>>>>>>>> not initialized and it's tied with the particular offload
>>>>>>>> API. 
>>>>>>>> Can
>>>>>>>> these
>>>>>>>> callbacks be just part of the normal 'struct
>>>>>>>> netdev_flow_api' ?
>>>>>>>> Should not be hard to merge meter_offload_api and
>>>>>>>> netdev_flow_api
>>>>>>>> into one.
>>>>>>>> That should also reduce the amount of code.  Or am I
>>>>>>>> missing
>>>>>>>> something?
>>>>>>>
>>>>>>> Conceptually, meter is used globally, and it does not belong
>>>>>>> to any
>>>>>>> perticular netdev/flow. All the callbacks in 'struct
>>>>>>> netdev_flow_api'
>>>>>>> have a parameter related to netdev, but meter callbacks don't
>>>>>>> need.
>>>>>>> The meter_offload_set/get/del can be part of 'struct
>>>>>>> netdev_flow_api',
>>>>>>> but meter_offload_init/destory must be placed to elsewhere,
>>>>>>> to
>>>>>>> avoid
>>>>>>> being called more than once.
>>>>>>> So, why not add new API for meter? But I'd like to change if
>>>>>>> you
>>>>>>> insist. What's your decision?
>>>>>>
>>>>>> I agree that meters do not belong to any particular netdev,
>>>>>> that's
>>>>>> true.  However, netdev_flow_api is also a global entity.  All
>>>>>> netdevs
>>>>>> are using the same instance of the netdev_fow_api.  And most of
>>>>>> the
>>>>>> structures in netdev-offload-tc are global (hash maps,
>>>>>> configurations).
>>>>>> 'netdev' argument can just be viewed as a guide for these
>>>>>> callbacks,
>>>>>> like meter_id for the meter offload API, not something
>>>>>> necessary.
>>>>>
>>>>> Usually, we get netdev_flow_api from netdev->flow_api, but
>>>>> dpif_class-
>>>>>> meter_get/set/del don't have the param of netdev, we must add a
>>>>>> helper
>>>>> to get any netdev from dpif. The helper is more like:
>>>>>  
>>>>> struct netdev *
>>>>> netdev_get_any(const char *dpif_type)
>>>>> {
>>>>>     struct port_to_netdev_data *data;
>>>>>     struct netdev *dev = NULL;
>>>>>
>>>>>     ovs_rwlock_rdlock(&netdev_hmap_rwlock);
>>>>>     HMAP_FOR_EACH (data, portno_node, &port_to_netdev) {
>>>>>         if (netdev_get_dpif_type(data->netdev) == dpif_type) {
>>>>>             dev = netdev_ref(data->netdev);
>>>>>             break;
>>>>>         }
>>>>>     }
>>>>>     ovs_rwlock_unlock(&netdev_hmap_rwlock);
>>>>>
>>>>>     return dev;
>>>>> }
>>>>>
>>>>> Do you accept it, or have better solution?
>>>>
>>>> Hmm.  Good point, I missed that.
>>>> In fact, a single dpif can have ports with different offload API
>>>> initialized.  For example, linux ports may use TC, while DPDK ports
>>>> can use DPDK's rte_flow while being in the same dpif-netdev
>>>> instance.
>>>>
>>>> So, I think, the correct solution is to add a meter to all flow
>>>> APIs
>>>> that are initialized for ports within that datapath.  Maybe
>>>> something
>>>> like this (not tested, pseudo-code):
>>>>
>>>> diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c
>>>> index fb108c0d5..91504ee15 100644
>>>> --- a/lib/netdev-offload.c
>>>> +++ b/lib/netdev-offload.c
>>>> @@ -79,6 +79,9 @@ struct netdev_registered_flow_api {
>>>>      /* Number of references: one for the flow_api itself and one
>>>> for every
>>>>       * instance of the netdev that uses it. */
>>>>      struct ovs_refcount refcnt;
>>>> +
>>>> +    /* Reference counter for dpif types which has a port with this
>>>> flow API. */
>>>> +    struct simap used_by_dpif_types;
>>>>  };
>>>>  
>>>>  static struct netdev_registered_flow_api *
>>>> @@ -120,6 +123,7 @@ netdev_register_flow_api_provider(const struct
>>>> netdev_flow_api *new_flow_api)
>>>>                      hash_string(new_flow_api->type, 0));
>>>>          rfa->flow_api = new_flow_api;
>>>>          ovs_refcount_init(&rfa->refcnt);
>>>> +        simap_init(&rfa->used_by_dpif_types);
>>>>          VLOG_DBG("netdev: flow API '%s' registered.",
>>>> new_flow_api->type);
>>>>      }
>>>>      ovs_mutex_unlock(&netdev_flow_api_provider_mutex);
>>>> @@ -154,6 +158,7 @@ netdev_unregister_flow_api_provider(const char
>>>> *type)
>>>>      } else  {
>>>>          cmap_remove(&netdev_flow_apis, &rfa->cmap_node,
>>>>                      hash_string(rfa->flow_api->type, 0));
>>>> +        ovsrcu_postpone(simap_destroy, &rfa->used_by_dpif_types);
>>>>          ovsrcu_postpone(free, rfa);
>>>>          error = 0;
>>>>      }
>>>> @@ -182,6 +187,8 @@ netdev_assign_flow_api(struct netdev *netdev)
>>>>      CMAP_FOR_EACH (rfa, cmap_node, &netdev_flow_apis) {
>>>>          if (!rfa->flow_api->init_flow_api(netdev)) {
>>>>              ovs_refcount_ref(&rfa->refcnt);
>>>> +            simap_increase(&rfa->used_by_dpif_types,
>>>> +                           netdev_get_dpif_type(netdev), 1);
>>>>              ovsrcu_set(&netdev->flow_api, rfa->flow_api);
>>>>              VLOG_INFO("%s: Assigned flow API '%s'.",
>>>>                        netdev_get_name(netdev), rfa->flow_api-
>>>>> type);
>>>> @@ -344,6 +351,8 @@ netdev_uninit_flow_api(struct netdev *netdev)
>>>>  
>>>>      ovsrcu_set(&netdev->flow_api, NULL);
>>>>      rfa = netdev_lookup_flow_api(flow_api->type);
>>>> +    ovs_assert(simap_increase(&rfa->used_by_dpif_types,
>>>> +                              netdev_get_dpif_type(netdev), -1) >=
>>>> 0);
>>>>      ovs_refcount_unref(&rfa->refcnt);
>>>>  }
>>>>  
>>>> @@ -739,6 +748,20 @@ netdev_ports_get_n_flows(const char
>>>> *dpif_type, odp_port_t port_no,
>>>>      return ret;
>>>>  }
>>>>  
>>>> +
>>>> +void
>>>> +netdev_offload_meter_set(const char *dpif_type, ...)
>>>> +{
>>>> +    struct netdev_registered_flow_api *rfa;
>>>> +
>>>> +    CMAP_FOR_EACH (rfa, cmap_node, &netdev_flow_apis) {
>>>> +        if (simap_get(&rfa->used_by_dpif_types, dpif_type) > 0) {
>>>> +            rfa->flow_api->meter_set(...);
>>>> +        }
>>>> +    }
>>>> +}
>>>> +
>>>> +
>>>>  odp_port_t
>>>>  netdev_ifindex_to_odp_port(int ifindex)
>>>>  {
>>>> ---
>>>>
>>>> ?
>>>
>>> Thinking more, there is no point in counting netdevs, since they
>>> can be added after meters already created.  So, I guess, we just
>>> have to always add meters to all registered flow API providers,
>>> i.e. just:
>>>
>>> void
>>> netdev_offload_meter_set(...)
>>> {
>>>     struct netdev_registered_flow_api *rfa;
>>>
>>>     CMAP_FOR_EACH (rfa, cmap_node, &netdev_flow_apis) {
>>>         rfa->flow_api->meter_set(...);
>>>     }
>>> }
>>
>> You know I will put this netdev_offload_meter_set in
>> dpif_netlink_meter_set. But, in the future, someone may implement
>> meter_set for netdev_offload_dpdk, so it will be called as well. It
>> looks unreasonable as we want to call linux_tc's meter_set only,
>> dpdk_flow_api's meter_set should be called in dpif_netdev. 
> 
> meter_set for netdev_offload_dpdk will not be called unless
> netdev_offload_dpdk is registered and it will not be registered
> unless dpdk is explicitely enabled by the user via database knob.
> And if user enables dpdk, they likely want to use userpsace
> datapath.  So that doesn't sound like a big problem.
> 
> Userspace datapath though is more versatile and it may want
> to use several different offload APIs at the same time.  And
> we don't really know for packets from which ports the meter
> will be used.  And since they may have different offload APIs
> we kind of have to create meters in all registered.

Alternative solution is to not create meters anywhere at all,
but only store them globally inside the netdev_offload module.
Flow API implementations will look up and create the meter
on-demand while creating a flow that is going to use that meter.
Not sure how deletions should look like in this case.

> 
> Binding a flow API to a specific datapath doesn't look good to me.
> 
> I guess, this also raises the point of using the same meter
> for flows installed in kernel and TC for dpif-netlink, as well
> as using the same meter for flows in TC and DPDK with userspace
> datapath.
> But that problem seems to not have a clear solution as
> bandwidth sharing doesn't seem to be possible in such cases.
> 
> Best regards, Ilya Maximets.

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to