Re: [PATCH net-next v2 2/4] sch_htb: Hierarchical QoS hardware offload

Maxim Mikityanskiy Mon, 14 Dec 2020 12:33:02 -0800

On 2020-12-14 21:35, Cong Wang wrote:

On Mon, Dec 14, 2020 at 7:13 AM Maxim Mikityanskiy <maxi...@nvidia.com> wrote:


On 2020-12-11 21:16, Cong Wang wrote:

On Fri, Dec 11, 2020 at 7:26 AM Maxim Mikityanskiy <maxi...@mellanox.com> wrote:


HTB doesn't scale well because of contention on a single lock, and it
also consumes CPU. This patch adds support for offloading HTB to
hardware that supports hierarchical rate limiting.

This solution addresses two main problems of scaling HTB:

1. Contention by flow classification. Currently the filters are attached
to the HTB instance as follows:


I do not think this is the reason, tcf_classify() has been called with RCU
only on the ingress side for a rather long time. What contentions are you
talking about here?


When one attaches filters to HTB, tcf_classify is called from
htb_classify, which is called from htb_enqueue, which is called with the
root spinlock of the qdisc taken.


So it has nothing to do with tcf_classify() itself... :-/

[...]

And doesn't TBF already work with mq? I mean you can attach it as
a leaf to each mq so that the tree lock will not be shared either, but you'd
lose the benefits of a global rate limit too.


Yes, I'd lose not only the global rate limit, but also multi-level
hierarchical limits, which are all provided by this HTB offload - that's
why TBF is not really a replacement for this feature.


Interesting, please explain how your HTB offload still has a global rate
limit and borrowing across queues?


Sure, I will explain that.

I simply can't see it, all I can see
is you offload HTB into each queue in ->attach(),

In the non-offload mode, the same HTB instance would be attached to allqueues. In the offload mode, HTB behaves like MQ: there is a rootinstance of HTB, but each queue gets a separate simple qdisc (pfifo).Only the root qdisc (HTB) gets offloaded, and when that happens, the NICcreates an object for the QoS root.

Then all configuration changes are sent to the driver, and it issues thecorresponding firmware commands to replicate the whole hierarchy in theNIC. Leaf classes correspond to queue groups (in this implementationqueue groups contain only one queue, but it can be extended), and innerclasses correspond to entities called TSARs.

The information about rate limits is stored inside TSARs and queuegroups. Queues know what groups they belong to, and groups and TSARsknow what TSAR is their parent. A queue is picked in ndo_select_queue bylooking at the classification result of clsact. So, when a packet is putonto a queue, the NIC can track the whole hierarchy and do the HTBalgorithm.

where I assume the
hardware will do rate limit on each queue,

So, it's not flat in the NIC, and rate limiting is done in ahierarchical way.

if the hardware also has a
global control, why it is not reflected on the root qdisc?

I'm not sure if I got this last question correctly. The root qdisc isHTB, and all the configuration of the HTB tree gets reflected in theNIC, as I just explained. I hope now it's clearer, but if you still havequestions, I'm glad to explain more details (also, I'm ready to respinwith the minor fixes for the CI build issue on parisc).


Thanks,
Max

Thanks!

Re: [PATCH net-next v2 2/4] sch_htb: Hierarchical QoS hardware offload

Reply via email to