Hi guys,
Here is a list of incremental features and improvements we are considering to
prototype and add to the DPDK hierarchical scheduler SW library. This list is
driven by our findings as well as feedback from various users. Please take a
look and feel free to add more features to this list or comment on the features
below. Of course, all these items are subject to preserving the functional
correctness, existing accuracy and performance of the current implementation.
1. Pipe level: Increase number of traffic classes (TCs). Allow a more flexible
mapping of the pipe queues to traffic classes. Do not allocate memory for
queues that are not needed.
a) Currently, each pipe has 16 queues that are hardwired into 4 TCs scheduled
with strict priority (SP), and each TC has exactly with 4 queues that are
scheduled with Weighted Fair Queuing (WFQ). Specifically, TC0 = [Queue 0 ..
Queue 3], TC1 = [Queue 4 .. Queue 7], TC2 = [Queue 8 .. Queue 11], TC3 = [Queue
12 .. Queue 15].
b) The plan is to support up to 16 TCs. All the high priority TCs (TC1, TC2,
...) will have exactly 1 queue, while the lowest priority TC, called Best
Effort (BE), has 1, 4 or 8 queues. This is justified by the fact that typically
all the high priority TCs are fully provisioned (small to medium traffic
rates), while most of the traffic fits into the BE class, which is usually
greatly oversubscribed.
c) This leads to the following valid options for mapping pipe queues to TCs:
i. BE class has 1 queue => Max number of TCs is 16
ii. BE class has 4 queues => Max number of TCs is 13
iii. BE class has 8 queues => Max number of TCs is 9
d) In order to keep implementation complexity under control, it is required
that all pipes from the same subport share the same mapping of pipe queues to
TCs.
e) Currently, all the 16 pipe queues have to be configured (and memory
allocated for them internally), even if not all of them are needed. Going
forward, it shall be allowed to use less than 16 queues per pipe when not all
the 16 queues are needed, and no memory shall be allocated for the queues that
are not needed.
2. Subport level: Allow different subports of the same port to have different
configuration in terms of number of pipes, pipe queue sizes, pipe queue mapping
to traffic classes, etc.
a) In order to keep the implementation complexity under control, it is required
that all pipes within the same subport share the same configuration for these
parameters.
b) Internal implications: each subport port will likely need to have its own
bitmap data structure.
3. Redistribution of unused pipe BW to other pipes within the same subport:
Enable the existing oversubscription mechanism by default.
a) Currently, this mechanism needs to be explicitly enabled at build time.
b) This change is subject to performance impact not going to be significant.
4. Pipe TC level: Improve shaper accuracy.
a) The current pipe TC rate limiting mechanism is not robust and it can result
in deadlock for certain configurations. Currently, the pipe TC credits are
periodically cleared and re-initialized to a fixed value (period is
configurable), which can result in deadlock if number of pipe TC credits is
smaller than the MTU.
b) The plan is to move the pipe TC rate limiting from the scheduler dequeue
operation (shaping) to the scheduler enqueue operation (metering), by using one
token bucket per pipe TC. Basically, packets that exceed the pipe TC rate will
be detected and dropped earlier rather than later, which should be beneficial
from the perspective of not spending cycles on packets that are later going to
dropped anyway.
c) Internal implications: Number of token buckets is multiplied 16 times. Need
to improve the token bucket performance (e.g. by using branchless code) in
order to get back some of the performance.
Best regards,
Your faithful DPDK QoS implementers,
Cristian and Jasvinder