On 07/09/2018 05:40 PM, Marcelo Ricardo Leitner wrote:
On Mon, Jul 09, 2018 at 05:03:31PM -0400, Michel Machado wrote:
On 07/09/2018 03:53 PM, Marcelo Ricardo Leitner wrote:
On Mon, Jul 09, 2018 at 02:18:33PM -0400, Michel Machado wrote:
On 07/09/2018 11:44 AM, Marcelo Ricardo Leitner wrote:
On Sat, Jul 07, 2018 at 03:43:55PM +0530, Nishanth Devarajan wrote:
net/sched: add skbprio scheduer

Skbprio (SKB Priority Queue) is a queueing discipline that prioritizes packets
according to their skb->priority field. Under congestion, already-enqueued lower
priority packets will be dropped to make space available for higher priority
packets. Skbprio was conceived as a solution for denial-of-service defenses that
need to route packets with different priorities as a means to overcome DoS
attacks.

Why can't we implement this as a new flag for sch_prio.c?

I don't see why this duplication is needed, especially because it will
only be "slower" (as in, it will do more work) when qdisc is already
full and dropping packets anyway.

     sch_prio.c and skbprio diverge on a number of aspects:

     1. sch_prio.c supports up to 16 priorities whereas skbprio 64. This is
not just a matter of changing a constant since sch_prio.c doesn't use
skb->priority.

Yes it does use skb->priority for classifying into a band:

prio_classify(struct sk_buff *skb, struct Qdisc *sch, int *qerr)
{
          struct prio_sched_data *q = qdisc_priv(sch);
          u32 band = skb->priority;
...

    Changing TC_PRIO_MAX from 15 to 63 risks breaking backward compatibility
with applications.

If done, it needs to be done carefully, indeed. I don't know if it's
doable, neither I know how hard is your requirement for 64 different
priorities.

You can get 64 different priorities by stacking sch_prio, btw. And if
you implement drop_from_tail() as part of Qdisc, you can even get it
working for this cascading case too.

A solution would be to add another flag to switch between the current prio_classify() and a new one to just use skb->priority as in skbprio. This way we don't risk breaking applications that rely on tcf_classify() and this odd behavior that I found in prio_classify():

        band = TC_H_MIN(band) - 1;
        if (band >= q->bands)
                return q->queues[q->prio2band[0]];
        return q->queues[band];

When band is zero, it returns q->queues[q->prio2band[0]] instead of q->queues[band] as it would for other bands less than q->bands.

     3. The queues of sch_prio.c are struct Qdisc, which don't have a method
to drop at its tail.

That can be implemented, most likely as prio_tail_drop() as above.

    struct Qdisc represents *all* qdiscs. My knowledge of the other qdiscs is
limited, but not all qdiscs may have a meaningful method to drop at the
tail. For example: a qdisc that works over flows may not know with flow is

True, but it doesn't mean you have to implement it for all available qdiscs.

If it is not implemented for all available qdiscs and the flag to drop at the tail is on, sch_prio.c would need to issue a log message whenever a packet goes into one of the subqueues that don't drop at the tail and have a failsafe behavior.

the tail. Not to mention that this would be a widespread patch to only
support this new prio qdisc. It would be prudent to wait for the production
success of the proposed, self-contained qdisc before making this commitment.

On the other hand, by adding another qdisc you're adding more work
that one needs to do when dealing with qdisc infrastructure, such as
updating enqueue() prototype, for example.

Once this new qdisc is in, it won't be easy to deprecate it.

We need to choose between (1) having skbprio that has some duplicate code with sch_prio.c and (2) adding flags to sch_prio.c and make a major refactoring of the schedule subsystem to add drop an the tail to qdiscs.

I mean major because we are not just talking about adding the method dequeue_tail() to struct Qdisc and adding dequeue_tail() to all qdiscs. One will need to come up with definitions of dequeue_tail() for qdiscs that don't naturally have it and even rewrite the data structures of qdiscs. To substantiate this last point, consider sch_fifo.c, one of the simplest qdiscs available. sch_fifo.c keeps its packets in sch->q, which is of type struct qdisc_skb_head. struct qdisc_skb_head doesn't set skb->prev, so it cannot drop at the tail without walking through its list.

I do understand the motivation for minimizing duplicate code. But the small amount of duplicate code that skbprio adds is cheaper than refactoring the scheduler system to only support this new sch_prio.c.

[ ]'s
Michel Machado

Reply via email to