HTB, CBQ and HFSC pay a very high cost updating the qdisc 'throttled' status that nothing but CBQ seems to use.
CBQ usage is flaky anyway, since no qdisc ->enqueue() updates the 'throttled' qdisc status. This looks like some 'optimization' that actually cost more than code without the optimization, and might cause latency issues with CBQ. In my tests, I could achieve a 8 % performance increase in TCP_RR workload through HTB qdisc, in presence of throttled classes, and 5 % without throttled classes. Eric Dumazet (4): net_sched: sch_plug: use a private throttled status net_sched: cbq: remove a flaky use of qdisc_is_throttled() net_sched: netem: remove qdisc_is_throttled() use net_sched: remove generic throttled management include/net/pkt_sched.h | 4 ++-- include/net/sch_generic.h | 16 ---------------- net/sched/sch_api.c | 7 +------ net/sched/sch_cbq.c | 4 +--- net/sched/sch_fq.c | 3 +-- net/sched/sch_hfsc.c | 1 - net/sched/sch_htb.c | 3 +-- net/sched/sch_netem.c | 4 ---- net/sched/sch_plug.c | 14 ++++++++------ net/sched/sch_tbf.c | 4 +--- 10 files changed, 15 insertions(+), 45 deletions(-) -- 2.8.0.rc3.226.g39d4020