Re: [RFC PATCH] net: sched: convert qdisc linked list to hashtable (was Re: Deleting child qdisc doesn't reset parent to default qdisc?)
On Thu, 2016-07-07 at 18:32 +0200, Jiri Kosina wrote: > On Thu, 7 Jul 2016, Eric Dumazet wrote: > > > > @@ -1440,6 +1441,7 @@ static int tc_dump_qdisc_root(struct Qdisc *root, > > > struct sk_buff *skb, > > > { > > > int ret = 0, q_idx = *q_idx_p; > > > struct Qdisc *q; > > > + int b; > > > > > > if (!root) > > > return 0; > > > @@ -1454,7 +1456,7 @@ static int tc_dump_qdisc_root(struct Qdisc *root, > > > struct sk_buff *skb, > > > goto done; > > > q_idx++; > > > } > > > - list_for_each_entry(q, >list, list) { > > > + hash_for_each(qdisc_dev(root)->qdisc_hash, b, q, hash) { > > > if (q_idx < s_q_idx) { > > > q_idx++; > > > continue; > > > @@ -1771,6 +1773,7 @@ static int tc_dump_tclass_root(struct Qdisc *root, > > > struct sk_buff *skb, > > > int *t_p, int s_t) > > > { > > > struct Qdisc *q; > > > + int b; > > > > > > if (!root) > > > return 0; > > > @@ -1778,7 +1781,7 @@ static int tc_dump_tclass_root(struct Qdisc *root, > > > struct sk_buff *skb, > > > if (tc_dump_tclass_qdisc(root, skb, tcm, cb, t_p, s_t) < 0) > > > return -1; > > > > > > - list_for_each_entry(q, >list, list) { > > > + hash_for_each_rcu(qdisc_dev(root)->qdisc_hash, b, q, hash) { > > > if (tc_dump_tclass_qdisc(q, skb, tcm, cb, t_p, s_t) < 0) > > > return -1; > > > } > > > > > > Not sure why you used the rcu version here, but the non rcu version in > > tc_dump_qdisc_root() > > Good catch. > > Actually even the current code is odd in this regard -- > qdisc_match_from_root() uses RCU iterator, Because it can be run from qdisc enqueue() dequeue(), not holding RTNL. > while tc_dump_*() use the > non-RCU one; addition and deletion is performed using RCU primitives. It really is protected by RTNL (qdiscs can not change during the dump) > > I haven't got my head around this yet; if it's correct at all, it'd at > least deserve a comment somewhere. > > I'll respin v2 of the patch (there is also a conflict on HASH_SIZE > definition in ip6_tunnel.c, ip6_gre.c and sit.c due to hashtable.h include > in netdevice.h that needs to be resolved as well) that'd make RCU usage > consistent. > > Any other objections/comments? I was namely curious about any opinions > regarding the hashtable size. Well, this is the tricky part, but rhashtable would mean way more changes...
Re: [RFC PATCH] net: sched: convert qdisc linked list to hashtable (was Re: Deleting child qdisc doesn't reset parent to default qdisc?)
On Thu, 7 Jul 2016, Eric Dumazet wrote: > > @@ -1440,6 +1441,7 @@ static int tc_dump_qdisc_root(struct Qdisc *root, > > struct sk_buff *skb, > > { > > int ret = 0, q_idx = *q_idx_p; > > struct Qdisc *q; > > + int b; > > > > if (!root) > > return 0; > > @@ -1454,7 +1456,7 @@ static int tc_dump_qdisc_root(struct Qdisc *root, > > struct sk_buff *skb, > > goto done; > > q_idx++; > > } > > - list_for_each_entry(q, >list, list) { > > + hash_for_each(qdisc_dev(root)->qdisc_hash, b, q, hash) { > > if (q_idx < s_q_idx) { > > q_idx++; > > continue; > > @@ -1771,6 +1773,7 @@ static int tc_dump_tclass_root(struct Qdisc *root, > > struct sk_buff *skb, > >int *t_p, int s_t) > > { > > struct Qdisc *q; > > + int b; > > > > if (!root) > > return 0; > > @@ -1778,7 +1781,7 @@ static int tc_dump_tclass_root(struct Qdisc *root, > > struct sk_buff *skb, > > if (tc_dump_tclass_qdisc(root, skb, tcm, cb, t_p, s_t) < 0) > > return -1; > > > > - list_for_each_entry(q, >list, list) { > > + hash_for_each_rcu(qdisc_dev(root)->qdisc_hash, b, q, hash) { > > if (tc_dump_tclass_qdisc(q, skb, tcm, cb, t_p, s_t) < 0) > > return -1; > > } > > > Not sure why you used the rcu version here, but the non rcu version in > tc_dump_qdisc_root() Good catch. Actually even the current code is odd in this regard -- qdisc_match_from_root() uses RCU iterator, while tc_dump_*() use the non-RCU one; addition and deletion is performed using RCU primitives. I haven't got my head around this yet; if it's correct at all, it'd at least deserve a comment somewhere. I'll respin v2 of the patch (there is also a conflict on HASH_SIZE definition in ip6_tunnel.c, ip6_gre.c and sit.c due to hashtable.h include in netdevice.h that needs to be resolved as well) that'd make RCU usage consistent. Any other objections/comments? I was namely curious about any opinions regarding the hashtable size. Thanks, -- Jiri Kosina SUSE Labs
Re: [RFC PATCH] net: sched: convert qdisc linked list to hashtable (was Re: Deleting child qdisc doesn't reset parent to default qdisc?)
On Thu, 2016-07-07 at 11:04 +0200, Jiri Kosina wrote: > > > From: Jiri Kosina> Subject: [PATCH] net: sched: convert qdisc linked list to hashtable > > Convert the per-device linked list into a hashtable. The primary motivation > for this change is that currently, we're not tracking all the qdiscs in > hierarchy (e.g. excluding default qdiscs), as the lookup performed over the > linked list by qdisc_match_from_root() is rather expensive. ... > } > @@ -1440,6 +1441,7 @@ static int tc_dump_qdisc_root(struct Qdisc *root, > struct sk_buff *skb, > { > int ret = 0, q_idx = *q_idx_p; > struct Qdisc *q; > + int b; > > if (!root) > return 0; > @@ -1454,7 +1456,7 @@ static int tc_dump_qdisc_root(struct Qdisc *root, > struct sk_buff *skb, > goto done; > q_idx++; > } > - list_for_each_entry(q, >list, list) { > + hash_for_each(qdisc_dev(root)->qdisc_hash, b, q, hash) { > if (q_idx < s_q_idx) { > q_idx++; > continue; > @@ -1771,6 +1773,7 @@ static int tc_dump_tclass_root(struct Qdisc *root, > struct sk_buff *skb, > int *t_p, int s_t) > { > struct Qdisc *q; > + int b; > > if (!root) > return 0; > @@ -1778,7 +1781,7 @@ static int tc_dump_tclass_root(struct Qdisc *root, > struct sk_buff *skb, > if (tc_dump_tclass_qdisc(root, skb, tcm, cb, t_p, s_t) < 0) > return -1; > > - list_for_each_entry(q, >list, list) { > + hash_for_each_rcu(qdisc_dev(root)->qdisc_hash, b, q, hash) { > if (tc_dump_tclass_qdisc(q, skb, tcm, cb, t_p, s_t) < 0) > return -1; > } Not sure why you used the rcu version here, but the non rcu version in tc_dump_qdisc_root() Thanks.
[RFC PATCH] net: sched: convert qdisc linked list to hashtable (was Re: Deleting child qdisc doesn't reset parent to default qdisc?)
On Fri, 15 Apr 2016, Eric Dumazet wrote: > Anyway, we probably need to improve our ability to understand qdisc > hierarchies. Having some hidden qdiscs is the real problem here. > > We need to add some hash table so that qdisc_match_from_root() does not > have to scan hundred of qdiscs. So how about something like the patch below? I already have preliminary patches on top which unhide the default qdiscs, but let's make this one step after the other. Thanks. From: Jiri KosinaSubject: [PATCH] net: sched: convert qdisc linked list to hashtable Convert the per-device linked list into a hashtable. The primary motivation for this change is that currently, we're not tracking all the qdiscs in hierarchy (e.g. excluding default qdiscs), as the lookup performed over the linked list by qdisc_match_from_root() is rather expensive. The ultimate goal is to get rid of hidden qdiscs completely, which will bring much more determinism in user experience. Signed-off-by: Jiri Kosina --- include/linux/netdevice.h | 2 ++ include/net/pkt_sched.h | 4 ++-- include/net/sch_generic.h | 2 +- net/core/dev.c| 1 + net/sched/sch_api.c | 23 +-- net/sched/sch_generic.c | 6 +++--- net/sched/sch_mq.c| 2 +- net/sched/sch_mqprio.c| 2 +- 8 files changed, 24 insertions(+), 18 deletions(-) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index f45929c..630838e 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -52,6 +52,7 @@ #include #include #include +#include struct netpoll_info; struct device; @@ -1778,6 +1779,7 @@ struct net_device { unsigned intnum_tx_queues; unsigned intreal_num_tx_queues; struct Qdisc*qdisc; + DECLARE_HASHTABLE (qdisc_hash, 16); unsigned long tx_queue_len; spinlock_t tx_global_lock; int watchdog_timeo; diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h index fea53f4..8ba11b4 100644 --- a/include/net/pkt_sched.h +++ b/include/net/pkt_sched.h @@ -90,8 +90,8 @@ int unregister_qdisc(struct Qdisc_ops *qops); void qdisc_get_default(char *id, size_t len); int qdisc_set_default(const char *id); -void qdisc_list_add(struct Qdisc *q); -void qdisc_list_del(struct Qdisc *q); +void qdisc_hash_add(struct Qdisc *q); +void qdisc_hash_del(struct Qdisc *q); struct Qdisc *qdisc_lookup(struct net_device *dev, u32 handle); struct Qdisc *qdisc_lookup_class(struct net_device *dev, u32 handle); struct qdisc_rate_table *qdisc_get_rtab(struct tc_ratespec *r, diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index 62d5531..26f5cb3 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -67,7 +67,7 @@ struct Qdisc { u32 limit; const struct Qdisc_ops *ops; struct qdisc_size_table __rcu *stab; - struct list_headlist; + struct hlist_node hash; u32 handle; u32 parent; int (*reshape_fail)(struct sk_buff *skb, diff --git a/net/core/dev.c b/net/core/dev.c index 904ff43..edc1617 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -7511,6 +7511,7 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name, INIT_LIST_HEAD(>all_adj_list.lower); INIT_LIST_HEAD(>ptype_all); INIT_LIST_HEAD(>ptype_specific); + hash_init(dev->qdisc_hash); dev->priv_flags = IFF_XMIT_DST_RELEASE | IFF_XMIT_DST_RELEASE_PERM; setup(dev); diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c index ddf047d..82953cb 100644 --- a/net/sched/sch_api.c +++ b/net/sched/sch_api.c @@ -29,6 +29,7 @@ #include #include #include +#include #include #include @@ -265,33 +266,33 @@ static struct Qdisc *qdisc_match_from_root(struct Qdisc *root, u32 handle) root->handle == handle) return root; - list_for_each_entry_rcu(q, >list, list) { + hash_for_each_possible_rcu(qdisc_dev(root)->qdisc_hash, q, hash, handle) { if (q->handle == handle) return q; } return NULL; } -void qdisc_list_add(struct Qdisc *q) +void qdisc_hash_add(struct Qdisc *q) { if ((q->parent != TC_H_ROOT) && !(q->flags & TCQ_F_INGRESS)) { struct Qdisc *root = qdisc_dev(q)->qdisc; WARN_ON_ONCE(root == _qdisc); ASSERT_RTNL(); - list_add_tail_rcu(>list, >list); + hash_add_rcu(qdisc_dev(q)->qdisc_hash, >hash, q->handle); } } -EXPORT_SYMBOL(qdisc_list_add); +EXPORT_SYMBOL(qdisc_hash_add); -void qdisc_list_del(struct Qdisc *q) +void qdisc_hash_del(struct Qdisc *q) { if ((q->parent != TC_H_ROOT) && !(q->flags & TCQ_F_INGRESS)) {
Re: Deleting child qdisc doesn't reset parent to default qdisc?
On Tue, 28 Jun 2016, Cong Wang wrote: > > BTW, I've started to actually work on fixing this, and I've noticed that > > TBF behavior actually violates what's stated in pfifo_fast manpage: > > > > == > > Whenever an interface is created, the pfifo_fast qdisc is > > automatically used as a queue. If another qdisc is > > attached, it preempts the default pfifo_fast, which automatically > > returns to function when an existing qdisc is detached. > > > > In this sense this qdisc is magic, and unlike other qdiscs. > > == > > It is out of date, now default qdisc can be set to any other qdisc > via /proc. Also, probably due to historical reasons, we don't have > a unified default default qdisc, some uses bfifo, some uses pfifo, > we may break some existing script if we change that. While I do understand that reasoning, I'd argue that unpredictable and unexpected behavior of TBF causing systems with non-working networking is much more likely than any userspace having hard dependency on the fact that default (*) qdisc for TBF is noop. (*) where 'default upon creation' != 'default when reset' Thanks, -- Jiri Kosina SUSE Labs
Re: Deleting child qdisc doesn't reset parent to default qdisc?
On Tue, Jun 28, 2016 at 8:19 AM, Jiri Kosinawrote: > BTW, I've started to actually work on fixing this, and I've noticed that > TBF behavior actually violates what's stated in pfifo_fast manpage: > > == > Whenever an interface is created, the pfifo_fast qdisc is > automatically used as a queue. If another qdisc is > attached, it preempts the default pfifo_fast, which automatically > returns to function when an existing qdisc is detached. > > In this sense this qdisc is magic, and unlike other qdiscs. > == It is out of date, now default qdisc can be set to any other qdisc via /proc. Also, probably due to historical reasons, we don't have a unified default default qdisc, some uses bfifo, some uses pfifo, we may break some existing script if we change that.
[RFC PATCH] sch_tbf: avoid silent fallback to noop_qdisc (was Re: Deleting child qdisc doesn't reset parent to default qdisc?)
On Fri, 15 Apr 2016, Eric Dumazet wrote: > Then you need to save the initial qdisc (bfifo for TBF) in a special > place, to make sure the delete operation is guaranteed to succeed. > > Or fail the delete if the bfifo can not be allocated. > > I can tell that determinism if far more interesting than usability for > some users occasionally playing with tc. > > Surely the silent fallback to noop_qdisc is wrong. So before we go further and fix the fact that we actually do have hidden qdiscs (by refactoring qdisc_match_from_root() and friends), I'd still like to bring the patch below up for consideration. Thanks. From: Jiri KosinaSubject: [PATCH] sch_tbf: avoid silent fallback to noop_qdisc TBF started its life as a classless qdisc with a single builtin FIFO queue which was being shaped. When it got later turned into classful qdisc, it was written in a way that the fallback qdisc was noop_qdisc, which produces bad user experience (delete of last manually added class doesn't reset it to initial default, but renders networking unusable instead). Switch the default fallback to bfifo; this also mimics how the other guys (HTB, HFSC, CBQ, ...) are behaving. Signed-off-by: Jiri Kosina --- net/sched/sch_tbf.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/net/sched/sch_tbf.c b/net/sched/sch_tbf.c index 3161e49..b06dffe 100644 --- a/net/sched/sch_tbf.c +++ b/net/sched/sch_tbf.c @@ -508,8 +508,12 @@ static int tbf_graft(struct Qdisc *sch, unsigned long arg, struct Qdisc *new, { struct tbf_sched_data *q = qdisc_priv(sch); - if (new == NULL) - new = _qdisc; + if (new == NULL) { + /* reset to default qdisc */ + new = qdisc_create_dflt(sch->dev_queue, _qdisc_ops, sch->parent); + if (!new) + return -ENOBUFS; + } *old = qdisc_replace(sch, new, >qdisc); return 0; -- Jiri Kosina SUSE Labs
Re: Deleting child qdisc doesn't reset parent to default qdisc?
On Fri, 15 Apr 2016, Eric Dumazet wrote: > > TBF is probably a bad example because it started life as a classless > > qdisc. There was only one built-in fifo queue that was shaped. Then > > someone made it classful and changed this behavior. To me it sounds > > reasonable to have the default behavior restored. At minimal > > consistency. > > > Then you need to save the initial qdisc (bfifo for TBF) in a special > place, to make sure the delete operation is guaranteed to succeed. > > Or fail the delete if the bfifo can not be allocated. > > I can tell that determinism if far more interesting than usability for > some users occasionally playing with tc. BTW, I've started to actually work on fixing this, and I've noticed that TBF behavior actually violates what's stated in pfifo_fast manpage: == Whenever an interface is created, the pfifo_fast qdisc is automatically used as a queue. If another qdisc is attached, it preempts the default pfifo_fast, which automatically returns to function when an existing qdisc is detached. In this sense this qdisc is magic, and unlike other qdiscs. == -- Jiri Kosina SUSE Labs
Re: Deleting child qdisc doesn't reset parent to default qdisc?
From: Eric DumazetDate: Fri, 15 Apr 2016 07:58:48 -0700 > Having some hidden qdiscs is the real problem here. +1
Re: Deleting child qdisc doesn't reset parent to default qdisc?
On Fri, 2016-04-15 at 08:42 -0400, Jamal Hadi Salim wrote: > On 16-04-14 01:49 PM, Eric Dumazet wrote: > > > And what would be the chosen behavior ? > > > > TBF is probably a bad example because it started life as > a classless qdisc. There was only one built-in fifo queue > that was shaped. Then someone made it classful and changed > this behavior. To me it sounds reasonable to have the > default behavior restored. At minimal consistency. Then you need to save the initial qdisc (bfifo for TBF) in a special place, to make sure the delete operation is guaranteed to succeed. Or fail the delete if the bfifo can not be allocated. I can tell that determinism if far more interesting than usability for some users occasionally playing with tc. Surely the silent fallback to noop_qdisc is wrong. Anyway, we probably need to improve our ability to understand qdisc hierarchies. Having some hidden qdiscs is the real problem here. We need to add some hash table so that qdisc_match_from_root() does not have to scan hundred of qdiscs.
Re: Deleting child qdisc doesn't reset parent to default qdisc?
On 16-04-14 01:49 PM, Eric Dumazet wrote: And what would be the chosen behavior ? TBF is probably a bad example because it started life as a classless qdisc. There was only one built-in fifo queue that was shaped. Then someone made it classful and changed this behavior. To me it sounds reasonable to have the default behavior restored. At minimal consistency. Relying on TBF installing a bfifo for you at delete would be hazardous. For example CBQ got it differently than HFSC If qdisc_create_dflt() fails in CBQ, we fail the 'delete', while HFSC falls back to noop_qdisc, without warning the user :( At least always using noop_qdisc is consistent. No magic there. Doing 'unification' right now would break existing scripts. This is too late, I am afraid. Sigh. So rant: IMO, we should let any new APIs and API updates stay longer in discussion. Or better mark them as unstable for sometime. The excuse that "it is out in the wild therefore cant be changed" is harmful because the timeline is "forever" whereas patches are applied after a short period of posting and discussions and sometimes not involving the right people. It is like having a jury issuing a death sentence after 1 week of deliberation. You cant take it back after execution. cheers, jamal
Re: Deleting child qdisc doesn't reset parent to default qdisc?
On Thu, 2016-04-14 at 18:08 +0200, Jiri Kosina wrote: > On Thu, 14 Apr 2016, Phil Sutter wrote: > > > > > I've came across the behavior where adding a child qdisc and then > > > > deleting > > > > it again makes the networking dysfunctional (I guess that's because all > > > > of > > > > a sudden there is absolutely no working qdisc on the device, although > > > > there originally was a default one in the parent). > > > > > > > > In a nutshell, is this expected behavior or bug? > > > > > > This is the expected behavior. > > > > OTOH some qdiscs (CBQ, DRR, DSMARK, HFSC, HTB, QFQ) assign the default > > one upon deletion instead of noop_qdisc, hence I would describe > > the situation using the words 'inconsistent' and 'accident' rather than > > 'expected'. :) > > Would a patch that'd unify this in a sense that all qdiscs would assign > the default one upon deletion acceptable? > And what would be the chosen behavior ? Relying on TBF installing a bfifo for you at delete would be hazardous. For example CBQ got it differently than HFSC If qdisc_create_dflt() fails in CBQ, we fail the 'delete', while HFSC falls back to noop_qdisc, without warning the user :( At least always using noop_qdisc is consistent. No magic there. Doing 'unification' right now would break existing scripts. This is too late, I am afraid.
Re: Deleting child qdisc doesn't reset parent to default qdisc?
On Thu, 2016-04-14 at 18:22 +0200, Phil Sutter wrote: > And those being invisible can be overridden using 'tc qd add', right? > AFAIR they're not listed because they don't properly register, so the > system doesn't care to override them. In this case we could change all > classful qdiscs to restore the default qdisc if a leaf qdisc is being > deleted instead of noop (which is probably not what the user wants > anyway). Even if they properly register, they are not visible. Take a look at https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=95dc19299f741c986227ec33e23cbf9b3321f812 for some context. When a default pfifo is created on say a HTB class, you do not see it by default in a dump. If you have 100 HTB classes, HTB created 100 pfifo just fine, and it works, unless an admin tries to delete them maybe ;)
Re: Deleting child qdisc doesn't reset parent to default qdisc?
On Thu, Apr 14, 2016 at 08:44:40AM -0700, Eric Dumazet wrote: > On Thu, 2016-04-14 at 17:34 +0200, Jiri Kosina wrote: > > On Thu, 14 Apr 2016, Phil Sutter wrote: > > > > > OTOH some qdiscs (CBQ, DRR, DSMARK, HFSC, HTB, QFQ) assign the default > > > one upon deletion instead of noop_qdisc, hence I would describe > > > the situation using the words 'inconsistent' and 'accident' rather than > > > 'expected'. :) > > > > Exactly. I'd again like to stress the fact that this configuration works: > > > > jikos:~ # tc qdisc show > > qdisc tbf 10: dev eth0 root refcnt 2 rate 800Mbit burst 131000b lat > > 1.0ms > > > > and this (after performing add/delete operation) doesn't: > > > > jikos:~ # tc qdisc show > > qdisc tbf 10: dev eth0 root refcnt 2 rate 800Mbit burst 131000b lat > > 1.0ms > > > > It's hard to spot a difference (hint: there is none). > > This is because some qdisc are not visible in the dump. And those being invisible can be overridden using 'tc qd add', right? AFAIR they're not listed because they don't properly register, so the system doesn't care to override them. In this case we could change all classful qdiscs to restore the default qdisc if a leaf qdisc is being deleted instead of noop (which is probably not what the user wants anyway). Cheers, Phil
Re: Deleting child qdisc doesn't reset parent to default qdisc?
On Thu, 14 Apr 2016, Phil Sutter wrote: > > > I've came across the behavior where adding a child qdisc and then > > > deleting > > > it again makes the networking dysfunctional (I guess that's because all > > > of > > > a sudden there is absolutely no working qdisc on the device, although > > > there originally was a default one in the parent). > > > > > > In a nutshell, is this expected behavior or bug? > > > > This is the expected behavior. > > OTOH some qdiscs (CBQ, DRR, DSMARK, HFSC, HTB, QFQ) assign the default > one upon deletion instead of noop_qdisc, hence I would describe > the situation using the words 'inconsistent' and 'accident' rather than > 'expected'. :) Would a patch that'd unify this in a sense that all qdiscs would assign the default one upon deletion acceptable? Thanks, -- Jiri Kosina SUSE Labs
Re: Deleting child qdisc doesn't reset parent to default qdisc?
On Thu, 2016-04-14 at 17:34 +0200, Jiri Kosina wrote: > On Thu, 14 Apr 2016, Phil Sutter wrote: > > > OTOH some qdiscs (CBQ, DRR, DSMARK, HFSC, HTB, QFQ) assign the default > > one upon deletion instead of noop_qdisc, hence I would describe > > the situation using the words 'inconsistent' and 'accident' rather than > > 'expected'. :) > > Exactly. I'd again like to stress the fact that this configuration works: > > jikos:~ # tc qdisc show > qdisc tbf 10: dev eth0 root refcnt 2 rate 800Mbit burst 131000b lat > 1.0ms > > and this (after performing add/delete operation) doesn't: > > jikos:~ # tc qdisc show > qdisc tbf 10: dev eth0 root refcnt 2 rate 800Mbit burst 131000b lat > 1.0ms > > It's hard to spot a difference (hint: there is none). This is because some qdisc are not visible in the dump. qdisc_list_add() uses a single list, so adding too much stuff in it could slow down fast path (qdisc_lookup(), called from qdisc_tree_reduce_backlog())
Re: Deleting child qdisc doesn't reset parent to default qdisc?
On Thu, 14 Apr 2016, Phil Sutter wrote: > OTOH some qdiscs (CBQ, DRR, DSMARK, HFSC, HTB, QFQ) assign the default > one upon deletion instead of noop_qdisc, hence I would describe > the situation using the words 'inconsistent' and 'accident' rather than > 'expected'. :) Exactly. I'd again like to stress the fact that this configuration works: jikos:~ # tc qdisc show qdisc tbf 10: dev eth0 root refcnt 2 rate 800Mbit burst 131000b lat 1.0ms and this (after performing add/delete operation) doesn't: jikos:~ # tc qdisc show qdisc tbf 10: dev eth0 root refcnt 2 rate 800Mbit burst 131000b lat 1.0ms It's hard to spot a difference (hint: there is none). Thanks, -- Jiri Kosina SUSE Labs
Re: Deleting child qdisc doesn't reset parent to default qdisc?
On Thu, Apr 14, 2016 at 08:01:39AM -0700, Eric Dumazet wrote: > On Thu, 2016-04-14 at 16:44 +0200, Jiri Kosina wrote: > > Hi, > > > > I've came across the behavior where adding a child qdisc and then deleting > > it again makes the networking dysfunctional (I guess that's because all of > > a sudden there is absolutely no working qdisc on the device, although > > there originally was a default one in the parent). > > > > In a nutshell, is this expected behavior or bug? > > This is the expected behavior. OTOH some qdiscs (CBQ, DRR, DSMARK, HFSC, HTB, QFQ) assign the default one upon deletion instead of noop_qdisc, hence I would describe the situation using the words 'inconsistent' and 'accident' rather than 'expected'. :) Anyhow, the problem with skilled admins is they accept quirks too easily and just build their scripts around them - the same scripts we have to keep compatible to then. Cheers, Phil
Re: Deleting child qdisc doesn't reset parent to default qdisc?
On Thu, 2016-04-14 at 16:44 +0200, Jiri Kosina wrote: > Hi, > > I've came across the behavior where adding a child qdisc and then deleting > it again makes the networking dysfunctional (I guess that's because all of > a sudden there is absolutely no working qdisc on the device, although > there originally was a default one in the parent). > > In a nutshell, is this expected behavior or bug? This is the expected behavior. If the kernel was suddenly doing a 'replace' when you ask a delete, then the scripts doing a delete , than a add would break. tc users are skilled admins ;)
Re: Deleting child qdisc doesn't reset parent to default qdisc?
On Thu, 14 Apr 2016, Jiri Kosina wrote: > In a nutshell, is this expected behavior or bug? Just to clarify what seems to suggest to me that this is rather a bug that needs to be fixed (but apparently one that has been there for quite a long time) can be demonstrated by this: > > = > jikos:~ # tc qdisc show > qdisc tbf 10: dev eth0 root refcnt 2 rate 800Mbit burst 131000b lat 1.0ms The above configuration works. > jikos:~ # ping -c 1 nix.cz | head -2 > PING nix.cz (195.47.235.3) 56(84) bytes of data. > 64 bytes from info.nix.cz (195.47.235.3): icmp_seq=1 ttl=89 time=1.59 ms > > jikos:~ # tc qdisc add dev eth0 parent 10:1 sfq > jikos:~ # tc qdisc show > qdisc tbf 10: dev eth0 root refcnt 2 rate 800Mbit burst 131000b lat 1.0ms > qdisc sfq 8008: dev eth0 parent 10:1 limit 127p quantum 1514b depth 127 > divisor 1024 > > jikos:~ # ping -c 1 nix.cz | head -2 > PING nix.cz (195.47.235.3) 56(84) bytes of data. > 64 bytes from info.nix.cz (195.47.235.3): icmp_seq=1 ttl=89 time=1.67 ms > > jikos:~ # tc qdisc del dev eth0 parent 10:1 sfq > jikos:~ # tc qdisc show > qdisc tbf 10: dev eth0 root refcnt 2 rate 800Mbit burst 131000b lat 1.0ms The above configuration doesn't although it's identical to the working one at the beginning. > jikos:~ # ping -c 1 nix.cz | head -2 > PING nix.cz (195.47.235.3) 56(84) bytes of data. > [ ... nothing happens ... ] > ^C -- Jiri Kosina SUSE Labs