RE: [PATCH] NET: [UPDATED] Multiqueue network device support implementation.

Waskiewicz Jr, Peter P Tue, 10 Apr 2007 09:29:13 -0700

> Peter P Waskiewicz Jr wrote:
> > +   /* To retrieve statistics per subqueue - FOR FUTURE USE */
> > +   struct net_device_stats* (*get_subqueue_stats)(struct 
> net_device *dev,
> > +                                                   int 
> queue_index);
> 
> 
> Please no future use stuff, just add it when you need it.


Gotcha.  I'll remove this.

> 
> > diff --git a/net/core/dev.c b/net/core/dev.c index 219a57f..c11c8fa 
> > 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -3326,12 +3328,23 @@ struct net_device *alloc_netdev(int 
> sizeof_priv, const char *name,
> >     if (sizeof_priv)
> >             dev->priv = netdev_priv(dev);
> >  
> > +   alloc_size = (sizeof(struct net_device_subqueue) * queue_count);
> > + 
> > +   p = kzalloc(alloc_size, GFP_KERNEL);
> > +   if (!p) {
> > +           printk(KERN_ERR "alloc_netdev: Unable to 
> allocate queues.\n");
> > +           return NULL;
> 
> 
> This leaks the device. You treat every single-queue device as 
> having a single subqueue. If it doesn't get too ugly it would 
> be nice to avoid this and only allocate the subqueue states 
> for real multiqueue devices.

We went back and forth on this.  The reason we allocate a queue in every
case, even on single-queue devices, was to make the stack not have
branching for multiqueue and non-multiqueue devices.  If we don't have
at least one queue on a device, then we can't have
netif_subqueue_stopped() in the hotpath unless we check if a device is
multiqueue before.  The original patches I released had this branching,
and I was asked to not do that.  I'd also like to see all queue-related
stuff be pulled from net_device and put into net_device_subqueue at some
point, even for single-queue devices.  Thoughts?

> 
> > --- a/net/sched/sch_generic.c
> > +++ b/net/sched/sch_generic.c
> > @@ -133,7 +133,8 @@ static inline int qdisc_restart(struct 
> net_device *dev)
> >                     /* And release queue */
> >                     spin_unlock(&dev->queue_lock);
> >  
> > -                   if (!netif_queue_stopped(dev)) {
> > +                   if (!netif_queue_stopped(dev) &&
> > +                       !netif_subqueue_stopped(dev, 
> skb->queue_mapping)) {
> >                             int ret;
> >  
> >                             ret = dev_hard_start_xmit(skb, 
> dev); @@ -149,7 +150,6 @@ static 
> > inline int qdisc_restart(struct net_device *dev)
> >                                     goto collision;
> >                             }
> >                     }
> > -
> 
> 
> Unrelated whitespace change.

I'll fix that.

> 
> >                     /* NETDEV_TX_BUSY - we need to requeue */
> >                     /* Release the driver */
> >                     if (!nolock) {
> > diff --git a/net/sched/sch_prio.c b/net/sched/sch_prio.c index 
> > 5cfe60b..7365621 100644
> > --- a/net/sched/sch_prio.c
> > +++ b/net/sched/sch_prio.c
> > @@ -43,6 +43,7 @@ struct prio_sched_data
> >     struct tcf_proto *filter_list;
> >     u8  prio2band[TC_PRIO_MAX+1];
> >     struct Qdisc *queues[TCQ_PRIO_BANDS];
> > +   u16 band2queue[TC_PRIO_MAX + 1];
> >  };
> >  
> >  
> > @@ -63,20 +64,26 @@ prio_classify(struct sk_buff *skb, 
> struct Qdisc *sch, int *qerr)
> >             case TC_ACT_SHOT:
> >                     return NULL;
> >             };
> > -
> 
> Same here

I'll fix that too.

> 
> >             if (!q->filter_list ) {
> >  #else
> >             if (!q->filter_list || tc_classify(skb, 
> q->filter_list, &res)) {  
> > #endif
> >                     if (TC_H_MAJ(band))
> >                             band = 0;
> > +                   skb->queue_mapping =
> > +                             
> q->prio2band[q->band2queue[band&TC_PRIO_MAX]];
> > +
> 
> 
> Does this needs to be cleared at some point again? TC actions 
> might redirect or mirror packets to other (multiqueue) devices.

If an skb is redirected to another device, the skb should be filtered
through that device's qdisc, yes?

> 
> > @@ -242,6 +259,30 @@ static int prio_tune(struct Qdisc 
> *sch, struct rtattr *opt)
> >                     }
> >             }
> >     }
> > +   /* setup queue to band mapping */
> > +   if (q->bands < sch->dev->egress_subqueue_count) {
> > +           qmapoffset = 1;
> > +           mod = 0;
> > +   } else {
> > +           mod = q->bands % sch->dev->egress_subqueue_count;
> > +           qmapoffset = q->bands / 
> sch->dev->egress_subqueue_count +
> > +                           ((mod) ? 1 : 0);
> > +   }
> > +
> > +   queue = 0;
> > +   offset = 0;
> > +   for (i = 0; i < q->bands; i++) {
> > +           q->band2queue[i] = queue;
> > +           if ( ((i + 1) - offset) == qmapoffset) {
> > +                   queue++;
> > +                   offset += qmapoffset;
> > +                   if (mod)
> > +                           mod--;
> > +                   qmapoffset = q->bands /
> > +                           sch->dev->egress_subqueue_count +
> > +                           ((mod) ? 1 : 0);
> > +           }
> > +   }
> 
> 
> Besides being quite ugly, I don't think this does what you want.
> For bands < queues we get band2queue[0] = 0, all others map to 1.
> 

I see what's wrong with this.  If I removed "mod = 0;", the algorithm
works as intended.  We also did go through a number of iterations of the
best way to map queues to bands, using a simple hash, a complex
assigment, etc., and this algorithm gives the best distribution given
any combination of queues / bands (minus that small bug).  I'll fix this
and resubmit.

Thanks for the feedback,

-PJ Waskiewicz
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH] NET: [UPDATED] Multiqueue network device support implementation.

Reply via email to