On Fri, May 27, 2016 at 10:56:44AM -0700, Cong Wang wrote:

> > Commit 412ca1550cbecb2c ("macvlan: Move broadcasts into a work queue")
> > moved processing of all macvlan multicasts into a work queue.  This
> > causes a noticable performance regression when there is heavy multicast
> > traffic on the underlying interface for multicast groups that the
> > macvlan subinterfaces are not members of, in which case we end up
> > cloning all those packets and then freeing them again from a work queue
> > without really doing any useful work with them in between.
> 
> But we only queue up to 1000 packets in our backlog.
> 
> How about adding a quick check before cloning it?
> 
> diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
> index cb01023..1c73d0f 100644
> --- a/drivers/net/macvlan.c
> +++ b/drivers/net/macvlan.c
> @@ -315,6 +315,9 @@ static void macvlan_broadcast_enqueue(struct
> macvlan_port *port,
>         struct sk_buff *nskb;
>         int err = -ENOMEM;
> 
> +       if (skb_queue_len(&port->bc_queue) >= MACVLAN_BC_QUEUE_LEN)
> +               return;
> +
>         nskb = skb_clone(skb, GFP_ATOMIC);
>         if (!nskb)
>                 goto err;

We're not hitting the bc_queue skb limit in our environment, as the
machine can keep up with the traffic -- it's just that taking an
extra clone of the skb and queueing and running the work queue item
to free it again is eating up a lot of cycles.

But doing the queue length check before the clone might not be a bad
idea?  (You'd probably want to atomic_long_inc(&skb->dev->rx_dropped)
before returning, though?)

Reply via email to