On Mon, 2017-08-14 at 18:19 +0200, Jesper Dangaard Brouer wrote:
> The output (extracted below) didn't show who called 'do_raw_spin_lock',
> BUT it showed another interesting thing.  The kernel code
> __dev_queue_xmit() in might create route dst-cache problem for itself(?),
> as it will first call skb_dst_force() and then skb_dst_drop() when the
> packet is transmitted on a VLAN.
> 
>  static int __dev_queue_xmit(struct sk_buff *skb, void *accel_priv)
>  {
>  [...]
>       /* If device/qdisc don't need skb->dst, release it right now while
>        * its hot in this cpu cache.
>        */
>       if (dev->priv_flags & IFF_XMIT_DST_RELEASE)
>               skb_dst_drop(skb);
>       else
>               skb_dst_force(skb);

I think that the high impact of the above code in this specific test is
mostly due to the following:

- ingress packets with different RSS rx hash lands on different CPUs
- but they use the same dst entry, since the destination IPs belong to
the same subnet
- the dst refcnt cacheline is contented between all the CPUs

Perhaps we can inprove the situation setting the IFF_XMIT_DST_RELEASE
flag for vlan if the underlaying device does not have (relevant)
classifier attached? (and clearing it as needed)

Paolo

Reply via email to