W dniu 2017-08-14 o 18:57, Paolo Abeni pisze:
On Mon, 2017-08-14 at 18:19 +0200, Jesper Dangaard Brouer wrote:
The output (extracted below) didn't show who called 'do_raw_spin_lock',
BUT it showed another interesting thing.  The kernel code
__dev_queue_xmit() in might create route dst-cache problem for itself(?),
as it will first call skb_dst_force() and then skb_dst_drop() when the
packet is transmitted on a VLAN.

  static int __dev_queue_xmit(struct sk_buff *skb, void *accel_priv)
  {
  [...]
        /* If device/qdisc don't need skb->dst, release it right now while
         * its hot in this cpu cache.
         */
        if (dev->priv_flags & IFF_XMIT_DST_RELEASE)
                skb_dst_drop(skb);
        else
                skb_dst_force(skb);
I think that the high impact of the above code in this specific test is
mostly due to the following:

- ingress packets with different RSS rx hash lands on different CPUs
yes but isn't this normal ?
everybody that want to ballance load over cores will try tu use as many as possible :) With some limitations ... best are 6 to 7 RSS queues - so need to use 6 to 7 cpu cores

- but they use the same dst entry, since the destination IPs belong to
the same subnet
typical for ddos - many sources one destination


- the dst refcnt cacheline is contented between all the CPUs

Perhaps we can inprove the situation setting the IFF_XMIT_DST_RELEASE
flag for vlan if the underlaying device does not have (relevant)
classifier attached? (and clearing it as needed)

Paolo


Reply via email to