W dniu 2017-08-14 o 18:57, Paolo Abeni pisze:
On Mon, 2017-08-14 at 18:19 +0200, Jesper Dangaard Brouer wrote:
The output (extracted below) didn't show who called 'do_raw_spin_lock',
BUT it showed another interesting thing. The kernel code
__dev_queue_xmit() in might create route dst-cache problem for itself(?),
as it will first call skb_dst_force() and then skb_dst_drop() when the
packet is transmitted on a VLAN.
static int __dev_queue_xmit(struct sk_buff *skb, void *accel_priv)
{
[...]
/* If device/qdisc don't need skb->dst, release it right now while
* its hot in this cpu cache.
*/
if (dev->priv_flags & IFF_XMIT_DST_RELEASE)
skb_dst_drop(skb);
else
skb_dst_force(skb);
I think that the high impact of the above code in this specific test is
mostly due to the following:
- ingress packets with different RSS rx hash lands on different CPUs
yes but isn't this normal ?
everybody that want to ballance load over cores will try tu use as many
as possible :)
With some limitations ... best are 6 to 7 RSS queues - so need to use 6
to 7 cpu cores
- but they use the same dst entry, since the destination IPs belong to
the same subnet
typical for ddos - many sources one destination
- the dst refcnt cacheline is contented between all the CPUs
Perhaps we can inprove the situation setting the IFF_XMIT_DST_RELEASE
flag for vlan if the underlaying device does not have (relevant)
classifier attached? (and clearing it as needed)
Paolo