> > I have a Linux router with a lot of interfaces (hundreds or > > thousands of VLANs) and an application that creates AF_PACKET > > socket per interface and bind()s sockets to interfaces. ... > > I noticed that box has strange performance problems with > > most of the CPU time spent in __netif_receive_skb: > > 86.15% [k] __netif_receive_skb > > 1.41% [k] _raw_spin_lock > > 1.09% [k] fib_table_lookup > > 0.99% [k] local_bh_enable_ip ... > > This corresponds to: > > > > net/core/dev.c: > > type = skb->protocol; > > list_for_each_entry_rcu(ptype, > > &ptype_base[ntohs(type) & PTYPE_HASH_MASK], list) { > > if (ptype->type == type && > > (ptype->dev == null_or_dev || ptype->dev == skb->dev || > > ptype->dev == orig_dev)) { > > if (pt_prev) > > ret = deliver_skb(skb, pt_prev, orig_dev); > > pt_prev = ptype; > > } > > } > > > > Which works perfectly OK until there are a lot of AF_PACKET sockets, since > > the socket adds a protocol to ptype list:
Presumably the 'ethertype' is the same for all the sockets? (And probably the '& PTYPE_HASH_MASH' doesn't separate it from 0800 or 0806 (IIRC IP and ICMP)) How often is that deliver_skb() inside the loop called? If the code could be arranged so that the scan loop didn't contain a function call then the loop code would be a lot faster since the compiler can cache values in registers. While that woukd speed the code up somewhat, there would still be a significant cost to iterate 1000+ times. Looks like the ptype_base[] should be per 'dev'? Or just put entries where ptype->dev != null_or_dev on a per-interface list and do two searches? David