I haven't looked for years, but in the dim dark past many NIC drivers
made non-optimal hardware accesses which slowed them down quite a bit.
Profiling helped find some of them. Exactly how many cycles of stall
happen during a bus (PCI, PCIe, ISA, VME) reference depends on the CPU,
but a wild guess is 600 or more on a 3GHz CPU. A write reference can
cause barrier stalls as well.
I was working with FreeBSD at the time. From casual glances the OpenBSD
drivers *were* very similar in their bus access patterns.
Typically a bus access was done in a per-packet loop which could be
hoisted out of the loop. Some reads could be moved so that the data
wasn't used for a while getting some parallelism. The standard
optimizations for slow references. One or two truly egregious cases were
inline spin waits for talking through the NIC to the net modem chip(s).
Even longer ago on a 4.4BSD-based MP system, allowing interrupts on CPUs
other than 0, great care taken to lock the absolute minimum data, etc.
helped a lot. If interrupts had to be taken on CPU0, a task queue to
transfer responsibility to the next free processor without an expensive
CPU-CPU interrupt helped a great deal.
If those have been fixed please accept my apologies to the developers.
As I said, I haven't looked at NIC code since 3.9 or before.
On 10/27/2011 03:16 PM, tx wrote:
OK. I'm about special network tweaks or something like this. For
example, for FreeBSD there is a sysctl tweaks that give zero CPU load
on 100 Mbps traffic with PRO1000/PT NIC, and about 10% CPU load
without these tweaks on same hardware and in same conditions.