Robert Watson wrote:
On Mon, 7 Jul 2008, Andre Oppermann wrote:
Distributing the interrupts and taskqueues among the available CPUs
gives concurrent forwarding with bi- or multi-directional traffic. All
incoming traffic from any particular interface is still serialized
though.
... although not on multiple input queue-enabled hardware and drivers.
While I've really only focused on local traffic performance with my
10gbps Chelsio setup, it should be possible to do packet forwarding from
multiple input queues using that hardware and driver today.
I'll update the netisr2 patches, which allow work to be pushed to
multiple CPUs from a single input queue. However, these necessarily
take a cache miss or two on packet header data in order to break out the
packets from the input queue into flows that can be processed
independently without ordering constraints, so if those cache misses on
header data are a big part of the performance of a configuration, load
balancing in this manner may not help. What would be neat is if the
cards without multiple input queues could still tag receive descriptors
with a flow identifier generated from the IP/TCP/etc layers that could
be used for work placement.
The cache miss is really the elephant in the room. If the network card
supports multiple RX rings with separate interrupts and a stable hash
based (that includes IP+Port src+dst) distribution they can be bound to
different CPUs. It is very important to maintain the packet order for
flows that go through the router. Otherwise TCP and VoIP will suffer.
--
Andre
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"