Robert Watson wrote:

On Mon, 7 Jul 2008, Andre Oppermann wrote:

Distributing the interrupts and taskqueues among the available CPUs gives concurrent forwarding with bi- or multi-directional traffic. All incoming traffic from any particular interface is still serialized though.

... although not on multiple input queue-enabled hardware and drivers. While I've really only focused on local traffic performance with my 10gbps Chelsio setup, it should be possible to do packet forwarding from multiple input queues using that hardware and driver today.

I'll update the netisr2 patches, which allow work to be pushed to multiple CPUs from a single input queue. However, these necessarily take a cache miss or two on packet header data in order to break out the packets from the input queue into flows that can be processed independently without ordering constraints, so if those cache misses on header data are a big part of the performance of a configuration, load balancing in this manner may not help. What would be neat is if the cards without multiple input queues could still tag receive descriptors with a flow identifier generated from the IP/TCP/etc layers that could be used for work placement.

The cache miss is really the elephant in the room.  If the network card
supports multiple RX rings with separate interrupts and a stable hash
based (that includes IP+Port src+dst) distribution they can be bound to
different CPUs.  It is very important to maintain the packet order for
flows that go through the router.  Otherwise TCP and VoIP will suffer.

--
Andre
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to