On Fri, 18 Aug 2006, Andi Kleen wrote:

> Also I must say it's still not quite clear to me if it's better to place
> network packets on the node the device is connected to or on the 
> node which contains the CPU who processes the packet data 
> For RX this can be three different nodes in the worst case
> (CPU processing is often split on different CPUs between softirq
> and user context), for TX  two. Do you have some experience that shows 
> that a particular placement is better than the other?

The more nodes are involved the more numa traffic and the slower the 
overall performance. It is best to place all control information on the 
node of the network card. The actual data is read via DMA and one may
place that local to the executing process. The DMA transfer will then have 
to cross the NUMA interlink but that DMA transfer is a linear and 
predictable stream that can often be optimized by hardware. If you 
would create the data on the network node then you would have off 
node overhead when creating the data (random acces to cache lines?) which 
is likely worse.

One should be able to place the process near the node that has the network 
card. Our machines typically have multiple network cards. It would be best 
if the network stack could choose the one that is nearest to the process.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to