[dpdk-dev] NUMA CPU Sockets and DPDK

Richardson, Bruce Wed, 12 Feb 2014 11:49:23 +0000

> 
> What has been your experience of using DPDK based app's in NUMA mode
> with multiple sockets where some cores are present on one socket and
> other cores on some other socket.
> 
> I am migrating my application from one intel machine with 8 cores, all in
> one socket to a 32 core machine where 16 cores are in one socket and 16
> other cores in the second socket.
> My core 0 does all initialization for mbuf's, nic ports, queues etc. and uses
> SOCKET_ID_ANY for socket related parameters.



It is recommended that you decide ahead of time on what cores on what numa 
socket different parts of your application are going to run, and then set up 
your objects in memory appropriately. SOCKET_ID_ANY should only be used to 
allocate items that are not for use in the data-path and for which you 
therefore don't care about access time. Any objects for rings or mempools 
should be created by specifying the correct socket to allocate the memory on. 
If you are working using two sockets, in some cases you may want to duplicate 
your data structures, for example, use two memory pools - one on each socket - 
instead of one, so that all data access is local.

> 
> The usecase works, but I think I am running into performance issues on the
> 32 core machine.
> The lscpu output on my 32 core machine shows the following - NUMA
> node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30
> NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31
> I am using core 1 to lift all the data from a single queue of an 82599EB port
> and I see that the cpu utilization for this core 1 is way too high even for
> lifting traffic of 1 Gbps with packet size of 650 bytes.

How are you measuring the cpu utilization, because when using the Intel DPDK in 
most cases your cpu utilization will always be 100% as you are constantly 
polling? Therefore actual cpu headroom can be hard to judge at times.
Another thing to consider is the numa nodes to which your NICs are connected. 
You can check using the rte_eth_dev_socket_id() what numa socket your NIC is 
connected to - assuming a modern platform where the PCI connects straight to 
the CPUs. Whatever numa node that is connected to, you want to run the code for 
polling the NIC RX queues on that numa node, and do all packet transmission 
using cores on that NUMA node.

> 
> In general, does one need to be careful in working with multiple sockets and
> so forth, any comments would be helpful.

In general, yes, you need to be a bit more careful, but the basic rules as 
outlined above should give you a good start.

[dpdk-dev] NUMA CPU Sockets and DPDK

Reply via email to