On Mon, 23 Sep 2024 20:31:16 +0000 (UTC)
amit sehas <[email protected]> wrote:
> We are seeing different dpdk threads (launched via rte_eal_remote_launch()),
> demonstrate very different performance.
Are the DPDK threads running on isolated cpus?
Are the DPDK threads doing any system calls (use strace to check)?
>
> After placing counters all over the code, we realize that some threads are
> uniformly slow, in other words there is no application level issue that is
> throttling one thread over the other. We come to the conculsion that either
> the Cores on which they are running are not at the same frequency which seems
> doubtful or the threads are not getting a chance to execute on the cores
> uniformly.
>
> It seems that isolcpus has been deprecated in recent versions of linux.
>
> What is the recommended approach to prevent the kernel from utilizing some
> CPU threads, for anything other than the threads that are launched on them.
On modern Linux systems, CPU isolation can be achieved with cgroups.
>
> Is there some API in dpdk which also helps us determine which CPU core the
> thread is pinned to?
> I did not find any code in dpdk which actually performed pinning of a thread
> to a CPU core.
It is here in lib/eal/linux/eal.c
/* Launch threads, called at application init(). */
int
rte_eal_init(int argc, char **argv)
{
...
RTE_LCORE_FOREACH_WORKER(i) {
...
ret = rte_thread_set_affinity_by_id(lcore_config[i].thread_id,
&lcore_config[i].cpuset);
if (ret != 0)
rte_panic("Cannot set affinity\n");
}
>
> In our case it is more or less certain that the different threads are simply
> not getting the same CPU core time, as a result some are demonstrating higher
> throughput than the others ...
>
> how do we fix this?
Did you get profiling info? I would start by getting flame graph using perf.