Re: core performance

amit sehas Thu, 26 Sep 2024 20:23:08 -0700

For a weaker AWS instance this is what we find with cpu_layout.py:

 ======================================================================
Core and Socket Information (as reported by '/sys/devices/system/cpu')
======================================================================
cores =  [0, 1]
sockets =  [0]

       Socket 0
       --------
Core 0 [0, 2]
Core 1 [1, 3]

So from this we deduce that logical core 0 and logical core 2 are on the same 
physical core
and 1, 3 are on the other physical core,  learnt something valuable today ...

regards

On Thursday, September 26, 2024 at 08:13:48 PM PDT, amit sehas 
<[email protected]> wrote: 

Thanks for the suggestion, i didnt even know about cpu_layout.py ... i will 
definitely try it.... i made some more measurements and so far this is the 
hypothesis:

1) 8 hyperthreads are not the same as 8 CPUs, the scale up is not linear.
2) the vCPU cache allocation per logical CPU thread is also important, if 2 
threads are running the same code on the same physical core but 2 different 
logical cores then we will not have cores competing with each other.
3) try to run dissimilar code on the logical cores that run on the same 
physical core ...

the cpu map is deinitely worth figuring out ...

regards

On Thursday, September 26, 2024 at 08:03:30 PM PDT, Stephen Hemminger 
<[email protected]> wrote: 

On Thu, 26 Sep 2024 17:03:17 +0000 (UTC)

amit sehas <[email protected]> wrote:

> If there is a way to determine:
> 
> vCPU thread utilization numbers over a period of time, such as a few hours
> 
> or which processes are consuming the most CPU
> 
> top always indicates that the server is consuming the most CPU.
> 
> Now i am begining to wonder if 8 vCPU threads really are capable of running 6 
> high intensity threads or only 4 such threads? Dont know
> 
> Also tried to utilize pthread_setschedparam() explicitly on some of the 
> threads, it made no difference to the performance. But if we do it on more 
> than 1-2 threads then it hangs the whole system.
> 
> This is primarily a matter of CPU scheduling, and if we restirct context 
> switching on even 2 critical threads we have a win.

> 
> 

Some other recommendations.
  - avoid CPU 0 you can't isolate it, and it has other stuff that has to run 
there
    if you have main thread that sleeps, and worker threads that poll, then
    go ahead and put main on cpu 0.

  - don't put two active polling cores on shared hyper-thread.
    You can used DPDK's cpu_layout.py script to show this.

For example:

$ ./usertools/cpu_layout.py 
======================================================================
Core and Socket Information (as reported by '/sys/devices/system/cpu')
======================================================================

cores =  [0, 1, 2, 3]
sockets =  [0]

      Socket 0    
      --------    
Core 0 [0, 4]      
Core 1 [1, 5]      
Core 2 [2, 6]      
Core 3 [3, 7] 

On this system, don't poll on cores 0 and 4 (system activity).
Use lcore 1, 2, 3

Re: core performance

Reply via email to