If P=1 and Q=1, your setting up a 1x1 matrix which should only need a
single processor. Something tells me you have 4 independent HPL jobs
running, rather than one job using 4 threads. I think you should have
2x2 grid if you want to use 4 threads. For HPL, P * Q = number of cores
being used.
Be default, OMPI will bind your procs to a single core. You probably want to at
least bind to socket (for NUMA reasons), or not bind at all if you want to use
all the cores on the node.
So either add "--bind-to socket" or "--bind-to none" to your cmd line.
On Aug 3, 2020, at 1:33 AM, John Duff
Hi
I’m experimenting with hybrid OpenMPI/OpenMP Linpack benchmarks on my small
cluster, and I’m a bit confused as to how to invoke mpirun.
I have compiled/linked HPL-2.3 with OpenMPI and libopenblas-openmp using the
GCC -fopenmp option on Ubuntu 20.04 64-bit.
With P=1 and Q=1 in HPL.dat, if I