Hi,

For testing purposes I run some MPI+OpenMP benchmarks with `mpirun -np 1 
./a.out`, and I am using OpenMPI 3.1.3.
As far as I understand, `mpirun` sets an affinity mask, and the OpenMP runtime 
(in my case the LLVM OpenMP RT) respects this mask and only sees 1 physical 
core.
In my case, I am running in a POWER8 which has 8 logical cores per physical 
core. The OpenMP runtime in this case creates always the max number of 
available logical cores in the machines (160 in my machine), but because of the 
mask in this case will create 8 threads.
All this threads runs in the same physical cores making the program slower than 
if it would run each thread in a different physical core.

So my question is, what's the right way to run a single MPI process such that 
the OpenMP threads can run in different physical cores independently from the 
mask set by mpirun?

I know about the option `--bind-to none` and using that all the cores in the 
system become available and the OpenMP runtime uses all of them.
Otherwise, doing some web search I read that a singleton MPI program should be 
executed with ` OMPI_MCA_ess_singleton_isolated=1 ./a.out` without `mpirun` at 
all, but I couldn't find a good explanation of it.

Is there anyone that could clarify this?

Thank you!
Simone


-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may 
contain
confidential information.  Any unauthorized review, use, disclosure or 
distribution
is prohibited.  If you are not the intended recipient, please contact the 
sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to