Re: [OMPI users] Openmpi 1.10.x, mpirun and Slurm 15.08 problem

2016-09-23 Thread Marcin Krotkiewski
Hi, I have stumbled upon a similar issue, so I wonder those might be related. On one of our systems I get the following error message, both when using openmpi 1.8.8 and 1.10.4 $ mpirun -debug-daemons --mca btl tcp,self --mca mca_base_verbose 100 --mca btl_base_verbose 100 ls [...] [compute

Re: [OMPI users] Hybrid OpenMPI+OpenMP tasks using SLURM

2015-10-09 Thread Marcin Krotkiewski
something else did as I seem to recall we handled this okay before (but I could be wrong). Fixing that will take some time that I honestly won’t have for awhile. On Oct 9, 2015, at 6:14 AM, Marcin Krotkiewski wrote: Ralph, Here is the result running mpirun --map-by slot:pe=4 -display

Re: [OMPI users] Hybrid OpenMPI+OpenMP tasks using SLURM

2015-10-09 Thread Marcin Krotkiewski
Ralph, Here is the result running mpirun --map-by slot:pe=4 -display-allocation ./affinity == ALLOCATED NODES == c12-29: slots=4 max_slots=0 slots_inuse=0 state=UP = rank 0 @ compute-

Re: [OMPI users] Process binding with SLURM and 'heterogeneous' nodes

2015-10-08 Thread Marcin Krotkiewski
Hi, Gilles, I have briefly tested your patch with master. So far everything works. I must say what I really like about this version is that it with --report-bindings it actually shows how the heterogeneous architectures looks like, i.e., varying number of cores/sockets per compute node. This

[OMPI users] Process binding with SLURM and 'heterogeneous' nodes

2015-10-02 Thread Marcin Krotkiewski
Hi, I fail to make OpenMPI bind to cores correctly when running from within SLURM-allocated CPU resources spread over a range of compute nodes in an otherwise homogeneous cluster. I have found this thread http://www.open-mpi.org/community/lists/users/2014/06/24682.php and did try to use what

[OMPI users] libfabric/usnic does not compile in 2.x

2015-09-30 Thread Marcin Krotkiewski
Hi, I am trying to compile the 2.x branch with libfabric support, but get this error during configure: configure:100708: checking rdma/fi_ext_usnic.h presence configure:100708: gcc -E -I/cluster/software/VERSIONS/openmpi.gnu.2.x/include -I/usit/abel/u1/marcink/software/ompi-release-2.x/opal/

Re: [OMPI users] Using POSIX shared memory as send buffer

2015-09-29 Thread Marcin Krotkiewski
Thanks, Dave. I have verified the memory locality and IB card locality, all's fine. Quite accidentally I have found that there is a huge penalty if I mmap the shm with PROT_READ only. Using PROT_READ | PROT_WRITE yields good results, although I must look at this further. I'll report when I am

Re: [OMPI users] bug in MPI_Comm_accept?

2015-09-16 Thread Marcin Krotkiewski
But where would I put it? If I put it in the while(1), then MPI_Comm_Accept cannot be called for the second time. If I put it outside of the loop it will never be called. On 09/16/2015 04:18 PM, Jalel Chergui wrote: Can you check with an MPI_Finalize in the receiver ? Jalel Le 16/09/2015 16:

Re: [OMPI users] Wrong distance calculations in multi-rail setup?

2015-08-28 Thread Marcin Krotkiewski
Brilliant! Thank you, Rolf. This works: all ranks have reported using the expected port number, and performance is twice of what I was observing before :) I can certainly live with this workaround, but I will be happy to do some debugging to find the problem. If you tell me what is needed /