Re: [OMPI users] Any changes to rmaps in 1.10.2?

2016-01-28 Thread Ben Menadue
Hi Gilles, Wow, thanks - that was quick. I'm rebuilding now. Cheers, Ben -Original Message- From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Gilles Gouaillardet Sent: Friday, 29 January 2016 1:54 PM To: Open MPI Users Subject: Re: [OMPI users] Any changes to rmaps in 1.10.2

Re: [OMPI users] Any changes to rmaps in 1.10.2?

2016-01-28 Thread Gilles Gouaillardet
Ben, here is a patch that does fix that sorry for the inconvenience and thanks for your help in understanding this issue Cheers, Gilles diff --git a/opal/mca/hwloc/base/hwloc_base_util.c b/opal/mca/hwloc/base/hwloc_base_util.c index 237c6b0..a4fa193 100644 --- a/opal/mca/hwloc/base/hwloc_

Re: [OMPI users] Any changes to rmaps in 1.10.2?

2016-01-28 Thread Ben Menadue
Yes, I'm able to reproduce it on a single node as well. Actually, even on just a single CPU (and -np 1) - won't let me launch unless both threads of that core are in the cgroup. -Original Message- From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Gilles Gouaillardet Sent: Frid

Re: [OMPI users] Any changes to rmaps in 1.10.2?

2016-01-28 Thread Gilles Gouaillardet
I was able to reproduce the issue on one node with a cpuset manually set. fwiw, i cannot reproduce the issue using taskset instead of cpuset (!) Cheers, Gilles On 1/29/2016 11:08 AM, Ben Menadue wrote: Hi Gilles, Ralph, Okay, it definitely seems to be due to the cpuset having only one of the

Re: [OMPI users] Any changes to rmaps in 1.10.2?

2016-01-28 Thread Gilles Gouaillardet
Ben, what is the minimum number of nodes required to reproduce the issue ? e.g. can you reproduce it with one node ? Cheers, Gilles On 1/29/2016 11:00 AM, Ben Menadue wrote: Hi Gilles, with respect to PBS, are both OpenMPI built the same way ? e.g. configure --with-tm=/opt/pbs/default or so

Re: [OMPI users] Any changes to rmaps in 1.10.2?

2016-01-28 Thread Ben Menadue
Hi Gilles, Ralph, Okay, it definitely seems to be due to the cpuset having only one of the hyperthreads of each physical core: [13:02:13 root@r60:4363542.r-man2] # echo 0-15 > cpuset.cpus 13:03 bjm900@r60 ~ > cat /cgroup/cpuset/pbspro/4363542.r-man2/cpuset.cpus 0-15 13:03 bjm900@r60 ~ > /apps

Re: [OMPI users] Any changes to rmaps in 1.10.2?

2016-01-28 Thread Ben Menadue
Hi Gilles, > with respect to PBS, are both OpenMPI built the same way ? > e.g. configure --with-tm=/opt/pbs/default or something similar Both are built against TM explicitly using the --with-tm option. > you ran run > mpirun --mca plm_base_verbose 100 --mca ess_base_verbose 100 --mca ras_base_ve

Re: [OMPI users] Any changes to rmaps in 1.10.2?

2016-01-28 Thread Gilles Gouaillardet
Ben, with respect to PBS, are both OpenMPI built the same way ? e.g. configure --with-tm=/opt/pbs/default or something similar you ran run mpirun --mca plm_base_verbose 100 --mca ess_base_verbose 100 --mca ras_base_verbose 100 hostname and you should see the "tm" module in the logs. i notice

Re: [OMPI users] Any changes to rmaps in 1.10.2?

2016-01-28 Thread Ralph Castain
Actually, looking at the output, it appears that we are correctly detecting the cpus. It looks instead like there is some other setting that is overriding the discovery. Is your allocation setting a specific cpuset? Or are you allocating the entire node? On Thu, Jan 28, 2016 at 3:19 PM, wrote:

Re: [OMPI users] Any changes to rmaps in 1.10.2?

2016-01-28 Thread tmishima
Hi Ben and Ralph, just a very short comment. The error message shows the hardware detection doesn't work well, because it says the number of cpus is zero. > >   #cpus-per-proc:  1 > >   number of cpus:  0 > >   map-by:  BYSOCKET:NOOVERSUBSCRIBE Regards, Tetsuya > Thanks Ralph, > > > > T

Re: [OMPI users] Any changes to rmaps in 1.10.2?

2016-01-28 Thread Ben Menadue
Thanks Ralph, There’s no MCA parameters in my environment at all. Here’s the contents of openmpi-mca-params.conf: mpi_leave_pinned = 0 hwloc_base_binding_policy = core rmaps_base_mapping_policy = core hwloc_base_mem_alloc_policy = local_only shmem_mmap_enable_nfs_warning = 0 pml = ^ya

Re: [OMPI users] Any changes to rmaps in 1.10.2?

2016-01-28 Thread Ralph Castain
I'm unaware of any change that would impact you here. For some reason, mpirun believes you are requesting multiple cpus-per-proc, and that seems to be the heart of the problem. Is there an MCA parameter in your environment or default param file, perhaps? On Wed, Jan 27, 2016 at 2:57 PM, Ben Menad

Re: [OMPI users] Using MPI_Type_create_resized is leading to segfault when one-sided communication is used (ungarbled)

2016-01-28 Thread Gilles Gouaillardet
James, for the v1.8/v1.10 series, the fix is available at https://github.com/ggouaillardet/ompi-release/commit/c301bab8c9aff76eb7a3ee56b965b6ff3cf0073c.diff fwiw - i ran the test program under the debugger, and the datatype is the same before and after MPI_Type_create_resize (e.g. the compile