[OMPI users] openmpi-v2.0.x-201709300323-7b3806e

2017-10-02 Thread Siegmar Gross
Hi, I've tried to install openmpi-v2.0.x-201709300323-7b3806e on my "SUSE Linux Enterprise Server 12.2 (x86_64)" with Sun C 5.15 (Oracle Developer Studio 12.6). Unfortunately I still get the following error that I reported some time ago. loki openmpi-v2.0.x-201709300323-7b3806e-Linux.x86_64.64_c

Re: [OMPI users] OpenMPI 3.0.0, compilation using Intel icc 11.1 on Linux, error when compiling pmix_mmap

2017-10-02 Thread Jeff Squyres (jsquyres)
Ralph -- I think he cited a typo in his email. The actual file he is referring to is - $ find . -name pmix_mmap.c ./opal/mca/pmix/pmix2x/pmix/src/sm/pmix_mmap.c - From his log file, there appear to be two problems: - sm/pmix_mmap.c(66): warning #266: function "posix_fallocate" dec

Re: [OMPI users] OpenMPI 3.0.0, compilation using Intel icc 11.1 on Linux, error when compiling pmix_mmap

2017-10-02 Thread r...@open-mpi.org
I correctly understood the file and the errors. I’m just pointing out that the referenced file cannot possibly contain a pointer to opal/threads/condition.h. There is no include in that file that can pull in that path. > On Oct 2, 2017, at 11:39 AM, Jeff Squyres (jsquyres) > wrote: > > Ralph

Re: [OMPI users] OpenMPI 3.0.0, compilation using Intel icc 11.1 on Linux, error when compiling pmix_mmap

2017-10-02 Thread Jeff Squyres (jsquyres)
I think that file does get included indirectly, but the real issue is the old Intel compiler not supporting (struct argparse). I.e., the solution might well be "use a newer compiler." > On Oct 2, 2017, at 2:44 PM, r...@open-mpi.org wrote: > > I correctly understood the file and the errors. I’

Re: [OMPI users] OpenMPI with-tm is not obeying torque

2017-10-02 Thread Anthony Thyssen
Update... Problem of all processes runing on first node (oversubscribed dual-core machine) is NOT resolved. Changing the mpirun command in the Torque batch script ("pbs_hello" - See previous) to mpirun --nooversubscribe --display-allocation hostname Then submitting to PBS/Torque using qs

Re: [OMPI users] OpenMPI with-tm is not obeying torque

2017-10-02 Thread r...@open-mpi.org
One thing I can see is that the local host (where mpirun executed) shows as “node21” in the allocation, while all others show their FQDN. This might be causing some confusion. You might try adding "--mca orte_keep_fqdn_hostnames 1” to your cmd line and see if that helps. > On Oct 2, 2017, at

Re: [OMPI users] OpenMPI with-tm is not obeying torque

2017-10-02 Thread Anthony Thyssen
I noticed that too. Though the submitting host for torque is a different host (main head node, "shrek"), "node21" is the host that torque runs the batch script (and the mpirun command) it being the first node in the "dualcore" resource group. Adding option... It fixed the hostname in the alloca

Re: [OMPI users] OpenMPI with-tm is not obeying torque

2017-10-02 Thread Gilles Gouaillardet
Anthony, in your script, can you set -x env pbsdsh hostname mpirun --display-map --display-allocation --mca ess_base_verbose 10 --mca plm_base_verbose 10 --mca ras_base_verbose 10 hostname and then compress and send the output ? Cheers, Gilles On 10/3/2017 1:19 PM, Anthony Thyssen

Re: [OMPI users] OpenMPI with-tm is not obeying torque

2017-10-02 Thread Anthony Thyssen
The stdin and stdout are saved to separate channels. It is interesting that the output from pbsdsh is node21.emperor 5 times, even though $PBS_NODES is the 5 individual nodes. Attached are the two compressed files, as well as the pbs_hello batch used. Anthony Thyssen ( System Programmer )

Re: [OMPI users] OpenMPI with-tm is not obeying torque

2017-10-02 Thread Gilles Gouaillardet
Anthony, we had a similar issue reported some times ago (e.g. Open MPI ignores torque allocation), and after quite some troubleshooting, we ended up with the same behavior (e.g. pbsdsh is not working as expected). see https://www.mail-archive.com/users@lists.open-mpi.org/msg29952.html for