Re: [OMPI users] KNEM errors when running OMPI 2.0.1

2017-01-18 Thread Gilles Gouaillardet
Juan, You also need to make sure knem and MOFED drivers are loaded on all the compute nodes. You also need to double check the permissions of /dev/knem Cheers, Gilles On Wednesday, January 18, 2017, Juan A. Cordero Varelaq < bioinformatica-i...@us.es> wrote: > Sure, I attach the config.log fro

Re: [OMPI users] Startup limited to 128 remote hosts in some situations?

2017-01-18 Thread William Hay
On Tue, Jan 17, 2017 at 09:56:54AM -0800, r...@open-mpi.org wrote: > As I recall, the problem was that qrsh isn???t available on the backend > compute nodes, and so we can???t use a tree for launch. If that isn???t true, > then we can certainly adjust it. > qrsh should be available on all nodes o

Re: [OMPI users] KNEM errors when running OMPI 2.0.1

2017-01-18 Thread Juan A. Cordero Varelaq
Hi, knem and MOFED drivers are installed in /opt: * /opt/knem-1.1.90mlnx2 * /opt/mellanox/fca * /opt/mellanox/mxm * /opt/mellanox/openshmem However/dev/knem does not exist. Cheers, Juan On 18/01/17 11:36, Gilles Gouaillardet wrote: Juan, You also need to make sure knem and MOFED driver

Re: [OMPI users] KNEM errors when running OMPI 2.0.1

2017-01-18 Thread Gilles Gouaillardet
Juan, So you need to load the knem module sudo modprobe knem and then you can check it is correctly loaded with lsmod Loading the module should automagically create /dev/knem, but maybe not with the permissions you expect Cheers, Gilles On Wednesday, January 18, 2017, Juan A. Cordero Varelaq <

Re: [OMPI users] KNEM errors when running OMPI 2.0.1

2017-01-18 Thread Juan A. Cordero Varelaq
Hi, when I try sudo modprobe knem, I get: FATAL: Error inserting knem (/lib/modules/3.13.0-37-generic/updates/dkms/knem.ko): Invalid module format Cheers, Juan On 18/01/17 15:08, Gilles Gouaillardet wrote: Juan, So you need to load the knem module sudo modprobe knem and then you can check i

[OMPI users] MPI_File_write_shared() and MPI_MODE_APPEND issue ?

2017-01-18 Thread Nicolas Joly
Hi, We have a tool where all workers will use MPI_File_write_shared() on a file that was opened with MPI_MODE_APPEND, mostly because rank 0 will have written some format specific header data. We recently upgraded our openmpi version from v1.10.4 to v2.0.1. And at that time we noticed a behaviour

Re: [OMPI users] MPI_File_write_shared() and MPI_MODE_APPEND issue ?

2017-01-18 Thread Edgar Gabriel
I will look into this, I have a suspicion on what might be wrong. Give me a day or three. Thanks EDgar On 1/18/2017 9:36 AM, Nicolas Joly wrote: Hi, We have a tool where all workers will use MPI_File_write_shared() on a file that was opened with MPI_MODE_APPEND, mostly because rank 0 will h

Re: [OMPI users] Rounding errors and MPI

2017-01-18 Thread Jeff Hammond
If compiling with -O0 solves the problem, then you should use -assume protect-parens and/or one of the options discussed in the PDF you will find at https://software.intel.com/en-us/articles/consistency-of-floating-point-results-using-the-intel-compiler. Disabling optimization is a heavy hammer tha

Re: [OMPI users] Rounding errors and MPI

2017-01-18 Thread Jason Maldonis
Hi Oscar, I have similar issues that I was never able to fully track down in my code, but I think you just identified the real problem. If you figure out the correct options could you please let me know here? Using the compiler optimizations are important for our code, but if we can solve this is