[OMPI users] Invalid read of size 4 (Valgrind error) with OpenMPI 1.8.7

2015-07-23 Thread Schlottke-Lakemper, Michael
Hi folks, recently we’ve been getting a Valgrind error in PMPI_Init for our suite of regression tests: ==5922== Invalid read of size 4 ==5922==at 0x61CC5C0: opal_os_dirpath_create (in /aia/opt/openmpi-1.8.7/lib64/libopen-pal.so.6.2.2) ==5922==by 0x5F207E5: orte_session_dir (in /aia/opt

[OMPI users] File coherence issues with OpenMPI/torque/NFS (?)

2015-07-23 Thread Schlottke-Lakemper, Michael
Hi folks, We are currently encountering a weird file coherence issue when running parallel jobs with OpenMPI (1.8.7) and writing files in parallel to an NFS-mounted file system using Parallel netCDF 1.6.1 (which internally uses MPI-I/O). Sometimes (~30-40% of our samples) we get a file whose co

Re: [OMPI users] File coherence issues with OpenMPI/torque/NFS (?)

2015-07-23 Thread Schlottke-Lakemper, Michael
Hi Gilles, > are you running 1.8.7 or master ? 1.8.7. We recently upgraded our cluster installation from OpenSUSE 11.3/OpenMPI 1.6.5 to OpenSUSE 12.3/OpenMPI 1.8.7. Before the upgrade, we did not encounter this problem. > if not default, which io module are you running ? > (default is ROMIO wit

Re: [OMPI users] File coherence issues with OpenMPI/torque/NFS (?)

2015-07-23 Thread Schlottke-Lakemper, Michael
Hi Dave, > That's probably not a good idea. Have you read about NFS in the romio > README? It's old, but as far as I know, it's still relevant for NFS3. > Maybe Rob Latham will see this and be able to comment on NFS4. No, are you referring to the file openmpi-1.8.7/ompi/mca/io/romio/romio/README

Re: [OMPI users] File coherence issues with OpenMPI/torque/NFS (?)

2015-07-23 Thread Schlottke-Lakemper, Michael
mailto:gilles.gouaillar...@gmail.com>> wrote: Michael, ROMIO is the default in the 1.8 series you can run ompi_info --all | grep io | grep priority ROMIO priority should be 20 and ompio priority should be 10. Cheers, Gilles On Thursday, July 23, 2015, Schlottke-Lakemper, Michael mailto

Re: [OMPI users] Invalid read of size 4 (Valgrind error) with OpenMPI 1.8.7

2015-07-28 Thread Schlottke-Lakemper, Michael
de the space at all times. I’d suppress it On Jul 23, 2015, at 12:47 AM, Schlottke-Lakemper, Michael mailto:m.schlottke-lakem...@aia.rwth-aachen.de>> wrote: Hi folks, recently we’ve been getting a Valgrind error in PMPI_Init for our suite of regression tests: ==5922== Invalid read of s

Re: [OMPI users] Invalid read of size 4 (Valgrind error) with OpenMPI 1.8.7

2015-07-29 Thread Schlottke-Lakemper, Michael
and could not find where such a thing can happen Thanks, Gilles On Wednesday, July 29, 2015, Thomas Jahns mailto:ja...@dkrz.de>> wrote: Hello, On 07/28/15 17:34, Schlottke-Lakemper, Michael wrote: That’s what I suspected. Thank you for your confirmation. you are mistaken, the allocation

[OMPI users] Oversubscription disabled by default in OpenMPI 1.8.7

2015-08-14 Thread Schlottke-Lakemper, Michael
Hi folks, It seems like oversubscription is disabled by default in OpenMPI 1.8.7, at least when running on a PBS/torque-managed node. When I run a program in parallel on a node with 8 cores, I am not able to use more than 8 ranks: > mic@aia272:~> mpirun --display-allocation -n 9 hostname > > =

Re: [OMPI users] Oversubscription disabled by default in OpenMPI 1.8.7

2015-08-14 Thread Schlottke-Lakemper, Michael
ins so that we don't oversubscribe allocations given to us by resource managers unless specifically told to do so. On Fri, Aug 14, 2015 at 12:52 AM, Schlottke-Lakemper, Michael mailto:m.schlottke-lakem...@aia.rwth-aachen.de>> wrote: Hi folks, It seems like oversubscription is disab

Re: [OMPI users] Invalid read of size 4 (Valgrind error) with OpenMPI 1.8.7

2015-09-28 Thread Schlottke-Lakemper, Michael
ooking >> at the code, it sure seems impossible, but maybe there is some strange >> path that would break it. >> >> On Jul 29, 2015, at 6:29 AM, Schlottke-Lakemper, Michael >> wrote: >> If it is helpful, I can try to compile OpenMPI with debug i