Re: [OMPI users] OMPI users] OMPI users] OMPI users] which info is needed for SIGSEGV inJava foropenmpi-dev-124-g91e9686on Solaris

2014-10-27 Thread Gilles Gouaillardet
Ralph, On 2014/10/28 0:46, Ralph Castain wrote: > Actually, I propose to also remove that issue. Simple enough to use a > hash_table_32 to handle the jobids, and let that point to a > hash_table_32 of vpids. Since we rarely have more than one jobid > anyway, the memory overhead actually decreases

Re: [OMPI users] OpenMPI 1.8.3 configure fails, Mac OS X 10.9.5, Intel Compilers

2014-10-27 Thread Ralph Castain
FWIW: I just tested with the Intel 15 compilers on Mac 10.10 and it works fine, so apparently the problem has been fixed. You should be able to upgrade to the 15 versions, so that might be the best solution > On Oct 27, 2014, at 11:06 AM, Bosler, Peter Andrew wrote: > >

Re: [OMPI users] Problem with Yosemite

2014-10-27 Thread Guillaume Houzeaux
On 24/10/14 18:09 pm, Ralph Castain wrote: I was able to build and run the trunk without problem on Yosemite with: gcc (MacPorts gcc49 4.9.1_0) 4.9.1 GNU Fortran (MacPorts gcc49 4.9.1_0) 4.9.1 Will test 1.8 branch now, though I believe the fortran support in 1.8 is up-to-date Dear all, I

[OMPI users] OpenMPI 1.8.3 configure fails, Mac OS X 10.9.5, Intel Compilers

2014-10-27 Thread Bosler, Peter Andrew
Good morning, I'm trying to build OpenMPI with the Intel 14.01 compilers with the following configure line ./configure --prefix=/opt/openmpi-1.8.3/intel-14.01 CC=icc CXX=icpc FC=ifort On a 6-core 3.5 GHz Intel Xeon E5 Mac Pro running Mac OS X 10.9.5. Configure outputs a pthread error,

Re: [OMPI users] HAMSTER MPI+Yarn

2014-10-27 Thread Ralph Castain
FWIW: the “better” solution is to move Hadoop to an HPC-like RM such as Slurm. We did this as Pivotal as well as at Intel, but in both cases business moves at the very end of the project (Greenplum becoming Pivotal, and Intel moving its Hadoop work into Cloudera) blocked its release.

Re: [OMPI users] HAMSTER MPI+Yarn

2014-10-27 Thread Brock Palen
Thanks this is good feedback. I was worried with the dynamic nature of Yarn containers that it would be hard to coordinate wire up, and you have confirmed that. Thanks Brock Palen www.umich.edu/~brockp CAEN Advanced Computing XSEDE Campus Champion bro...@umich.edu (734)936-1985 > On Oct 27,

Re: [OMPI users] MPI_Init seems to hang, but works after a, minute or two

2014-10-27 Thread maxinator333
Hello, After compiling and running a MPI program, it seems to hang at MPI_Init(), but it eventually will work after a minute or two. While the problem occured on my Notebook it did not on my desktop PC. It can be a timeout on a network interface. I see a similar issue with wireless ON but

Re: [OMPI users] OMPI users] OMPI users] OMPI users] which info is needed for SIGSEGV inJava foropenmpi-dev-124-g91e9686on Solaris

2014-10-27 Thread Ralph Castain
> On Oct 26, 2014, at 11:12 PM, Gilles Gouaillardet > wrote: > > Ralph, > > this is also a solution. > the pro is it seems more lightweight than PR #249 > the two cons i can see are : > - opal_process_name_t alignment goes from 64 to 32 bits > - some functions

Re: [OMPI users] HAMSTER MPI+Yarn

2014-10-27 Thread Ralph Castain
> On Oct 26, 2014, at 9:56 PM, Brock Palen wrote: > > We are starting to look at supporting MPI on our Hadoop/Spark YARN based > cluster. You poor soul… > I found a bunch of referneces to Hamster, but what I don't find is if it was > ever merged into regular OpenMPI, and

Re: [OMPI users] WG: Bug in OpenMPI-1.8.3: storage limition in shared memory allocation (MPI_WIN_ALLOCATE_SHARED) in Ftn-code

2014-10-27 Thread Nathan Hjelm
On Mon, Oct 27, 2014 at 02:15:45PM +, michael.rach...@dlr.de wrote: > Dear Gilles, > > This is the system response on the login node of cluster5: > > cluster5:~/dat> mpirun -np 1 df -h > Filesystem Size Used Avail Use% Mounted on > /dev/sda31 228G 5.6G 211G 3% / > udev

Re: [OMPI users] WG: Bug in OpenMPI-1.8.3: storage limition in shared memory allocation (MPI_WIN_ALLOCATE_SHARED) in Ftn-code

2014-10-27 Thread Michael.Rachner
Dear Gilles, This is the system response on the login node of cluster5: cluster5:~/dat> mpirun -np 1 df -h Filesystem Size Used Avail Use% Mounted on /dev/sda31 228G 5.6G 211G 3% / udev 32G 232K 32G 1% /dev tmpfs32G 0 32G 0% /dev/shm

Re: [OMPI users] WG: Bug in OpenMPI-1.8.3: storage limition in shared memory allocation (MPI_WIN_ALLOCATE_SHARED) in Ftn-code

2014-10-27 Thread Michael.Rachner
-Ursprüngliche Nachricht- Von: users [mailto:users-boun...@open-mpi.org] Im Auftrag von Gilles Gouaillardet Gesendet: Montag, 27. Oktober 2014 14:49 An: Open MPI Users Betreff: Re: [OMPI users] WG: Bug in OpenMPI-1.8.3: storage limition in shared memory allocation

Re: [OMPI users] WG: Bug in OpenMPI-1.8.3: storage limition in shared memory allocation (MPI_WIN_ALLOCATE_SHARED) in Ftn-code

2014-10-27 Thread Gilles Gouaillardet
Michael, The available space must be greater than the requested size + 5% From the logs, the error message makes sense to me : there is not enough space in /tmp Since the compute nodes have a lot of memory, you might want to try using /dev/shm instead of /tmp for the backing files Cheers,

Re: [OMPI users] WG: Bug in OpenMPI-1.8.3: storage limition in shared memory allocation (MPI_WIN_ALLOCATE_SHARED) in Ftn-code

2014-10-27 Thread Gilles Gouaillardet
Michael, Could you please run mpirun -np 1 df -h mpirun -np 1 df -hi on both compute and login nodes Thanks Gilles michael.rach...@dlr.de wrote: >Dear developers of OPENMPI, > >We have now installed and tested the bugfixed OPENMPI Nightly Tarball of >2014-10-24

[OMPI users] WG: Bug in OpenMPI-1.8.3: storage limition in shared memory allocation (MPI_WIN_ALLOCATE_SHARED) in Ftn-code

2014-10-27 Thread Michael.Rachner
Dear developers of OPENMPI, We have now installed and tested the bugfixed OPENMPI Nightly Tarball of 2014-10-24 (openmpi-dev-176-g9334abc.tar.gz) on Cluster5 . As before (with OPENMPI-1.8.3 release version) the small Ftn-testprogram runs correctly on the login-node. As before the program

Re: [OMPI users] OMPI users] Possible Memory Leak in simple PingPong-Routine with OpenMPI 1.8.3?

2014-10-27 Thread Gilles Gouaillardet
Thanks Marco, I could reproduce the issue even with one node sending/receiving to itself. I will investigate this tomorrow Cheers, Gilles Marco Atzeri wrote: > > >On 10/27/2014 10:30 AM, Gilles Gouaillardet wrote: >> Hi, >> >> i tested on a RedHat 6 like linux server

Re: [OMPI users] Bug in OpenMPI-1.8.3: storage limition in shared memory allocation (MPI_WIN_ALLOCATE_SHARED) in Ftn-code

2014-10-27 Thread Michael.Rachner
Dear Mr. Squyres. We will try to install your bug-fixed nigthly tarball of 2014-10-24 on Cluster5 to see whether it works or not. The installation however will take some time. I get back to you, if I know more. Let me add the information that on the Laki each nodes has 16 GB of shared memory

Re: [OMPI users] Possible Memory Leak in simple PingPong-Routine with OpenMPI 1.8.3?

2014-10-27 Thread Marco Atzeri
On 10/27/2014 10:30 AM, Gilles Gouaillardet wrote: Hi, i tested on a RedHat 6 like linux server and could not observe any memory leak. BTW, are you running 32 or 64 bits cygwin ? and what is your configure command line ? Thanks, Gilles the problem is present in both versions. cygwin

Re: [OMPI users] Possible Memory Leak in simple PingPong-Routine with OpenMPI 1.8.3?

2014-10-27 Thread Gilles Gouaillardet
Hi, i tested on a RedHat 6 like linux server and could not observe any memory leak. BTW, are you running 32 or 64 bits cygwin ? and what is your configure command line ? Thanks, Gilles On 2014/10/27 18:26, Marco Atzeri wrote: > On 10/27/2014 8:30 AM, maxinator333 wrote: >> Hello, >> >> I

Re: [OMPI users] Possible Memory Leak in simple PingPong-Routine with OpenMPI 1.8.3?

2014-10-27 Thread Marco Atzeri
On 10/27/2014 8:30 AM, maxinator333 wrote: Hello, I noticed this weird behavior, because after a certain time of more than one minute the transfer rates of MPI_Send and MPI_Recv dropped by a factor of 100+. By chance I saw, that my program did allocate more and more memory. I have the following

Re: [OMPI users] MPI_Init seems to hang, but works after a minute or two

2014-10-27 Thread Marco Atzeri
On 10/27/2014 8:32 AM, maxinator333 wrote: Hello, After compiling and running a MPI program, it seems to hang at MPI_Init(), but it eventually will work after a minute or two. While the problem occured on my Notebook it did not on my desktop PC. It can be a timeout on a network interface. I

Re: [OMPI users] which info is needed for SIGSEGV in Java foropenmpi-dev-124-g91e9686on Solaris

2014-10-27 Thread Oscar Vega-Gisbert
Hi Takahiro, Gilles, Siegmar, Thank you very much for all your fix. I don't notice about calling 'mca_base_var_register' before MPI_Init. I'm sorry for the inconvenience. Regards, Oscar El 27/10/14 07:16, Gilles Gouaillardet escribió: Kawashima-san, thanks a lot for the detailled

[OMPI users] MPI_Init seems to hang, but works after a minute or two

2014-10-27 Thread maxinator333
Hello, After compiling and running a MPI program, it seems to hang at MPI_Init(), but it eventually will work after a minute or two. While the problem occured on my Notebook it did not on my desktop PC. Both run on Win 7, cygwin 64 Bit, OpenMPI version 1.8.3 r32794 (ompi_info), g++ v

[OMPI users] Possible Memory Leak in simple PingPong-Routine with OpenMPI 1.8.3?

2014-10-27 Thread maxinator333
Hello, I noticed this weird behavior, because after a certain time of more than one minute the transfer rates of MPI_Send and MPI_Recv dropped by a factor of 100+. By chance I saw, that my program did allocate more and more memory. I have the following minimal working example: #include

Re: [OMPI users] which info is needed for SIGSEGV in Javaforopenmpi-dev-124-g91e9686on Solaris

2014-10-27 Thread Siegmar Gross
Hi Gilles, Oscar, Ralph, Takahiro thank you very much for all your help and time investigating my problems on Sparc systems. > thanks a lot for the detailled explanation. > FWIW, i was previously testing on Solaris 11 that behaves like Linux : > printf("%s", NULL) outputs '(null)' > vs a

Re: [OMPI users] which info is needed for SIGSEGV in Java foropenmpi-dev-124-g91e9686on Solaris

2014-10-27 Thread Gilles Gouaillardet
Kawashima-san, thanks a lot for the detailled explanation. FWIW, i was previously testing on Solaris 11 that behaves like Linux : printf("%s", NULL) outputs '(null)' vs a SIGSEGV on Solaris 10 i commited a16c1e44189366fbc8e967769e050f517a40f3f8 in order to fix this issue (i moved the call to

Re: [OMPI users] OMPI users] OMPI users] OMPI users] which info is needed for SIGSEGV inJava foropenmpi-dev-124-g91e9686on Solaris

2014-10-27 Thread Gilles Gouaillardet
Ralph, this is also a solution. the pro is it seems more lightweight than PR #249 the two cons i can see are : - opal_process_name_t alignment goes from 64 to 32 bits - some functions (opal_hash_table_*) takes an uint64_t as argument so we still need to use memcpy in order to * guarantee 64

[OMPI users] Java FAQ Page out of date

2014-10-27 Thread Brock Palen
I think a lot of the information on this page: http://www.open-mpi.org/faq/?category=java Is out of date with the 1.8 release. Brock Palen www.umich.edu/~brockp CAEN Advanced Computing XSEDE Campus Champion bro...@umich.edu (734)936-1985

[OMPI users] HAMSTER MPI+Yarn

2014-10-27 Thread Brock Palen
We are starting to look at supporting MPI on our Hadoop/Spark YARN based cluster. I found a bunch of referneces to Hamster, but what I don't find is if it was ever merged into regular OpenMPI, and if so is it just another RM integration? Or does it need more setup? I found this: