Re: [OMPI devel] Master hangs in opal_fifo test

2015-02-03 Thread Nathan Hjelm
Thats the second report involving icc 14. I will dig into this later this week. -Nathan On Mon, Feb 02, 2015 at 11:03:41PM -0800, Paul Hargrove wrote: >I have seen opal_fifo hang on 2 distinct systems > + Linux/ppc32 with xlc-11.1 > + Linux/x86-64 with icc-14.0.1.106 >I have no e

Re: [OMPI devel] HELP in OpenMPI - for PH.D research

2015-02-03 Thread Cyrille DIBAMOU MBEUYO
Thank you, i'll look for it. Best regards. 2015-02-03 14:57 UTC+01:00, Jeff Squyres (jsquyres) : > Have you looked at the "self" CR module? > >> On Feb 3, 2015, at 8:46 AM, Cyrille DIBAMOU MBEUYO >> wrote: >> >> 2015-02-02 22:08 UTC+01:00, Jeff Squyres (jsquyres) : >>> On Jan 25, 2015, at 1:06 P

Re: [OMPI devel] open mpi

2015-02-03 Thread Ralph Castain
Let's see if I understand you correctly. You are running "mpirun" on the master node, with your applications running on other nodes in the cluster. In that situation, mpirun is using TCP sockets to communicate with the OMPI daemons on the remote nodes, and you would like to know which Ethernet inte

Re: [OMPI devel] HELP in OpenMPI - for PH.D research

2015-02-03 Thread Jeff Squyres (jsquyres)
Have you looked at the "self" CR module? > On Feb 3, 2015, at 8:46 AM, Cyrille DIBAMOU MBEUYO wrote: > > 2015-02-02 22:08 UTC+01:00, Jeff Squyres (jsquyres) : >> On Jan 25, 2015, at 1:06 PM, Cyrille DIBAMOU MBEUYO >> wrote: >>> >>> Good afternoon development team, >>> >>> I have a small probl

Re: [OMPI devel] HELP in OpenMPI - for PH.D research

2015-02-03 Thread Cyrille DIBAMOU MBEUYO
2015-02-02 22:08 UTC+01:00, Jeff Squyres (jsquyres) : > On Jan 25, 2015, at 1:06 PM, Cyrille DIBAMOU MBEUYO > wrote: >> >> Good afternoon development team, >> >> I have a small problem in OpenMPI to achieve my Ph.D research >> >> My problem is that : >> >> while saving the context.PID of a process

Re: [OMPI devel] Master hangs in opal_LIFO test

2015-02-03 Thread Gilles Gouaillardet
Paul, George and i were able to reproduce this issue with icc 14.0 but not with icc 14.3 and later i am trying to see how the difference/bug could be automatically handled Cheers, Gilles On 2015/02/03 16:18, Paul Hargrove wrote: > CORRECTION: > > It is the opal_lifo (not fifo) test which hung

Re: [OMPI devel] Master hangs in opal_LIFO test

2015-02-03 Thread Adrian Reber
There is right now another bug report concerning opal_lifo and ppc64 here: https://github.com/open-mpi/ompi/issues/371 and there were hangs on ppc64 a few weeks ago in opal_lifo which Nathan fixed with additional barriers. On Mon, Feb 02, 2015 at 11:18:43PM -0800, Paul Hargrove wrote: > CORRECTI

Re: [OMPI devel] Master hangs in opal_LIFO test

2015-02-03 Thread Paul Hargrove
CORRECTION: It is the opal_lifo (not fifo) test which hung on both systems. -Paul On Mon, Feb 2, 2015 at 11:03 PM, Paul Hargrove wrote: > I have seen opal_fifo hang on 2 distinct systems > + Linux/ppc32 with xlc-11.1 > + Linux/x86-64 with icc-14.0.1.106 > > I have no explanation to offer for

[OMPI devel] Master build broken libfabrics + PGI

2015-02-03 Thread Paul Hargrove
On a Linux/x86_64 system with PGI-14.3 I have configured a current master tarball with the following: --prefix=... --enable-debug CC=pgcc CXX=pgCC FC=pgfortran I see "make V=1" fail as shown below. This does NOT occur with GNU or Intel compilers on the same system. Initial guess is mis-ordered

[OMPI devel] Master hangs in opal_fifo test

2015-02-03 Thread Paul Hargrove
I have seen opal_fifo hang on 2 distinct systems + Linux/ppc32 with xlc-11.1 + Linux/x86-64 with icc-14.0.1.106 I have no explanation to offer for either hang. No "weird" configure options were passed to either. -Paul -- Paul H. Hargrove phhargr...@lbl.gov Computer La

[OMPI devel] failed to open libltdl.so

2015-02-03 Thread Paul Hargrove
I found another failure mode for non-embedded libltdl. On a system with libltdl.so on the login node but NOT the compute nodes I encountered the following, once per rank, at job launch: /home/phhargrove/OMPI/openmpi-libltdl-linux-x86_64 psm/INST/bin/orted: error while loading shared libraries: li

[OMPI devel] open mpi

2015-02-03 Thread khushi popat
hello, can anyone tell how to get idea about which interface of master node is being used while i m running open mpi program on cluster from master node ?? thanking you khushi

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-02-03 Thread Paul Hargrove
On Mon, Feb 2, 2015 at 9:26 PM, Paul Hargrove wrote: > I am now going to see about a PGI compiler on a system at another center > (or two?) in order to see how universal the problem is. That was a dead-end. Of the many non-NERSC non-Cray institutions where I have accounts, I could only find on

Re: [OMPI devel] RFC: Remove embedded libltdl

2015-02-03 Thread Paul Hargrove
On Mon, Feb 2, 2015 at 5:47 PM, Paul Hargrove wrote: > I'll report my test results more completely later, but all 4 PGI-based > builds I have results for so far have failed with libtool replacing > "-lltdl" in link command line with "/usr/lib/libltdl.so" rather than the > correct "/usr/lib64/lib

Re: [OMPI devel] Master build failure on Mac OS 10.8 with --enable-static/--disable-shared

2015-02-03 Thread Ralph Castain
Scratching my head over this one - I can replicate it, but need to think a bit on how to solve it. On Mon, Feb 2, 2015 at 7:08 PM, Paul Hargrove wrote: > I have a Mac OSX 10.8 system, where cc is clang. > I have no problems with a default build from the current master tarball. > However, a stat