[OMPI users] help: seg fault when freeing communicator

2009-04-13 Thread Graham Mark



This has me stumped. My code seg faults (sometimes) while
it's attempting to free a communicator--at least, that's what the
stack trace indicates, and that's what Totalview also shows.

This happens when I run the program with 27 processes. If I run with 8,
the program finishes without error. (The program requires that the  
number of

processes be a perfect cube.) It happens on two different machines.

The program reads input files and creates a 1-D circular MPI topology
in order to pass input data round robin to all processes. When that is
done, each process does some computation and writes out a file. Then
the program finishes. The seg fault occurs when the communicator
associated with the topoology is supposedly being freed as the program
ends.

The openmpi help web page lists information that should be included in
a help request. I'm attaching all of that that I could find: my
command to run the program, the stack trace, the outputs of
'ompi_info', 'limit', 'ibv_devinfo', 'ifconfig', 'uname' and values of  
my

PATH and LD_LIBRARY_PATH.

Thanks for your help.

Graham Mark




*
** **
** WARNING:  This email contains an attachment of a very suspicious type.  **
** You are urged NOT to open this attachment unless you are absolutely **
** sure it is legitimate.  Opening this attachment may cause irreparable   **
** damage to your computer and your files.  If you have any questions  **
** about the validity of this message, PLEASE SEEK HELP BEFORE OPENING IT. **
** **
** This warning was added by the IU Computer Science Dept. mail scanner.   **
*


<>




==

Graham Mark
CCS-3
Information Sciences
Los Alamos National Laboratory
505-667-8147





Re: [OMPI users] shared libraries issue compiling 1.3.1/intel10.1.022

2009-04-13 Thread Francesco Pietra
I knew that but have considered it again. I wonder whether the info at
the end of this mail suggests how to operate from the viewpoint of
openmpi in compiling a code.

In trying to compile openmpi-1.3.1 on debian amd64 lenny, intels
10.1.022 do not see their librar libimf.so, which is on the unix path
as required by your reference. A mixed compilation gcc g++ ifort only
succeeded with a Tyan S2895, not with four-socket Supermicro boards,
which are of my need.

The problem was solved with gcc g++ gfortran. The openmpi-1.3.1
examples run correctly and Amber10 sander.MPI could be built plainly.

What remains unfulfilled - along similar lines - is the compilation of
Amber9 sander.MPI which I need. Installation of bison fulfilled the
request of yacc, and serial compilation passed.

The info alluded to above is:

"make clean" after serial compilation, ended with (between ==):
===
Making `clean' in directory /usr/local/amber9/src/netcdf/src/cxx

make[3]: Entering directory `/usr/local/amber9/src/netcdf/src/cxx'
rm -f *.o *.a *.so *.sl *.i *.Z core nctst test.out example.nc *.cps
*.dvi *.fns *.log *~ *.gs *.aux *.cp *.fn *.ky *.pg *.toc *.tp *.vr
make[3]: Leaving directory `/usr/local/amber9/src/netcdf/src/cxx'

Returning to directory /usr/local/amber9/src/netcdf/src

make[2]: Leaving directory `/usr/local/amber9/src/netcdf/src'
rm -f *.o *.a *.so *.sl *.i *.Z core
make[1]: Leaving directory `/usr/local/amber9/src/netcdf/src'
cd netcdf/lib && rm -f libnetcdf.a
/bin/sh: line 0: cd: netcdf/lib: No such file or directory
make: [clean] Error 1 (ignored)
cd netcdf/include && rm -f *.mod
/bin/sh: line 0: cd: netcdf/include: No such file or directory
make: [clean] Error 1 (ignored)


./configure -openmpi gfortran
gave no error.

"make parallel" returned, in full (between )
xxx
Starting installation of Amber9 (parallel) at Mon Apr 13 17:36:19 CEST 2009.
cd sander; make parallel
make[1]: Entering directory `/usr/local/amber9/src/sander'
./checkparconf
cpp -traditional -I/usr/local/include -P -DMPI -xassembler-with-cpp
-Dsecond=ambsecond  evb_vars.f > _evb_vars.f
gfortran -c -O0 -fno-second-underscore -march=nocona  -ffree-form  -o
evb_vars.o _evb_vars.f
cpp -traditional -I/usr/local/include -P -DMPI -xassembler-with-cpp
-Dsecond=ambsecond  evb_input.f > _evb_input.f
gfortran -c -O0 -fno-second-underscore -march=nocona  -ffree-form  -o
evb_input.o _evb_input.f
cpp -traditional -I/usr/local/include -P -DMPI -xassembler-with-cpp
-Dsecond=ambsecond  evb_init.f > _evb_init.f
gfortran -c -O0 -fno-second-underscore -march=nocona  -ffree-form  -o
evb_init.o _evb_init.f
Error: Can't open included file 'mpif-common.h'
_evb_init.f:372.67:

 call mpi_bcast ( xdat_dia(n)% filename, 512, MPI_CHARACTER, 0, commwor
  1
Error: Symbol 'mpi_character' at (1) has no IMPLICIT type
_evb_init.f:367.68:

 call mpi_bcast ( xdat_dia(n)% q, ndim, MPI_DOUBLE_PRECISION, 0, commwo
   1
Error: Symbol 'mpi_double_precision' at (1) has no IMPLICIT type
_evb_init.f:327.40:

   call mpi_bcast ( ndim, 1, MPI_INTEGER, 0, commworld, ierr )
   1
Error: Symbol 'mpi_integer' at (1) has no IMPLICIT type
make[1]: *** [evb_init.o] Error 1
make[1]: Leaving directory `/usr/local/amber9/src/sander'
make: *** [parallel] Error 2
xx

I can't apply to the amber site because they have declined interest in
adapting Amber9 to present software. Unfortunately I don't have two
sufficiently powerful computers for present and vintage status.

Thanks a lot for considering my mail

francesco pietra





On Fri, Apr 10, 2009 at 6:24 PM, Jeff Squyres  wrote:
> See this FAQ entry:
>
>    http://www.open-mpi.org/faq/?category=running#intel-compilers-static
>
>
>
> On Apr 10, 2009, at 12:16 PM, Francesco Pietra wrote:
>
>> Hi Gus:
>>
>> If you feel that the observations below are not relevant to openmpi,
>> please disregard the message. You have already kindly devoted so much
>> time to my problems.
>>
>> The "limits.h" issue is solved with 10.1.022 intel compilers: as I
>> felt, the problem was with the pre-10.1.021 version of the intel C++
>> and ifort compilers, a subtle bug observed also by gentoo people (web
>> intel). There remains an orted issue.
>>
>> The openmpi 1.3.1 installation was able to compile connectivity_c.c
>> and hello_c.c, though, running mpirun (output below between ===):
>>
>> =
>> /usr/local/bin/mpirun -host -n 4 connectivity_c 2>&1 | tee
>> connectivity.out
>> /usr/local/bin/orted: error while loading shared libraries: libimf.so:
>> cannot open shared object file: No such file or directory
>> --
>> A daemon (pid 8472) died unexpectedly with status 127 while attempting
>> to launch so we are aborting.
>>
>> There may be more information reported by the e

Re: [OMPI users] Problem with running openMPI program

2009-04-13 Thread Gus Correa

Hi Ankush

To test if OpenMPI works, compile and run the examples (hello_c, etc)
in the  examples/ directory (on the directory where you decompressed
the OpenMPI tarball, not where you installed OpenMPI).
Compile them with mpicc, etc, and run them with mpiexec,
all from OpenMPI.
Using full path names help avoid confusion with other
MPI flavors.

One MPI benchmark available free from Intel:

http://www.intel.com/cd/software/products/asmo-na/eng/219848.htm

There may be others though.

Gus Correa
-
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
-


Ankush Kaul wrote:
can you please suggest a simple benchmarking software, are there any gui 
benchmarking softwares available?


On Tue, Apr 7, 2009 at 2:29 PM, Ankush Kaul > wrote:


Thank you sir, thanks a lot.
 
The information you provided helped us a lot. Am currently going

through the OpenMPI FAQ and will contact you in case of any doubts.
 
Regards,

Ankush Kaul





___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Problem with running openMPI program

2009-04-13 Thread Gus Correa

Ankush Kaul wrote:


I am able to run the program on de server node, but in de compute node 
the program only runs in the directory on which the de /work is mounted 
(/work on de server contains de Pi program).


Also while running Pi it shows de process running only on server not 
compute node(using top)



Hi Ankush, list

I am not sure I understand your machine setup,
but maybe it is a "server" machine and a "compute node"
somehow connected through a network
(or directly by an Ethernet cable), right?

If that is the case, yes you will be able to launch a program with 
mpirun on the server machine, but it will only run in the compute node

if the work directory is mounted by the compute node.
This is the preferred way to run MPI programs.

If you want to run on a directory that is not exported to and mounted on
the compute node, you have to copy over all files (executable, input 
files, etc) to that directory on the compute node.

This is not as comfortable a way to run MPI programs as the alternative
above.

Moreover, you need to tell mpiexec where you want the processes to run.
There are two basic ways to do this.
You can specify the nodes on the command line with the -host option,
or you can specify them in a file with the -hostfile option.
Do "mpiexec --help" to learn the details.

I hope this helps.

Gus Correa
-
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
-


On Sat, Apr 11, 2009 at 1:34 PM, Ankush Kaul > wrote:


can you please suggest a simple benchmarking software, are there any
gui benchmarking softwares available?


On Tue, Apr 7, 2009 at 2:29 PM, Ankush Kaul mailto:ankush.rk...@gmail.com>> wrote:

Thank you sir, thanks a lot.
 
The information you provided helped us a lot. Am currently going

through the OpenMPI FAQ and will contact you in case of any doubts.
 
Regards,

Ankush Kaul






___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




[OMPI users] PGI Fortran pthread support

2009-04-13 Thread Orion Poplawski
Seeing the following building openmpi 1.3.1 on CentOS 5.3 with PGI pgf90 
8.0-5 fortran compiler:


checking if C compiler and POSIX threads work with -Kthread... no
checking if C compiler and POSIX threads work with -kthread... no
checking if C compiler and POSIX threads work with -pthread... yes
checking if C++ compiler and POSIX threads work with -Kthread... no
checking if C++ compiler and POSIX threads work with -kthread... no
checking if C++ compiler and POSIX threads work with -pthread... yes
checking if F77 compiler and POSIX threads work with -Kthread... no
checking if F77 compiler and POSIX threads work with -kthread... no
checking if F77 compiler and POSIX threads work with -pthread... no
checking if F77 compiler and POSIX threads work with -pthreads... no
checking if F77 compiler and POSIX threads work with -mt... no
checking if F77 compiler and POSIX threads work with -mthreads... no
checking if F77 compiler and POSIX threads work with -lpthreads... no
checking if F77 compiler and POSIX threads work with -llthread... no
checking if F77 compiler and POSIX threads work with -lpthread... no
checking for PTHREAD_MUTEX_ERRORCHECK_NP... yes
checking for PTHREAD_MUTEX_ERRORCHECK... yes
checking for working POSIX threads package... no
checking if C compiler and Solaris threads work... no
checking if C++ compiler and Solaris threads work... no
checking if F77 compiler and Solaris threads work... no
checking for working Solaris threads package... no
checking for type of thread support... none found



Open MPI was unable to find threading support on your system.  The
OMPI development team is considering requiring threading support for
proper OMPI execution.  This is in part because we are not aware of
any OpenFabrics users that do not have thread support -- so we need
you to e-mail the Open MPI Users mailing list to tell us if this is a
problem for you.




Is there any way to get the PGI Fortran compiler to support threads for 
openmpi?


--
Orion Poplawski
Technical Manager 303-415-9701 x222
NWRA/CoRA DivisionFAX: 303-415-9702
3380 Mitchell Lane  or...@cora.nwra.com
Boulder, CO 80301  http://www.cora.nwra.com


Re: [OMPI users] PGI Fortran pthread support

2009-04-13 Thread Jeff Squyres

Please send all the information listed here:

http://www.open-mpi.org/community/help/

Thanks.


On Apr 13, 2009, at 6:48 PM, Orion Poplawski wrote:

Seeing the following building openmpi 1.3.1 on CentOS 5.3 with PGI  
pgf90

8.0-5 fortran compiler:

checking if C compiler and POSIX threads work with -Kthread... no
checking if C compiler and POSIX threads work with -kthread... no
checking if C compiler and POSIX threads work with -pthread... yes
checking if C++ compiler and POSIX threads work with -Kthread... no
checking if C++ compiler and POSIX threads work with -kthread... no
checking if C++ compiler and POSIX threads work with -pthread... yes
checking if F77 compiler and POSIX threads work with -Kthread... no
checking if F77 compiler and POSIX threads work with -kthread... no
checking if F77 compiler and POSIX threads work with -pthread... no
checking if F77 compiler and POSIX threads work with -pthreads... no
checking if F77 compiler and POSIX threads work with -mt... no
checking if F77 compiler and POSIX threads work with -mthreads... no
checking if F77 compiler and POSIX threads work with -lpthreads... no
checking if F77 compiler and POSIX threads work with -llthread... no
checking if F77 compiler and POSIX threads work with -lpthread... no
checking for PTHREAD_MUTEX_ERRORCHECK_NP... yes
checking for PTHREAD_MUTEX_ERRORCHECK... yes
checking for working POSIX threads package... no
checking if C compiler and Solaris threads work... no
checking if C++ compiler and Solaris threads work... no
checking if F77 compiler and Solaris threads work... no
checking for working Solaris threads package... no
checking for type of thread support... none found



Open MPI was unable to find threading support on your system.  The
OMPI development team is considering requiring threading support for
proper OMPI execution.  This is in part because we are not aware of
any OpenFabrics users that do not have thread support -- so we need
you to e-mail the Open MPI Users mailing list to tell us if this is a
problem for you.




Is there any way to get the PGI Fortran compiler to support threads  
for

openmpi?

--
Orion Poplawski
Technical Manager 303-415-9701 x222
NWRA/CoRA DivisionFAX: 303-415-9702
3380 Mitchell Lane  or...@cora.nwra.com
Boulder, CO 80301  http://www.cora.nwra.com
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems