Re: [OMPI users] Problem installing Dalton with OpenMPI overPelicanHPC

2009-05-18 Thread Jeff Squyres

On May 15, 2009, at 1:23 AM, Silviu Groza wrote:


I still not solved these errors.
I need help in order to install Dalton quantum with OpenMPI.
Thank you.

---> Linking sequential dalton.x ...
mpif77.openmpi -march=x86-64 -O3 -ffast-math -fexpensive- 
optimizations -funroll-loops -fno-range-check -fsecond-underscore \


I notice the "-fsecond-underscore" option here; do you know who is  
insertting this option?  If I had to guess, I'd say that that is  
forcing the Fortran linker to change its native name mangling scheme,  
and it therefore does not match the Fortran name mangling scheme that  
Open MPI was created with...?




-o /root/Fig/dalton-2.0/bin/dalton.x abacus/dalton.o cc/ 
crayio.o abacus/linux_mem_allo.o \
abacus/herpar.o eri/eri2par.o amfi/amfi.o amfi/symtra.o gp/ 
mpi_dummy.o -Labacus -labacus -Lrsp -lrsp -Lsirius -lsirius -labacus  
-Leri -leri -Ldensfit -ldensfit -Lcc  -lcc -Ldft -ldft -Lgp -lgp - 
Lpdpack -lpdpack -L/usr/lib -llapack -lblas

dft/libdft.a(general.o): In function `mpi_sync_data':


What happens if you copy/paste this entire "mpif77.openmpi" command  
line and add "--showme" to the end of it?  If you chase down the  
libmpi.so that is used in that command line and run nm on it, do you  
see ompi_mpi_comm_world (and friends) listed?


I'm afraid that I'm not familiar with Dalton and PelicanHPC -- are all  
of the support libraries listed above part of Darlton or Pelican HPC?   
Were they all compiled with Open MPI?  More specifically: is it  
possible that they were compiled with a different MPI implementation?


--
Jeff Squyres
Cisco Systems



Re: [OMPI users] OpenMPI 1.3.2 with PathScale 3.2

2009-05-18 Thread Jeff Squyres

Hah; this is probably at least tangentially related to


http://www.open-mpi.org/faq/?category=building#pathscale-broken-with-mpi-c++-api

I'll be kind and say that Pathscale has been "unwilling to help on  
these kinds of issues" with me in the past as well.  :-)


It's not entirely clear from the text, but I guess that sounds like  
Pathscale is unsupported on GCC 3.x systems...?  Is that what you  
parse his answer to mean?





On May 18, 2009, at 5:09 PM, Joshua Bernstein wrote:


Well,

I spoke Gautam Chakrabarti at Pathscale. It seems the long  
and short of it is

that using OpenMP with C++ with a GNU3.3 (RHEL4) frontend creates some
limitations inside of pathCC. On a RHEL4 system, the compilier  
activates the
proper frontend for GCC 3.3, this is what creates the crash. As  
suggested I
forced the compilier to use the newer frontend with the -gnu4 option  
and the
build completes without an issue. Sad though that they aren't trying  
to be
backwards compatible, or even testing on RHEL4 systems. I imagine  
there is still

large group of people using RHEL4.

Perhaps this is an OMPI FAQ entry?

The full response from Pathscale appears below:

---SNIP---
It appears you are using the compiler on a relatively old linux  
distribution
which has a default GCC compiler based on version 3.3. Our compiler  
has a
front-end that is activated on such systems, and a different newer  
improved
front-end which is activated on the newer GCC4-based systems. Our  
compiler is
tested on GCC-based systems with versions up to 4.2. I see that you  
are using
OpenMP (using -mp). C++ OpenMP has limitations when being used with  
the GNU3.3
based front-end, and is only fully supported when on a GNU4 based  
system.


You can invoke the newer front-end by the option -gnu4 on a GNU3  
based system.
While compiling this particular file may work with -gnu4 on a GNU3  
based system,
it is generally not safe to use this option for C++ on a GNU3 based  
system due

to incompatibility issues.

The ideal fix would be to try your compilation on a GNU4 based linux  
distribution.

---END SNIP---

-Joshua Bernstein
Software Engineer
Penguin Computing

Jeff Squyres wrote:
> FWIW, I'm able to duplicate the error.  Looks definitely like  
a[nother]

> pathscale bug to me.
>
> Perhaps David's suggestions to disable some of the optimizations may
> help; otherwise, you can disable that entire chunk of code with the
> following:
>
>--enable-contrib-no-build=vt
>
> (as Ralph mentioned, this VampirTrace code is an add-on to Open MPI;
> it's not part of core OMPI itself)
>
>
> On May 15, 2009, at 9:17 AM, David O. Gunter wrote:
>
>> Pathscale supports -O3 (at least as of the 3.1 line).  Here are  
some

>> suggestions from the 3.2 Users Manual you may also want to try.
>>
>> -david
>>
>>
>> If there are numerical problems with -O3 -OPT:Ofast, then try  
either

>> of the
>> following:
>>
>>   -O3 -OPT:Ofast:ro=1
>>   -O3 -OPT:Ofast:div_split=OFF
>>
>> Note that ’ro’ is short for roundoff.
>>
>> -Ofast is equivalent to -O3 -ipa -OPT:Ofast -fno-math-errno - 
ffast-math

>> so similar cautions apply to it as to -O3 -OPT:Ofast.
>>
>> To use interprocedural analysis without the "Ofast-type"  
optimizations,

>> use either of the following:
>>   -O3 -ipa
>>   -O2 -ipa
>>
>> Testing different optimizations can be automated by pathopt2. This
>> program
>> compiles and runs your program with a variety of compiler options  
and

>> creates a sorted list of the execution times for each run.
>>
>> --
>> David Gunter
>> Los Alamos National Laboratory
>>
>> > Last I checked when we were building here, I'm not sure Pathscale
>> > supports -O3. IIRC, O2 is the max supported value, though it  
has been

>> > awhile since I played with it.
>> >
>> > Have you checked the man page for it?
>> >
>> > It could also be something in the VampirTrace code since that  
is where
>> > you are failing. That is a contributed code - not part of OMPI  
itself

>> > - so we would have to check with those developers.
>> >
>> >
>> > On May 14, 2009, at 2:49 PM, Åke Sandgren wrote:
>> >
>> >> On Thu, 2009-05-14 at 13:35 -0700, Joshua Bernstein wrote:
>> >>> Greetings All,
>> >>>
>> >>> I'm trying to build OpenMPI 1.3.2 with the Pathscale  
compiler,

>> >>> version 3.2. A
>> >>> bit of the way through the build the compiler dies with what it
>> >>> things is a bad
>> >>> optimization. Has anybody else seen this, or know a work  
around for

>> >>> it? I'm
>> >>> going to take it up with Pathscale of course, but I thought I'd
>> >>> throw it out here:
>> >>>
>> >>> ---SNIP---
>> >>> /opt/pathscale/bin/pathCC -DHAVE_CONFIG_H -I. -I../.. -I../../
>> >>> extlib/otf/otflib
>> >>> -I../../extlib/otf/otflib -I../../vtlib/ -I../../vtlib  -
>> >>> D_GNU_SOURCE -mp
>> >>> -DVT_OMP -O3 -DNDEBUG -finline-functions -pthread -MT vtfilter-
>> >>> vt_tracefilter.o
>> >>> -MD -MP -MF .deps/vtfilter-vt_tracefilter.Tpo -c -o vtfilter-
>> >>> vt_tracefilter.o
>> >>> `test -f 'vt_

Re: [OMPI users] MPI processes hang when using OpenMPI 1.3.2 and Gcc-4.4.0

2009-05-18 Thread Eugene Loh

Simone Pellegrini wrote:

sorry for the delay but I did some additional experiments to found out 
whether the problem was openmpi or gcc!


The program just hangs... and never terminates! I am running on a SMP 
machine with 32 cores, actually it is a Sun Fire X4600 X2. (8 
quad-core Barcelona AMD chips), the OS is CentOS 5 and the kernel is 
2.6.18-92.el5.src-PAPI (patched with PAPI).
I use a N of 1024, and if I print out the value of the iterator i, 
sometimes it stops around 165, other times around 520... and it 
doesn't make any sense.


If I run the program (and it's important to notice I don't recompile 
it, I just use another mpirun from a different mpi version) the 
program works fine. I did some experiments during the weekend and if I 
use openmpi-1.3.2 compiled with gcc433 everything works fine.


So I really think the problem is strictly related to the usage of 
gcc-4.4.0! ...and it doesn't depends from OpenMPI as the program hangs 
even when I use gcc 1.3.1 compiled with gcc 4.4!


I finally got GCC 4.4, but was unable to reproduce the problem.  How 
small can you make np (number of MPI processes) and still see the 
problem?  How reproducible is the problem?  When it hangs, can you get 
stack traces of all the processes?  We're trying to hunt down some 
similar behavior, but I think yours is of a different flavor.


Re: [OMPI users] OpenMPI deadlocks and race conditions ?

2009-05-18 Thread Eugene Loh




François PELLEGRINI wrote:

  users-requ...@open-mpi.org wrote:
  
  
Date: Thu, 14 May 2009 17:06:07 -0700
From: Eugene Loh 
Subject: Re: [OMPI users] OpenMPI deadlocks and race conditions ?
To: Open MPI Users 

Fran?ois PELLEGRINI wrote:


  I sometimes run into deadlocks in OpenMPI (1.3.3a1r21206), when
running my MPI+threaded PT-Scotch software.
  

So, are there multiple threads per process that perform message-passing 
operations?

  
  
Yes. I use the MPI_THREAD_MULTIPLE level of MPI.

You mentioned that the problem was pretty easy to reproduce.  Could you
send a simple test case (simple means few lines of code and doesn't
take a large system to run)?




Re: [OMPI users] OpenMPI 1.3.2 with PathScale 3.2

2009-05-18 Thread Joshua Bernstein

Well,

	I spoke Gautam Chakrabarti at Pathscale. It seems the long and short of it is 
that using OpenMP with C++ with a GNU3.3 (RHEL4) frontend creates some 
limitations inside of pathCC. On a RHEL4 system, the compilier activates the 
proper frontend for GCC 3.3, this is what creates the crash. As suggested I 
forced the compilier to use the newer frontend with the -gnu4 option and the 
build completes without an issue. Sad though that they aren't trying to be 
backwards compatible, or even testing on RHEL4 systems. I imagine there is still 
large group of people using RHEL4.


Perhaps this is an OMPI FAQ entry?

The full response from Pathscale appears below:

---SNIP---
It appears you are using the compiler on a relatively old linux distribution 
which has a default GCC compiler based on version 3.3. Our compiler has a 
front-end that is activated on such systems, and a different newer improved 
front-end which is activated on the newer GCC4-based systems. Our compiler is 
tested on GCC-based systems with versions up to 4.2. I see that you are using 
OpenMP (using -mp). C++ OpenMP has limitations when being used with the GNU3.3 
based front-end, and is only fully supported when on a GNU4 based system.


You can invoke the newer front-end by the option -gnu4 on a GNU3 based system. 
While compiling this particular file may work with -gnu4 on a GNU3 based system, 
it is generally not safe to use this option for C++ on a GNU3 based system due 
to incompatibility issues.


The ideal fix would be to try your compilation on a GNU4 based linux 
distribution.
---END SNIP---

-Joshua Bernstein
Software Engineer
Penguin Computing

Jeff Squyres wrote:
FWIW, I'm able to duplicate the error.  Looks definitely like a[nother] 
pathscale bug to me.


Perhaps David's suggestions to disable some of the optimizations may 
help; otherwise, you can disable that entire chunk of code with the 
following:


   --enable-contrib-no-build=vt

(as Ralph mentioned, this VampirTrace code is an add-on to Open MPI; 
it's not part of core OMPI itself)



On May 15, 2009, at 9:17 AM, David O. Gunter wrote:


Pathscale supports -O3 (at least as of the 3.1 line).  Here are some
suggestions from the 3.2 Users Manual you may also want to try.

-david


If there are numerical problems with -O3 -OPT:Ofast, then try either 
of the

following:

  -O3 -OPT:Ofast:ro=1
  -O3 -OPT:Ofast:div_split=OFF

Note that ’ro’ is short for roundoff.

-Ofast is equivalent to -O3 -ipa -OPT:Ofast -fno-math-errno -ffast-math
so similar cautions apply to it as to -O3 -OPT:Ofast.

To use interprocedural analysis without the "Ofast-type" optimizations,
use either of the following:
  -O3 -ipa
  -O2 -ipa

Testing different optimizations can be automated by pathopt2. This 
program

compiles and runs your program with a variety of compiler options and
creates a sorted list of the execution times for each run.

--
David Gunter
Los Alamos National Laboratory

> Last I checked when we were building here, I'm not sure Pathscale
> supports -O3. IIRC, O2 is the max supported value, though it has been
> awhile since I played with it.
>
> Have you checked the man page for it?
>
> It could also be something in the VampirTrace code since that is where
> you are failing. That is a contributed code - not part of OMPI itself
> - so we would have to check with those developers.
>
>
> On May 14, 2009, at 2:49 PM, Åke Sandgren wrote:
>
>> On Thu, 2009-05-14 at 13:35 -0700, Joshua Bernstein wrote:
>>> Greetings All,
>>>
>>> I'm trying to build OpenMPI 1.3.2 with the Pathscale compiler,
>>> version 3.2. A
>>> bit of the way through the build the compiler dies with what it
>>> things is a bad
>>> optimization. Has anybody else seen this, or know a work around for
>>> it? I'm
>>> going to take it up with Pathscale of course, but I thought I'd
>>> throw it out here:
>>>
>>> ---SNIP---
>>> /opt/pathscale/bin/pathCC -DHAVE_CONFIG_H -I. -I../.. -I../../
>>> extlib/otf/otflib
>>> -I../../extlib/otf/otflib -I../../vtlib/ -I../../vtlib  -
>>> D_GNU_SOURCE -mp
>>> -DVT_OMP -O3 -DNDEBUG -finline-functions -pthread -MT vtfilter-
>>> vt_tracefilter.o
>>> -MD -MP -MF .deps/vtfilter-vt_tracefilter.Tpo -c -o vtfilter-
>>> vt_tracefilter.o
>>> `test -f 'vt_tracefilter.cc' || echo './'`vt_tracefilter.cc
>>> Signal: Segmentation fault in Global Optimization -- Dead Store
>>> Elimination phase.
>>> Error: Signal Segmentation fault in phase Global Optimization --
>>> Dead Store
>>> Elimination -- processing aborted
>>> *** Internal stack backtrace:
>>> pathCC INTERNAL ERROR: /opt/pathscale/lib/3.2/be died due to signal 4
>>
>> Haven't seen it. But I'm only using -O2 when building openmpi.
>> Report it quickly, if we're lucky they might get a fix into the 3.3
>> release that is due out very soon. (I just got the beta yesterday)
>>
>> --
>> Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
>> Internet: a...@hpc2n.umu.se   Phone: +46 90 7866134 Fax: +46 90 7866126
>> Mobile: +46 70 7716134 WWW: h

[OMPI users] CP2K mpi hang

2009-05-18 Thread Noam Bernstein
Hi all - I have a bizarre OpenMPI hanging problem.  I'm running an MPI  
code
called CP2K (related to, but not the same as cpmd).  The complications  
of the

software aside, here are the observations:

At the base is a serial code that uses system() calls to repeatedly  
invoke

   mpirun cp2k.popt.
When I run from my NFS mounted home directory, everything appears to be
fine.  When I run from a scratch directory local to each node, it  
hangs on
the _third_ invokation of CP2K (the 1st and 3rd invokations do  
computationally
expensive stuff, the 2nd uses the code in a different mode which does  
a rather
different and quicker computation).   These behaviors are quite  
repeatable.

Run from NFS mounted home dir - no problem.  Run from node-local scratch
directory - hang.  Hang is always in the same place (as far as the  
output of

the code, anyway).

The underlying system is Linux with a 2.6.18-128.1.6.el5 kernel  
(CentOS 5.3)

on a dual single core Opteron system with Mellanox Infiniband SDR cards.
One note of caution is that I'm running OFED 1.4.1-rc4, because I need  
1.4.1

for compatibility with this kernel as far as I can tell.

The code is complicated, the input files are big and lead to long  
computation
times, so I don't think I'll be able to make a simple test case.   
Instead

I attached to the hanging processes (all 8 of them) with gdb
during  the hang. The stack trace is below.  Nodes seem to spend most of
their time in the  btl_openib_component_progress(), and occasionally in
mca_pml_ob1_progress().  I.e. not completely stuck, but not making  
progress.


Does anyone have any ideas what could be wrong?


Noam

P.S. I get a similar hang with MVAPICH, in a nearby but different part  
of the

code (on an MPI_Bcast, specifically), increasing my tendency to believe
that it's OFED's fault.  But maybe the stack trace will suggest to  
someone

where it might be stuck, and therefore perhaps an mca flag to try?


#0  0x2ac2d19d7733 in btl_openib_component_progress () from /share/ 
apps/mpi/openmpi-1.3.2/intel-11.0.083/lib/openmpi/mca_btl_openib.so
#1  0x2ac2cdd4daea in opal_progress () from /share/apps/mpi/ 
openmpi-1.3.2/intel-11.0.083/lib/libopen-pal.so.0
#2  0x2ac2cd887e55 in ompi_request_default_wait_all () from /share/ 
apps/mpi/openmpi-1.3.2/intel-11.0.083/lib/libmpi.so.0
#3  0x2ac2d2eb544f in  
ompi_coll_tuned_allreduce_intra_recursivedoubling () from /share/apps/ 
mpi/openmpi-1.3.2/intel-11.0.083/lib/openmpi/mca_coll_tuned.so
#4  0x2ac2cd89b867 in PMPI_Allreduce () from /share/apps/mpi/ 
openmpi-1.3.2/intel-11.0.083/lib/libmpi.so.0
#5  0x2ac2cd6429b5 in pmpi_allreduce__ () from /share/apps/mpi/ 
openmpi-1.3.2/intel-11.0.083/lib/libmpi_f77.so.0

#6  0x0077e7db in message_passing_mp_mp_sum_r1_ ()
#7  0x00be67dd in  
sparse_matrix_types_mp_cp_sm_sm_trace_scalar_ ()
#8  0x0160b68c in  
qs_initial_guess_mp_calculate_first_density_matrix_ ()

#9  0x00a7ec05 in qs_scf_mp_scf_env_initial_rho_setup_ ()
#10 0x00a79fca in qs_scf_mp_init_scf_run_ ()
#11 0x00a659fd in qs_scf_mp_scf_ ()
#12 0x008c5713 in qs_energy_mp_qs_energies_ ()
#13 0x008d469e in qs_force_mp_qs_forces_ ()
#14 0x005368bb in  
force_env_methods_mp_force_env_calc_energy_force_ ()
#15 0x0053620e in  
force_env_methods_mp_force_env_calc_energy_force_ ()

#16 0x00742724 in md_run_mp_qs_mol_dyn_ ()
#17 0x00489c42 in cp2k_runs_mp_cp2k_run_ ()
#18 0x0048878a in cp2k_runs_mp_run_input_ ()
#19 0x00487669 in MAIN__ ()
#20 0x0048667c in main ()






#0  0x2b4d0b57bf09 in mca_pml_ob1_progress () from /share/apps/mpi/ 
openmpi-1.3.2/intel-11.0.083/lib/openmpi/mca_pml_ob1.so
#1  0x2b4d08538aea in opal_progress () from /share/apps/mpi/ 
openmpi-1.3.2/intel-11.0.083/lib/libopen-pal.so.0
#2  0x2b4d08072e55 in ompi_request_default_wait_all () from /share/ 
apps/mpi/openmpi-1.3.2/intel-11.0.083/lib/libmpi.so.0
#3  0x2b4d0d6a044f in  
ompi_coll_tuned_allreduce_intra_recursivedoubling () from /share/apps/ 
mpi/openmpi-1.3.2/intel-11.0.083/lib/openmpi/mca_coll_tuned.so
#4  0x2b4d08086867 in PMPI_Allreduce () from /share/apps/mpi/ 
openmpi-1.3.2/intel-11.0.083/lib/libmpi.so.0
#5  0x2b4d07e2d9b5 in pmpi_allreduce__ () from /share/apps/mpi/ 
openmpi-1.3.2/intel-11.0.083/lib/libmpi_f77.so.0

#6  0x0077e7db in message_passing_mp_mp_sum_r1_ ()
#7  0x00be67dd in  
sparse_matrix_types_mp_cp_sm_sm_trace_scalar_ ()
#8  0x0160b68c in  
qs_initial_guess_mp_calculate_first_density_matrix_ ()

#9  0x00a7ec05 in qs_scf_mp_scf_env_initial_rho_setup_ ()
#10 0x00a79fca in qs_scf_mp_init_scf_run_ ()
#11 0x00a659fd in qs_scf_mp_scf_ ()
#12 0x008c5713 in qs_energy_mp_qs_energies_ ()
#13 0x008d469e in qs_force_mp_qs_forces_ ()
#14 0x005368

Re: [OMPI users] scaling problem with openmpi

2009-05-18 Thread Gus Correa

Hi Pavel

This is not my league, but here are some
CPMD helpful links (code, benchmarks):

http://www.cpmd.org/
http://www.cpmd.org/cpmd_thecode.html
http://www.theochem.ruhr-uni-bochum.de/~axel.kohlmeyer/cpmd-bench.html

IHIH
Gus Correa

Noam Bernstein wrote:


On May 18, 2009, at 12:50 PM, Pavel Shamis (Pasha) wrote:


Roman,
Can you please share with us Mvapich numbers that you get . Also what 
is mvapich version that you use.
Default mvapich and openmpi IB tuning is very similar, so it is 
strange to see so big difference. Do you know what kind of collectives 
operation is used in this specific application.


This code does a bunch of parallel things in various different places
(mostly dense matrix math, and some FFT stuff that may or may not
be parallelized).  In the standard output there's a summary of the time
taken by various MPI routines.  Perhaps Roman can send them?  The
code also uses ScaLAPACK, but I'm not sure how CP2K labels the
timing for those routines in the output.

Noam
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] could oversubscription clobber an executable?

2009-05-18 Thread Jeff Squyres
There is another option here -- Fortran compilers can aggressively  
move code around, particularly when it doesn't know about MPI inter- 
function dependencies.


This usually only happens with non-blocking MPI communication  
functions, though.  Are you using those, perchance?



On May 18, 2009, at 11:51 AM, Iain Bason wrote:



On May 14, 2009, at 3:20 PM, Valmor de Almeida wrote:

> I guess another way to ask is: is it guaranteed that A and B are
> contiguous?

Yes.

> and the MPI communication correctly sends the data?

I'm not sure what you're asking, but the code looks as though it ought
to work.

Iain

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




--
Jeff Squyres
Cisco Systems



Re: [OMPI users] scaling problem with openmpi

2009-05-18 Thread Noam Bernstein


On May 18, 2009, at 12:50 PM, Pavel Shamis (Pasha) wrote:


Roman,
Can you please share with us Mvapich numbers that you get . Also  
what is mvapich version that you use.
Default mvapich and openmpi IB tuning is very similar, so it is  
strange to see so big difference. Do you know what kind of  
collectives operation is used in this specific application.


This code does a bunch of parallel things in various different places
(mostly dense matrix math, and some FFT stuff that may or may not
be parallelized).  In the standard output there's a summary of the time
taken by various MPI routines.  Perhaps Roman can send them?  The
code also uses ScaLAPACK, but I'm not sure how CP2K labels the
timing for those routines in the output.

Noam


Re: [OMPI users] scaling problem with openmpi

2009-05-18 Thread Pavel Shamis (Pasha)

Roman,
Can you please share with us Mvapich numbers that you get . Also what is 
mvapich version that you use.
Default mvapich and openmpi IB tuning is very similar, so it is strange 
to see so big difference. Do you know what kind of collectives operation 
is used in this specific application.


Pasha.

Roman Martonak wrote:

I've been using --mca mpi_paffinity_alone 1 in all simulations. Concerning "-mca
 mpi_leave_pinned 1", I tried it with openmpi 1.2.X versions and it
makes no difference.

Best regards

Roman

On Mon, May 18, 2009 at 4:57 PM, Pavel Shamis (Pasha)  wrote:
  

1) I was told to add "-mca mpi_leave_pinned 0" to avoid problems with
Infinband.  This was with OpenMPI 1.3.1.  Not
  

Actually for 1.2.X version I will recommend you to enable leave pinned "-mca
mpi_leave_pinned 1"


sure if the problems were fixed on 1.3.2, but I am hanging on to that
setting just in case.
  

We had data corruption issue in 1.3.1 but it was resolved in 1.3.2. In 1.3.2
version leave_pinned is enabled by default.

If I remember correct mvapich enables affinity mode by default, so I can
recommend you to try to enable it too:
"--mca mpi_paffinity_alone 1". For more details please check FAQ -
http://www.open-mpi.org/faq/?category=tuning#using-paffinity

Thanks,
Pasha.
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




  




Re: [OMPI users] scaling problem with openmpi

2009-05-18 Thread Gus Correa

Hi Roman

Note that in 1.3.0 and 1.3.1 the default ("-mca mpi_leave_pinned 1")
had a glitch.  In my case it appeared as a memory leak.

See this:

http://www.open-mpi.org/community/lists/users/2009/05/9173.php
http://www.open-mpi.org/community/lists/announce/2009/03/0029.php

One workaround is to revert to
"-mca mpi_leave_pinned 0" (which is what I suggested to you)
when using 1.3.0 or 1.3.1.
The solution advocated by OpenMPI is to upgrade to 1.3.2.

You reported you used "1.3", besides 1.2.6 and 1.2.8.
If this means that you are using 1.3.0 or 1.3.1,
you may want to try the workaround or the upgrade,
regardless of any scaling performance expectations.

Gus Correa
-
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
-




Roman Martonak wrote:

I've been using --mca mpi_paffinity_alone 1 in all simulations. Concerning "-mca
 mpi_leave_pinned 1", I tried it with openmpi 1.2.X versions and it
makes no difference.

Best regards

Roman

On Mon, May 18, 2009 at 4:57 PM, Pavel Shamis (Pasha)  wrote:

1) I was told to add "-mca mpi_leave_pinned 0" to avoid problems with
Infinband.  This was with OpenMPI 1.3.1.  Not

Actually for 1.2.X version I will recommend you to enable leave pinned "-mca
mpi_leave_pinned 1"

sure if the problems were fixed on 1.3.2, but I am hanging on to that
setting just in case.

We had data corruption issue in 1.3.1 but it was resolved in 1.3.2. In 1.3.2
version leave_pinned is enabled by default.

If I remember correct mvapich enables affinity mode by default, so I can
recommend you to try to enable it too:
"--mca mpi_paffinity_alone 1". For more details please check FAQ -
http://www.open-mpi.org/faq/?category=tuning#using-paffinity

Thanks,
Pasha.
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] scaling problem with openmpi

2009-05-18 Thread Roman Martonak
I've been using --mca mpi_paffinity_alone 1 in all simulations. Concerning "-mca
 mpi_leave_pinned 1", I tried it with openmpi 1.2.X versions and it
makes no difference.

Best regards

Roman

On Mon, May 18, 2009 at 4:57 PM, Pavel Shamis (Pasha)  wrote:
>
>>
>> 1) I was told to add "-mca mpi_leave_pinned 0" to avoid problems with
>> Infinband.  This was with OpenMPI 1.3.1.  Not
>
> Actually for 1.2.X version I will recommend you to enable leave pinned "-mca
> mpi_leave_pinned 1"
>>
>> sure if the problems were fixed on 1.3.2, but I am hanging on to that
>> setting just in case.
>
> We had data corruption issue in 1.3.1 but it was resolved in 1.3.2. In 1.3.2
> version leave_pinned is enabled by default.
>
> If I remember correct mvapich enables affinity mode by default, so I can
> recommend you to try to enable it too:
> "--mca mpi_paffinity_alone 1". For more details please check FAQ -
> http://www.open-mpi.org/faq/?category=tuning#using-paffinity
>
> Thanks,
> Pasha.
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



Re: [OMPI users] could oversubscription clobber an executable?

2009-05-18 Thread Iain Bason


On May 14, 2009, at 3:20 PM, Valmor de Almeida wrote:


I guess another way to ask is: is it guaranteed that A and B are
contiguous?


Yes.


and the MPI communication correctly sends the data?


I'm not sure what you're asking, but the code looks as though it ought  
to work.


Iain



Re: [OMPI users] Openmpi -MacOSX-mpif90 won't compile

2009-05-18 Thread John Boccio

Thanks for that comment.

I thought that is what I was doing when I used the full path name

/usr/local/openmpi-1.3/bin/mpif90

Is that not true?

John Boccio
On May 18, 2009, at 11:31 AM, Jeff Squyres wrote:

Check first to make sure you're using the mpif90 in /usr/local/ 
openmpi-1.3/bin -- OS X ships with an Open MPI installation that  
does not include F90 support.  The default OS X Open MPI install may  
be in your PATH before the Open MPI you just installed in /usr/local.



On May 18, 2009, at 10:13 AM, John Boccio wrote:


Hi,

I need to use mpif90 for some work on a parallel cluster for galaxy- 
galaxy collision research.
I am certainly not an expert in using UNIX to compile big packages  
like openmpi.


I have list below all (I hope) relevant information and included  
output files(compressed) as an attachment.


Thanks for any help,

John Boccio
boc...@swarthmore.edu
Department of Physics
Swarthmore College


Here is g95 and xcode info.
Using openmpi-1.3

Mac OSX Leopard 10.5.7

g95 from www.g95.com

g95 -v
Using built-in specs.
Target:
Configured with: ../configure --enable-languages=c
Thread model: posix
gcc version 4.0.3 (g95 0.92!) Oct 18 2008

xcode311_2517_developerdvd.dmg

openmpi-1.3

sudo ./configure --enable-mpi-f77 --enable-mpi-f90 F77="/usr/bin/ 
g95" FC="/usr/bin/g95" > config.out


sudo make clean

sudo make clean prefix=/usr/local/openmpi-1.3

sudo make > make.out

sudo make install prefix=/usr/local/openmpi-1.3 > make-install.out

/usr/local/openmpi-1.3/bin/mpif90

--
Unfortunately, this installation of Open MPI was not compiled with
Fortran 90 support.  As such, the mpif90 compiler is non-functional.
--

files included in attachment   ompi-output.tar.gz :

config.out
config.status
config.log
Makefile
make.out
make-install.out


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Openmpi -MacOSX-mpif90 won't compile

2009-05-18 Thread Jeff Squyres
Check first to make sure you're using the mpif90 in /usr/local/ 
openmpi-1.3/bin -- OS X ships with an Open MPI installation that does  
not include F90 support.  The default OS X Open MPI install may be in  
your PATH before the Open MPI you just installed in /usr/local.



On May 18, 2009, at 10:13 AM, John Boccio wrote:


Hi,

I need to use mpif90 for some work on a parallel cluster for galaxy- 
galaxy collision research.
I am certainly not an expert in using UNIX to compile big packages  
like openmpi.


I have list below all (I hope) relevant information and included  
output files(compressed) as an attachment.


Thanks for any help,

John Boccio
boc...@swarthmore.edu
Department of Physics
Swarthmore College


Here is g95 and xcode info.
Using openmpi-1.3

Mac OSX Leopard 10.5.7

g95 from www.g95.com

g95 -v
Using built-in specs.
Target:
Configured with: ../configure --enable-languages=c
Thread model: posix
gcc version 4.0.3 (g95 0.92!) Oct 18 2008

xcode311_2517_developerdvd.dmg

openmpi-1.3

sudo ./configure --enable-mpi-f77 --enable-mpi-f90 F77="/usr/bin/ 
g95" FC="/usr/bin/g95" > config.out


sudo make clean

sudo make clean prefix=/usr/local/openmpi-1.3

sudo make > make.out

sudo make install prefix=/usr/local/openmpi-1.3 > make-install.out

/usr/local/openmpi-1.3/bin/mpif90

--
Unfortunately, this installation of Open MPI was not compiled with
Fortran 90 support.  As such, the mpif90 compiler is non-functional.
--

files included in attachment   ompi-output.tar.gz :

config.out
config.status
config.log
Makefile
make.out
make-install.out


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems



Re: [OMPI users] scaling problem with openmpi

2009-05-18 Thread Pavel Shamis (Pasha)




1) I was told to add "-mca mpi_leave_pinned 0" to avoid problems with 
Infinband.  This was with OpenMPI 1.3.1.  Not
Actually for 1.2.X version I will recommend you to enable leave pinned 
"-mca mpi_leave_pinned 1"
sure if the problems were fixed on 1.3.2, but I am hanging on to that 
setting just in case.
We had data corruption issue in 1.3.1 but it was resolved in 1.3.2. In 
1.3.2 version leave_pinned is enabled by default.


If I remember correct mvapich enables affinity mode by default, so I can 
recommend you to try to enable it too:
"--mca mpi_paffinity_alone 1". For more details please check FAQ - 
http://www.open-mpi.org/faq/?category=tuning#using-paffinity


Thanks,
Pasha.


[OMPI users] Openmpi -MacOSX-mpif90 won't compile

2009-05-18 Thread John Boccio
Hi,I need to use mpif90 for some work on a parallel cluster for galaxy-galaxy collision research.I am certainly not an expert in using UNIX to compile big packages like openmpi.I have list below all (I hope) relevant information and included output files(compressed) as an attachment.Thanks for any help,John Boccioboc...@swarthmore.eduDepartment of PhysicsSwarthmore CollegeHere is g95 and xcode info.Using openmpi-1.3Mac OSX Leopard 10.5.7g95 from www.g95.comg95 -vUsing built-in specs.Target: Configured with: ../configure --enable-languages=cThread model: posixgcc version 4.0.3 (g95 0.92!) Oct 18 2008xcode311_2517_developerdvd.dmgopenmpi-1.3sudo ./configure --enable-mpi-f77 --enable-mpi-f90 F77="/usr/bin/g95" FC="/usr/bin/g95" > config.outsudo make cleansudo make clean prefix=/usr/local/openmpi-1.3sudo make > make.outsudo make install prefix=/usr/local/openmpi-1.3 > make-install.out/usr/local/openmpi-1.3/bin/mpif90--Unfortunately, this installation of Open MPI was not compiled withFortran 90 support.  As such, the mpif90 compiler is non-functional.--files included in attachment   ompi-output.tar.gz :config.outconfig.statusconfig.logMakefilemake.outmake-install.out

ompi-output.tar.gz
Description: GNU Zip compressed data