[OMPI users] Help: HPL Compiled Problem

2009-07-22 Thread Lee Amy
Hi,

I'm going to compile HPL by using OpenMPI-1.2.4. Here's my
Make.Linux_ATHLON_CBLAS file.

# ##
#
# --
# - shell --
# --
#
SHELL= /bin/sh
#
CD   = cd
CP   = cp
LN_S = ln -s
MKDIR= mkdir
RM   = /bin/rm -f
TOUCH= touch
#
# --
# - Platform identifier 
# --
#
ARCH = Linux_ATHLON_CBLAS
#
# --
# - HPL Directory Structure / HPL library --
# --
#
TOPdir   = /ma/hpl-2.0
INCdir   = $(TOPdir)/include
BINdir   = $(TOPdir)/bin/$(ARCH)
LIBdir   = $(TOPdir)/lib/$(ARCH)
#
HPLlib   = $(LIBdir)/libhpl.a
#
# --
# - MPI directories - library --
# --
# MPinc tells the  C  compiler where to find the Message Passing library
# header files,  MPlib  is defined  to be the name of  the library to be
# used. The variable MPdir is only used for defining MPinc and MPlib.
#
MPdir= /ma/openmpi-1.2.4
MPinc= -I$(MPdir)/include
MPlib= $(MPdir)/lib/libmpi.so
#
# --
# - Linear Algebra library (BLAS or VSIPL) -
# --
# LAinc tells the  C  compiler where to find the Linear Algebra  library
# header files,  LAlib  is defined  to be the name of  the library to be
# used. The variable LAdir is only used for defining LAinc and LAlib.
#
LAdir= /ma/GotoBLAS-1.26
LAinc=
LAlib= $(LAdir)/libgoto.a
#
# --
# - F77 / C interface --
# --
# You can skip this section  if and only if  you are not planning to use
# a  BLAS  library featuring a Fortran 77 interface.  Otherwise,  it  is
# necessary  to  fill out the  F2CDEFS  variable  with  the  appropriate
# options.  **One and only one**  option should be chosen in **each** of
# the 3 following categories:
#
# 1) name space (How C calls a Fortran 77 routine)
#
# -DAdd_  : all lower case and a suffixed underscore  (Suns,
#   Intel, ...),   [default]
# -DNoChange  : all lower case (IBM RS6000),
# -DUpCase: all upper case (Cray),
# -DAdd__ : the FORTRAN compiler in use is f2c.
#
# 2) C and Fortran 77 integer mapping
#
# -DF77_INTEGER=int   : Fortran 77 INTEGER is a C int, [default]
# -DF77_INTEGER=long  : Fortran 77 INTEGER is a C long,
# -DF77_INTEGER=short : Fortran 77 INTEGER is a C short.
#
# 3) Fortran 77 string handling
#
# -DStringSunStyle: The string address is passed at the string loca-
#   tion on the stack, and the string length is then
#   passed as  an  F77_INTEGER  after  all  explicit
#   stack arguments,   [default]
# -DStringStructPtr   : The address  of  a  structure  is  passed  by  a
#   Fortran 77  string,  and the structure is of the
#   form: struct {char *cp; F77_INTEGER len;},
# -DStringStructVal   : A structure is passed by value for each  Fortran
#   77 string,  and  the  structure is  of the form:
#   struct {char *cp; F77_INTEGER len;},
# -DStringCrayStyle   : Special option for  Cray  machines,  which  uses
#   Cray  fcd  (fortran  character  descriptor)  for
#   interoperation.
#
F2CDEFS  =
#
# --
# - HPL includes / libraries / specifics ---
# --
#
HPL_INCLUDES = -I$(INCdir) -I$(INCdir)/$(ARCH) $(LAinc) $(MPinc)
HPL_LIBS = $(HPLlib) $(LAlib) $(MPlib)
#
# - Compile time options ---
#
# -DHPL_COPY_L   force the copy of the panel L before bcast;
# -DHPL_CALL_CBLAS   call the cblas interface;
# -DHPL_CALL_VSIPL   call the vsip  library;
# -DHPL_DETAILED_TIMING  enable detailed timers;
#
# By default HPL will:
#*) n

Re: [OMPI users] Help: HPL Compiled Problem

2009-07-22 Thread Daniël Mantione


On Wed, 22 Jul 2009, Lee Amy wrote:

> Hi,
> 
> I'm going to compile HPL by using OpenMPI-1.2.4. Here's my
> Make.Linux_ATHLON_CBLAS file.

GotoBLAS needs to be called as Fortran BLAS, so you need to switch from 
CBLAS to FBLAS.

Daniël Mantione

Re: [OMPI users] Help: HPL Compiled Problem

2009-07-22 Thread Lee Amy
On Wed, Jul 22, 2009 at 2:20 PM, Daniël
Mantione wrote:
>
>
> On Wed, 22 Jul 2009, Lee Amy wrote:
>
>> Hi,
>>
>> I'm going to compile HPL by using OpenMPI-1.2.4. Here's my
>> Make.Linux_ATHLON_CBLAS file.
>
> GotoBLAS needs to be called as Fortran BLAS, so you need to switch from
> CBLAS to FBLAS.
>
> Daniël Mantione
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Dear sir,

Thank you very much. I have compiled HPL successfully. But when I
start up xhpl program I encountered such problem.

[node101:15416] *** Process received signal ***
[node101:15418] *** Process received signal ***
[node101:15418] Signal: Segmentation fault (11)
[node101:15418] Signal code: Address not mapped (1)
[node101:15418] Failing at address: 0x7fff
[node101:15416] Signal: Segmentation fault (11)
[node101:15416] Signal code: Address not mapped (1)
[node101:15416] Failing at address: 0x7fff
[node101:15418] [ 0] /lib64/libc.so.6 [0x2b7e20aa1c30]
[node101:15418] [ 1] xhpl [0x4259f0]
[node101:15418] *** End of error message ***
[node101:15416] [ 0] /lib64/libc.so.6 [0x2aacfce93c30]
[node101:15416] [ 1] xhpl [0x4259f0]
[node101:15416] *** End of error message ***
mpirun noticed that job rank 0 with PID 15416 on node node101 exited
on signal 11 (Segmentation fault).

Here's the uname -a output.

Linux node101 2.6.16.60-0.21-smp #1 SMP Tue May 6 12:41:02 UTC 2008
x86_64 x86_64 x86_64 GNU/Linux

Here's the lsb_release output.

LSB Version:
core-2.0-noarch:core-3.0-noarch:core-2.0-x86_64:core-3.0-x86_64:desktop-3.1-amd64:desktop-3.1-noarch:graphics-2.0-amd64:graphics-2.0-noarch:graphics-3.1-amd64:graphics-3.1-noarch

Could you tell me how to fix that?

Thank you very much.

Amy



Re: [OMPI users] Help: HPL Compiled Problem

2009-07-22 Thread Daniël Mantione


On Wed, 22 Jul 2009, Lee Amy wrote:

> Dear sir,
> 
> Thank you very much. I have compiled HPL successfully. But when I
> start up xhpl program I encountered such problem.
> 
> mpirun noticed that job rank 0 with PID 15416 on node node101 exited
> on signal 11 (Segmentation fault).
> 
> Could you tell me how to fix that?

That error message gives very little information to diagnose the problem. 
Maybe you can recompile with debug information, then it will print a more 
meaningfull backtrace.

Also, please compare your Makefile with the attached one.

Daniël Mantione#  
#  -- High Performance Computing Linpack Benchmark (HPL)
# HPL - 1.0a - January 20, 2004  
# Antoine P. Petitet
# University of Tennessee, Knoxville
# Innovative Computing Laboratories 
# (C) Copyright 2000-2004 All Rights Reserved   
#   
#  -- Copyright notice and Licensing terms: 
#   
#  Redistribution  and  use in  source and binary forms, with or without
#  modification, are  permitted provided  that the following  conditions
#  are met: 
#   
#  1. Redistributions  of  source  code  must retain the above copyright
#  notice, this list of conditions and the following disclaimer.
#   
#  2. Redistributions in binary form must reproduce  the above copyright
#  notice, this list of conditions,  and the following disclaimer in the
#  documentation and/or other materials provided with the distribution. 
#   
#  3. All  advertising  materials  mentioning  features  or  use of this
#  software must display the following acknowledgement: 
#  This  product  includes  software  developed  at  the  University  of
#  Tennessee, Knoxville, Innovative Computing Laboratories. 
#   
#  4. The name of the  University,  the name of the  Laboratory,  or the
#  names  of  its  contributors  may  not  be used to endorse or promote
#  products  derived   from   this  software  without  specific  written
#  permission.  
#   
#  -- Disclaimer:   
#   
#  THIS  SOFTWARE  IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
#  ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES,  INCLUDING,  BUT NOT
#  LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
#  A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE UNIVERSITY
#  OR  CONTRIBUTORS  BE  LIABLE FOR ANY  DIRECT,  INDIRECT,  INCIDENTAL,
#  SPECIAL,  EXEMPLARY,  OR  CONSEQUENTIAL DAMAGES  (INCLUDING,  BUT NOT
#  LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
#  DATA OR PROFITS; OR BUSINESS INTERRUPTION)  HOWEVER CAUSED AND ON ANY
#  THEORY OF LIABILITY, WHETHER IN CONTRACT,  STRICT LIABILITY,  OR TORT
#  (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
#  OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 
# ##
#  
# --
# - shell --
# --
#
SHELL= /bin/sh
#
CD   = cd
CP   = cp
LN_S = ln -s
MKDIR= mkdir
RM   = /bin/rm -f
TOUCH= touch
#
# --
# - Platform identifier 
# --
#
ARCH = clustervision-openmpi-intel
#
# --
# - HPL Directory Structure / HPL library --
# --
#
TOPdir   = $(HOME)/hpl
INCdir   = $(TOPdir)/include
BINdir   = $(TOPdir)/bin/$(ARCH)
LIBdir   = $(TOPdir)/lib/$(ARCH)
#
HPLlib   = $(LIBdir)/libhpl.a 
#
# --
# - Message Passing library (MPI) --
# ---

Re: [OMPI users] Help: HPL Compiled Problem

2009-07-22 Thread Lee Amy
On Wed, Jul 22, 2009 at 2:53 PM, Daniël
Mantione wrote:
>
>
> On Wed, 22 Jul 2009, Lee Amy wrote:
>
>> Dear sir,
>>
>> Thank you very much. I have compiled HPL successfully. But when I
>> start up xhpl program I encountered such problem.
>>
>> mpirun noticed that job rank 0 with PID 15416 on node node101 exited
>> on signal 11 (Segmentation fault).
>>
>> Could you tell me how to fix that?
>
> That error message gives very little information to diagnose the problem.
> Maybe you can recompile with debug information, then it will print a more
> meaningfull backtrace.
>
> Also, please compare your Makefile with the attached one.
>
> Daniël Mantione
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Thanks. I have use your Makefile to recompile. However, I still
encounter some odd problem.

I have attached the make output and Makefile.

Thank you very much.

Amy


make_1
Description: Binary data


Make.Linux_PII_FBLAS
Description: Binary data


[OMPI users] Warning: declaration ‘struct MPI::Grequest_intercept_t’ does not declare anything

2009-07-22 Thread Alexey Sokolov
Hi

I faced a warning "declaration ‘struct MPI::Grequest_intercept_t’ does
not declare anything" using openmpi 1.2.4 (compiling under Fedora 10
with mpic++ wrapper over gcc 4.3.2) and don't know how to solve it.
Browsing the Internet i've found an advise just to ignore it, but i
don't think it is impossible to solve it in another way.

I have a correct working single thread program. Then i just include
mpi.h, compile and get this:

In file included
from /usr/include/openmpi/1.2.4-gcc/openmpi/ompi/mpi/cxx/mpicxx.h:246,
 from /usr/include/openmpi/1.2.4-gcc/mpi.h:1783,

from 
/home/user/NetBeansProjects/Correlation_orig/Correlation/Correlation.cpp:2:
/usr/include/openmpi/1.2.4-gcc/openmpi/ompi/mpi/cxx/request_inln.h:347: 
warning: declaration ‘struct MPI::Grequest_intercept_t’ does not declare 
anything

The program is still works correctly but this warning makes me nervous.

Sincerely yours, Alexey.



Re: [OMPI users] Warning: declaration ‘struct MPI::Grequest_intercept_t’ does not declare anything

2009-07-22 Thread jody
Hi Alexey

I don't know how this error messgae comes about,
but have you ever considered using a newer version of Open MPI?
1.2.4 is quite ancient, the current version is 1.3.3
   http://www.open-mpi.org/software/ompi/v1.3/
Jody



On Wed, Jul 22, 2009 at 9:17 AM, Alexey Sokolov wrote:
> Hi
>
> I faced a warning "declaration ‘struct MPI::Grequest_intercept_t’ does
> not declare anything" using openmpi 1.2.4 (compiling under Fedora 10
> with mpic++ wrapper over gcc 4.3.2) and don't know how to solve it.
> Browsing the Internet i've found an advise just to ignore it, but i
> don't think it is impossible to solve it in another way.
>
> I have a correct working single thread program. Then i just include
> mpi.h, compile and get this:
>
>        In file included
>        from /usr/include/openmpi/1.2.4-gcc/openmpi/ompi/mpi/cxx/mpicxx.h:246,
>                         from /usr/include/openmpi/1.2.4-gcc/mpi.h:1783,
>
>        from 
> /home/user/NetBeansProjects/Correlation_orig/Correlation/Correlation.cpp:2:
>        
> /usr/include/openmpi/1.2.4-gcc/openmpi/ompi/mpi/cxx/request_inln.h:347: 
> warning: declaration ‘struct MPI::Grequest_intercept_t’ does not declare 
> anything
>
> The program is still works correctly but this warning makes me nervous.
>
> Sincerely yours, Alexey.
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] Warning: declaration ‘struct MPI::Grequest_intercept_t’ does not declare anything

2009-07-22 Thread Alexey Sokolov
Hi Jody

As I'm new at linux it was much simpler for me to use default Fedora yum
installer and the latest version accessible with it is still 1.2.4.

I've installed the latest 1.3.3 version as you advised and that warning
disappeared. Still don't know how and why but the problem now is solved.

Sincerely yours, Alexey.

On Wed, 2009-07-22 at 09:55 +0200, jody wrote: 
> Hi Alexey
> 
> I don't know how this error messgae comes about,
> but have you ever considered using a newer version of Open MPI?
> 1.2.4 is quite ancient, the current version is 1.3.3
>http://www.open-mpi.org/software/ompi/v1.3/
> Jody
> 
> 
> 
> On Wed, Jul 22, 2009 at 9:17 AM, Alexey Sokolov wrote:
> > Hi
> >
> > I faced a warning "declaration ‘struct MPI::Grequest_intercept_t’ does
> > not declare anything" using openmpi 1.2.4 (compiling under Fedora 10
> > with mpic++ wrapper over gcc 4.3.2) and don't know how to solve it.
> > Browsing the Internet i've found an advise just to ignore it, but i
> > don't think it is impossible to solve it in another way.
> >
> > I have a correct working single thread program. Then i just include
> > mpi.h, compile and get this:
> >
> >In file included
> >from 
> > /usr/include/openmpi/1.2.4-gcc/openmpi/ompi/mpi/cxx/mpicxx.h:246,
> > from /usr/include/openmpi/1.2.4-gcc/mpi.h:1783,
> >
> >from 
> > /home/user/NetBeansProjects/Correlation_orig/Correlation/Correlation.cpp:2:
> >
> > /usr/include/openmpi/1.2.4-gcc/openmpi/ompi/mpi/cxx/request_inln.h:347: 
> > warning: declaration ‘struct MPI::Grequest_intercept_t’ does not declare 
> > anything
> >
> > The program is still works correctly but this warning makes me nervous.
> >
> > Sincerely yours, Alexey.
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] Help: HPL Compiled Problem

2009-07-22 Thread Daniël Mantione


On Wed, 22 Jul 2009, Lee Amy wrote:

> Thanks. I have use your Makefile to recompile. However, I still
> encounter some odd problem.
> 
> I have attached the make output and Makefile.

I see nothing wrong with the make output?

Daniël Mantione

Re: [OMPI users] Help: HPL Compiled Problem

2009-07-22 Thread Lee Amy
On Wed, Jul 22, 2009 at 4:41 PM, Daniël
Mantione wrote:
>
>
> On Wed, 22 Jul 2009, Lee Amy wrote:
>
>> Thanks. I have use your Makefile to recompile. However, I still
>> encounter some odd problem.
>>
>> I have attached the make output and Makefile.
>
> I see nothing wrong with the make output?
>
> Daniël Mantione
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
Thanks, I have solved that problem. It's the problem that occurred at
GotoBLAS library.

Thank you very much,

Amy



Re: [OMPI users] Network connection check

2009-07-22 Thread Jeff Squyres
I'm not sure what you mean.  Open MPI uses the hostname of the machine  
for general identification purposes.  That may be the same (or not)  
from the resolved name that comes back for a given IP interface.


What are you trying to check, exactly?


On Jul 16, 2009, at 1:56 AM, vipin kumar wrote:


Hi all,

Is there any way to check network connection using HostName in  
OpenMPI ?



Thanks and Regards,
--
Vipin K.
Research Engineer,
C-DOTB, India
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
jsquy...@cisco.com



Re: [OMPI users] Network connection check

2009-07-22 Thread vipin kumar
Hi Jeff,

Thanks for your response.

Actually requirement is how a C/C++ program running in "master" node should
find out whether "slave" node is reachable (as we check this using "ping"
command) or not ? Because IP address may change at any time, that's why I am
trying to achieve this using "host name" of the "slave" node. How this can
be done?


Thanks & Regards,

On Wed, Jul 22, 2009 at 6:54 PM, Jeff Squyres  wrote:

> I'm not sure what you mean.  Open MPI uses the hostname of the machine for
> general identification purposes.  That may be the same (or not) from the
> resolved name that comes back for a given IP interface.
>
> What are you trying to check, exactly?
>
>
>
> On Jul 16, 2009, at 1:56 AM, vipin kumar wrote:
>
>  Hi all,
>>
>> Is there any way to check network connection using HostName in OpenMPI ?
>>
>>
>> Thanks and Regards,
>> --
>> Vipin K.
>> Research Engineer,
>> C-DOTB, India
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Vipin K.
Research Engineer,
C-DOTB, India


Re: [OMPI users] Network connection check

2009-07-22 Thread Jeff Squyres

On Jul 22, 2009, at 10:05 AM, vipin kumar wrote:

Actually requirement is how a C/C++ program running in "master" node  
should find out whether "slave" node is reachable (as we check this  
using "ping" command) or not ? Because IP address may change at any  
time, that's why I am trying to achieve this using "host name" of  
the "slave" node. How this can be done?



Are you asking to find out this information before issuing "mpirun"?   
Open MPI does assume that the nodes you are trying to use are reachable.


--
Jeff Squyres
jsquy...@cisco.com



Re: [OMPI users] [Open MPI Announce] Open MPI v1.3.3 released

2009-07-22 Thread Jeff Squyres

On Jul 20, 2009, at 9:03 AM, Dave Love wrote:


> Hmmm...there should be messages on both the user and devel lists
> regarding binary compatibility at the MPI level being promised for
> 1.3.2 and beyond.

This is confusing.  As I read the quotes below, recompilation is
necessary, and the announcement has items which suggest at least  
some of

the ABI has changed.



The MPI ABI has not changed since 1.3.2.  We started making MPI ABI  
promises with v1.3.2 -- so any version prior to that (including 1.3.0  
and 1.3.1) are not guaranteed to be ABI compatible.  To be clear: you  
should be able to mpicc/mpif77/etc. an MPI application with Open MPI  
v1.3.2 and then be able to run it against an Open MPI v1.3.3  
installation (e.g., change your LD_LIBRARY_PATH to point to an OMPI  
v1.3.3 installation).


Note that our internal API's are *not* guaranteed to be ABI compatible  
between releases (we try hard to keep them stable between releases in  
a single series, but it doesn't always work).  We're only providing an  
ABI guarantee for the official MPI API.



Could the promise also specify that future ABI changes will result in
ELF version changes to avoid any more of the mess with the 1.2 and 1.3
libraries wrongly appearing as compatible to the dynamic linker?  It
should just be a question of managing changes and doing the right  
thing

with libtool.



Yes, we should.  This issue has come up before, but it's gotten  
muddied by some other (uninteresting) technical issues.  I'll bring it  
up again with the rest of the developers.


--
Jeff Squyres
jsquy...@cisco.com



Re: [OMPI users] ifort and gfortran module

2009-07-22 Thread Jeff Squyres

On Jul 20, 2009, at 9:09 AM, Dave Love wrote:


> you should compile openmpi with each pf intel and gfortran seperatly
> and install each of them in a separate location, and use mpi- 
selector

> to select one.

What, precisely, requires that, at least if you can recompile the MPI
program with appropriate options?  (Presumably it's features of the
Fortran/C interfacing and/or Fortran runtime, but the former may be
influenced by compilation options, and I'd hope the glue didn't  
require

the compiler runtime -- the Intel compiler is on the list to check.)



See https://svn.open-mpi.org/source/xref/ompi_1.3/README#257.

It's obviously of interest to those of us facing combinatorial  
explosion

of libraries we're expected to install.



Indeed.  In OMPI, we tried to make this as simple as possible.  But  
unless you use specific compiler options to hide their differences, it  
isn't possible and is beyond our purview to fix.  :-(  (similar  
situation with the C++ bindings)



Also, is there any reason to use mpi-selector rather than switcher?




Nope -- they do about the same thing.

--
Jeff Squyres
jsquy...@cisco.com



Re: [OMPI users] ifort and gfortran module

2009-07-22 Thread Jeff Squyres

Yep, that works.

I'm glad that our txt files and "look at argv[0]" scheme was useful in  
the real world!  (we designed it with uses almost exactly like this in  
mind)



On Jul 20, 2009, at 1:47 PM, Martin Siegert wrote:


Hi,

I want to avoid separate MPI distributions since we compile many
MPI software packages. Having more than one MPI distribution
(at least) doubles the amount of work.

For now I came up with the following solution:

1. compile openmpi using gfortran as the Fortran compiler
   and install it in /usr/local/openmpi
2. move the Fortran module to the directory
   /usr/local/openmpi/include/gfortran. In that directory
   create softlinks to the files in /usr/local/openmpi/include.
3. compile openmpi using ifort and install the Fortran module in
   /usr/local/openmpi/include.
4. in /usr/local/openmpi/bin create softlinks mpif90.ifort
   and mpif90.gfortran pointing to opal_wrapper. Remove the
   mpif90 softlink.
5. Move /usr/local/openmpi/share/openmpi/mpif90-wrapper-data.txt
   to /usr/local/openmpi/share/openmpi/mpif90.ifort-wrapper-data.txt.
   Copy the file to
   /usr/local/openmpi/share/openmpi/mpif90.gfortran-wrapper-data.txt
   and change the line includedir=${includedir} to
   includedir=${includedir}/gfortran
6. Create a wrapper script /usr/local/openmpi/bin/mpif90:

#!/bin/bash
OMPI_WRAPPER_FC=`basename $OMPI_FC 2> /dev/null`
if [ "$OMPI_WRAPPER_FC" = 'gfortran' ]; then
   exec $0.gfortran "$@"
else
   exec $0.ifort "$@"
fi

The reason we use gfortran in step 1 is that otherwise you get those
irritating error messages from the Intel libraries, cf.
http://www.open-mpi.org/faq/?category=building#intel-compiler-wrapper-compiler-warnings

Cheers,
Martin

--
Martin Siegert
Head, Research Computing
WestGrid Site Lead
IT Servicesphone: 778 782-4691
Simon Fraser Universityfax:   778 782-4242
Burnaby, British Columbia  email: sieg...@sfu.ca
Canada  V5A 1S6

On Sat, Jul 18, 2009 at 10:03:50AM +0330, rahmani wrote:
> Hi,
> you should compile openmpi with each pf intel and gfortran  
seperatly and install each of them in a separate location, and use  
mpi-selector to select one.
> if don't use mpi selector, use full path of the compiler (for  
example /usr/local/openmpi/intel/bin/mpif90) and add the  
corresponding library to your LD_LIBRARY_PATH

> Mahdi Rahmani
>
> - Original Message -
> From: "Jim Kress" 
> To: "Open MPI Users" 
> Sent: Saturday, July 18, 2009 5:43:20 AM (GMT+0330) Asia/Tehran
> Subject: Re: [OMPI users] ifort and gfortran module
>
> Why not generate an ifort version with a prefix of want for

> openmpi>_intel
> And the gfortran version with a prefix of  openmpi>_gcc
>
> ?
>
> That's what I do and then use mpi-selector to switch between  
versions as

> required.
>
> Jim
>
> -Original Message-
> From: users-boun...@open-mpi.org [mailto:users-bounces@open- 
mpi.org] On

> Behalf Of Martin Siegert
> Sent: Friday, July 17, 2009 3:29 PM
> To: Open MPI Users
> Subject: [OMPI users] ifort and gfortran module
>
> Hi,
>
> I am wondering whether it is possible to support both the Intel
> compiler ifort and gfortran within a single compiled version of
> openmpi.
> E.g.,
> 1. compile openmpi ifort as the Fortran compiler and install it
>in /usr/local/openmpi-1.3.3
> 2. compile openmpi using gfortran, but do not install it; only
>copy mpi.mod to /usr/local/openmpi-1.3.3/include/gfortran
>
> Is there a way to cause mpif90 to include
> /usr/local/openmpi-1.3.3/include/gfortran
> before including /usr/local/openmpi-1.3.3/include if OMPI_FC is
> set to gfortran (more precisely if `basename $OMPI_FC` = gfortran)?
>
> Or is there another way of accomplishing this?
>
> Cheers,
> Martin
>
> --
> Martin Siegert
> Head, Research Computing
> WestGrid Site Lead
> IT Servicesphone: 778 782-4691
> Simon Fraser Universityfax:   778 782-4242
> Burnaby, British Columbia  email: sieg...@sfu.ca
> Canada  V5A 1S6
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




--
Jeff Squyres
jsquy...@cisco.com



Re: [OMPI users] ifort and gfortran module

2009-07-22 Thread Jeff Squyres

On Jul 22, 2009, at 1:37 PM, Jeff Squyres (jsquyres) wrote:


Yep, that works.



I should clarify -- that *probably* works.

The .mod file are essentially precompiled headers.  Assuming that all  
the data types and sizes are the same between gfortran and ifort, you  
should be ok.  Many of OMPI's F90 functions are implemented by  
directly calling the back-end F77 functions, but some of them have  
thin F90 wrappers before calling the back-end F77 functions.


If the calling conventions, parameter sizes, and constant values (see  
that README that I cited earlier in this thread) are all the same,  
then you should be ok using a single back-end libmpi_f77 and  
libmpi_f90 with 2 different .mod files.  But this is not something I  
have tested extensively, so I can't give you a definite "this will  
always work" ruling.


I *think* that there are compiler flags that you can use with ifort to  
make it behave similarly to gfortran in terms of sizes and constant  
values, etc.  These may or may not be necessary...?  You might want to  
check into this, but I'd be interested to hear what you find.


--
Jeff Squyres
jsquy...@cisco.com



Re: [OMPI users] Open-MPI-1.3.2 compatibility with old torque?

2009-07-22 Thread Ralph Castain

mpirun --display-allocation --display-map

Run a batch job that just prints out $PBS_NODEFILE. I'll bet that it  
isn't what we are expecting, and that the problem comes from it.


In a Torque environment, we read that file to get the list of nodes  
and #slots/node that are allocated to your job. We then filter that  
through any hostfile you provide. So all the nodes have to be in the  
$PBS_NODEFILE, which has to be in the expected format.


I'm a little suspicious, though, because of your reported error. It  
sounds like we are indeed trying to launch a daemon on a known node. I  
can only surmise a couple of possible reasons for the failure:


1. this is a node that is not allocated for your use. Was node0006 in  
your allocation?? If not, then the launch would fail. This would  
indicate we are not parsing the nodefile correctly.


2. if the node is in your allocation, then I would wonder if you have  
a TCP connection between that node and the one where mpirun exists. Is  
there a firewall in the way? Or something that would preclude a  
connection? Frankly, I doubt this possibility because it works when  
run manually.


My money is on option #1. :-)

If it is #1 and you send me a copy of a sample $PBS_NODEFILE on your  
system, I can create a way to parse it so we can provide support for  
that older version.


Ralph


On Jul 21, 2009, at 4:44 PM, Song, Kai Song wrote:


Hi Ralph,

Thanks a lot for the fast response.

Could you give me more instructions on which command do I put "-- 
display-allocation" and "--display-map" with? mpirun? ./configure?...


Also,we have tested that in our PBS script, if we put node=1, the  
helloworld works. But, when I put node=2 or more, it will hang until  
timeout . And the error message will be something like:

node0006 - daemon did not report back when launched

However, if we don't go through the scheduler and run mpi manually,  
everything works fine too.
/home/software/ompi/1.3.2-pgi/bin/mpirun -machinefile ./nodes -np  
16 ./a.out


What do you think the problem would be? It's not the network issue,  
because manually running MPI works. That is why we question about  
torque compatibility.


Thanks again,

Kai


Kai Song
 1.510.486.4894
High Performance Computing Services (HPCS) Intern
Lawrence Berkeley National Laboratory - http://scs.lbl.gov


- Original Message -
From: Ralph Castain 
Date: Tuesday, July 21, 2009 12:12 pm
Subject: Re: [OMPI users] Open-MPI-1.3.2 compatibility with old  
torque?

To: Open MPI Users 

I'm afraid I have no idea - I've never seen a Torque version that  
old,

however, so it is quite possible that we don't work with it. It
also looks
like it may have been modified (given the p2-aspen3 on the end), so
I have
no idea how the system would behave.

First thing you could do is verify that the allocation is being read
correctly. Add a --display-allocation to the cmd line and see what
we think
Torque gave us. Then add --display-map to see where it plans to
place the
processes.

If all that looks okay, and if you allow ssh, then try -mca plm rsh
on the
cmd line and see if that works.

HTH
Ralph


On Tue, Jul 21, 2009 at 12:57 PM, Song, Kai Song 
wrote:

Hi All,

I am building open-mpi-1.3.2 on centos-3.4, with torque-1.1.0p2-

aspen3 and

myrinet. I compiled it just fine with this configuration:
./configure --prefix=/home/software/ompi/1.3.2-pgi --with-

gm=/usr/local/> --with-gm-libdir=/usr/local/lib64/ --enable-static -
-disable-shared

--with-tm=/usr/ --without-threads CC=pgcc CXX=pgCC FC=pgf90

F77=pgf77> LDFLAGS=-L/usr/lib64/torque/


However, when I submit jobs for 2 or more nodes through the torque
schedular, the jobs just hang here. It shows the RUN state, but no
communication between the nodes, then jobs will die with timeout.

We have comfirmed that the myrinet is working because our lam-mpi-

7.1 works

just fine. We are having a really hard time determining what are

the causes

for this problem. So, we suspect it's because our torque is too old.

What is the lowest version requirement of torque for open-mpi-

1.3.2? The

README file didn't specify this detail. Does anyone know more

about it?


Thanks in advance,

Kai

Kai Song
 1.510.486.4894
High Performance Computing Services (HPCS) Intern
Lawrence Berkeley National Laboratory - http://scs.lbl.gov

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users







[OMPI users] Tuned collectives: How to choose them dynamically? (-mca coll_tuned_dynamic_rules_filename dyn_rules)"

2009-07-22 Thread Gus Correa

Dear OpenMPI experts

I would like to experiment with the OpenMPI tuned collectives,
hoping to improve the performance of some programs we run
in production mode.

However, I could not find any documentation on how to select the
different collective algorithms and other parameters.
In particular, I would love to read an explanation clarifying
the syntax and meaning of the lines on "dyn_rules"
file that is passed to
"-mca coll_tuned_dynamic_rules_filename ./dyn_rules"

Recently there was an interesting discussion on the list
about this topic.  It showed that choosing the right collective
algorithm can make a big difference in overall performance:

http://www.open-mpi.org/community/lists/users/2009/05/9355.php
http://www.open-mpi.org/community/lists/users/2009/05/9399.php
http://www.open-mpi.org/community/lists/users/2009/05/9401.php
http://www.open-mpi.org/community/lists/users/2009/05/9419.php

However, the thread was concentrated on "MPI_Alltoall".
Nothing was said about other collective functions.
Not much was said about the
"tuned collective dynamic rules" file syntax,
the meaning of its parameters, etc.

Is there any source of information about that which I missed?
Thank you for any pointers or clarifications.

Gus Correa
-
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
-