Re: [OMPI users] Problem launching onto Bourne shell

2008-10-07 Thread Jeff Squyres

On Oct 7, 2008, at 4:19 PM, Hahn Kim wrote:

you probably want to set the LD_LIBRARY_PATH (and PATH, likely, and  
possibly others, such as that LICENSE key, etc.) regardless of  
whether it's an interactive or non-interactive login.


Right, that's exactly what I want to do.  I was hoping that mpirun  
would run .profile as the FAQ page stated, but the -x fix works for  
now.


If you're using Bash, it should be running .bashrc.  But it looks like  
you did identify a bug that we're *not* running .profile.  I have a  
Mercurial branch up with a fix if you want to give it a spin:


http://www.open-mpi.org/hg/hgwebdir.cgi/jsquyres/sh-profile-fixes/

I just realized that I'm using .bash_profile on the x86 and need to  
move its contents into .bashrc and call .bashrc from .bash_profile,  
since eventually I will also be launching MPI jobs onto other x86  
processors.


Thanks to everyone for their help.

Hahn

On Oct 7, 2008, at 2:16 PM, Jeff Squyres wrote:


On Oct 7, 2008, at 12:48 PM, Hahn Kim wrote:


Regarding 1., we're actually using 1.2.5.  We started using Open MPI
last winter and just stuck with it.  For now, using the -x flag with
mpirun works.  If this really is a bug in 1.2.7, then I think we'll
stick with 1.2.5 for now, then upgrade later when it's fixed.


It looks like this behavior has been the same throughout the entire
1.2 series.


Regarding 2., are you saying I should run the commands you suggest
from the x86 node running bash, so that ssh logs into the Cell node
running Bourne?


I'm saying that if "ssh othernode env" gives different answers than
"ssh othernode"/"env", then your .bashrc or .profile or whatever is
dumping out early depending on whether you have an interactive login
or not.  This is the real cause of the error -- you probably want to
set the LD_LIBRARY_PATH (and PATH, likely, and possibly others, such
as that LICENSE key, etc.) regardless of whether it's an interactive
or non-interactive login.



When I run "ssh othernode env" from the x86 node, I get the
following vanilla environment:

USER=ha17646
HOME=/home/ha17646
LOGNAME=ha17646
SHELL=/bin/sh
PWD=/home/ha17646

When I run "ssh othernode" from the x86 node, then run "env" on the
Cell, I get the following:

USER=ha17646
LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32
HOME=/home/ha17646
MCS_LICENSE_PATH=/opt/MultiCorePlus/mcf.key
LOGNAME=ha17646
TERM=xterm-color
PATH=/usr/local/bin:/usr/bin:/sbin:/bin:/tools/openmpi-1.2.5/bin:/
tools/cmake-2.4.7/bin:/tools
SHELL=/bin/sh
PWD=/home/ha17646
TZ=EST5EDT

Hahn

On Oct 7, 2008, at 12:07 PM, Jeff Squyres wrote:


Ralph and I just talked about this a bit:

1. In all released versions of OMPI, we *do* source the .profile  
file
on the target node if it exists (because vanilla Bourne shells do  
not

source anything on remote nodes -- Bash does, though, per the FAQ).
However, looking in 1.2.7, it looks like it might not be executing
that code -- there *may* be a bug in this area.  We're checking
into it.

2. You might want to check your configuration to see if  
your .bashrc
is dumping out early because it's a non-interactive shell.  Check  
the

output of:

ssh othernode env
vs.
ssh othernode
env

(i.e., a non-interactive running of "env" vs. an interactive login
and
running "env")



On Oct 7, 2008, at 8:53 AM, Ralph Castain wrote:


I am unaware of anything in the code that would "source .profile"
for you. I believe the FAQ page is in error here.

Ralph

On Oct 6, 2008, at 7:47 PM, Hahn Kim wrote:

Great, that worked, thanks!  However, it still concerns me that  
the
FAQ page says that mpirun will execute .profile which doesn't  
seem

to work for me.  Are there any configuration issues that could
possibly be preventing mpirun from doing this?  It would  
certainly

be more convenient if I could maintain my environment in a
single .profile file instead of adding what could potentially  
be a

lot of -x arguments to my mpirun command.

Hahn

On Oct 6, 2008, at 5:44 PM, Aurélien Bouteiller wrote:

tYou can forward your local env with mpirun -x  
LD_LIBRARY_PATH. As

an
alternative you can set specific values with mpirun -x
LD_LIBRARY_PATH=/some/where:/some/where/else . More information
with
mpirun --help (or man mpirun).

Aurelien



Le 6 oct. 08 à 16:06, Hahn Kim a écrit :


Hi,

I'm having difficulty launching an Open MPI job onto a machine
that
is running the Bourne shell.

Here's my basic setup.  I have two machines, one is an x86- 
based

machine running bash and the other is a Cell-based machine
running
Bourne shell.  I'm running mpirun from the x86 machine, which
launches a C++ MPI application onto the Cell machine.  I get  
the

following error:

error while loading shared libraries: libstdc++.so.6: cannot  
open

shared object file: No such file or directory

The basic problem is that LD_LIBRARY_PATH needs to be set to  
the

directory that contains libstdc++.so.6 for the Cell.  I set the
following line in .profile:

export 

Re: [OMPI users] Problem launching onto Bourne shell

2008-10-07 Thread Hahn Kim
you probably want to set the LD_LIBRARY_PATH (and PATH, likely, and  
possibly others, such as that LICENSE key, etc.) regardless of  
whether it's an interactive or non-interactive login.



Right, that's exactly what I want to do.  I was hoping that mpirun  
would run .profile as the FAQ page stated, but the -x fix works for now.


I just realized that I'm using .bash_profile on the x86 and need to  
move its contents into .bashrc and call .bashrc from .bash_profile,  
since eventually I will also be launching MPI jobs onto other x86  
processors.


Thanks to everyone for their help.

Hahn

On Oct 7, 2008, at 2:16 PM, Jeff Squyres wrote:


On Oct 7, 2008, at 12:48 PM, Hahn Kim wrote:


Regarding 1., we're actually using 1.2.5.  We started using Open MPI
last winter and just stuck with it.  For now, using the -x flag with
mpirun works.  If this really is a bug in 1.2.7, then I think we'll
stick with 1.2.5 for now, then upgrade later when it's fixed.


It looks like this behavior has been the same throughout the entire
1.2 series.


Regarding 2., are you saying I should run the commands you suggest
from the x86 node running bash, so that ssh logs into the Cell node
running Bourne?


I'm saying that if "ssh othernode env" gives different answers than
"ssh othernode"/"env", then your .bashrc or .profile or whatever is
dumping out early depending on whether you have an interactive login
or not.  This is the real cause of the error -- you probably want to
set the LD_LIBRARY_PATH (and PATH, likely, and possibly others, such
as that LICENSE key, etc.) regardless of whether it's an interactive
or non-interactive login.



When I run "ssh othernode env" from the x86 node, I get the
following vanilla environment:

USER=ha17646
HOME=/home/ha17646
LOGNAME=ha17646
SHELL=/bin/sh
PWD=/home/ha17646

When I run "ssh othernode" from the x86 node, then run "env" on the
Cell, I get the following:

USER=ha17646
LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32
HOME=/home/ha17646
MCS_LICENSE_PATH=/opt/MultiCorePlus/mcf.key
LOGNAME=ha17646
TERM=xterm-color
PATH=/usr/local/bin:/usr/bin:/sbin:/bin:/tools/openmpi-1.2.5/bin:/
tools/cmake-2.4.7/bin:/tools
SHELL=/bin/sh
PWD=/home/ha17646
TZ=EST5EDT

Hahn

On Oct 7, 2008, at 12:07 PM, Jeff Squyres wrote:


Ralph and I just talked about this a bit:

1. In all released versions of OMPI, we *do* source the .profile  
file
on the target node if it exists (because vanilla Bourne shells do  
not

source anything on remote nodes -- Bash does, though, per the FAQ).
However, looking in 1.2.7, it looks like it might not be executing
that code -- there *may* be a bug in this area.  We're checking
into it.

2. You might want to check your configuration to see if your .bashrc
is dumping out early because it's a non-interactive shell.  Check  
the

output of:

ssh othernode env
vs.
ssh othernode
env

(i.e., a non-interactive running of "env" vs. an interactive login
and
running "env")



On Oct 7, 2008, at 8:53 AM, Ralph Castain wrote:


I am unaware of anything in the code that would "source .profile"
for you. I believe the FAQ page is in error here.

Ralph

On Oct 6, 2008, at 7:47 PM, Hahn Kim wrote:

Great, that worked, thanks!  However, it still concerns me that  
the

FAQ page says that mpirun will execute .profile which doesn't seem
to work for me.  Are there any configuration issues that could
possibly be preventing mpirun from doing this?  It would certainly
be more convenient if I could maintain my environment in a
single .profile file instead of adding what could potentially be a
lot of -x arguments to my mpirun command.

Hahn

On Oct 6, 2008, at 5:44 PM, Aurélien Bouteiller wrote:

tYou can forward your local env with mpirun -x LD_LIBRARY_PATH.  
As

an
alternative you can set specific values with mpirun -x
LD_LIBRARY_PATH=/some/where:/some/where/else . More information
with
mpirun --help (or man mpirun).

Aurelien



Le 6 oct. 08 à 16:06, Hahn Kim a écrit :


Hi,

I'm having difficulty launching an Open MPI job onto a machine
that
is running the Bourne shell.

Here's my basic setup.  I have two machines, one is an x86-based
machine running bash and the other is a Cell-based machine
running
Bourne shell.  I'm running mpirun from the x86 machine, which
launches a C++ MPI application onto the Cell machine.  I get the
following error:

error while loading shared libraries: libstdc++.so.6: cannot  
open

shared object file: No such file or directory

The basic problem is that LD_LIBRARY_PATH needs to be set to the
directory that contains libstdc++.so.6 for the Cell.  I set the
following line in .profile:

export LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32

which is the path to the PPC libraries for Cell.

Now if I log directly into the Cell machine and run the program
directly from the command line, I don't get the above error.   
But

mpirun still fails, even after setting LD_LIBRARY_PATH
in .profile.

As a sanity check, I did the following.  I ran the following
command

Re: [OMPI users] Problem launching onto Bourne shell

2008-10-07 Thread Jeff Squyres

On Oct 7, 2008, at 12:48 PM, Hahn Kim wrote:

Regarding 1., we're actually using 1.2.5.  We started using Open MPI  
last winter and just stuck with it.  For now, using the -x flag with  
mpirun works.  If this really is a bug in 1.2.7, then I think we'll  
stick with 1.2.5 for now, then upgrade later when it's fixed.


It looks like this behavior has been the same throughout the entire  
1.2 series.


Regarding 2., are you saying I should run the commands you suggest  
from the x86 node running bash, so that ssh logs into the Cell node  
running Bourne?


I'm saying that if "ssh othernode env" gives different answers than  
"ssh othernode"/"env", then your .bashrc or .profile or whatever is  
dumping out early depending on whether you have an interactive login  
or not.  This is the real cause of the error -- you probably want to  
set the LD_LIBRARY_PATH (and PATH, likely, and possibly others, such  
as that LICENSE key, etc.) regardless of whether it's an interactive  
or non-interactive login.




When I run "ssh othernode env" from the x86 node, I get the  
following vanilla environment:


USER=ha17646
HOME=/home/ha17646
LOGNAME=ha17646
SHELL=/bin/sh
PWD=/home/ha17646

When I run "ssh othernode" from the x86 node, then run "env" on the  
Cell, I get the following:


USER=ha17646
LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32
HOME=/home/ha17646
MCS_LICENSE_PATH=/opt/MultiCorePlus/mcf.key
LOGNAME=ha17646
TERM=xterm-color
PATH=/usr/local/bin:/usr/bin:/sbin:/bin:/tools/openmpi-1.2.5/bin:/ 
tools/cmake-2.4.7/bin:/tools

SHELL=/bin/sh
PWD=/home/ha17646
TZ=EST5EDT

Hahn

On Oct 7, 2008, at 12:07 PM, Jeff Squyres wrote:


Ralph and I just talked about this a bit:

1. In all released versions of OMPI, we *do* source the .profile file
on the target node if it exists (because vanilla Bourne shells do not
source anything on remote nodes -- Bash does, though, per the FAQ).
However, looking in 1.2.7, it looks like it might not be executing
that code -- there *may* be a bug in this area.  We're checking  
into it.


2. You might want to check your configuration to see if your .bashrc
is dumping out early because it's a non-interactive shell.  Check the
output of:

ssh othernode env
vs.
ssh othernode
env

(i.e., a non-interactive running of "env" vs. an interactive login  
and

running "env")



On Oct 7, 2008, at 8:53 AM, Ralph Castain wrote:


I am unaware of anything in the code that would "source .profile"
for you. I believe the FAQ page is in error here.

Ralph

On Oct 6, 2008, at 7:47 PM, Hahn Kim wrote:


Great, that worked, thanks!  However, it still concerns me that the
FAQ page says that mpirun will execute .profile which doesn't seem
to work for me.  Are there any configuration issues that could
possibly be preventing mpirun from doing this?  It would certainly
be more convenient if I could maintain my environment in a
single .profile file instead of adding what could potentially be a
lot of -x arguments to my mpirun command.

Hahn

On Oct 6, 2008, at 5:44 PM, Aurélien Bouteiller wrote:


tYou can forward your local env with mpirun -x LD_LIBRARY_PATH. As
an
alternative you can set specific values with mpirun -x
LD_LIBRARY_PATH=/some/where:/some/where/else . More information  
with

mpirun --help (or man mpirun).

Aurelien



Le 6 oct. 08 à 16:06, Hahn Kim a écrit :


Hi,

I'm having difficulty launching an Open MPI job onto a machine  
that

is running the Bourne shell.

Here's my basic setup.  I have two machines, one is an x86-based
machine running bash and the other is a Cell-based machine  
running

Bourne shell.  I'm running mpirun from the x86 machine, which
launches a C++ MPI application onto the Cell machine.  I get the
following error:

error while loading shared libraries: libstdc++.so.6: cannot open
shared object file: No such file or directory

The basic problem is that LD_LIBRARY_PATH needs to be set to the
directory that contains libstdc++.so.6 for the Cell.  I set the
following line in .profile:

export LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32

which is the path to the PPC libraries for Cell.

Now if I log directly into the Cell machine and run the program
directly from the command line, I don't get the above error.  But
mpirun still fails, even after setting LD_LIBRARY_PATH  
in .profile.


As a sanity check, I did the following.  I ran the following
command
from the x86 machine:

mpirun -np 1 --host cab0 env

which, among others things, shows me the following value:

LD_LIBRARY_PATH=/tools/openmpi-1.2.5/lib:

If I log into the Cell machine and run env directly from the
command
line, I get the following value:

LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32

So it appears that .profile gets sourced when I log in but not  
when

mpirun runs.

However, according to the OpenMPI FAQ 
(http://www.open-mpi.org/faq/?category=running#adding-ompi-to-path
), mpirun is supposed to directly call .profile since Bourne  
shell

doesn't automatically call it for 

[OMPI users] OMPI link error with petsc 2.3.3

2008-10-07 Thread Terry Dontje

Yann,

How were you trying to link your code with PETSc?  Did you use mpif90 or mpif77 wrappers or were you using cc or mpicc wrappers?  I ran some basic tests that test the usage of MPI_STATUS_IGNORE using mpif90 (and mpif77) and it works fine.  However I was able to generate a similar error as you did when tried to link things with the cc program.  


If you are using cc to link could you possibly try to use mpif90 to link your 
code?

--td

List-Post: users@lists.open-mpi.org
Date: Tue, 07 Oct 2008 16:55:14 +0200
From: "Yann JOBIC" 
Subject: [OMPI users] OMPI link error with petsc 2.3.3
To: Open MPI Users 
Message-ID: <48eb7852.6070...@polytech.univ-mrs.fr>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Hello,

I'm using openmpi 1.3r19400 (ClusterTools 8.0), with sun studio 12, and 
solaris 10u5


I've got this error when linking a PETSc code :
ld: warning: symbol `mpi_fortran_status_ignore_' has differing sizes:
   (file /opt/SUNWhpc/HPC8.0/lib/amd64/libmpi.so value=0x8; file 
/opt/SUNWhpc/HPC8.0/lib/amd64/libmpi_f90.so value=0x14);

   /opt/SUNWhpc/HPC8.0/lib/amd64/libmpi.so definition taken


Isn't it very strange ?

Have you got any idea on the way to solve it ?

Many thanks,

Yann

  




Re: [OMPI users] Problem launching onto Bourne shell

2008-10-07 Thread Hahn Kim

Thanks for the feedback.

Regarding 1., we're actually using 1.2.5.  We started using Open MPI  
last winter and just stuck with it.  For now, using the -x flag with  
mpirun works.  If this really is a bug in 1.2.7, then I think we'll  
stick with 1.2.5 for now, then upgrade later when it's fixed.


Regarding 2., are you saying I should run the commands you suggest  
from the x86 node running bash, so that ssh logs into the Cell node  
running Bourne?


When I run "ssh othernode env" from the x86 node, I get the following  
vanilla environment:


USER=ha17646
HOME=/home/ha17646
LOGNAME=ha17646
SHELL=/bin/sh
PWD=/home/ha17646

When I run "ssh othernode" from the x86 node, then run "env" on the  
Cell, I get the following:


USER=ha17646
LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32
HOME=/home/ha17646
MCS_LICENSE_PATH=/opt/MultiCorePlus/mcf.key
LOGNAME=ha17646
TERM=xterm-color
PATH=/usr/local/bin:/usr/bin:/sbin:/bin:/tools/openmpi-1.2.5/bin:/ 
tools/cmake-2.4.7/bin:/tools

SHELL=/bin/sh
PWD=/home/ha17646
TZ=EST5EDT

Hahn

On Oct 7, 2008, at 12:07 PM, Jeff Squyres wrote:


Ralph and I just talked about this a bit:

1. In all released versions of OMPI, we *do* source the .profile file
on the target node if it exists (because vanilla Bourne shells do not
source anything on remote nodes -- Bash does, though, per the FAQ).
However, looking in 1.2.7, it looks like it might not be executing
that code -- there *may* be a bug in this area.  We're checking into  
it.


2. You might want to check your configuration to see if your .bashrc
is dumping out early because it's a non-interactive shell.  Check the
output of:

ssh othernode env
vs.
ssh othernode
env

(i.e., a non-interactive running of "env" vs. an interactive login and
running "env")



On Oct 7, 2008, at 8:53 AM, Ralph Castain wrote:


I am unaware of anything in the code that would "source .profile"
for you. I believe the FAQ page is in error here.

Ralph

On Oct 6, 2008, at 7:47 PM, Hahn Kim wrote:


Great, that worked, thanks!  However, it still concerns me that the
FAQ page says that mpirun will execute .profile which doesn't seem
to work for me.  Are there any configuration issues that could
possibly be preventing mpirun from doing this?  It would certainly
be more convenient if I could maintain my environment in a
single .profile file instead of adding what could potentially be a
lot of -x arguments to my mpirun command.

Hahn

On Oct 6, 2008, at 5:44 PM, Aurélien Bouteiller wrote:


tYou can forward your local env with mpirun -x LD_LIBRARY_PATH. As
an
alternative you can set specific values with mpirun -x
LD_LIBRARY_PATH=/some/where:/some/where/else . More information  
with

mpirun --help (or man mpirun).

Aurelien



Le 6 oct. 08 à 16:06, Hahn Kim a écrit :


Hi,

I'm having difficulty launching an Open MPI job onto a machine  
that

is running the Bourne shell.

Here's my basic setup.  I have two machines, one is an x86-based
machine running bash and the other is a Cell-based machine running
Bourne shell.  I'm running mpirun from the x86 machine, which
launches a C++ MPI application onto the Cell machine.  I get the
following error:

error while loading shared libraries: libstdc++.so.6: cannot open
shared object file: No such file or directory

The basic problem is that LD_LIBRARY_PATH needs to be set to the
directory that contains libstdc++.so.6 for the Cell.  I set the
following line in .profile:

export LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32

which is the path to the PPC libraries for Cell.

Now if I log directly into the Cell machine and run the program
directly from the command line, I don't get the above error.  But
mpirun still fails, even after setting LD_LIBRARY_PATH  
in .profile.


As a sanity check, I did the following.  I ran the following
command
from the x86 machine:

mpirun -np 1 --host cab0 env

which, among others things, shows me the following value:

LD_LIBRARY_PATH=/tools/openmpi-1.2.5/lib:

If I log into the Cell machine and run env directly from the
command
line, I get the following value:

LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32

So it appears that .profile gets sourced when I log in but not  
when

mpirun runs.

However, according to the OpenMPI FAQ 
(http://www.open-mpi.org/faq/?category=running#adding-ompi-to-path
), mpirun is supposed to directly call .profile since Bourne shell
doesn't automatically call it for non-interactive shells.

Does anyone have any insight as to why my environment isn't being
set properly?  Thanks!

Hahn

--
Hahn Kim, h...@ll.mit.edu
MIT Lincoln Laboratory
244 Wood St., Lexington, MA 02420
Tel: 781-981-0940, Fax: 781-981-5255






___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




--
* Dr. Aurélien Bouteiller
* Sr. Research Associate at Innovative Computing Laboratory
* University of Tennessee
* 1122 Volunteer Boulevard, suite 350
* Knoxville, TN 37996
* 865 

Re: [OMPI users] OMPI link error with petsc 2.3.3

2008-10-07 Thread Yann JOBIC

Terry Dontje wrote:

Yann,

I'll take a look at this it looks like there definitely is an issue 
between our libmpi.so and libmpi_f90.so files.


I noticed that the linkage message is a warning does the code actually 
fail when running?


--td

Thanks for you fast answer.
No, the program is running and gives some good results (so far, for some 
small cases).

However i don't know if we'll have some strange behavior in some cases.

Yann


Date: Tue, 07 Oct 2008 16:55:14 +0200
From: "Yann JOBIC" 
Subject: [OMPI users] OMPI link error with petsc 2.3.3
To: Open MPI Users 
Message-ID: <48eb7852.6070...@polytech.univ-mrs.fr>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Hello,

I'm using openmpi 1.3r19400 (ClusterTools 8.0), with sun studio 12, 
and solaris 10u5


I've got this error when linking a PETSc code :
ld: warning: symbol `mpi_fortran_status_ignore_' has differing sizes:
   (file /opt/SUNWhpc/HPC8.0/lib/amd64/libmpi.so value=0x8; file 
/opt/SUNWhpc/HPC8.0/lib/amd64/libmpi_f90.so value=0x14);

   /opt/SUNWhpc/HPC8.0/lib/amd64/libmpi.so definition taken


Isn't it very strange ?

Have you got any idea on the way to solve it ?

Many thanks,

Yann

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
___

Yann JOBIC
HPC engineer
Polytech Marseille DME
IUSTI-CNRS UMR 6595
Technopôle de Château Gombert
5 rue Enrico Fermi
13453 Marseille cedex 13
Tel : (33) 4 91 10 69 39
 ou  (33) 4 91 10 69 43
Fax : (33) 4 91 10 69 69 


Re: [OMPI users] Problem launching onto Bourne shell

2008-10-07 Thread Jeff Squyres

Ralph and I just talked about this a bit:

1. In all released versions of OMPI, we *do* source the .profile file  
on the target node if it exists (because vanilla Bourne shells do not  
source anything on remote nodes -- Bash does, though, per the FAQ).   
However, looking in 1.2.7, it looks like it might not be executing  
that code -- there *may* be a bug in this area.  We're checking into it.


2. You might want to check your configuration to see if your .bashrc  
is dumping out early because it's a non-interactive shell.  Check the  
output of:


ssh othernode env
vs.
ssh othernode
env

(i.e., a non-interactive running of "env" vs. an interactive login and  
running "env")




On Oct 7, 2008, at 8:53 AM, Ralph Castain wrote:

I am unaware of anything in the code that would "source .profile"  
for you. I believe the FAQ page is in error here.


Ralph

On Oct 6, 2008, at 7:47 PM, Hahn Kim wrote:

Great, that worked, thanks!  However, it still concerns me that the  
FAQ page says that mpirun will execute .profile which doesn't seem  
to work for me.  Are there any configuration issues that could  
possibly be preventing mpirun from doing this?  It would certainly  
be more convenient if I could maintain my environment in a  
single .profile file instead of adding what could potentially be a  
lot of -x arguments to my mpirun command.


Hahn

On Oct 6, 2008, at 5:44 PM, Aurélien Bouteiller wrote:

tYou can forward your local env with mpirun -x LD_LIBRARY_PATH. As  
an

alternative you can set specific values with mpirun -x
LD_LIBRARY_PATH=/some/where:/some/where/else . More information with
mpirun --help (or man mpirun).

Aurelien



Le 6 oct. 08 à 16:06, Hahn Kim a écrit :


Hi,

I'm having difficulty launching an Open MPI job onto a machine that
is running the Bourne shell.

Here's my basic setup.  I have two machines, one is an x86-based
machine running bash and the other is a Cell-based machine running
Bourne shell.  I'm running mpirun from the x86 machine, which
launches a C++ MPI application onto the Cell machine.  I get the
following error:

error while loading shared libraries: libstdc++.so.6: cannot open
shared object file: No such file or directory

The basic problem is that LD_LIBRARY_PATH needs to be set to the
directory that contains libstdc++.so.6 for the Cell.  I set the
following line in .profile:

export LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32

which is the path to the PPC libraries for Cell.

Now if I log directly into the Cell machine and run the program
directly from the command line, I don't get the above error.  But
mpirun still fails, even after setting LD_LIBRARY_PATH in .profile.

As a sanity check, I did the following.  I ran the following  
command

from the x86 machine:

mpirun -np 1 --host cab0 env

which, among others things, shows me the following value:

LD_LIBRARY_PATH=/tools/openmpi-1.2.5/lib:

If I log into the Cell machine and run env directly from the  
command

line, I get the following value:

LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32

So it appears that .profile gets sourced when I log in but not when
mpirun runs.

However, according to the OpenMPI FAQ 
(http://www.open-mpi.org/faq/?category=running#adding-ompi-to-path
), mpirun is supposed to directly call .profile since Bourne shell
doesn't automatically call it for non-interactive shells.

Does anyone have any insight as to why my environment isn't being
set properly?  Thanks!

Hahn

--
Hahn Kim, h...@ll.mit.edu
MIT Lincoln Laboratory
244 Wood St., Lexington, MA 02420
Tel: 781-981-0940, Fax: 781-981-5255






___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




--
* Dr. Aurélien Bouteiller
* Sr. Research Associate at Innovative Computing Laboratory
* University of Tennessee
* 1122 Volunteer Boulevard, suite 350
* Knoxville, TN 37996
* 865 974 6321





___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




--
Hahn Kim
MIT Lincoln Laboratory   Phone: (781) 981-0940
244 Wood Street, S2-252  Fax: (781) 981-5255
Lexington, MA 02420  E-mail: h...@ll.mit.edu




___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems




Re: [OMPI users] OMPI link error with petsc 2.3.3

2008-10-07 Thread Terry Dontje

Yann,

I'll take a look at this it looks like there definitely is an issue between our 
libmpi.so and libmpi_f90.so files.

I noticed that the linkage message is a warning does the code actually fail 
when running?

--td

List-Post: users@lists.open-mpi.org
Date: Tue, 07 Oct 2008 16:55:14 +0200
From: "Yann JOBIC" 
Subject: [OMPI users] OMPI link error with petsc 2.3.3
To: Open MPI Users 
Message-ID: <48eb7852.6070...@polytech.univ-mrs.fr>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Hello,

I'm using openmpi 1.3r19400 (ClusterTools 8.0), with sun studio 12, and 
solaris 10u5


I've got this error when linking a PETSc code :
ld: warning: symbol `mpi_fortran_status_ignore_' has differing sizes:
   (file /opt/SUNWhpc/HPC8.0/lib/amd64/libmpi.so value=0x8; file 
/opt/SUNWhpc/HPC8.0/lib/amd64/libmpi_f90.so value=0x14);

   /opt/SUNWhpc/HPC8.0/lib/amd64/libmpi.so definition taken


Isn't it very strange ?

Have you got any idea on the way to solve it ?

Many thanks,

Yann



Re: [OMPI users] OMPI link error with petsc 2.3.3

2008-10-07 Thread Rolf Vandevaart


This is strange.  We need to look into this a little more.  However, you 
may be OK as the warning says it is taking the value from libmpi.so 
which I believe is the correct one.  Does your program run OK?


Rolf

On 10/07/08 10:57, Doug Reeder wrote:

Yann,

It looks like somehow the libmpi and libmpi_f90 have different values 
for the variable mpi_fortran_status_ignore. It sounds like a configure 
problem. You might check the mpi include files to see if you can see 
where the different values are coming from.


Doug Reeder
On Oct 7, 2008, at 7:55 AM, Yann JOBIC wrote:


Hello,

I'm using openmpi 1.3r19400 (ClusterTools 8.0), with sun studio 12, 
and solaris 10u5


I've got this error when linking a PETSc code :
ld: warning: symbol `mpi_fortran_status_ignore_' has differing sizes:
   (file /opt/SUNWhpc/HPC8.0/lib/amd64/libmpi.so value=0x8; file 
/opt/SUNWhpc/HPC8.0/lib/amd64/libmpi_f90.so value=0x14);

   /opt/SUNWhpc/HPC8.0/lib/amd64/libmpi.so definition taken


Isn't it very strange ?

Have you got any idea on the way to solve it ?

Many thanks,

Yann
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




--

=
rolf.vandeva...@sun.com
781-442-3043
=


[OMPI users] build failed using intel compilers on mac os x

2008-10-07 Thread Massimo Cafaro


openmpi build
Description: Binary data



Dear all,

I tried to build the latest v1.2.7 open-mpi version on Mac OS X 10.5.5  
using the intel c, c++ and fortran compilers v10.1.017 (the latest  
ones released by intel). Before starting the build I have properly  
configured the CC, CXX, F77 and FC environment variables (to icc and  
ifort). The build failed due to undefined symbols.


I am attaching a log of the failed build process.
Any clue? Am I doing something wrong?

Also, to build a 64 bit version it is enough to supply in the  
corresponding environment variables the -m64 option ?

Thank you in advance and best regards,

Massimo


--

***

 Massimo Cafaro, Ph.D.  Additional  
affiliations:
 Assistant Professor National  
Nanotechnology Laboratory (NNL/CNR-INFM)
 Dept. of Engineering for Innovation Euro-Mediterranean  
Centre for Climate Change

 University of Salento, Lecce, ItalySPACI Consortium
 Via per Monteroni
 73100 Lecce, Italy
 Voice  +39 0832 297371
 Fax +39 0832 298173
 Web http://sara.unile.it/~cafaro
 E-mail massimo.caf...@unile.it
  caf...@cacr.caltech.edu

***





Re: [OMPI users] OMPI link error with petsc 2.3.3

2008-10-07 Thread Doug Reeder

Yann,

It looks like somehow the libmpi and libmpi_f90 have different values  
for the variable mpi_fortran_status_ignore. It sounds like a  
configure problem. You might check the mpi include files to see if  
you can see where the different values are coming from.


Doug Reeder
On Oct 7, 2008, at 7:55 AM, Yann JOBIC wrote:


Hello,

I'm using openmpi 1.3r19400 (ClusterTools 8.0), with sun studio 12,  
and solaris 10u5


I've got this error when linking a PETSc code :
ld: warning: symbol `mpi_fortran_status_ignore_' has differing sizes:
   (file /opt/SUNWhpc/HPC8.0/lib/amd64/libmpi.so value=0x8;  
file /opt/SUNWhpc/HPC8.0/lib/amd64/libmpi_f90.so value=0x14);

   /opt/SUNWhpc/HPC8.0/lib/amd64/libmpi.so definition taken


Isn't it very strange ?

Have you got any idea on the way to solve it ?

Many thanks,

Yann
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




[OMPI users] OMPI link error with petsc 2.3.3

2008-10-07 Thread Yann JOBIC

Hello,

I'm using openmpi 1.3r19400 (ClusterTools 8.0), with sun studio 12, and 
solaris 10u5


I've got this error when linking a PETSc code :
ld: warning: symbol `mpi_fortran_status_ignore_' has differing sizes:
   (file /opt/SUNWhpc/HPC8.0/lib/amd64/libmpi.so value=0x8; file 
/opt/SUNWhpc/HPC8.0/lib/amd64/libmpi_f90.so value=0x14);

   /opt/SUNWhpc/HPC8.0/lib/amd64/libmpi.so definition taken


Isn't it very strange ?

Have you got any idea on the way to solve it ?

Many thanks,

Yann


Re: [OMPI users] Problem launching onto Bourne shell

2008-10-07 Thread Ralph Castain
I am unaware of anything in the code that would "source .profile" for  
you. I believe the FAQ page is in error here.


Ralph

On Oct 6, 2008, at 7:47 PM, Hahn Kim wrote:

Great, that worked, thanks!  However, it still concerns me that the  
FAQ page says that mpirun will execute .profile which doesn't seem  
to work for me.  Are there any configuration issues that could  
possibly be preventing mpirun from doing this?  It would certainly  
be more convenient if I could maintain my environment in a  
single .profile file instead of adding what could potentially be a  
lot of -x arguments to my mpirun command.


Hahn

On Oct 6, 2008, at 5:44 PM, Aurélien Bouteiller wrote:


tYou can forward your local env with mpirun -x LD_LIBRARY_PATH. As an
alternative you can set specific values with mpirun -x
LD_LIBRARY_PATH=/some/where:/some/where/else . More information with
mpirun --help (or man mpirun).

Aurelien



Le 6 oct. 08 à 16:06, Hahn Kim a écrit :


Hi,

I'm having difficulty launching an Open MPI job onto a machine that
is running the Bourne shell.

Here's my basic setup.  I have two machines, one is an x86-based
machine running bash and the other is a Cell-based machine running
Bourne shell.  I'm running mpirun from the x86 machine, which
launches a C++ MPI application onto the Cell machine.  I get the
following error:

 error while loading shared libraries: libstdc++.so.6: cannot open
shared object file: No such file or directory

The basic problem is that LD_LIBRARY_PATH needs to be set to the
directory that contains libstdc++.so.6 for the Cell.  I set the
following line in .profile:

 export LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32

which is the path to the PPC libraries for Cell.

Now if I log directly into the Cell machine and run the program
directly from the command line, I don't get the above error.  But
mpirun still fails, even after setting LD_LIBRARY_PATH in .profile.

As a sanity check, I did the following.  I ran the following command
from the x86 machine:

 mpirun -np 1 --host cab0 env

which, among others things, shows me the following value:

 LD_LIBRARY_PATH=/tools/openmpi-1.2.5/lib:

If I log into the Cell machine and run env directly from the command
line, I get the following value:

 LD_LIBRARY_PATH=/opt/cell/toolchain/lib/gcc/ppu/4.1.1/32

So it appears that .profile gets sourced when I log in but not when
mpirun runs.

However, according to the OpenMPI FAQ 
(http://www.open-mpi.org/faq/?category=running#adding-ompi-to-path
), mpirun is supposed to directly call .profile since Bourne shell
doesn't automatically call it for non-interactive shells.

Does anyone have any insight as to why my environment isn't being
set properly?  Thanks!

Hahn

--
Hahn Kim, h...@ll.mit.edu
MIT Lincoln Laboratory
244 Wood St., Lexington, MA 02420
Tel: 781-981-0940, Fax: 781-981-5255






___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




--
* Dr. Aurélien Bouteiller
* Sr. Research Associate at Innovative Computing Laboratory
* University of Tennessee
* 1122 Volunteer Boulevard, suite 350
* Knoxville, TN 37996
* 865 974 6321





___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




--
Hahn Kim
MIT Lincoln Laboratory   Phone: (781) 981-0940
244 Wood Street, S2-252  Fax: (781) 981-5255
Lexington, MA 02420  E-mail: h...@ll.mit.edu




___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users





Re: [OMPI users] OpenMPI with openib partitions

2008-10-07 Thread Jeff Squyres
FWIW, if this configuration is for all of your users, you might want  
to specify these MCA params in the default MCA param file, or the  
environment, ...etc.  Just so that you don't have to specify it on  
every mpirun command line.


See http://www.open-mpi.org/faq/?category=tuning#setting-mca-params.


On Oct 7, 2008, at 5:43 AM, Lenny Verkhovsky wrote:


Sorry, misunderstood the question,

thanks for Pasha the right command line will be

-mca btl openib,self -mca btl_openib_of_pkey_val 0x8109 -mca  
btl_openib_of_pkey_ix 1


ex.

#mpirun -np 2 -H witch2,witch3 -mca btl openib,self -mca  
btl_openib_of_pkey_val 0x8001 -mca btl_openib_of_pkey_ix 1 ./ 
mpi_p1_4_TRUNK -t lt

LT (2) (size min max avg) 1 3.443480 3.443480 3.443480


Best regards

Lenny.


On 10/6/08, Jeff Squyres  wrote: On Oct 5, 2008,  
at 1:22 PM, Lenny Verkhovsky wrote:


you should probably use -mca tcp,self  -mca btl_openib_if_include  
ib0.8109



Really?  I thought we only took OpenFabrics device names in the  
openib_if_include MCA param...?  It looks like ib0.8109 is an IPoIB  
device name.




Lenny.


On 10/3/08, Matt Burgess  wrote:
Hi,


I'm trying to get openmpi working over openib partitions. On this  
cluster, the partition number is 0x109. The ib interfaces are  
pingable over the appropriate ib0.8109 interface:


d2:/opt/openmpi-ib # ifconfig ib0.8109
ib0.8109  Link encap:UNSPEC  HWaddr 80-00-00-4A- 
FE-80-00-00-00-00-00-00-00-00-00-00

 inet addr:10.21.48.2  Bcast:10.21.255.255  Mask:255.255.0.0
 inet6 addr: fe80::202:c902:26:ca01/64 Scope:Link
 UP BROADCAST RUNNING MULTICAST  MTU:65520  Metric:1
 RX packets:16811 errors:0 dropped:0 overruns:0 frame:0
 TX packets:15848 errors:0 dropped:1 overruns:0 carrier:0
 collisions:0 txqueuelen:256
 RX bytes:102229428 (97.4 Mb)  TX bytes:102324172 (97.5 Mb)


I have tried the following:

/opt/openmpi-ib/1.2.6/bin/mpirun -np 2 -machinefile machinefile -mca  
btl openib,self -mca btl_openib_max_btls 1 -mca  
btl_openib_ib_pkey_val 0x8109 -mca btl_openib_ib_pkey_ix 1 /cluster/ 
pallas/x86_64-ib/IMB-MPI1


but I just get a RETRY EXCEEDED ERROR. Is there a MCA parameter I am  
missing?


I was successful using tcp only:

/opt/openmpi-ib/1.2.6/bin/mpirun -np 2 -machinefile machinefile -mca  
btl tcp,self -mca btl_openib_max_btls 1 -mca btl_openib_ib_pkey_val  
0x8109 /cluster/pallas/x86_64-ib/IMB-MPI1




Thanks,
Matt Burgess

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


--
Jeff Squyres
Cisco Systems


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




--
Jeff Squyres
Cisco Systems



Re: [OMPI users] OpenMPI with openib partitions

2008-10-07 Thread Pavel Shamis (Pasha)

Matt,
I guess that you have some problem with partition configuration.
Can you share with us your partition configuration file (by default 
opensm use /etc/opensm/partitions.conf) and guid from your machines ( 
ibstat | grep GUID ) ?


Regards,
Pasha

Matt Burgess wrote:

Hi,


I'm trying to get openmpi working over openib partitions. On this 
cluster, the partition number is 0x109. The ib interfaces are pingable 
over the appropriate ib0.8109 interface:


d2:/opt/openmpi-ib # ifconfig ib0.8109
ib0.8109  Link encap:UNSPEC  HWaddr 
80-00-00-4A-FE-80-00-00-00-00-00-00-00-00-00-00 
  inet addr:10.21.48.2   
Bcast:10.21.255.255   Mask:255.255.0.0 


  inet6 addr: fe80::202:c902:26:ca01/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:65520  Metric:1
  RX packets:16811 errors:0 dropped:0 overruns:0 frame:0
  TX packets:15848 errors:0 dropped:1 overruns:0 carrier:0
  collisions:0 txqueuelen:256
  RX bytes:102229428 (97.4 Mb)  TX bytes:102324172 (97.5 Mb)


I have tried the following:

/opt/openmpi-ib/1.2.6/bin/mpirun -np 2 -machinefile machinefile -mca 
btl openib,self -mca btl_openib_max_btls 1 -mca btl_openib_ib_pkey_val 
0x8109 -mca btl_openib_ib_pkey_ix 1 /cluster/pallas/x86_64-ib/IMB-MPI1


but I just get a RETRY EXCEEDED ERROR. Is there a MCA parameter I am 
missing?


I was successful using tcp only:

/opt/openmpi-ib/1.2.6/bin/mpirun -np 2 -machinefile machinefile -mca 
btl tcp,self -mca btl_openib_max_btls 1 -mca btl_openib_ib_pkey_val 
0x8109 /cluster/pallas/x86_64-ib/IMB-MPI1




Thanks,
Matt Burgess


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
--
Pavel Shamis (Pasha)
Mellanox Technologies LTD.



Re: [OMPI users] OpenMPI with openib partitions

2008-10-07 Thread Lenny Verkhovsky
Sorry, misunderstood the question,

thanks for Pasha the right command line will be

-mca btl openib,self -mca btl_openib_of_pkey_val 0x8109 -mca
btl_openib_of_pkey_ix 1

ex.

#mpirun -np 2 -H witch2,witch3 -mca btl openib,self -mca
btl_openib_of_pkey_val 0x8001 -mca btl_openib_of_pkey_ix 1 ./mpi_p1_4_TRUNK
-t lt
LT (2) (size min max avg) 1 3.443480 3.443480 3.443480

Best regards

Lenny.

On 10/6/08, Jeff Squyres  wrote:
>
> On Oct 5, 2008, at 1:22 PM, Lenny Verkhovsky wrote:
>
>  you should probably use -mca tcp,self  -mca btl_openib_if_include ib0.8109
>>
>>
> Really?  I thought we only took OpenFabrics device names in the
> openib_if_include MCA param...?  It looks like ib0.8109 is an IPoIB device
> name.
>
>
>  Lenny.
>>
>>
>> On 10/3/08, Matt Burgess  wrote:
>> Hi,
>>
>>
>> I'm trying to get openmpi working over openib partitions. On this cluster,
>> the partition number is 0x109. The ib interfaces are pingable over the
>> appropriate ib0.8109 interface:
>>
>> d2:/opt/openmpi-ib # ifconfig ib0.8109
>> ib0.8109  Link encap:UNSPEC  HWaddr
>> 80-00-00-4A-FE-80-00-00-00-00-00-00-00-00-00-00
>>  inet addr:10.21.48.2  Bcast:10.21.255.255  Mask:255.255.0.0
>>  inet6 addr: fe80::202:c902:26:ca01/64 Scope:Link
>>  UP BROADCAST RUNNING MULTICAST  MTU:65520  Metric:1
>>  RX packets:16811 errors:0 dropped:0 overruns:0 frame:0
>>  TX packets:15848 errors:0 dropped:1 overruns:0 carrier:0
>>  collisions:0 txqueuelen:256
>>  RX bytes:102229428 (97.4 Mb)  TX bytes:102324172 (97.5 Mb)
>>
>>
>> I have tried the following:
>>
>> /opt/openmpi-ib/1.2.6/bin/mpirun -np 2 -machinefile machinefile -mca btl
>> openib,self -mca btl_openib_max_btls 1 -mca btl_openib_ib_pkey_val 0x8109
>> -mca btl_openib_ib_pkey_ix 1 /cluster/pallas/x86_64-ib/IMB-MPI1
>>
>> but I just get a RETRY EXCEEDED ERROR. Is there a MCA parameter I am
>> missing?
>>
>> I was successful using tcp only:
>>
>> /opt/openmpi-ib/1.2.6/bin/mpirun -np 2 -machinefile machinefile -mca btl
>> tcp,self -mca btl_openib_max_btls 1 -mca btl_openib_ib_pkey_val 0x8109
>> /cluster/pallas/x86_64-ib/IMB-MPI1
>>
>>
>>
>> Thanks,
>> Matt Burgess
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
> --
> Jeff Squyres
> Cisco Systems
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>