Re: [OMPI users] mpirun example program fail on multiple nodes- unable to launch specified application on client node

2009-11-06 Thread Josh Hursey
As an alternative technique for distributing the binary, you could ask  
Open MPI's runtime to do it for you (made available in the v1.3  
series). You still need to make sure that the same version of Open is  
installed on all nodes, but if you pass the --preload-binary option to  
mpirun the runtime environment will distribute the binary across the  
machine (staging it to a temporary directory) before launching it.


You can do the same with any arbitrary set of files or directories  
(comma separated) using the --preload-files option as well.


If you type 'mpirun --help' the options that you are looking for are:

   --preload-files 
   --preload-files-dest-dir 
 with --preload-files. By default the  
absolute and

 relative paths provided by --preload-files are
-s|--preload-binary  Preload the binary on the remote machine before



-- Josh

On Nov 5, 2009, at 6:56 PM, Terry Frankcombe wrote:

For small ad hoc COWs I'd vote for sshfs too.  It may well be as  
slow as
a dog, but it actually has some security, unlike NFS, and is a  
doddle to

make work with no superuser access on the server, unlike NFS.


On Thu, 2009-11-05 at 17:53 -0500, Jeff Squyres wrote:

On Nov 5, 2009, at 5:34 PM, Douglas Guptill wrote:

I am currently using sshfs to mount both OpenMPI and my  
application on

the "other" computers/nodes.  The advantage to this is that I have
only one copy of OpenMPI and my application.  There may be a
performance penalty, but I haven't seen it yet.




For a small number of nodes (where small <=32 or sometimes even  
<=64),

I find that simple NFS works just fine.  If your apps aren't IO
intensive, that can greatly simplify installation and deployment of
both Open MPI and your MPI applications IMNSHO.

But -- every app is different.  :-)  YMMV.



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] mpirun example program fail on multiple nodes- unable to launch specified application on client node

2009-11-05 Thread Terry Frankcombe
For small ad hoc COWs I'd vote for sshfs too.  It may well be as slow as
a dog, but it actually has some security, unlike NFS, and is a doddle to
make work with no superuser access on the server, unlike NFS.


On Thu, 2009-11-05 at 17:53 -0500, Jeff Squyres wrote:
> On Nov 5, 2009, at 5:34 PM, Douglas Guptill wrote:
> 
> > I am currently using sshfs to mount both OpenMPI and my application on
> > the "other" computers/nodes.  The advantage to this is that I have
> > only one copy of OpenMPI and my application.  There may be a
> > performance penalty, but I haven't seen it yet.
> >
> 
> 
> For a small number of nodes (where small <=32 or sometimes even <=64),  
> I find that simple NFS works just fine.  If your apps aren't IO  
> intensive, that can greatly simplify installation and deployment of  
> both Open MPI and your MPI applications IMNSHO.
> 
> But -- every app is different.  :-)  YMMV.
> 



Re: [OMPI users] mpirun example program fail on multiple nodes- unable to launch specified application on client node

2009-11-05 Thread Jeff Squyres

On Nov 5, 2009, at 5:34 PM, Douglas Guptill wrote:


I am currently using sshfs to mount both OpenMPI and my application on
the "other" computers/nodes.  The advantage to this is that I have
only one copy of OpenMPI and my application.  There may be a
performance penalty, but I haven't seen it yet.




For a small number of nodes (where small <=32 or sometimes even <=64),  
I find that simple NFS works just fine.  If your apps aren't IO  
intensive, that can greatly simplify installation and deployment of  
both Open MPI and your MPI applications IMNSHO.


But -- every app is different.  :-)  YMMV.

--
Jeff Squyres
jsquy...@cisco.com



Re: [OMPI users] mpirun example program fail on multiple nodes - unable to launch specified application on client node

2009-11-05 Thread Douglas Guptill
On Thu, Nov 05, 2009 at 03:15:33PM -0600, Qing Pang wrote:

> Thank you Jeff! That solves the problem. :-) You are the lifesaver!
> So does that means I always need to copy my application to all the  
> nodes? Or should I give the pathname of the my executable in a different  
> way to avoid this? Do I need a network file system for that?

I am currently using sshfs to mount both OpenMPI and my application on
the "other" computers/nodes.  The advantage to this is that I have
only one copy of OpenMPI and my application.  There may be a
performance penalty, but I haven't seen it yet.

Douglas.


Re: [OMPI users] mpirun example program fail on multiple nodes - unable to launch specified application on client node

2009-11-05 Thread Jeff Squyres

On Nov 5, 2009, at 4:15 PM, Qing Pang wrote:


Thank you Jeff! That solves the problem. :-) You are the lifesaver!
So does that means I always need to copy my application to all the
nodes? Or should I give the pathname of the my executable in a  
different

way to avoid this? Do I need a network file system for that?



Your executable needs to be available on all nodes, yes, whether you  
have copied it out there or whether you use a network filesystem.  For  
a small number of nodes, using a network filesystem is likely much  
more convenient.


See http://www.open-mpi.org/faq/?category=running#do-i-need-a-common-filesystem

--
Jeff Squyres
jsquy...@cisco.com



Re: [OMPI users] mpirun example program fail on multiple nodes - unable to launch specified application on client node

2009-11-05 Thread Qing Pang

Thank you Jeff! That solves the problem. :-) You are the lifesaver!
So does that means I always need to copy my application to all the 
nodes? Or should I give the pathname of the my executable in a different 
way to avoid this? Do I need a network file system for that?



Jeff Squyres wrote:
The short version of the answer is to check to see that the executable 
is in the same location on both nodes (apparently: 
/home/gordon/Desktop/openmpi-1.3.3/examples/hello_c.out).  Open MPI is 
complaining that it can't find that specific executable on the .194 node.


See below for more detail.


On Nov 5, 2009, at 3:19 PM, qing pang wrote:


1) I'm trying to run opemMPI with the following setting:

1 PC (as master node) and 1 notebook (as client node) connected to an
ethernet router through ethernet cable. Both running Ubuntu 8.10.
There's no other connections. - Is this setting OK to run OpenMPI?



Yes.


2) Prerequisites

SSH has been set up so that the master node can access the client node
through passwordless ssh. I do notice that it takes 10~15 seconds
between me entering '>ssh 'command and getting onto
the client node.
--- Could this be too slow for openmpi to run properlly?



Nope -- should be ok.


I do not have programs like network file system, network time protocol,
resource management, scheduler, etc installed.
--- Does OpenMPI need any prerequites other than passwordless ssh?



Not in this case, no.


3) OpenMPI is installed on both nodes - downloaded from open-mpi.org,
and do configure/make all using Default Settings.

4) PATH and LD_LIBRARY_PATH
On both nodes,
PATH is
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games,
which is the default setting in ubuntu.
LD_LIBRARY_PATH is set in ~/.bashrc - I added one line at the end of the
file, 'export LD_LIBRARY_PATH=usr/local/lib:usr/lib'
So when I echo them on both nodes, I get:
 >echo $PATH
 >/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games 


 >echo $LD_LIBRARY_PATH
 >usr/local/lib:usr/lib

But, if I do
 >ssh  'echo $LD_LIBRARY_PATH'
nothing comes back.

while
 >ssh  'echo $PATH'
comes back with the right path.

Is that a problem?



No.


4) Problem:
I compiled the example Hello_c using
 >mpicc hello_c.c -o hello_c.out
and run them on both nodes locally, everything works fine.

But when I tried to run it on 2 nodes (-np 2)
 >mpirun -machinefile machine.linux -np 2 $(pwd)/hello_c.out
I got the following error:

 


gordon@gordon-desktop:~/Desktop/openmpi-1.3.3/examples$ mpirun
--machinefile machine.linux -np 2 $(pwd)/hello_c.out
-- 

mpirun was unable to launch the specified application as it could not 
access

or execute an executable:

Executable: /home/gordon/Desktop/openmpi-1.3.3/examples/hello_c.out
Node: 192.168.0.194



You are giving an absolute pathname in the mpirun command line:

mpirun -machinefile machine.linux -np 2 $(pwd)/hello_c.out


Hence, it's looking for exactly 
/home/gordon/Desktop/openmpi-1.3.3/examples/hello_c.out on both 
nodes.  If the executable is in a different directory on the other 
node, that's where you're probably running into the problem.






Re: [OMPI users] mpirun example program fail on multiple nodes - unable to launch specified application on client node

2009-11-05 Thread Jeff Squyres
The short version of the answer is to check to see that the executable  
is in the same location on both nodes (apparently: /home/gordon/ 
Desktop/openmpi-1.3.3/examples/hello_c.out).  Open MPI is complaining  
that it can't find that specific executable on the .194 node.


See below for more detail.


On Nov 5, 2009, at 3:19 PM, qing pang wrote:


1) I'm trying to run opemMPI with the following setting:

1 PC (as master node) and 1 notebook (as client node) connected to an
ethernet router through ethernet cable. Both running Ubuntu 8.10.
There's no other connections. - Is this setting OK to run OpenMPI?



Yes.


2) Prerequisites

SSH has been set up so that the master node can access the client node
through passwordless ssh. I do notice that it takes 10~15 seconds
between me entering '>ssh 'command and getting onto
the client node.
--- Could this be too slow for openmpi to run properlly?



Nope -- should be ok.

I do not have programs like network file system, network time  
protocol,

resource management, scheduler, etc installed.
--- Does OpenMPI need any prerequites other than passwordless ssh?



Not in this case, no.


3) OpenMPI is installed on both nodes - downloaded from open-mpi.org,
and do configure/make all using Default Settings.

4) PATH and LD_LIBRARY_PATH
On both nodes,
PATH is
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/ 
games,

which is the default setting in ubuntu.
LD_LIBRARY_PATH is set in ~/.bashrc - I added one line at the end of  
the

file, 'export LD_LIBRARY_PATH=usr/local/lib:usr/lib'
So when I echo them on both nodes, I get:
 >echo $PATH
 >/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/ 
games

 >echo $LD_LIBRARY_PATH
 >usr/local/lib:usr/lib

But, if I do
 >ssh  'echo $LD_LIBRARY_PATH'
nothing comes back.

while
 >ssh  'echo $PATH'
comes back with the right path.

Is that a problem?



No.


4) Problem:
I compiled the example Hello_c using
 >mpicc hello_c.c -o hello_c.out
and run them on both nodes locally, everything works fine.

But when I tried to run it on 2 nodes (-np 2)
 >mpirun -machinefile machine.linux -np 2 $(pwd)/hello_c.out
I got the following error:


gordon@gordon-desktop:~/Desktop/openmpi-1.3.3/examples$ mpirun
--machinefile machine.linux -np 2 $(pwd)/hello_c.out
--
mpirun was unable to launch the specified application as it could  
not access

or execute an executable:

Executable: /home/gordon/Desktop/openmpi-1.3.3/examples/hello_c.out
Node: 192.168.0.194



You are giving an absolute pathname in the mpirun command line:

mpirun -machinefile machine.linux -np 2 $(pwd)/hello_c.out


Hence, it's looking for exactly /home/gordon/Desktop/openmpi-1.3.3/ 
examples/hello_c.out on both nodes.  If the executable is in a  
different directory on the other node, that's where you're probably  
running into the problem.


--
Jeff Squyres
jsquy...@cisco.com



[OMPI users] mpirun example program fail on multiple nodes - unable to launch specified application on client node

2009-11-05 Thread qing pang

Dear Sir/Madam,

I'm having problem running example program. Please kindly help --- I've 
been fooling with it for days, kind of getting lost.


-
MPIRUN fails on example hello prgram
-unable to launch the specified application on client node
-

1) I'm trying to run opemMPI with the following setting:

1 PC (as master node) and 1 notebook (as client node) connected to an 
ethernet router through ethernet cable. Both running Ubuntu 8.10. 
There's no other connections. - Is this setting OK to run OpenMPI?


2) Prerequisites

SSH has been set up so that the master node can access the client node 
through passwordless ssh. I do notice that it takes 10~15 seconds 
between me entering '>ssh 'command and getting onto 
the client node.

--- Could this be too slow for openmpi to run properlly?

I do not have programs like network file system, network time protocol, 
resource management, scheduler, etc installed.

--- Does OpenMPI need any prerequites other than passwordless ssh?

3) OpenMPI is installed on both nodes - downloaded from open-mpi.org, 
and do configure/make all using Default Settings.


4) PATH and LD_LIBRARY_PATH
On both nodes,
PATH is 
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games, 
which is the default setting in ubuntu.
LD_LIBRARY_PATH is set in ~/.bashrc - I added one line at the end of the 
file, 'export LD_LIBRARY_PATH=usr/local/lib:usr/lib'

So when I echo them on both nodes, I get:
>echo $PATH
>/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games
>echo $LD_LIBRARY_PATH
>usr/local/lib:usr/lib

But, if I do
>ssh  'echo $LD_LIBRARY_PATH'
nothing comes back.

while
>ssh  'echo $PATH'
comes back with the right path.

Is that a problem?

4) Problem:
I compiled the example Hello_c using
>mpicc hello_c.c -o hello_c.out
and run them on both nodes locally, everything works fine.

But when I tried to run it on 2 nodes (-np 2)
>mpirun -machinefile machine.linux -np 2 $(pwd)/hello_c.out
I got the following error:


gordon@gordon-desktop:~/Desktop/openmpi-1.3.3/examples$ mpirun 
--machinefile machine.linux -np 2 $(pwd)/hello_c.out

--
mpirun was unable to launch the specified application as it could not access
or execute an executable:

Executable: /home/gordon/Desktop/openmpi-1.3.3/examples/hello_c.out
Node: 192.168.0.194

while attempting to start process rank 1.
--

Sometimes I get one other error message after that:
--
[gordon-desktop:30748] [[25975,0],0]-[[25975,1],0] mca_oob_tcp_msg_recv: 
readv failed: Connection reset by peer (104)

--

5) Infomation attached:
ifconfig_masternode - output of ifconfig on masternode
ifconfig_slavenode - output of ifconfig on slavenode
ompi_info.txt - output of ompi_info -all
config.log - OpenMPI logfile
machine.linux - the machinefile used in mpirun command

--
Sincerely,
Qing Pang
(601) 979 0270



mpirun_info.tar.gz
Description: application/gzip
-
MPIRUN fails on example hello prgram 
-unable to launch the specified application on client node 
-


1) I'm trying to run opemMPI with the following setting:

1 PC (as master node) and 1 notebook (as client node) connected to an ethernet 
router through ethernet cable. Both running Ubuntu 8.10. There's no other 
connections. - Is this setting OK to run OpenMPI?

2) Prerequisites

SSH has been set up so that the master node can access the client node through 
passwordless ssh. I do notice that it takes 10~15 seconds between me entering 
'>ssh 'command and getting onto the client node. - Can this 
be too slow for openmpi to run properlly? 

I do not have programs like network file system, network time protocol, 
resource management, scheduler, etc installed. - Does OpenMPI have any 
prerequites other than passwordless ssh?

3) OpenMPI is installed on both nodes - downloaded from open-mpi.org, and do 
configure/make all using Default Settings.

4) PATH and LD_LIBRARY_PATH
On both nodes,
PATH is 
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games, which 
is the default setting in ubuntu.
LD_LIBRARY_PATH is set in ~/.bashrc - I added one line at the end of the file, 
'export LD_LIBRARY_PATH=usr/local/lib:usr/lib'
So when I echo them on both nodes, I get:
>echo $PATH 
>/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games
>echo $LD_LIBRARY_PATH
>usr/local/lib:usr/lib