[OMPI users] Naming MPI_Spawn children

2012-06-18 Thread Jaison Paul Mulerikkal
HI,

I'm running openmpi on Rackspace cloud over Internet using MPI_Spawn. IT means,
I run the parent on my PC and the children on Rackspace cloud machines.
Rackspace provides direct IP addresses of the machines (no NAT), that is why it
is possible. 

Now, there is a communicator involving only the children and some communications
involve only communication between children (on Rackspace cloud, in this
scenario). When we conducted experiments, we experienced more than expected
delays in this operation - communication between children alone. 

My assumption is that openMPI is looking at the direct IP addresses at the
hostfile and try to communicate between Rackspace children over Internet. What I
would want/expect is the Rackspace children communicate between themselves
internally, using the internal Rackspace hostnames. Rackspace provide internal
IP addresses. But if I use that in the hostfile at my home PC, the parent wont
be able to access the children (there is a communicator involving parent and
children).

Can I anyway tell openMPI to look into the internal IP addresses of Rackspace
machines (another hostfile, may be) for the sub-group (communicator) involving
Rackspace children? In that case we will get performance improvement, I guess.

Thanks in advance for your valuable suggestions.

Jaison
Australian National University. 




[OMPI users] Accessing OpenMPI processes on EC2 machine over Internet using ssh

2011-12-18 Thread Jaison Paul
We have reported this before. We are still not able to do it, fully.

However partially successful, now. We have used a machine with static IP address
and modified the router settings by opening all ssh ports. Master runs on this
machine and the slaves on EC2. 

Now we can run the "Hello world" over internet using ssh. It starts MPI
executables in EC2 (we can see on 'top') and print back "hello" to our
home/master machine. 

But send/recv doesnt work. send/recv hang between master(home PC)<->slave(EC2),
both ways. 

What are the port settings for send/recv? Do we need to modify anything?

Any help is very much appreciated. 

Jaison
Australian National Uni



Re: [OMPI users] Accessing OpenMPI processes over Internet using ssh

2011-11-30 Thread Jaison Paul
Ralph Castain  open-mpi.org> writes:

> 
> This has come up before - I would suggest doing a quick search of "ec2" on our
user list. Here is one solution:
> On Jun 14, 2011, at 10:50 AM, Barnet Wagman wrote:I've put together a simple
system for running OMPI on EC2 (Amazon's cloud computing service).  If you're
interested, see 
> 
> 
> 
> http://norbl.com/ppe-ompi.html
> 

I have tried little bit more:

I have set the MCA parameters as follows:
mpirun -np 1 --mca btl tcp,self --mca btl_tcp_if_exclude lo,eth0 -hostfile
hostinfo nbs-client -bynode

But still failed and got the following error:

Permission denied (publickey).
--
A daemon (pid 24744) died unexpectedly with status 255 while attempting
to launch so we are aborting.

There may be more information reported by the environment (see above).

This may be because the daemon was unable to find all the needed shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
--
mpirun: clean termination accomplished

I dont understand the "Permission denied (publickey)" error. I access the EC2
instance using password-less ssh as follows:

ssh ubuntu  ec2-67-202-**-***.compute-1.amazonaws.com

So, what went wrong?

hostinfo file is:

[jmulerik  jaison Client]$ cat hostinfo 
localhost
ubuntu  ec2-67-202-48-118.compute-1.amazonaws.com

Jaison




Re: [OMPI users] Accessing OpenMPI processes over Internet using ssh

2011-11-30 Thread Jaison Paul
Jeff Squyres  cisco.com> writes:

> 
> On Nov 30, 2011, at 6:03 AM, Jaison Paul wrote:
> 
> > Yes, we have set up .ssh file on remote EC2 hosts. Is there anything else
that we should be taking care of when
> dealing with EC2?
> 
> I have heard that Open MPI's TCP latency on EC2 is horrid.  I actually talked
with some Amazon / EC2 folks about
> it at SC'11 a few weeks ago; we set a date to dive into it a bit deeper in
December.
> 
> No promises on when/if the TCP latency will improve, but it's definitely
something that we're looking at. 
> My first *guess* is that it might have something to do with specifying
btl_tcp_if_include /
> oob_tcp_if_include improperly (or not at all) -- but that's a SWAG.
> 

I have tried little bit more:

I have set the MCA parameters as follows:
mpirun -np 1 --mca btl tcp,self --mca btl_tcp_if_exclude lo,eth0 -hostfile
hostinfo nbs-client -bynode

But still failed and got the following error:

Permission denied (publickey).
--
A daemon (pid 24744) died unexpectedly with status 255 while attempting
to launch so we are aborting.

There may be more information reported by the environment (see above).

This may be because the daemon was unable to find all the needed shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
--
mpirun: clean termination accomplished


I dont understand the "Permission denied (publickey)" error. I access the EC2
instance using password-less ssh as follows:

ssh ubuntu@ec2-67-202-**-***.compute-1.amazonaws.com

So, what went wrong?

hostinfo file is:

[jmulerik@jaison Client]$ cat hostinfo 
localhost
ubu...@ec2-67-202-48-118.compute-1.amazonaws.com

Jaison






Re: [OMPI users] Accessing OpenMPI processes over Internet using ssh

2011-11-30 Thread Jaison Paul
Ralph Castain  open-mpi.org> writes:

> 
> This has come up before - I would suggest doing a quick search of "ec2" on our
user list. Here is one solution:
> On Jun 14, 2011, at 10:50 AM, Barnet Wagman wrote:I've put together a simple
system for running OMPI on EC2 (Amazon's cloud computing service).  If you're
interested, see 
> 
> 
> 
> http://norbl.com/ppe-ompi.html
> 

Thank you Barnet. We are using some scripts at the moment to easily configure
EC2 nodes with ompi. Will try this one. But this is to set up a network of Ompi
hosts within EC2, right? Does not support a client outside EC2 and the slaves
inside EC2?

Jaison

> 
> 
> Barnet Wagman
> 
> 
> 
> 



Re: [OMPI users] Accessing OpenMPI processes over Internet using ssh

2011-11-30 Thread Jaison Paul
Jeff Squyres  cisco.com> writes:

> 
> On Nov 30, 2011, at 6:03 AM, Jaison Paul wrote:
> 
> > Yes, we have set up .ssh file on remote EC2 hosts. Is there anything else
that we should be taking care of when
> dealing with EC2?
> 
> I have heard that Open MPI's TCP latency on EC2 is horrid.  I actually talked
with some Amazon / EC2 folks about
> it at SC'11 a few weeks ago; we set a date to dive into it a bit deeper in
December.
> 
> No promises on when/if the TCP latency will improve, but it's definitely
something that we're looking at. 
> My first *guess* is that it might have something to do with specifying
btl_tcp_if_include /
> oob_tcp_if_include improperly (or not at all) -- but that's a SWAG.
> 


Yes Jeff,

We are not setting up --mca btl_tcp_if_include / --mca oob_tcp_if_include at all
at the moment. What will be the best setup to access EC2 hosts over internet for
--mca btl_tcp_if_include / --mca oob_tcp_if_include? I dont understand --mca
very well.

Thanks, Jaison





Re: [OMPI users] Accessing OpenMPI processes over Internet using ssh

2011-11-30 Thread Jaison Paul

Ralph Castain  open-mpi.org> writes:

> 
> 
> On Nov 24, 2011, at 2:00 AM, Reuti wrote:
> 


Thanks a lot to Ralph and Reuti.

Actually we are trying to use EC2 nodes as compute nodes and my local PC as host
node. 

Happy to know that it is OK to use usersomehost.com

We used that but failed. Would try again. 

Yes, we have set up .ssh file on remote EC2 hosts. Is there anything else that
we should be taking care of when dealing with EC2?

Jaison


> > Hi,
> > 
> > Am 24.11.2011 um 05:26 schrieb Jaison Paul:
> > 
> >> I am trying to access OpenMPI processes over Internet using ssh and not
quite successful, yet. I believe
> that I should be able to do it.
> >> 
> >> I have to run one process on my PC and the rest on a remote cluster over
internet. I have set the public keys
> (at .ssh/authorized_keys) to access remote nodes without a password.
> >> 
> >> I use hostfile to run mpi. It will read something like:
> >> -
> >> localhost
> >> user  remotehost.com
> > 
> > this is not a valid syntax for Open MPI.
> 
> This isn't correct - we have long supported that syntax in a hostfile, and
there is no issue with having a
> different user name at each node.
> 
> Jaison: are you sure your nodes are setup for password-less ssh? In other
words, have you setup your .ssh
> files on the remote nodes so they will allow us to ssh a process on them
without providing a password? This is
> the typical problem we see.
> 
> > 
> > 
> >> -
> >> But it fails.
> >> 
> >> The issue seems to be the user! That is, the user on my PC is different to
that of user at remotehosts. That's
> my assumption.
> >> 
> >> Is this the problem? Is there any work-around to solve this issue? Do I
need to have same username at all
> nodes to solve this issue?
> > 
> > You can define nicknames for an ssh connection in a file ~/.ssh/config like:
> > 
> > Host foobar
> >User baz
> >Hostname the.remote.server.demo
> >Port 1234
> > 
> > While this will work with any nickname for an ssh connection, in your case
the nickname must match the one
> specified in the hostfile, as Open MPI won't use this lookup file:
> > 
> > Host remotehost.com
> >User user
> > 
> > ssh should then use the entries therein to initiate the connection. For
details you can have a look at `man ssh_config`.
> > 
> > -- Reuti
> > ___
> > users mailing list
> > users  open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 






[OMPI users] Accessing OpenMPI processes over Internet using ssh

2011-11-23 Thread Jaison Paul

Hi all,

I am trying to access OpenMPI processes over Internet using ssh and not 
quite successful, yet. I believe that I should be able to do it.


I have to run one process on my PC and the rest on a remote cluster over 
internet. I have set the public keys (at .ssh/authorized_keys) to access 
remote nodes without a password.


I use hostfile to run mpi. It will read something like:
-
localhost
u...@remotehost.com
-
But it fails.

The issue seems to be the user! That is, the user on my PC is different 
to that of user at remotehosts. That's my assumption.


Is this the problem? Is there any work-around to solve this issue? Do I 
need to have same username at all nodes to solve this issue?


Jaison, ANU




Re: [OMPI users] How to start MPI_Spawn child processes early?

2010-01-27 Thread Jaison Paul
Hi, I am just reposting my early query once again. If anyone one can 
give some hint, that would be great.


Thanks, Jaison
ANU

Jaison Paul wrote:

Hi All,

I am trying to use MPI for scientific High Performance (hpc) 
applications. I use MPI_Spawn to create child processes. Is there a 
way to start child processes early than the parent process, using 
MPI_Spawn?


I want this because, my experiments showed that the time to spawn the 
children by parent is too long for HPC apps which slows down the whole 
process. If the children are ready when parent application process 
seeks for them, that initial delay can be avoided. Is there a way to 
do that?


Thanks in advance,

Jaison
Australian National University
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




[OMPI users] Can I start MPI_Spawn child processes early?

2010-01-25 Thread Jaison Paul

Hi All,

I am trying to use MPI for scientific High Performance (hpc) 
applications. I use MPI_Spawn to create child processes. Is there a way 
to start child processes early than the parent process, using MPI_Spawn?


I want this because, my experiments showed that the time to spawn the 
children by parent is too long for HPC apps which slows down the whole 
process. If the children are ready when parent application process seeks 
for them, that initial delay can be avoided. Is there a way to do that?


Thanks in advance,

Jaison
Australian National University


Re: [OMPI users] Fails to run "MPI_Comm_spawn" on remote host

2009-09-16 Thread Jaison Paul

Hi Ralph,

Thank you so much for your reply. Your tips worked!  The idea is to  
set the hosts first and then pick them using 'host' reserved key in  
MPI_info. Great! Thanks a ton. I  tried "-host" variable in mpirun like:


 "mpirun --prefix /opt/mpi/ompi-1.3.2/  -np 1 -host myhost1,myhost2  
spawner"


and set

"MPI_info" reserved key 'host' to set the remote host like:

  MPI_Info hostinfo;
  MPI_Info_create();
  MPI_Info_set(hostinfo, "host", "myhost2");
  MPI_Info_set(hostinfo, "wdir", "/home/jaison/mpi/advanced_MPI/ 
spawn/lib");


Now I can run child processes in remote host - myhost2. I shall also  
try the  "add-hostfile" option.


Btw, the man page of MPI_Comm_spawn does not give detailed  
information as you have just done.


Jaison
http://cs.anu.edu.au/~Jaison.Mulerikkal/Home.html




On 16/09/2009, at 12:39 PM, Ralph Castain wrote:

We don't support the ability to add a new host during a comm_spawn  
call in the 1.3 series. This is a feature that is being added for  
the upcoming new feature series release (tagged 1.5).


There are two solutions to this problem in 1.3:

1. declare all hosts at the beginning of the job. You can then  
specify which one to use with the "host" key.


2. you -can- add a hostfile to the job during a comm_spawn. This is  
done with the "add-hostfile" key. All the hosts in the hostfile  
will be added to the job. You can then specify which host(s) to use  
for this particular comm_spawn with the "host" key.


All of this is documented - you should see it with a "man  
MPI_Comm_spawn" command.


If you need to dynamically add a host via "host" before then, you  
could try downloading a copy of the developer's trunk from the OMPI  
web site. It is implemented there at this time - and also  
documented via the man page.


Ralph


On Tue, Sep 15, 2009 at 5:14 PM, Jaison Paul  
<jmule...@cs.anu.edu.au> wrote:

Hi All,

I am waiting on some inputs on my query. I just wanted to know  
whether I can run dynamic child processes using 'MPI_Comm_spawn' on  
remote hosts? (in openmpi  1.3.2)). Has anyone did that  
successfully? Or OpenMPI hasnt implemented it yet?


Please help.

Jaison
http://cs.anu.edu.au/~Jaison.Mulerikkal/Home.html




On 14/09/2009, at 8:45 AM, Jaison Paul wrote:


Hi,

I am trying to create a library using OpenMPI for an SOA  
middleware for my Phd research. "MPI_Comm_spawn"  is the one I  
need to go for. I got a sample example working, but only on the  
local host. Whenever I try to run the spawned children on  a  
remote hosts, parent cannot launch children on remote hosts and I  
get the following error message:


--BEGIN MPIRUN AND ERROR MSG
mpirun --prefix /opt/mpi/ompi-1.3.2/ --mca btl_tcp_if_include eth0  
-np 1 /home/jaison/mpi/advanced_MPI/spawn/manager

Manager code started - host headnode -- myid & world_size 0 1
Host is: myhost
WorkDir is: /home/jaison/mpi/advanced_MPI/spawn/lib
- 
-

There are no allocated resources for the application
  /home/jaison/mpi/advanced_MPI/spawn//lib
that match the requested mapping:


Verify that you have mapped the allocated resources properly using  
the

--host or --hostfile specification.
- 
-
- 
-
A daemon (pid unknown) died unexpectedly on signal 1  while  
attempting to

launch so we are aborting.

There may be more information reported by the environment (see  
above).


This may be because the daemon was unable to find all the needed  
shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to  
have the

location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
- 
-

mpirun: clean termination accomplished
--END OF ERROR  
MSG---


I use the reserved keys - 'host' & 'wdir' - to set the remote host  
and work directory using MPI_Info. Here is the code snippet:


--BEGIN Code  
Snippet---

  MPI_Info hostinfo;
  MPI_Info_create();
  MPI_Info_set(hostinfo, "host", "myhost");
  MPI_Info_set(hostinfo, "wdir", "/home/jaison/mpi/advanced_MPI/ 
spawn/lib");


  // Checking for 'hostinfo'. The results are okay (see above)
  int test0 = MPI_Info_get(hostinfo, "host", valuelen, value, );
  int test = MPI_Info_get(hostinfo, "wdir", valuelen, value1, );
  printf("Host is: %s\n", value);
  printf("WorkDir is: %s\n", value1);

  sprintf( launched_prog

Re: [OMPI users] Fails to run "MPI_Comm_spawn" on remote host

2009-09-15 Thread Jaison Paul

Hi All,

I am waiting on some inputs on my query. I just wanted to know  
whether I can run dynamic child processes using 'MPI_Comm_spawn' on  
remote hosts? (in openmpi  1.3.2)). Has anyone did that successfully?  
Or OpenMPI hasnt implemented it yet?


Please help.

Jaison
http://cs.anu.edu.au/~Jaison.Mulerikkal/Home.html




On 14/09/2009, at 8:45 AM, Jaison Paul wrote:


Hi,

I am trying to create a library using OpenMPI for an SOA middleware  
for my Phd research. "MPI_Comm_spawn"  is the one I need to go for.  
I got a sample example working, but only on the local host.  
Whenever I try to run the spawned children on  a remote hosts,  
parent cannot launch children on remote hosts and I get the  
following error message:


--BEGIN MPIRUN AND ERROR MSG
mpirun --prefix /opt/mpi/ompi-1.3.2/ --mca btl_tcp_if_include eth0 - 
np 1 /home/jaison/mpi/advanced_MPI/spawn/manager

Manager code started - host headnode -- myid & world_size 0 1
Host is: myhost
WorkDir is: /home/jaison/mpi/advanced_MPI/spawn/lib
-- 


There are no allocated resources for the application
  /home/jaison/mpi/advanced_MPI/spawn//lib
that match the requested mapping:


Verify that you have mapped the allocated resources properly using the
--host or --hostfile specification.
-- 

-- 

A daemon (pid unknown) died unexpectedly on signal 1  while  
attempting to

launch so we are aborting.

There may be more information reported by the environment (see above).

This may be because the daemon was unable to find all the needed  
shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to  
have the

location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
-- 


mpirun: clean termination accomplished
--END OF ERROR  
MSG---


I use the reserved keys - 'host' & 'wdir' - to set the remote host  
and work directory using MPI_Info. Here is the code snippet:


--BEGIN Code  
Snippet---

  MPI_Info hostinfo;
  MPI_Info_create();
  MPI_Info_set(hostinfo, "host", "myhost");
  MPI_Info_set(hostinfo, "wdir", "/home/jaison/mpi/advanced_MPI/ 
spawn/lib");


  // Checking for 'hostinfo'. The results are okay (see above)
  int test0 = MPI_Info_get(hostinfo, "host", valuelen, value, );
  int test = MPI_Info_get(hostinfo, "wdir", valuelen, value1, );
  printf("Host is: %s\n", value);
  printf("WorkDir is: %s\n", value1);

  sprintf( launched_program, "launched_program" );

  MPI_Comm_spawn( launched_program, MPI_ARGV_NULL , number_to_spawn,
  hostinfo, 0, MPI_COMM_SELF, ,
  MPI_ERRCODES_IGNORE );

--END OF Code  
Snippet---


I've set the LD_LIBRARY_PATH correctly. Is "MPI_Comm_spawn"  
implemented in open mpi (I am using version 1.3.2)? If so, where am  
I going wrong? Any input will be very much appreciated.


Thanking you in advance.

Jaison
jmule...@cs.anu.edu.au
http://cs.anu.edu.au/~Jaison.Mulerikkal/Home.html




___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




[OMPI users] Fails to run "MPI_Comm_spawn" on remote host

2009-09-13 Thread Jaison Paul

Hi,

I am trying to create a library using OpenMPI for an SOA middleware  
for my Phd research. "MPI_Comm_spawn"  is the one I need to go for. I  
got a sample example working, but only on the local host. Whenever I  
try to run the spawned children on  a remote hosts, parent cannot  
launch children on remote hosts and I get the following error message:


--BEGIN MPIRUN AND ERROR MSG
mpirun --prefix /opt/mpi/ompi-1.3.2/ --mca btl_tcp_if_include eth0 - 
np 1 /home/jaison/mpi/advanced_MPI/spawn/manager

Manager code started - host headnode -- myid & world_size 0 1
Host is: myhost
WorkDir is: /home/jaison/mpi/advanced_MPI/spawn/lib
 
--

There are no allocated resources for the application
  /home/jaison/mpi/advanced_MPI/spawn//lib
that match the requested mapping:


Verify that you have mapped the allocated resources properly using the
--host or --hostfile specification.
 
--
 
--
A daemon (pid unknown) died unexpectedly on signal 1  while  
attempting to

launch so we are aborting.

There may be more information reported by the environment (see above).

This may be because the daemon was unable to find all the needed shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to  
have the

location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
 
--

mpirun: clean termination accomplished
--END OF ERROR  
MSG---


I use the reserved keys - 'host' & 'wdir' - to set the remote host  
and work directory using MPI_Info. Here is the code snippet:


--BEGIN Code  
Snippet---

  MPI_Info hostinfo;
  MPI_Info_create();
  MPI_Info_set(hostinfo, "host", "myhost");
  MPI_Info_set(hostinfo, "wdir", "/home/jaison/mpi/advanced_MPI/ 
spawn/lib");


  // Checking for 'hostinfo'. The results are okay (see above)
  int test0 = MPI_Info_get(hostinfo, "host", valuelen, value, );
  int test = MPI_Info_get(hostinfo, "wdir", valuelen, value1, );
  printf("Host is: %s\n", value);
  printf("WorkDir is: %s\n", value1);

  sprintf( launched_program, "launched_program" );

  MPI_Comm_spawn( launched_program, MPI_ARGV_NULL , number_to_spawn,
  hostinfo, 0, MPI_COMM_SELF, ,
  MPI_ERRCODES_IGNORE );

--END OF Code  
Snippet---


I've set the LD_LIBRARY_PATH correctly. Is "MPI_Comm_spawn"  
implemented in open mpi (I am using version 1.3.2)? If so, where am I  
going wrong? Any input will be very much appreciated.


Thanking you in advance.

Jaison
jmule...@cs.anu.edu.au
http://cs.anu.edu.au/~Jaison.Mulerikkal/Home.html