Re: [OMPI users] Accessing OpenMPI processes over Internet using ssh
As I said, please do a quick search on the "user" mailing list. There are numerous discussions there about how to do this. Here is another one that dealt with getting thru the Amazon firewall: http://www.open-mpi.org/community/lists/users/2011/02/15646.php On Nov 30, 2011, at 1:58 PM, Jaison Paul wrote: > Ralph Castain open-mpi.org> writes: > >> >> This has come up before - I would suggest doing a quick search of "ec2" on >> our > user list. Here is one solution: >> On Jun 14, 2011, at 10:50 AM, Barnet Wagman wrote:I've put together a simple > system for running OMPI on EC2 (Amazon's cloud computing service). If you're > interested, see >> >> >> >> http://norbl.com/ppe-ompi.html >> > > I have tried little bit more: > > I have set the MCA parameters as follows: > mpirun -np 1 --mca btl tcp,self --mca btl_tcp_if_exclude lo,eth0 -hostfile > hostinfo nbs-client -bynode > > But still failed and got the following error: > > Permission denied (publickey). > -- > A daemon (pid 24744) died unexpectedly with status 255 while attempting > to launch so we are aborting. > > There may be more information reported by the environment (see above). > > This may be because the daemon was unable to find all the needed shared > libraries on the remote node. You may set your LD_LIBRARY_PATH to have the > location of the shared libraries on the remote nodes and this will > automatically be forwarded to the remote nodes. > -- > mpirun: clean termination accomplished > > I dont understand the "Permission denied (publickey)" error. I access the EC2 > instance using password-less ssh as follows: > > ssh ubuntu ec2-67-202-**-***.compute-1.amazonaws.com > > So, what went wrong? > > hostinfo file is: > > [jmulerik jaison Client]$ cat hostinfo > localhost > ubuntu ec2-67-202-48-118.compute-1.amazonaws.com > > Jaison > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Accessing OpenMPI processes over Internet using ssh
Ralph Castain open-mpi.org> writes: > > This has come up before - I would suggest doing a quick search of "ec2" on our user list. Here is one solution: > On Jun 14, 2011, at 10:50 AM, Barnet Wagman wrote:I've put together a simple system for running OMPI on EC2 (Amazon's cloud computing service). If you're interested, see > > > > http://norbl.com/ppe-ompi.html > I have tried little bit more: I have set the MCA parameters as follows: mpirun -np 1 --mca btl tcp,self --mca btl_tcp_if_exclude lo,eth0 -hostfile hostinfo nbs-client -bynode But still failed and got the following error: Permission denied (publickey). -- A daemon (pid 24744) died unexpectedly with status 255 while attempting to launch so we are aborting. There may be more information reported by the environment (see above). This may be because the daemon was unable to find all the needed shared libraries on the remote node. You may set your LD_LIBRARY_PATH to have the location of the shared libraries on the remote nodes and this will automatically be forwarded to the remote nodes. -- mpirun: clean termination accomplished I dont understand the "Permission denied (publickey)" error. I access the EC2 instance using password-less ssh as follows: ssh ubuntu ec2-67-202-**-***.compute-1.amazonaws.com So, what went wrong? hostinfo file is: [jmulerik jaison Client]$ cat hostinfo localhost ubuntu ec2-67-202-48-118.compute-1.amazonaws.com Jaison
Re: [OMPI users] Accessing OpenMPI processes over Internet using ssh
Jeff Squyres cisco.com> writes: > > On Nov 30, 2011, at 6:03 AM, Jaison Paul wrote: > > > Yes, we have set up .ssh file on remote EC2 hosts. Is there anything else that we should be taking care of when > dealing with EC2? > > I have heard that Open MPI's TCP latency on EC2 is horrid. I actually talked with some Amazon / EC2 folks about > it at SC'11 a few weeks ago; we set a date to dive into it a bit deeper in December. > > No promises on when/if the TCP latency will improve, but it's definitely something that we're looking at. > My first *guess* is that it might have something to do with specifying btl_tcp_if_include / > oob_tcp_if_include improperly (or not at all) -- but that's a SWAG. > I have tried little bit more: I have set the MCA parameters as follows: mpirun -np 1 --mca btl tcp,self --mca btl_tcp_if_exclude lo,eth0 -hostfile hostinfo nbs-client -bynode But still failed and got the following error: Permission denied (publickey). -- A daemon (pid 24744) died unexpectedly with status 255 while attempting to launch so we are aborting. There may be more information reported by the environment (see above). This may be because the daemon was unable to find all the needed shared libraries on the remote node. You may set your LD_LIBRARY_PATH to have the location of the shared libraries on the remote nodes and this will automatically be forwarded to the remote nodes. -- mpirun: clean termination accomplished I dont understand the "Permission denied (publickey)" error. I access the EC2 instance using password-less ssh as follows: ssh ubuntu@ec2-67-202-**-***.compute-1.amazonaws.com So, what went wrong? hostinfo file is: [jmulerik@jaison Client]$ cat hostinfo localhost ubu...@ec2-67-202-48-118.compute-1.amazonaws.com Jaison
Re: [OMPI users] Accessing OpenMPI processes over Internet using ssh
On Nov 30, 2011, at 3:02 PM, Jaison Paul wrote: > We are not setting up --mca btl_tcp_if_include / --mca oob_tcp_if_include at > all > at the moment. What will be the best setup to access EC2 hosts over internet > for > --mca btl_tcp_if_include / --mca oob_tcp_if_include? I dont understand --mca > very well. I don't know; I've never run on EC2 before. My meeting with the EC2 folks is next week, so that's the earliest possibility of me gaining a little knowledge into what the Right way is to run with OMPI on EC2 (where "Right" = "run without horrid latency"). It may take a bit longer than that, though, depending on my time availability. The two parameters I'm referring to simply limit the Ethernet interfaces that are used for Open MPI's MPI messaging and out-of-band messaging. For example, on a commodity linux system, you could run with: mpirun --mca oob_tcp_if_include eth0 \ --mca btl_tcp_if_include eth1 ... Where eth0 will be used for OMPI's control traffic (e.g., perhaps it's a commodity 1GB network) and eth1 will be used for OMPI's MPI traffic (e.g., perhaps it's a 10GB network). -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] Accessing OpenMPI processes over Internet using ssh
Ralph Castain open-mpi.org> writes: > > This has come up before - I would suggest doing a quick search of "ec2" on our user list. Here is one solution: > On Jun 14, 2011, at 10:50 AM, Barnet Wagman wrote:I've put together a simple system for running OMPI on EC2 (Amazon's cloud computing service). If you're interested, see > > > > http://norbl.com/ppe-ompi.html > Thank you Barnet. We are using some scripts at the moment to easily configure EC2 nodes with ompi. Will try this one. But this is to set up a network of Ompi hosts within EC2, right? Does not support a client outside EC2 and the slaves inside EC2? Jaison > > > Barnet Wagman > > > >
Re: [OMPI users] Accessing OpenMPI processes over Internet using ssh
Jeff Squyres cisco.com> writes: > > On Nov 30, 2011, at 6:03 AM, Jaison Paul wrote: > > > Yes, we have set up .ssh file on remote EC2 hosts. Is there anything else that we should be taking care of when > dealing with EC2? > > I have heard that Open MPI's TCP latency on EC2 is horrid. I actually talked with some Amazon / EC2 folks about > it at SC'11 a few weeks ago; we set a date to dive into it a bit deeper in December. > > No promises on when/if the TCP latency will improve, but it's definitely something that we're looking at. > My first *guess* is that it might have something to do with specifying btl_tcp_if_include / > oob_tcp_if_include improperly (or not at all) -- but that's a SWAG. > Yes Jeff, We are not setting up --mca btl_tcp_if_include / --mca oob_tcp_if_include at all at the moment. What will be the best setup to access EC2 hosts over internet for --mca btl_tcp_if_include / --mca oob_tcp_if_include? I dont understand --mca very well. Thanks, Jaison
Re: [OMPI users] Accessing OpenMPI processes over Internet using ssh
This has come up before - I would suggest doing a quick search of "ec2" on our user list. Here is one solution: On Jun 14, 2011, at 10:50 AM, Barnet Wagman wrote: > I've put together a simple system for running OMPI on EC2 (Amazon's cloud > computing service). If you're interested, see > > http://norbl.com/ppe-ompi.html > > > Barnet Wagman On Nov 30, 2011, at 4:03 AM, Jaison Paul wrote: > > Ralph Castain open-mpi.org> writes: > >> >> >> On Nov 24, 2011, at 2:00 AM, Reuti wrote: >> > > > Thanks a lot to Ralph and Reuti. > > Actually we are trying to use EC2 nodes as compute nodes and my local PC as > host > node. > > Happy to know that it is OK to use usersomehost.com > > We used that but failed. Would try again. > > Yes, we have set up .ssh file on remote EC2 hosts. Is there anything else that > we should be taking care of when dealing with EC2? > > Jaison > > >>> Hi, >>> >>> Am 24.11.2011 um 05:26 schrieb Jaison Paul: >>> I am trying to access OpenMPI processes over Internet using ssh and not > quite successful, yet. I believe >> that I should be able to do it. I have to run one process on my PC and the rest on a remote cluster over > internet. I have set the public keys >> (at .ssh/authorized_keys) to access remote nodes without a password. I use hostfile to run mpi. It will read something like: - localhost user remotehost.com >>> >>> this is not a valid syntax for Open MPI. >> >> This isn't correct - we have long supported that syntax in a hostfile, and > there is no issue with having a >> different user name at each node. >> >> Jaison: are you sure your nodes are setup for password-less ssh? In other > words, have you setup your .ssh >> files on the remote nodes so they will allow us to ssh a process on them > without providing a password? This is >> the typical problem we see. >> >>> >>> - But it fails. The issue seems to be the user! That is, the user on my PC is different to > that of user at remotehosts. That's >> my assumption. Is this the problem? Is there any work-around to solve this issue? Do I > need to have same username at all >> nodes to solve this issue? >>> >>> You can define nicknames for an ssh connection in a file ~/.ssh/config like: >>> >>> Host foobar >>> User baz >>> Hostname the.remote.server.demo >>> Port 1234 >>> >>> While this will work with any nickname for an ssh connection, in your case > the nickname must match the one >> specified in the hostfile, as Open MPI won't use this lookup file: >>> >>> Host remotehost.com >>> User user >>> >>> ssh should then use the entries therein to initiate the connection. For > details you can have a look at `man ssh_config`. >>> >>> -- Reuti >>> ___ >>> users mailing list >>> users open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Accessing OpenMPI processes over Internet using ssh
On Nov 30, 2011, at 6:03 AM, Jaison Paul wrote: > Yes, we have set up .ssh file on remote EC2 hosts. Is there anything else > that we should be taking care of when dealing with EC2? I have heard that Open MPI's TCP latency on EC2 is horrid. I actually talked with some Amazon / EC2 folks about it at SC'11 a few weeks ago; we set a date to dive into it a bit deeper in December. No promises on when/if the TCP latency will improve, but it's definitely something that we're looking at. My first *guess* is that it might have something to do with specifying btl_tcp_if_include / oob_tcp_if_include improperly (or not at all) -- but that's a SWAG. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] Accessing OpenMPI processes over Internet using ssh
Ralph Castain open-mpi.org> writes: > > > On Nov 24, 2011, at 2:00 AM, Reuti wrote: > Thanks a lot to Ralph and Reuti. Actually we are trying to use EC2 nodes as compute nodes and my local PC as host node. Happy to know that it is OK to use usersomehost.com We used that but failed. Would try again. Yes, we have set up .ssh file on remote EC2 hosts. Is there anything else that we should be taking care of when dealing with EC2? Jaison > > Hi, > > > > Am 24.11.2011 um 05:26 schrieb Jaison Paul: > > > >> I am trying to access OpenMPI processes over Internet using ssh and not quite successful, yet. I believe > that I should be able to do it. > >> > >> I have to run one process on my PC and the rest on a remote cluster over internet. I have set the public keys > (at .ssh/authorized_keys) to access remote nodes without a password. > >> > >> I use hostfile to run mpi. It will read something like: > >> - > >> localhost > >> user remotehost.com > > > > this is not a valid syntax for Open MPI. > > This isn't correct - we have long supported that syntax in a hostfile, and there is no issue with having a > different user name at each node. > > Jaison: are you sure your nodes are setup for password-less ssh? In other words, have you setup your .ssh > files on the remote nodes so they will allow us to ssh a process on them without providing a password? This is > the typical problem we see. > > > > > > >> - > >> But it fails. > >> > >> The issue seems to be the user! That is, the user on my PC is different to that of user at remotehosts. That's > my assumption. > >> > >> Is this the problem? Is there any work-around to solve this issue? Do I need to have same username at all > nodes to solve this issue? > > > > You can define nicknames for an ssh connection in a file ~/.ssh/config like: > > > > Host foobar > >User baz > >Hostname the.remote.server.demo > >Port 1234 > > > > While this will work with any nickname for an ssh connection, in your case the nickname must match the one > specified in the hostfile, as Open MPI won't use this lookup file: > > > > Host remotehost.com > >User user > > > > ssh should then use the entries therein to initiate the connection. For details you can have a look at `man ssh_config`. > > > > -- Reuti > > ___ > > users mailing list > > users open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] Accessing OpenMPI processes over Internet using ssh
On Nov 25, 2011, at 3:42 AM, Reuti wrote: > Hi Ralph, > > Am 25.11.2011 um 03:47 schrieb Ralph Castain: > >> >> On Nov 24, 2011, at 2:00 AM, Reuti wrote: >> >>> Hi, >>> >>> Am 24.11.2011 um 05:26 schrieb Jaison Paul: >>> I am trying to access OpenMPI processes over Internet using ssh and not quite successful, yet. I believe that I should be able to do it. I have to run one process on my PC and the rest on a remote cluster over internet. I have set the public keys (at .ssh/authorized_keys) to access remote nodes without a password. I use hostfile to run mpi. It will read something like: - localhost u...@remotehost.com >>> >>> this is not a valid syntax for Open MPI. >> >> This isn't correct > > I'm completely sorry about this, it wasn't my intention to misguide anyone. Not a problem at all! > But this syntax isn't something I would have expected to work, nor is it > documented in `man mpiexec` AFAICS. I suggest to add it there or at > http://www.open-mpi.org/faq/?category=running. Or maybe a complete new man > page for "hostfile", where also slots= and max_slots= are explained in one > location. Yeah, our documentation is somewhat out-of-date in that area. The best explanation is on the wiki: https://svn.open-mpi.org/trac/ompi/wiki/HostFilePlan That was the design document I used when I wrote the code. > > NB: Checking orte/util/hostfile/hostfile.c even ^ to exclude hosts is > supported, but from which initial list will they be excluded? In the `man > orte_hosts` I find --default-hostfile which could be the initial list, but > --default-hostfile isn't in mpirun's man page. The "initial list" is whatever hostfile you provided - either the default hostfile or one specified on the cmd line. Remember, we use a progression here: 1. if a default hostfile exists, get our allocation from it. Any other hostfiles specified on the cmd line are then used to filter hosts from the default hostfile - i.e., we will ignore any hostname given in the cmd line hostfile if it wasn't included in the default hostfile. The "exclude" option applies in both cases. Any exclude directive in the default hostfile will ensure that host isn't included in the allocation. An excluded host in the cmd line hostfile will ensure that host is removed from the final allocation, should it have been present in the default hostfile. 2. if a default hostfile doesn't exist, then cycle across all hostfiles given on the cmd line and use the aggregate list as the allocation. I believe any exclude option here would apply only to the individual hostfile - i.e., if one hostfile includes a node and another excludes it, I suspect the node will wind up in the allocation. Once we have that global allocation, the nodes used for launch of each app_context are filtered from that global allocation using the hostfile specified for that app_context. So any exclude in that hostfile will impact only the associated app_context. Confusing and complex, I know - unfortunately, that is what I was told the community would want. :-/ HTH Ralph > > -- Reuti > > >> - we have long supported that syntax in a hostfile, and there is no issue >> with having a different user name at each node. >> >> Jaison: are you sure your nodes are setup for password-less ssh? In other >> words, have you setup your .ssh files on the remote nodes so they will allow >> us to ssh a process on them without providing a password? This is the >> typical problem we see. >> >> >>> >>> - But it fails. The issue seems to be the user! That is, the user on my PC is different to that of user at remotehosts. That's my assumption. Is this the problem? Is there any work-around to solve this issue? Do I need to have same username at all nodes to solve this issue? >>> >>> You can define nicknames for an ssh connection in a file ~/.ssh/config like: >>> >>> Host foobar >>> User baz >>> Hostname the.remote.server.demo >>> Port 1234 >>> >>> While this will work with any nickname for an ssh connection, in your case >>> the nickname must match the one specified in the hostfile, as Open MPI >>> won't use this lookup file: >>> >>> Host remotehost.com >>> User user >>> >>> ssh should then use the entries therein to initiate the connection. For >>> details you can have a look at `man ssh_config`. >>> >>> -- Reuti >>> ___ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Accessing OpenMPI processes over Internet using ssh
Hi Ralph, Am 25.11.2011 um 03:47 schrieb Ralph Castain: > > On Nov 24, 2011, at 2:00 AM, Reuti wrote: > >> Hi, >> >> Am 24.11.2011 um 05:26 schrieb Jaison Paul: >> >>> I am trying to access OpenMPI processes over Internet using ssh and not >>> quite successful, yet. I believe that I should be able to do it. >>> >>> I have to run one process on my PC and the rest on a remote cluster over >>> internet. I have set the public keys (at .ssh/authorized_keys) to access >>> remote nodes without a password. >>> >>> I use hostfile to run mpi. It will read something like: >>> - >>> localhost >>> u...@remotehost.com >> >> this is not a valid syntax for Open MPI. > > This isn't correct I'm completely sorry about this, it wasn't my intention to misguide anyone. But this syntax isn't something I would have expected to work, nor is it documented in `man mpiexec` AFAICS. I suggest to add it there or at http://www.open-mpi.org/faq/?category=running. Or maybe a complete new man page for "hostfile", where also slots= and max_slots= are explained in one location. NB: Checking orte/util/hostfile/hostfile.c even ^ to exclude hosts is supported, but from which initial list will they be excluded? In the `man orte_hosts` I find --default-hostfile which could be the initial list, but --default-hostfile isn't in mpirun's man page. -- Reuti > - we have long supported that syntax in a hostfile, and there is no issue > with having a different user name at each node. > > Jaison: are you sure your nodes are setup for password-less ssh? In other > words, have you setup your .ssh files on the remote nodes so they will allow > us to ssh a process on them without providing a password? This is the typical > problem we see. > > >> >> >>> - >>> But it fails. >>> >>> The issue seems to be the user! That is, the user on my PC is different to >>> that of user at remotehosts. That's my assumption. >>> >>> Is this the problem? Is there any work-around to solve this issue? Do I >>> need to have same username at all nodes to solve this issue? >> >> You can define nicknames for an ssh connection in a file ~/.ssh/config like: >> >> Host foobar >> User baz >> Hostname the.remote.server.demo >> Port 1234 >> >> While this will work with any nickname for an ssh connection, in your case >> the nickname must match the one specified in the hostfile, as Open MPI won't >> use this lookup file: >> >> Host remotehost.com >> User user >> >> ssh should then use the entries therein to initiate the connection. For >> details you can have a look at `man ssh_config`. >> >> -- Reuti >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Accessing OpenMPI processes over Internet using ssh
On Nov 24, 2011, at 2:00 AM, Reuti wrote: > Hi, > > Am 24.11.2011 um 05:26 schrieb Jaison Paul: > >> I am trying to access OpenMPI processes over Internet using ssh and not >> quite successful, yet. I believe that I should be able to do it. >> >> I have to run one process on my PC and the rest on a remote cluster over >> internet. I have set the public keys (at .ssh/authorized_keys) to access >> remote nodes without a password. >> >> I use hostfile to run mpi. It will read something like: >> - >> localhost >> u...@remotehost.com > > this is not a valid syntax for Open MPI. This isn't correct - we have long supported that syntax in a hostfile, and there is no issue with having a different user name at each node. Jaison: are you sure your nodes are setup for password-less ssh? In other words, have you setup your .ssh files on the remote nodes so they will allow us to ssh a process on them without providing a password? This is the typical problem we see. > > >> - >> But it fails. >> >> The issue seems to be the user! That is, the user on my PC is different to >> that of user at remotehosts. That's my assumption. >> >> Is this the problem? Is there any work-around to solve this issue? Do I need >> to have same username at all nodes to solve this issue? > > You can define nicknames for an ssh connection in a file ~/.ssh/config like: > > Host foobar >User baz >Hostname the.remote.server.demo >Port 1234 > > While this will work with any nickname for an ssh connection, in your case > the nickname must match the one specified in the hostfile, as Open MPI won't > use this lookup file: > > Host remotehost.com >User user > > ssh should then use the entries therein to initiate the connection. For > details you can have a look at `man ssh_config`. > > -- Reuti > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Accessing OpenMPI processes over Internet using ssh
Hi, Am 24.11.2011 um 05:26 schrieb Jaison Paul: > I am trying to access OpenMPI processes over Internet using ssh and not quite > successful, yet. I believe that I should be able to do it. > > I have to run one process on my PC and the rest on a remote cluster over > internet. I have set the public keys (at .ssh/authorized_keys) to access > remote nodes without a password. > > I use hostfile to run mpi. It will read something like: > - > localhost > u...@remotehost.com this is not a valid syntax for Open MPI. > - > But it fails. > > The issue seems to be the user! That is, the user on my PC is different to > that of user at remotehosts. That's my assumption. > > Is this the problem? Is there any work-around to solve this issue? Do I need > to have same username at all nodes to solve this issue? You can define nicknames for an ssh connection in a file ~/.ssh/config like: Host foobar User baz Hostname the.remote.server.demo Port 1234 While this will work with any nickname for an ssh connection, in your case the nickname must match the one specified in the hostfile, as Open MPI won't use this lookup file: Host remotehost.com User user ssh should then use the entries therein to initiate the connection. For details you can have a look at `man ssh_config`. -- Reuti