I don't really know what the problem is. It seems like you're doing things correctly. I'm almost sure you've done all of the following, but just to be sure: having the ssh public keys in other computer's authorized_key file. ssh keys generated without passphrases
On Wed, Feb 9, 2011 at 10:08 PM, Tena Sakai <tsa...@gallo.ucsf.edu> wrote: > Hi, > > I have made a bit of progress(?)... > I made a config file in my .ssh directory on the cloud. It looks like: > # machine A > Host domU-12-31-39-07-35-21.compute-1.internal > HostName domU-12-31-39-07-35-21 > BatchMode yes > IdentityFile /home/tsakai/.ssh/tsakai > ChallengeResponseAuthentication no > IdentitiesOnly yes > > # machine B > Host domU-12-31-39-06-74-E2.compute-1.internal > HostName domU-12-31-39-06-74-E2 > BatchMode yes > IdentityFile /home/tsakai/.ssh/tsakai > ChallengeResponseAuthentication no > IdentitiesOnly yes > > This file exists on both machine A and machine B. > > Now When I issue mpirun command as below: > [tsakai@domU-12-31-39-06-74-E2 ~]$ mpirun -app app.ac2 > > It hungs. I control-C out of it and I get: > > mpirun: killing job... > > > -------------------------------------------------------------------------- > mpirun noticed that the job aborted, but has no info as to the process > that caused that situation. > > -------------------------------------------------------------------------- > > -------------------------------------------------------------------------- > mpirun was unable to cleanly terminate the daemons on the nodes shown > below. Additional manual cleanup may be required - please refer to > the "orte-clean" tool for assistance. > > -------------------------------------------------------------------------- > domU-12-31-39-07-35-21.compute-1.internal - daemon did not report > back when launched > > Am I making progress? > > Does this mean I am past authentication and something else is the problem? > Does someone have an example .ssh/config file I can look at? There are so > many keyword-argument paris for this config file and I would like to look > at > some very basic one that works. > > > Thank you. > > Tena Sakai > tsa...@gallo.ucsf.edu > > On 2/9/11 7:52 PM, "Tena Sakai" <tsa...@gallo.ucsf.edu> wrote: > > Hi > > I have an app.ac1 file like below: > [tsakai@vixen local]$ cat app.ac1 > -H vixen.egcrc.org -np 1 Rscript > /Users/tsakai/Notes/R/parallel/Rmpi/local/fib.R 5 > -H vixen.egcrc.org -np 1 Rscript > /Users/tsakai/Notes/R/parallel/Rmpi/local/fib.R 6 > -H blitzen.egcrc.org -np 1 Rscript > /Users/tsakai/Notes/R/parallel/Rmpi/local/fib.R 7 > -H blitzen.egcrc.org -np 1 Rscript > /Users/tsakai/Notes/R/parallel/Rmpi/local/fib.R 8 > > The program I run is > Rscript /Users/tsakai/Notes/R/parallel/Rmpi/local/fib.R x > Where x is [5..8]. The machines vixen and blitzen each run 2 runs. > > Here’s the program fib.R: > [ tsakai@vixen local]$ cat fib.R > # fib() computes, given index n, fibonacci number iteratively > # here's the first dozen sequence (indexed from 0..11) > # 1, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89 > > fib <- function( n ) { > a <- 0 > b <- 1 > for ( i in 1:n ) { > t <- b > b <- a > a <- a + t > } > a > > arg <- commandArgs( TRUE ) > myHost <- system( 'hostname', intern=TRUE ) > cat( fib(arg), myHost, '\n' ) > > It reads an argument from command line and produces a fibonacci number that > corresponds to that index, followed by the machine name. Pretty simple > stuff. > > Here’s the run output: > [tsakai@vixen local]$ mpirun -app app.ac1 > 5 vixen.egcrc.org > 8 vixen.egcrc.org > 13 blitzen.egcrc.org > 21 blitzen.egcrc.org > > Which is exactly what I expect. So far so good. > > Now I want to run the same thing on cloud. I launch 2 instances of the > same > virtual machine, to which I get to by: > [tsakai@vixen local]$ ssh –A –I ~/.ssh/tsakai > machine-instance-A-public-dns > > Now I am on machine A: > [tsakai@domU-12-31-39-00-D1-F2 ~]$ > > [tsakai@domU-12-31-39-00-D1-F2 ~]$ # and I can go to machine B without > password authentication, > [tsakai@domU-12-31-39-00-D1-F2 ~]$ # i.e., use public/private key > [tsakai@domU-12-31-39-00-D1-F2 ~]$ > [tsakai@domU-12-31-39-00-D1-F2 ~]$ hostname > domU-12-31-39-00-D1-F2 > [tsakai@domU-12-31-39-00-D1-F2 ~]$ ssh -i .ssh/tsakai > domU-12-31-39-0C-C8-01 > Last login: Wed Feb 9 20:51:48 2011 from 10.254.214.4 > [tsakai@domU-12-31-39-0C-C8-01 ~]$ > [tsakai@domU-12-31-39-0C-C8-01 ~]$ # I am now on machine B > [tsakai@domU-12-31-39-0C-C8-01 ~]$ hostname > domU-12-31-39-0C-C8-01 > [tsakai@domU-12-31-39-0C-C8-01 ~]$ > [tsakai@domU-12-31-39-0C-C8-01 ~]$ # now show I can get to machine A > without using password > [tsakai@domU-12-31-39-0C-C8-01 ~]$ > [tsakai@domU-12-31-39-0C-C8-01 ~]$ ssh -i .ssh/tsakai > domU-12-31-39-00-D1-F2 > The authenticity of host 'domu-12-31-39-00-d1-f2 (10.254.214.4)' can't > be established. > RSA key fingerprint is e3:ad:75:b1:a4:63:7f:0f:c4:0b:10:71:f3:2f:21:81. > Are you sure you want to continue connecting (yes/no)? yes > Warning: Permanently added 'domu-12-31-39-00-d1-f2' (RSA) to the list > of known hosts. > Last login: Wed Feb 9 20:49:34 2011 from 10.215.203.239 > [tsakai@domU-12-31-39-00-D1-F2 ~]$ > [tsakai@domU-12-31-39-00-D1-F2 ~]$ hostname > domU-12-31-39-00-D1-F2 > [tsakai@domU-12-31-39-00-D1-F2 ~]$ > [tsakai@domU-12-31-39-00-D1-F2 ~]$ exit > logout > Connection to domU-12-31-39-00-D1-F2 closed. > [tsakai@domU-12-31-39-0C-C8-01 ~]$ > [tsakai@domU-12-31-39-0C-C8-01 ~]$ exit > logout > Connection to domU-12-31-39-0C-C8-01 closed. > [tsakai@domU-12-31-39-00-D1-F2 ~]$ > [tsakai@domU-12-31-39-00-D1-F2 ~]$ # back at machine A > [tsakai@domU-12-31-39-00-D1-F2 ~]$ hostname > domU-12-31-39-00-D1-F2 > > As you can see, neither machine uses password for authentication; it uses > public/private key pairs. There is no problem (that I can see) for ssh > invocation > from one machine to the other. This is so because I have a copy of public > key > and a copy of private key on each instance. > > The app.ac file is identical, except the node names: > [tsakai@domU-12-31-39-00-D1-F2 ~]$ cat app.ac1 > -H domU-12-31-39-00-D1-F2 -np 1 Rscript /home/tsakai/fib.R 5 > -H domU-12-31-39-00-D1-F2 -np 1 Rscript /home/tsakai/fib.R 6 > -H domU-12-31-39-0C-C8-01 -np 1 Rscript /home/tsakai/fib.R 7 > -H domU-12-31-39-0C-C8-01 -np 1 Rscript /home/tsakai/fib.R 8 > > Here’s what happens with mpirun: > > [tsakai@domU-12-31-39-00-D1-F2 ~]$ mpirun -app app.ac1 > tsakai@domu-12-31-39-0c-c8-01's password: > Permission denied, please try again. > tsakai@domu-12-31-39-0c-c8-01's password: mpirun: killing job... > > > -------------------------------------------------------------------------- > mpirun noticed that the job aborted, but has no info as to the process > that caused that situation. > > -------------------------------------------------------------------------- > > mpirun: clean termination accomplished > > [tsakai@domU-12-31-39-00-D1-F2 ~]$ > > Mpirun (or somebody else?) asks me password, which I don’t have. > I end up typing control-C. > > Here’s my question: > How can I get past authentication by mpirun where there is no password? > > I would appreciate your help/insight greatly. > > Thank you. > > Tena Sakai > tsa...@gallo.ucsf.edu > > > > > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- David Zhang University of California, San Diego