Hi David,

Thank you for your reply.

> just to be sure:
> having the ssh public keys in other computer's authorized_key file.
> ssh keys generated without passphrases

Yes, as evidenced by my session dialogue, invoking ssh manually is
not a problem.  I cannot use mpirun command (which I believe
uses ssh as an infrastructure) in the same setting, i.e., with  private
key and public key, the latter in the destination’s authorized_key
file).

Regards,

Tena


On 2/9/11 10:58 PM, "David Zhang" <solarbik...@gmail.com> wrote:

I don't really know what the problem is.  It seems like you're doing things 
correctly.  I'm almost sure you've done all of the following, but just to be 
sure:
having the ssh public keys in other computer's authorized_key file.
ssh keys generated without passphrases

On Wed, Feb 9, 2011 at 10:08 PM, Tena Sakai <tsa...@gallo.ucsf.edu> wrote:
Hi,

I have made a bit of progress(?)...
I made a config file in my .ssh directory on the cloud.  It looks like:
    # machine A
    Host domU-12-31-39-07-35-21.compute-1.internal
    HostName domU-12-31-39-07-35-21
    BatchMode yes
    IdentityFile /home/tsakai/.ssh/tsakai
    ChallengeResponseAuthentication no
    IdentitiesOnly yes

    # machine B
    Host domU-12-31-39-06-74-E2.compute-1.internal
    HostName domU-12-31-39-06-74-E2
    BatchMode yes
    IdentityFile /home/tsakai/.ssh/tsakai
    ChallengeResponseAuthentication no
    IdentitiesOnly yes

This file exists on both machine A and machine B.

Now When I issue mpirun command as below:
    [tsakai@domU-12-31-39-06-74-E2 ~]$ mpirun -app app.ac2

It hungs.  I control-C out of it and I get:

    mpirun: killing job...

    --------------------------------------------------------------------------
    mpirun noticed that the job aborted, but has no info as to the process
    that caused that situation.
    --------------------------------------------------------------------------
    --------------------------------------------------------------------------
    mpirun was unable to cleanly terminate the daemons on the nodes shown
    below. Additional manual cleanup may be required - please refer to
    the "orte-clean" tool for assistance.
    --------------------------------------------------------------------------
        domU-12-31-39-07-35-21.compute-1.internal - daemon did not report back 
when launched

Am I making progress?

Does this mean I am past authentication and something else is the problem?
Does someone have an example .ssh/config file I can look at?  There are so
many keyword-argument paris for this config file and I would like to look at
some very basic one that works.


Thank you.

Tena Sakai
tsa...@gallo.ucsf.edu <http://tsa...@gallo.ucsf.edu>

On 2/9/11 7:52 PM, "Tena Sakai" <tsa...@gallo.ucsf.edu 
<http://tsa...@gallo.ucsf.edu> > wrote:

Hi

I have an app.ac1 file like below:
    [tsakai@vixen local]$ cat app.ac1
    -H vixen.egcrc.org <http://vixen.egcrc.org>    -np 1 Rscript 
/Users/tsakai/Notes/R/parallel/Rmpi/local/fib.R 5
    -H vixen.egcrc.org <http://vixen.egcrc.org>    -np 1 Rscript 
/Users/tsakai/Notes/R/parallel/Rmpi/local/fib.R 6
    -H blitzen.egcrc.org <http://blitzen.egcrc.org>  -np 1 Rscript 
/Users/tsakai/Notes/R/parallel/Rmpi/local/fib.R 7
    -H blitzen.egcrc.org <http://blitzen.egcrc.org>  -np 1 Rscript 
/Users/tsakai/Notes/R/parallel/Rmpi/local/fib.R 8

The program I run is
    Rscript /Users/tsakai/Notes/R/parallel/Rmpi/local/fib.R x
Where x is [5..8].  The machines vixen and blitzen each run 2 runs.

Here’s the program fib.R:
    [ tsakai@vixen local]$ cat fib.R
        # fib() computes, given index n, fibonacci number iteratively
        # here's the first dozen sequence (indexed from 0..11)
        # 1, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89

    fib <- function( n ) {
            a <- 0
            b <- 1
            for ( i in 1:n ) {
                 t <- b
                 b <- a
                 a <- a + t
            }
        a

    arg <- commandArgs( TRUE )
    myHost <- system( 'hostname', intern=TRUE )
    cat( fib(arg), myHost, '\n' )

It reads an argument from command line and produces a fibonacci number that
corresponds to that index, followed by the machine name.  Pretty simple stuff.

Here’s the run output:
    [tsakai@vixen local]$ mpirun -app app.ac1
    5 vixen.egcrc.org <http://vixen.egcrc.org>
    8 vixen.egcrc.org <http://vixen.egcrc.org>
    13 blitzen.egcrc.org <http://blitzen.egcrc.org>
    21 blitzen.egcrc.org <http://blitzen.egcrc.org>

Which is exactly what I expect.  So far so good.

Now I want to run the same thing on cloud.  I launch 2 instances of the same
virtual machine, to which I get to by:
    [tsakai@vixen local]$ ssh –A –I ~/.ssh/tsakai machine-instance-A-public-dns

Now I am on machine A:
    [tsakai@domU-12-31-39-00-D1-F2 ~]$
    [tsakai@domU-12-31-39-00-D1-F2 ~]$ # and I can go to machine B without 
password authentication,
    [tsakai@domU-12-31-39-00-D1-F2 ~]$ # i.e., use public/private key
    [tsakai@domU-12-31-39-00-D1-F2 ~]$
    [tsakai@domU-12-31-39-00-D1-F2 ~]$ hostname
    domU-12-31-39-00-D1-F2
    [tsakai@domU-12-31-39-00-D1-F2 ~]$ ssh -i .ssh/tsakai domU-12-31-39-0C-C8-01
    Last login: Wed Feb  9 20:51:48 2011 from 10.254.214.4
    [tsakai@domU-12-31-39-0C-C8-01 ~]$
    [tsakai@domU-12-31-39-0C-C8-01 ~]$ # I am now on machine B
    [tsakai@domU-12-31-39-0C-C8-01 ~]$ hostname
    domU-12-31-39-0C-C8-01
    [tsakai@domU-12-31-39-0C-C8-01 ~]$
    [tsakai@domU-12-31-39-0C-C8-01 ~]$ # now show I can get to machine A 
without using password
    [tsakai@domU-12-31-39-0C-C8-01 ~]$
    [tsakai@domU-12-31-39-0C-C8-01 ~]$ ssh -i .ssh/tsakai domU-12-31-39-00-D1-F2
    The authenticity of host 'domu-12-31-39-00-d1-f2 (10.254.214.4)' can't be 
established.
    RSA key fingerprint is e3:ad:75:b1:a4:63:7f:0f:c4:0b:10:71:f3:2f:21:81.
    Are you sure you want to continue connecting (yes/no)? yes
    Warning: Permanently added 'domu-12-31-39-00-d1-f2' (RSA) to the list of 
known hosts.
    Last login: Wed Feb  9 20:49:34 2011 from 10.215.203.239
    [tsakai@domU-12-31-39-00-D1-F2 ~]$
    [tsakai@domU-12-31-39-00-D1-F2 ~]$ hostname
    domU-12-31-39-00-D1-F2
    [tsakai@domU-12-31-39-00-D1-F2 ~]$
    [tsakai@domU-12-31-39-00-D1-F2 ~]$ exit
    logout
    Connection to domU-12-31-39-00-D1-F2 closed.
    [tsakai@domU-12-31-39-0C-C8-01 ~]$
    [tsakai@domU-12-31-39-0C-C8-01 ~]$ exit
    logout
    Connection to domU-12-31-39-0C-C8-01 closed.
    [tsakai@domU-12-31-39-00-D1-F2 ~]$
    [tsakai@domU-12-31-39-00-D1-F2 ~]$ # back at machine A
    [tsakai@domU-12-31-39-00-D1-F2 ~]$ hostname
    domU-12-31-39-00-D1-F2

As you can see, neither machine uses password for authentication; it uses
public/private key pairs.  There is no problem (that I can see) for ssh 
invocation
from one machine to the other.  This is so because I have a copy of public key
and a copy of private key on each instance.

The app.ac <http://app.ac>  file is identical, except the node names:
    [tsakai@domU-12-31-39-00-D1-F2 ~]$ cat app.ac1
    -H domU-12-31-39-00-D1-F2 -np 1 Rscript /home/tsakai/fib.R 5
    -H domU-12-31-39-00-D1-F2 -np 1 Rscript /home/tsakai/fib.R 6
    -H domU-12-31-39-0C-C8-01 -np 1 Rscript /home/tsakai/fib.R 7
    -H domU-12-31-39-0C-C8-01 -np 1 Rscript /home/tsakai/fib.R 8

Here’s what happens with mpirun:

    [tsakai@domU-12-31-39-00-D1-F2 ~]$ mpirun -app app.ac1
    tsakai@domu-12-31-39-0c-c8-01's password:
    Permission denied, please try again.
    tsakai@domu-12-31-39-0c-c8-01's password: mpirun: killing job...

    --------------------------------------------------------------------------
    mpirun noticed that the job aborted, but has no info as to the process
    that caused that situation.
    --------------------------------------------------------------------------

    mpirun: clean termination accomplished

    [tsakai@domU-12-31-39-00-D1-F2 ~]$

Mpirun (or somebody else?) asks me password, which I don’t have.
I end up typing control-C.

Here’s my question:
How can I get past authentication by mpirun where there is no password?

I would appreciate your help/insight greatly.

Thank you.

Tena Sakai
tsa...@gallo.ucsf.edu <http://tsa...@gallo.ucsf.edu>






_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to