[
https://issues.apache.org/jira/browse/SPARK-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicholas Chammas updated SPARK-5473:
------------------------------------
Description:
If there is some fatal problem with launching a cluster, `spark-ec2` just hangs
without giving the user useful feedback on what the problem is.
This PR exposes the output of the SSH calls to the user if the SSH test fails
during cluster launch for any reason but the instance status checks are all
green.
For example:
```
$ ./ec2/spark-ec2 -k key -i /incorrect/path/identity.pem --instance-type
m3.medium --slaves 1 --zone us-east-1c launch "spark-test"
Setting up security groups...
Searching for existing cluster spark-test...
Spark AMI: ami-35b1885c
Launching instances...
Launched 1 slaves in us-east-1c, regid = r-7dadd096
Launched master in us-east-1c, regid = r-fcadd017
Waiting for cluster to enter 'ssh-ready' state...
Warning: SSH connection error. (This could be temporary.)
Host: 127.0.0.1
SSH return code: 255
SSH output: Warning: Identity file /incorrect/path/identity.pem not accessible:
No such file or directory.
Warning: Permanently added '127.0.0.1' (RSA) to the list of known hosts.
Permission denied (publickey).
```
This should give users enough information when some unrecoverable error occurs
during launch so they can know to abort the launch. This will help avoid
situations like the ones reported [here on Stack
Overflow](http://stackoverflow.com/q/28002443/) and [here on the user
list](http://mail-archives.apache.org/mod_mbox/spark-user/201501.mbox/%[email protected]%3E),
where the users couldn't tell what the problem was because it was being hidden
by `spark-ec2`.
This is a usability improvement that should be backported to 1.2.
> Expose SSH failures after status checks pass
> --------------------------------------------
>
> Key: SPARK-5473
> URL: https://issues.apache.org/jira/browse/SPARK-5473
> Project: Spark
> Issue Type: Improvement
> Components: EC2
> Affects Versions: 1.2.0
> Reporter: Nicholas Chammas
> Assignee: Nicholas Chammas
> Priority: Minor
> Fix For: 1.3.0
>
>
> If there is some fatal problem with launching a cluster, `spark-ec2` just
> hangs without giving the user useful feedback on what the problem is.
> This PR exposes the output of the SSH calls to the user if the SSH test fails
> during cluster launch for any reason but the instance status checks are all
> green.
> For example:
> ```
> $ ./ec2/spark-ec2 -k key -i /incorrect/path/identity.pem --instance-type
> m3.medium --slaves 1 --zone us-east-1c launch "spark-test"
> Setting up security groups...
> Searching for existing cluster spark-test...
> Spark AMI: ami-35b1885c
> Launching instances...
> Launched 1 slaves in us-east-1c, regid = r-7dadd096
> Launched master in us-east-1c, regid = r-fcadd017
> Waiting for cluster to enter 'ssh-ready' state...
> Warning: SSH connection error. (This could be temporary.)
> Host: 127.0.0.1
> SSH return code: 255
> SSH output: Warning: Identity file /incorrect/path/identity.pem not
> accessible: No such file or directory.
> Warning: Permanently added '127.0.0.1' (RSA) to the list of known hosts.
> Permission denied (publickey).
> ```
> This should give users enough information when some unrecoverable error
> occurs during launch so they can know to abort the launch. This will help
> avoid situations like the ones reported [here on Stack
> Overflow](http://stackoverflow.com/q/28002443/) and [here on the user
> list](http://mail-archives.apache.org/mod_mbox/spark-user/201501.mbox/%[email protected]%3E),
> where the users couldn't tell what the problem was because it was being
> hidden by `spark-ec2`.
> This is a usability improvement that should be backported to 1.2.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]