[ 
https://issues.apache.org/jira/browse/SPARK-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Chammas reopened SPARK-4325:
-------------------------------------

Hey [~joshrosen], though [#3195|https://github.com/apache/spark/pull/3195] 
relates to this JIRA issue, it does not resolve it completely. There are 
several other improvements described here that have not been implemented yet.

In the future, should we try to have one PR match one JIRA issue? This issue 
could easily be an umbrella issue spanning several sub-tasks, one of which has 
been taken care of by the aforementioned PR.

> Improve spark-ec2 cluster launch times
> --------------------------------------
>
>                 Key: SPARK-4325
>                 URL: https://issues.apache.org/jira/browse/SPARK-4325
>             Project: Spark
>          Issue Type: Improvement
>          Components: EC2
>            Reporter: Nicholas Chammas
>            Assignee: Nicholas Chammas
>            Priority: Minor
>             Fix For: 1.3.0
>
>
> There are several optimizations we know we can make to [{{setup.sh}} | 
> https://github.com/mesos/spark-ec2/blob/v4/setup.sh] to make cluster launches 
> faster.
> There are also some improvements to the AMIs that will help a lot.
> Potential improvements:
> * Upgrade the Spark AMIs and pre-install tools like Ganglia on them. This 
> will reduce or eliminate SSH wait time and Ganglia init time.
> * Replace instances of {{download; rsync to rest of cluster}} with parallel 
> downloads on all nodes of the cluster.
> * Replace instances of 
>  {code}
> for node in $NODES; do
>   command
>   sleep 0.3
> done
> wait{code}
>  with simpler calls to {{pssh}}.
> * Remove the [linear backoff | 
> https://github.com/apache/spark/blob/b32734e12d5197bad26c080e529edd875604c6fb/ec2/spark_ec2.py#L665]
>  when we wait for SSH availability now that we are already waiting for EC2 
> status checks to clear before testing SSH.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to