[jira] [Resolved] (SPARK-4325) Improve spark-ec2 cluster launch times
[ https://issues.apache.org/jira/browse/SPARK-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-4325. -- Resolution: Done > Improve spark-ec2 cluster launch times > -- > > Key: SPARK-4325 > URL: https://issues.apache.org/jira/browse/SPARK-4325 > Project: Spark > Issue Type: Umbrella > Components: EC2 >Reporter: Nicholas Chammas >Assignee: Nicholas Chammas >Priority: Minor > > This is an umbrella task to capture several pieces of work related to > significantly improving spark-ec2 cluster launch times. > There are several optimizations we know we can make to [{{setup.sh}} | > https://github.com/mesos/spark-ec2/blob/v4/setup.sh] to make cluster launches > faster. > There are also some improvements to the AMIs that will help a lot. > Potential improvements: > * Upgrade the Spark AMIs and pre-install tools like Ganglia on them. This > will reduce or eliminate SSH wait time and Ganglia init time. > * Replace instances of {{download; rsync to rest of cluster}} with parallel > downloads on all nodes of the cluster. > * Replace instances of > {code} > for node in $NODES; do > command > sleep 0.3 > done > wait{code} > with simpler calls to {{pssh}}. > * Remove the [linear backoff | > https://github.com/apache/spark/blob/b32734e12d5197bad26c080e529edd875604c6fb/ec2/spark_ec2.py#L665] > when we wait for SSH availability now that we are already waiting for EC2 > status checks to clear before testing SSH. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-4325) Improve spark-ec2 cluster launch times
[ https://issues.apache.org/jira/browse/SPARK-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-4325. -- Resolution: Fixed Looks like sub-tasks are all resolved > Improve spark-ec2 cluster launch times > -- > > Key: SPARK-4325 > URL: https://issues.apache.org/jira/browse/SPARK-4325 > Project: Spark > Issue Type: Improvement > Components: EC2 >Reporter: Nicholas Chammas >Assignee: Nicholas Chammas >Priority: Minor > Fix For: 1.3.0 > > > There are several optimizations we know we can make to [{{setup.sh}} | > https://github.com/mesos/spark-ec2/blob/v4/setup.sh] to make cluster launches > faster. > There are also some improvements to the AMIs that will help a lot. > Potential improvements: > * Upgrade the Spark AMIs and pre-install tools like Ganglia on them. This > will reduce or eliminate SSH wait time and Ganglia init time. > * Replace instances of {{download; rsync to rest of cluster}} with parallel > downloads on all nodes of the cluster. > * Replace instances of > {code} > for node in $NODES; do > command > sleep 0.3 > done > wait{code} > with simpler calls to {{pssh}}. > * Remove the [linear backoff | > https://github.com/apache/spark/blob/b32734e12d5197bad26c080e529edd875604c6fb/ec2/spark_ec2.py#L665] > when we wait for SSH availability now that we are already waiting for EC2 > status checks to clear before testing SSH. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-4325) Improve spark-ec2 cluster launch times
[ https://issues.apache.org/jira/browse/SPARK-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-4325. --- Resolution: Fixed Fix Version/s: 1.3.0 Assignee: Nicholas Chammas Fixed by https://github.com/apache/spark/pull/3195 in 1.3.0. > Improve spark-ec2 cluster launch times > -- > > Key: SPARK-4325 > URL: https://issues.apache.org/jira/browse/SPARK-4325 > Project: Spark > Issue Type: Improvement > Components: EC2 >Reporter: Nicholas Chammas >Assignee: Nicholas Chammas >Priority: Minor > Fix For: 1.3.0 > > > There are several optimizations we know we can make to [{{setup.sh}} | > https://github.com/mesos/spark-ec2/blob/v4/setup.sh] to make cluster launches > faster. > There are also some improvements to the AMIs that will help a lot. > Potential improvements: > * Upgrade the Spark AMIs and pre-install tools like Ganglia on them. This > will reduce or eliminate SSH wait time and Ganglia init time. > * Replace instances of {{download; rsync to rest of cluster}} with parallel > downloads on all nodes of the cluster. > * Replace instances of > {code} > for node in $NODES; do > command > sleep 0.3 > done > wait{code} > with simpler calls to {{pssh}}. > * Remove the [linear backoff | > https://github.com/apache/spark/blob/b32734e12d5197bad26c080e529edd875604c6fb/ec2/spark_ec2.py#L665] > when we wait for SSH availability now that we are already waiting for EC2 > status checks to clear before testing SSH. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org