[jira] [Updated] (SPARK-4325) Improve spark-ec2 cluster launch times
[ https://issues.apache.org/jira/browse/SPARK-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-4325: - Fix Version/s: (was: 1.3.0) Improve spark-ec2 cluster launch times -- Key: SPARK-4325 URL: https://issues.apache.org/jira/browse/SPARK-4325 Project: Spark Issue Type: Umbrella Components: EC2 Reporter: Nicholas Chammas Assignee: Nicholas Chammas Priority: Minor This is an umbrella task to capture several pieces of work related to significantly improving spark-ec2 cluster launch times. There are several optimizations we know we can make to [{{setup.sh}} | https://github.com/mesos/spark-ec2/blob/v4/setup.sh] to make cluster launches faster. There are also some improvements to the AMIs that will help a lot. Potential improvements: * Upgrade the Spark AMIs and pre-install tools like Ganglia on them. This will reduce or eliminate SSH wait time and Ganglia init time. * Replace instances of {{download; rsync to rest of cluster}} with parallel downloads on all nodes of the cluster. * Replace instances of {code} for node in $NODES; do command sleep 0.3 done wait{code} with simpler calls to {{pssh}}. * Remove the [linear backoff | https://github.com/apache/spark/blob/b32734e12d5197bad26c080e529edd875604c6fb/ec2/spark_ec2.py#L665] when we wait for SSH availability now that we are already waiting for EC2 status checks to clear before testing SSH. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-4325) Improve spark-ec2 cluster launch times
[ https://issues.apache.org/jira/browse/SPARK-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-4325: Description: This is an umbrella task to capture several pieces of work related to significantly improving spark-ec2 cluster launch times. There are several optimizations we know we can make to [{{setup.sh}} | https://github.com/mesos/spark-ec2/blob/v4/setup.sh] to make cluster launches faster. There are also some improvements to the AMIs that will help a lot. Potential improvements: * Upgrade the Spark AMIs and pre-install tools like Ganglia on them. This will reduce or eliminate SSH wait time and Ganglia init time. * Replace instances of {{download; rsync to rest of cluster}} with parallel downloads on all nodes of the cluster. * Replace instances of {code} for node in $NODES; do command sleep 0.3 done wait{code} with simpler calls to {{pssh}}. * Remove the [linear backoff | https://github.com/apache/spark/blob/b32734e12d5197bad26c080e529edd875604c6fb/ec2/spark_ec2.py#L665] when we wait for SSH availability now that we are already waiting for EC2 status checks to clear before testing SSH. was: There are several optimizations we know we can make to [{{setup.sh}} | https://github.com/mesos/spark-ec2/blob/v4/setup.sh] to make cluster launches faster. There are also some improvements to the AMIs that will help a lot. Potential improvements: * Upgrade the Spark AMIs and pre-install tools like Ganglia on them. This will reduce or eliminate SSH wait time and Ganglia init time. * Replace instances of {{download; rsync to rest of cluster}} with parallel downloads on all nodes of the cluster. * Replace instances of {code} for node in $NODES; do command sleep 0.3 done wait{code} with simpler calls to {{pssh}}. * Remove the [linear backoff | https://github.com/apache/spark/blob/b32734e12d5197bad26c080e529edd875604c6fb/ec2/spark_ec2.py#L665] when we wait for SSH availability now that we are already waiting for EC2 status checks to clear before testing SSH. Improve spark-ec2 cluster launch times -- Key: SPARK-4325 URL: https://issues.apache.org/jira/browse/SPARK-4325 Project: Spark Issue Type: Umbrella Components: EC2 Reporter: Nicholas Chammas Assignee: Nicholas Chammas Priority: Minor Fix For: 1.3.0 This is an umbrella task to capture several pieces of work related to significantly improving spark-ec2 cluster launch times. There are several optimizations we know we can make to [{{setup.sh}} | https://github.com/mesos/spark-ec2/blob/v4/setup.sh] to make cluster launches faster. There are also some improvements to the AMIs that will help a lot. Potential improvements: * Upgrade the Spark AMIs and pre-install tools like Ganglia on them. This will reduce or eliminate SSH wait time and Ganglia init time. * Replace instances of {{download; rsync to rest of cluster}} with parallel downloads on all nodes of the cluster. * Replace instances of {code} for node in $NODES; do command sleep 0.3 done wait{code} with simpler calls to {{pssh}}. * Remove the [linear backoff | https://github.com/apache/spark/blob/b32734e12d5197bad26c080e529edd875604c6fb/ec2/spark_ec2.py#L665] when we wait for SSH availability now that we are already waiting for EC2 status checks to clear before testing SSH. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-4325) Improve spark-ec2 cluster launch times
[ https://issues.apache.org/jira/browse/SPARK-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-4325: Description: There are several optimizations we know we can make to [{{setup.sh}} | https://github.com/mesos/spark-ec2/blob/v4/setup.sh] to make cluster launches faster. There are also some improvements to the AMIs that will help a lot. Potential improvements: * Upgrade the Spark AMIs and pre-install tools like Ganglia on them. This will reduce or eliminate SSH wait time and Ganglia init time. * Replace instances of {{download; rsync to rest of cluster}} with parallel downloads on all nodes of the cluster. * Replace instances of {code} for node in $NODES; do command sleep 0.3 done wait{code} with simpler calls to {{pssh}}. * Remove the [linear backoff | https://github.com/apache/spark/blob/b32734e12d5197bad26c080e529edd875604c6fb/ec2/spark_ec2.py#L665] when we wait for SSH availability now that we are already waiting for EC2 status checks to clear before testing SSH. was: There are several optimizations we know we can make to [{{setup.sh}} | https://github.com/mesos/spark-ec2/blob/v4/setup.sh] to make cluster launches faster. There are also some improvements to the AMIs that will help a lot. Improve spark-ec2 cluster launch times -- Key: SPARK-4325 URL: https://issues.apache.org/jira/browse/SPARK-4325 Project: Spark Issue Type: Improvement Components: EC2 Reporter: Nicholas Chammas Priority: Minor There are several optimizations we know we can make to [{{setup.sh}} | https://github.com/mesos/spark-ec2/blob/v4/setup.sh] to make cluster launches faster. There are also some improvements to the AMIs that will help a lot. Potential improvements: * Upgrade the Spark AMIs and pre-install tools like Ganglia on them. This will reduce or eliminate SSH wait time and Ganglia init time. * Replace instances of {{download; rsync to rest of cluster}} with parallel downloads on all nodes of the cluster. * Replace instances of {code} for node in $NODES; do command sleep 0.3 done wait{code} with simpler calls to {{pssh}}. * Remove the [linear backoff | https://github.com/apache/spark/blob/b32734e12d5197bad26c080e529edd875604c6fb/ec2/spark_ec2.py#L665] when we wait for SSH availability now that we are already waiting for EC2 status checks to clear before testing SSH. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org