[jira] [Commented] (SPARK-3213) spark_ec2.py cannot find slave instances launched with Launch More Like This
[ https://issues.apache.org/jira/browse/SPARK-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14512215#comment-14512215 ] Xiangrui Meng commented on SPARK-3213: -- We reverted this change in (https://github.com/apache/spark/pull/2225). So we should be able to use launch more like this now. The major benefits of using launch more like this are spot instances and persisted slaves. spark_ec2.py cannot find slave instances launched with Launch More Like This -- Key: SPARK-3213 URL: https://issues.apache.org/jira/browse/SPARK-3213 Project: Spark Issue Type: Improvement Components: EC2 Affects Versions: 1.1.0 Reporter: Joseph K. Bradley Priority: Critical Fix For: 1.1.0 Attachments: Screen Shot 2014-08-25 at 6.45.35 PM.png, Screen Shot 2014-08-26 at 1.22.32 PM.png spark_ec2.py cannot find all slave instances. In particular: * I created a master slave and configured them. * I created new slave instances from the original slave (Launch More Like This). * I tried to relaunch the cluster, and it could only find the original slave. Old versions of the script worked. The latest working commit which edited that .py script is: a0bcbc159e89be868ccc96175dbf1439461557e1 There may be a problem with this PR: [https://github.com/apache/spark/pull/1899]. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3213) spark_ec2.py cannot find slave instances launched with Launch More Like This
[ https://issues.apache.org/jira/browse/SPARK-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14512233#comment-14512233 ] Nicholas Chammas commented on SPARK-3213: - Thanks for the background, Joseph and Xiangrui. [~mengxr] - What do you mean by persisted slave? spark_ec2.py cannot find slave instances launched with Launch More Like This -- Key: SPARK-3213 URL: https://issues.apache.org/jira/browse/SPARK-3213 Project: Spark Issue Type: Improvement Components: EC2 Affects Versions: 1.1.0 Reporter: Joseph K. Bradley Priority: Critical Fix For: 1.1.0 Attachments: Screen Shot 2014-08-25 at 6.45.35 PM.png, Screen Shot 2014-08-26 at 1.22.32 PM.png spark_ec2.py cannot find all slave instances. In particular: * I created a master slave and configured them. * I created new slave instances from the original slave (Launch More Like This). * I tried to relaunch the cluster, and it could only find the original slave. Old versions of the script worked. The latest working commit which edited that .py script is: a0bcbc159e89be868ccc96175dbf1439461557e1 There may be a problem with this PR: [https://github.com/apache/spark/pull/1899]. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3213) spark_ec2.py cannot find slave instances launched with Launch More Like This
[ https://issues.apache.org/jira/browse/SPARK-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14512188#comment-14512188 ] Nicholas Chammas commented on SPARK-3213: - Hey people, is the main motivation for using Launch more like this the fact that spark-ec2 does not support directly adding slaves to an existing cluster ([SPARK-2008])? spark_ec2.py cannot find slave instances launched with Launch More Like This -- Key: SPARK-3213 URL: https://issues.apache.org/jira/browse/SPARK-3213 Project: Spark Issue Type: Improvement Components: EC2 Affects Versions: 1.1.0 Reporter: Joseph K. Bradley Priority: Critical Fix For: 1.1.0 Attachments: Screen Shot 2014-08-25 at 6.45.35 PM.png, Screen Shot 2014-08-26 at 1.22.32 PM.png spark_ec2.py cannot find all slave instances. In particular: * I created a master slave and configured them. * I created new slave instances from the original slave (Launch More Like This). * I tried to relaunch the cluster, and it could only find the original slave. Old versions of the script worked. The latest working commit which edited that .py script is: a0bcbc159e89be868ccc96175dbf1439461557e1 There may be a problem with this PR: [https://github.com/apache/spark/pull/1899]. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3213) spark_ec2.py cannot find slave instances launched with Launch More Like This
[ https://issues.apache.org/jira/browse/SPARK-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14512208#comment-14512208 ] Joseph K. Bradley commented on SPARK-3213: -- When I made this issue, it was because I wanted to create new slaves with the same non-standard libraries already installed. I hardly ever need to use that functionality though. spark_ec2.py cannot find slave instances launched with Launch More Like This -- Key: SPARK-3213 URL: https://issues.apache.org/jira/browse/SPARK-3213 Project: Spark Issue Type: Improvement Components: EC2 Affects Versions: 1.1.0 Reporter: Joseph K. Bradley Priority: Critical Fix For: 1.1.0 Attachments: Screen Shot 2014-08-25 at 6.45.35 PM.png, Screen Shot 2014-08-26 at 1.22.32 PM.png spark_ec2.py cannot find all slave instances. In particular: * I created a master slave and configured them. * I created new slave instances from the original slave (Launch More Like This). * I tried to relaunch the cluster, and it could only find the original slave. Old versions of the script worked. The latest working commit which edited that .py script is: a0bcbc159e89be868ccc96175dbf1439461557e1 There may be a problem with this PR: [https://github.com/apache/spark/pull/1899]. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3213) spark_ec2.py cannot find slave instances launched with Launch More Like This
[ https://issues.apache.org/jira/browse/SPARK-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14112652#comment-14112652 ] Vida Ha commented on SPARK-3213: I have a pull request that fixes the issue by copying the tags from the spot request to the instances. Joseph - would you mind verifying if this change solves the problem? I've tried myself and it works, but can't hurt to have another person try it too. https://github.com/apache/spark/pull/2163 spark_ec2.py cannot find slave instances launched with Launch More Like This -- Key: SPARK-3213 URL: https://issues.apache.org/jira/browse/SPARK-3213 Project: Spark Issue Type: Improvement Components: EC2 Affects Versions: 1.1.0 Reporter: Joseph K. Bradley Priority: Blocker Attachments: Screen Shot 2014-08-25 at 6.45.35 PM.png, Screen Shot 2014-08-26 at 1.22.32 PM.png spark_ec2.py cannot find all slave instances. In particular: * I created a master slave and configured them. * I created new slave instances from the original slave (Launch More Like This). * I tried to relaunch the cluster, and it could only find the original slave. Old versions of the script worked. The latest working commit which edited that .py script is: a0bcbc159e89be868ccc96175dbf1439461557e1 There may be a problem with this PR: [https://github.com/apache/spark/pull/1899]. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3213) spark_ec2.py cannot find slave instances launched with Launch More Like This
[ https://issues.apache.org/jira/browse/SPARK-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14112748#comment-14112748 ] Joseph K. Bradley commented on SPARK-3213: -- Testing now... spark_ec2.py cannot find slave instances launched with Launch More Like This -- Key: SPARK-3213 URL: https://issues.apache.org/jira/browse/SPARK-3213 Project: Spark Issue Type: Improvement Components: EC2 Affects Versions: 1.1.0 Reporter: Joseph K. Bradley Priority: Critical Attachments: Screen Shot 2014-08-25 at 6.45.35 PM.png, Screen Shot 2014-08-26 at 1.22.32 PM.png spark_ec2.py cannot find all slave instances. In particular: * I created a master slave and configured them. * I created new slave instances from the original slave (Launch More Like This). * I tried to relaunch the cluster, and it could only find the original slave. Old versions of the script worked. The latest working commit which edited that .py script is: a0bcbc159e89be868ccc96175dbf1439461557e1 There may be a problem with this PR: [https://github.com/apache/spark/pull/1899]. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3213) spark_ec2.py cannot find slave instances launched with Launch More Like This
[ https://issues.apache.org/jira/browse/SPARK-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110353#comment-14110353 ] Patrick Wendell commented on SPARK-3213: Hey I don't think we previously supported adding slaves like this, so I'm renaming this from a bug to a feature :) spark_ec2.py cannot find slave instances launched with Launch More Like This -- Key: SPARK-3213 URL: https://issues.apache.org/jira/browse/SPARK-3213 Project: Spark Issue Type: Improvement Components: EC2 Affects Versions: 1.1.0 Reporter: Joseph K. Bradley Priority: Blocker Attachments: Screen Shot 2014-08-25 at 6.45.35 PM.png spark_ec2.py cannot find all slave instances. In particular: * I created a master slave and configured them. * I created new slave instances from the original slave (Launch More Like This). * I tried to relaunch the cluster, and it could only find the original slave. Old versions of the script worked. The latest working commit which edited that .py script is: a0bcbc159e89be868ccc96175dbf1439461557e1 There may be a problem with this PR: [https://github.com/apache/spark/pull/1899]. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3213) spark_ec2.py cannot find slave instances launched with Launch More Like This
[ https://issues.apache.org/jira/browse/SPARK-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111249#comment-14111249 ] Vida Ha commented on SPARK-3213: Okay, I'm able to reproduce now. It's occurring when you use Launch More Like This, and when you request spot instances. If you don't use spot instances and use Launch More Like This, it's okay. Why it is necessary to do Launch More like this? Why not use the scripts always? Are there use cases that can't be covered from the scripts? spark_ec2.py cannot find slave instances launched with Launch More Like This -- Key: SPARK-3213 URL: https://issues.apache.org/jira/browse/SPARK-3213 Project: Spark Issue Type: Improvement Components: EC2 Affects Versions: 1.1.0 Reporter: Joseph K. Bradley Priority: Blocker Attachments: Screen Shot 2014-08-25 at 6.45.35 PM.png spark_ec2.py cannot find all slave instances. In particular: * I created a master slave and configured them. * I created new slave instances from the original slave (Launch More Like This). * I tried to relaunch the cluster, and it could only find the original slave. Old versions of the script worked. The latest working commit which edited that .py script is: a0bcbc159e89be868ccc96175dbf1439461557e1 There may be a problem with this PR: [https://github.com/apache/spark/pull/1899]. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3213) spark_ec2.py cannot find slave instances launched with Launch More Like This
[ https://issues.apache.org/jira/browse/SPARK-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111264#comment-14111264 ] Vida Ha commented on SPARK-3213: FYI - more info on the bug: Amazon seems to be copying the tags to the spot requests rather than the instances themselves - see the image above. It may be possible to detect these instances by cycling through spot requests and checking those names... -Vida spark_ec2.py cannot find slave instances launched with Launch More Like This -- Key: SPARK-3213 URL: https://issues.apache.org/jira/browse/SPARK-3213 Project: Spark Issue Type: Improvement Components: EC2 Affects Versions: 1.1.0 Reporter: Joseph K. Bradley Priority: Blocker Attachments: Screen Shot 2014-08-25 at 6.45.35 PM.png, Screen Shot 2014-08-26 at 1.22.32 PM.png spark_ec2.py cannot find all slave instances. In particular: * I created a master slave and configured them. * I created new slave instances from the original slave (Launch More Like This). * I tried to relaunch the cluster, and it could only find the original slave. Old versions of the script worked. The latest working commit which edited that .py script is: a0bcbc159e89be868ccc96175dbf1439461557e1 There may be a problem with this PR: [https://github.com/apache/spark/pull/1899]. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3213) spark_ec2.py cannot find slave instances
[ https://issues.apache.org/jira/browse/SPARK-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14109697#comment-14109697 ] Joseph K. Bradley commented on SPARK-3213: -- [~vidaha] Please take a look. Thanks! spark_ec2.py cannot find slave instances Key: SPARK-3213 URL: https://issues.apache.org/jira/browse/SPARK-3213 Project: Spark Issue Type: Bug Components: EC2 Affects Versions: 1.1.0 Reporter: Joseph K. Bradley Priority: Blocker spark_ec2.py cannot find all slave instances. In particular: * I created a master slave and configured them. * I created new slave instances from the original slave (Launch More Like This). * I tried to relaunch the cluster, and it could only find the original slave. Old versions of the script worked. The latest working commit which edited that .py script is: a0bcbc159e89be868ccc96175dbf1439461557e1 There may be a problem with this PR: [https://github.com/apache/spark/pull/1899]. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3213) spark_ec2.py cannot find slave instances
[ https://issues.apache.org/jira/browse/SPARK-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14109700#comment-14109700 ] Joseph K. Bradley commented on SPARK-3213: -- The security group name I was using was joseph-r3.2xlarge-slaves It may be a regex/matching issue. spark_ec2.py cannot find slave instances Key: SPARK-3213 URL: https://issues.apache.org/jira/browse/SPARK-3213 Project: Spark Issue Type: Bug Components: EC2 Affects Versions: 1.1.0 Reporter: Joseph K. Bradley Priority: Blocker spark_ec2.py cannot find all slave instances. In particular: * I created a master slave and configured them. * I created new slave instances from the original slave (Launch More Like This). * I tried to relaunch the cluster, and it could only find the original slave. Old versions of the script worked. The latest working commit which edited that .py script is: a0bcbc159e89be868ccc96175dbf1439461557e1 There may be a problem with this PR: [https://github.com/apache/spark/pull/1899]. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3213) spark_ec2.py cannot find slave instances
[ https://issues.apache.org/jira/browse/SPARK-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14109818#comment-14109818 ] Vida Ha commented on SPARK-3213: Joseph, Josh, I discussed in person. There is a quick workarounds: 1) Use an old version of the spark_ec2 scripts that uses security groups to identify the slaves, if using Launch more like this But now I need to investigate: If using launch more like this, it does seem like amazon tries to reuse the tags, but I'm wondering if it doesn't like having multiple machines with the same Name tag. I will try using a different tag, like spark-ec2-cluster-id or something like that to identify the machine. If that tag does copy over, then we can properly support Launch more like this. spark_ec2.py cannot find slave instances Key: SPARK-3213 URL: https://issues.apache.org/jira/browse/SPARK-3213 Project: Spark Issue Type: Bug Components: EC2 Affects Versions: 1.1.0 Reporter: Joseph K. Bradley Priority: Blocker spark_ec2.py cannot find all slave instances. In particular: * I created a master slave and configured them. * I created new slave instances from the original slave (Launch More Like This). * I tried to relaunch the cluster, and it could only find the original slave. Old versions of the script worked. The latest working commit which edited that .py script is: a0bcbc159e89be868ccc96175dbf1439461557e1 There may be a problem with this PR: [https://github.com/apache/spark/pull/1899]. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3213) spark_ec2.py cannot find slave instances
[ https://issues.apache.org/jira/browse/SPARK-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14109828#comment-14109828 ] Vida Ha commented on SPARK-3213: Can someone rename this issue to: spark_ec2.py cannot find slave instances launched with Launch More Like This I think that's more indicative of the issue - it's not wider than that. spark_ec2.py cannot find slave instances Key: SPARK-3213 URL: https://issues.apache.org/jira/browse/SPARK-3213 Project: Spark Issue Type: Bug Components: EC2 Affects Versions: 1.1.0 Reporter: Joseph K. Bradley Priority: Blocker spark_ec2.py cannot find all slave instances. In particular: * I created a master slave and configured them. * I created new slave instances from the original slave (Launch More Like This). * I tried to relaunch the cluster, and it could only find the original slave. Old versions of the script worked. The latest working commit which edited that .py script is: a0bcbc159e89be868ccc96175dbf1439461557e1 There may be a problem with this PR: [https://github.com/apache/spark/pull/1899]. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3213) spark_ec2.py cannot find slave instances launched with Launch More Like This
[ https://issues.apache.org/jira/browse/SPARK-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110149#comment-14110149 ] Vida Ha commented on SPARK-3213: Hi Joseph, Can you tell me more about how you launched these, without copying the tags? I used Launch More Like This, and the name and tags were copied over correctly. I'm wondering if maybe when you were using EC2, if perhaps you could have been so unlucky as to have trigger a temporary outage in copying tags... Let's sync up in person tomorrow and figure out if this was a one time problem or happens each time Launch spark_ec2.py cannot find slave instances launched with Launch More Like This -- Key: SPARK-3213 URL: https://issues.apache.org/jira/browse/SPARK-3213 Project: Spark Issue Type: Bug Components: EC2 Affects Versions: 1.1.0 Reporter: Joseph K. Bradley Priority: Blocker spark_ec2.py cannot find all slave instances. In particular: * I created a master slave and configured them. * I created new slave instances from the original slave (Launch More Like This). * I tried to relaunch the cluster, and it could only find the original slave. Old versions of the script worked. The latest working commit which edited that .py script is: a0bcbc159e89be868ccc96175dbf1439461557e1 There may be a problem with this PR: [https://github.com/apache/spark/pull/1899]. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org