[GitHub] spark issue #19046: [SPARK-18769][yarn] Limit resource requests based on RM'...

2017-09-07 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/19046 I'm going to close this; when I find some free time I might take a closer at the issue described in Wilfred's message. --- - To

[GitHub] spark issue #19046: [SPARK-18769][yarn] Limit resource requests based on RM'...

2017-09-06 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/19046 Yeah if we are releasing and they reacquiring right away over and over again that would be bad, but I don't know when we would do that so more details would be great if possible. ---

[GitHub] spark issue #19046: [SPARK-18769][yarn] Limit resource requests based on RM'...

2017-09-06 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/19046 I was told that the preemption issue was fixed in YARN (that's YARN-6210); so I don't think there's a need for this code currently (just use a fixed YARN if you want reliable preemption).

[GitHub] spark issue #19046: [SPARK-18769][yarn] Limit resource requests based on RM'...

2017-09-06 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/19046 It might help if you can give an exact scenario where you see the issue and perhaps configs if those matter. meaning do you have like some set of minimum containers, a small idle timeout, etc..

[GitHub] spark issue #19046: [SPARK-18769][yarn] Limit resource requests based on RM'...

2017-09-06 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/19046 Unfortunately that isn't clear to me as to the cause or what the issue is with the yarn side. I'm not sure what he means by "AM is releasing and then acquiring the reservations again and

[GitHub] spark issue #19046: [SPARK-18769][yarn] Limit resource requests based on RM'...

2017-09-06 Thread Tagar
Github user Tagar commented on the issue: https://github.com/apache/spark/pull/19046 @tgravescs, here's quote from Wilfred Spiegelenburg - hope it answers both of your questions. > The behaviour I discussed earlier around the Spark AM reservations is not optimal. It turns

[GitHub] spark issue #19046: [SPARK-18769][yarn] Limit resource requests based on RM'...

2017-09-06 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/19046 @Tagar can you be more specific about the problems you are seeing? how does this affect preemption? Why don't you see the same issues on MapReduce/Tez? ---

[GitHub] spark issue #19046: [SPARK-18769][yarn] Limit resource requests based on RM'...

2017-09-06 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/19046 I'm in general not a fan of adding more and more config options; the end result is that most people won't enable it until they run into problems, and to me that's too late. I'm still

[GitHub] spark issue #19046: [SPARK-18769][yarn] Limit resource requests based on RM'...

2017-09-06 Thread Tagar
Github user Tagar commented on the issue: https://github.com/apache/spark/pull/19046 > This could just be an adhoc queue but the spark users would lose out to tez/mapreduce users. I'm pretty positive this will hurt spark users on some of our cluster so would want performance numbers

[GitHub] spark issue #19046: [SPARK-18769][yarn] Limit resource requests based on RM'...

2017-08-29 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/19046 As I said, I do believe this is first and foremost a YARN problem and this was just trying to make Spark not trigger that. The things you're worried about can be adjusted (e.g. instead of using the

[GitHub] spark issue #19046: [SPARK-18769][yarn] Limit resource requests based on RM'...

2017-08-29 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/19046 > The MR AM does something similar. Can you be more specific here, which exact code are you referring to? MR am does do some headroom calculation for things like slow start and

[GitHub] spark issue #19046: [SPARK-18769][yarn] Limit resource requests based on RM'...

2017-08-28 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/19046 BTW, in spite of the above, if you really feel strongly about it I might just drop this and tell the YARN folks to make their code scale better. But I really don't see the downsides you seem worried

[GitHub] spark issue #19046: [SPARK-18769][yarn] Limit resource requests based on RM'...

2017-08-28 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/19046 This has been a long time request from our YARN team; they've had users run into issues with YARN and traced it back to Spark making a large number of container requests. You can argue that it's an

[GitHub] spark issue #19046: [SPARK-18769][yarn] Limit resource requests based on RM'...

2017-08-28 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/19046 there are other options, like change the default for max executors to something reasonable. I'm not sure if its related or not but we are also looking at adding a config to limit # of

[GitHub] spark issue #19046: [SPARK-18769][yarn] Limit resource requests based on RM'...

2017-08-28 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/19046 I would like to clarify why we are doing this. The jira has some discussion on it but I would like to know exactly what we are improving/fixing? If we do this change, I definitely

[GitHub] spark issue #19046: [SPARK-18769][yarn] Limit resource requests based on RM'...

2017-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19046 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #19046: [SPARK-18769][yarn] Limit resource requests based on RM'...

2017-08-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19046 **[Test build #81108 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81108/testReport)** for PR 19046 at commit

[GitHub] spark issue #19046: [SPARK-18769][yarn] Limit resource requests based on RM'...

2017-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19046 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81108/ Test PASSed. ---

[GitHub] spark issue #19046: [SPARK-18769][yarn] Limit resource requests based on RM'...

2017-08-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19046 **[Test build #81108 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81108/testReport)** for PR 19046 at commit

[GitHub] spark issue #19046: [SPARK-18769][yarn] Limit resource requests based on RM'...

2017-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19046 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #19046: [SPARK-18769][yarn] Limit resource requests based on RM'...

2017-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19046 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81105/ Test PASSed. ---

[GitHub] spark issue #19046: [SPARK-18769][yarn] Limit resource requests based on RM'...

2017-08-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19046 **[Test build #81105 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81105/testReport)** for PR 19046 at commit

[GitHub] spark issue #19046: [SPARK-18769][yarn] Limit resource requests based on RM'...

2017-08-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19046 **[Test build #81105 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81105/testReport)** for PR 19046 at commit

[GitHub] spark issue #19046: [SPARK-18769][yarn] Limit resource requests based on RM'...

2017-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19046 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #19046: [SPARK-18769][yarn] Limit resource requests based on RM'...

2017-08-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19046 **[Test build #81104 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81104/testReport)** for PR 19046 at commit

[GitHub] spark issue #19046: [SPARK-18769][yarn] Limit resource requests based on RM'...

2017-08-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19046 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81104/ Test PASSed. ---

[GitHub] spark issue #19046: [SPARK-18769][yarn] Limit resource requests based on RM'...

2017-08-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19046 **[Test build #81104 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81104/testReport)** for PR 19046 at commit