[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-10-02 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2485 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-10-02 Thread tgravescs
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-57683434 I committed this. I missed there wasn't a jira here so filed https://issues.apache.org/jira/browse/SPARK-3768. --- If your project is set up for it, you can reply to

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-10-02 Thread nishkamravi2
Github user nishkamravi2 commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-57705659 Thanks @tgravescs --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-10-01 Thread tgravescs
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-57518030 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-30 Thread tgravescs
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-57326448 @andrewor14 did you have any further comments on this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-30 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-57341483 I think this is fine. I spotted one semicolon but I'll let that go. LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-30 Thread nishkamravi2
Github user nishkamravi2 commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-57360651 Semicolon removed (nice catch) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-26 Thread tgravescs
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-56979002 It seems a bit much to have 2 configs to do essentially same thing. I see it leading to confusion and just extra overhead. --- If your project is set up for it, you

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-26 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-56987453 Yes I'm inclined towards having only one config, and beefing up the documentation and comments on the existing one. --- If your project is set up for it, you can

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-26 Thread nishkamravi2
Github user nishkamravi2 commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-57001251 @andrewor14 I had added a few comments in code/docs yday. Not sure if you got a chance to take a look. If there is anything specific (in terms of comments) you'd

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-26 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-57001360 Hey @nishkamravi2 yes I just looked at the latest changes and the existing comments are good. Any other comments by others? --- If your project is set up for it, you

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-26 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-57001461 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-26 Thread brndnmtthws
Github user brndnmtthws commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-57001671 Can you refactor this to be non-YARN specific? It would be good to share code between this and #2401. --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-26 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-57001868 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20866/consoleFull) for PR 2485 at commit

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-26 Thread brndnmtthws
Github user brndnmtthws commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-57001899 In particular, look at how I put the logic into a common function, `calculateTotalMemory`. --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-26 Thread nishkamravi2
Github user nishkamravi2 commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-57002615 Not sure what you mean by non-yarn specific. The two code bases are quite different and memoryOverhead is too small of an intersection to try and unify them. I

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-26 Thread nishkamravi2
Github user nishkamravi2 commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-57002734 Calculate totalMemory can be differently defined for the two code paths. The overhead percentage will have to be different too. As long as they follow the same

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-26 Thread brndnmtthws
Github user brndnmtthws commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-57004118 Why can't they both share the same config parameters, for example? I understand the implementation differences, but we shouldn't need to have distinct config

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-26 Thread nishkamravi2
Github user nishkamravi2 commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-57004722 For one, it would mean a change in the UI, which breaks existing deployments and there should be a compelling reason to do so. --- If your project is set up for

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-26 Thread brndnmtthws
Github user brndnmtthws commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-57004940 So I guess there's nothing to do. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-26 Thread nishkamravi2
Github user nishkamravi2 commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-57005579 I think PR #2401 can be modeled after this one. Instead of defining overhead as a percentage, it could (and probably should) be defined as an absolute value. Also,

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-26 Thread brndnmtthws
Github user brndnmtthws commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-57006489 Naturally you wouldn't want to have to change yours. I'll drop the `.minimum` thing, and prefix the config params with `.mesos`, like you've done for yarn.

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-26 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-57007876 Hey I just talked to @pwendell about this. I think it's better for us to have a yarn config and a mesos config, but not generalize this to use a common

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-26 Thread brndnmtthws
Github user brndnmtthws commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-57008082 That's fair. I'm updating the PR to make that Mesos specific now. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-26 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-57008999 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20866/consoleFull) for PR 2485 at commit

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-57009006 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-26 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-57009912 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-26 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-57010357 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20870/consoleFull) for PR 2485 at commit

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-26 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-57017843 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20870/consoleFull) for PR 2485 at commit

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-57017849 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-26 Thread nishkamravi2
Github user nishkamravi2 commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-57018879 Need some help interpreting the test results. Not clear which one is failing. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-26 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-57029756 It's the python ones. This is unlikely to be related to your patch. Let's retest this please. --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-26 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-57030100 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20887/consoleFull) for PR 2485 at commit

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-26 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-57034392 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20887/consoleFull) for PR 2485 at commit

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-57034396 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-25 Thread tgravescs
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-56816028 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-25 Thread tgravescs
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-56818674 @JoshRosen would you mind kicking jenkins again now that its upmerged? --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-25 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-56843859 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/153/consoleFull) for PR 2485 at commit

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-25 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-56853597 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/153/consoleFull) for PR 2485 at commit

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-25 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/2485#discussion_r18049289 --- Diff: yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ClientBase.scala --- @@ -64,14 +64,18 @@ private[spark] trait ClientBase extends Logging

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-25 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/2485#discussion_r18049314 --- Diff: yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ClientBase.scala --- @@ -64,14 +64,18 @@ private[spark] trait ClientBase extends Logging

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-25 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/2485#discussion_r18049333 --- Diff: yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ClientBase.scala --- @@ -64,14 +64,18 @@ private[spark] trait ClientBase extends Logging

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-25 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/2485#discussion_r18049458 --- Diff: yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ClientArguments.scala --- @@ -39,15 +39,19 @@ private[spark] class

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-25 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/2485#discussion_r18049502 --- Diff: yarn/common/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala --- @@ -117,9 +118,10 @@ private[yarn] abstract class YarnAllocator(

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-25 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-56863597 Have we ever come to a consensus on whether 0.07 is an appropriate default? Under the current settings this means anything above ~5.5G of executor / driver memory

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-25 Thread sryza
Github user sryza commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-56865310 @nishkamravi2 arrived at this through experimentation. He had a few details on his experiments on the previous incarnation of this PR #1391 . If anything, I think 0.07 is

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-25 Thread nishkamravi2
Github user nishkamravi2 commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-56875798 Updated as per @andrewor14 's comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-25 Thread nishkamravi2
Github user nishkamravi2 commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-56876371 As Sandy points out, 7% is on the conservative side. In the interest of minimizing memory waste while covering the common cases (as per our experiments). Anything

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-25 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/2485#discussion_r18057379 --- Diff: out --- @@ -0,0 +1 @@ +Already up-to-date. --- End diff -- Wait, can you delete this file? --- If your project is set up for

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-25 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-56878538 I see. Maybe it makes sense to at least add a comment (in the documentation and in the code) to explain how we arrived at these numbers. This config can't be

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-25 Thread nishkamravi2
Github user nishkamravi2 commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-56887022 @andrewor14 Which config are we referring to? spark.yarn.*.memoryOverhead is configurable. --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-25 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-56889393 The static value (MB) is configurable, but the user can't specify 15% or 20% but is stuck with 7%. This is probably fine. --- If your project is set up for it, you

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-25 Thread nishkamravi2
Github user nishkamravi2 commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-56891420 I see. Yeah, we have the choice to expose the config parameter relative to the container size or as an absolute value. Since memoryOverhead as a function of the

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-25 Thread sryza
Github user sryza commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-56892181 We could also expose both and make memoryOverhead override the other. I think this could be reasonable because the scale is most likely to be set in spark-defaults.conf

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-25 Thread nishkamravi2
Github user nishkamravi2 commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-56893439 Not a bad suggestion Sandy, but I would be wary of the potential confusion it may create. Ideally this parameter should not be exposed as a config parameter at all

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-24 Thread tgravescs
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-56706929 @pwendell @mateiz @andrewor14 can any of you kick jenkins? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-24 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-56707385 I just kicked it from the `spark-prs` parameterized build trigger; let's wait and see if it starts... --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-24 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-56707584 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/146/consoleFull) for PR 2485 at commit

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-24 Thread tgravescs
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-56707989 ah sorry, looks like something conflicts now and it needs upmerged. @nishkamravi2 can you please upmerge --- If your project is set up for it, you can reply

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-24 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-56720302 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/146/consoleFull) for PR 2485 at commit

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-24 Thread nishkamravi2
Github user nishkamravi2 commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-56762805 Updated and merged. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-24 Thread nishkamravi2
Github user nishkamravi2 commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-56763002 ClientBase changes are now distributed over ClientBase and ClientArguments. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-23 Thread sryza
Github user sryza commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-56480298 This looks good to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-23 Thread tgravescs
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-56526845 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-23 Thread tgravescs
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-56522491 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-23 Thread tgravescs
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-56532051 @JoshRosen Any idea why Jenkins isn't running on this? Could you kick it manually? --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-22 Thread nishkamravi2
GitHub user nishkamravi2 opened a pull request: https://github.com/apache/spark/pull/2485 Modify default YARN memory_overhead-- from an additive constant to a multiplier Redone against the recent master branch (https://github.com/apache/spark/pull/1391) You can merge this pull

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-22 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-56334768 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-22 Thread nishkamravi2
Github user nishkamravi2 commented on the pull request: https://github.com/apache/spark/pull/1391#issuecomment-56334843 Have redone the PR against the recent master branch, which has undergone significant structural changes for Yarn. Addressed review comments and changed the

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-22 Thread sryza
Github user sryza commented on the pull request: https://github.com/apache/spark/pull/1391#issuecomment-56342496 If #2485 is the replacement, can we close this one out? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-22 Thread sryza
Github user sryza commented on a diff in the pull request: https://github.com/apache/spark/pull/2485#discussion_r17837060 --- Diff: yarn/common/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala --- @@ -117,9 +118,10 @@ private[yarn] abstract class YarnAllocator(

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-22 Thread sryza
Github user sryza commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-56343225 It would also be nice to log what it is when we fail to get a container large enough or it fails due to the cluster max allocation limit was hit. @tgravescs I

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-22 Thread nishkamravi2
Github user nishkamravi2 commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-56345871 Updated as per @sryza 's comments --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-22 Thread nishkamravi2
Github user nishkamravi2 commented on the pull request: https://github.com/apache/spark/pull/1391#issuecomment-56346539 Shall we let this linger on for just a bit until the other one gets merged? --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-22 Thread nishkamravi2
Github user nishkamravi2 closed the pull request at: https://github.com/apache/spark/pull/1391 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-22 Thread nishkamravi2
Github user nishkamravi2 commented on the pull request: https://github.com/apache/spark/pull/1391#issuecomment-56347057 Noticed that we have a reference to this one in 2485, closing it out. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-22 Thread tgravescs
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-56371027 yes it would be nice to tell the user what the overhead limit is calculated to be as I might not realize there is overhead and that its dependent upon the multiplier.

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-22 Thread nishkamravi2
Github user nishkamravi2 commented on the pull request: https://github.com/apache/spark/pull/2485#issuecomment-56457534 Updated as per @tgravescs 's comments --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-19 Thread nishkamravi2
Github user nishkamravi2 commented on the pull request: https://github.com/apache/spark/pull/1391#issuecomment-56142506 @sryza Thanks Sandy. Will do. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-19 Thread tgravescs
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/1391#issuecomment-56174217 @mridulm any comments? I'm ok with it if its a consistent problem for users. One thing we definitely need to do is document it and possibly look at including

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-18 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/1391#discussion_r17762675 --- Diff: yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala --- @@ -92,7 +92,7 @@ private[yarn] class

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-18 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/1391#issuecomment-56119506 What is the current state of this PR? @tgravescs @mridulm any more thoughts about the current approach? This is a related PR for mesos and I'm wondering if we can use

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-18 Thread nishkamravi2
Github user nishkamravi2 commented on the pull request: https://github.com/apache/spark/pull/1391#issuecomment-56120931 Updated as per @andrewor14 's comments --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-18 Thread sryza
Github user sryza commented on the pull request: https://github.com/apache/spark/pull/1391#issuecomment-56132524 @nishkamravi2 mind resolving the merge conflicts? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-18 Thread sryza
Github user sryza commented on the pull request: https://github.com/apache/spark/pull/1391#issuecomment-56132497 These changes look good to me. This addresses what continues to be the #1 issue that we see in Cloudera customer YARN deployments. It's worth considering boosting this

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-09-05 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1391#issuecomment-54694595 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-07-18 Thread tgravescs
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/1391#issuecomment-49480064 I'll let mridul comment on this but I think adding a comment where 0.06 came from would be useful. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-07-18 Thread nishkamravi2
Github user nishkamravi2 commented on the pull request: https://github.com/apache/spark/pull/1391#issuecomment-49483642 6% was experimentally obtained (with the goal of keeping the bound as tight as possible without the containers crashing). Three workloads were experimented with:

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-07-17 Thread nishkamravi2
Github user nishkamravi2 commented on the pull request: https://github.com/apache/spark/pull/1391#issuecomment-49348179 Bringing the discussion back online. Thanks for all the input so far. Ran a few experiments yday and today. Number of executors (which was the other main

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-07-13 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/1391#issuecomment-48835312 We have gone over this in the past .. it is suboptimal to make it a linear function of executor/driver memory. Overhead is a function of number of executors,

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-07-13 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/1391#issuecomment-48835447 That makes sense, but then it doesn't explain why a constant amount works for a given job when executor memory is low, and then doesn't work when it is high. This has

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-07-13 Thread nishkamravi2
Github user nishkamravi2 commented on the pull request: https://github.com/apache/spark/pull/1391#issuecomment-48835560 Sean, the memory_overhead is fairly substantial. More than 2GB for a 30GB executor. Less than 400MB for a 2GB executor. --- If your project is set up for it, you

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-07-13 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/1391#issuecomment-48835566 The default constant is actually a lowerbound to account for other overheads (since yarn will aggressively kill tasks)... Unfortunately we have not sized this

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-07-13 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/1391#issuecomment-48835618 That would be a function of your jobs. Other apps would have a drastically different characteristics ... Which is why we can't generalize to a simple fraction of

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-07-13 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/1391#issuecomment-48835656 The basic issue is you are trying to model overhead using the wrong variable... It has no correlation on executor memory actually (other than vm overheads as heap

Re: [GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-07-13 Thread Mridul Muralidharan
You are lucky :-) for some of our jobs, in a 8gb container, overhead is 1.8gb ! On 13-Jul-2014 2:40 pm, nishkamravi2 g...@git.apache.org wrote: Github user nishkamravi2 commented on the pull request: https://github.com/apache/spark/pull/1391#issuecomment-48835560 Sean, the

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-07-13 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/1391#issuecomment-48835727 Yes of course, lots of settings' best or even usable values are ultimately app-specific. Ideally, defaults work for lots of cases. A flat value is the simplest of models,

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-07-13 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/1391#issuecomment-48835769 You are lucky :-) for some of our jobs, in a 8gb container, overhead is 1.8gb ! On 13-Jul-2014 2:41 pm, nishkamravi2 notificati...@github.com wrote:

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-07-13 Thread nishkamravi2
Github user nishkamravi2 commented on the pull request: https://github.com/apache/spark/pull/1391#issuecomment-48835852 Experimented with three different workloads and noticed common patterns of proportionality. Other parameters were left unchanged and only executor size was

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-07-13 Thread nishkamravi2
Github user nishkamravi2 commented on the pull request: https://github.com/apache/spark/pull/1391#issuecomment-48835881 That's why the parameter is configurable. If you have jobs that cause 20-25% memory_overhead, default values will not help. --- If your project is set up for it,

  1   2   >