[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-20 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/18487 @tgravescs Thanks for merging this. I have created a PR for 2.2 https://github.com/apache/spark/pull/18691 I had to remove a couple of newer config entries which landed while resolving a

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-20 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/18487 @cloud-fan replied to your comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-19 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18487 @rxin it's kind of a stability fix(make shuffle service more stable), so I'm ok to backport if the conflict is small. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-19 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/18487 hm is this a bug fix? if not we shouldn't cherry pick it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-19 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/18487 cherry pick to 2.2 wasn't clean so can you please put up a separate PR against branch 2.2 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18487 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79760/ Test PASSed. ---

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18487 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18487 **[Test build #79760 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79760/testReport)** for PR 18487 at commit

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-19 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/18487 +1, pending jenkins build. if no further comments I'm going to commit this later today. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18487 **[Test build #79760 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79760/testReport)** for PR 18487 at commit

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-19 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18487 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-17 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/18487 @jinxing64 We have the default set the int max so that by default there is no performance penalty for users. We have done some testing as Dhruve mentioned but we don't regularly hit the issue.

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-17 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/18487 @jinxing64 I performed few runs to see if we were observing any performance issues with the change. I ran a simple word count job over a random set of text - 3TB. I couldn't get 100's of executors

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-14 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18487 `maxReqsInFlight` and `maxBytesInFlight` is hard to control the # of blocks in a single request. When # of map is very high, this change can alleviate the pressure of shuffle server. @dhruve

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18487 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79614/ Test PASSed. ---

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18487 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18487 **[Test build #79614 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79614/testReport)** for PR 18487 at commit

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18487 **[Test build #79614 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79614/testReport)** for PR 18487 at commit

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-11 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/18487 @jiangxb1987 Thanks for the review. @cloud-fan #18388 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-10 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18487 Will this be convered by https://github.com/apache/spark/pull/18388 ? And another concern is how shall we expect users to tune this config? Can users just tune `spark.reducer.maxReqsInFlight`

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-10 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/18487 cc @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18487 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79477/ Test PASSed. ---

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18487 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18487 **[Test build #79477 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79477/testReport)** for PR 18487 at commit

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18487 **[Test build #79477 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79477/testReport)** for PR 18487 at commit

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-10 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/18487 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18487 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18487 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79469/ Test FAILed. ---

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18487 **[Test build #79469 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79469/testReport)** for PR 18487 at commit

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-10 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/18487 @jiangxb1987 I have resolved the merge conflicts and reworded the config to make it more clear. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18487 **[Test build #79469 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79469/testReport)** for PR 18487 at commit

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-09 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/18487 This LGTM, @dhruve could you rebase it with the master branch please? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-07 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/18487 @jiangxb1987 I have made the changes requested. Can you have a look. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18487 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18487 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79292/ Test PASSed. ---

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18487 **[Test build #79292 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79292/testReport)** for PR 18487 at commit

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18487 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18487 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79290/ Test PASSed. ---

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18487 **[Test build #79290 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79290/testReport)** for PR 18487 at commit

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18487 **[Test build #79292 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79292/testReport)** for PR 18487 at commit

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18487 **[Test build #79290 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79290/testReport)** for PR 18487 at commit

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-07-05 Thread dhruve
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/18487 @rxin @cloud-fan Can you review this PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18487 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18487 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/78984/ Test PASSed. ---

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-06-30 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18487 **[Test build #78984 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78984/testReport)** for PR 18487 at commit

[GitHub] spark issue #18487: [SPARK-21243][Core] Limit no. of map outputs in a shuffl...

2017-06-30 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18487 **[Test build #78984 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/78984/testReport)** for PR 18487 at commit