[GitHub] spark issue #18388: [SPARK-21175][WIP] Reject OpenBlocks when memory shortag...

2017-07-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18388 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #18388: [SPARK-21175][WIP] Reject OpenBlocks when memory shortag...

2017-07-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18388 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79642/ Test PASSed. ---

[GitHub] spark issue #18388: [SPARK-21175][WIP] Reject OpenBlocks when memory shortag...

2017-07-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18388 **[Test build #79642 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79642/testReport)** for PR 18388 at commit [`3cc29a7`](https://github.com/apache/spark/commit/3

[GitHub] spark issue #18388: [SPARK-21175][WIP] Reject OpenBlocks when memory shortag...

2017-07-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18388 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #18388: [SPARK-21175][WIP] Reject OpenBlocks when memory shortag...

2017-07-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18388 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79640/ Test PASSed. ---

[GitHub] spark issue #18388: [SPARK-21175][WIP] Reject OpenBlocks when memory shortag...

2017-07-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18388 **[Test build #79640 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79640/testReport)** for PR 18388 at commit [`d7c71a4`](https://github.com/apache/spark/commit/d

[GitHub] spark issue #18388: [SPARK-21175][WIP] Reject OpenBlocks when memory shortag...

2017-07-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18388 **[Test build #79642 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79642/testReport)** for PR 18388 at commit [`3cc29a7`](https://github.com/apache/spark/commit/3c

[GitHub] spark issue #18388: [SPARK-21175][WIP] Reject OpenBlocks when memory shortag...

2017-07-16 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18388 @tgravescs Thanks a lot for helping this pr ! I changed this pr, in current change: Shuffle server will track the number of chunks being transferred. Connection will be closed whe

[GitHub] spark issue #18388: [SPARK-21175][WIP] Reject OpenBlocks when memory shortag...

2017-07-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18388 **[Test build #79640 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79640/testReport)** for PR 18388 at commit [`d7c71a4`](https://github.com/apache/spark/commit/d7

[GitHub] spark issue #18388: [SPARK-21175][WIP] Reject OpenBlocks when memory shortag...

2017-07-14 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/18388 I don't think we reject any requests at this point. So yes you could still run into an issue. Generally I think limiting the # of blocks you are fetching at once will solve this also, but it

[GitHub] spark issue #18388: [SPARK-21175][WIP] Reject OpenBlocks when memory shortag...

2017-07-13 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18388 @tgravescs Thanks a lot for advice. > the flow control part should allow everyone to start fetching without rejecting a bunch, especially if the network can't push it out that fast anyway.

[GitHub] spark issue #18388: [SPARK-21175][WIP] Reject OpenBlocks when memory shortag...

2017-07-12 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18388 +1 on both. We have to do it at server side, because there may be a lot of reducers fetching data from one shuffle service at the same time, which may OOM the server even each reducer has

[GitHub] spark issue #18388: [SPARK-21175][WIP] Reject OpenBlocks when memory shortag...

2017-07-12 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/18388 Ideally we do both. I think https://github.com/apache/spark/pull/18487 already will help you on the reducer side. It allows you to limit the # of blocks its fetching in one call. So you

[GitHub] spark issue #18388: [SPARK-21175][WIP] Reject OpenBlocks when memory shortag...

2017-07-12 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18388 I think it could be more efficient to do the control on shuffle service side. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If yo

[GitHub] spark issue #18388: [SPARK-21175][WIP] Reject OpenBlocks when memory shortag...

2017-07-12 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18388 shall we control it at reducer side or shuffle service side or both? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #18388: [SPARK-21175][WIP] Reject OpenBlocks when memory shortag...

2017-07-12 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18388 Previously I was saying that I have 200k+ connections to one shuffle service. I'm sorry about this, the information is wrong. It turns out that our each `NodeManager` has two auxiliary shuffle ser

[GitHub] spark issue #18388: [SPARK-21175][WIP] Reject OpenBlocks when memory shortag...

2017-07-11 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18388 >Is this all a single application? No, it's data warehouse, there are thousands ETLs >You say 6000 nodes with 64 executors on each host, how many cores per executor? 1

[GitHub] spark issue #18388: [SPARK-21175][WIP] Reject OpenBlocks when memory shortag...

2017-07-11 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/18388 @jinxing64 I'm still a bit curious on a few of my previous questions to get exact usage here: Is this all a single application? You say 6000 nodes with 64 executors on each host, how many

[GitHub] spark issue #18388: [SPARK-21175][WIP] Reject OpenBlocks when memory shortag...

2017-07-10 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18388 @tgravescs Thanks a lot for your advice :) very helpful. I will try more on this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as