Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18388
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18388
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79642/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18388
**[Test build #79642 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79642/testReport)**
for PR 18388 at commit
[`3cc29a7`](https://github.com/apache/spark/commit/3
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18388
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18388
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/79640/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18388
**[Test build #79640 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79640/testReport)**
for PR 18388 at commit
[`d7c71a4`](https://github.com/apache/spark/commit/d
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18388
**[Test build #79642 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79642/testReport)**
for PR 18388 at commit
[`3cc29a7`](https://github.com/apache/spark/commit/3c
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/18388
@tgravescs
Thanks a lot for helping this pr !
I changed this pr, in current change:
Shuffle server will track the number of chunks being transferred.
Connection will be closed whe
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18388
**[Test build #79640 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/79640/testReport)**
for PR 18388 at commit
[`d7c71a4`](https://github.com/apache/spark/commit/d7
Github user tgravescs commented on the issue:
https://github.com/apache/spark/pull/18388
I don't think we reject any requests at this point. So yes you could still
run into an issue. Generally I think limiting the # of blocks you are fetching
at once will solve this also, but it
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/18388
@tgravescs
Thanks a lot for advice.
> the flow control part should allow everyone to start fetching without
rejecting a bunch, especially if the network can't push it out that fast anyway.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/18388
+1 on both.
We have to do it at server side, because there may be a lot of reducers
fetching data from one shuffle service at the same time, which may OOM the
server even each reducer has
Github user tgravescs commented on the issue:
https://github.com/apache/spark/pull/18388
Ideally we do both.
I think https://github.com/apache/spark/pull/18487 already will help you on
the reducer side. It allows you to limit the # of blocks its fetching in one
call. So you
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/18388
I think it could be more efficient to do the control on shuffle service
side.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If yo
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/18388
shall we control it at reducer side or shuffle service side or both?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/18388
Previously I was saying that I have 200k+ connections to one shuffle
service. I'm sorry about this, the information is wrong. It turns out that our
each `NodeManager` has two auxiliary shuffle ser
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/18388
>Is this all a single application?
No, it's data warehouse, there are thousands ETLs
>You say 6000 nodes with 64 executors on each host, how many cores per
executor?
1
Github user tgravescs commented on the issue:
https://github.com/apache/spark/pull/18388
@jinxing64 I'm still a bit curious on a few of my previous questions to get
exact usage here:
Is this all a single application? You say 6000 nodes with 64 executors on
each host, how many
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/18388
@tgravescs
Thanks a lot for your advice :) very helpful.
I will try more on this.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as
19 matches
Mail list logo