Github user vanzin commented on the issue:
https://github.com/apache/spark/pull/19046
I'm going to close this; when I find some free time I might take a closer
at the issue described in Wilfred's message.
---
-
To
Github user tgravescs commented on the issue:
https://github.com/apache/spark/pull/19046
Yeah if we are releasing and they reacquiring right away over and over
again that would be bad, but I don't know when we would do that so more details
would be great if possible.
---
Github user vanzin commented on the issue:
https://github.com/apache/spark/pull/19046
I was told that the preemption issue was fixed in YARN (that's YARN-6210);
so I don't think there's a need for this code currently (just use a fixed YARN
if you want reliable preemption).
Github user tgravescs commented on the issue:
https://github.com/apache/spark/pull/19046
It might help if you can give an exact scenario where you see the issue and
perhaps configs if those matter. meaning do you have like some set of minimum
containers, a small idle timeout, etc..
Github user tgravescs commented on the issue:
https://github.com/apache/spark/pull/19046
Unfortunately that isn't clear to me as to the cause or what the issue is
with the yarn side.
I'm not sure what he means by "AM is releasing and then acquiring the
reservations again and
Github user Tagar commented on the issue:
https://github.com/apache/spark/pull/19046
@tgravescs, here's quote from Wilfred Spiegelenburg - hope it answers both
of your questions.
> The behaviour I discussed earlier around the Spark AM reservations is not
optimal. It turns
Github user tgravescs commented on the issue:
https://github.com/apache/spark/pull/19046
@Tagar can you be more specific about the problems you are seeing? how
does this affect preemption? Why don't you see the same issues on
MapReduce/Tez?
---
Github user vanzin commented on the issue:
https://github.com/apache/spark/pull/19046
I'm in general not a fan of adding more and more config options; the end
result is that most people won't enable it until they run into problems, and to
me that's too late.
I'm still
Github user Tagar commented on the issue:
https://github.com/apache/spark/pull/19046
> This could just be an adhoc queue but the spark users would lose out to
tez/mapreduce users. I'm pretty positive this will hurt spark users on some of
our cluster so would want performance numbers
Github user vanzin commented on the issue:
https://github.com/apache/spark/pull/19046
As I said, I do believe this is first and foremost a YARN problem and this
was just trying to make Spark not trigger that. The things you're worried about
can be adjusted (e.g. instead of using the
Github user tgravescs commented on the issue:
https://github.com/apache/spark/pull/19046
> The MR AM does something similar.
Can you be more specific here, which exact code are you referring to? MR
am does do some headroom calculation for things like slow start and
Github user vanzin commented on the issue:
https://github.com/apache/spark/pull/19046
BTW, in spite of the above, if you really feel strongly about it I might
just drop this and tell the YARN folks to make their code scale better. But I
really don't see the downsides you seem worried
Github user vanzin commented on the issue:
https://github.com/apache/spark/pull/19046
This has been a long time request from our YARN team; they've had users run
into issues with YARN and traced it back to Spark making a large number of
container requests. You can argue that it's an
Github user tgravescs commented on the issue:
https://github.com/apache/spark/pull/19046
there are other options, like change the default for max executors to
something reasonable.
I'm not sure if its related or not but we are also looking at adding a
config to limit # of
Github user tgravescs commented on the issue:
https://github.com/apache/spark/pull/19046
I would like to clarify why we are doing this. The jira has some discussion
on it but I would like to know exactly what we are improving/fixing?
If we do this change, I definitely
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19046
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19046
**[Test build #81108 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81108/testReport)**
for PR 19046 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19046
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81108/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19046
**[Test build #81108 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81108/testReport)**
for PR 19046 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19046
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19046
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81105/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19046
**[Test build #81105 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81105/testReport)**
for PR 19046 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19046
**[Test build #81105 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81105/testReport)**
for PR 19046 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19046
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19046
**[Test build #81104 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81104/testReport)**
for PR 19046 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19046
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81104/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19046
**[Test build #81104 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81104/testReport)**
for PR 19046 at commit
27 matches
Mail list logo