Github user markhamstra commented on the issue:
https://github.com/apache/spark/pull/14039
I haven't got anything more concrete to offer at this time than the
descriptions in the relevant JIRA's, but I do have this running in production
with 1.6, and it does work. Essentially, you
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14039
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14039
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61738/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14039
**[Test build #61738 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61738/consoleFull)**
for PR 14039 at commit
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/14039
@markhamstra Thanks for the comment. I think the reuse of fragments highly
depends on user's queries, catalyst optimizer, cluster resources... Reusing
`ShuffledRowRDD` shuffle data in a single job
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/14039
@srowen My understanding is that shuffle data in stages are possibly shared
in a job. However, once the job is finished, the current implementation cannot
reuse the shuffle data anymore. So, we can
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14039
**[Test build #61738 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61738/consoleFull)**
for PR 14039 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14039
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61717/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14039
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14039
**[Test build #61717 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61717/consoleFull)**
for PR 14039 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14039
**[Test build #61717 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61717/consoleFull)**
for PR 14039 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14039
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61715/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14039
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14039
**[Test build #61715 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61715/consoleFull)**
for PR 14039 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14039
**[Test build #61715 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61715/consoleFull)**
for PR 14039 at commit
Github user markhamstra commented on the issue:
https://github.com/apache/spark/pull/14039
Actually, they can be reused -- not in Spark as distributed, but it is an
open question whether reusing shuffle files within Spark SQL is something that
we should be doing and want to support.
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/14039
@srowen thanks for the comment. Yea, I noticed that and I'm fixing this to
remove only shuffle files generated by `ShuffleExchange`.
---
If your project is set up for it, you can reply to this
Github user srowen commented on the issue:
https://github.com/apache/spark/pull/14039
I don't think we do this in general. The shuffle files are supposed to
remain to potentially be reused if the stage needs to be re-executed.
---
If your project is set up for it, you can reply to
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14039
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14039
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61702/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14039
**[Test build #61702 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61702/consoleFull)**
for PR 14039 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14039
**[Test build #61702 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61702/consoleFull)**
for PR 14039 at commit
22 matches
Mail list logo