Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/18866
@cloud-fan
Thanks for reply. Looks like #19001 continues working on this and it's more
comprehensive. I will close this pr for now.
---
If your project is set up for it, you can reply to thi
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/18866
does this work with append? Even you shuffle the data before writing, we
still may have multiple files for one bucket.
Is it possible to generalize this patch to data source level? The cur
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/18866
@cloud-fan
Would you give some advice on this ? Thus I can know if I'm on the right
direction. I can keep working on it :)
---
If your project is set up for it, you can reply to this email a
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18866
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80597/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18866
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18866
**[Test build #80597 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80597/testReport)**
for PR 18866 at commit
[`19f880b`](https://github.com/apache/spark/commit/1
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18866
**[Test build #80597 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80597/testReport)**
for PR 18866 at commit
[`19f880b`](https://github.com/apache/spark/commit/19
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/18866
Hash function is not the only issue, one important difference is: hive will
shuffle before write, and make sure one bucket has only one file. Spark doesn't
shuffle and each write task may write a
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/18866
cc @cloud-fan
Would you mind give some comments? I can keep working on this :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. I
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18866
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18866
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80349/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18866
**[Test build #80349 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80349/testReport)**
for PR 18866 at commit
[`9765a48`](https://github.com/apache/spark/commit/9
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18866
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80348/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18866
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18866
**[Test build #80348 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80348/testReport)**
for PR 18866 at commit
[`8e4a9ea`](https://github.com/apache/spark/commit/8
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18866
**[Test build #80349 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80349/testReport)**
for PR 18866 at commit
[`9765a48`](https://github.com/apache/spark/commit/97
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/18866
@viirya
Please take another look when you have time. I've already updated :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18866
**[Test build #80348 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80348/testReport)**
for PR 18866 at commit
[`8e4a9ea`](https://github.com/apache/spark/commit/8e
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18866
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18866
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80341/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18866
**[Test build #80341 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80341/testReport)**
for PR 18866 at commit
[`51d2c11`](https://github.com/apache/spark/commit/5
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18866
**[Test build #80341 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80341/testReport)**
for PR 18866 at commit
[`51d2c11`](https://github.com/apache/spark/commit/51
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/18866
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18866
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80322/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18866
**[Test build #80322 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80322/testReport)**
for PR 18866 at commit
[`51d2c11`](https://github.com/apache/spark/commit/5
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18866
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user jinxing64 commented on the issue:
https://github.com/apache/spark/pull/18866
I added the unit test referring
(https://github.com/apache/hive/blob/branch-1/ql/src/java/org/apache/hadoop/hive/ql/optimizer/AbstractBucketJoinProc.java#L393).
Hive will sort bucket files by f
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18866
**[Test build #80322 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80322/testReport)**
for PR 18866 at commit
[`51d2c11`](https://github.com/apache/spark/commit/51
28 matches
Mail list logo