Github user shivaram commented on the issue:
https://github.com/apache/spark/pull/16739
Agree with @jkbradley on this one. We should avoid adding functions that
are completely new in a patch release given that the timing between minor
versions and patch releases aren't that high. As w
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16739
I've commented elsewhere, but wanted to here just to make more people
aware: Let's refrain from backporting new APIs into patch versions unless they
are really critical. We do not do this elsewhe
Github user dongjoon-hyun commented on the issue:
https://github.com/apache/spark/pull/16739
Thank YOU, always! :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wis
Github user felixcheung commented on the issue:
https://github.com/apache/spark/pull/16739
@dongjoon-hyun my apologies, thanks for bringing this to my attention. I
had to hang merge and didn't realize the mismatch. Opened a new PR to fix that.
---
If your project is set up for it, yo
Github user dongjoon-hyun commented on the issue:
https://github.com/apache/spark/pull/16739
Hi, @felixcheung .
While backporting,
https://github.com/apache/spark/commit/6c35399068f1035fec6d5f909a83a5b1683702e0#diff-3d2a6b9d2b7d84ae179d7ea0f9eca696R1232
seems to break the build of
Github user felixcheung commented on the issue:
https://github.com/apache/spark/pull/16739
merged to master and branch-2.1
@gatorsmile thanks - please feel free to update or remove unneeded test
cases.
---
If your project is set up for it, you can reply to this email and have you
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16739
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72929/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16739
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16739
**[Test build #72929 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72929/testReport)**
for PR 16739 at commit
[`bf2373f`](https://github.com/apache/spark/commit/b
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16739
**[Test build #72929 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72929/testReport)**
for PR 16739 at commit
[`bf2373f`](https://github.com/apache/spark/commit/bf
Github user felixcheung commented on the issue:
https://github.com/apache/spark/pull/16739
Jenkins, retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16739
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72925/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16739
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16739
**[Test build #72925 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72925/testReport)**
for PR 16739 at commit
[`bf2373f`](https://github.com/apache/spark/commit/bf
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/16739
The issue is fixed in https://github.com/apache/spark/pull/16933. If this
is merged at first, I will fix the test case in this PR Thanks! : )
---
If your project is set up for it, you can reply
Github user felixcheung commented on the issue:
https://github.com/apache/spark/pull/16739
great, looking forward to that.
I'm going to merge this unless anyone has a concern?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/16739
Let me rewrite the test cases in Scala.
```Scala
val df = spark.range(0, 1, 1, 5)
assert(df.rdd.getNumPartitions == 5)
assert(df.coalesce(3).rdd.getNumPartition
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16739
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72790/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16739
Build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16739
**[Test build #72790 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72790/testReport)**
for PR 16739 at commit
[`55b99df`](https://github.com/apache/spark/commit/5
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16739
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16739
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72791/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16739
**[Test build #72791 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72791/testReport)**
for PR 16739 at commit
[`a0fe134`](https://github.com/apache/spark/commit/a
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16739
**[Test build #72791 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72791/testReport)**
for PR 16739 at commit
[`a0fe134`](https://github.com/apache/spark/commit/a0
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16739
**[Test build #72790 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72790/testReport)**
for PR 16739 at commit
[`55b99df`](https://github.com/apache/spark/commit/55
Github user felixcheung commented on the issue:
https://github.com/apache/spark/pull/16739
hmm, not as far as I can see:
```
> df2 <- repartition(df1, 10)
> getNumPartitions(df2) # right after repartition the number of partition
is greater than the original numSlices
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/16739
: ) This might be caused by the optimizer rule `CollapseRepartition`. Can
you output the plan by `explain(true)`?
---
If your project is set up for it, you can reply to this email and have your
Github user felixcheung commented on the issue:
https://github.com/apache/spark/pull/16739
@gatorsmile thanks for commenting. `coalesce` currently accept a number
even if it is larger than the current number of partitions - I guess we didn't
want to throw exeception in that case?
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/16739
`coalesce` is used to decrease the number of partitions in the RDD, but
when you are setting it to a number that is larger than the number of the
current RDD partitions, the result is not predica
Github user felixcheung commented on the issue:
https://github.com/apache/spark/pull/16739
yap, https://github.com/apache/spark/pull/16739#issuecomment-276739220 -
only RDD has `coalesce(.. shuffle)`, in Dataset, it's `coalesce` and
`repartition`
---
If your project is set up for it
Github user holdenk commented on the issue:
https://github.com/apache/spark/pull/16739
@felixcheung I was refering to the ` * However, if you're doing a drastic
coalesce, e.g. to numPartitions = 1,
* this may result in your computation taking place on fewer nodes than
*
Github user felixcheung commented on the issue:
https://github.com/apache/spark/pull/16739
and actually I find the current behavior a bit hard to explain, could
someone perhaps enlighten me if this is intentional and how best, if we are to,
document this behavior?
```
df <- a
Github user felixcheung commented on the issue:
https://github.com/apache/spark/pull/16739
surely, i think you mean
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rdd/RDD.scala#L428
we will need to update this to say `use repartition() if you want
Github user shivaram commented on the issue:
https://github.com/apache/spark/pull/16739
Thanks @felixcheung - I think these changes look good.
cc @gatorsmile / @holdenk for doc changes in SQL, Python
---
If your project is set up for it, you can reply to this email and have
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16739
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16739
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72240/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16739
**[Test build #72240 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72240/testReport)**
for PR 16739 at commit
[`3ed835a`](https://github.com/apache/spark/commit/3
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16739
**[Test build #72240 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72240/testReport)**
for PR 16739 at commit
[`3ed835a`](https://github.com/apache/spark/commit/3e
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16739
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72232/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16739
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16739
**[Test build #72232 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72232/testReport)**
for PR 16739 at commit
[`1bd7163`](https://github.com/apache/spark/commit/1
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16739
**[Test build #72232 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72232/testReport)**
for PR 16739 at commit
[`1bd7163`](https://github.com/apache/spark/commit/1b
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16739
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16739
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72166/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16739
**[Test build #72166 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72166/testReport)**
for PR 16739 at commit
[`938c2ce`](https://github.com/apache/spark/commit/9
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16739
**[Test build #72166 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72166/testReport)**
for PR 16739 at commit
[`938c2ce`](https://github.com/apache/spark/commit/93
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16739
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16739
**[Test build #72149 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72149/testReport)**
for PR 16739 at commit
[`50ab563`](https://github.com/apache/spark/commit/5
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16739
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72149/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16739
**[Test build #72149 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72149/testReport)**
for PR 16739 at commit
[`50ab563`](https://github.com/apache/spark/commit/50
Github user felixcheung commented on the issue:
https://github.com/apache/spark/pull/16739
Jenkins, retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16739
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
e
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16739
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72147/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16739
**[Test build #72147 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72147/testReport)**
for PR 16739 at commit
[`50ab563`](https://github.com/apache/spark/commit/50
54 matches
Mail list logo