Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-156266057
Sounds good.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
en
Github user marmbrus commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-156264807
No, `min` and `max` can be used on string and other types so should not
return `NaN`. We should follow the SQL standard here.
---
If your project is set up for it, yo
Github user JihongMA commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-156252110
@mengxr do we want to change the behavior for min, max as well?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user JihongMA commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-156249080
@mengxr sure, will take care mean via seperate PR.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If yo
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-156247920
LGTM. Merged into master and branch-1.6. Thanks! Btw, there is a minor
style issue I marked inline. @JihongMA Could you submit another PR to change
the output of `mean([]
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/9380
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enab
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-156234588
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/4
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-156234586
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-156234403
**[Test build #45755 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45755/consoleFull)**
for PR 9380 at commit
[`7a239ec`](https://git
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-156201232
**[Test build #45755 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45755/consoleFull)**
for PR 9380 at commit
[`7a239ec`](https://gith
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-156198432
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-156198404
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not h
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-156198135
test this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this featur
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-156189548
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/4
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-156189544
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-156189442
**[Test build #45749 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45749/consoleFull)**
for PR 9380 at commit
[`7a239ec`](https://git
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-156181656
**[Test build #45749 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45749/consoleFull)**
for PR 9380 at commit
[`7a239ec`](https://gith
Github user yu-iskw commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-156181157
@JihongMA thanks for the update! Could you revert `Skewness.scala` and
`Kurtosis.scala`. Since I don't think the change relates to the issue. I know
this is a minor thin
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-156178213
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-156178175
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not h
Github user yu-iskw commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-156177206
Jenkins, test this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user JihongMA commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-156171034
@AmplabJenkins please retest the change.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your projec
Github user felixcheung commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-156024999
@JihongMA yap that should fix them
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project do
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-155988221
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user JihongMA commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-155988170
@felixcheung Thank you! this is the change I have made to make it pass
for R. I am not familiar with R .
df3 <- agg(gd, age = "stddev")
expect_is(df3, "
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-155988223
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/4
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-155987720
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-155987721
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/4
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-155986712
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not h
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-155986721
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-155986198
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-155986161
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not h
Github user felixcheung commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-155970380
These SparkR support has just been added so this change breaks tests
```
1. Failure (at test_sparkSQL.R#1010): group by, agg functions
--
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-155956026
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-155955969
**[Test build #45678 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45678/consoleFull)**
for PR 9380 at commit
[`dc0558b`](https://git
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-155956027
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/4
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-155927889
**[Test build #45678 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45678/consoleFull)**
for PR 9380 at commit
[`dc0558b`](https://gith
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-155927086
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not h
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-155927109
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-155630541
@JihongMA Could you merge the current master? There are some merge
conflicts.
For `NaN` vs. `null`, we had some discussion in
https://issues.apache.org/jira/brow
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-155228360
But eval only happens once per group.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user marmbrus commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-155226869
The eval call is boxing which you aren't going to see without a groupby.
---
If your project is set up for it, you can reply to this email and have your
reply appear on
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-155224857
Which part is boxing the result? I tested the following on master with
changes from #9480:
~~~scala
val df = sqlContext.range(1)
df.select(var_sam
Github user marmbrus commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-154593369
Even though its simple, I think this implementation is boxing the result,
which could result in slower performance on real workloads (but is harder to
see in micro benc
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-154579413
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-154579243
**[Test build #45256 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45256/consoleFull)**
for PR 9380 at commit
[`b69d1e6`](https://git
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-154546741
**[Test build #45256 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45256/consoleFull)**
for PR 9380 at commit
[`b69d1e6`](https://gith
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-154544406
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-154544379
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not h
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-154509839
@rxin We still need to run some benchmark. The formulation is simple and
fixed, and hence codegen won't bring much performance gain.
@JihongMA Could you address t
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-154209278
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-154209280
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-154209027
**[Test build #45135 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45135/consoleFull)**
for PR 9380 at commit
[`e3417aa`](https://git
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-154168859
**[Test build #45135 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45135/consoleFull)**
for PR 9380 at commit
[`e3417aa`](https://gith
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-154168774
BTW with https://github.com/apache/spark/pull/9480, we might not need to
replace it with imperative aggregate anymore.
---
If your project is set up for it, you can reply
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-154163473
add to whitelist
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this featur
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-154166042
+1 on @yu-iskw 's suggestion. Let's keep the changes in this PR minimal.
Just replace `StdDev` implementation by imperative agg.
---
If your project is set up for it, yo
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-154165567
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not h
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-154165662
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user yu-iskw commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-154164156
@JihongMA I don't know if there are any strong reasons in terms of
catalyst. However, personally I think we should separate changing the return
type and `null` from the
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-154163504
ok to test
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enab
Github user JihongMA commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-153847977
@mengxr rebased with the changes @rxin [SPARK-11490], stddev / variance
mapped to the corresponding sample stddev / variance. I checked Hive doesn't
support this mappi
Github user JihongMA commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-153775399
@mengxr Please take another look.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user JihongMA commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-153475180
getStatistics() will continue to return Double value for normal cases,
changing it to return null only for edge cases. is there a strong reason to
return Double.NaN? wh
Github user yu-iskw commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-153462689
@JihongMA I'm not sure about that. I don't think we should return `null`,
instead of `Double.NaN`. Why do we need to change the return type?
---
If your project is set
Github user JihongMA commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-153453307
I propose to return null for all cases which currently Double.NaN is
returned. and change getStatistics() to return Any instead of Double.
---
If your project is set
Github user JihongMA commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-153451606
so for skewness and kurtosis in case of count =1, we want to return null
instead of 0. I can address it, but instead of returning Double.NaN, should we
return null f
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-153412815
@JihongMA Could you address @sethah 's comment and rebase master? There are
some merge conflicts.
---
If your project is set up for it, you can reply to this email and h
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/9380#discussion_r43772715
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/functions.scala
---
@@ -1135,7 +992,76 @@ abstract class CentralMome
Github user sethah commented on a diff in the pull request:
https://github.com/apache/spark/pull/9380#discussion_r43662570
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/functions.scala
---
@@ -1135,7 +992,76 @@ abstract class CentralMome
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9380#issuecomment-152594052
Can one of the admins verify this patch?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your pr
71 matches
Mail list logo