[GitHub] spark pull request #18798: [SPARK-19634][ML] Multivariate summarizer - dataf...

2017-08-14 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18798#discussion_r133115762 --- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala --- @@ -0,0 +1,593 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request #18798: [SPARK-19634][ML] Multivariate summarizer - dataf...

2017-08-14 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18798#discussion_r133120700 --- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala --- @@ -0,0 +1,593 @@ +/* + * Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request #18798: [SPARK-19634][ML] Multivariate summarizer - dataf...

2017-08-14 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/18798#discussion_r133120739 --- Diff: mllib/src/test/scala/org/apache/spark/ml/stat/SummarizerSuite.scala --- @@ -0,0 +1,619 @@ +/* + * Licensed to the Apache Software Found

[GitHub] spark issue #18798: [SPARK-19634][ML] Multivariate summarizer - dataframes A...

2017-08-14 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/18798 @WeichenXu123 I left some minor comments, otherwise, LGTM. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #18798: [SPARK-19634][ML] Multivariate summarizer - dataframes A...

2017-08-14 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/18798 @yanboliang I will update ASAP, thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this featur

[GitHub] spark pull request #18926: [SPARK-21712] [PySpark] Clarify type error for Co...

2017-08-14 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18926#discussion_r133121408 --- Diff: python/pyspark/sql/column.py --- @@ -406,7 +406,13 @@ def substr(self, startPos, length): [Row(col=u'Ali'), Row(col=u'Bob')]

[GitHub] spark issue #18902: [SPARK-21690][ML] one-pass imputer

2017-08-14 Thread zhengruifeng
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/18902 @hhbyyh Good Idea! We can also use this trick to compute median, because method `multipleApproxQuantiles`[https://github.com/apache/spark/blob/0e80ecae300f3e2033419b2d98da8bf092c105bb/sql/core/

[GitHub] spark issue #18938: [SPARK-21363][SQL] Prevent name duplication in (global/l...

2017-08-14 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/18938 We still support this for 3.x releases? If so, I'll close jira as "Won't fix". --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #18926: [SPARK-21712] [PySpark] Clarify type error for Column.su...

2017-08-14 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18926 It sounds like the comment hides. Could you address the comment https://github.com/apache/spark/pull/18926#discussion_r133121408? --- If your project is set up for it, you can reply to this emai

[GitHub] spark pull request #18798: [SPARK-19634][ML] Multivariate summarizer - dataf...

2017-08-14 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/18798#discussion_r133121659 --- Diff: mllib/src/main/scala/org/apache/spark/ml/stat/Summarizer.scala --- @@ -0,0 +1,593 @@ +/* + * Licensed to the Apache Software Foundatio

[GitHub] spark issue #18853: [SPARK-21646][SQL] BinaryComparison shouldn't auto cast ...

2017-08-14 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18853 Currently, the type casting has a few issues when types are different. So far, we do not have any good option to resolve all the issues. Thus, we are hesitant to introduce any behavior change unl

[GitHub] spark issue #18938: [SPARK-21363][SQL] Prevent name duplication in (global/l...

2017-08-14 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18938 Not sure. You can bring it up when we decide to jump to 3.x. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark issue #18938: [SPARK-21363][SQL] Prevent name duplication in (global/l...

2017-08-14 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/18938 ok, I'll close for now. If you get time, could you set 3.0 at the jira target? Thx! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request #18938: [SPARK-21363][SQL] Prevent name duplication in (g...

2017-08-14 Thread maropu
Github user maropu closed the pull request at: https://github.com/apache/spark/pull/18938 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #18942: [BACKPORT-2.1][SPARK-19372][SQL] Fix throwing a Java exc...

2017-08-14 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18942 This might be too risky to be merged to 2.1.1. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this f

[GitHub] spark pull request #18926: [SPARK-21712] [PySpark] Clarify type error for Co...

2017-08-14 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18926#discussion_r133122900 --- Diff: python/pyspark/sql/column.py --- @@ -406,7 +406,13 @@ def substr(self, startPos, length): [Row(col=u'Ali'), Row(col=u'Bob')]

[GitHub] spark issue #18946: [SPARK-19471][SQL]AggregationIterator does not initializ...

2017-08-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18946 **[Test build #80659 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80659/testReport)** for PR 18946 at commit [`9f2ec8f`](https://github.com/apache/spark/commit/9f

[GitHub] spark issue #18931: [SPARK-21717][SQL][WIP] Decouple consume functions of ph...

2017-08-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18931 **[Test build #80654 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80654/testReport)** for PR 18931 at commit [`c04da15`](https://github.com/apache/spark/commit/c

[GitHub] spark issue #18931: [SPARK-21717][SQL][WIP] Decouple consume functions of ph...

2017-08-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18931 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80654/ Test PASSed. ---

[GitHub] spark issue #18931: [SPARK-21717][SQL][WIP] Decouple consume functions of ph...

2017-08-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18931 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #18926: [SPARK-21712] [PySpark] Clarify type error for Column.su...

2017-08-14 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18926 For ^, I want to make this separate if possible. Do you guys strongly feel about supporting `long` (and namely "mixed" types) here - @gatorsmile and @ueshin? --- If your project is set up for

[GitHub] spark issue #18934: [SPARK-21721][SQL] Clear FileSystem deleteOnExit cache w...

2017-08-14 Thread yzheng616
Github user yzheng616 commented on the issue: https://github.com/apache/spark/pull/18934 Please try to fix it in 2.1 too. We have a product running on this version Spark. Thanks a lot! --- If your project is set up for it, you can reply to this email and have your reply appear on Git

[GitHub] spark issue #18926: [SPARK-21712] [PySpark] Clarify type error for Column.su...

2017-08-14 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18926 Even if we plan to drop `long` in this PR, [the checking](https://github.com/nchammas/spark/blob/fc1d84f002f5bd66bcad038a5581a05ade8dbc35/python/pyspark/sql/column.py#L408) looks weird to me. Bas

[GitHub] spark issue #18944: [SPARK-21732][SQL]Lazily init hive metastore client

2017-08-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18944 **[Test build #80655 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80655/testReport)** for PR 18944 at commit [`9eb9149`](https://github.com/apache/spark/commit/9

[GitHub] spark issue #18944: [SPARK-21732][SQL]Lazily init hive metastore client

2017-08-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18944 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80655/ Test PASSed. ---

[GitHub] spark issue #18944: [SPARK-21732][SQL]Lazily init hive metastore client

2017-08-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18944 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #18902: [SPARK-21690][ML] one-pass imputer

2017-08-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18902 **[Test build #80660 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80660/testReport)** for PR 18902 at commit [`fd1eb43`](https://github.com/apache/spark/commit/fd

[GitHub] spark issue #18939: [SPARK-21724][SQL][DOC] Adds since information in the do...

2017-08-14 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18939 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the fe

[GitHub] spark issue #18939: [SPARK-21724][SQL][DOC] Adds since information in the do...

2017-08-14 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18939 Thanks! Merging to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and w

[GitHub] spark pull request #18939: [SPARK-21724][SQL][DOC] Adds since information in...

2017-08-14 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18939 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #18944: [SPARK-21732][SQL]Lazily init hive metastore client

2017-08-14 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18944 Thanks! Merging to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and w

[GitHub] spark pull request #18944: [SPARK-21732][SQL]Lazily init hive metastore clie...

2017-08-14 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18944 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #18934: [SPARK-21721][SQL] Clear FileSystem deleteOnExit cache w...

2017-08-14 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18934 cc @viirya --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark issue #18315: [SPARK-21108] [ML] [WIP] convert LinearSVC to aggregator...

2017-08-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18315 **[Test build #80661 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80661/testReport)** for PR 18315 at commit [`94e0250`](https://github.com/apache/spark/commit/94

<    1   2   3   4   5