[GitHub] spark issue #18820: [SPARK-14932][SQL] Allow DataFrame.replace() to replace ...

2017-08-03 Thread bravo-zhang
Github user bravo-zhang commented on the issue: https://github.com/apache/spark/pull/18820 What if the field is not nullable? I did a test: ``` val rows = spark.sparkContext.parallelize(Seq( Row("Bravo", 28, 183.5), Row("Jessie", 18, 165.8))) val sche

[GitHub] spark pull request #18746: [ML][Python] UnaryTransformer in Python

2017-08-03 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/18746#discussion_r131223795 --- Diff: python/pyspark/ml/base.py --- @@ -116,3 +121,53 @@ class Model(Transformer): """ __metaclass__ = ABCMeta + +

[GitHub] spark pull request #18746: [ML][Python] UnaryTransformer in Python

2017-08-03 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/18746#discussion_r131222861 --- Diff: python/pyspark/ml/base.py --- @@ -116,3 +121,53 @@ class Model(Transformer): """ __metaclass__ = ABCMeta + +

[GitHub] spark pull request #18746: [ML][Python] UnaryTransformer in Python

2017-08-03 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/18746#discussion_r131258120 --- Diff: python/pyspark/ml/tests.py --- @@ -1957,6 +1987,24 @@ def test_chisquaretest(self): self.assertTrue(all(field in fieldNames for fiel

[GitHub] spark pull request #18746: [ML][Python] UnaryTransformer in Python

2017-08-03 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/18746#discussion_r13190 --- Diff: python/pyspark/ml/base.py --- @@ -116,3 +121,53 @@ class Model(Transformer): """ __metaclass__ = ABCMeta + +

[GitHub] spark pull request #18746: [ML][Python] UnaryTransformer in Python

2017-08-03 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/18746#discussion_r13122 --- Diff: python/pyspark/ml/base.py --- @@ -116,3 +121,53 @@ class Model(Transformer): """ __metaclass__ = ABCMeta + +

[GitHub] spark pull request #18746: [ML][Python] UnaryTransformer in Python

2017-08-03 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/18746#discussion_r131258476 --- Diff: python/pyspark/ml/tests.py --- @@ -1957,6 +1987,24 @@ def test_chisquaretest(self): self.assertTrue(all(field in fieldNames for fiel

[GitHub] spark pull request #18746: [ML][Python] UnaryTransformer in Python

2017-08-03 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/18746#discussion_r131257864 --- Diff: python/pyspark/ml/tests.py --- @@ -1957,6 +1987,24 @@ def test_chisquaretest(self): self.assertTrue(all(field in fieldNames for fiel

[GitHub] spark issue #18831: [SPARK-21622][ML][SparkR] Support offset in SparkR GLM

2017-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18831 **[Test build #80218 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80218/testReport)** for PR 18831 at commit [`dc8ccbc`](https://github.com/apache/spark/commit/dc

[GitHub] spark issue #18824: [SPARK-21617][SQL] Store correct metadata in Hive for al...

2017-08-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18824 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #18824: [SPARK-21617][SQL] Store correct metadata in Hive for al...

2017-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18824 **[Test build #80217 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80217/testReport)** for PR 18824 at commit [`cc7cd95`](https://github.com/apache/spark/commit/c

[GitHub] spark issue #18824: [SPARK-21617][SQL] Store correct metadata in Hive for al...

2017-08-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18824 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80217/ Test FAILed. ---

[GitHub] spark pull request #18828: [SPARK-21619][SQL] Fail the execution of canonica...

2017-08-03 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/18828#discussion_r131250896 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala --- @@ -181,17 +181,38 @@ abstract class QueryPlan[PlanType <:

[GitHub] spark issue #18833: [SPARK-21625][SQL] sqrt(negative number) should be null.

2017-08-03 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/18833 @maropu that only works for literals. I am sort-of in favor of the Hive default; it seems kinda bad to bring down a job because of negative value. --- If your project is set up for it, you can r

[GitHub] spark pull request #18836: Update SortMergeJoinExec.scala

2017-08-03 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/18836#discussion_r131239029 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala --- @@ -82,7 +82,7 @@ case class SortMergeJoinExec(

[GitHub] spark issue #18833: [SPARK-21625][SQL] sqrt(negative number) should be null.

2017-08-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18833 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #18833: [SPARK-21625][SQL] sqrt(negative number) should be null.

2017-08-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18833 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80214/ Test PASSed. ---

[GitHub] spark issue #18833: [SPARK-21625][SQL] sqrt(negative number) should be null.

2017-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18833 **[Test build #80214 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80214/testReport)** for PR 18833 at commit [`3e9ec8c`](https://github.com/apache/spark/commit/3

[GitHub] spark issue #18820: [SPARK-14932][SQL] Allow DataFrame.replace() to replace ...

2017-08-03 Thread bravo-zhang
Github user bravo-zhang commented on the issue: https://github.com/apache/spark/pull/18820 Hey @nchammas I made the logic much simpler. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this fe

[GitHub] spark issue #18824: [SPARK-21617][SQL] Store correct metadata in Hive for al...

2017-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18824 **[Test build #80217 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80217/testReport)** for PR 18824 at commit [`cc7cd95`](https://github.com/apache/spark/commit/cc

[GitHub] spark issue #18824: [SPARK-21617][SQL] Store correct metadata in Hive for al...

2017-08-03 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/18824 I reworked the patch to try to merge the "create table" and "alter table" paths, so they both do the translation the same way. There are still some test failures but I wanted to get this up h

[GitHub] spark pull request #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Tim...

2017-08-03 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18664#discussion_r131227296 --- Diff: python/pyspark/sql/tests.py --- @@ -3036,6 +3052,9 @@ def test_toPandas_arrow_toggle(self): pdf = df.toPandas() self.s

[GitHub] spark issue #18790: [SPARK-21587][SS] Added pushdown through watermarks.

2017-08-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18790 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #18790: [SPARK-21587][SS] Added pushdown through watermarks.

2017-08-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18790 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80215/ Test FAILed. ---

[GitHub] spark issue #18790: [SPARK-21587][SS] Added pushdown through watermarks.

2017-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18790 **[Test build #80215 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80215/testReport)** for PR 18790 at commit [`8c73117`](https://github.com/apache/spark/commit/8

[GitHub] spark issue #18797: [SPARK-21523][ML] update breeze to 0.13.2 for an emergen...

2017-08-03 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/18797 Yeah, the only issue is that the test set is generated and used in several tests. Maybe we can just see if changing it works for all callers. --- If your project is set up for it, you can reply to t

[GitHub] spark issue #18797: [SPARK-21523][ML] update breeze to 0.13.2 for an emergen...

2017-08-03 Thread BenFradet
Github user BenFradet commented on the issue: https://github.com/apache/spark/pull/18797 @srowen there shouldn't be any issue with removing the first row of the test data afaict. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark issue #18836: Update SortMergeJoinExec.scala

2017-08-03 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18836 @BoleynSu Sure, I can do it. Will give all the credits to you. Please continue to help us report new issues or fixes. Thanks! --- If your project is set up for it, you can reply to this email an

[GitHub] spark issue #18836: Update SortMergeJoinExec.scala

2017-08-03 Thread BoleynSu
Github user BoleynSu commented on the issue: https://github.com/apache/spark/pull/18836 @gatorsmile I am not familiar with the PR process, it is great that you can take it over. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on Gi

[GitHub] spark pull request #18819: [SPARK-20713][Spark Core] Convert CommitDenied to...

2017-08-03 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18819 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #18836: Update SortMergeJoinExec.scala

2017-08-03 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18836 @BoleynSu Do you want to continue the PR? or you want us to take it over? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your p

[GitHub] spark issue #18836: Update SortMergeJoinExec.scala

2017-08-03 Thread BoleynSu
Github user BoleynSu commented on the issue: https://github.com/apache/spark/pull/18836 A test case to make the existing code fail. @srowen I am sorry that this pull request is not well formatted but I just want to help. ```scala import org.apache.spark.sql.SparkSession

[GitHub] spark issue #18836: Update SortMergeJoinExec.scala

2017-08-03 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18836 Thanks for fixing this. Please follow the contribution guideline. Also, you need to add a test case. You can follow what we did in this PR: https://github.com/apache/spark/pull/17339 -

[GitHub] spark issue #18819: [SPARK-20713][Spark Core] Convert CommitDenied to TaskKi...

2017-08-03 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/18819 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the featu

[GitHub] spark issue #18499: [SPARK-21176][WEB UI] Use a single ProxyServlet to proxy...

2017-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18499 **[Test build #80216 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80216/testReport)** for PR 18499 at commit [`45dd13c`](https://github.com/apache/spark/commit/45

[GitHub] spark issue #18499: [SPARK-21176][WEB UI] Use a single ProxyServlet to proxy...

2017-08-03 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18499 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #18835: [SPARK-21628][BUILD] Explicitly specify Java version in ...

2017-08-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18835 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80212/ Test FAILed. ---

[GitHub] spark issue #18836: Update SortMergeJoinExec.scala

2017-08-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18836 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark issue #18835: [SPARK-21628][BUILD] Explicitly specify Java version in ...

2017-08-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18835 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #18835: [SPARK-21628][BUILD] Explicitly specify Java version in ...

2017-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18835 **[Test build #80212 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80212/testReport)** for PR 18835 at commit [`3c8e473`](https://github.com/apache/spark/commit/3

[GitHub] spark issue #12646: [SPARK-14878][SQL] Trim characters string function suppo...

2017-08-03 Thread kevinyu98
Github user kevinyu98 commented on the issue: https://github.com/apache/spark/pull/12646 @gatorsmile Hello Xiao, can you help retest this ? Thanks --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not h

[GitHub] spark issue #18836: Update SortMergeJoinExec.scala

2017-08-03 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/18836 You didn't read the link above, I take it? http://spark.apache.org/contributing.html --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as wel

[GitHub] spark pull request #18836: Update SortMergeJoinExec.scala

2017-08-03 Thread BoleynSu
GitHub user BoleynSu opened a pull request: https://github.com/apache/spark/pull/18836 Update SortMergeJoinExec.scala fix a bug in outputOrdering ## What changes were proposed in this pull request? Change `case Inner` to `case _: InnerLike` so that Cross will be han

[GitHub] spark pull request #18281: [SPARK-21027][ML][PYTHON] Added tunable paralleli...

2017-08-03 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/18281#discussion_r131215571 --- Diff: python/pyspark/ml/param/_shared_params_code_gen.py --- @@ -152,6 +152,8 @@ def get$Name(self): ("varianceCol", "column name for th

[GitHub] spark issue #18795: Fix Java SimpleApp spark application

2017-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18795 **[Test build #3877 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3877/testReport)** for PR 18795 at commit [`7471781`](https://github.com/apache/spark/commit/7

[GitHub] spark issue #18797: [SPARK-21523][ML] update breeze to 0.13.2 for an emergen...

2017-08-03 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/18797 Thanks! Waiting AFT testcode author to figure out how to modify the testcase. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request #18831: [SPARK-21622][ML][SparkR] Support offset in Spark...

2017-08-03 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/18831#discussion_r131209057 --- Diff: R/pkg/R/mllib_regression.R --- @@ -125,7 +127,7 @@ setClass("IsotonicRegressionModel", representation(jobj = "jobj")) #' @seealso \link{g

[GitHub] spark pull request #18831: [SPARK-21622][ML][SparkR] Support offset in Spark...

2017-08-03 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/18831#discussion_r131211844 --- Diff: R/pkg/R/mllib_regression.R --- @@ -159,10 +161,16 @@ setMethod("spark.glm", signature(data = "SparkDataFrame", formula = "formula"),

[GitHub] spark pull request #18824: [SPARK-21617][SQL] Store correct metadata in Hive...

2017-08-03 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/18824#discussion_r131211342 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala --- @@ -413,7 +414,10 @@ private[hive] class HiveClientImpl(

[GitHub] spark pull request #18824: [SPARK-21617][SQL] Store correct metadata in Hive...

2017-08-03 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/18824#discussion_r131210013 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -616,15 +616,24 @@ private[spark] class HiveExternalCatalog(conf

[GitHub] spark pull request #18820: [SPARK-14932][SQL] Allow DataFrame.replace() to r...

2017-08-03 Thread nchammas
Github user nchammas commented on a diff in the pull request: https://github.com/apache/spark/pull/18820#discussion_r131208895 --- Diff: python/pyspark/sql/dataframe.py --- @@ -1423,8 +1434,9 @@ def all_of_(xs): subset = [subset] # Verify we were

[GitHub] spark issue #18786: [SPARK-21584][SQL][SparkR] Update R method for summary t...

2017-08-03 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/18786 Is it too late to change the Scala side output format? I suspect it doesn't matter too much on Scala/Python which order they are in and preserving the existing order in R could be helpful. ---

[GitHub] spark issue #18786: [SPARK-21584][SQL][SparkR] Update R method for summary t...

2017-08-03 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/18786 I see. I recall the method name discussion; though changing API and/or output format is something we generally want to avoid. Something like this has been called out in past releases as we shoul

[GitHub] spark issue #17849: [SPARK-10931][ML][PYSPARK] PySpark Models Copy Param Val...

2017-08-03 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/17849 If params are defined in the PySpark model, when that model is fit a Scala version is created then the PySpark model is wrapped around it. The param values from the Scala version are never tran

[GitHub] spark issue #18831: [SPARK-21622][ML][SparkR] Support offset in SparkR GLM

2017-08-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18831 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80213/ Test FAILed. ---

[GitHub] spark issue #18831: [SPARK-21622][ML][SparkR] Support offset in SparkR GLM

2017-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18831 **[Test build #80213 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80213/testReport)** for PR 18831 at commit [`6ec068e`](https://github.com/apache/spark/commit/6

[GitHub] spark issue #18831: [SPARK-21622][ML][SparkR] Support offset in SparkR GLM

2017-08-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18831 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark pull request #18779: [SPARK-21580][SQL]Integers in aggregation express...

2017-08-03 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18779#discussion_r131200152 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/SubstituteUnresolvedOrdinals.scala --- @@ -1,54 +0,0 @@ -/* - * Lic

[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

2017-08-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18814 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

2017-08-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18814 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80211/ Test PASSed. ---

[GitHub] spark issue #18814: [SPARK-21608][SPARK-9221][SQL] Window rangeBetween() API...

2017-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18814 **[Test build #80211 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80211/testReport)** for PR 18814 at commit [`f247191`](https://github.com/apache/spark/commit/f

[GitHub] spark pull request #18824: [SPARK-21617][SQL] Store correct metadata in Hive...

2017-08-03 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/18824#discussion_r131198143 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala --- @@ -413,7 +414,10 @@ private[hive] class HiveClientImpl(

[GitHub] spark issue #18832: [SPARK-21623][ML]fix RF doc

2017-08-03 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/18832 If you want to change it, that's fine. I think it's fine either way. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project do

[GitHub] spark issue #18833: [SPARK-21625][SQL] sqrt(negative number) should be null.

2017-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18833 **[Test build #80214 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80214/testReport)** for PR 18833 at commit [`3e9ec8c`](https://github.com/apache/spark/commit/3e

[GitHub] spark issue #18790: [SPARK-21587][SS] Added pushdown through watermarks.

2017-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18790 **[Test build #80215 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80215/testReport)** for PR 18790 at commit [`8c73117`](https://github.com/apache/spark/commit/8c

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

2017-08-03 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/18779 It might help to document this in the Dataset `groupBy` comment. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does n

[GitHub] spark pull request #18668: [SPARK-21451][SQL]get `spark.hadoop.*` properties...

2017-08-03 Thread tejasapatil
Github user tejasapatil commented on a diff in the pull request: https://github.com/apache/spark/pull/18668#discussion_r131194926 --- Diff: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala --- @@ -50,6 +50,7 @@ private[hive] objec

[GitHub] spark pull request #18825: [SPARK-12717][PYTHON][BRANCH-2.1] Adding thread-s...

2017-08-03 Thread BryanCutler
Github user BryanCutler closed the pull request at: https://github.com/apache/spark/pull/18825 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature i

[GitHub] spark pull request #18823: [SPARK-12717][PYTHON][BRANCH-2.2] Adding thread-s...

2017-08-03 Thread BryanCutler
Github user BryanCutler closed the pull request at: https://github.com/apache/spark/pull/18823 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature i

[GitHub] spark issue #18779: [SPARK-21580][SQL]Integers in aggregation expressions ar...

2017-08-03 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18779 Unfortunately, our Dataset APIs support it. We have to keep the support. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your p

[GitHub] spark issue #18819: [SPARK-20713][Spark Core] Convert CommitDenied to TaskKi...

2017-08-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18819 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80210/ Test PASSed. ---

[GitHub] spark issue #18819: [SPARK-20713][Spark Core] Convert CommitDenied to TaskKi...

2017-08-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18819 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #18819: [SPARK-20713][Spark Core] Convert CommitDenied to TaskKi...

2017-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18819 **[Test build #80210 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80210/testReport)** for PR 18819 at commit [`f975922`](https://github.com/apache/spark/commit/f

[GitHub] spark issue #18804: [SPARK-21599][SQL] Collecting column statistics for data...

2017-08-03 Thread dilipbiswal
Github user dilipbiswal commented on the issue: https://github.com/apache/spark/pull/18804 @gatorsmile Thank you very much !! Sure, i will submit a backport to 2.2. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #18804: [SPARK-21599][SQL] Collecting column statistics for data...

2017-08-03 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18804 Merged to master. Could you submit a backport PR to 2.2? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not h

[GitHub] spark issue #18832: [SPARK-21623][ML]fix RF doc

2017-08-03 Thread mpjlu
Github user mpjlu commented on the issue: https://github.com/apache/spark/pull/18832 I agree with you. Do you think we should update the comment to help others understand the code. Since parantStats is updated and used in each iteration. Thanks. --- If your project is set up

[GitHub] spark pull request #18804: [SPARK-21599][SQL] Collecting column statistics f...

2017-08-03 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18804 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #18804: [SPARK-21599][SQL] Collecting column statistics for data...

2017-08-03 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18804 Thanks! LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or i

[GitHub] spark pull request #18804: [SPARK-21599][SQL] Collecting column statistics f...

2017-08-03 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18804#discussion_r131190694 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala --- @@ -117,6 +125,72 @@ class StatisticsSuite extends StatisticsColle

[GitHub] spark issue #18832: [SPARK-21623][ML]fix RF doc

2017-08-03 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/18832 No, I don't think so. Computing parent stats is a very small fraction of the time and memory compared with the overall `allStats` array. That's why we decided to just add it in the first place.

[GitHub] spark pull request #18668: [SPARK-21451][SQL]get `spark.hadoop.*` properties...

2017-08-03 Thread tejasapatil
Github user tejasapatil commented on a diff in the pull request: https://github.com/apache/spark/pull/18668#discussion_r131188713 --- Diff: docs/configuration.md --- @@ -2326,7 +2326,7 @@ from this directory. # Inheriting Hadoop Cluster Configuration If you plan to r

[GitHub] spark pull request #18668: [SPARK-21451][SQL]get `spark.hadoop.*` properties...

2017-08-03 Thread tejasapatil
Github user tejasapatil commented on a diff in the pull request: https://github.com/apache/spark/pull/18668#discussion_r131187701 --- Diff: docs/configuration.md --- @@ -2335,5 +2335,61 @@ The location of these configuration files varies across Hadoop versions, but a common lo

[GitHub] spark pull request #18668: [SPARK-21451][SQL]get `spark.hadoop.*` properties...

2017-08-03 Thread tejasapatil
Github user tejasapatil commented on a diff in the pull request: https://github.com/apache/spark/pull/18668#discussion_r131185632 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveUtils.scala --- @@ -404,6 +404,13 @@ private[spark] object HiveUtils extends Logging {

[GitHub] spark issue #18832: [SPARK-21623][ML]fix RF doc

2017-08-03 Thread mpjlu
Github user mpjlu commented on the issue: https://github.com/apache/spark/pull/18832 I know your point. I am confusing the code doesn't work that way. The code update parentStats for each iteration. Actually, we only need to update parentStats for the first Iteration. So

[GitHub] spark issue #18831: [SPARK-21622][ML][SparkR] Support offset in SparkR GLM

2017-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18831 **[Test build #80213 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80213/testReport)** for PR 18831 at commit [`6ec068e`](https://github.com/apache/spark/commit/6e

[GitHub] spark issue #18831: [SPARK-21622][ML][SparkR] Support offset in SparkR GLM

2017-08-03 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18831 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled an

[GitHub] spark issue #18835: [SPARK-21628][BUILD] Explicitly specify Java version in ...

2017-08-03 Thread aray
Github user aray commented on the issue: https://github.com/apache/spark/pull/18835 Thanks, I see it now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, o

[GitHub] spark pull request #18835: [SPARK-21628][BUILD] Explicitly specify Java vers...

2017-08-03 Thread aray
Github user aray closed the pull request at: https://github.com/apache/spark/pull/18835 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabl

[GitHub] spark issue #18630: [SPARK-12559][SPARK SUBMIT] fix --packages for stand-alo...

2017-08-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18630 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80208/ Test PASSed. ---

[GitHub] spark issue #18630: [SPARK-12559][SPARK SUBMIT] fix --packages for stand-alo...

2017-08-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18630 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #18630: [SPARK-12559][SPARK SUBMIT] fix --packages for stand-alo...

2017-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18630 **[Test build #80208 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80208/testReport)** for PR 18630 at commit [`70649e2`](https://github.com/apache/spark/commit/7

[GitHub] spark issue #18835: [SPARK-21628][BUILD] Explicitly specify Java version in ...

2017-08-03 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/18835 This has been reported a few times and is already fixed, you can close this. (needs to use java.version anyway) --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #18832: [SPARK-21623][ML]fix RF doc

2017-08-03 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/18832 I don't agree the comment is _misleading_. It might be confusing, but that's something different. The reason that the `DTStatsAggregator` needs to keep track of `parentStats` is so that we c

[GitHub] spark issue #18835: [SPARK-21628][BUILD] Explicitly specify Java version in ...

2017-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18835 **[Test build #80212 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80212/testReport)** for PR 18835 at commit [`3c8e473`](https://github.com/apache/spark/commit/3c

[GitHub] spark pull request #18835: [SPARK-21628][BUILD] Explicitly specify Java vers...

2017-08-03 Thread aray
GitHub user aray opened a pull request: https://github.com/apache/spark/pull/18835 [SPARK-21628][BUILD] Explicitly specify Java version in maven compiler plugin so IntelliJ imports project correctly ## What changes were proposed in this pull request? Explicitly specify Java

[GitHub] spark issue #18832: [SPARK-21623][ML]fix RF doc

2017-08-03 Thread mpjlu
Github user mpjlu commented on the issue: https://github.com/apache/spark/pull/18832 parentStats is used in this code: binAggregates.getParentImpurityCalculator(), this is used in all iteration. So that comment seems very misleading. `} else if (binAggregates.metadata.isUnor

[GitHub] spark issue #18832: [SPARK-21623][ML]fix RF doc

2017-08-03 Thread mpjlu
Github user mpjlu commented on the issue: https://github.com/apache/spark/pull/18832 node.stats is ImpurityStats, and parentStats is Array[Double], there are different. Maybe this comment should be used on node.stats, but not on parentStats. Is my understanding wrong? --- If your pr

[GitHub] spark pull request #18499: [SPARK-21176][WEB UI] Use a single ProxyServlet t...

2017-08-03 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/18499#discussion_r131119963 --- Diff: core/src/main/scala/org/apache/spark/ui/JettyUtils.scala --- @@ -194,30 +194,26 @@ private[spark] object JettyUtils extends Logging {

[GitHub] spark pull request #18499: [SPARK-21176][WEB UI] Use a single ProxyServlet t...

2017-08-03 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/18499#discussion_r131085567 --- Diff: core/src/main/scala/org/apache/spark/ui/JettyUtils.scala --- @@ -194,30 +194,26 @@ private[spark] object JettyUtils extends Logging {

[GitHub] spark pull request #18499: [SPARK-21176][WEB UI] Use a single ProxyServlet t...

2017-08-03 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/18499#discussion_r131118306 --- Diff: core/src/main/scala/org/apache/spark/ui/JettyUtils.scala --- @@ -194,30 +194,26 @@ private[spark] object JettyUtils extends Logging {

<    1   2   3   4   5   >