[GitHub] spark issue #17938: [SPARK-20694][DOCS][SQL] Document DataFrameWriter partit...

2017-05-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17938 @zero323 We already support it for data source tables. Below is just an example. ```SQL CREATE TABLE tbl(a INT, b INT) USING parquet CLUSTERED BY (a) SORTED BY (b) INTO 5 BUCKETS")

[GitHub] spark pull request #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJav...

2017-05-12 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/17222#discussion_r116351461 --- Diff: python/pyspark/sql/context.py --- @@ -232,6 +232,23 @@ def registerJavaFunction(self, name, javaClassName, returnType=None):

[GitHub] spark issue #17308: [SPARK-19968][SS] Use a cached instance of `KafkaProduce...

2017-05-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17308 **[Test build #76890 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76890/testReport)** for PR 17308 at commit

[GitHub] spark issue #17971: Branch 0.5

2017-05-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17971 @peterpi0915 it looks mistakenly open. Could you close this please? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request #16598: [SPARK-19236][Core] Added createOrReplaceGlobalTe...

2017-05-12 Thread arman1371
Github user arman1371 commented on a diff in the pull request: https://github.com/apache/spark/pull/16598#discussion_r116350696 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -2603,6 +2603,21 @@ class Dataset[T] private[sql]( def

[GitHub] spark issue #17970: [SPARK-20730][SQL] Add an optimizer rule to combine nest...

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17970 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17970: [SPARK-20730][SQL] Add an optimizer rule to combine nest...

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17970 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76889/ Test PASSed. ---

[GitHub] spark issue #17970: [SPARK-20730][SQL] Add an optimizer rule to combine nest...

2017-05-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17970 **[Test build #76889 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76889/testReport)** for PR 17970 at commit

[GitHub] spark issue #17966: [SPARK-20727] Skip tests that use Hadoop utils on CRAN W...

2017-05-12 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/17966 So I'd propose this ``` is_cran <- function() { !identical(Sys.getenv("NOT_CRAN"), "true") } is_windows <- function() { .Platform$OS.type == "windows" } hadoop_home_set <-

[GitHub] spark pull request #17965: [SPARK-20726][SPARKR] wrapper for SQL broadcast

2017-05-12 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/17965#discussion_r116350427 --- Diff: R/pkg/R/context.R --- @@ -258,15 +258,15 @@ includePackage <- function(sc, pkg) { #' #' # Large Matrix object that we want to

[GitHub] spark pull request #17965: [SPARK-20726][SPARKR] wrapper for SQL broadcast

2017-05-12 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/17965#discussion_r116350400 --- Diff: R/pkg/R/generics.R --- @@ -799,6 +799,10 @@ setGeneric("write.df", function(df, path = NULL, ...) { standardGeneric("write.d #' @export

[GitHub] spark pull request #17965: [SPARK-20726][SPARKR] wrapper for SQL broadcast

2017-05-12 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/17965#discussion_r116350359 --- Diff: R/pkg/R/DataFrame.R --- @@ -3769,3 +3769,33 @@ setMethod("alias", sdf <- callJMethod(object@sdf, "alias", data)

[GitHub] spark pull request #17965: [SPARK-20726][SPARKR] wrapper for SQL broadcast

2017-05-12 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/17965#discussion_r116350178 --- Diff: R/pkg/R/DataFrame.R --- @@ -3769,3 +3769,33 @@ setMethod("alias", sdf <- callJMethod(object@sdf, "alias", data)

[GitHub] spark pull request #17965: [SPARK-20726][SPARKR] wrapper for SQL broadcast

2017-05-12 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/17965#discussion_r116350368 --- Diff: R/pkg/R/context.R --- @@ -258,15 +258,15 @@ includePackage <- function(sc, pkg) { #' #' # Large Matrix object that we want to

[GitHub] spark pull request #17965: [SPARK-20726][SPARKR] wrapper for SQL broadcast

2017-05-12 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/17965#discussion_r116350189 --- Diff: R/pkg/R/DataFrame.R --- @@ -3769,3 +3769,33 @@ setMethod("alias", sdf <- callJMethod(object@sdf, "alias", data)

[GitHub] spark pull request #17965: [SPARK-20726][SPARKR] wrapper for SQL broadcast

2017-05-12 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/17965#discussion_r116350376 --- Diff: R/pkg/R/generics.R --- @@ -799,6 +799,10 @@ setGeneric("write.df", function(df, path = NULL, ...) { standardGeneric("write.d #' @export

[GitHub] spark pull request #17969: [SPARK-20729][SPARKR][ML] Reduce boilerplate in S...

2017-05-12 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/17969#discussion_r116350163 --- Diff: R/pkg/R/mllib_wrapper.R --- @@ -0,0 +1,61 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +#

[GitHub] spark issue #17971: Branch 0.5

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17971 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #17971: Branch 0.5

2017-05-12 Thread peterpi0915
GitHub user peterpi0915 opened a pull request: https://github.com/apache/spark/pull/17971 Branch 0.5 ## What changes were proposed in this pull request? (Please fill in changes proposed in this fix) ## How was this patch tested? (Please explain how this

[GitHub] spark issue #17955: [SPARK-20715] Store MapStatuses only in MapOutputTracker...

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17955 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17955: [SPARK-20715] Store MapStatuses only in MapOutputTracker...

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17955 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76886/ Test PASSed. ---

[GitHub] spark issue #17644: [SPARK-17729] [SQL] Enable creating hive bucketed tables

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17644 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76887/ Test PASSed. ---

[GitHub] spark issue #17955: [SPARK-20715] Store MapStatuses only in MapOutputTracker...

2017-05-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17955 **[Test build #76886 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76886/testReport)** for PR 17955 at commit

[GitHub] spark issue #17644: [SPARK-17729] [SQL] Enable creating hive bucketed tables

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17644 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17644: [SPARK-17729] [SQL] Enable creating hive bucketed tables

2017-05-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17644 **[Test build #76887 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76887/testReport)** for PR 17644 at commit

[GitHub] spark issue #17970: [SPARK-20730][SQL] Add an optimizer rule to combine nest...

2017-05-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17970 **[Test build #76889 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76889/testReport)** for PR 17970 at commit

[GitHub] spark issue #17758: [SPARK-20460][SQL] Make it more consistent to handle col...

2017-05-12 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/17758 @gatorsmile Could you check this and give me advise on this? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request #17970: [SPARK-20730][SQL] Add an optimizer rule to combi...

2017-05-12 Thread maropu
GitHub user maropu opened a pull request: https://github.com/apache/spark/pull/17970 [SPARK-20730][SQL] Add an optimizer rule to combine nested Concat ## What changes were proposed in this pull request? This pr added a new Optimizer rule to combine nested Concat. The master

[GitHub] spark issue #17644: [SPARK-17729] [SQL] Enable creating hive bucketed tables

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17644 Build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #17644: [SPARK-17729] [SQL] Enable creating hive bucketed tables

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17644 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76884/ Test PASSed. ---

[GitHub] spark issue #17644: [SPARK-17729] [SQL] Enable creating hive bucketed tables

2017-05-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17644 **[Test build #76884 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76884/testReport)** for PR 17644 at commit

[GitHub] spark issue #17969: [SPARK-20729][SPARKR][ML] Reduce boilerplate in Spark ML...

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17969 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76888/ Test PASSed. ---

[GitHub] spark issue #17969: [SPARK-20729][SPARKR][ML] Reduce boilerplate in Spark ML...

2017-05-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17969 **[Test build #76888 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76888/testReport)** for PR 17969 at commit

[GitHub] spark issue #17969: [SPARK-20729][SPARKR][ML] Reduce boilerplate in Spark ML...

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17969 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17966: [SPARK-20727] Skip tests that use Hadoop utils on CRAN W...

2017-05-12 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/17966 Probably - but how to check for Hadoop? See if HADOOP_HOME is set? We don't need to set that on *nix though, I think --- If your project is set up for it, you can reply to this email

[GitHub] spark issue #17969: [SPARK-20729][SPARKR][ML] Reduce boilerplate in Spark ML...

2017-05-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17969 **[Test build #76888 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76888/testReport)** for PR 17969 at commit

[GitHub] spark issue #17966: [SPARK-20727] Skip tests that use Hadoop utils on CRAN W...

2017-05-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17966 Just FYI, closing and opening a PR is a workaround to re-trigger the build in AppVeyor as (I assume) we all don't currently have the permission via AppVeyor Web UI. --- If your project is set

[GitHub] spark pull request #17969: [SPARK-20729][SPARKR][ML] Reduce boilerplate in S...

2017-05-12 Thread zero323
Github user zero323 commented on a diff in the pull request: https://github.com/apache/spark/pull/17969#discussion_r116346161 --- Diff: R/pkg/R/generics.R --- @@ -1535,9 +1535,7 @@ setGeneric("spark.freqItemsets", function(object) { standardGeneric("spark.freqI #' @export

[GitHub] spark issue #17966: [SPARK-20727] Skip tests that use Hadoop utils on CRAN W...

2017-05-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17966 Thank you for cc'ing me. I think primarily it is because single AppVeyor account is shared across several Apache projects but the number of concurrent jobs is single up to my knowledge. So, it

[GitHub] spark pull request #17969: [SPARK-20729][SPARKR][ML] Reduce boilerplate in S...

2017-05-12 Thread zero323
Github user zero323 commented on a diff in the pull request: https://github.com/apache/spark/pull/17969#discussion_r116346059 --- Diff: R/pkg/R/mllib_wrapper.R --- @@ -0,0 +1,61 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor

[GitHub] spark pull request #17969: [SPARK-20729][SPARKR][ML] Reduce boilerplate in S...

2017-05-12 Thread zero323
Github user zero323 commented on a diff in the pull request: https://github.com/apache/spark/pull/17969#discussion_r116345958 --- Diff: R/pkg/DESCRIPTION --- @@ -42,6 +42,7 @@ Collate: 'functions.R' 'install.R' 'jvm.R' +'mllib_wrapper.R' ---

[GitHub] spark pull request #16985: [SPARK-19122][SQL] Unnecessary shuffle+sort added...

2017-05-12 Thread tejasapatil
Github user tejasapatil commented on a diff in the pull request: https://github.com/apache/spark/pull/16985#discussion_r116345617 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/ReorderJoinPredicates.scala --- @@ -0,0 +1,93 @@ +/* + * Licensed to

[GitHub] spark pull request #17969: [SPARK-20729][SPARKR][ML] Reduce boilerplate in S...

2017-05-12 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/17969#discussion_r116345383 --- Diff: R/pkg/DESCRIPTION --- @@ -42,6 +42,7 @@ Collate: 'functions.R' 'install.R' 'jvm.R' +'mllib_wrapper.R'

[GitHub] spark pull request #17969: [SPARK-20729][SPARKR][ML] Reduce boilerplate in S...

2017-05-12 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/17969#discussion_r116345166 --- Diff: R/pkg/R/mllib_regression.R --- @@ -360,6 +338,7 @@ setMethod("spark.isoreg", signature(data = "SparkDataFrame", formula = "formula"

[GitHub] spark pull request #17969: [SPARK-20729][SPARKR][ML] Reduce boilerplate in S...

2017-05-12 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/17969#discussion_r116345323 --- Diff: R/pkg/R/mllib_wrapper.R --- @@ -0,0 +1,61 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +#

[GitHub] spark pull request #17969: [SPARK-20729][SPARKR][ML] Reduce boilerplate in S...

2017-05-12 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/17969#discussion_r116345209 --- Diff: R/pkg/R/mllib_wrapper.R --- @@ -0,0 +1,61 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +#

[GitHub] spark pull request #17969: [SPARK-20729][SPARKR][ML] Reduce boilerplate in S...

2017-05-12 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/17969#discussion_r116345283 --- Diff: R/pkg/R/mllib_wrapper.R --- @@ -0,0 +1,61 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +#

[GitHub] spark pull request #17969: [SPARK-20729][SPARKR][ML] Reduce boilerplate in S...

2017-05-12 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/17969#discussion_r116344992 --- Diff: R/pkg/R/mllib_classification.R --- @@ -22,29 +22,36 @@ #' #' @param jobj a Java object reference to the backing Scala

[GitHub] spark pull request #17969: [SPARK-20729][SPARKR][ML] Reduce boilerplate in S...

2017-05-12 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/17969#discussion_r116344933 --- Diff: R/pkg/R/generics.R --- @@ -1535,9 +1535,7 @@ setGeneric("spark.freqItemsets", function(object) { standardGeneric("spark.freqI #'

[GitHub] spark issue #17956: [SPARK-18772][SQL] Avoid unnecessary conversion try and ...

2017-05-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17956 1. It is not good to hold off PRs. If a PR looks good and coherent, I think we could merge. 2. "How many JSON writers write a float "INF" as a string?", if it is your worry, I will

[GitHub] spark issue #17644: [SPARK-17729] [SQL] Enable creating hive bucketed tables

2017-05-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17644 **[Test build #76887 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76887/testReport)** for PR 17644 at commit

[GitHub] spark issue #17969: [SPARK-20729][SPARKR][ML] Reduce boilerplate in Spark ML...

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17969 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76885/ Test PASSed. ---

[GitHub] spark issue #17969: [SPARK-20729][SPARKR][ML] Reduce boilerplate in Spark ML...

2017-05-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17969 **[Test build #76885 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76885/testReport)** for PR 17969 at commit

[GitHub] spark issue #17969: [SPARK-20729][SPARKR][ML] Reduce boilerplate in Spark ML...

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17969 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16743: [SPARK-19379][CORE] SparkAppHandle.getState not register...

2017-05-12 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/16743 This PR has been silent for months, so unless the owner of the PR replies it should be closed. If you want to pick up the work, you'll have to open a new PR anyway. --- If your project is set up

[GitHub] spark pull request #17955: [SPARK-20715] Store MapStatuses only in MapOutput...

2017-05-12 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/17955#discussion_r116344085 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -495,106 +532,153 @@ private[spark] class MapOutputTrackerMaster(conf:

[GitHub] spark pull request #17965: [SPARK-20726][SPARKR] wrapper for SQL broadcast

2017-05-12 Thread zero323
GitHub user zero323 reopened a pull request: https://github.com/apache/spark/pull/17965 [SPARK-20726][SPARKR] wrapper for SQL broadcast ## What changes were proposed in this pull request? - Adds R wrapper for `o.a.s.sql.functions.broadcast`. - Renames `broadcast` to

[GitHub] spark pull request #17965: [SPARK-20726][SPARKR] wrapper for SQL broadcast

2017-05-12 Thread zero323
Github user zero323 closed the pull request at: https://github.com/apache/spark/pull/17965 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #17955: [SPARK-20715] Store MapStatuses only in MapOutputTracker...

2017-05-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17955 **[Test build #76886 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76886/testReport)** for PR 17955 at commit

[GitHub] spark issue #16743: [SPARK-19379][CORE] SparkAppHandle.getState not register...

2017-05-12 Thread adamstatdna
Github user adamstatdna commented on the issue: https://github.com/apache/spark/pull/16743 Yes, I believe it is still active. The solution stated by Marcelo of detecting the exit code in local mode would be a solution for my purposes of testing where you want to do end-to-end

[GitHub] spark issue #17969: [SPARK-20729][SPARKR][ML] Reduce boilerplate in Spark ML...

2017-05-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17969 **[Test build #76885 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76885/testReport)** for PR 17969 at commit

[GitHub] spark pull request #17969: [SPARK-20729][SPARKR][ML] Reduce boilerplate in S...

2017-05-12 Thread zero323
GitHub user zero323 opened a pull request: https://github.com/apache/spark/pull/17969 [SPARK-20729][SPARKR][ML] Reduce boilerplate in Spark ML models ## What changes were proposed in this pull request? - Add `JavaModel` and `JavaMLWritable` S4 classes and mix them with

[GitHub] spark pull request #17644: [SPARK-17729] [SQL] Enable creating hive bucketed...

2017-05-12 Thread tejasapatil
Github user tejasapatil commented on a diff in the pull request: https://github.com/apache/spark/pull/17644#discussion_r116342178 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -171,8 +172,7 @@ private[hive] class

[GitHub] spark issue #17644: [SPARK-17729] [SQL] Enable creating hive bucketed tables

2017-05-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17644 **[Test build #76884 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76884/testReport)** for PR 17644 at commit

[GitHub] spark issue #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJavaFuncti...

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17222 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJavaFuncti...

2017-05-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17222 **[Test build #76883 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76883/testReport)** for PR 17222 at commit

[GitHub] spark issue #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJavaFuncti...

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17222 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76883/ Test FAILed. ---

[GitHub] spark issue #17967: [SPARK-14659][ML] RFormula consistent with R when handli...

2017-05-12 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/17967 @yanboliang @MLnick @HyukjinKwon @jkbradley @sethah --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJav...

2017-05-12 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/17222#discussion_r116340261 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/AggregationQuerySuite.scala --- @@ -20,16 +20,19 @@ package

[GitHub] spark issue #17957: [SPARK-20717][SS] Minor tweaks to the MapGroupsWithState...

2017-05-12 Thread marmbrus
Github user marmbrus commented on the issue: https://github.com/apache/spark/pull/17957 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #17644: [SPARK-17729] [SQL] Enable creating hive bucketed tables

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17644 Build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #17644: [SPARK-17729] [SQL] Enable creating hive bucketed tables

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17644 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76882/ Test PASSed. ---

[GitHub] spark issue #17644: [SPARK-17729] [SQL] Enable creating hive bucketed tables

2017-05-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17644 **[Test build #76882 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76882/testReport)** for PR 17644 at commit

[GitHub] spark pull request #17960: [SPARK-20719] [SQL] Support LIMIT ALL

2017-05-12 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17960 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #17960: [SPARK-20719] [SQL] Support LIMIT ALL

2017-05-12 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/17960 LGTM - merging to master. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17966: [SPARK-20727] Skip tests that use Hadoop utils on CRAN W...

2017-05-12 Thread shivaram
Github user shivaram commented on the issue: https://github.com/apache/spark/pull/17966 @HyukjinKwon Do we know why things sometime queue for a long time on AppVeyor ? Like this PR has been queued for around 5 hours right now. --- If your project is set up for it, you can reply to

[GitHub] spark issue #17960: [SPARK-20719] [SQL] Support LIMIT ALL

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17960 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76881/ Test PASSed. ---

[GitHub] spark issue #17960: [SPARK-20719] [SQL] Support LIMIT ALL

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17960 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17960: [SPARK-20719] [SQL] Support LIMIT ALL

2017-05-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17960 **[Test build #76881 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76881/testReport)** for PR 17960 at commit

[GitHub] spark issue #17955: [SPARK-20715] Store MapStatuses only in MapOutputTracker...

2017-05-12 Thread JoshRosen
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/17955 The `MapOutputTrackerSuite` `remote fetch` test case failed as of that last commit because I didn't faithfully replicate the behavior of `clearEpoch()` / `incrementEpoch()`. In the old

[GitHub] spark issue #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJavaFuncti...

2017-05-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17222 **[Test build #76883 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76883/testReport)** for PR 17222 at commit

[GitHub] spark issue #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJavaFuncti...

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17222 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJavaFuncti...

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17222 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76879/ Test PASSed. ---

[GitHub] spark pull request #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJav...

2017-05-12 Thread zjffdu
Github user zjffdu commented on a diff in the pull request: https://github.com/apache/spark/pull/17222#discussion_r116325723 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala --- @@ -491,20 +491,42 @@ class UDFRegistration private[sql] (functionRegistry:

[GitHub] spark issue #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJavaFuncti...

2017-05-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17222 **[Test build #76879 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76879/testReport)** for PR 17222 at commit

[GitHub] spark issue #17958: [SPARK-20716][SS] StateStore.abort() should not throw ex...

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17958 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17958: [SPARK-20716][SS] StateStore.abort() should not throw ex...

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17958 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76880/ Test PASSed. ---

[GitHub] spark issue #17958: [SPARK-20716][SS] StateStore.abort() should not throw ex...

2017-05-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17958 **[Test build #76880 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76880/testReport)** for PR 17958 at commit

[GitHub] spark issue #17644: [SPARK-17729] [SQL] Enable creating hive bucketed tables

2017-05-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17644 **[Test build #76882 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76882/testReport)** for PR 17644 at commit

[GitHub] spark pull request #15074: [SPARK-17520] Implement a better __eq__ for Spars...

2017-05-12 Thread gglanzani
Github user gglanzani commented on a diff in the pull request: https://github.com/apache/spark/pull/15074#discussion_r116317288 --- Diff: python/pyspark/mllib/linalg/__init__.py --- @@ -1296,9 +1296,19 @@ def asML(self): return newlinalg.SparseMatrix(self.numRows,

[GitHub] spark pull request #15074: [SPARK-17520] Implement a better __eq__ for Spars...

2017-05-12 Thread gglanzani
Github user gglanzani closed the pull request at: https://github.com/apache/spark/pull/15074 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #15074: [SPARK-17520] Implement a better __eq__ for SparseMatrix

2017-05-12 Thread gglanzani
Github user gglanzani commented on the issue: https://github.com/apache/spark/pull/15074 @HyukjinKwon So I think I will close this one. The issue is that there is never a shorter path. Take for example this code ```python C = SparseMatrix(2, 2, [0, 0, 2], [1],

[GitHub] spark issue #17968: [SPARK-9792] Make DenseMatrix equality semantical

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17968 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #17968: [SPARK-9792] Make DenseMatrix equality semantical

2017-05-12 Thread gglanzani
GitHub user gglanzani opened a pull request: https://github.com/apache/spark/pull/17968 [SPARK-9792] Make DenseMatrix equality semantical Before, you could have this code ``` A = SparseMatrix(2, 2, [0, 2, 3], [0], [2]) B = DenseMatrix(2, 2, [2, 0, 0, 0]) B

[GitHub] spark issue #17960: [SPARK-20719] [SQL] Support LIMIT ALL

2017-05-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17960 **[Test build #76881 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76881/testReport)** for PR 17960 at commit

[GitHub] spark pull request #17964: [SPARK-20725][SQL] partial aggregate should behav...

2017-05-12 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/17964#discussion_r116306324 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala --- @@ -429,17 +429,13 @@ object QueryPlan { * with

[GitHub] spark issue #17967: [SPARK-14659][ML] RFormula allows to drop the same categ...

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17967 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17967: [SPARK-14659][ML] RFormula allows to drop the same categ...

2017-05-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17967 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76878/ Test PASSed. ---

[GitHub] spark issue #17967: [SPARK-14659][ML] RFormula allows to drop the same categ...

2017-05-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17967 **[Test build #76878 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76878/testReport)** for PR 17967 at commit

[GitHub] spark pull request #15074: [SPARK-17520] Implement a better __eq__ for Spars...

2017-05-12 Thread gglanzani
Github user gglanzani commented on a diff in the pull request: https://github.com/apache/spark/pull/15074#discussion_r116302904 --- Diff: python/pyspark/mllib/linalg/__init__.py --- @@ -1296,9 +1296,19 @@ def asML(self): return newlinalg.SparseMatrix(self.numRows,

  1   2   3   4   5   >