[GitHub] spark pull request: [SPARK-14763] [SQL] fix subquery resolution

2016-04-21 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/12539#discussion_r60697951 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -862,28 +862,68 @@ class Analyzer( object ResolveSu

[GitHub] spark pull request: [SPARK-14819] [SQL] Improve SET / SET -v comma...

2016-04-21 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/12583#issuecomment-213292518 @bomeng while you are at this, do you mind implementing https://issues.apache.org/jira/browse/SPARK-14806 together? --- If your project is set up for it, you can reply t

[GitHub] spark pull request: [SPARK-14838][SQL] Skip automatically broadcas...

2016-04-21 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/12599#discussion_r60697763 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala --- @@ -83,6 +83,16 @@ case class SerializeFromObject(

[GitHub] spark pull request: [SPARK-14763] [SQL] fix subquery resolution

2016-04-21 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/12539#discussion_r60697737 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -862,28 +862,68 @@ class Analyzer( object Resol

[GitHub] spark pull request: [SPARK-14763] [SQL] fix subquery resolution

2016-04-21 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/12539#discussion_r60697769 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -862,28 +862,68 @@ class Analyzer( object Resol

[GitHub] spark pull request: [SPARK-14763] [SQL] fix subquery resolution

2016-04-21 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/12539#discussion_r60697711 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -862,28 +862,68 @@ class Analyzer( object ResolveSu

[GitHub] spark pull request: [SPARK-14763] [SQL] fix subquery resolution

2016-04-21 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/12539#discussion_r60697650 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1536,17 +1560,22 @@ object RewritePredicateSubquery ext

[GitHub] spark pull request: [SPARK-14826][SQL] Remove HiveQueryExecution

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12588#issuecomment-213291287 **[Test build #56667 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56667/consoleFull)** for PR 12588 at commit [`f16dbd2`](https://gi

[GitHub] spark pull request: [SPARK-14819] [SQL] Improve SET / SET -v comma...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12583#issuecomment-213291210 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-14819] [SQL] Improve SET / SET -v comma...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12583#issuecomment-213291213 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-14819] [SQL] Improve SET / SET -v comma...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12583#issuecomment-213290728 **[Test build #56653 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56653/consoleFull)** for PR 12583 at commit [`99ff8a5`](https://g

[GitHub] spark pull request: [SPARK-13266][SQL] None read/writer options we...

2016-04-21 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/12494#discussion_r60697339 --- Diff: python/pyspark/sql/readwriter.py --- @@ -367,16 +370,19 @@ def format(self, source): @since(1.5) def option(self, key, value):

[GitHub] spark pull request: [SPARK-14489][SPARK-14153][ML][PYSPARK] Suppor...

2016-04-21 Thread MLnick
Github user MLnick commented on the pull request: https://github.com/apache/spark/pull/12577#issuecomment-213289938 @sethah you're correct, this is a a bit of a band-aid fix. However, the real fix is getting CrossValidator to handle cases like this in a principled and generic way (and

[GitHub] spark pull request: [SPARK-14838][SQL] Skip automatically broadcas...

2016-04-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/12599#discussion_r60697122 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala --- @@ -83,6 +83,16 @@ case class SerializeFromObject(

[GitHub] spark pull request: [SPARK-14489][SPARK-14153][ML][PYSPARK] Suppor...

2016-04-21 Thread MLnick
Github user MLnick commented on the pull request: https://github.com/apache/spark/pull/12577#issuecomment-213289486 @holdenk yes I think it makes sense to add something to docs on cross-val to illustrate use cases. --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request: [SPARK-14489][SPARK-14153][ML][PYSPARK] Suppor...

2016-04-21 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/12577#discussion_r60697050 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/RegressionEvaluator.scala --- @@ -69,7 +69,27 @@ final class RegressionEvaluator @Since("1.4.0"

[GitHub] spark pull request: [SPARK-14489][SPARK-14153][ML][PYSPARK] Suppor...

2016-04-21 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/12577#discussion_r60696956 --- Diff: python/pyspark/ml/evaluation.py --- @@ -187,6 +187,13 @@ class RegressionEvaluator(JavaEvaluator, HasLabelCol, HasPredictionCol): 0.993...

[GitHub] spark pull request: [SPARK-14489][SPARK-14153][ML][PYSPARK] Suppor...

2016-04-21 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/12577#discussion_r60696894 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/RegressionEvaluator.scala --- @@ -86,8 +106,9 @@ final class RegressionEvaluator @Since("1.4.0"

[GitHub] spark pull request: [SPARK-9778][SQL] remove unnecessary evaluatio...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8066#issuecomment-213289056 **[Test build #5 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/5/consoleFull)** for PR 8066 at commit [`4baaf55`](https://gith

[GitHub] spark pull request: [SPARK-14842][SQL] Implement view creation in ...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12603#issuecomment-213289054 **[Test build #56664 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56664/consoleFull)** for PR 12603 at commit [`71f258e`](https://gi

[GitHub] spark pull request: [SPARK-14838][SQL] Skip automatically broadcas...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12599#issuecomment-213289068 **[Test build #56665 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56665/consoleFull)** for PR 12599 at commit [`9d2033f`](https://gi

[GitHub] spark pull request: [SPARK-14838][SQL] Skip automatically broadcas...

2016-04-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/12599#discussion_r60696638 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala --- @@ -83,6 +83,16 @@ case class SerializeFromObject(

[GitHub] spark pull request: [SPARK-14763] [SQL] fix subquery resolution

2016-04-21 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/12539#discussion_r60696584 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -862,28 +862,68 @@ class Analyzer( object ResolveSu

[GitHub] spark pull request: [SPARK-14582] [SQL] increase parallelism for s...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12344#issuecomment-213288687 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-14763] [SQL] fix subquery resolution

2016-04-21 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/12539#discussion_r60696559 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1536,17 +1560,22 @@ object RewritePredicateSubquery

[GitHub] spark pull request: [SPARK-14582] [SQL] increase parallelism for s...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12344#issuecomment-213288686 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-14842][SQL] Implement view creation in ...

2016-04-21 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/12603 [SPARK-14842][SQL] Implement view creation in sql/core ## What changes were proposed in this pull request? This patch re-implements view creation command in sql/core, based on the pre-existing vie

[GitHub] spark pull request: [SPARK-14582] [SQL] increase parallelism for s...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12344#issuecomment-213288588 **[Test build #56658 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56658/consoleFull)** for PR 12344 at commit [`f688bf1`](https://g

[GitHub] spark pull request: [SPARK-14791] [SQL] fix risk condition between...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12600#issuecomment-213287306 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-14791] [SQL] fix risk condition between...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12600#issuecomment-213287309 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-13857][ML][WIP] Add "recommend all" fun...

2016-04-21 Thread MLnick
Github user MLnick commented on the pull request: https://github.com/apache/spark/pull/12574#issuecomment-213287357 Test failure seems to be caused by issue in #12599 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If you

[GitHub] spark pull request: [SPARK-14791] [SQL] fix risk condition between...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12600#issuecomment-213286426 **[Test build #56648 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56648/consoleFull)** for PR 12600 at commit [`e5bd221`](https://g

[GitHub] spark pull request: [SPARK-14571][ML]Log instrumentation in ALS

2016-04-21 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/12560#discussion_r60696182 --- Diff: mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala --- @@ -607,7 +611,8 @@ object ALS extends DefaultParamsReadable[ALS] with Lo

[GitHub] spark pull request: [SPARK-13266][SQL] None read/writer options we...

2016-04-21 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/12494#issuecomment-213285149 @davies any more comments regarding this pr? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your proj

[GitHub] spark pull request: [SPARK-14826][SQL] Remove HiveQueryExecution

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12588#issuecomment-213285109 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-14826][SQL] Remove HiveQueryExecution

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12588#issuecomment-213285113 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-14826][SQL] Remove HiveQueryExecution

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12588#issuecomment-213284793 **[Test build #56652 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56652/consoleFull)** for PR 12588 at commit [`21595e9`](https://g

[GitHub] spark pull request: [SPARK-14838][SQL] Skip automatically broadcas...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12599#issuecomment-213284497 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-14763] [SQL] fix subquery resolution

2016-04-21 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/12539#discussion_r60696040 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -862,28 +862,68 @@ class Analyzer( object Resol

[GitHub] spark pull request: [SPARK-14838][SQL] Skip automatically broadcas...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12599#issuecomment-213284493 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-14838][SQL] Skip automatically broadcas...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12599#issuecomment-213284105 **[Test build #56650 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56650/consoleFull)** for PR 12599 at commit [`8e0541c`](https://g

[GitHub] spark pull request: [SPARK-14763] [SQL] fix subquery resolution

2016-04-21 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/12539#discussion_r60695940 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -862,28 +862,68 @@ class Analyzer( object Resol

[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-213283714 **[Test build #56663 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56663/consoleFull)** for PR 12493 at commit [`75dae85`](https://gi

[GitHub] spark pull request: [SPARK-13266][SQL] None read/writer options we...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12494#issuecomment-213283133 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-13266][SQL] None read/writer options we...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12494#issuecomment-213283127 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-13266][SQL] None read/writer options we...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12494#issuecomment-213282207 **[Test build #56649 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56649/consoleFull)** for PR 12494 at commit [`9ab2c01`](https://g

[GitHub] spark pull request: [DOCS][MINOR] Accumulators

2016-04-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12569#issuecomment-213282258 @jaceklaskowski Would you accept my PR if I fix a bug in datasource in Spark SQL and I name it as "datasource"? --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-14369][SQL] Locality support for FileSc...

2016-04-21 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/12527#discussion_r60695719 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileScanRDD.scala --- @@ -131,4 +134,23 @@ class FileScanRDD( }

[GitHub] spark pull request: [SPARK-14826][SQL] Remove HiveQueryExecution

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12588#issuecomment-213281485 **[Test build #2855 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2855/consoleFull)** for PR 12588 at commit [`21595e9`](https://

[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...

2016-04-21 Thread sun-rui
Github user sun-rui commented on a diff in the pull request: https://github.com/apache/spark/pull/12493#discussion_r60695623 --- Diff: R/pkg/R/DataFrame.R --- @@ -1137,11 +1137,22 @@ setMethod("summarize", #' @rdname dapply #' @name dapply #' @export +#' @examples

[GitHub] spark pull request: [DOCS][MINOR] Accumulators

2016-04-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12569#issuecomment-213280716 @jaceklaskowski Because I thought obviously it is not clear. For me it sounds like adding whole documents for Accumulators. As you just said, I think "Added scr

[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...

2016-04-21 Thread sun-rui
Github user sun-rui commented on a diff in the pull request: https://github.com/apache/spark/pull/12493#discussion_r60695610 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/r/MapPartitionsRWrapper.scala --- @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache S

[GitHub] spark pull request: [SPARK-14841][SQL] Move SQLBuilder into sql/co...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12602#issuecomment-213280524 **[Test build #56662 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56662/consoleFull)** for PR 12602 at commit [`e931362`](https://gi

[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...

2016-04-21 Thread sun-rui
Github user sun-rui commented on a diff in the pull request: https://github.com/apache/spark/pull/12493#discussion_r60695589 --- Diff: R/pkg/inst/tests/testthat/test_sparkSQL.R --- @@ -1964,6 +1964,38 @@ test_that("Method str()", { expect_equal(capture.output(utils:::str(iris

[GitHub] spark pull request: [SPARK-14841][SQL] Move SQLBuilder into sql/co...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12602#issuecomment-213278547 **[Test build #56661 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56661/consoleFull)** for PR 12602 at commit [`6b1da0f`](https://gi

[GitHub] spark pull request: [SPARK-14736][core] Deadlock in registering ap...

2016-04-21 Thread nirandaperera
Github user nirandaperera commented on the pull request: https://github.com/apache/spark/pull/12506#issuecomment-213277995 Great! can we get this PR merged then? Is there anything else I should do in order to get this merged? --- If your project is set up for it, you can reply to thi

[GitHub] spark pull request: [SPARK-14557][SQL] Reading textfile (created t...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12356#issuecomment-213276919 **[Test build #56660 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56660/consoleFull)** for PR 12356 at commit [`ca9a160`](https://gi

[GitHub] spark pull request: [SPARK-14841][SQL] Move SQLBuilder into sql/co...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12602#issuecomment-213276878 **[Test build #56659 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56659/consoleFull)** for PR 12602 at commit [`2c3dddb`](https://gi

[GitHub] spark pull request: [SPARK-14841][SQL] Move SQLBuilder into sql/co...

2016-04-21 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/12602 [SPARK-14841][SQL] Move SQLBuilder into sql/core ## What changes were proposed in this pull request? This patch moves SQLBuilder into sql/core so we can in the future move view generation also int

[GitHub] spark pull request: [DOCS][MINOR] Accumulators

2016-04-21 Thread jaceklaskowski
Github user jaceklaskowski commented on the pull request: https://github.com/apache/spark/pull/12569#issuecomment-213275339 @HyukjinKwon In that case I'd ask for the alternative as I currently have no idea how to make it clearer (it wasn't me to say "the title is not clear" :)) What d

[GitHub] spark pull request: [SPARK-14838][SQL] Skip automatically broadcas...

2016-04-21 Thread gatorsmile
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/12599#issuecomment-213275326 @cloud-fan If we do not produce objects, it should work. Otherwise, we will hit the exception when the parent node calculates the statistics: https://github.com/apac

[GitHub] spark pull request: [SPARK-14557][SQL] Reading textfile (created t...

2016-04-21 Thread kasjain
Github user kasjain commented on the pull request: https://github.com/apache/spark/pull/12356#issuecomment-213275310 Resolved the merge conflicts for easy merging --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your pr

[GitHub] spark pull request: [SPARK-14387][SQL] Exceptions thrown when quer...

2016-04-21 Thread rajeshbalamohan
Github user rajeshbalamohan commented on the pull request: https://github.com/apache/spark/pull/12293#issuecomment-213275126 \cc @liancheng --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have thi

[GitHub] spark pull request: [SPARK-14838][SQL] Skip automatically broadcas...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12599#issuecomment-213274784 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-14838][SQL] Skip automatically broadcas...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12599#issuecomment-213274786 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-14838][SQL] Skip automatically broadcas...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12599#issuecomment-213274657 **[Test build #56647 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56647/consoleFull)** for PR 12599 at commit [`84207c7`](https://g

[GitHub] spark pull request: [SPARK-14525][SQL] Make DataFrameWrite.save wo...

2016-04-21 Thread JustinPihony
GitHub user JustinPihony opened a pull request: https://github.com/apache/spark/pull/12601 [SPARK-14525][SQL] Make DataFrameWrite.save work for jdbc ## What changes were proposed in this pull request? This change modifies the implementation of DataFrameWriter.save such that

[GitHub] spark pull request: [SPARK-14525][SQL] Make DataFrameWrite.save wo...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12601#issuecomment-213274569 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your p

[GitHub] spark pull request: [SPARK-14433][PySpark][ML]:PySpark ml Gaussian...

2016-04-21 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/12402#discussion_r60694336 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala --- @@ -104,6 +105,17 @@ class GaussianMixtureModel private[ml] (

[GitHub] spark pull request: [SPARK-14838][SQL] Skip automatically broadcas...

2016-04-21 Thread cloud-fan
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/12599#issuecomment-213272475 @viirya , yea you are right, `ObjectConsumer` may also produce objects, so we should implement `statistic` in `SerializeFromObject`. @gatorsmile I may misunde

[GitHub] spark pull request: [SPARK-14521][SQL]StackOverflowError in Kryo w...

2016-04-21 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12598#discussion_r60693434 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala --- @@ -324,8 +324,8 @@ private[joins] object UnsafeHashedRe

[GitHub] spark pull request: [SPARK-14521][SQL]StackOverflowError in Kryo w...

2016-04-21 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12598#discussion_r60693402 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala --- @@ -325,7 +325,7 @@ private[joins] object UnsafeHashedRe

[GitHub] spark pull request: [SPARK-14838][SQL] Skip automatically broadcas...

2016-04-21 Thread gatorsmile
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/12599#issuecomment-213269638 The problem is the parent node calls the defaultSize of its child's output. ```scala val childRowSize = child.output.map(_.dataType.defaultSize).sum +

[GitHub] spark pull request: [SPARK-14571][ML]Log instrumentation in ALS

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12560#issuecomment-213268198 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-14571][ML]Log instrumentation in ALS

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12560#issuecomment-213268088 **[Test build #56655 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56655/consoleFull)** for PR 12560 at commit [`d3e4fe2`](https://g

[GitHub] spark pull request: [SPARK-14571][ML]Log instrumentation in ALS

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12560#issuecomment-213268202 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-14838][SQL] Skip automatically broadcas...

2016-04-21 Thread gatorsmile
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/12599#issuecomment-213267994 @viirya Nope. Actually, I did that before. It does not work. The issue is its parent node's statistics calculation triggers the exception. --- If your project is se

[GitHub] spark pull request: [SPARK-14838][SQL] Skip automatically broadcas...

2016-04-21 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/12599#issuecomment-213265799 @gatorsmile Looks at the `SerializeFromObject` in your plan. If we implement `statistics` in it, we can skip estimating size of `MapGroups` which produces domain objects

[GitHub] spark pull request: [SPARK-14582] [SQL] increase parallelism for s...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12344#issuecomment-213266271 **[Test build #56658 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56658/consoleFull)** for PR 12344 at commit [`f688bf1`](https://gi

[GitHub] spark pull request: [SPARK-14838][SQL] Skip automatically broadcas...

2016-04-21 Thread gatorsmile
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/12599#issuecomment-213265498 When we calculating the statistics of `Filter`, we hit the issue caused by the `UnaryNode`'s default statistics calculation, right? --- If your project is set up f

[GitHub] spark pull request: [SPARK-14582] [SQL] increase parallelism for s...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12344#issuecomment-213265119 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-14582] [SQL] increase parallelism for s...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12344#issuecomment-213265121 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-14582] [SQL] increase parallelism for s...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12344#issuecomment-213265117 **[Test build #56657 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56657/consoleFull)** for PR 12344 at commit [`6cb931a`](https://g

[GitHub] spark pull request: [SPARK-14838][SQL] Skip automatically broadcas...

2016-04-21 Thread gatorsmile
Github user gatorsmile commented on the pull request: https://github.com/apache/spark/pull/12599#issuecomment-213265063 ``` == Optimized Logical Plan == Project [user#7,recommendations#48 AS prediction#77,actual#65 AS label#78] +- Join Inner, Some((user#7 = id#64)) :-

[GitHub] spark pull request: [SPARK-14571][ML]Log instrumentation in ALS

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12560#issuecomment-213264908 **[Test build #56655 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56655/consoleFull)** for PR 12560 at commit [`d3e4fe2`](https://gi

[GitHub] spark pull request: [SPARK-14582] [SQL] increase parallelism for s...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12344#issuecomment-213264912 **[Test build #56657 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56657/consoleFull)** for PR 12344 at commit [`6cb931a`](https://gi

[GitHub] spark pull request: [SPARK-14669] [SQL] Fix some SQL metrics in co...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12425#issuecomment-213264910 **[Test build #56656 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56656/consoleFull)** for PR 12425 at commit [`1076c75`](https://gi

[GitHub] spark pull request: [SPARK-14838][SQL] Skip automatically broadcas...

2016-04-21 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/12599#issuecomment-213264884 @cloud-fan Is there guarantee that an `ObjectConsumer` can't produce domain object? If no, I think it is safer to implement `statistics` in `SerializeFromObject`, instea

[GitHub] spark pull request: [SPARK-14669] [SQL] Fix some SQL metrics in co...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12425#issuecomment-213264628 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-14669] [SQL] Fix some SQL metrics in co...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12425#issuecomment-213264623 **[Test build #56654 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56654/consoleFull)** for PR 12425 at commit [`cc65830`](https://g

[GitHub] spark pull request: [SPARK-14669] [SQL] Fix some SQL metrics in co...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12425#issuecomment-213264631 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-14838][SQL] Skip automatically broadcas...

2016-04-21 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/12599#issuecomment-213264562 Yea I see. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature ena

[GitHub] spark pull request: [SPARK-13266][SQL] None read/writer options we...

2016-04-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12494#issuecomment-213264603 Filed in [SPARK-14839](https://issues.apache.org/jira/browse/SPARK-14839). I might be able to give a shot **if** nobody gives a try. --- If your project is set up

[GitHub] spark pull request: [SPARK-14669] [SQL] Fix some SQL metrics in co...

2016-04-21 Thread davies
Github user davies commented on the pull request: https://github.com/apache/spark/pull/12425#issuecomment-213264516 @zsxwing Addressed you comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-14571][ML]Log instrumentation in ALS

2016-04-21 Thread wangmiao1981
Github user wangmiao1981 commented on the pull request: https://github.com/apache/spark/pull/12560#issuecomment-213264393 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: [SPARK-14731][shuffle]Revert SPARK-12130 to ma...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12568#issuecomment-213264162 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/

[GitHub] spark pull request: [SPARK-14838][SQL] Skip automatically broadcas...

2016-04-21 Thread cloud-fan
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/12599#issuecomment-213264156 The object operators are really special, it breaks the contract that operator will always produce unsafe rows, so their usage is quite limited. Generally speaking, an

[GitHub] spark pull request: [SPARK-14731][shuffle]Revert SPARK-12130 to ma...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12568#issuecomment-213264159 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark pull request: [SPARK-14669] [SQL] Fix some SQL metrics in co...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12425#issuecomment-213263958 **[Test build #56654 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56654/consoleFull)** for PR 12425 at commit [`cc65830`](https://gi

[GitHub] spark pull request: [SPARK-14731][shuffle]Revert SPARK-12130 to ma...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12568#issuecomment-213263681 **[Test build #56641 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56641/consoleFull)** for PR 12568 at commit [`f10a23c`](https://g

  1   2   3   4   5   6   7   8   9   10   >