[GitHub] spark issue #16636: [SPARK-19279] [SQL] Block Creating a Hive Table With an ...

2017-01-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16636 **[Test build #71795 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71795/testReport)** for PR 16636 at commit

[GitHub] spark issue #16228: [SPARK-17076] [SQL] Cardinality estimation for join base...

2017-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16228 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71794/ Test PASSed. ---

[GitHub] spark issue #16228: [SPARK-17076] [SQL] Cardinality estimation for join base...

2017-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16228 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16228: [SPARK-17076] [SQL] Cardinality estimation for join base...

2017-01-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16228 **[Test build #71794 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71794/testReport)** for PR 16228 at commit

[GitHub] spark issue #16671: [SPARK-19327][SparkSQL] a better balance partition metho...

2017-01-21 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16671 So far, the best workaround is that predicate-based JDBC API; otherwise, as I mentioned above, we need to do it using sampling to find the boundary of each block. > In one embodiment, a

[GitHub] spark issue #16657: [SPARK-19306][Core] Fix inconsistent state in DiskBlockO...

2017-01-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16657 **[Test build #71800 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71800/testReport)** for PR 16657 at commit

[GitHub] spark issue #15314: [SPARK-17747][ML] WeightCol support non-double numeric d...

2017-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15314 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71798/ Test PASSed. ---

[GitHub] spark issue #15314: [SPARK-17747][ML] WeightCol support non-double numeric d...

2017-01-21 Thread zhengruifeng
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/15314 re-ping @jkbradley --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #15314: [SPARK-17747][ML] WeightCol support non-double numeric d...

2017-01-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15314 **[Test build #71798 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71798/testReport)** for PR 15314 at commit

[GitHub] spark issue #15314: [SPARK-17747][ML] WeightCol support non-double numeric d...

2017-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15314 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #16670: [SPARK-19324][SPARKR] Spark VJM stdout output is ...

2017-01-21 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/16670#discussion_r97214965 --- Diff: R/pkg/inst/tests/testthat/test_Windows.R --- @@ -20,7 +20,7 @@ test_that("sparkJars tag in SparkContext", { if (.Platform$OS.type !=

[GitHub] spark issue #16579: [SPARK-19218][SQL] Fix SET command to show a result corr...

2017-01-21 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/16579 Hi, @gatorsmile . This is the original PR which has two fixes together now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark issue #16624: [WIP] Fix `SET -v` not to raise exceptions for configs w...

2017-01-21 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/16624 Hi, @gatorsmile . I tested here and applied to #16579 . PR #16579 has two fixes. After merging #16579 , I'm going to close this one. --- If your project is set up for it, you can

[GitHub] spark pull request #16609: [SPARK-8480] [CORE] [PYSPARK] [SPARKR] Add setNam...

2017-01-21 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/16609#discussion_r97214415 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -172,11 +172,23 @@ class Dataset[T] private[sql](

[GitHub] spark pull request #16609: [SPARK-8480] [CORE] [PYSPARK] [SPARKR] Add setNam...

2017-01-21 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/16609#discussion_r97214402 --- Diff: python/pyspark/sql/dataframe.py --- @@ -85,17 +85,20 @@ def rdd(self): self._lazy_rdd = RDD(jrdd, self.sql_ctx._sc,

[GitHub] spark pull request #16609: [SPARK-8480] [CORE] [PYSPARK] [SPARKR] Add setNam...

2017-01-21 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/16609#discussion_r97214411 --- Diff: python/pyspark/sql/dataframe.py --- @@ -85,17 +85,20 @@ def rdd(self): self._lazy_rdd = RDD(jrdd, self.sql_ctx._sc,

[GitHub] spark pull request #16657: [SPARK-19306][Core] Fix inconsistent state in Dis...

2017-01-21 Thread jerryshao
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/16657#discussion_r97214376 --- Diff: core/src/main/scala/org/apache/spark/storage/DiskBlockObjectWriter.scala --- @@ -206,18 +209,22 @@ private[spark] class DiskBlockObjectWriter(

[GitHub] spark pull request #16670: [SPARK-19324][SPARKR] Spark VJM stdout output is ...

2017-01-21 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/16670#discussion_r97214373 --- Diff: R/pkg/inst/tests/testthat/test_Windows.R --- @@ -20,7 +20,7 @@ test_that("sparkJars tag in SparkContext", { if (.Platform$OS.type !=

[GitHub] spark pull request #15040: [WIP] [SPARK-17487] [SQL] Configurable bucketing ...

2017-01-21 Thread tejasapatil
Github user tejasapatil closed the pull request at: https://github.com/apache/spark/pull/15040 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #16654: [SPARK-19303][ML][WIP] Add evaluate method in clustering...

2017-01-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16654 **[Test build #71799 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71799/testReport)** for PR 16654 at commit

[GitHub] spark issue #16671: [SPARK-19327][SparkSQL] a better balance partition metho...

2017-01-21 Thread djvulee
Github user djvulee commented on the issue: https://github.com/apache/spark/pull/16671 Using the *predicates* parameters to split the table seems reasonable, but it just put some work should be done by Spark to users in my personal opinion. Users need know how to split the table

[GitHub] spark issue #15505: [SPARK-18890][CORE] Move task serialization from the Tas...

2017-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15505 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71792/ Test PASSed. ---

[GitHub] spark issue #15505: [SPARK-18890][CORE] Move task serialization from the Tas...

2017-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15505 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16642: [SPARK-19284][SQL]append to partitioned datasource table...

2017-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16642 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16642: [SPARK-19284][SQL]append to partitioned datasource table...

2017-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16642 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71793/ Test PASSed. ---

[GitHub] spark issue #15505: [SPARK-18890][CORE] Move task serialization from the Tas...

2017-01-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15505 **[Test build #71792 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71792/testReport)** for PR 15505 at commit

[GitHub] spark issue #16642: [SPARK-19284][SQL]append to partitioned datasource table...

2017-01-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16642 **[Test build #71793 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71793/testReport)** for PR 16642 at commit

[GitHub] spark issue #15314: [SPARK-17747][ML] WeightCol support non-double numeric d...

2017-01-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15314 **[Test build #71798 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71798/testReport)** for PR 15314 at commit

[GitHub] spark issue #15314: [SPARK-17747][ML] WeightCol support non-double numeric d...

2017-01-21 Thread zhengruifeng
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/15314 jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #16671: [SPARK-19327][SparkSQL] a better balance partition metho...

2017-01-21 Thread djvulee
Github user djvulee commented on the issue: https://github.com/apache/spark/pull/16671 Yes, this solution is not suitable for large table, but I can not find a better solution, this is the best optimisation I can find. So just add it as a choose, let the users know what he is

[GitHub] spark issue #16654: [SPARK-19303][ML][WIP] Add evaluate method in clustering...

2017-01-21 Thread zhengruifeng
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/16654 I think now clustering metrics are not that general, comparing with classification/regression metrics: WSSSE only apply to `KMeans` and `BiKMeans` Loglikelihood only apply to `GMM`

[GitHub] spark pull request #16626: [SPARK-19261][SQL] Alter add columns for Hive tab...

2017-01-21 Thread xwu0226
Github user xwu0226 commented on a diff in the pull request: https://github.com/apache/spark/pull/16626#discussion_r97213829 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala --- @@ -168,6 +168,43 @@ case class AlterTableRenameCommand( }

[GitHub] spark issue #16624: [WIP] Fix `SET -v` not to raise exceptions for configs w...

2017-01-21 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16624 Please update the PR description. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #16626: [SPARK-19261][SQL] Alter add columns for Hive tab...

2017-01-21 Thread xwu0226
Github user xwu0226 commented on a diff in the pull request: https://github.com/apache/spark/pull/16626#discussion_r97213702 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -584,14 +593,18 @@ private[spark] class

[GitHub] spark issue #16636: [SPARK-19279] [SQL] Block Creating a Hive Table With an ...

2017-01-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16636 **[Test build #71796 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71796/testReport)** for PR 16636 at commit

[GitHub] spark issue #16587: [SPARK-19229] [SQL] Disallow Creating Hive Source Tables...

2017-01-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16587 **[Test build #71797 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71797/testReport)** for PR 16587 at commit

[GitHub] spark pull request #16626: [SPARK-19261][SQL] Alter add columns for Hive tab...

2017-01-21 Thread xwu0226
Github user xwu0226 commented on a diff in the pull request: https://github.com/apache/spark/pull/16626#discussion_r97213578 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/SpecificParquetRecordReaderBase.java --- @@ -107,7 +107,13 @@ public

[GitHub] spark pull request #16646: [SPARK-19291][SPARKR][ML] spark.gaussianMixture s...

2017-01-21 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16646 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #16636: [SPARK-19279] [SQL] Block Creating a Hive Table With an ...

2017-01-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16636 **[Test build #71795 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71795/testReport)** for PR 16636 at commit

[GitHub] spark issue #16587: [SPARK-19229] [SQL] Disallow Creating Hive Source Tables...

2017-01-21 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16587 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #16646: [SPARK-19291][SPARKR][ML] spark.gaussianMixture supports...

2017-01-21 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/16646 Merged into master. If there are new comments about the model persistence compatibility issue, we can address them in follow-up work. Thanks for all your reviewing. --- If your project is set

[GitHub] spark issue #16579: [SPARK-19218][SQL] Fix SET command to show a result corr...

2017-01-21 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/16579 For `SET -v` without sorting, please refer #16624 , too. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request #16516: [SPARK-19155][ML] MLlib GeneralizedLinearRegressi...

2017-01-21 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16516 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #16516: [SPARK-19155][ML] MLlib GeneralizedLinearRegression fami...

2017-01-21 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/16516 Merged into master, branch-2.1 and branch-2.0. Thanks for all your reviewing. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark issue #16671: [SparkSQL] a better balance partition method for jdbc AP...

2017-01-21 Thread djvulee
Github user djvulee commented on the issue: https://github.com/apache/spark/pull/16671 @gatorsmile can you take a look at? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16671: [SparkSQL] a better balance partition method for jdbc AP...

2017-01-21 Thread djvulee
Github user djvulee commented on the issue: https://github.com/apache/spark/pull/16671 Table2 with about 5M rows, 200partition by SparkSQL. (The table using the MySQL sharding, and every partition will return 10K rows at most) old partition result(elements in

[GitHub] spark issue #16661: [SPARK-19313][ML][MLLIB] GaussianMixture should limit th...

2017-01-21 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/16661 ping @yanboliang --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark issue #16228: [SPARK-17076] [SQL] Cardinality estimation for join base...

2017-01-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16228 **[Test build #71794 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71794/testReport)** for PR 16228 at commit

[GitHub] spark issue #16671: [SparkSQL] a better balance partition method for jdbc AP...

2017-01-21 Thread djvulee
Github user djvulee commented on the issue: https://github.com/apache/spark/pull/16671 Here is the real data test result: Table with 1.2Million rows, 50partition by SparkSQL. old partition result(elements in each partition)

[GitHub] spark issue #16671: [SparkSQL] a better balance partition method for jdbc AP...

2017-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16671 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #16671: [SparkSQL] a better balance partition method for ...

2017-01-21 Thread djvulee
GitHub user djvulee opened a pull request: https://github.com/apache/spark/pull/16671 [SparkSQL] a better balance partition method for jdbc API ## What changes were proposed in this pull request? The partition method in` jdbc` using the equal step, this can lead to skew

[GitHub] spark pull request #16594: [SPARK-17078] [SQL] Show stats when explain

2017-01-21 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/16594#discussion_r97212822 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -649,6 +649,14 @@ object SQLConf { .doubleConf

[GitHub] spark issue #16611: [SPARK-17967][SPARK-17878][SQL][PYTHON] Support for arra...

2017-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16611 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71788/ Test PASSed. ---

[GitHub] spark issue #16611: [SPARK-17967][SPARK-17878][SQL][PYTHON] Support for arra...

2017-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16611 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16611: [SPARK-17967][SPARK-17878][SQL][PYTHON] Support for arra...

2017-01-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16611 **[Test build #71788 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71788/testReport)** for PR 16611 at commit

[GitHub] spark pull request #16670: [SPARK-19324][SPARKR] Spark VJM stdout output is ...

2017-01-21 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/16670#discussion_r97212748 --- Diff: R/pkg/inst/tests/testthat/test_Windows.R --- @@ -20,7 +20,7 @@ test_that("sparkJars tag in SparkContext", { if (.Platform$OS.type !=

[GitHub] spark issue #16669: [SPARK-16101][SQL] Refactoring CSV read path to be consi...

2017-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16669 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16669: [SPARK-16101][SQL] Refactoring CSV read path to be consi...

2017-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16669 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71789/ Test PASSed. ---

[GitHub] spark issue #16669: [SPARK-16101][SQL] Refactoring CSV read path to be consi...

2017-01-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16669 **[Test build #71789 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71789/testReport)** for PR 16669 at commit

[GitHub] spark issue #16624: [WIP] Fix `SET -v` not to raise exceptions for configs w...

2017-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16624 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71791/ Test PASSed. ---

[GitHub] spark issue #16624: [WIP] Fix `SET -v` not to raise exceptions for configs w...

2017-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16624 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16624: [WIP] Fix `SET -v` not to raise exceptions for configs w...

2017-01-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16624 **[Test build #71791 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71791/testReport)** for PR 16624 at commit

[GitHub] spark pull request #16626: [SPARK-19261][SQL] Alter add columns for Hive tab...

2017-01-21 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16626#discussion_r97212553 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala --- @@ -736,6 +736,22 @@ class SparkSqlAstBuilder(conf: SQLConf)

[GitHub] spark pull request #16659: [SPARK-19309][SQL] disable common subexpression e...

2017-01-21 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16659#discussion_r97212331 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala --- @@ -67,28 +67,33 @@ class

[GitHub] spark pull request #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable wor...

2017-01-21 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16593 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #16642: [SPARK-19284][SQL]append to partitioned datasource table...

2017-01-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16642 **[Test build #71793 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71793/testReport)** for PR 16642 at commit

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

2017-01-21 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16593 LGTM, merging to master! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #15505: [SPARK-18890][CORE] Move task serialization from the Tas...

2017-01-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15505 **[Test build #71792 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71792/testReport)** for PR 15505 at commit

[GitHub] spark issue #16552: [SPARK-19152][SQL]DataFrameWriter.saveAsTable support hi...

2017-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16552 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16552: [SPARK-19152][SQL]DataFrameWriter.saveAsTable support hi...

2017-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16552 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71785/ Test PASSed. ---

[GitHub] spark issue #16552: [SPARK-19152][SQL]DataFrameWriter.saveAsTable support hi...

2017-01-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16552 **[Test build #71785 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71785/testReport)** for PR 16552 at commit

[GitHub] spark issue #15505: [SPARK-18890][CORE] Move task serialization from the Tas...

2017-01-21 Thread witgo
Github user witgo commented on the issue: https://github.com/apache/spark/pull/15505 @squito My understanding is that the TaskSchedulerImpl class contains many synchronized statements (synchronized the methods). If a synchronized statements execution time is very long, it will

[GitHub] spark pull request #15505: [SPARK-18890][CORE] Move task serialization from ...

2017-01-21 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/15505#discussion_r97211797 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -602,6 +619,20 @@ class

[GitHub] spark issue #16245: [SPARK-18824][SQL] Add optimizer rule to reorder Filter ...

2017-01-21 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/16245 It is true of course you can construct a combination of complex string operations and compare it with a simple Scala UDF. But as you said, the previous claim is true in most of time. I also think

[GitHub] spark pull request #16245: [SPARK-18824][SQL] Add optimizer rule to reorder ...

2017-01-21 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/16245#discussion_r97211716 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala --- @@ -514,6 +514,34 @@ case class

[GitHub] spark issue #16670: [SPARK-19324][SPARKR] Spark VJM stdout output is getting...

2017-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16670 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16670: [SPARK-19324][SPARKR] Spark VJM stdout output is getting...

2017-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16670 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71790/ Test PASSed. ---

[GitHub] spark issue #16670: [SPARK-19324][SPARKR] Spark VJM stdout output is getting...

2017-01-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16670 **[Test build #71790 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71790/testReport)** for PR 16670 at commit

[GitHub] spark issue #16245: [SPARK-18824][SQL] Add optimizer rule to reorder Filter ...

2017-01-21 Thread chenghao-intel
Github user chenghao-intel commented on the issue: https://github.com/apache/spark/pull/16245 I think that's true in most of time for`Scala UDF needs extra conversion between internal format and external format on input and out`, not all of the time, for example, some built-in string

[GitHub] spark pull request #16245: [SPARK-18824][SQL] Add optimizer rule to reorder ...

2017-01-21 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16245#discussion_r97211599 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala --- @@ -514,6 +514,34 @@ case class OptimizeCodegen(conf:

[GitHub] spark issue #15219: [SPARK-14098][SQL] Generate Java code to build CachedCol...

2017-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15219 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71787/ Test FAILed. ---

[GitHub] spark issue #15219: [SPARK-14098][SQL] Generate Java code to build CachedCol...

2017-01-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15219 **[Test build #71787 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71787/testReport)** for PR 15219 at commit

[GitHub] spark issue #15219: [SPARK-14098][SQL] Generate Java code to build CachedCol...

2017-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15219 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #16245: [SPARK-18824][SQL] Add optimizer rule to reorder ...

2017-01-21 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/16245#discussion_r97211489 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala --- @@ -514,6 +514,34 @@ case class

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

2017-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16593 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71784/ Test PASSed. ---

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

2017-01-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16593 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16593: [SPARK-19153][SQL]DataFrameWriter.saveAsTable work with ...

2017-01-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16593 **[Test build #71784 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71784/testReport)** for PR 16593 at commit

[GitHub] spark issue #16245: [SPARK-18824][SQL] Add optimizer rule to reorder Filter ...

2017-01-21 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/16245 I think most of time it should be as Scala UDF needs extra conversion between internal format and external format on input and out. --- If your project is set up for it, you can reply to this email

[GitHub] spark issue #16245: [SPARK-18824][SQL] Add optimizer rule to reorder Filter ...

2017-01-21 Thread chenghao-intel
Github user chenghao-intel commented on the issue: https://github.com/apache/spark/pull/16245 Actually I doubt this is really an optimization, as the assumption of Scala UDF is slower than the non-SCALA UDF probably not always true. --- If your project is set up for it, you can

[GitHub] spark pull request #16245: [SPARK-18824][SQL] Add optimizer rule to reorder ...

2017-01-21 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16245#discussion_r97211371 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala --- @@ -514,6 +514,34 @@ case class OptimizeCodegen(conf:

[GitHub] spark issue #16596: [SPARK-19237][SPARKR][WIP] R should check for java when ...

2017-01-21 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/16596 I've found the root cause, from investigations, but need to test cross platform for the fix. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request #16245: [SPARK-18824][SQL] Add optimizer rule to reorder ...

2017-01-21 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/16245#discussion_r97211330 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala --- @@ -514,6 +514,34 @@ case class

[GitHub] spark issue #16624: [WIP] Fix `SET -v` not to raise exceptions for configs w...

2017-01-21 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/16624 The final failure, `HiveSparkSubmitSuite.dir` is irrelevant to this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #16624: [WIP] Fix `SET -v` not to raise exceptions for configs w...

2017-01-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16624 **[Test build #71791 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71791/testReport)** for PR 16624 at commit

[GitHub] spark issue #16579: [SPARK-19218][SQL] Fix SET command to show a result corr...

2017-01-21 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/16579 Hi, @srowen and @gatorsmile . Finally, this PR resolved all issues. Could you review this again? --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request #16670: [SPARK-19324][SPARKR] Spark VJM stdout output is ...

2017-01-21 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/16670#discussion_r97211236 --- Diff: R/pkg/R/utils.R --- @@ -756,12 +756,12 @@ varargsToJProperties <- function(...) { props } -launchScript <-

[GitHub] spark issue #16624: [WIP] Fix `SET -v` not to raise exceptions for configs w...

2017-01-21 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/16624 Retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #16670: [SPARK-19324][SPARKR] Spark VJM stdout output is getting...

2017-01-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16670 **[Test build #71790 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71790/testReport)** for PR 16670 at commit

[GitHub] spark pull request #16670: [SPARK-19324][SPARKR] Spark VJM stdout output is ...

2017-01-21 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/16670#discussion_r97211194 --- Diff: R/pkg/R/utils.R --- @@ -756,12 +756,12 @@ varargsToJProperties <- function(...) { props } -launchScript <-

[GitHub] spark pull request #16670: [SPARK-19324][SPARKR] Spark VJM stdout output is ...

2017-01-21 Thread felixcheung
GitHub user felixcheung opened a pull request: https://github.com/apache/spark/pull/16670 [SPARK-19324][SPARKR] Spark VJM stdout output is getting dropped in SparkR ## What changes were proposed in this pull request? This affects mostly running job from the driver in client

  1   2   3   >