[GitHub] spark pull request #15703: [SPARK-18186] Migrate HiveUDAFFunction to TypedIm...

2016-11-15 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/15703#discussion_r88141512 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveUDAFSuite.scala --- @@ -0,0 +1,152 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #15703: [SPARK-18186] Migrate HiveUDAFFunction to TypedIm...

2016-11-15 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/15703#discussion_r88140933 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUDFs.scala --- @@ -365,4 +380,66 @@ private[hive] case class HiveUDAFFunction( val

[GitHub] spark pull request #15703: [SPARK-18186] Migrate HiveUDAFFunction to TypedIm...

2016-11-15 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/15703#discussion_r88141492 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveUDAFSuite.scala --- @@ -0,0 +1,152 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #15703: [SPARK-18186] Migrate HiveUDAFFunction to TypedIm...

2016-11-15 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/15703#discussion_r88140713 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUDFs.scala --- @@ -289,73 +302,75 @@ private[hive] case class HiveUDAFFunction

[GitHub] spark pull request #15703: [SPARK-18186] Migrate HiveUDAFFunction to TypedIm...

2016-11-15 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/15703#discussion_r88141072 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUDFs.scala --- @@ -365,4 +380,66 @@ private[hive] case class HiveUDAFFunction( val

[GitHub] spark pull request #15703: [SPARK-18186] Migrate HiveUDAFFunction to TypedIm...

2016-11-15 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/15703#discussion_r88140381 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUDFs.scala --- @@ -289,73 +302,75 @@ private[hive] case class HiveUDAFFunction

[GitHub] spark issue #15857: [SPARK-18300][SQL] Do not apply foldable propagation wit...

2016-11-15 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15857 Seems this breaks the scala 2.10 build? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15857: [SPARK-18300][SQL] Do not apply foldable propagation wit...

2016-11-15 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15857 ``` [error] [warn] /home/jenkins/workspace/spark-master-compile-sbt-scala-2.10/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala:439: Cannot check match

spark git commit: [SPARK-18379][SQL] Make the parallelism of parallelPartitionDiscovery configurable.

2016-11-15 Thread yhuai
Repository: spark Updated Branches: refs/heads/master f14ae4900 -> 745ab8bc5 [SPARK-18379][SQL] Make the parallelism of parallelPartitionDiscovery configurable. ## What changes were proposed in this pull request? The largest parallelism in PartitioningAwareFileIndex

[GitHub] spark issue #15829: [SPARK-18379][SQL] Make the parallelism of parallelParti...

2016-11-15 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15829 lgtm. Merging to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

spark git commit: [SPARK-18368][SQL] Fix regexp replace when serialized

2016-11-09 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 c8628e877 -> 6e7310590 [SPARK-18368][SQL] Fix regexp replace when serialized ## What changes were proposed in this pull request? This makes the result value both transient and lazy, so that if the RegExpReplace object is initialized

spark git commit: [SPARK-18368][SQL] Fix regexp replace when serialized

2016-11-09 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.1 626f6d6d4 -> 80f58510a [SPARK-18368][SQL] Fix regexp replace when serialized ## What changes were proposed in this pull request? This makes the result value both transient and lazy, so that if the RegExpReplace object is initialized

spark git commit: [SPARK-18368][SQL] Fix regexp replace when serialized

2016-11-09 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 47636618a -> d4028de97 [SPARK-18368][SQL] Fix regexp replace when serialized ## What changes were proposed in this pull request? This makes the result value both transient and lazy, so that if the RegExpReplace object is initialized then

[GitHub] spark issue #15834: [SPARK-18368] [SQL] Fix regexp replace when serialized

2016-11-09 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15834 Great. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #15834: [SPARK-18368] [SQL] Fix regexp replace when serialized

2016-11-09 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15834 ![image](https://cloud.githubusercontent.com/assets/2072857/20150618/9d871bf2-a66b-11e6-8d21-1a9bb6eb27d7.png) Since tests have already passed, I am merging this PR to master/branch-2.1

[GitHub] spark issue #15834: [SPARK-18368] [SQL] Fix regexp replace when serialized

2016-11-09 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15834 Awesome! btw looks like your original changes in `ExpressionEvalHelper.scala` (https://github.com/apache/spark/pull/15816/files#diff-41747ec3f56901eb7bfb95d2a217e94d) uncovered issues with other

[GitHub] spark pull request #15829: [SPARK-18379][SQL] Make the parallelism of parall...

2016-11-09 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/15829#discussion_r87253993 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -396,6 +396,13 @@ object SQLConf { .intConf

[GitHub] spark issue #15816: SPARK-18368: Fix regexp_replace with task serialization.

2016-11-09 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15816 @rdblue Can you send a new pr with the fix? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #15816: SPARK-18368: Fix regexp_replace with task serialization.

2016-11-09 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15816 Reverted from master/branch-2.1/branch-2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

spark git commit: Revert "[SPARK-18368] Fix regexp_replace with task serialization."

2016-11-09 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 06a13ecca -> 47636618a Revert "[SPARK-18368] Fix regexp_replace with task serialization." This reverts commit b9192bb3ffc319ebee7dbd15c24656795e454749. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

[GitHub] spark issue #15816: SPARK-18368: Fix regexp_replace with task serialization.

2016-11-09 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15816 oh, seems the last commit did not pass build. Sorry. I am going to revert this patch. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #15816: SPARK-18368: Fix regexp_replace with task serialization.

2016-11-09 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15816 I am wondering if it breaks some tests? ``` org.apache.spark.sql.catalyst.expressions.MathExpressionsSuite.e org.apache.spark.sql.catalyst.expressions.MathExpressionsSuite.pi

spark git commit: [SPARK-18338][SQL][TEST-MAVEN] Fix test case initialization order under Maven builds

2016-11-09 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 02c5325b8 -> 205e6d586 [SPARK-18338][SQL][TEST-MAVEN] Fix test case initialization order under Maven builds ## What changes were proposed in this pull request? Test case initialization order under Maven and SBT are different. Maven

[GitHub] spark issue #15802: [SPARK-18338][SQL][test-maven] Fix test case initializat...

2016-11-09 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15802 I am going to merge this to fix maven build. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #15812: [SPARK-18360][SQL] warehouse path config should work for...

2016-11-08 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15812 Yea. Warehouse location should not be session specific. Since we will propagate it to hive, it is shared by all sessions. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark issue #15802: [SPARK-18338][SQL] Fix test case initialization order un...

2016-11-07 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15802 Seems it does not work with sbt? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

spark git commit: [SPARK-18256] Improve the performance of event log replay in HistoryServer

2016-11-04 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 4cee2ce25 -> 0e3312ee7 [SPARK-18256] Improve the performance of event log replay in HistoryServer ## What changes were proposed in this pull request? This patch significantly improves the performance of event log replay in the

[GitHub] spark issue #15756: [SPARK-18256] Improve the performance of event log repla...

2016-11-04 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15756 Cool. Merging to master! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

spark git commit: [SPARK-18167] Re-enable the non-flaky parts of SQLQuerySuite

2016-11-04 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.1 e51978c3d -> 0a303a694 [SPARK-18167] Re-enable the non-flaky parts of SQLQuerySuite ## What changes were proposed in this pull request? It seems the proximate cause of the test failures is that `cast(str as decimal)` in derby will

spark git commit: [SPARK-18167] Re-enable the non-flaky parts of SQLQuerySuite

2016-11-04 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 550cd56e8 -> 4cee2ce25 [SPARK-18167] Re-enable the non-flaky parts of SQLQuerySuite ## What changes were proposed in this pull request? It seems the proximate cause of the test failures is that `cast(str as decimal)` in derby will raise

[GitHub] spark issue #15725: [SPARK-18167] Re-enable the non-flaky parts of SQLQueryS...

2016-11-04 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15725 lgtm. merging to master and branch 2.1. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15756: [SPARK-18256] Improve the performance of event log repla...

2016-11-04 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15756 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #14750: [SPARK-17183][SQL] put hive serde table schema to...

2016-11-03 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14750#discussion_r86470657 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -417,11 +429,12 @@ private[spark] class HiveExternalCatalog(conf

[GitHub] spark pull request #14750: [SPARK-17183][SQL] put hive serde table schema to...

2016-11-03 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14750#discussion_r86469470 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -95,8 +95,11 @@ private[spark] class HiveExternalCatalog(conf

[GitHub] spark pull request #14750: [SPARK-17183][SQL] put hive serde table schema to...

2016-11-03 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14750#discussion_r86471858 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -537,22 +559,11 @@ private[spark] class HiveExternalCatalog(conf

[GitHub] spark pull request #14750: [SPARK-17183][SQL] put hive serde table schema to...

2016-11-03 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14750#discussion_r86469682 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -255,6 +267,12 @@ private[spark] class HiveExternalCatalog(conf

[GitHub] spark pull request #14750: [SPARK-17183][SQL] put hive serde table schema to...

2016-11-03 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14750#discussion_r86471149 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -475,18 +490,27 @@ private[spark] class HiveExternalCatalog(conf

[GitHub] spark pull request #14750: [SPARK-17183][SQL] put hive serde table schema to...

2016-11-03 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/14750#discussion_r86472353 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -620,7 +667,9 @@ private[spark] class HiveExternalCatalog(conf

spark git commit: [SPARK-17949][SQL] A JVM object based aggregate operator

2016-11-03 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 66a99f4a4 -> 27daf6bcd [SPARK-17949][SQL] A JVM object based aggregate operator ## What changes were proposed in this pull request? This PR adds a new hash-based aggregate operator named `ObjectHashAggregateExec` that supports

[GitHub] spark issue #15590: [SPARK-17949][SQL] A JVM object based aggregate operator

2016-11-03 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15590 LGTM. Merging to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #15024: [SPARK-17470][SQL] unify path for data source table and ...

2016-11-02 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15024 OK. Let's get https://github.com/apache/spark/pull/14750 updated to fix SPARK-17183. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

spark git commit: [SPARK-17470][SQL] unify path for data source table and locationUri for hive serde table

2016-11-02 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.1 2aff2ea81 -> 5ea2f9e5e [SPARK-17470][SQL] unify path for data source table and locationUri for hive serde table ## What changes were proposed in this pull request? Due to a limitation of hive metastore(table location must be

spark git commit: [SPARK-17470][SQL] unify path for data source table and locationUri for hive serde table

2016-11-02 Thread yhuai
Repository: spark Updated Branches: refs/heads/master fd90541c3 -> 3a1bc6f47 [SPARK-17470][SQL] unify path for data source table and locationUri for hive serde table ## What changes were proposed in this pull request? Due to a limitation of hive metastore(table location must be directory

[GitHub] spark issue #15024: [SPARK-17470][SQL] unify path for data source table and ...

2016-11-02 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15024 LGTM. Merging to master and branch 2.1. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #15024: [SPARK-17470][SQL] unify path for data source tab...

2016-11-02 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/15024#discussion_r86273774 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala --- @@ -517,15 +517,15 @@ class SQLQuerySuite extends QueryTest

[GitHub] spark pull request #15024: [SPARK-17470][SQL] unify path for data source tab...

2016-11-02 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/15024#discussion_r86272489 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -189,66 +188,39 @@ private[spark] class HiveExternalCatalog(conf

[GitHub] spark pull request #15024: [SPARK-17470][SQL] unify path for data source tab...

2016-11-01 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/15024#discussion_r86070024 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala --- @@ -91,7 +73,8 @@ case class CreateTableLikeCommand

[GitHub] spark pull request #15024: [SPARK-17470][SQL] unify path for data source tab...

2016-11-01 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/15024#discussion_r86070455 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -207,6 +207,9 @@ private[spark] class HiveExternalCatalog(conf

[GitHub] spark pull request #15024: [SPARK-17470][SQL] unify path for data source tab...

2016-11-01 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/15024#discussion_r86069832 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala --- @@ -85,14 +86,7 @@ case class

[GitHub] spark pull request #15024: [SPARK-17470][SQL] unify path for data source tab...

2016-11-01 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/15024#discussion_r86069964 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -665,15 +665,7 @@ case class AlterTableSetLocationCommand

[GitHub] spark pull request #15024: [SPARK-17470][SQL] unify path for data source tab...

2016-11-01 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/15024#discussion_r86070624 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -383,8 +389,22 @@ private[spark] class HiveExternalCatalog(conf

[GitHub] spark pull request #15024: [SPARK-17470][SQL] unify path for data source tab...

2016-11-01 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/15024#discussion_r86070959 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -513,6 +555,16 @@ private[spark] class HiveExternalCatalog(conf

[GitHub] spark pull request #15024: [SPARK-17470][SQL] unify path for data source tab...

2016-11-01 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/15024#discussion_r86070329 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/sources/PathOptionSuite.scala --- @@ -0,0 +1,97 @@ +/* +* Licensed to the Apache Software

[GitHub] spark pull request #15024: [SPARK-17470][SQL] unify path for data source tab...

2016-11-01 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/15024#discussion_r86070568 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -259,10 +266,9 @@ private[spark] class HiveExternalCatalog(conf

[GitHub] spark pull request #15024: [SPARK-17470][SQL] unify path for data source tab...

2016-11-01 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/15024#discussion_r86070169 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -541,3 +434,123 @@ case class DataSource

[GitHub] spark pull request #15024: [SPARK-17470][SQL] unify path for data source tab...

2016-11-01 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/15024#discussion_r86066888 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/InMemoryCatalog.scala --- @@ -196,18 +196,32 @@ class InMemoryCatalog

[GitHub] spark issue #15725: [SPARK-18167] Print out spark confs, and hive confs when...

2016-11-01 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15725 lgtm --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #15701: [SPARK-18167] [SQL] Also log all partitions when the SQL...

2016-10-31 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15701 merging to master. Thanks --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

spark git commit: [SPARK-18167][SQL] Also log all partitions when the SQLQuerySuite test flakes

2016-10-31 Thread yhuai
Repository: spark Updated Branches: refs/heads/master de3f87fa7 -> 6633b97b5 [SPARK-18167][SQL] Also log all partitions when the SQLQuerySuite test flakes ## What changes were proposed in this pull request? One possibility for this test flaking is that we have corrupted the partition schema

[GitHub] spark issue #15701: [SPARK-18167] [SQL] Also log all partitions when the SQL...

2016-10-31 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15701 lgtm --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

spark git commit: [SPARK-17972][SQL] Add Dataset.checkpoint() to truncate large query plans

2016-10-31 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 26b07f190 -> 8bfc3b7aa [SPARK-17972][SQL] Add Dataset.checkpoint() to truncate large query plans ## What changes were proposed in this pull request? ### Problem Iterative ML code may easily create query plans that grow exponentially. We

[GitHub] spark issue #15651: [SPARK-17972][SQL] Add Dataset.checkpoint() to truncate ...

2016-10-31 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15651 lgtm pending jenkins --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #15024: [SPARK-17470][SQL] unify path for data source table and ...

2016-10-31 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15024 > 3. before passing storage properties to DataSource as data source options, add locationUri as path option, to keep the previous behaviour, i.e. the path option always exists(even users did

[GitHub] spark issue #15024: [SPARK-17470][SQL] unify path for data source table and ...

2016-10-31 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15024 What do we do for data source tables if the path is a single file? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request #15651: [SPARK-17972][SQL] Add Dataset.checkpoint() to tr...

2016-10-29 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/15651#discussion_r85645164 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/ExistingRDD.scala --- @@ -130,17 +130,40 @@ case class ExternalRDDScanExec[T

[GitHub] spark pull request #15651: [SPARK-17972][SQL] Add Dataset.checkpoint() to tr...

2016-10-29 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/15651#discussion_r85645138 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/ExistingRDD.scala --- @@ -130,17 +130,40 @@ case class ExternalRDDScanExec[T

[GitHub] spark pull request #15676: [SPARK-18167] [SQL] Add debug code for SQLQuerySu...

2016-10-28 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/15676#discussion_r85627218 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala --- @@ -585,7 +586,19 @@ private[client] class Shim_v0_13 extends Shim_v0_12

[GitHub] spark issue #15657: [DO NOT MERGE] Test partition

2016-10-27 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15657 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[1/2] spark git commit: [SPARK-17970][SQL] store partition spec in metastore for data source table

2016-10-27 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 79fd0cc05 -> ccb115430 http://git-wip-us.apache.org/repos/asf/spark/blob/ccb11543/sql/hive/src/test/scala/org/apache/spark/sql/hive/PartitionProviderCompatibilitySuite.scala

[2/2] spark git commit: [SPARK-17970][SQL] store partition spec in metastore for data source table

2016-10-27 Thread yhuai
[SPARK-17970][SQL] store partition spec in metastore for data source table ## What changes were proposed in this pull request? We should follow hive table and also store partition spec in metastore for data source table. This brings 2 benefits: 1. It's more flexible to manage the table data

[GitHub] spark issue #15515: [SPARK-17970][SQL] store partition spec in metastore for...

2016-10-27 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15515 Cool. I am merging this pr to unblock other tasks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #15515: [SPARK-17970][SQL] store partition spec in metast...

2016-10-27 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/15515#discussion_r85420521 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeColumnCommand.scala --- @@ -50,7 +50,8 @@ case class AnalyzeColumnCommand

[GitHub] spark pull request #15515: [SPARK-17970][SQL] store partition spec in metast...

2016-10-27 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/15515#discussion_r85415502 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala --- @@ -387,7 +388,15 @@ final class DataFrameWriter[T] private[sql](ds: Dataset

[GitHub] spark issue #15515: [SPARK-17970][SQL] store partition spec in metastore for...

2016-10-27 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15515 Looks good. I left a few questions. Let me know if you want to address them in follow-up prs. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request #15515: [SPARK-17970][SQL] store partition spec in metast...

2016-10-27 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/15515#discussion_r85421683 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -531,6 +529,11 @@ case class AlterTableRecoverPartitionsCommand

[GitHub] spark pull request #15515: [SPARK-17970][SQL] store partition spec in metast...

2016-10-27 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/15515#discussion_r85421410 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala --- @@ -232,6 +238,15 @@ case class

[GitHub] spark issue #15661: [SPARK-16963][SQL]Fix test "metadata log should contain ...

2016-10-27 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15661 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #15657: [DO NOT MERGE] Test partition

2016-10-27 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15657 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #15657: [DO NOT MERGE] Test partition

2016-10-27 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15657 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

spark git commit: [SPARK-18132] Fix checkstyle

2016-10-26 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 dcf2f090c -> 1a4be51d6 [SPARK-18132] Fix checkstyle This PR fixes checkstyle. Author: Yin Huai <yh...@databricks.com> Closes #15656 from yhuai/fix-format. (cherry picked from commit d3b4831d009905185ad74096ce3ecfa934bc191

spark git commit: [SPARK-18132] Fix checkstyle

2016-10-26 Thread yhuai
Repository: spark Updated Branches: refs/heads/master dd4f088c1 -> d3b4831d0 [SPARK-18132] Fix checkstyle This PR fixes checkstyle. Author: Yin Huai <yh...@databricks.com> Closes #15656 from yhuai/fix-format. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: h

[GitHub] spark issue #15656: [SPARK-18132] Fix checkstyle

2016-10-26 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15656 merging to master and branch 2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #15656: [SPARK-18132] Fix checkstyle

2016-10-26 Thread yhuai
GitHub user yhuai opened a pull request: https://github.com/apache/spark/pull/15656 [SPARK-18132] Fix checkstyle This PR fixes checkstyle. You can merge this pull request into a Git repository by running: $ git pull https://github.com/yhuai/spark fix-format

[GitHub] spark pull request #15651: [SPARK-17972][SQL] Add Dataset.checkpoint() to tr...

2016-10-26 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/15651#discussion_r85259326 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala --- @@ -919,6 +922,44 @@ class DatasetSuite extends QueryTest with SharedSQLContext

[GitHub] spark issue #15520: [SPARK-13747][SQL]Fix concurrent executions in ForkJoinP...

2016-10-26 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15520 lgtm --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #15590: [SPARK-17949][SQL] A JVM object based aggregate operator

2016-10-25 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15590 lgtm1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #15024: [SPARK-17470][SQL] unify path for data source table and ...

2016-10-25 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15024 For a CatalogTable, will its option still have `path` set? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request #15024: [SPARK-17470][SQL] unify path for data source tab...

2016-10-25 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/15024#discussion_r84984048 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/InMemoryCatalog.scala --- @@ -196,18 +196,30 @@ class InMemoryCatalog

spark git commit: [SPARK-18070][SQL] binary operator should not consider nullability when comparing input types

2016-10-25 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 1c1e847bc -> 7c8d9a557 [SPARK-18070][SQL] binary operator should not consider nullability when comparing input types ## What changes were proposed in this pull request? Binary operator requires its inputs to be of same type, but it

spark git commit: [SPARK-18070][SQL] binary operator should not consider nullability when comparing input types

2016-10-25 Thread yhuai
Repository: spark Updated Branches: refs/heads/master c5fe3dd4f -> a21791e31 [SPARK-18070][SQL] binary operator should not consider nullability when comparing input types ## What changes were proposed in this pull request? Binary operator requires its inputs to be of same type, but it

[GitHub] spark issue #15606: [SPARK-18070][SQL] binary operator should not consider n...

2016-10-25 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15606 LGTM. Merging to master and branch 2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15394: [SPARK-17748][ML] One pass solver for Weighted Least Squ...

2016-10-25 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15394 This breaks the scala 2.10 build. Can you fix the problem? ``` [error] /home/jenkins/workspace/spark-master-compile-sbt-scala-2.10/mllib/src/test/scala/org/apache/spark/ml/optim

[GitHub] spark issue #15024: [SPARK-17470][SQL] unify path for data source table and ...

2016-10-24 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15024 Can you update the description to explain how we handle sources that are not file-based (e.g. jdbc)? --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request #15520: [SPARK-13747][SQL]Fix concurrent executions in Fo...

2016-10-21 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/15520#discussion_r84563401 --- Diff: scalastyle-config.xml --- @@ -200,6 +200,7 @@ This file is divided into 3 sections: // scalastyle:off awaitresult Await.result

spark git commit: [SPARK-17926][SQL][STREAMING] Added json for statuses

2016-10-21 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 78458a7eb -> af2e6e0c9 [SPARK-17926][SQL][STREAMING] Added json for statuses ## What changes were proposed in this pull request? StreamingQueryStatus exposed through StreamingQueryListener often needs to be recorded (similar to

spark git commit: [SPARK-17926][SQL][STREAMING] Added json for statuses

2016-10-21 Thread yhuai
Repository: spark Updated Branches: refs/heads/master e371040a0 -> 7a531e305 [SPARK-17926][SQL][STREAMING] Added json for statuses ## What changes were proposed in this pull request? StreamingQueryStatus exposed through StreamingQueryListener often needs to be recorded (similar to

[GitHub] spark issue #15476: [SPARK-17926][SQL][STREAMING] Added json for statuses

2016-10-21 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15476 lgtm --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #15575: [SPARK-18038] [SQL] Move output partitioning definition ...

2016-10-20 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15575 > I felt that there are numerous places where child's output ordering could be used but the operators don't set it Can you list them at here? --- If your project is set up for it, you

[GitHub] spark issue #15575: [SPARK-18038] [SQL] Move output partitioning definition ...

2016-10-20 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15575 Our planner decides if to add an `ShuffleExchange` by consider `outputPartitioning` and `requiredDistribution` together. If the `outputPartitioning` of the child does not satisfy

[GitHub] spark issue #15575: [SPARK-18038] [SQL] Move output partitioning definition ...

2016-10-20 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15575 `outputPartitioning` of a node will only be changed if this node shuffles data. Right now, only `ShuffleExchange` shuffles data. --- If your project is set up for it, you can reply to this email

<    1   2   3   4   5   6   7   8   9   10   >