[GitHub] spark pull request #13766: [SPARK-16036][SPARK-16037][SPARK-16034][SQL] Foll...

2016-06-18 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13766#discussion_r67610305 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala --- @@ -245,29 +245,17 @@ final class DataFrameWriter[T] private[sql](ds

[GitHub] spark pull request #13766: [SPARK-16036][SPARK-16037][SPARK-16034][SQL] Foll...

2016-06-18 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13766#discussion_r67610302 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala --- @@ -369,6 +369,8 @@ case class

[GitHub] spark pull request #13766: [SPARK-16036][SPARK-16037][SPARK-16034][SQL] Foll...

2016-06-18 Thread yhuai
GitHub user yhuai opened a pull request: https://github.com/apache/spark/pull/13766 [SPARK-16036][SPARK-16037][SPARK-16034][SQL] Follow up clean up and ## What changes were proposed in this pull request? This PR is the follow-up PR for https://github.com/apache/spark/pull

[GitHub] spark pull request #13754: [SPARK-16036][SPARK-16037][SQL] fix various table...

2016-06-18 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13754#discussion_r67609956 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertIntoHiveTableSuite.scala --- @@ -325,27 +325,6 @@ class InsertIntoHiveTableSuite extends

[GitHub] spark pull request #13749: [SPARK-16034][SQL] Checks the partition columns w...

2016-06-18 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13749#discussion_r67609776 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -435,26 +435,25 @@ case class DataSource

[GitHub] spark pull request #13749: [SPARK-16034][SQL] Checks the partition columns w...

2016-06-18 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13749#discussion_r67609434 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -435,26 +435,25 @@ case class DataSource

spark git commit: [SPARK-16034][SQL] Checks the partition columns when calling dataFrame.write.mode("append").saveAsTable

2016-06-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 8d2fc010b -> ee6eea644 [SPARK-16034][SQL] Checks the partition columns when calling dataFrame.write.mode("append").saveAsTable ## What changes were proposed in this pull request? `DataFrameWriter` can be used to append data to

spark git commit: [SPARK-16034][SQL] Checks the partition columns when calling dataFrame.write.mode("append").saveAsTable

2016-06-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 3d010c837 -> ce3b98bae [SPARK-16034][SQL] Checks the partition columns when calling dataFrame.write.mode("append").saveAsTable ## What changes were proposed in this pull request? `DataFrameWriter` can be used to append data to existing

[GitHub] spark issue #13749: [SPARK-16034][SQL] Checks the partition columns when cal...

2016-06-18 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/13749 LGTM. Let's address the case-sensitivity issue in a separate PR (together with issue found in https://github.com/apache/spark/pull/13754). I will take care the minor comments (i.e. variable naming

[GitHub] spark pull request #13749: [SPARK-16034][SQL] Checks the partition columns w...

2016-06-18 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13749#discussion_r67604063 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala --- @@ -1317,4 +1317,28 @@ class DDLSuite extends QueryTest

[GitHub] spark pull request #13749: [SPARK-16034][SQL] Checks the partition columns w...

2016-06-18 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13749#discussion_r67604030 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -435,26 +435,25 @@ case class DataSource

[GitHub] spark pull request #13749: [SPARK-16034][SQL] Checks the partition columns w...

2016-06-18 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13749#discussion_r67604016 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -435,26 +435,25 @@ case class DataSource

[GitHub] spark pull request #13749: [SPARK-16034][SQL] Checks the partition columns w...

2016-06-18 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13749#discussion_r67604004 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala --- @@ -242,8 +242,13 @@ case class

spark git commit: [SPARK-16036][SPARK-16037][SQL] fix various table insertion problems

2016-06-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/master e574c9973 -> 3d010c837 [SPARK-16036][SPARK-16037][SQL] fix various table insertion problems ## What changes were proposed in this pull request? The current table insertion has some weird behaviours: 1. inserting into a partitioned table

spark git commit: [SPARK-16036][SPARK-16037][SQL] fix various table insertion problems

2016-06-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 f159eb521 -> 8d2fc010b [SPARK-16036][SPARK-16037][SQL] fix various table insertion problems ## What changes were proposed in this pull request? The current table insertion has some weird behaviours: 1. inserting into a partitioned

[GitHub] spark issue #13754: [SPARK-16036][SPARK-16037][SQL] fix various table insert...

2016-06-18 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/13754 Overall LGTM. I am merging this to master and branch 2.0. I will take care those comments in my PR. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request #13754: [SPARK-16036][SPARK-16037][SQL] fix various table...

2016-06-18 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13754#discussion_r67603928 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala --- @@ -1684,4 +1684,36 @@ class SQLQuerySuite extends QueryTest

[GitHub] spark pull request #13754: [SPARK-16036][SPARK-16037][SQL] fix various table...

2016-06-18 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13754#discussion_r67603919 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala --- @@ -1684,4 +1684,36 @@ class SQLQuerySuite extends QueryTest

[GitHub] spark pull request #13754: [SPARK-16036][SPARK-16037][SQL] fix various table...

2016-06-18 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13754#discussion_r67603894 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveQuerySuite.scala --- @@ -1033,41 +1033,6 @@ class HiveQuerySuite extends

[GitHub] spark pull request #13754: [SPARK-16036][SPARK-16037][SQL] fix various table...

2016-06-18 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13754#discussion_r67603845 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertIntoHiveTableSuite.scala --- @@ -325,27 +325,6 @@ class InsertIntoHiveTableSuite extends

[GitHub] spark pull request #13754: [SPARK-16036][SPARK-16037][SQL] fix various table...

2016-06-18 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13754#discussion_r67603805 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala --- @@ -62,53 +62,79 @@ private[sql] class ResolveDataSource

[GitHub] spark pull request #13754: [SPARK-16036][SPARK-16037][SQL] fix various table...

2016-06-18 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13754#discussion_r67603755 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala --- @@ -62,53 +62,79 @@ private[sql] class ResolveDataSource

[GitHub] spark pull request #13754: [SPARK-16036][SPARK-16037][SQL] fix various table...

2016-06-18 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13754#discussion_r67603346 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala --- @@ -369,10 +369,8 @@ case class

spark git commit: [SPARK-16033][SQL] insertInto() can't be used together with partitionBy()

2016-06-17 Thread yhuai
Repository: spark Updated Branches: refs/heads/master ebb9a3b6f -> 10b671447 [SPARK-16033][SQL] insertInto() can't be used together with partitionBy() ## What changes were proposed in this pull request? When inserting into an existing partitioned table, partitioning columns should always be

[GitHub] spark issue #13747: [SPARK-16033][SQL] insertInto() can't be used together w...

2016-06-17 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/13747 LGTM. Merging to master and branch 2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #13746: [SPARK-16030] [SQL] Allow specifying static parti...

2016-06-17 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13746#discussion_r67592834 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala --- @@ -43,11 +44,94 @@ import

[GitHub] spark pull request #13746: [SPARK-16030] [SQL] Allow specifying static parti...

2016-06-17 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13746#discussion_r67592468 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala --- @@ -320,6 +320,19 @@ trait CheckAnalysis extends

[GitHub] spark pull request #13746: [SPARK-16030] [SQL] Allow specifying static parti...

2016-06-17 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13746#discussion_r67592296 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/sources/InsertSuite.scala --- @@ -284,4 +285,78 @@ class InsertSuite extends DataSourceTest

[GitHub] spark pull request #13747: [SPARK-16033][SQL] insertInto() can't be used tog...

2016-06-17 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13747#discussion_r67591585 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala --- @@ -243,7 +241,15 @@ final class DataFrameWriter[T] private[sql](ds: Dataset

[GitHub] spark pull request #13746: [SPARK-16030] [SQL] Allow specifying static parti...

2016-06-17 Thread yhuai
GitHub user yhuai opened a pull request: https://github.com/apache/spark/pull/13746 [SPARK-16030] [SQL] Allow specifying static partitions when inserting to data source tables ## What changes were proposed in this pull request? This PR adds the static partition support

spark git commit: [SPARK-15706][SQL] Fix Wrong Answer when using IF NOT EXISTS in INSERT OVERWRITE for DYNAMIC PARTITION

2016-06-16 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 5ada60614 -> e5d703bca [SPARK-15706][SQL] Fix Wrong Answer when using IF NOT EXISTS in INSERT OVERWRITE for DYNAMIC PARTITION What changes were proposed in this pull request? `IF NOT EXISTS` in `INSERT OVERWRITE` should not support

spark git commit: [SPARK-15706][SQL] Fix Wrong Answer when using IF NOT EXISTS in INSERT OVERWRITE for DYNAMIC PARTITION

2016-06-16 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 3994372f4 -> b82abde06 [SPARK-15706][SQL] Fix Wrong Answer when using IF NOT EXISTS in INSERT OVERWRITE for DYNAMIC PARTITION What changes were proposed in this pull request? `IF NOT EXISTS` in `INSERT OVERWRITE` should not

[GitHub] spark issue #13447: [SPARK-15706] [SQL] Fix Wrong Answer when using IF NOT E...

2016-06-16 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/13447 Thanks. LGTM. Merging to master and branch 2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

svn commit: r1748776 [2/2] - in /spark: news/_posts/ site/ site/graphx/ site/mllib/ site/news/ site/releases/ site/screencasts/ site/sql/ site/streaming/

2016-06-16 Thread yhuai
Modified: spark/site/news/spark-tips-from-quantifind.html URL: http://svn.apache.org/viewvc/spark/site/news/spark-tips-from-quantifind.html?rev=1748776=1748775=1748776=diff == ---

svn commit: r1748776 [1/2] - in /spark: news/_posts/ site/ site/graphx/ site/mllib/ site/news/ site/releases/ site/screencasts/ site/sql/ site/streaming/

2016-06-16 Thread yhuai
Author: yhuai Date: Thu Jun 16 22:14:05 2016 New Revision: 1748776 URL: http://svn.apache.org/viewvc?rev=1748776=rev Log: Add a new news for CFP of Spark Summit 2016 EU Added: spark/news/_posts/2016-06-16-submit-talks-to-spark-summit-eu-2016.md spark/site/news/submit-talks-to-spark

[GitHub] spark pull request #13711: [SPARK-15991] SparkContext.hadoopConfiguration sh...

2016-06-16 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13711#discussion_r67431239 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SessionState.scala --- @@ -49,7 +49,7 @@ private[sql] class SessionState(sparkSession

[GitHub] spark pull request #13711: [SPARK-15991] SparkContext.hadoopConfiguration sh...

2016-06-16 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13711#discussion_r67430972 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala --- @@ -43,23 +43,17 @@ private[sql] class SharedState(val sparkContext

[GitHub] spark issue #13711: [SPARK-15991] SparkContext.hadoopConfiguration should be...

2016-06-16 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/13711 We will document the change in the release notes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #13711: [SPARK-15991] SparkContext.hadoopConfiguration sh...

2016-06-16 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13711#discussion_r67393518 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala --- @@ -43,23 +43,17 @@ private[sql] class SharedState(val sparkContext

[GitHub] spark pull request #13711: [SPARK-15991] SparkContext.hadoopConfiguration sh...

2016-06-16 Thread yhuai
GitHub user yhuai opened a pull request: https://github.com/apache/spark/pull/13711 [SPARK-15991] SparkContext.hadoopConfiguration should be always the base of hadoop conf created by SessionState ## What changes were proposed in this pull request? Before this patch, after

[GitHub] spark issue #13371: [SPARK-15639][SQL] Try to push down filter at RowGroups ...

2016-06-15 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/13371 Yea. Since this one was closed by asfgit, I am not sure you can reopen it. On Wed, Jun 15, 2016 at 7:39 PM -0700, "Liang-Chi Hsieh" <notificati...@gith

[GitHub] spark pull request #13542: [SPARK-15730][SQL] Respect the --hiveconf in the ...

2016-06-15 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13542#discussion_r67270459 --- Diff: sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala --- @@ -91,6 +91,8 @@ class CliSuite extends

[GitHub] spark issue #13679: [SPARK-15959] [SQL] Add the support of hive.metastore.wa...

2016-06-15 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/13679 @gatorsmile What are other confs? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #13679: [SPARK-15959] [SQL] Add the support of hive.metas...

2016-06-15 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13679#discussion_r67107457 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala --- @@ -66,6 +67,30 @@ private[sql] class SharedState(val sparkContext

[GitHub] spark issue #13679: [SPARK-15959] [SQL] Add the support of hive.metastore.wa...

2016-06-15 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/13679 cc @rxin @gatorsmile @andrewor14 for review --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #13679: [SPARK-15959] [SQL] Add the support of hive.metas...

2016-06-15 Thread yhuai
GitHub user yhuai opened a pull request: https://github.com/apache/spark/pull/13679 [SPARK-15959] [SQL] Add the support of hive.metastore.warehouse.dir back ## What changes were proposed in this pull request? This PR adds the support of conf `hive.metastore.warehouse.dir` back

spark git commit: [SPARK-15247][SQL] Set the default number of partitions for reading parquet schemas

2016-06-14 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 24539223b -> 9adba414c [SPARK-15247][SQL] Set the default number of partitions for reading parquet schemas ## What changes were proposed in this pull request? This pr sets the default number of partitions when reading parquet schemas.

spark git commit: [SPARK-15247][SQL] Set the default number of partitions for reading parquet schemas

2016-06-14 Thread yhuai
Repository: spark Updated Branches: refs/heads/master bd39ffe35 -> dae4d5db2 [SPARK-15247][SQL] Set the default number of partitions for reading parquet schemas ## What changes were proposed in this pull request? This pr sets the default number of partitions when reading parquet schemas.

[GitHub] spark issue #13137: [SPARK-15247][SQL] Set the default number of partitions ...

2016-06-14 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/13137 Thanks. Merging to master and branch 2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

spark git commit: [SPARK-15895][SQL] Filters out metadata files while doing partition discovery

2016-06-14 Thread yhuai
Repository: spark Updated Branches: refs/heads/master df4ea6614 -> bd39ffe35 [SPARK-15895][SQL] Filters out metadata files while doing partition discovery ## What changes were proposed in this pull request? Take the following directory layout as an example: ``` dir/ +- p0=0/ |-_metadata

spark git commit: [SPARK-15895][SQL] Filters out metadata files while doing partition discovery

2016-06-14 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 515937046 -> e03c25193 [SPARK-15895][SQL] Filters out metadata files while doing partition discovery ## What changes were proposed in this pull request? Take the following directory layout as an example: ``` dir/ +- p0=0/

[GitHub] spark issue #13623: [SPARK-15895][SQL] Filters out metadata files while doin...

2016-06-14 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/13623 LGTM. Merging to master and branch 2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #13651: [SPARK-15776][SQL] Divide Expression inside Aggre...

2016-06-14 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13651#discussion_r67011472 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala --- @@ -213,7 +213,7 @@ case class Multiply(left: Expression

[GitHub] spark pull request #13651: [SPARK-15776][SQL] Divide Expression inside Aggre...

2016-06-14 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13651#discussion_r67008784 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -2847,4 +2847,15 @@ class SQLQuerySuite extends QueryTest

[GitHub] spark pull request #13651: [SPARK-15776][SQL] Divide Expression inside Aggre...

2016-06-14 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13651#discussion_r67008385 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -525,7 +525,7 @@ object TypeCoercion { def

[GitHub] spark issue #13651: [SPARK-15776][SQL] Divide Expression inside Aggregation ...

2016-06-14 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/13651 I think we need to explain the inconsistent behavior in the PR description, which is the main reason of making this change. (right now, if we run `select 1/2`, the result is 0.5. However, `select sum

[GitHub] spark pull request #13280: [SPARK-9876][SQL]: Update Parquet to 1.8.1.

2016-06-14 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13280#discussion_r67006064 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystSchemaConverter.scala --- @@ -538,6 +538,22 @@ private[parquet

[GitHub] spark issue #13137: [SPARK-15247][SQL] Set the default number of partitions ...

2016-06-14 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/13137 LGTM pending jenkins --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

spark git commit: [SPARK-15914][SQL] Add deprecated method back to SQLContext for backward source code compatibility

2016-06-14 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 e90ba2287 -> 73beb9fb3 [SPARK-15914][SQL] Add deprecated method back to SQLContext for backward source code compatibility ## What changes were proposed in this pull request? Revert partial changes in SPARK-12600, and add some

[GitHub] spark issue #13637: [SPARK-15914][SQL] Add deprecated method back to SQLCont...

2016-06-14 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/13637 Thanks. Merging to master and branch 2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13371: [SPARK-15639][SQL] Try to push down filter at RowGroups ...

2016-06-14 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/13371 Can you add results showing that there are skipped row groups with this change (and before this patch all row groups are loaded)? For those results, let's also put them in the description

[GitHub] spark pull request #13137: [SPARK-15247][SQL] Set the default number of part...

2016-06-14 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13137#discussion_r66999843 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala --- @@ -795,11 +795,16 @@ private[sql] object

spark git commit: [SPARK-15663][SQL] SparkSession.catalog.listFunctions shouldn't include the list of built-in functions

2016-06-13 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 1a57bf0f4 -> 2841bbac4 [SPARK-15663][SQL] SparkSession.catalog.listFunctions shouldn't include the list of built-in functions ## What changes were proposed in this pull request? SparkSession.catalog.listFunctions currently returns all

[GitHub] spark issue #13413: [SPARK-15663][SQL] SparkSession.catalog.listFunctions sh...

2016-06-13 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/13413 Thanks. Merging to master and branch 2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13413: [SPARK-15663][SQL] SparkSession.catalog.listFunctions sh...

2016-06-13 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/13413 lgtm pending jenkins --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

spark git commit: [SPARK-15808][SQL] File Format Checking When Appending Data

2016-06-13 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 774014250 -> 55c1fac21 [SPARK-15808][SQL] File Format Checking When Appending Data What changes were proposed in this pull request? **Issue:** Got wrong results or strange errors when append data to a table with mismatched file

spark git commit: [SPARK-15808][SQL] File Format Checking When Appending Data

2016-06-13 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 7b9071eea -> 5827b65e2 [SPARK-15808][SQL] File Format Checking When Appending Data What changes were proposed in this pull request? **Issue:** Got wrong results or strange errors when append data to a table with mismatched file

[GitHub] spark issue #13546: [SPARK-15808] [SQL] File Format Checking When Appending ...

2016-06-13 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/13546 Thanks. Merging to master and branch 2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13546: [SPARK-15808] [SQL] File Format Checking When Appending ...

2016-06-13 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/13546 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #13546: [SPARK-15808] [SQL] File Format Checking When Appending ...

2016-06-13 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/13546 LGTM pending jenkins --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #13623: [SPARK-15895][SQL] Filters out metadata files whi...

2016-06-13 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13623#discussion_r66884268 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetPartitionDiscoverySuite.scala --- @@ -890,4 +892,26 @@ class

[GitHub] spark pull request #13623: [SPARK-15895][SQL] Filters out metadata files whi...

2016-06-13 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13623#discussion_r66884199 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetPartitionDiscoverySuite.scala --- @@ -890,4 +892,26 @@ class

[GitHub] spark pull request #13623: [SPARK-15895][SQL] Filters out metadata files whi...

2016-06-13 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13623#discussion_r66884135 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningAwareFileCatalog.scala --- @@ -197,4 +201,9 @@ abstract class

[GitHub] spark pull request #13623: [SPARK-15895][SQL] Filters out metadata files whi...

2016-06-13 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13623#discussion_r66884097 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningAwareFileCatalog.scala --- @@ -96,7 +96,11 @@ abstract class

[GitHub] spark pull request #13623: [SPARK-15895][SQL] Filters out metadata files whi...

2016-06-13 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13623#discussion_r66883639 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ListingFileCatalog.scala --- @@ -83,8 +83,9 @@ class ListingFileCatalog

[GitHub] spark issue #13546: [SPARK-15808] [SQL] File Format Checking When Appending ...

2016-06-13 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/13546 I think we cannot ban the format when the append is used in `DataFrameWriter`. For example, when I use `createDF(10, 19).write.mode(SaveMode.Append).format("text").saveAsTable("a

spark git commit: [SPARK-15887][SQL] Bring back the hive-site.xml support for Spark 2.0

2016-06-13 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 97fe1d8ee -> b148b0364 [SPARK-15887][SQL] Bring back the hive-site.xml support for Spark 2.0 ## What changes were proposed in this pull request? Right now, Spark 2.0 does not load hive-site.xml. Based on users' feedback, it seems

spark git commit: [SPARK-15887][SQL] Bring back the hive-site.xml support for Spark 2.0

2016-06-13 Thread yhuai
Repository: spark Updated Branches: refs/heads/master c654ae214 -> c4b1ad020 [SPARK-15887][SQL] Bring back the hive-site.xml support for Spark 2.0 ## What changes were proposed in this pull request? Right now, Spark 2.0 does not load hive-site.xml. Based on users' feedback, it seems make

[GitHub] spark issue #13611: [SPARK-15887][SQL] Bring back the hive-site.xml support ...

2016-06-13 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/13611 Thanks. Merging to master and branch 2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

spark git commit: [SPARK-15530][SQL] Set #parallelism for file listing in listLeafFilesInParallel

2016-06-13 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 8c4050a5a -> d9db8a9c8 [SPARK-15530][SQL] Set #parallelism for file listing in listLeafFilesInParallel ## What changes were proposed in this pull request? This pr is to set the number of parallelism to prevent file listing in

spark git commit: [SPARK-15530][SQL] Set #parallelism for file listing in listLeafFilesInParallel

2016-06-13 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 3b7fb84cf -> 5ad4e32d4 [SPARK-15530][SQL] Set #parallelism for file listing in listLeafFilesInParallel ## What changes were proposed in this pull request? This pr is to set the number of parallelism to prevent file listing in

[GitHub] spark issue #13444: [SPARK-15530][SQL] Set #parallelism for file listing in ...

2016-06-13 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/13444 Thanks! Merging to master and branch 2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #13137: [SPARK-15247][SQL] Set the default number of part...

2016-06-13 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13137#discussion_r66862256 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala --- @@ -795,11 +795,15 @@ private[sql] object

spark git commit: [SPARK-15676][SQL] Disallow Column Names as Partition Columns For Hive Tables

2016-06-13 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 2a0da84dc -> 8c4050a5a [SPARK-15676][SQL] Disallow Column Names as Partition Columns For Hive Tables What changes were proposed in this pull request? When creating a Hive Table (not data source tables), a common error users might

spark git commit: [SPARK-15676][SQL] Disallow Column Names as Partition Columns For Hive Tables

2016-06-13 Thread yhuai
Repository: spark Updated Branches: refs/heads/master a6a18a457 -> 3b7fb84cf [SPARK-15676][SQL] Disallow Column Names as Partition Columns For Hive Tables What changes were proposed in this pull request? When creating a Hive Table (not data source tables), a common error users might

[GitHub] spark issue #13415: [SPARK-15676] [SQL] Disallow Column Names as Partition C...

2016-06-13 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/13415 Thanks. Merging to master and branch 2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13611: [SPARK-15887][SQL] Bring back the hive-site.xml support ...

2016-06-13 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/13611 LGTM pending jenkins. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #13413: [SPARK-15663][SQL] SparkSession.catalog.listFunctions sh...

2016-06-13 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/13413 https://github.com/apache/spark/pull/13413/files#r66858447 is my last comment. Other parts look good. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request #13413: [SPARK-15663][SQL] SparkSession.catalog.listFunct...

2016-06-13 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13413#discussion_r66858447 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala --- @@ -187,28 +187,35 @@ class SQLQuerySuite extends QueryTest

[GitHub] spark pull request #13611: [SPARK-15887][SQL] Bring back the hive-site.xml s...

2016-06-13 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13611#discussion_r66856578 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala --- @@ -41,9 +43,22 @@ private[sql] class SharedState(val sparkContext

[GitHub] spark issue #13413: [SPARK-15663][SQL] SparkSession.catalog.listFunctions sh...

2016-06-13 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/13413 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #13413: [SPARK-15663][SQL] SparkSession.catalog.listFunct...

2016-06-13 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13413#discussion_r66846408 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala --- @@ -396,6 +396,11 @@ object FunctionRegistry

[GitHub] spark issue #13611: [SPARK-15887][SQL] Bring back the hive-site.xml support ...

2016-06-13 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/13611 Can we also add a test in sql/hive to make sure that confs in hive-site.xml file get propagated into metadataHive's hiveConf (HiveClientImpl has a `getConf` method, which can be used to check

[GitHub] spark pull request #13611: [SPARK-15887][SQL] Bring back the hive-site.xml s...

2016-06-13 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13611#discussion_r66820717 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala --- @@ -40,10 +42,16 @@ private[sql] class SharedState(val sparkContext

[GitHub] spark pull request #13611: [SPARK-15887][SQL] Bring back the hive-site.xml s...

2016-06-13 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13611#discussion_r66820184 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala --- @@ -40,10 +42,16 @@ private[sql] class SharedState(val sparkContext

[GitHub] spark pull request #13611: [SPARK-15887][SQL] Bring back the hive-site.xml s...

2016-06-13 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13611#discussion_r66819063 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala --- @@ -40,10 +42,16 @@ private[sql] class SharedState(val sparkContext

[GitHub] spark pull request #13413: [SPARK-15663][SQL] SparkSession.catalog.listFunct...

2016-06-12 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13413#discussion_r66729858 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala --- @@ -89,6 +89,10 @@ class SimpleFunctionRegistry

[GitHub] spark pull request #13611: [SPARK-15887][SQL] Bring back the hive-site.xml s...

2016-06-10 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13611#discussion_r66690402 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSharedState.scala --- @@ -40,17 +42,22 @@ private[hive] class HiveSharedState(override val

[GitHub] spark issue #13447: [SPARK-15706] [SQL] Fix Wrong Answer when using IF NOT E...

2016-06-10 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/13447 Seems the behavior of the following query is weird (if it works). (the case without IF NOT EXISTS) ``` INSERT OVERWRITE TABLE table_with_partition partition (p1='a',p2) SELECT 'blarr3

[GitHub] spark pull request #13444: [SPARK-15530][SQL] Set #parallelism for file list...

2016-06-10 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/13444#discussion_r66683717 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/fileSourceInterfaces.scala --- @@ -419,56 +419,79 @@ private[sql] object

<    7   8   9   10   11   12   13   14   15   16   >