spark git commit: [SPARK-13922][SQL] Filter rows with null attributes in vectorized parquet reader

2016-03-19 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 4ce2d24e2 -> b90c0206f [SPARK-13922][SQL] Filter rows with null attributes in vectorized parquet reader # What changes were proposed in this pull request? It's common for many SQL operators to not care about reading `null` values for

spark git commit: [SPARK-13719][SQL] Parse JSON rows having an array type and a struct type in the same fieild

2016-03-19 Thread yhuai
Repository: spark Updated Branches: refs/heads/master ca9ef86c8 -> 917f4000b [SPARK-13719][SQL] Parse JSON rows having an array type and a struct type in the same fieild ## What changes were proposed in this pull request? This https://github.com/apache/spark/pull/2400 added the support to

spark git commit: [SPARK-12721][SQL] SQL Generation for Script Transformation

2016-03-19 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 1d1de28a3 -> c4bd57602 [SPARK-12721][SQL] SQL Generation for Script Transformation What changes were proposed in this pull request? This PR is to convert to SQL from analyzed logical plans containing operator `ScriptTransformation`.

[2/2] spark git commit: [SPARK-13923][SQL] Implement SessionCatalog

2016-03-20 Thread yhuai
[SPARK-13923][SQL] Implement SessionCatalog ## What changes were proposed in this pull request? As part of the effort to merge `SQLContext` and `HiveContext`, this patch implements an internal catalog called `SessionCatalog` that handles temporary functions and tables and delegates metastore

spark git commit: [SPARK-14016][SQL] Support high-precision decimals in vectorized parquet reader

2016-03-21 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 43ef1e52b -> 729996165 [SPARK-14016][SQL] Support high-precision decimals in vectorized parquet reader ## What changes were proposed in this pull request? This patch adds support for reading `DecimalTypes` with high (> 18) precision in

spark git commit: [SPARK-14015][SQL] Support TimestampType in vectorized parquet reader

2016-03-23 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 02d9c352c -> 0a64294fc [SPARK-14015][SQL] Support TimestampType in vectorized parquet reader ## What changes were proposed in this pull request? This PR adds support for TimestampType in the vectorized parquet reader ## How was this

spark git commit: [SPARK-13456][SQL] fix creating encoders for case classes defined in Spark shell

2016-03-21 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 060a28c63 -> 43ebf7a9c [SPARK-13456][SQL] fix creating encoders for case classes defined in Spark shell ## What changes were proposed in this pull request? case classes defined in REPL are wrapped by line classes, and we have a trick for

spark git commit: [SPARK-14161][SQL] Native Parsing for DDL Command Drop Database

2016-03-26 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 20c0bcd97 -> 8989d3a39 [SPARK-14161][SQL] Native Parsing for DDL Command Drop Database ### What changes were proposed in this pull request? Based on the Hive DDL document https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL

spark git commit: [SPARK-14144][SQL] Explicitly identify/catch UnsupportedOperationException during parquet reader initialization

2016-03-25 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 43b15e01c -> b5f8c36e3 [SPARK-14144][SQL] Explicitly identify/catch UnsupportedOperationException during parquet reader initialization ## What changes were proposed in this pull request? This PR is a minor cleanup task as part of

spark git commit: [SPARK-14157][SQL] Parse Drop Function DDL command

2016-03-26 Thread yhuai
Repository: spark Updated Branches: refs/heads/master b547de8a6 -> bc925b73a [SPARK-14157][SQL] Parse Drop Function DDL command ## What changes were proposed in this pull request? JIRA: https://issues.apache.org/jira/browse/SPARK-14157 We only parse create function command. In order to

spark git commit: [SPARK-14177][SQL] Native Parsing for DDL Command "Describe Database" and "Alter Database"

2016-03-26 Thread yhuai
Repository: spark Updated Branches: refs/heads/master bc925b73a -> a01b6a92b [SPARK-14177][SQL] Native Parsing for DDL Command "Describe Database" and "Alter Database" What changes were proposed in this pull request? This PR is to provide native parsing support for two DDL commands:

spark git commit: [SPARK-14061][SQL] implement CreateMap

2016-03-25 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 6603d9f7e -> 43b15e01c [SPARK-14061][SQL] implement CreateMap ## What changes were proposed in this pull request? As we have `CreateArray` and `CreateStruct`, we should also have `CreateMap`. This PR adds the `CreateMap` expression, and

spark git commit: [SPARK-13415][SQL] Visualize subquery in SQL web UI

2016-03-03 Thread yhuai
Repository: spark Updated Branches: refs/heads/master ad0de99f3 -> b373a8886 [SPARK-13415][SQL] Visualize subquery in SQL web UI ## What changes were proposed in this pull request? This PR support visualization for subquery in SQL web UI, also improve the explain of subquery, especially

spark git commit: [SPARK-12941][SQL][MASTER] Spark-SQL JDBC Oracle dialect fails to map string datatypes to Oracle VARCHAR datatype mapping

2016-03-03 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 15d57f9c2 -> f6ac7c30d [SPARK-12941][SQL][MASTER] Spark-SQL JDBC Oracle dialect fails to map string datatypes to Oracle VARCHAR datatype mapping ## What changes were proposed in this pull request? A test suite added for the bug fix -SPARK

spark git commit: [SPARK-12941][SQL][MASTER] Spark-SQL JDBC Oracle dialect fails to map string datatypes to Oracle VARCHAR datatype mapping

2016-03-04 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 5a2712952 -> 528e37352 [SPARK-12941][SQL][MASTER] Spark-SQL JDBC Oracle dialect fails to map string datatypes to Oracle VARCHAR datatype mapping A test suite added for the bug fix -SPARK 12941; for the mapping of the StringType to

spark git commit: [SPARK-13495][SQL] Add Null Filters in the query plan for Filters/Joins based on their data constraints

2016-03-07 Thread yhuai
ted ExpressionTrees via `OrcFilterSuite` 3. Test filter source pushdown logic via `SimpleTextHadoopFsRelationSuite` cc yhuai nongli Author: Sameer Agarwal <sam...@databricks.com> Closes #11372 from sameeragarwal/gen-isnotnull. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Com

spark git commit: [SPARK-14278][SQL] Initialize columnar batch with proper memory mode

2016-03-31 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 8d6207206 -> 358692932 [SPARK-14278][SQL] Initialize columnar batch with proper memory mode ## What changes were proposed in this pull request? Fixes a minor bug in the record reader constructor that was possibly introduced during

spark git commit: [SPARK-14263][SQL] Benchmark Vectorized HashMap for GroupBy Aggregates

2016-03-31 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 8b207f3b6 -> 8d6207206 [SPARK-14263][SQL] Benchmark Vectorized HashMap for GroupBy Aggregates ## What changes were proposed in this pull request? This PR proposes a new data-structure based on a vectorized hashmap that can be potentially

spark git commit: [SPARK-14244][SQL] Don't use SizeBasedWindowFunction.n created on executor side when evaluating window functions

2016-04-01 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 4fc35e6f5 -> 27e71a2cd [SPARK-14244][SQL] Don't use SizeBasedWindowFunction.n created on executor side when evaluating window functions ## What changes were proposed in this pull request? `SizeBasedWindowFunction.n` is a global singleton

spark git commit: [SPARK-14156][SQL] Use executedPlan in HiveComparisonTest for the messages of computed tables

2016-03-28 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 4a7636f2d -> 1528ff4c9 [SPARK-14156][SQL] Use executedPlan in HiveComparisonTest for the messages of computed tables ## What changes were proposed in this pull request? JIRA: https://issues.apache.org/jira/browse/SPARK-14156 In

spark git commit: [SPARK-14388][SQL] Implement CREATE TABLE

2016-04-13 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 1018a1c1e -> 7d2ed8cc0 [SPARK-14388][SQL] Implement CREATE TABLE ## What changes were proposed in this pull request? This patch implements the `CREATE TABLE` command using the `SessionCatalog`. Previously we handled only `CTAS` and

spark git commit: [SPARK-14892][SQL][TEST] Disable the HiveCompatibilitySuite test case for INPUTDRIVER and OUTPUTDRIVER.

2016-04-25 Thread yhuai
Repository: spark Updated Branches: refs/heads/master c7758ba38 -> 88e54218d [SPARK-14892][SQL][TEST] Disable the HiveCompatibilitySuite test case for INPUTDRIVER and OUTPUTDRIVER. What changes were proposed in this pull request? Disable the test case involving INPUTDRIVER and

spark git commit: [SPARK-13643][SQL] Implement SparkSession

2016-04-21 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 8e1bb0456 -> a2e8d4fdd [SPARK-13643][SQL] Implement SparkSession ## What changes were proposed in this pull request? After removing most of `HiveContext` in 8fc267ab3322e46db81e725a5cb1adb5a71b2b4d we can now move existing functionality

spark git commit: [SPARK-14954] [SQL] Add PARTITION BY and BUCKET BY clause for data source CTAS syntax

2016-04-27 Thread yhuai
Repository: spark Updated Branches: refs/heads/master f405de87c -> 24bea0004 [SPARK-14954] [SQL] Add PARTITION BY and BUCKET BY clause for data source CTAS syntax Currently, we can only create persisted partitioned and/or bucketed data source tables using the Dataset API but not using SQL

spark git commit: [SPARK-13023][PROJECT INFRA][BRANCH-1.6] Fix handling of root module in modules_to_test()

2016-04-27 Thread yhuai
est` instead of `changed_modules`. Author: Yin Huai <yh...@databricks.com> Closes #12743 from yhuai/1.6build. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f4af6a8b Tree: http://git-wip-us.apache.org/repos/asf/spark/tree

spark git commit: [SPARK-14944][SPARK-14943][SQL] Remove HiveConf from HiveTableScanExec, HiveTableReader, and ScriptTransformation

2016-04-27 Thread yhuai
Repository: spark Updated Branches: refs/heads/master b2a456064 -> d73d67f62 [SPARK-14944][SPARK-14943][SQL] Remove HiveConf from HiveTableScanExec, HiveTableReader, and ScriptTransformation ## What changes were proposed in this pull request? This patch removes HiveConf from

[1/2] spark git commit: [SPARK-14872][SQL] Restructure command package

2016-04-23 Thread yhuai
Repository: spark Updated Branches: refs/heads/master fddd3aee0 -> 5c8a0ec99 http://git-wip-us.apache.org/repos/asf/spark/blob/5c8a0ec9/sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala --

spark git commit: [SPARK-14871][SQL] Disable StatsReportListener to declutter output

2016-04-23 Thread yhuai
Repository: spark Updated Branches: refs/heads/master ee6b209a9 -> fddd3aee0 [SPARK-14871][SQL] Disable StatsReportListener to declutter output ## What changes were proposed in this pull request? Spark SQL inherited from Shark to use the StatsReportListener. Unfortunately this clutters the

spark git commit: [SPARK-14865][SQL] Better error handling for view creation.

2016-04-23 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 890abd127 -> e3c1366bb [SPARK-14865][SQL] Better error handling for view creation. ## What changes were proposed in this pull request? This patch improves error handling in view creation. CreateViewCommand itself will analyze the view SQL

[2/2] spark git commit: [SPARK-14872][SQL] Restructure command package

2016-04-23 Thread yhuai
[SPARK-14872][SQL] Restructure command package ## What changes were proposed in this pull request? This patch restructures sql.execution.command package to break the commands into multiple files, in some logical organization: databases, tables, views, functions. I also renamed

spark git commit: [SPARK-14877][SQL] Remove HiveMetastoreTypes class

2016-04-23 Thread yhuai
Repository: spark Updated Branches: refs/heads/master e3c1366bb -> 162e12b08 [SPARK-14877][SQL] Remove HiveMetastoreTypes class ## What changes were proposed in this pull request? It is unnecessary as DataType.catalogString largely replaces the need for this class. ## How was this patch

spark git commit: [SPARK-14917][SQL] Enable some ORC compressions tests for writing

2016-04-29 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 09da43d51 -> d7755cfd0 [SPARK-14917][SQL] Enable some ORC compressions tests for writing ## What changes were proposed in this pull request? https://issues.apache.org/jira/browse/SPARK-14917 As it is described in the JIRA, it seems Hive

spark git commit: [SPARK-15011][SQL][TEST] Ignore org.apache.spark.sql.hive.StatisticsSuite.analyze MetastoreRelation

2016-04-29 Thread yhuai
lem. Author: Yin Huai <yh...@databricks.com> Closes #12783 from yhuai/SPARK-15011-ignore. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ac115f66 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/ac115f66 Diff: h

spark git commit: [SPARK-15028][SQL] Remove HiveSessionState.setDefaultOverrideConfs

2016-04-30 Thread yhuai
Repository: spark Updated Branches: refs/heads/master b3ea57931 -> 8dc3987d0 [SPARK-15028][SQL] Remove HiveSessionState.setDefaultOverrideConfs ## What changes were proposed in this pull request? This patch removes some code that are no longer relevant -- mainly

spark git commit: Revert "[SPARK-14613][ML] Add @Since into the matrix and vector classes in spark-mllib-local"

2016-04-28 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 2398e3d69 -> 9c7c42bc6 Revert "[SPARK-14613][ML] Add @Since into the matrix and vector classes in spark-mllib-local" This reverts commit dae538a4d7c36191c1feb02ba87ffc624ab960dc. Project:

spark git commit: [SPARK-14912][SQL] Propagate data source options to Hadoop configuration

2016-04-26 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 92f66331b -> 5cb03220a [SPARK-14912][SQL] Propagate data source options to Hadoop configuration ## What changes were proposed in this pull request? We currently have no way for users to propagate options to the underlying library that

spark git commit: [SPARK-14879][SQL] Move CreateMetastoreDataSource and CreateMetastoreDataSourceAsSelect to sql/core

2016-04-23 Thread yhuai
rce and CreateMetastoreDataSourceAsSelect are not Hive-specific. So, this PR moves them from sql/hive to sql/core. Also, I am adding `Command` suffix to these two classes. ## How was this patch tested? Existing tests. Author: Yin Huai <yh...@databricks.com> Closes #12645 from yhuai/moveCreateDataSource. Project: http:

spark git commit: [SPARK-15192][SQL] null check for SparkSession.createDataFrame

2016-05-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 36acf8856 -> f5784459e [SPARK-15192][SQL] null check for SparkSession.createDataFrame ## What changes were proposed in this pull request? This PR adds null check in `SparkSession.createDataFrame`, so that we can make sure the passed

spark git commit: [SPARK-15192][SQL] null check for SparkSession.createDataFrame

2016-05-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 32be51fba -> ebfe3a1f2 [SPARK-15192][SQL] null check for SparkSession.createDataFrame ## What changes were proposed in this pull request? This PR adds null check in `SparkSession.createDataFrame`, so that we can make sure the passed in

spark git commit: [SPARK-15280] Input/Output] Refactored OrcOutputWriter and moved serialization to a new class.

2016-05-21 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 4148a9c2c -> 6871deb93 [SPARK-15280] Input/Output] Refactored OrcOutputWriter and moved serialization to a new class. ## What changes were proposed in this pull request? Refactoring: Separated ORC serialization logic from

spark git commit: [SPARK-15280] Input/Output] Refactored OrcOutputWriter and moved serialization to a new class.

2016-05-21 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 201a51f36 -> c18fa464f [SPARK-15280] Input/Output] Refactored OrcOutputWriter and moved serialization to a new class. ## What changes were proposed in this pull request? Refactoring: Separated ORC serialization logic from OrcOutputWriter

spark git commit: [SPARK-14127][SQL] Makes 'DESC [EXTENDED|FORMATTED] ' support data source tables

2016-05-09 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 29bc8d2ec -> de6afc887 [SPARK-14127][SQL] Makes 'DESC [EXTENDED|FORMATTED] ' support data source tables ## What changes were proposed in this pull request? This is a follow-up of PR #12844. It makes the newly updated

spark git commit: [SPARK-14127][SQL] Makes 'DESC [EXTENDED|FORMATTED] ' support data source tables

2016-05-09 Thread yhuai
Repository: spark Updated Branches: refs/heads/master b1e01fd51 -> 671b382a8 [SPARK-14127][SQL] Makes 'DESC [EXTENDED|FORMATTED] ' support data source tables ## What changes were proposed in this pull request? This is a follow-up of PR #12844. It makes the newly updated

spark git commit: [SPARK-15173][SQL] DataFrameWriter.insertInto should work with datasource table stored in hive

2016-05-09 Thread yhuai
Repository: spark Updated Branches: refs/heads/master c3e23bc0c -> 2adb11f6d [SPARK-15173][SQL] DataFrameWriter.insertInto should work with datasource table stored in hive When we parse `CREATE TABLE USING`, we should build a `CreateTableUsing` plan with the `managedIfNoPath` set to true.

spark git commit: [SPARK-15173][SQL] DataFrameWriter.insertInto should work with datasource table stored in hive

2016-05-09 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 40d24686a -> bf53b96b5 [SPARK-15173][SQL] DataFrameWriter.insertInto should work with datasource table stored in hive When we parse `CREATE TABLE USING`, we should build a `CreateTableUsing` plan with the `managedIfNoPath` set to

spark git commit: [SPARK-15025][SQL] fix duplicate of PATH key in datasource table options

2016-05-09 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 3323d0f93 -> 980bba0dc [SPARK-15025][SQL] fix duplicate of PATH key in datasource table options ## What changes were proposed in this pull request? The issue is that when the user provides the path option with uppercase "PATH" key,

spark git commit: [SPARK-15025][SQL] fix duplicate of PATH key in datasource table options

2016-05-09 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 6a5ec08ea -> 1bcbf6157 [SPARK-15025][SQL] fix duplicate of PATH key in datasource table options ## What changes were proposed in this pull request? The issue is that when the user provides the path option with uppercase "PATH" key,

spark git commit: [SPARK-14986][SQL] Return correct result for empty LATERAL VIEW OUTER

2016-05-10 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 5a4a188fe -> 0ab195886 [SPARK-14986][SQL] Return correct result for empty LATERAL VIEW OUTER ## What changes were proposed in this pull request? A Generate with the `outer` flag enabled should always return one or more rows for every

spark git commit: [SPARK-14986][SQL] Return correct result for empty LATERAL VIEW OUTER

2016-05-10 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 89f73f674 -> d28c67544 [SPARK-14986][SQL] Return correct result for empty LATERAL VIEW OUTER ## What changes were proposed in this pull request? A Generate with the `outer` flag enabled should always return one or more rows for every

spark git commit: [SPARK-14933][HOTFIX] Replace `sqlContext` with `spark`.

2016-05-11 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 0858a82c1 -> 403ba6513 [SPARK-14933][HOTFIX] Replace `sqlContext` with `spark`. ## What changes were proposed in this pull request? This fixes compile errors. ## How was this patch tested? Pass the Jenkins tests. Author: Dongjoon

spark git commit: [SPARK-14933][HOTFIX] Replace `sqlContext` with `spark`.

2016-05-11 Thread yhuai
Repository: spark Updated Branches: refs/heads/master a5f9fdbba -> e1576478b [SPARK-14933][HOTFIX] Replace `sqlContext` with `spark`. ## What changes were proposed in this pull request? This fixes compile errors. ## How was this patch tested? Pass the Jenkins tests. Author: Dongjoon Hyun

spark git commit: [SPARK-14837][SQL][STREAMING] Added support in file stream source for reading new files added to subdirs

2016-05-10 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 f021f3460 -> d8c2da9a4 [SPARK-14837][SQL][STREAMING] Added support in file stream source for reading new files added to subdirs ## What changes were proposed in this pull request? Currently, file stream source can only find new files

spark git commit: [SPARK-14837][SQL][STREAMING] Added support in file stream source for reading new files added to subdirs

2016-05-10 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 86475520f -> d9ca9fd3e [SPARK-14837][SQL][STREAMING] Added support in file stream source for reading new files added to subdirs ## What changes were proposed in this pull request? Currently, file stream source can only find new files if

spark git commit: [SPARK-15257][SQL] Require CREATE EXTERNAL TABLE to specify LOCATION

2016-05-11 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 40ba87f76 -> 8881765ac [SPARK-15257][SQL] Require CREATE EXTERNAL TABLE to specify LOCATION ## What changes were proposed in this pull request? Before: ```sql -- uses warehouse dir anyway CREATE EXTERNAL TABLE my_tab -- doesn't actually

spark git commit: [SPARK-15257][SQL] Require CREATE EXTERNAL TABLE to specify LOCATION

2016-05-11 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 b1e14d9bf -> 4e56857ca [SPARK-15257][SQL] Require CREATE EXTERNAL TABLE to specify LOCATION ## What changes were proposed in this pull request? Before: ```sql -- uses warehouse dir anyway CREATE EXTERNAL TABLE my_tab -- doesn't

spark git commit: [SPARK-14346] SHOW CREATE TABLE for data source tables

2016-05-11 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 b2b04c6da -> 0b14b3f13 [SPARK-14346] SHOW CREATE TABLE for data source tables ## What changes were proposed in this pull request? This PR adds native `SHOW CREATE TABLE` DDL command for data source tables. Support for Hive tables

spark git commit: [SPARK-15094][SPARK-14803][SQL] Remove extra Project added in EliminateSerialization

2016-05-12 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 b3f145442 -> 68617e1ad [SPARK-15094][SPARK-14803][SQL] Remove extra Project added in EliminateSerialization ## What changes were proposed in this pull request? We will eliminate the pair of `DeserializeToObject` and

spark git commit: [SPARK-15094][SPARK-14803][SQL] Remove extra Project added in EliminateSerialization

2016-05-12 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 5bb62b893 -> 470de743e [SPARK-15094][SPARK-14803][SQL] Remove extra Project added in EliminateSerialization ## What changes were proposed in this pull request? We will eliminate the pair of `DeserializeToObject` and `SerializeFromObject`

spark git commit: [SPARK-15160][SQL] support data source table in InMemoryCatalog

2016-05-12 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 9e266d07a -> 46991448a [SPARK-15160][SQL] support data source table in InMemoryCatalog ## What changes were proposed in this pull request? This PR adds a new rule to convert `SimpleCatalogRelation` to data source table if its table

spark git commit: [SPARK-15072][SQL][PYSPARK][HOT-FIX] Remove SparkSession.withHiveSupport from readwrite.py

2016-05-11 Thread yhuai
mit/db573fc743d12446dd0421fb45d00c2f541eaf9a did not remove withHiveSupport from readwrite.py Author: Yin Huai <yh...@databricks.com> Closes #13069 from yhuai/fixPython. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ba5487c0 Tree: http:

spark git commit: [SPARK-15072][SQL][PYSPARK][HOT-FIX] Remove SparkSession.withHiveSupport from readwrite.py

2016-05-11 Thread yhuai
mit/db573fc743d12446dd0421fb45d00c2f541eaf9a did not remove withHiveSupport from readwrite.py Author: Yin Huai <yh...@databricks.com> Closes #13069 from yhuai/fixPython. (cherry picked from commit ba5487c061168627b27af2fa9610d53791fcc90d) Signed-off-by: Yin Huai <yh...@databricks.com> Project:

spark git commit: [SPARK-15160][SQL] support data source table in InMemoryCatalog

2016-05-12 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 86acb5efd -> beda3938c [SPARK-15160][SQL] support data source table in InMemoryCatalog ## What changes were proposed in this pull request? This PR adds a new rule to convert `SimpleCatalogRelation` to data source table if its table

spark git commit: [SPARK-14346][SQL][FOLLOW-UP] add tests for CREAT TABLE USING with partition and bucket

2016-05-17 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 110876b9a -> adc1c2685 [SPARK-14346][SQL][FOLLOW-UP] add tests for CREAT TABLE USING with partition and bucket ## What changes were proposed in this pull request? https://github.com/apache/spark/pull/12781 introduced PARTITIONED BY,

spark git commit: [SPARK-11735][CORE][SQL] Add a check in the constructor of SQLContext/SparkSession to make sure its SparkContext is not stopped

2016-05-17 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 0f576a574 -> 8e8bc9f95 [SPARK-11735][CORE][SQL] Add a check in the constructor of SQLContext/SparkSession to make sure its SparkContext is not stopped ## What changes were proposed in this pull request? Add a check in the constructor of

spark git commit: [SPARK-11735][CORE][SQL] Add a check in the constructor of SQLContext/SparkSession to make sure its SparkContext is not stopped

2016-05-17 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 c0bb77132 -> 7b62b7c11 [SPARK-11735][CORE][SQL] Add a check in the constructor of SQLContext/SparkSession to make sure its SparkContext is not stopped ## What changes were proposed in this pull request? Add a check in the constructor

spark git commit: [SPARK-14346][SQL] Native SHOW CREATE TABLE for Hive tables/views

2016-05-17 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 8e8bc9f95 -> b674e67c2 [SPARK-14346][SQL] Native SHOW CREATE TABLE for Hive tables/views ## What changes were proposed in this pull request? This is a follow-up of #12781. It adds native `SHOW CREATE TABLE` support for Hive tables and

spark git commit: [SPARK-14346][SQL] Native SHOW CREATE TABLE for Hive tables/views

2016-05-17 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 7b62b7c11 -> 2dddec40d [SPARK-14346][SQL] Native SHOW CREATE TABLE for Hive tables/views ## What changes were proposed in this pull request? This is a follow-up of #12781. It adds native `SHOW CREATE TABLE` support for Hive tables

spark git commit: [SPARK-14346] Fix scala-2.10 build

2016-05-17 Thread yhuai
; Closes #13157 from yhuai/SPARK-14346-fix-scala2.10. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2a5db9c1 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/2a5db9c1 Diff: http://git-wip-us.apache.org/repos/asf/s

spark git commit: [SPARK-14346] Fix scala-2.10 build

2016-05-17 Thread yhuai
ks.com> Closes #13157 from yhuai/SPARK-14346-fix-scala2.10. (cherry picked from commit 2a5db9c140b9d60a5ec91018be19bec7b80850ee) Signed-off-by: Yin Huai <yh...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commi

spark git commit: [SPARK-14541][SQL] Support IFNULL, NULLIF, NVL and NVL2

2016-05-12 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 d73ce364e -> 51706f8a4 [SPARK-14541][SQL] Support IFNULL, NULLIF, NVL and NVL2 ## What changes were proposed in this pull request? This patch adds support for a few SQL functions to improve compatibility with other databases: IFNULL,

spark git commit: [SPARK-14541][SQL] Support IFNULL, NULLIF, NVL and NVL2

2016-05-12 Thread yhuai
Repository: spark Updated Branches: refs/heads/master ba169c323 -> eda2800d4 [SPARK-14541][SQL] Support IFNULL, NULLIF, NVL and NVL2 ## What changes were proposed in this pull request? This patch adds support for a few SQL functions to improve compatibility with other databases: IFNULL,

spark git commit: [SPARK-15231][SQL] Document the semantic of saveAsTable and insertInto and don't drop columns silently

2016-05-11 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 a8637f4ac -> 2d3c69a02 [SPARK-15231][SQL] Document the semantic of saveAsTable and insertInto and don't drop columns silently ## What changes were proposed in this pull request? This PR adds documents about the different behaviors

spark git commit: [SPARK-15231][SQL] Document the semantic of saveAsTable and insertInto and don't drop columns silently

2016-05-11 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 007882c7e -> 875ef7642 [SPARK-15231][SQL] Document the semantic of saveAsTable and insertInto and don't drop columns silently ## What changes were proposed in this pull request? This PR adds documents about the different behaviors

spark git commit: [SPARK-15248][SQL] Make MetastoreFileCatalog consider directories from partition specs of a partitioned metastore table

2016-05-11 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 89e67d666 -> 81c68eceb [SPARK-15248][SQL] Make MetastoreFileCatalog consider directories from partition specs of a partitioned metastore table Table partitions can be added with locations different from default warehouse location of a

spark git commit: [SPARK-15248][SQL] Make MetastoreFileCatalog consider directories from partition specs of a partitioned metastore table

2016-05-11 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 56e1e2f17 -> 6b36185d0 [SPARK-15248][SQL] Make MetastoreFileCatalog consider directories from partition specs of a partitioned metastore table Table partitions can be added with locations different from default warehouse location of

spark git commit: [SPARK-13749][SQL][FOLLOW-UP] Faster pivot implementation for many distinct values with two phase aggregation

2016-05-02 Thread yhuai
Repository: spark Updated Branches: refs/heads/master bb9ab56b9 -> d8f528ceb [SPARK-13749][SQL][FOLLOW-UP] Faster pivot implementation for many distinct values with two phase aggregation ## What changes were proposed in this pull request? This is a follow up PR for #11583. It makes 3 lazy

spark git commit: [SPARK-13749][SQL][FOLLOW-UP] Faster pivot implementation for many distinct values with two phase aggregation

2016-05-02 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 a7e8cfa64 -> 52308103e [SPARK-13749][SQL][FOLLOW-UP] Faster pivot implementation for many distinct values with two phase aggregation ## What changes were proposed in this pull request? This is a follow up PR for #11583. It makes 3

spark git commit: [SPARK-15108][SQL] Describe Permanent UDTF

2016-05-06 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 76ad04d9a -> 5c8fad7b9 [SPARK-15108][SQL] Describe Permanent UDTF What changes were proposed in this pull request? When Describe a UDTF, the command returns a wrong result. The command is unable to find the function, which has been

spark git commit: [SPARK-14997][SQL] Fixed FileCatalog to return correct set of files when there is no partitioning scheme in the given paths

2016-05-06 Thread yhuai
Repository: spark Updated Branches: refs/heads/master e20cd9f4c -> f7b7ef416 [SPARK-14997][SQL] Fixed FileCatalog to return correct set of files when there is no partitioning scheme in the given paths ## What changes were proposed in this pull request? Lets says there are json files in the

spark git commit: [SPARK-13749][SQL] Faster pivot implementation for many distinct values with two phase aggregation

2016-05-02 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 eb7336a75 -> 08ae32e61 [SPARK-13749][SQL] Faster pivot implementation for many distinct values with two phase aggregation ## What changes were proposed in this pull request? The existing implementation of pivot translates into a

spark git commit: [SPARK-13749][SQL] Faster pivot implementation for many distinct values with two phase aggregation

2016-05-02 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 0a3026990 -> 992744186 [SPARK-13749][SQL] Faster pivot implementation for many distinct values with two phase aggregation ## What changes were proposed in this pull request? The existing implementation of pivot translates into a single

spark git commit: [SPARK-6339][SQL] Supports CREATE TEMPORARY VIEW tableIdentifier AS query

2016-05-04 Thread yhuai
Repository: spark Updated Branches: refs/heads/master fa79d346e -> 8fb1463d6 [SPARK-6339][SQL] Supports CREATE TEMPORARY VIEW tableIdentifier AS query ## What changes were proposed in this pull request? This PR support new SQL syntax CREATE TEMPORARY VIEW. Like: ``` CREATE TEMPORARY VIEW

spark git commit: [SPARK-14993][SQL] Fix Partition Discovery Inconsistency when Input is a Path to Parquet File

2016-05-04 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 8fb1463d6 -> ef55e46c9 [SPARK-14993][SQL] Fix Partition Discovery Inconsistency when Input is a Path to Parquet File What changes were proposed in this pull request? When we load a dataset, if we set the path to ```/path/a=1```, we

spark git commit: [SPARK-14993][SQL] Fix Partition Discovery Inconsistency when Input is a Path to Parquet File

2016-05-04 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 d90359d63 -> 689b0fc81 [SPARK-14993][SQL] Fix Partition Discovery Inconsistency when Input is a Path to Parquet File What changes were proposed in this pull request? When we load a dataset, if we set the path to ```/path/a=1```,

spark git commit: [SPARK-6339][SQL] Supports CREATE TEMPORARY VIEW tableIdentifier AS query

2016-05-04 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 fa3c5507f -> d90359d63 [SPARK-6339][SQL] Supports CREATE TEMPORARY VIEW tableIdentifier AS query ## What changes were proposed in this pull request? This PR support new SQL syntax CREATE TEMPORARY VIEW. Like: ``` CREATE TEMPORARY VIEW

spark git commit: [SPARK-14127][SQL] "DESC ": Extracts schema information from table properties for data source tables

2016-05-10 Thread yhuai
Repository: spark Updated Branches: refs/heads/master aab99d31a -> 8a12580d2 [SPARK-14127][SQL] "DESC ": Extracts schema information from table properties for data source tables ## What changes were proposed in this pull request? This is a follow-up of #12934 and #12844. This PR adds a set

spark git commit: [SPARK-14677][SQL] Make the max number of iterations configurable for Catalyst

2016-04-15 Thread yhuai
Repository: spark Updated Branches: refs/heads/master b2dfa8495 -> f4be0946a [SPARK-14677][SQL] Make the max number of iterations configurable for Catalyst ## What changes were proposed in this pull request? We currently hard code the max number of optimizer/analyzer iterations to 100. This

spark git commit: [SPARK-14620][SQL] Use/benchmark a better hash in VectorizedHashMap

2016-04-15 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 8028a2888 -> 4df65184b [SPARK-14620][SQL] Use/benchmark a better hash in VectorizedHashMap ## What changes were proposed in this pull request? This PR uses a better hashing algorithm while probing the AggregateHashMap: ```java long h = 0

svn commit: r1739799 [1/2] - in /spark: news/_posts/ site/ site/graphx/ site/mllib/ site/news/ site/releases/ site/screencasts/ site/sql/ site/streaming/

2016-04-18 Thread yhuai
Author: yhuai Date: Mon Apr 18 17:56:44 2016 New Revision: 1739799 URL: http://svn.apache.org/viewvc?rev=1739799=rev Log: Add news for Spark Summit (June 6, 2016) agenda Added: spark/news/_posts/2016-04-17-submit-summit-agenda-posted.md Modified: spark/site/community.html spark/site

svn commit: r1739799 [2/2] - in /spark: news/_posts/ site/ site/graphx/ site/mllib/ site/news/ site/releases/ site/screencasts/ site/sql/ site/streaming/

2016-04-18 Thread yhuai
Modified: spark/site/news/spark-version-0-6-0-released.html URL: http://svn.apache.org/viewvc/spark/site/news/spark-version-0-6-0-released.html?rev=1739799=1739798=1739799=diff == ---

svn commit: r1739801 [1/2] - in /spark: news/_posts/ site/ site/graphx/ site/mllib/ site/news/ site/releases/ site/screencasts/ site/sql/ site/streaming/

2016-04-18 Thread yhuai
Author: yhuai Date: Mon Apr 18 18:04:55 2016 New Revision: 1739801 URL: http://svn.apache.org/viewvc?rev=1739801=rev Log: Fix the link for previous commit (Add news for Spark Summit (June 6, 2016) agenda) Added: spark/news/_posts/2016-04-17-spark-summit-june-2016-agenda-posted.md Modified

spark git commit: [SPARK-14647][SQL] Group SQLContext/HiveContext state into SharedState

2016-04-18 Thread yhuai
<yh...@databricks.com> Closes #12463 from yhuai/sharedState. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/28ee1570 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/28ee1570 Diff: http://git-wip-us.apache.org/repos/asf/s

svn commit: r1739801 [2/2] - in /spark: news/_posts/ site/ site/graphx/ site/mllib/ site/news/ site/releases/ site/screencasts/ site/sql/ site/streaming/

2016-04-18 Thread yhuai
Modified: spark/site/releases/spark-release-1-2-2.html URL: http://svn.apache.org/viewvc/spark/site/releases/spark-release-1-2-2.html?rev=1739801=1739800=1739801=diff == --- spark/site/releases/spark-release-1-2-2.html

svn commit: r1739802 - /spark/site/news/spark-summit-june-2016-agenda-posted.html

2016-04-18 Thread yhuai
Author: yhuai Date: Mon Apr 18 18:05:47 2016 New Revision: 1739802 URL: http://svn.apache.org/viewvc?rev=1739802=rev Log: Fix the link for previous commit (Add news for Spark Summit (June 6, 2016) agenda) again Added: spark/site/news/spark-summit-june-2016-agenda-posted.html Added: spark

spark git commit: [SPARK-13681][SPARK-14458][SPARK-14566][SQL] Add back once removed CommitFailureTestRelationSuite and SimpleTextHadoopFsRelationSuite

2016-04-19 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 3d46d796a -> 5e360c93b [SPARK-13681][SPARK-14458][SPARK-14566][SQL] Add back once removed CommitFailureTestRelationSuite and SimpleTextHadoopFsRelationSuite ## What changes were proposed in this pull request? These test suites were

spark git commit: [SPARK-14675][SQL] ClassFormatError when use Seq as Aggregator buffer type

2016-04-19 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 947b9020b -> 5cb2e3360 [SPARK-14675][SQL] ClassFormatError when use Seq as Aggregator buffer type ## What changes were proposed in this pull request? After https://github.com/apache/spark/pull/12067, we now use expressions to do the

spark git commit: [SPARK-14647][SQL] Group SQLContext/HiveContext state into SharedState

2016-04-16 Thread yhuai
rew Or <and...@databricks.com> Author: Yin Huai <yh...@databricks.com> Closes #12447 from yhuai/sharedState. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/5cefecc9 Tree: http://git-wip-us.apache.org/repos/asf/spar

spark git commit: [SPARK-14672][SQL] Move HiveContext analyze logic to AnalyzeTable

2016-04-16 Thread yhuai
How was this patch tested? Existing tests. Closes #12429 Author: Yin Huai <yh...@databricks.com> Author: Andrew Or <and...@databricks.com> Closes #12448 from yhuai/analyzeTable. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/a

spark git commit: [SPARK-14677][SQL] follow up: make max iter num config internal

2016-04-16 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 36da5e323 -> 7319fcc1c [SPARK-14677][SQL] follow up: make max iter num config internal ## What changes were proposed in this pull request? This is a follow-up to make the max iteration number an internal config. ## How was this patch

spark git commit: [SPARK-14407][SQL] Hides HadoopFsRelation related data source API into execution/datasources package #12178

2016-04-19 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 366414235 -> 10f273d8d [SPARK-14407][SQL] Hides HadoopFsRelation related data source API into execution/datasources package #12178 ## What changes were proposed in this pull request? This PR moves `HadoopFsRelation` related data source

<    1   2   3   4   5   6   7   8   >