spark git commit: [SPARK-8532] [SQL] In Python's DataFrameWriter, save/saveAsTable/json/parquet/jdbc always override mode

2015-06-22 Thread yhuai
actions (i.e. `save/saveAsTable/json/parquet/jdbc`) always override mode. Second, it adds input argument `partitionBy` to `save/saveAsTable/parquet`. Author: Yin Huai yh...@databricks.com Closes #6937 from yhuai/SPARK-8532 and squashes the following commits: f972d5d [Yin Huai] davies's comment

spark git commit: [SPARK-7859] [SQL] Collect_set() behavior differences which fails the unit test under jdk8

2015-06-22 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.4 994abbaeb - d73900a90 [SPARK-7859] [SQL] Collect_set() behavior differences which fails the unit test under jdk8 To reproduce that: ``` JAVA_HOME=/home/hcheng/Java/jdk1.8.0_45 | build/sbt -Phadoop-2.3 -Phive 'test-only

spark git commit: [HOTFIX] Hotfix branch-1.4 building by removing avgMetrics in CrossValidatorSuite

2015-06-21 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.4 2a7ea31a9 - b836bac3f [HOTFIX] Hotfix branch-1.4 building by removing avgMetrics in CrossValidatorSuite Ref. #6905 ping yhuai Author: Liang-Chi Hsieh vii...@gmail.com Closes #6929 from viirya/hot_fix_cv_test and squashes

spark git commit: [SPARK-8420] [SQL] Fix comparision of timestamps/dates with strings

2015-06-19 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 9814b971f - a333a72e0 [SPARK-8420] [SQL] Fix comparision of timestamps/dates with strings In earlier versions of Spark SQL we casted `TimestampType` and `DataType` to `StringType` when it was involved in a binary comparison with a

spark git commit: [SPARK-8093] [SQL] Remove empty structs inferred from JSON documents

2015-06-19 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.4 1a6b51078 - 0131142d9 [SPARK-8093] [SQL] Remove empty structs inferred from JSON documents Author: Nathan Howell nhow...@godaddy.com Closes #6799 from NathanHowell/spark-8093 and squashes the following commits: 76ac3e8 [Nathan

spark git commit: [SPARK-8093] [SQL] Remove empty structs inferred from JSON documents

2015-06-19 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 1fa29c2df - 9814b971f [SPARK-8093] [SQL] Remove empty structs inferred from JSON documents Author: Nathan Howell nhow...@godaddy.com Closes #6799 from NathanHowell/spark-8093 and squashes the following commits: 76ac3e8 [Nathan Howell]

spark git commit: [HOT-FIX] Fix compilation (caused by 0131142d98b191f6cc112d383aa10582a3ac35bf)

2015-06-19 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.4 0131142d9 - 2510365fa [HOT-FIX] Fix compilation (caused by 0131142d98b191f6cc112d383aa10582a3ac35bf) Author: Yin Huai yh...@databricks.com Closes #6913 from yhuai/branch-1.4-hotfix and squashes the following commits: 7f91fa0 [Yin

spark git commit: [SPARK-8498] [SQL] Add regression test for SPARK-8470

2015-06-19 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.4 2510365fa - 2248ad8b7 [SPARK-8498] [SQL] Add regression test for SPARK-8470 **Summary of the problem in SPARK-8470.** When using `HiveContext` to create a data frame of a user case class, Spark throws

spark git commit: [SPARK-8498] [SQL] Add regression test for SPARK-8470

2015-06-19 Thread yhuai
Repository: spark Updated Branches: refs/heads/master b305e377f - 093c34838 [SPARK-8498] [SQL] Add regression test for SPARK-8470 **Summary of the problem in SPARK-8470.** When using `HiveContext` to create a data frame of a user case class, Spark throws

spark git commit: [SPARK-8121] [SQL] Fixes InsertIntoHadoopFsRelation job initialization for Hadoop 1.x

2015-06-08 Thread yhuai
Repository: spark Updated Branches: refs/heads/master ed5c2dccd - bbdfc0a40 [SPARK-8121] [SQL] Fixes InsertIntoHadoopFsRelation job initialization for Hadoop 1.x For Hadoop 1.x, `TaskAttemptContext` constructor clones the `Configuration` argument, thus configurations done in

spark git commit: [SPARK-8121] [SQL] Fixes InsertIntoHadoopFsRelation job initialization for Hadoop 1.x (branch 1.4 backport based on https://github.com/apache/spark/pull/6669)

2015-06-08 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.4 a3afc2cba - 69197c3e3 [SPARK-8121] [SQL] Fixes InsertIntoHadoopFsRelation job initialization for Hadoop 1.x (branch 1.4 backport based on https://github.com/apache/spark/pull/6669) Project:

spark git commit: [SPARK-7747] [SQL] [DOCS] spark.sql.planner.externalSort

2015-06-05 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 4f16d3fe2 - 4060526cd [SPARK-7747] [SQL] [DOCS] spark.sql.planner.externalSort Add documentation for spark.sql.planner.externalSort Author: Luca Martinetti l...@luca.io Closes #6272 from lucamartinetti/docs-externalsort and squashes the

spark git commit: [SPARK-7747] [SQL] [DOCS] spark.sql.planner.externalSort

2015-06-05 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.4 200c980a1 - 94f65bcce [SPARK-7747] [SQL] [DOCS] spark.sql.planner.externalSort Add documentation for spark.sql.planner.externalSort Author: Luca Martinetti l...@luca.io Closes #6272 from lucamartinetti/docs-externalsort and squashes

spark git commit: [SPARK-6964] [SQL] Support Cancellation in the Thrift Server

2015-06-05 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 6ebe419f3 - eb19d3f75 [SPARK-6964] [SQL] Support Cancellation in the Thrift Server Support runInBackground in SparkExecuteStatementOperation, and add cancellation Author: Dong Wang d...@databricks.com Closes #6207 from

spark git commit: [SPARK-7973] [SQL] Increase the timeout of two CliSuite tests.

2015-06-03 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.4 ee7f365bd - 54a4ea407 [SPARK-7973] [SQL] Increase the timeout of two CliSuite tests. https://issues.apache.org/jira/browse/SPARK-7973 Author: Yin Huai yh...@databricks.com Closes #6525 from yhuai/SPARK-7973 and squashes the following

spark git commit: [SPARK-7973] [SQL] Increase the timeout of two CliSuite tests.

2015-06-03 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 28dbde387 - f1646e102 [SPARK-7973] [SQL] Increase the timeout of two CliSuite tests. https://issues.apache.org/jira/browse/SPARK-7973 Author: Yin Huai yh...@databricks.com Closes #6525 from yhuai/SPARK-7973 and squashes the following

spark git commit: [SPARK-8020] [SQL] Spark SQL conf in spark-defaults.conf make metadataHive get constructed too early

2015-06-02 Thread yhuai
Repository: spark Updated Branches: refs/heads/master bcb47ad77 - 7b7f7b6c6 [SPARK-8020] [SQL] Spark SQL conf in spark-defaults.conf make metadataHive get constructed too early https://issues.apache.org/jira/browse/SPARK-8020 Author: Yin Huai yh...@databricks.com Closes #6571 from yhuai

spark git commit: [SPARK-8020] [SQL] Spark SQL conf in spark-defaults.conf make metadataHive get constructed too early

2015-06-02 Thread yhuai
yhuai/SPARK-8020-1 and squashes the following commits: 0398f5b [Yin Huai] First populate the SQLConf and then construct executionHive and metadataHive. (cherry picked from commit 7b7f7b6c6fd903e2ecfc886d29eaa9df58adcfc3) Signed-off-by: Yin Huai yh...@databricks.com Project: http://git-wip

spark git commit: [SPARK-8014] [SQL] Avoid premature metadata discovery when writing a HadoopFsRelation with a save mode other than Append

2015-06-02 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.4 815e05654 - cbaf59544 [SPARK-8014] [SQL] Avoid premature metadata discovery when writing a HadoopFsRelation with a save mode other than Append The current code references the schema of the DataFrame to be written before checking save

spark git commit: [HOT-FIX] Add EvaluatedType back to RDG

2015-06-02 Thread yhuai
from yhuai/getBackEvaluatedType and squashes the following commits: 618c2eb [Yin Huai] Add EvaluatedType back. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/8c3fc3a6 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree

spark git commit: [SPARK-7950] [SQL] Sets spark.sql.hive.version in HiveThriftServer2.startWithContext()

2015-05-29 Thread yhuai
Repository: spark Updated Branches: refs/heads/master a51b133de - e7b617755 [SPARK-7950] [SQL] Sets spark.sql.hive.version in HiveThriftServer2.startWithContext() When starting `HiveThriftServer2` via `startWithContext`, property `spark.sql.hive.version` isn't set. This causes Simba ODBC

spark git commit: [SPARK-7950] [SQL] Sets spark.sql.hive.version in HiveThriftServer2.startWithContext()

2015-05-29 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.4 23bd05fff - caea7a618 [SPARK-7950] [SQL] Sets spark.sql.hive.version in HiveThriftServer2.startWithContext() When starting `HiveThriftServer2` via `startWithContext`, property `spark.sql.hive.version` isn't set. This causes Simba

spark git commit: [SPARK-7853] [SQL] Fix HiveContext in Spark Shell

2015-05-28 Thread yhuai
Context fails to create in spark shell because of the class loader issue. Author: Yin Huai yh...@databricks.com Closes #6459 from yhuai/SPARK-7853 and squashes the following commits: 37ad33e [Yin Huai] Do not use hiveQlTable at all. 47cdb6d [Yin Huai] Move hiveconf.set to the end of setConf

spark git commit: [SPARK-7853] [SQL] Fix HiveContext in Spark Shell

2015-05-28 Thread yhuai
that Hive Context fails to create in spark shell because of the class loader issue. Author: Yin Huai yh...@databricks.com Closes #6459 from yhuai/SPARK-7853 and squashes the following commits: 37ad33e [Yin Huai] Do not use hiveQlTable at all. 47cdb6d [Yin Huai] Move hiveconf.set to the end of setConf

spark git commit: [SPARK-7847] [SQL] Fixes dynamic partition directory escaping

2015-05-27 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.4 90525c9ba - a25ce91f9 [SPARK-7847] [SQL] Fixes dynamic partition directory escaping Please refer to [SPARK-7847] [1] for details. [1]: https://issues.apache.org/jira/browse/SPARK-7847 Author: Cheng Lian l...@databricks.com Closes

spark git commit: [SPARK-7847] [SQL] Fixes dynamic partition directory escaping

2015-05-27 Thread yhuai
Repository: spark Updated Branches: refs/heads/master ff0ddff46 - 15459db4f [SPARK-7847] [SQL] Fixes dynamic partition directory escaping Please refer to [SPARK-7847] [1] for details. [1]: https://issues.apache.org/jira/browse/SPARK-7847 Author: Cheng Lian l...@databricks.com Closes #6389

spark git commit: [SPARK-7790] [SQL] date and decimal conversion for dynamic partition key

2015-05-27 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 6fec1a940 - 8161562ea [SPARK-7790] [SQL] date and decimal conversion for dynamic partition key Author: Daoyuan Wang daoyuan.w...@intel.com Closes #6318 from adrian-wang/dynpart and squashes the following commits: ad73b61 [Daoyuan Wang]

spark git commit: [SPARK-7684] [SQL] Refactoring MetastoreDataSourcesSuite to workaround SPARK-7684

2015-05-27 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.4 d33142fd8 - 89fe93fc3 [SPARK-7684] [SQL] Refactoring MetastoreDataSourcesSuite to workaround SPARK-7684 As stated in SPARK-7684, currently `TestHive.reset` has some execution order specific bug, which makes running specific test

spark git commit: [SPARK-7853] [SQL] Fixes a class loader issue in Spark SQL

2015-05-27 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.4 89fe93fc3 - e07b71560 [SPARK-7853] [SQL] Fixes a class loader issue in Spark SQL This PR is based on PR #6396 authored by chenghao-intel. Essentially, Spark SQL should use context classloader to load SerDe classes. yhuai helped

spark git commit: [SPARK-7853] [SQL] Fixes a class loader issue in Spark SQL

2015-05-27 Thread yhuai
Repository: spark Updated Branches: refs/heads/master b97ddff00 - db3fd054f [SPARK-7853] [SQL] Fixes a class loader issue in Spark SQL This PR is based on PR #6396 authored by chenghao-intel. Essentially, Spark SQL should use context classloader to load SerDe classes. yhuai helped updating

spark git commit: [SPARK-7907] [SQL] [UI] Rename tab ThriftServer to SQL.

2015-05-27 Thread yhuai
`; and 3. Renaming the title of the session page from `ThriftServer` to `JDBC/ODBC Session`. https://issues.apache.org/jira/browse/SPARK-7907 Author: Yin Huai yh...@databricks.com Closes #6448 from yhuai/JDBCServer and squashes the following commits: eadcc3d [Yin Huai] Update test. 9168005 [Yin Huai

spark git commit: [SPARK-7907] [SQL] [UI] Rename tab ThriftServer to SQL.

2015-05-27 Thread yhuai
. Renaming the title of the session page from `ThriftServer` to `JDBC/ODBC Session`. https://issues.apache.org/jira/browse/SPARK-7907 Author: Yin Huai yh...@databricks.com Closes #6448 from yhuai/JDBCServer and squashes the following commits: eadcc3d [Yin Huai] Update test. 9168005 [Yin Huai] Use

spark git commit: [SPARK-7868] [SQL] Ignores _temporary directories in HadoopFsRelation

2015-05-26 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.4 faadbd4d9 - d0bd68ff8 [SPARK-7868] [SQL] Ignores _temporary directories in HadoopFsRelation So that potential partial/corrupted data files left by failed tasks/jobs won't affect normal data scan. Author: Cheng Lian

spark git commit: [SPARK-7805] [SQL] Move SQLTestUtils.scala and ParquetTest.scala to src/test

2015-05-24 Thread yhuai
`SQLTestUtils` and `ParquetTest` in `src/main`. We should only add stuff that will be needed by `sql/console` or Python tests (for Python, we need it in `src/main`, right? davies). Author: Yin Huai yh...@databricks.com Closes #6334 from yhuai/SPARK-7805 and squashes the following commits

spark git commit: [SPARK-7845] [BUILD] Bump Hadoop 1 tests to version 1.2.1

2015-05-24 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.4 947d700ec - 11d998eb7 [SPARK-7845] [BUILD] Bump Hadoop 1 tests to version 1.2.1 https://issues.apache.org/jira/browse/SPARK-7845 Author: Yin Huai yh...@databricks.com Closes #6384 from yhuai/hadoop1Test and squashes the following

spark git commit: [SPARK-7845] [BUILD] Bump Hadoop 1 tests to version 1.2.1

2015-05-24 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 3c1a2d049 - bfbc0df72 [SPARK-7845] [BUILD] Bump Hadoop 1 tests to version 1.2.1 https://issues.apache.org/jira/browse/SPARK-7845 Author: Yin Huai yh...@databricks.com Closes #6384 from yhuai/hadoop1Test and squashes the following commits

spark git commit: [SPARK-7322, SPARK-7836, SPARK-7822][SQL] DataFrame window function related updates

2015-05-23 Thread yhuai
Repository: spark Updated Branches: refs/heads/master ad0badba1 - efe3bfdf4 [SPARK-7322, SPARK-7836, SPARK-7822][SQL] DataFrame window function related updates 1. ntile should take an integer as parameter. 2. Added Python API (based on #6364) 3. Update documentation of various DataFrame

spark git commit: [SPARK-7322, SPARK-7836, SPARK-7822][SQL] DataFrame window function related updates

2015-05-23 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.4 ea9db50bc - d1515381c [SPARK-7322, SPARK-7836, SPARK-7822][SQL] DataFrame window function related updates 1. ntile should take an integer as parameter. 2. Added Python API (based on #6364) 3. Update documentation of various DataFrame

spark git commit: [SPARK-7654] [SQL] Move insertInto into reader/writer interface.

2015-05-23 Thread yhuai
Closes #6366 from yhuai/insert and squashes the following commits: 3d717fb [Yin Huai] Use insertInto to handle the casue when table exists and Append is used for saveAsTable. 56d2540 [Yin Huai] Add PreWriteCheck to HiveContext's analyzer. c636e35 [Yin Huai] Remove unnecessary empty lines

spark git commit: [SPARK-7654] [SQL] Move insertInto into reader/writer interface.

2015-05-23 Thread yhuai
...@databricks.com Closes #6366 from yhuai/insert and squashes the following commits: 3d717fb [Yin Huai] Use insertInto to handle the casue when table exists and Append is used for saveAsTable. 56d2540 [Yin Huai] Add PreWriteCheck to HiveContext's analyzer. c636e35 [Yin Huai] Remove unnecessary empty lines

spark git commit: [SPARK-7749] [SQL] Fixes partition discovery for non-partitioned tables

2015-05-21 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 13348e21b - 8730fbb47 [SPARK-7749] [SQL] Fixes partition discovery for non-partitioned tables When no partition columns can be found, we should have an empty `PartitionSpec`, rather than a `PartitionSpec` with empty partition columns.

spark git commit: [SPARK-7749] [SQL] Fixes partition discovery for non-partitioned tables

2015-05-21 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.4 b97a8053a - 70d9839cf [SPARK-7749] [SQL] Fixes partition discovery for non-partitioned tables When no partition columns can be found, we should have an empty `PartitionSpec`, rather than a `PartitionSpec` with empty partition columns.

spark git commit: [SPARK-7763] [SPARK-7616] [SQL] Persists partition columns into metastore

2015-05-21 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.4 33e0e - 96c82515b [SPARK-7763] [SPARK-7616] [SQL] Persists partition columns into metastore Author: Yin Huai yh...@databricks.com Author: Cheng Lian l...@databricks.com Closes #6285 from liancheng/spark-7763 and squashes the

spark git commit: [SPARK-7763] [SPARK-7616] [SQL] Persists partition columns into metastore

2015-05-21 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 311fab6f1 - 30f3f556f [SPARK-7763] [SPARK-7616] [SQL] Persists partition columns into metastore Author: Yin Huai yh...@databricks.com Author: Cheng Lian l...@databricks.com Closes #6285 from liancheng/spark-7763 and squashes the following

spark git commit: [SPARK-7718] [SQL] Speed up partitioning by avoiding closure cleaning

2015-05-21 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.4 c9a80fc40 - ba04b5236 [SPARK-7718] [SQL] Speed up partitioning by avoiding closure cleaning According to yhuai we spent 6-7 seconds cleaning closures in a partitioning job that takes 12 seconds. Since we provide these closures

spark git commit: [SPARK-7718] [SQL] Speed up partitioning by avoiding closure cleaning

2015-05-21 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 6b18cdc1b - 5287eec5a [SPARK-7718] [SQL] Speed up partitioning by avoiding closure cleaning According to yhuai we spent 6-7 seconds cleaning closures in a partitioning job that takes 12 seconds. Since we provide these closures in Spark we

spark git commit: [SPARK-7565] [SQL] fix MapType in JsonRDD

2015-05-21 Thread yhuai
Repository: spark Updated Branches: refs/heads/master feb3a9d3f - a25c1ab8f [SPARK-7565] [SQL] fix MapType in JsonRDD The key of Map in JsonRDD should be converted into UTF8String (also failed records), Thanks to yhuai viirya Closes #6084 Author: Davies Liu dav...@databricks.com Closes

spark git commit: [SPARK-7565] [SQL] fix MapType in JsonRDD

2015-05-21 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.4 f0e421351 - 3aa618510 [SPARK-7565] [SQL] fix MapType in JsonRDD The key of Map in JsonRDD should be converted into UTF8String (also failed records), Thanks to yhuai viirya Closes #6084 Author: Davies Liu dav...@databricks.com

spark git commit: [SPARK-7320] [SQL] [Minor] Move the testData into beforeAll()

2015-05-21 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 1ee8eb431 - feb3a9d3f [SPARK-7320] [SQL] [Minor] Move the testData into beforeAll() Follow up of #6340, to avoid the test report missing once it fails. Author: Cheng Hao hao.ch...@intel.com Closes #6312 from chenghao-intel/rollup_minor

spark git commit: [SPARK-7320] [SQL] [Minor] Move the testData into beforeAll()

2015-05-21 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.4 f08c6f319 - f0e421351 [SPARK-7320] [SQL] [Minor] Move the testData into beforeAll() Follow up of #6340, to avoid the test report missing once it fails. Author: Cheng Hao hao.ch...@intel.com Closes #6312 from

spark git commit: [SPARK-7713] [SQL] Use shared broadcast hadoop conf for partitioned table scan.

2015-05-20 Thread yhuai
).explain(true) ``` In our master `explain` takes 40s in my laptop. With this PR, `explain` takes 14s. Author: Yin Huai yh...@databricks.com Closes #6252 from yhuai/broadcastHadoopConf and squashes the following commits: 6fa73df [Yin Huai] Address comments of Josh and Andrew. 807fbf9 [Yin Huai] Make

spark git commit: [SPARK-7713] [SQL] Use shared broadcast hadoop conf for partitioned table scan.

2015-05-20 Thread yhuai
).explain(true) ``` In our master `explain` takes 40s in my laptop. With this PR, `explain` takes 14s. Author: Yin Huai yh...@databricks.com Closes #6252 from yhuai/broadcastHadoopConf and squashes the following commits: 6fa73df [Yin Huai] Address comments of Josh and Andrew. 807fbf9 [Yin Huai

spark git commit: [SPARK-7320] [SQL] Add Cube / Rollup for dataframe

2015-05-20 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.4 b6182ce89 - 4fd674336 [SPARK-7320] [SQL] Add Cube / Rollup for dataframe This is a follow up for #6257, which broke the maven test. Add cube rollup for DataFrame For example: ```scala testData.rollup($a + $b, $b).agg(sum($a - $b))

spark git commit: [SPARK-7320] [SQL] Add Cube / Rollup for dataframe

2015-05-20 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 895baf8f7 - 42c592adb [SPARK-7320] [SQL] Add Cube / Rollup for dataframe This is a follow up for #6257, which broke the maven test. Add cube rollup for DataFrame For example: ```scala testData.rollup($a + $b, $b).agg(sum($a - $b))

spark git commit: [SPARK-7673] [SQL] WIP: HadoopFsRelation and ParquetRelation2 performance optimizations

2015-05-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 530397ba2 - 9dadf019b [SPARK-7673] [SQL] WIP: HadoopFsRelation and ParquetRelation2 performance optimizations This PR introduces several performance optimizations to `HadoopFsRelation` and `ParquetRelation2`: 1. Moving `FileStatus`

spark git commit: [SQL] [MINOR] use catalyst type converter in ScalaUdf

2015-05-17 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.4 e0632ffaf - be66d1924 [SQL] [MINOR] use catalyst type converter in ScalaUdf It's a follow-up of https://github.com/apache/spark/pull/5154, we can speed up scala udf evaluation by create type converter in advance. Author: Wenchen Fan

spark git commit: [SQL] [MINOR] use catalyst type converter in ScalaUdf

2015-05-17 Thread yhuai
Repository: spark Updated Branches: refs/heads/master ca4257aec - 2f22424e9 [SQL] [MINOR] use catalyst type converter in ScalaUdf It's a follow-up of https://github.com/apache/spark/pull/5154, we can speed up scala udf evaluation by create type converter in advance. Author: Wenchen Fan

spark git commit: [SPARK-7375] [SQL] Avoid row copying in exchange when sort.serializeMapOutputs takes effect

2015-05-08 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 0a901dd3a - cde548388 [SPARK-7375] [SQL] Avoid row copying in exchange when sort.serializeMapOutputs takes effect This patch refactors the SQL `Exchange` operator's logic for determining whether map outputs need to be copied before being

spark git commit: [SPARK-7375] [SQL] Avoid row copying in exchange when sort.serializeMapOutputs takes effect

2015-05-08 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.4 448ff333f - 21212a27c [SPARK-7375] [SQL] Avoid row copying in exchange when sort.serializeMapOutputs takes effect This patch refactors the SQL `Exchange` operator's logic for determining whether map outputs need to be copied before

spark git commit: [SPARK-7470] [SQL] Spark shell SQLContext crashes without hive

2015-05-07 Thread yhuai
sqlContext import sqlContext.sql ^ ``` yhuai marmbrus Author: Andrew Or and...@databricks.com Closes #5997 from andrewor14/sql-shell-crash and squashes the following commits: 61147e6 [Andrew Or] Also expect NoClassDefFoundError Project: http://git-wip-us.apache.org/repos/asf

spark git commit: [SPARK-7232] [SQL] Add a Substitution batch for spark sql analyzer

2015-05-07 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.4 1a3e9e982 - bb5872f2d [SPARK-7232] [SQL] Add a Substitution batch for spark sql analyzer Added a new batch named `Substitution` before `Resolution` batch. The motivation for this is there are kind of cases we want to do some

spark git commit: [SPARK-7232] [SQL] Add a Substitution batch for spark sql analyzer

2015-05-07 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 714db2ef5 - f496bf3c5 [SPARK-7232] [SQL] Add a Substitution batch for spark sql analyzer Added a new batch named `Substitution` before `Resolution` batch. The motivation for this is there are kind of cases we want to do some

[2/2] spark git commit: [SPARK-6908] [SQL] Use isolated Hive client

2015-05-07 Thread yhuai
[SPARK-6908] [SQL] Use isolated Hive client This PR switches Spark SQL's Hive support to use the isolated hive client interface introduced by #5851, instead of directly interacting with the client. By using this isolated client we can now allow users to dynamically configure the version of

[1/2] spark git commit: [SPARK-6908] [SQL] Use isolated Hive client

2015-05-07 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.4 2e8a141b5 - 05454fd8a http://git-wip-us.apache.org/repos/asf/spark/blob/05454fd8/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/ClientWrapper.scala -- diff

[1/2] spark git commit: [SPARK-6908] [SQL] Use isolated Hive client

2015-05-07 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 22ab70e06 - cd1d4110c http://git-wip-us.apache.org/repos/asf/spark/blob/cd1d4110/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/ClientWrapper.scala -- diff --git

spark git commit: [SPARK-6986] [SQL] Use Serializer2 in more cases.

2015-05-07 Thread yhuai
to determine it is handling a key-value pari, a key, or a value. It is safe to use `SparkSqlSerializer2` in more cases. Author: Yin Huai yh...@databricks.com Closes #5849 from yhuai/serializer2MoreCases and squashes the following commits: 53a5eaa [Yin Huai] Josh's comments. 487f540 [Yin Huai

spark git commit: [SPARK-6986] [SQL] Use Serializer2 in more cases.

2015-05-07 Thread yhuai
to determine it is handling a key-value pari, a key, or a value. It is safe to use `SparkSqlSerializer2` in more cases. Author: Yin Huai yh...@databricks.com Closes #5849 from yhuai/serializer2MoreCases and squashes the following commits: 53a5eaa [Yin Huai] Josh's comments. 487f540 [Yin Huai] Use

spark git commit: [SPARK-7470] [SQL] Spark shell SQLContext crashes without hive

2015-05-07 Thread yhuai
sqlContext import sqlContext.sql ^ ``` yhuai marmbrus Author: Andrew Or and...@databricks.com Closes #5997 from andrewor14/sql-shell-crash and squashes the following commits: 61147e6 [Andrew Or] Also expect NoClassDefFoundError (cherry picked from commit

spark git commit: [SPARK-7330] [SQL] avoid NPE at jdbc rdd

2015-05-07 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 4f87e9562 - ed9be06a4 [SPARK-7330] [SQL] avoid NPE at jdbc rdd Thank nadavoosh point this out in #5590 Author: Daoyuan Wang daoyuan.w...@intel.com Closes #5877 from adrian-wang/jdbcrdd and squashes the following commits: cc11900

spark git commit: [SPARK-7330] [SQL] avoid NPE at jdbc rdd

2015-05-07 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.4 91ce13109 - 84ee348bc [SPARK-7330] [SQL] avoid NPE at jdbc rdd Thank nadavoosh point this out in #5590 Author: Daoyuan Wang daoyuan.w...@intel.com Closes #5877 from adrian-wang/jdbcrdd and squashes the following commits: cc11900

spark git commit: [SPARK-7330] [SQL] avoid NPE at jdbc rdd

2015-05-07 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.3 cbf232daa - edcd3643a [SPARK-7330] [SQL] avoid NPE at jdbc rdd Thank nadavoosh point this out in #5590 Author: Daoyuan Wang daoyuan.w...@intel.com Closes #5877 from adrian-wang/jdbcrdd and squashes the following commits: cc11900

spark git commit: [HOT-FIX] Move HiveWindowFunctionQuerySuite.scala to hive compatibility dir.

2015-05-06 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 845d1d4d0 - 774099670 [HOT-FIX] Move HiveWindowFunctionQuerySuite.scala to hive compatibility dir. Author: Yin Huai yh...@databricks.com Closes #5951 from yhuai/fixBuildMaven and squashes the following commits: fdde183 [Yin Huai] Move

spark git commit: [HOT-FIX] Move HiveWindowFunctionQuerySuite.scala to hive compatibility dir.

2015-05-06 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.4 2163367ea - 14bcb84e8 [HOT-FIX] Move HiveWindowFunctionQuerySuite.scala to hive compatibility dir. Author: Yin Huai yh...@databricks.com Closes #5951 from yhuai/fixBuildMaven and squashes the following commits: fdde183 [Yin Huai

spark git commit: [SPARK-6201] [SQL] promote string and do widen types for IN

2015-05-06 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 150f671c2 - c3eb441f5 [SPARK-6201] [SQL] promote string and do widen types for IN huangjs Acutally spark sql will first go through analysis period, in which we do widen types and promote strings, and then optimization, where constant IN

<    3   4   5   6   7   8