spark git commit: [SPARK-22939][PYSPARK] Support Spark UDF in registerFunction

2018-01-04 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.3 eb99b8ade -> 1f5e3540c [SPARK-22939][PYSPARK] Support Spark UDF in registerFunction ## What changes were proposed in this pull request? ```Python import random from pyspark.sql.functions import udf from pyspark.sql.types import

spark git commit: [SPARK-22939][PYSPARK] Support Spark UDF in registerFunction

2018-01-04 Thread lixiao
Repository: spark Updated Branches: refs/heads/master d5861aba9 -> 5aadbc929 [SPARK-22939][PYSPARK] Support Spark UDF in registerFunction ## What changes were proposed in this pull request? ```Python import random from pyspark.sql.functions import udf from pyspark.sql.types import

spark git commit: [SPARK-22944][SQL] improve FoldablePropagation

2018-01-03 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.3 a51212b64 -> f51c8fde8 [SPARK-22944][SQL] improve FoldablePropagation ## What changes were proposed in this pull request? `FoldablePropagation` is a little tricky as it needs to handle attributes that are miss-derived from children,

spark git commit: [SPARK-22944][SQL] improve FoldablePropagation

2018-01-03 Thread lixiao
Repository: spark Updated Branches: refs/heads/master b29702913 -> 7d045c5f0 [SPARK-22944][SQL] improve FoldablePropagation ## What changes were proposed in this pull request? `FoldablePropagation` is a little tricky as it needs to handle attributes that are miss-derived from children, e.g.

[1/2] spark git commit: [SPARK-20960][SQL] make ColumnVector public

2018-01-03 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.3 79f7263da -> a51212b64 http://git-wip-us.apache.org/repos/asf/spark/blob/a51212b6/sql/core/src/main/java/org/apache/spark/sql/vectorized/ColumnarRow.java -- diff --git

[2/2] spark git commit: [SPARK-20960][SQL] make ColumnVector public

2018-01-03 Thread lixiao
[SPARK-20960][SQL] make ColumnVector public ## What changes were proposed in this pull request? move `ColumnVector` and related classes to `org.apache.spark.sql.vectorized`, and improve the document. ## How was this patch tested? existing tests. Author: Wenchen Fan

[2/2] spark git commit: [SPARK-20960][SQL] make ColumnVector public

2018-01-03 Thread lixiao
[SPARK-20960][SQL] make ColumnVector public ## What changes were proposed in this pull request? move `ColumnVector` and related classes to `org.apache.spark.sql.vectorized`, and improve the document. ## How was this patch tested? existing tests. Author: Wenchen Fan

[1/2] spark git commit: [SPARK-20960][SQL] make ColumnVector public

2018-01-03 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 9a2b65a3c -> b29702913 http://git-wip-us.apache.org/repos/asf/spark/blob/b2970291/sql/core/src/main/java/org/apache/spark/sql/vectorized/ColumnarRow.java -- diff --git

spark git commit: [SPARK-22932][SQL] Refactor AnalysisContext

2018-01-03 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.3 b96248862 -> 27c949d67 [SPARK-22932][SQL] Refactor AnalysisContext ## What changes were proposed in this pull request? Add a `reset` function to ensure the state in `AnalysisContext ` is per-query. ## How was this patch tested? The

spark git commit: [SPARK-20236][SQL] dynamic partition overwrite

2018-01-03 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.3 a05e85ecb -> b96248862 [SPARK-20236][SQL] dynamic partition overwrite ## What changes were proposed in this pull request? When overwriting a partitioned table with dynamic partition columns, the behavior is different between data

spark git commit: [SPARK-20236][SQL] dynamic partition overwrite

2018-01-03 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 1a87a1609 -> a66fe36ce [SPARK-20236][SQL] dynamic partition overwrite ## What changes were proposed in this pull request? When overwriting a partitioned table with dynamic partition columns, the behavior is different between data source

spark git commit: [SPARK-22934][SQL] Make optional clauses order insensitive for CREATE TABLE SQL statement

2018-01-03 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.3 b96a21324 -> a05e85ecb [SPARK-22934][SQL] Make optional clauses order insensitive for CREATE TABLE SQL statement ## What changes were proposed in this pull request? Currently, our CREATE TABLE syntax require the EXACT order of

spark git commit: [SPARK-22934][SQL] Make optional clauses order insensitive for CREATE TABLE SQL statement

2018-01-03 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 247a08939 -> 1a87a1609 [SPARK-22934][SQL] Make optional clauses order insensitive for CREATE TABLE SQL statement ## What changes were proposed in this pull request? Currently, our CREATE TABLE syntax require the EXACT order of clauses. It

spark git commit: [SPARK-22932][SQL] Refactor AnalysisContext

2018-01-01 Thread lixiao
Repository: spark Updated Branches: refs/heads/master e734a4b9c -> e0c090f22 [SPARK-22932][SQL] Refactor AnalysisContext ## What changes were proposed in this pull request? Add a `reset` function to ensure the state in `AnalysisContext ` is per-query. ## How was this patch tested? The

spark git commit: [SPARK-22895][SQL] Push down the deterministic predicates that are after the first non-deterministic

2017-12-30 Thread lixiao
Repository: spark Updated Branches: refs/heads/master ee3af15fe -> cfbe11e81 [SPARK-22895][SQL] Push down the deterministic predicates that are after the first non-deterministic ## What changes were proposed in this pull request? Currently, we do not guarantee an order evaluation of

spark git commit: [SPARK-22363][SQL][TEST] Add unit test for Window spilling

2017-12-30 Thread lixiao
Repository: spark Updated Branches: refs/heads/master ea0a5eef2 -> ee3af15fe [SPARK-22363][SQL][TEST] Add unit test for Window spilling ## What changes were proposed in this pull request? There is already test using window spilling, but the test coverage is not ideal. In this PR the already

spark git commit: [TEST][MINOR] remove redundant `EliminateSubqueryAliases` in test code

2017-12-30 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 14c4a62c1 -> 234d9435d [TEST][MINOR] remove redundant `EliminateSubqueryAliases` in test code ## What changes were proposed in this pull request? The `analyze` method in `implicit class DslLogicalPlan` already includes

spark git commit: [SPARK-22771][SQL] Concatenate binary inputs into a binary output

2017-12-29 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 2ea17afb6 -> f2b3525c1 [SPARK-22771][SQL] Concatenate binary inputs into a binary output ## What changes were proposed in this pull request? This pr modified `concat` to concat binary inputs into a single binary output. `concat` in the

spark git commit: [SPARK-22916][SQL] shouldn't bias towards build right if user does not specify

2017-12-29 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 224375c55 -> cc30ef800 [SPARK-22916][SQL] shouldn't bias towards build right if user does not specify ## What changes were proposed in this pull request? When there are no broadcast hints, the current spark strategies will prefer to

spark git commit: [SPARK-22891][SQL] Make hive client creation thread safe

2017-12-28 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 796e48c60 -> 67ea11ea0 [SPARK-22891][SQL] Make hive client creation thread safe ## What changes were proposed in this pull request? This is to walk around the hive issue: https://issues.apache.org/jira/browse/HIVE-11935 ## How was this

spark git commit: [SPARK-22818][SQL] csv escape of quote escape

2017-12-28 Thread lixiao
Repository: spark Updated Branches: refs/heads/master cfcd74668 -> ffe6fd77a [SPARK-22818][SQL] csv escape of quote escape ## What changes were proposed in this pull request? Escape of escape should be considered when using the UniVocity csv encoding/decoding library. Ref:

spark git commit: [SPARK-22890][TEST] Basic tests for DateTimeOperations

2017-12-28 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 9c21ece35 -> 613b71a12 [SPARK-22890][TEST] Basic tests for DateTimeOperations ## What changes were proposed in this pull request? Test Coverage for `DateTimeOperations`, this is a Sub-tasks for

spark git commit: [SPARK-20392][SQL][FOLLOWUP] should not add extra AnalysisBarrier

2017-12-28 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 1eebfbe19 -> 755f2f518 [SPARK-20392][SQL][FOLLOWUP] should not add extra AnalysisBarrier ## What changes were proposed in this pull request? I found this problem while auditing the analyzer code. It's dangerous to introduce extra

spark git commit: [SPARK-22904][SQL] Add tests for decimal operations and string casts

2017-12-27 Thread lixiao
Repository: spark Updated Branches: refs/heads/master b8bfce51a -> 774715d5c [SPARK-22904][SQL] Add tests for decimal operations and string casts ## What changes were proposed in this pull request? Test coverage for arithmetic operations leading to: 1. Precision loss 2. Overflow

spark git commit: [SPARK-22894][SQL] DateTimeOperations should accept SQL like string type

2017-12-26 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 9348e6842 -> 91d1b300d [SPARK-22894][SQL] DateTimeOperations should accept SQL like string type ## What changes were proposed in this pull request? `DateTimeOperations` accept

spark git commit: [SPARK-22833][EXAMPLE] Improvement SparkHive Scala Examples

2017-12-26 Thread lixiao
Repository: spark Updated Branches: refs/heads/master ff48b1b33 -> 9348e6842 [SPARK-22833][EXAMPLE] Improvement SparkHive Scala Examples ## What changes were proposed in this pull request? Some improvements: 1. Point out we are using both Spark SQ native syntax and HQL syntax in the example

spark git commit: [SPARK-22893][SQL][HOTFIX] Fix a error message of VersionsSuite

2017-12-25 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 12d20dd75 -> be03d3ad7 [SPARK-22893][SQL][HOTFIX] Fix a error message of VersionsSuite ## What changes were proposed in this pull request? https://github.com/apache/spark/pull/20064 breaks Jenkins tests because it missed to update one

[2/2] spark git commit: [SPARK-22893][SQL] Unified the data type mismatch message

2017-12-25 Thread lixiao
[SPARK-22893][SQL] Unified the data type mismatch message ## What changes were proposed in this pull request? We should use `dataType.simpleString` to unified the data type mismatch message: Before: ``` spark-sql> select cast(1 as binary); Error in query: cannot resolve 'CAST(1 AS BINARY)' due

[1/2] spark git commit: [SPARK-22893][SQL] Unified the data type mismatch message

2017-12-25 Thread lixiao
Repository: spark Updated Branches: refs/heads/master fba03133d -> 33ae2437b http://git-wip-us.apache.org/repos/asf/spark/blob/33ae2437/sql/core/src/test/resources/sql-tests/results/typeCoercion/native/inConversion.sql.out --

spark git commit: [SPARK-22862] Docs on lazy elimination of columns missing from an encoder

2017-12-21 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.1 0f6862106 -> 1df8020e1 [SPARK-22862] Docs on lazy elimination of columns missing from an encoder This behavior has confused some users, so lets clarify it. Author: Michael Armbrust Closes #20048 from

spark git commit: [SPARK-22862] Docs on lazy elimination of columns missing from an encoder

2017-12-21 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.2 1e4cca02f -> 1cf3e3a26 [SPARK-22862] Docs on lazy elimination of columns missing from an encoder This behavior has confused some users, so lets clarify it. Author: Michael Armbrust Closes #20048 from

spark git commit: [SPARK-22862] Docs on lazy elimination of columns missing from an encoder

2017-12-21 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 22e1849bc -> 8df1da396 [SPARK-22862] Docs on lazy elimination of columns missing from an encoder This behavior has confused some users, so lets clarify it. Author: Michael Armbrust Closes #20048 from

spark git commit: [SPARK-22042][FOLLOW-UP][SQL] ReorderJoinPredicates can break when child's partitioning is not decided

2017-12-21 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 4e107fdb7 -> fe65361b0 [SPARK-22042][FOLLOW-UP][SQL] ReorderJoinPredicates can break when child's partitioning is not decided ## What changes were proposed in this pull request? This is a followup PR of

[3/3] spark git commit: [SPARK-22822][TEST] Basic tests for WindowFrameCoercion and DecimalPrecision

2017-12-21 Thread lixiao
[SPARK-22822][TEST] Basic tests for WindowFrameCoercion and DecimalPrecision ## What changes were proposed in this pull request? Test Coverage for `WindowFrameCoercion` and `DecimalPrecision`, this is a Sub-tasks for [SPARK-22722](https://issues.apache.org/jira/browse/SPARK-22722). ## How was

[2/3] spark git commit: [SPARK-22822][TEST] Basic tests for WindowFrameCoercion and DecimalPrecision

2017-12-21 Thread lixiao
http://git-wip-us.apache.org/repos/asf/spark/blob/4e107fdb/sql/core/src/test/resources/sql-tests/results/typeCoercion/native/decimalPrecision.sql.out -- diff --git

[1/3] spark git commit: [SPARK-22822][TEST] Basic tests for WindowFrameCoercion and DecimalPrecision

2017-12-21 Thread lixiao
Repository: spark Updated Branches: refs/heads/master d3a1d9527 -> 4e107fdb7 http://git-wip-us.apache.org/repos/asf/spark/blob/4e107fdb/sql/core/src/test/resources/sql-tests/results/typeCoercion/native/windowFrameCoercion.sql.out

spark git commit: [SPARK-22849] ivy.retrieve pattern should also consider `classifier`

2017-12-20 Thread lixiao
Repository: spark Updated Branches: refs/heads/master d762d110d -> c89b43118 [SPARK-22849] ivy.retrieve pattern should also consider `classifier` ## What changes were proposed in this pull request? In the previous PR https://github.com/apache/spark/pull/5755#discussion_r157848354, we dropped

spark git commit: [SPARK-22649][PYTHON][SQL] Adding localCheckpoint to Dataset API

2017-12-19 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 6e36d8d56 -> 13268a58f [SPARK-22649][PYTHON][SQL] Adding localCheckpoint to Dataset API ## What changes were proposed in this pull request? This change adds local checkpoint support to datasets and respective bind from Python Dataframe

spark git commit: [SPARK-22829] Add new built-in function date_trunc()

2017-12-19 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 3a7494dfe -> 6e36d8d56 [SPARK-22829] Add new built-in function date_trunc() ## What changes were proposed in this pull request? Adding date_trunc() as a built-in function. `date_trunc` is common in other databases, but Spark or Hive does

[2/2] spark git commit: [SPARK-22821][TEST] Basic tests for WidenSetOperationTypes, BooleanEquality, StackCoercion and Division

2017-12-19 Thread lixiao
[SPARK-22821][TEST] Basic tests for WidenSetOperationTypes, BooleanEquality, StackCoercion and Division ## What changes were proposed in this pull request? Test Coverage for `WidenSetOperationTypes`, `BooleanEquality`, `StackCoercion` and `Division`, this is a Sub-tasks for

[1/2] spark git commit: [SPARK-22821][TEST] Basic tests for WidenSetOperationTypes, BooleanEquality, StackCoercion and Division

2017-12-19 Thread lixiao
Repository: spark Updated Branches: refs/heads/master ef10f452e -> 6129ffa11 http://git-wip-us.apache.org/repos/asf/spark/blob/6129ffa1/sql/core/src/test/resources/sql-tests/results/typeCoercion/native/division.sql.out -- diff

spark git commit: [SPARK-21652][SQL][FOLLOW-UP] Fix rule conflict caused by InferFiltersFromConstraints

2017-12-19 Thread lixiao
Repository: spark Updated Branches: refs/heads/master ee56fc343 -> ef10f452e [SPARK-21652][SQL][FOLLOW-UP] Fix rule conflict caused by InferFiltersFromConstraints ## What changes were proposed in this pull request? The optimizer rule `InferFiltersFromConstraints` could trigger our batch

[3/3] spark git commit: [SPARK-22816][TEST] Basic tests for PromoteStrings and InConversion

2017-12-17 Thread lixiao
[SPARK-22816][TEST] Basic tests for PromoteStrings and InConversion ## What changes were proposed in this pull request? Test Coverage for `PromoteStrings` and `InConversion`, this is a Sub-tasks for [SPARK-22722](https://issues.apache.org/jira/browse/SPARK-22722). ## How was this patch tested?

[1/3] spark git commit: [SPARK-22816][TEST] Basic tests for PromoteStrings and InConversion

2017-12-17 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 77988a9d0 -> 7f6d10a73 http://git-wip-us.apache.org/repos/asf/spark/blob/7f6d10a7/sql/core/src/test/resources/sql-tests/results/typeCoercion/native/promoteStrings.sql.out

[2/3] spark git commit: [SPARK-22816][TEST] Basic tests for PromoteStrings and InConversion

2017-12-17 Thread lixiao
http://git-wip-us.apache.org/repos/asf/spark/blob/7f6d10a7/sql/core/src/test/resources/sql-tests/results/typeCoercion/native/inConversion.sql.out -- diff --git

[2/2] spark git commit: [SPARK-22762][TEST] Basic tests for IfCoercion and CaseWhenCoercion

2017-12-15 Thread lixiao
[SPARK-22762][TEST] Basic tests for IfCoercion and CaseWhenCoercion ## What changes were proposed in this pull request? Basic tests for IfCoercion and CaseWhenCoercion ## How was this patch tested? N/A Author: Yuming Wang Closes #19949 from wangyum/SPARK-22762. Project:

[1/2] spark git commit: [SPARK-22762][TEST] Basic tests for IfCoercion and CaseWhenCoercion

2017-12-15 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 9fafa8209 -> 46776234a http://git-wip-us.apache.org/repos/asf/spark/blob/46776234/sql/core/src/test/resources/sql-tests/results/typeCoercion/native/ifCoercion.sql.out --

spark git commit: [SPARK-22800][TEST][SQL] Add a SSB query suite

2017-12-15 Thread lixiao
Repository: spark Updated Branches: refs/heads/master e58f27567 -> 9fafa8209 [SPARK-22800][TEST][SQL] Add a SSB query suite ## What changes were proposed in this pull request? Add a test suite to ensure all the [SSB (Star Schema Benchmark)](https://www.cs.umb.edu/~poneil/StarSchemaB.PDF)

[spark] Git Push Summary

2017-12-15 Thread lixiao
Repository: spark Updated Branches: refs/heads/revert19961 [deleted] e58f27567 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: Revert "[SPARK-22496][SQL] thrift server adds operation logs"

2017-12-15 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 3775dd31e -> e58f27567 Revert "[SPARK-22496][SQL] thrift server adds operation logs" This reverts commit 0ea2d8c12e49e30df6bbfa57d74134b25f96a196. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: Revert "[SPARK-22496][SQL] thrift server adds operation logs"

2017-12-15 Thread lixiao
Repository: spark Updated Branches: refs/heads/revert19961 [created] e58f27567 Revert "[SPARK-22496][SQL] thrift server adds operation logs" This reverts commit 0ea2d8c12e49e30df6bbfa57d74134b25f96a196. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-22753][SQL] Get rid of dataSource.writeAndRead

2017-12-14 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 3fea5c4f1 -> 3775dd31e [SPARK-22753][SQL] Get rid of dataSource.writeAndRead ## What changes were proposed in this pull request? As the discussion in https://github.com/apache/spark/pull/16481 and

spark git commit: [SPARK-22787][TEST][SQL] Add a TPC-H query suite

2017-12-14 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 0ea2d8c12 -> 3fea5c4f1 [SPARK-22787][TEST][SQL] Add a TPC-H query suite ## What changes were proposed in this pull request? Add a test suite to ensure all the TPC-H queries can be successfully analyzed, optimized and compiled without

spark git commit: [SPARK-22496][SQL] thrift server adds operation logs

2017-12-14 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 59daf91b7 -> 0ea2d8c12 [SPARK-22496][SQL] thrift server adds operation logs ## What changes were proposed in this pull request? since hive 2.0+ upgrades log4j to log4j2,a lot of

spark git commit: [SPARK-16496][SQL] Add wholetext as option for reading text in SQL.

2017-12-14 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 606ae491e -> 40de176c9 [SPARK-16496][SQL] Add wholetext as option for reading text in SQL. ## What changes were proposed in this pull request? In multiple text analysis problems, it is not often desirable for the rows to be split by

spark git commit: [SPARK-22779][SQL] Resolve default values for fallback configs.

2017-12-13 Thread lixiao
Repository: spark Updated Branches: refs/heads/master f8c7c1f21 -> c3dd2a26d [SPARK-22779][SQL] Resolve default values for fallback configs. SQLConf allows some callers to define a custom default value for configs, and that complicates a little bit the handling of fallback config entries,

spark git commit: [SPARK-22600][SQL][FOLLOW-UP] Fix a compilation error in TPCDS q75/q77

2017-12-13 Thread lixiao
Repository: spark Updated Branches: refs/heads/master a83e8e6c2 -> ef9299965 [SPARK-22600][SQL][FOLLOW-UP] Fix a compilation error in TPCDS q75/q77 ## What changes were proposed in this pull request? This pr fixed a compilation error of TPCDS `q75`/`q77` caused by #19813; ```

spark git commit: [SPARK-22772][SQL] Use splitExpressionsWithCurrentInputs to split codes in elt

2017-12-13 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 0bdb4e516 -> ba0e79f57 [SPARK-22772][SQL] Use splitExpressionsWithCurrentInputs to split codes in elt ## What changes were proposed in this pull request? In SPARK-22550 which fixes 64KB JVM bytecode limit problem with elt,

spark git commit: [SPARK-22763][CORE] SHS: Ignore unknown events and parse through the file

2017-12-13 Thread lixiao
Repository: spark Updated Branches: refs/heads/master c5a4701ac -> 1abcbed67 [SPARK-22763][CORE] SHS: Ignore unknown events and parse through the file ## What changes were proposed in this pull request? While spark code changes, there are new events in event log: #19649 And we used to

spark git commit: Revert "[SPARK-21417][SQL] Infer join conditions using propagated constraints"

2017-12-13 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 8eb5609d8 -> c5a4701ac Revert "[SPARK-21417][SQL] Infer join conditions using propagated constraints" This reverts commit 6ac57fd0d1c82b834eb4bf0dd57596b92a99d6de. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-22042][SQL] ReorderJoinPredicates can break when child's partitioning is not decided

2017-12-12 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 874350905 -> 682eb4f2e [SPARK-22042][SQL] ReorderJoinPredicates can break when child's partitioning is not decided ## What changes were proposed in this pull request? See jira description for the bug :

spark git commit: [SPARK-22759][SQL] Filters can be combined iff both are deterministic

2017-12-12 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 6b80ce4fb -> 13e489b67 [SPARK-22759][SQL] Filters can be combined iff both are deterministic ## What changes were proposed in this pull request? The query execution/optimization does not guarantee the expressions are evaluated in order.

spark git commit: [SPARK-19809][SQL][TEST][FOLLOWUP] Move the test case to HiveOrcQuerySuite

2017-12-12 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 0e36ba621 -> 6b80ce4fb [SPARK-19809][SQL][TEST][FOLLOWUP] Move the test case to HiveOrcQuerySuite ## What changes were proposed in this pull request? As a follow-up of #19948 , this PR moves the test case and adds comments. ## How was

spark git commit: Revert "[SPARK-22574][MESOS][SUBMIT] Check submission request parameters"

2017-12-12 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.2 728a45e5a -> 0230515a2 Revert "[SPARK-22574][MESOS][SUBMIT] Check submission request parameters" This reverts commit 728a45e5a68a20bdd17227edc70e6a38d178af1c. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: Revert "[SPARK-22574][MESOS][SUBMIT] Check submission request parameters"

2017-12-12 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 7a51e7135 -> 704af4bd6 Revert "[SPARK-22574][MESOS][SUBMIT] Check submission request parameters" This reverts commit 7a51e71355485bb176a1387d99ec430c5986cbec. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-22729][SQL] Add getTruncateQuery to JdbcDialect

2017-12-12 Thread lixiao
Repository: spark Updated Branches: refs/heads/master d5007734b -> e6dc5f280 [SPARK-22729][SQL] Add getTruncateQuery to JdbcDialect In order to enable truncate for PostgreSQL databases in Spark JDBC, a change is needed to the query used for truncating a PostgreSQL table. By default,

spark git commit: [SPARK-20557][SQL] Only support TIMESTAMP WITH TIME ZONE for Oracle Dialect

2017-12-11 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 3d82f6eb7 -> a4002651a [SPARK-20557][SQL] Only support TIMESTAMP WITH TIME ZONE for Oracle Dialect ## What changes were proposed in this pull request? In the previous PRs, https://github.com/apache/spark/pull/17832 and

[2/2] spark git commit: [SPARK-22726][TEST] Basic tests for Binary Comparison and ImplicitTypeCasts

2017-12-11 Thread lixiao
[SPARK-22726][TEST] Basic tests for Binary Comparison and ImplicitTypeCasts ## What changes were proposed in this pull request? Before we deliver the Hive compatibility mode, we plan to write a set of test cases that can be easily run in both Spark and Hive sides. We can easily compare whether

[1/2] spark git commit: [SPARK-22726][TEST] Basic tests for Binary Comparison and ImplicitTypeCasts

2017-12-11 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 3f4060c34 -> 3d82f6eb7 http://git-wip-us.apache.org/repos/asf/spark/blob/3d82f6eb/sql/core/src/test/resources/sql-tests/results/typeCoercion/native/implicitTypeCasts.sql.out

spark git commit: [SPARK-22746][SQL] Avoid the generation of useless mutable states by SortMergeJoin

2017-12-11 Thread lixiao
Repository: spark Updated Branches: refs/heads/master a04f2bea6 -> c235b5f97 [SPARK-22746][SQL] Avoid the generation of useless mutable states by SortMergeJoin ## What changes were proposed in this pull request? This PR reduce the number of global mutable variables in generated code of

spark git commit: Revert "[SPARK-22496][SQL] thrift server adds operation logs"

2017-12-11 Thread lixiao
Repository: spark Updated Branches: refs/heads/master bf20abb2d -> a04f2bea6 Revert "[SPARK-22496][SQL] thrift server adds operation logs" This reverts commit 4289ac9d8dbbc45fc2ee6d0250a2113107bf08d0. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-22496][SQL] thrift server adds operation logs

2017-12-10 Thread lixiao
Repository: spark Updated Branches: refs/heads/master ab1b6ee73 -> 4289ac9d8 [SPARK-22496][SQL] thrift server adds operation logs ## What changes were proposed in this pull request? since hive 2.0+ upgrades log4j to log4j2,a lot of

spark git commit: [SPARK-22279][SQL] Turn on spark.sql.hive.convertMetastoreOrc by default

2017-12-07 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 18b75d465 -> aa1764ba1 [SPARK-22279][SQL] Turn on spark.sql.hive.convertMetastoreOrc by default ## What changes were proposed in this pull request? Like Parquet, this PR aims to turn on `spark.sql.hive.convertMetastoreOrc` by default.

spark git commit: [SPARK-22719][SQL] Refactor ConstantPropagation

2017-12-07 Thread lixiao
Repository: spark Updated Branches: refs/heads/master f41c0a93f -> 18b75d465 [SPARK-22719][SQL] Refactor ConstantPropagation ## What changes were proposed in this pull request? The current time complexity of ConstantPropagation is O(n^2), which can be slow when the query is complex.

spark git commit: [SPARK-22688][SQL] Upgrade Janino version to 3.0.8

2017-12-06 Thread lixiao
Repository: spark Updated Branches: refs/heads/master f110a7f88 -> 8ae004b46 [SPARK-22688][SQL] Upgrade Janino version to 3.0.8 ## What changes were proposed in this pull request? This PR upgrade Janino version to 3.0.8. [Janino

spark git commit: [SPARK-22693][SQL] CreateNamedStruct and InSet should not use global variables

2017-12-06 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 9948b860a -> f110a7f88 [SPARK-22693][SQL] CreateNamedStruct and InSet should not use global variables ## What changes were proposed in this pull request? CreateNamedStruct and InSet are using a global variable which is not needed. This

spark git commit: [SPARK-22720][SS] Make EventTimeWatermark Extend UnaryNode

2017-12-06 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 51066b437 -> effca9868 [SPARK-22720][SS] Make EventTimeWatermark Extend UnaryNode ## What changes were proposed in this pull request? Our Analyzer and Optimizer have multiple rules for `UnaryNode`. After making `EventTimeWatermark` extend

spark git commit: [SPARK-22710] ConfigBuilder.fallbackConf should trigger onCreate function

2017-12-06 Thread lixiao
Repository: spark Updated Branches: refs/heads/master e98f9647f -> 4286cba7d [SPARK-22710] ConfigBuilder.fallbackConf should trigger onCreate function ## What changes were proposed in this pull request? I was looking at the config code today and found that configs defined using

spark git commit: [SPARK-20392][SQL] Set barrier to prevent re-entering a tree

2017-12-05 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 82183f7b5 -> 00d176d2f [SPARK-20392][SQL] Set barrier to prevent re-entering a tree ## What changes were proposed in this pull request? The SQL `Analyzer` goes through a whole query plan even most part of it is analyzed. This increases

spark git commit: [SPARK-22662][SQL] Failed to prune columns after rewriting predicate subquery

2017-12-05 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 132a3f470 -> 1e17ab83d [SPARK-22662][SQL] Failed to prune columns after rewriting predicate subquery ## What changes were proposed in this pull request? As a simple example: ``` spark-sql> create table base (a int, b int) using parquet;

spark git commit: [SPARK-22500][SQL][FOLLOWUP] cast for struct can split code even with whole stage codegen

2017-12-05 Thread lixiao
Repository: spark Updated Branches: refs/heads/master ced6ccf0d -> 132a3f470 [SPARK-22500][SQL][FOLLOWUP] cast for struct can split code even with whole stage codegen ## What changes were proposed in this pull request? A followup of https://github.com/apache/spark/pull/19730, we can split

spark git commit: [SPARK-22701][SQL] add ctx.splitExpressionsWithCurrentInputs

2017-12-05 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 03fdc92e4 -> ced6ccf0d [SPARK-22701][SQL] add ctx.splitExpressionsWithCurrentInputs ## What changes were proposed in this pull request? This pattern appears many times in the codebase: ``` if (ctx.INPUT_ROW == null || ctx.currentVars !=

spark git commit: [SPARK-22665][SQL] Avoid repartitioning with empty list of expressions

2017-12-04 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 1d5597b40 -> 3887b7eef [SPARK-22665][SQL] Avoid repartitioning with empty list of expressions ## What changes were proposed in this pull request? Repartitioning by empty set of expressions is currently possible, even though it is a case

spark git commit: [SPARK-22626][SQL][FOLLOWUP] improve documentation and simplify test case

2017-12-04 Thread lixiao
Repository: spark Updated Branches: refs/heads/master e1dd03e42 -> 1d5597b40 [SPARK-22626][SQL][FOLLOWUP] improve documentation and simplify test case ## What changes were proposed in this pull request? This PR improves documentation for not using zero `numRows` statistics and simplifies

spark git commit: [SPARK-22489][DOC][FOLLOWUP] Update broadcast behavior changes in migration section

2017-12-03 Thread lixiao
Repository: spark Updated Branches: refs/heads/master dff440f1e -> 4131ad03f [SPARK-22489][DOC][FOLLOWUP] Update broadcast behavior changes in migration section ## What changes were proposed in this pull request? Update broadcast behavior changes in migration section. ## How was this patch

spark git commit: [SPARK-22601][SQL] Data load is getting displayed successful on providing non existing nonlocal file path

2017-11-30 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.2 af8a692d6 -> ba00bd961 [SPARK-22601][SQL] Data load is getting displayed successful on providing non existing nonlocal file path ## What changes were proposed in this pull request? When user tries to load data with a non existing hdfs

spark git commit: [SPARK-22601][SQL] Data load is getting displayed successful on providing non existing nonlocal file path

2017-11-30 Thread lixiao
Repository: spark Updated Branches: refs/heads/master dc365422b -> 16adaf634 [SPARK-22601][SQL] Data load is getting displayed successful on providing non existing nonlocal file path ## What changes were proposed in this pull request? When user tries to load data with a non existing hdfs

spark git commit: [SPARK-22614] Dataset API: repartitionByRange(...)

2017-11-30 Thread lixiao
Repository: spark Updated Branches: refs/heads/master bcceab649 -> f5f8e84d9 [SPARK-22614] Dataset API: repartitionByRange(...) ## What changes were proposed in this pull request? This PR introduces a way to explicitly range-partition a Dataset. So far, only round-robin and hash

spark git commit: [SPARK-22489][SQL] Shouldn't change broadcast join buildSide if user clearly specified

2017-11-30 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 6ac57fd0d -> bcceab649 [SPARK-22489][SQL] Shouldn't change broadcast join buildSide if user clearly specified ## What changes were proposed in this pull request? How to reproduce: ```scala import

spark git commit: [SPARK-21417][SQL] Infer join conditions using propagated constraints

2017-11-30 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 999ec137a -> 6ac57fd0d [SPARK-21417][SQL] Infer join conditions using propagated constraints ## What changes were proposed in this pull request? This PR adds an optimization rule that infers join conditions using propagated constraints.

spark git commit: [SPARK-22615][SQL] Handle more cases in PropagateEmptyRelation

2017-11-29 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 20b239845 -> 57687280d [SPARK-22615][SQL] Handle more cases in PropagateEmptyRelation ## What changes were proposed in this pull request? Currently, in the optimize rule `PropagateEmptyRelation`, the following cases is not handled: 1.

spark git commit: [SPARK-22637][SQL] Only refresh a logical plan once.

2017-11-28 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.2 eef72d3f0 -> 38a0532cf [SPARK-22637][SQL] Only refresh a logical plan once. ## What changes were proposed in this pull request? `CatalogImpl.refreshTable` uses `foreach(..)` to refresh all tables in a view. This traverses all nodes in

spark git commit: [SPARK-22637][SQL] Only refresh a logical plan once.

2017-11-28 Thread lixiao
Repository: spark Updated Branches: refs/heads/master a10b328db -> 475a29f11 [SPARK-22637][SQL] Only refresh a logical plan once. ## What changes were proposed in this pull request? `CatalogImpl.refreshTable` uses `foreach(..)` to refresh all tables in a view. This traverses all nodes in the

spark git commit: [SPARK-22515][SQL] Estimation relation size based on numRows * rowSize

2017-11-28 Thread lixiao
Repository: spark Updated Branches: refs/heads/master b70e483cb -> da3557429 [SPARK-22515][SQL] Estimation relation size based on numRows * rowSize ## What changes were proposed in this pull request? Currently, relation size is computed as the sum of file size, which is error-prone because

spark git commit: [SPARK-22602][SQL] remove ColumnVector#loadBytes

2017-11-26 Thread lixiao
Repository: spark Updated Branches: refs/heads/master d49d9e403 -> 5a02e3a2a [SPARK-22602][SQL] remove ColumnVector#loadBytes ## What changes were proposed in this pull request? `ColumnVector#loadBytes` is only used as an optimization for reading UTF8String in `WritableColumnVector`, this

spark git commit: [SPARK-22604][SQL] remove the get address methods from ColumnVector

2017-11-24 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 70221903f -> e3fd93f14 [SPARK-22604][SQL] remove the get address methods from ColumnVector ## What changes were proposed in this pull request? `nullsNativeAddress` and `valuesNativeAddress` are only used in tests and benchmark, no need

spark git commit: [SPARK-22596][SQL] set ctx.currentVars in CodegenSupport.consume

2017-11-24 Thread lixiao
Repository: spark Updated Branches: refs/heads/master a1877f45c -> 70221903f [SPARK-22596][SQL] set ctx.currentVars in CodegenSupport.consume ## What changes were proposed in this pull request? `ctx.currentVars` means the input variables for the current operator, which is already decided in

spark git commit: [SPARK-22592][SQL] cleanup filter converting for hive

2017-11-23 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 42f83d7c4 -> c1217565e [SPARK-22592][SQL] cleanup filter converting for hive ## What changes were proposed in this pull request? We have 2 different methods to convert filters for hive, regarding a config. This introduces duplicated and

spark git commit: [SPARK-22543][SQL] fix java 64kb compile error for deeply nested expressions

2017-11-22 Thread lixiao
Repository: spark Updated Branches: refs/heads/master 327d25fe1 -> 0605ad761 [SPARK-22543][SQL] fix java 64kb compile error for deeply nested expressions ## What changes were proposed in this pull request? A frequently reported issue of Spark is the Java 64kb compile error. This is because

spark git commit: [SPARK-17920][SPARK-19580][SPARK-19878][SQL] Backport PR 19779 to branch-2.2 - Support writing to Hive table which uses Avro schema url 'avro.schema.url'

2017-11-22 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.2 df9228b49 -> b17f4063c [SPARK-17920][SPARK-19580][SPARK-19878][SQL] Backport PR 19779 to branch-2.2 - Support writing to Hive table which uses Avro schema url 'avro.schema.url' ## What changes were proposed in this pull request? >

<    5   6   7   8   9   10   11   12   13   14   >