spark git commit: [SPARK-12184][PYTHON] Make python api doc for pivot consistant with scala doc

2015-12-07 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 84b809445 -> 36282f78b [SPARK-12184][PYTHON] Make python api doc for pivot consistant with scala doc In SPARK-11946 the API for pivot was changed a bit and got updated doc, the doc changes were not made for the python api though. This PR

spark git commit: [SPARK-12184][PYTHON] Make python api doc for pivot consistant with scala doc

2015-12-07 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 c8aa5f201 -> cdeb89b34 [SPARK-12184][PYTHON] Make python api doc for pivot consistant with scala doc In SPARK-11946 the API for pivot was changed a bit and got updated doc, the doc changes were not made for the python api though. This

spark git commit: [SPARK-11678][SQL][DOCS] Document basePath in the programming guide.

2015-12-09 Thread yhuai
tps://cloud.githubusercontent.com/assets/2072857/11673132/1ba01192-9dcb-11e5-98d9-ac0b4e92e98c.png) JIRA: https://issues.apache.org/jira/browse/SPARK-11678 Author: Yin Huai <yh...@databricks.com> Closes #10211 from yhuai/basePathDoc. (cherry picked from commit ac8cdf1cdc148bd21290ecf4d4f9874

spark git commit: [SPARK-11678][SQL][DOCS] Document basePath in the programming guide.

2015-12-09 Thread yhuai
tps://cloud.githubusercontent.com/assets/2072857/11673132/1ba01192-9dcb-11e5-98d9-ac0b4e92e98c.png) JIRA: https://issues.apache.org/jira/browse/SPARK-11678 Author: Yin Huai <yh...@databricks.com> Closes #10211 from yhuai/basePathDoc. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Com

spark git commit: [SPARK-12250][SQL] Allow users to define a UDAF without providing details of its inputSchema

2015-12-10 Thread yhuai
Repository: spark Updated Branches: refs/heads/master d9d354ed4 -> bc5f56aa6 [SPARK-12250][SQL] Allow users to define a UDAF without providing details of its inputSchema https://issues.apache.org/jira/browse/SPARK-12250 Author: Yin Huai <yh...@databricks.com> Closes #10236 from yh

spark git commit: [SPARK-12250][SQL] Allow users to define a UDAF without providing details of its inputSchema

2015-12-10 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 e541f703d -> 594fafc61 [SPARK-12250][SQL] Allow users to define a UDAF without providing details of its inputSchema https://issues.apache.org/jira/browse/SPARK-12250 Author: Yin Huai <yh...@databricks.com> Closes #10236 f

spark git commit: [SPARK-12228][SQL] Try to run execution hive's derby in memory.

2015-12-10 Thread yhuai
new one. It is possible that it can reduce the flakyness of our tests that need to create HiveContext (e.g. HiveSparkSubmitSuite). I will test it more. https://issues.apache.org/jira/browse/SPARK-12228 Author: Yin Huai <yh...@databricks.com> Closes #10204 from yhuai/derbyInMemory. Proj

[2/2] spark git commit: [SPARK-8641][SQL] Native Spark Window functions

2015-12-17 Thread yhuai
in Window functions. cc rxin / yhuai Author: Herman van Hovell <hvanhov...@questtec.nl> Closes #9819 from hvanhovell/SPARK-8641-2. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/658f66e6 Tree: http://git-wip-us.apache.org/rep

spark git commit: [SPARK-12258][SQL] passing null into ScalaUDF

2015-12-10 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 5d3722f8e -> d09af2cb4 [SPARK-12258][SQL] passing null into ScalaUDF Check nullability and passing them into ScalaUDF. Closes #10249 Author: Davies Liu Closes #10259 from davies/udf_null. (cherry picked from

spark git commit: [SPARK-12012][SQL][BRANCH-1.6] Show more comprehensive PhysicalRDD metadata when visualizing SQL query plan

2015-12-10 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 93ef24638 -> e541f703d [SPARK-12012][SQL][BRANCH-1.6] Show more comprehensive PhysicalRDD metadata when visualizing SQL query plan This PR backports PR #10004 to branch-1.6 It adds a private[sql] method metadata to SparkPlan, which

spark git commit: [SPARK-12275][SQL] No plan for BroadcastHint in some condition

2015-12-13 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 834e71489 -> ed87f6d3b [SPARK-12275][SQL] No plan for BroadcastHint in some condition When SparkStrategies.BasicOperators's "case BroadcastHint(child) => apply(child)" is hit, it only recursively invokes BasicOperators.apply with this

spark git commit: [SPARK-12275][SQL] No plan for BroadcastHint in some condition

2015-12-13 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 fbf16da2e -> 94ce5025f [SPARK-12275][SQL] No plan for BroadcastHint in some condition When SparkStrategies.BasicOperators's "case BroadcastHint(child) => apply(child)" is hit, it only recursively invokes BasicOperators.apply with

[2/2] spark git commit: [SPARK-12213][SQL] use multiple partitions for single distinct query

2015-12-13 Thread yhuai
yhuai nongli marmbrus Author: Davies Liu <dav...@databricks.com> Closes #10228 from davies/single_distinct. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/834e7148 Tree: http://git-wip-us.apache.org/repos/asf/spar

[1/2] spark git commit: [SPARK-12213][SQL] use multiple partitions for single distinct query

2015-12-13 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 2aecda284 -> 834e71489 http://git-wip-us.apache.org/repos/asf/spark/blob/834e7148/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/utils.scala -- diff --git

spark git commit: [SPARK-12298][SQL] Fix infinite loop in DataFrame.sortWithinPartitions

2015-12-11 Thread yhuai
Repository: spark Updated Branches: refs/heads/master a0ff6d16e -> 1e799d617 [SPARK-12298][SQL] Fix infinite loop in DataFrame.sortWithinPartitions Modifies the String overload to call the Column overload and ensures this is called in a test. Author: Ankur Dave Closes

spark git commit: [SPARK-12298][SQL] Fix infinite loop in DataFrame.sortWithinPartitions

2015-12-11 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 c2f20469d -> 03d801587 [SPARK-12298][SQL] Fix infinite loop in DataFrame.sortWithinPartitions Modifies the String overload to call the Column overload and ensures this is called in a test. Author: Ankur Dave

spark git commit: [SPARK-12579][SQL] Force user-specified JDBC driver to take precedence

2016-01-04 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 8f659393b -> 6c83d938c [SPARK-12579][SQL] Force user-specified JDBC driver to take precedence Spark SQL's JDBC data source allows users to specify an explicit JDBC driver to load (using the `driver` argument), but in the current code it's

spark git commit: [SPARK-12579][SQL] Force user-specified JDBC driver to take precedence

2016-01-04 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 b5a1f564a -> 7f37c1e45 [SPARK-12579][SQL] Force user-specified JDBC driver to take precedence Spark SQL's JDBC data source allows users to specify an explicit JDBC driver to load (using the `driver` argument), but in the current code

spark git commit: [SPARK-12589][SQL] Fix UnsafeRowParquetRecordReader to properly set the row length.

2016-01-04 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 1005ee396 -> 8ac919809 [SPARK-12589][SQL] Fix UnsafeRowParquetRecordReader to properly set the row length. The reader was previously not setting the row length meaning it was wrong if there were variable length columns. This problem

spark git commit: [SPARK-12589][SQL] Fix UnsafeRowParquetRecordReader to properly set the row length.

2016-01-04 Thread yhuai
Repository: spark Updated Branches: refs/heads/master d084a2de3 -> 34de24abb [SPARK-12589][SQL] Fix UnsafeRowParquetRecordReader to properly set the row length. The reader was previously not setting the row length meaning it was wrong if there were variable length columns. This problem does

spark git commit: [SPARK-12530][BUILD] Fix build break at Spark-Master-Maven-Snapshots from #1293

2015-12-29 Thread yhuai
Repository: spark Updated Branches: refs/heads/master d80cc90b5 -> 8e629b10c [SPARK-12530][BUILD] Fix build break at Spark-Master-Maven-Snapshots from #1293 Compilation error caused due to string concatenations that are not a constant Use raw string literal to avoid string concatenations

spark git commit: Revert "[SPARK-12006][ML][PYTHON] Fix GMM failure if initialModel is not None"

2016-01-06 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.4 bc397753c -> d4914647a Revert "[SPARK-12006][ML][PYTHON] Fix GMM failure if initialModel is not None" This reverts commit fcd013cf70e7890aa25a8fe3cb6c8b36bf0e1f04. Author: Yin Huai <yh...@databricks.com> Close

spark git commit: [SPARK-12647][SQL] Fix o.a.s.sqlexecution.ExchangeCoordinatorSuite.determining the number of reducers: aggregate operator

2016-01-05 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 d9e4438b5 -> 5afa62b20 [SPARK-12647][SQL] Fix o.a.s.sqlexecution.ExchangeCoordinatorSuite.determining the number of reducers: aggregate operator change expected partition sizes Author: Pete Robbins Closes

spark git commit: [SPARK-8641][SPARK-12455][SQL] Native Spark Window functions - Follow-up (docs & tests)

2015-12-30 Thread yhuai
to add the licenses of these two projects to the licenses directory. They are both under the ASL. srowen any thoughts? cc yhuai Author: Herman van Hovell <hvanhov...@questtec.nl> Closes #10402 from hvanhovell/SPARK-8641-docs. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commi

spark git commit: [SPARK-12580][SQL] Remove string concatenations from usage and extended in @ExpressionDescription

2016-01-07 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 834651835 -> 34dbc8af2 [SPARK-12580][SQL] Remove string concatenations from usage and extended in @ExpressionDescription Use multi-line string literals for ExpressionDescription with ``// scalastyle:off line.size.limit`` and ``//

[1/2] spark git commit: [SPARK-12593][SQL] Converts resolved logical plan back to SQL

2016-01-08 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 659fd9d04 -> d9447cac7 http://git-wip-us.apache.org/repos/asf/spark/blob/d9447cac/sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUDFs.scala -- diff --git

[2/2] spark git commit: [SPARK-12593][SQL] Converts resolved logical plan back to SQL

2016-01-08 Thread yhuai
[SPARK-12593][SQL] Converts resolved logical plan back to SQL This PR tries to enable Spark SQL to convert resolved logical plans back to SQL query strings. For now, the major use case is to canonicalize Spark SQL native view support. The major entry point is `SQLBuilder.toSQL`, which returns

spark git commit: [SPARK-12102][SQL] Cast a non-nullable struct field to a nullable field during analysis

2015-12-22 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 575a13279 -> b374a2583 [SPARK-12102][SQL] Cast a non-nullable struct field to a nullable field during analysis Compare both left and right side of the case expression ignoring nullablity when checking for type equality. Author: Dilip

spark git commit: [SPARK-11619][SQL] cannot use UDTF in DataFrame.selectExpr

2015-12-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 278281828 -> ee444fe4b [SPARK-11619][SQL] cannot use UDTF in DataFrame.selectExpr Description of the problem from cloud-fan Actually this line:

spark git commit: [SPARK-12218][SQL] Invalid splitting of nested AND expressions in Data Source filter API

2015-12-18 Thread yhuai
ted AND expressions partially. Author: Yin Huai <yh...@databricks.com> Closes #10362 from yhuai/SPARK-12218. (cherry picked from commit 41ee7c57abd9f52065fd7ffb71a8af229603371d) Signed-off-by: Yin Huai <yh...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/re

spark git commit: [SPARK-12218][SQL] Invalid splitting of nested AND expressions in Data Source filter API

2015-12-18 Thread yhuai
ted AND expressions partially. Author: Yin Huai <yh...@databricks.com> Closes #10362 from yhuai/SPARK-12218. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/41ee7c57 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree

spark git commit: [SPARK-12218][SQL] Invalid splitting of nested AND expressions in Data Source filter API

2015-12-18 Thread yhuai
ted AND expressions partially. Author: Yin Huai <yh...@databricks.com> Closes #10362 from yhuai/SPARK-12218. (cherry picked from commit 41ee7c57abd9f52065fd7ffb71a8af229603371d) Signed-off-by: Yin Huai <yh...@databricks.com> Conflicts: sql/core/src/test/scala/org/apache/spar

spark git commit: [SPARK-12218] Fixes ORC conjunction predicate push down

2015-12-28 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 8d4940092 -> 8e23d8db7 [SPARK-12218] Fixes ORC conjunction predicate push down This PR is a follow-up of PR #10362. Two major changes: 1. The fix introduced in #10362 is OK for Parquet, but may disable ORC PPD in many cases PR

spark git commit: [SPARK-11394][SQL] Throw IllegalArgumentException for unsupported types in postgresql

2015-12-28 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 1a91be807 -> 73862a1eb [SPARK-11394][SQL] Throw IllegalArgumentException for unsupported types in postgresql If DataFrame has BYTE types, throws an exception: org.postgresql.util.PSQLException: ERROR: type "byte" does not exist Author:

spark git commit: [SPARK-11394][SQL] Throw IllegalArgumentException for unsupported types in postgresql

2015-12-28 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 fd202485a -> 85a871818 [SPARK-11394][SQL] Throw IllegalArgumentException for unsupported types in postgresql If DataFrame has BYTE types, throws an exception: org.postgresql.util.PSQLException: ERROR: type "byte" does not exist

spark git commit: [SPARK-11783][SQL] Fixes execution Hive client when using remote Hive metastore

2015-11-24 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 34ca392da -> c7f95df5c [SPARK-11783][SQL] Fixes execution Hive client when using remote Hive metastore When using remote Hive metastore, `hive.metastore.uris` is set to the metastore URI. However, it overrides

spark git commit: [SPARK-11783][SQL] Fixes execution Hive client when using remote Hive metastore

2015-11-24 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 015569341 -> 3f15ad783 [SPARK-11783][SQL] Fixes execution Hive client when using remote Hive metastore When using remote Hive metastore, `hive.metastore.uris` is set to the metastore URI. However, it overrides

spark git commit: [SPARK-11998][SQL][TEST-HADOOP2.0] When downloading Hadoop artifacts from maven, we need to try to download the version that is used by Spark

2015-11-26 Thread yhuai
ive-thriftserver -Phive build/sbt -Pyarn -Phadoop-2.2 -Pkinesis-asl -Phive-thriftserver -Phive build/sbt -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0 -Pkinesis-asl -Phive-thriftserver -Phive ``` Author: Yin Huai <yh...@databricks.com> Closes #9979 from yhuai/versionsSuite. (cherry picked fr

spark git commit: [SPARK-11998][SQL][TEST-HADOOP2.0] When downloading Hadoop artifacts from maven, we need to try to download the version that is used by Spark

2015-11-26 Thread yhuai
ver -Phive build/sbt -Pyarn -Phadoop-2.2 -Pkinesis-asl -Phive-thriftserver -Phive build/sbt -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0 -Pkinesis-asl -Phive-thriftserver -Phive ``` Author: Yin Huai <yh...@databricks.com> Closes #9979 from yhuai/versionsSuite. Project: http://git-wip-us.a

spark git commit: [SPARK-12020][TESTS][TEST-HADOOP2.0] PR builder cannot trigger hadoop 2.0 test

2015-11-27 Thread yhuai
Repository: spark Updated Branches: refs/heads/master f57e6c9ef -> b9921524d [SPARK-12020][TESTS][TEST-HADOOP2.0] PR builder cannot trigger hadoop 2.0 test https://issues.apache.org/jira/browse/SPARK-12020 Author: Yin Huai <yh...@databricks.com> Closes #10010 from yhuai/SP

spark git commit: [SPARK-12039] [SQL] Ignore HiveSparkSubmitSuite's "SPARK-9757 Persist Parquet relation with decimal column".

2015-11-29 Thread yhuai
tests, we can disable it while we are investigating the cause. Author: Yin Huai <yh...@databricks.com> Closes #10035 from yhuai/SPARK-12039-ignore. (cherry picked from commit 0ddfe7868948e302858a2b03b50762eaefbeb53e) Signed-off-by: Yin Huai <yh...@databricks.com> Project: http://gi

spark git commit: [SPARK-12039] [SQL] Ignore HiveSparkSubmitSuite's "SPARK-9757 Persist Parquet relation with decimal column".

2015-11-29 Thread yhuai
tests, we can disable it while we are investigating the cause. Author: Yin Huai <yh...@databricks.com> Closes #10035 from yhuai/SPARK-12039-ignore. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/0ddfe786 Tree: http://gi

spark git commit: [SPARK-11949][SQL] Set field nullable property for GroupingSets to get correct results for null values

2015-12-01 Thread yhuai
Repository: spark Updated Branches: refs/heads/master a0af0e351 -> c87531b76 [SPARK-11949][SQL] Set field nullable property for GroupingSets to get correct results for null values JIRA: https://issues.apache.org/jira/browse/SPARK-11949 The result of cube plan uses incorrect schema. The

spark git commit: Revert "[SPARK-11544][SQL] sqlContext doesn't use PathFilter"

2015-11-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 6d0848b53 -> 9c0654d36 Revert "[SPARK-11544][SQL] sqlContext doesn't use PathFilter" This reverts commit 54db79702513e11335c33bcf3a03c59e965e6f16. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: Revert "[SPARK-11544][SQL] sqlContext doesn't use PathFilter"

2015-11-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 cfdd8a1a3 -> 19f4f26f3 Revert "[SPARK-11544][SQL] sqlContext doesn't use PathFilter" This reverts commit 54db79702513e11335c33bcf3a03c59e965e6f16. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-11544][SQL] sqlContext doesn't use PathFilter

2015-11-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 603a721c2 -> 54db79702 [SPARK-11544][SQL] sqlContext doesn't use PathFilter Apply the user supplied pathfilter while retrieving the files from fs. Author: Dilip Biswal Closes #9652 from dilipbiswal/spark-11544.

spark git commit: [SPARK-11544][SQL] sqlContext doesn't use PathFilter

2015-11-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 464b2d421 -> e8390e1ab [SPARK-11544][SQL] sqlContext doesn't use PathFilter Apply the user supplied pathfilter while retrieving the files from fs. Author: Dilip Biswal Closes #9652 from dilipbiswal/spark-11544.

spark git commit: [SPARK-11614][SQL] serde parameters should be set only when all params are ready

2015-11-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 4b4a6bf5c -> 59eaec2d4 [SPARK-11614][SQL] serde parameters should be set only when all params are ready see HIVE-7975 and HIVE-12373 With changed semantic of setters in thrift objects in hive, setter should be called only after all

spark git commit: [SPARK-11614][SQL] serde parameters should be set only when all params are ready

2015-11-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 67c75828f -> fc3f77b42 [SPARK-11614][SQL] serde parameters should be set only when all params are ready see HIVE-7975 and HIVE-12373 With changed semantic of setters in thrift objects in hive, setter should be called only after all

spark git commit: [SPARK-11817][SQL] Truncating the fractional seconds to prevent inserting a NULL

2015-11-20 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.5 6fe1ce6ab -> 9a906c1c3 [SPARK-11817][SQL] Truncating the fractional seconds to prevent inserting a NULL JIRA: https://issues.apache.org/jira/browse/SPARK-11817 Instead of return None, we should truncate the fractional seconds to

spark git commit: [SPARK-11817][SQL] Truncating the fractional seconds to prevent inserting a NULL

2015-11-20 Thread yhuai
Repository: spark Updated Branches: refs/heads/master bef361c58 -> 60bfb1133 [SPARK-11817][SQL] Truncating the fractional seconds to prevent inserting a NULL JIRA: https://issues.apache.org/jira/browse/SPARK-11817 Instead of return None, we should truncate the fractional seconds to prevent

spark git commit: [SPARK-11817][SQL] Truncating the fractional seconds to prevent inserting a NULL

2015-11-20 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 3662b9f4c -> 119f92b4e [SPARK-11817][SQL] Truncating the fractional seconds to prevent inserting a NULL JIRA: https://issues.apache.org/jira/browse/SPARK-11817 Instead of return None, we should truncate the fractional seconds to

spark git commit: [SPARK-11724][SQL] Change casting between int and timestamp to consistently treat int in seconds.

2015-11-20 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 652def318 -> 9ed4ad426 [SPARK-11724][SQL] Change casting between int and timestamp to consistently treat int in seconds. Hive has since changed this behavior as well. https://issues.apache.org/jira/browse/HIVE-3454 Author: Nong Li

spark git commit: [SPARK-11724][SQL] Change casting between int and timestamp to consistently treat int in seconds.

2015-11-20 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 6fc968754 -> 9c8e17984 [SPARK-11724][SQL] Change casting between int and timestamp to consistently treat int in seconds. Hive has since changed this behavior as well. https://issues.apache.org/jira/browse/HIVE-3454 Author: Nong Li

spark git commit: [SPARK-11840][SQL] Restore the 1.5's behavior of planning a single distinct aggregation.

2015-11-19 Thread yhuai
lan. Author: Yin Huai <yh...@databricks.com> Closes #9828 from yhuai/distinctRewriter. (cherry picked from commit 962878843b611fa6229e3ee67bb22e2a4bc283cd) Signed-off-by: Yin Huai <yh...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wi

spark git commit: [SPARK-11840][SQL] Restore the 1.5's behavior of planning a single distinct aggregation.

2015-11-19 Thread yhuai
lan. Author: Yin Huai <yh...@databricks.com> Closes #9828 from yhuai/distinctRewriter. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/96287884 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/96287884 Diff: http:

spark git commit: [SPARK-11275][SQL] Incorrect results when using rollup/cube

2015-11-19 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 baae1ccc9 -> 70d4edda8 [SPARK-11275][SQL] Incorrect results when using rollup/cube Fixes bug with grouping sets (including cube/rollup) where aggregates that included grouping expressions would return the wrong (null) result. Also

spark git commit: [SPARK-11275][SQL] Incorrect results when using rollup/cube

2015-11-19 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 01403aa97 -> 37cff1b1a [SPARK-11275][SQL] Incorrect results when using rollup/cube Fixes bug with grouping sets (including cube/rollup) where aggregates that included grouping expressions would return the wrong (null) result. Also

spark git commit: [SPARK-11544][SQL][TEST-HADOOP1.0] sqlContext doesn't use PathFilter

2015-11-19 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 8b34fb0b8 -> a936fa5c5 [SPARK-11544][SQL][TEST-HADOOP1.0] sqlContext doesn't use PathFilter Apply the user supplied pathfilter while retrieving the files from fs. Author: Dilip Biswal Closes #9830 from

spark git commit: [SPARK-11544][SQL][TEST-HADOOP1.0] sqlContext doesn't use PathFilter

2015-11-19 Thread yhuai
Repository: spark Updated Branches: refs/heads/master ee2140774 -> 7ee7d5a3c [SPARK-11544][SQL][TEST-HADOOP1.0] sqlContext doesn't use PathFilter Apply the user supplied pathfilter while retrieving the files from fs. Author: Dilip Biswal Closes #9830 from

spark git commit: [SPARK-11628][SQL] support column datatype of char(x) to recognize HiveChar

2015-11-23 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 bad93d9f3 -> 9ce4e97db [SPARK-11628][SQL] support column datatype of char(x) to recognize HiveChar Can someone review my code to make sure I'm not missing anything? Thanks! Author: Xiu Guo Author: Xiu Guo

spark git commit: [SPARK-12558][SQL] AnalysisException when multiple functions applied in GROUP BY clause

2016-01-12 Thread yhuai
Repository: spark Updated Branches: refs/heads/master f14922cff -> dc7b3870f [SPARK-12558][SQL] AnalysisException when multiple functions applied in GROUP BY clause cloud-fan Can you please take a look ? In this case, we are failing during check analysis while validating the aggregation

spark git commit: [SPARK-12558][SQL] AnalysisException when multiple functions applied in GROUP BY clause

2016-01-12 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 f71e5cc12 -> dcdc864cf [SPARK-12558][SQL] AnalysisException when multiple functions applied in GROUP BY clause cloud-fan Can you please take a look ? In this case, we are failing during check analysis while validating the

spark git commit: Revert "[SPARK-12645][SPARKR] SparkR support hash function"

2016-01-12 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 94b39f777 -> 03e523e52 Revert "[SPARK-12645][SPARKR] SparkR support hash function" This reverts commit 8b5f23043322254c725c703c618ba3d3cc4a4240. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-15839] Fix Maven doc-jar generation when JAVA_7_HOME is set

2016-06-09 Thread yhuai
ean install -DskipTests=true` when `JAVA_7_HOME` was set. Also manually inspected the effective POM diff to verify that the final POM changes were scoped correctly: https://gist.github.com/JoshRosen/f889d1c236fad14fa25ac4be01654653 /cc vanzin and yhuai for review. Author: Josh Rosen <

spark git commit: [SPARK-15839] Fix Maven doc-jar generation when JAVA_7_HOME is set

2016-06-09 Thread yhuai
mvn clean install -DskipTests=true` when `JAVA_7_HOME` was set. Also manually inspected the effective POM diff to verify that the final POM changes were scoped correctly: https://gist.github.com/JoshRosen/f889d1c236fad14fa25ac4be01654653 /cc vanzin and yhuai for review. Author: Josh Rosen <

spark git commit: [SPARK-15914][SQL] Add deprecated method back to SQLContext for backward source code compatibility

2016-06-14 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 e90ba2287 -> 73beb9fb3 [SPARK-15914][SQL] Add deprecated method back to SQLContext for backward source code compatibility ## What changes were proposed in this pull request? Revert partial changes in SPARK-12600, and add some

spark git commit: [SPARK-15247][SQL] Set the default number of partitions for reading parquet schemas

2016-06-14 Thread yhuai
Repository: spark Updated Branches: refs/heads/master bd39ffe35 -> dae4d5db2 [SPARK-15247][SQL] Set the default number of partitions for reading parquet schemas ## What changes were proposed in this pull request? This pr sets the default number of partitions when reading parquet schemas.

spark git commit: [SPARK-15247][SQL] Set the default number of partitions for reading parquet schemas

2016-06-14 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 24539223b -> 9adba414c [SPARK-15247][SQL] Set the default number of partitions for reading parquet schemas ## What changes were proposed in this pull request? This pr sets the default number of partitions when reading parquet schemas.

spark git commit: [SPARK-15895][SQL] Filters out metadata files while doing partition discovery

2016-06-14 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 515937046 -> e03c25193 [SPARK-15895][SQL] Filters out metadata files while doing partition discovery ## What changes were proposed in this pull request? Take the following directory layout as an example: ``` dir/ +- p0=0/

spark git commit: [SPARK-15895][SQL] Filters out metadata files while doing partition discovery

2016-06-14 Thread yhuai
Repository: spark Updated Branches: refs/heads/master df4ea6614 -> bd39ffe35 [SPARK-15895][SQL] Filters out metadata files while doing partition discovery ## What changes were proposed in this pull request? Take the following directory layout as an example: ``` dir/ +- p0=0/ |-_metadata

spark git commit: [SPARK-15887][SQL] Bring back the hive-site.xml support for Spark 2.0

2016-06-13 Thread yhuai
Repository: spark Updated Branches: refs/heads/master c654ae214 -> c4b1ad020 [SPARK-15887][SQL] Bring back the hive-site.xml support for Spark 2.0 ## What changes were proposed in this pull request? Right now, Spark 2.0 does not load hive-site.xml. Based on users' feedback, it seems make

spark git commit: [SPARK-15887][SQL] Bring back the hive-site.xml support for Spark 2.0

2016-06-13 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 97fe1d8ee -> b148b0364 [SPARK-15887][SQL] Bring back the hive-site.xml support for Spark 2.0 ## What changes were proposed in this pull request? Right now, Spark 2.0 does not load hive-site.xml. Based on users' feedback, it seems

spark git commit: [SPARK-15530][SQL] Set #parallelism for file listing in listLeafFilesInParallel

2016-06-13 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 3b7fb84cf -> 5ad4e32d4 [SPARK-15530][SQL] Set #parallelism for file listing in listLeafFilesInParallel ## What changes were proposed in this pull request? This pr is to set the number of parallelism to prevent file listing in

spark git commit: [SPARK-15530][SQL] Set #parallelism for file listing in listLeafFilesInParallel

2016-06-13 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 8c4050a5a -> d9db8a9c8 [SPARK-15530][SQL] Set #parallelism for file listing in listLeafFilesInParallel ## What changes were proposed in this pull request? This pr is to set the number of parallelism to prevent file listing in

spark git commit: [SPARK-15676][SQL] Disallow Column Names as Partition Columns For Hive Tables

2016-06-13 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 2a0da84dc -> 8c4050a5a [SPARK-15676][SQL] Disallow Column Names as Partition Columns For Hive Tables What changes were proposed in this pull request? When creating a Hive Table (not data source tables), a common error users might

spark git commit: [SPARK-15676][SQL] Disallow Column Names as Partition Columns For Hive Tables

2016-06-13 Thread yhuai
Repository: spark Updated Branches: refs/heads/master a6a18a457 -> 3b7fb84cf [SPARK-15676][SQL] Disallow Column Names as Partition Columns For Hive Tables What changes were proposed in this pull request? When creating a Hive Table (not data source tables), a common error users might

spark git commit: [SPARK-15663][SQL] SparkSession.catalog.listFunctions shouldn't include the list of built-in functions

2016-06-13 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 1a57bf0f4 -> 2841bbac4 [SPARK-15663][SQL] SparkSession.catalog.listFunctions shouldn't include the list of built-in functions ## What changes were proposed in this pull request? SparkSession.catalog.listFunctions currently returns all

spark git commit: [SPARK-15808][SQL] File Format Checking When Appending Data

2016-06-13 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 7b9071eea -> 5827b65e2 [SPARK-15808][SQL] File Format Checking When Appending Data What changes were proposed in this pull request? **Issue:** Got wrong results or strange errors when append data to a table with mismatched file

spark git commit: [SPARK-15808][SQL] File Format Checking When Appending Data

2016-06-13 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 774014250 -> 55c1fac21 [SPARK-15808][SQL] File Format Checking When Appending Data What changes were proposed in this pull request? **Issue:** Got wrong results or strange errors when append data to a table with mismatched file

spark git commit: [SPARK-15636][SQL] Make aggregate expressions more concise in explain

2016-05-28 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 74c1b79f3 -> 472f16181 [SPARK-15636][SQL] Make aggregate expressions more concise in explain ## What changes were proposed in this pull request? This patch reduces the verbosity of aggregate expressions in explain (but does not actually

spark git commit: [SPARK-15636][SQL] Make aggregate expressions more concise in explain

2016-05-28 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 a2f68ded2 -> f3570bcea [SPARK-15636][SQL] Make aggregate expressions more concise in explain ## What changes were proposed in this pull request? This patch reduces the verbosity of aggregate expressions in explain (but does not

spark git commit: [SPARK-15594][SQL] ALTER TABLE SERDEPROPERTIES does not respect partition spec

2016-05-27 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 776d183c8 -> 4a2fb8b87 [SPARK-15594][SQL] ALTER TABLE SERDEPROPERTIES does not respect partition spec ## What changes were proposed in this pull request? These commands ignore the partition spec and change the storage properties of the

spark git commit: [SPARK-15594][SQL] ALTER TABLE SERDEPROPERTIES does not respect partition spec

2016-05-27 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 dc6e94157 -> 80a40e8e2 [SPARK-15594][SQL] ALTER TABLE SERDEPROPERTIES does not respect partition spec ## What changes were proposed in this pull request? These commands ignore the partition spec and change the storage properties of

spark git commit: [SPARK-15658][SQL] UDT serializer should declare its data type as udt instead of udt.sqlType

2016-05-31 Thread yhuai
Repository: spark Updated Branches: refs/heads/master d67c82e4b -> 2bfed1a0c [SPARK-15658][SQL] UDT serializer should declare its data type as udt instead of udt.sqlType ## What changes were proposed in this pull request? When we build serializer for UDT object, we should declare its data

spark git commit: [SPARK-15658][SQL] UDT serializer should declare its data type as udt instead of udt.sqlType

2016-05-31 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 6347ff512 -> 29b94fdb3 [SPARK-15658][SQL] UDT serializer should declare its data type as udt instead of udt.sqlType ## What changes were proposed in this pull request? When we build serializer for UDT object, we should declare its

spark git commit: [SPARK-15622][SQL] Wrap the parent classloader of Janino's classloader in the ParentClassLoader.

2016-05-31 Thread yhuai
iff-bb538fda94224dd0af01d0fd7e1b4ea0R81) and `test-only *ReplSuite -- -z "SPARK-2576 importing implicits"` still passes the test (without the change in `CodeGenerator`, this test does not pass with the change in `ExecutorClassLoader `). Author: Yin Huai <yh...@databricks.com> Closes #13366 from yhuai/SPARK-156

spark git commit: [SPARK-15565][SQL] Add the File Scheme to the Default Value of WAREHOUSE_PATH

2016-05-27 Thread yhuai
user.dir")/spark-warehouse`. Since `System.getProperty("user.dir")` is a local dir, we should explicitly set the scheme to local filesystem. cc yhuai How was this patch tested? Added two test cases Author: gatorsmile <gatorsm...@gmail.com> Closes #13348 from gatorsmile/ad

spark git commit: [SPARK-15565][SQL] Add the File Scheme to the Default Value of WAREHOUSE_PATH

2016-05-27 Thread yhuai
user.dir")/spark-warehouse`. Since `System.getProperty("user.dir")` is a local dir, we should explicitly set the scheme to local filesystem. cc yhuai How was this patch tested? Added two test cases Author: gatorsmile <gatorsm...@gmail.com>

spark git commit: [SPARK-15431][SQL][BRANCH-2.0-TEST] rework the clisuite test cases

2016-05-27 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 dcf498e8a -> 9c137b2e3 [SPARK-15431][SQL][BRANCH-2.0-TEST] rework the clisuite test cases ## What changes were proposed in this pull request? This PR reworks on the CliSuite test cases for `LIST FILES/JARS` commands. CC yhuai Tha

spark git commit: [SPARK-15431][SQL][BRANCH-2.0-TEST] rework the clisuite test cases

2016-05-27 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 21b2605dc -> 019afd9c7 [SPARK-15431][SQL][BRANCH-2.0-TEST] rework the clisuite test cases ## What changes were proposed in this pull request? This PR reworks on the CliSuite test cases for `LIST FILES/JARS` commands. CC yhuai Tha

spark git commit: [SPARK-15431][SQL][HOTFIX] ignore 'list' command testcase from CliSuite for now

2016-05-27 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 b3845fede -> b430aa98c [SPARK-15431][SQL][HOTFIX] ignore 'list' command testcase from CliSuite for now ## What changes were proposed in this pull request? The test cases for `list` command added in `CliSuite` by PR #13212 can not run

spark git commit: [SPARK-15431][SQL][HOTFIX] ignore 'list' command testcase from CliSuite for now

2016-05-27 Thread yhuai
Repository: spark Updated Branches: refs/heads/master d5911d117 -> 6f95c6c03 [SPARK-15431][SQL][HOTFIX] ignore 'list' command testcase from CliSuite for now ## What changes were proposed in this pull request? The test cases for `list` command added in `CliSuite` by PR #13212 can not run in

spark git commit: [SPARK-12988][SQL] Can't drop top level columns that contain dots

2016-05-31 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 0f2471346 -> 06514d689 [SPARK-12988][SQL] Can't drop top level columns that contain dots ## What changes were proposed in this pull request? Fixes "Can't drop top level columns that contain dots". This work is based on dilipbiswal's

spark git commit: [SPARK-15532][SQL] SQLContext/HiveContext's public constructors should use SparkSession.build.getOrCreate

2016-05-26 Thread yhuai
lic constructor to use SparkSession.build.getOrCreate and removes isRootContext from SQLContext. ## How was this patch tested? Existing tests. Author: Yin Huai <yh...@databricks.com> Closes #13310 from yhuai/SPARK-15532. (cherry picked from commit 3ac2363d757cc9cebc627974f17ecda3a263efdf) S

spark git commit: [SPARK-15583][SQL] Disallow altering datasource properties

2016-05-26 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 702755f92 -> 8e26b74fc [SPARK-15583][SQL] Disallow altering datasource properties ## What changes were proposed in this pull request? Certain table properties (and SerDe properties) are in the protected namespace

spark git commit: [SPARK-15583][SQL] Disallow altering datasource properties

2016-05-26 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 6ab973ec5 -> 3fca635b4 [SPARK-15583][SQL] Disallow altering datasource properties ## What changes were proposed in this pull request? Certain table properties (and SerDe properties) are in the protected namespace `spark.sql.sources.`,

spark git commit: [SPARK-15596][SPARK-15635][SQL] ALTER TABLE RENAME fixes

2016-06-01 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 46d5f7f38 -> 44052a707 [SPARK-15596][SPARK-15635][SQL] ALTER TABLE RENAME fixes ## What changes were proposed in this pull request? **SPARK-15596**: Even after we renamed a cached table, the plan would remain in the cache with the

spark git commit: [SPARK-15596][SPARK-15635][SQL] ALTER TABLE RENAME fixes

2016-06-01 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 5b08ee639 -> 9e2643b21 [SPARK-15596][SPARK-15635][SQL] ALTER TABLE RENAME fixes ## What changes were proposed in this pull request? **SPARK-15596**: Even after we renamed a cached table, the plan would remain in the cache with the old

spark git commit: [SPARK-15622][SQL] Wrap the parent classloader of Janino's classloader in the ParentClassLoader.

2016-05-31 Thread yhuai
R81) and `test-only *ReplSuite -- -z "SPARK-2576 importing implicits"` still passes the test (without the change in `CodeGenerator`, this test does not pass with the change in `ExecutorClassLoader `). Author: Yin Huai <yh...@databricks.com> Closes #13366 from yhuai/SPARK-15622.

<    1   2   3   4   5   6   7   8   >