spark git commit: [SQL] [MINOR] remove internalRowRDD in DataFrame

2015-07-01 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master fc3a6fe67 - 0eee06158 [SQL] [MINOR] remove internalRowRDD in DataFrame Developers have already familiar with `queryExecution.toRDD` as internal row RDD, and we should not add new concept. Author: Wenchen Fan cloud0...@outlook.com Closes

spark git commit: [SPARK-8628] [SQL] Race condition in AbstractSparkSQLParser.parse

2015-06-30 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master c1befd780 - b8e5bb6fc [SPARK-8628] [SQL] Race condition in AbstractSparkSQLParser.parse Made lexical iniatialization as lazy val Author: Vinod K C vinod...@huawei.com Closes #7015 from vinodkc/handle_lexical_initialize_schronization and

spark git commit: [SPARK-8628] [SQL] Race condition in AbstractSparkSQLParser.parse

2015-06-30 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.4 f9cd5cc1b - 80b0fe200 [SPARK-8628] [SQL] Race condition in AbstractSparkSQLParser.parse Made lexical iniatialization as lazy val Author: Vinod K C vinod...@huawei.com Closes #7015 from vinodkc/handle_lexical_initialize_schronization

spark git commit: [SPARK-6785] [SQL] fix DateTimeUtils for dates before 1970

2015-06-30 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master d16a94437 - 1e1f33997 [SPARK-6785] [SQL] fix DateTimeUtils for dates before 1970 Hi Michael, this Pull-Request is a follow-up to [PR-6242](https://github.com/apache/spark/pull/6242). I removed the two obsolete test cases from the

spark git commit: [SPARK-8589] [SQL] cleanup DateTimeUtils

2015-06-29 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 4b497a724 - 881662e9c [SPARK-8589] [SQL] cleanup DateTimeUtils move date time related operations into `DateTimeUtils` and rename some methods to make it more clear. Author: Wenchen Fan cloud0...@outlook.com Closes #6980 from

[4/4] spark git commit: [SPARK-8478] [SQL] Harmonize UDF-related code to use uniformly UDF instead of Udf

2015-06-29 Thread marmbrus
[SPARK-8478] [SQL] Harmonize UDF-related code to use uniformly UDF instead of Udf Follow-up of #6902 for being coherent between ```Udf``` and ```UDF``` Author: BenFradet benjamin.fra...@gmail.com Closes #6920 from BenFradet/SPARK-8478 and squashes the following commits: c500f29 [BenFradet]

[2/4] spark git commit: [SPARK-8478] [SQL] Harmonize UDF-related code to use uniformly UDF instead of Udf

2015-06-29 Thread marmbrus
http://git-wip-us.apache.org/repos/asf/spark/blob/931da5c8/sql/core/src/main/scala/org/apache/spark/sql/execution/pythonUDFs.scala -- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/pythonUDFs.scala

[3/4] spark git commit: [SPARK-8478] [SQL] Harmonize UDF-related code to use uniformly UDF instead of Udf

2015-06-29 Thread marmbrus
http://git-wip-us.apache.org/repos/asf/spark/blob/931da5c8/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ScalaUdf.scala -- diff --git

[1/4] spark git commit: [SPARK-8478] [SQL] Harmonize UDF-related code to use uniformly UDF instead of Udf

2015-06-29 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master c8ae887ef - 931da5c8a http://git-wip-us.apache.org/repos/asf/spark/blob/931da5c8/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveUDFSuite.scala -- diff

spark git commit: [SPARK-7862] [SQL] Disable the error message redirect to stderr

2015-06-29 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 637b4eeda - c6ba2ea34 [SPARK-7862] [SQL] Disable the error message redirect to stderr This is a follow up of #6404, the ScriptTransformation prints the error msg into stderr directly, probably be a disaster for application log. Author:

spark git commit: [SPARK-8669] [SQL] Fix crash with BINARY (ENUM) fields with Parquet 1.7

2015-06-29 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master ecacb1e88 - 4915e9e3b [SPARK-8669] [SQL] Fix crash with BINARY (ENUM) fields with Parquet 1.7 Patch to fix crash with BINARY fields with ENUM original types. Author: Steven She ste...@canopylabs.com Closes #7048 from

spark git commit: [SPARK-7289] handle project - limit - sort efficiently

2015-06-24 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master b84d4b4df - f04b5672c [SPARK-7289] handle project - limit - sort efficiently make the `TakeOrdered` strategy and operator more general, such that it can optionally handle a projection when necessary Author: Wenchen Fan

spark git commit: [SPARK-8075] [SQL] apply type check interface to more expressions

2015-06-24 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 7daa70292 - b71d3254e [SPARK-8075] [SQL] apply type check interface to more expressions a follow up of https://github.com/apache/spark/pull/6405. Note: It's not a big change, a lot of changing is due to I swap some code in

spark git commit: [SPARK-7088] [SQL] Fix analysis for 3rd party logical plan.

2015-06-24 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 43e66192f - b84d4b4df [SPARK-7088] [SQL] Fix analysis for 3rd party logical plan. ResolveReferences analysis rule now does not throw when it cannot resolve references in a self-join. Author: Santiago M. Mola sm...@stratio.com Closes

spark git commit: [SPARK-7235] [SQL] Refactor the grouping sets

2015-06-23 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 4f7fbefb8 - 7b1450b66 [SPARK-7235] [SQL] Refactor the grouping sets The logical plan `Expand` takes the `output` as constructor argument, which break the references chain. We need to refactor the code, as well as the column pruning.

spark git commit: [SPARK-8432] [SQL] fix hashCode() and equals() of BinaryType in Row

2015-06-23 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 7b1450b66 - 6f4cadf5e [SPARK-8432] [SQL] fix hashCode() and equals() of BinaryType in Row Also added more tests in LiteralExpressionSuite Author: Davies Liu dav...@databricks.com Closes #6876 from davies/fix_hashcode and squashes the

spark git commit: [SPARK-8300] DataFrame hint for broadcast join.

2015-06-23 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master f0dcbe8a7 - 6ceb16960 [SPARK-8300] DataFrame hint for broadcast join. Users can now do ```scala left.join(broadcast(right), joinKey) ``` to give the query planner a hint that right DataFrame is small and should be broadcasted. Author:

spark git commit: [SPARK-8356] [SQL] Reconcile callUDF and callUdf

2015-06-22 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master b1f3a489e - 50d3242d6 [SPARK-8356] [SQL] Reconcile callUDF and callUdf Deprecates ```callUdf``` in favor of ```callUDF```. Author: BenFradet benjamin.fra...@gmail.com Closes #6902 from BenFradet/SPARK-8356 and squashes the following

spark git commit: [SPARK-7153] [SQL] support all integral type ordinal in GetArrayItem

2015-06-22 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 1dfb0f7b2 - 860a49ef2 [SPARK-7153] [SQL] support all integral type ordinal in GetArrayItem first convert `ordinal` to `Number`, then convert to int type. Author: Wenchen Fan cloud0...@outlook.com Closes #5706 from cloud-fan/7153 and

spark git commit: [SPARK-8104] [SQL] auto alias expressions in analyzer

2015-06-22 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 5d89d9f00 - da7bbb943 [SPARK-8104] [SQL] auto alias expressions in analyzer Currently we auto alias expression in parser. However, during parser phase we don't have enough information to do the right alias. For example, Generator that

spark git commit: [SPARK-8368] [SPARK-8058] [SQL] HiveContext may override the context class loader of the current thread

2015-06-19 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 4be53d039 - c5876e529 [SPARK-8368] [SPARK-8058] [SQL] HiveContext may override the context class loader of the current thread https://issues.apache.org/jira/browse/SPARK-8368 Also, I add tests according

spark git commit: [SPARK-8368] [SPARK-8058] [SQL] HiveContext may override the context class loader of the current thread (branch 1.4)

2015-06-19 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.4 4b2c793a2 - 9ac839366 [SPARK-8368] [SPARK-8058] [SQL] HiveContext may override the context class loader of the current thread (branch 1.4) This is for 1.4 branch (based on https://github.com/apache/spark/pull/6891). Author: Yin Huai

spark git commit: [SPARK-8446] [SQL] Add helper functions for testing SparkPlan physical operators

2015-06-18 Thread marmbrus
an extra column which isn't part of the sort ae1896b [Josh Rosen] Provide implicits automatically a80f9b0 [Josh Rosen] Merge pull request #4 from marmbrus/pr/6885 d9ab1e4 [Michael Armbrust] Add simple resolver c60a44d [Josh Rosen] Manually bind references 996332a [Josh Rosen] Add types so that tests

spark git commit: [SPARK-8077] [SQL] Optimization for TreeNodes with large numbers of children

2015-06-17 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 50a0496a4 - 0c1b2df04 [SPARK-8077] [SQL] Optimization for TreeNodes with large numbers of children For example large IN clauses Large IN clauses are parsed very slowly. For example SQL below (10K items in IN) takes 45-50s. sSELECT *

spark git commit: [SPARK-8397] [SQL] Allow custom configuration for TestHive

2015-06-17 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master a06d9c8e7 - d1069cba4 [SPARK-8397] [SQL] Allow custom configuration for TestHive We encourage people to use TestHive in unit tests, because it's impossible to create more than one HiveContext within one process. The current implementation

spark git commit: [SPARK-8010] [SQL] Promote types to StringType as implicit conversion in non-binary expression of HiveTypeCoercion

2015-06-17 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master a46594435 - 98ee3512b [SPARK-8010] [SQL] Promote types to StringType as implicit conversion in non-binary expression of HiveTypeCoercion 1. Given a query `select coalesce(null, 1, '1') from dual` will cause exception:

spark git commit: [SPARK-7067] [SQL] fix bug when use complex nested fields in ORDER BY

2015-06-17 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master a411a40de - 7f05b1fe6 [SPARK-7067] [SQL] fix bug when use complex nested fields in ORDER BY This PR is a improvement for https://github.com/apache/spark/pull/5189. The resolution rule for ORDER BY is: first resolve based on what comes

spark git commit: [SPARK-8306] [SQL] AddJar command needs to set the new class loader to the HiveConf inside executionHive.state.

2015-06-17 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 7f05b1fe6 - 302556ff9 [SPARK-8306] [SQL] AddJar command needs to set the new class loader to the HiveConf inside executionHive.state. https://issues.apache.org/jira/browse/SPARK-8306 I will try to add a test later. marmbrus aarondav

spark git commit: [SPARK-8306] [SQL] AddJar command needs to set the new class loader to the HiveConf inside executionHive.state.

2015-06-17 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.4 5aedfa2ce - 73cf5def0 [SPARK-8306] [SQL] AddJar command needs to set the new class loader to the HiveConf inside executionHive.state. https://issues.apache.org/jira/browse/SPARK-8306 I will try to add a test later. marmbrus aarondav

spark git commit: [SPARK-6782] add sbt-revolver plugin

2015-06-17 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master f005be027 - a46594435 [SPARK-6782] add sbt-revolver plugin to make it easier to start stop http servers in sbt https://issues.apache.org/jira/browse/SPARK-6782 Author: Imran Rashid iras...@cloudera.com Closes #5426 from

spark git commit: [SPARK-8156] [SQL] create table to specific database by 'use dbname'

2015-06-16 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master ca998757e - 0b8c8fdc1 [SPARK-8156] [SQL] create table to specific database by 'use dbname' when i test the following code: hiveContext.sql(use testdb) val df = (1 to 3).map(i = (i, sval_$i, i * 2)).toDF(a, b, c) df.write .format(parquet)

spark git commit: [SPARK-6583] [SQL] Support aggregate functions in ORDER BY

2015-06-15 Thread marmbrus
#5290. Author: Yadong Qi qiyadong2...@gmail.com Author: Michael Armbrust mich...@databricks.com Closes #6816 from marmbrus/pr/5290 and squashes the following commits: 3226a97 [Michael Armbrust] consistent ordering eb8938d [Michael Armbrust] no vars c8b25c1 [Yadong Qi] move the test data. 7f9b736

spark git commit: [SPARK-8358] [SQL] Wait for child resolution when resolving generators

2015-06-14 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master ea7fd2ff6 - 9073a426e [SPARK-8358] [SQL] Wait for child resolution when resolving generators Author: Michael Armbrust mich...@databricks.com Closes #6811 from marmbrus/aliasExplodeStar and squashes the following commits: fbd2065 [Michael

spark git commit: [SPARK-8358] [SQL] Wait for child resolution when resolving generators

2015-06-14 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.4 4634be5a7 - 2805d145e [SPARK-8358] [SQL] Wait for child resolution when resolving generators Author: Michael Armbrust mich...@databricks.com Closes #6811 from marmbrus/aliasExplodeStar and squashes the following commits: fbd2065

spark git commit: [SPARK-8362] [SQL] Add unit tests for +, -, *, /, %

2015-06-14 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 9073a426e - 53c16b92a [SPARK-8362] [SQL] Add unit tests for +, -, *, /, % Added unit tests for all supported data types for: - Add - Subtract - Multiply - Divide - UnaryMinus - Remainder Fixed bugs caught by the unit tests. Author:

spark git commit: [SPARK-8065] [SQL] Add support for Hive 0.14 metastores

2015-06-14 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master f3f2a4397 - 4eb48ed1d [SPARK-8065] [SQL] Add support for Hive 0.14 metastores This change has two parts. The first one gets rid of ReflectionMagic. That worked well for the differences between 0.12 and 0.13, but breaks in 0.14, since

spark git commit: [SPARK-8349] [SQL] Use expression constructors (rather than apply) in FunctionRegistry

2015-06-13 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master a13895339 - 2d71ba4c8 [SPARK-8349] [SQL] Use expression constructors (rather than apply) in FunctionRegistry Author: Reynold Xin r...@databricks.com Closes #6806 from rxin/gs and squashes the following commits: ed1aebb [Reynold Xin]

spark git commit: [SPARK-7915] [SQL] Support specifying the column list for target table in CTAS

2015-06-11 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master c8d551d54 - 040f223c5 [SPARK-7915] [SQL] Support specifying the column list for target table in CTAS ``` create table t1 (a int, b string) as select key, value from src; desc t1; key int NULL value string NULL ``` Thus Hive

spark git commit: [SPARK-7444] [TESTS] Eliminate noisy css warn/error logs for UISeleniumSuite

2015-06-11 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 040f223c5 - 95690a17d [SPARK-7444] [TESTS] Eliminate noisy css warn/error logs for UISeleniumSuite Eliminate the following noisy logs for `UISeleniumSuite`: ``` 15/05/07 10:09:50.196 pool-1-thread-1-ScalaTest-running-UISeleniumSuite WARN

spark git commit: [SPARK-7862] [SQL] Fix the deadlock in script transformation for stderr

2015-06-11 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master b9d177c51 - 2dd7f9308 [SPARK-7862] [SQL] Fix the deadlock in script transformation for stderr [Related PR SPARK-7044] (https://github.com/apache/spark/pull/5671) Author: zhichao.li zhichao...@intel.com Closes #6404 from

spark git commit: [SPARK-8317] [SQL] Do not push sort into shuffle in Exchange operator

2015-06-11 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 767cc94ca - b9d177c51 [SPARK-8317] [SQL] Do not push sort into shuffle in Exchange operator In some cases, Spark SQL pushes sorting operations into the shuffle layer by specifying a key ordering as part of the shuffle dependency. I think

spark git commit: [SPARK-7824] [SQL] Collapse operator reordering and constant folding into a single batch.

2015-06-11 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 7d669a56f - 7914c720b [SPARK-7824] [SQL] Collapse operator reordering and constant folding into a single batch. SQL ``` select * from tableA join tableB on (a 3 and b = d) or (a 3 and b = e) ``` Plan before modify ``` == Optimized

spark git commit: [SPARK-7158] [SQL] Fix bug of cached data cannot be used in collect() after cache()

2015-06-11 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 337c16d57 - 767cc94ca [SPARK-7158] [SQL] Fix bug of cached data cannot be used in collect() after cache() When df.cache() method called, the `withCachedData` of `QueryExecution` has been created, which mean it will not look up the cached

spark git commit: [SPARK-7637] [SQL] O(N) merge implementation for StructType merge

2015-05-26 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 0463428b6 - 03668348e [SPARK-7637] [SQL] O(N) merge implementation for StructType merge Contribution is my original work and I license the work to the project under the projects open source license. Author: rowan

spark git commit: [SPARK-7758] [SQL] Override more configs to avoid failure when connect to a postgre sql

2015-05-22 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master eac00691d - 31d5d463e [SPARK-7758] [SQL] Override more configs to avoid failure when connect to a postgre sql https://issues.apache.org/jira/browse/SPARK-7758 When initializing `executionHive`, we only masks

spark git commit: [SPARK-7724] [SQL] Support Intersect/Except in Catalyst DSL.

2015-05-22 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.4 40989cea0 - e18d623d9 [SPARK-7724] [SQL] Support Intersect/Except in Catalyst DSL. Author: Santiago M. Mola sa...@mola.io Closes #6327 from smola/feature/catalyst-dsl-set-ops and squashes the following commits: 11db778 [Santiago M.

spark git commit: [SPARK-7724] [SQL] Support Intersect/Except in Catalyst DSL.

2015-05-22 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 31d5d463e - e4aef91fe [SPARK-7724] [SQL] Support Intersect/Except in Catalyst DSL. Author: Santiago M. Mola sa...@mola.io Closes #6327 from smola/feature/catalyst-dsl-set-ops and squashes the following commits: 11db778 [Santiago M.

spark git commit: [SPARK-7834] [SQL] Better window error messages

2015-05-22 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 821254fb9 - 3c1305107 [SPARK-7834] [SQL] Better window error messages Author: Michael Armbrust mich...@databricks.com Closes #6363 from marmbrus/windowErrors and squashes the following commits: 516b02d [Michael Armbrust] [SPARK-7834

spark git commit: [SPARK-7834] [SQL] Better window error messages

2015-05-22 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.4 afde4019b - d7660dc2f [SPARK-7834] [SQL] Better window error messages Author: Michael Armbrust mich...@databricks.com Closes #6363 from marmbrus/windowErrors and squashes the following commits: 516b02d [Michael Armbrust] [SPARK-7834

spark git commit: [SPARK-6743] [SQL] Fix empty projections of cached data

2015-05-22 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 4e5220c31 - 3b68cb043 [SPARK-6743] [SQL] Fix empty projections of cached data Author: Michael Armbrust mich...@databricks.com Closes #6165 from marmbrus/wrongColumn and squashes the following commits: 4fad158 [Michael Armbrust] Merge

spark git commit: [SPARK-6743] [SQL] Fix empty projections of cached data

2015-05-22 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.4 1a284743e - 427dc04c1 [SPARK-6743] [SQL] Fix empty projections of cached data Author: Michael Armbrust mich...@databricks.com Closes #6165 from marmbrus/wrongColumn and squashes the following commits: 4fad158 [Michael Armbrust] Merge

spark git commit: [SQL] [TEST] udf_java_method failed due to jdk version

2015-05-21 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 4f572008f - f6c486aa4 [SQL] [TEST] udf_java_method failed due to jdk version java.lang.Math.exp(1.0) has different result between jdk versions. so do not use createQueryTest, write a separate test for it. ``` jdk version result

spark git commit: [SPARK-7656] [SQL] use CatalystConf in FunctionRegistry

2015-05-19 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.4 2ef04a162 - 86893390c [SPARK-7656] [SQL] use CatalystConf in FunctionRegistry follow up for #5806 Author: scwf wangf...@huawei.com Closes #6164 from scwf/FunctionRegistry and squashes the following commits: 15e6697 [scwf] use

spark git commit: [SPARK-7656] [SQL] use CatalystConf in FunctionRegistry

2015-05-19 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 386052063 - 60336e3bc [SPARK-7656] [SQL] use CatalystConf in FunctionRegistry follow up for #5806 Author: scwf wangf...@huawei.com Closes #6164 from scwf/FunctionRegistry and squashes the following commits: 15e6697 [scwf] use

spark git commit: [SPARK-7662] [SQL] Resolve correct names for generator in projection

2015-05-19 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.4 87fa8ccd2 - 62b4c7392 [SPARK-7662] [SQL] Resolve correct names for generator in projection ``` select explode(map(value, key)) from src; ``` Throws exception ``` org.apache.spark.sql.AnalysisException: The number of aliases supplied in

spark git commit: [SPARK-6888] [SQL] Make the jdbc driver handling user-definable

2015-05-18 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.4 a0ae8ce01 - b41301a13 [SPARK-6888] [SQL] Make the jdbc driver handling user-definable Replace the DriverQuirks with JdbcDialect(s) (and MySQLDialect/PostgresDialect) and allow developers to change the dialects on the fly (for new

spark git commit: [SPARK-6888] [SQL] Make the jdbc driver handling user-definable

2015-05-18 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 563bfcc1a - e1ac2a955 [SPARK-6888] [SQL] Make the jdbc driver handling user-definable Replace the DriverQuirks with JdbcDialect(s) (and MySQLDialect/PostgresDialect) and allow developers to change the dialects on the fly (for new JDBCRRDs

spark git commit: [SPARK-7631] [SQL] treenode argString should not print children

2015-05-18 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master aa31e431f - fc2480ed1 [SPARK-7631] [SQL] treenode argString should not print children spark-sql explain extended select * from ( select key from src union all select key from src) t; now the spark plan will print children in argString

spark git commit: [SPARK-7631] [SQL] treenode argString should not print children

2015-05-18 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.4 65d71bd9f - dbd4ec807 [SPARK-7631] [SQL] treenode argString should not print children spark-sql explain extended select * from ( select key from src union all select key from src) t; now the spark plan will print children in

spark git commit: [SPARK-7269] [SQL] Incorrect analysis for aggregation(use semanticEquals)

2015-05-18 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.4 dbd4ec807 - d6f5f3791 [SPARK-7269] [SQL] Incorrect analysis for aggregation(use semanticEquals) A modified version of https://github.com/apache/spark/pull/6110, use `semanticEquals` to make it more efficient. Author: Wenchen Fan

spark git commit: [SPARK-7269] [SQL] Incorrect analysis for aggregation(use semanticEquals)

2015-05-18 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master fc2480ed1 - 103c863c2 [SPARK-7269] [SQL] Incorrect analysis for aggregation(use semanticEquals) A modified version of https://github.com/apache/spark/pull/6110, use `semanticEquals` to make it more efficient. Author: Wenchen Fan

spark git commit: [SPARK-7570] [SQL] Ignores _temporary during partition discovery

2015-05-18 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master e1ac2a955 - 010a1c278 [SPARK-7570] [SQL] Ignores _temporary during partition discovery !-- Reviewable:start -- [img src=https://reviewable.io/review_button.png; height=40 alt=Review on

spark git commit: [SPARK-2883] [SQL] ORC data source for Spark SQL

2015-05-18 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.4 cf4e04a0c - 65d71bd9f [SPARK-2883] [SQL] ORC data source for Spark SQL This PR updates PR #6135 authored by zhzhan from Hortonworks. This PR implements a Spark SQL data source for accessing ORC files. **NOTE** Although ORC

spark git commit: [SPARK-2883] [SQL] ORC data source for Spark SQL

2015-05-18 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 9c7e802a5 - aa31e431f [SPARK-2883] [SQL] ORC data source for Spark SQL This PR updates PR #6135 authored by zhzhan from Hortonworks. This PR implements a Spark SQL data source for accessing ORC files. **NOTE** Although ORC is

spark git commit: [SPARK-7567] [SQL] [follow-up] Use a new flag to set output committer based on mapreduce apis

2015-05-18 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.4 d6f5f3791 - a385f4b8d [SPARK-7567] [SQL] [follow-up] Use a new flag to set output committer based on mapreduce apis cc liancheng marmbrus Author: Yin Huai yh...@databricks.com Closes #6130 from yhuai/directOutput and squashes

spark git commit: [SPARK-7491] [SQL] Allow configuration of classloader isolation for hive

2015-05-17 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 564562874 - 2ca60ace8 [SPARK-7491] [SQL] Allow configuration of classloader isolation for hive Author: Michael Armbrust mich...@databricks.com Closes #6167 from marmbrus/configureIsolation and squashes the following commits: 6147cbe

spark git commit: [SPARK-7491] [SQL] Allow configuration of classloader isolation for hive

2015-05-17 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.4 53d6ab51b - a8556086d [SPARK-7491] [SQL] Allow configuration of classloader isolation for hive Author: Michael Armbrust mich...@databricks.com Closes #6167 from marmbrus/configureIsolation and squashes the following commits: 6147cbe

spark git commit: [SPARK-7548] [SQL] Add explode function for DataFrames

2015-05-14 Thread marmbrus
...@databricks.com Closes #6107 from marmbrus/explodeFunction and squashes the following commits: 7ee2c87 [Michael Armbrust] whitespace 6f80ba3 [Michael Armbrust] Update dataframe.py c176c89 [Michael Armbrust] Merge remote-tracking branch 'origin/master' into explodeFunction 81b5da3 [Michael

spark git commit: [SPARK-7595] [SQL] Window will cause resolve failed with self join

2015-05-14 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.4 9ab4db29f - c80e0cff2 [SPARK-7595] [SQL] Window will cause resolve failed with self join for example: table: src(key string, value string) sql: with v1 as(select key, count(value) over (partition by key) cnt_val from src), v2

spark git commit: [SQL] Move some classes into packages that are more appropriate.

2015-05-13 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.4 d5c52d9ac - acd872bbd [SQL] Move some classes into packages that are more appropriate. JavaTypeInference into catalyst types.DateUtils into catalyst CacheManager into execution DefaultParserDialect into catalyst Author: Reynold Xin

spark git commit: [SQL] Move some classes into packages that are more appropriate.

2015-05-13 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 59250fe51 - e683182c3 [SQL] Move some classes into packages that are more appropriate. JavaTypeInference into catalyst types.DateUtils into catalyst CacheManager into execution DefaultParserDialect into catalyst Author: Reynold Xin

spark git commit: [SPARK-7303] [SQL] push down project if possible when the child is sort

2015-05-13 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.4 51230f2a9 - d5c52d9ac [SPARK-7303] [SQL] push down project if possible when the child is sort Optimize the case of `project(_, sort)` , a example is: `select key from (select * from testData order by key) t` before this PR: ``` ==

spark git commit: [SPARK-7303] [SQL] push down project if possible when the child is sort

2015-05-13 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master df2fb1305 - 59250fe51 [SPARK-7303] [SQL] push down project if possible when the child is sort Optimize the case of `project(_, sort)` , a example is: `select key from (select * from testData order by key) t` before this PR: ``` == Parsed

spark git commit: [HOTFIX] Use 'new Job' in fsBasedParquet.scala

2015-05-13 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.4 aec83949a - d518c0369 [HOTFIX] Use 'new Job' in fsBasedParquet.scala Same issue as #6095 cc liancheng Author: zsxwing zsxw...@gmail.com Closes #6136 from zsxwing/hotfix and squashes the following commits: 4beea54 [zsxwing] Use 'new

[1/2] spark git commit: [SPARK-7567] [SQL] Migrating Parquet data source to FSBasedRelation

2015-05-13 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master bec938f77 - 7ff16e8ab http://git-wip-us.apache.org/repos/asf/spark/blob/7ff16e8a/sql/core/src/test/scala/org/apache/spark/sql/parquet/ParquetFilterSuite.scala -- diff --git

spark git commit: [SPARK-7276] [DATAFRAME] speed up DataFrame.select by collapsing Project

2015-05-12 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.4 a23610458 - 8be43f897 [SPARK-7276] [DATAFRAME] speed up DataFrame.select by collapsing Project Author: Wenchen Fan cloud0...@outlook.com Closes #5831 from cloud-fan/7276 and squashes the following commits: ee4a1e1 [Wenchen Fan] fix

spark git commit: [SPARK-7276] [DATAFRAME] speed up DataFrame.select by collapsing Project

2015-05-12 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 65697bbea - 4e290522c [SPARK-7276] [DATAFRAME] speed up DataFrame.select by collapsing Project Author: Wenchen Fan cloud0...@outlook.com Closes #5831 from cloud-fan/7276 and squashes the following commits: ee4a1e1 [Wenchen Fan] fix

spark git commit: [SPARK-7331] [SQL] Re-use HiveConf in HiveQl

2015-05-11 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.3 2de111a8b - b152c6cc2 [SPARK-7331] [SQL] Re-use HiveConf in HiveQl Author: nitin2goyal nitin2go...@gmail.com Closes #6037 from nitin2goyal/dev-nitin-1.3 and squashes the following commits: 414b80a [nitin2goyal] [SPARK-7331][SQL]

spark git commit: [SPARK-7331] [SQL] Re-use HiveConf in HiveQl

2015-05-11 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.2 c0bd415bd - d3b7a8b1d [SPARK-7331] [SQL] Re-use HiveConf in HiveQl Re-use HiveConf in HiveQl Author: nitin2goyal nitin2go...@gmail.com Closes #6036 from nitin2goyal/dev-nitin-1.2 and squashes the following commits: 7ff1f9e

spark git commit: [SPARK-7324] [SQL] DataFrame.dropDuplicates

2015-05-11 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master f9c7580ad - b6bf4f76c [SPARK-7324] [SQL] DataFrame.dropDuplicates This should also close https://github.com/apache/spark/pull/5870 Author: Reynold Xin r...@databricks.com Closes #6066 from rxin/dropDups and squashes the following

spark git commit: [SPARK-7437] [SQL] Fold literal in (item1, item2, ..., literal, ...) into true or false directly

2015-05-11 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.4 1a664a0d5 - c30982dd8 [SPARK-7437] [SQL] Fold literal in (item1, item2, ..., literal, ...) into true or false directly SQL ``` select key from src where 3 in (4, 5); ``` Before ``` == Optimized Logical Plan == Project [key#12] Filter

spark git commit: [BUILD] Reference fasterxml.jackson.version in sql/core/pom.xml

2015-05-09 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.4 6c5b9ffda - 5110f3efe [BUILD] Reference fasterxml.jackson.version in sql/core/pom.xml Author: tedyu yuzhih...@gmail.com Closes #6031 from tedyu/master and squashes the following commits: 5c2580c [tedyu] Reference

spark git commit: [BUILD] Reference fasterxml.jackson.version in sql/core/pom.xml

2015-05-09 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 3071aac38 - bd74301ff [BUILD] Reference fasterxml.jackson.version in sql/core/pom.xml Author: tedyu yuzhih...@gmail.com Closes #6031 from tedyu/master and squashes the following commits: 5c2580c [tedyu] Reference

spark git commit: Upgrade version of jackson-databind in sql/core/pom.xml

2015-05-09 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 7d0f17208 - 3071aac38 Upgrade version of jackson-databind in sql/core/pom.xml Currently version of jackson-databind in sql/core/pom.xml is 2.3.0 This is older than the version specified in root pom.xml This PR upgrades the version in

spark git commit: [SPARK-7133] [SQL] Implement struct, array, and map field accessor

2015-05-08 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master a1ec08f7e - 2d05f325d [SPARK-7133] [SQL] Implement struct, array, and map field accessor It's the first step: generalize UnresolvedGetField to support all map, struct, and array TODO: add `apply` in Scala and `__getitem__` in Python, and

spark git commit: [SPARK-7133] [SQL] Implement struct, array, and map field accessor

2015-05-08 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.4 0b2c252d0 - f8468c451 [SPARK-7133] [SQL] Implement struct, array, and map field accessor It's the first step: generalize UnresolvedGetField to support all map, struct, and array TODO: add `apply` in Scala and `__getitem__` in Python,

spark git commit: [SPARK-4699] [SQL] Make caseSensitive configurable in spark sql analyzer

2015-05-08 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.4 5205eb4c2 - 21bd7222e [SPARK-4699] [SQL] Make caseSensitive configurable in spark sql analyzer based on #3558 Author: Jacky Li jacky.li...@huawei.com Author: wangfei wangf...@huawei.com Author: scwf wangf...@huawei.com Closes #5806

spark git commit: [SPARK-4699] [SQL] Make caseSensitive configurable in spark sql analyzer

2015-05-08 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 90527f560 - 6dad76e5e [SPARK-4699] [SQL] Make caseSensitive configurable in spark sql analyzer based on #3558 Author: Jacky Li jacky.li...@huawei.com Author: wangfei wangf...@huawei.com Author: scwf wangf...@huawei.com Closes #5806 from

spark git commit: [SPARK-5213] [SQL] Remove the duplicated SparkSQLParser

2015-05-07 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master dec8f5371 - 074d75d4c [SPARK-5213] [SQL] Remove the duplicated SparkSQLParser This is a follow up of #5827 to remove the additional `SparkSQLParser` Author: Cheng Hao hao.ch...@intel.com Closes #5965 from

spark git commit: [SPARK-5213] [SQL] Remove the duplicated SparkSQLParser

2015-05-07 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.4 86f141c90 - 2b0c42385 [SPARK-5213] [SQL] Remove the duplicated SparkSQLParser This is a follow up of #5827 to remove the additional `SparkSQLParser` Author: Cheng Hao hao.ch...@intel.com Closes #5965 from

spark git commit: [SPARK-7116] [SQL] [PYSPARK] Remove cache() causing memory leak

2015-05-07 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.4 9dcf4f78f - 86f141c90 [SPARK-7116] [SQL] [PYSPARK] Remove cache() causing memory leak This patch simply removes a `cache()` on an intermediate RDD when evaluating Python UDFs. Author: ksonj k...@siberie.de Closes #5973 from

spark git commit: [SPARK-1442] [SQL] [FOLLOW-UP] Address minor comments in Window Function PR (#5604).

2015-05-07 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.4 ef835dc52 - 9dcf4f78f [SPARK-1442] [SQL] [FOLLOW-UP] Address minor comments in Window Function PR (#5604). Address marmbrus and scwf's comments in #5604. Author: Yin Huai yh...@databricks.com Closes #5945 from yhuai/windowFollowup

spark git commit: [SPARK-1442] [SQL] [FOLLOW-UP] Address minor comments in Window Function PR (#5604).

2015-05-07 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 1712a7c70 - 5784c8d95 [SPARK-1442] [SQL] [FOLLOW-UP] Address minor comments in Window Function PR (#5604). Address marmbrus and scwf's comments in #5604. Author: Yin Huai yh...@databricks.com Closes #5945 from yhuai/windowFollowup

spark git commit: [SPARK-5281] [SQL] Registering table on RDD is giving MissingRequirementError

2015-05-07 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master ea3077f19 - 937ba798c [SPARK-5281] [SQL] Registering table on RDD is giving MissingRequirementError Go through the context classloader when reflecting on user types in ScalaReflection. Replaced calls to `typeOf` with

spark git commit: [SPARK-5281] [SQL] Registering table on RDD is giving MissingRequirementError

2015-05-07 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.4 7064ea0cd - 9fd25f7a3 [SPARK-5281] [SQL] Registering table on RDD is giving MissingRequirementError Go through the context classloader when reflecting on user types in ScalaReflection. Replaced calls to `typeOf` with

spark git commit: [SPARK-2155] [SQL] [WHEN D THEN E] [ELSE F] add CaseKeyWhen for CASE a WHEN b THEN c * END

2015-05-07 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 937ba798c - 35f0173b8 [SPARK-2155] [SQL] [WHEN D THEN E] [ELSE F] add CaseKeyWhen for CASE a WHEN b THEN c * END Avoid translating to CaseWhen and evaluate the key expression many times. Author: Wenchen Fan cloud0...@outlook.com Closes

spark git commit: [SPARK-2155] [SQL] [WHEN D THEN E] [ELSE F] add CaseKeyWhen for CASE a WHEN b THEN c * END

2015-05-07 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.4 9fd25f7a3 - 622a0c51c [SPARK-2155] [SQL] [WHEN D THEN E] [ELSE F] add CaseKeyWhen for CASE a WHEN b THEN c * END Avoid translating to CaseWhen and evaluate the key expression many times. Author: Wenchen Fan cloud0...@outlook.com

[04/13] spark git commit: [SPARK-1442] [SQL] Window Function Support for Spark SQL

2015-05-06 Thread marmbrus
http://git-wip-us.apache.org/repos/asf/spark/blob/b521a3b0/sql/hive/src/test/resources/golden/windowing_windowspec.q (deterministic)-1-6378faf36ffd3f61e61cee6c0cb70e6 -- diff --git

[11/13] spark git commit: [SPARK-1442] [SQL] Window Function Support for Spark SQL

2015-05-06 Thread marmbrus
http://git-wip-us.apache.org/repos/asf/spark/blob/b521a3b0/sql/hive/src/test/resources/golden/windowing.q -- 43. testUnboundedFollowingForRange-0-3cd04e5f2398853c4850f4f86142bb39 -- diff --git

[09/13] spark git commit: [SPARK-1442] [SQL] Window Function Support for Spark SQL

2015-05-06 Thread marmbrus
http://git-wip-us.apache.org/repos/asf/spark/blob/b521a3b0/sql/hive/src/test/resources/golden/windowing_ntile.q (deterministic)-1-a3d352560ac835993001665db6954965 -- diff --git

<    2   3   4   5   6   7   8   9   10   11   >