spark git commit: [SPARK-15424][SPARK-15437][SPARK-14807][SQL] Revert Create a hivecontext-compatibility module

2016-05-20 Thread rxin
<r...@databricks.com> Closes #13207 from rxin/SPARK-15424. (cherry picked from commit 45b7557e61d440612d4ce49c31b5ef242fdefa54) Signed-off-by: Reynold Xin <r...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c0cc

spark git commit: [SPARK-15424][SPARK-15437][SPARK-14807][SQL] Revert Create a hivecontext-compatibility module

2016-05-20 Thread rxin
ks.com> Closes #13207 from rxin/SPARK-15424. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/45b7557e Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/45b7557e Diff: http://git-wip-us.apache.org/repos/asf/spark/diff

spark git commit: [SPARK-15454][SQL] Filter out files starting with _

2016-05-20 Thread rxin
How was this patch tested? Added a unit test case. Author: Reynold Xin <r...@databricks.com> Closes #13227 from rxin/SPARK-15454. (cherry picked from commit dcac8e6f49918a809fb3f2b8bf666582c479a6eb) Signed-off-by: Reynold Xin <r...@databricks.com> Project: http://git-wip-us.apach

spark git commit: [SPARK-15454][SQL] Filter out files starting with _

2016-05-20 Thread rxin
How was this patch tested? Added a unit test case. Author: Reynold Xin <r...@databricks.com> Closes #13227 from rxin/SPARK-15454. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/dcac8e6f Tree: http://git-wip-us.apache.org/

spark git commit: [SPARK-15438][SQL] improve explain of whole stage codegen

2016-05-20 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 3ed9ba6e1 -> 89e29870b [SPARK-15438][SQL] improve explain of whole stage codegen ## What changes were proposed in this pull request? Currently, the explain of a query with whole-stage codegen looks like this ``` >>> df =

spark git commit: [SPARK-15438][SQL] improve explain of whole stage codegen

2016-05-20 Thread rxin
Repository: spark Updated Branches: refs/heads/master 2ba3ff044 -> 0e70fd61b [SPARK-15438][SQL] improve explain of whole stage codegen ## What changes were proposed in this pull request? Currently, the explain of a query with whole-stage codegen looks like this ``` >>> df =

spark git commit: [SPARK-15400][SQL] CreateNamedStruct and CreateNamedStructUnsafe should preserve metadata of value expressions if it is NamedExpression.

2016-05-20 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 a879e7c32 -> 0dd3bdc27 [SPARK-15400][SQL] CreateNamedStruct and CreateNamedStructUnsafe should preserve metadata of value expressions if it is NamedExpression. ## What changes were proposed in this pull request? `CreateNamedStruct`

[2/2] spark git commit: [SPARK-15435][SQL] Append Command to all commands

2016-05-20 Thread rxin
was this patch tested? Updated test cases to reflect the renames. Author: Reynold Xin <r...@databricks.com> Closes #13215 from rxin/SPARK-15435. (cherry picked from commit e8adc552df80af413e1d31b020489612d13a8770) Signed-off-by: Reynold Xin <r...@databricks.com> Project: http://git-wip-u

spark git commit: [SPARK-15400][SQL] CreateNamedStruct and CreateNamedStructUnsafe should preserve metadata of value expressions if it is NamedExpression.

2016-05-20 Thread rxin
Repository: spark Updated Branches: refs/heads/master e8adc552d -> 2cbe96e64 [SPARK-15400][SQL] CreateNamedStruct and CreateNamedStructUnsafe should preserve metadata of value expressions if it is NamedExpression. ## What changes were proposed in this pull request? `CreateNamedStruct` and

[1/2] spark git commit: [SPARK-15435][SQL] Append Command to all commands

2016-05-20 Thread rxin
Repository: spark Updated Branches: refs/heads/master d2e1aa97e -> e8adc552d http://git-wip-us.apache.org/repos/asf/spark/blob/e8adc552/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveDDLCommandSuite.scala -- diff --git

[1/2] spark git commit: [SPARK-15435][SQL] Append Command to all commands

2016-05-20 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 78c8825bd -> a879e7c32 http://git-wip-us.apache.org/repos/asf/spark/blob/a879e7c3/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveDDLCommandSuite.scala -- diff

[2/2] spark git commit: [SPARK-15435][SQL] Append Command to all commands

2016-05-20 Thread rxin
was this patch tested? Updated test cases to reflect the renames. Author: Reynold Xin <r...@databricks.com> Closes #13215 from rxin/SPARK-15435. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e8adc552 Tree: http://g

spark git commit: [SPARK-15308][SQL] RowEncoder should preserve nested column name.

2016-05-20 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 0066d35cc -> 78c8825bd [SPARK-15308][SQL] RowEncoder should preserve nested column name. ## What changes were proposed in this pull request? The following code generates wrong schema: ``` val schema = new StructType().add(

spark git commit: [SPARK-15308][SQL] RowEncoder should preserve nested column name.

2016-05-20 Thread rxin
Repository: spark Updated Branches: refs/heads/master 9a9c6f5c2 -> d2e1aa97e [SPARK-15308][SQL] RowEncoder should preserve nested column name. ## What changes were proposed in this pull request? The following code generates wrong schema: ``` val schema = new StructType().add( "struct",

spark git commit: [SPARK-15335][SQL] Implement TRUNCATE TABLE Command

2016-05-20 Thread rxin
Repository: spark Updated Branches: refs/heads/master d5e1c5acd -> 09a00510c [SPARK-15335][SQL] Implement TRUNCATE TABLE Command ## What changes were proposed in this pull request? Like TRUNCATE TABLE Command in Hive, TRUNCATE TABLE is also supported by Hive. See the link:

spark git commit: [SPARK-15335][SQL] Implement TRUNCATE TABLE Command

2016-05-20 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 612866473 -> 47feebd13 [SPARK-15335][SQL] Implement TRUNCATE TABLE Command ## What changes were proposed in this pull request? Like TRUNCATE TABLE Command in Hive, TRUNCATE TABLE is also supported by Hive. See the link:

spark git commit: [SPARK-15313][SQL] EmbedSerializerInFilter rule should keep exprIds of output of surrounded SerializeFromObject.

2016-05-19 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 f8d0177c3 -> 2ef645724 [SPARK-15313][SQL] EmbedSerializerInFilter rule should keep exprIds of output of surrounded SerializeFromObject. ## What changes were proposed in this pull request? The following code: ``` val ds = Seq(("a",

spark git commit: [SPARK-15313][SQL] EmbedSerializerInFilter rule should keep exprIds of output of surrounded SerializeFromObject.

2016-05-19 Thread rxin
Repository: spark Updated Branches: refs/heads/master e384c7fbb -> d5e1c5acd [SPARK-15313][SQL] EmbedSerializerInFilter rule should keep exprIds of output of surrounded SerializeFromObject. ## What changes were proposed in this pull request? The following code: ``` val ds = Seq(("a", 1),

spark git commit: Revert "[SPARK-15392][SQL] fix default value of size estimation of logical plan"

2016-05-19 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 dd0c7fb39 -> f8d0177c3 Revert "[SPARK-15392][SQL] fix default value of size estimation of logical plan" This reverts commit fc29b896dae08b957ed15fa681b46162600a4050. (cherry picked from commit 84b23453ddb0a97e3d81306de0a5dcb64f88bdd0)

spark git commit: Revert "[HOTFIX] Test compilation error from 52b967f"

2016-05-19 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 1fc0f95eb -> dd0c7fb39 Revert "[HOTFIX] Test compilation error from 52b967f" This reverts commit 1fc0f95eb8abbb9cc8ede2139670e493e6939317. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [CORE][MINOR] Remove redundant set master in OutputCommitCoordinatorIntegrationSuite

2016-05-19 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 642f00980 -> 2126fb0c2 [CORE][MINOR] Remove redundant set master in OutputCommitCoordinatorIntegrationSuite Remove redundant set master in OutputCommitCoordinatorIntegrationSuite, as we are already setting it in SparkContext below on

spark git commit: [MINOR] Fix Typos

2016-05-19 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 1dc30f189 -> 642f00980 [MINOR] Fix Typos 1,Rename matrix args in BreezeUtil to upper to match the doc 2,Fix several typos in ML and SQL manual tests Author: Zheng RuiFeng Closes #13078 from

spark git commit: [DOC][MINOR] ml.feature Scala and Python API sync

2016-05-19 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 dcf36ad54 -> 1dc30f189 [DOC][MINOR] ml.feature Scala and Python API sync I reviewed Scala and Python APIs for ml.feature and corrected discrepancies. Built docs locally, ran style checks Author: Bryan Cutler

spark git commit: [SPARK-15057][GRAPHX] Remove stale TODO comment for making `enum` in GraphGenerators

2016-05-19 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 7bb33352f -> dcf36ad54 [SPARK-15057][GRAPHX] Remove stale TODO comment for making `enum` in GraphGenerators This PR removes a stale TODO comment in `GraphGenerators.scala` Just comment removed. Author: Dongjoon Hyun

spark git commit: [SPARK-14261][SQL] Memory leak in Spark Thrift Server

2016-05-19 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 fd2da7b91 -> 7200e6b54 [SPARK-14261][SQL] Memory leak in Spark Thrift Server Fixed memory leak (HiveConf in the CommandProcessorFactory) Author: Oleg Danilov Closes #12932 from dosoft/SPARK-14261. (cherry

spark git commit: [SPARK-14261][SQL] Memory leak in Spark Thrift Server

2016-05-19 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 c08739afb -> 7bb33352f [SPARK-14261][SQL] Memory leak in Spark Thrift Server Fixed memory leak (HiveConf in the CommandProcessorFactory) Author: Oleg Danilov Closes #12932 from dosoft/SPARK-14261. (cherry

spark git commit: [SPARK-14261][SQL] Memory leak in Spark Thrift Server

2016-05-19 Thread rxin
Repository: spark Updated Branches: refs/heads/master 3ba34d435 -> e384c7fbb [SPARK-14261][SQL] Memory leak in Spark Thrift Server Fixed memory leak (HiveConf in the CommandProcessorFactory) Author: Oleg Danilov Closes #12932 from dosoft/SPARK-14261. Project:

spark git commit: [SPARK-14990][SQL] Fix checkForSameTypeInputExpr (ignore nullability)

2016-05-19 Thread rxin
768. Author: Reynold Xin <r...@databricks.com> Author: Oleg Danilov <oleg.dani...@wandisco.com> Closes #13208 from rxin/SPARK-14990. (cherry picked from commit 3ba34d435c1f61435c2dddc28650cd111e7c1f33) Signed-off-by: Reynold Xin <r...@databricks.com> Project: http://git-wip-us

spark git commit: [SPARK-14990][SQL] Fix checkForSameTypeInputExpr (ignore nullability)

2016-05-19 Thread rxin
768. Author: Reynold Xin <r...@databricks.com> Author: Oleg Danilov <oleg.dani...@wandisco.com> Closes #13208 from rxin/SPARK-14990. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/3ba34d43 Tree: http://git-wip-us.apach

[2/2] spark git commit: [SPARK-15075][SPARK-15345][SQL] Clean up SparkSession builder and propagate config options to existing sessions if specified

2016-05-19 Thread rxin
, and also introduced a new SparkSessionBuilderSuite that should cover all the branches. Author: Reynold Xin <r...@databricks.com> Closes #13200 from rxin/SPARK-15075. (cherry picked from commit f2ee0ed4b7ecb2855cc4928a9613a07d45446f4e) Signed-off-by: Reynold Xin <r...@databricks.com> P

[1/2] spark git commit: [SPARK-15075][SPARK-15345][SQL] Clean up SparkSession builder and propagate config options to existing sessions if specified

2016-05-19 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 e6810e9cd -> 52b967fe6 http://git-wip-us.apache.org/repos/asf/spark/blob/52b967fe/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala --

[2/2] spark git commit: [SPARK-15075][SPARK-15345][SQL] Clean up SparkSession builder and propagate config options to existing sessions if specified

2016-05-19 Thread rxin
, and also introduced a new SparkSessionBuilderSuite that should cover all the branches. Author: Reynold Xin <r...@databricks.com> Closes #13200 from rxin/SPARK-15075. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f2ee0ed

[1/2] spark git commit: [SPARK-15075][SPARK-15345][SQL] Clean up SparkSession builder and propagate config options to existing sessions if specified

2016-05-19 Thread rxin
Repository: spark Updated Branches: refs/heads/master 17591d90e -> f2ee0ed4b http://git-wip-us.apache.org/repos/asf/spark/blob/f2ee0ed4/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala -- diff

spark git commit: [SPARK-15375][SQL][STREAMING] Add ConsoleSink to structure streaming

2016-05-19 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 758253f7c -> 2c939e541 [SPARK-15375][SQL][STREAMING] Add ConsoleSink to structure streaming ## What changes were proposed in this pull request? Add ConsoleSink to structure streaming, user could use it to display dataframes on the

spark git commit: [SPARK-15375][SQL][STREAMING] Add ConsoleSink to structure streaming

2016-05-19 Thread rxin
Repository: spark Updated Branches: refs/heads/master ef43a5fe5 -> dcf407de6 [SPARK-15375][SQL][STREAMING] Add ConsoleSink to structure streaming ## What changes were proposed in this pull request? Add ConsoleSink to structure streaming, user could use it to display dataframes on the

spark git commit: [SPARK-14463][SQL] Document the semantics for read.text

2016-05-18 Thread rxin
ion to clarify the semantics of read.text with respect to partitioning. ## How was this patch tested? N/A Author: Reynold Xin <r...@databricks.com> Closes #13184 from rxin/SPARK-14463. (cherry picked from commit 4987f39ac7a694e1c8b8b82246eb4fbd863201c4) Signed-off-by: Reynold Xin <r...@da

spark git commit: [SPARK-14463][SQL] Document the semantics for read.text

2016-05-18 Thread rxin
ify the semantics of read.text with respect to partitioning. ## How was this patch tested? N/A Author: Reynold Xin <r...@databricks.com> Closes #13184 from rxin/SPARK-14463. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spa

spark git commit: [SPARK-15323][SPARK-14463][SQL] Fix reading of partitioned format=text datasets

2016-05-18 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 4c0af3bbd -> 36acf8856 [SPARK-15323][SPARK-14463][SQL] Fix reading of partitioned format=text datasets https://issues.apache.org/jira/browse/SPARK-15323 I was using partitioned text datasets in Spark 1.6.1 but it broke in Spark

spark git commit: [SPARK-15323][SPARK-14463][SQL] Fix reading of partitioned format=text datasets

2016-05-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master 84b23453d -> 32be51fba [SPARK-15323][SPARK-14463][SQL] Fix reading of partitioned format=text datasets https://issues.apache.org/jira/browse/SPARK-15323 I was using partitioned text datasets in Spark 1.6.1 but it broke in Spark 2.0.0.

spark git commit: [SPARK-15392][SQL] fix default value of size estimation of logical plan

2016-05-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master cc6a47dd8 -> fc29b896d [SPARK-15392][SQL] fix default value of size estimation of logical plan ## What changes were proposed in this pull request? We use autoBroadcastJoinThreshold + 1L as the default value of size estimation, that is

spark git commit: [SPARK-15392][SQL] fix default value of size estimation of logical plan

2016-05-18 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 d65707b7f -> 4c0af3bbd [SPARK-15392][SQL] fix default value of size estimation of logical plan ## What changes were proposed in this pull request? We use autoBroadcastJoinThreshold + 1L as the default value of size estimation, that

spark git commit: Prepare branch for 2.0.0-preview.

2016-05-17 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 5f5270ead -> c8be3da66 Prepare branch for 2.0.0-preview. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c8be3da6 Tree:

spark git commit: [TRIVIAL] Add () to SparkSession's builder function

2016-05-13 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 d3110d8b9 -> 78bf9a1aa [TRIVIAL] Add () to SparkSession's builder function Was trying out `SparkSession` for the first time and the given class doc (when copied as is) did not work over Spark shell: ``` scala>

spark git commit: [TRIVIAL] Add () to SparkSession's builder function

2016-05-13 Thread rxin
Repository: spark Updated Branches: refs/heads/master 3ded5bc4d -> 4210e2a6b [TRIVIAL] Add () to SparkSession's builder function ## What changes were proposed in this pull request? Was trying out `SparkSession` for the first time and the given class doc (when copied as is) did not work over

spark git commit: [SPARK-15267][SQL] Refactor options for JDBC and ORC data sources and change default compression for ORC

2016-05-13 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 1390eca2c -> d3110d8b9 [SPARK-15267][SQL] Refactor options for JDBC and ORC data sources and change default compression for ORC ## What changes were proposed in this pull request? Currently, Parquet, JSON and CSV data sources have a

spark git commit: [SPARK-15267][SQL] Refactor options for JDBC and ORC data sources and change default compression for ORC

2016-05-13 Thread rxin
Repository: spark Updated Branches: refs/heads/master 10a838967 -> 3ded5bc4d [SPARK-15267][SQL] Refactor options for JDBC and ORC data sources and change default compression for ORC ## What changes were proposed in this pull request? Currently, Parquet, JSON and CSV data sources have a

[1/2] spark git commit: [SPARK-15310][SQL] Rename HiveTypeCoercion -> TypeCoercion

2016-05-13 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 86b8f8a9a -> 43570c576 http://git-wip-us.apache.org/repos/asf/spark/blob/43570c57/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercionSuite.scala

[2/2] spark git commit: [SPARK-15310][SQL] Rename HiveTypeCoercion -> TypeCoercion

2016-05-13 Thread rxin
mes it TypeCoercion. ## How was this patch tested? Updated unit tests to reflect the rename. Author: Reynold Xin <r...@databricks.com> Closes #13091 from rxin/SPARK-15310. (cherry picked from commit e1dc853737fc1739fbb5377ffe31fb2d89935b1f) Signed-off-by: Reynold Xin <r...@databricks.com> Proj

[2/2] spark git commit: [SPARK-15310][SQL] Rename HiveTypeCoercion -> TypeCoercion

2016-05-13 Thread rxin
mes it TypeCoercion. ## How was this patch tested? Updated unit tests to reflect the rename. Author: Reynold Xin <r...@databricks.com> Closes #13091 from rxin/SPARK-15310. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e1dc8537 T

[1/2] spark git commit: [SPARK-15310][SQL] Rename HiveTypeCoercion -> TypeCoercion

2016-05-13 Thread rxin
Repository: spark Updated Branches: refs/heads/master 31f1aebbe -> e1dc85373 http://git-wip-us.apache.org/repos/asf/spark/blob/e1dc8537/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercionSuite.scala

spark git commit: [SPARK-15306][SQL] Move object expressions into expressions.objects package

2016-05-12 Thread rxin
age, for better code organization. ## How was this patch tested? N/A Author: Reynold Xin <r...@databricks.com> Closes #13085 from rxin/SPARK-15306. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ba169c32 Tree: http:

spark git commit: [SPARK-15306][SQL] Move object expressions into expressions.objects package

2016-05-12 Thread rxin
cts package, for better code organization. ## How was this patch tested? N/A Author: Reynold Xin <r...@databricks.com> Closes #13085 from rxin/SPARK-15306. (cherry picked from commit ba169c3230e7d6cb192ec4bd567a1fef7b93b29f) Signed-off-by: Reynold Xin <r...@databricks.com> Project:

spark git commit: [SPARK-10605][SQL] Create native collect_list/collect_set aggregates

2016-05-12 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 ac6e9a8d9 -> 31ea3c7bd [SPARK-10605][SQL] Create native collect_list/collect_set aggregates ## What changes were proposed in this pull request? We currently use the Hive implementations for the collect_list/collect_set aggregate

spark git commit: [SPARK-12200][SQL] Add __contains__ implementation to Row

2016-05-11 Thread rxin
Repository: spark Updated Branches: refs/heads/master bb88ad4e0 -> 7ecd49688 [SPARK-12200][SQL] Add __contains__ implementation to Row https://issues.apache.org/jira/browse/SPARK-12200 Author: Maciej Brynski Author: Maciej Bryński

spark git commit: [SPARK-12200][SQL] Add __contains__ implementation to Row

2016-05-11 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 83050ddb8 -> 6e08eb469 [SPARK-12200][SQL] Add __contains__ implementation to Row https://issues.apache.org/jira/browse/SPARK-12200 Author: Maciej Brynski Author: Maciej Bryński

[3/5] spark git commit: [SPARK-15085][STREAMING][KAFKA] Rename streaming-kafka artifact

2016-05-11 Thread rxin
http://git-wip-us.apache.org/repos/asf/spark/blob/89e67d66/external/kafka-0-8/src/test/scala/org/apache/spark/streaming/kafka/DirectKafkaStreamSuite.scala -- diff --git

[2/5] spark git commit: [SPARK-15085][STREAMING][KAFKA] Rename streaming-kafka artifact

2016-05-11 Thread rxin
http://git-wip-us.apache.org/repos/asf/spark/blob/89e67d66/external/kafka/src/main/scala/org/apache/spark/streaming/kafka/KafkaRDD.scala -- diff --git

[2/5] spark git commit: [SPARK-15085][STREAMING][KAFKA] Rename streaming-kafka artifact

2016-05-11 Thread rxin
http://git-wip-us.apache.org/repos/asf/spark/blob/56e1e2f1/external/kafka/src/main/scala/org/apache/spark/streaming/kafka/KafkaRDD.scala -- diff --git

[4/5] spark git commit: [SPARK-15085][STREAMING][KAFKA] Rename streaming-kafka artifact

2016-05-11 Thread rxin
http://git-wip-us.apache.org/repos/asf/spark/blob/56e1e2f1/external/kafka-0-8/src/main/scala/org/apache/spark/streaming/kafka/KafkaUtils.scala -- diff --git

[5/5] spark git commit: [SPARK-15085][STREAMING][KAFKA] Rename streaming-kafka artifact

2016-05-11 Thread rxin
[SPARK-15085][STREAMING][KAFKA] Rename streaming-kafka artifact ## What changes were proposed in this pull request? Renaming the streaming-kafka artifact to include kafka version, in anticipation of needing a different artifact for later kafka versions ## How was this patch tested? Unit tests

[3/5] spark git commit: [SPARK-15085][STREAMING][KAFKA] Rename streaming-kafka artifact

2016-05-11 Thread rxin
http://git-wip-us.apache.org/repos/asf/spark/blob/56e1e2f1/external/kafka-0-8/src/test/scala/org/apache/spark/streaming/kafka/DirectKafkaStreamSuite.scala -- diff --git

[1/5] spark git commit: [SPARK-15085][STREAMING][KAFKA] Rename streaming-kafka artifact

2016-05-11 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 e3703c411 -> 56e1e2f17 http://git-wip-us.apache.org/repos/asf/spark/blob/56e1e2f1/external/kafka/src/test/java/org/apache/spark/streaming/kafka/JavaKafkaRDDSuite.java

[5/5] spark git commit: [SPARK-15085][STREAMING][KAFKA] Rename streaming-kafka artifact

2016-05-11 Thread rxin
[SPARK-15085][STREAMING][KAFKA] Rename streaming-kafka artifact ## What changes were proposed in this pull request? Renaming the streaming-kafka artifact to include kafka version, in anticipation of needing a different artifact for later kafka versions ## How was this patch tested? Unit tests

[1/5] spark git commit: [SPARK-15085][STREAMING][KAFKA] Rename streaming-kafka artifact

2016-05-11 Thread rxin
Repository: spark Updated Branches: refs/heads/master 6d0368ab8 -> 89e67d666 http://git-wip-us.apache.org/repos/asf/spark/blob/89e67d66/external/kafka/src/test/java/org/apache/spark/streaming/kafka/JavaKafkaRDDSuite.java --

[4/5] spark git commit: [SPARK-15085][STREAMING][KAFKA] Rename streaming-kafka artifact

2016-05-11 Thread rxin
http://git-wip-us.apache.org/repos/asf/spark/blob/89e67d66/external/kafka-0-8/src/main/scala/org/apache/spark/streaming/kafka/KafkaUtils.scala -- diff --git

spark git commit: [SPARK-15235][WEBUI] Corresponding row cannot be highlighted even though cursor is on the job on Web UI's timeline

2016-05-10 Thread rxin
Repository: spark Updated Branches: refs/heads/master 9f0a642f8 -> ba181c0c7 [SPARK-15235][WEBUI] Corresponding row cannot be highlighted even though cursor is on the job on Web UI's timeline ## What changes were proposed in this pull request? To extract job descriptions and stage name,

spark git commit: [SPARK-15235][WEBUI] Corresponding row cannot be highlighted even though cursor is on the job on Web UI's timeline

2016-05-10 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 d9288b804 -> ca5ce5365 [SPARK-15235][WEBUI] Corresponding row cannot be highlighted even though cursor is on the job on Web UI's timeline ## What changes were proposed in this pull request? To extract job descriptions and stage name,

spark git commit: [SPARK-15246][SPARK-4452][CORE] Fix code style and improve volatile for

2016-05-10 Thread rxin
Repository: spark Updated Branches: refs/heads/master 1fbe2785d -> 9f0a642f8 [SPARK-15246][SPARK-4452][CORE] Fix code style and improve volatile for ## What changes were proposed in this pull request? 1. Fix code style 2. remove volatile of elementsRead method because there is only one thread

spark git commit: [SPARK-15246][SPARK-4452][CORE] Fix code style and improve volatile for

2016-05-10 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 1b446a461 -> d9288b804 [SPARK-15246][SPARK-4452][CORE] Fix code style and improve volatile for ## What changes were proposed in this pull request? 1. Fix code style 2. remove volatile of elementsRead method because there is only one

spark git commit: [SPARK-15255][SQL] limit the length of name for cached DataFrame

2016-05-10 Thread rxin
Repository: spark Updated Branches: refs/heads/master 665545960 -> 1fbe2785d [SPARK-15255][SQL] limit the length of name for cached DataFrame ## What changes were proposed in this pull request? We use the tree string of an SparkPlan as the name of cached DataFrame, that could be very long,

spark git commit: [SPARK-15255][SQL] limit the length of name for cached DataFrame

2016-05-10 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 a675f5e1d -> 1b446a461 [SPARK-15255][SQL] limit the length of name for cached DataFrame ## What changes were proposed in this pull request? We use the tree string of an SparkPlan as the name of cached DataFrame, that could be very

spark git commit: [SPARK-15265][SQL][MINOR] Fix Union query error message indentation

2016-05-10 Thread rxin
Repository: spark Updated Branches: refs/heads/master 3ff012051 -> 665545960 [SPARK-15265][SQL][MINOR] Fix Union query error message indentation ## What changes were proposed in this pull request? This issue fixes the error message indentation consistently with other set queries

spark git commit: [SPARK-15265][SQL][MINOR] Fix Union query error message indentation

2016-05-10 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 0ecc105d2 -> a675f5e1d [SPARK-15265][SQL][MINOR] Fix Union query error message indentation ## What changes were proposed in this pull request? This issue fixes the error message indentation consistently with other set queries

spark git commit: [SPARK-15250][SQL] Remove deprecated json API in DataFrameReader

2016-05-10 Thread rxin
Repository: spark Updated Branches: refs/heads/master 5a5b83c97 -> 3ff012051 [SPARK-15250][SQL] Remove deprecated json API in DataFrameReader ## What changes were proposed in this pull request? This PR removes the old `json(path: String)` API which is covered by the new `json(paths:

spark git commit: [SPARK-15250][SQL] Remove deprecated json API in DataFrameReader

2016-05-10 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 03dfe7830 -> 0ecc105d2 [SPARK-15250][SQL] Remove deprecated json API in DataFrameReader ## What changes were proposed in this pull request? This PR removes the old `json(path: String)` API which is covered by the new `json(paths:

spark git commit: [SPARK-15261][SQL] Remove experimental tag from DataFrameReader/Writer

2016-05-10 Thread rxin
ter, and explicitly tags a few methods added for structured streaming as experimental. ## How was this patch tested? N/A Author: Reynold Xin <r...@databricks.com> Closes #13038 from rxin/SPARK-15261. (cherry picked from commit 5a5b83c97bbab1d717dcc30b09aafb7c0ed85069) Signed-off-by: Reyno

spark git commit: [SPARK-15261][SQL] Remove experimental tag from DataFrameReader/Writer

2016-05-10 Thread rxin
tly tags a few methods added for structured streaming as experimental. ## How was this patch tested? N/A Author: Reynold Xin <r...@databricks.com> Closes #13038 from rxin/SPARK-15261. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/

spark git commit: [SPARK-14476][SQL] Improve the physical plan visualization by adding meta info like table name and file path for data source.

2016-05-10 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 d8c2da9a4 -> 5e3192a9a [SPARK-14476][SQL] Improve the physical plan visualization by adding meta info like table name and file path for data source. ## What changes were proposed in this pull request? Improve the physical plan

spark git commit: [SPARK-15229][SQL] Make case sensitivity setting internal

2016-05-09 Thread rxin
age users from turning it on, effectively making Spark always case insensitive. ## How was this patch tested? N/A - a small config documentation change. Author: Reynold Xin <r...@databricks.com> Closes #13011 from rxin/SPARK-15229. (cherry picked from commit 4b4344a81331e48b0a00032ec8285f3

spark git commit: [SPARK-15229][SQL] Make case sensitivity setting internal

2016-05-09 Thread rxin
ers from turning it on, effectively making Spark always case insensitive. ## How was this patch tested? N/A - a small config documentation change. Author: Reynold Xin <r...@databricks.com> Closes #13011 from rxin/SPARK-15229. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Com

spark git commit: [SPARK-15234][SQL] Fix spark.catalog.listDatabases.show()

2016-05-09 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 1bcbf6157 -> 036c22494 [SPARK-15234][SQL] Fix spark.catalog.listDatabases.show() ## What changes were proposed in this pull request? Before: ``` scala> spark.catalog.listDatabases.show() ++---+---+

spark git commit: [SPARK-15234][SQL] Fix spark.catalog.listDatabases.show()

2016-05-09 Thread rxin
Repository: spark Updated Branches: refs/heads/master 980bba0dc -> 8f932fb88 [SPARK-15234][SQL] Fix spark.catalog.listDatabases.show() ## What changes were proposed in this pull request? Before: ``` scala> spark.catalog.listDatabases.show() ++---+---+ |

spark git commit: [SPARK-15178][CORE] Remove LazyFileRegion instead use netty's DefaultFileRegion

2016-05-07 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 9560bad2d -> 69f3edc32 [SPARK-15178][CORE] Remove LazyFileRegion instead use netty's DefaultFileRegion ## What changes were proposed in this pull request? Remove LazyFileRegion instead use netty's DefaultFileRegion, since It was

spark git commit: [SPARK-15178][CORE] Remove LazyFileRegion instead use netty's DefaultFileRegion

2016-05-07 Thread rxin
Repository: spark Updated Branches: refs/heads/master 5d188a697 -> 6e268b9ee [SPARK-15178][CORE] Remove LazyFileRegion instead use netty's DefaultFileRegion ## What changes were proposed in this pull request? Remove LazyFileRegion instead use netty's DefaultFileRegion, since It was created

spark git commit: [SPARK-15148][SQL] Upgrade Univocity library from 2.0.2 to 2.1.0

2016-05-05 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 346811141 -> 4ec5d9345 [SPARK-15148][SQL] Upgrade Univocity library from 2.0.2 to 2.1.0 ## What changes were proposed in this pull request? https://issues.apache.org/jira/browse/SPARK-15148 Mainly it improves the performance roughtly

spark git commit: [SPARK-15132][MINOR][SQL] Debug log for generated code should be printed with proper indentation

2016-05-04 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 2023faf6c -> 0914296cb [SPARK-15132][MINOR][SQL] Debug log for generated code should be printed with proper indentation ## What changes were proposed in this pull request? Similar to #11990, GenerateOrdering and

spark git commit: [SPARK-15132][MINOR][SQL] Debug log for generated code should be printed with proper indentation

2016-05-04 Thread rxin
Repository: spark Updated Branches: refs/heads/master 428374195 -> 1a9b34158 [SPARK-15132][MINOR][SQL] Debug log for generated code should be printed with proper indentation ## What changes were proposed in this pull request? Similar to #11990, GenerateOrdering and GenerateColumnAccessor

[1/2] spark git commit: [SPARK-15115][SQL] Reorganize whole stage codegen benchmark suites

2016-05-04 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 54d90bd3e -> c59615432 http://git-wip-us.apache.org/repos/asf/spark/blob/c5961543/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/SortBenchmark.scala

[2/2] spark git commit: [SPARK-15115][SQL] Reorganize whole stage codegen benchmark suites

2016-05-04 Thread rxin
takes forever to run. ## How was this patch tested? This is a test only change. Author: Reynold Xin <r...@databricks.com> Closes #12891 from rxin/SPARK-15115. (cherry picked from commit 6274a520fa743b7d079fde4a3033da5c3a2532a1) Signed-off-by: Reynold Xin <r...@databricks.com> P

[2/2] spark git commit: [SPARK-15115][SQL] Reorganize whole stage codegen benchmark suites

2016-05-04 Thread rxin
takes forever to run. ## How was this patch tested? This is a test only change. Author: Reynold Xin <r...@databricks.com> Closes #12891 from rxin/SPARK-15115. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/6274a52

spark git commit: [SPARK-15109][SQL] Accept Dataset[_] in joins

2016-05-04 Thread rxin
How was this patch tested? N/A. Author: Reynold Xin <r...@databricks.com> Closes #12886 from rxin/SPARK-15109. (cherry picked from commit d864c55cf8c92466336e796d0c98d83230e330af) Signed-off-by: Reynold Xin <r...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/re

spark git commit: [SPARK-15109][SQL] Accept Dataset[_] in joins

2016-05-04 Thread rxin
How was this patch tested? N/A. Author: Reynold Xin <r...@databricks.com> Closes #12886 from rxin/SPARK-15109. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d864c55c Tree: http://git-wip-us.apache.org/repos/asf/spark/tree

spark git commit: [SPARK-15029] improve error message for Generate

2016-05-04 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 64ad9ba27 -> b99f715e8 [SPARK-15029] improve error message for Generate ## What changes were proposed in this pull request? This PR improve the error message for `Generate` in 3 cases: 1. generator is nested in expressions, e.g.

spark git commit: [SPARK-15029] improve error message for Generate

2016-05-04 Thread rxin
Repository: spark Updated Branches: refs/heads/master bc3760d40 -> 6c12e801e [SPARK-15029] improve error message for Generate ## What changes were proposed in this pull request? This PR improve the error message for `Generate` in 3 cases: 1. generator is nested in expressions, e.g. `SELECT

spark git commit: [SPARK-15107][SQL] Allow varying # iterations by test case in Benchmark

2016-05-03 Thread rxin
gen on. I also updated some results. N/A - this is a test util. Author: Reynold Xin <r...@databricks.com> Closes #12884 from rxin/SPARK-15107. (cherry picked from commit 695f0e9195209c75bfc62fc70bfc6d7d9f1047b3) Signed-off-by: Reynold Xin <r...@databricks.com> Project:

spark git commit: [SPARK-15107][SQL] Allow varying # iterations by test case in Benchmark

2016-05-03 Thread rxin
ole stage codegen off, and 5 for whole stage codegen on. I also updated some results. ## How was this patch tested? N/A - this is a test util. Author: Reynold Xin <r...@databricks.com> Closes #12884 from rxin/SPARK-15107. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: h

spark git commit: [SPARK-15095][SQL] remove HiveSessionHook from ThriftServer

2016-05-03 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 940b8f60b -> fd3accca6 [SPARK-15095][SQL] remove HiveSessionHook from ThriftServer ## What changes were proposed in this pull request? Remove HiveSessionHook ## How was this patch tested? No tests needed. Author: Davies Liu

spark git commit: [SPARK-15095][SQL] remove HiveSessionHook from ThriftServer

2016-05-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master 6ba17cd14 -> 348c13898 [SPARK-15095][SQL] remove HiveSessionHook from ThriftServer ## What changes were proposed in this pull request? Remove HiveSessionHook ## How was this patch tested? No tests needed. Author: Davies Liu

spark git commit: [SPARK-15104] Fix spacing in log line

2016-05-03 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 51bb0bcc8 -> c212307b9 [SPARK-15104] Fix spacing in log line Otherwise get logs that look like this (note no space before NODE_LOCAL) ``` INFO [2016-05-03 21:18:51,477] org.apache.spark.scheduler.TaskSetManager: Starting task 0.0 in

spark git commit: [SPARK-15104] Fix spacing in log line

2016-05-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master 028c6a5db -> dbacd9998 [SPARK-15104] Fix spacing in log line Otherwise get logs that look like this (note no space before NODE_LOCAL) ``` INFO [2016-05-03 21:18:51,477] org.apache.spark.scheduler.TaskSetManager: Starting task 0.0 in

<    8   9   10   11   12   13   14   15   16   17   >