spark git commit: [SPARK-15646][SQL] When spark.sql.hive.convertCTAS is true, the conversion rule needs to respect TEXTFILE/SEQUENCEFILE format and the user-defined location

2016-06-01 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-2.0 35195f6ce -> 5a835b99f [SPARK-15646][SQL] When spark.sql.hive.convertCTAS is true, the conversion rule needs to respect TEXTFILE/SEQUENCEFILE format and the user-defined location ## What changes were proposed in this pull request?

spark git commit: [SPARK-15646][SQL] When spark.sql.hive.convertCTAS is true, the conversion rule needs to respect TEXTFILE/SEQUENCEFILE format and the user-defined location

2016-06-01 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master c8fb776d4 -> 6dddb70c3 [SPARK-15646][SQL] When spark.sql.hive.convertCTAS is true, the conversion rule needs to respect TEXTFILE/SEQUENCEFILE format and the user-defined location ## What changes were proposed in this pull request? When

spark git commit: [SPARK-15692][SQL] Improves the explain output of several physical plans by displaying embedded logical plan in tree style

2016-06-01 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.0 91812226f -> 35195f6ce [SPARK-15692][SQL] Improves the explain output of several physical plans by displaying embedded logical plan in tree style ## What changes were proposed in this pull request? Improves the explain output of

spark git commit: [SPARK-15441][SQL] support null object in Dataset outer-join

2016-06-01 Thread lian
Repository: spark Updated Branches: refs/heads/branch-2.0 8cdc0d4da -> 91812226f [SPARK-15441][SQL] support null object in Dataset outer-join ## What changes were proposed in this pull request? Currently we can't encode top level null object into internal row, as Spark SQL doesn't allow row

spark git commit: [SPARK-15441][SQL] support null object in Dataset outer-join

2016-06-01 Thread lian
Repository: spark Updated Branches: refs/heads/master 7bb64aae2 -> 8640cdb83 [SPARK-15441][SQL] support null object in Dataset outer-join ## What changes were proposed in this pull request? Currently we can't encode top level null object into internal row, as Spark SQL doesn't allow row to

spark git commit: [SPARK-9876] [BRANCH-2.0] Revert "[SPARK-9876][SQL] Update Parquet to 1.8.1."

2016-06-01 Thread lian
Repository: spark Updated Branches: refs/heads/branch-2.0 e033fd50f -> 8cdc0d4da [SPARK-9876] [BRANCH-2.0] Revert "[SPARK-9876][SQL] Update Parquet to 1.8.1." ## What changes were proposed in this pull request? Since we are pretty late in the 2.0 release cycle, it is not clear if this

spark git commit: [SPARK-15269][SQL] Removes unexpected empty table directories created while creating external Spark SQL data sourcet tables.

2016-06-01 Thread lian
Repository: spark Updated Branches: refs/heads/branch-2.0 44052a707 -> e033fd50f [SPARK-15269][SQL] Removes unexpected empty table directories created while creating external Spark SQL data sourcet tables. This PR is an alternative to #13120 authored by xwu0226. ## What changes were

spark git commit: [SPARK-15269][SQL] Removes unexpected empty table directories created while creating external Spark SQL data sourcet tables.

2016-06-01 Thread lian
Repository: spark Updated Branches: refs/heads/master 9e2643b21 -> 7bb64aae2 [SPARK-15269][SQL] Removes unexpected empty table directories created while creating external Spark SQL data sourcet tables. This PR is an alternative to #13120 authored by xwu0226. ## What changes were proposed in

spark git commit: [SPARK-15596][SPARK-15635][SQL] ALTER TABLE RENAME fixes

2016-06-01 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.0 46d5f7f38 -> 44052a707 [SPARK-15596][SPARK-15635][SQL] ALTER TABLE RENAME fixes ## What changes were proposed in this pull request? **SPARK-15596**: Even after we renamed a cached table, the plan would remain in the cache with the

spark git commit: [SPARK-15596][SPARK-15635][SQL] ALTER TABLE RENAME fixes

2016-06-01 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 5b08ee639 -> 9e2643b21 [SPARK-15596][SPARK-15635][SQL] ALTER TABLE RENAME fixes ## What changes were proposed in this pull request? **SPARK-15596**: Even after we renamed a cached table, the plan would remain in the cache with the old

spark git commit: [SPARK-15671] performance regression CoalesceRDD.pickBin with large #…

2016-06-01 Thread davies
Repository: spark Updated Branches: refs/heads/branch-2.0 47902d4bc -> 46d5f7f38 [SPARK-15671] performance regression CoalesceRDD.pickBin with large #… I was running a 15TB join job with 202000 partitions. It looks like the changes I made to CoalesceRDD in pickBin() are really slow with

spark git commit: [SPARK-15702][DOCUMENTATION] Update document programming-guide accumulator section

2016-06-01 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 beb4ea0b4 -> 47902d4bc [SPARK-15702][DOCUMENTATION] Update document programming-guide accumulator section ## What changes were proposed in this pull request? Update document programming-guide accumulator section (scala language) java

spark git commit: [SPARK-15702][DOCUMENTATION] Update document programming-guide accumulator section

2016-06-01 Thread rxin
Repository: spark Updated Branches: refs/heads/master 07a98ca4c -> 2402b9146 [SPARK-15702][DOCUMENTATION] Update document programming-guide accumulator section ## What changes were proposed in this pull request? Update document programming-guide accumulator section (scala language) java and

spark git commit: [SPARK-15587][ML] ML 2.0 QA: Scala APIs audit for ml.feature

2016-06-01 Thread mlnick
Repository: spark Updated Branches: refs/heads/master a71d1364a -> 07a98ca4c [SPARK-15587][ML] ML 2.0 QA: Scala APIs audit for ml.feature ## What changes were proposed in this pull request? ML 2.0 QA: Scala APIs audit for ml.feature. Mainly include: * Remove seed for

spark git commit: [SPARK-15587][ML] ML 2.0 QA: Scala APIs audit for ml.feature

2016-06-01 Thread mlnick
Repository: spark Updated Branches: refs/heads/branch-2.0 71e8aaeaa -> beb4ea0b4 [SPARK-15587][ML] ML 2.0 QA: Scala APIs audit for ml.feature ## What changes were proposed in this pull request? ML 2.0 QA: Scala APIs audit for ml.feature. Mainly include: * Remove seed for

spark git commit: [SPARK-6320][SQL] Move planLater method into GenericStrategy.

2016-06-01 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-2.0 a780848af -> 71e8aaeaa [SPARK-6320][SQL] Move planLater method into GenericStrategy. ## What changes were proposed in this pull request? This PR is the minimal version of #13147 for `branch-2.0`. ## How was this patch tested? Picked

[1/2] spark git commit: [SPARK-15686][SQL] Move user-facing streaming classes into sql.streaming

2016-06-01 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-2.0 9406a3c9a -> a780848af http://git-wip-us.apache.org/repos/asf/spark/blob/a780848a/sql/core/src/main/scala/org/apache/spark/sql/util/ContinuousQueryListener.scala --

[2/2] spark git commit: [SPARK-15686][SQL] Move user-facing streaming classes into sql.streaming

2016-06-01 Thread marmbrus
[SPARK-15686][SQL] Move user-facing streaming classes into sql.streaming ## What changes were proposed in this pull request? This patch moves all user-facing structured streaming classes into sql.streaming. As part of this, I also added some since version annotation to methods and classes that

[1/2] spark git commit: [SPARK-15686][SQL] Move user-facing streaming classes into sql.streaming

2016-06-01 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master d5012c274 -> a71d1364a http://git-wip-us.apache.org/repos/asf/spark/blob/a71d1364/sql/core/src/main/scala/org/apache/spark/sql/util/ContinuousQueryListener.scala -- diff

spark git commit: [SPARK-15495][SQL] Improve the explain output for Aggregation operator

2016-06-01 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.0 cb254ecb1 -> 9406a3c9a [SPARK-15495][SQL] Improve the explain output for Aggregation operator ## What changes were proposed in this pull request? This PR improves the explain output of Aggregator operator. SQL: ```

spark git commit: [SPARK-15495][SQL] Improve the explain output for Aggregation operator

2016-06-01 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 1f43562da -> d5012c274 [SPARK-15495][SQL] Improve the explain output for Aggregation operator ## What changes were proposed in this pull request? This PR improves the explain output of Aggregator operator. SQL: ```

spark git commit: [SPARK-14343][SQL] Proper column pruning for text data source

2016-06-01 Thread lian
Repository: spark Updated Branches: refs/heads/branch-2.0 8fb125bdf -> cb254ecb1 [SPARK-14343][SQL] Proper column pruning for text data source ## What changes were proposed in this pull request? Text data source ignores requested schema, and may give wrong result when the only data column

spark git commit: [SPARK-14343][SQL] Proper column pruning for text data source

2016-06-01 Thread lian
Repository: spark Updated Branches: refs/heads/master 6563d72b1 -> 1f43562da [SPARK-14343][SQL] Proper column pruning for text data source ## What changes were proposed in this pull request? Text data source ignores requested schema, and may give wrong result when the only data column is

spark git commit: [SPARK-15664][MLLIB] Replace FileSystem.get(conf) with path.getFileSystem(conf) when removing CheckpointFile in MLlib

2016-06-01 Thread srowen
Repository: spark Updated Branches: refs/heads/master e4ce1bc4f -> 6563d72b1 [SPARK-15664][MLLIB] Replace FileSystem.get(conf) with path.getFileSystem(conf) when removing CheckpointFile in MLlib ## What changes were proposed in this pull request? if sparkContext.set CheckpointDir to another

spark git commit: [SPARK-15664][MLLIB] Replace FileSystem.get(conf) with path.getFileSystem(conf) when removing CheckpointFile in MLlib

2016-06-01 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-2.0 29a1cdfc4 -> 8fb125bdf [SPARK-15664][MLLIB] Replace FileSystem.get(conf) with path.getFileSystem(conf) when removing CheckpointFile in MLlib ## What changes were proposed in this pull request? if sparkContext.set CheckpointDir to

spark git commit: [SPARK-15659][SQL] Ensure FileSystem is gotten from path

2016-06-01 Thread srowen
Repository: spark Updated Branches: refs/heads/master 1dd925644 -> e4ce1bc4f [SPARK-15659][SQL] Ensure FileSystem is gotten from path ## What changes were proposed in this pull request? Currently `spark.sql.warehouse.dir` is pointed to local dir by default, which will throw exception when