spark git commit: [SPARK-18244][SQL] Rename partitionProviderIsHive -> tracksPartitionsInCatalog

2016-11-03 Thread rxin
e old name was too Hive specific. ## How was this patch tested? Should be covered by existing tests. Author: Reynold Xin <r...@databricks.com> Closes #15750 from rxin/SPARK-18244. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/

spark git commit: [SQL] minor - internal doc improvement for InsertIntoTable.

2016-11-03 Thread rxin
eynold Xin <r...@databricks.com> Closes #15749 from rxin/doc-improvement. (cherry picked from commit 0ea5d5b24c1f7b29efeac0e72d271aba279523f7) Signed-off-by: Reynold Xin <r...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.o

spark git commit: [SQL] minor - internal doc improvement for InsertIntoTable.

2016-11-03 Thread rxin
eynold Xin <r...@databricks.com> Closes #15749 from rxin/doc-improvement. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/0ea5d5b2 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/0ea5d5b2 Diff: http://git-wip-us.apache.org/r

spark git commit: [SPARK-18219] Move commit protocol API (internal) from sql/core to core module

2016-11-03 Thread rxin
we can use it in the future in the RDD API. As part of this patch, I also moved the speficiation of the random uuid for the write path out of the commit protocol, and instead pass in a job id. ## How was this patch tested? N/A Author: Reynold Xin <r...@databricks.com> Closes #15731

spark git commit: [SPARK-18219] Move commit protocol API (internal) from sql/core to core module

2016-11-03 Thread rxin
can use it in the future in the RDD API. As part of this patch, I also moved the speficiation of the random uuid for the write path out of the commit protocol, and instead pass in a job id. ## How was this patch tested? N/A Author: Reynold Xin <r...@databricks.com> Closes #15731 from r

spark git commit: [SPARK-18200][GRAPHX] Support zero as an initial capacity in OpenHashSet

2016-11-03 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.1 2cf39d638 -> 965c964c2 [SPARK-18200][GRAPHX] Support zero as an initial capacity in OpenHashSet ## What changes were proposed in this pull request? [SPARK-18200](https://issues.apache.org/jira/browse/SPARK-18200) reports Apache Spark

spark git commit: [SPARK-18200][GRAPHX] Support zero as an initial capacity in OpenHashSet

2016-11-03 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 3253ae7f7 -> dae1581d9 [SPARK-18200][GRAPHX] Support zero as an initial capacity in OpenHashSet ## What changes were proposed in this pull request? [SPARK-18200](https://issues.apache.org/jira/browse/SPARK-18200) reports Apache Spark

spark git commit: [SPARK-18200][GRAPHX] Support zero as an initial capacity in OpenHashSet

2016-11-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master 9ddec8636 -> d24e73647 [SPARK-18200][GRAPHX] Support zero as an initial capacity in OpenHashSet ## What changes were proposed in this pull request? [SPARK-18200](https://issues.apache.org/jira/browse/SPARK-18200) reports Apache Spark 2.x

spark git commit: [SPARK-18214][SQL] Simplify RuntimeReplaceable type coercion

2016-11-02 Thread rxin
; Closes #15723 from rxin/SPARK-18214. (cherry picked from commit fd90541c35af2bccf0155467bec8cea7c8865046) Signed-off-by: Reynold Xin <r...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2aff2ea8 Tree:

spark git commit: [SPARK-18214][SQL] Simplify RuntimeReplaceable type coercion

2016-11-02 Thread rxin
es #15723 from rxin/SPARK-18214. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/fd90541c Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/fd90541c Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/fd90541c Bra

spark git commit: [SPARK-17058][BUILD] Add maven snapshots-and-staging profile to build/test against staging artifacts

2016-11-02 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.1 bd3ea6595 -> 1eef8e5cd [SPARK-17058][BUILD] Add maven snapshots-and-staging profile to build/test against staging artifacts ## What changes were proposed in this pull request? Adds a `snapshots-and-staging profile` so that RCs of

spark git commit: [SPARK-17058][BUILD] Add maven snapshots-and-staging profile to build/test against staging artifacts

2016-11-02 Thread rxin
Repository: spark Updated Branches: refs/heads/master 3c24299b7 -> 37d95227a [SPARK-17058][BUILD] Add maven snapshots-and-staging profile to build/test against staging artifacts ## What changes were proposed in this pull request? Adds a `snapshots-and-staging profile` so that RCs of

spark git commit: [SPARK-18111][SQL] Wrong approximate quantile answer when multiple records have the minimum value(for branch 2.0)

2016-11-02 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 1696bcfad -> 3253ae7f7 [SPARK-18111][SQL] Wrong approximate quantile answer when multiple records have the minimum value(for branch 2.0) ## What changes were proposed in this pull request? When multiple records have the minimum value,

spark git commit: [SPARK-17895] Improve doc for rangeBetween and rowsBetween

2016-11-02 Thread rxin
Repository: spark Updated Branches: refs/heads/master 4af0ce2d9 -> 742e0fea5 [SPARK-17895] Improve doc for rangeBetween and rowsBetween ## What changes were proposed in this pull request? Copied description for row and range based frame boundary from

spark git commit: [SPARK-14393][SQL] values generated by non-deterministic functions shouldn't change after coalesce or union

2016-11-02 Thread rxin
on for proper initialization. ## How was this patch tested? Unit tests. (Actually I'm not very confident that this PR fixed all issues without introducing new ones ...) cc: rxin davies Author: Xiangrui Meng <m...@databricks.com> Closes #15567 from mengxr/SPARK-14393. (cherry

spark git commit: [SPARK-14393][SQL] values generated by non-deterministic functions shouldn't change after coalesce or union

2016-11-02 Thread rxin
nitialization. ## How was this patch tested? Unit tests. (Actually I'm not very confident that this PR fixed all issues without introducing new ones ...) cc: rxin davies Author: Xiangrui Meng <m...@databricks.com> Closes #15567 from mengxr/SPARK-14393. Project: http://git-wip-us.apache.

spark git commit: [SPARK-17895] Improve doc for rangeBetween and rowsBetween

2016-11-02 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.1 9be069125 -> a885d5bbc [SPARK-17895] Improve doc for rangeBetween and rowsBetween ## What changes were proposed in this pull request? Copied description for row and range based frame boundary from

spark git commit: [SPARK-17683][SQL] Support ArrayType in Literal.apply

2016-11-02 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.1 41491e540 -> 9be069125 [SPARK-17683][SQL] Support ArrayType in Literal.apply ## What changes were proposed in this pull request? This pr is to add pattern-matching entries for array data in `Literal.apply`. ## How was this patch

spark git commit: [SPARK-17683][SQL] Support ArrayType in Literal.apply

2016-11-02 Thread rxin
Repository: spark Updated Branches: refs/heads/master f151bd1af -> 4af0ce2d9 [SPARK-17683][SQL] Support ArrayType in Literal.apply ## What changes were proposed in this pull request? This pr is to add pattern-matching entries for array data in `Literal.apply`. ## How was this patch tested?

spark git commit: [SPARK-17532] Add lock debugging info to thread dumps.

2016-11-02 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.1 4c4bf87ac -> 3b624bedf [SPARK-17532] Add lock debugging info to thread dumps. ## What changes were proposed in this pull request? This adds information to the web UI thread dump page about the JVM locks held by threads and the locks

spark git commit: [SPARK-17532] Add lock debugging info to thread dumps.

2016-11-02 Thread rxin
Repository: spark Updated Branches: refs/heads/master 85c5424d4 -> 2dc048081 [SPARK-17532] Add lock debugging info to thread dumps. ## What changes were proposed in this pull request? This adds information to the web UI thread dump page about the JVM locks held by threads and the locks that

spark git commit: [SPARK-18192] Support all file formats in structured streaming

2016-11-02 Thread rxin
lly a very small change thanks to all the previous refactoring done using the new internal commit protocol API. ## How was this patch tested? Updated FileStreamSinkSuite to add test cases for json, text, and parquet. Author: Reynold Xin <r...@databricks.com> Closes #15711 from rxin/SP

spark git commit: [SPARK-18192] Support all file formats in structured streaming

2016-11-02 Thread rxin
ery small change thanks to all the previous refactoring done using the new internal commit protocol API. ## How was this patch tested? Updated FileStreamSinkSuite to add test cases for json, text, and parquet. Author: Reynold Xin <r...@databricks.com> Closes #15711 from rxin/SPARK-18192.

spark git commit: [SPARK-18183][SPARK-18184] Fix INSERT [INTO|OVERWRITE] TABLE ... PARTITION for Datasource tables

2016-11-02 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.1 39d2fdb51 -> e6509c245 [SPARK-18183][SPARK-18184] Fix INSERT [INTO|OVERWRITE] TABLE ... PARTITION for Datasource tables There are a couple issues with the current 2.1 behavior when inserting into Datasource tables with partitions

spark git commit: [SPARK-17475][STREAMING] Delete CRC files if the filesystem doesn't use checksum files

2016-11-02 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.1 1bbf9ff63 -> 39d2fdb51 [SPARK-17475][STREAMING] Delete CRC files if the filesystem doesn't use checksum files ## What changes were proposed in this pull request? When the metadata logs for various parts of Structured Streaming are

spark git commit: [SPARK-17475][STREAMING] Delete CRC files if the filesystem doesn't use checksum files

2016-11-02 Thread rxin
Repository: spark Updated Branches: refs/heads/master 1bbf9ff63 -> 620da3b48 [SPARK-17475][STREAMING] Delete CRC files if the filesystem doesn't use checksum files ## What changes were proposed in this pull request? When the metadata logs for various parts of Structured Streaming are stored

[spark] Git Push Summary

2016-11-01 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.1 [created] 1bbf9ff63 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-17992][SQL] Return all partitions from HiveShim when Hive throws a metastore exception when attempting to fetch partitions by filter

2016-11-01 Thread rxin
Repository: spark Updated Branches: refs/heads/master 1ecfafa08 -> 1bbf9ff63 [SPARK-17992][SQL] Return all partitions from HiveShim when Hive throws a metastore exception when attempting to fetch partitions by filter (Link to Jira issue: https://issues.apache.org/jira/browse/SPARK-17992) ##

spark git commit: [SPARK-18216][SQL] Make Column.expr public

2016-11-01 Thread rxin
up, similar to how we use QueryExecution. ## How was this patch tested? N/A - this is a simple visibility change. Author: Reynold Xin <r...@databricks.com> Closes #15724 from rxin/SPARK-18216. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/

spark git commit: [SPARK-18182] Expose ReplayListenerBus.read() overload which takes string iterator

2016-11-01 Thread rxin
Repository: spark Updated Branches: refs/heads/master 6e6298154 -> b929537b6 [SPARK-18182] Expose ReplayListenerBus.read() overload which takes string iterator The `ReplayListenerBus.read()` method is used when implementing a custom `ApplicationHistoryProvider`. The current interface only

spark git commit: [SPARK-17350][SQL] Disable default use of KryoSerializer in Thrift Server

2016-11-01 Thread rxin
Repository: spark Updated Branches: refs/heads/master 01dd00830 -> 6e6298154 [SPARK-17350][SQL] Disable default use of KryoSerializer in Thrift Server In SPARK-4761 / #3621 (December 2014) we enabled Kryo serialization by default in the Spark Thrift Server. However, I don't think that the

spark git commit: [SPARK-18114][HOTFIX] Fix line-too-long style error from backport of SPARK-18114

2016-11-01 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 4176da8be -> a01b95060 [SPARK-18114][HOTFIX] Fix line-too-long style error from backport of SPARK-18114 ## What changes were proposed in this pull request? Fix style error introduced in cherry-pick of

spark git commit: [SPARK-18167] Disable flaky SQLQuerySuite test

2016-11-01 Thread rxin
Repository: spark Updated Branches: refs/heads/master d0272b436 -> cfac17ee1 [SPARK-18167] Disable flaky SQLQuerySuite test We now know it's a persistent environmental issue that is causing this test to sometimes fail. One hypothesis is that some configuration is leaked from another suite,

spark git commit: [SPARK-18148][SQL] Misleading Error Message for Aggregation Without Window/GroupBy

2016-11-01 Thread rxin
Repository: spark Updated Branches: refs/heads/master 8a538c97b -> d0272b436 [SPARK-18148][SQL] Misleading Error Message for Aggregation Without Window/GroupBy ## What changes were proposed in this pull request? Aggregation Without Window/GroupBy expressions will fail in `checkAnalysis`,

spark git commit: [SPARK-18189][SQL] Fix serialization issue in KeyValueGroupedDataset

2016-11-01 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 4d2672a40 -> 58655f51f [SPARK-18189][SQL] Fix serialization issue in KeyValueGroupedDataset ## What changes were proposed in this pull request? Likewise

spark git commit: [SPARK-18189][SQL] Fix serialization issue in KeyValueGroupedDataset

2016-11-01 Thread rxin
Repository: spark Updated Branches: refs/heads/master 8cdf143f4 -> 8a538c97b [SPARK-18189][SQL] Fix serialization issue in KeyValueGroupedDataset ## What changes were proposed in this pull request? Likewise

spark git commit: [SPARK-18103][FOLLOW-UP][SQL][MINOR] Rename `MetadataLogFileCatalog` to `MetadataLogFileIndex`

2016-11-01 Thread rxin
Repository: spark Updated Branches: refs/heads/master 8ac09108f -> 8cdf143f4 [SPARK-18103][FOLLOW-UP][SQL][MINOR] Rename `MetadataLogFileCatalog` to `MetadataLogFileIndex` ## What changes were proposed in this pull request? This is a follow-up to https://github.com/apache/spark/pull/15634.

spark git commit: [SPARK-18107][SQL] Insert overwrite statement runs much slower in spark-sql than it does in hive-client

2016-11-01 Thread rxin
Repository: spark Updated Branches: refs/heads/master d9d146500 -> dd85eb544 [SPARK-18107][SQL] Insert overwrite statement runs much slower in spark-sql than it does in hive-client ## What changes were proposed in this pull request? As reported on the jira, insert overwrite statement runs

spark git commit: [SPARK-18024][SQL] Introduce an internal commit protocol API

2016-10-31 Thread rxin
ks.com> Author: Eric Liang <e...@databricks.com> Closes #15707 from rxin/SPARK-18024-2. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d9d14650 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d9d14650 Diff:

spark git commit: [SPARK-18167][SQL] Retry when the SQLQuerySuite test flakes

2016-10-31 Thread rxin
Repository: spark Updated Branches: refs/heads/master efc254a82 -> 7d6c87155 [SPARK-18167][SQL] Retry when the SQLQuerySuite test flakes ## What changes were proposed in this pull request? This will re-run the flaky test a few times after it fails. This will help determine if it's due to

spark git commit: [SPARK-18087][SQL] Optimize insert to not require REPAIR TABLE

2016-10-31 Thread rxin
Repository: spark Updated Branches: refs/heads/master 6633b97b5 -> efc254a82 [SPARK-18087][SQL] Optimize insert to not require REPAIR TABLE ## What changes were proposed in this pull request? When inserting into datasource tables with partitions managed by the hive metastore, we need to

spark git commit: [SPARK-18143][SQL] Ignore Structured Streaming event logs to avoid breaking history server (branch 2.0)

2016-10-31 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 9f924747d -> 300d596a5 [SPARK-18143][SQL] Ignore Structured Streaming event logs to avoid breaking history server (branch 2.0) ## What changes were proposed in this pull request? Backport #15663 to branch-2.0 and fixed conflicts in

[1/2] spark git commit: [SPARK-18103][SQL] Rename *FileCatalog to *FileIndex

2016-10-30 Thread rxin
Repository: spark Updated Branches: refs/heads/master 3ad99f166 -> 90d3b91f4 http://git-wip-us.apache.org/repos/asf/spark/blob/90d3b91f/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala -- diff --git

[2/2] spark git commit: [SPARK-18103][SQL] Rename *FileCatalog to *FileIndex

2016-10-30 Thread rxin
[SPARK-18103][SQL] Rename *FileCatalog to *FileIndex ## What changes were proposed in this pull request? To reduce the number of components in SQL named *Catalog, rename *FileCatalog to *FileIndex. A FileIndex is responsible for returning the list of partitions / files to scan given a

spark git commit: [SPARK-18167][SQL] Add debug code for SQLQuerySuite flakiness when metastore partition pruning is enabled

2016-10-28 Thread rxin
Repository: spark Updated Branches: refs/heads/master 59cccbda4 -> d2d438d1d [SPARK-18167][SQL] Add debug code for SQLQuerySuite flakiness when metastore partition pruning is enabled ## What changes were proposed in this pull request? org.apache.spark.sql.hive.execution.SQLQuerySuite is

spark git commit: [SPARK-18094][SQL][TESTS] Move group analytics test cases from `SQLQuerySuite` into a query file test.

2016-10-26 Thread rxin
Repository: spark Updated Branches: refs/heads/master dcdda1978 -> 5b7d403c1 [SPARK-18094][SQL][TESTS] Move group analytics test cases from `SQLQuerySuite` into a query file test. ## What changes were proposed in this pull request? Currently we have several test cases for group

spark git commit: [SPARK-18063][SQL] Failed to infer constraints over multiple aliases

2016-10-26 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 773fbfef1 -> 5b81b0102 [SPARK-18063][SQL] Failed to infer constraints over multiple aliases ## What changes were proposed in this pull request? The `UnaryNode.getAliasedConstraints` function fails to replace all expressions by their

spark git commit: [SPARK-18063][SQL] Failed to infer constraints over multiple aliases

2016-10-26 Thread rxin
Repository: spark Updated Branches: refs/heads/master 7ac70e7ba -> fa7d9d708 [SPARK-18063][SQL] Failed to infer constraints over multiple aliases ## What changes were proposed in this pull request? The `UnaryNode.getAliasedConstraints` function fails to replace all expressions by their

spark git commit: [SPARK-17698][SQL] Join predicates should not contain filter clauses

2016-10-22 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 b959dab32 -> 3d5878751 [SPARK-17698][SQL] Join predicates should not contain filter clauses ## What changes were proposed in this pull request? This is a backport of https://github.com/apache/spark/pull/15272 to 2.0 branch. Jira :

spark git commit: [SPARK-928][CORE] Add support for Unsafe-based serializer in Kryo

2016-10-22 Thread rxin
Repository: spark Updated Branches: refs/heads/master 4f1dcd3dc -> bc167a2a5 [SPARK-928][CORE] Add support for Unsafe-based serializer in Kryo ## What changes were proposed in this pull request? Now since we have migrated to Kryo-3.0.0 in https://issues.apache.org/jira/browse/SPARK-11416, we

spark git commit: [SPARK-18051][SPARK CORE] fix bug of custom PartitionCoalescer causing serialization exception

2016-10-22 Thread rxin
Repository: spark Updated Branches: refs/heads/master 5fa9f8795 -> 4f1dcd3dc [SPARK-18051][SPARK CORE] fix bug of custom PartitionCoalescer causing serialization exception ## What changes were proposed in this pull request? add a require check in `CoalescedRDD` to make sure the passed in

spark git commit: [SPARK-16606][MINOR] Tiny follow-up to , to correct more instances of the same log message typo

2016-10-21 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 d3c78c4f3 -> a0c03c925 [SPARK-16606][MINOR] Tiny follow-up to , to correct more instances of the same log message typo ## What changes were proposed in this pull request? Tiny follow-up to SPARK-16606 /

spark git commit: [SPARK-16606][MINOR] Tiny follow-up to , to correct more instances of the same log message typo

2016-10-21 Thread rxin
Repository: spark Updated Branches: refs/heads/master 3fbf5a58c -> 7178c5643 [SPARK-16606][MINOR] Tiny follow-up to , to correct more instances of the same log message typo ## What changes were proposed in this pull request? Tiny follow-up to SPARK-16606 /

spark git commit: [SPARK-18021][SQL] Refactor file name specification for data sources

2016-10-20 Thread rxin
ter clarity - Renamed "path" in multiple places to "stagingDir", to more accurately reflect its meaning ## How was this patch tested? This should be covered by existing data source tests. Author: Reynold Xin <r...@databricks.com> Closes #15562 from rxin/SPARK-180

spark git commit: [SPARK-15780][SQL] Support mapValues on KeyValueGroupedDataset

2016-10-20 Thread rxin
Repository: spark Updated Branches: refs/heads/master fb0894b3a -> 84b245f2d [SPARK-15780][SQL] Support mapValues on KeyValueGroupedDataset ## What changes were proposed in this pull request? Add mapValues to KeyValueGroupedDataset ## How was this patch tested? New test in DatasetSuite for

spark git commit: [SPARK-17698][SQL] Join predicates should not contain filter clauses

2016-10-20 Thread rxin
Repository: spark Updated Branches: refs/heads/master e895bc254 -> fb0894b3a [SPARK-17698][SQL] Join predicates should not contain filter clauses ## What changes were proposed in this pull request? Jira : https://issues.apache.org/jira/browse/SPARK-17698 `ExtractEquiJoinKeys` is incorrectly

spark git commit: [SPARK-17991][SQL] Enable metastore partition pruning by default.

2016-10-20 Thread rxin
Repository: spark Updated Branches: refs/heads/master 39755169f -> 4bd17c460 [SPARK-17991][SQL] Enable metastore partition pruning by default. ## What changes were proposed in this pull request? This should apply to non-converted metastore relations. WIP to see if this causes any test

spark git commit: [SPARK-18003][SPARK CORE] Fix bug of RDD zipWithIndex & zipWithUniqueId index value overflowing

2016-10-20 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 995f602d2 -> 4131623a8 [SPARK-18003][SPARK CORE] Fix bug of RDD zipWithIndex & zipWithUniqueId index value overflowing ## What changes were proposed in this pull request? - Fix bug of RDD `zipWithIndex` generating wrong result when

spark git commit: [SPARK-18003][SPARK CORE] Fix bug of RDD zipWithIndex & zipWithUniqueId index value overflowing

2016-10-20 Thread rxin
Repository: spark Updated Branches: refs/heads/master f313117bc -> 39755169f [SPARK-18003][SPARK CORE] Fix bug of RDD zipWithIndex & zipWithUniqueId index value overflowing ## What changes were proposed in this pull request? - Fix bug of RDD `zipWithIndex` generating wrong result when one

spark git commit: [SPARK-16078][SQL] Backport: from_utc_timestamp/to_utc_timestamp should not depends on local timezone

2016-10-19 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 b95ac0d00 -> 82e98f126 [SPARK-16078][SQL] Backport: from_utc_timestamp/to_utc_timestamp should not depends on local timezone ## What changes were proposed in this pull request? Back-port of https://github.com/apache/spark/pull/13784

spark git commit: [SPARK-17989][SQL] Check ascendingOrder type in sort_array function rather than throwing ClassCastException

2016-10-19 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 cdd2570e6 -> 995f602d2 [SPARK-17989][SQL] Check ascendingOrder type in sort_array function rather than throwing ClassCastException ## What changes were proposed in this pull request? This PR proposes to check the second argument,

spark git commit: [SPARK-17989][SQL] Check ascendingOrder type in sort_array function rather than throwing ClassCastException

2016-10-19 Thread rxin
Repository: spark Updated Branches: refs/heads/master 444c2d22e -> 4b2011ec9 [SPARK-17989][SQL] Check ascendingOrder type in sort_array function rather than throwing ClassCastException ## What changes were proposed in this pull request? This PR proposes to check the second argument,

spark git commit: [SPARK-18001][DOCUMENT] fix broke link to SparkDataFrame

2016-10-18 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 3796a98cf -> cdd2570e6 [SPARK-18001][DOCUMENT] fix broke link to SparkDataFrame ## What changes were proposed in this pull request? In http://spark.apache.org/docs/latest/sql-programming-guide.html, Section "Untyped Dataset

spark git commit: [SPARK-18001][DOCUMENT] fix broke link to SparkDataFrame

2016-10-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master 4329c5cea -> f39852e59 [SPARK-18001][DOCUMENT] fix broke link to SparkDataFrame ## What changes were proposed in this pull request? In http://spark.apache.org/docs/latest/sql-programming-guide.html, Section "Untyped Dataset Operations

spark git commit: [SPARK-17841][STREAMING][KAFKA] drain commitQueue

2016-10-18 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 6ef923137 -> f6b87939c [SPARK-17841][STREAMING][KAFKA] drain commitQueue ## What changes were proposed in this pull request? Actually drain commit queue rather than just iterating it. iterator() on a concurrent linked queue won't

spark git commit: [SPARK-17841][STREAMING][KAFKA] drain commitQueue

2016-10-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master cd662bc7a -> cd106b050 [SPARK-17841][STREAMING][KAFKA] drain commitQueue ## What changes were proposed in this pull request? Actually drain commit queue rather than just iterating it. iterator() on a concurrent linked queue won't remove

spark git commit: Revert "[SPARK-17985][CORE] Bump commons-lang3 version to 3.5."

2016-10-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master b3130c7b6 -> cd662bc7a Revert "[SPARK-17985][CORE] Bump commons-lang3 version to 3.5." This reverts commit bfe7885aee2f406c1bbde08e30809a0b4bb070d2. The commit caused build failures on Hadoop 2.2 profile: ``` [error] /scrat

spark git commit: [SPARK-17955][SQL] Make DataFrameReader.jdbc call DataFrameReader.format("jdbc").load

2016-10-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master 4518642ab -> b3130c7b6 [SPARK-17955][SQL] Make DataFrameReader.jdbc call DataFrameReader.format("jdbc").load ## What changes were proposed in this pull request? This PR proposes to make `DataFrameReader.jdbc` call

spark git commit: [MINOR][DOC] Add more built-in sources in sql-programming-guide.md

2016-10-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master bfe7885ae -> 20dd11096 [MINOR][DOC] Add more built-in sources in sql-programming-guide.md ## What changes were proposed in this pull request? Add more built-in sources in sql-programming-guide.md. ## How was this patch tested? Manually.

spark git commit: [MINOR][DOC] Add more built-in sources in sql-programming-guide.md

2016-10-18 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 26e978a93 -> 6ef923137 [MINOR][DOC] Add more built-in sources in sql-programming-guide.md ## What changes were proposed in this pull request? Add more built-in sources in sql-programming-guide.md. ## How was this patch tested?

spark git commit: [SPARK-17985][CORE] Bump commons-lang3 version to 3.5.

2016-10-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master 4ef39c2f4 -> bfe7885ae [SPARK-17985][CORE] Bump commons-lang3 version to 3.5. ## What changes were proposed in this pull request? `SerializationUtils.clone()` of commons-lang3 (<3.5) has a bug that breaks thread safety, which gets stack

spark git commit: [SPARK-17899][SQL][FOLLOW-UP] debug mode should work for corrupted table

2016-10-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master a9e79a41e -> e59df62e6 [SPARK-17899][SQL][FOLLOW-UP] debug mode should work for corrupted table ## What changes were proposed in this pull request? Debug mode should work for corrupted table, so that we can really debug ## How was this

spark git commit: Revert "[SPARK-17974] Refactor FileCatalog classes to simplify the inheritance tree"

2016-10-17 Thread rxin
Repository: spark Updated Branches: refs/heads/master 8daa1a29b -> 1c5a7d7f6 Revert "[SPARK-17974] Refactor FileCatalog classes to simplify the inheritance tree" This reverts commit 8daa1a29b65a9b5337518458e9ece1619e8a01e3. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-17974] Refactor FileCatalog classes to simplify the inheritance tree

2016-10-17 Thread rxin
Repository: spark Updated Branches: refs/heads/master 813ab5e02 -> 8daa1a29b [SPARK-17974] Refactor FileCatalog classes to simplify the inheritance tree ## What changes were proposed in this pull request? This renames `BasicFileCatalog => FileCatalog`, combines `SessionFileCatalog` with

spark git commit: [MINOR][SQL] Add prettyName for current_database function

2016-10-16 Thread rxin
Repository: spark Updated Branches: refs/heads/master e18d02c5a -> 56b0f5f4d [MINOR][SQL] Add prettyName for current_database function ## What changes were proposed in this pull request? Added a `prettyname` for current_database function. ## How was this patch tested? Manually. Before: ```

spark git commit: [MINOR][SQL] Add prettyName for current_database function

2016-10-16 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 3cc2fe5b9 -> ca66f52ff [MINOR][SQL] Add prettyName for current_database function ## What changes were proposed in this pull request? Added a `prettyname` for current_database function. ## How was this patch tested? Manually. Before:

[spark] Git Push Summary

2016-10-16 Thread rxin
Repository: spark Updated Tags: refs/tags/v1.6.3-rc1 [created] 7375bb0c8 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org

[spark] Git Push Summary

2016-10-16 Thread rxin
Repository: spark Updated Tags: refs/tags/v1.6.3 [deleted] 7375bb0c8 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: Prepare branch-1.6 for 1.6.3 release.

2016-10-16 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 745c5e70f -> 0f577857c Prepare branch-1.6 for 1.6.3 release. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/0f577857 Tree:

spark git commit: [SPARK-17819][SQL] Support default database in connection URIs for Spark Thrift Server

2016-10-16 Thread rxin
Repository: spark Updated Branches: refs/heads/master 72a6e7a57 -> 59e3eb5af [SPARK-17819][SQL] Support default database in connection URIs for Spark Thrift Server ## What changes were proposed in this pull request? Currently, Spark Thrift Server ignores the default database in URI. This PR

spark git commit: Revert "[SPARK-17637][SCHEDULER] Packed scheduling for Spark tasks across executors"

2016-10-15 Thread rxin
Repository: spark Updated Branches: refs/heads/master ed1463341 -> 72a6e7a57 Revert "[SPARK-17637][SCHEDULER] Packed scheduling for Spark tasks across executors" This reverts commit ed1463341455830b8867b721a1b34f291139baf3. The patch merged had obvious quality and documentation issue. The

spark git commit: [SPARK-17953][DOCUMENTATION] Fix typo in SparkSession scaladoc

2016-10-15 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 c53b83749 -> 2a1b10b64 [SPARK-17953][DOCUMENTATION] Fix typo in SparkSession scaladoc ## What changes were proposed in this pull request? ### Before: ```scala SparkSession.builder() .master("local") .appName("Word Count")

spark git commit: [SPARK-17953][DOCUMENTATION] Fix typo in SparkSession scaladoc

2016-10-15 Thread rxin
Repository: spark Updated Branches: refs/heads/master 6ce1b675e -> 36d81c2c6 [SPARK-17953][DOCUMENTATION] Fix typo in SparkSession scaladoc ## What changes were proposed in this pull request? ### Before: ```scala SparkSession.builder() .master("local") .appName("Word Count")

[2/2] spark git commit: [SPARK-16980][SQL] Load only catalog table partition metadata required to answer a query

2016-10-14 Thread rxin
[SPARK-16980][SQL] Load only catalog table partition metadata required to answer a query (This PR addresses https://issues.apache.org/jira/browse/SPARK-16980.) ## What changes were proposed in this pull request? In a new Spark session, when a partitioned Hive table is converted to use Spark's

[1/2] spark git commit: [SPARK-16980][SQL] Load only catalog table partition metadata required to answer a query

2016-10-14 Thread rxin
Repository: spark Updated Branches: refs/heads/master 2d96d35dc -> 6ce1b675e http://git-wip-us.apache.org/repos/asf/spark/blob/6ce1b675/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClient.scala -- diff --git

spark git commit: [SPARK-17946][PYSPARK] Python crossJoin API similar to Scala

2016-10-14 Thread rxin
Repository: spark Updated Branches: refs/heads/master 72adfbf94 -> 2d96d35dc [SPARK-17946][PYSPARK] Python crossJoin API similar to Scala ## What changes were proposed in this pull request? Add a crossJoin function to the DataFrame API similar to that in Scala. Joins with no condition

spark git commit: [SPARK-17884][SQL] To resolve Null pointer exception when casting from empty string to interval type

2016-10-14 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 18b173cfc -> 745c5e70f [SPARK-17884][SQL] To resolve Null pointer exception when casting from empty string to interval type ## What changes were proposed in this pull request? This change adds a check in castToInterval method of Cast

spark git commit: [SPARK-17661][SQL] Consolidate various listLeafFiles implementations

2016-10-13 Thread rxin
Repository: spark Updated Branches: refs/heads/master 7106866c2 -> adc112429 [SPARK-17661][SQL] Consolidate various listLeafFiles implementations ## What changes were proposed in this pull request? There are 4 listLeafFiles-related functions in Spark: - ListingFileCatalog.listLeafFiles

spark git commit: minor doc fix for Row.scala

2016-10-12 Thread rxin
Repository: spark Updated Branches: refs/heads/master 064d6650e -> 7222a25a1 minor doc fix for Row.scala ## What changes were proposed in this pull request? minor doc fix for "getAnyValAs" in class Row ## How was this patch tested? None. (If this patch involves UI changes, please attach a

spark git commit: minor doc fix for Row.scala

2016-10-12 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 ab00e410c -> d38f38a09 minor doc fix for Row.scala ## What changes were proposed in this pull request? minor doc fix for "getAnyValAs" in class Row ## How was this patch tested? None. (If this patch involves UI changes, please

spark git commit: [SPARK-16827][BRANCH-2.0] Avoid reporting spill metrics as shuffle metrics

2016-10-12 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 050b8177e -> 5903dabc5 [SPARK-16827][BRANCH-2.0] Avoid reporting spill metrics as shuffle metrics ## What changes were proposed in this pull request? Fix a bug where spill metrics were being reported as shuffle metrics. Eventually

spark git commit: [SPARK-17840][DOCS] Add some pointers for wiki/CONTRIBUTING.md in README.md and some warnings in PULL_REQUEST_TEMPLATE

2016-10-12 Thread rxin
Repository: spark Updated Branches: refs/heads/master 5cc503f4f -> f8062b63f [SPARK-17840][DOCS] Add some pointers for wiki/CONTRIBUTING.md in README.md and some warnings in PULL_REQUEST_TEMPLATE ## What changes were proposed in this pull request? Link to contributing wiki in PR template,

spark git commit: [SPARK-17884][SQL] To resolve Null pointer exception when casting from empty string to interval type.

2016-10-12 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 4dcbde48d -> 5451541d1 [SPARK-17884][SQL] To resolve Null pointer exception when casting from empty string to interval type. ## What changes were proposed in this pull request? This change adds a check in castToInterval method of Cast

spark git commit: [SPARK-17884][SQL] To resolve Null pointer exception when casting from empty string to interval type.

2016-10-12 Thread rxin
Repository: spark Updated Branches: refs/heads/master 8880fd13e -> d5580ebaa [SPARK-17884][SQL] To resolve Null pointer exception when casting from empty string to interval type. ## What changes were proposed in this pull request? This change adds a check in castToInterval method of Cast

spark git commit: [SPARK-14761][SQL] Reject invalid join methods when join columns are not specified in PySpark DataFrame join.

2016-10-12 Thread rxin
Repository: spark Updated Branches: refs/heads/master 8d33e1e5b -> 8880fd13e [SPARK-14761][SQL] Reject invalid join methods when join columns are not specified in PySpark DataFrame join. ## What changes were proposed in this pull request? In PySpark, the invalid join type will not throw

spark git commit: [SPARK-17853][STREAMING][KAFKA][DOC] make it clear that reusing group.id is bad

2016-10-12 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 f3d82b53c -> f12b74c02 [SPARK-17853][STREAMING][KAFKA][DOC] make it clear that reusing group.id is bad ## What changes were proposed in this pull request? Documentation fix to make it clear that reusing group id for different streams

spark git commit: [SPARK-17853][STREAMING][KAFKA][DOC] make it clear that reusing group.id is bad

2016-10-12 Thread rxin
Repository: spark Updated Branches: refs/heads/master b512f04f8 -> c264ef9b1 [SPARK-17853][STREAMING][KAFKA][DOC] make it clear that reusing group.id is bad ## What changes were proposed in this pull request? Documentation fix to make it clear that reusing group id for different streams is

spark git commit: [SPARK-17880][DOC] The url linking to `AccumulatorV2` in the document is incorrect.

2016-10-11 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 e68e95e94 -> f3d82b53c [SPARK-17880][DOC] The url linking to `AccumulatorV2` in the document is incorrect. ## What changes were proposed in this pull request? In `programming-guide.md`, the url which links to `AccumulatorV2` says

spark git commit: [SPARK-17880][DOC] The url linking to `AccumulatorV2` in the document is incorrect.

2016-10-11 Thread rxin
Repository: spark Updated Branches: refs/heads/master 299eb04ba -> b512f04f8 [SPARK-17880][DOC] The url linking to `AccumulatorV2` in the document is incorrect. ## What changes were proposed in this pull request? In `programming-guide.md`, the url which links to `AccumulatorV2` says

spark git commit: Fix hadoop.version in building-spark.md

2016-10-11 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.0 5ec3e6680 -> e68e95e94 Fix hadoop.version in building-spark.md Couple of mvn build examples use `-Dhadoop.version=VERSION` instead of actual version number Author: Alexander Pivovarov Closes #15440 from

<    1   2   3   4   5   6   7   8   9   10   >