spark git commit: [SPARK-20590][SQL] Use Spark internal datasource if multiples are found for the same shorten name

2017-05-09 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.2 6a996b362 -> 7b6f3a118 [SPARK-20590][SQL] Use Spark internal datasource if multiples are found for the same shorten name ## What changes were proposed in this pull request? One of the common usability problems around reading data in

spark git commit: [SPARK-20590][SQL] Use Spark internal datasource if multiples are found for the same shorten name

2017-05-09 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 771abeb46 -> 3d2131ab4 [SPARK-20590][SQL] Use Spark internal datasource if multiples are found for the same shorten name ## What changes were proposed in this pull request? One of the common usability problems around reading data in

spark git commit: [SPARK-17685][SQL] Make SortMergeJoinExec's currentVars is null when calling createJoinKey

2017-05-09 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/branch-2.1 12c937ede -> 50f28dfe4 [SPARK-17685][SQL] Make SortMergeJoinExec's currentVars is null when calling createJoinKey ## What changes were proposed in this pull request? The following SQL query cause `IndexOutOfBoundsException` issue

spark git commit: [SPARK-17685][SQL] Make SortMergeJoinExec's currentVars is null when calling createJoinKey

2017-05-09 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/branch-2.2 7600a7ab6 -> 6a996b362 [SPARK-17685][SQL] Make SortMergeJoinExec's currentVars is null when calling createJoinKey ## What changes were proposed in this pull request? The following SQL query cause `IndexOutOfBoundsException` issue

spark git commit: [SPARK-17685][SQL] Make SortMergeJoinExec's currentVars is null when calling createJoinKey

2017-05-09 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/master c0189abc7 -> 771abeb46 [SPARK-17685][SQL] Make SortMergeJoinExec's currentVars is null when calling createJoinKey ## What changes were proposed in this pull request? The following SQL query cause `IndexOutOfBoundsException` issue when

spark git commit: [SPARK-20373][SQL][SS] Batch queries with 'Dataset/DataFrame.withWatermark()` does not execute

2017-05-09 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.2 d191b962d -> 7600a7ab6 [SPARK-20373][SQL][SS] Batch queries with 'Dataset/DataFrame.withWatermark()` does not execute ## What changes were proposed in this pull request? Any Dataset/DataFrame batch query with the operation

spark git commit: [SPARK-20373][SQL][SS] Batch queries with 'Dataset/DataFrame.withWatermark()` does not execute

2017-05-09 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master f79aa285c -> c0189abc7 [SPARK-20373][SQL][SS] Batch queries with 'Dataset/DataFrame.withWatermark()` does not execute ## What changes were proposed in this pull request? Any Dataset/DataFrame batch query with the operation

spark git commit: Revert "[SPARK-20311][SQL] Support aliases for table value functions"

2017-05-09 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-2.2 9e8d23b3a -> d191b962d Revert "[SPARK-20311][SQL] Support aliases for table value functions" This reverts commit 714811d0b5bcb5d47c39782ff74f898d276ecc59. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: Revert "[SPARK-20311][SQL] Support aliases for table value functions"

2017-05-09 Thread yhuai
Repository: spark Updated Branches: refs/heads/master ac1ab6b9d -> f79aa285c Revert "[SPARK-20311][SQL] Support aliases for table value functions" This reverts commit 714811d0b5bcb5d47c39782ff74f898d276ecc59. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: Revert "[SPARK-12297][SQL] Hive compatibility for Parquet Timestamps"

2017-05-09 Thread rxin
Repository: spark Updated Branches: refs/heads/master 1b85bcd92 -> ac1ab6b9d Revert "[SPARK-12297][SQL] Hive compatibility for Parquet Timestamps" This reverts commit 22691556e5f0dfbac81b8cc9ca0a67c70c1711ca. See JIRA ticket for more information. Project:

spark git commit: [SPARK-20627][PYSPARK] Drop the hadoop distirbution name from the Python version

2017-05-09 Thread holden
Repository: spark Updated Branches: refs/heads/branch-2.2 c7bd909f6 -> 9e8d23b3a [SPARK-20627][PYSPARK] Drop the hadoop distirbution name from the Python version ## What changes were proposed in this pull request? Drop the hadoop distirbution name from the Python version (PEP440 -

spark git commit: [SPARK-20627][PYSPARK] Drop the hadoop distirbution name from the Python version

2017-05-09 Thread holden
Repository: spark Updated Branches: refs/heads/branch-2.1 f7a91a17e -> 12c937ede [SPARK-20627][PYSPARK] Drop the hadoop distirbution name from the Python version ## What changes were proposed in this pull request? Drop the hadoop distirbution name from the Python version (PEP440 -

spark git commit: [SPARK-20627][PYSPARK] Drop the hadoop distirbution name from the Python version

2017-05-09 Thread holden
Repository: spark Updated Branches: refs/heads/master 25ee816e0 -> 1b85bcd92 [SPARK-20627][PYSPARK] Drop the hadoop distirbution name from the Python version ## What changes were proposed in this pull request? Drop the hadoop distirbution name from the Python version (PEP440 -

spark git commit: [SPARK-19876][BUILD] Move Trigger.java to java source hierarchy

2017-05-09 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/branch-2.2 73aa23b8e -> c7bd909f6 [SPARK-19876][BUILD] Move Trigger.java to java source hierarchy ## What changes were proposed in this pull request? Simply moves `Trigger.java` to `src/main/java` from `src/main/scala` See

spark git commit: [SPARK-19876][BUILD] Move Trigger.java to java source hierarchy

2017-05-09 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/master d099f414d -> 25ee816e0 [SPARK-19876][BUILD] Move Trigger.java to java source hierarchy ## What changes were proposed in this pull request? Simply moves `Trigger.java` to `src/main/java` from `src/main/scala` See

spark git commit: [SPARK-20674][SQL] Support registering UserDefinedFunction as named UDF

2017-05-09 Thread lixiao
Repository: spark Updated Branches: refs/heads/branch-2.2 08e1b78f0 -> 73aa23b8e [SPARK-20674][SQL] Support registering UserDefinedFunction as named UDF ## What changes were proposed in this pull request? For some reason we don't have an API to register UserDefinedFunction as named UDF. It

spark git commit: [SPARK-20674][SQL] Support registering UserDefinedFunction as named UDF

2017-05-09 Thread lixiao
Repository: spark Updated Branches: refs/heads/master f561a76b2 -> d099f414d [SPARK-20674][SQL] Support registering UserDefinedFunction as named UDF ## What changes were proposed in this pull request? For some reason we don't have an API to register UserDefinedFunction as named UDF. It is a

spark git commit: [SPARK-20548][FLAKY-TEST] share one REPL instance among REPL test cases

2017-05-09 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.2 272d2a10d -> 08e1b78f0 [SPARK-20548][FLAKY-TEST] share one REPL instance among REPL test cases `ReplSuite.newProductSeqEncoder with REPL defined class` was flaky and throws OOM exception frequently. By analyzing the heap dump, we

spark git commit: [SPARK-20548][FLAKY-TEST] share one REPL instance among REPL test cases

2017-05-09 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 181261a81 -> f561a76b2 [SPARK-20548][FLAKY-TEST] share one REPL instance among REPL test cases ## What changes were proposed in this pull request? `ReplSuite.newProductSeqEncoder with REPL defined class` was flaky and throws OOM

spark git commit: [SPARK-20355] Add per application spark version on the history server headerpage

2017-05-09 Thread tgraves
Repository: spark Updated Branches: refs/heads/master 714811d0b -> 181261a81 [SPARK-20355] Add per application spark version on the history server headerpage ## What changes were proposed in this pull request? Spark Version for a specific application is not displayed on the history page

spark git commit: [SPARK-20311][SQL] Support aliases for table value functions

2017-05-09 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.2 b3309676b -> 272d2a10d [SPARK-20311][SQL] Support aliases for table value functions ## What changes were proposed in this pull request? This pr added parsing rules to support aliases in table value functions. ## How was this patch

spark git commit: [SPARK-20311][SQL] Support aliases for table value functions

2017-05-09 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 0d00c768a -> 714811d0b [SPARK-20311][SQL] Support aliases for table value functions ## What changes were proposed in this pull request? This pr added parsing rules to support aliases in table value functions. ## How was this patch tested?

spark git commit: [SPARK-20667][SQL][TESTS] Cleanup the cataloged metadata after completing the package of sql/core and sql/hive

2017-05-09 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.2 4b7aa0b1d -> b3309676b [SPARK-20667][SQL][TESTS] Cleanup the cataloged metadata after completing the package of sql/core and sql/hive ## What changes were proposed in this pull request? So far, we do not drop all the cataloged

spark git commit: [SPARK-20667][SQL][TESTS] Cleanup the cataloged metadata after completing the package of sql/core and sql/hive

2017-05-09 Thread wenchen
Repository: spark Updated Branches: refs/heads/master b8733e0ad -> 0d00c768a [SPARK-20667][SQL][TESTS] Cleanup the cataloged metadata after completing the package of sql/core and sql/hive ## What changes were proposed in this pull request? So far, we do not drop all the cataloged objects

spark git commit: [SPARK-20606][ML] ML 2.2 QA: Remove deprecated methods for ML

2017-05-09 Thread yliang
Repository: spark Updated Branches: refs/heads/branch-2.2 4bbfad44e -> 4b7aa0b1d [SPARK-20606][ML] ML 2.2 QA: Remove deprecated methods for ML ## What changes were proposed in this pull request? Remove ML methods we deprecated in 2.1. ## How was this patch tested? Existing tests. Author:

spark git commit: [SPARK-20606][ML] ML 2.2 QA: Remove deprecated methods for ML

2017-05-09 Thread yliang
Repository: spark Updated Branches: refs/heads/master be53a7835 -> b8733e0ad [SPARK-20606][ML] ML 2.2 QA: Remove deprecated methods for ML ## What changes were proposed in this pull request? Remove ML methods we deprecated in 2.1. ## How was this patch tested? Existing tests. Author: Yanbo

spark-website git commit: Direct 2.1.0, 2.0.1 downloads to archive; use https links for download; Apache Hadoop; remove stale download logic

2017-05-09 Thread srowen
Repository: spark-website Updated Branches: refs/heads/asf-site 7b32b181f -> b54c4f3fa Direct 2.1.0, 2.0.1 downloads to archive; use https links for download; Apache Hadoop; remove stale download logic Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo Commit:

spark git commit: [SPARK-20615][ML][TEST] SparseVector.argmax throws IndexOutOfBoundsException

2017-05-09 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-2.2 ca3f7edba -> 4bbfad44e [SPARK-20615][ML][TEST] SparseVector.argmax throws IndexOutOfBoundsException ## What changes were proposed in this pull request? Added a check for for the number of defined values. Previously the argmax

spark git commit: [SPARK-20615][ML][TEST] SparseVector.argmax throws IndexOutOfBoundsException

2017-05-09 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-2.1 a1112c615 -> f7a91a17e [SPARK-20615][ML][TEST] SparseVector.argmax throws IndexOutOfBoundsException ## What changes were proposed in this pull request? Added a check for for the number of defined values. Previously the argmax

spark git commit: [SPARK-20615][ML][TEST] SparseVector.argmax throws IndexOutOfBoundsException

2017-05-09 Thread srowen
Repository: spark Updated Branches: refs/heads/master 10b00abad -> be53a7835 [SPARK-20615][ML][TEST] SparseVector.argmax throws IndexOutOfBoundsException ## What changes were proposed in this pull request? Added a check for for the number of defined values. Previously the argmax function

spark git commit: [SPARK-20587][ML] Improve performance of ML ALS recommendForAll

2017-05-09 Thread mlnick
Repository: spark Updated Branches: refs/heads/branch-2.2 72fca9a0a -> ca3f7edba [SPARK-20587][ML] Improve performance of ML ALS recommendForAll This PR is a `DataFrame` version of #17742 for [SPARK-11968](https://issues.apache.org/jira/browse/SPARK-11968), for improving the performance of

spark git commit: [SPARK-20587][ML] Improve performance of ML ALS recommendForAll

2017-05-09 Thread mlnick
Repository: spark Updated Branches: refs/heads/master 807942476 -> 10b00abad [SPARK-20587][ML] Improve performance of ML ALS recommendForAll This PR is a `DataFrame` version of #17742 for [SPARK-11968](https://issues.apache.org/jira/browse/SPARK-11968), for improving the performance of

spark git commit: [SPARK-11968][MLLIB] Optimize MLLIB ALS recommendForAll

2017-05-09 Thread mlnick
Repository: spark Updated Branches: refs/heads/master b952b44af -> 807942476 [SPARK-11968][MLLIB] Optimize MLLIB ALS recommendForAll The recommendForAll of MLLIB ALS is very slow. GC is a key problem of the current method. The task use the following code to keep temp result: val output = new

spark git commit: [SPARK-11968][MLLIB] Optimize MLLIB ALS recommendForAll

2017-05-09 Thread mlnick
Repository: spark Updated Branches: refs/heads/branch-2.2 54e074349 -> 72fca9a0a [SPARK-11968][MLLIB] Optimize MLLIB ALS recommendForAll The recommendForAll of MLLIB ALS is very slow. GC is a key problem of the current method. The task use the following code to keep temp result: val output =