spark git commit: [DOC] bucketing is applicable to all file-based data sources

2016-12-21 Thread rxin
Repository: spark Updated Branches: refs/heads/master 7c5b7b3a2 -> 2e861df96 [DOC] bucketing is applicable to all file-based data sources ## What changes were proposed in this pull request? Starting Spark 2.1.0, bucketing feature is available for all file-based data sources. This patch fixes

spark git commit: [DOC] bucketing is applicable to all file-based data sources

2016-12-21 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.1 def3690f6 -> ec0d6e21e [DOC] bucketing is applicable to all file-based data sources ## What changes were proposed in this pull request? Starting Spark 2.1.0, bucketing feature is available for all file-based data sources. This patch

spark git commit: [SQL] Minor readability improvement for partition handling code

2016-12-21 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.1 07e2a17d1 -> def3690f6 [SQL] Minor readability improvement for partition handling code This patch includes minor changes to improve readability for partition handling code. I'm in the middle of implementing some new feature and found

spark git commit: [SQL] Minor readability improvement for partition handling code

2016-12-21 Thread wenchen
Repository: spark Updated Branches: refs/heads/master ff7d82a20 -> 7c5b7b3a2 [SQL] Minor readability improvement for partition handling code ## What changes were proposed in this pull request? This patch includes minor changes to improve readability for partition handling code. I'm in the

spark git commit: [SPARK-18908][SS] Creating StreamingQueryException should check if logicalPlan is created

2016-12-21 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.1 9a3c5bd70 -> 07e2a17d1 [SPARK-18908][SS] Creating StreamingQueryException should check if logicalPlan is created ## What changes were proposed in this pull request? This PR audits places using `logicalPlan` in StreamExecution and

spark git commit: [BUILD] make-distribution should find JAVA_HOME for non-RHEL systems

2016-12-21 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master afe36516e -> e1b43dc45 [BUILD] make-distribution should find JAVA_HOME for non-RHEL systems ## What changes were proposed in this pull request? make-distribution.sh should find JAVA_HOME for Ubuntu, Mac and other non-RHEL systems ## How

spark git commit: [FLAKY-TEST] InputStreamsSuite.socket input stream

2016-12-21 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-2.1 021952d58 -> 9a3c5bd70 [FLAKY-TEST] InputStreamsSuite.socket input stream ## What changes were proposed in this pull request?

spark git commit: [FLAKY-TEST] InputStreamsSuite.socket input stream

2016-12-21 Thread tdas
Repository: spark Updated Branches: refs/heads/master 7e8994ffd -> afe36516e [FLAKY-TEST] InputStreamsSuite.socket input stream ## What changes were proposed in this pull request?

spark git commit: [SPARK-18903][SPARKR] Add API to get SparkUI URL

2016-12-21 Thread felixcheung
Repository: spark Updated Branches: refs/heads/master b41ec9977 -> 7e8994ffd [SPARK-18903][SPARKR] Add API to get SparkUI URL ## What changes were proposed in this pull request? API for SparkUI URL from SparkContext ## How was this patch tested? manual, unit tests Author: Felix Cheung

spark git commit: [SPARK-18528][SQL] Fix a bug to initialise an iterator of aggregation buffer

2016-12-21 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/branch-2.0 53cd99f65 -> 080ac37fb [SPARK-18528][SQL] Fix a bug to initialise an iterator of aggregation buffer ## What changes were proposed in this pull request? This pr is to fix an `NullPointerException` issue caused by a following `limit +

spark git commit: [SPARK-18528][SQL] Fix a bug to initialise an iterator of aggregation buffer

2016-12-21 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/branch-2.1 60e02a173 -> 021952d58 [SPARK-18528][SQL] Fix a bug to initialise an iterator of aggregation buffer ## What changes were proposed in this pull request? This pr is to fix an `NullPointerException` issue caused by a following `limit +

spark git commit: [SPARK-18528][SQL] Fix a bug to initialise an iterator of aggregation buffer

2016-12-21 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/master 83a6ace0d -> b41ec9977 [SPARK-18528][SQL] Fix a bug to initialise an iterator of aggregation buffer ## What changes were proposed in this pull request? This pr is to fix an `NullPointerException` issue caused by a following `limit +

spark git commit: [SPARK-18234][SS] Made update mode public

2016-12-21 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-2.1 17ef57fe8 -> 60e02a173 [SPARK-18234][SS] Made update mode public ## What changes were proposed in this pull request? Made update mode public. As part of that here are the changes. - Update DatastreamWriter to accept "update" - Changed

spark git commit: [SPARK-18234][SS] Made update mode public

2016-12-21 Thread tdas
Repository: spark Updated Branches: refs/heads/master afd9bc1d8 -> 83a6ace0d [SPARK-18234][SS] Made update mode public ## What changes were proposed in this pull request? Made update mode public. As part of that here are the changes. - Update DatastreamWriter to accept "update" - Changed

spark git commit: [SPARK-17807][CORE] split test-tags into test-JAR

2016-12-21 Thread vanzin
Repository: spark Updated Branches: refs/heads/master 95efc895e -> afd9bc1d8 [SPARK-17807][CORE] split test-tags into test-JAR Remove spark-tag's compile-scope dependency (and, indirectly, spark-core's compile-scope transitive-dependency) on scalatest by splitting test-oriented tags into

spark git commit: [SPARK-18588][SS][KAFKA] Create a new KafkaConsumer when error happens to fix the flaky test

2016-12-21 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-2.1 0e51bb085 -> 17ef57fe8 [SPARK-18588][SS][KAFKA] Create a new KafkaConsumer when error happens to fix the flaky test ## What changes were proposed in this pull request? When KafkaSource fails on Kafka errors, we should create a new

spark git commit: [SPARK-18588][SS][KAFKA] Create a new KafkaConsumer when error happens to fix the flaky test

2016-12-21 Thread tdas
Repository: spark Updated Branches: refs/heads/master 354e93618 -> 95efc895e [SPARK-18588][SS][KAFKA] Create a new KafkaConsumer when error happens to fix the flaky test ## What changes were proposed in this pull request? When KafkaSource fails on Kafka errors, we should create a new

spark git commit: [SPARK-18775][SQL] Limit the max number of records written per file

2016-12-21 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/master 078c71c2d -> 354e93618 [SPARK-18775][SQL] Limit the max number of records written per file ## What changes were proposed in this pull request? Currently, Spark writes a single file out per task, sometimes leading to very large files. It

spark git commit: [SPARK-18949][SQL][BACKPORT-2.1] Add recoverPartitions API to Catalog

2016-12-21 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-2.1 318483421 -> 0e51bb085 [SPARK-18949][SQL][BACKPORT-2.1] Add recoverPartitions API to Catalog ### What changes were proposed in this pull request? This PR is to backport https://github.com/apache/spark/pull/16356 to Spark 2.1.1

spark git commit: [SPARK-18700][SQL][BACKPORT-2.0] Add StripedLock for each table's relation in cache

2016-12-21 Thread hvanhovell
Repository: spark Updated Branches: refs/heads/branch-2.0 5f8c0b742 -> 53cd99f65 [SPARK-18700][SQL][BACKPORT-2.0] Add StripedLock for each table's relation in cache ## What changes were proposed in this pull request? Backport of #16135 to branch-2.0 ## How was this patch tested? Because

spark git commit: [SPARK-18954][TESTS] Fix flaky test: o.a.s.streaming.BasicOperationsSuite rdd cleanup - map and window

2016-12-21 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 ef206ace2 -> 5f8c0b742 [SPARK-18954][TESTS] Fix flaky test: o.a.s.streaming.BasicOperationsSuite rdd cleanup - map and window ## What changes were proposed in this pull request? The issue in this test is the cleanup of RDDs may not

spark git commit: [SPARK-18954][TESTS] Fix flaky test: o.a.s.streaming.BasicOperationsSuite rdd cleanup - map and window

2016-12-21 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.1 162bdb910 -> 318483421 [SPARK-18954][TESTS] Fix flaky test: o.a.s.streaming.BasicOperationsSuite rdd cleanup - map and window ## What changes were proposed in this pull request? The issue in this test is the cleanup of RDDs may not

spark git commit: [SPARK-18954][TESTS] Fix flaky test: o.a.s.streaming.BasicOperationsSuite rdd cleanup - map and window

2016-12-21 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master ccfe60a83 -> 078c71c2d [SPARK-18954][TESTS] Fix flaky test: o.a.s.streaming.BasicOperationsSuite rdd cleanup - map and window ## What changes were proposed in this pull request? The issue in this test is the cleanup of RDDs may not be

spark git commit: [SPARK-18031][TESTS] Fix flaky test ExecutorAllocationManagerSuite.basic functionality

2016-12-21 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-2.1 3c8861d92 -> 162bdb910 [SPARK-18031][TESTS] Fix flaky test ExecutorAllocationManagerSuite.basic functionality ## What changes were proposed in this pull request? The failure is because in `test("basic functionality")`, it doesn't

spark git commit: [SPARK-18031][TESTS] Fix flaky test ExecutorAllocationManagerSuite.basic functionality

2016-12-21 Thread tdas
Repository: spark Updated Branches: refs/heads/master 607a1e63d -> ccfe60a83 [SPARK-18031][TESTS] Fix flaky test ExecutorAllocationManagerSuite.basic functionality ## What changes were proposed in this pull request? The failure is because in `test("basic functionality")`, it doesn't block

spark git commit: [SPARK-18894][SS] Fix event time watermark delay threshold specified in months or years

2016-12-21 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.1 bc54a14b4 -> 3c8861d92 [SPARK-18894][SS] Fix event time watermark delay threshold specified in months or years ## What changes were proposed in this pull request? Two changes - Fix how delays specified in months and years are

spark git commit: [SPARK-18894][SS] Fix event time watermark delay threshold specified in months or years

2016-12-21 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 1a6438897 -> 607a1e63d [SPARK-18894][SS] Fix event time watermark delay threshold specified in months or years ## What changes were proposed in this pull request? Two changes - Fix how delays specified in months and years are translated

spark git commit: [SPARK-18951] Upgrade com.thoughtworks.paranamer/paranamer to 2.6

2016-12-21 Thread yhuai
Repository: spark Updated Branches: refs/heads/master b7650f11c -> 1a6438897 [SPARK-18951] Upgrade com.thoughtworks.paranamer/paranamer to 2.6 ## What changes were proposed in this pull request? I recently hit a bug of com.thoughtworks.paranamer/paranamer, which causes jackson fail to handle

spark git commit: [SPARK-18947][SQL] SQLContext.tableNames should not call Catalog.listTables

2016-12-21 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.0 2aae220b5 -> ef206ace2 [SPARK-18947][SQL] SQLContext.tableNames should not call Catalog.listTables ## What changes were proposed in this pull request? It's a huge waste to call `Catalog.listTables` in `SQLContext.tableNames`, which

spark git commit: [SPARK-18947][SQL] SQLContext.tableNames should not call Catalog.listTables

2016-12-21 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.1 063a98e52 -> bc54a14b4 [SPARK-18947][SQL] SQLContext.tableNames should not call Catalog.listTables ## What changes were proposed in this pull request? It's a huge waste to call `Catalog.listTables` in `SQLContext.tableNames`, which

spark git commit: [SPARK-18947][SQL] SQLContext.tableNames should not call Catalog.listTables

2016-12-21 Thread wenchen
Repository: spark Updated Branches: refs/heads/master ba4468bb2 -> b7650f11c [SPARK-18947][SQL] SQLContext.tableNames should not call Catalog.listTables ## What changes were proposed in this pull request? It's a huge waste to call `Catalog.listTables` in `SQLContext.tableNames`, which only

spark git commit: [SPARK-18923][DOC][BUILD] Support skipping R/Python API docs

2016-12-21 Thread srowen
Repository: spark Updated Branches: refs/heads/master 24c0c9412 -> ba4468bb2 [SPARK-18923][DOC][BUILD] Support skipping R/Python API docs ## What changes were proposed in this pull request? We can build Python API docs by `cd ./python/docs && make html for Python` and R API docs by `cd ./R