spark git commit: [SPARK-15131][SQL] Shutdown StateStore management thread when SparkContext has been shutdown

2016-05-04 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master ef55e46c9 -> bde27b89a [SPARK-15131][SQL] Shutdown StateStore management thread when SparkContext has been shutdown ## What changes were proposed in this pull request? Make sure that whenever the StateStoreCoordinator cannot be

spark git commit: [SPARK-15022][SPARK-15023][SQL][STREAMING] Add support for testing against the `ProcessingTime(intervalMS > 0)` trigger and `ManualClock`

2016-05-04 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master a45647746 -> e597ec6f1 [SPARK-15022][SPARK-15023][SQL][STREAMING] Add support for testing against the `ProcessingTime(intervalMS > 0)` trigger and `ManualClock` ## What changes were proposed in this pull request? Currently in

spark git commit: [SPARK-15022][SPARK-15023][SQL][STREAMING] Add support for testing against the `ProcessingTime(intervalMS > 0)` trigger and `ManualClock`

2016-05-04 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 ae79032dc -> 343c28504 [SPARK-15022][SPARK-15023][SQL][STREAMING] Add support for testing against the `ProcessingTime(intervalMS > 0)` trigger and `ManualClock` ## What changes were proposed in this pull request? Currently in

spark git commit: [SPARK-14884][SQL][STREAMING][WEBUI] Fix call site for continuous queries

2016-05-03 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 5503e453b -> 5bd9a2f69 [SPARK-14884][SQL][STREAMING][WEBUI] Fix call site for continuous queries ## What changes were proposed in this pull request? Since we've been processing continuous queries in separate threads, the call sites are

spark git commit: [SPARK-14884][SQL][STREAMING][WEBUI] Fix call site for continuous queries

2016-05-03 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 17996e7d0 -> 45bc65519 [SPARK-14884][SQL][STREAMING][WEBUI] Fix call site for continuous queries ## What changes were proposed in this pull request? Since we've been processing continuous queries in separate threads, the call sites

spark git commit: [SPARK-14473][SQL] Define analysis rules to catch operations not supported in streaming

2016-04-18 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 432d1399c -> 775cf17ea [SPARK-14473][SQL] Define analysis rules to catch operations not supported in streaming ## What changes were proposed in this pull request? There are many operations that are currently not supported in the

spark git commit: [SPARK-13904] Add exit code parameter to exitExecutor()

2016-04-19 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 9ee95b6ec -> e89633605 [SPARK-13904] Add exit code parameter to exitExecutor() ## What changes were proposed in this pull request? This PR adds exit code parameter to exitExecutor() so that caller can specify different exit code. ## How

spark git commit: [SPARK-14713][TESTS] Fix the flaky test NettyBlockTransferServiceSuite

2016-04-18 Thread zsxwing
634 and 27634 to reduce the possibility of port conflicts. - Make `service1` use `service0.port` to bind to avoid the above race condition. ## How was this patch tested? Jenkins unit tests. Author: Shixiong Zhu <shixi...@databricks.com> Closes #12477 from zsxwing/SPARK-14713. Proj

spark git commit: [SPARK-16715][TESTS] Fix a potential ExprId conflict for SubexpressionEliminationSuite."Semantic equals and hash"

2016-07-25 Thread zsxwing
lict happens. ## How was this patch tested? Jenkins unit tests. Author: Shixiong Zhu <shixi...@databricks.com> Closes #14350 from zsxwing/SPARK-16715. (cherry picked from commit 12f490b5c85cdee26d47eb70ad1a1edd00504f21) Signed-off-by: Shixiong Zhu <shixi...@databricks.com> Project: h

spark git commit: [SPARK-16715][TESTS] Fix a potential ExprId conflict for SubexpressionEliminationSuite."Semantic equals and hash"

2016-07-25 Thread zsxwing
lict happens. ## How was this patch tested? Jenkins unit tests. Author: Shixiong Zhu <shixi...@databricks.com> Closes #14350 from zsxwing/SPARK-16715. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/12f490b5 Tree: http://gi

spark git commit: [SPARK-15590][WEBUI] Paginate Job Table in Jobs tab

2016-07-25 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master c979c8bba -> db36e1e75 [SPARK-15590][WEBUI] Paginate Job Table in Jobs tab ## What changes were proposed in this pull request? This patch adds pagination support for the Job Tables in the Jobs tab. Pagination is provided for all of the

spark git commit: [SPARK-16230][CORE] CoarseGrainedExecutorBackend to self kill if there is an exception while creating an Executor

2016-07-15 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 e833c906f -> 34ac45a34 [SPARK-16230][CORE] CoarseGrainedExecutorBackend to self kill if there is an exception while creating an Executor ## What changes were proposed in this pull request? With the fix from SPARK-13112, I see that

spark git commit: [SPARK-16148][SCHEDULER] Allow for underscores in TaskLocation in the Executor ID

2016-06-28 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master d59ba8e30 -> ae14f3623 [SPARK-16148][SCHEDULER] Allow for underscores in TaskLocation in the Executor ID ## What changes were proposed in this pull request? Previously, the TaskLocation implementation would not allow for executor ids

spark git commit: [SPARK-16148][SCHEDULER] Allow for underscores in TaskLocation in the Executor ID

2016-06-28 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 c86d29b2e -> 5c9555e11 [SPARK-16148][SCHEDULER] Allow for underscores in TaskLocation in the Executor ID ## What changes were proposed in this pull request? Previously, the TaskLocation implementation would not allow for executor ids

spark git commit: [SPARK-16266][SQL][STREAING] Moved DataStreamReader/Writer from pyspark.sql to pyspark.sql.streaming

2016-06-28 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 153c2f9ac -> f454a7f9f [SPARK-16266][SQL][STREAING] Moved DataStreamReader/Writer from pyspark.sql to pyspark.sql.streaming ## What changes were proposed in this pull request? - Moved DataStreamReader/Writer from pyspark.sql to

[2/2] spark git commit: [SPARK-16266][SQL][STREAING] Moved DataStreamReader/Writer from pyspark.sql to pyspark.sql.streaming

2016-06-29 Thread zsxwing
[SPARK-16266][SQL][STREAING] Moved DataStreamReader/Writer from pyspark.sql to pyspark.sql.streaming ## What changes were proposed in this pull request? - Moved DataStreamReader/Writer from pyspark.sql to pyspark.sql.streaming to make them consistent with scala packaging - Exposed the

[1/2] spark git commit: [SPARK-16259][PYSPARK] cleanup options in DataFrame read/write API

2016-06-29 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 22b4072e7 -> 6650c0533 [SPARK-16259][PYSPARK] cleanup options in DataFrame read/write API ## What changes were proposed in this pull request? There are some duplicated code for options in DataFrame reader/writer API, this PR clean

spark git commit: [MINOR][DOCS][STRUCTURED STREAMING] Minor doc fixes around `DataFrameWriter` and `DataStreamWriter`

2016-06-28 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 3554713a1 -> 5545b7910 [MINOR][DOCS][STRUCTURED STREAMING] Minor doc fixes around `DataFrameWriter` and `DataStreamWriter` ## What changes were proposed in this pull request? Fixes a couple old references to `DataFrameWriter.startStream`

spark git commit: [MINOR][DOCS][STRUCTURED STREAMING] Minor doc fixes around `DataFrameWriter` and `DataStreamWriter`

2016-06-28 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 5fb7804e5 -> 52c9d69f7 [MINOR][DOCS][STRUCTURED STREAMING] Minor doc fixes around `DataFrameWriter` and `DataStreamWriter` ## What changes were proposed in this pull request? Fixes a couple old references to

spark git commit: Revert "[SPARK-16372][MLLIB] Retag RDD to tallSkinnyQR of RowMatrix"

2016-07-07 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 45dda9221 -> bb92788f9 Revert "[SPARK-16372][MLLIB] Retag RDD to tallSkinnyQR of RowMatrix" This reverts commit 45dda92214191310a56333a2085e2343eba170cd. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-16350][SQL] Fix support for incremental planning in wirteStream.foreach()

2016-07-07 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master a04cab8f1 -> 0f7175def [SPARK-16350][SQL] Fix support for incremental planning in wirteStream.foreach() ## What changes were proposed in this pull request? There are cases where `complete` output mode does not output updated aggregated

spark git commit: [SPARK-16350][SQL] Fix support for incremental planning in wirteStream.foreach()

2016-07-07 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 24933355c -> cbfd94eac [SPARK-16350][SQL] Fix support for incremental planning in wirteStream.foreach() ## What changes were proposed in this pull request? There are cases where `complete` output mode does not output updated

spark git commit: [SPARK-15591][WEBUI] Paginate Stage Table in Stages tab

2016-07-06 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 21eadd1d8 -> 478b71d02 [SPARK-15591][WEBUI] Paginate Stage Table in Stages tab ## What changes were proposed in this pull request? This patch adds pagination support for the Stage Tables in the Stage tab. Pagination is provided for all

spark git commit: [SPARK-15869][STREAMING] Fix a potential NPE in StreamingJobProgressListener.getBatchUIData

2016-08-01 Thread zsxwing
ted? Existing unit tests. Author: Shixiong Zhu <shixi...@databricks.com> Closes #14443 from zsxwing/SPARK-15869. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/03d46aaf Tree: http://git-wip-us.apache.org/repos/asf/s

spark git commit: [SPARK-15869][STREAMING] Fix a potential NPE in StreamingJobProgressListener.getBatchUIData

2016-08-01 Thread zsxwing
tch tested? Existing unit tests. Author: Shixiong Zhu <shixi...@databricks.com> Closes #14443 from zsxwing/SPARK-15869. (cherry picked from commit 03d46aafe561b03e25f4e25cf01e631c18dd827c) Signed-off-by: Shixiong Zhu <shixi...@databricks.com> Project: http://git-wip-us.apache.org/

spark git commit: [SPARK-16236][SQL][FOLLOWUP] Add Path Option back to Load API in DataFrameReader

2016-06-29 Thread zsxwing
sue, zsxwing ! Below is an example: ```Python spark.read.format('json').load('python/test_support/sql/people.json') ``` How was this patch tested? Existing test cases cover the changes by this PR Author: gatorsmile <gatorsm...@gmail.com> Closes #13965 from gatorsmile/optionPaths. (cher

spark git commit: [SPARK-16236][SQL][FOLLOWUP] Add Path Option back to Load API in DataFrameReader

2016-06-29 Thread zsxwing
sue, zsxwing ! Below is an example: ```Python spark.read.format('json').load('python/test_support/sql/people.json') ``` How was this patch tested? Existing test cases cover the changes by this PR Author: gatorsmile <gatorsm...@gmail.com> Closes #13965 from gatorsmile/optionPaths. Project: h

spark git commit: [SPARK-15982][SPARK-16009][SPARK-16007][SQL] Harmonize the behavior of DataFrameReader.text/csv/json/parquet/orc

2016-06-20 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 6df8e3886 -> b99129cc4 [SPARK-15982][SPARK-16009][SPARK-16007][SQL] Harmonize the behavior of DataFrameReader.text/csv/json/parquet/orc ## What changes were proposed in this pull request? Issues with current reader behavior. - `text()`

spark git commit: [SPARK-15982][SPARK-16009][SPARK-16007][SQL] Harmonize the behavior of DataFrameReader.text/csv/json/parquet/orc

2016-06-20 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 8159da20e -> 54001cb12 [SPARK-15982][SPARK-16009][SPARK-16007][SQL] Harmonize the behavior of DataFrameReader.text/csv/json/parquet/orc ## What changes were proposed in this pull request? Issues with current reader behavior. -

spark git commit: [SPARK-19432][CORE] Fix an unexpected failure when connecting timeout

2017-02-01 Thread zsxwing
e$1.apply(Future.scala:136) at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32) ``` It's better to provide a meaningful message. ## How was this patch tested? Jenkins Author: Shixiong Zhu <shixi...@databricks.com> Closes #16773 from zsxwing/connect-timeout. Proj

spark git commit: [SPARK-19432][CORE] Fix an unexpected failure when connecting timeout

2017-02-01 Thread zsxwing
fun$onFailure$1.apply(Future.scala:136) at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32) ``` It's better to provide a meaningful message. ## How was this patch tested? Jenkins Author: Shixiong Zhu <shixi...@databricks.com> Closes #16773 from zsxwing/connect-timeout. (cher

spark git commit: [SPARK-19437] Rectify spark executor id in HeartbeatReceiverSuite.

2017-02-02 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 1d5d2a9d0 -> c86a57f4d [SPARK-19437] Rectify spark executor id in HeartbeatReceiverSuite. ## What changes were proposed in this pull request? The current code in `HeartbeatReceiverSuite`, executorId is set as below: ``` private val

spark git commit: [SPARK-19407][SS] defaultFS is used FileSystem.get instead of getting it from uri scheme

2017-02-06 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.1 f55bd4c73 -> 62fab5bee [SPARK-19407][SS] defaultFS is used FileSystem.get instead of getting it from uri scheme ## What changes were proposed in this pull request? ``` Caused by: java.lang.IllegalArgumentException: Wrong FS:

spark git commit: [SPARK-19407][SS] defaultFS is used FileSystem.get instead of getting it from uri scheme

2017-02-06 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master fab0d62a7 -> 7a0a630e0 [SPARK-19407][SS] defaultFS is used FileSystem.get instead of getting it from uri scheme ## What changes were proposed in this pull request? ``` Caused by: java.lang.IllegalArgumentException: Wrong FS:

[2/2] spark git commit: [SPARK-18682][SS] Batch Source for Kafka

2017-02-07 Thread zsxwing
[SPARK-18682][SS] Batch Source for Kafka ## What changes were proposed in this pull request? Today, you can start a stream that reads from kafka. However, given kafka's configurable retention period, it seems like sometimes you might just want to read all of the data that is available now. As

[1/2] spark git commit: [SPARK-18682][SS] Batch Source for Kafka

2017-02-07 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 73ee73945 -> 8df03 http://git-wip-us.apache.org/repos/asf/spark/blob/8df0/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaSourceSuite.scala

[2/2] spark git commit: [SPARK-18682][SS] Batch Source for Kafka

2017-02-07 Thread zsxwing
[SPARK-18682][SS] Batch Source for Kafka Today, you can start a stream that reads from kafka. However, given kafka's configurable retention period, it seems like sometimes you might just want to read all of the data that is available now. As such we should add a version that works with

[1/2] spark git commit: [SPARK-18682][SS] Batch Source for Kafka

2017-02-07 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.1 dd1abef13 -> e642a07d5 http://git-wip-us.apache.org/repos/asf/spark/blob/e642a07d/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaSourceSuite.scala

spark git commit: [SPARK-19413][SS] MapGroupsWithState for arbitrary stateful operations for branch-2.1

2017-02-08 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.1 71b6eacf7 -> 502c927b8 [SPARK-19413][SS] MapGroupsWithState for arbitrary stateful operations for branch-2.1 This is a follow up PR for merging #16758 to spark 2.1 branch ## What changes were proposed in this pull request?

spark git commit: [SPARK-19377][WEBUI][CORE] Killed tasks should have the status as KILLED

2017-02-01 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.1 61cdc8c7c -> f94646415 [SPARK-19377][WEBUI][CORE] Killed tasks should have the status as KILLED ## What changes were proposed in this pull request? Copying of the killed status was missing while getting the newTaskInfo object by

spark git commit: [SPARK-19377][WEBUI][CORE] Killed tasks should have the status as KILLED

2017-02-01 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 5ed397baa -> df4a27cc5 [SPARK-19377][WEBUI][CORE] Killed tasks should have the status as KILLED ## What changes were proposed in this pull request? Copying of the killed status was missing while getting the newTaskInfo object by dropping

[2/2] spark git commit: [SPARK-19139][CORE] New auth mechanism for transport library.

2017-01-24 Thread zsxwing
[SPARK-19139][CORE] New auth mechanism for transport library. This change introduces a new auth mechanism to the transport library, to be used when users enable strong encryption. This auth mechanism has better security than the currently used DIGEST-MD5. The new protocol uses symmetric key

[1/2] spark git commit: [SPARK-19139][CORE] New auth mechanism for transport library.

2017-01-24 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master d9783380f -> 8f3f73abc http://git-wip-us.apache.org/repos/asf/spark/blob/8f3f73ab/common/network-common/src/test/java/org/apache/spark/network/crypto/AuthIntegrationSuite.java

spark git commit: [SPARK-19268][SS] Disallow adaptive query execution for streaming queries

2017-01-23 Thread zsxwing
hes, it may break streaming queries. Hence, we should disallow this feature in Structured Streaming. ## How was this patch tested? `test("SPARK-19268: Adaptive query execution should be disallowed")`. Author: Shixiong Zhu <shixi...@databricks.com> Closes #16683 from zsxwing/SP

spark git commit: [SPARK-19268][SS] Disallow adaptive query execution for streaming queries

2017-01-23 Thread zsxwing
hes, it may break streaming queries. Hence, we should disallow this feature in Structured Streaming. ## How was this patch tested? `test("SPARK-19268: Adaptive query execution should be disallowed")`. Author: Shixiong Zhu <shixi...@databricks.com> Closes #16683 from zsxwing/SPA

spark git commit: [SPARK-19365][CORE] Optimize RequestMessage serialization

2017-01-27 Thread zsxwing
2679 5 2760 6 2710 7 2747 8 2793 9 2679 10 2651 ``` I also captured the TCP packets for this test. Before this patch, the total size of TCP packets is ~1.5GB. After it, it reduces to ~1.2GB. Author: Shixiong Zhu <shixi...@databricks.com> Closes #16

spark git commit: [SPARK-19330][DSTREAMS] Also show tooltip for successful batches

2017-01-24 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 15ef3740d -> 40a4cfc7c [SPARK-19330][DSTREAMS] Also show tooltip for successful batches ## What changes were proposed in this pull request? ### Before

spark git commit: [SPARK-19330][DSTREAMS] Also show tooltip for successful batches

2017-01-24 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.1 b94fb284b -> c13378796 [SPARK-19330][DSTREAMS] Also show tooltip for successful batches ## What changes were proposed in this pull request? ### Before

spark git commit: [SPARK-19603][SS] Fix StreamingQuery explain command

2017-02-15 Thread zsxwing
ks.com> Closes #16934 from zsxwing/SPARK-19603. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/fc02ef95 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/fc02ef95 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff

spark git commit: [SPARK-19603][SS] Fix StreamingQuery explain command

2017-02-15 Thread zsxwing
;shixi...@databricks.com> Closes #16934 from zsxwing/SPARK-19603. (cherry picked from commit fc02ef95cdfc226603b52dc579b7133631f7143d) Signed-off-by: Shixiong Zhu <shixi...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/

spark git commit: [SPARK-19517][SS] KafkaSource fails to initialize partition offsets

2017-02-17 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.1 6e3abed8f -> b083ec511 [SPARK-19517][SS] KafkaSource fails to initialize partition offsets ## What changes were proposed in this pull request? This patch fixes a bug in `KafkaSource` with the (de)serialization of the length of the

spark git commit: [SPARK-19517][SS] KafkaSource fails to initialize partition offsets

2017-02-17 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 4cc06f4eb -> 1a3f5f8c5 [SPARK-19517][SS] KafkaSource fails to initialize partition offsets ## What changes were proposed in this pull request? This patch fixes a bug in `KafkaSource` with the (de)serialization of the length of the JSON

spark git commit: [SPARK-19617][SS] Fix the race condition when starting and stopping a query quickly

2017-02-17 Thread zsxwing
lem after we fix the race condition. ## How was this patch tested? Jenkins Author: Shixiong Zhu <shixi...@databricks.com> Closes #16947 from zsxwing/SPARK-19617. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/15b144d2 T

spark git commit: [SPARK-17714][CORE][TEST-MAVEN][TEST-HADOOP2.6] Avoid using ExecutorClassLoader to load Netty generated classes

2017-02-13 Thread zsxwing
s Author: Shixiong Zhu <shixi...@databricks.com> Closes #16859 from zsxwing/SPARK-17714. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/905fdf0c Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/905fdf0c Diff: http:

spark git commit: [SPARK-17714][CORE][TEST-MAVEN][TEST-HADOOP2.6] Avoid using ExecutorClassLoader to load Netty generated classes

2017-02-13 Thread zsxwing
s Author: Shixiong Zhu <shixi...@databricks.com> Closes #16859 from zsxwing/SPARK-17714. (cherry picked from commit 905fdf0c243e1776c54c01a25b17878361400225) Signed-off-by: Shixiong Zhu <shixi...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: htt

spark git commit: [HOTFIX][SPARK-19542][SS]Fix the missing import in DataStreamReaderWriterSuite

2017-02-13 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.1 328b22984 -> 2968d8c06 [HOTFIX][SPARK-19542][SS]Fix the missing import in DataStreamReaderWriterSuite Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2968d8c0

spark git commit: [SPARK-19564][SPARK-19559][SS][KAFKA] KafkaOffsetReader's consumers should not be in the same group

2017-02-12 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master bc0a0e639 -> 2bdbc8705 [SPARK-19564][SPARK-19559][SS][KAFKA] KafkaOffsetReader's consumers should not be in the same group ## What changes were proposed in this pull request? In `KafkaOffsetReader`, when error occurs, we abort the

spark git commit: [SPARK-19564][SPARK-19559][SS][KAFKA] KafkaOffsetReader's consumers should not be in the same group

2017-02-12 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.1 06e77e009 -> fe4fcc570 [SPARK-19564][SPARK-19559][SS][KAFKA] KafkaOffsetReader's consumers should not be in the same group ## What changes were proposed in this pull request? In `KafkaOffsetReader`, when error occurs, we abort the

spark git commit: [SPARK-19599][SS] Clean up HDFSMetadataLog

2017-02-15 Thread zsxwing
; Unit` and just call `serialize` directly. - Remove catching FileNotFoundException. ## How was this patch tested? Jenkins Author: Shixiong Zhu <shixi...@databricks.com> Closes #16932 from zsxwing/metadata-cleanup. (cherry picked from commit 21b4ba2d6f21a9759af879471715c123073bd67a

spark git commit: [SPARK-19599][SS] Clean up HDFSMetadataLog

2017-02-15 Thread zsxwing
; Unit` and just call `serialize` directly. - Remove catching FileNotFoundException. ## How was this patch tested? Jenkins Author: Shixiong Zhu <shixi...@databricks.com> Closes #16932 from zsxwing/metadata-cleanup. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: ht

spark git commit: [SPARK-19168][STRUCTURED STREAMING] StateStore should be aborted upon error

2017-01-18 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master c050c1227 -> 569e50680 [SPARK-19168][STRUCTURED STREAMING] StateStore should be aborted upon error ## What changes were proposed in this pull request? We should call `StateStore.abort()` when there should be any error before the store is

spark git commit: [SPARK-19168][STRUCTURED STREAMING] StateStore should be aborted upon error

2017-01-18 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.1 047506bae -> 4cff0b504 [SPARK-19168][STRUCTURED STREAMING] StateStore should be aborted upon error ## What changes were proposed in this pull request? We should call `StateStore.abort()` when there should be any error before the

spark git commit: [SPARK-19113][SS][TESTS] Ignore StreamingQueryException thrown from awaitInitialization to avoid breaking tests

2017-01-18 Thread zsxwing
tch `StreamingQueryException` as well. ## How was this patch tested? Jenkins Author: Shixiong Zhu <shixi...@databricks.com> Closes #16567 from zsxwing/SPARK-19113-2. (cherry picked from commit c050c12274fba2ac4c4938c4724049a47fa59280) Signed-off-by: Shixiong Zhu <shixi...@databricks.com> Project:

spark git commit: [SPARK-19113][SS][TESTS] Ignore StreamingQueryException thrown from awaitInitialization to avoid breaking tests

2017-01-18 Thread zsxwing
tch `StreamingQueryException` as well. ## How was this patch tested? Jenkins Author: Shixiong Zhu <shixi...@databricks.com> Closes #16567 from zsxwing/SPARK-19113-2. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c050c122 Tree: http:

spark git commit: [SPARK-19182][DSTREAM] Optimize the lock in StreamingJobProgressListener to not block UI when generating Streaming jobs

2017-01-18 Thread zsxwing
ter to fetch metadata). It's better to optimize the locks in DStreamGraph and StreamingJobProgressListener to make the UI not block by job generation. ## How was this patch tested? existing ut cc zsxwing Author: uncleGen <husty...@gmail.com> Closes #16601 from uncleGen/SPARK-19182. Proj

spark git commit: [SPARK-18905][STREAMING] Fix the issue of removing a failed jobset from JobScheduler.jobSets

2017-01-16 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master c84f7d3e1 -> f8db8945f [SPARK-18905][STREAMING] Fix the issue of removing a failed jobset from JobScheduler.jobSets ## What changes were proposed in this pull request? the current implementation of Spark streaming considers a batch is

spark git commit: [SPARK-18905][STREAMING] Fix the issue of removing a failed jobset from JobScheduler.jobSets

2017-01-16 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.1 975890507 -> f4317be66 [SPARK-18905][STREAMING] Fix the issue of removing a failed jobset from JobScheduler.jobSets ## What changes were proposed in this pull request? the current implementation of Spark streaming considers a batch

spark git commit: [SPARK-19749][SS] Name socket source with a meaningful name

2017-02-27 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 16d8472f7 -> 735303835 [SPARK-19749][SS] Name socket source with a meaningful name ## What changes were proposed in this pull request? Name socket source with a meaningful name ## How was this patch tested? Jenkins Author: uncleGen

spark git commit: [SPARK-19594][STRUCTURED STREAMING] StreamingQueryListener fails to handle QueryTerminatedEvent if more then one listeners exists

2017-02-26 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 68f2142cf -> 9f8e39215 [SPARK-19594][STRUCTURED STREAMING] StreamingQueryListener fails to handle QueryTerminatedEvent if more then one listeners exists ## What changes were proposed in this pull request? currently if multiple streaming

spark git commit: [SPARK-19594][STRUCTURED STREAMING] StreamingQueryListener fails to handle QueryTerminatedEvent if more then one listeners exists

2017-02-26 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.1 20a432951 -> 04fbb9e09 [SPARK-19594][STRUCTURED STREAMING] StreamingQueryListener fails to handle QueryTerminatedEvent if more then one listeners exists ## What changes were proposed in this pull request? currently if multiple

spark git commit: [SPARK-19677][SS] Committing a delta file atop an existing one should not fail on HDFS

2017-02-28 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.1 4b4c3bf3f -> 947c0cd90 [SPARK-19677][SS] Committing a delta file atop an existing one should not fail on HDFS ## What changes were proposed in this pull request? HDFSBackedStateStoreProvider fails to rename files on HDFS but not on

spark git commit: [SPARK-19677][SS] Committing a delta file atop an existing one should not fail on HDFS

2017-02-28 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 7c7fc30b4 -> 9734a928a [SPARK-19677][SS] Committing a delta file atop an existing one should not fail on HDFS ## What changes were proposed in this pull request? HDFSBackedStateStoreProvider fails to rename files on HDFS but not on the

spark git commit: [SPARK-19677][SS] Committing a delta file atop an existing one should not fail on HDFS

2017-02-28 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 a6af60f25 -> dcfb05c86 [SPARK-19677][SS] Committing a delta file atop an existing one should not fail on HDFS ## What changes were proposed in this pull request? HDFSBackedStateStoreProvider fails to rename files on HDFS but not on

spark git commit: [SPARK-19633][SS] FileSource read from FileSink

2017-02-28 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 89cd3845b -> 4913c92c2 [SPARK-19633][SS] FileSource read from FileSink ## What changes were proposed in this pull request? Right now file source always uses `InMemoryFileIndex` to scan files from a given path. But when reading the

spark git commit: [SPARK-17316][CORE] Make CoarseGrainedSchedulerBackend.removeExecutor non-blocking

2016-09-02 Thread zsxwing
lue). ## How was this patch tested? Jenkins unit tests. Author: Shixiong Zhu <shixi...@databricks.com> Closes #14882 from zsxwing/SPARK-17316. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b84a92c2 Tree: http:

spark git commit: [SPARK-17316][CORE] Fix the 'ask' type parameter in 'removeExecutor'

2016-09-06 Thread zsxwing
not cast java.lang.Boolean to scala.runtime.Nothing$` ## How was this patch tested? Jenkins Author: Shixiong Zhu <shixi...@databricks.com> Closes #14983 from zsxwing/SPARK-17316-3. (cherry picked from commit 175b4344112b376cbbbd05265125ed0e1b87d507) Signed-off-by: Shixiong Z

spark git commit: [SPARK-17316][CORE] Fix the 'ask' type parameter in 'removeExecutor'

2016-09-06 Thread zsxwing
not cast java.lang.Boolean to scala.runtime.Nothing$` ## How was this patch tested? Jenkins Author: Shixiong Zhu <shixi...@databricks.com> Closes #14983 from zsxwing/SPARK-17316-3. (cherry picked from commit 175b4344112b376cbbbd05265125ed0e1b87d507) Signed-off-by: Shixiong Z

spark git commit: [SPARK-17318][TESTS] Fix ReplSuite replicating blocks of object with class defined in repl again

2016-09-01 Thread zsxwing
est. ## How was this patch tested? Jenkins Author: Shixiong Zhu <shixi...@databricks.com> Closes #14905 from zsxwing/SPARK-17318-2. (cherry picked from commit 21c0a4fe9d8e21819ba96e7dc2b1f2999d3299ae) Signed-off-by: Shixiong Zhu <shixi...@databricks.com> Project: http://git-wip-us.a

spark git commit: [SPARK-17318][TESTS] Fix ReplSuite replicating blocks of object with class defined in repl again

2016-09-01 Thread zsxwing
est. ## How was this patch tested? Jenkins Author: Shixiong Zhu <shixi...@databricks.com> Closes #14905 from zsxwing/SPARK-17318-2. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/21c0a4fe Tree: http://git-wip-us.apache.

spark git commit: [SPARK-17314][CORE] Use Netty's DefaultThreadFactory to enable its fast ThreadLocal impl

2016-08-30 Thread zsxwing
Zhu <shixi...@databricks.com> Closes #14879 from zsxwing/netty-thread. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/02ac379e Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/02ac379e Diff: http://git-wip-us.apache.

spark git commit: [SPARK-17318][TESTS] Fix ReplSuite replicating blocks of object with class defined in repl

2016-08-30 Thread zsxwing
;shixi...@databricks.com> Closes #14884 from zsxwing/SPARK-17318. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/231f9732 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/231f9732 Diff: http://git-wip-us.apache.org/

spark git commit: [SPARK-17486] Remove unused TaskMetricsUIData.updatedBlockStatuses field

2016-09-11 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 d293062a4 -> 30521522d [SPARK-17486] Remove unused TaskMetricsUIData.updatedBlockStatuses field The `TaskMetricsUIData.updatedBlockStatuses` field is assigned to but never read, increasing the memory consumption of the web UI. We

spark git commit: [SPARK-17486] Remove unused TaskMetricsUIData.updatedBlockStatuses field

2016-09-11 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 767d48076 -> 72eec70bd [SPARK-17486] Remove unused TaskMetricsUIData.updatedBlockStatuses field The `TaskMetricsUIData.updatedBlockStatuses` field is assigned to but never read, increasing the memory consumption of the web UI. We should

spark git commit: [SPARK-15487][WEB UI] Spark Master UI to reverse proxy Application and Workers UI

2016-09-08 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 722afbb2b -> 92ce8d484 [SPARK-15487][WEB UI] Spark Master UI to reverse proxy Application and Workers UI ## What changes were proposed in this pull request? This pull request adds the functionality to enable accessing worker and

spark git commit: [SPARK-17451][CORE] CoarseGrainedExecutorBackend should inform driver before self-kill

2016-09-15 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 2ad276954 -> b47927814 [SPARK-17451][CORE] CoarseGrainedExecutorBackend should inform driver before self-kill ## What changes were proposed in this pull request? Jira : https://issues.apache.org/jira/browse/SPARK-17451

spark git commit: [SPARK-17379][BUILD] Upgrade netty-all to 4.0.41 final for bug fixes

2016-09-15 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master b47927814 -> 0ad8eeb4d [SPARK-17379][BUILD] Upgrade netty-all to 4.0.41 final for bug fixes ## What changes were proposed in this pull request? Upgrade netty-all to latest in the 4.0.x line which is 4.0.41, mentions several bug fixes and

spark git commit: [SPARK-15703][SCHEDULER][CORE][WEBUI] Make ListenerBus event queue size configurable (branch 2.0)

2016-09-23 Thread zsxwing
ted? Jenkins. Author: Dhruve Ashar <dhruveas...@gmail.com> Closes #15222 from zsxwing/SPARK-15703-2.0. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/9e91a100 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/9e91

spark git commit: [SPARK-17649][CORE] Log how many Spark events got dropped in LiveListenerBus

2016-09-26 Thread zsxwing
get insights on how to set a correct event queue size. ## How was this patch tested? Jenkins Author: Shixiong Zhu <shixi...@databricks.com> Closes #15220 from zsxwing/SPARK-17649. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spa

spark git commit: [SPARK-17649][CORE] Log how many Spark events got dropped in LiveListenerBus

2016-09-26 Thread zsxwing
get insights on how to set a correct event queue size. ## How was this patch tested? Jenkins Author: Shixiong Zhu <shixi...@databricks.com> Closes #15220 from zsxwing/SPARK-17649. (cherry picked from commit bde85f8b70138a51052b613664facbc981378c38) Signed-off-by: Shixiong Z

spark git commit: [SPARK-17649][CORE] Log how many Spark events got dropped in AsynchronousListenerBus (branch 1.6)

2016-09-26 Thread zsxwing
ted? Jenkins Author: Shixiong Zhu <shixi...@databricks.com> Closes #15226 from zsxwing/SPARK-17649-branch-1.6. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7aded55e Tree: http://git-wip-us.apache.org/repos/asf/spark/tree

spark git commit: [SPARK-17650] malformed url's throw exceptions before bricking Executors

2016-09-25 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master de333d121 -> 59d87d240 [SPARK-17650] malformed url's throw exceptions before bricking Executors ## What changes were proposed in this pull request? When a malformed URL was sent to Executors through `sc.addJar` and `sc.addFile`, the

spark git commit: [SPARK-17650] malformed url's throw exceptions before bricking Executors

2016-09-25 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 ed545763a -> 88ba2e1d0 [SPARK-17650] malformed url's throw exceptions before bricking Executors ## What changes were proposed in this pull request? When a malformed URL was sent to Executors through `sc.addJar` and `sc.addFile`, the

spark git commit: [SPARK-17778][TESTS] Mock SparkContext to reduce memory usage of BlockManagerSuite

2016-10-05 Thread zsxwing
How was this patch tested? Jenkins Author: Shixiong Zhu <shixi...@databricks.com> Closes #15350 from zsxwing/SPARK-17778. (cherry picked from commit 221b418b1c9db7b04c600b6300d18b034a4f444e) Signed-off-by: Shixiong Zhu <shixi...@databricks.com> Project: http://git-wip-us.apache.org/repos

spark git commit: [SPARK-17778][TESTS] Mock SparkContext to reduce memory usage of BlockManagerSuite

2016-10-05 Thread zsxwing
How was this patch tested? Jenkins Author: Shixiong Zhu <shixi...@databricks.com> Closes #15350 from zsxwing/SPARK-17778. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/221b418b Tree: http://git-wip-us.apache.org/repos/asf/s

spark git commit: [SPARK-17643] Remove comparable requirement from Offset (backport for branch-2.0)

2016-10-05 Thread zsxwing
mit/988c71457354b0a443471f501cef544a85b1a76a to branch-2.0 ## How was this patch tested? Jenkins Author: Michael Armbrust <mich...@databricks.com> Closes #15362 from zsxwing/SPARK-17643-2.0. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/1c2d

spark git commit: [SPARK-17707][WEBUI] Web UI prevents spark-submit application to be finished

2016-10-07 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 3487b0203 -> 9f2eb27a4 [SPARK-17707][WEBUI] Web UI prevents spark-submit application to be finished This expands calls to Jetty's simple `ServerConnector` constructor to explicitly specify a `ScheduledExecutorScheduler` that makes

spark git commit: [SPARK-17707][WEBUI] Web UI prevents spark-submit application to be finished

2016-10-07 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master dd16b52cf -> cff560755 [SPARK-17707][WEBUI] Web UI prevents spark-submit application to be finished ## What changes were proposed in this pull request? This expands calls to Jetty's simple `ServerConnector` constructor to explicitly

spark git commit: [SPARK-17782][STREAMING][BUILD] Add Kafka 0.10 project to build modules

2016-10-07 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 f460a199e -> a84d8ef37 [SPARK-17782][STREAMING][BUILD] Add Kafka 0.10 project to build modules ## What changes were proposed in this pull request? This PR adds the Kafka 0.10 subproject to the build infrastructure. This makes sure

[1/2] spark git commit: [SPARK-17346][SQL][TEST-MAVEN] Add Kafka source for Structured Streaming (branch 2.0)

2016-10-07 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 9f2eb27a4 -> f460a199e http://git-wip-us.apache.org/repos/asf/spark/blob/f460a199/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala

[2/2] spark git commit: [SPARK-17346][SQL][TEST-MAVEN] Add Kafka source for Structured Streaming (branch 2.0)

2016-10-07 Thread zsxwing
/b678e465afa417780b54db0fbbaa311621311f15 into branch 2.0. The only difference is the Spark version in pom file. ## How was this patch tested? Jenkins. Author: Shixiong Zhu <shixi...@databricks.com> Closes #15367 from zsxwing/kafka-source-branch-2.0. Project: http://git-wip-us.apache.org/repos/asf/spar

<    1   2   3   4   5   6   7   8   >