spark git commit: [SPARK-12021][STREAMING][TESTS] Fix the potential dead-lock in StreamingListenerSuite

2015-11-27 Thread zsxwing
lls `ssc.stop()`, `StreamingContextStoppingCollector` may call `ssc.stop()` in the listener bus thread, which is a dead-lock. This PR updated `StreamingContextStoppingCollector` to only call `ssc.stop()` in the first batch to avoid the dead-lock. Author: Shixiong Zhu <shixi...@databricks.com> Closes #10011 fr

spark git commit: [SPARK-12021][STREAMING][TESTS] Fix the potential dead-lock in StreamingListenerSuite

2015-11-27 Thread zsxwing
lls `ssc.stop()`, `StreamingContextStoppingCollector` may call `ssc.stop()` in the listener bus thread, which is a dead-lock. This PR updated `StreamingContextStoppingCollector` to only call `ssc.stop()` in the first batch to avoid the dead-lock. Author: Shixiong Zhu <shixi...@databricks.com> Closes #10011 fr

spark git commit: [SPARK-12074] Avoid memory copy involving ByteBuffer.wrap(ByteArrayOutputStream.toByteArray)

2015-12-08 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 6cb06e871 -> 75c60bf4b [SPARK-12074] Avoid memory copy involving ByteBuffer.wrap(ByteArrayOutputStream.toByteArray) SPARK-12060 fixed JavaSerializerInstance.serialize This PR applies the same technique on two other classes. zsxw

spark git commit: Revert "[SPARK-12060][CORE] Avoid memory copy in JavaSerializerInstance.serialize"

2015-12-01 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 81db8d086 -> 21909b8ac Revert "[SPARK-12060][CORE] Avoid memory copy in JavaSerializerInstance.serialize" This reverts commit 9b99b2b46c452ba396e922db5fc7eec02c45b158. Project: http://git-wip-us.apache.org/repos/asf/spark/repo

spark git commit: Revert "[SPARK-12060][CORE] Avoid memory copy in JavaSerializerInstance.serialize"

2015-12-01 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 60b541ee1 -> 328b757d5 Revert "[SPARK-12060][CORE] Avoid memory copy in JavaSerializerInstance.serialize" This reverts commit 1401166576c7018c5f9c31e0a6703d5fb16ea339. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-12002][STREAMING][PYSPARK] Fix python direct stream checkpoint recovery issue

2015-12-01 Thread zsxwing
hu <shixi...@databricks.com> Closes #10074 from zsxwing/review-pr10017. (cherry picked from commit f292018f8e57779debc04998456ec875f628133b) Signed-off-by: Shixiong Zhu <shixi...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apach

spark git commit: [SPARK-12002][STREAMING][PYSPARK] Fix python direct stream checkpoint recovery issue

2015-12-01 Thread zsxwing
hu <shixi...@databricks.com> Closes #10074 from zsxwing/review-pr10017. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f292018f Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/f292018f Diff: http://git-wip-us.apach

spark git commit: [SPARK-12101][CORE] Fix thread pools that cannot cache tasks in Worker and AppClient

2015-12-03 Thread zsxwing
xed `ThreadUtils.newDaemonCachedThreadPool`. Author: Shixiong Zhu <shixi...@databricks.com> Closes #10108 from zsxwing/fix-threadpool. (cherry picked from commit 649be4fa4532dcd3001df8345f9f7e970a3fbc65) Signed-off-by: Shixiong Zhu <shixi...@databricks.com> Project: http://git-wip-us.apache.org/

spark git commit: [SPARK-12101][CORE] Fix thread pools that cannot cache tasks in Worker and AppClient

2015-12-03 Thread zsxwing
xed `ThreadUtils.newDaemonCachedThreadPool`. Author: Shixiong Zhu <shixi...@databricks.com> Closes #10108 from zsxwing/fix-threadpool. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/649be4fa Tree: http://git-wip-us.apache.org/repos/asf/s

spark git commit: [SPARK-12060][CORE] Avoid memory copy in JavaSerializerInstance.serialize

2015-12-01 Thread zsxwing
ong Zhu <shixi...@databricks.com> Closes #10051 from zsxwing/SPARK-12060. (cherry picked from commit 1401166576c7018c5f9c31e0a6703d5fb16ea339) Signed-off-by: Shixiong Zhu <shixi...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.a

spark git commit: [SPARK-12060][CORE] Avoid memory copy in JavaSerializerInstance.serialize

2015-12-01 Thread zsxwing
Zhu <shixi...@databricks.com> Closes #10051 from zsxwing/SPARK-12060. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/14011665 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/14011665 Diff: http://git-wip-us.a

spark git commit: [SPARK-12087][STREAMING] Create new JobConf for every batch in saveAsHadoopFiles

2015-12-01 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 96691feae -> 8a75a3049 [SPARK-12087][STREAMING] Create new JobConf for every batch in saveAsHadoopFiles The JobConf object created in `DStream.saveAsHadoopFiles` is used concurrently in multiple places: * The JobConf is updated by

spark git commit: [SPARK-12087][STREAMING] Create new JobConf for every batch in saveAsHadoopFiles

2015-12-01 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.5 4f07a590c -> 0d57a4ae1 [SPARK-12087][STREAMING] Create new JobConf for every batch in saveAsHadoopFiles The JobConf object created in `DStream.saveAsHadoopFiles` is used concurrently in multiple places: * The JobConf is updated by

spark git commit: [SPARK-12087][STREAMING] Create new JobConf for every batch in saveAsHadoopFiles

2015-12-01 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 a5743affc -> 1f42295b5 [SPARK-12087][STREAMING] Create new JobConf for every batch in saveAsHadoopFiles The JobConf object created in `DStream.saveAsHadoopFiles` is used concurrently in multiple places: * The JobConf is updated by

spark git commit: [SPARK-12087][STREAMING] Create new JobConf for every batch in saveAsHadoopFiles

2015-12-01 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.4 f5af299ab -> b6ba2dab2 [SPARK-12087][STREAMING] Create new JobConf for every batch in saveAsHadoopFiles The JobConf object created in `DStream.saveAsHadoopFiles` is used concurrently in multiple places: * The JobConf is updated by

spark git commit: [SPARK-12060][CORE] Avoid memory copy in JavaSerializerInstance.serialize

2015-12-07 Thread zsxwing
;shixi...@databricks.com> Closes #10167 from zsxwing/merge-SPARK-12060. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/3f4efb5c Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/3f4efb5c Diff: http://git-wip-us.apache.

spark git commit: [SPARK-12101][CORE] Fix thread pools that cannot cache tasks in Worker and AppClient (backport 1.5)

2015-12-07 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.5 93a0510a5 -> 3868ab644 [SPARK-12101][CORE] Fix thread pools that cannot cache tasks in Worker and AppClient (backport 1.5) backport #10108 to branch 1.5 Author: Shixiong Zhu <shixi...@databricks.com> Closes #10135 from zs

[1/2] spark git commit: [SPARK-12244][SPARK-12245][STREAMING] Rename trackStateByKey to mapWithState and change tracking function signature

2015-12-09 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 699f497cf -> f6d866173 http://git-wip-us.apache.org/repos/asf/spark/blob/f6d86617/streaming/src/test/java/org/apache/spark/streaming/JavaTrackStateByKeySuite.java --

[2/2] spark git commit: [SPARK-12244][SPARK-12245][STREAMING] Rename trackStateByKey to mapWithState and change tracking function signature

2015-12-09 Thread zsxwing
[SPARK-12244][SPARK-12245][STREAMING] Rename trackStateByKey to mapWithState and change tracking function signature SPARK-12244: Based on feedback from early users and personal experience attempting to explain it, the name trackStateByKey had two problem. "trackState" is a completely new term

[2/2] spark git commit: [SPARK-12244][SPARK-12245][STREAMING] Rename trackStateByKey to mapWithState and change tracking function signature

2015-12-09 Thread zsxwing
[SPARK-12244][SPARK-12245][STREAMING] Rename trackStateByKey to mapWithState and change tracking function signature SPARK-12244: Based on feedback from early users and personal experience attempting to explain it, the name trackStateByKey had two problem. "trackState" is a completely new term

[1/2] spark git commit: [SPARK-12244][SPARK-12245][STREAMING] Rename trackStateByKey to mapWithState and change tracking function signature

2015-12-09 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 2166c2a75 -> bd2cd4f53 http://git-wip-us.apache.org/repos/asf/spark/blob/bd2cd4f5/streaming/src/test/java/org/apache/spark/streaming/JavaTrackStateByKeySuite.java -- diff

spark git commit: [SPARK-12273][STREAMING] Make Spark Streaming web UI list Receivers in order

2015-12-11 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master aa305dcaf -> 713e6959d [SPARK-12273][STREAMING] Make Spark Streaming web UI list Receivers in order Currently the Streaming web UI does NOT list Receivers in order; however, it seems more convenient for the users if Receivers are listed

spark git commit: [STREAMING][MINOR] Fix typo in function name of StateImpl

2015-12-15 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master c59df8c51 -> bc1ff9f4a [STREAMING][MINOR] Fix typo in function name of StateImpl cc\ tdas zsxwing , please review. Thanks a lot. Author: jerryshao <ss...@hortonworks.com> Closes #10305 from jerryshao/fix-typo-state-impl. Proj

spark git commit: [STREAMING][MINOR] Fix typo in function name of StateImpl

2015-12-15 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 352a0c80f -> 23c884605 [STREAMING][MINOR] Fix typo in function name of StateImpl cc\ tdas zsxwing , please review. Thanks a lot. Author: jerryshao <ss...@hortonworks.com> Closes #10305 from jerryshao/fix-typo-state-impl.

spark git commit: [MINOR] Add missing interpolation in NettyRPCEnv

2015-12-16 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 552b38f87 -> 638b89bc3 [MINOR] Add missing interpolation in NettyRPCEnv ``` Exception in thread "main" org.apache.spark.rpc.RpcTimeoutException: Cannot receive any reply in ${timeout.duration}. This timeout is controlled by

spark git commit: [MINOR] Add missing interpolation in NettyRPCEnv

2015-12-16 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 27b98e99d -> 861549acd [MINOR] Add missing interpolation in NettyRPCEnv ``` Exception in thread "main" org.apache.spark.rpc.RpcTimeoutException: Cannot receive any reply in ${timeout.duration}. This timeout is controlled by

spark git commit: [SPARK-11904][PYSPARK] reduceByKeyAndWindow does not require checkpointing when invFunc is None

2015-12-16 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 97678edea -> 437583f69 [SPARK-11904][PYSPARK] reduceByKeyAndWindow does not require checkpointing when invFunc is None when invFunc is None, `reduceByKeyAndWindow(func, None, winsize, slidesize)` is equivalent to

spark git commit: [SPARK-12267][CORE] Store the remote RpcEnv address to send the correct disconnetion message

2015-12-12 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 98b212d36 -> 8af2f8c61 [SPARK-12267][CORE] Store the remote RpcEnv address to send the correct disconnetion message Author: Shixiong Zhu <shixi...@databricks.com> Closes #10261 from zsxwing/SPARK-12267. Project: http:

spark git commit: [SPARK-12267][CORE] Store the remote RpcEnv address to send the correct disconnetion message

2015-12-12 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 e05364baa -> d7e3bfd7d [SPARK-12267][CORE] Store the remote RpcEnv address to send the correct disconnetion message Author: Shixiong Zhu <shixi...@databricks.com> Closes #10261 from zsxwing/SPARK-12267. (cherry picked fr

spark git commit: [SPARK-12281][CORE] Fix a race condition when reporting ExecutorState in the shutdown hook

2015-12-13 Thread zsxwing
ava:745) ``` Author: Shixiong Zhu <shixi...@databricks.com> Closes #10269 from zsxwing/executor-state. (cherry picked from commit 2aecda284e22ec608992b6221e2f5ffbd51fcd24) Signed-off-by: Shixiong Zhu <shixi...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/re

spark git commit: [SPARK-12281][CORE] Fix a race condition when reporting ExecutorState in the shutdown hook

2015-12-13 Thread zsxwing
745) ``` Author: Shixiong Zhu <shixi...@databricks.com> Closes #10269 from zsxwing/executor-state. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2aecda28 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/2aec

spark git commit: [SPARK-12220][CORE] Make Utils.fetchFile support files that contain special characters

2015-12-17 Thread zsxwing
0208 from zsxwing/uri. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/86e405f3 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/86e405f3 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/86e405f3 Branch: refs/hea

spark git commit: [SPARK-12220][CORE] Make Utils.fetchFile support files that contain special characters

2015-12-17 Thread zsxwing
es #10208 from zsxwing/uri. (cherry picked from commit 86e405f357711ae93935853a912bc13985c259db) Signed-off-by: Shixiong Zhu <shixi...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/1fbca412 Tree:

spark git commit: [SPARK-12410][STREAMING] Fix places that use '.' and '|' directly in split

2015-12-17 Thread zsxwing
..@databricks.com> Closes #10361 from zsxwing/reg-bug. (cherry picked from commit 540b5aeadc84d1a5d61bda4414abd6bf35dc7ff9) Signed-off-by: Shixiong Zhu <shixi...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/c

spark git commit: [SPARK-12410][STREAMING] Fix places that use '.' and '|' directly in split

2015-12-17 Thread zsxwing
..@databricks.com> Closes #10361 from zsxwing/reg-bug. (cherry picked from commit 540b5aeadc84d1a5d61bda4414abd6bf35dc7ff9) Signed-off-by: Shixiong Zhu <shixi...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/c

spark git commit: [SPARK-12376][TESTS] Spark Streaming Java8APISuite fails in assertOrderInvariantEquals method

2015-12-17 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master e096a652b -> ed6ebda5c [SPARK-12376][TESTS] Spark Streaming Java8APISuite fails in assertOrderInvariantEquals method org.apache.spark.streaming.Java8APISuite.java is failing due to trying to sort immutable list in

spark git commit: [SPARK-12376][TESTS] Spark Streaming Java8APISuite fails in assertOrderInvariantEquals method

2015-12-17 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 48dcee484 -> 4df1dd403 [SPARK-12376][TESTS] Spark Streaming Java8APISuite fails in assertOrderInvariantEquals method org.apache.spark.streaming.Java8APISuite.java is failing due to trying to sort immutable list in

spark git commit: [SPARK-12410][STREAMING] Fix places that use '.' and '|' directly in split

2015-12-17 Thread zsxwing
bricks.com> Closes #10361 from zsxwing/reg-bug. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/540b5aea Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/540b5aea Diff: http://git-wip-us.apache.org/repos/asf/spark/diff

spark git commit: [SPARK-12410][STREAMING] Fix places that use '.' and '|' directly in split

2015-12-17 Thread zsxwing
..@databricks.com> Closes #10361 from zsxwing/reg-bug. (cherry picked from commit 540b5aeadc84d1a5d61bda4414abd6bf35dc7ff9) Signed-off-by: Shixiong Zhu <shixi...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/c

spark git commit: [SPARK-12304][STREAMING] Make Spark Streaming web UI display more fri…

2015-12-15 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master ca0690b5e -> d52bf47e1 [SPARK-12304][STREAMING] Make Spark Streaming web UI display more fri… …endly Receiver graphs Currently, the Spark Streaming web UI uses the same maxY when displays 'Input Rate Times& Histograms' and

spark git commit: [STREAMING][DOC][MINOR] Update the description of direct Kafka stream doc

2015-12-10 Thread zsxwing
ies compared to Scala/Java, so here changing the description to make it more precise. zsxwing tdas , please review, thanks a lot. Author: jerryshao <ss...@hortonworks.com> Closes #10246 from jerryshao/direct-kafka-doc-update. (cherry picked from commit 24d3357d66e14388faf8709b368edca70ea96432) S

spark git commit: [STREAMING][DOC][MINOR] Update the description of direct Kafka stream doc

2015-12-10 Thread zsxwing
ies compared to Scala/Java, so here changing the description to make it more precise. zsxwing tdas , please review, thanks a lot. Author: jerryshao <ss...@hortonworks.com> Closes #10246 from jerryshao/direct-kafka-doc-update. (cherry picked from commit 24d3357d66e14388faf8709b368edca70ea96432) S

spark git commit: [STREAMING][DOC][MINOR] Update the description of direct Kafka stream doc

2015-12-10 Thread zsxwing
red to Scala/Java, so here changing the description to make it more precise. zsxwing tdas , please review, thanks a lot. Author: jerryshao <ss...@hortonworks.com> Closes #10246 from jerryshao/direct-kafka-doc-update. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http:

spark git commit: [SPARK-12608][STREAMING] Remove submitJobThreadPool since submitJob doesn't create a separate thread to wait for the job result

2016-01-04 Thread zsxwing
ult. `submitJobThreadPool` was a workaround in `ReceiverTracker` to run these waiting-job-result threads. Now #9264 has been merged to master and resolved this blocking issue, `submitJobThreadPool` can be removed now. Author: Shixiong Zhu <shixi...@databricks.com> Closes #10560 from zsxwi

spark git commit: [SPARK-12673][UI] Add missing uri prepending for job description

2016-01-06 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 11b901b22 -> 94af69c9b [SPARK-12673][UI] Add missing uri prepending for job description Otherwise the url will be failed to proxy to the right one if in YARN mode. Here is the screenshot: ![screen shot 2016-01-06 at 5 28 26

spark git commit: [SPARK-12673][UI] Add missing uri prepending for job description

2016-01-06 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.5 5e86c0cce -> f2bc02ec4 [SPARK-12673][UI] Add missing uri prepending for job description Otherwise the url will be failed to proxy to the right one if in YARN mode. Here is the screenshot: ![screen shot 2016-01-06 at 5 28 26

spark git commit: [SPARK-12673][UI] Add missing uri prepending for job description

2016-01-06 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 8e19c7663 -> 174e72cec [SPARK-12673][UI] Add missing uri prepending for job description Otherwise the url will be failed to proxy to the right one if in YARN mode. Here is the screenshot: ![screen shot 2016-01-06 at 5 28 26

spark git commit: [SPARK-12510][STREAMING] Refactor ActorReceiver to support Java

2016-01-07 Thread zsxwing
ver` for Scala and `JavaActorReceiver` for Java 4. Add `JavaActorWordCount` example Author: Shixiong Zhu <shixi...@databricks.com> Closes #10457 from zsxwing/java-actor-stream. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit

spark git commit: [SPARK-12701][CORE] FileAppender should use join to ensure writing thread completion

2016-01-08 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master cfe1ba56e -> ea104b8f1 [SPARK-12701][CORE] FileAppender should use join to ensure writing thread completion Changed Logging FileAppender to use join in `awaitTermination` to ensure that thread is properly finished before returning.

spark git commit: [SPARK-12617][PYSPARK] Move Py4jCallbackConnectionCleaner to Streaming

2016-01-06 Thread zsxwing
;shixi...@databricks.com> Closes #10621 from zsxwing/SPARK-12617-2. (cherry picked from commit 1e6648d62fb82b708ea54c51cd23bfe4f542856e) Signed-off-by: Shixiong Zhu <shixi...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/

spark git commit: [SPARK-12617][PYSPARK] Move Py4jCallbackConnectionCleaner to Streaming

2016-01-06 Thread zsxwing
;shixi...@databricks.com> Closes #10621 from zsxwing/SPARK-12617-2. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/1e6648d6 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/1e6648d6 Diff: http://git-wip-us.apache.org/

spark git commit: [SPARK-12617][PYSPARK] Move Py4jCallbackConnectionCleaner to Streaming

2016-01-06 Thread zsxwing
;shixi...@databricks.com> Closes #10621 from zsxwing/SPARK-12617-2. (cherry picked from commit 1e6648d62fb82b708ea54c51cd23bfe4f542856e) Signed-off-by: Shixiong Zhu <shixi...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/

spark git commit: Revert "[SPARK-12672][STREAMING][UI] Use the uiRoot function instead of default root path to gain the streaming batch url."

2016-01-06 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 8f0ead3e7 -> 39b0a3480 Revert "[SPARK-12672][STREAMING][UI] Use the uiRoot function instead of default root path to gain the streaming batch url." This reverts commit 8f0ead3e79beb2c5f2731ceaa34fe1c133763386. Will merge #10618

spark git commit: Revert "[SPARK-12672][STREAMING][UI] Use the uiRoot function instead of default root path to gain the streaming batch url."

2016-01-06 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 19e4e9feb -> cbaea9591 Revert "[SPARK-12672][STREAMING][UI] Use the uiRoot function instead of default root path to gain the streaming batch url." This reverts commit 19e4e9febf9bb4fd69f6d7bc13a54844e4e096f1. Will merge #10618 instead.

spark git commit: Revert "[SPARK-12672][STREAMING][UI] Use the uiRoot function instead of default root path to gain the streaming batch url."

2016-01-06 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.5 fb421af08 -> d10b9d572 Revert "[SPARK-12672][STREAMING][UI] Use the uiRoot function instead of default root path to gain the streaming batch url." This reverts commit fb421af08de73e4ae6b04a576721109cae561865. Will merge #10618

spark git commit: [SPARK-12672][STREAMING][UI] Use the uiRoot function instead of default root path to gain the streaming batch url.

2016-01-06 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 d821fae0e -> 8f0ead3e7 [SPARK-12672][STREAMING][UI] Use the uiRoot function instead of default root path to gain the streaming batch url. Author: huangzhaowei Closes #10617 from SaintBacchus/SPARK-12672.

spark git commit: [SPARK-12672][STREAMING][UI] Use the uiRoot function instead of default root path to gain the streaming batch url.

2016-01-06 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.5 598a5c2cc -> fb421af08 [SPARK-12672][STREAMING][UI] Use the uiRoot function instead of default root path to gain the streaming batch url. Author: huangzhaowei Closes #10617 from SaintBacchus/SPARK-12672.

spark git commit: [SPARK-12672][STREAMING][UI] Use the uiRoot function instead of default root path to gain the streaming batch url.

2016-01-06 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 1e6648d62 -> 19e4e9feb [SPARK-12672][STREAMING][UI] Use the uiRoot function instead of default root path to gain the streaming batch url. Author: huangzhaowei Closes #10617 from SaintBacchus/SPARK-12672.

spark git commit: [SPARK-11985][STREAMING][KINESIS][DOCS] Update Kinesis docs

2015-12-18 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 6eba65525 -> 2377b707f [SPARK-11985][STREAMING][KINESIS][DOCS] Update Kinesis docs - Provide example on `message handler` - Provide bit on KPL record de-aggregation - Fix typos Author: Burak Yavuz Closes #9970 from

spark git commit: [SPARK-11985][STREAMING][KINESIS][DOCS] Update Kinesis docs

2015-12-18 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 bd33d4ee8 -> eca401ee5 [SPARK-11985][STREAMING][KINESIS][DOCS] Update Kinesis docs - Provide example on `message handler` - Provide bit on KPL record de-aggregation - Fix typos Author: Burak Yavuz Closes #9970

spark git commit: [SPARK-12489][CORE][SQL][MLIB] Fix minor issues found by FindBugs

2015-12-28 Thread zsxwing
zed` and `ReentrantLock`. Author: Shixiong Zhu <shixi...@databricks.com> Closes #10440 from zsxwing/findbugs. (cherry picked from commit 710b41172958a0b3a2b70c48821aefc81893731b) Signed-off-by: Shixiong Zhu <shixi...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-12489][CORE][SQL][MLIB] Fix minor issues found by FindBugs

2015-12-28 Thread zsxwing
ock`. Author: Shixiong Zhu <shixi...@databricks.com> Closes #10440 from zsxwing/findbugs. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/710b4117 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/710b4117 Diff: h

spark git commit: [SPARK-12490][CORE] Limit the css style scope to fix the Streaming UI

2015-12-29 Thread zsxwing
f8b-39df08426bf8.png;> This PR just added a class for the new style and only applied them to the paged tables. Author: Shixiong Zhu <shixi...@databricks.com> Closes #10517 from zsxwing/fix-streaming-ui. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-u

spark git commit: [SPARK-11749][STREAMING] Duplicate creating the RDD in file stream when recovering from checkpoint data

2015-12-17 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 658f66e62 -> f4346f612 [SPARK-11749][STREAMING] Duplicate creating the RDD in file stream when recovering from checkpoint data Add a transient flag `DStream.restoredFromCheckpointData` to control the restore processing in DStream to

spark git commit: [SPARK-11749][STREAMING] Duplicate creating the RDD in file stream when recovering from checkpoint data

2015-12-17 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 4df1dd403 -> 9177ea383 [SPARK-11749][STREAMING] Duplicate creating the RDD in file stream when recovering from checkpoint data Add a transient flag `DStream.restoredFromCheckpointData` to control the restore processing in DStream to

spark git commit: [MINOR] Hide the error logs for 'SQLListenerMemoryLeakSuite'

2015-12-17 Thread zsxwing
ks.com> Closes #10363 from zsxwing/hide-log. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/0370abdf Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/0370abdf Diff: http://git-wip-us.apache.org/repos/asf/spark/diff

spark git commit: [SPARK-11872] Prevent the call to SparkContext#stop() in the listener bus's thread

2015-11-24 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 19530da69 -> 81012546e [SPARK-11872] Prevent the call to SparkContext#stop() in the listener bus's thread This is continuation of SPARK-11761 Andrew suggested adding this protection. See tail of https://github.com/apache/spark/pull/9741

spark git commit: [SPARK-11999][CORE] Fix the issue that ThreadUtils.newDaemonCachedThreadPool doesn't cache any task

2015-11-25 Thread zsxwing
;shixi...@databricks.com> Closes #9978 from zsxwing/cached-threadpool. (cherry picked from commit d3ef693325f91a1ed340c9756c81244a80398eb2) Signed-off-by: Shixiong Zhu <shixi...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/a

spark git commit: [SPARK-11999][CORE] Fix the issue that ThreadUtils.newDaemonCachedThreadPool doesn't cache any task

2015-11-25 Thread zsxwing
;shixi...@databricks.com> Closes #9978 from zsxwing/cached-threadpool. (cherry picked from commit d3ef693325f91a1ed340c9756c81244a80398eb2) Signed-off-by: Shixiong Zhu <shixi...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/a

spark git commit: [SPARK-11999][CORE] Fix the issue that ThreadUtils.newDaemonCachedThreadPool doesn't cache any task

2015-11-25 Thread zsxwing
;shixi...@databricks.com> Closes #9978 from zsxwing/cached-threadpool. (cherry picked from commit d3ef693325f91a1ed340c9756c81244a80398eb2) Signed-off-by: Shixiong Zhu <shixi...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/a

spark git commit: [SPARK-11999][CORE] Fix the issue that ThreadUtils.newDaemonCachedThreadPool doesn't cache any task

2015-11-25 Thread zsxwing
;shixi...@databricks.com> Closes #9978 from zsxwing/cached-threadpool. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d3ef6933 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d3ef6933 Diff: http://git-wip-us.apache.org/repos/asf/s

spark git commit: [STREAMING][FLAKY-TEST] Catch execution context race condition in `FileBasedWriteAheadLog.close()`

2015-11-24 Thread zsxwing
nit/org.apache.spark.streaming.util/BatchedWriteAheadLogWithCloseFileAfterWriteSuite/BatchedWriteAheadLog___clean_old_logs/ The reason the test fails is in `afterEach`, `writeAheadLog.close` is called, and there may still be async deletes in flight. tdas zsxwing Author: Burak Yavuz <brk...@gmail.com> Clo

spark git commit: [STREAMING][FLAKY-TEST] Catch execution context race condition in `FileBasedWriteAheadLog.close()`

2015-11-24 Thread zsxwing
nit/org.apache.spark.streaming.util/BatchedWriteAheadLogWithCloseFileAfterWriteSuite/BatchedWriteAheadLog___clean_old_logs/ The reason the test fails is in `afterEach`, `writeAheadLog.close` is called, and there may still be async deletes in flight. tdas zsxwing Author: Burak Yavuz <brk...@gmail.com>

spark git commit: [SPARK-11979][STREAMING] Empty TrackStateRDD cannot be checkpointed and recovered from checkpoint file

2015-11-24 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 151d7c2ba -> 216988688 [SPARK-11979][STREAMING] Empty TrackStateRDD cannot be checkpointed and recovered from checkpoint file This solves the following exception caused when empty state RDD is checkpointed and recovered. The root cause

spark git commit: [SPARK-11979][STREAMING] Empty TrackStateRDD cannot be checkpointed and recovered from checkpoint file

2015-11-24 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-1.6 68bcb9b33 -> 7f030aa42 [SPARK-11979][STREAMING] Empty TrackStateRDD cannot be checkpointed and recovered from checkpoint file This solves the following exception caused when empty state RDD is checkpointed and recovered. The root

spark git commit: [SPARK-12058][HOTFIX] Disable KinesisStreamTests

2015-11-30 Thread zsxwing
es #10047 from zsxwing/disable-python-kinesis-test. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/edb26e7f Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/edb26e7f Diff: http://git-wip-us.apache.org/repos/asf/s

spark git commit: [SPARK-15515][SQL] Error Handling in Running SQL Directly On Files

2016-06-02 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 8900c8d8f -> 9aff6f3b1 [SPARK-15515][SQL] Error Handling in Running SQL Directly On Files What changes were proposed in this pull request? This PR is to address the following issues: - **ISSUE 1:** For ORC source format, we are

spark git commit: [SPARK-15515][SQL] Error Handling in Running SQL Directly On Files

2016-06-02 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 cd7bf4b8e -> 32b025e94 [SPARK-15515][SQL] Error Handling in Running SQL Directly On Files What changes were proposed in this pull request? This PR is to address the following issues: - **ISSUE 1:** For ORC source format, we are

spark git commit: [SPARK-15871][SQL] Add `assertNotPartitioned` check in `DataFrameWriter`

2016-06-10 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 c1390ccbb -> f15d641e2 [SPARK-15871][SQL] Add `assertNotPartitioned` check in `DataFrameWriter` ## What changes were proposed in this pull request? It doesn't make sense to specify partitioning parameters, when we write data out from

spark git commit: [SPARK-15871][SQL] Add `assertNotPartitioned` check in `DataFrameWriter`

2016-06-10 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 5c16ad0d5 -> fb219029d [SPARK-15871][SQL] Add `assertNotPartitioned` check in `DataFrameWriter` ## What changes were proposed in this pull request? It doesn't make sense to specify partitioning parameters, when we write data out from

spark git commit: [SPARK-15841][Tests] REPLSuite has incorrect env set for a couple of tests.

2016-06-09 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master aa0364510 -> 83070cd1d [SPARK-15841][Tests] REPLSuite has incorrect env set for a couple of tests. Description from JIRA. In ReplSuite, for a test that can be tested well on just local should not really have to start a local-cluster. And

spark git commit: [SPARK-15841][Tests] REPLSuite has incorrect env set for a couple of tests.

2016-06-09 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 b2d076c35 -> 3119d8eef [SPARK-15841][Tests] REPLSuite has incorrect env set for a couple of tests. Description from JIRA. In ReplSuite, for a test that can be tested well on just local should not really have to start a local-cluster.

spark git commit: [SPARK-9044] Fix "Storage" tab in UI so that it reflects RDD name change.

2016-05-25 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 f63ba2210 -> 69327667d [SPARK-9044] Fix "Storage" tab in UI so that it reflects RDD name change. ## What changes were proposed in this pull request? 1. Making 'name' field of RDDInfo mutable. 2. In StorageListener: catching the fact

spark git commit: [SPARK-9044] Fix "Storage" tab in UI so that it reflects RDD name change.

2016-05-25 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 4f27b8dd5 -> b120fba6a [SPARK-9044] Fix "Storage" tab in UI so that it reflects RDD name change. ## What changes were proposed in this pull request? 1. Making 'name' field of RDDInfo mutable. 2. In StorageListener: catching the fact that

spark git commit: [SPARK-15508][STREAMING][TESTS] Fix flaky test: JavaKafkaStreamSuite.testKafkaStream

2016-05-24 Thread zsxwing
ark-tests.appspot.com/tests/org.apache.spark.streaming.kafka.JavaKafkaStreamSuite/testKafkaStream ## How was this patch tested? Jenkins unit tests. Author: Shixiong Zhu <shixi...@databricks.com> Closes #13281 from zsxwing/flaky-kafka-test. Project: http://git-wip-us.apache.org/repos/

spark git commit: [SPARK-15508][STREAMING][TESTS] Fix flaky test: JavaKafkaStreamSuite.testKafkaStream

2016-05-24 Thread zsxwing
ttp://spark-tests.appspot.com/tests/org.apache.spark.streaming.kafka.JavaKafkaStreamSuite/testKafkaStream ## How was this patch tested? Jenkins unit tests. Author: Shixiong Zhu <shixi...@databricks.com> Closes #13281 from zsxwing/flaky-kafka-test. (cherry picked fr

spark git commit: [SPARK-15697][REPL] Unblock some of the useful repl commands.

2016-06-13 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 938434dc7 -> 4134653e5 [SPARK-15697][REPL] Unblock some of the useful repl commands. ## What changes were proposed in this pull request? Unblock some of the useful repl commands. like, "implicits", "javap", "power", "type", "kind". As

spark git commit: [SPARK-15697][REPL] Unblock some of the useful repl commands.

2016-06-13 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 c01dc815d -> 413826d40 [SPARK-15697][REPL] Unblock some of the useful repl commands. ## What changes were proposed in this pull request? Unblock some of the useful repl commands. like, "implicits", "javap", "power", "type", "kind".

spark git commit: [SPARK-15935][PYSPARK] Fix a wrong format tag in the error message

2016-06-14 Thread zsxwing
ins unit tests. Author: Shixiong Zhu <shixi...@databricks.com> Closes #13665 from zsxwing/fix. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/0ee9fd9e Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/0ee9fd9e D

[1/3] spark git commit: [SPARK-15953][WIP][STREAMING] Renamed ContinuousQuery to StreamingQuery

2016-06-15 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master d30b7e669 -> 9a5071996 http://git-wip-us.apache.org/repos/asf/spark/blob/9a507199/sql/core/src/test/scala/org/apache/spark/sql/streaming/test/DataStreamReaderWriterSuite.scala

[2/3] spark git commit: [SPARK-15953][WIP][STREAMING] Renamed ContinuousQuery to StreamingQuery

2016-06-15 Thread zsxwing
http://git-wip-us.apache.org/repos/asf/spark/blob/9a507199/sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryListener.scala -- diff --git

[3/3] spark git commit: [SPARK-15953][WIP][STREAMING] Renamed ContinuousQuery to StreamingQuery

2016-06-15 Thread zsxwing
[SPARK-15953][WIP][STREAMING] Renamed ContinuousQuery to StreamingQuery Renamed for simplicity, so that its obvious that its related to streaming. Existing unit tests. Author: Tathagata Das Closes #13673 from tdas/SPARK-15953. (cherry picked from commit

[2/3] spark git commit: [SPARK-15953][WIP][STREAMING] Renamed ContinuousQuery to StreamingQuery

2016-06-15 Thread zsxwing
http://git-wip-us.apache.org/repos/asf/spark/blob/885e74a3/sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryListener.scala -- diff --git

[1/3] spark git commit: [SPARK-15953][WIP][STREAMING] Renamed ContinuousQuery to StreamingQuery

2016-06-15 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 4c950a757 -> 885e74a38 http://git-wip-us.apache.org/repos/asf/spark/blob/885e74a3/sql/core/src/test/scala/org/apache/spark/sql/streaming/test/DataStreamReaderWriterSuite.scala

spark git commit: [SPARK-15889][SQL][STREAMING] Add a unique id to ContinuousQuery

2016-06-13 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 d9db8a9c8 -> 97fe1d8ee [SPARK-15889][SQL][STREAMING] Add a unique id to ContinuousQuery ## What changes were proposed in this pull request? ContinuousQueries have names that are unique across all the active ones. However, when

spark git commit: [SPARK-15889][SQL][STREAMING] Add a unique id to ContinuousQuery

2016-06-13 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 5ad4e32d4 -> c654ae214 [SPARK-15889][SQL][STREAMING] Add a unique id to ContinuousQuery ## What changes were proposed in this pull request? ContinuousQueries have names that are unique across all the active ones. However, when queries

spark git commit: Revert "[SPARK-11753][SQL][TEST-HADOOP2.2] Make allowNonNumericNumbers option work

2016-05-31 Thread zsxwing
ent a PR to run Jenkins tests due to the revert conflicts of `dev/deps/spark-deps-hadoop*`. ## How was this patch tested? Jenkins unit tests, integration tests, manual tests) Author: Shixiong Zhu <shixi...@databricks.com> Closes #13417 from zsxwing/revert-SPARK-11753. (cherry pick

spark git commit: Revert "[SPARK-11753][SQL][TEST-HADOOP2.2] Make allowNonNumericNumbers option work

2016-05-31 Thread zsxwing
R to run Jenkins tests due to the revert conflicts of `dev/deps/spark-deps-hadoop*`. ## How was this patch tested? Jenkins unit tests, integration tests, manual tests) Author: Shixiong Zhu <shixi...@databricks.com> Closes #13417 from zsxwing/revert-SPARK-11753. Project: http://git-wip

spark git commit: [SPARK-15606][CORE] Use non-blocking removeExecutor call to avoid deadlocks

2016-06-02 Thread zsxwing
Repository: spark Updated Branches: refs/heads/master 63b7f127c -> 7c07d176f [SPARK-15606][CORE] Use non-blocking removeExecutor call to avoid deadlocks ## What changes were proposed in this pull request? Set minimum number of dispatcher threads to 3 to avoid deadlocks on machines with only

spark git commit: [SPARK-15606][CORE] Use non-blocking removeExecutor call to avoid deadlocks

2016-06-02 Thread zsxwing
Repository: spark Updated Branches: refs/heads/branch-2.0 fe639adea -> 18d613a4d [SPARK-15606][CORE] Use non-blocking removeExecutor call to avoid deadlocks ## What changes were proposed in this pull request? Set minimum number of dispatcher threads to 3 to avoid deadlocks on machines with

  1   2   3   4   5   6   7   8   >