[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user uncleGen closed the pull request at: https://github.com/apache/spark/pull/4135 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-113335605 @tdas @uncleGen any updates? If not, I would recommend that we close this PR for now and reopen it later if there is more activity. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-91069166 On closer inspection, my [previous comment](https://github.com/apache/spark/pull/4135#discussion_r26727249) about testing still stands > I notice that this test harness the logic from the real receiver tracker. Is this test suite actually covering the changes made to ReceiverTracker? In other words, if I were to go in and delete the callback registration code that you added to the real tracker, would that cause any tests to fail? If not, then I think we need to do additional work to test this, since the test isn't really testing anything if it can't expose bugs in the real code. > It seems like we'd like to test against a real ReceiverTracker using mock implementations of ReceiverSupervisor and Receiver. I'm not super-familiar with this corner of the streaming code, though, so maybe @tdas can chime in with suggestions on how to test this? Since the changes in this patch modify ReceiverTracker, it makes sense to test the actual ReceiverTracker while mocking other components. The updated test still doesn't exercise the real receiver tracker and thus isn't actually covering the streaming changes in this patch. I understand that properly testing ReceiverTracker might not be the easiest test to write, so perhaps this would require refactoring of ReceiverTracker to make the individual components more testable. I'll let @tdas chime in to decide on the importance of writing a real test for this. Sorry to be pedantic, but I'm a bit of a stickler for the importance of writing tests that actually exercise the code being modified, since I think it's very rare that it's truly impossible to test changes like this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/4135#discussion_r28024130 --- Diff: streaming/src/test/scala/org/apache/spark/streaming/ReceiverSuite.scala --- @@ -295,6 +334,27 @@ class ReceiverSuite extends TestSuiteBase with Timeouts with Serializable { } /** + * An implementation of Task running a NetworkReceiver used for testing whether + * we can start/stop receiver properly. + */ + class FakeReceiverTask( +stageId: Int, +executor: FakeReceiverSupervisor, --- End diff -- How about calling this `supervisor` instead of `executor`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/4135#discussion_r28024103 --- Diff: streaming/src/test/scala/org/apache/spark/streaming/ReceiverSuite.scala --- @@ -295,6 +334,27 @@ class ReceiverSuite extends TestSuiteBase with Timeouts with Serializable { } /** + * An implementation of Task running a NetworkReceiver used for testing whether + * we can start/stop receiver properly. + */ + class FakeReceiverTask( +stageId: Int, +executor: FakeReceiverSupervisor, +prefLocs: Seq[TaskLocation] = Nil) extends Task[Unit](stageId, 0) { --- End diff -- It looks like we never call this with a non-Nil value of `prefLocs`, so we can probably drop the constructor argument and just hardcode `Nil` into the `preferredLocations` method. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/4135#discussion_r28024057 --- Diff: streaming/src/test/scala/org/apache/spark/streaming/ReceiverSuite.scala --- @@ -295,6 +334,27 @@ class ReceiverSuite extends TestSuiteBase with Timeouts with Serializable { } /** + * An implementation of Task running a NetworkReceiver used for testing whether + * we can start/stop receiver properly. + */ + class FakeReceiverTask( +stageId: Int, --- End diff -- Style nit: these arguments should be indented another two spaces. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/4135#discussion_r28023730 --- Diff: streaming/src/test/scala/org/apache/spark/streaming/ReceiverSuite.scala --- @@ -129,6 +130,44 @@ class ReceiverSuite extends TestSuiteBase with Timeouts with Serializable { } } + test("check the state of `Receiver` and `ReceiverSupervisor` when receivers are killed") { --- End diff -- I'd add `(SPARK-5205)` to the name of this test, if it will fit, or just add a comment on the next line referencing the JIRA number. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/4135#discussion_r28023642 --- Diff: project/MimaExcludes.scala --- @@ -64,6 +64,10 @@ object MimaExcludes { // SPARK-6492 Fix deadlock in SparkContext.stop() ProblemFilters.exclude[MissingMethodProblem]("org.apache.spark.SparkContext.org$" + "apache$spark$SparkContext$$SPARK_CONTEXT_CONSTRUCTOR_LOCK") + ) ++ Seq( --- End diff -- The indentation here looks a bit off. Mind indenting this back a bit so that it lines up with the other closing braces (e.g. line 63)? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/4135#discussion_r28023348 --- Diff: core/src/test/scala/org/apache/spark/scheduler/TaskContextSuite.scala --- @@ -64,6 +64,53 @@ class TaskContextSuite extends FunSuite with BeforeAndAfter with LocalSparkConte verify(listener, times(1)).onTaskCompletion(any()) } + test("calls TaskInterruptionListener when task killed") { +TaskContextSuite.interrupted = false +sc = new SparkContext("local", "test") +val rdd = new RDD[String](sc, List()) { + override def getPartitions = Array[Partition](StubPartition(0)) + override def compute(split: Partition, context: TaskContext) = { +context.addTaskInterruptionListener(context => TaskContextSuite.interrupted = true) +Thread.sleep(1000) +Iterator("test") + } +} +val closureSerializer = SparkEnv.get.closureSerializer.newInstance() +val func = (c: TaskContext, i: Iterator[String]) => i.next() +val task = new ResultTask[String, String]( + 0, sc.broadcast(closureSerializer.serialize((rdd, func)).array), rdd.partitions(0), Seq(), 0) + +task.kill(false) +assert(TaskContextSuite.interrupted === false) + +val taskThread = new Thread("Task") { + override def run(): Unit = { +task.run(0, 0) + } +} +taskThread.start() +Thread.sleep(500) --- End diff -- If possible, we should avoid adding any new Thread.sleep() calls in our tests, since they slow things down and have been a major source of flakiness in the past. How about adding a CountdownLatch to the TaskContexSuite companion object and using that to block the main thread until the body of the task has started executing and the interruption listener has been registered? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/4135#discussion_r28023099 --- Diff: core/src/test/scala/org/apache/spark/scheduler/TaskContextSuite.scala --- @@ -64,6 +64,53 @@ class TaskContextSuite extends FunSuite with BeforeAndAfter with LocalSparkConte verify(listener, times(1)).onTaskCompletion(any()) } + test("calls TaskInterruptionListener when task killed") { +TaskContextSuite.interrupted = false +sc = new SparkContext("local", "test") +val rdd = new RDD[String](sc, List()) { + override def getPartitions = Array[Partition](StubPartition(0)) + override def compute(split: Partition, context: TaskContext) = { +context.addTaskInterruptionListener(context => TaskContextSuite.interrupted = true) +Thread.sleep(1000) +Iterator("test") + } +} +val closureSerializer = SparkEnv.get.closureSerializer.newInstance() +val func = (c: TaskContext, i: Iterator[String]) => i.next() +val task = new ResultTask[String, String]( + 0, sc.broadcast(closureSerializer.serialize((rdd, func)).array), rdd.partitions(0), Seq(), 0) + +task.kill(false) --- End diff -- The task isn't running at this point, so what's the purpose of calling `kill` here? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/4135#discussion_r28022977 --- Diff: core/src/main/scala/org/apache/spark/TaskContext.scala --- @@ -106,6 +105,20 @@ abstract class TaskContext extends Serializable { def addOnCompleteCallback(f: () => Unit) /** + * Add a (Java friendly) listener to be executed on task interruption. We add this + * listener for some more clean works. An example use is to stop `receiver supervisor` + * properly. + */ + def addTaskInterruptionListener(listener: TaskInterruptionListener): TaskContext + + /** + * Add a listener in the form of a Scala closure to be executed on task interruption. + * We add this listener for some more clean works. An example use is to stop `receiver --- End diff -- I'd reword this slightly to be a bit more explicit: > Add a listener in the form of a Scala closure to be executed on task interruption. An example use is to cleanly shut down long-running receiver tasks in Spark Streaming. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/4135#discussion_r28022727 --- Diff: core/src/main/scala/org/apache/spark/util/TaskInterruptionListener.scala --- @@ -0,0 +1,33 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.util + +import java.util.EventListener + +import org.apache.spark.TaskContext +import org.apache.spark.annotation.DeveloperApi + +/** + * :: DeveloperApi :: + * + * Listener providing a callback function to invoke when interrupt a task. --- End diff -- "when interrupt" -> "when interrupting" --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-90476533 Ha~llo~, @JoshRosen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-90455427 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29784/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-90455405 [Test build #29784 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29784/consoleFull) for PR 4135 at commit [`debfbac`](https://github.com/apache/spark/commit/debfbacd088d910351aa3fb0070e9ad3ea8ffce3). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. * This patch does not change any dependencies. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-90405475 [Test build #29784 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29784/consoleFull) for PR 4135 at commit [`debfbac`](https://github.com/apache/spark/commit/debfbacd088d910351aa3fb0070e9ad3ea8ffce3). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-88342325 @JoshRosen Your comments are reasonable, and I have improved the relevant codes just what you pointed. And more, I add some unit tests about the behavior about `TaskInterruptionListener`. In the same time, I left some unit test in `ReceiverSuite` to test the behavior about `Receiver` and `ReceiverSupervisor`. Maybe, @tdas has some suggestions. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-88089553 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29478/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-88089542 [Test build #29478 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29478/consoleFull) for PR 4135 at commit [`3a68742`](https://github.com/apache/spark/commit/3a68742a5631e9492ebdf29813ba547e1e4842bf). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `trait TaskInterruptionListener extends EventListener ` * `class TaskInterruptionListenerException(errorMessages: Seq[String]) extends Exception ` * This patch does not change any dependencies. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-88064582 [Test build #29478 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29478/consoleFull) for PR 4135 at commit [`3a68742`](https://github.com/apache/spark/commit/3a68742a5631e9492ebdf29813ba547e1e4842bf). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-88013418 wait for a moment --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-87990165 [Test build #29469 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29469/consoleFull) for PR 4135 at commit [`e26a207`](https://github.com/apache/spark/commit/e26a207b1f53ce426677221a03fe100fd0a72bf4). * This patch **fails MiMa tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `trait TaskInterruptionListener extends EventListener ` * `class TaskInterruptionListenerException(errorMessages: Seq[String]) extends Exception ` * This patch does not change any dependencies. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-87990173 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29469/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-87987214 [Test build #29469 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29469/consoleFull) for PR 4135 at commit [`e26a207`](https://github.com/apache/spark/commit/e26a207b1f53ce426677221a03fe100fd0a72bf4). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
GitHub user uncleGen reopened a pull request: https://github.com/apache/spark/pull/4135 [SPARK-5205][Streaming]:Inconsistent behaviour between Streaming job and others, when click kill link in WebUI The "kill" link is used to kill a stage in job. It works in any kinds of Spark job but Spark Streaming. To be specific, we can only kill the stage which is used to run "Receiver", but not kill the "Receivers". Well, the stage can be killed and cleaned from the ui, but the receivers are still alive and receiving data. I think it dose not fit with the common sense. IMHO, killing the "receiver" stage means kill the "receivers" and stopping receiving data. You can merge this pull request into a Git repository by running: $ git pull https://github.com/uncleGen/spark master-clean-150121 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/4135.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4135 commit c90a288ca6fde01008ff3ab5d04970c8f120c4b1 Author: uncleGen Date: 2015-01-21T08:58:10Z BUG FIX: Inconsistent behaviour between Streaming job and others, when click kill link in WebUI commit fc3be2c0d3cd3e8f2f15a4a4f0fa504b28164d38 Author: uncleGen Date: 2015-01-21T09:16:12Z fix commit d85dede41dce4a32748037871241da1620ada564 Author: uncleGen Date: 2015-01-21T09:23:59Z style fix commit f3162618d3eb1de3e6690e1403bf1943145d316a Author: uncleGen Date: 2015-02-03T02:36:52Z resolve merge conflicts commit 168115cb364285867bc9e7af400d48922712c71c Author: uncleGen Date: 2015-03-02T03:46:42Z resolve merge conflicts commit 047931c531b7d307de201ba74ed5dcb7da7c6559 Author: uncleGen Date: 2015-03-02T05:24:18Z resolve merge conflicts commit d8b57dfff760ce7a3b2413c3dcada3e1572fb7e8 Author: uncleGen Date: 2015-03-09T03:28:37Z Merge branch 'master-clean' into master-clean-150121 commit 4417ff06aeeea63b1a2f4e8a12d736a050abf057 Author: uncleGen Date: 2015-03-09T08:19:02Z update commit 963556d4b64c81e56913a2860f596c6b25b1286d Author: uncleGen Date: 2015-03-10T03:14:15Z add unit test commit f997698a0233d351941fd7ddd9577a659806281e Author: uncleGen Date: 2015-03-10T03:42:03Z fix unit test commit 705118453ab6f9869ac090ca2bac009f167869cd Author: uncleGen Date: 2015-03-10T08:57:21Z minor fix commit 92fb864e384e298eead53f10844365ea1887a929 Author: uncleGen Date: 2015-03-12T11:41:33Z roll back commit 98166e7e8b43d1a912b2b6a468eb6d8f1f0297f2 Author: uncleGen Date: 2015-03-12T11:44:41Z minor fix commit baa175898df8d36b36ec9b9494ebbd117acfae87 Author: uncleGen Date: 2015-03-12T11:47:23Z minor fix commit fe3e5d52bf8680a021b00dd598cdc5a3f1c7df7e Author: uncleGen Date: 2015-03-12T11:51:11Z minor fix commit fb9716d3feee4523a3ee4cddd0a9c0926e228099 Author: uncleGen Date: 2015-03-12T11:52:33Z minor fix commit 02bf9a936f798b7eb40dcd1183ea14b5c2125deb Author: uncleGen Date: 2015-03-12T12:17:46Z minor fix commit 2bac6564a0a1153ec1366bfa4da085a6e89d07d5 Author: uncleGen Date: 2015-03-12T13:49:21Z minor fix commit 2544f8e5ad0e205e4043bd47646f57d5ed1f42a8 Author: uncleGen Date: 2015-03-17T13:52:36Z roll back to original approach commit b5237d86ddfcfd0b1b9b5bcd6952902ee1d843b6 Author: uncleGen Date: 2015-03-17T14:03:10Z resolve merge conflict commit 96f21820250bec9c7286489baa0f7dbd96758df6 Author: uncleGen Date: 2015-03-18T02:38:58Z resolve scala style error commit b2916fb7cdb6047eb618c2da767d18eb5a4e0caf Author: uncleGen Date: 2015-03-18T09:20:55Z minor fix commit 77df65a1d96bb98ee7617590825d63c8ee6a26f9 Author: uncleGen Date: 2015-03-31T06:30:53Z Merge branch 'master-clean-tmp' into master-clean-150121 Conflicts: streaming/src/main/scala/org/apache/spark/streaming/receiver/ReceiverSupervisor.scala streaming/src/test/scala/org/apache/spark/streaming/ReceiverSuite.scala commit e6724ecac594e818973a6d813c4d012fca1bd06d Author: uncleGen Date: 2015-03-31T06:40:01Z minor fix commit e05e87f3a5f3f7325f7e31591d7556afdef8b8dd Author: uncleGen Date: 2015-03-31T06:56:56Z minor fix commit e26a207b1f53ce426677221a03fe100fd0a72bf4 Author: uncleGen Date: 2015-03-31T07:55:45Z update test suite --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user uncleGen closed the pull request at: https://github.com/apache/spark/pull/4135 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-87967820 @JoshRosen Your comments are reasonable, and I have improved related code just as what you pointed. For the test suite, I just check if the state of `Receiver` and `ReceiverSupervisor` are correct. In the previous way, killing `task` could not stop `Receiver` and `ReceiverSupervisor` properly. I think it is enough. @tdas, what is your opinion? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-87967183 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29461/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-87967179 [Test build #29461 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29461/consoleFull) for PR 4135 at commit [`e6724ec`](https://github.com/apache/spark/commit/e6724ecac594e818973a6d813c4d012fca1bd06d). * This patch **fails MiMa tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `trait TaskInterruptionListener extends EventListener ` * `class TaskInterruptionListenerException(errorMessages: Seq[String]) extends Exception ` * `class UDFRegistration(object):` * This patch does not change any dependencies. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-87964217 [Test build #29461 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/29461/consoleFull) for PR 4135 at commit [`e6724ec`](https://github.com/apache/spark/commit/e6724ecac594e818973a6d813c4d012fca1bd06d). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
GitHub user uncleGen reopened a pull request: https://github.com/apache/spark/pull/4135 [SPARK-5205][Streaming]:Inconsistent behaviour between Streaming job and others, when click kill link in WebUI The "kill" link is used to kill a stage in job. It works in any kinds of Spark job but Spark Streaming. To be specific, we can only kill the stage which is used to run "Receiver", but not kill the "Receivers". Well, the stage can be killed and cleaned from the ui, but the receivers are still alive and receiving data. I think it dose not fit with the common sense. IMHO, killing the "receiver" stage means kill the "receivers" and stopping receiving data. You can merge this pull request into a Git repository by running: $ git pull https://github.com/uncleGen/spark master-clean-150121 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/4135.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4135 commit c90a288ca6fde01008ff3ab5d04970c8f120c4b1 Author: uncleGen Date: 2015-01-21T08:58:10Z BUG FIX: Inconsistent behaviour between Streaming job and others, when click kill link in WebUI commit fc3be2c0d3cd3e8f2f15a4a4f0fa504b28164d38 Author: uncleGen Date: 2015-01-21T09:16:12Z fix commit d85dede41dce4a32748037871241da1620ada564 Author: uncleGen Date: 2015-01-21T09:23:59Z style fix commit f3162618d3eb1de3e6690e1403bf1943145d316a Author: uncleGen Date: 2015-02-03T02:36:52Z resolve merge conflicts commit 168115cb364285867bc9e7af400d48922712c71c Author: uncleGen Date: 2015-03-02T03:46:42Z resolve merge conflicts commit 047931c531b7d307de201ba74ed5dcb7da7c6559 Author: uncleGen Date: 2015-03-02T05:24:18Z resolve merge conflicts commit d8b57dfff760ce7a3b2413c3dcada3e1572fb7e8 Author: uncleGen Date: 2015-03-09T03:28:37Z Merge branch 'master-clean' into master-clean-150121 commit 4417ff06aeeea63b1a2f4e8a12d736a050abf057 Author: uncleGen Date: 2015-03-09T08:19:02Z update commit 963556d4b64c81e56913a2860f596c6b25b1286d Author: uncleGen Date: 2015-03-10T03:14:15Z add unit test commit f997698a0233d351941fd7ddd9577a659806281e Author: uncleGen Date: 2015-03-10T03:42:03Z fix unit test commit 705118453ab6f9869ac090ca2bac009f167869cd Author: uncleGen Date: 2015-03-10T08:57:21Z minor fix commit 92fb864e384e298eead53f10844365ea1887a929 Author: uncleGen Date: 2015-03-12T11:41:33Z roll back commit 98166e7e8b43d1a912b2b6a468eb6d8f1f0297f2 Author: uncleGen Date: 2015-03-12T11:44:41Z minor fix commit baa175898df8d36b36ec9b9494ebbd117acfae87 Author: uncleGen Date: 2015-03-12T11:47:23Z minor fix commit fe3e5d52bf8680a021b00dd598cdc5a3f1c7df7e Author: uncleGen Date: 2015-03-12T11:51:11Z minor fix commit fb9716d3feee4523a3ee4cddd0a9c0926e228099 Author: uncleGen Date: 2015-03-12T11:52:33Z minor fix commit 02bf9a936f798b7eb40dcd1183ea14b5c2125deb Author: uncleGen Date: 2015-03-12T12:17:46Z minor fix commit 2bac6564a0a1153ec1366bfa4da085a6e89d07d5 Author: uncleGen Date: 2015-03-12T13:49:21Z minor fix commit 2544f8e5ad0e205e4043bd47646f57d5ed1f42a8 Author: uncleGen Date: 2015-03-17T13:52:36Z roll back to original approach commit b5237d86ddfcfd0b1b9b5bcd6952902ee1d843b6 Author: uncleGen Date: 2015-03-17T14:03:10Z resolve merge conflict commit 96f21820250bec9c7286489baa0f7dbd96758df6 Author: uncleGen Date: 2015-03-18T02:38:58Z resolve scala style error commit b2916fb7cdb6047eb618c2da767d18eb5a4e0caf Author: uncleGen Date: 2015-03-18T09:20:55Z minor fix commit 77df65a1d96bb98ee7617590825d63c8ee6a26f9 Author: uncleGen Date: 2015-03-31T06:30:53Z Merge branch 'master-clean-tmp' into master-clean-150121 Conflicts: streaming/src/main/scala/org/apache/spark/streaming/receiver/ReceiverSupervisor.scala streaming/src/test/scala/org/apache/spark/streaming/ReceiverSuite.scala commit e6724ecac594e818973a6d813c4d012fca1bd06d Author: uncleGen Date: 2015-03-31T06:40:01Z minor fix --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user uncleGen closed the pull request at: https://github.com/apache/spark/pull/4135 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-87918010 @uncleGen No worries; I'm not in a huge rush, but just wanted to check in to see if you planned to update this. Sorry that the review process has been so slow; thanks for your patience! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-87908164 @JoshRosen Sorry for my laziness. I will update it as soon as possible --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-87870087 Any update on this? This patch looked good to me overall; the main thrust of my feedback was that the tests here didn't seem sufficient to actually exercise the code added in this patch, but I think that shouldn't be too hard to fix. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-83291328 I left a pass of comments, mostly about naming / documentation things. The overall approach looks good, but I'm not sure if this is the right / best unit test for this logic because the test harness seems to duplicate most of the logic being tested. I'll let @tdas chime in regarding tests. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/4135#discussion_r26727249 --- Diff: streaming/src/test/scala/org/apache/spark/streaming/ReceiverSuite.scala --- @@ -296,6 +335,27 @@ class ReceiverSuite extends TestSuiteBase with Timeouts with Serializable { } /** + * An implementation of Task running a NetworkReceiver used for testing whether + * we can start/stop receiver properly. + */ + class FakeReceiverTask( +stageId: Int, +executor: FakeReceiverSupervisor, +prefLocs: Seq[TaskLocation] = Nil) extends Task[Unit](stageId, 0) { +override def runTask(context: TaskContext): Unit = { + context.addTaskInterruptedListener { context => --- End diff -- I notice that this test harness the logic from the real receiver tracker. Is this test suite actually covering the changes made to ReceiverTracker? In other words, if I were to go in and delete the callback registration code that you added to the real tracker, would that cause any tests to fail? If not, then I think we need to do additional work to test this, since the test isn't really testing anything if it can't expose bugs in the real code. It seems like we'd like to test against a real ReceiverTracker using mock implementations of ReceiverSupervisor and Receiver. I'm not super-familiar with this corner of the streaming code, though, so maybe @tdas can chime in with suggestions on how to test this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/4135#discussion_r26726883 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceiverTracker.scala --- @@ -295,6 +296,14 @@ class ReceiverTracker(ssc: StreamingContext, skipReceiverLaunch: Boolean = false val receiver = iterator.next() val supervisor = new ReceiverSupervisorImpl( receiver, SparkEnv.get, serializableHadoopConf.value, checkpointDirOption) + +// Due to SPARK-5205, `Receiver` stage can not be stopped properly, we need to add a --- End diff -- It's fine to be a bit more explicit / verbose about what's going on here, so I'd maybe update this comment to say something like this: > To ensure that receivers are stopped properly when their tasks / stages are killed, register a callback to stop the receiver when its task is interrupted. See SPARK-5205 for more details. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/4135#discussion_r26726784 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/receiver/ReceiverSupervisor.scala --- @@ -23,9 +23,9 @@ import scala.collection.mutable.ArrayBuffer import org.apache.spark.{Logging, SparkConf} import org.apache.spark.storage.StreamBlockId -import java.util.concurrent.CountDownLatch import scala.concurrent._ import ExecutionContext.Implicits.global +import java.util.concurrent.CountDownLatch --- End diff -- We don't actually have to modify this file as part of this patch, though, so it'd be fine to just roll it back to what's in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/4135#discussion_r26726766 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/receiver/ReceiverSupervisor.scala --- @@ -23,9 +23,9 @@ import scala.collection.mutable.ArrayBuffer import org.apache.spark.{Logging, SparkConf} import org.apache.spark.storage.StreamBlockId -import java.util.concurrent.CountDownLatch import scala.concurrent._ import ExecutionContext.Implicits.global +import java.util.concurrent.CountDownLatch --- End diff -- The import ordering in this file was a bit of a mess to begin with, but if you're going to change it would you mind bringing it up to date with the ordering described in the style guide (e.g. java imports, then Scala, then third-party packages, then Spark)? https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/4135#discussion_r26726692 --- Diff: core/src/main/scala/org/apache/spark/TaskContextImpl.scala --- @@ -82,9 +97,27 @@ private[spark] class TaskContextImpl( } } - /** Marks the task for interruption, i.e. cancellation. */ + /** + * Marks the task as interruption, i.e. cancellation. We add this + * method for some more clean works. For example, we need to register + * an "interruption" callback to completely stop a receiver supervisor. + */ private[spark] def markInterrupted(): Unit = { interrupted = true +val errorMsgs = new ArrayBuffer[String](2) +// Process interruption callbacks in the reverse order of registration +onInterruptedCallbacks.reverse.foreach { listener => + try { +listener.onTaskInterrupted(this) + } catch { +case e: Throwable => --- End diff -- It might be dangerous for this code to catch all Throwables, since this could also catch irrecoverable exceptions like OutOfMemoryError. Instead, it might be safer to use `case NonFatal(e) =>`. If you agree, then let's also update the original code in `markTaskCompleted()` to catch NonFatal as well. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/4135#discussion_r26726595 --- Diff: core/src/main/scala/org/apache/spark/TaskContextImpl.scala --- @@ -82,9 +97,27 @@ private[spark] class TaskContextImpl( } } - /** Marks the task for interruption, i.e. cancellation. */ + /** + * Marks the task as interruption, i.e. cancellation. We add this --- End diff -- I'd reword this comment slightly to say something like "Marks the task for interruption (i.e. cancellation) and calls any task interruption callbacks registered for this task." We already document the sample use-case on the public addTaskInterruptionListener callback, so I don't think we need to do it here as well. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/4135#discussion_r26726510 --- Diff: core/src/main/scala/org/apache/spark/TaskContext.scala --- @@ -106,6 +105,20 @@ abstract class TaskContext extends Serializable { def addOnCompleteCallback(f: () => Unit) /** + * Add a (Java friendly) listener to be executed on task interruption. We add this + * listener for some more clean works. An example use is to stop `receiver supervisor` + * properly. + */ + def addTaskInterruptedListener(listener: TaskInterruptionListener): TaskContext --- End diff -- For consistency, I think we should call this `addTaskInterruptionListener`, since the actual listener class is named `TaskInterruptionListener` and for consistency with the other listener, which is named "completion" rather than "complete". --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-82889057 [Test build #28788 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28788/consoleFull) for PR 4135 at commit [`b2916fb`](https://github.com/apache/spark/commit/b2916fb7cdb6047eb618c2da767d18eb5a4e0caf). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `trait TaskInterruptionListener extends EventListener ` * `class TaskInterruptionListenerException(errorMessages: Seq[String]) extends Exception ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-82889063 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28788/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-82844745 @JoshRosen Could you please take a look again, thank you! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-82844809 [Test build #28788 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28788/consoleFull) for PR 4135 at commit [`b2916fb`](https://github.com/apache/spark/commit/b2916fb7cdb6047eb618c2da767d18eb5a4e0caf). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-82729606 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28765/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-82729600 [Test build #28765 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28765/consoleFull) for PR 4135 at commit [`96f2182`](https://github.com/apache/spark/commit/96f21820250bec9c7286489baa0f7dbd96758df6). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `trait TaskInterruptionListener extends EventListener ` * `class TaskInterruptionListenerException(errorMessages: Seq[String]) extends Exception ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-82696976 [Test build #28765 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28765/consoleFull) for PR 4135 at commit [`96f2182`](https://github.com/apache/spark/commit/96f21820250bec9c7286489baa0f7dbd96758df6). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-82372334 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28723/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-82372328 [Test build #28723 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28723/consoleFull) for PR 4135 at commit [`b5237d8`](https://github.com/apache/spark/commit/b5237d86ddfcfd0b1b9b5bcd6952902ee1d843b6). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `trait TaskInterruptionListener extends EventListener ` * `class TaskInterruptionListenerException(errorMessages: Seq[String]) extends Exception ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-82371995 [Test build #28723 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28723/consoleFull) for PR 4135 at commit [`b5237d8`](https://github.com/apache/spark/commit/b5237d86ddfcfd0b1b9b5bcd6952902ee1d843b6). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
GitHub user uncleGen reopened a pull request: https://github.com/apache/spark/pull/4135 [SPARK-5205][Streaming]:Inconsistent behaviour between Streaming job and others, when click kill link in WebUI The "kill" link is used to kill a stage in job. It works in any kinds of Spark job but Spark Streaming. To be specific, we can only kill the stage which is used to run "Receiver", but not kill the "Receivers". Well, the stage can be killed and cleaned from the ui, but the receivers are still alive and receiving data. I think it dose not fit with the common sense. IMHO, killing the "receiver" stage means kill the "receivers" and stopping receiving data. You can merge this pull request into a Git repository by running: $ git pull https://github.com/uncleGen/spark master-clean-150121 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/4135.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4135 commit c90a288ca6fde01008ff3ab5d04970c8f120c4b1 Author: uncleGen Date: 2015-01-21T08:58:10Z BUG FIX: Inconsistent behaviour between Streaming job and others, when click kill link in WebUI commit fc3be2c0d3cd3e8f2f15a4a4f0fa504b28164d38 Author: uncleGen Date: 2015-01-21T09:16:12Z fix commit d85dede41dce4a32748037871241da1620ada564 Author: uncleGen Date: 2015-01-21T09:23:59Z style fix commit f3162618d3eb1de3e6690e1403bf1943145d316a Author: uncleGen Date: 2015-02-03T02:36:52Z resolve merge conflicts commit 168115cb364285867bc9e7af400d48922712c71c Author: uncleGen Date: 2015-03-02T03:46:42Z resolve merge conflicts commit 047931c531b7d307de201ba74ed5dcb7da7c6559 Author: uncleGen Date: 2015-03-02T05:24:18Z resolve merge conflicts commit d8b57dfff760ce7a3b2413c3dcada3e1572fb7e8 Author: uncleGen Date: 2015-03-09T03:28:37Z Merge branch 'master-clean' into master-clean-150121 commit 4417ff06aeeea63b1a2f4e8a12d736a050abf057 Author: uncleGen Date: 2015-03-09T08:19:02Z update commit 963556d4b64c81e56913a2860f596c6b25b1286d Author: uncleGen Date: 2015-03-10T03:14:15Z add unit test commit f997698a0233d351941fd7ddd9577a659806281e Author: uncleGen Date: 2015-03-10T03:42:03Z fix unit test commit 705118453ab6f9869ac090ca2bac009f167869cd Author: uncleGen Date: 2015-03-10T08:57:21Z minor fix commit 92fb864e384e298eead53f10844365ea1887a929 Author: uncleGen Date: 2015-03-12T11:41:33Z roll back commit 98166e7e8b43d1a912b2b6a468eb6d8f1f0297f2 Author: uncleGen Date: 2015-03-12T11:44:41Z minor fix commit baa175898df8d36b36ec9b9494ebbd117acfae87 Author: uncleGen Date: 2015-03-12T11:47:23Z minor fix commit fe3e5d52bf8680a021b00dd598cdc5a3f1c7df7e Author: uncleGen Date: 2015-03-12T11:51:11Z minor fix commit fb9716d3feee4523a3ee4cddd0a9c0926e228099 Author: uncleGen Date: 2015-03-12T11:52:33Z minor fix commit 02bf9a936f798b7eb40dcd1183ea14b5c2125deb Author: uncleGen Date: 2015-03-12T12:17:46Z minor fix commit 2bac6564a0a1153ec1366bfa4da085a6e89d07d5 Author: uncleGen Date: 2015-03-12T13:49:21Z minor fix commit 2544f8e5ad0e205e4043bd47646f57d5ed1f42a8 Author: uncleGen Date: 2015-03-17T13:52:36Z roll back to original approach commit b5237d86ddfcfd0b1b9b5bcd6952902ee1d843b6 Author: uncleGen Date: 2015-03-17T14:03:10Z resolve merge conflict --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-82363317 close and resolve the merge conflict. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user uncleGen closed the pull request at: https://github.com/apache/spark/pull/4135 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-78736965 @JoshRosen OK, I will roll back to the original approach and do some impromvements :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-78550324 I chatted with @tdas about this and I think we've arrived at a better understanding of this problem: - The `onTaskCompletion` approach that I suggested will not work because that callback will only be invoked after the task has finished running, but the task will not be able to finish until we stop the receiver. - We could modify the receiver to respond to `TaskContext.isInterrupted()`, but the current implementation here isn't ideal since it involves busy-looping. It sounds like your original approach of adding a new callback on interruption is the right fix, since that callback can just call `Receiver.stop()` and we can continue to use the countdown latch to avoid polling / busy-waiting. However, I think that the original name for the callback was misleading / confusing, which led me onto this derail: the name `addTaskKilledListener` is confusing because it sounds like it would be subsumed by `onCompleteCallback`. Besides this naming confusion, though, I think the original approach is actually fine. Therefore, my suggestion is to keep the unit tests that you've added (which look good to me) and revert back to the `onKilledCallback` approach, but change the name to be `addTaskInterruptionListener`. Because the existing methods / naming was confusing, it might also be nice to add a state transition diagram somewhere which explains the order in which the callbacks are invoked (or some much more explicit comments). Sorry for the derail and thanks for being patient with me! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/4135#discussion_r26325661 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/receiver/ReceiverSupervisor.scala --- @@ -161,22 +163,45 @@ private[streaming] abstract class ReceiverSupervisor( } } - /** Check if receiver has been marked for stopping */ + /** Check if receiver has been marked for stopping. */ def isReceiverStarted() = { logDebug("state = " + receiverState) receiverState == Started } - /** Check if receiver has been marked for stopping */ + /** Check if receiver has been marked for stopping. */ def isReceiverStopped() = { logDebug("state = " + receiverState) receiverState == Stopped } + /** Due to SPARK-5205, we need `TaskContext` to check if the task is interrupted. */ + def setTaskContext(context: TaskContext) = { --- End diff -- Instead of adding a setter method, why not pass the TaskContext into the ReceiverSupervistor's constructor? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-78500578 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28514/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-78500558 [Test build #28514 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28514/consoleFull) for PR 4135 at commit [`2bac656`](https://github.com/apache/spark/commit/2bac6564a0a1153ec1366bfa4da085a6e89d07d5). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-78482924 [Test build #28514 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28514/consoleFull) for PR 4135 at commit [`2bac656`](https://github.com/apache/spark/commit/2bac6564a0a1153ec1366bfa4da085a6e89d07d5). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-78477226 [Test build #28513 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28513/consoleFull) for PR 4135 at commit [`02bf9a9`](https://github.com/apache/spark/commit/02bf9a936f798b7eb40dcd1183ea14b5c2125deb). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-78477235 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28513/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-78474524 [Test build #28512 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28512/consoleFull) for PR 4135 at commit [`fb9716d`](https://github.com/apache/spark/commit/fb9716d3feee4523a3ee4cddd0a9c0926e228099). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-78474531 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28512/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-78469849 [Test build #28513 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28513/consoleFull) for PR 4135 at commit [`02bf9a9`](https://github.com/apache/spark/commit/02bf9a936f798b7eb40dcd1183ea14b5c2125deb). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
GitHub user uncleGen reopened a pull request: https://github.com/apache/spark/pull/4135 [SPARK-5205][Streaming]:Inconsistent behaviour between Streaming job and others, when click kill link in WebUI The "kill" link is used to kill a stage in job. It works in any kinds of Spark job but Spark Streaming. To be specific, we can only kill the stage which is used to run "Receiver", but not kill the "Receivers". Well, the stage can be killed and cleaned from the ui, but the receivers are still alive and receiving data. I think it dose not fit with the common sense. IMHO, killing the "receiver" stage means kill the "receivers" and stopping receiving data. You can merge this pull request into a Git repository by running: $ git pull https://github.com/uncleGen/spark master-clean-150121 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/4135.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4135 commit c90a288ca6fde01008ff3ab5d04970c8f120c4b1 Author: uncleGen Date: 2015-01-21T08:58:10Z BUG FIX: Inconsistent behaviour between Streaming job and others, when click kill link in WebUI commit fc3be2c0d3cd3e8f2f15a4a4f0fa504b28164d38 Author: uncleGen Date: 2015-01-21T09:16:12Z fix commit d85dede41dce4a32748037871241da1620ada564 Author: uncleGen Date: 2015-01-21T09:23:59Z style fix commit f3162618d3eb1de3e6690e1403bf1943145d316a Author: uncleGen Date: 2015-02-03T02:36:52Z resolve merge conflicts commit 168115cb364285867bc9e7af400d48922712c71c Author: uncleGen Date: 2015-03-02T03:46:42Z resolve merge conflicts commit 047931c531b7d307de201ba74ed5dcb7da7c6559 Author: uncleGen Date: 2015-03-02T05:24:18Z resolve merge conflicts commit d8b57dfff760ce7a3b2413c3dcada3e1572fb7e8 Author: uncleGen Date: 2015-03-09T03:28:37Z Merge branch 'master-clean' into master-clean-150121 commit 4417ff06aeeea63b1a2f4e8a12d736a050abf057 Author: uncleGen Date: 2015-03-09T08:19:02Z update commit 963556d4b64c81e56913a2860f596c6b25b1286d Author: uncleGen Date: 2015-03-10T03:14:15Z add unit test commit f997698a0233d351941fd7ddd9577a659806281e Author: uncleGen Date: 2015-03-10T03:42:03Z fix unit test commit 705118453ab6f9869ac090ca2bac009f167869cd Author: uncleGen Date: 2015-03-10T08:57:21Z minor fix commit 92fb864e384e298eead53f10844365ea1887a929 Author: uncleGen Date: 2015-03-12T11:41:33Z roll back commit 98166e7e8b43d1a912b2b6a468eb6d8f1f0297f2 Author: uncleGen Date: 2015-03-12T11:44:41Z minor fix commit baa175898df8d36b36ec9b9494ebbd117acfae87 Author: uncleGen Date: 2015-03-12T11:47:23Z minor fix commit fe3e5d52bf8680a021b00dd598cdc5a3f1c7df7e Author: uncleGen Date: 2015-03-12T11:51:11Z minor fix commit fb9716d3feee4523a3ee4cddd0a9c0926e228099 Author: uncleGen Date: 2015-03-12T11:52:33Z minor fix commit 02bf9a936f798b7eb40dcd1183ea14b5c2125deb Author: uncleGen Date: 2015-03-12T12:17:46Z minor fix --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user uncleGen closed the pull request at: https://github.com/apache/spark/pull/4135 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-78468550 @JoshRosen, thanks for your patience. It occurred to me that we may check when to terminate the `receiver` in `ReceiverSupervisor`. Then the condition to stop the `receiver` is satisfied when `TaskContext.isInterrupted()` or `isReceiverStopped`. So I removed the `stopLatch` lock and just `checkTermination()` once per 500ms. Well, a little ugly and any idea? Now, we may just mark the task for interruption and trigger callback processing inside of markTaskCompleted. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-78466921 [Test build #28511 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28511/consoleFull) for PR 4135 at commit [`fe3e5d5`](https://github.com/apache/spark/commit/fe3e5d52bf8680a021b00dd598cdc5a3f1c7df7e). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-78466925 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28511/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-78466744 [Test build #28512 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28512/consoleFull) for PR 4135 at commit [`fb9716d`](https://github.com/apache/spark/commit/fb9716d3feee4523a3ee4cddd0a9c0926e228099). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-78466160 [Test build #28511 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28511/consoleFull) for PR 4135 at commit [`fe3e5d5`](https://github.com/apache/spark/commit/fe3e5d52bf8680a021b00dd598cdc5a3f1c7df7e). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-78465782 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28510/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-78465778 [Test build #28510 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28510/consoleFull) for PR 4135 at commit [`92fb864`](https://github.com/apache/spark/commit/92fb864e384e298eead53f10844365ea1887a929). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-78465032 [Test build #28510 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28510/consoleFull) for PR 4135 at commit [`92fb864`](https://github.com/apache/spark/commit/92fb864e384e298eead53f10844365ea1887a929). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/4135#discussion_r26237008 --- Diff: streaming/src/test/scala/org/apache/spark/streaming/ReceiverSuite.scala --- @@ -129,6 +130,41 @@ class ReceiverSuite extends TestSuiteBase with Timeouts with Serializable { } } + test("kill receiver") { +val sparkConf = new SparkConf() + .setMaster("local[4]") + .setAppName(framework) +val batchDuration = Milliseconds(1000) +withStreamingContext(new StreamingContext(sparkConf, batchDuration)) { ssc => + val receiver = new FakeReceiver + val executor = new FakeReceiverSupervisor(receiver) + val task = new FakeReceiverTask(0, executor) + + new Thread("Receiver Task") { +override def run(): Unit = { + task.run(0, 0) +} + }.start() + Thread.sleep(2000) + + // Verify that receiver was started + assert(receiver.onStartCalled) + assert(executor.isReceiverStarted) + assert(receiver.isStarted) + assert(!receiver.isStopped()) + assert(receiver.otherThread.isAlive) + + task.kill(false) + --- End diff -- There's some asynchrony in task killing, so these checks that the receiver was stopped need to be wrapped in an `eventually` block or some use some other mechanism to avoid race conditions. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/4135#discussion_r26236001 --- Diff: core/src/main/scala/org/apache/spark/TaskContextImpl.scala --- @@ -66,8 +66,21 @@ private[spark] class TaskContextImpl( /** Marks the task as completed and triggers the listeners. */ private[spark] def markTaskCompleted(): Unit = { completed = true +processCallBacks() + } + + /** Marks the task for interruption, i.e. cancellation. */ + private[spark] def markInterrupted(): Unit = { +interrupted = true +processCallBacks() --- End diff -- I noticed that the old code did not process callbacks on interruption. What is the API contract here for markTaskCompleted and markInterrupted()? Are they mutually-exclusive, e.g. a task can be marked as completed or interrupted, but never both? I think that we should add assertions to determine if this is the case, since I'm worried that this change might lead to duplicate calling of callbacks if this assumption isn't true. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/4135#discussion_r26237173 --- Diff: streaming/src/test/scala/org/apache/spark/streaming/ReceiverSuite.scala --- @@ -129,6 +130,41 @@ class ReceiverSuite extends TestSuiteBase with Timeouts with Serializable { } } + test("kill receiver") { +val sparkConf = new SparkConf() + .setMaster("local[4]") + .setAppName(framework) +val batchDuration = Milliseconds(1000) +withStreamingContext(new StreamingContext(sparkConf, batchDuration)) { ssc => + val receiver = new FakeReceiver + val executor = new FakeReceiverSupervisor(receiver) + val task = new FakeReceiverTask(0, executor) + + new Thread("Receiver Task") { +override def run(): Unit = { + task.run(0, 0) +} + }.start() + Thread.sleep(2000) + + // Verify that receiver was started + assert(receiver.onStartCalled) + assert(executor.isReceiverStarted) + assert(receiver.isStarted) + assert(!receiver.isStopped()) + assert(receiver.otherThread.isAlive) + + task.kill(false) + --- End diff -- In fact, there's an example of using `eventually` in this file; see line 112. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/4135#discussion_r26237198 --- Diff: streaming/src/test/scala/org/apache/spark/streaming/ReceiverSuite.scala --- @@ -296,6 +332,27 @@ class ReceiverSuite extends TestSuiteBase with Timeouts with Serializable { } /** + * An implementation of Task running a NetworkReceiver used for testing whether + * we can start/stop receiver properly. + */ + class FakeReceiverTask( +stageId: Int, +executor: FakeReceiverSupervisor, +prefLocs: Seq[TaskLocation] = Nil) extends Task[Unit](stageId, 0) { +override def runTask(context: TaskContext): Unit = { + context.addTaskCompletionListener { context => +if(context.isInterrupted()) { --- End diff -- Style nit: space after `if`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/4135#discussion_r26236828 --- Diff: streaming/src/test/scala/org/apache/spark/streaming/ReceiverSuite.scala --- @@ -129,6 +130,41 @@ class ReceiverSuite extends TestSuiteBase with Timeouts with Serializable { } } + test("kill receiver") { +val sparkConf = new SparkConf() + .setMaster("local[4]") + .setAppName(framework) +val batchDuration = Milliseconds(1000) +withStreamingContext(new StreamingContext(sparkConf, batchDuration)) { ssc => + val receiver = new FakeReceiver + val executor = new FakeReceiverSupervisor(receiver) + val task = new FakeReceiverTask(0, executor) + + new Thread("Receiver Task") { +override def run(): Unit = { + task.run(0, 0) +} + }.start() + Thread.sleep(2000) --- End diff -- If we can help it, we shouldn't be writing any new streaming tests that rely on Thread.sleep, since this has been a major cause of test flakiness. See the use of ManualClock in the other streaming tests for examples of how to write tests which aren't dependent on wall-clock time. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/4135#discussion_r26236688 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceiverTracker.scala --- @@ -295,6 +297,14 @@ class ReceiverTracker(ssc: StreamingContext, skipReceiverLaunch: Boolean = false val receiver = iterator.next() val supervisor = new ReceiverSupervisorImpl( receiver, SparkEnv.get, serializableHadoopConf.value, checkpointDirOption) + +// Due to SPARK-5205, `Receiver` stage can not be stopped properly, we need to add a +// callback to do some more clean works. +context.addTaskCompletionListener { context => + if(context.isInterrupted()) { --- End diff -- Code style: space after `if`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/4135#discussion_r26236670 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceiverTracker.scala --- @@ -23,7 +23,9 @@ import scala.language.existentials import akka.actor._ -import org.apache.spark.{Logging, SerializableWritable, SparkEnv, SparkException} +import scala.Some --- End diff -- This import should be grouped with the other Scala imports on line 22. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/4135#discussion_r26236576 --- Diff: core/src/main/scala/org/apache/spark/TaskContextImpl.scala --- @@ -66,8 +66,21 @@ private[spark] class TaskContextImpl( /** Marks the task as completed and triggers the listeners. */ private[spark] def markTaskCompleted(): Unit = { completed = true +processCallBacks() + } + + /** Marks the task for interruption, i.e. cancellation. */ + private[spark] def markInterrupted(): Unit = { +interrupted = true +processCallBacks() --- End diff -- Actually, tale a look at the definition of `Task.run`: ```scala final def run(attemptId: Long): T = { context = new TaskContextImpl(stageId, partitionId, attemptId, false) TaskContextHelper.setTaskContext(context) context.taskMetrics.hostname = Utils.localHostName() taskThread = Thread.currentThread() if (_killed) { kill(interruptThread = false) } try { runTask(context) } finally { context.markTaskCompleted() TaskContextHelper.unset() } } ``` If `_killed` is true, then we'll call `kill()` which marks the task context as interrupted, and then still call `runTask`, so the `finally` block will mark the task as completed after it's already been interrupted. Therefore, I don't think that triggering call backprocessing inside of `markInterrupted` is the right approach, especially since `Task.kill()` is asynchronous, so the task could still be running at this point. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-78209621 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28459/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-78209607 **[Test build #28459 timed out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28459/consoleFull)** for PR 4135 at commit [`7051184`](https://github.com/apache/spark/commit/705118453ab6f9869ac090ca2bac009f167869cd) after a configured wait of `120m`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-78201355 [Test build #28459 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28459/consoleFull) for PR 4135 at commit [`7051184`](https://github.com/apache/spark/commit/705118453ab6f9869ac090ca2bac009f167869cd). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-78201163 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-78017037 [Test build #28426 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28426/consoleFull) for PR 4135 at commit [`7051184`](https://github.com/apache/spark/commit/705118453ab6f9869ac090ca2bac009f167869cd). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-78017042 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28426/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-78016846 [Test build #28426 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28426/consoleFull) for PR 4135 at commit [`7051184`](https://github.com/apache/spark/commit/705118453ab6f9869ac090ca2bac009f167869cd). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-78015735 Some timeout errors happened. Could you please take a time to check and review it again? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-78010542 **[Test build #28423 timed out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28423/consoleFull)** for PR 4135 at commit [`f997698`](https://github.com/apache/spark/commit/f997698a0233d351941fd7ddd9577a659806281e) after a configured wait of `120m`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-78010548 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28423/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-78000862 [Test build #28423 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28423/consoleFull) for PR 4135 at commit [`f997698`](https://github.com/apache/spark/commit/f997698a0233d351941fd7ddd9577a659806281e). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-78000658 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-77999456 **[Test build #28422 timed out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28422/consoleFull)** for PR 4135 at commit [`f997698`](https://github.com/apache/spark/commit/f997698a0233d351941fd7ddd9577a659806281e) after a configured wait of `120m`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-77999462 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28422/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4135#issuecomment-77992381 [Test build #28421 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/28421/consoleFull) for PR 4135 at commit [`963556d`](https://github.com/apache/spark/commit/963556d4b64c81e56913a2860f596c6b25b1286d). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org