[jira] [Commented] (SPARK-2892) Socket Receiver does not stop when streaming context is stopped
[ https://issues.apache.org/jira/browse/SPARK-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14262131#comment-14262131 ] Ilayaperumal Gopinathan commented on SPARK-2892: [~joshrosen] This indeed fixes the issue on stopping the receiver. thanks! P.S: I deleted my prior comment from the JIRA as the error I noticed was due to some setup issue on my cluster environment. > Socket Receiver does not stop when streaming context is stopped > --- > > Key: SPARK-2892 > URL: https://issues.apache.org/jira/browse/SPARK-2892 > Project: Spark > Issue Type: Bug > Components: Streaming >Affects Versions: 1.0.2 >Reporter: Tathagata Das >Assignee: Tathagata Das >Priority: Critical > > Running NetworkWordCount with > {quote} > ssc.start(); Thread.sleep(1); ssc.stop(stopSparkContext = false); > Thread.sleep(6) > {quote} > gives the following error > {quote} > 14/08/06 18:37:13 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) > in 10047 ms on localhost (1/1) > 14/08/06 18:37:13 INFO DAGScheduler: Stage 0 (runJob at > ReceiverTracker.scala:275) finished in 10.056 s > 14/08/06 18:37:13 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks > have all completed, from pool > 14/08/06 18:37:13 INFO SparkContext: Job finished: runJob at > ReceiverTracker.scala:275, took 10.179263 s > 14/08/06 18:37:13 INFO ReceiverTracker: All of the receivers have been > terminated > 14/08/06 18:37:13 WARN ReceiverTracker: All of the receivers have not > deregistered, Map(0 -> > ReceiverInfo(0,SocketReceiver-0,null,false,localhost,Stopped by driver,)) > 14/08/06 18:37:13 INFO ReceiverTracker: ReceiverTracker stopped > 14/08/06 18:37:13 INFO JobGenerator: Stopping JobGenerator immediately > 14/08/06 18:37:13 INFO RecurringTimer: Stopped timer for JobGenerator after > time 1407375433000 > 14/08/06 18:37:13 INFO JobGenerator: Stopped JobGenerator > 14/08/06 18:37:13 INFO JobScheduler: Stopped JobScheduler > 14/08/06 18:37:13 INFO StreamingContext: StreamingContext stopped successfully > 14/08/06 18:37:43 INFO SocketReceiver: Stopped receiving > 14/08/06 18:37:43 INFO SocketReceiver: Closed socket to localhost: > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2892) Socket Receiver does not stop when streaming context is stopped
[ https://issues.apache.org/jira/browse/SPARK-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14262078#comment-14262078 ] Ilayaperumal Gopinathan commented on SPARK-2892: [~joshrosen] yeah, serialization could be the real issue here. But after trying the suggested change (in my environment where I call streamingContext stop). But I see the active streaming job is only cancelled upon spark context shutdown), I see the below: ERROR akka.actor.default-dispatcher-19 scheduler.TaskSchedulerImpl - Lost executor 0 on 192.168.2.6: remote Akka client disassociated WARN akka.actor.default-dispatcher-20 remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://sparkExecutor@192.168.2.6:54259] has failed, address is now gated for [5000] ms. Reason is: [Disassociated]. WARN akka.actor.default-dispatcher-19 scheduler.TaskSetManager - Lost task 0.0 in stage 8.0 (TID 76, 192.168.2.6): ExecutorLostFailure (executor 0 lost) ERROR akka.actor.default-dispatcher-19 cluster.SparkDeploySchedulerBackend - Asked to remove non-existent executor 0 ERROR akka.actor.default-dispatcher-19 cluster.SparkDeploySchedulerBackend - Asked to remove non-existent executor 0 WARN main-EventThread scheduler.ReceiverTracker - All of the receivers have not deregistered, Map(0 -> ReceiverInfo(0,MessageBusReceiver-0,Actor[akka.tcp://sparkExecutor@192.168.2.6:54279/user/Receiver-0-1420019455242#1553271505],true,192.168.2.6,,)) Exception in thread "Thread-42" org.apache.spark.SparkException: Job cancelled because SparkContext was shut down at org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:702) at org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:701) at scala.collection.mutable.HashSet.foreach(HashSet.scala:79) at org.apache.spark.scheduler.DAGScheduler.cleanUpAfterSchedulerStop(DAGScheduler.scala:701) at org.apache.spark.scheduler.DAGSchedulerEventProcessActor.postStop(DAGScheduler.scala:1428) at akka.actor.Actor$class.aroundPostStop(Actor.scala:475) at org.apache.spark.scheduler.DAGSchedulerEventProcessActor.aroundPostStop(DAGScheduler.scala:1375) at akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210) at akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:172) at akka.actor.ActorCell.terminate(ActorCell.scala:369) at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:462) at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478) at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:263) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Let me know what do you think? > Socket Receiver does not stop when streaming context is stopped > --- > > Key: SPARK-2892 > URL: https://issues.apache.org/jira/browse/SPARK-2892 > Project: Spark > Issue Type: Bug > Components: Streaming >Affects Versions: 1.0.2 >Reporter: Tathagata Das >Assignee: Tathagata Das >Priority: Critical > > Running NetworkWordCount with > {quote} > ssc.start(); Thread.sleep(1); ssc.stop(stopSparkContext = false); > Thread.sleep(6) > {quote} > gives the following error > {quote} > 14/08/06 18:37:13 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) > in 10047 ms on localhost (1/1) > 14/08/06 18:37:13 INFO DAGScheduler: Stage 0 (runJob at > ReceiverTracker.scala:275) finished in 10.056 s > 14/08/06 18:37:13 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks > have all completed, from pool > 14/08/06 18:37:13 INFO SparkContext: Job finished: runJob at > ReceiverTracker.scala:275, took 10.179263 s > 14/08/06 18:37:13 INFO ReceiverTracker: All of the receivers have been > terminated > 14/08/06 18:37:13 WARN ReceiverTracker: All of the receivers have not > deregistered, Map(0 -> > ReceiverInfo(0,SocketReceiver-0,null,false,localhost,Stopped by driver,)) > 14/08/06 18:37:13 INFO ReceiverTracker: ReceiverTracker stopped > 14/08/06 18:37:13 INFO JobGenerator: Stopping JobGenerator immediately > 14/08/06 18:37:13 INFO RecurringTimer: Stopped timer for JobGenerator after > time 1407375433000 > 14/08/06 18:37:13 IN
[jira] [Commented] (SPARK-2892) Socket Receiver does not stop when streaming context is stopped
[ https://issues.apache.org/jira/browse/SPARK-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14262050#comment-14262050 ] Josh Rosen commented on SPARK-2892: --- I think that I've found the reason that this works in local mode and not on a real cluster; see SPARK-5035 / https://github.com/apache/spark/pull/3857 > Socket Receiver does not stop when streaming context is stopped > --- > > Key: SPARK-2892 > URL: https://issues.apache.org/jira/browse/SPARK-2892 > Project: Spark > Issue Type: Bug > Components: Streaming >Affects Versions: 1.0.2 >Reporter: Tathagata Das >Assignee: Tathagata Das >Priority: Critical > > Running NetworkWordCount with > {quote} > ssc.start(); Thread.sleep(1); ssc.stop(stopSparkContext = false); > Thread.sleep(6) > {quote} > gives the following error > {quote} > 14/08/06 18:37:13 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) > in 10047 ms on localhost (1/1) > 14/08/06 18:37:13 INFO DAGScheduler: Stage 0 (runJob at > ReceiverTracker.scala:275) finished in 10.056 s > 14/08/06 18:37:13 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks > have all completed, from pool > 14/08/06 18:37:13 INFO SparkContext: Job finished: runJob at > ReceiverTracker.scala:275, took 10.179263 s > 14/08/06 18:37:13 INFO ReceiverTracker: All of the receivers have been > terminated > 14/08/06 18:37:13 WARN ReceiverTracker: All of the receivers have not > deregistered, Map(0 -> > ReceiverInfo(0,SocketReceiver-0,null,false,localhost,Stopped by driver,)) > 14/08/06 18:37:13 INFO ReceiverTracker: ReceiverTracker stopped > 14/08/06 18:37:13 INFO JobGenerator: Stopping JobGenerator immediately > 14/08/06 18:37:13 INFO RecurringTimer: Stopped timer for JobGenerator after > time 1407375433000 > 14/08/06 18:37:13 INFO JobGenerator: Stopped JobGenerator > 14/08/06 18:37:13 INFO JobScheduler: Stopped JobScheduler > 14/08/06 18:37:13 INFO StreamingContext: StreamingContext stopped successfully > 14/08/06 18:37:43 INFO SocketReceiver: Stopped receiving > 14/08/06 18:37:43 INFO SocketReceiver: Closed socket to localhost: > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2892) Socket Receiver does not stop when streaming context is stopped
[ https://issues.apache.org/jira/browse/SPARK-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261788#comment-14261788 ] Tathagata Das commented on SPARK-2892: -- [~ilayaperumalg] Ping, any thoughts? > Socket Receiver does not stop when streaming context is stopped > --- > > Key: SPARK-2892 > URL: https://issues.apache.org/jira/browse/SPARK-2892 > Project: Spark > Issue Type: Bug > Components: Streaming >Affects Versions: 1.0.2 >Reporter: Tathagata Das >Assignee: Tathagata Das >Priority: Critical > > Running NetworkWordCount with > {quote} > ssc.start(); Thread.sleep(1); ssc.stop(stopSparkContext = false); > Thread.sleep(6) > {quote} > gives the following error > {quote} > 14/08/06 18:37:13 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) > in 10047 ms on localhost (1/1) > 14/08/06 18:37:13 INFO DAGScheduler: Stage 0 (runJob at > ReceiverTracker.scala:275) finished in 10.056 s > 14/08/06 18:37:13 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks > have all completed, from pool > 14/08/06 18:37:13 INFO SparkContext: Job finished: runJob at > ReceiverTracker.scala:275, took 10.179263 s > 14/08/06 18:37:13 INFO ReceiverTracker: All of the receivers have been > terminated > 14/08/06 18:37:13 WARN ReceiverTracker: All of the receivers have not > deregistered, Map(0 -> > ReceiverInfo(0,SocketReceiver-0,null,false,localhost,Stopped by driver,)) > 14/08/06 18:37:13 INFO ReceiverTracker: ReceiverTracker stopped > 14/08/06 18:37:13 INFO JobGenerator: Stopping JobGenerator immediately > 14/08/06 18:37:13 INFO RecurringTimer: Stopped timer for JobGenerator after > time 1407375433000 > 14/08/06 18:37:13 INFO JobGenerator: Stopped JobGenerator > 14/08/06 18:37:13 INFO JobScheduler: Stopped JobScheduler > 14/08/06 18:37:13 INFO StreamingContext: StreamingContext stopped successfully > 14/08/06 18:37:43 INFO SocketReceiver: Stopped receiving > 14/08/06 18:37:43 INFO SocketReceiver: Closed socket to localhost: > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2892) Socket Receiver does not stop when streaming context is stopped
[ https://issues.apache.org/jira/browse/SPARK-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14258553#comment-14258553 ] Tathagata Das commented on SPARK-2892: -- [~ilayaperumalg] Now that SPARK-4802 has been solved could you check whether this issue has been resolved? > Socket Receiver does not stop when streaming context is stopped > --- > > Key: SPARK-2892 > URL: https://issues.apache.org/jira/browse/SPARK-2892 > Project: Spark > Issue Type: Bug > Components: Streaming >Affects Versions: 1.0.2 >Reporter: Tathagata Das >Assignee: Tathagata Das >Priority: Critical > > Running NetworkWordCount with > {quote} > ssc.start(); Thread.sleep(1); ssc.stop(stopSparkContext = false); > Thread.sleep(6) > {quote} > gives the following error > {quote} > 14/08/06 18:37:13 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) > in 10047 ms on localhost (1/1) > 14/08/06 18:37:13 INFO DAGScheduler: Stage 0 (runJob at > ReceiverTracker.scala:275) finished in 10.056 s > 14/08/06 18:37:13 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks > have all completed, from pool > 14/08/06 18:37:13 INFO SparkContext: Job finished: runJob at > ReceiverTracker.scala:275, took 10.179263 s > 14/08/06 18:37:13 INFO ReceiverTracker: All of the receivers have been > terminated > 14/08/06 18:37:13 WARN ReceiverTracker: All of the receivers have not > deregistered, Map(0 -> > ReceiverInfo(0,SocketReceiver-0,null,false,localhost,Stopped by driver,)) > 14/08/06 18:37:13 INFO ReceiverTracker: ReceiverTracker stopped > 14/08/06 18:37:13 INFO JobGenerator: Stopping JobGenerator immediately > 14/08/06 18:37:13 INFO RecurringTimer: Stopped timer for JobGenerator after > time 1407375433000 > 14/08/06 18:37:13 INFO JobGenerator: Stopped JobGenerator > 14/08/06 18:37:13 INFO JobScheduler: Stopped JobScheduler > 14/08/06 18:37:13 INFO StreamingContext: StreamingContext stopped successfully > 14/08/06 18:37:43 INFO SocketReceiver: Stopped receiving > 14/08/06 18:37:43 INFO SocketReceiver: Closed socket to localhost: > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2892) Socket Receiver does not stop when streaming context is stopped
[ https://issues.apache.org/jira/browse/SPARK-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14241951#comment-14241951 ] Ilayaperumal Gopinathan commented on SPARK-2892: To add more info: When the ReceiverTracker sends the "StopReceiver" message to the receiver actor at the executor, it's ReceiverLauncher thread always times out and I notice the corresponding job is cancelled only because of stopping the DAGScheduler. This throws the exception[1] while at the executor side the worker node throws this info[2] Exception[1]: INFO sparkDriver-akka.actor.default-dispatcher-14 cluster.SparkDeploySchedulerBackend - Asking each executor to shut down 15:06:53,783 1.1.0.SNAP INFO Thread-40 scheduler.DAGScheduler - Job 1 failed: start at SparkDriver.java:109, took 72.739141 s Exception in thread "Thread-40" org.apache.spark.SparkException: Job cancelled because SparkContext was shut down at org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:702) at org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:701) at scala.collection.mutable.HashSet.foreach(HashSet.scala:79) at org.apache.spark.scheduler.DAGScheduler.cleanUpAfterSchedulerStop(DAGScheduler.scala:701) at org.apache.spark.scheduler.DAGSchedulerEventProcessActor.postStop(DAGScheduler.scala:1428) at akka.actor.Actor$class.aroundPostStop(Actor.scala:475) at org.apache.spark.scheduler.DAGSchedulerEventProcessActor.aroundPostStop(DAGScheduler.scala:1375) at akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210) at akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:172) at akka.actor.ActorCell.terminate(ActorCell.scala:369) at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:462) at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478) at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:263) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) info[2] INFO LocalActorRef: Message [akka.remote.transport.AssociationHandle$Disassociated] from Actor[akka://sparkWorker/deadLetters] to Actor[akka://sparkWorker/system/transports/akkaprotocolmanager.tcp0/akkaProtocol-tcp%3A%2F%2FsparkWorker%40127.0.0.1%3A52219-2#1619424601] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. 14/12/10 15:06:53 ERROR EndpointWriter: AssociationError [akka.tcp://sparkWorker@localhost:51262] <- [akka.tcp://sparkExecutor@localhost:52217]: Error [Shut down address: akka.tcp://sparkExecutor@localhost:52217] [ akka.remote.ShutDownAssociation: Shut down address: akka.tcp://sparkExecutor@localhost:52217 Caused by: akka.remote.transport.Transport$InvalidAssociationException: The remote system terminated the association because it is shutting down. Please note that I am running everything on localhost and the same thing works fine on "local" mode and the above issue only arises on "cluster" mode. I tried changing the hostname to 127.0.0.1 but noticed the same. Any clues on what might be going on here would help a lot. Thanks! > Socket Receiver does not stop when streaming context is stopped > --- > > Key: SPARK-2892 > URL: https://issues.apache.org/jira/browse/SPARK-2892 > Project: Spark > Issue Type: Bug > Components: Streaming >Affects Versions: 1.0.2 >Reporter: Tathagata Das >Assignee: Tathagata Das >Priority: Critical > > Running NetworkWordCount with > {quote} > ssc.start(); Thread.sleep(1); ssc.stop(stopSparkContext = false); > Thread.sleep(6) > {quote} > gives the following error > {quote} > 14/08/06 18:37:13 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) > in 10047 ms on localhost (1/1) > 14/08/06 18:37:13 INFO DAGScheduler: Stage 0 (runJob at > ReceiverTracker.scala:275) finished in 10.056 s > 14/08/06 18:37:13 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks > have all completed, from pool > 14/08/06 18:37:13 INFO SparkContext: Job finished: runJob at > ReceiverTracker.scala:275, took 1
[jira] [Commented] (SPARK-2892) Socket Receiver does not stop when streaming context is stopped
[ https://issues.apache.org/jira/browse/SPARK-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14241362#comment-14241362 ] Mark Fisher commented on SPARK-2892: [~srowen] SPARK-4802 is only related to the receiverInfo not being removed. This issue is actually much more critical, given that Receivers do not seem to stop other than in local mode. Please reopen. > Socket Receiver does not stop when streaming context is stopped > --- > > Key: SPARK-2892 > URL: https://issues.apache.org/jira/browse/SPARK-2892 > Project: Spark > Issue Type: Bug > Components: Streaming >Affects Versions: 1.0.2 >Reporter: Tathagata Das >Assignee: Tathagata Das >Priority: Critical > > Running NetworkWordCount with > {quote} > ssc.start(); Thread.sleep(1); ssc.stop(stopSparkContext = false); > Thread.sleep(6) > {quote} > gives the following error > {quote} > 14/08/06 18:37:13 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) > in 10047 ms on localhost (1/1) > 14/08/06 18:37:13 INFO DAGScheduler: Stage 0 (runJob at > ReceiverTracker.scala:275) finished in 10.056 s > 14/08/06 18:37:13 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks > have all completed, from pool > 14/08/06 18:37:13 INFO SparkContext: Job finished: runJob at > ReceiverTracker.scala:275, took 10.179263 s > 14/08/06 18:37:13 INFO ReceiverTracker: All of the receivers have been > terminated > 14/08/06 18:37:13 WARN ReceiverTracker: All of the receivers have not > deregistered, Map(0 -> > ReceiverInfo(0,SocketReceiver-0,null,false,localhost,Stopped by driver,)) > 14/08/06 18:37:13 INFO ReceiverTracker: ReceiverTracker stopped > 14/08/06 18:37:13 INFO JobGenerator: Stopping JobGenerator immediately > 14/08/06 18:37:13 INFO RecurringTimer: Stopped timer for JobGenerator after > time 1407375433000 > 14/08/06 18:37:13 INFO JobGenerator: Stopped JobGenerator > 14/08/06 18:37:13 INFO JobScheduler: Stopped JobScheduler > 14/08/06 18:37:13 INFO StreamingContext: StreamingContext stopped successfully > 14/08/06 18:37:43 INFO SocketReceiver: Stopped receiving > 14/08/06 18:37:43 INFO SocketReceiver: Closed socket to localhost: > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2892) Socket Receiver does not stop when streaming context is stopped
[ https://issues.apache.org/jira/browse/SPARK-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14240437#comment-14240437 ] Ilayaperumal Gopinathan commented on SPARK-2892: It looks like this one and the issue mentioned in SPARK-4802 (ReceiverInfo removal at ReceiverTracker upon deregistering receiver) are related. I believe the following warning message is the result of receiverInfo not being removed at ReceiverTracker by the ReceiverTrackerActor when the corresponding receiver is deregistered. "WARN ReceiverTracker: All of the receivers have not deregistered, Map(0 -> ReceiverInfo(0,SocketReceiver-0,null,false,localhost,Stopped by driver,))" >From what I can see so far, closing the streaming context stops the receiver >only in "local" mode. In "cluster" mode, using the Spark standalone cluster I noticed that when the ReceiverTracker at the driver sends the "StopReceiver" message as a result of streaming context close, it couldn't reach to the ReceiverSupervisorImpl's actor that is running at the executor node. At the same time, the ReceiverSupervisorImpl at the executor could send the messages such as RegisterReceiver, AddBlock back to the ReceiverTrackerActor at the driver. It would be great if someone could explain what might be going on from ReceiverTracker -> ReceiverSupervisorImpl actor at executor when sending the stop signal in the distributed mode case. Thanks! > Socket Receiver does not stop when streaming context is stopped > --- > > Key: SPARK-2892 > URL: https://issues.apache.org/jira/browse/SPARK-2892 > Project: Spark > Issue Type: Bug > Components: Streaming >Affects Versions: 1.0.2 >Reporter: Tathagata Das >Assignee: Tathagata Das >Priority: Critical > > Running NetworkWordCount with > {quote} > ssc.start(); Thread.sleep(1); ssc.stop(stopSparkContext = false); > Thread.sleep(6) > {quote} > gives the following error > {quote} > 14/08/06 18:37:13 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) > in 10047 ms on localhost (1/1) > 14/08/06 18:37:13 INFO DAGScheduler: Stage 0 (runJob at > ReceiverTracker.scala:275) finished in 10.056 s > 14/08/06 18:37:13 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks > have all completed, from pool > 14/08/06 18:37:13 INFO SparkContext: Job finished: runJob at > ReceiverTracker.scala:275, took 10.179263 s > 14/08/06 18:37:13 INFO ReceiverTracker: All of the receivers have been > terminated > 14/08/06 18:37:13 WARN ReceiverTracker: All of the receivers have not > deregistered, Map(0 -> > ReceiverInfo(0,SocketReceiver-0,null,false,localhost,Stopped by driver,)) > 14/08/06 18:37:13 INFO ReceiverTracker: ReceiverTracker stopped > 14/08/06 18:37:13 INFO JobGenerator: Stopping JobGenerator immediately > 14/08/06 18:37:13 INFO RecurringTimer: Stopped timer for JobGenerator after > time 1407375433000 > 14/08/06 18:37:13 INFO JobGenerator: Stopped JobGenerator > 14/08/06 18:37:13 INFO JobScheduler: Stopped JobScheduler > 14/08/06 18:37:13 INFO StreamingContext: StreamingContext stopped successfully > 14/08/06 18:37:43 INFO SocketReceiver: Stopped receiving > 14/08/06 18:37:43 INFO SocketReceiver: Closed socket to localhost: > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2892) Socket Receiver does not stop when streaming context is stopped
[ https://issues.apache.org/jira/browse/SPARK-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14224818#comment-14224818 ] Gino Bustelo commented on SPARK-2892: - Update? > Socket Receiver does not stop when streaming context is stopped > --- > > Key: SPARK-2892 > URL: https://issues.apache.org/jira/browse/SPARK-2892 > Project: Spark > Issue Type: Bug > Components: Streaming >Affects Versions: 1.0.2 >Reporter: Tathagata Das >Assignee: Tathagata Das >Priority: Critical > > Running NetworkWordCount with > {quote} > ssc.start(); Thread.sleep(1); ssc.stop(stopSparkContext = false); > Thread.sleep(6) > {quote} > gives the following error > {quote} > 14/08/06 18:37:13 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) > in 10047 ms on localhost (1/1) > 14/08/06 18:37:13 INFO DAGScheduler: Stage 0 (runJob at > ReceiverTracker.scala:275) finished in 10.056 s > 14/08/06 18:37:13 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks > have all completed, from pool > 14/08/06 18:37:13 INFO SparkContext: Job finished: runJob at > ReceiverTracker.scala:275, took 10.179263 s > 14/08/06 18:37:13 INFO ReceiverTracker: All of the receivers have been > terminated > 14/08/06 18:37:13 WARN ReceiverTracker: All of the receivers have not > deregistered, Map(0 -> > ReceiverInfo(0,SocketReceiver-0,null,false,localhost,Stopped by driver,)) > 14/08/06 18:37:13 INFO ReceiverTracker: ReceiverTracker stopped > 14/08/06 18:37:13 INFO JobGenerator: Stopping JobGenerator immediately > 14/08/06 18:37:13 INFO RecurringTimer: Stopped timer for JobGenerator after > time 1407375433000 > 14/08/06 18:37:13 INFO JobGenerator: Stopped JobGenerator > 14/08/06 18:37:13 INFO JobScheduler: Stopped JobScheduler > 14/08/06 18:37:13 INFO StreamingContext: StreamingContext stopped successfully > 14/08/06 18:37:43 INFO SocketReceiver: Stopped receiving > 14/08/06 18:37:43 INFO SocketReceiver: Closed socket to localhost: > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2892) Socket Receiver does not stop when streaming context is stopped
[ https://issues.apache.org/jira/browse/SPARK-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139061#comment-14139061 ] Gino Bustelo commented on SPARK-2892: - Any update on this? Will it get fixed for 1.0.3 or 1.0.1 > Socket Receiver does not stop when streaming context is stopped > --- > > Key: SPARK-2892 > URL: https://issues.apache.org/jira/browse/SPARK-2892 > Project: Spark > Issue Type: Bug > Components: Streaming >Affects Versions: 1.0.2 >Reporter: Tathagata Das >Assignee: Tathagata Das >Priority: Critical > > Running NetworkWordCount with > {quote} > ssc.start(); Thread.sleep(1); ssc.stop(stopSparkContext = false); > Thread.sleep(6) > {quote} > gives the following error > {quote} > 14/08/06 18:37:13 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) > in 10047 ms on localhost (1/1) > 14/08/06 18:37:13 INFO DAGScheduler: Stage 0 (runJob at > ReceiverTracker.scala:275) finished in 10.056 s > 14/08/06 18:37:13 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks > have all completed, from pool > 14/08/06 18:37:13 INFO SparkContext: Job finished: runJob at > ReceiverTracker.scala:275, took 10.179263 s > 14/08/06 18:37:13 INFO ReceiverTracker: All of the receivers have been > terminated > 14/08/06 18:37:13 WARN ReceiverTracker: All of the receivers have not > deregistered, Map(0 -> > ReceiverInfo(0,SocketReceiver-0,null,false,localhost,Stopped by driver,)) > 14/08/06 18:37:13 INFO ReceiverTracker: ReceiverTracker stopped > 14/08/06 18:37:13 INFO JobGenerator: Stopping JobGenerator immediately > 14/08/06 18:37:13 INFO RecurringTimer: Stopped timer for JobGenerator after > time 1407375433000 > 14/08/06 18:37:13 INFO JobGenerator: Stopped JobGenerator > 14/08/06 18:37:13 INFO JobScheduler: Stopped JobScheduler > 14/08/06 18:37:13 INFO StreamingContext: StreamingContext stopped successfully > 14/08/06 18:37:43 INFO SocketReceiver: Stopped receiving > 14/08/06 18:37:43 INFO SocketReceiver: Closed socket to localhost: > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2892) Socket Receiver does not stop when streaming context is stopped
[ https://issues.apache.org/jira/browse/SPARK-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14123253#comment-14123253 ] Tathagata Das commented on SPARK-2892: -- I intended it to be ERROR to catch such issues where receivers dont stop properly. If a program is supposed to shutdown after stopping the streaming context, then this probably not much of a problem as everything of Spark is torn down anyways. But a SparkContext is going to be reused, then this is indeed a problem. > Socket Receiver does not stop when streaming context is stopped > --- > > Key: SPARK-2892 > URL: https://issues.apache.org/jira/browse/SPARK-2892 > Project: Spark > Issue Type: Bug > Components: Streaming >Affects Versions: 1.0.2 >Reporter: Tathagata Das >Assignee: Tathagata Das >Priority: Critical > > Running NetworkWordCount with > {quote} > ssc.start(); Thread.sleep(1); ssc.stop(stopSparkContext = false); > Thread.sleep(6) > {quote} > gives the following error > {quote} > 14/08/06 18:37:13 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) > in 10047 ms on localhost (1/1) > 14/08/06 18:37:13 INFO DAGScheduler: Stage 0 (runJob at > ReceiverTracker.scala:275) finished in 10.056 s > 14/08/06 18:37:13 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks > have all completed, from pool > 14/08/06 18:37:13 INFO SparkContext: Job finished: runJob at > ReceiverTracker.scala:275, took 10.179263 s > 14/08/06 18:37:13 INFO ReceiverTracker: All of the receivers have been > terminated > 14/08/06 18:37:13 WARN ReceiverTracker: All of the receivers have not > deregistered, Map(0 -> > ReceiverInfo(0,SocketReceiver-0,null,false,localhost,Stopped by driver,)) > 14/08/06 18:37:13 INFO ReceiverTracker: ReceiverTracker stopped > 14/08/06 18:37:13 INFO JobGenerator: Stopping JobGenerator immediately > 14/08/06 18:37:13 INFO RecurringTimer: Stopped timer for JobGenerator after > time 1407375433000 > 14/08/06 18:37:13 INFO JobGenerator: Stopped JobGenerator > 14/08/06 18:37:13 INFO JobScheduler: Stopped JobScheduler > 14/08/06 18:37:13 INFO StreamingContext: StreamingContext stopped successfully > 14/08/06 18:37:43 INFO SocketReceiver: Stopped receiving > 14/08/06 18:37:43 INFO SocketReceiver: Closed socket to localhost: > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2892) Socket Receiver does not stop when streaming context is stopped
[ https://issues.apache.org/jira/browse/SPARK-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14121561#comment-14121561 ] Helena Edelson commented on SPARK-2892: --- I wonder if the ERROR should be a WARN or INFO since it occurs as a result of ReceiverSupervisorImpl receiving a StopReceiver, and " Deregistered receiver for stream" seems like the expected behavior. > Socket Receiver does not stop when streaming context is stopped > --- > > Key: SPARK-2892 > URL: https://issues.apache.org/jira/browse/SPARK-2892 > Project: Spark > Issue Type: Bug > Components: Streaming >Affects Versions: 1.0.2 >Reporter: Tathagata Das >Assignee: Tathagata Das >Priority: Critical > > Running NetworkWordCount with > {quote} > ssc.start(); Thread.sleep(1); ssc.stop(stopSparkContext = false); > Thread.sleep(6) > {quote} > gives the following error > {quote} > 14/08/06 18:37:13 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) > in 10047 ms on localhost (1/1) > 14/08/06 18:37:13 INFO DAGScheduler: Stage 0 (runJob at > ReceiverTracker.scala:275) finished in 10.056 s > 14/08/06 18:37:13 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks > have all completed, from pool > 14/08/06 18:37:13 INFO SparkContext: Job finished: runJob at > ReceiverTracker.scala:275, took 10.179263 s > 14/08/06 18:37:13 INFO ReceiverTracker: All of the receivers have been > terminated > 14/08/06 18:37:13 WARN ReceiverTracker: All of the receivers have not > deregistered, Map(0 -> > ReceiverInfo(0,SocketReceiver-0,null,false,localhost,Stopped by driver,)) > 14/08/06 18:37:13 INFO ReceiverTracker: ReceiverTracker stopped > 14/08/06 18:37:13 INFO JobGenerator: Stopping JobGenerator immediately > 14/08/06 18:37:13 INFO RecurringTimer: Stopped timer for JobGenerator after > time 1407375433000 > 14/08/06 18:37:13 INFO JobGenerator: Stopped JobGenerator > 14/08/06 18:37:13 INFO JobScheduler: Stopped JobScheduler > 14/08/06 18:37:13 INFO StreamingContext: StreamingContext stopped successfully > 14/08/06 18:37:43 INFO SocketReceiver: Stopped receiving > 14/08/06 18:37:43 INFO SocketReceiver: Closed socket to localhost: > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2892) Socket Receiver does not stop when streaming context is stopped
[ https://issues.apache.org/jira/browse/SPARK-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14121324#comment-14121324 ] Helena Edelson commented on SPARK-2892: --- I see the same with 1.0.2 streaming: ERROR 08:26:21,139 Deregistered receiver for stream 0: Stopped by driver WARN 08:26:21,211 Stopped executor without error WARN 08:26:21,213 All of the receivers have not deregistered, Map(0 -> ReceiverInfo(0,ActorReceiver-0,null,false,host,Stopped by driver,)) > Socket Receiver does not stop when streaming context is stopped > --- > > Key: SPARK-2892 > URL: https://issues.apache.org/jira/browse/SPARK-2892 > Project: Spark > Issue Type: Bug > Components: Streaming >Affects Versions: 1.0.2 >Reporter: Tathagata Das >Assignee: Tathagata Das >Priority: Critical > > Running NetworkWordCount with > {quote} > ssc.start(); Thread.sleep(1); ssc.stop(stopSparkContext = false); > Thread.sleep(6) > {quote} > gives the following error > {quote} > 14/08/06 18:37:13 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) > in 10047 ms on localhost (1/1) > 14/08/06 18:37:13 INFO DAGScheduler: Stage 0 (runJob at > ReceiverTracker.scala:275) finished in 10.056 s > 14/08/06 18:37:13 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks > have all completed, from pool > 14/08/06 18:37:13 INFO SparkContext: Job finished: runJob at > ReceiverTracker.scala:275, took 10.179263 s > 14/08/06 18:37:13 INFO ReceiverTracker: All of the receivers have been > terminated > 14/08/06 18:37:13 WARN ReceiverTracker: All of the receivers have not > deregistered, Map(0 -> > ReceiverInfo(0,SocketReceiver-0,null,false,localhost,Stopped by driver,)) > 14/08/06 18:37:13 INFO ReceiverTracker: ReceiverTracker stopped > 14/08/06 18:37:13 INFO JobGenerator: Stopping JobGenerator immediately > 14/08/06 18:37:13 INFO RecurringTimer: Stopped timer for JobGenerator after > time 1407375433000 > 14/08/06 18:37:13 INFO JobGenerator: Stopped JobGenerator > 14/08/06 18:37:13 INFO JobScheduler: Stopped JobScheduler > 14/08/06 18:37:13 INFO StreamingContext: StreamingContext stopped successfully > 14/08/06 18:37:43 INFO SocketReceiver: Stopped receiving > 14/08/06 18:37:43 INFO SocketReceiver: Closed socket to localhost: > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org