[ https://issues.apache.org/jira/browse/SPARK-10638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mamdouh Alramadan updated SPARK-10638: -------------------------------------- Description: With spark 1.4 on Mesos cluster, I am trying to stop the context with graceful shutdown, I have seen this mailing list that [~tdas] addressed http://mail-archives.apache.org/mod_mbox/incubator-spark-commits/201505.mbox/%3c176cb228a2704ab996839fb97fa90...@git.apache.org%3E which introduces a new config that was not documented, however, even with including it, the streaming job still stops correctly but the process doesn't die after all e.g. the Spark Context still running. My Mesos UI still sees the framework which is still allocating all the cores needed the code used for the shutdown hook is: {code:title=Start.scala|borderStyle=solid} sys.ShutdownHookThread { logInfo("Received SIGTERM, calling streaming stop") streamingContext.stop(stopSparkContext = true, stopGracefully = true) logInfo("Application Stopped") } {code} The logs are for this process are: {code:title=SparkLogs|borderStyle=solid} ``` 5/09/16 16:37:51 INFO Start: Received SIGTERM, calling streaming stop 15/09/16 16:37:51 INFO JobGenerator: Stopping JobGenerator gracefully 15/09/16 16:37:51 INFO JobGenerator: Waiting for all received blocks to be consumed for job generation 15/09/16 16:37:51 INFO JobGenerator: Waited for all received blocks to be consumed for job generation 15/09/16 16:37:51 INFO StreamingContext: Invoking stop(stopGracefully=true) from shutdown hook 15/09/16 16:38:00 INFO RecurringTimer: Stopped timer for JobGenerator after time 1442421480000 15/09/16 16:38:00 INFO JobScheduler: Starting job streaming job 1442421480000 ms.0 from job set of time 1442421480000 ms 15/09/16 16:38:00 INFO JobGenerator: Stopped generation timer 15/09/16 16:38:00 INFO JobGenerator: Waiting for jobs to be processed and checkpoints to be written 15/09/16 16:38:00 INFO JobScheduler: Added jobs for time 1442421480000 ms 15/09/16 16:38:00 INFO JobGenerator: Checkpointing graph for time 1442421480000 ms 15/09/16 16:38:00 INFO DStreamGraph: Updating checkpoint data for time 1442421480000 ms 15/09/16 16:38:00 INFO DStreamGraph: Updated checkpoint data for time 1442421480000 ms 15/09/16 16:38:00 INFO SparkContext: Starting job: foreachRDD at StreamDigest.scala:21 15/09/16 16:38:00 INFO DAGScheduler: Got job 12 (foreachRDD at StreamDigest.scala:21) with 1 output partitions (allowLocal=true) 15/09/16 16:38:00 INFO DAGScheduler: Final stage: ResultStage 12(foreachRDD at StreamDigest.scala:21) 15/09/16 16:38:00 INFO DAGScheduler: Parents of final stage: List() 15/09/16 16:38:00 INFO CheckpointWriter: Saving checkpoint for time 1442421480000 ms to file 'hdfs://EMRURL/sparkStreaming/checkpoint/checkpoint-1442421480000' 15/09/16 16:38:00 INFO DAGScheduler: Missing parents: List() . . . . 15/09/16 16:38:00 INFO JobGenerator: Waited for jobs to be processed and checkpoints to be written 15/09/16 16:38:00 INFO CheckpointWriter: CheckpointWriter executor terminated ? true, waited for 1 ms. 15/09/16 16:38:00 INFO JobGenerator: Stopped JobGenerator 15/09/16 16:38:00 INFO JobScheduler: Stopped JobScheduler ``` {code} And in my spark-defaults.conf I included {code:title=spark-defaults.conf|borderStyle=solid} spark.streaming.stopGracefullyOnShutdown true {code} was: With spark 1.4 on Mesos cluster, I am trying to stop the context with graceful shutdown, I have seen this mailing list that [~tdas] addressed http://mail-archives.apache.org/mod_mbox/incubator-spark-commits/201505.mbox/%3c176cb228a2704ab996839fb97fa90...@git.apache.org%3E which introduces a new config that was not documented, however, even with including it, the streaming job still stops correctly but the process doesn't die after all e.g. the Spark Context still running. My Mesos UI still sees the framework which is still allocating all the cores needed the code used for the shutdown hook is: {code:title=Start.scala|borderStyle=solid} sys.ShutdownHookThread { logInfo("Received SIGTERM, calling streaming stop") streamingContext.stop(stopSparkContext = true, stopGracefully = true) logInfo("Application Stopped") } {code} The logs are for this process are: {code:title=SparkLogs|borderStyle=solid} ``` 5/09/16 16:37:51 INFO Start: Received SIGTERM, calling streaming stop 15/09/16 16:37:51 INFO JobGenerator: Stopping JobGenerator gracefully 15/09/16 16:37:51 INFO JobGenerator: Waiting for all received blocks to be consumed for job generation 15/09/16 16:37:51 INFO JobGenerator: Waited for all received blocks to be consumed for job generation 15/09/16 16:37:51 INFO StreamingContext: Invoking stop(stopGracefully=true) from shutdown hook 15/09/16 16:38:00 INFO RecurringTimer: Stopped timer for JobGenerator after time 1442421480000 15/09/16 16:38:00 INFO JobScheduler: Starting job streaming job 1442421480000 ms.0 from job set of time 1442421480000 ms 15/09/16 16:38:00 INFO JobGenerator: Stopped generation timer 15/09/16 16:38:00 INFO JobGenerator: Waiting for jobs to be processed and checkpoints to be written 15/09/16 16:38:00 INFO JobScheduler: Added jobs for time 1442421480000 ms 15/09/16 16:38:00 INFO JobGenerator: Checkpointing graph for time 1442421480000 ms 15/09/16 16:38:00 INFO DStreamGraph: Updating checkpoint data for time 1442421480000 ms 15/09/16 16:38:00 INFO DStreamGraph: Updated checkpoint data for time 1442421480000 ms 15/09/16 16:38:00 INFO SparkContext: Starting job: foreachRDD at StreamDigest.scala:21 15/09/16 16:38:00 INFO DAGScheduler: Got job 12 (foreachRDD at StreamDigest.scala:21) with 1 output partitions (allowLocal=true) 15/09/16 16:38:00 INFO DAGScheduler: Final stage: ResultStage 12(foreachRDD at StreamDigest.scala:21) 15/09/16 16:38:00 INFO DAGScheduler: Parents of final stage: List() 15/09/16 16:38:00 INFO CheckpointWriter: Saving checkpoint for time 1442421480000 ms to file 'hdfs://EMRURL/sparkStreaming/checkpoint/checkpoint-1442421480000' 15/09/16 16:38:00 INFO DAGScheduler: Missing parents: List() . . . . 15/09/16 16:38:00 INFO JobGenerator: Waited for jobs to be processed and checkpoints to be written 15/09/16 16:38:00 INFO CheckpointWriter: CheckpointWriter executor terminated ? true, waited for 1 ms. 15/09/16 16:38:00 INFO JobGenerator: Stopped JobGenerator 15/09/16 16:38:00 INFO JobScheduler: Stopped JobScheduler ``` {code} And in my spark-defaults.conf I included {code:title=spark-defaults.conf|borderStyle=solid} `spark.streaming.stopGracefullyOnShutdown true` {code} > spark streaming stop gracefully keeps the spark context > ------------------------------------------------------- > > Key: SPARK-10638 > URL: https://issues.apache.org/jira/browse/SPARK-10638 > Project: Spark > Issue Type: Bug > Components: Streaming > Affects Versions: 1.4.0 > Reporter: Mamdouh Alramadan > > With spark 1.4 on Mesos cluster, I am trying to stop the context with > graceful shutdown, I have seen this mailing list that [~tdas] addressed > http://mail-archives.apache.org/mod_mbox/incubator-spark-commits/201505.mbox/%3c176cb228a2704ab996839fb97fa90...@git.apache.org%3E > which introduces a new config that was not documented, however, even with > including it, the streaming job still stops correctly but the process doesn't > die after all e.g. the Spark Context still running. My Mesos UI still sees > the framework which is still allocating all the cores needed > the code used for the shutdown hook is: > {code:title=Start.scala|borderStyle=solid} > sys.ShutdownHookThread { > logInfo("Received SIGTERM, calling streaming stop") > streamingContext.stop(stopSparkContext = true, stopGracefully = true) > logInfo("Application Stopped") > } > {code} > The logs are for this process are: > {code:title=SparkLogs|borderStyle=solid} > ``` > 5/09/16 16:37:51 INFO Start: Received SIGTERM, calling streaming stop > 15/09/16 16:37:51 INFO JobGenerator: Stopping JobGenerator gracefully > 15/09/16 16:37:51 INFO JobGenerator: Waiting for all received blocks to be > consumed for job generation > 15/09/16 16:37:51 INFO JobGenerator: Waited for all received blocks to be > consumed for job generation > 15/09/16 16:37:51 INFO StreamingContext: Invoking stop(stopGracefully=true) > from shutdown hook > 15/09/16 16:38:00 INFO RecurringTimer: Stopped timer for JobGenerator after > time 1442421480000 > 15/09/16 16:38:00 INFO JobScheduler: Starting job streaming job 1442421480000 > ms.0 from job set of time 1442421480000 ms > 15/09/16 16:38:00 INFO JobGenerator: Stopped generation timer > 15/09/16 16:38:00 INFO JobGenerator: Waiting for jobs to be processed and > checkpoints to be written > 15/09/16 16:38:00 INFO JobScheduler: Added jobs for time 1442421480000 ms > 15/09/16 16:38:00 INFO JobGenerator: Checkpointing graph for time > 1442421480000 ms > 15/09/16 16:38:00 INFO DStreamGraph: Updating checkpoint data for time > 1442421480000 ms > 15/09/16 16:38:00 INFO DStreamGraph: Updated checkpoint data for time > 1442421480000 ms > 15/09/16 16:38:00 INFO SparkContext: Starting job: foreachRDD at > StreamDigest.scala:21 > 15/09/16 16:38:00 INFO DAGScheduler: Got job 12 (foreachRDD at > StreamDigest.scala:21) with 1 output partitions (allowLocal=true) > 15/09/16 16:38:00 INFO DAGScheduler: Final stage: ResultStage 12(foreachRDD > at StreamDigest.scala:21) > 15/09/16 16:38:00 INFO DAGScheduler: Parents of final stage: List() > 15/09/16 16:38:00 INFO CheckpointWriter: Saving checkpoint for time > 1442421480000 ms to file > 'hdfs://EMRURL/sparkStreaming/checkpoint/checkpoint-1442421480000' > 15/09/16 16:38:00 INFO DAGScheduler: Missing parents: List() > . > . > . > . > 15/09/16 16:38:00 INFO JobGenerator: Waited for jobs to be processed and > checkpoints to be written > 15/09/16 16:38:00 INFO CheckpointWriter: CheckpointWriter executor terminated > ? true, waited for 1 ms. > 15/09/16 16:38:00 INFO JobGenerator: Stopped JobGenerator > 15/09/16 16:38:00 INFO JobScheduler: Stopped JobScheduler > ``` > {code} > And in my spark-defaults.conf I included > {code:title=spark-defaults.conf|borderStyle=solid} > spark.streaming.stopGracefullyOnShutdown true > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org