[jira] [Comment Edited] (SPARK-12009) Avoid re-allocate yarn container while driver want to stop all Executors
[ https://issues.apache.org/jira/browse/SPARK-12009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16051607#comment-16051607 ] JiYeon OH edited comment on SPARK-12009 at 6/16/17 8:50 AM: I'm having the same problem with Spark 2.1.0 I have some jobs with exact same code and had a few jobs failed. In the jobs that finished successfully, there was this message after the job finished: 17/06/15 00:26:02 INFO YarnAllocator: Driver requested a total number of 0 executor(s). But in the jobs that failed, there was this message instead: 17/06/16 14:31:14 ERROR LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerExecutorMetricsUpdate(10,WrappedArray()) I'm guessing the YarnAllocator must have requested some executors after spark job was finished, but can't' find out why. and why is YarnAllocator requesting executors after job finished Does anyone know why?? was (Author: ogcheeze): I'm having the same problem with Spark 2.1.0 I have some jobs with exact same code and had a few jobs failed. In the jobs that finished successfully, there was this message after the job finished 17/06/15 00:26:02 INFO YarnAllocator: Driver requested a total number of 0 executor(s). But in the jobs that filaed, there was this message instead 17/06/16 14:31:14 ERROR LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerExecutorMetricsUpdate(10,WrappedArray()) I'm guessing the YarnAllocator must have requested some executors after spark job was finished, but can't' find out why. and why is YarnAllocator requesting executors after job finished Does anyone know why?? > Avoid re-allocate yarn container while driver want to stop all Executors > > > Key: SPARK-12009 > URL: https://issues.apache.org/jira/browse/SPARK-12009 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 1.5.2 >Reporter: SuYan >Assignee: SuYan >Priority: Minor > Fix For: 2.0.0 > > > Log based 1.4.0 > 2015-11-26,03:05:16,176 WARN > org.spark-project.jetty.util.thread.QueuedThreadPool: 8 threads could not be > stopped > 2015-11-26,03:05:16,177 INFO org.apache.spark.ui.SparkUI: Stopped Spark web > UI at http:// > 2015-11-26,03:05:16,401 INFO org.apache.spark.scheduler.DAGScheduler: > Stopping DAGScheduler > 2015-11-26,03:05:16,450 INFO > org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend: Shutting down > all executors > 2015-11-26,03:05:16,525 INFO > org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend: Asking each > executor to shut down > 2015-11-26,03:05:16,791 INFO > org.apache.spark.deploy.yarn.ApplicationMaster$AMEndpoint: Driver terminated > or disconnected! Shutting down. XX.XX.XX.XX:38734 > 2015-11-26,03:05:16,847 ERROR org.apache.spark.scheduler.LiveListenerBus: > SparkListenerBus has already stopped! Dropping event > SparkListenerExecutorMetricsUpdate(164,WrappedArray()) > 2015-11-26,03:05:27,242 INFO org.apache.spark.deploy.yarn.YarnAllocator: Will > request 13 executor containers, each with 1 cores and 4608 MB memory > including 1024 MB overhead -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-12009) Avoid re-allocate yarn container while driver want to stop all Executors
[ https://issues.apache.org/jira/browse/SPARK-12009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902634#comment-15902634 ] Swaranga Sarma edited comment on SPARK-12009 at 3/9/17 7:59 AM: The JIRA says that the issue is fixed but I still see this error in Spark 2.1.0 {code} try (JavaSparkContext sc = new JavaSparkContext(new SparkConf())) { //run the job } {code} was (Author: swaranga): The JIRA says that the issue is fixed but I still see this error in Spark 2.1.0 {code} try (JavaSparkContext sc = new JavaSparkContext(new SparkConf()) { //run the job } {code} > Avoid re-allocate yarn container while driver want to stop all Executors > > > Key: SPARK-12009 > URL: https://issues.apache.org/jira/browse/SPARK-12009 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 1.5.2 >Reporter: SuYan >Assignee: SuYan >Priority: Minor > Fix For: 2.0.0 > > > Log based 1.4.0 > 2015-11-26,03:05:16,176 WARN > org.spark-project.jetty.util.thread.QueuedThreadPool: 8 threads could not be > stopped > 2015-11-26,03:05:16,177 INFO org.apache.spark.ui.SparkUI: Stopped Spark web > UI at http:// > 2015-11-26,03:05:16,401 INFO org.apache.spark.scheduler.DAGScheduler: > Stopping DAGScheduler > 2015-11-26,03:05:16,450 INFO > org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend: Shutting down > all executors > 2015-11-26,03:05:16,525 INFO > org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend: Asking each > executor to shut down > 2015-11-26,03:05:16,791 INFO > org.apache.spark.deploy.yarn.ApplicationMaster$AMEndpoint: Driver terminated > or disconnected! Shutting down. XX.XX.XX.XX:38734 > 2015-11-26,03:05:16,847 ERROR org.apache.spark.scheduler.LiveListenerBus: > SparkListenerBus has already stopped! Dropping event > SparkListenerExecutorMetricsUpdate(164,WrappedArray()) > 2015-11-26,03:05:27,242 INFO org.apache.spark.deploy.yarn.YarnAllocator: Will > request 13 executor containers, each with 1 cores and 4608 MB memory > including 1024 MB overhead -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-12009) Avoid re-allocate yarn container while driver want to stop all Executors
[ https://issues.apache.org/jira/browse/SPARK-12009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15031298#comment-15031298 ] SuYan edited comment on SPARK-12009 at 11/30/15 3:59 AM: - I still think it is better to only stop to request new containers, and try to find a more general way was (Author: suyan): I still think it is better to only stop to request new containers > Avoid re-allocate yarn container while driver want to stop all Executors > > > Key: SPARK-12009 > URL: https://issues.apache.org/jira/browse/SPARK-12009 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 1.5.2 >Reporter: SuYan >Priority: Minor > > Log based 1.4.0 > 2015-11-26,03:05:16,176 WARN > org.spark-project.jetty.util.thread.QueuedThreadPool: 8 threads could not be > stopped > 2015-11-26,03:05:16,177 INFO org.apache.spark.ui.SparkUI: Stopped Spark web > UI at http:// > 2015-11-26,03:05:16,401 INFO org.apache.spark.scheduler.DAGScheduler: > Stopping DAGScheduler > 2015-11-26,03:05:16,450 INFO > org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend: Shutting down > all executors > 2015-11-26,03:05:16,525 INFO > org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend: Asking each > executor to shut down > 2015-11-26,03:05:16,791 INFO > org.apache.spark.deploy.yarn.ApplicationMaster$AMEndpoint: Driver terminated > or disconnected! Shutting down. XX.XX.XX.XX:38734 > 2015-11-26,03:05:16,847 ERROR org.apache.spark.scheduler.LiveListenerBus: > SparkListenerBus has already stopped! Dropping event > SparkListenerExecutorMetricsUpdate(164,WrappedArray()) > 2015-11-26,03:05:27,242 INFO org.apache.spark.deploy.yarn.YarnAllocator: Will > request 13 executor containers, each with 1 cores and 4608 MB memory > including 1024 MB overhead -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-12009) Avoid re-allocate yarn container while driver want to stop all Executors
[ https://issues.apache.org/jira/browse/SPARK-12009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15029416#comment-15029416 ] Saisai Shao edited comment on SPARK-12009 at 11/27/15 3:30 AM: --- So what actual version of Spark you're running? 1.4.0 or 1.5.2? was (Author: jerryshao): So what actually version of Spark you're running? 1.4.0 or 1.5.2? > Avoid re-allocate yarn container while driver want to stop all Executors > > > Key: SPARK-12009 > URL: https://issues.apache.org/jira/browse/SPARK-12009 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 1.5.2 >Reporter: SuYan >Priority: Minor > > Log based 1.4.0 > 2015-11-26,03:05:16,176 WARN > org.spark-project.jetty.util.thread.QueuedThreadPool: 8 threads could not be > stopped > 2015-11-26,03:05:16,177 INFO org.apache.spark.ui.SparkUI: Stopped Spark web > UI at http:// > 2015-11-26,03:05:16,401 INFO org.apache.spark.scheduler.DAGScheduler: > Stopping DAGScheduler > 2015-11-26,03:05:16,450 INFO > org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend: Shutting down > all executors > 2015-11-26,03:05:16,525 INFO > org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend: Asking each > executor to shut down > 2015-11-26,03:05:16,791 INFO > org.apache.spark.deploy.yarn.ApplicationMaster$AMEndpoint: Driver terminated > or disconnected! Shutting down. XX.XX.XX.XX:38734 > 2015-11-26,03:05:16,847 ERROR org.apache.spark.scheduler.LiveListenerBus: > SparkListenerBus has already stopped! Dropping event > SparkListenerExecutorMetricsUpdate(164,WrappedArray()) > 2015-11-26,03:05:27,242 INFO org.apache.spark.deploy.yarn.YarnAllocator: Will > request 13 executor containers, each with 1 cores and 4608 MB memory > including 1024 MB overhead -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-12009) Avoid re-allocate yarn container while driver want to stop all Executors
[ https://issues.apache.org/jira/browse/SPARK-12009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15029431#comment-15029431 ] Saisai Shao edited comment on SPARK-12009 at 11/27/15 3:50 AM: --- Alright, my code is master branch. Anyway I understood your issue now. But I don't think current fix is a solid fix, it would be better to fix in the AM. I think what you need to do is to interrupt the {{reportThread}} in AM in {{onDisconnected}}, so that YarnAllocator will not sync with RM to request new containers. You could take a try. was (Author: jerryshao): Alright, my code is master branch. Anyway I understood your issue now. But I don't think current fix is a solid fix, it would be better to fix in the AM. I think what you need to do is to interrupt the {{reportThread}} in AM in {{onDisconnected}}, so that YarnAllocator will not sync with RM to request new containers. > Avoid re-allocate yarn container while driver want to stop all Executors > > > Key: SPARK-12009 > URL: https://issues.apache.org/jira/browse/SPARK-12009 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 1.5.2 >Reporter: SuYan >Priority: Minor > > Log based 1.4.0 > 2015-11-26,03:05:16,176 WARN > org.spark-project.jetty.util.thread.QueuedThreadPool: 8 threads could not be > stopped > 2015-11-26,03:05:16,177 INFO org.apache.spark.ui.SparkUI: Stopped Spark web > UI at http:// > 2015-11-26,03:05:16,401 INFO org.apache.spark.scheduler.DAGScheduler: > Stopping DAGScheduler > 2015-11-26,03:05:16,450 INFO > org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend: Shutting down > all executors > 2015-11-26,03:05:16,525 INFO > org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend: Asking each > executor to shut down > 2015-11-26,03:05:16,791 INFO > org.apache.spark.deploy.yarn.ApplicationMaster$AMEndpoint: Driver terminated > or disconnected! Shutting down. XX.XX.XX.XX:38734 > 2015-11-26,03:05:16,847 ERROR org.apache.spark.scheduler.LiveListenerBus: > SparkListenerBus has already stopped! Dropping event > SparkListenerExecutorMetricsUpdate(164,WrappedArray()) > 2015-11-26,03:05:27,242 INFO org.apache.spark.deploy.yarn.YarnAllocator: Will > request 13 executor containers, each with 1 cores and 4608 MB memory > including 1024 MB overhead -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-12009) Avoid re-allocate yarn container while driver want to stop all Executors
[ https://issues.apache.org/jira/browse/SPARK-12009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15028363#comment-15028363 ] SuYan edited comment on SPARK-12009 at 11/26/15 8:42 AM: - AM is not exit, it will exit while driver complete its usercode in userThread. the below logs tell that the a executor is terminated. 2015-11-26,03:05:16,791 INFO org.apache.spark.deploy.yarn.ApplicationMaster$AMEndpoint: Driver terminated or disconnected! Shutting down. XX.XX.XX.XX:38734 was (Author: suyan): AM is not exit, it will exit while driver execute its usercode in userThread. the below logs tell that the a executor is terminated. 2015-11-26,03:05:16,791 INFO org.apache.spark.deploy.yarn.ApplicationMaster$AMEndpoint: Driver terminated or disconnected! Shutting down. XX.XX.XX.XX:38734 > Avoid re-allocate yarn container while driver want to stop all Executors > > > Key: SPARK-12009 > URL: https://issues.apache.org/jira/browse/SPARK-12009 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 1.5.2 >Reporter: SuYan >Priority: Minor > > 2015-11-26,03:05:16,176 WARN > org.spark-project.jetty.util.thread.QueuedThreadPool: 8 threads could not be > stopped > 2015-11-26,03:05:16,177 INFO org.apache.spark.ui.SparkUI: Stopped Spark web > UI at http:// > 2015-11-26,03:05:16,401 INFO org.apache.spark.scheduler.DAGScheduler: > Stopping DAGScheduler > 2015-11-26,03:05:16,450 INFO > org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend: Shutting down > all executors > 2015-11-26,03:05:16,525 INFO > org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend: Asking each > executor to shut down > 2015-11-26,03:05:16,791 INFO > org.apache.spark.deploy.yarn.ApplicationMaster$AMEndpoint: Driver terminated > or disconnected! Shutting down. XX.XX.XX.XX:38734 > 2015-11-26,03:05:16,847 ERROR org.apache.spark.scheduler.LiveListenerBus: > SparkListenerBus has already stopped! Dropping event > SparkListenerExecutorMetricsUpdate(164,WrappedArray()) > 2015-11-26,03:05:27,242 INFO org.apache.spark.deploy.yarn.YarnAllocator: Will > request 13 executor containers, each with 1 cores and 4608 MB memory > including 1024 MB overhead -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-12009) Avoid re-allocate yarn container while driver want to stop all Executors
[ https://issues.apache.org/jira/browse/SPARK-12009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15028363#comment-15028363 ] SuYan edited comment on SPARK-12009 at 11/26/15 8:42 AM: - AM is not exit, it will exit while driver complete its usercode in userThread. the below logs tell that a executor is terminated. 2015-11-26,03:05:16,791 INFO org.apache.spark.deploy.yarn.ApplicationMaster$AMEndpoint: Driver terminated or disconnected! Shutting down. XX.XX.XX.XX:38734 was (Author: suyan): AM is not exit, it will exit while driver complete its usercode in userThread. the below logs tell that the a executor is terminated. 2015-11-26,03:05:16,791 INFO org.apache.spark.deploy.yarn.ApplicationMaster$AMEndpoint: Driver terminated or disconnected! Shutting down. XX.XX.XX.XX:38734 > Avoid re-allocate yarn container while driver want to stop all Executors > > > Key: SPARK-12009 > URL: https://issues.apache.org/jira/browse/SPARK-12009 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 1.5.2 >Reporter: SuYan >Priority: Minor > > 2015-11-26,03:05:16,176 WARN > org.spark-project.jetty.util.thread.QueuedThreadPool: 8 threads could not be > stopped > 2015-11-26,03:05:16,177 INFO org.apache.spark.ui.SparkUI: Stopped Spark web > UI at http:// > 2015-11-26,03:05:16,401 INFO org.apache.spark.scheduler.DAGScheduler: > Stopping DAGScheduler > 2015-11-26,03:05:16,450 INFO > org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend: Shutting down > all executors > 2015-11-26,03:05:16,525 INFO > org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend: Asking each > executor to shut down > 2015-11-26,03:05:16,791 INFO > org.apache.spark.deploy.yarn.ApplicationMaster$AMEndpoint: Driver terminated > or disconnected! Shutting down. XX.XX.XX.XX:38734 > 2015-11-26,03:05:16,847 ERROR org.apache.spark.scheduler.LiveListenerBus: > SparkListenerBus has already stopped! Dropping event > SparkListenerExecutorMetricsUpdate(164,WrappedArray()) > 2015-11-26,03:05:27,242 INFO org.apache.spark.deploy.yarn.YarnAllocator: Will > request 13 executor containers, each with 1 cores and 4608 MB memory > including 1024 MB overhead -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-12009) Avoid re-allocate yarn container while driver want to stop all Executors
[ https://issues.apache.org/jira/browse/SPARK-12009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15028325#comment-15028325 ] Saisai Shao edited comment on SPARK-12009 at 11/26/15 7:58 AM: --- A interesting thing is that AM is shutting down at time {{2015-11-26,03:05:16}}, but YarnAllocator still request 13 executors after 11 seconds. Looks like AM is not exited so fast, that's why YarnAllocator is still requesting new containers. Normally if AM is exited as fast as it receive disconnected message, there will be not time for container requesting for YarnAllocator. was (Author: jerryshao): A interesting thing is that AM is shutting down at time {{2015-11-26,03:05:16}}, but YarnAllocator still request 13 executors after 11 seconds. Looks like AM is not exited so fast, that's why YarnAllocator is still requesting new containers. Normally if AM is exited as fast as it receive disconnected message, there will be not container requesting for YarnAllocator. > Avoid re-allocate yarn container while driver want to stop all Executors > > > Key: SPARK-12009 > URL: https://issues.apache.org/jira/browse/SPARK-12009 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 1.5.2 >Reporter: SuYan >Priority: Minor > > 2015-11-26,03:05:16,176 WARN > org.spark-project.jetty.util.thread.QueuedThreadPool: 8 threads could not be > stopped > 2015-11-26,03:05:16,177 INFO org.apache.spark.ui.SparkUI: Stopped Spark web > UI at http:// > 2015-11-26,03:05:16,401 INFO org.apache.spark.scheduler.DAGScheduler: > Stopping DAGScheduler > 2015-11-26,03:05:16,450 INFO > org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend: Shutting down > all executors > 2015-11-26,03:05:16,525 INFO > org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend: Asking each > executor to shut down > 2015-11-26,03:05:16,791 INFO > org.apache.spark.deploy.yarn.ApplicationMaster$AMEndpoint: Driver terminated > or disconnected! Shutting down. XX.XX.XX.XX:38734 > 2015-11-26,03:05:16,847 ERROR org.apache.spark.scheduler.LiveListenerBus: > SparkListenerBus has already stopped! Dropping event > SparkListenerExecutorMetricsUpdate(164,WrappedArray()) > 2015-11-26,03:05:27,242 INFO org.apache.spark.deploy.yarn.YarnAllocator: Will > request 13 executor containers, each with 1 cores and 4608 MB memory > including 1024 MB overhead -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org