[ https://issues.apache.org/jira/browse/SLIDER-479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gour Saha updated SLIDER-479: ----------------------------- Fix Version/s: Slider 2.0.0 > Provide a slider command to kill all stranded containers continuing to run > post stop command > -------------------------------------------------------------------------------------------- > > Key: SLIDER-479 > URL: https://issues.apache.org/jira/browse/SLIDER-479 > Project: Slider > Issue Type: Bug > Reporter: Gour Saha > Fix For: Slider 2.0.0 > > > A container can continue to run even after a slider stop command has been > issued. One such scenarios is when NM of a non Slider-AM node is lost and > before the Slider-AM could clean up the stranded agent (and the application > processes) slider stop command was issued. In such a scenario even if the NM > is brought back up it will not kill these containers. > In a large cluster with several applications deployed/managed by slider there > could easily be numerous such stranded containers. > Slider client could expose a "stop-all" command or maybe an option "stop > --clean" (or anything appropriate for this task) to do the cleanup. It can > bring up the Slider-AM in clean mode (say) which will not start any > application but will simply register to ZK and wait for agents to heart-beat > into it. Each one of these agents will receive the terminate command from the > AM and will do necessary cleanup and shutdown. > This new command can be issued only after an application has been stopped. > When invoked while the application is running this command should fail > providing relevant information. This command can also provide a summary of > how many stranded containers it cleaned up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)