[jira] [Updated] (MESOS-6933) Executor does not respect grace period
[ https://issues.apache.org/jira/browse/MESOS-6933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deshi Xiao updated MESOS-6933: -- Attachment: 屏幕快照 2017-07-17 下午2.19.03.png please check the screenshot. [~janisz] > Executor does not respect grace period > -- > > Key: MESOS-6933 > URL: https://issues.apache.org/jira/browse/MESOS-6933 > Project: Mesos > Issue Type: Bug > Components: executor >Reporter: Tomasz Janiszewski > Attachments: 屏幕快照 2017-07-17 下午2.19.03.png > > > Mesos Command Executor try to support grace period with escalate but > unfortunately it does not work. It launches {{command}} by wrapping it in > {{sh -c}} this cause process tree to look like this > {code} > Received killTask > Shutting down > Sending SIGTERM to process tree at pid 18 > Sent SIGTERM to the following process trees: > [ > -+- 18 sh -c cd offer-i18n-0.1.24 && LD_PRELOAD=../librealresources.so > ./bin/offer-i18n -e prod -p $PORT0 > \--- 19 command... > ] > Command terminated with signal Terminated (pid: 18) > {code} > This cause {{sh}} to immediately close and so executor, while wrapped > {{command}} might need some more time to finish. Finally, executor thinks > command executed gracefully so it won't > [escalate|https://github.com/apache/mesos/blob/1.1.0/src/launcher/executor.cpp#L695] > to SIGKILL. > This cause leaks when POSIX containerizer is used because if command ignores > SIGTERM it will be attached to initialize and never get killed. Using > pid/namespace only masks the problem because hanging process is captured > before it can gracefully shutdown. > Fix for this is to sent SIGTERM only to {{sh}} children. {{sh}} will exit > when all children processes finish. If not they will be killed by escalation > to SIGKILL. > All versions from 0.20 are affected. > This test should pass > [src/tests/command_executor_tests.cpp:342|https://github.com/apache/mesos/blob/2c856178b59593ff8068ea8d6c6593943c33008c/src/tests/command_executor_tests.cpp#L342-L343] > [Mailing list > thread|https://lists.apache.org/thread.html/1025dca0cf4418aee50b14330711500af864f08b53eb82d10cd5c04c@%3Cuser.mesos.apache.org%3E] -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-6933) Executor does not respect grace period
[ https://issues.apache.org/jira/browse/MESOS-6933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haosdent updated MESOS-6933: Description: Mesos Command Executor try to support grace period with escalate but unfortunately it does not work. It launches {{command}} by wrapping it in {{sh -c}} this cause process tree to look like this {code} Received killTask Shutting down Sending SIGTERM to process tree at pid 18 Sent SIGTERM to the following process trees: [ -+- 18 sh -c cd offer-i18n-0.1.24 && LD_PRELOAD=../librealresources.so ./bin/offer-i18n -e prod -p $PORT0 \--- 19 command... ] Command terminated with signal Terminated (pid: 18) {code} This cause {{sh}} to immediately close and so executor, while wrapped {{command}} might need some more time to finish. Finally, executor thinks command executed gracefully so it won't [escalate|https://github.com/apache/mesos/blob/1.1.0/src/launcher/executor.cpp#L695] to SIGKILL. This cause leaks when POSIX containerizer is used because if command ignores SIGTERM it will be attached to initialize and never get killed. Using pid/namespace only masks the problem because hanging process is captured before it can gracefully shutdown. Fix for this is to sent SIGTERM only to {{sh}} children. {{sh}} will exit when all children processes finish. If not they will be killed by escalation to SIGKILL. All versions from 0.20 are affected. This test should pass [src/tests/command_executor_tests.cpp:342|https://github.com/apache/mesos/blob/2c856178b59593ff8068ea8d6c6593943c33008c/src/tests/command_executor_tests.cpp#L342-L343] [Mailing list thread|https://lists.apache.org/thread.html/1025dca0cf4418aee50b14330711500af864f08b53eb82d10cd5c04c@%3Cuser.mesos.apache.org%3E] was: Mesos Command Executor try to support grace period with escalate but unfortunately it does not work. It launches {{command}} by wrapping it in {{sh -c}} this cause process tree to look like this {code} Received killTask Shutting down Sending SIGTERM to process tree at pid 18 Sent SIGTERM to the following process trees: [ -+- 18 sh -c cd offer-i18n-0.1.24 && LD_PRELOAD=../librealresources.so ./bin/offer-i18n -e prod -p $PORT0 \--- 19 command... ] Command terminated with signal Terminated (pid: 18) {code} This cause {{sh}} to immediately close and so executor, while wrapped {{command}} might need some more time to finish. Finally, executor thinks command executed gracefully so it won't [escalate|https://github.com/apache/mesos/blob/1.1.0/src/launcher/executor.cpp#L695] to SIGKILL. This cause leaks when POSIX contenerizer is used because if command ignores SIGTERM it will be attached to init and never get killed. Using pid/namespace only masks the problem because hanging process is cpatured before it can gracefully shutdown. Fix for this is to sent SIGTERM only to {{sh}} children. {{sh}} will exit when all sub processes finish. If not they will be killed by escalation to SIGKILL. All versions from: 0.20 are affected. This test should pass [src/tests/command_executor_tests.cpp:342|https://github.com/apache/mesos/blob/2c856178b59593ff8068ea8d6c6593943c33008c/src/tests/command_executor_tests.cpp#L342-L343] [Mailing list thread|https://lists.apache.org/thread.html/1025dca0cf4418aee50b14330711500af864f08b53eb82d10cd5c04c@%3Cuser.mesos.apache.org%3E] > Executor does not respect grace period > -- > > Key: MESOS-6933 > URL: https://issues.apache.org/jira/browse/MESOS-6933 > Project: Mesos > Issue Type: Bug > Components: executor >Reporter: Tomasz Janiszewski > > Mesos Command Executor try to support grace period with escalate but > unfortunately it does not work. It launches {{command}} by wrapping it in > {{sh -c}} this cause process tree to look like this > {code} > Received killTask > Shutting down > Sending SIGTERM to process tree at pid 18 > Sent SIGTERM to the following process trees: > [ > -+- 18 sh -c cd offer-i18n-0.1.24 && LD_PRELOAD=../librealresources.so > ./bin/offer-i18n -e prod -p $PORT0 > \--- 19 command... > ] > Command terminated with signal Terminated (pid: 18) > {code} > This cause {{sh}} to immediately close and so executor, while wrapped > {{command}} might need some more time to finish. Finally, executor thinks > command executed gracefully so it won't > [escalate|https://github.com/apache/mesos/blob/1.1.0/src/launcher/executor.cpp#L695] > to SIGKILL. > This cause leaks when POSIX containerizer is used because if command ignores > SIGTERM it will be attached to initialize and never get killed. Using > pid/namespace only masks the problem because hanging process is captured > before it can gracefully shutdown. > Fix for this is to sent SIGTERM only to {{sh}} children. {{sh}} will exit > when all children processes finish. If not they will be killed by escalation > to SIG
[jira] [Updated] (MESOS-6933) Executor does not respect grace period
[ https://issues.apache.org/jira/browse/MESOS-6933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haosdent updated MESOS-6933: Description: Mesos Command Executor try to support grace period with escalate but unfortunately it does not work. It launches {{command}} by wrapping it in {{sh -c}} this cause process tree to look like this {code} Received killTask Shutting down Sending SIGTERM to process tree at pid 18 Sent SIGTERM to the following process trees: [ -+- 18 sh -c cd offer-i18n-0.1.24 && LD_PRELOAD=../librealresources.so ./bin/offer-i18n -e prod -p $PORT0 \--- 19 command... ] Command terminated with signal Terminated (pid: 18) {code} This cause {{sh}} to immediately close and so executor, while wrapped {{command}} might need some more time to finish. Finally, executor thinks command executed gracefully so it won't [escalate|https://github.com/apache/mesos/blob/1.1.0/src/launcher/executor.cpp#L695] to SIGKILL. This cause leaks when POSIX contenerizer is used because if command ignores SIGTERM it will be attached to init and never get killed. Using pid/namespace only masks the problem because hanging process is cpatured before it can gracefully shutdown. Fix for this is to sent SIGTERM only to {{sh}} children. {{sh}} will exit when all sub processes finish. If not they will be killed by escalation to SIGKILL. All versions from: 0.20 are affected. This test should pass [src/tests/command_executor_tests.cpp:342|https://github.com/apache/mesos/blob/2c856178b59593ff8068ea8d6c6593943c33008c/src/tests/command_executor_tests.cpp#L342-L343] [Mailing list thread|https://lists.apache.org/thread.html/1025dca0cf4418aee50b14330711500af864f08b53eb82d10cd5c04c@%3Cuser.mesos.apache.org%3E] was: Mesos Defult Executor try to support grace period with escalate but unfortunately it does not work. It launches {{command}} by wrapping it in {{sh -c}} this cause process tree to look like this {code} Received killTask Shutting down Sending SIGTERM to process tree at pid 18 Sent SIGTERM to the following process trees: [ -+- 18 sh -c cd offer-i18n-0.1.24 && LD_PRELOAD=../librealresources.so ./bin/offer-i18n -e prod -p $PORT0 \--- 19 command... ] Command terminated with signal Terminated (pid: 18) {code} This cause {{sh}} to immediately close and so executor, while wrapped {{command}} might need some more time to finish. Finally, executor thinks command executed gracefully so it won't [escalate|https://github.com/apache/mesos/blob/1.1.0/src/launcher/executor.cpp#L695] to SIGKILL. This cause leaks when POSIX contenerizer is used because if command ignores SIGTERM it will be attached to init and never get killed. Using pid/namespace only masks the problem because hanging process is cpatured before it can gracefully shutdown. Fix for this is to sent SIGTERM only to {{sh}} children. {{sh}} will exit when all sub processes finish. If not they will be killed by escalation to SIGKILL. All versions from: 0.20 are affected. This test should pass [src/tests/command_executor_tests.cpp:342|https://github.com/apache/mesos/blob/2c856178b59593ff8068ea8d6c6593943c33008c/src/tests/command_executor_tests.cpp#L342-L343] [Mailing list thread|https://lists.apache.org/thread.html/1025dca0cf4418aee50b14330711500af864f08b53eb82d10cd5c04c@%3Cuser.mesos.apache.org%3E] > Executor does not respect grace period > -- > > Key: MESOS-6933 > URL: https://issues.apache.org/jira/browse/MESOS-6933 > Project: Mesos > Issue Type: Bug > Components: executor >Reporter: Tomasz Janiszewski > > Mesos Command Executor try to support grace period with escalate but > unfortunately it does not work. It launches {{command}} by wrapping it in > {{sh -c}} this cause process tree to look like this > {code} > Received killTask > Shutting down > Sending SIGTERM to process tree at pid 18 > Sent SIGTERM to the following process trees: > [ > -+- 18 sh -c cd offer-i18n-0.1.24 && LD_PRELOAD=../librealresources.so > ./bin/offer-i18n -e prod -p $PORT0 > \--- 19 command... > ] > Command terminated with signal Terminated (pid: 18) > {code} > This cause {{sh}} to immediately close and so executor, while wrapped > {{command}} might need some more time to finish. Finally, executor thinks > command executed gracefully so it won't > [escalate|https://github.com/apache/mesos/blob/1.1.0/src/launcher/executor.cpp#L695] > to SIGKILL. > This cause leaks when POSIX contenerizer is used because if command ignores > SIGTERM it will be attached to init and never get killed. Using pid/namespace > only masks the problem because hanging process is cpatured before it can > gracefully shutdown. > Fix for this is to sent SIGTERM only to {{sh}} children. {{sh}} will exit > when all sub processes finish. If not they will be killed by escalation to > SIGKILL. > All versions from