Mohit Jaggi created AURORA-1960: ----------------------------------- Summary: Aurora should use SHUTDOWN instead of KILL Key: AURORA-1960 URL: https://issues.apache.org/jira/browse/AURORA-1960 Project: Aurora Issue Type: Task Affects Versions: 0.19.0, 0.18.1 Reporter: Mohit Jaggi Assignee: Mohit Jaggi Priority: Minor Fix For: 0.19.0, 0.18.1
Aurora is using KILL as the default method of terminating executors. SHUTDOWN is better suited to get rid of executors more reliably. ---- mailing list discussion ---- The new API is present in Aurora in a compatibility layer Aha! I had not explored that code yet. It does seem that SHUTDOWN provides the behavior that we aim for when killing tasks. The global executor shutdown timeout (--executor_shutdown_grace_period) potentially interferes with our graceful_shutdown_wait_secs job-level configuration. However, an operator could use the former as an upper limit to the latter. >From what i see, i'd support a patch to switch to SHUTDOWN when using >DriverKind.V0_DRIVER or DriverKind.V1_DRIVER. On Sat, Dec 9, 2017 at 4:27 PM, David McLaughlin <dmclaugh...@apache.org> wrote: The new API is present in Aurora in a compatibility layer, but the HTTP performance issues still exist so we can't make it the default. On Sat, Dec 9, 2017 at 4:24 PM, Bill Farner <wfar...@apache.org> wrote: Aurora pre-dates SHUTDOWN by several years, so the option was not present. Additionally, the SHUTDOWN call is not available in the API used by Aurora. Last i knew, Aurora could not use the "new" API because of performance issues in the implementation, but i do not know where that stands today. https://mesos.apache.org/documentation/latest/scheduler-http-api/#shutdown NOTE: This is a new call that was not present in the old API On Sat, Dec 9, 2017 at 4:11 PM, Mohit Jaggi <mohit.ja...@uber.com> wrote: Folks, Our Mesos team is wondering why Aurora chose KILL over SHUTDOWN for killing tasks. As Aurora has an executor per task, won't SHUTDOWN work better? It will avoid zombie executors. Mohit. -- This message was sent by Atlassian JIRA (v6.4.14#64029)