Right. In order to keep the current abstraction in Aurora (both APIs), we obviously have to bind to the lower common denominator API methods. So the only way to integrate with shutdown will be to fix the performance issues so we can switch to the new API.
The performance issue we ran into at Twitter was that with status updates that were similar to our production volume, they started to get dropped and tasks end up being LOST and unnecessarily killed. So it's a definite blocker for us to adopt in its current state. We have someone who has fixing this on the Mesos side in their backlog, but it's currently not the highest priority for us. On Thu, Jan 11, 2018 at 1:45 PM, Renan DelValle <renanidelva...@gmail.com> wrote: > The HTTP API is what is used under the hood for V0 and V1 (instead of > libmesos), I believe that's what David was referencing when he mentioned > the HTTP performance issues. Here's a better explanation from the original > patch submitted by Zameer: https://github.com/apache/aurora/commit/ > 705dbc7cd7c3ff477bcf766cdafe49a68ab47dee#diff- > 75bd5a98db87502a2332e9110d2eafc6 > > I'm not sure about the Shutdown call, as you mentioned, the versioned > driver seems to have the method but the driver interface does not. This > might get tricky from here on in since Mesos has V1 only compatible calls. > > On Thu, Jan 11, 2018 at 1:24 PM, Mohit Jaggi <mohit.ja...@uber.com> wrote: > >> Thanks Renan. I saw that code. "Driver" interface does not have >> SHUTDOWN...so it is not "compatible". I was trying to change to >> VersionedSchedulerDriverService all over the code (that wreaks havoc >> across the tests!) but Mesos's Java wrapper doesn't seem to have that >> call either. Perhaps, that is why David referred to the HTTP API. >> >> On Thu, Jan 11, 2018 at 1:14 PM, Renan DelValle <renanidelva...@gmail.com >> > wrote: >> >>> https://github.com/apache/aurora/blob/aae2b0dc73b7534c66982e >>> d07b1f029150e245de/src/main/java/org/apache/aurora/scheduler >>> /mesos/SchedulerDriverModule.java >>> >>> https://github.com/apache/aurora/blob/aae2b0dc73b7534c66982e >>> d07b1f029150e245de/src/main/java/org/apache/aurora/scheduler >>> /mesos/VersionedSchedulerDriverService.java#L50 >>> >>> On Tue, Jan 9, 2018 at 1:21 PM, Mohit Jaggi <mohit.ja...@uber.com> >>> wrote: >>> >>>> David, >>>> Where can I find this code? >>>> >>>> Mohit. >>>> >>>> On Sat, Dec 9, 2017 at 4:27 PM, David McLaughlin < >>>> dmclaugh...@apache.org> wrote: >>>> >>>>> The new API is present in Aurora in a compatibility layer, but the >>>>> HTTP performance issues still exist so we can't make it the default. >>>>> >>>>> On Sat, Dec 9, 2017 at 4:24 PM, Bill Farner <wfar...@apache.org> >>>>> wrote: >>>>> >>>>>> Aurora pre-dates SHUTDOWN by several years, so the option was not >>>>>> present. Additionally, the SHUTDOWN call is not available in the API >>>>>> used >>>>>> by Aurora. Last i knew, Aurora could not use the "new" API because of >>>>>> performance issues in the implementation, but i do not know where that >>>>>> stands today. >>>>>> >>>>>> https://mesos.apache.org/documentation/latest/scheduler-http >>>>>> -api/#shutdown >>>>>> >>>>>>> NOTE: This is a new call that was not present in the old API >>>>>> >>>>>> >>>>>> On Sat, Dec 9, 2017 at 4:11 PM, Mohit Jaggi <mohit.ja...@uber.com> >>>>>> wrote: >>>>>> >>>>>>> Folks, >>>>>>> Our Mesos team is wondering why Aurora chose KILL over SHUTDOWN for >>>>>>> killing tasks. As Aurora has an executor per task, won't SHUTDOWN work >>>>>>> better? It will avoid zombie executors. >>>>>>> >>>>>>> Mohit. >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >> >