Thanks Stephan. Please read inline.

On Sat, Jan 20, 2018 at 5:03 AM, Stephan Erb <s...@apache.org> wrote:

> Q1: Does Aurora use COMMAND or DEFAULT executor?
>
>
> Aurora is currently using neither. In Mesos terms Thermos is a CUSTOM
> executor. On top, Aurora supports alternative custom executors [1] such as
> the Docker compose executor [2].
>
> Mesos seems to be betting on the new DEFAULT executor. It should be
> possible to make Thermos fit the DEFAULT executor model (as it supports
> task groups), but I have no real estimate how much refactoring this would
> require.
>
>
This was about a point Bill made earlier. I am wondering if "without an
executor" is COMMAND or DEFAULT.
```

> But do we really need the command line option?


*Aurora can run tasks without an executor.*  I'm assuming the shutdown call
is incompatible with that mode.
```



>
> Q2: I think that this is ok as Aurora's reconciliation will still work...
> Right?
>
>
> Aurora assumes a correspondence of one task per executor, so I believe
> this is correct.
>
>
Great.


> Q3: Does thermos executor need any changes to respond to SHUTDOWN or does
> it already handle that?
>
>
> I have never tried it, but I believe it should work out of the box [3].
>

Indeed looks like it is already handled.


> [1] https://github.com/apache/aurora/blob/master/docs/
> features/custom-executors.md
> [2] https://github.com/mesos/docker-compose-executor
> [3] https://github.com/apache/aurora/blob/8af269f52f162faa36cd2778979626
> eefcbe8181/src/main/python/apache/aurora/executor/aurora_
> executor.py#L301-L313
>
>
> Best regards,
> Stephan
>
>
> On Wed, 2018-01-17 at 16:45 -0800, Mohit Jaggi wrote:
>
> FYI....I had a quick chat with Vinod from the Mesos team. I have some
> questions for Aurora users inline:
>
>
> *Originally the default was the COMMAND executor. In this world the
> scheduler has no visibility into the command executor.*
> *More recently, we added a DEFAULT executor which is used by frameworks
> when they want to launch pod like task groups*
>
> *The SHUTDOWN executor call is only applicable if a scheduler uses CUSTOM
> or DEFAULT executor *and* uses v1 scheduler API.*
>
> Q1: Does Aurora use COMMAND or DEFAULT executor?
>
>
> *note that SHUTDOWN is not as robust as you might think
> :slightly_smiling_face:*
> *for one, there is no reconciliation API for the executor state. it is
> very much best effort. *
> *KILL is more robust for killing tasks, because task status updates are
> reliably delivered and there is reconciliation API*
>
> Q2: I think that this is ok as Aurora's reconciliation will still work as
> we don't have "executor state". "task state" will be a good and correct
> proxy for that. Aurora will send SHUTDOWN again and again until it succeeds
> in the same way as it does now with KILL. Right?
>
> Q3: Does thermos executor need any changes to respond to SHUTDOWN or does
> it already handle that?
>
>
>
>
> On Tue, Jan 16, 2018 at 4:48 PM, Mohit Jaggi <mohit.ja...@uber.com> wrote:
>
> So that is pretty much what I proposed...
>
> If the method signature has to change, we can keep the executorId as it
> is, unless we want to take this opportunity to clean that up. I will check
> if the SHUTDOWN works in non-executor cases also.
>
> On Tue, Jan 16, 2018 at 3:03 PM, Bill Farner <wfar...@apache.org> wrote:
>
> We still need "Agent ID" for the shutdown call.
>
>
> Darn.  In that case, how about we change the method signature in Driver to
> accept agentId and ignore that param in MesosSchedulerDriver.
>
> But do we really need the command line option?
>
>
> Aurora can run tasks without an executor.  I'm assuming the shutdown call
> is incompatible with that mode.
>
> On Tue, Jan 16, 2018 at 1:57 PM, Mohit Jaggi <mohit.ja...@uber.com> wrote:
>
> We still need "Agent ID" for the shutdown call.
>
> On Tue, Jan 16, 2018 at 1:57 PM, Mohit Jaggi <mohit.ja...@uber.com> wrote:
>
> Sounds good. But do we really need the command line option? One can use an
> older Driver if KILL is preferred for some reason.
>
> On Tue, Jan 16, 2018 at 1:51 PM, Bill Farner <wfar...@apache.org> wrote:
>
> This situation is much simpler if task ID == executor ID.  I can't come up
> with a good reason why this is not the case today.  Our executor IDs
> originally included static prefix, though i do not recall any rationale for
> this.  When Renan added custom executor support, this static prefix was
> made configurable.  Again, i do not believe there was any rationale for the
> utility of executor IDs.
>
> I propose the following:
> - Change relevant code in MesosTaskFactory to
> setExecutorId(task.getTaskId())
> - Add a command line parameter (default false) to toggle use of executor
> shutdown in VersionedSchedulerDriverService.killTask
>
> Does anyone see an issue with this approach?
>
> On Tue, Jan 16, 2018 at 11:15 AM, Mohit Jaggi <mohit.ja...@uber.com>
> wrote:
>
> To do this in a backward compatible manner, one way is :
>
> ```
> void destroy(taskId, executorId, agentId) {
>
> if(driver instanceOf Versioned....)
>    (Versioned...)driver.shutdown(executorId, agentId)
> else
>    driver.kill(taskId)
>
> }
> ```
>
> Any other opinions?
>
> On Tue, Jan 16, 2018 at 11:12 AM, David McLaughlin <dmclaugh...@apache.org
> > wrote:
>
> Nope, I support getting SHUTDOWN in for users of the new API.
>
> On Tue, Jan 16, 2018 at 11:06 AM, Mohit Jaggi <mohit.ja...@uber.com>
> wrote:
>
> Are you suggesting that we delay the switch to SHUTDOWN call until this
> working group can resolve the API perf issue?
>
> On Mon, Jan 15, 2018 at 3:55 PM, David McLaughlin <dmclaugh...@apache.org>
> wrote:
>
> We are working with Mesos folks to resolve it. There is a Mesos
> performance working group that folks can join if they'd like to contribute:
> http://mesos.apache.org/blog/performance-working-group-progress-report/
>
> I'm not sure what you mean by branch. Everything we used to scale test is
> on master.
>
> On Mon, Jan 15, 2018 at 10:08 AM, Meghdoot bhattacharya <
> meghdoo...@yahoo.com> wrote:
>
> David, should twitter try against mesos 1.5 to see if things are better
> with the new api instead of libmesos. This is going to be a drift over time
> that will stop us from adopting new features.
>
> If it was sometime back it would be good to rerun the tests and open a
> ticket in Mesos if issues exist. All aurora users can then push for
> resolution.
>
> Also details on branch etc that has the api integration?
>
> Thx
>
> On Jan 12, 2018, at 11:39 AM, David McLaughlin <dmclaugh...@apache.org>
> wrote:
>
> I'm not sure I agree with the summary. Bill's proposal was using shutdown
> only when using the new API. I would also support this if it's possible.
>
> On Fri, Jan 12, 2018 at 11:14 AM, Mohit Jaggi <mohit.ja...@uber.com>
> wrote:
>
> Summary so far:
> - Bill supports making this change
> - This change cannot be made in a backward compatible manner
> - David (Twitter) does not want to use HTTP APIs due to performance
> concerns. I conclude that folks from Twitter don't support this change
>
> Question:
> - Are there other users that want this change?
>
>
>
>
>
>
>
>
>
>
>
>
>
>

Reply via email to