Hi, Benjamin, Usually for us if tasks run longer than a certain period of time it means that something has gone wrong and we should just abort/try again.
David (also at Yelp) On Fri, Mar 23, 2018 at 7:14 PM, Benjamin Mahler <[email protected]> wrote: > Ah, I was more curious about why they need to be killed after a timeout. > E.g. After a particular deadline the work is useless (in Zhitao's case). > > On Fri, Mar 23, 2018 at 6:22 PM Sagar Sadashiv Patwardhan <[email protected]> > wrote: > >> Hi Benjamin, >> We have a few tasks that should be killed after >> some timeout. We currently have some logic in our scheduler to kill these >> tasks. Would be nice to delegate this to the executor. >> >> - Sagar >> >> On Fri, Mar 23, 2018 at 3:29 PM, Benjamin Mahler <[email protected]> >> wrote: >> >> > Sagar, could you share your use case? Or is it exactly the same as >> > Zhitao's? >> > >> > On Fri, Mar 23, 2018 at 3:15 PM, Sagar Sadashiv Patwardhan < >> > [email protected]> >> > wrote: >> > >> > > +1 >> > > >> > > This will be useful for us(Yelp) as well. >> > > >> > > On Fri, Mar 23, 2018 at 1:31 PM, Benjamin Mahler <[email protected]> >> > > wrote: >> > > >> > > > Also, it's advantageous for mesos to be aware of a hard deadline >> when >> > it >> > > > comes to resource allocation. We know that some resources will free >> up >> > > and >> > > > can make better decisions when it comes to pre-emption, for example. >> > > > Currently, mesos doesn't know if a task will run forever or will >> run to >> > > > completion. >> > > > >> > > > On Fri, Mar 23, 2018 at 10:07 AM, James Peach <[email protected]> >> > wrote: >> > > > >> > > > > >> > > > > >> > > > > > On Mar 23, 2018, at 9:57 AM, Renan DelValle < >> > > [email protected]> >> > > > > wrote: >> > > > > > >> > > > > > Hi Zhitao, >> > > > > > >> > > > > > Since this is something that could potentially be handled by the >> > > > > executor and/or framework, I was wondering if you could speak to >> the >> > > > > advantages of making this a TaskInfo primitive vs having the >> executor >> > > (or >> > > > > even the framework) handle it. >> > > > > >> > > > > There's some discussion around this on https://issues.apache.org/ >> > > > > jira/browse/MESOS-8725. >> > > > > >> > > > > My take is that delegating too much to the scheduler makes >> schedulers >> > > > > harder to write and exacerbates the complexity of the system. If 4 >> > > > > different schedulers implement this feature, operators are likely >> to >> > > need >> > > > > to understand 4 different ways of doing the same thing, which >> would >> > be >> > > > > unfortunate. >> > > > > >> > > > > J >> > > > >> > > >> > >> >
