> On Oct 9, 2017, at 1:27 PM, Vinod Kone <vinodk...@apache.org> wrote:
> 
>> In the case that a task is killed because it violated a resource
>> constraint (ie. the reason field is REASON_CONTAINER_LIMITATION,
>> REASON_CONTAINER_LIMITATION_DISK or REASON_CONTAINER_LIMITATION_MEMORY),
>> this field may be populated with the resource that triggered the
>> limitation. This is intended to give better information to schedulers about
>> task resource failures, in the expectation that it will help them bubble
>> useful information up to the user or a monitoring system.
>> 
> 
> Can you elaborate what schedulers are expected to do with this information?
> Looking for some concrete use cases if you can.

There's no concrete use case here; it's just a matter of propagating 
information we know in a structured way.

If we assume that the scheduler knows about some sort of monitoring system or 
has a UI, we can present this to the user or a system that can take action on 
it. The status quo is that the raw message string is dumped to logs, and has to 
be manually interpreted. 

Additionally, this can pave the way to getting rid of 
REASON_CONTAINER_LIMITATION_DISK and REASON_CONTAINER_LIMITATION_MEMORY. All 
you really need is REASON_CONTAINER_LIMITATION plus the resource information.

J

Reply via email to