[ 
https://issues.apache.org/jira/browse/YARN-7019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16128826#comment-16128826
 ] 

Jason Lowe commented on YARN-7019:
----------------------------------

I'm totally OK if we want to generalize this to a preemption score or whatever 
as long as applications have the ability to inform YARN in some way.

bq. In YARN-3784, we were trying to send preemption timeout to AMs, so could AM 
take an immediate action about checkpointing OR even send a feedback to RM with 
alternative container to preempt instead of selected one?

The problem with relying on the AM is that feedback loop can be too long for 
some use cases.  For example, YARN-1011 is proposing to have the NM preempt 
containers on its own in order to preserve the health of the node when too many 
containers end up on the node and resource utilization is at critical levels.  
In those cases the NM doesn't have time to wait for the NM to tell the RM about 
it, have the RM wait for the AM to heartbeat in so it can tell the AM about it, 
wait for the AM to respond with a preemption preference, then wait for the NM 
to heartbeat in again so the RM can relay the priority.  It would be nice if 
the NM could be proactively told during container execution when the cost of 
preemption changes so it can make better decisions on its own when pressed for 
time.

> Ability for applications to notify YARN about container reuse
> -------------------------------------------------------------
>
>                 Key: YARN-7019
>                 URL: https://issues.apache.org/jira/browse/YARN-7019
>             Project: Hadoop YARN
>          Issue Type: New Feature
>            Reporter: Jason Lowe
>
> During preemption calculations YARN can try to reduce the amount of work lost 
> by considering how long a container has been running.  However when an 
> application framework like Tez reuses a container across multiple tasks it 
> changes the work lost calculation since the container has essentially 
> checkpointed between task assignments.  It would be nice if applications 
> could inform YARN when a container has been reused/checkpointed and therefore 
> is a better candidate for preemption wrt. lost work than other, younger 
> containers.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to