[ https://issues.apache.org/jira/browse/YARN-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14294613#comment-14294613 ]
Chris Douglas commented on YARN-1039: ------------------------------------- [~cwelch] YARN shouldn't understand the lifecycle for a service or the progress/dependencies for task containers. As proposed, an AM will receive a lease on a container for some duration. Before the lease expires, it can relinquish the lease or request that it be renewed. While this adds some complexity in the AM implementation- it needs to track and renew its container leases- it's mostly library code that admits straightforward, naive implementations. The most obvious strawman would request all resources at the longest possible duration and always renew. Mapping an enumeration expressing an AM lifecycle into a policy for requesting, refreshing, and managing resources is an excellent client-side abstraction. Even if an implementation of YARN only receives (and only issues) leases from a fixed set of values, the underlying abstraction can admit arbitrary durations. An enumeration is a good API for applications, but it's the RM framework could have a more fine-grained substrate. Leases actually help services run under YARN. By way of example, refusing to renew a lease could signal that the node will be decommissioned, or that some cluster-wide invariant- like balanced utilization or fairness- is better met by (re)moving that container. Refusing to renew a lease- or renewing it for a shorter period- could signal the service to request new containers. > Add parameter for YARN resource requests to indicate "long lived" > ----------------------------------------------------------------- > > Key: YARN-1039 > URL: https://issues.apache.org/jira/browse/YARN-1039 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager > Affects Versions: 3.0.0, 2.1.1-beta > Reporter: Steve Loughran > Assignee: Craig Welch > Attachments: YARN-1039.1.patch, YARN-1039.2.patch, YARN-1039.3.patch > > > A container request could support a new parameter "long-lived". This could be > used by a scheduler that would know not to host the service on a transient > (cloud: spot priced) node. > Schedulers could also decide whether or not to allocate multiple long-lived > containers on the same node -- This message was sent by Atlassian JIRA (v6.3.4#6332)