[ 
https://issues.apache.org/jira/browse/YARN-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15873247#comment-15873247
 ] 

Arun Suresh commented on YARN-5501:
-----------------------------------

Great discussion.

I would suggest we make the following simplify assumptions for an initial cut.

h4. 1. Concept of detach and attach.
>From the doc, it looks like "detach" implies removing the pre-initialized 
>container from the pool and "attach" referrs to associating an app with a 
>pooled container. It might be simpler if we treat the operation as atomic. In 
>that sense, we can make do with just having an "attach" or "lease", where a 
>pre-initialized container is associated with an app.

h4. 2. Use once and throw away.
For the sake of simplicity. Maybe we should assume that once an application is 
assigned a container from the pool and it has "attached" to it, it is the 
application's container and the Pooling framework relinquishes ownership of. 
The container then completes normally and all resource accounting is billed 
against the app. The pool of containers can be re-populated externally by the 
pool manager component in the RM (beyond the scope of this currently)

h4. 3. Resource accounting.
This is one of the reasons why I feel generalized resources would be useful 
here. Assume initialy we have a cluster with resources <10 vcores, 10 GB> 
spread across 2 NMs equally. Lets say we allocate 4 pre-initialized containers 
(via the pooling component in the RM) of type *foo* each with <1 vcore, 1 GB>. 
Lets say's we distribute it equally across the NMs. Once the pre-initialized 
containers have started, the total cluster resources would be <6 vcores, 6 GB, 
4 foo>.
Each NM would have  <3 vcores, 3 GB, 2 foo> available resources. Now if an app 
asks for <0 vcores, 0 GB, 1 foo>, it will be allocated 1 pooled container and 
the resources associated with 1 foo <1 vcore, 1 GB> can be accounted against 
the app. The app can also maybe ask for <1 vcore, 1 GB, 1 foo>, in which case, 
the app will still be assigned one of the pooled containers with the assumption 
that, the container's size can expand by <1 vcore, 1 GB> if required. 
Cgroups/JobObjects to be used to enforce this.

h4. 4. AM Container communication.
As raised by [~jlowe], It is currently unclear what happens if the app 
framework requires an umbilical connection back to the AM, how does the 
pre-initialized container know where that AM is and when to connect. Currently, 
the *ContainerLaunchContext* should contain all context required by the 
container to operate, this includes the location of the AM and how to talk to 
it. This is usually application specific (The *TaskUmbillical* protocol used by 
MR for eg.) If the container is pre-initialized, this implies that the 
container is in some stand-by state waiting for this context to be passed to 
it. We can should call this out the design doc:
# The "attach" process will pass the application's ContainerLaunchContext to 
the pre-initialized container.
# The feature requires some smarts in the ContainerExecutor, that knows how to 
pass the LaunchContext specific to the "type" of pre-initialized container to 
the container, which itself should somehow konw that it is pre-iniitialized and 
in some stand-by state. We can leverage some of the *Container Runtime* 
features for this.
# TODO later: Introduce an NM/Executor <-> container protocol to formalized the 
above, which maybe useful for long running containers.

Thoughts?

> Container Pooling in YARN
> -------------------------
>
>                 Key: YARN-5501
>                 URL: https://issues.apache.org/jira/browse/YARN-5501
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Arun Suresh
>            Assignee: Hitesh Sharma
>         Attachments: Container Pooling in YARN.pdf, Container Pooling - one 
> pager.pdf
>
>
> This JIRA proposes a method for reducing the container launch latency in 
> YARN. It introduces a notion of pooling *Unattached Pre-Initialized 
> Containers*.
> Proposal in brief:
> * Have a *Pre-Initialized Container Factory* service within the NM to create 
> these unattached containers.
> * The NM would then advertise these containers as special resource types 
> (this should be possible via YARN-3926).
> * When a start container request is received by the node manager for 
> launching a container requesting this specific type of resource, it will take 
> one of these unattached pre-initialized containers from the pool, and use it 
> to service the container request.
> * Once the request is complete, the pre-initialized container would be 
> released and ready to serve another request.
> This capability would help reduce container launch latencies and thereby 
> allow for development of more interactive applications on YARN.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to