[ 
https://issues.apache.org/jira/browse/YARN-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216921#comment-15216921
 ] 

Konstantinos Karanasos commented on YARN-2883:
----------------------------------------------

Thanks for the comments, [~kasha].

bq. What happens if there is a delay in killing the opportunistic containers? 
If we launch the opportunistic containers with the right restrictions (cgroups 
in Linux), do we really need to hold off on launching guaranteed?
I think that in this initial version we should wait for containers to be killed 
as we currently do, since cgroups are often not used in practice (at least yet).
As an improvement, we can later allow the ContainerManager to send a kill event 
that makes the opportunistic container be killed instantly (without waiting for 
a clean exit), although we need to discuss further whether that would be the 
correct behavior.

bq. If the ResourceUtilization methods are unrelated, let us do it on a 
separate JIRA so people are not confused when they see this commit.
Done. I created YARN-4895 and posted the patch.

bq. In case of guaranteed containers, ContainerManagerImpl receives and 
launches the containers, and eventually handles the monitoring task to 
ContainersMonitorImpl. Using ContainerStopMonitoringEvent to figure out when to 
start the next queued container seems confusing. Is there no other way around 
this? How about having ContainerImpl#sendFinishedEvents notify 
ContainerManagerImpl so it can consider launching a queued container?
Indeed I overloaded the {{ContainerStopMonitoringEvent}}. We could create a 
separate event in the ContainerImpl, but that would simply be a duplicate of 
the {{ContainerStopMonitoringEvent}} in the current implementation. Moreover, 
given that it is the {{ContainersMonitorImpl}} that is doing the bookkeeping of 
the resources used by the containers, I think this is the one that should 
receive the event (as it currently happens in the patch), rather that the 
{{ContainerManagerImpl}}. Otherwise, the latter would simply have to pass a 
dummy event to the Monitor.

bq. ContainerManagerImpl starts containers synchronously, without any events. I 
would expect QueuingContainerManagerImpl to queue containers the same way - 
synchronously. That would save us one set of event and event-type classes.
I agree it would indeed spare us from having an extra event type. However, what 
is really helpful with using the events is that we avoid synchronization 
problems (no additional data structure locking is needed), which makes the code 
easier to follow without adding significant overhead.

Let me know what you think. I will now post a new version of the patch fixing 
the remaining checkstyle and other issues.

> Queuing of container requests in the NM
> ---------------------------------------
>
>                 Key: YARN-2883
>                 URL: https://issues.apache.org/jira/browse/YARN-2883
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager, resourcemanager
>            Reporter: Konstantinos Karanasos
>            Assignee: Konstantinos Karanasos
>         Attachments: YARN-2883-trunk.004.patch, YARN-2883-trunk.005.patch, 
> YARN-2883-trunk.006.patch, YARN-2883-yarn-2877.001.patch, 
> YARN-2883-yarn-2877.002.patch, YARN-2883-yarn-2877.003.patch, 
> YARN-2883-yarn-2877.004.patch
>
>
> We propose to add a queue in each NM, where queueable container requests can 
> be held.
> Based on the available resources in the node and the containers in the queue, 
> the NM will decide when to allow the execution of a queued container.
> In order to ensure the instantaneous start of a guaranteed-start container, 
> the NM may decide to pre-empt/kill running queueable containers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to