[ https://issues.apache.org/jira/browse/YARN-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216921#comment-15216921 ]
Konstantinos Karanasos commented on YARN-2883: ---------------------------------------------- Thanks for the comments, [~kasha]. bq. What happens if there is a delay in killing the opportunistic containers? If we launch the opportunistic containers with the right restrictions (cgroups in Linux), do we really need to hold off on launching guaranteed? I think that in this initial version we should wait for containers to be killed as we currently do, since cgroups are often not used in practice (at least yet). As an improvement, we can later allow the ContainerManager to send a kill event that makes the opportunistic container be killed instantly (without waiting for a clean exit), although we need to discuss further whether that would be the correct behavior. bq. If the ResourceUtilization methods are unrelated, let us do it on a separate JIRA so people are not confused when they see this commit. Done. I created YARN-4895 and posted the patch. bq. In case of guaranteed containers, ContainerManagerImpl receives and launches the containers, and eventually handles the monitoring task to ContainersMonitorImpl. Using ContainerStopMonitoringEvent to figure out when to start the next queued container seems confusing. Is there no other way around this? How about having ContainerImpl#sendFinishedEvents notify ContainerManagerImpl so it can consider launching a queued container? Indeed I overloaded the {{ContainerStopMonitoringEvent}}. We could create a separate event in the ContainerImpl, but that would simply be a duplicate of the {{ContainerStopMonitoringEvent}} in the current implementation. Moreover, given that it is the {{ContainersMonitorImpl}} that is doing the bookkeeping of the resources used by the containers, I think this is the one that should receive the event (as it currently happens in the patch), rather that the {{ContainerManagerImpl}}. Otherwise, the latter would simply have to pass a dummy event to the Monitor. bq. ContainerManagerImpl starts containers synchronously, without any events. I would expect QueuingContainerManagerImpl to queue containers the same way - synchronously. That would save us one set of event and event-type classes. I agree it would indeed spare us from having an extra event type. However, what is really helpful with using the events is that we avoid synchronization problems (no additional data structure locking is needed), which makes the code easier to follow without adding significant overhead. Let me know what you think. I will now post a new version of the patch fixing the remaining checkstyle and other issues. > Queuing of container requests in the NM > --------------------------------------- > > Key: YARN-2883 > URL: https://issues.apache.org/jira/browse/YARN-2883 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager > Reporter: Konstantinos Karanasos > Assignee: Konstantinos Karanasos > Attachments: YARN-2883-trunk.004.patch, YARN-2883-trunk.005.patch, > YARN-2883-trunk.006.patch, YARN-2883-yarn-2877.001.patch, > YARN-2883-yarn-2877.002.patch, YARN-2883-yarn-2877.003.patch, > YARN-2883-yarn-2877.004.patch > > > We propose to add a queue in each NM, where queueable container requests can > be held. > Based on the available resources in the node and the containers in the queue, > the NM will decide when to allow the execution of a queued container. > In order to ensure the instantaneous start of a guaranteed-start container, > the NM may decide to pre-empt/kill running queueable containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)