[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15666401#comment-15666401 ]
Konstantinos Karanasos commented on YARN-4597: ---------------------------------------------- Thanks for the updated patch, [~asuresh]. Looks good to me. Some final comments below... All are minor, so up to you to address (I would only "insist" about the last one). - In the {{ContainerScheduler}}: -- In the comment for the runningContainers, let's mention that these are the running containers, including the containers that are in the process of transitioning from the SCHEDULED to the RUNNING state. I think the rest are details that might be confusing. -- In the {{updateQueuingLimit}}, you can do an extra check of the form {{if (this.queuingLimit.getMaxQueueLength() < queuedOpportunisticContainers.size())}} to avoid calling the shedding if the queue is not long enough. This might often be the case if the NM has imposed a small queue size. -- I was thinking that, although less likely than before, the fields of the {{OpportunisticContainersStatus()}} can still be updated during the {{getOpportunisticContainersStatus()}}. To avoid synchronization, we could set the fields using an event, and then in the {{getOpportunisticContainersStatus()}} we would just return the object. But if you think it is too much, we can leave it as is. -- In the {{onContainerCompleted}}, a container can belong either to queued guaranteed, to queued opportunistic or to running. So, you could avoid doing the remove from all lists once found in one of them. - In the {{YarnConfiguration}}, let's include in a comment that the max queue length coming from the RM is the globally max queue length. - In the {{SchedulerNode}}, I still suggest to put the {{++numContainers}} and the {{--numContainers}} inside the if statements. If I remember well, these fields are used for the web UI, so there will be a disconnect between the resources used (referring only to guaranteed containers) and the number of containers (referring to both guaranteed and opportunistic at the moment). The stats for the opportunistic containers are carried by the opportunisticContainersStatus, so we are good with reporting them too. Again, all comments are minor. +1 for the patch and thanks for all the work! > Add SCHEDULE to NM container lifecycle > -------------------------------------- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager > Reporter: Chris Douglas > Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch, YARN-4597.010.patch, YARN-4597.011.patch, > YARN-4597.012.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org