[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arun Suresh updated YARN-4597: ------------------------------ Attachment: YARN-4597.010.patch Updating patch addressing [~kkaranasos]'s and [~kasha]'s comments I've responded to Karthik's comments directly on github. [~kkaranasos], bq. queuedOpportunisticContainers will have concurrency issues. We are updating it when containers arrive but also in the shedQueuedOpportunisticContainers. Good catch. Ive fixed it in the latest patch by sending a 'SHED_QUEUED_CONTAINERS' event to the ContainerScheduler when the Node HB response from the RM asks to shed queued containers. In addition to preserving the fact that ContainerScheduler operations are serialized, it also ensures that the Node HB thread is not blocked. bq. shedQueuedOpportunisticContainers: numAllowed is the number of allowed containers in the queue. Instead, we are killing numAllowed containers. In other words, we should not kill numAllowed, but queuedOpportunisticContainers.size() - numAllowed. Even though I agreed with you offline, I took a look again, and actually the logic is correct. The {{numAllowed}} counter is initialized to maxAllowed and the decremented in the loop. Containers are killed only AFTER it's value goes <= 0. In any case, I've added a testcase in 'TestContainerSchedulerQueuing' to verify that this actually works. bq. line 252, indeed we can now do extraOpportContainersToKill -> opportContainersToKill, as Karthik mentioned at a comment. I think 'extra' is still apt. Since (as I mentioned to Karthik), these are 'extra' opp containers over and above what is already present in the 'oppContainersToKill'. bq. queuedGuaranteedContainers and queuedOpportunisticContainers: I think we should use queues. I don't think we retrieve the container by the key anywhere either ways. Karthik mentioned this in his comments too. LinkedHashMaps are essentially a indexed queue. Additionally, there is actually 1 case where we need to retrieve by the key: When the AM asks to kill a container that is queued. Furthermore, queue re-ordering etc. might be easier with a map... Lets keep it as a LinkedHashMap unless we find it is detrimental in some way. bq. oppContainersMarkedForKill: could be a Set, right? Technically yes, but I would have to modify Container to add equals and hashcode too, which I felt was too much of a hastle.. I prefer to keep it as it is. bq. fields of the opportunisticContainersStatus are set in different places. Due to that, when we call getOpportunisticContainersStatus() we may see an inconsistent object. Let's set the fields only in the getOpportunisticContainersStatus(). Agreed... I've also added the oppMem, oppCores & numOpp values in the NodeManagerMetrics. bq. There seem to be two redundant parameters at YarnConfiguration at the moment: NM_CONTAINER_QUEUING_MIN_QUEUE_LENGTH and NM_OPPORTUNISTIC_CONTAINERS_MAX_QUEUE_LENGTH. If I am not missing something, we should keep one of the two. Actually they are bit different. NM_CONTAINER_QUEUING_MIN_QUEUE_LENGTH and the corresponding max value is used by the RM to calculate a limit value for the Queue. It is possible that the Queue can momentarily go above that. While NM_OPPORTUNISTIC_CONTAINERS_MAX_QUEUE_LENGTH is used in the NM to prevent queuing beyond that value. It is a configuration hard limit ([~kasha] had requested it) > Add SCHEDULE to NM container lifecycle > -------------------------------------- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager > Reporter: Chris Douglas > Assignee: Arun Suresh > Labels: oct16-hard > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, > YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, > YARN-4597.009.patch, YARN-4597.010.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org