[ https://issues.apache.org/jira/browse/YARN-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14606205#comment-14606205 ]
Chris Douglas commented on YARN-3784: ------------------------------------- Minor: - Docs for timeout don't include units - Many whitespace changes in {{FiCaSchedulerApp}} - change nested if to {{&&}} at: {noformat} + if (this.preemptionTimeout != 0) { + if (timeout > this.preemptionTimeout) { {noformat} - Would it be possible to test more than the timeout reported is non-zero? If this used a {{Clock}} instead of calling {{System.currentTimeMillis}} directly, the unit test could be easier to write... If containers are preempted for multiple causes (e.g., over-capacity, NM decommission), then the time to preempt could vary widely. The ProportionalCPP also limits the preempted capacity per round, so a global timeout will be very pessimistic. Would it make sense to change {{timeout}} to be {{nextkill}}? More general solutions would be significantly more work... > Indicate preemption timout along with the list of containers to AM > (preemption message) > --------------------------------------------------------------------------------------- > > Key: YARN-3784 > URL: https://issues.apache.org/jira/browse/YARN-3784 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager > Reporter: Sunil G > Assignee: Sunil G > Attachments: 0001-YARN-3784.patch > > > Currently during preemption, AM is notified with a list of containers which > are marked for preemption. Introducing a timeout duration also along with > this container list so that AM can know how much time it will get to do a > graceful shutdown to its containers (assuming one of preemption policy is > loaded in AM). > This will help in decommissioning NM scenarios, where NM will be > decommissioned after a timeout (also killing containers on it). This timeout > will be helpful to indicate AM that those containers can be killed by RM > forcefully after the timeout. -- This message was sent by Atlassian JIRA (v6.3.4#6332)