[ https://issues.apache.org/jira/browse/YARN-914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066518#comment-15066518 ]
Junping Du commented on YARN-914: --------------------------------- Hi [~danzhi], thanks for sharing the information above and welcome to join the contribution to Apache Hadoop. bq. Our implementation is much in sync with the architecture and idea in the JIRA design document. Good to hear that we are on the same page. One thing we need to pay attention is: we already have many patches committed into trunk/branch-2.8. As an continuous developing effort on YARN, we need to remove the code (current internal to yourself) for similar functionality or APIs before contributing or it would take reviewer/committer more effort to differentiate which functionalities/APIs are duplicated and which are not - that usually take much longer time. bq. On the other hand, there are additional details and component level designs that the JIRA design document not necessarily discuss or touch. These details naturally surfaced up during the development iterations and the corresponding design became matured and stabilized. I agree that the design document could miss some details of implementation in general. However, we can find more background/details in JIRA discussion or patch implementation. Let me explain below. bq. One example is the DecommissioningNodeWatcher, which embedded in ResourceTrackingService, tracks DECOMMISSIONING nodes status automatically and asynchronously after client/admin made the graceful decommission request. Another example is per node decommission timeout support, which is useful to decommission node that will be terminated soon. Actually, our current design and committed patches already support timeout feature. There are basically two ways to handle timeout: RM side or CLI side, both have pros and cons. Per disussions above (https://issues.apache.org/jira/browse/YARN-914?focusedCommentId=14312677&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14312677 and https://issues.apache.org/jira/browse/YARN-914?focusedCommentId=14312677&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14312677), we (Jason, Vinod and I) all agreed to go with CLI way first and we already implement it in sub JIRA (YARN-3225) and get committed. Of course, we are open for the other way of implementation, but we do want it can be based on a switch on/off configuration that doesn't affect current preferred option that we already implemented. bq. Are you able to share these details in an "augmented" design doc? Agreeing on the design would greatly help with review/commits later. I would prefer the effort to abstract the different implementation for tracking/handling timeout. This doesn't sounds like a overall "augmented" design as prevous saying it "much in sync" with current architecture and design. Also it is more proper to create a sub jira to discuss your ideas and put your document there given we already have a very long discussion here on overall design. bq. As far as implementation goes, it is recommended to create subtasks as you see fit. Note that it is easier to review smaller chunks of code. Also, since you guys have implemented it already, can you comment on how much of the code changes are in frequently updated parts? If not much, it might make sense to develop on a branch and merge it to trunk. I would say most parts of YARN-914 are already get committed or patch available already. It doesn't sounds massive of work for enhancing the timeout tracking/handling here, so a dedicated develop branch sounds unnecessary to me. However, I would prefer to create a sub jira to discuss the idea/scope and take a look at your demo code (with removing the duplicated code/feature that already committed or patch available public) before making any judegement/decision. [~danzhi], the concrete steps I would suggest for now is: 1. Review all JIRA discussions/design doc/implementations under this umbrella JIRA so far, and understand the scope and gap with your current internal implementation. 2. Raise a sub jira to put your ideas/design to highlight different options for discussion. If possible, put a demo patch with removing any similar code or feature on existing patches for better understanding. We can discuss later on how to bring in your patch contribution. Make sense? > (Umbrella) Support graceful decommission of nodemanager > ------------------------------------------------------- > > Key: YARN-914 > URL: https://issues.apache.org/jira/browse/YARN-914 > Project: Hadoop YARN > Issue Type: Improvement > Components: graceful > Affects Versions: 2.0.4-alpha > Reporter: Luke Lu > Assignee: Junping Du > Attachments: Gracefully Decommission of NodeManager (v1).pdf, > Gracefully Decommission of NodeManager (v2).pdf, > GracefullyDecommissionofNodeManagerv3.pdf > > > When NMs are decommissioned for non-fault reasons (capacity change etc.), > it's desirable to minimize the impact to running applications. > Currently if a NM is decommissioned, all running containers on the NM need to > be rescheduled on other NMs. Further more, for finished map tasks, if their > map output are not fetched by the reducers of the job, these map tasks will > need to be rerun as well. > We propose to introduce a mechanism to optionally gracefully decommission a > node manager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)