[ https://issues.apache.org/jira/browse/HDFS-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841600#comment-13841600 ]
Suresh Srinivas edited comment on HDFS-4114 at 12/6/13 7:54 PM: ---------------------------------------------------------------- bq. As it stands today BackupNode is the only extension of the NameNode in the current code base. I do not think it is a sufficient reason to retain BackupNode. If you really want to show how Namenode can be extended, you could contribute another simpler, easier to maintain example that extends Namenode. In fact some of the constructs that are used only by BackupNode, I reckon, are not what extensions of Namenode would use. Some examples: # It uses JournalProtocol and NamenodeProtocol to co-ordinate checkpointing. This is no longer necessary with the improvements in edits, where checkpointing can be done any time without the need to roll, start checkpoint, end checkpoint. # A lot of code in FSImage and FSEditLog caters to this, just for BackupNode. This code is not well documented. Adds unnecessary complexity. As you see from the early patch, we can remove approximately 5000 lines of code. This code belongs to a functionality that no one tests or uses. In fact I will not be surprised that there are bugs lurking in that functionality that might cause major issues for a misguided user that ends up using it. Given that I believe BackupNode should be removed. As regards to is any code that helps extending namenode is being removed, I would like to see a proposal on what extending a namenode means, which of the functionality relevant to that is being removed in my patch. bq. You are right it's been a while and I have a debt to provide proper ones, which is on my todo list. I fail to understand what the plans for BackupNode are and why is it relevant anymore. Describing that would help. bq. If you wish we can assign this issue to me so that I could take care of it in the future. I wish just assigning a bug to you would have been that easy. When making changes in the code, with a feature in mind, there are lot of these unused code and tests that also need change. This is currently a tax that feature developers are paying. The folks working on a feature have a time frame that they are working towards. Having to depend on you for related changes means, having to co-ordinate the work with you, getting the work done within the timeline. This will not only be work for you, but also work for people working on features. It is hard for me to reason why spend all that effort? I can give you few examples where folks had to do all this unnecessary work: - When we did protobuf support we had to add support for all the protocols that is only used by BackupNode. - In HA, considerable coding and testing effort went into supporting BackupNode. - Recently, when I worked on retry cache, I spent a lot of time just understanding how all this works and added support for retriabiliity. - I also know that [~jingzhao] and [~wheat9] spent time on BackupNode specific functionality when working on http policy and https support related cleanup. Unless there are justified reasons for retaining this functionality, regular contributors of HDFS will have to continue pay this cost. We have waited almost an year for a plan for taking BackupNode forward. I also think with Namenode HA stabilizing, even if there is a plan, I am not sure how relevant it would be. A suggestion is to move this functionality to github and as HDFS changes you could maintain it. This in essence is equivalent to involving you to maintain BackupNode related functionality for features added to HDFS, without the cost of co-ordination. was (Author: sureshms): bq. As it stands today BackupNode is the only extension of the NameNode in the current code base. I do not think it is a sufficient reason to retain BackupNode. If you really want to shot how Namenode can be extended, you could contribute another simpler, easier to maintain example that extends Namenode. In fact some of the constructs that are used only by BackupNode, I reckon, are not what extensions of Namenode would use. Some examples: # It uses JournalProtocol and NamenodeProtocol to co-ordinate checkpointing. This is no longer necessary with the improvements in edits, where checkpointing can be done any time without the need to roll, start checkpoint, end checkpoint. # A lot of code in FSImage and FSEditLog caters to this, just for BackupNode. This code is not well documented. Adds unnecessary complexity. As you see from the early patch, we can remove approximately 5000 lines of code. This code belongs to a functionality that no one tests or uses. In fact I will not be surprised that there are bugs lurking in that functionality that might cause major issues for a misguided user that ends up using it. Given that I believe BackupNode should be removed. As regards to is any code that helps extending namenode is being removed, I would like to see a proposal on what extending a namenode means, which of the functionality relevant to that is being removed in my patch. bq. You are right it's been a while and I have a debt to provide proper ones, which is on my todo list. I fail to understand what the plans for BackupNode are and why is it relevant anymore. Describing that would help. bq. If you wish we can assign this issue to me so that I could take care of it in the future. I wish just assigning a bug to you would have been that easy. When making changes in the code, with a feature in mind, there are lot of these unused code and tests that also need change. This is currently a tax that feature developers are paying. The folks working on a feature have a time frame that they are working towards. Having to depend on you for related changes means, having to co-ordinate the work with you, getting the work done within the timeline. This will not only be work for you, but also work for people working on features. It is hard for me to reason why spend all that effort? I can give you few examples where folks had to do all this unnecessary work: - When we did protobuf support we had to add support for all the protocols that is only used by BackupNode. - In HA, considerable coding and testing effort went into supporting BackupNode. - Recently, when I worked on retry cache, I spent a lot of time just understanding how all this works and added support for retriabiliity. - I also know that [~jingzhao] and [~wheat9] spent time on BackupNode specific functionality when working on http policy and https support related cleanup. Unless there are justified reasons for retaining this functionality, regular contributors of HDFS will have to continue pay this cost. We have waited almost an year for a plan for taking BackupNode forward. I also think with Namenode HA stabilizing, even if there is a plan, I am not sure how relevant it would be. A suggestion is to move this functionality to github and as HDFS changes you could maintain it. This in essence is equivalent to involving you to maintain BackupNode related functionality for features added to HDFS, without the cost of co-ordination. > Deprecate the BackupNode and CheckpointNode in 2.0 > -------------------------------------------------- > > Key: HDFS-4114 > URL: https://issues.apache.org/jira/browse/HDFS-4114 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: Eli Collins > Assignee: Suresh Srinivas > Attachments: HDFS-4114.patch > > > Per the thread on hdfs-dev@ (http://s.apache.org/tMT) let's remove the > BackupNode and CheckpointNode. -- This message was sent by Atlassian JIRA (v6.1#6144)