[jira] [Comment Edited] (HDFS-4114) Deprecate the BackupNode and CheckpointNode in 2.0

Suresh Srinivas (JIRA) Fri, 06 Dec 2013 11:56:02 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841600#comment-13841600
 ]


Suresh Srinivas edited comment on HDFS-4114 at 12/6/13 7:54 PM:
----------------------------------------------------------------

bq. As it stands today BackupNode is the only extension of the NameNode in the 
current code base.
I do not think it is a sufficient reason to retain BackupNode. If you really 
want to show how Namenode can be extended, you could contribute another 
simpler, easier to maintain example that extends Namenode. In fact some of the 
constructs that are used only by BackupNode, I reckon, are not what extensions 
of Namenode would use. Some examples:
# It  uses JournalProtocol and NamenodeProtocol to co-ordinate checkpointing. 
This is no longer necessary with the improvements in edits, where checkpointing 
can be done any time without the need to roll, start checkpoint, end checkpoint.
# A lot of code in FSImage and FSEditLog caters to this, just for BackupNode. 
This code is not well documented. Adds unnecessary complexity.

As you see from the early patch, we can remove approximately 5000 lines of 
code. This code belongs to a functionality that no one tests or uses. In fact I 
will not be surprised that there are bugs lurking in that functionality that 
might cause major issues for a misguided user that ends up using it.

Given that I believe BackupNode should be removed. As regards to is any code 
that helps extending namenode is being removed, I would like to see a proposal 
on what extending a namenode means, which of the functionality relevant to that 
is being removed in my patch.

bq. You are right it's been a while and I have a debt to provide proper ones, 
which is on my todo list.
I fail to understand what the plans for BackupNode are and why is it relevant 
anymore. Describing that would help.

bq.  If you wish we can assign this issue to me so that I could take care of it 
in the future.
I wish just assigning a bug to you would have been that easy. When making 
changes in the code, with a feature in mind, there are lot of these unused code 
and tests that also need change. This is currently a tax that feature 
developers are paying. The folks working on a feature have a time frame that 
they are working towards. Having to depend on you for related changes means, 
having to co-ordinate the work with you, getting the work done within the 
timeline. This will not only be work for you, but also work for people working 
on features. It is hard for me to reason why spend all that effort?

I can give you few examples where folks had to do all this unnecessary work:
- When we did protobuf support we had to add support for all the protocols that 
is only used by BackupNode.
- In HA, considerable coding and testing effort went into supporting BackupNode.
- Recently, when I worked on retry cache, I spent a lot of time just 
understanding how all this works and added support for retriabiliity.
- I also know that [~jingzhao] and [~wheat9] spent time on BackupNode specific 
functionality when working on http policy and https support related cleanup.

Unless there are justified reasons for retaining this functionality, regular 
contributors of HDFS will have to continue pay this cost. We have waited almost 
an year for a plan for taking BackupNode forward. I also think with Namenode HA 
stabilizing, even if there is a plan, I am not sure how relevant it would be.

A suggestion is to move this functionality to github and as HDFS changes you 
could maintain it. This in essence is equivalent to involving you to maintain 
BackupNode related functionality for features added to HDFS, without the cost 
of co-ordination.


was (Author: sureshms):
bq. As it stands today BackupNode is the only extension of the NameNode in the 
current code base.
I do not think it is a sufficient reason to retain BackupNode. If you really 
want to shot how Namenode can be extended, you could contribute another 
simpler, easier to maintain example that extends Namenode. In fact some of the 
constructs that are used only by BackupNode, I reckon, are not what extensions 
of Namenode would use. Some examples:
# It  uses JournalProtocol and NamenodeProtocol to co-ordinate checkpointing. 
This is no longer necessary with the improvements in edits, where checkpointing 
can be done any time without the need to roll, start checkpoint, end checkpoint.
# A lot of code in FSImage and FSEditLog caters to this, just for BackupNode. 
This code is not well documented. Adds unnecessary complexity.

As you see from the early patch, we can remove approximately 5000 lines of 
code. This code belongs to a functionality that no one tests or uses. In fact I 
will not be surprised that there are bugs lurking in that functionality that 
might cause major issues for a misguided user that ends up using it.

Given that I believe BackupNode should be removed. As regards to is any code 
that helps extending namenode is being removed, I would like to see a proposal 
on what extending a namenode means, which of the functionality relevant to that 
is being removed in my patch.

bq. You are right it's been a while and I have a debt to provide proper ones, 
which is on my todo list.
I fail to understand what the plans for BackupNode are and why is it relevant 
anymore. Describing that would help.

bq.  If you wish we can assign this issue to me so that I could take care of it 
in the future.
I wish just assigning a bug to you would have been that easy. When making 
changes in the code, with a feature in mind, there are lot of these unused code 
and tests that also need change. This is currently a tax that feature 
developers are paying. The folks working on a feature have a time frame that 
they are working towards. Having to depend on you for related changes means, 
having to co-ordinate the work with you, getting the work done within the 
timeline. This will not only be work for you, but also work for people working 
on features. It is hard for me to reason why spend all that effort?

I can give you few examples where folks had to do all this unnecessary work:
- When we did protobuf support we had to add support for all the protocols that 
is only used by BackupNode.
- In HA, considerable coding and testing effort went into supporting BackupNode.
- Recently, when I worked on retry cache, I spent a lot of time just 
understanding how all this works and added support for retriabiliity.
- I also know that [~jingzhao] and [~wheat9] spent time on BackupNode specific 
functionality when working on http policy and https support related cleanup.

Unless there are justified reasons for retaining this functionality, regular 
contributors of HDFS will have to continue pay this cost. We have waited almost 
an year for a plan for taking BackupNode forward. I also think with Namenode HA 
stabilizing, even if there is a plan, I am not sure how relevant it would be.

A suggestion is to move this functionality to github and as HDFS changes you 
could maintain it. This in essence is equivalent to involving you to maintain 
BackupNode related functionality for features added to HDFS, without the cost 
of co-ordination.

> Deprecate the BackupNode and CheckpointNode in 2.0
> --------------------------------------------------
>
>                 Key: HDFS-4114
>                 URL: https://issues.apache.org/jira/browse/HDFS-4114
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Eli Collins
>            Assignee: Suresh Srinivas
>         Attachments: HDFS-4114.patch
>
>
> Per the thread on hdfs-dev@ (http://s.apache.org/tMT) let's remove the 
> BackupNode and CheckpointNode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Comment Edited] (HDFS-4114) Deprecate the BackupNode and CheckpointNode in 2.0

Reply via email to