[ 
https://issues.apache.org/jira/browse/HADOOP-1134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12484541
 ] 

Raghu Angadi commented on HADOOP-1134:
--------------------------------------


Upgrade from previous versions requires quite a bit of interaction with 
namenode. This results in a quite a bit a upgrade specific code both in 
namenode and datanode. This will add a few new RPCs too. I propose to clearly 
mark such code as temporary and remove it in next major version. If anyone 
tries to upgrade from pre block-crc version to post block-crc version, the 
namenode and datanode will exit with a clear message that DFS first needs to be 
upgraded to block-crc version.

Regd, getting rid of .crc files: 

It will be done as part of upgrade. Namenode gets into a special safe mode 
during upgrade. 

  1) It keeps track of all the blocks that belong to non-".crc" files. 
   2) Once all the replicas are upgraded, it marks the corresponding .crc file 
to be deleted at the end of special mode. 
   3) At the end of special mode, it deletes all the .crc files that are 
enqueued.
   4) When everything goes well, there should not be any .crc files left. 
Namenode prints all the .crc files still left in the system to log.
   5) Users will be advised to delete the extra .crc files manually.

Another option is to let namenode delete all ".crc" files at the end of upgrade.

During the upgrade, datanodes do not delete the blocks that belong to to .crc 
files. These blocks will be deleted when cluster resumes normal functionality 
(during block report, or if the blocks are read by client). These blocks will 
not have have corresponding crc files on datanode.

Also note that Namenode and Datanodes are expected to restart anytime during 
the upgrade process. So all the code will be written assuming the java process 
could be killed anytime.

When is upgrade considered complete?
------------------------------------------------------

I propose the following conditions are met :

1) All the datanodes that registered should report completion of their upgrade. 
If Namenode restarts, each datanode will re-register and inform again about 
their completion.
2) Similar to current safemode, "dfs.safemode.threshold.pct" of the blocks that 
belong to non .crc files, should have at least "dfs.replication.min" replicas 
reported upgraded.








> Block level CRCs in HDFS
> ------------------------
>
>                 Key: HADOOP-1134
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1134
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Raghu Angadi
>         Assigned To: Raghu Angadi
>
> Currently CRCs are handled at FileSystem level and are transparent to core 
> HDFS. See recent improvement HADOOP-928 ( that can add checksums to a given 
> filesystem ) regd more about it. Though this served us well there a few 
> disadvantages :
> 1) This doubles namespace in HDFS ( or other filesystem implementations ). In 
> many cases, it nearly doubles the number of blocks. Taking namenode out of 
> CRCs would nearly double namespace performance both in terms of CPU and 
> memory.
> 2) Since CRCs are transparent to HDFS, it can not actively detect corrupted 
> blocks. With block level CRCs, Datanode can periodically verify the checksums 
> and report corruptions to namnode such that name replicas can be created.
> We propose to have CRCs maintained for all HDFS data in much the same way as 
> in GFS. I will update the jira with detailed requirements and design. This 
> will include same guarantees provided by current implementation and will 
> include a upgrade of current data.
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to