[ 
https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15271737#comment-15271737
 ] 

Konstantin Shvachko commented on HDFS-10301:
--------------------------------------------

??In the short term, however, I would prefer the current patch, since it 
involves no RPC changes, and doesn't require all the DataNodes to be upgraded 
before it can work.??

* I don't think my approach requires RPC change, since the block-report RPC 
message already has all required structures in place. It should require only 
the processing logic change.
* DataNodes will need to be upgraded indeed, but only in the case if they split 
its block-reports into multiple RPC, because full report lists all storages 
already. But even multi-RPC case it will only mean that zombie storages will 
not be removed until they are upgraded.
* Colin, it would have been good to have an interim solution, but it does not 
seem reasonable to commit a patch, which fixes one bug, while introducing 
another.
I traced back a series of jiras related to this problem. It looks like that 
multiple storages were not thoroughly thought through in the beginning and that 
people were trying to solve problems as they appear for a while. Feels like the 
time for the right fix.

> BlockReport retransmissions may lead to storages falsely being declared 
> zombie if storage report processing happens out of order
> --------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-10301
>                 URL: https://issues.apache.org/jira/browse/HDFS-10301
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.6.1
>            Reporter: Konstantin Shvachko
>            Assignee: Colin Patrick McCabe
>            Priority: Critical
>         Attachments: HDFS-10301.002.patch, HDFS-10301.003.patch, 
> HDFS-10301.01.patch, HDFS-10301.sample.patch, zombieStorageLogs.rtf
>
>
> When NameNode is busy a DataNode can timeout sending a block report. Then it 
> sends the block report again. Then NameNode while process these two reports 
> at the same time can interleave processing storages from different reports. 
> This screws up the blockReportId field, which makes NameNode think that some 
> storages are zombie. Replicas from zombie storages are immediately removed, 
> causing missing blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to