[ 
https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15321613#comment-15321613
 ] 

Konstantin Shvachko commented on HDFS-10301:
--------------------------------------------

Sounds like you were on -1 spree lately, [~cmccabe]. Hope you are alright.

Here is why I think we should not commit your patch.
# The whole approach of keeping the state for the block report processing on 
the NameNode is error-prone. It assumes at-once execution, and therefore when 
block reports interleave the BR-state gets _messed up_. Particularly, the 
BitSet used to mark storages, which have been processed, can be reset during 
interleaving multiple times and cannot be used to count storages in the report. 
In current implementation the _messing-up_ of BR-state leads to false positive 
detection of a zombie storage and removal of a perfectly valid one.
# Your patch leaves the _messing-up_ of the BR-state in place (the BitSet is 
still inconsistent). It only tweaks it to avoid the false-positive. It still 
allows false-negatives, which lead to not detecting a zombie when it actually 
is present.
# So the correct solution for the problem is to remove the BR-state altogether, 
which is achieved in Vinita's patch. And if we have a better solution why 
settle on a temporary work-around. It may be a bigger change, but only because 
it removes the invalid logic related to the BR-state.

It seems that you don't or don't want to understand reasoning around adding 
separate storage reporting RPC call. At least you addressed it only by 
repeating your -1. For the third time. And did not respond to [~zhz]'s proposal 
to merge the storage reporting RPC into one of the storage reports in the next 
jira.
Given that and in order to move forward, we should look into making changes to 
the last BR RPC call, which should now also report all storages.

> BlockReport retransmissions may lead to storages falsely being declared 
> zombie if storage report processing happens out of order
> --------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-10301
>                 URL: https://issues.apache.org/jira/browse/HDFS-10301
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.6.1
>            Reporter: Konstantin Shvachko
>            Assignee: Colin Patrick McCabe
>            Priority: Critical
>         Attachments: HDFS-10301.002.patch, HDFS-10301.003.patch, 
> HDFS-10301.004.patch, HDFS-10301.005.patch, HDFS-10301.01.patch, 
> HDFS-10301.sample.patch, zombieStorageLogs.rtf
>
>
> When NameNode is busy a DataNode can timeout sending a block report. Then it 
> sends the block report again. Then NameNode while process these two reports 
> at the same time can interleave processing storages from different reports. 
> This screws up the blockReportId field, which makes NameNode think that some 
> storages are zombie. Replicas from zombie storages are immediately removed, 
> causing missing blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to