[ 
https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15403045#comment-15403045
 ] 

Konstantin Shvachko commented on HDFS-10301:
--------------------------------------------

We are actively looking into possible problem with this change. LMK if the 
revert fixes the problem. Just to clarify you are using per-storage reports on 
your cluster?
In the meantime answering your questions Daryn.

??Why is this patch changing per-storage reports when it's the single-rpc 
report that is the problem???
The problem is both with single-rpc and per-storage reports. In multi-rpc case 
DNs can send repeated RPCs for each storage and this will cause incorrect 
zombie detection if RPCs processed out of order.

??Is this change compatible???
Yes. The compatibility issues were discussed here above.

??What does an old NN do if it gets this pseudo-report???
According to [Rolling upgrade 
documentation|https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html]
 we first upgrade NameNodes, then DataNodes. So in practice new DNs don't talk 
to old NNs.

??What does a new NN do when it gets old style reports? Will it remove all but 
the last storage???
As mentioned in [this 
comment|https://issues.apache.org/jira/browse/HDFS-10301?focusedCommentId=15271737&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15271737]
 old DataNodes reports will be processed as regular reports, only zombie 
storages will not be removed until DNs upgraded.
During upgrade no storages are removed.

> BlockReport retransmissions may lead to storages falsely being declared 
> zombie if storage report processing happens out of order
> --------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-10301
>                 URL: https://issues.apache.org/jira/browse/HDFS-10301
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.6.1
>            Reporter: Konstantin Shvachko
>            Assignee: Vinitha Reddy Gankidi
>            Priority: Critical
>             Fix For: 2.7.4
>
>         Attachments: HDFS-10301.002.patch, HDFS-10301.003.patch, 
> HDFS-10301.004.patch, HDFS-10301.005.patch, HDFS-10301.006.patch, 
> HDFS-10301.007.patch, HDFS-10301.008.patch, HDFS-10301.009.patch, 
> HDFS-10301.01.patch, HDFS-10301.010.patch, HDFS-10301.011.patch, 
> HDFS-10301.012.patch, HDFS-10301.branch-2.7.patch, HDFS-10301.branch-2.patch, 
> HDFS-10301.sample.patch, zombieStorageLogs.rtf
>
>
> When NameNode is busy a DataNode can timeout sending a block report. Then it 
> sends the block report again. Then NameNode while process these two reports 
> at the same time can interleave processing storages from different reports. 
> This screws up the blockReportId field, which makes NameNode think that some 
> storages are zombie. Replicas from zombie storages are immediately removed, 
> causing missing blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to