[ https://issues.apache.org/jira/browse/HDFS-17003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17729578#comment-17729578 ]
ASF GitHub Bot commented on HDFS-17003: --------------------------------------- zhangshuyan0 commented on PR #5643: URL: https://github.com/apache/hadoop/pull/5643#issuecomment-1577869018 I think the added UT still has a problem. I'm not sure if you saw this comment from me: > Should we skip dnd and dnd2 when turn on heartbeat here? Otherwise the corrupt replicas may be deleted from InvalidateBlock if delete commands are sent. > Erasure coding: invalidate wrong block after reporting bad blocks from > datanode > ------------------------------------------------------------------------------- > > Key: HDFS-17003 > URL: https://issues.apache.org/jira/browse/HDFS-17003 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: farmmamba > Assignee: farmmamba > Priority: Critical > Labels: pull-request-available > > After receiving reportBadBlocks RPC from datanode, NameNode compute wrong > block to invalidate. It is a dangerous behaviour and may cause data loss. > Some logs in our production as below: > > NameNode log: > {code:java} > 2023-05-08 21:23:49,112 INFO org.apache.hadoop.hdfs.StateChange: *DIR* > reportBadBlocks for block: > BP-932824627-xxxx-1680179358678:blk_-9223372036848404320_1471186 on datanode: > datanode1:50010 > 2023-05-08 21:23:49,183 INFO org.apache.hadoop.hdfs.StateChange: *DIR* > reportBadBlocks for block: > BP-932824627-xxxx-1680179358678:blk_-9223372036848404319_1471186 on datanode: > datanode2:50010{code} > datanode1 log: > {code:java} > 2023-05-08 21:23:49,088 WARN > org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad > BP-932824627-xxxx-1680179358678:blk_-9223372036848404320_1471186 on > /data7/hadoop/hdfs/datanode > 2023-05-08 21:24:00,509 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Failed > to delete replica blk_-9223372036848404319_1471186: ReplicaInfo not > found.{code} > > This phenomenon can be reproduced. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org