[ https://issues.apache.org/jira/browse/HDFS-14021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16661307#comment-16661307 ]
Xiao Chen commented on HDFS-14021: ---------------------------------- Attached a sample failure report, and a patch to fix. {noformat} 2018-10-15 23:02:12,834 [Block report processor] DEBUG blockmanagement.BlockManager (BlockManager.java:processAndHandleReportedBlock(3839)) - In memory blockUCState = UNDER_CONSTRUCTION 2018-10-15 23:02:12,836 [Block report processor] DEBUG BlockStateChange (BlockManager.java:addStoredBlock(3148)) - BLOCK* addStoredBlock: 127.0.0.1:38427 is added to blk_-9223372036854775792_1001 (size=0) 2018-10-15 23:02:12,837 [Block report processor] DEBUG BlockStateChange (BlockManager.java:processIncrementalBlockReport(3949)) - BLOCK* block RECEIVED_BLOCK: blk_-9223372036854775785_1001 is received from 127.0.0.1:38427 2018-10-15 23:02:12,837 [Block report processor] DEBUG BlockStateChange (BlockManager.java:processIncrementalBlockReport(3952)) - *BLOCK* NameNode.processIncrementalBlockReport: from 127.0.0.1:38427 receiving: 0, received: 1, deleted: 0 ---> 2018-10-15 23:02:12,840 [IPC Server handler 7 on 35885] DEBUG BlockStateChange (LowRedundancyBlocks.java:add(293)) - BLOCK* NameSystem.LowRedundancyBlock.add: blk_-9223372036854775792_1001 has only 8 replicas and need 9 replicas so is added to neededReconstructions at priority level 2 2018-10-15 23:02:12,840 [IPC Server handler 7 on 35885] INFO hdfs.StateChange (FSNamesystem.java:completeFile(2830)) - DIR* completeFile: /foo is closed by DFSClient_NONMAPREDUCE_-442030319_1 2018-10-15 23:02:12,841 [Block report processor] DEBUG blockmanagement.BlockManager (BlockManager.java:processAndHandleReportedBlock(3816)) - Reported block blk_-9223372036854775784_1001 on 127.0.0.1:44904 size 2097152 replicaState = FINALIZED 2018-10-15 23:02:12,841 [Block report processor] DEBUG blockmanagement.BlockManager (BlockManager.java:processAndHandleReportedBlock(3839)) - In memory blockUCState = COMPLETE 2018-10-15 23:02:12,841 [Block report processor] DEBUG BlockStateChange (BlockManager.java:addStoredBlock(3148)) - BLOCK* addStoredBlock: 127.0.0.1:44904 is added to blk_-9223372036854775792_1001 (size=12582912) 2018-10-15 23:02:12,841 [main] INFO hdfs.MiniDFSCluster (MiniDFSCluster.java:shutdown(1965)) - Shutting down the Mini HDFS Cluster ---> 2018-10-15 23:02:12,842 [Block report processor] DEBUG BlockStateChange ---(LowRedundancyBlocks.java:remove(387)) - BLOCK* NameSystem.LowRedundancyBlock.remove: Removing block blk_-9223372036854775792_1001 from priority queue 2 2018-10-15 23:02:12,842 [main] INFO hdfs.MiniDFSCluster (MiniDFSCluster.java:shutdownDataNode(2013)) - Shutting down DataNode 8 2018-10-15 23:02:12,842 [Block report processor] DEBUG BlockStateChange (BlockManager.java:processIncrementalBlockReport(3949)) - BLOCK* block RECEIVED_BLOCK: blk_-9223372036854775784_1001 is received from 127.0.0.1:44904 2018-10-15 23:02:12,844 [Block report processor] DEBUG BlockStateChange (BlockManager.java:processIncrementalBlockReport(3952)) - *BLOCK* NameNode.processIncrementalBlockReport: from 127.0.0.1:44904 receiving: 0, received: 1, deleted: 0 2018-10-15 23:02:12,843 [org.apache.hadoop.hdfs.server.datanode.DataXceiverServer@62e7dffa] INFO datanode.DataNode (DataXceiverServer.java:closeAllPeers(281)) - Closing all peers. 2018-10-15 23:02:12,843 [main] WARN datanode.DirectoryScanner (DirectoryScanner.java:shutdown(340)) - DirectoryScanner: shutdown has been called {noformat} It appears to be a race condition between the block reports and the test's check to {{numOfUnderReplicatedBlocks}}. > TestReconstructStripedBlocksWithRackAwareness#testReconstructForNotEnoughRacks > fails intermittently > --------------------------------------------------------------------------------------------------- > > Key: HDFS-14021 > URL: https://issues.apache.org/jira/browse/HDFS-14021 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding, test > Affects Versions: 3.0.0 > Reporter: Xiao Chen > Assignee: Xiao Chen > Priority: Major > Attachments: HDFS-14021.01.patch, > TEST-org.apache.hadoop.hdfs.server.blockmanagement.TestReconstructStripedBlocksWithRackAwareness.xml > > > The test sometimes fail with: > {noformat} > java.lang.AssertionError: expected:<0> but was:<1> > > at > org.apache.hadoop.hdfs.server.blockmanagement.TestReconstructStripedBlocksWithRackAwarness.testReconstructForNotEnoughRacks(TestReconstructStripedBlocksWithRackAwareness.java:171) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org