[ https://issues.apache.org/jira/browse/HDFS-14642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16882590#comment-16882590 ]
Wei-Chiu Chuang commented on HDFS-14642: ---------------------------------------- Good catch. Patch LGTM. test failures seem unrelated. > processMisReplicatedBlocks does not return correct processed count > ------------------------------------------------------------------ > > Key: HDFS-14642 > URL: https://issues.apache.org/jira/browse/HDFS-14642 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode > Affects Versions: 3.2.0 > Reporter: Stephen O'Donnell > Assignee: Stephen O'Donnell > Priority: Major > Attachments: HDFS-14642.001.patch > > > HDFS-14053 introduced a method "processMisReplicatedBlocks" to the > blockManager, and it is used by fsck to schedule mis-replicated blocks for > replication. > The method should return a the number of blocks it processed, but it always > returns zero as "processed" is never incremented in the method. > It should also drop and re-take the write lock every "numBlocksPerIteration" > but as processed is never incremented, it will never drop and re-take the > write lock, giving potential for holding the write lock for a long time. > {code:java} > public int processMisReplicatedBlocks(List<BlockInfo> blocks) { > int processed = 0; > Iterator<BlockInfo> iter = blocks.iterator(); > try { > while (isPopulatingReplQueues() && namesystem.isRunning() > && !Thread.currentThread().isInterrupted() > && iter.hasNext()) { > int limit = processed + numBlocksPerIteration; > namesystem.writeLockInterruptibly(); > try { > while (iter.hasNext() && processed < limit) { > BlockInfo blk = iter.next(); > MisReplicationResult r = processMisReplicatedBlock(blk); > LOG.debug("BLOCK* processMisReplicatedBlocks: " + > "Re-scanned block {}, result is {}", blk, r); > } > } finally { > namesystem.writeUnlock(); > } > } > } catch (InterruptedException ex) { > LOG.info("Caught InterruptedException while scheduling replication work" + > " for mis-replicated blocks"); > Thread.currentThread().interrupt(); > } > return processed; > }{code} > Due to this, fsck causes a warning to be logged in the NN for every > mis-replicated file it schedules replication for, as it checks the processed > count: > {code:java} > 2019-07-10 15:46:14,790 WARN namenode.NameNode: Fsck: Block manager is able > to process only 0 mis-replicated blocks (Total count : 1 ) for path /...{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org