[ https://issues.apache.org/jira/browse/HDFS-15177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17039156#comment-17039156 ]
Stephen O'Donnell commented on HDFS-15177: ------------------------------------------ In the first screen shot you shared, it shows the BPOfferService heartbeat thread attempting to get the lock in order to queue the Invalidate command, and if the datanode is busy the heartbeat thread can get blocked for some time. HDFS-14997 attempts to resolve this by moving the command processing to another thread. The second screen shot shows a thread blocked attempting to start a read on a block, but its not clear what is holding the lock at that time - was it blocks being deleted that were holding the lock? Which version are these screen shots from, as I do not see that getBlockFileNoExistsCheck() method on trunk? > Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too > much time. > -------------------------------------------------------------------------------------- > > Key: HDFS-15177 > URL: https://issues.apache.org/jira/browse/HDFS-15177 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode > Reporter: zhuqi > Assignee: zhuqi > Priority: Major > Attachments: image-2020-02-18-22-39-00-642.png, > image-2020-02-18-22-51-28-624.png, image-2020-02-18-22-52-59-202.png, > image-2020-02-18-22-55-38-661.png > > > In our cluster, the datanode receive the delete command with too many blocks > deletion when we have many blockpools sharing the same datanode and the > datanode with about 30 storage dirs, it will cause the FsDatasetImpl lock too > much time. > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org