[ 
https://issues.apache.org/jira/browse/HDFS-15177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17039146#comment-17039146
 ] 

zhuqi edited comment on HDFS-15177 at 2/18/20 2:56 PM:
-------------------------------------------------------

cc [~sodonnell] , 

Thanks for your reply. Our cluster' deletion is so heavy with many namespaces 
and with about 30 storage dirs on one datanode. All namespace will call the 
same FsDatasetImpl function.

I am on 2.x branch, cdh5.16.1 version, I mean BPOfferService receive too many 
deletion blocks waiting to be delete. And cause too many items in Block 
invalidBlks[], so that the synchronized FsDatasetImpl including the volumeMap 
removed will iterator so many times, even blocked the heartbeat which uses the  
synchronized FsDatasetImpl and when we change to remove the heartbeat 
synchronized FsDatasetImpl the heartbeat recover normal according to HDFS-7060 
, but heavy deletion will still block the synchronized FsDatasetImpl some times 
which will affect other action about synchronized FsDatasetImpl.

The blocked stack example is :

!image-2020-02-18-22-39-00-642.png|width=843,height=115!

Also affect the read and write.
!image-2020-02-18-22-55-38-661.png|width=960,height=122!

!image-2020-02-18-22-51-28-624.png|width=891,height=120!

!image-2020-02-18-22-52-59-202.png|width=996,height=120!


was (Author: zhuqi):
cc [~sodonnell] , 

Thanks for your reply. Our cluster' deletion is so heavy with many namespaces 
and with about 30 storage dirs on one datanode. All namespace will call the 
same FsDatasetImpl function.

I am on 2.x branch, cdh5.16.1 version, I mean BPOfferService receive too many 
deletion blocks waiting to be delete. And cause too many items in Block 
invalidBlks[], so that the synchronized FsDatasetImpl including the volumeMap 
removed will iterator so many times, even blocked the heartbeat which uses the  
synchronized FsDatasetImpl and when we change to remove the heartbeat 
synchronized FsDatasetImpl the heartbeat recover normal according to 
[HDFS-7060|https://issues.apache.org/jira/browse/HDFS-7060] , but heavy 
deletion will still block the synchronized FsDatasetImpl some times which will 
affect other action about synchronized FsDatasetImpl.

The blocked stack example is :

!image-2020-02-18-22-39-00-642.png|width=843,height=115!

Also affect the read and write.

!image-2020-02-18-22-51-28-624.png|width=891,height=120!

!image-2020-02-18-22-52-59-202.png|width=996,height=120!

> Split datanode invalide block deletion, to avoid the FsDatasetImpl lock too 
> much time.
> --------------------------------------------------------------------------------------
>
>                 Key: HDFS-15177
>                 URL: https://issues.apache.org/jira/browse/HDFS-15177
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>            Reporter: zhuqi
>            Assignee: zhuqi
>            Priority: Major
>         Attachments: image-2020-02-18-22-39-00-642.png, 
> image-2020-02-18-22-51-28-624.png, image-2020-02-18-22-52-59-202.png, 
> image-2020-02-18-22-55-38-661.png
>
>
> In our cluster, the datanode receive the delete command with too many blocks 
> deletion when we have many blockpools sharing the same datanode and the 
> datanode with about 30 storage dirs, it will cause the FsDatasetImpl lock too 
> much time.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to