[jira] Commented: (HDFS-611) Heartbeats times from Datanodes increase when there are plenty of blocks to delete

Zheng Shao (JIRA) Mon, 02 Nov 2009 16:31:24 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12772783#action_12772783
 ]


Zheng Shao commented on HDFS-611:
---------------------------------

In light of some other discussions related to async deletion in JobTracker when 
it restarts, I will rename BlockFileDeleter to AsyncDiskService, and move 
BlockFileDeleteTask inside the FSDataset class. In this way, it will be easier 
to move AsyncDiskService to common and reuse it in the JobTracker in the future.

> Traditionally hdfs did not distinguish between blocks and their replicas, 
> which we found very confusing while implementing append and tried to call new 
> classes Replica*. So yes you see a lot of Block* classes, but it would be 
> really good to turn this in the right direction. Wouldn't you agree that 
> "replica" is a more precise term for a copy of a block on a specific 
> data-node.

I agree. My initial thought was that we should do that change in a single 
transaction for everything, otherwise having both old and new conventions will 
make the things even more confusing. 

> I think the abstraction should provide an api to delete replica files 
> independently on whether it is multi-threaded or single-threaded, so it makes 
> sense to me to keep the implementation details concealed in the deleter.

Users might need to know if an operation is sync or async. I will add a 
deleteAsync method.

> allowCoreThreadTimeOut 

Agree. I will add that.

> The key for the HashMap of threads is the reference to the volume

That makes sense. I will change the key of the HashMap to File, which 
represents the currentDir of the FSVolume.


> Heartbeats times from Datanodes increase when there are plenty of blocks to 
> delete
> ----------------------------------------------------------------------------------
>
>                 Key: HDFS-611
>                 URL: https://issues.apache.org/jira/browse/HDFS-611
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node
>    Affects Versions: 0.20.1, 0.21.0, 0.22.0
>            Reporter: dhruba borthakur
>            Assignee: Zheng Shao
>             Fix For: 0.20.2, 0.21.0, 0.22.0
>
>         Attachments: HDFS-611.branch-19.patch, HDFS-611.branch-19.v2.patch, 
> HDFS-611.branch-20.patch, HDFS-611.branch-20.v2.patch, HDFS-611.trunk.patch, 
> HDFS-611.trunk.v2.patch, HDFS-611.trunk.v3.patch
>
>
> I am seeing that when we delete a large directory that has plenty of blocks, 
> the heartbeat times from datanodes increase significantly from the normal 
> value of 3 seconds to as large as 50 seconds or so. The heartbeat thread in 
> the Datanode deletes a bunch of blocks sequentially, this causes the 
> heartbeat times to increase.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-611) Heartbeats times from Datanodes increase when there are plenty of blocks to delete

Reply via email to