[ 
https://issues.apache.org/jira/browse/HDFS-611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12772113#action_12772113
 ] 

Konstantin Shvachko commented on HDFS-611:
------------------------------------------

> This in turn maps to the capability of serving ~3000 nodes with 3 second 
> heartbeat latency. 

This makes sense. I hope it's not going to come to that all heartbeats will 
remove 1000 blocks on all nodes.

> Most of the classes in DataNode are named Block* 

Traditionally hdfs did not distinguish between blocks and their replicas, which 
we found very confusing while implementing append and tried to call new classes 
Replica*. So yes you see a lot of Block* classes, but it would be really good 
to turn this in the right direction. Wouldn't you agree that "replica" is a 
more precise term for a copy of a block on a specific data-node.

> I like the abstraction of creating a task using the 5 arguments, and then do 
> "execute(Task)". 

I think the abstraction should provide an api to delete replica files 
independently on whether it is multi-threaded or single-threaded, so it makes 
sense to me to keep the implementation details concealed in the deleter.

Looked into the implementation details a bit. By default ThreadPoolExecutor 
sets allowCoreThreadTimeOut to false, which means the threads never shutdown 
even if there are no deletes. I would rather pay the price of restarting 
threads when new deletes arrive than keep those threads running forever. 
Data-nodes spawn a lot of threads besides the deletion. Besides, it will also 
automatically take care of the condition when a volume dies and we remove it 
from FSVolumeSet. It would be a waist to keep a thread around for a dead volume.

The key for the HashMap of threads is the reference to the volume. This is 
based on that you do not explicitly define equals() and hashCode() for 
FSVolume. Currently we do not alter instances of volumes, but if we ever do 
this could be a problem. May be it is better to use volume's directory as the 
key in the HashMap.
You still need to remove the HashMap entry, when the volume is removed from the 
system.

> Heartbeats times from Datanodes increase when there are plenty of blocks to 
> delete
> ----------------------------------------------------------------------------------
>
>                 Key: HDFS-611
>                 URL: https://issues.apache.org/jira/browse/HDFS-611
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node
>    Affects Versions: 0.20.1, 0.21.0, 0.22.0
>            Reporter: dhruba borthakur
>            Assignee: Zheng Shao
>             Fix For: 0.20.2, 0.21.0, 0.22.0
>
>         Attachments: HDFS-611.branch-19.patch, HDFS-611.branch-19.v2.patch, 
> HDFS-611.branch-20.patch, HDFS-611.branch-20.v2.patch, HDFS-611.trunk.patch, 
> HDFS-611.trunk.v2.patch, HDFS-611.trunk.v3.patch
>
>
> I am seeing that when we delete a large directory that has plenty of blocks, 
> the heartbeat times from datanodes increase significantly from the normal 
> value of 3 seconds to as large as 50 seconds or so. The heartbeat thread in 
> the Datanode deletes a bunch of blocks sequentially, this causes the 
> heartbeat times to increase.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to