[ 
https://issues.apache.org/jira/browse/HDFS-8863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692870#comment-14692870
 ] 

Yi Liu commented on HDFS-8863:
------------------------------

{quote}
What if we let it check against storage type level sum and also make sure there 
is at least one storage with enough space?
{quote}
Still have potential issue.  For example, we have datanode dn0, and three 
storages(s1, s2, s3) of required storage type. Both s1 and s3 has 2/3 block 
size remaining space, and s2 has 1+2/3 block size remaining space. We just 
scheduled one block on dn0, it's certainly on s2, now a new block is adding and 
block placement checks dn0, for current patch, it will see the maximum of 
remaining space is 1 + 2/3 block size (s2), and also the sum satisfy, so treat 
it as a good target, but actually it's not.

I am thinking we can do as following:  do storage type level sum, but for each 
storage, we only count the remaining space of multiple block size part, so for 
above example, remaining space of s1 and s3 is counted 0, s2 is 1, then the sum 
is 1, dn0 is not a good target.

{quote}
Datanodes only care about the storage type, so checking a particular 
storagewon't do any good. It will just cause block placement to re-pick target 
more.
{quote}
You are right, I also had another meaning: when iterating storages, it's to 
check the remaining space of storage type, but actually some back storages may 
be {{State.FAILED}} or {{State.READ_ONLY_SHARED}}, it's remaining space is 
still be counted, right?  So I think you can do these check in 
{{getRemaining}}.  See my JIRA HDFS-8884, which has relation to this, I do 
fast-fail check for datanode, of cause, I can do this part in my JIRA if you 
don't do it here.

> The remiaing space check in BlockPlacementPolicyDefault is flawed
> -----------------------------------------------------------------
>
>                 Key: HDFS-8863
>                 URL: https://issues.apache.org/jira/browse/HDFS-8863
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.6.0
>            Reporter: Kihwal Lee
>            Assignee: Kihwal Lee
>            Priority: Critical
>              Labels: 2.6.1-candidate
>         Attachments: HDFS-8863.patch, HDFS-8863.v2.patch
>
>
> The block placement policy calls 
> {{DatanodeDescriptor#getRemaining(StorageType}}}} to check whether the block 
> is going to fit. Since the method is adding up all remaining spaces, namenode 
> can allocate a new block on a full node. This causes pipeline construction 
> failure and {{abandonBlock}}. If the cluster is nearly full, the client might 
> hit this multiple times and the write can fail permanently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to