haiyang1987 opened a new pull request, #6112:
URL: https://github.com/apache/hadoop/pull/6112

   ### Description of PR
   
   https://issues.apache.org/jira/browse/HDFS-17205.
   
   Current allocate new block , the NameNode will choose datanode and choose a 
good storage of given storage type from datanode, the specific calling code is 
DatanodeDescriptor#chooseStorage4Block, here will calculate the space required 
for write operations,
   requiredSize = blockSize * HdfsServerConstants.MIN_BLOCKS_FOR_WRITE(default 
is 1).
   ```
   public DatanodeStorageInfo chooseStorage4Block(StorageType t,
       long blockSize) {
     final long requiredSize =
         blockSize * HdfsServerConstants.MIN_BLOCKS_FOR_WRITE;
     final long scheduledSize = blockSize * getBlocksScheduled(t);
     long remaining = 0;
     DatanodeStorageInfo storage = null;
     for (DatanodeStorageInfo s : getStorageInfos()) {
       if (s.getState() == State.NORMAL && s.getStorageType() == t) {
         if (storage == null) {
           storage = s;
         }
         long r = s.getRemaining();
         if (r >= requiredSize) {
           remaining += r;
         }
       }
     }
     if (requiredSize > remaining - scheduledSize) {
       BlockPlacementPolicy.LOG.debug(
           "The node {} does not have enough {} space (required={},"
           + " scheduled={}, remaining={}).",
           this, t, requiredSize, scheduledSize, remaining);
       return null;
     }
     return storage;
   }
   ```
   But when multiple NameSpaces select the storage of the same datanode to 
write blocks at the same time.
   In extreme cases, if there is only one block size left in the current 
storage, there will be a situation where there is not enough free space for the 
writer to write data.
   
   log similar to the following appears:
   `The volume [file:/disk1/] with the available space (=21129618 B) is less 
than the block size (=268435456 B).  `
   
   In order to avoid this case, consider 
HdfsServerConstants.MIN_BLOCKS_FOR_WRITE should be configurable, and the 
parameters can be adjusted in larger clusters.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to