[jira] [Updated] (HDFS-17205) HdfsServerConstants.MIN_BLOCKS_FOR_WRITE should be configurable

2024-01-27 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-17205:
--
  Component/s: hdfs
 Target Version/s: 3.4.0
Affects Version/s: 3.4.0

> HdfsServerConstants.MIN_BLOCKS_FOR_WRITE should be configurable
> ---
>
> Key: HDFS-17205
> URL: https://issues.apache.org/jira/browse/HDFS-17205
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.4.0
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> Current allocate new block , the NameNode will choose datanode and choose a 
> good storage of given storage type from datanode, the specific calling code 
> is DatanodeDescriptor#chooseStorage4Block, here will calculate the space 
> required for write operations, 
> requiredSize = blockSize * HdfsServerConstants.MIN_BLOCKS_FOR_WRITE(default 
> is 1).
> {code:java}
> public DatanodeStorageInfo chooseStorage4Block(StorageType t,
> long blockSize) {
>   final long requiredSize =
>   blockSize * HdfsServerConstants.MIN_BLOCKS_FOR_WRITE;
>   final long scheduledSize = blockSize * getBlocksScheduled(t);
>   long remaining = 0;
>   DatanodeStorageInfo storage = null;
>   for (DatanodeStorageInfo s : getStorageInfos()) {
> if (s.getState() == State.NORMAL && s.getStorageType() == t) {
>   if (storage == null) {
> storage = s;
>   }
>   long r = s.getRemaining();
>   if (r >= requiredSize) {
> remaining += r;
>   }
> }
>   }
>   if (requiredSize > remaining - scheduledSize) {
> BlockPlacementPolicy.LOG.debug(
> "The node {} does not have enough {} space (required={},"
> + " scheduled={}, remaining={}).",
> this, t, requiredSize, scheduledSize, remaining);
> return null;
>   }
>   return storage;
> }
> {code}
> But when multiple NameSpaces select the storage of the same datanode to write 
> blocks at the same time. 
> In extreme cases, if there is only one block size left in the current 
> storage, there will be a situation where there is not enough free space for 
> the writer to write data.
> log similar to the following appears:
> {code:java}
> The volume [file:/disk1/] with the available space (=21129618 B) is less than 
> the block size (=268435456 B).  
> {code}
> In order to avoid this case, consider 
> HdfsServerConstants.MIN_BLOCKS_FOR_WRITE should be configurable, and the 
> parameters can be adjusted in larger clusters.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17205) HdfsServerConstants.MIN_BLOCKS_FOR_WRITE should be configurable

2023-09-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-17205:
--
Labels: pull-request-available  (was: )

> HdfsServerConstants.MIN_BLOCKS_FOR_WRITE should be configurable
> ---
>
> Key: HDFS-17205
> URL: https://issues.apache.org/jira/browse/HDFS-17205
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>
> Current allocate new block , the NameNode will choose datanode and choose a 
> good storage of given storage type from datanode, the specific calling code 
> is DatanodeDescriptor#chooseStorage4Block, here will calculate the space 
> required for write operations, 
> requiredSize = blockSize * HdfsServerConstants.MIN_BLOCKS_FOR_WRITE(default 
> is 1).
> {code:java}
> public DatanodeStorageInfo chooseStorage4Block(StorageType t,
> long blockSize) {
>   final long requiredSize =
>   blockSize * HdfsServerConstants.MIN_BLOCKS_FOR_WRITE;
>   final long scheduledSize = blockSize * getBlocksScheduled(t);
>   long remaining = 0;
>   DatanodeStorageInfo storage = null;
>   for (DatanodeStorageInfo s : getStorageInfos()) {
> if (s.getState() == State.NORMAL && s.getStorageType() == t) {
>   if (storage == null) {
> storage = s;
>   }
>   long r = s.getRemaining();
>   if (r >= requiredSize) {
> remaining += r;
>   }
> }
>   }
>   if (requiredSize > remaining - scheduledSize) {
> BlockPlacementPolicy.LOG.debug(
> "The node {} does not have enough {} space (required={},"
> + " scheduled={}, remaining={}).",
> this, t, requiredSize, scheduledSize, remaining);
> return null;
>   }
>   return storage;
> }
> {code}
> But when multiple NameSpaces select the storage of the same datanode to write 
> blocks at the same time. 
> In extreme cases, if there is only one block size left in the current 
> storage, there will be a situation where there is not enough free space for 
> the writer to write data.
> log similar to the following appears:
> {code:java}
> The volume [file:/disk1/] with the available space (=21129618 B) is less than 
> the block size (=268435456 B).  
> {code}
> In order to avoid this case, consider 
> HdfsServerConstants.MIN_BLOCKS_FOR_WRITE should be configurable, and the 
> parameters can be adjusted in larger clusters.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17205) HdfsServerConstants.MIN_BLOCKS_FOR_WRITE should be configurable

2023-09-23 Thread Haiyang Hu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haiyang Hu updated HDFS-17205:
--
Description: 
Current allocate new block , the NameNode will choose datanode and choose a 
good storage of given storage type from datanode, the specific calling code is 
DatanodeDescriptor#chooseStorage4Block, here will calculate the space required 
for write operations, 
requiredSize = blockSize * HdfsServerConstants.MIN_BLOCKS_FOR_WRITE(default is 
1).

{code:java}
public DatanodeStorageInfo chooseStorage4Block(StorageType t,
long blockSize) {
  final long requiredSize =
  blockSize * HdfsServerConstants.MIN_BLOCKS_FOR_WRITE;
  final long scheduledSize = blockSize * getBlocksScheduled(t);
  long remaining = 0;
  DatanodeStorageInfo storage = null;
  for (DatanodeStorageInfo s : getStorageInfos()) {
if (s.getState() == State.NORMAL && s.getStorageType() == t) {
  if (storage == null) {
storage = s;
  }
  long r = s.getRemaining();
  if (r >= requiredSize) {
remaining += r;
  }
}
  }
  if (requiredSize > remaining - scheduledSize) {
BlockPlacementPolicy.LOG.debug(
"The node {} does not have enough {} space (required={},"
+ " scheduled={}, remaining={}).",
this, t, requiredSize, scheduledSize, remaining);
return null;
  }
  return storage;
}
{code}

But when multiple NameSpaces select the storage of the same datanode to write 
blocks at the same time. 
In extreme cases, if there is only one block size left in the current storage, 
there will be a situation where there is not enough free space for the writer 
to write data.

log similar to the following appears:

{code:java}
The volume [file:/disk1/] with the available space (=21129618 B) is less than 
the block size (=268435456 B).  
{code}


In order to avoid this case, consider HdfsServerConstants.MIN_BLOCKS_FOR_WRITE 
should be configurable, and the parameters can be adjusted in larger clusters.


  was:
Current allocate new block , the NameNode will choose datanode and choose a 
good storage of given storage type from datanode, the specific calling code is 
DatanodeDescriptor#chooseStorage4Block, here will calculate the space required 
for write operations, 
requiredSize = blockSize * HdfsServerConstants.MIN_BLOCKS_FOR_WRITE(default is 
1).

{code:java}
public DatanodeStorageInfo chooseStorage4Block(StorageType t,
long blockSize) {
  final long requiredSize =
  blockSize * HdfsServerConstants.MIN_BLOCKS_FOR_WRITE;
  final long scheduledSize = blockSize * getBlocksScheduled(t);
  long remaining = 0;
  DatanodeStorageInfo storage = null;
  for (DatanodeStorageInfo s : getStorageInfos()) {
if (s.getState() == State.NORMAL && s.getStorageType() == t) {
  if (storage == null) {
storage = s;
  }
  long r = s.getRemaining();
  if (r >= requiredSize) {
remaining += r;
  }
}
  }
  if (requiredSize > remaining - scheduledSize) {
BlockPlacementPolicy.LOG.debug(
"The node {} does not have enough {} space (required={},"
+ " scheduled={}, remaining={}).",
this, t, requiredSize, scheduledSize, remaining);
return null;
  }
  return storage;
}
{code}

But when multiple NameSpaces select the storage of the same datanode to write 
blocks at the same time. 
In extreme cases, if there is only one block size left in the current storage, 
there will be a situation where there is not enough free space for the writer 
to write data.

log similar to the following appears:
The volume[file:/disk1/] with the available space (=21129618 B) is less than 
the block size (=268435456 B).  

In order to avoid this case, consider HdfsServerConstants.MIN_BLOCKS_FOR_WRITE 
should be configurable, and the parameters can be adjusted in larger clusters.



> HdfsServerConstants.MIN_BLOCKS_FOR_WRITE should be configurable
> ---
>
> Key: HDFS-17205
> URL: https://issues.apache.org/jira/browse/HDFS-17205
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>
> Current allocate new block , the NameNode will choose datanode and choose a 
> good storage of given storage type from datanode, the specific calling code 
> is DatanodeDescriptor#chooseStorage4Block, here will calculate the space 
> required for write operations, 
> requiredSize = blockSize * HdfsServerConstants.MIN_BLOCKS_FOR_WRITE(default 
> is 1).
> {code:java}
> public DatanodeStorageInfo chooseStorage4Block(StorageType t,
> long blockSize) {
>   final long requiredSize =
>   blockSize * HdfsServerConstants.MIN_BLOCKS_FOR_WRITE;
>   final long scheduledSize = blockSize * getBlocksScheduled(t);
>   long remaining = 0;
>   DatanodeStorageInfo storage = null;
>   for 

[jira] [Updated] (HDFS-17205) HdfsServerConstants.MIN_BLOCKS_FOR_WRITE should be configurable

2023-09-23 Thread Haiyang Hu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haiyang Hu updated HDFS-17205:
--
Description: 
Current allocate new block , the NameNode will choose datanode and choose a 
good storage of given storage type from datanode, the specific calling code is 
DatanodeDescriptor#chooseStorage4Block, here will calculate the space required 
for write operations, 
requiredSize = blockSize * HdfsServerConstants.MIN_BLOCKS_FOR_WRITE(default is 
1).

{code:java}
public DatanodeStorageInfo chooseStorage4Block(StorageType t,
long blockSize) {
  final long requiredSize =
  blockSize * HdfsServerConstants.MIN_BLOCKS_FOR_WRITE;
  final long scheduledSize = blockSize * getBlocksScheduled(t);
  long remaining = 0;
  DatanodeStorageInfo storage = null;
  for (DatanodeStorageInfo s : getStorageInfos()) {
if (s.getState() == State.NORMAL && s.getStorageType() == t) {
  if (storage == null) {
storage = s;
  }
  long r = s.getRemaining();
  if (r >= requiredSize) {
remaining += r;
  }
}
  }
  if (requiredSize > remaining - scheduledSize) {
BlockPlacementPolicy.LOG.debug(
"The node {} does not have enough {} space (required={},"
+ " scheduled={}, remaining={}).",
this, t, requiredSize, scheduledSize, remaining);
return null;
  }
  return storage;
}
{code}

But when multiple NameSpaces select the storage of the same datanode to write 
blocks at the same time. 
In extreme cases, if there is only one block size left in the current storage, 
there will be a situation where there is not enough free space for the writer 
to write data.

log similar to the following appears:
The volume[file:/disk1/] with the available space (=21129618 B) is less than 
the block size (=268435456 B).  

In order to avoid this case, consider HdfsServerConstants.MIN_BLOCKS_FOR_WRITE 
should be configurable, and the parameters can be adjusted in larger clusters.


> HdfsServerConstants.MIN_BLOCKS_FOR_WRITE should be configurable
> ---
>
> Key: HDFS-17205
> URL: https://issues.apache.org/jira/browse/HDFS-17205
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>
> Current allocate new block , the NameNode will choose datanode and choose a 
> good storage of given storage type from datanode, the specific calling code 
> is DatanodeDescriptor#chooseStorage4Block, here will calculate the space 
> required for write operations, 
> requiredSize = blockSize * HdfsServerConstants.MIN_BLOCKS_FOR_WRITE(default 
> is 1).
> {code:java}
> public DatanodeStorageInfo chooseStorage4Block(StorageType t,
> long blockSize) {
>   final long requiredSize =
>   blockSize * HdfsServerConstants.MIN_BLOCKS_FOR_WRITE;
>   final long scheduledSize = blockSize * getBlocksScheduled(t);
>   long remaining = 0;
>   DatanodeStorageInfo storage = null;
>   for (DatanodeStorageInfo s : getStorageInfos()) {
> if (s.getState() == State.NORMAL && s.getStorageType() == t) {
>   if (storage == null) {
> storage = s;
>   }
>   long r = s.getRemaining();
>   if (r >= requiredSize) {
> remaining += r;
>   }
> }
>   }
>   if (requiredSize > remaining - scheduledSize) {
> BlockPlacementPolicy.LOG.debug(
> "The node {} does not have enough {} space (required={},"
> + " scheduled={}, remaining={}).",
> this, t, requiredSize, scheduledSize, remaining);
> return null;
>   }
>   return storage;
> }
> {code}
> But when multiple NameSpaces select the storage of the same datanode to write 
> blocks at the same time. 
> In extreme cases, if there is only one block size left in the current 
> storage, there will be a situation where there is not enough free space for 
> the writer to write data.
> log similar to the following appears:
> The volume[file:/disk1/] with the available space (=21129618 B) is less than 
> the block size (=268435456 B).  
> In order to avoid this case, consider 
> HdfsServerConstants.MIN_BLOCKS_FOR_WRITE should be configurable, and the 
> parameters can be adjusted in larger clusters.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org