from:"Xiangyi Zhu \\\\\\\(Jira\\\\\\\)"

[jira] [Updated] (HDFS-17191) HDFS: Delete operation adds a thread to collect blocks asynchronously

2023-09-14 Thread Xiangyi Zhu (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-17191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiangyi Zhu updated HDFS-17191:
---
Description: 
When we delete a large directory, it is time-consuming to collect the blocks in 
the deleted subtree. Currently, block collection is executed within a write 
lock. If a large directory is deleted, other RPCs may be blocked for a period 
of time. Asynchronous deletion of collected blocks has been implemented, we can 
refer to this Jira https://issues.apache.org/jira/browse/HDFS-16043.

In fact, collecting blocks does not require locking, because after the subtree 
is deleted, this subtree will not be accessed by other RPCs. We can collect the 
deleted subtree asynchronously and without locking.
But there may be some problems:
1. When the parent node of the subtree is configured with quota, the quota 
update is not synchronous and there will be a small delay.
2. Because the root directory always has the DirectoryWithQuotaFeature 
attribute, we need to update the quotaUsage of the root directory anyway. In 
addition, the root directory does not have an upper limit for quota 
configuration. I think we can ignore the delayed update of quota for the root 
directory.

To solve the above problem, we can check whether all parent directories of the 
subtree are configured with quota. If quota is not configured, use asynchronous 
collection. We can also use configuration to let users decide whether to enable 
quota checking.

  was:
When we delete a large directory, it is time-consuming to collect the blocks in 
the deleted subtree. Currently, block collection is executed within a write 
lock. If a large directory is deleted, other RPCs may be blocked for a period 
of time. Asynchronous deletion of collected blocks has been implemented, we can 
refer to this.

In fact, collecting blocks does not require locking, because after the subtree 
is deleted, this subtree will not be accessed by other RPCs. We can collect the 
deleted subtree asynchronously and without locking.
But there may be some problems:
1. When the parent node of the subtree is configured with quota, the quota 
update is not synchronous and there will be a small delay.
2. Because the root directory always has the DirectoryWithQuotaFeature 
attribute, we need to update the quotaUsage of the root directory anyway. In 
addition, the root directory does not have an upper limit for quota 
configuration. I think we can ignore the delayed update of quota for the root 
directory.

To solve the above problem, we can check whether all parent directories of the 
subtree are configured with quota. If quota is not configured, use asynchronous 
collection. We can also use configuration to let users decide whether to enable 
quota checking.


> HDFS: Delete operation adds a thread to collect blocks asynchronously
> -
>
> Key: HDFS-17191
> URL: https://issues.apache.org/jira/browse/HDFS-17191
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Assignee: Xiangyi Zhu
>Priority: Major
>
> When we delete a large directory, it is time-consuming to collect the blocks 
> in the deleted subtree. Currently, block collection is executed within a 
> write lock. If a large directory is deleted, other RPCs may be blocked for a 
> period of time. Asynchronous deletion of collected blocks has been 
> implemented, we can refer to this Jira 
> https://issues.apache.org/jira/browse/HDFS-16043.
> In fact, collecting blocks does not require locking, because after the 
> subtree is deleted, this subtree will not be accessed by other RPCs. We can 
> collect the deleted subtree asynchronously and without locking.
> But there may be some problems:
> 1. When the parent node of the subtree is configured with quota, the quota 
> update is not synchronous and there will be a small delay.
> 2. Because the root directory always has the DirectoryWithQuotaFeature 
> attribute, we need to update the quotaUsage of the root directory anyway. In 
> addition, the root directory does not have an upper limit for quota 
> configuration. I think we can ignore the delayed update of quota for the root 
> directory.
> To solve the above problem, we can check whether all parent directories of 
> the subtree are configured with quota. If quota is not configured, use 
> asynchronous collection. We can also use configuration to let users decide 
> whether to enable quota checking.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-17191) HDFS: Delete operation adds a thread to collect blocks asynchronously

2023-09-14 Thread Xiangyi Zhu (Jira)

Xiangyi Zhu created HDFS-17191:
--

 Summary: HDFS: Delete operation adds a thread to collect blocks 
asynchronously
 Key: HDFS-17191
 URL: https://issues.apache.org/jira/browse/HDFS-17191
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs
Affects Versions: 3.4.0
Reporter: Xiangyi Zhu
Assignee: Xiangyi Zhu


When we delete a large directory, it is time-consuming to collect the blocks in 
the deleted subtree. Currently, block collection is executed within a write 
lock. If a large directory is deleted, other RPCs may be blocked for a period 
of time. Asynchronous deletion of collected blocks has been implemented, we can 
refer to this.

In fact, collecting blocks does not require locking, because after the subtree 
is deleted, this subtree will not be accessed by other RPCs. We can collect the 
deleted subtree asynchronously and without locking.
But there may be some problems:
1. When the parent node of the subtree is configured with quota, the quota 
update is not synchronous and there will be a small delay.
2. Because the root directory always has the DirectoryWithQuotaFeature 
attribute, we need to update the quotaUsage of the root directory anyway. In 
addition, the root directory does not have an upper limit for quota 
configuration. I think we can ignore the delayed update of quota for the root 
directory.

To solve the above problem, we can check whether all parent directories of the 
subtree are configured with quota. If quota is not configured, use asynchronous 
collection. We can also use configuration to let users decide whether to enable 
quota checking.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-16000) HDFS : Rename performance optimization

2023-09-12 Thread Xiangyi Zhu (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiangyi Zhu updated HDFS-16000:
---
Description: 
It takes a long time to move a large directory with rename. For example, it 
takes about 40 seconds to move a 1000W directory. When a large amount of data 
is deleted to the trash, the move large directory will occur when the recycle 
bin makes checkpoint. In addition, the user may also actively trigger the move 
large directory operation, which will cause the NameNode to lock too long and 
be killed by Zkfc. Through the flame graph, it is found that the main time 
consuming is to create the EnumCounters object.

 
h3. Rename logic optimization:
 * Regardless of whether the rename operation is the source directory and the 
target directory, the quota count must be calculated three times. The first 
time, check whether the moved directory exceeds the target directory quota, the 
second time, calculate the mobile directory quota to update the source 
directory quota, and the third time, calculate the mobile directory 
configuration update to the target directory.
 * I think some of the above three quota quota calculations are unnecessary. 
For example, if all parent directories of the source directory and target 
directory are not configured with quota, there is no need to calculate 
quotaCount. Even if both the source directory and the target directory use 
quota, there is no need to calculate the quota three times. The calculation 
logic for the first and third times is the same, and it only needs to be 
calculated once.

  was:
It takes a long time to move a large directory with rename. For example, it 
takes about 40 seconds to move a 1000W directory. When a large amount of data 
is deleted to the trash, the move large directory will occur when the recycle 
bin makes checkpoint. In addition, the user may also actively trigger the move 
large directory operation, which will cause the NameNode to lock too long and 
be killed by Zkfc. Through the flame graph, it is found that the main time 
consuming is to create the EnumCounters object.
h3. I think the following two points can optimize the efficiency of rename 
execution
h3. QuotaCount calculation time-consuming optimization:
 * Create a QuotaCounts object in the calculation directory quotaCount, and 
pass the quotaCount to the next calculation function through a parameter each 
time, so as to avoid creating an EnumCounters object for each calculation.
 * In addition, through the flame graph, it is found that using lambda to 
modify QuotaCounts takes longer than the ordinary method, so the ordinary 
method is used to modify the QuotaCounts count.

h3. Rename logic optimization:
 * Regardless of whether the rename operation is the source directory and the 
target directory, the quota count must be calculated three times. The first 
time, check whether the moved directory exceeds the target directory quota, the 
second time, calculate the mobile directory quota to update the source 
directory quota, and the third time, calculate the mobile directory 
configuration update to the target directory.
 * I think some of the above three quota quota calculations are unnecessary. 
For example, if all parent directories of the source directory and target 
directory are not configured with quota, there is no need to calculate 
quotaCount. Even if both the source directory and the target directory use 
quota, there is no need to calculate the quota three times. The calculation 
logic for the first and third times is the same, and it only needs to be 
calculated once.


> HDFS : Rename performance optimization
> --
>
> Key: HDFS-16000
> URL: https://issues.apache.org/jira/browse/HDFS-16000
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namenode
>Affects Versions: 3.1.4, 3.3.1
>Reporter: Xiangyi Zhu
>Assignee: Xiangyi Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: 20210428-143238.svg, 20210428-171635-lambda.svg, 
> HDFS-16000.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> It takes a long time to move a large directory with rename. For example, it 
> takes about 40 seconds to move a 1000W directory. When a large amount of data 
> is deleted to the trash, the move large directory will occur when the recycle 
> bin makes checkpoint. In addition, the user may also actively trigger the 
> move large directory operation, which will cause the NameNode to lock too 
> long and be killed by Zkfc. Through the flame graph, it is found that the 
> main time consuming is to create the EnumCounters object.
>  
> h3. Rename logic optimization:
>  * Regardless of whether the rename operation is the source directory and the 
> target directory, the quota count must

[jira] [Commented] (HDFS-16214) Asynchronously collect blocks and update quota when deleting

2022-03-14 Thread Xiangyi Zhu (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17506225#comment-17506225
 ] 

Xiangyi Zhu commented on HDFS-16214:


[~hexiaoqiao] Thanks for your comment.
{quote}A. when clean inode being deleted in the set.
{quote}
Adding inodes and removing inodes from the set are all operations within the 
lock, which is thread-safe.
{quote}B. it seems not only `create file` should be considered, other 
operations such as renameTo also need to check, right?
{quote}
1. When the /a/file file is deleted, the inode corresponding to this file is 
removed from children within the lock. Assuming that when collecting this file 
block, rename /a/file to /a/file1, since file is no longer in the children of 
the /a directory, rename will return false.

2.When deleting the /a/file file and in the block collection stage, at this 
time rename /dir to /dir1 At this time, the calculated quota does not include 
/a/file, when /a/file finishes collecting the block, it can find the parent 
directory according to the iip update quota. Its quota information is 
eventually consistent.
{quote}C. what will happen if we delete parent inode during step 2?
{quote}
The inode of the block to be collected will be put into the queue, and then 
there will be a thread to collect the block alone. It can guarantee the order 
of collecting the block and update quota. If we delete the parent node in step 
2, I think it is normal.
{quote}D. do you mean the delete logic will be same at Active and Standby side 
for HA setup? If that, checkpoint has to wait every deletion complete to 
checkpoint? In some corn case, it will postpone checkpoint or if the checkpoint 
period will not under control?
{quote}
In addition, in order to avoid the situation that deleted files may still apply 
for blocks during multiple active/standby switchover, the active node waits for 
all deletions to be completed when it is converted into a standby node.

 

> Asynchronously collect blocks and update quota when deleting
> 
>
> Key: HDFS-16214
> URL: https://issues.apache.org/jira/browse/HDFS-16214
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Assignee: Xiangyi Zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The time-consuming deletion is mainly reflected in three logics , collecting 
> blocks, deleting Inode from InodeMap, and deleting blocks. The current 
> deletion is divided into two major steps. Step 1 acquires the lock, collects 
> the block and inode, deletes the inode, and releases the lock. Step 2 Acquire 
> the lock and delete the block to release the lock.
> Phase 2 is currently deleting blocks in batches, which can control the lock 
> holding time. Here we can also delete blocks asynchronously.
> Now step 1 still has the problem of holding the lock for a long time.
> For stage 1, we can make the collection block not hold the lock. The process 
> is as follows, step 1 obtains the lock, parent.removeChild, writes to 
> editLog, releases the lock. Step 2 no lock, collects the block. Step 3 
> acquire lock, update quota, release lease, release lock. Step 4 acquire lock, 
> delete Inode from InodeMap, release lock. Step 5 acquire lock, delete block 
> to release lock.
> There may be some problems following the above process:
> 1. When the /a/b/c file is writing, then delete the /a/b directory. If the 
> deletion is performed to the collecting block stage, the client writes 
> complete or addBlock to the /a/b/c file at this time. This step is not locked 
> and delete /a/b and editLog has been written successfully. In this case, the 
> order of editLog is delete /a/c and complete /a/b/c. In this case, the 
> standby node playback editLog /a/b/c file has been deleted, and then go to 
> complete /a/b/c file will be abnormal.
> *The process is as follows:*
> *write editLog order: delete /a/b/c -> delete /a/b -> complete /a/b/c* 
> *replay  editLog order:* *delete /a/b/c ->* *delete /a/b ->* *complete /a/b/c 
> {color:#ff}(not found){color}*
> 2. If a delete operation is executed to the stage of collecting block, then 
> the administrator executes saveNameSpace, and then restarts Namenode. This 
> situation may cause the Inode that has been deleted from the parent childList 
> to remain in the InodeMap.
> To solve the above problem, in step 1, add the inode being deleted to the 
> Set. When there is a file WriteFileOp (logAllocateBlockId/logCloseFile 
> EditLog), check whether there is this file and one of its parent Inodes in 
> the Set, and throw it if there is. An exception FileNotFoundException 
> occurred.
> In addition, the execution of

[jira] [Updated] (HDFS-16214) Asynchronously collect blocks and update quota when deleting

2022-03-08 Thread Xiangyi Zhu (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiangyi Zhu updated HDFS-16214:
---
Summary: Asynchronously collect blocks and update quota when deleting  
(was: Lock optimization for large deleteing, no locks on the collection block)

> Asynchronously collect blocks and update quota when deleting
> 
>
> Key: HDFS-16214
> URL: https://issues.apache.org/jira/browse/HDFS-16214
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Assignee: Xiangyi Zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The time-consuming deletion is mainly reflected in three logics , collecting 
> blocks, deleting Inode from InodeMap, and deleting blocks. The current 
> deletion is divided into two major steps. Step 1 acquires the lock, collects 
> the block and inode, deletes the inode, and releases the lock. Step 2 Acquire 
> the lock and delete the block to release the lock.
> Phase 2 is currently deleting blocks in batches, which can control the lock 
> holding time. Here we can also delete blocks asynchronously.
> Now step 1 still has the problem of holding the lock for a long time.
> For stage 1, we can make the collection block not hold the lock. The process 
> is as follows, step 1 obtains the lock, parent.removeChild, writes to 
> editLog, releases the lock. Step 2 no lock, collects the block. Step 3 
> acquire lock, update quota, release lease, release lock. Step 4 acquire lock, 
> delete Inode from InodeMap, release lock. Step 5 acquire lock, delete block 
> to release lock.
> There may be some problems following the above process:
> 1. When the /a/b/c file is writing, then delete the /a/b directory. If the 
> deletion is performed to the collecting block stage, the client writes 
> complete or addBlock to the /a/b/c file at this time. This step is not locked 
> and delete /a/b and editLog has been written successfully. In this case, the 
> order of editLog is delete /a/c and complete /a/b/c. In this case, the 
> standby node playback editLog /a/b/c file has been deleted, and then go to 
> complete /a/b/c file will be abnormal.
> *The process is as follows:*
> *write editLog order: delete /a/b/c -> delete /a/b -> complete /a/b/c* 
> *replay  editLog order:* *delete /a/b/c ->* *delete /a/b ->* *complete /a/b/c 
> {color:#ff}(not found){color}*
> 2. If a delete operation is executed to the stage of collecting block, then 
> the administrator executes saveNameSpace, and then restarts Namenode. This 
> situation may cause the Inode that has been deleted from the parent childList 
> to remain in the InodeMap.
> To solve the above problem, in step 1, add the inode being deleted to the 
> Set. When there is a file WriteFileOp (logAllocateBlockId/logCloseFile 
> EditLog), check whether there is this file and one of its parent Inodes in 
> the Set, and throw it if there is. An exception FileNotFoundException 
> occurred.
> In addition, the execution of saveNamespace needs to wait for all iNodes in 
> Set to be removed before execution.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16214) Lock optimization for large deleteing, no locks on the collection block

2022-01-19 Thread Xiangyi Zhu (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17478544#comment-17478544
 ] 

Xiangyi Zhu commented on HDFS-16214:


[~John Smith] Currently Issues wants to solve the problem of long lock-holding 
time when collecting blocks when deleting large directories. This  
[HDFS-16043|https://issues.apache.org/jira/browse/HDFS-16043] Issuss is to 
achieve asynchronous deletion of blocks. These two issues are not the same.

> Lock optimization for large deleteing, no locks on the collection block
> ---
>
> Key: HDFS-16214
> URL: https://issues.apache.org/jira/browse/HDFS-16214
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Assignee: Xiangyi Zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The time-consuming deletion is mainly reflected in three logics , collecting 
> blocks, deleting Inode from InodeMap, and deleting blocks. The current 
> deletion is divided into two major steps. Step 1 acquires the lock, collects 
> the block and inode, deletes the inode, and releases the lock. Step 2 Acquire 
> the lock and delete the block to release the lock.
> Phase 2 is currently deleting blocks in batches, which can control the lock 
> holding time. Here we can also delete blocks asynchronously.
> Now step 1 still has the problem of holding the lock for a long time.
> For stage 1, we can make the collection block not hold the lock. The process 
> is as follows, step 1 obtains the lock, parent.removeChild, writes to 
> editLog, releases the lock. Step 2 no lock, collects the block. Step 3 
> acquire lock, update quota, release lease, release lock. Step 4 acquire lock, 
> delete Inode from InodeMap, release lock. Step 5 acquire lock, delete block 
> to release lock.
> There may be some problems following the above process:
> 1. When the /a/b/c file is writing, then delete the /a/b directory. If the 
> deletion is performed to the collecting block stage, the client writes 
> complete or addBlock to the /a/b/c file at this time. This step is not locked 
> and delete /a/b and editLog has been written successfully. In this case, the 
> order of editLog is delete /a/c and complete /a/b/c. In this case, the 
> standby node playback editLog /a/b/c file has been deleted, and then go to 
> complete /a/b/c file will be abnormal.
> *The process is as follows:*
> *write editLog order: delete /a/b/c -> delete /a/b -> complete /a/b/c* 
> *replay  editLog order:* *delete /a/b/c ->* *delete /a/b ->* *complete /a/b/c 
> {color:#ff}(not found){color}*
> 2. If a delete operation is executed to the stage of collecting block, then 
> the administrator executes saveNameSpace, and then restarts Namenode. This 
> situation may cause the Inode that has been deleted from the parent childList 
> to remain in the InodeMap.
> To solve the above problem, in step 1, add the inode being deleted to the 
> Set. When there is a file WriteFileOp (logAllocateBlockId/logCloseFile 
> EditLog), check whether there is this file and one of its parent Inodes in 
> the Set, and throw it if there is. An exception FileNotFoundException 
> occurred.
> In addition, the execution of saveNamespace needs to wait for all iNodes in 
> Set to be removed before execution.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-16043) Add markedDeleteBlockScrubberThread to delete blocks asynchronously

2022-01-19 Thread Xiangyi Zhu (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiangyi Zhu updated HDFS-16043:
---
Description: Add markedDeleteBlockScrubberThread to delete blocks 
asynchronously.  (was: The deletion of the large directory caused NN to hold 
the lock for too long, which caused our NameNode to be killed by ZKFC.
 Through the flame graph, it is found that its main time-consuming calculation 
is QuotaCount when removingBlocks(toRemovedBlocks) and deleting inodes, and 
removeBlocks(toRemovedBlocks) takes a higher proportion of time.
h3. solution:

1. RemoveBlocks is processed asynchronously. A thread is started in the 
BlockManager to process the deleted blocks and control the lock time.
 2. QuotaCount calculation optimization, this is similar to the optimization of 
this Issue HDFS-16000.
h3. Comparison before and after optimization:

Delete 1000w Inode and 1000w block test.
 *before:*
remove inode elapsed time: 7691 ms
 remove block elapsed time :11107 ms
 *after:*
 remove inode elapsed time: 4149 ms
 remove block elapsed time :0 ms)

> Add markedDeleteBlockScrubberThread to delete blocks asynchronously
> ---
>
> Key: HDFS-16043
> URL: https://issues.apache.org/jira/browse/HDFS-16043
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namanode
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Assignee: Xiangyi Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
> Attachments: 20210527-after.svg, 20210527-before.svg
>
>  Time Spent: 12.5h
>  Remaining Estimate: 0h
>
> Add markedDeleteBlockScrubberThread to delete blocks asynchronously.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16214) Lock optimization for large deleteing, no locks on the collection block

2022-01-16 Thread Xiangyi Zhu (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17476944#comment-17476944
 ] 

Xiangyi Zhu commented on HDFS-16214:


[~hexiaoqiao] [~weichiu] [~sodonnell]  If the collection block is not locked or 
processed asynchronously, the quota update will not be real-time accurate, but 
it will eventually be consistent. I think it is acceptable to sacrifice the 
real-time accuracy of quota for performance improvement.look forward to your 
reply.

> Lock optimization for large deleteing, no locks on the collection block
> ---
>
> Key: HDFS-16214
> URL: https://issues.apache.org/jira/browse/HDFS-16214
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Assignee: Xiangyi Zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The time-consuming deletion is mainly reflected in three logics , collecting 
> blocks, deleting Inode from InodeMap, and deleting blocks. The current 
> deletion is divided into two major steps. Step 1 acquires the lock, collects 
> the block and inode, deletes the inode, and releases the lock. Step 2 Acquire 
> the lock and delete the block to release the lock.
> Phase 2 is currently deleting blocks in batches, which can control the lock 
> holding time. Here we can also delete blocks asynchronously.
> Now step 1 still has the problem of holding the lock for a long time.
> For stage 1, we can make the collection block not hold the lock. The process 
> is as follows, step 1 obtains the lock, parent.removeChild, writes to 
> editLog, releases the lock. Step 2 no lock, collects the block. Step 3 
> acquire lock, update quota, release lease, release lock. Step 4 acquire lock, 
> delete Inode from InodeMap, release lock. Step 5 acquire lock, delete block 
> to release lock.
> There may be some problems following the above process:
> 1. When the /a/b/c file is writing, then delete the /a/b directory. If the 
> deletion is performed to the collecting block stage, the client writes 
> complete or addBlock to the /a/b/c file at this time. This step is not locked 
> and delete /a/b and editLog has been written successfully. In this case, the 
> order of editLog is delete /a/c and complete /a/b/c. In this case, the 
> standby node playback editLog /a/b/c file has been deleted, and then go to 
> complete /a/b/c file will be abnormal.
> *The process is as follows:*
> *write editLog order: delete /a/b/c -> delete /a/b -> complete /a/b/c* 
> *replay  editLog order:* *delete /a/b/c ->* *delete /a/b ->* *complete /a/b/c 
> {color:#ff}(not found){color}*
> 2. If a delete operation is executed to the stage of collecting block, then 
> the administrator executes saveNameSpace, and then restarts Namenode. This 
> situation may cause the Inode that has been deleted from the parent childList 
> to remain in the InodeMap.
> To solve the above problem, in step 1, add the inode being deleted to the 
> Set. When there is a file WriteFileOp (logAllocateBlockId/logCloseFile 
> EditLog), check whether there is this file and one of its parent Inodes in 
> the Set, and throw it if there is. An exception FileNotFoundException 
> occurred.
> In addition, the execution of saveNamespace needs to wait for all iNodes in 
> Set to be removed before execution.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-16214) Lock optimization for large deleteing, no locks on the collection block

2022-01-14 Thread Xiangyi Zhu (Jira)

[
https://issues.apache.org/jira/browse/HDFS-16214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Xiangyi Zhu updated HDFS-16214:
---
Description:
The time-consuming deletion is mainly reflected in three logics , collecting
blocks, deleting Inode from InodeMap, and deleting blocks. The current deletion
is divided into two major steps. Step 1 acquires the lock, collects the block
and inode, deletes the inode, and releases the lock. Step 2 Acquire the lock
and delete the block to release the lock.
Phase 2 is currently deleting blocks in batches, which can control the lock
holding time. Here we can also delete blocks asynchronously.

Now step 1 still has the problem of holding the lock for a long time.
For stage 1, we can make the collection block not hold the lock. The process is
as follows, step 1 obtains the lock, parent.removeChild, writes to editLog,
releases the lock. Step 2 no lock, collects the block. Step 3 acquire lock,
update quota, release lease, release lock. Step 4 acquire lock, delete Inode
from InodeMap, release lock. Step 5 acquire lock, delete block to release lock.

There may be some problems following the above process:
1. When the /a/b/c file is writing, then delete the /a/b directory. If the
deletion is performed to the collecting block stage, the client writes complete
or addBlock to the /a/b/c file at this time. This step is not locked and delete
/a/b and editLog has been written successfully. In this case, the order of
editLog is delete /a/c and complete /a/b/c. In this case, the standby node
playback editLog /a/b/c file has been deleted, and then go to complete /a/b/c
file will be abnormal.

*The process is as follows:*

*write editLog order: delete /a/b/c -> delete /a/b -> complete /a/b/c*

*replay editLog order:* *delete /a/b/c ->* *delete /a/b ->* *complete /a/b/c
{color:#ff}(not found){color}*

2. If a delete operation is executed to the stage of collecting block, then the
administrator executes saveNameSpace, and then restarts Namenode. This
situation may cause the Inode that has been deleted from the parent childList
to remain in the InodeMap.

To solve the above problem, in step 1, add the inode being deleted to the Set.
When there is a file WriteFileOp (logAllocateBlockId/logCloseFile EditLog),
check whether there is this file and one of its parent Inodes in the Set, and
throw it if there is. An exception FileNotFoundException occurred.
In addition, the execution of saveNamespace needs to wait for all iNodes in Set
to be removed before execution.

was:
The time-consuming deletion is mainly reflected in three logics , collecting
blocks, deleting Inode from InodeMap, and deleting blocks. The current deletion
is divided into two major steps. Step 1 acquires the lock, collects the block
and inode, deletes the inode, and releases the lock. Step 2 Acquire the lock
and delete the block to release the lock.
Phase 2 is currently deleting blocks in batches, which can control the lock
holding time. Here we can also delete blocks asynchronously.

*The process is as follows:*

*write editLog order: delete /a/b/c -> delete /a/b -> complete /a/b/c*

*replay editLog order:* *delete /a/b/c ->* *delete /a/b ->* {*}complete /a/b/c
{color:#FF}(not found){color}{*}{*}{*}{*}{*}{*}{*}

[jira] [Updated] (HDFS-16214) Lock optimization for large deleteing, no locks on the collection block

2022-01-14 Thread Xiangyi Zhu (Jira)

[
https://issues.apache.org/jira/browse/HDFS-16214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

*The process is as follows:*

*write editLog order: delete /a/b/c -> delete /a/b -> complete /a/b/c*

*replay editLog order:* *delete /a/b/c ->* *delete /a/b ->* {*}complete /a/b/c
{color:#FF}(not found){color}{*}{*}{*}{*}{*}{*}{*}

> Lock optimization for large deleteing, no locks on the collection block
> ---
>
>

[jira] [Updated] (HDFS-16214) Lock optimization for large deleteing, no locks on the collection block

2022-01-14 Thread Xiangyi Zhu (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiangyi Zhu updated HDFS-16214:
---
Description: 
The time-consuming deletion is mainly reflected in three logics , collecting 
blocks, deleting Inode from InodeMap, and deleting blocks. The current deletion 
is divided into two major steps. Step 1 acquires the lock, collects the block 
and inode, deletes the inode, and releases the lock. Step 2 Acquire the lock 
and delete the block to release the lock.
Phase 2 is currently deleting blocks in batches, which can control the lock 
holding time. Here we can also delete blocks asynchronously.

Now step 1 still has the problem of holding the lock for a long time.
For stage 1, we can make the collection block not hold the lock. The process is 
as follows, step 1 obtains the lock, parent.removeChild, writes to editLog, 
releases the lock. Step 2 no lock, collects the block. Step 3 acquire lock, 
update quota, release lease, release lock. Step 4 acquire lock, delete Inode 
from InodeMap, release lock. Step 5 acquire lock, delete block to release lock.

There may be some problems following the above process:
1. When the /a/b/c file is writing, then delete the /a/b directory. If the 
deletion is performed to the collecting block stage, the client writes complete 
or addBlock to the /a/b/c file at this time. This step is not locked and delete 
/a/b and editLog has been written successfully. In this case, the order of 
editLog is delete /a/c and complete /a/b/c. In this case, the standby node 
playback editLog /a/b/c file has been deleted, and then go to complete /a/b/c 
file will be abnormal.


2. If a delete operation is executed to the stage of collecting block, then the 
administrator executes saveNameSpace, and then restarts Namenode. This 
situation may cause the Inode that has been deleted from the parent childList 
to remain in the InodeMap.

To solve the above problem, in step 1, add the inode being deleted to the Set. 
When there is a file WriteFileOp (logAllocateBlockId/logCloseFile EditLog), 
check whether there is this file and one of its parent Inodes in the Set, and 
throw it if there is. An exception FileNotFoundException occurred.
In addition, the execution of saveNamespace needs to wait for all iNodes in Set 
to be removed before execution.

  was:
The time-consuming deletion is mainly reflected in three logics , collecting 
blocks, deleting Inode from InodeMap, and deleting blocks. The current deletion 
is divided into two major steps. Step 1 acquires the lock, collects the block 
and inode, deletes the inode, and releases the lock. Step 2 Acquire the lock 
and delete the block to release the lock.
Phase 2 is currently deleting blocks in batches, which can control the lock 
holding time. Here we can also delete blocks asynchronously.

Now step 1 still has the problem of holding the lock for a long time.
For stage 1, we can make the collection block not hold the lock. The process is 
as follows, step 1 obtains the lock, parent.removeChild, writes to editLog, 
releases the lock. Step 2 no lock, collects the block. Step 3 acquire lock, 
update quota, release lease, release lock. Step 4 acquire lock, delete Inode 
from InodeMap, release lock. Step 5 acquire lock, delete block to release lock.

There may be some problems following the above process:
1. When the /a/b/c file is open, then delete the /a/b directory. If the 
deletion is performed to the collecting block stage, the client writes complete 
or addBlock to the /a/b/c file at this time. This step is not locked and delete 
/a/b and editLog has been written successfully. In this case, the order of 
editLog is delete /a/c and complete /a/b/c. In this case, the standby node 
playback editLog /a/b/c file has been deleted, and then go to complete /a/b/c 
file will be abnormal.
2. If a delete operation is executed to the stage of collecting block, then the 
administrator executes saveNameSpace, and then restarts Namenode. This 
situation may cause the Inode that has been deleted from the parent childList 
to remain in the InodeMap.

To solve the above problem, in step 1, add the inode being deleted to the Set. 
When there is a file WriteFileOp (logAllocateBlockId/logCloseFile EditLog), 
check whether there is this file and one of its parent Inodes in the Set, and 
throw it if there is. An exception FileNotFoundException occurred.
In addition, the execution of saveNamespace needs to wait for all iNodes in Set 
to be removed before execution.


> Lock optimization for large deleteing, no locks on the collection block
> ---
>
> Key: HDFS-16214
> URL: https://issues.apache.org/jira/browse/HDFS-16214
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.4.0
>

[jira] [Updated] (HDFS-16421) Remove RouterRpcFairnessPolicyController ConcurrentNS to avoid renewLease being unavailable

2022-01-12 Thread Xiangyi Zhu (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiangyi Zhu updated HDFS-16421:
---
Summary: Remove RouterRpcFairnessPolicyController ConcurrentNS to avoid 
renewLease being unavailable  (was: RouterRpcFairnessPolicyController remove 
ConcurrentNS )

> Remove RouterRpcFairnessPolicyController ConcurrentNS to avoid renewLease 
> being unavailable
> ---
>
> Key: HDFS-16421
> URL: https://issues.apache.org/jira/browse/HDFS-16421
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Assignee: Xiangyi Zhu
>Priority: Major
>
> When using the RouterRpcFairnessConstants strategy, if the NamNode rpc is 
> slow or does not respond, it is easy to use up the concurrent available 
> handlers, and the client will not be able to renewLease normally.
> I think CONCURRENT_NS can be removed. When there is an rpc of CONCURRENT, we 
> traverse each NS to apply for the corresponding Handler, instead of just 
> applying for one Handler like CONCURRENT.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-16421) RouterRpcFairnessPolicyController remove ConcurrentNS

2022-01-12 Thread Xiangyi Zhu (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiangyi Zhu updated HDFS-16421:
---
Summary: RouterRpcFairnessPolicyController remove ConcurrentNS   (was: 
RouterRpcFairnessConstants remove ConcurrentNS )

> RouterRpcFairnessPolicyController remove ConcurrentNS 
> --
>
> Key: HDFS-16421
> URL: https://issues.apache.org/jira/browse/HDFS-16421
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Assignee: Xiangyi Zhu
>Priority: Major
>
> When using the RouterRpcFairnessConstants strategy, if the NamNode rpc is 
> slow or does not respond, it is easy to use up the concurrent available 
> handlers, and the client will not be able to renewLease normally.
> I think CONCURRENT_NS can be removed. When there is an rpc of CONCURRENT, we 
> traverse each NS to apply for the corresponding Handler, instead of just 
> applying for one Handler like CONCURRENT.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-16421) RouterRpcFairnessConstants remove ConcurrentNS

2022-01-12 Thread Xiangyi Zhu (Jira)

Xiangyi Zhu created HDFS-16421:
--

 Summary: RouterRpcFairnessConstants remove ConcurrentNS 
 Key: HDFS-16421
 URL: https://issues.apache.org/jira/browse/HDFS-16421
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: rbf
Affects Versions: 3.4.0
Reporter: Xiangyi Zhu
Assignee: Xiangyi Zhu


When using the RouterRpcFairnessConstants strategy, if the NamNode rpc is slow 
or does not respond, it is easy to use up the concurrent available handlers, 
and the client will not be able to renewLease normally.

I think CONCURRENT_NS can be removed. When there is an rpc of CONCURRENT, we 
traverse each NS to apply for the corresponding Handler, instead of just 
applying for one Handler like CONCURRENT.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-16019) HDFS: Inode CheckPoint

2022-01-12 Thread Xiangyi Zhu (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiangyi Zhu resolved HDFS-16019.

Resolution: Later

> HDFS: Inode CheckPoint 
> ---
>
> Key: HDFS-16019
> URL: https://issues.apache.org/jira/browse/HDFS-16019
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namanode
>Affects Versions: 3.3.1
>Reporter: Xiangyi Zhu
>Assignee: Xiangyi Zhu
>Priority: Major
>
> *background*
> The OIV IMAGE analysis tool has brought us many benefits, such as file size 
> distribution, cold and hot data, abnormal growth directory analysis. But in 
> my opinion he is too slow, especially the big IMAGE.
> After Hadoop 2.3, the format of IMAGE has changed. For OIV tools, it is 
> necessary to load the entire IMAGE into the memory to output the inode 
> information into a text format. For large IMAGE, this process takes a long 
> time and consumes more resources and requires a large memory machine to 
> analyze.
> Although, HDFS provides the dfs.namenode.legacy-oiv-image.dir parameter to 
> get the old version of IMAGE through CheckPoint. The old IMAGE parsing does 
> not require too many resources, but we need to parse the IMAGE again through 
> the hdfs oiv_legacy command to get the text information of the Inode, which 
> is relatively time-consuming.
> **
> *Solution*
> We can ask the standby node to periodically check the Inode and serialize the 
> Inode in text mode. For OutPut, different FileSystems can be used according 
> to the configuration, such as the local file system or the HDFS file system.
> The advantage of providing HDFS file system is that we can analyze Inode 
> directly through spark/hive. I think the block information corresponding to 
> the Inode may not be of much use. The size of the file and the number of 
> copies are more useful to us.
> In addition, the sequential output of the Inode is not necessary. We can 
> speed up the CheckPoint for the Inode, and use the partition for the 
> serialized Inode to output different files. Use a production thread to put 
> Inode in the Queue, and use multi-threaded consumption Queue to write to 
> different partition files. For output files, compression can also be used to 
> reduce disk IO.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16043) Add markedDeleteBlockScrubberThread to delete blocks asynchronously

2022-01-07 Thread Xiangyi Zhu (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17470561#comment-17470561
 ] 

Xiangyi Zhu commented on HDFS-16043:


[~mofei]  I think even if the collection of blocks is made asynchronous, if he 
still holds the lock, there will still be problems with the big lock. 
HDFS-16214 The idea here is to keep the collection block from holding the lock. 
I will submit the code of this problem next week.

> Add markedDeleteBlockScrubberThread to delete blocks asynchronously
> ---
>
> Key: HDFS-16043
> URL: https://issues.apache.org/jira/browse/HDFS-16043
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namanode
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Assignee: Xiangyi Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: 20210527-after.svg, 20210527-before.svg
>
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> The deletion of the large directory caused NN to hold the lock for too long, 
> which caused our NameNode to be killed by ZKFC.
>  Through the flame graph, it is found that its main time-consuming 
> calculation is QuotaCount when removingBlocks(toRemovedBlocks) and deleting 
> inodes, and removeBlocks(toRemovedBlocks) takes a higher proportion of time.
> h3. solution:
> 1. RemoveBlocks is processed asynchronously. A thread is started in the 
> BlockManager to process the deleted blocks and control the lock time.
>  2. QuotaCount calculation optimization, this is similar to the optimization 
> of this Issue HDFS-16000.
> h3. Comparison before and after optimization:
> Delete 1000w Inode and 1000w block test.
>  *before:*
> remove inode elapsed time: 7691 ms
>  remove block elapsed time :11107 ms
>  *after:*
>  remove inode elapsed time: 4149 ms
>  remove block elapsed time :0 ms



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16043) Add markedDeleteBlockScrubberThread to delete blocks asynchronously

2022-01-06 Thread Xiangyi Zhu (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17470363#comment-17470363
 ] 

Xiangyi Zhu commented on HDFS-16043:


[~mofei] Thank a lot for your feedback.

> Add markedDeleteBlockScrubberThread to delete blocks asynchronously
> ---
>
> Key: HDFS-16043
> URL: https://issues.apache.org/jira/browse/HDFS-16043
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namanode
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Assignee: Xiangyi Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: 20210527-after.svg, 20210527-before.svg
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> The deletion of the large directory caused NN to hold the lock for too long, 
> which caused our NameNode to be killed by ZKFC.
>  Through the flame graph, it is found that its main time-consuming 
> calculation is QuotaCount when removingBlocks(toRemovedBlocks) and deleting 
> inodes, and removeBlocks(toRemovedBlocks) takes a higher proportion of time.
> h3. solution:
> 1. RemoveBlocks is processed asynchronously. A thread is started in the 
> BlockManager to process the deleted blocks and control the lock time.
>  2. QuotaCount calculation optimization, this is similar to the optimization 
> of this Issue HDFS-16000.
> h3. Comparison before and after optimization:
> Delete 1000w Inode and 1000w block test.
>  *before:*
> remove inode elapsed time: 7691 ms
>  remove block elapsed time :11107 ms
>  *after:*
>  remove inode elapsed time: 4149 ms
>  remove block elapsed time :0 ms



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-16412) Add metrics to support obtaining file size distribution

2022-01-05 Thread Xiangyi Zhu (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiangyi Zhu updated HDFS-16412:
---
Description: 
Use RangeMapRange "Map fileSizeRange" to store counters at 
different intervals. RangeMap key is a specific interval, and value is the 
counter corresponding to the interval.

*Counter update:*
When the file size changes or the file is deleted, the file size is obtained, 
and the counter in the corresponding interval is called to update the counter.

*Interval division:*
The default is to initialize the startup according to the following interval, 
or it can be initialized through the configuration file.
0MB
0-16MB
16-32MB
32-64MB
64-128MB
128-256MB
256-512MB
>512MB

  was:
Use RangeMapRange "Map fileSizeRange" to store counters at 
different intervals. RangeMap key is a specific interval, and value is the 
counter corresponding to the interval.
**

*Counter update:*
When the file size changes or the file is deleted, the file size is obtained, 
and the counter in the corresponding interval is called to update the counter.
**

*Interval division:*
The default is to initialize the startup according to the following interval, 
or it can be initialized through the configuration file.
0MB
0-16MB
16-32MB
32-64MB
64-128MB
128-256MB
256-512MB
>512MB


> Add metrics to support obtaining file size distribution
> ---
>
> Key: HDFS-16412
> URL: https://issues.apache.org/jira/browse/HDFS-16412
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Assignee: Xiangyi Zhu
>Priority: Minor
>
> Use RangeMapRange "Map fileSizeRange" to store counters at 
> different intervals. RangeMap key is a specific interval, and value is the 
> counter corresponding to the interval.
> *Counter update:*
> When the file size changes or the file is deleted, the file size is obtained, 
> and the counter in the corresponding interval is called to update the counter.
> *Interval division:*
> The default is to initialize the startup according to the following interval, 
> or it can be initialized through the configuration file.
> 0MB
> 0-16MB
> 16-32MB
> 32-64MB
> 64-128MB
> 128-256MB
> 256-512MB
> >512MB



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-16412) Add metrics to support obtaining file size distribution

2022-01-05 Thread Xiangyi Zhu (Jira)

Xiangyi Zhu created HDFS-16412:
--

 Summary: Add metrics to support obtaining file size distribution
 Key: HDFS-16412
 URL: https://issues.apache.org/jira/browse/HDFS-16412
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 3.4.0
Reporter: Xiangyi Zhu
Assignee: Xiangyi Zhu


Use RangeMapRange "Map fileSizeRange" to store counters at 
different intervals. RangeMap key is a specific interval, and value is the 
counter corresponding to the interval.
**

*Counter update:*
When the file size changes or the file is deleted, the file size is obtained, 
and the counter in the corresponding interval is called to update the counter.
**

*Interval division:*
The default is to initialize the startup according to the following interval, 
or it can be initialized through the configuration file.
0MB
0-16MB
16-32MB
32-64MB
64-128MB
128-256MB
256-512MB
>512MB



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDFS-16276) RBF: Remove the useless configuration of rpc isolation in md

2021-10-17 Thread Xiangyi Zhu (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiangyi Zhu reassigned HDFS-16276:
--

Assignee: Xiangyi Zhu

> RBF:  Remove the useless configuration of rpc isolation in md
> -
>
> Key: HDFS-16276
> URL: https://issues.apache.org/jira/browse/HDFS-16276
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Assignee: Xiangyi Zhu
>Priority: Major
>
> The *dfs.federation.router.fairness.enable* configuration is not used in the 
> code, but there is it in md, we should delete it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-16276) RBF: Remove the useless configuration of rpc isolation in md

2021-10-15 Thread Xiangyi Zhu (Jira)

Xiangyi Zhu created HDFS-16276:
--

 Summary: RBF:  Remove the useless configuration of rpc isolation 
in md
 Key: HDFS-16276
 URL: https://issues.apache.org/jira/browse/HDFS-16276
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: rbf
Affects Versions: 3.4.0
Reporter: Xiangyi Zhu


The *dfs.federation.router.fairness.enable* configuration is not used in the 
code, but there is it in md, we should delete it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-16273) RBF: RouterRpcFairnessPolicyController add availableHandleOnPerNs metrics

2021-10-14 Thread Xiangyi Zhu (Jira)

Xiangyi Zhu created HDFS-16273:
--

 Summary: RBF: RouterRpcFairnessPolicyController add 
availableHandleOnPerNs metrics
 Key: HDFS-16273
 URL: https://issues.apache.org/jira/browse/HDFS-16273
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: rbf
Affects Versions: 3.4.0
Reporter: Xiangyi Zhu


Add the availableHandlerOnPerNs metrics to monitor whether the number of 
handlers configured for each NS is reasonable when using 
RouterRpcFairnessPolicyController.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDFS-16273) RBF: RouterRpcFairnessPolicyController add availableHandleOnPerNs metrics

2021-10-14 Thread Xiangyi Zhu (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiangyi Zhu reassigned HDFS-16273:
--

Assignee: Xiangyi Zhu

> RBF: RouterRpcFairnessPolicyController add availableHandleOnPerNs metrics
> -
>
> Key: HDFS-16273
> URL: https://issues.apache.org/jira/browse/HDFS-16273
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Assignee: Xiangyi Zhu
>Priority: Major
>
> Add the availableHandlerOnPerNs metrics to monitor whether the number of 
> handlers configured for each NS is reasonable when using 
> RouterRpcFairnessPolicyController.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-16043) Add markedDeleteBlockScrubberThread to delete blocks asynchronously

2021-09-30 Thread Xiangyi Zhu (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiangyi Zhu updated HDFS-16043:
---
Summary: Add markedDeleteBlockScrubberThread to delete blocks 
asynchronously  (was: HDFS : Add markedDeleteBlockScrubberThread to delete 
blcoks asynchronously)

> Add markedDeleteBlockScrubberThread to delete blocks asynchronously
> ---
>
> Key: HDFS-16043
> URL: https://issues.apache.org/jira/browse/HDFS-16043
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namanode
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Assignee: Xiangyi Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: 20210527-after.svg, 20210527-before.svg
>
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> The deletion of the large directory caused NN to hold the lock for too long, 
> which caused our NameNode to be killed by ZKFC.
>  Through the flame graph, it is found that its main time-consuming 
> calculation is QuotaCount when removingBlocks(toRemovedBlocks) and deleting 
> inodes, and removeBlocks(toRemovedBlocks) takes a higher proportion of time.
> h3. solution:
> 1. RemoveBlocks is processed asynchronously. A thread is started in the 
> BlockManager to process the deleted blocks and control the lock time.
>  2. QuotaCount calculation optimization, this is similar to the optimization 
> of this Issue HDFS-16000.
> h3. Comparison before and after optimization:
> Delete 1000w Inode and 1000w block test.
>  *before:*
> remove inode elapsed time: 7691 ms
>  remove block elapsed time :11107 ms
>  *after:*
>  remove inode elapsed time: 4149 ms
>  remove block elapsed time :0 ms



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-16043) HDFS : Add markedDeleteBlockScrubberThread to delete blcoks asynchronously

2021-09-28 Thread Xiangyi Zhu (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiangyi Zhu updated HDFS-16043:
---
Summary: HDFS : Add markedDeleteBlockScrubberThread to delete blcoks 
asynchronously  (was: HDFS : Delete performance optimization)

> HDFS : Add markedDeleteBlockScrubberThread to delete blcoks asynchronously
> --
>
> Key: HDFS-16043
> URL: https://issues.apache.org/jira/browse/HDFS-16043
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namanode
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Assignee: Xiangyi Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: 20210527-after.svg, 20210527-before.svg
>
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> The deletion of the large directory caused NN to hold the lock for too long, 
> which caused our NameNode to be killed by ZKFC.
>  Through the flame graph, it is found that its main time-consuming 
> calculation is QuotaCount when removingBlocks(toRemovedBlocks) and deleting 
> inodes, and removeBlocks(toRemovedBlocks) takes a higher proportion of time.
> h3. solution:
> 1. RemoveBlocks is processed asynchronously. A thread is started in the 
> BlockManager to process the deleted blocks and control the lock time.
>  2. QuotaCount calculation optimization, this is similar to the optimization 
> of this Issue HDFS-16000.
> h3. Comparison before and after optimization:
> Delete 1000w Inode and 1000w block test.
>  *before:*
> remove inode elapsed time: 7691 ms
>  remove block elapsed time :11107 ms
>  *after:*
>  remove inode elapsed time: 4149 ms
>  remove block elapsed time :0 ms



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-16214) Lock optimization for large deleteing, no locks on the collection block

2021-09-07 Thread Xiangyi Zhu (Jira)

Xiangyi Zhu created HDFS-16214:
--

 Summary: Lock optimization for large deleteing, no locks on the 
collection block
 Key: HDFS-16214
 URL: https://issues.apache.org/jira/browse/HDFS-16214
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 3.4.0
Reporter: Xiangyi Zhu


The time-consuming deletion is mainly reflected in three logics , collecting 
blocks, deleting Inode from InodeMap, and deleting blocks. The current deletion 
is divided into two major steps. Step 1 acquires the lock, collects the block 
and inode, deletes the inode, and releases the lock. Step 2 Acquire the lock 
and delete the block to release the lock.
Phase 2 is currently deleting blocks in batches, which can control the lock 
holding time. Here we can also delete blocks asynchronously.

Now step 1 still has the problem of holding the lock for a long time.
For stage 1, we can make the collection block not hold the lock. The process is 
as follows, step 1 obtains the lock, parent.removeChild, writes to editLog, 
releases the lock. Step 2 no lock, collects the block. Step 3 acquire lock, 
update quota, release lease, release lock. Step 4 acquire lock, delete Inode 
from InodeMap, release lock. Step 5 acquire lock, delete block to release lock.

There may be some problems following the above process:
1. When the /a/b/c file is open, then delete the /a/b directory. If the 
deletion is performed to the collecting block stage, the client writes complete 
or addBlock to the /a/b/c file at this time. This step is not locked and delete 
/a/b and editLog has been written successfully. In this case, the order of 
editLog is delete /a/c and complete /a/b/c. In this case, the standby node 
playback editLog /a/b/c file has been deleted, and then go to complete /a/b/c 
file will be abnormal.
2. If a delete operation is executed to the stage of collecting block, then the 
administrator executes saveNameSpace, and then restarts Namenode. This 
situation may cause the Inode that has been deleted from the parent childList 
to remain in the InodeMap.

To solve the above problem, in step 1, add the inode being deleted to the Set. 
When there is a file WriteFileOp (logAllocateBlockId/logCloseFile EditLog), 
check whether there is this file and one of its parent Inodes in the Set, and 
throw it if there is. An exception FileNotFoundException occurred.
In addition, the execution of saveNamespace needs to wait for all iNodes in Set 
to be removed before execution.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDFS-16214) Lock optimization for large deleteing, no locks on the collection block

2021-09-07 Thread Xiangyi Zhu (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiangyi Zhu reassigned HDFS-16214:
--

Assignee: Xiangyi Zhu

> Lock optimization for large deleteing, no locks on the collection block
> ---
>
> Key: HDFS-16214
> URL: https://issues.apache.org/jira/browse/HDFS-16214
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Assignee: Xiangyi Zhu
>Priority: Major
>
> The time-consuming deletion is mainly reflected in three logics , collecting 
> blocks, deleting Inode from InodeMap, and deleting blocks. The current 
> deletion is divided into two major steps. Step 1 acquires the lock, collects 
> the block and inode, deletes the inode, and releases the lock. Step 2 Acquire 
> the lock and delete the block to release the lock.
> Phase 2 is currently deleting blocks in batches, which can control the lock 
> holding time. Here we can also delete blocks asynchronously.
> Now step 1 still has the problem of holding the lock for a long time.
> For stage 1, we can make the collection block not hold the lock. The process 
> is as follows, step 1 obtains the lock, parent.removeChild, writes to 
> editLog, releases the lock. Step 2 no lock, collects the block. Step 3 
> acquire lock, update quota, release lease, release lock. Step 4 acquire lock, 
> delete Inode from InodeMap, release lock. Step 5 acquire lock, delete block 
> to release lock.
> There may be some problems following the above process:
> 1. When the /a/b/c file is open, then delete the /a/b directory. If the 
> deletion is performed to the collecting block stage, the client writes 
> complete or addBlock to the /a/b/c file at this time. This step is not locked 
> and delete /a/b and editLog has been written successfully. In this case, the 
> order of editLog is delete /a/c and complete /a/b/c. In this case, the 
> standby node playback editLog /a/b/c file has been deleted, and then go to 
> complete /a/b/c file will be abnormal.
> 2. If a delete operation is executed to the stage of collecting block, then 
> the administrator executes saveNameSpace, and then restarts Namenode. This 
> situation may cause the Inode that has been deleted from the parent childList 
> to remain in the InodeMap.
> To solve the above problem, in step 1, add the inode being deleted to the 
> Set. When there is a file WriteFileOp (logAllocateBlockId/logCloseFile 
> EditLog), check whether there is this file and one of its parent Inodes in 
> the Set, and throw it if there is. An exception FileNotFoundException 
> occurred.
> In addition, the execution of saveNamespace needs to wait for all iNodes in 
> Set to be removed before execution.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16095) Add lsQuotaList command and getQuotaListing api for hdfs quota

2021-07-06 Thread Xiangyi Zhu (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17375286#comment-17375286
 ] 

Xiangyi Zhu commented on HDFS-16095:


[~weichiu],[~ayushtkn],[~hexiaoqiao],[~kihwal]  Looking forward to your 
comments.

> Add lsQuotaList command and getQuotaListing api for hdfs quota
> --
>
> Key: HDFS-16095
> URL: https://issues.apache.org/jira/browse/HDFS-16095
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Assignee: Xiangyi Zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Currently hdfs does not support obtaining all quota information. The 
> administrator may need to check which quotas have been added to a certain 
> directory, or the quotas of the entire cluster.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-16096) Delete useless method DirectoryWithQuotaFeature#setQuota

2021-06-29 Thread Xiangyi Zhu (Jira)

Xiangyi Zhu created HDFS-16096:
--

 Summary: Delete useless method DirectoryWithQuotaFeature#setQuota
 Key: HDFS-16096
 URL: https://issues.apache.org/jira/browse/HDFS-16096
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs
Reporter: Xiangyi Zhu
 Fix For: 3.4.0


Delete useless method DirectoryWithQuotaFeature#setQuota.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-16095) Add lsQuotaList command and getQuotaListing api for hdfs quota

2021-06-28 Thread Xiangyi Zhu (Jira)

Xiangyi Zhu created HDFS-16095:
--

 Summary: Add lsQuotaList command and getQuotaListing api for hdfs 
quota
 Key: HDFS-16095
 URL: https://issues.apache.org/jira/browse/HDFS-16095
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs
Affects Versions: 3.4.0
Reporter: Xiangyi Zhu


Currently hdfs does not support obtaining all quota information. The 
administrator may need to check which quotas have been added to a certain 
directory, or the quotas of the entire cluster.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16043) HDFS : Delete performance optimization

2021-06-03 Thread Xiangyi Zhu (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17356302#comment-17356302
 ] 

Xiangyi Zhu commented on HDFS-16043:


[~hexiaoqiao] Thanks for your comment. correct. The modification here does not 
affect the SBN cleanup block, which works normally in the HA scenario. Those 
check errors I will deal with it and add unit tests.

The second optimization is shared with the issue of  HDFS-16000. I want to open 
an Issue to optimize the calculation time-consuming problem of QuotaCount.

> HDFS : Delete performance optimization
> --
>
> Key: HDFS-16043
> URL: https://issues.apache.org/jira/browse/HDFS-16043
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namanode
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Assignee: Xiangyi Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: 20210527-after.svg, 20210527-before.svg
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The deletion of the large directory caused NN to hold the lock for too long, 
> which caused our NameNode to be killed by ZKFC.
>  Through the flame graph, it is found that its main time-consuming 
> calculation is QuotaCount when removingBlocks(toRemovedBlocks) and deleting 
> inodes, and removeBlocks(toRemovedBlocks) takes a higher proportion of time.
> h3. solution:
> 1. RemoveBlocks is processed asynchronously. A thread is started in the 
> BlockManager to process the deleted blocks and control the lock time.
>  2. QuotaCount calculation optimization, this is similar to the optimization 
> of this Issue HDFS-16000.
> h3. Comparison before and after optimization:
> Delete 1000w Inode and 1000w block test.
>  *before:*
> remove inode elapsed time: 7691 ms
>  remove block elapsed time :11107 ms
>  *after:*
>  remove inode elapsed time: 4149 ms
>  remove block elapsed time :0 ms



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16043) HDFS : Delete performance optimization

2021-05-31 Thread Xiangyi Zhu (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17354754#comment-17354754
 ] 

Xiangyi Zhu commented on HDFS-16043:


[~hexiaoqiao],[~vjasani],[~vjasani]   Looking forward to your comments.

> HDFS : Delete performance optimization
> --
>
> Key: HDFS-16043
> URL: https://issues.apache.org/jira/browse/HDFS-16043
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namanode
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Assignee: Xiangyi Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: 20210527-after.svg, 20210527-before.svg
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The deletion of the large directory caused NN to hold the lock for too long, 
> which caused our NameNode to be killed by ZKFC.
>  Through the flame graph, it is found that its main time-consuming 
> calculation is QuotaCount when removingBlocks(toRemovedBlocks) and deleting 
> inodes, and removeBlocks(toRemovedBlocks) takes a higher proportion of time.
> h3. solution:
> 1. RemoveBlocks is processed asynchronously. A thread is started in the 
> BlockManager to process the deleted blocks and control the lock time.
>  2. QuotaCount calculation optimization, this is similar to the optimization 
> of this Issue HDFS-16000.
> h3. Comparison before and after optimization:
> Delete 1000w Inode and 1000w block test.
>  *before:*
> remove inode elapsed time: 7691 ms
>  remove block elapsed time :11107 ms
>  *after:*
>  remove inode elapsed time: 4149 ms
>  remove block elapsed time :0 ms



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16045) FileSystem.CACHE memory leak

2021-05-27 Thread Xiangyi Zhu (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17352937#comment-17352937
 ] 

Xiangyi Zhu commented on HDFS-16045:


[~hexiaoqiao]  Thank you very much for your comment. Presto uses the 
"UserGroupInformation#createProxyUser" method to create UGI, which needs to 
pass the real super user information (here should be related to Kerberos), and 
the "FileSystem#get" Api uses "UserGroupInformation#createRemoteUser" to create 
UGI. The latter does not require real user information, and the created UGI 
only contains information related to the user name, which means that the latter 
uses the same nature of the UGI created by the same user. I think they can 
share the same Filesystem instance.The consideration may not be comprehensive, 
welcome to discuss.

 
{code:java}
public static UserGroupInformation createRemoteUser(String user, AuthMethod 
authMethod) {
  if (user == null || user.isEmpty()) {
throw new IllegalArgumentException("Null user");
  }
  Subject subject = new Subject();
  subject.getPrincipals().add(new User(user));
  UserGroupInformation result = new UserGroupInformation(subject);
  result.setAuthenticationMethod(authMethod);
  return result;
}{code}

> FileSystem.CACHE memory leak
> 
>
> Key: HDFS-16045
> URL: https://issues.apache.org/jira/browse/HDFS-16045
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Priority: Major
>
> {code:java}
> FileSystem get(final URI uri, final Configuration conf,
>  final String user){code}
> When the client turns on the cache and uses the above API to specify the user 
> to create a Filesystem instance, the cache will be invalid.
> The specified user creates a new UGI every time he creates a Filesystem 
> instance, and cache compares it according to UGI.
> {code:java}
> public int hashCode() {
>  return (scheme + authority).hashCode() + ugi.hashCode() + (int)unique;
> }{code}
> Whether you can use username to replace UGI to make a comparison, and whether 
> there are other risks.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-16045) FileSystem.CACHE memory leak

2021-05-27 Thread Xiangyi Zhu (Jira)

Xiangyi Zhu created HDFS-16045:
--

 Summary: FileSystem.CACHE memory leak
 Key: HDFS-16045
 URL: https://issues.apache.org/jira/browse/HDFS-16045
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs
Affects Versions: 3.4.0
Reporter: Xiangyi Zhu


{code:java}
FileSystem get(final URI uri, final Configuration conf,
 final String user){code}
When the client turns on the cache and uses the above API to specify the user 
to create a Filesystem instance, the cache will be invalid.

The specified user creates a new UGI every time he creates a Filesystem 
instance, and cache compares it according to UGI.
{code:java}
public int hashCode() {
 return (scheme + authority).hashCode() + ugi.hashCode() + (int)unique;
}{code}
Whether you can use username to replace UGI to make a comparison, and whether 
there are other risks.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-16043) HDFS : Delete performance optimization

2021-05-27 Thread Xiangyi Zhu (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiangyi Zhu updated HDFS-16043:
---
Description: 
The deletion of the large directory caused NN to hold the lock for too long, 
which caused our NameNode to be killed by ZKFC.
 Through the flame graph, it is found that its main time-consuming calculation 
is QuotaCount when removingBlocks(toRemovedBlocks) and deleting inodes, and 
removeBlocks(toRemovedBlocks) takes a higher proportion of time.
h3. solution:

1. RemoveBlocks is processed asynchronously. A thread is started in the 
BlockManager to process the deleted blocks and control the lock time.
 2. QuotaCount calculation optimization, this is similar to the optimization of 
this Issue HDFS-16000.
h3. Comparison before and after optimization:

Delete 1000w Inode and 1000w block test.
 *before:*
remove inode elapsed time: 7691 ms
 remove block elapsed time :11107 ms
 *after:*
 remove inode elapsed time: 4149 ms
 remove block elapsed time :0 ms

  was:
The deletion of the large directory caused NN to hold the lock for too long, 
which caused our NameNode to be killed by ZKFC.
 Through the flame graph, it is found that its main time-consuming calculation 
is QuotaCount when removingBlocks(toRemovedBlocks) and deleting inodes, and 
removeBlocks(toRemovedBlocks) takes a higher proportion of time.
h3. solution:

1. RemoveBlocks is processed asynchronously. A thread is started in the 
BlockManager to process the deleted blocks and control the lock time.
 2. QuotaCount calculation optimization, this is similar to the optimization of 
this Issue HDFS-16000.
h3. Comparison before and after optimization:

Delete 1000w Inode and 1000w block test.
 *before:*
 Before optimization: remove inode elapsed time: 7691 ms
 remove block elapsed time :11107 ms
 *after:*
 remove inode elapsed time: 4149 ms
 remove block elapsed time :0 ms


> HDFS : Delete performance optimization
> --
>
> Key: HDFS-16043
> URL: https://issues.apache.org/jira/browse/HDFS-16043
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namanode
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Assignee: Xiangyi Zhu
>Priority: Major
> Attachments: 20210527-after.svg, 20210527-before.svg
>
>
> The deletion of the large directory caused NN to hold the lock for too long, 
> which caused our NameNode to be killed by ZKFC.
>  Through the flame graph, it is found that its main time-consuming 
> calculation is QuotaCount when removingBlocks(toRemovedBlocks) and deleting 
> inodes, and removeBlocks(toRemovedBlocks) takes a higher proportion of time.
> h3. solution:
> 1. RemoveBlocks is processed asynchronously. A thread is started in the 
> BlockManager to process the deleted blocks and control the lock time.
>  2. QuotaCount calculation optimization, this is similar to the optimization 
> of this Issue HDFS-16000.
> h3. Comparison before and after optimization:
> Delete 1000w Inode and 1000w block test.
>  *before:*
> remove inode elapsed time: 7691 ms
>  remove block elapsed time :11107 ms
>  *after:*
>  remove inode elapsed time: 4149 ms
>  remove block elapsed time :0 ms



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-16043) HDFS : Delete performance optimization

2021-05-26 Thread Xiangyi Zhu (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiangyi Zhu updated HDFS-16043:
---
Attachment: 20210527-before.svg
20210527-after.svg

> HDFS : Delete performance optimization
> --
>
> Key: HDFS-16043
> URL: https://issues.apache.org/jira/browse/HDFS-16043
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namanode
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Priority: Major
> Attachments: 20210527-after.svg, 20210527-before.svg
>
>
> The deletion of the large directory caused NN to hold the lock for too long, 
> which caused our NameNode to be killed by ZKFC.
>  Through the flame graph, it is found that its main time-consuming 
> calculation is QuotaCount when removingBlocks(toRemovedBlocks) and deleting 
> inodes, and removeBlocks(toRemovedBlocks) takes a higher proportion of time.
> h3. solution:
> 1. RemoveBlocks is processed asynchronously. A thread is started in the 
> BlockManager to process the deleted blocks and control the lock time.
>  2. QuotaCount calculation optimization, this is similar to the optimization 
> of this Issue HDFS-16000.
> h3. Comparison before and after optimization:
> Delete 1000w Inode and 1000w block test.
>  *before:*
>  Before optimization: remove inode elapsed time: 7691 ms
>  remove block elapsed time :11107 ms
>  *after:*
>  remove inode elapsed time: 4149 ms
>  remove block elapsed time :0 ms



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDFS-16043) HDFS : Delete performance optimization

2021-05-26 Thread Xiangyi Zhu (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiangyi Zhu reassigned HDFS-16043:
--

Assignee: Xiangyi Zhu

> HDFS : Delete performance optimization
> --
>
> Key: HDFS-16043
> URL: https://issues.apache.org/jira/browse/HDFS-16043
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namanode
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Assignee: Xiangyi Zhu
>Priority: Major
> Attachments: 20210527-after.svg, 20210527-before.svg
>
>
> The deletion of the large directory caused NN to hold the lock for too long, 
> which caused our NameNode to be killed by ZKFC.
>  Through the flame graph, it is found that its main time-consuming 
> calculation is QuotaCount when removingBlocks(toRemovedBlocks) and deleting 
> inodes, and removeBlocks(toRemovedBlocks) takes a higher proportion of time.
> h3. solution:
> 1. RemoveBlocks is processed asynchronously. A thread is started in the 
> BlockManager to process the deleted blocks and control the lock time.
>  2. QuotaCount calculation optimization, this is similar to the optimization 
> of this Issue HDFS-16000.
> h3. Comparison before and after optimization:
> Delete 1000w Inode and 1000w block test.
>  *before:*
>  Before optimization: remove inode elapsed time: 7691 ms
>  remove block elapsed time :11107 ms
>  *after:*
>  remove inode elapsed time: 4149 ms
>  remove block elapsed time :0 ms



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-16043) HDFS : Delete performance optimization

2021-05-26 Thread Xiangyi Zhu (Jira)

Xiangyi Zhu created HDFS-16043:
--

 Summary: HDFS : Delete performance optimization
 Key: HDFS-16043
 URL: https://issues.apache.org/jira/browse/HDFS-16043
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs, namanode
Affects Versions: 3.4.0
Reporter: Xiangyi Zhu


The deletion of the large directory caused NN to hold the lock for too long, 
which caused our NameNode to be killed by ZKFC.
Through the flame graph, it is found that its main time-consuming calculation 
is QuotaCount when removingBlocks(toRemovedBlocks) and deleting inodes, and 
removeBlocks(toRemovedBlocks) takes a higher proportion of time.
h3. 
solution:

1. RemoveBlocks is processed asynchronously. A thread is started in the 
BlockManager to process the deleted blocks and control the lock time.
2. QuotaCount calculation optimization, this is similar to the optimization of 
this Issue [HDFS-16000|https://issues.apache.org/jira/browse/HDFS-16000].
h3. Comparison before and after optimization:


Delete 1000w Inode and 1000w block test.
*before:*
Before optimization: remove inode elapsed time: 7691 ms
remove block elapsed time :11107 ms
*after:*
remove inode elapsed time: 4149 ms
remove block elapsed time :0 ms



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-16043) HDFS : Delete performance optimization

2021-05-26 Thread Xiangyi Zhu (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiangyi Zhu updated HDFS-16043:
---
Description: 
The deletion of the large directory caused NN to hold the lock for too long, 
which caused our NameNode to be killed by ZKFC.
 Through the flame graph, it is found that its main time-consuming calculation 
is QuotaCount when removingBlocks(toRemovedBlocks) and deleting inodes, and 
removeBlocks(toRemovedBlocks) takes a higher proportion of time.
h3. solution:

1. RemoveBlocks is processed asynchronously. A thread is started in the 
BlockManager to process the deleted blocks and control the lock time.
 2. QuotaCount calculation optimization, this is similar to the optimization of 
this Issue HDFS-16000.
h3. Comparison before and after optimization:

Delete 1000w Inode and 1000w block test.
 *before:*
 Before optimization: remove inode elapsed time: 7691 ms
 remove block elapsed time :11107 ms
 *after:*
 remove inode elapsed time: 4149 ms
 remove block elapsed time :0 ms

  was:
The deletion of the large directory caused NN to hold the lock for too long, 
which caused our NameNode to be killed by ZKFC.
Through the flame graph, it is found that its main time-consuming calculation 
is QuotaCount when removingBlocks(toRemovedBlocks) and deleting inodes, and 
removeBlocks(toRemovedBlocks) takes a higher proportion of time.
h3. 
solution:

1. RemoveBlocks is processed asynchronously. A thread is started in the 
BlockManager to process the deleted blocks and control the lock time.
2. QuotaCount calculation optimization, this is similar to the optimization of 
this Issue [HDFS-16000|https://issues.apache.org/jira/browse/HDFS-16000].
h3. Comparison before and after optimization:


Delete 1000w Inode and 1000w block test.
*before:*
Before optimization: remove inode elapsed time: 7691 ms
remove block elapsed time :11107 ms
*after:*
remove inode elapsed time: 4149 ms
remove block elapsed time :0 ms


> HDFS : Delete performance optimization
> --
>
> Key: HDFS-16043
> URL: https://issues.apache.org/jira/browse/HDFS-16043
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, namanode
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Priority: Major
>
> The deletion of the large directory caused NN to hold the lock for too long, 
> which caused our NameNode to be killed by ZKFC.
>  Through the flame graph, it is found that its main time-consuming 
> calculation is QuotaCount when removingBlocks(toRemovedBlocks) and deleting 
> inodes, and removeBlocks(toRemovedBlocks) takes a higher proportion of time.
> h3. solution:
> 1. RemoveBlocks is processed asynchronously. A thread is started in the 
> BlockManager to process the deleted blocks and control the lock time.
>  2. QuotaCount calculation optimization, this is similar to the optimization 
> of this Issue HDFS-16000.
> h3. Comparison before and after optimization:
> Delete 1000w Inode and 1000w block test.
>  *before:*
>  Before optimization: remove inode elapsed time: 7691 ms
>  remove block elapsed time :11107 ms
>  *after:*
>  remove inode elapsed time: 4149 ms
>  remove block elapsed time :0 ms



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16032) DFSClient#delete supports Trash

2021-05-26 Thread Xiangyi Zhu (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17351669#comment-17351669
 ] 

Xiangyi Zhu commented on HDFS-16032:


[~ayushtkn],[~sodonnell] Thanks a lot for your comments, I use your suggestions 
to improve it.

>  DFSClient#delete supports Trash
> 
>
> Key: HDFS-16032
> URL: https://issues.apache.org/jira/browse/HDFS-16032
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hadoop-client, hdfs
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently, HDFS can only move deleted data to Trash through Shell commands. 
> In actual scenarios, most of the data is deleted through DFSClient Api. I 
> think it should support Trash.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16032) DFSClient#delete supports Trash

2021-05-25 Thread Xiangyi Zhu (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17351454#comment-17351454
 ] 

Xiangyi Zhu commented on HDFS-16032:


[~hexiaoqiao],[~ayushtkn],[Stephen 
O'Donnell|https://issues.apache.org/jira/secure/ViewProfile.jspa?name=sodonnell]
 Looking forward your comments.

>  DFSClient#delete supports Trash
> 
>
> Key: HDFS-16032
> URL: https://issues.apache.org/jira/browse/HDFS-16032
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hadoop-client, hdfs
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently, HDFS can only move deleted data to Trash through Shell commands. 
> In actual scenarios, most of the data is deleted through DFSClient Api. I 
> think it should support Trash.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16039) RBF: Some indicators of RBFMetrics count inaccurately

2021-05-25 Thread Xiangyi Zhu (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17350835#comment-17350835
 ] 

Xiangyi Zhu commented on HDFS-16039:


[~elgoiri] Looking forward to your comments.

> RBF:  Some indicators of RBFMetrics count inaccurately
> --
>
> Key: HDFS-16039
> URL: https://issues.apache.org/jira/browse/HDFS-16039
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Affects Versions: 3.4.0
>Reporter: Xiangyi Zhu
>Assignee: Xiangyi Zhu
>Priority: Major
>
> RBFMetrics#getNumLiveNodes, getNumNamenodes, getTotalCapacity
> The current statistical algorithm is to accumulate all Nn indicators, which 
> will lead to inaccurate counting. I think that the same ClusterID only needs 
> to take one Max and then do the accumulation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-16039) RBF: Some indicators of RBFMetrics count inaccurately

2021-05-25 Thread Xiangyi Zhu (Jira)

Xiangyi Zhu created HDFS-16039:
--

 Summary: RBF:  Some indicators of RBFMetrics count inaccurately
 Key: HDFS-16039
 URL: https://issues.apache.org/jira/browse/HDFS-16039
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: rbf
Affects Versions: 3.4.0
Reporter: Xiangyi Zhu
Assignee: Xiangyi Zhu


RBFMetrics#getNumLiveNodes, getNumNamenodes, getTotalCapacity
The current statistical algorithm is to accumulate all Nn indicators, which 
will lead to inaccurate counting. I think that the same ClusterID only needs to 
take one Max and then do the accumulation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

43 matches

Mail list logo