date:20240319

[jira] [Commented] (HDFS-17430) RecoveringBlock will skip no live replicas when get block recovery command.

2024-03-19 Thread ASF GitHub Bot (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-17430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17828589#comment-17828589
 ] 

ASF GitHub Bot commented on HDFS-17430:
---

dineshchitlangia commented on PR #6635:
URL: https://github.com/apache/hadoop/pull/6635#issuecomment-2008619571

   @ZanderXu as you had posted the first set of suggestions, could you confirm 
if your suggestions are addressed? We can merge once we have your +1




> RecoveringBlock will skip no live replicas when get block recovery command.
> ---
>
> Key: HDFS-17430
> URL: https://issues.apache.org/jira/browse/HDFS-17430
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>
> RecoveringBlock maybe skip no live replicas when get block recovery command.
> *Issue:*
> Currently the following scenarios may lead to failure in the execution of 
> BlockRecoveryWorker by the datanode, resulting file being not to be closed 
> for a long time.
> *t1.*  The block_xxx_xxx has two replicas[dn1,dn2]; the dn1 machine shut down 
> and will be dead status, the dn2 is live status.
> *t2.* Occurs block recovery.
> related logs：
> {code:java}
> 2024-03-13 21:58:00.651 WARN hdfs.StateChange DIR* 
> NameSystem.internalReleaseLease: File /xxx/file has not been closed. Lease 
> recovery is in progress. RecoveryId = 28577373754 for block blk_xxx_xxx
> {code}
> *t3.*  The dn2 is chosen for block recovery.
> dn1 is marked as stale (is dead state) at this time, here the 
> recoveryLocations size is 1, currently according to the following logic, dn1 
> and dn2 will be chosen to participate in block recovery.
> DatanodeManager#getBlockRecoveryCommand
> {code:java}
>// Skip stale nodes during recovery
>  final List recoveryLocations =
>  new ArrayList<>(storages.length);
>  final List storageIdx = new ArrayList<>(storages.length);
>  for (int i = 0; i < storages.length; ++i) {
>if (!storages[i].getDatanodeDescriptor().isStale(staleInterval)) {
>  recoveryLocations.add(storages[i]);
>  storageIdx.add(i);
>}
>  }
>  ...
>  // If we only get 1 replica after eliminating stale nodes, choose all
>  // replicas for recovery and let the primary data node handle failures.
>  DatanodeInfo[] recoveryInfos;
>  if (recoveryLocations.size() > 1) {
>if (recoveryLocations.size() != storages.length) {
>  LOG.info("Skipped stale nodes for recovery : "
>  + (storages.length - recoveryLocations.size()));
>}
>recoveryInfos = DatanodeStorageInfo.toDatanodeInfos(recoveryLocations);
>  } else {
>// If too many replicas are stale, then choose all replicas to
>// participate in block recovery.
>recoveryInfos = DatanodeStorageInfo.toDatanodeInfos(storages);
>  }
> {code}
> {code:java}
> 2024-03-13 21:58:01,425 INFO  datanode.DataNode 
> (BlockRecoveryWorker.java:logRecoverBlock(563))
> [org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$1@54e291ac] -
> BlockRecoveryWorker: NameNode at xxx:8040 calls 
> recoverBlock(BP-xxx:blk_xxx_xxx, 
> targets=[DatanodeInfoWithStorage[dn1:50010,null,null], 
> DatanodeInfoWithStorage[dn2:50010,null,null]], 
> newGenerationStamp=28577373754, newBlock=null, isStriped=false)
> {code}
> *t4.* When dn2 executes BlockRecoveryWorker#recover, it will call 
> initReplicaRecovery operation on dn1, however, since the dn1 machine is 
> currently down state at this time, it will take a very long time to timeout,  
> the default number of retries to establish a server connection is 45 times.
> related logs：
> {code:java}
> 2024-03-13 21:59:31,518 INFO  ipc.Client 
> (Client.java:handleConnectionTimeout(904)) 
> [org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$1@54e291ac] - 
> Retrying connect to server: dn1:8010. Already tried 0 time(s); maxRetries=45
> ...
> 2024-03-13 23:05:35,295 INFO  ipc.Client 
> (Client.java:handleConnectionTimeout(904)) 
> [org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$1@54e291ac] - 
> Retrying connect to server: dn2:8010. Already tried 44 time(s); maxRetries=45
> 2024-03-13 23:07:05,392 WARN  protocol.InterDatanodeProtocol 
> (BlockRecoveryWorker.java:recover(170)) 
> [org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$1@54e291ac] -
> Failed to recover block (block=BP-xxx:blk_xxx_xxx, 
> datanode=DatanodeInfoWithStorage[dn1:50010,null,null]) 
> org.apache.hadoop.net.ConnectTimeoutException:
> Call From dn2 to dn1:8010 failed on socket timeout exception: 
> org.apache.hadoop.net.ConnectTimeoutException: 9 millis timeout while 
> waiting for channel to be ready for connect.ch : 
> java.nio.channels.SocketChanne

[jira] [Commented] (HDFS-17430) RecoveringBlock will skip no live replicas when get block recovery command.

2024-03-19 Thread ASF GitHub Bot (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-17430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17828586#comment-17828586
 ] 

ASF GitHub Bot commented on HDFS-17430:
---

haiyang1987 commented on PR #6635:
URL: https://github.com/apache/hadoop/pull/6635#issuecomment-2008615230

   Hi Sir @dineshchitlangia @ayushtkn   Would you mind help review this PR when 
you have free time? Thank you so much.




> RecoveringBlock will skip no live replicas when get block recovery command.
> ---
>
> Key: HDFS-17430
> URL: https://issues.apache.org/jira/browse/HDFS-17430
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>
> RecoveringBlock maybe skip no live replicas when get block recovery command.
> *Issue:*
> Currently the following scenarios may lead to failure in the execution of 
> BlockRecoveryWorker by the datanode, resulting file being not to be closed 
> for a long time.
> *t1.*  The block_xxx_xxx has two replicas[dn1,dn2]; the dn1 machine shut down 
> and will be dead status, the dn2 is live status.
> *t2.* Occurs block recovery.
> related logs：
> {code:java}
> 2024-03-13 21:58:00.651 WARN hdfs.StateChange DIR* 
> NameSystem.internalReleaseLease: File /xxx/file has not been closed. Lease 
> recovery is in progress. RecoveryId = 28577373754 for block blk_xxx_xxx
> {code}
> *t3.*  The dn2 is chosen for block recovery.
> dn1 is marked as stale (is dead state) at this time, here the 
> recoveryLocations size is 1, currently according to the following logic, dn1 
> and dn2 will be chosen to participate in block recovery.
> DatanodeManager#getBlockRecoveryCommand
> {code:java}
>// Skip stale nodes during recovery
>  final List recoveryLocations =
>  new ArrayList<>(storages.length);
>  final List storageIdx = new ArrayList<>(storages.length);
>  for (int i = 0; i < storages.length; ++i) {
>if (!storages[i].getDatanodeDescriptor().isStale(staleInterval)) {
>  recoveryLocations.add(storages[i]);
>  storageIdx.add(i);
>}
>  }
>  ...
>  // If we only get 1 replica after eliminating stale nodes, choose all
>  // replicas for recovery and let the primary data node handle failures.
>  DatanodeInfo[] recoveryInfos;
>  if (recoveryLocations.size() > 1) {
>if (recoveryLocations.size() != storages.length) {
>  LOG.info("Skipped stale nodes for recovery : "
>  + (storages.length - recoveryLocations.size()));
>}
>recoveryInfos = DatanodeStorageInfo.toDatanodeInfos(recoveryLocations);
>  } else {
>// If too many replicas are stale, then choose all replicas to
>// participate in block recovery.
>recoveryInfos = DatanodeStorageInfo.toDatanodeInfos(storages);
>  }
> {code}
> {code:java}
> 2024-03-13 21:58:01,425 INFO  datanode.DataNode 
> (BlockRecoveryWorker.java:logRecoverBlock(563))
> [org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$1@54e291ac] -
> BlockRecoveryWorker: NameNode at xxx:8040 calls 
> recoverBlock(BP-xxx:blk_xxx_xxx, 
> targets=[DatanodeInfoWithStorage[dn1:50010,null,null], 
> DatanodeInfoWithStorage[dn2:50010,null,null]], 
> newGenerationStamp=28577373754, newBlock=null, isStriped=false)
> {code}
> *t4.* When dn2 executes BlockRecoveryWorker#recover, it will call 
> initReplicaRecovery operation on dn1, however, since the dn1 machine is 
> currently down state at this time, it will take a very long time to timeout,  
> the default number of retries to establish a server connection is 45 times.
> related logs：
> {code:java}
> 2024-03-13 21:59:31,518 INFO  ipc.Client 
> (Client.java:handleConnectionTimeout(904)) 
> [org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$1@54e291ac] - 
> Retrying connect to server: dn1:8010. Already tried 0 time(s); maxRetries=45
> ...
> 2024-03-13 23:05:35,295 INFO  ipc.Client 
> (Client.java:handleConnectionTimeout(904)) 
> [org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$1@54e291ac] - 
> Retrying connect to server: dn2:8010. Already tried 44 time(s); maxRetries=45
> 2024-03-13 23:07:05,392 WARN  protocol.InterDatanodeProtocol 
> (BlockRecoveryWorker.java:recover(170)) 
> [org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$1@54e291ac] -
> Failed to recover block (block=BP-xxx:blk_xxx_xxx, 
> datanode=DatanodeInfoWithStorage[dn1:50010,null,null]) 
> org.apache.hadoop.net.ConnectTimeoutException:
> Call From dn2 to dn1:8010 failed on socket timeout exception: 
> org.apache.hadoop.net.ConnectTimeoutException: 9 millis timeout while 
> waiting for channel to be ready for connect.ch : 
> java.nio.channels.SocketChannel[connection-pending remote=dn:801

[jira] [Resolved] (HDFS-17431) Fix log format for BlockRecoveryWorker#recoverBlocks

2024-03-19 Thread Dinesh Chitlangia (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-17431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Chitlangia resolved HDFS-17431.
--
Fix Version/s: 3.4.1
   Resolution: Fixed

Thanks [~haiyang Hu] for the improvement.

> Fix log format for BlockRecoveryWorker#recoverBlocks
> 
>
> Key: HDFS-17431
> URL: https://issues.apache.org/jira/browse/HDFS-17431
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.1
>
>
> Fix log format for BlockRecoveryWorker#recoverBlocks
>  
> As seen in PR [https://github.com/apache/hadoop/pull/6635] the additional {} 
> is moot.
>  
> 2024-03-13 23:07:05,401 WARN  datanode.DataNode 
> (BlockRecoveryWorker.java:run(623)) 
> [org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$1@54e291ac] -
> recover Block: RecoveringBlock\{BP-xxx:blk_xxx_xxx; getBlockSize()=0; 
> corrupt=false; offset=-1; locs=[DatanodeInfoWithStorage[dn1:50010,null,null], 
> DatanodeInfoWithStorage[dn2:50010,null,null]]; cachedLocs=[]}
> FAILED: 
> *{}*
>  org.apache.hadoop.ipc.RemoteException(java.io.IOException): The recovery id 
> 28577373754 does not match current recovery id 28578772548 for block 
> BP-xxx:blk_xxx_xxx
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.commitBlockSynchronization(FSNamesystem.java:4129)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.commitBlockSynchronization(NameNodeRpcServer.java:1184)
> at



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-17431) Fix log format for BlockRecoveryWorker#recoverBlocks

2024-03-19 Thread Dinesh Chitlangia (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-17431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Chitlangia updated HDFS-17431:
-
Description: 
Fix log format for BlockRecoveryWorker#recoverBlocks

 

As seen in PR [https://github.com/apache/hadoop/pull/6635] the additional {} is 
moot.

 
2024-03-13 23:07:05,401 WARN  datanode.DataNode 
(BlockRecoveryWorker.java:run(623)) 
[org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$1@54e291ac] -
recover Block: RecoveringBlock\{BP-xxx:blk_xxx_xxx; getBlockSize()=0; 
corrupt=false; offset=-1; locs=[DatanodeInfoWithStorage[dn1:50010,null,null], 
DatanodeInfoWithStorage[dn2:50010,null,null]]; cachedLocs=[]}
FAILED: 
*{}*
 org.apache.hadoop.ipc.RemoteException(java.io.IOException): The recovery id 
28577373754 does not match current recovery id 28578772548 for block 
BP-xxx:blk_xxx_xxx
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.commitBlockSynchronization(FSNamesystem.java:4129)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.commitBlockSynchronization(NameNodeRpcServer.java:1184)
at

  was:Fix log format for BlockRecoveryWorker#recoverBlocks


> Fix log format for BlockRecoveryWorker#recoverBlocks
> 
>
> Key: HDFS-17431
> URL: https://issues.apache.org/jira/browse/HDFS-17431
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>
> Fix log format for BlockRecoveryWorker#recoverBlocks
>  
> As seen in PR [https://github.com/apache/hadoop/pull/6635] the additional {} 
> is moot.
>  
> 2024-03-13 23:07:05,401 WARN  datanode.DataNode 
> (BlockRecoveryWorker.java:run(623)) 
> [org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$1@54e291ac] -
> recover Block: RecoveringBlock\{BP-xxx:blk_xxx_xxx; getBlockSize()=0; 
> corrupt=false; offset=-1; locs=[DatanodeInfoWithStorage[dn1:50010,null,null], 
> DatanodeInfoWithStorage[dn2:50010,null,null]]; cachedLocs=[]}
> FAILED: 
> *{}*
>  org.apache.hadoop.ipc.RemoteException(java.io.IOException): The recovery id 
> 28577373754 does not match current recovery id 28578772548 for block 
> BP-xxx:blk_xxx_xxx
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.commitBlockSynchronization(FSNamesystem.java:4129)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.commitBlockSynchronization(NameNodeRpcServer.java:1184)
> at



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-17431) Fix log format for BlockRecoveryWorker#recoverBlocks

2024-03-19 Thread ASF GitHub Bot (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-17431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17828583#comment-17828583
 ] 

ASF GitHub Bot commented on HDFS-17431:
---

haiyang1987 commented on PR #6643:
URL: https://github.com/apache/hadoop/pull/6643#issuecomment-2008608210

   Thanks  @dineshchitlangia @ayushtkn @wzk784533 for your review and merge~




> Fix log format for BlockRecoveryWorker#recoverBlocks
> 
>
> Key: HDFS-17431
> URL: https://issues.apache.org/jira/browse/HDFS-17431
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>
> Fix log format for BlockRecoveryWorker#recoverBlocks



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-17431) Fix log format for BlockRecoveryWorker#recoverBlocks

2024-03-19 Thread ASF GitHub Bot (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-17431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17828581#comment-17828581
 ] 

ASF GitHub Bot commented on HDFS-17431:
---

dineshchitlangia merged PR #6643:
URL: https://github.com/apache/hadoop/pull/6643




> Fix log format for BlockRecoveryWorker#recoverBlocks
> 
>
> Key: HDFS-17431
> URL: https://issues.apache.org/jira/browse/HDFS-17431
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>
> Fix log format for BlockRecoveryWorker#recoverBlocks



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-17431) Fix log format for BlockRecoveryWorker#recoverBlocks

2024-03-19 Thread ASF GitHub Bot (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-17431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17828580#comment-17828580
 ] 

ASF GitHub Bot commented on HDFS-17431:
---

haiyang1987 commented on PR #6643:
URL: https://github.com/apache/hadoop/pull/6643#issuecomment-2008603273

   Thanks @wzk784533 @ayushtkn @dineshchitlangia for your review.
   
   I found that the log format  showed some problems, such as the mentioned in 
this issue https://github.com/apache/hadoop/pull/6635. 
   ```
   
   2024-03-13 23:07:05,401 WARN  datanode.DataNode 
(BlockRecoveryWorker.java:run(623)) 
[org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$1@54e291ac] -
   recover Block: RecoveringBlock{BP-xxx:blk_xxx_xxx; getBlockSize()=0; 
corrupt=false; offset=-1; locs=[DatanodeInfoWithStorage[dn1:50010,null,null], 
DatanodeInfoWithStorage[dn2:50010,null,null]]; cachedLocs=[]}
   FAILED: {} org.apache.hadoop.ipc.RemoteException(java.io.IOException): The 
recovery id 28577373754 does not match current recovery id 28578772548 for 
block BP-xxx:blk_xxx_xxx
   at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.commitBlockSynchronization(FSNamesystem.java:4129)
   at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.commitBlockSynchronization(NameNodeRpcServer.java:1184)
   at 
org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.commitBlockSynchronization(DatanodeProtocolServerSideTranslatorPB.java:310)
   at 
org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:34391)
   at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:635)
   at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:603)
   at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:587)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1137)
   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1236)
   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1134)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:422)
   at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:2005)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3360)
   
   at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1579)
   at org.apache.hadoop.ipc.Client.call(Client.java:1511)
   at org.apache.hadoop.ipc.Client.call(Client.java:1402)
   at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:268)
   at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:142)
   at com.sun.proxy.$Proxy17.commitBlockSynchronization(Unknown Source)
   at 
org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.commitBlockSynchronization(DatanodeProtocolClientSideTranslatorPB.java:342)
   at 
org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$RecoveryTaskContiguous.syncBlock(BlockRecoveryWorker.java:334)
   at 
org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$RecoveryTaskContiguous.recover(BlockRecoveryWorker.java:189)
   at 
org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$1.run(BlockRecoveryWorker.java:620)
   at java.lang.Thread.run(Thread.java:748)
   ```
   
   `LOG.warn("recover Block: {} FAILED: {}", b, e);`
it invoke e will print the entire trace. so for the second placeholders is 
meaningless,  i think choose to remove for the second placeholders or  change e 
to e.toString()
   
   Hi sir @ayushtkn @dineshchitlangia @wzk784533 what dou you think?
   Thanks~
   
   




> Fix log format for BlockRecoveryWorker#recoverBlocks
> 
>
> Key: HDFS-17431
> URL: https://issues.apache.org/jira/browse/HDFS-17431
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>
> Fix log format for BlockRecoveryWorker#recoverBlocks



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-17433) metrics sumOfActorCommandQueueLength should only record valid commands

2024-03-19 Thread ASF GitHub Bot (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-17433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17828579#comment-17828579
 ] 

ASF GitHub Bot commented on HDFS-17433:
---

hfutatzhanghb commented on PR #6644:
URL: https://github.com/apache/hadoop/pull/6644#issuecomment-2008595490

   > +1 LGTM, pending CI @hfutatzhanghb thanks for finding this issue and 
contributing the fix.
   
   Sir, thanks a lot for reviewing.




> metrics sumOfActorCommandQueueLength should only record valid commands
> --
>
> Key: HDFS-17433
> URL: https://issues.apache.org/jira/browse/HDFS-17433
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.4.0
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-17433) metrics sumOfActorCommandQueueLength should only record valid commands

2024-03-19 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-17433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-17433:
--
Labels: pull-request-available  (was: )

> metrics sumOfActorCommandQueueLength should only record valid commands
> --
>
> Key: HDFS-17433
> URL: https://issues.apache.org/jira/browse/HDFS-17433
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.4.0
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-17433) metrics sumOfActorCommandQueueLength should only record valid commands

2024-03-19 Thread ASF GitHub Bot (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-17433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17828574#comment-17828574
 ] 

ASF GitHub Bot commented on HDFS-17433:
---

hfutatzhanghb opened a new pull request, #6644:
URL: https://github.com/apache/hadoop/pull/6644

   ### Description of PR
 We add an phone alarm on metrics sumOfActorCommandQueueLength when it 
beyond 3000.
 Recently, we received the alarm and we found that `DatanodeCommand[] cmds` 
with array length equals to 0 was 
   still put into queue and incrActorCmdQueueLength.
 When processedCommandsOpAvgTime is high, those empty cmds were put into 
queue every heartbeat intervel.
 sumOfActorCommandQueueLength should better only record valid commands.
   
   
   




> metrics sumOfActorCommandQueueLength should only record valid commands
> --
>
> Key: HDFS-17433
> URL: https://issues.apache.org/jira/browse/HDFS-17433
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.4.0
>Reporter: farmmamba
>Assignee: farmmamba
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-17433) metrics sumOfActorCommandQueueLength should only record valid commands

2024-03-19 Thread farmmamba (Jira)

farmmamba created HDFS-17433:


 Summary: metrics sumOfActorCommandQueueLength should only record 
valid commands
 Key: HDFS-17433
 URL: https://issues.apache.org/jira/browse/HDFS-17433
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 3.4.0
Reporter: farmmamba
Assignee: farmmamba






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-17431) Fix log format for BlockRecoveryWorker#recoverBlocks

2024-03-19 Thread ASF GitHub Bot (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-17431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17828490#comment-17828490
 ] 

ASF GitHub Bot commented on HDFS-17431:
---

ayushtkn commented on code in PR #6643:
URL: https://github.com/apache/hadoop/pull/6643#discussion_r1530927655


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockRecoveryWorker.java:
##
@@ -628,7 +628,7 @@ public void run() {
 new RecoveryTaskContiguous(b).recover();
   }
 } catch (IOException e) {
-  LOG.warn("recover Block: {} FAILED: {}", b, e);
+  LOG.warn("recover Block: {} FAILED: ", b, e);

Review Comment:
   Whats wrong here? the number of placeholders are correct only, for the 
second one it will invoke e.toString(), now you are changing it to print the 
entire trace.
   
   I don't think it is broken, it looks like it was intentional 





> Fix log format for BlockRecoveryWorker#recoverBlocks
> 
>
> Key: HDFS-17431
> URL: https://issues.apache.org/jira/browse/HDFS-17431
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>
> Fix log format for BlockRecoveryWorker#recoverBlocks



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-17431) Fix log format for BlockRecoveryWorker#recoverBlocks

2024-03-19 Thread ASF GitHub Bot (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-17431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17828479#comment-17828479
 ] 

ASF GitHub Bot commented on HDFS-17431:
---

hadoop-yetus commented on PR #6643:
URL: https://github.com/apache/hadoop/pull/6643#issuecomment-2007841730

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 35s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  47m 17s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 29s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   1m 20s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   1m 18s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 29s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  8s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 46s |  |  trunk passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 30s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  38m 29s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 17s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 15s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   1m 15s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 13s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |   1m 13s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  4s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 18s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 58s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 38s |  |  the patch passed with JDK 
Private Build-1.8.0_392-8u392-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 25s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  38m 41s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 228m 41s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 45s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 379m 37s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.44 ServerAPI=1.44 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6643/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6643 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 28104aeadaba 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 
15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 0ce2a9e09116ee8807a24c37e87595b52f3713da |
   | Default Java | Private Build-1.8.0_392-8u392-ga-1~20.04-b08 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6643/1/testReport/ |
   | Max. process+thread count | 4053 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6643/1/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |

[jira] [Commented] (HDFS-17413) [FGL] CacheReplicationMonitor supports fine-grained lock

2024-03-19 Thread ASF GitHub Bot (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-17413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17828425#comment-17828425
 ] 

ASF GitHub Bot commented on HDFS-17413:
---

hadoop-yetus commented on PR #6641:
URL: https://github.com/apache/hadoop/pull/6641#issuecomment-2007547095

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  12m 26s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ HDFS-17384 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  43m 55s |  |  HDFS-17384 passed  |
   | +1 :green_heart: |  compile  |   1m 21s |  |  HDFS-17384 passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  HDFS-17384 passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   1m 13s |  |  HDFS-17384 passed  |
   | +1 :green_heart: |  mvnsite  |   1m 24s |  |  HDFS-17384 passed  |
   | +1 :green_heart: |  javadoc  |   1m 11s |  |  HDFS-17384 passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 41s |  |  HDFS-17384 passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   3m 17s |  |  HDFS-17384 passed  |
   | +1 :green_heart: |  shadedclient  |  35m 33s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 12s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 11s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  8s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   1m  8s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  0s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 10s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 52s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 32s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   3m 15s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  35m 11s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 229m 58s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6641/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 46s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 381m 28s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.datanode.TestLargeBlockReport |
   |   | hadoop.hdfs.tools.TestDFSAdmin |
   |   | hadoop.hdfs.protocol.TestBlockListAsLongs |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.44 ServerAPI=1.44 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6641/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6641 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux dbd98b8aab75 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 
15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | HDFS-17384 / 901fff7cbf4ac90b8be0b4799ea19426eff89a20 |
   | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b

[jira] [Commented] (HDFS-17431) Fix log format for BlockRecoveryWorker#recoverBlocks

2024-03-19 Thread ASF GitHub Bot (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-17431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17828325#comment-17828325
 ] 

ASF GitHub Bot commented on HDFS-17431:
---

wzk784533 commented on PR #6643:
URL: https://github.com/apache/hadoop/pull/6643#issuecomment-2007159204

   LGTM




> Fix log format for BlockRecoveryWorker#recoverBlocks
> 
>
> Key: HDFS-17431
> URL: https://issues.apache.org/jira/browse/HDFS-17431
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>
> Fix log format for BlockRecoveryWorker#recoverBlocks



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-17431) Fix log format for BlockRecoveryWorker#recoverBlocks

2024-03-19 Thread ASF GitHub Bot (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-17431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17828293#comment-17828293
 ] 

ASF GitHub Bot commented on HDFS-17431:
---

haiyang1987 opened a new pull request, #6643:
URL: https://github.com/apache/hadoop/pull/6643

   ### Description of PR
   https://issues.apache.org/jira/browse/HDFS-17431
   Fix log format for BlockRecoveryWorker#recoverBlocks




> Fix log format for BlockRecoveryWorker#recoverBlocks
> 
>
> Key: HDFS-17431
> URL: https://issues.apache.org/jira/browse/HDFS-17431
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>
> Fix log format for BlockRecoveryWorker#recoverBlocks



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-17431) Fix log format for BlockRecoveryWorker#recoverBlocks

2024-03-19 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-17431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-17431:
--
Labels: pull-request-available  (was: )

> Fix log format for BlockRecoveryWorker#recoverBlocks
> 
>
> Key: HDFS-17431
> URL: https://issues.apache.org/jira/browse/HDFS-17431
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haiyang Hu
>Assignee: Haiyang Hu
>Priority: Major
>  Labels: pull-request-available
>
> Fix log format for BlockRecoveryWorker#recoverBlocks



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-17413) [FGL] CacheReplicationMonitor supports fine-grained lock

2024-03-19 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-17413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-17413:
--
Labels: pull-request-available  (was: )

> [FGL] CacheReplicationMonitor supports fine-grained lock
> 
>
> Key: HDFS-17413
> URL: https://issues.apache.org/jira/browse/HDFS-17413
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>
> * addCacheDirective
>  * modifyCacheDirective
>  * removeCacheDirective
>  * listCacheDirectives
>  * addCachePool
>  * modifyCachePool
>  * removeCachePool
>  * listCachePools
>  * cacheReport
>  * CacheManager
>  * CacheReplicationMonitor



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-17413) [FGL] CacheReplicationMonitor supports fine-grained lock

2024-03-19 Thread ASF GitHub Bot (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-17413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17828241#comment-17828241
 ] 

ASF GitHub Bot commented on HDFS-17413:
---

ZanderXu opened a new pull request, #6641:
URL: https://github.com/apache/hadoop/pull/6641

   Using FSLock to make cache-pool and cache-directive thread safe, since 
Clients will access or modify these information and these information has 
nothing to do with block.
   
   Using BMLock to make cachedBlock thread safe, since the related logic will 
access block information and modify cache-related information of one DN.




> [FGL] CacheReplicationMonitor supports fine-grained lock
> 
>
> Key: HDFS-17413
> URL: https://issues.apache.org/jira/browse/HDFS-17413
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>
> * addCacheDirective
>  * modifyCacheDirective
>  * removeCacheDirective
>  * listCacheDirectives
>  * addCachePool
>  * modifyCachePool
>  * removeCachePool
>  * listCachePools
>  * cacheReport
>  * CacheManager
>  * CacheReplicationMonitor



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-17413) [FGL] CacheReplicationMonitor supports fine-grained lock

2024-03-19 Thread ZanderXu (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-17413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZanderXu updated HDFS-17413:

Description: 
* addCacheDirective
 * modifyCacheDirective
 * removeCacheDirective
 * listCacheDirectives
 * addCachePool
 * modifyCachePool
 * removeCachePool
 * listCachePools
 * cacheReport
 * CacheManager
 * CacheReplicationMonitor

  was:
Client RPCs involving Cache supports fine-grained lock.
 * addCacheDirective
 * modifyCacheDirective
 * removeCacheDirective
 * listCacheDirectives
 * addCachePool
 * modifyCachePool
 * removeCachePool
 * listCachePools


> [FGL] CacheReplicationMonitor supports fine-grained lock
> 
>
> Key: HDFS-17413
> URL: https://issues.apache.org/jira/browse/HDFS-17413
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>
> * addCacheDirective
>  * modifyCacheDirective
>  * removeCacheDirective
>  * listCacheDirectives
>  * addCachePool
>  * modifyCachePool
>  * removeCachePool
>  * listCachePools
>  * cacheReport
>  * CacheManager
>  * CacheReplicationMonitor



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-17413) [FGL] CacheReplicationMonitor supports fine-grained lock

2024-03-19 Thread ZanderXu (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-17413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZanderXu updated HDFS-17413:

Summary: [FGL] CacheReplicationMonitor supports fine-grained lock  (was: 
[FGL] Client RPCs involving Cache supports fine-grained lock)

> [FGL] CacheReplicationMonitor supports fine-grained lock
> 
>
> Key: HDFS-17413
> URL: https://issues.apache.org/jira/browse/HDFS-17413
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>
> Client RPCs involving Cache supports fine-grained lock.
>  * addCacheDirective
>  * modifyCacheDirective
>  * removeCacheDirective
>  * listCacheDirectives
>  * addCachePool
>  * modifyCachePool
>  * removeCachePool
>  * listCachePools



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-17430) RecoveringBlock will skip no live replicas when get block recovery command.

[jira] [Commented] (HDFS-17430) RecoveringBlock will skip no live replicas when get block recovery command.

[jira] [Resolved] (HDFS-17431) Fix log format for BlockRecoveryWorker#recoverBlocks

[jira] [Updated] (HDFS-17431) Fix log format for BlockRecoveryWorker#recoverBlocks

[jira] [Commented] (HDFS-17431) Fix log format for BlockRecoveryWorker#recoverBlocks

[jira] [Commented] (HDFS-17431) Fix log format for BlockRecoveryWorker#recoverBlocks

[jira] [Commented] (HDFS-17431) Fix log format for BlockRecoveryWorker#recoverBlocks

[jira] [Commented] (HDFS-17433) metrics sumOfActorCommandQueueLength should only record valid commands

[jira] [Updated] (HDFS-17433) metrics sumOfActorCommandQueueLength should only record valid commands

[jira] [Commented] (HDFS-17433) metrics sumOfActorCommandQueueLength should only record valid commands

[jira] [Created] (HDFS-17433) metrics sumOfActorCommandQueueLength should only record valid commands

[jira] [Commented] (HDFS-17431) Fix log format for BlockRecoveryWorker#recoverBlocks

[jira] [Commented] (HDFS-17431) Fix log format for BlockRecoveryWorker#recoverBlocks

[jira] [Commented] (HDFS-17413) [FGL] CacheReplicationMonitor supports fine-grained lock

[jira] [Commented] (HDFS-17431) Fix log format for BlockRecoveryWorker#recoverBlocks

[jira] [Commented] (HDFS-17431) Fix log format for BlockRecoveryWorker#recoverBlocks

[jira] [Updated] (HDFS-17431) Fix log format for BlockRecoveryWorker#recoverBlocks

[jira] [Updated] (HDFS-17413) [FGL] CacheReplicationMonitor supports fine-grained lock

[jira] [Commented] (HDFS-17413) [FGL] CacheReplicationMonitor supports fine-grained lock

[jira] [Updated] (HDFS-17413) [FGL] CacheReplicationMonitor supports fine-grained lock

[jira] [Updated] (HDFS-17413) [FGL] CacheReplicationMonitor supports fine-grained lock

21 matches

Site Navigation

Mail list logo

Footer information