[jira] [Commented] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
[ https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17365399#comment-17365399 ] Haibin Huang commented on HDFS-13671: - [~zhaojk] I will cherry-pick to branch 3.1 later, you can just update your namenode and don't need to update datanode together, in my company i just update the namenode, it‘s compatible with the datanode which has FoldedTreeSet, but it better to update your datanode later if you have time. > Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet > -- > > Key: HDFS-13671 > URL: https://issues.apache.org/jira/browse/HDFS-13671 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.3 >Reporter: Yiqun Lin >Assignee: Haibin Huang >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.2 > > Attachments: HDFS-13671-001.patch, image-2021-06-10-19-28-18-373.png, > image-2021-06-10-19-28-58-359.png, image-2021-06-18-15-46-46-052.png, > image-2021-06-18-15-47-04-037.png > > Time Spent: 5h 50m > Remaining Estimate: 0h > > NameNode hung when deleting large files/blocks. The stack info: > {code} > "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 > tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > {code} > In the current deletion logic in NameNode, there are mainly two steps: > * Collect INodes and all blocks to be deleted, then delete INodes. > * Remove blocks chunk by chunk in a loop. > Actually the first step should be a more expensive operation and will takes > more time. However, now we always see NN hangs during the remove block > operation. > Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a > better performance in dealing FBR/IBRs. But compared with early > implementation in remove-block logic, {{FoldedTreeSet}} seems more slower > since It will take additional time to balance tree node. When there are large > block to be removed/deleted, it looks bad. > For the get type operations in {{DatanodeStorageInfo}}, we only provide the > {{getBlockIterator}} to return blocks iterator and no other get operation > with specified block. Still we need to use {{FoldedTreeSet}} in > {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not > Update. Maybe we can revert this to the early implementation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
[ https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17365319#comment-17365319 ] Haibin Huang commented on HDFS-13671: - [~tomscut] there are 2 blocks in one disk and each datanode has 12 disk > Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet > -- > > Key: HDFS-13671 > URL: https://issues.apache.org/jira/browse/HDFS-13671 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.3 >Reporter: Yiqun Lin >Assignee: Haibin Huang >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.2 > > Attachments: HDFS-13671-001.patch, image-2021-06-10-19-28-18-373.png, > image-2021-06-10-19-28-58-359.png, image-2021-06-18-15-46-46-052.png, > image-2021-06-18-15-47-04-037.png > > Time Spent: 5h 50m > Remaining Estimate: 0h > > NameNode hung when deleting large files/blocks. The stack info: > {code} > "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 > tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > {code} > In the current deletion logic in NameNode, there are mainly two steps: > * Collect INodes and all blocks to be deleted, then delete INodes. > * Remove blocks chunk by chunk in a loop. > Actually the first step should be a more expensive operation and will takes > more time. However, now we always see NN hangs during the remove block > operation. > Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a > better performance in dealing FBR/IBRs. But compared with early > implementation in remove-block logic, {{FoldedTreeSet}} seems more slower > since It will take additional time to balance tree node. When there are large > block to be removed/deleted, it looks bad. > For the get type operations in {{DatanodeStorageInfo}}, we only provide the > {{getBlockIterator}} to return blocks iterator and no other get operation > with specified block. Still we need to use {{FoldedTreeSet}} in > {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not > Update. Maybe we can revert this to the early implementation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
[ https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17365296#comment-17365296 ] Haibin Huang commented on HDFS-13671: - [~tomscut] You are right, it will affect the performance of handling block reports, in my company's cluster which has over 300 nodes, the AvgProcessTime of block report will increase about 70 percent, but the qps of block report is very slow, i think it can be acceptable. And the p99th rpc time on hdfs-client can be reduced by 85% when namenode do some big delete operation, it's worth doing revert. !image-2021-06-18-15-46-46-052.png! !image-2021-06-18-15-47-04-037.png! > Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet > -- > > Key: HDFS-13671 > URL: https://issues.apache.org/jira/browse/HDFS-13671 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.3 >Reporter: Yiqun Lin >Assignee: Haibin Huang >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.2 > > Attachments: HDFS-13671-001.patch, image-2021-06-10-19-28-18-373.png, > image-2021-06-10-19-28-58-359.png, image-2021-06-18-15-46-46-052.png, > image-2021-06-18-15-47-04-037.png > > Time Spent: 5h 50m > Remaining Estimate: 0h > > NameNode hung when deleting large files/blocks. The stack info: > {code} > "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 > tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > {code} > In the current deletion logic in NameNode, there are mainly two steps: > * Collect INodes and all blocks to be deleted, then delete INodes. > * Remove blocks chunk by chunk in a loop. > Actually the first step should be a more expensive operation and will takes > more time. However, now we always see NN hangs during the remove block > operation. > Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a > better performance in dealing FBR/IBRs. But compared with early > implementation in remove-block logic, {{FoldedTreeSet}} seems more slower > since It will take additional time to balance tree node. When there are large > block to be removed/deleted, it looks bad. > For the get type operations in {{DatanodeStorageInfo}}, we only provide the > {{getBlockIterator}} to return blocks iterator and no other get operation > with specified block. Still we need to use {{FoldedTreeSet}} in > {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not > Update. Maybe we can revert this to the early implementation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h..
[jira] [Updated] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
[ https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-13671: Attachment: image-2021-06-18-15-47-04-037.png > Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet > -- > > Key: HDFS-13671 > URL: https://issues.apache.org/jira/browse/HDFS-13671 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.3 >Reporter: Yiqun Lin >Assignee: Haibin Huang >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.2 > > Attachments: HDFS-13671-001.patch, image-2021-06-10-19-28-18-373.png, > image-2021-06-10-19-28-58-359.png, image-2021-06-18-15-46-46-052.png, > image-2021-06-18-15-47-04-037.png > > Time Spent: 5h 50m > Remaining Estimate: 0h > > NameNode hung when deleting large files/blocks. The stack info: > {code} > "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 > tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > {code} > In the current deletion logic in NameNode, there are mainly two steps: > * Collect INodes and all blocks to be deleted, then delete INodes. > * Remove blocks chunk by chunk in a loop. > Actually the first step should be a more expensive operation and will takes > more time. However, now we always see NN hangs during the remove block > operation. > Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a > better performance in dealing FBR/IBRs. But compared with early > implementation in remove-block logic, {{FoldedTreeSet}} seems more slower > since It will take additional time to balance tree node. When there are large > block to be removed/deleted, it looks bad. > For the get type operations in {{DatanodeStorageInfo}}, we only provide the > {{getBlockIterator}} to return blocks iterator and no other get operation > with specified block. Still we need to use {{FoldedTreeSet}} in > {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not > Update. Maybe we can revert this to the early implementation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
[ https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-13671: Attachment: image-2021-06-18-15-46-46-052.png > Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet > -- > > Key: HDFS-13671 > URL: https://issues.apache.org/jira/browse/HDFS-13671 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.3 >Reporter: Yiqun Lin >Assignee: Haibin Huang >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.2 > > Attachments: HDFS-13671-001.patch, image-2021-06-10-19-28-18-373.png, > image-2021-06-10-19-28-58-359.png, image-2021-06-18-15-46-46-052.png > > Time Spent: 5h 50m > Remaining Estimate: 0h > > NameNode hung when deleting large files/blocks. The stack info: > {code} > "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 > tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > {code} > In the current deletion logic in NameNode, there are mainly two steps: > * Collect INodes and all blocks to be deleted, then delete INodes. > * Remove blocks chunk by chunk in a loop. > Actually the first step should be a more expensive operation and will takes > more time. However, now we always see NN hangs during the remove block > operation. > Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a > better performance in dealing FBR/IBRs. But compared with early > implementation in remove-block logic, {{FoldedTreeSet}} seems more slower > since It will take additional time to balance tree node. When there are large > block to be removed/deleted, it looks bad. > For the get type operations in {{DatanodeStorageInfo}}, we only provide the > {{getBlockIterator}} to return blocks iterator and no other get operation > with specified block. Still we need to use {{FoldedTreeSet}} in > {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not > Update. Maybe we can revert this to the early implementation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
[ https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-13671: Attachment: image-2021-06-10-19-28-58-359.png > Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet > -- > > Key: HDFS-13671 > URL: https://issues.apache.org/jira/browse/HDFS-13671 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.3 >Reporter: Yiqun Lin >Assignee: Haibin Huang >Priority: Major > Labels: pull-request-available > Attachments: HDFS-13671-001.patch, image-2021-06-10-19-28-18-373.png, > image-2021-06-10-19-28-58-359.png > > Time Spent: 3.5h > Remaining Estimate: 0h > > NameNode hung when deleting large files/blocks. The stack info: > {code} > "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 > tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > {code} > In the current deletion logic in NameNode, there are mainly two steps: > * Collect INodes and all blocks to be deleted, then delete INodes. > * Remove blocks chunk by chunk in a loop. > Actually the first step should be a more expensive operation and will takes > more time. However, now we always see NN hangs during the remove block > operation. > Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a > better performance in dealing FBR/IBRs. But compared with early > implementation in remove-block logic, {{FoldedTreeSet}} seems more slower > since It will take additional time to balance tree node. When there are large > block to be removed/deleted, it looks bad. > For the get type operations in {{DatanodeStorageInfo}}, we only provide the > {{getBlockIterator}} to return blocks iterator and no other get operation > with specified block. Still we need to use {{FoldedTreeSet}} in > {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not > Update. Maybe we can revert this to the early implementation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
[ https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17360763#comment-17360763 ] Haibin Huang commented on HDFS-13671: - We apply this patch in our company's cluster which has over 300 nodes, and this cluster will empty trash every 6 hours. Here is the p99th rpc time on hdfs-client, the coordinate unit is ms. Before applying this patch, the rpc consume time is over 1k ms when namenode doing delete work, and after using this patch, it comes to normal. !image-2021-06-10-19-28-18-373.png! !image-2021-06-10-19-28-58-359.png! > Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet > -- > > Key: HDFS-13671 > URL: https://issues.apache.org/jira/browse/HDFS-13671 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.3 >Reporter: Yiqun Lin >Assignee: Haibin Huang >Priority: Major > Labels: pull-request-available > Attachments: HDFS-13671-001.patch, image-2021-06-10-19-28-18-373.png > > Time Spent: 3.5h > Remaining Estimate: 0h > > NameNode hung when deleting large files/blocks. The stack info: > {code} > "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 > tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > {code} > In the current deletion logic in NameNode, there are mainly two steps: > * Collect INodes and all blocks to be deleted, then delete INodes. > * Remove blocks chunk by chunk in a loop. > Actually the first step should be a more expensive operation and will takes > more time. However, now we always see NN hangs during the remove block > operation. > Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a > better performance in dealing FBR/IBRs. But compared with early > implementation in remove-block logic, {{FoldedTreeSet}} seems more slower > since It will take additional time to balance tree node. When there are large > block to be removed/deleted, it looks bad. > For the get type operations in {{DatanodeStorageInfo}}, we only provide the > {{getBlockIterator}} to return blocks iterator and no other get operation > with specified block. Still we need to use {{FoldedTreeSet}} in > {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not > Update. Maybe we can revert this to the early implementation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
[ https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-13671: Attachment: image-2021-06-10-19-28-18-373.png > Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet > -- > > Key: HDFS-13671 > URL: https://issues.apache.org/jira/browse/HDFS-13671 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.3 >Reporter: Yiqun Lin >Assignee: Haibin Huang >Priority: Major > Labels: pull-request-available > Attachments: HDFS-13671-001.patch, image-2021-06-10-19-28-18-373.png > > Time Spent: 3.5h > Remaining Estimate: 0h > > NameNode hung when deleting large files/blocks. The stack info: > {code} > "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 > tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > {code} > In the current deletion logic in NameNode, there are mainly two steps: > * Collect INodes and all blocks to be deleted, then delete INodes. > * Remove blocks chunk by chunk in a loop. > Actually the first step should be a more expensive operation and will takes > more time. However, now we always see NN hangs during the remove block > operation. > Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a > better performance in dealing FBR/IBRs. But compared with early > implementation in remove-block logic, {{FoldedTreeSet}} seems more slower > since It will take additional time to balance tree node. When there are large > block to be removed/deleted, it looks bad. > For the get type operations in {{DatanodeStorageInfo}}, we only provide the > {{getBlockIterator}} to return blocks iterator and no other get operation > with specified block. Still we need to use {{FoldedTreeSet}} in > {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not > Update. Maybe we can revert this to the early implementation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
[ https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17355679#comment-17355679 ] Haibin Huang commented on HDFS-13671: - Thanks [~ferhui] 's reminding, i have update the PR, and the failed test is pass in my local environment, i don't know why they fail in ci. > Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet > -- > > Key: HDFS-13671 > URL: https://issues.apache.org/jira/browse/HDFS-13671 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.3 >Reporter: Yiqun Lin >Assignee: Haibin Huang >Priority: Major > Labels: pull-request-available > Attachments: HDFS-13671-001.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > NameNode hung when deleting large files/blocks. The stack info: > {code} > "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 > tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > {code} > In the current deletion logic in NameNode, there are mainly two steps: > * Collect INodes and all blocks to be deleted, then delete INodes. > * Remove blocks chunk by chunk in a loop. > Actually the first step should be a more expensive operation and will takes > more time. However, now we always see NN hangs during the remove block > operation. > Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a > better performance in dealing FBR/IBRs. But compared with early > implementation in remove-block logic, {{FoldedTreeSet}} seems more slower > since It will take additional time to balance tree node. When there are large > block to be removed/deleted, it looks bad. > For the get type operations in {{DatanodeStorageInfo}}, we only provide the > {{getBlockIterator}} to return blocks iterator and no other get operation > with specified block. Still we need to use {{FoldedTreeSet}} in > {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not > Update. Maybe we can revert this to the early implementation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
[ https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17355019#comment-17355019 ] Haibin Huang commented on HDFS-13671: - Thanks [~ferhui], i have created a new PR https://github.com/apache/hadoop/pull/3065 > Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet > -- > > Key: HDFS-13671 > URL: https://issues.apache.org/jira/browse/HDFS-13671 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.3 >Reporter: Yiqun Lin >Assignee: Haibin Huang >Priority: Major > Labels: pull-request-available > Attachments: HDFS-13671-001.patch > > Time Spent: 10m > Remaining Estimate: 0h > > NameNode hung when deleting large files/blocks. The stack info: > {code} > "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 > tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > {code} > In the current deletion logic in NameNode, there are mainly two steps: > * Collect INodes and all blocks to be deleted, then delete INodes. > * Remove blocks chunk by chunk in a loop. > Actually the first step should be a more expensive operation and will takes > more time. However, now we always see NN hangs during the remove block > operation. > Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a > better performance in dealing FBR/IBRs. But compared with early > implementation in remove-block logic, {{FoldedTreeSet}} seems more slower > since It will take additional time to balance tree node. When there are large > block to be removed/deleted, it looks bad. > For the get type operations in {{DatanodeStorageInfo}}, we only provide the > {{getBlockIterator}} to return blocks iterator and no other get operation > with specified block. Still we need to use {{FoldedTreeSet}} in > {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not > Update. Maybe we can revert this to the early implementation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
[ https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17350752#comment-17350752 ] Haibin Huang commented on HDFS-13671: - [^HDFS-13671-001.patch] is based on [HDFS-9260|https://issues.apache.org/jira/browse/HDFS-9260], which revert FoldedTreeSet to LightWeightResizableGSet in org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaMap. This patch is work well in my company, i will submit a test report later. > Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet > -- > > Key: HDFS-13671 > URL: https://issues.apache.org/jira/browse/HDFS-13671 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.3 >Reporter: Yiqun Lin >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-13671-001.patch > > > NameNode hung when deleting large files/blocks. The stack info: > {code} > "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 > tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > {code} > In the current deletion logic in NameNode, there are mainly two steps: > * Collect INodes and all blocks to be deleted, then delete INodes. > * Remove blocks chunk by chunk in a loop. > Actually the first step should be a more expensive operation and will takes > more time. However, now we always see NN hangs during the remove block > operation. > Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a > better performance in dealing FBR/IBRs. But compared with early > implementation in remove-block logic, {{FoldedTreeSet}} seems more slower > since It will take additional time to balance tree node. When there are large > block to be removed/deleted, it looks bad. > For the get type operations in {{DatanodeStorageInfo}}, we only provide the > {{getBlockIterator}} to return blocks iterator and no other get operation > with specified block. Still we need to use {{FoldedTreeSet}} in > {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not > Update. Maybe we can revert this to the early implementation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
[ https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-13671: Attachment: HDFS-13671-001.patch > Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet > -- > > Key: HDFS-13671 > URL: https://issues.apache.org/jira/browse/HDFS-13671 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.3 >Reporter: Yiqun Lin >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-13671-001.patch > > > NameNode hung when deleting large files/blocks. The stack info: > {code} > "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 > tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > {code} > In the current deletion logic in NameNode, there are mainly two steps: > * Collect INodes and all blocks to be deleted, then delete INodes. > * Remove blocks chunk by chunk in a loop. > Actually the first step should be a more expensive operation and will takes > more time. However, now we always see NN hangs during the remove block > operation. > Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a > better performance in dealing FBR/IBRs. But compared with early > implementation in remove-block logic, {{FoldedTreeSet}} seems more slower > since It will take additional time to balance tree node. When there are large > block to be removed/deleted, it looks bad. > For the get type operations in {{DatanodeStorageInfo}}, we only provide the > {{getBlockIterator}} to return blocks iterator and no other get operation > with specified block. Still we need to use {{FoldedTreeSet}} in > {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not > Update. Maybe we can revert this to the early implementation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
[ https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17347348#comment-17347348 ] Haibin Huang commented on HDFS-13671: - Thanks [~ferhui] and [~LiJinglun] involving me here, i will submit a patch later. > Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet > -- > > Key: HDFS-13671 > URL: https://issues.apache.org/jira/browse/HDFS-13671 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.3 >Reporter: Yiqun Lin >Priority: Major > > NameNode hung when deleting large files/blocks. The stack info: > {code} > "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 > tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > {code} > In the current deletion logic in NameNode, there are mainly two steps: > * Collect INodes and all blocks to be deleted, then delete INodes. > * Remove blocks chunk by chunk in a loop. > Actually the first step should be a more expensive operation and will takes > more time. However, now we always see NN hangs during the remove block > operation. > Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a > better performance in dealing FBR/IBRs. But compared with early > implementation in remove-block logic, {{FoldedTreeSet}} seems more slower > since It will take additional time to balance tree node. When there are large > block to be removed/deleted, it looks bad. > For the get type operations in {{DatanodeStorageInfo}}, we only provide the > {{getBlockIterator}} to return blocks iterator and no other get operation > with specified block. Still we need to use {{FoldedTreeSet}} in > {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not > Update. Maybe we can revert this to the early implementation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15745) Make DataNodePeerMetrics#LOW_THRESHOLD_MS and MIN_OUTLIER_DETECTION_NODES configurable
[ https://issues.apache.org/jira/browse/HDFS-15745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17309362#comment-17309362 ] Haibin Huang commented on HDFS-15745: - Thanks [~prasad-acit] for comment, i have update this patch for branch 3.1 & 3.2 &3.3, [~ayushtkn] would you mind commit them? > Make DataNodePeerMetrics#LOW_THRESHOLD_MS and MIN_OUTLIER_DETECTION_NODES > configurable > -- > > Key: HDFS-15745 > URL: https://issues.apache.org/jira/browse/HDFS-15745 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Fix For: 3.4.0 > > Attachments: HDFS-15745-001.patch, HDFS-15745-002.patch, > HDFS-15745-003.patch, HDFS-15745-branch-3.1.001.patch, > HDFS-15745-branch-3.2.001.patch, HDFS-15745-branch-3.3.001.patch, > image-2020-12-22-17-00-50-796.png > > > When i enable DataNodePeerMetrics to find slow slow peer in cluster, i found > there is a lot of slow peer but ReportingNodes's averageDelay is very low, > and these slow peer node are normal. I think the reason of why generating so > many slow peer is that the value of DataNodePeerMetrics#LOW_THRESHOLD_MS is > too small (only 5ms) and it is not configurable. The default value of slow io > warning log threshold is 300ms, i.e. > DFSConfigKeys.DFS_DATANODE_SLOW_IO_WARNING_THRESHOLD_DEFAULT = 300, so > DataNodePeerMetrics#LOW_THRESHOLD_MS should not be less than 300ms, otherwise > namenode will get a lot of invalid slow peer information. > !image-2020-12-22-17-00-50-796.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15745) Make DataNodePeerMetrics#LOW_THRESHOLD_MS and MIN_OUTLIER_DETECTION_NODES configurable
[ https://issues.apache.org/jira/browse/HDFS-15745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-15745: Attachment: HDFS-15745-branch-3.3.001.patch HDFS-15745-branch-3.2.001.patch HDFS-15745-branch-3.1.001.patch > Make DataNodePeerMetrics#LOW_THRESHOLD_MS and MIN_OUTLIER_DETECTION_NODES > configurable > -- > > Key: HDFS-15745 > URL: https://issues.apache.org/jira/browse/HDFS-15745 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Fix For: 3.4.0 > > Attachments: HDFS-15745-001.patch, HDFS-15745-002.patch, > HDFS-15745-003.patch, HDFS-15745-branch-3.1.001.patch, > HDFS-15745-branch-3.2.001.patch, HDFS-15745-branch-3.3.001.patch, > image-2020-12-22-17-00-50-796.png > > > When i enable DataNodePeerMetrics to find slow slow peer in cluster, i found > there is a lot of slow peer but ReportingNodes's averageDelay is very low, > and these slow peer node are normal. I think the reason of why generating so > many slow peer is that the value of DataNodePeerMetrics#LOW_THRESHOLD_MS is > too small (only 5ms) and it is not configurable. The default value of slow io > warning log threshold is 300ms, i.e. > DFSConfigKeys.DFS_DATANODE_SLOW_IO_WARNING_THRESHOLD_DEFAULT = 300, so > DataNodePeerMetrics#LOW_THRESHOLD_MS should not be less than 300ms, otherwise > namenode will get a lot of invalid slow peer information. > !image-2020-12-22-17-00-50-796.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15744) Use cumulative counting way to improve the accuracy of slow disk detection
[ https://issues.apache.org/jira/browse/HDFS-15744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17268996#comment-17268996 ] Haibin Huang commented on HDFS-15744: - [~ayushtkn] [~aajisaka] [~elgoiri] [~hexiaoqiao] would you mind take a look at this? We use this way to detect slow disk in our company, and the accuracy of finding bad disk is over 90% . > Use cumulative counting way to improve the accuracy of slow disk detection > -- > > Key: HDFS-15744 > URL: https://issues.apache.org/jira/browse/HDFS-15744 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-15744-001.patch, image-2020-12-22-11-37-14-734.png, > image-2020-12-22-11-37-35-280.png, image-2020-12-22-11-46-48-817.png > > > Hdfs has supported the datanode disk outlier detection in > [HDFS-11461|https://issues.apache.org/jira/browse/HDFS-11461], we can use it > to find out slow disk via > SlowDiskReport([HDFS-11551|https://issues.apache.org/jira/browse/HDFS-11551]).However > i found the slow disk information may not be accurate enough in practice. > Because a large number of short-term writes can lead to miscalculation. Here > is the example, this disk is health, when it encounters a lot of writing in a > few minute, it's write io does get slow, and will be considered to be slow > disk.The disk just slow in a few minute but SlowDiskReport will keep it until > the information becomes invalid. This scenario confuse us since we want to > use SlowDiskReport to detect the real bad disk. > !image-2020-12-22-11-37-14-734.png! > !image-2020-12-22-11-37-35-280.png! > To improve the deteciton accuracy, we use a cumulative counting way to detect > slow disk. If within the reportValidityMs interval, a disk is considered to > be outlier over 50% times, than it should be a real bad disk. > Here is an exsample, if reportValidityMs is one hour and detection interval > is five minute, there will be 12 times disk outlier detection in one hour. If > a disk is considered to be outlier over 6 times, it should be a real bad > disk. We use this way to detect bad disk in cluster, it can reach over 90% > accuracy. > !image-2020-12-22-11-46-48-817.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15666) add average latency information to the SlowPeerReport
[ https://issues.apache.org/jira/browse/HDFS-15666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17268985#comment-17268985 ] Haibin Huang commented on HDFS-15666: - [~ayushtkn] [~aajisaka] [~elgoiri] [~hexiaoqiao] would you mind take a look at this? This improvement has a good effect in my company. > add average latency information to the SlowPeerReport > - > > Key: HDFS-15666 > URL: https://issues.apache.org/jira/browse/HDFS-15666 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Minor > Attachments: HDFS-15666-003.patch, HDFS-15666-004.patch, > HDFS-15666.001.patch, HDFS-15666.002.patch > > > In namenode's jmx, there is a SlowDisksReport like this: > {code:java} > [{"SlowDiskID":"dn3:disk1","Latencies":{"WRITE":1000.1}},{"SlowDiskID":"dn2:disk2","Latencies":{"WRITE":1000.1}},{"SlowDiskID":"dn1:disk2","Latencies":{"READ":1000.3}},{"SlowDiskID":"dn1:disk1","Latencies":{"METADATA":1000.1,"READ":1000.8}}] > {code} > So we can know the disk io letency from this report.However, SlowPeersReport > dosen't have average latency: > {code:java} > [{"SlowNode":"node4","ReportingNodes":["node1"]},{"SlowNode":"node2","ReportingNodes":["node1","node3"]},{"SlowNode":"node1","ReportingNodes":["node2"]}] > {code} > I think we should add the average latency to the report, which can get from > org.apache.hadoop.hdfs.server.protocol.SlowPeerReports#slowPeers. > After adding the average latency, the SlowPeerReport can be like this: > {code:java} > [{"SlowNode":"node4","ReportingNodes":[{"nodeId":"node1","averageLatency":2000.0}]},{"SlowNode":"node2","ReportingNodes":[{"nodeId":"node1","averageLatency":2000.0},{"nodeId":"node3","averageLatency":1000.0}]},{"SlowNode":"node1","ReportingNodes":[{"nodeId":"node2","averageLatency":2000.0}]}]{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15758) Fix typos in MutableMetric
[ https://issues.apache.org/jira/browse/HDFS-15758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17265667#comment-17265667 ] Haibin Huang commented on HDFS-15758: - [~ayushtkn] would you mind take a look at this? > Fix typos in MutableMetric > -- > > Key: HDFS-15758 > URL: https://issues.apache.org/jira/browse/HDFS-15758 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Minor > Attachments: HDFS-15758-001.patch > > > Now the java doc of MutableMetric#changed may cause misunderstanding, it > needs to be fixed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15758) Fix typos in MutableMetric
[ https://issues.apache.org/jira/browse/HDFS-15758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-15758: Status: Patch Available (was: Open) > Fix typos in MutableMetric > -- > > Key: HDFS-15758 > URL: https://issues.apache.org/jira/browse/HDFS-15758 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Minor > Attachments: HDFS-15758-001.patch > > > Now the java doc of MutableMetric#changed may cause misunderstanding, it > needs to be fixed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15758) Fix typos in MutableMetric
[ https://issues.apache.org/jira/browse/HDFS-15758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-15758: Attachment: HDFS-15758-001.patch > Fix typos in MutableMetric > -- > > Key: HDFS-15758 > URL: https://issues.apache.org/jira/browse/HDFS-15758 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Minor > Attachments: HDFS-15758-001.patch > > > Now the java doc of MutableMetric#changed may cause misunderstanding, it > needs to be fixed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15758) Fix typos in MutableMetric
[ https://issues.apache.org/jira/browse/HDFS-15758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-15758: Priority: Minor (was: Major) > Fix typos in MutableMetric > -- > > Key: HDFS-15758 > URL: https://issues.apache.org/jira/browse/HDFS-15758 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Minor > > Now the java doc of MutableMetric#changed may cause misunderstanding, it > needs to be fixed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15758) Fix typos in MutableMetric
Haibin Huang created HDFS-15758: --- Summary: Fix typos in MutableMetric Key: HDFS-15758 URL: https://issues.apache.org/jira/browse/HDFS-15758 Project: Hadoop HDFS Issue Type: Improvement Reporter: Haibin Huang Assignee: Haibin Huang Now the java doc of MutableMetric#changed may cause misunderstanding, it needs to be fixed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15745) Make DataNodePeerMetrics#LOW_THRESHOLD_MS and MIN_OUTLIER_DETECTION_NODES configurable
[ https://issues.apache.org/jira/browse/HDFS-15745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17257080#comment-17257080 ] Haibin Huang commented on HDFS-15745: - Thanx [~ayushtkn], i have updated the patch. > Make DataNodePeerMetrics#LOW_THRESHOLD_MS and MIN_OUTLIER_DETECTION_NODES > configurable > -- > > Key: HDFS-15745 > URL: https://issues.apache.org/jira/browse/HDFS-15745 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-15745-001.patch, HDFS-15745-002.patch, > HDFS-15745-003.patch, image-2020-12-22-17-00-50-796.png > > > When i enable DataNodePeerMetrics to find slow slow peer in cluster, i found > there is a lot of slow peer but ReportingNodes's averageDelay is very low, > and these slow peer node are normal. I think the reason of why generating so > many slow peer is that the value of DataNodePeerMetrics#LOW_THRESHOLD_MS is > too small (only 5ms) and it is not configurable. The default value of slow io > warning log threshold is 300ms, i.e. > DFSConfigKeys.DFS_DATANODE_SLOW_IO_WARNING_THRESHOLD_DEFAULT = 300, so > DataNodePeerMetrics#LOW_THRESHOLD_MS should not be less than 300ms, otherwise > namenode will get a lot of invalid slow peer information. > !image-2020-12-22-17-00-50-796.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15745) Make DataNodePeerMetrics#LOW_THRESHOLD_MS and MIN_OUTLIER_DETECTION_NODES configurable
[ https://issues.apache.org/jira/browse/HDFS-15745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-15745: Attachment: HDFS-15745-003.patch > Make DataNodePeerMetrics#LOW_THRESHOLD_MS and MIN_OUTLIER_DETECTION_NODES > configurable > -- > > Key: HDFS-15745 > URL: https://issues.apache.org/jira/browse/HDFS-15745 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-15745-001.patch, HDFS-15745-002.patch, > HDFS-15745-003.patch, image-2020-12-22-17-00-50-796.png > > > When i enable DataNodePeerMetrics to find slow slow peer in cluster, i found > there is a lot of slow peer but ReportingNodes's averageDelay is very low, > and these slow peer node are normal. I think the reason of why generating so > many slow peer is that the value of DataNodePeerMetrics#LOW_THRESHOLD_MS is > too small (only 5ms) and it is not configurable. The default value of slow io > warning log threshold is 300ms, i.e. > DFSConfigKeys.DFS_DATANODE_SLOW_IO_WARNING_THRESHOLD_DEFAULT = 300, so > DataNodePeerMetrics#LOW_THRESHOLD_MS should not be less than 300ms, otherwise > namenode will get a lot of invalid slow peer information. > !image-2020-12-22-17-00-50-796.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15745) Make DataNodePeerMetrics#LOW_THRESHOLD_MS and MIN_OUTLIER_DETECTION_NODES configurable
[ https://issues.apache.org/jira/browse/HDFS-15745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17256966#comment-17256966 ] Haibin Huang commented on HDFS-15745: - Thanks [~ayushtkn] for review, i have update the patch, take a look please. > Make DataNodePeerMetrics#LOW_THRESHOLD_MS and MIN_OUTLIER_DETECTION_NODES > configurable > -- > > Key: HDFS-15745 > URL: https://issues.apache.org/jira/browse/HDFS-15745 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-15745-001.patch, HDFS-15745-002.patch, > image-2020-12-22-17-00-50-796.png > > > When i enable DataNodePeerMetrics to find slow slow peer in cluster, i found > there is a lot of slow peer but ReportingNodes's averageDelay is very low, > and these slow peer node are normal. I think the reason of why generating so > many slow peer is that the value of DataNodePeerMetrics#LOW_THRESHOLD_MS is > too small (only 5ms) and it is not configurable. The default value of slow io > warning log threshold is 300ms, i.e. > DFSConfigKeys.DFS_DATANODE_SLOW_IO_WARNING_THRESHOLD_DEFAULT = 300, so > DataNodePeerMetrics#LOW_THRESHOLD_MS should not be less than 300ms, otherwise > namenode will get a lot of invalid slow peer information. > !image-2020-12-22-17-00-50-796.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15745) Make DataNodePeerMetrics#LOW_THRESHOLD_MS and MIN_OUTLIER_DETECTION_NODES configurable
[ https://issues.apache.org/jira/browse/HDFS-15745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-15745: Attachment: HDFS-15745-002.patch > Make DataNodePeerMetrics#LOW_THRESHOLD_MS and MIN_OUTLIER_DETECTION_NODES > configurable > -- > > Key: HDFS-15745 > URL: https://issues.apache.org/jira/browse/HDFS-15745 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-15745-001.patch, HDFS-15745-002.patch, > image-2020-12-22-17-00-50-796.png > > > When i enable DataNodePeerMetrics to find slow slow peer in cluster, i found > there is a lot of slow peer but ReportingNodes's averageDelay is very low, > and these slow peer node are normal. I think the reason of why generating so > many slow peer is that the value of DataNodePeerMetrics#LOW_THRESHOLD_MS is > too small (only 5ms) and it is not configurable. The default value of slow io > warning log threshold is 300ms, i.e. > DFSConfigKeys.DFS_DATANODE_SLOW_IO_WARNING_THRESHOLD_DEFAULT = 300, so > DataNodePeerMetrics#LOW_THRESHOLD_MS should not be less than 300ms, otherwise > namenode will get a lot of invalid slow peer information. > !image-2020-12-22-17-00-50-796.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14789) namenode should avoid slow node while chooseTarget in BlockPlacementPolicyDefault
[ https://issues.apache.org/jira/browse/HDFS-14789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-14789: Attachment: image-2020-12-22-22-15-17-703.png > namenode should avoid slow node while chooseTarget in > BlockPlacementPolicyDefault > - > > Key: HDFS-14789 > URL: https://issues.apache.org/jira/browse/HDFS-14789 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-14789, HDFS-14789-001.patch, > image-2020-12-22-22-15-17-703.png > > > With HDFS-11194 and HDFS-11551, namenode can show SlowPeersReport and > SlowDisksReport in jmx. I think namenode can avoid these slow node while > chooseTarget in BlockPlacementPolicyDefault. Because if there is a slow node > in pipeline, client might write very slowly. > I use a invalidityTime to let namnode not choose slow node before invalid > time finish. After the invalidityTime, if slow node return to normal, > namenode can choose it again, or it's still very slow, the invalidityTime > will update and keep not choosing it. > Also i consider the fallback, if namenode can't choose any normal node, > chooseTarget will throw NotEnoughReplicasException and retry, this time not > avoiding slow nodes. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14789) namenode should avoid slow node while chooseTarget in BlockPlacementPolicyDefault
[ https://issues.apache.org/jira/browse/HDFS-14789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-14789: Description: With HDFS-11194 and HDFS-11551, namenode can show SlowPeersReport and SlowDisksReport in jmx. I think namenode can avoid these slow node while chooseTarget in BlockPlacementPolicyDefault. Because if there is a slow node in pipeline, client might write very slowly. I use a invalidityTime to let namnode not choose slow node before invalid time finish. After the invalidityTime, if slow node return to normal, namenode can choose it again, or it's still very slow, the invalidityTime will update and keep not choosing it. Also i consider the fallback, if namenode can't choose any normal node, chooseTarget will throw NotEnoughReplicasException and retry, this time not avoiding slow nodes. !image-2020-12-22-22-15-17-703.png|width=969,height=322! was: With HDFS-11194 and HDFS-11551, namenode can show SlowPeersReport and SlowDisksReport in jmx. I think namenode can avoid these slow node while chooseTarget in BlockPlacementPolicyDefault. Because if there is a slow node in pipeline, client might write very slowly. I use a invalidityTime to let namnode not choose slow node before invalid time finish. After the invalidityTime, if slow node return to normal, namenode can choose it again, or it's still very slow, the invalidityTime will update and keep not choosing it. Also i consider the fallback, if namenode can't choose any normal node, chooseTarget will throw NotEnoughReplicasException and retry, this time not avoiding slow nodes. > namenode should avoid slow node while chooseTarget in > BlockPlacementPolicyDefault > - > > Key: HDFS-14789 > URL: https://issues.apache.org/jira/browse/HDFS-14789 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-14789, HDFS-14789-001.patch, > image-2020-12-22-22-15-17-703.png > > > With HDFS-11194 and HDFS-11551, namenode can show SlowPeersReport and > SlowDisksReport in jmx. I think namenode can avoid these slow node while > chooseTarget in BlockPlacementPolicyDefault. Because if there is a slow node > in pipeline, client might write very slowly. > I use a invalidityTime to let namnode not choose slow node before invalid > time finish. After the invalidityTime, if slow node return to normal, > namenode can choose it again, or it's still very slow, the invalidityTime > will update and keep not choosing it. > Also i consider the fallback, if namenode can't choose any normal node, > chooseTarget will throw NotEnoughReplicasException and retry, this time not > avoiding slow nodes. > > !image-2020-12-22-22-15-17-703.png|width=969,height=322! > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14789) namenode should avoid slow node while chooseTarget in BlockPlacementPolicyDefault
[ https://issues.apache.org/jira/browse/HDFS-14789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-14789: Attachment: HDFS-14789-001.patch > namenode should avoid slow node while chooseTarget in > BlockPlacementPolicyDefault > - > > Key: HDFS-14789 > URL: https://issues.apache.org/jira/browse/HDFS-14789 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-14789, HDFS-14789-001.patch > > > With HDFS-11194 and HDFS-11551, namenode can show SlowPeersReport and > SlowDisksReport in jmx. I think namenode can avoid these slow node while > chooseTarget in BlockPlacementPolicyDefault. Because if there is a slow node > in pipeline, client might write very slowly. > I use a invalidityTime to let namnode not choose slow node before invalid > time finish. After the invalidityTime, if slow node return to normal, > namenode can choose it again, or it's still very slow, the invalidityTime > will update and keep not choosing it. > Also i consider the fallback, if namenode can't choose any normal node, > chooseTarget will throw NotEnoughReplicasException and retry, this time not > avoiding slow nodes. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14789) namenode should avoid slow node while chooseTarget in BlockPlacementPolicyDefault
[ https://issues.apache.org/jira/browse/HDFS-14789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-14789: Summary: namenode should avoid slow node while chooseTarget in BlockPlacementPolicyDefault (was: namenode should avoid slow node when choose target in BlockPlacementPolicyDefault) > namenode should avoid slow node while chooseTarget in > BlockPlacementPolicyDefault > - > > Key: HDFS-14789 > URL: https://issues.apache.org/jira/browse/HDFS-14789 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-14789 > > > With HDFS-11194 and HDFS-11551, namenode can show SlowPeersReport and > SlowDisksReport in jmx. I think namenode can avoid these slow node while > chooseTarget in BlockPlacementPolicyDefault. Because if there is a slow node > in pipeline, client might write very slowly. > I use a invalidityTime to let namnode not choose slow node before invalid > time finish. After the invalidityTime, if slow node return to normal, > namenode can choose it again, or it's still very slow, the invalidityTime > will update and keep not choosing it. > Also i consider the fallback, if namenode can't choose any normal node, > chooseTarget will throw NotEnoughReplicasException and retry, this time not > avoiding slow nodes. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14789) namenode should avoid slow node when choose target in BlockPlacementPolicyDefault
[ https://issues.apache.org/jira/browse/HDFS-14789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-14789: Description: With HDFS-11194 and HDFS-11551, namenode can show SlowPeersReport and SlowDisksReport in jmx. I think namenode can avoid these slow node while chooseTarget in BlockPlacementPolicyDefault. Because if there is a slow node in pipeline, client might write very slowly. I use a invalidityTime to let namnode not choose slow node before invalid time finish. After the invalidityTime, if slow node return to normal, namenode can choose it again, or it's still very slow, the invalidityTime will update and keep not choosing it. Also i consider the fallback, if namenode can't choose any normal node, chooseTarget will throw NotEnoughReplicasException and retry, this time not avoiding slow nodes. was: With HDFS-11194 and HDFS-11551, namenode can show SlowPeersReport and SlowDisksReport in jmx. I think namenode can avoid these slow node information while choosing target in we can find slow node through namenode's jmx. So i think namenode should check these slow nodes when assigning a node for writing block. If namenode choose a node at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault#*chooseRandom()*, we should check whether it's belong to slow node, because choosing a slow one to write data may take a long time, which can cause a client writing data very slowly and even encounter a socket timeout exception like this: {code:java} 2019-08-19,17:16:41,181 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exceptionjava.net.SocketTimeoutException: 495000 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/xxx:xxx remote=/xxx:xxx] at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164) at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159) at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122) at java.io.DataOutputStream.write(DataOutputStream.java:107) at org.apache.hadoop.hdfs.DFSOutputStream$Packet.writeTo(DFSOutputStream.java:328) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:653){code} I use *maxChosenCount* to prevent choosing datanode task too long, which is calculated by the logarithm of probability, and it also can guarantee the probability of choosing a slow node to write block less than 0.01%. Finally, i use an expire time to let namnode don't choose these slow nodes within a specify period, because these slow nodes may have returned to normal after the period and can use to write block again. > namenode should avoid slow node when choose target in > BlockPlacementPolicyDefault > - > > Key: HDFS-14789 > URL: https://issues.apache.org/jira/browse/HDFS-14789 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-14789 > > > With HDFS-11194 and HDFS-11551, namenode can show SlowPeersReport and > SlowDisksReport in jmx. I think namenode can avoid these slow node while > chooseTarget in BlockPlacementPolicyDefault. Because if there is a slow node > in pipeline, client might write very slowly. > I use a invalidityTime to let namnode not choose slow node before invalid > time finish. After the invalidityTime, if slow node return to normal, > namenode can choose it again, or it's still very slow, the invalidityTime > will update and keep not choosing it. > Also i consider the fallback, if namenode can't choose any normal node, > chooseTarget will throw NotEnoughReplicasException and retry, this time not > avoiding slow nodes. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14789) namenode should avoid slow node when choose target in BlockPlacementPolicyDefault
[ https://issues.apache.org/jira/browse/HDFS-14789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-14789: Description: With HDFS-11194 and HDFS-11551, namenode can show SlowPeersReport and SlowDisksReport in jmx. I think namenode can avoid these slow node information while choosing target in we can find slow node through namenode's jmx. So i think namenode should check these slow nodes when assigning a node for writing block. If namenode choose a node at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault#*chooseRandom()*, we should check whether it's belong to slow node, because choosing a slow one to write data may take a long time, which can cause a client writing data very slowly and even encounter a socket timeout exception like this: {code:java} 2019-08-19,17:16:41,181 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exceptionjava.net.SocketTimeoutException: 495000 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/xxx:xxx remote=/xxx:xxx] at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164) at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159) at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122) at java.io.DataOutputStream.write(DataOutputStream.java:107) at org.apache.hadoop.hdfs.DFSOutputStream$Packet.writeTo(DFSOutputStream.java:328) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:653){code} I use *maxChosenCount* to prevent choosing datanode task too long, which is calculated by the logarithm of probability, and it also can guarantee the probability of choosing a slow node to write block less than 0.01%. Finally, i use an expire time to let namnode don't choose these slow nodes within a specify period, because these slow nodes may have returned to normal after the period and can use to write block again. was: With HDFS-11194 and HDFS-11551, we can find slow node through namenode's jmx. So i think namenode should check these slow nodes when assigning a node for writing block. If namenode choose a node at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault#*chooseRandom()*, we should check whether it's belong to slow node, because choosing a slow one to write data may take a long time, which can cause a client writing data very slowly and even encounter a socket timeout exception like this: {code:java} 2019-08-19,17:16:41,181 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exceptionjava.net.SocketTimeoutException: 495000 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/xxx:xxx remote=/xxx:xxx] at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164) at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159) at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122) at java.io.DataOutputStream.write(DataOutputStream.java:107) at org.apache.hadoop.hdfs.DFSOutputStream$Packet.writeTo(DFSOutputStream.java:328) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:653){code} I use *maxChosenCount* to prevent choosing datanode task too long, which is calculated by the logarithm of probability, and it also can guarantee the probability of choosing a slow node to write block less than 0.01%. Finally, i use an expire time to let namnode don't choose these slow nodes within a specify period, because these slow nodes may have returned to normal after the period and can use to write block again. > namenode should avoid slow node when choose target in > BlockPlacementPolicyDefault > - > > Key: HDFS-14789 > URL: https://issues.apache.org/jira/browse/HDFS-14789 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-14789 > > > With HDFS-11194 and HDFS-11551, namenode can show SlowPeersReport and > SlowDisksReport in jmx. I think namenode can avoid these slow node > information while choosing target in > we can find slow node through namenode's jmx. So i think namenode should > check these slow nodes when assigning a node for writing block. If namenode > choose a node at > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault#*chooseRandom()*, > we should check whether it's belong to slow node, because choosing a slow > one to write data may take a long time, w
[jira] [Updated] (HDFS-14789) namenode should avoid slow node when choose target in BlockPlacementPolicyDefault
[ https://issues.apache.org/jira/browse/HDFS-14789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-14789: Summary: namenode should avoid slow node when choose target in BlockPlacementPolicyDefault (was: namenode should check slow node when assigning a node for writing block ) > namenode should avoid slow node when choose target in > BlockPlacementPolicyDefault > - > > Key: HDFS-14789 > URL: https://issues.apache.org/jira/browse/HDFS-14789 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-14789 > > > With HDFS-11194 and HDFS-11551, we can find slow node through namenode's jmx. > So i think namenode should check these slow nodes when assigning a node for > writing block. If namenode choose a node at > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault#*chooseRandom()*, > we should check whether it's belong to slow node, because choosing a slow > one to write data may take a long time, which can cause a client writing > data very slowly and even encounter a socket timeout exception like this: > > {code:java} > 2019-08-19,17:16:41,181 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer > Exceptionjava.net.SocketTimeoutException: 495000 millis timeout while waiting > for channel to be ready for write. ch : > java.nio.channels.SocketChannel[connected local=/xxx:xxx remote=/xxx:xxx] at > org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164) > at > org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159) > at > org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117) > at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122) at > java.io.DataOutputStream.write(DataOutputStream.java:107) at > org.apache.hadoop.hdfs.DFSOutputStream$Packet.writeTo(DFSOutputStream.java:328) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:653){code} > > I use *maxChosenCount* to prevent choosing datanode task too long, which is > calculated by the logarithm of probability, and it also can guarantee the > probability of choosing a slow node to write block less than 0.01%. > Finally, i use an expire time to let namnode don't choose these slow nodes > within a specify period, because these slow nodes may have returned to normal > after the period and can use to write block again. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15745) Make DataNodePeerMetrics#LOW_THRESHOLD_MS and MIN_OUTLIER_DETECTION_NODES configurable
[ https://issues.apache.org/jira/browse/HDFS-15745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-15745: Attachment: HDFS-15745-001.patch Status: Patch Available (was: Open) > Make DataNodePeerMetrics#LOW_THRESHOLD_MS and MIN_OUTLIER_DETECTION_NODES > configurable > -- > > Key: HDFS-15745 > URL: https://issues.apache.org/jira/browse/HDFS-15745 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-15745-001.patch, image-2020-12-22-17-00-50-796.png > > > When i enable DataNodePeerMetrics to find slow slow peer in cluster, i found > there is a lot of slow peer but ReportingNodes's averageDelay is very low, > and these slow peer node are normal. I think the reason of why generating so > many slow peer is that the value of DataNodePeerMetrics#LOW_THRESHOLD_MS is > too small (only 5ms) and it is not configurable. The default value of slow io > warning log threshold is 300ms, i.e. > DFSConfigKeys.DFS_DATANODE_SLOW_IO_WARNING_THRESHOLD_DEFAULT = 300, so > DataNodePeerMetrics#LOW_THRESHOLD_MS should not be less than 300ms, otherwise > namenode will get a lot of invalid slow peer information. > !image-2020-12-22-17-00-50-796.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15745) Make DataNodePeerMetrics#LOW_THRESHOLD_MS and MIN_OUTLIER_DETECTION_NODES configurable
Haibin Huang created HDFS-15745: --- Summary: Make DataNodePeerMetrics#LOW_THRESHOLD_MS and MIN_OUTLIER_DETECTION_NODES configurable Key: HDFS-15745 URL: https://issues.apache.org/jira/browse/HDFS-15745 Project: Hadoop HDFS Issue Type: Improvement Reporter: Haibin Huang Assignee: Haibin Huang Attachments: image-2020-12-22-17-00-50-796.png When i enable DataNodePeerMetrics to find slow slow peer in cluster, i found there is a lot of slow peer but ReportingNodes's averageDelay is very low, and these slow peer node are normal. I think the reason of why generating so many slow peer is that the value of DataNodePeerMetrics#LOW_THRESHOLD_MS is too small (only 5ms) and it is not configurable. The default value of slow io warning log threshold is 300ms, i.e. DFSConfigKeys.DFS_DATANODE_SLOW_IO_WARNING_THRESHOLD_DEFAULT = 300, so DataNodePeerMetrics#LOW_THRESHOLD_MS should not be less than 300ms, otherwise namenode will get a lot of invalid slow peer information. !image-2020-12-22-17-00-50-796.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15744) Use cumulative counting way to improve the accuracy of slow disk detection
[ https://issues.apache.org/jira/browse/HDFS-15744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-15744: Status: Patch Available (was: Open) > Use cumulative counting way to improve the accuracy of slow disk detection > -- > > Key: HDFS-15744 > URL: https://issues.apache.org/jira/browse/HDFS-15744 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-15744-001.patch, image-2020-12-22-11-37-14-734.png, > image-2020-12-22-11-37-35-280.png, image-2020-12-22-11-46-48-817.png > > > Hdfs has supported the datanode disk outlier detection in > [HDFS-11461|https://issues.apache.org/jira/browse/HDFS-11461], we can use it > to find out slow disk via > SlowDiskReport([HDFS-11551|https://issues.apache.org/jira/browse/HDFS-11551]).However > i found the slow disk information may not be accurate enough in practice. > Because a large number of short-term writes can lead to miscalculation. Here > is the example, this disk is health, when it encounters a lot of writing in a > few minute, it's write io does get slow, and will be considered to be slow > disk.The disk just slow in a few minute but SlowDiskReport will keep it until > the information becomes invalid. This scenario confuse us since we want to > use SlowDiskReport to detect the real bad disk. > !image-2020-12-22-11-37-14-734.png! > !image-2020-12-22-11-37-35-280.png! > To improve the deteciton accuracy, we use a cumulative counting way to detect > slow disk. If within the reportValidityMs interval, a disk is considered to > be outlier over 50% times, than it should be a real bad disk. > Here is an exsample, if reportValidityMs is one hour and detection interval > is five minute, there will be 12 times disk outlier detection in one hour. If > a disk is considered to be outlier over 6 times, it should be a real bad > disk. We use this way to detect bad disk in cluster, it can reach over 90% > accuracy. > !image-2020-12-22-11-46-48-817.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15744) Use cumulative counting way to improve the accuracy of slow disk detection
[ https://issues.apache.org/jira/browse/HDFS-15744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-15744: Attachment: HDFS-15744-001.patch > Use cumulative counting way to improve the accuracy of slow disk detection > -- > > Key: HDFS-15744 > URL: https://issues.apache.org/jira/browse/HDFS-15744 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-15744-001.patch, image-2020-12-22-11-37-14-734.png, > image-2020-12-22-11-37-35-280.png, image-2020-12-22-11-46-48-817.png > > > Hdfs has supported the datanode disk outlier detection in > [HDFS-11461|https://issues.apache.org/jira/browse/HDFS-11461], we can use it > to find out slow disk via > SlowDiskReport([HDFS-11551|https://issues.apache.org/jira/browse/HDFS-11551]).However > i found the slow disk information may not be accurate enough in practice. > Because a large number of short-term writes can lead to miscalculation. Here > is the example, this disk is health, when it encounters a lot of writing in a > few minute, it's write io does get slow, and will be considered to be slow > disk.The disk just slow in a few minute but SlowDiskReport will keep it until > the information becomes invalid. This scenario confuse us since we want to > use SlowDiskReport to detect the real bad disk. > !image-2020-12-22-11-37-14-734.png! > !image-2020-12-22-11-37-35-280.png! > To improve the deteciton accuracy, we use a cumulative counting way to detect > slow disk. If within the reportValidityMs interval, a disk is considered to > be outlier over 50% times, than it should be a real bad disk. > Here is an exsample, if reportValidityMs is one hour and detection interval > is five minute, there will be 12 times disk outlier detection in one hour. If > a disk is considered to be outlier over 6 times, it should be a real bad > disk. We use this way to detect bad disk in cluster, it can reach over 90% > accuracy. > !image-2020-12-22-11-46-48-817.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15744) Use cumulative counting way to improve the accuracy of slow disk detection
[ https://issues.apache.org/jira/browse/HDFS-15744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-15744: Description: Hdfs has supported the datanode disk outlier detection in [HDFS-11461|https://issues.apache.org/jira/browse/HDFS-11461], we can use it to find out slow disk via SlowDiskReport([HDFS-11551|https://issues.apache.org/jira/browse/HDFS-11551]).However i found the slow disk information may not be accurate enough in practice. Because a large number of short-term writes can lead to miscalculation. Here is the example, this disk is health, when it encounters a lot of writing in a few minute, it's write io does get slow, and will be considered to be slow disk.The disk just slow in a few minute but SlowDiskReport will keep it until the information becomes invalid. This scenario confuse us since we want to use SlowDiskReport to detect the real bad disk. !image-2020-12-22-11-37-14-734.png! !image-2020-12-22-11-37-35-280.png! To improve the deteciton accuracy, we use a cumulative counting way to detect slow disk. If within the reportValidityMs interval, a disk is considered to be outlier over 50% times, than it should be a real bad disk. Here is an exsample, if reportValidityMs is one hour and detection interval is five minute, there will be 12 times disk outlier detection in one hour. If a disk is considered to be outlier over 6 times, it should be a real bad disk. We use this way to detect bad disk in cluster, it can reach over 90% accuracy. !image-2020-12-22-11-46-48-817.png! was: Hdfs has supported the datanode disk outlier detection in [11461|https://issues.apache.org/jira/browse/HDFS-11461], we can use it to find out slow disk via SlowDiskReport(11551).However i found the slow disk information may not be accurate enough in practice. Because a large number of short-term writes can lead to miscalculation. Here is the example, this disk is health, when it encounters a lot of writing in a few minute, it's write io does get slow, and will be considered to be slow disk.The disk just slow in a few minute but SlowDiskReport will keep it until the information becomes invalid. This scenario confuse us since we want to use SlowDiskReport to detect the real bad disk. !image-2020-12-22-11-37-14-734.png! !image-2020-12-22-11-37-35-280.png! To improve the deteciton accuracy, we use a cumulative counting way to detect slow disk. If within the reportValidityMs interval, a disk is considered to be outlier over 50% times, than it should be a real bad disk. Here is an exsample, if reportValidityMs is one hour and detection interval is five minute, there will be 12 times disk outlier detection in one hour. If a disk is considered to be outlier over 6 times, it should be a real bad disk. We use this way to detect bad disk in cluster, it can reach over 90% accuracy. !image-2020-12-22-11-46-48-817.png! > Use cumulative counting way to improve the accuracy of slow disk detection > -- > > Key: HDFS-15744 > URL: https://issues.apache.org/jira/browse/HDFS-15744 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: image-2020-12-22-11-37-14-734.png, > image-2020-12-22-11-37-35-280.png, image-2020-12-22-11-46-48-817.png > > > Hdfs has supported the datanode disk outlier detection in > [HDFS-11461|https://issues.apache.org/jira/browse/HDFS-11461], we can use it > to find out slow disk via > SlowDiskReport([HDFS-11551|https://issues.apache.org/jira/browse/HDFS-11551]).However > i found the slow disk information may not be accurate enough in practice. > Because a large number of short-term writes can lead to miscalculation. Here > is the example, this disk is health, when it encounters a lot of writing in a > few minute, it's write io does get slow, and will be considered to be slow > disk.The disk just slow in a few minute but SlowDiskReport will keep it until > the information becomes invalid. This scenario confuse us since we want to > use SlowDiskReport to detect the real bad disk. > !image-2020-12-22-11-37-14-734.png! > !image-2020-12-22-11-37-35-280.png! > To improve the deteciton accuracy, we use a cumulative counting way to detect > slow disk. If within the reportValidityMs interval, a disk is considered to > be outlier over 50% times, than it should be a real bad disk. > Here is an exsample, if reportValidityMs is one hour and detection interval > is five minute, there will be 12 times disk outlier detection in one hour. If > a disk is considered to be outlier over 6 times, it should be a real bad > disk. We use this way to detect bad disk in cluster, it can reach over 90% > accuracy. > !image-2020-12-22-11-46-48-817.png! -- Th
[jira] [Updated] (HDFS-15744) Use cumulative counting way to improve the accuracy of slow disk detection
[ https://issues.apache.org/jira/browse/HDFS-15744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-15744: Description: Hdfs has supported the datanode disk outlier detection in [11461|https://issues.apache.org/jira/browse/HDFS-11461], we can use it to find out slow disk via SlowDiskReport(11551).However i found the slow disk information may not be accurate enough in practice. Because a large number of short-term writes can lead to miscalculation. Here is the example, this disk is health, when it encounters a lot of writing in a few minute, it's write io does get slow, and will be considered to be slow disk.The disk just slow in a few minute but SlowDiskReport will keep it until the information becomes invalid. This scenario confuse us since we want to use SlowDiskReport to detect the real bad disk. !image-2020-12-22-11-37-14-734.png! !image-2020-12-22-11-37-35-280.png! To improve the deteciton accuracy, we use a cumulative counting way to detect slow disk. If within the reportValidityMs interval, a disk is considered to be outlier over 50% times, than it should be a real bad disk. Here is an exsample, if reportValidityMs is one hour and detection interval is five minute, there will be 12 times disk outlier detection in one hour. If a disk is considered to be outlier over 6 times, it should be a real bad disk. We use this way to detect bad disk in cluster, it can reach over 90% accuracy. !image-2020-12-22-11-46-48-817.png! was: 11461 support the datanode disk outlier detection, we can use it to find out slow disk via SlowDiskReport(11551).However i found the slow disk information may not be accurate enough in practice. Because a large number of short-term writes can lead to miscalculation. Here is the example, this disk is health, when it encounters a lot of writing in a few minute, it's write io does get slow, and will be considered to be slow disk.The disk just slow in a few minute but SlowDiskReport will keep it until the information becomes invalid. This scenario confuse us since we want to use SlowDiskReport to detect the real bad disk. !image-2020-12-22-11-37-14-734.png! !image-2020-12-22-11-37-35-280.png! To improve the deteciton accuracy, we use a cumulative counting way to detect slow disk. If within the reportValidityMs interval, a disk is considered to be outlier over 50% times, than it should be a real bad disk. Here is an exsample, if reportValidityMs is one hour and detection interval is five minute, there will be 12 times disk outlier detection in one hour. If a disk is considered to be outlier over 6 times, it should be a real bad disk. We use this way to detect bad disk in cluster, it can reach over 90% accuracy. !image-2020-12-22-11-46-48-817.png! > Use cumulative counting way to improve the accuracy of slow disk detection > -- > > Key: HDFS-15744 > URL: https://issues.apache.org/jira/browse/HDFS-15744 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: image-2020-12-22-11-37-14-734.png, > image-2020-12-22-11-37-35-280.png, image-2020-12-22-11-46-48-817.png > > > Hdfs has supported the datanode disk outlier detection in > [11461|https://issues.apache.org/jira/browse/HDFS-11461], we can use it to > find out slow disk via SlowDiskReport(11551).However i found the slow disk > information may not be accurate enough in practice. > Because a large number of short-term writes can lead to miscalculation. Here > is the example, this disk is health, when it encounters a lot of writing in a > few minute, it's write io does get slow, and will be considered to be slow > disk.The disk just slow in a few minute but SlowDiskReport will keep it until > the information becomes invalid. This scenario confuse us since we want to > use SlowDiskReport to detect the real bad disk. > !image-2020-12-22-11-37-14-734.png! > !image-2020-12-22-11-37-35-280.png! > To improve the deteciton accuracy, we use a cumulative counting way to detect > slow disk. If within the reportValidityMs interval, a disk is considered to > be outlier over 50% times, than it should be a real bad disk. > Here is an exsample, if reportValidityMs is one hour and detection interval > is five minute, there will be 12 times disk outlier detection in one hour. If > a disk is considered to be outlier over 6 times, it should be a real bad > disk. We use this way to detect bad disk in cluster, it can reach over 90% > accuracy. > !image-2020-12-22-11-46-48-817.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org F
[jira] [Updated] (HDFS-15744) Use cumulative counting way to improve the accuracy of slow disk detection
[ https://issues.apache.org/jira/browse/HDFS-15744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-15744: Description: [11461|https://issues.apache.org/jira/browse/HDFS-11461]11461 support the datanode disk outlier detection, we can use it to find out slow disk via SlowDiskReport([11551|https://issues.apache.org/jira/browse/HDFS-11551]).However i found the slow disk information may not be accurate enough in practice. Because a large number of short-term writes can lead to miscalculation. Here is the example, this disk is health, when it encounters a lot of writing in a few minute, it's write io does get slow, and will be considered to be slow disk.The disk just slow in a few minute but SlowDiskReport will keep it until the information becomes invalid. This scenario confuse us since we want to use SlowDiskReport to detect the real bad disk. !image-2020-12-22-11-37-14-734.png! !image-2020-12-22-11-37-35-280.png! To improve the deteciton accuracy, we use a cumulative counting way to detect slow disk. If within the reportValidityMs interval, a disk is considered to be outlier over 50% times, than it should be a real bad disk. Here is an exsample, if reportValidityMs is one hour and detection interval is five minute, there will be 12 times disk outlier detection in one hour. If a disk is considered to be outlier over 6 times, it should be a real bad disk. We use this way to detect bad disk in cluster, it can reach over 90% accuracy. !image-2020-12-22-11-46-48-817.png! was: 11461 support the datanode disk outlier detection, we can use it to find out slow disk via SlowDiskReport(11551).However i found the slow disk information may not be accurate enough in practice. Because a large number of short-term writes can lead to miscalculation. Here is the example, this disk is health, when it encounters a lot of writing in a few minute, it's write io does get slow, and will be considered to be slow disk.The disk just slow in a few minute but SlowDiskReport will keep it until the information becomes invalid. This scenario confuse us since we want to use SlowDiskReport to detect the real bad disk. !image-2020-12-22-11-37-14-734.png! !image-2020-12-22-11-37-35-280.png! To improve the deteciton accuracy, we use a cumulative counting way to detect slow disk. If within the reportValidityMs interval, a disk is considered to be outlier over 50% times, than it should be a real bad disk. Here is an exsample, if reportValidityMs is one hour and detection interval is five minute, there will be 12 times disk outlier detection in one hour. If a disk is considered to be outlier over 6 times, it should be a real bad disk. We use this way to detect bad disk in cluster, it can reach over 90% accuracy. !image-2020-12-22-11-46-48-817.png! > Use cumulative counting way to improve the accuracy of slow disk detection > -- > > Key: HDFS-15744 > URL: https://issues.apache.org/jira/browse/HDFS-15744 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: image-2020-12-22-11-37-14-734.png, > image-2020-12-22-11-37-35-280.png, image-2020-12-22-11-46-48-817.png > > > [11461|https://issues.apache.org/jira/browse/HDFS-11461]11461 support the > datanode disk outlier detection, we can use it to find out slow disk via > SlowDiskReport([11551|https://issues.apache.org/jira/browse/HDFS-11551]).However > i found the slow disk information may not be accurate enough in practice. > Because a large number of short-term writes can lead to miscalculation. Here > is the example, this disk is health, when it encounters a lot of writing in a > few minute, it's write io does get slow, and will be considered to be slow > disk.The disk just slow in a few minute but SlowDiskReport will keep it until > the information becomes invalid. This scenario confuse us since we want to > use SlowDiskReport to detect the real bad disk. > !image-2020-12-22-11-37-14-734.png! > !image-2020-12-22-11-37-35-280.png! > To improve the deteciton accuracy, we use a cumulative counting way to detect > slow disk. If within the reportValidityMs interval, a disk is considered to > be outlier over 50% times, than it should be a real bad disk. > Here is an exsample, if reportValidityMs is one hour and detection interval > is five minute, there will be 12 times disk outlier detection in one hour. If > a disk is considered to be outlier over 6 times, it should be a real bad > disk. We use this way to detect bad disk in cluster, it can reach over 90% > accuracy. > !image-2020-12-22-11-46-48-817.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) -
[jira] [Updated] (HDFS-15744) Use cumulative counting way to improve the accuracy of slow disk detection
[ https://issues.apache.org/jira/browse/HDFS-15744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-15744: Description: 11461 support the datanode disk outlier detection, we can use it to find out slow disk via SlowDiskReport(11551).However i found the slow disk information may not be accurate enough in practice. Because a large number of short-term writes can lead to miscalculation. Here is the example, this disk is health, when it encounters a lot of writing in a few minute, it's write io does get slow, and will be considered to be slow disk.The disk just slow in a few minute but SlowDiskReport will keep it until the information becomes invalid. This scenario confuse us since we want to use SlowDiskReport to detect the real bad disk. !image-2020-12-22-11-37-14-734.png! !image-2020-12-22-11-37-35-280.png! To improve the deteciton accuracy, we use a cumulative counting way to detect slow disk. If within the reportValidityMs interval, a disk is considered to be outlier over 50% times, than it should be a real bad disk. Here is an exsample, if reportValidityMs is one hour and detection interval is five minute, there will be 12 times disk outlier detection in one hour. If a disk is considered to be outlier over 6 times, it should be a real bad disk. We use this way to detect bad disk in cluster, it can reach over 90% accuracy. !image-2020-12-22-11-46-48-817.png! was: [11461|https://issues.apache.org/jira/browse/HDFS-11461]11461 support the datanode disk outlier detection, we can use it to find out slow disk via SlowDiskReport([11551|https://issues.apache.org/jira/browse/HDFS-11551]).However i found the slow disk information may not be accurate enough in practice. Because a large number of short-term writes can lead to miscalculation. Here is the example, this disk is health, when it encounters a lot of writing in a few minute, it's write io does get slow, and will be considered to be slow disk.The disk just slow in a few minute but SlowDiskReport will keep it until the information becomes invalid. This scenario confuse us since we want to use SlowDiskReport to detect the real bad disk. !image-2020-12-22-11-37-14-734.png! !image-2020-12-22-11-37-35-280.png! To improve the deteciton accuracy, we use a cumulative counting way to detect slow disk. If within the reportValidityMs interval, a disk is considered to be outlier over 50% times, than it should be a real bad disk. Here is an exsample, if reportValidityMs is one hour and detection interval is five minute, there will be 12 times disk outlier detection in one hour. If a disk is considered to be outlier over 6 times, it should be a real bad disk. We use this way to detect bad disk in cluster, it can reach over 90% accuracy. !image-2020-12-22-11-46-48-817.png! > Use cumulative counting way to improve the accuracy of slow disk detection > -- > > Key: HDFS-15744 > URL: https://issues.apache.org/jira/browse/HDFS-15744 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: image-2020-12-22-11-37-14-734.png, > image-2020-12-22-11-37-35-280.png, image-2020-12-22-11-46-48-817.png > > > 11461 support the datanode disk outlier detection, we can use it to find out > slow disk via SlowDiskReport(11551).However i found the slow disk information > may not be accurate enough in practice. > Because a large number of short-term writes can lead to miscalculation. Here > is the example, this disk is health, when it encounters a lot of writing in a > few minute, it's write io does get slow, and will be considered to be slow > disk.The disk just slow in a few minute but SlowDiskReport will keep it until > the information becomes invalid. This scenario confuse us since we want to > use SlowDiskReport to detect the real bad disk. > !image-2020-12-22-11-37-14-734.png! > !image-2020-12-22-11-37-35-280.png! > To improve the deteciton accuracy, we use a cumulative counting way to detect > slow disk. If within the reportValidityMs interval, a disk is considered to > be outlier over 50% times, than it should be a real bad disk. > Here is an exsample, if reportValidityMs is one hour and detection interval > is five minute, there will be 12 times disk outlier detection in one hour. If > a disk is considered to be outlier over 6 times, it should be a real bad > disk. We use this way to detect bad disk in cluster, it can reach over 90% > accuracy. > !image-2020-12-22-11-46-48-817.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-
[jira] [Created] (HDFS-15744) Use cumulative counting way to improve the accuracy of slow disk detection
Haibin Huang created HDFS-15744: --- Summary: Use cumulative counting way to improve the accuracy of slow disk detection Key: HDFS-15744 URL: https://issues.apache.org/jira/browse/HDFS-15744 Project: Hadoop HDFS Issue Type: Improvement Reporter: Haibin Huang Assignee: Haibin Huang Attachments: image-2020-12-22-11-37-14-734.png, image-2020-12-22-11-37-35-280.png, image-2020-12-22-11-46-48-817.png 11461 support the datanode disk outlier detection, we can use it to find out slow disk via SlowDiskReport(11551).However i found the slow disk information may not be accurate enough in practice. Because a large number of short-term writes can lead to miscalculation. Here is the example, this disk is health, when it encounters a lot of writing in a few minute, it's write io does get slow, and will be considered to be slow disk.The disk just slow in a few minute but SlowDiskReport will keep it until the information becomes invalid. This scenario confuse us since we want to use SlowDiskReport to detect the real bad disk. !image-2020-12-22-11-37-14-734.png! !image-2020-12-22-11-37-35-280.png! To improve the deteciton accuracy, we use a cumulative counting way to detect slow disk. If within the reportValidityMs interval, a disk is considered to be outlier over 50% times, than it should be a real bad disk. Here is an exsample, if reportValidityMs is one hour and detection interval is five minute, there will be 12 times disk outlier detection in one hour. If a disk is considered to be outlier over 6 times, it should be a real bad disk. We use this way to detect bad disk in cluster, it can reach over 90% accuracy. !image-2020-12-22-11-46-48-817.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15666) add average latency information to the SlowPeerReport
[ https://issues.apache.org/jira/browse/HDFS-15666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-15666: Attachment: HDFS-15666-004.patch > add average latency information to the SlowPeerReport > - > > Key: HDFS-15666 > URL: https://issues.apache.org/jira/browse/HDFS-15666 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Minor > Attachments: HDFS-15666-003.patch, HDFS-15666-004.patch, > HDFS-15666.001.patch, HDFS-15666.002.patch > > > In namenode's jmx, there is a SlowDisksReport like this: > {code:java} > [{"SlowDiskID":"dn3:disk1","Latencies":{"WRITE":1000.1}},{"SlowDiskID":"dn2:disk2","Latencies":{"WRITE":1000.1}},{"SlowDiskID":"dn1:disk2","Latencies":{"READ":1000.3}},{"SlowDiskID":"dn1:disk1","Latencies":{"METADATA":1000.1,"READ":1000.8}}] > {code} > So we can know the disk io letency from this report.However, SlowPeersReport > dosen't have average latency: > {code:java} > [{"SlowNode":"node4","ReportingNodes":["node1"]},{"SlowNode":"node2","ReportingNodes":["node1","node3"]},{"SlowNode":"node1","ReportingNodes":["node2"]}] > {code} > I think we should add the average latency to the report, which can get from > org.apache.hadoop.hdfs.server.protocol.SlowPeerReports#slowPeers. > After adding the average latency, the SlowPeerReport can be like this: > {code:java} > [{"SlowNode":"node4","ReportingNodes":[{"nodeId":"node1","averageLatency":2000.0}]},{"SlowNode":"node2","ReportingNodes":[{"nodeId":"node1","averageLatency":2000.0},{"nodeId":"node3","averageLatency":1000.0}]},{"SlowNode":"node1","ReportingNodes":[{"nodeId":"node2","averageLatency":2000.0}]}]{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15666) add average latency information to the SlowPeerReport
[ https://issues.apache.org/jira/browse/HDFS-15666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-15666: Attachment: HDFS-15666-003.patch > add average latency information to the SlowPeerReport > - > > Key: HDFS-15666 > URL: https://issues.apache.org/jira/browse/HDFS-15666 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Minor > Attachments: HDFS-15666-003.patch, HDFS-15666.001.patch, > HDFS-15666.002.patch > > > In namenode's jmx, there is a SlowDisksReport like this: > {code:java} > [{"SlowDiskID":"dn3:disk1","Latencies":{"WRITE":1000.1}},{"SlowDiskID":"dn2:disk2","Latencies":{"WRITE":1000.1}},{"SlowDiskID":"dn1:disk2","Latencies":{"READ":1000.3}},{"SlowDiskID":"dn1:disk1","Latencies":{"METADATA":1000.1,"READ":1000.8}}] > {code} > So we can know the disk io letency from this report.However, SlowPeersReport > dosen't have average latency: > {code:java} > [{"SlowNode":"node4","ReportingNodes":["node1"]},{"SlowNode":"node2","ReportingNodes":["node1","node3"]},{"SlowNode":"node1","ReportingNodes":["node2"]}] > {code} > I think we should add the average latency to the report, which can get from > org.apache.hadoop.hdfs.server.protocol.SlowPeerReports#slowPeers. > After adding the average latency, the SlowPeerReport can be like this: > {code:java} > [{"SlowNode":"node4","ReportingNodes":[{"nodeId":"node1","averageLatency":2000.0}]},{"SlowNode":"node2","ReportingNodes":[{"nodeId":"node1","averageLatency":2000.0},{"nodeId":"node3","averageLatency":1000.0}]},{"SlowNode":"node1","ReportingNodes":[{"nodeId":"node2","averageLatency":2000.0}]}]{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15733) Add seqno in log when BlockReceiver receive packet
[ https://issues.apache.org/jira/browse/HDFS-15733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17251761#comment-17251761 ] Haibin Huang commented on HDFS-15733: - Thanks [~elgoiri] for comment, i have updated the patch, now the debug msg look like this: {code:java} 2020-12-18 21:18:10,356 DEBUG datanode.DataNode (BlockReceiver.java:receivePacket(541)) - Receiving one packet for block BP-985744046-192.168.0.100-1608297475921:blk_1073741828_0 seqno:100 header:PacketHeader with packetLen=8 header data: offsetInBlock: 0 {code} > Add seqno in log when BlockReceiver receive packet > -- > > Key: HDFS-15733 > URL: https://issues.apache.org/jira/browse/HDFS-15733 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Minor > Attachments: HDFS-15733-001.patch, HDFS-15733-002.patch > > > There is a debug log show when BlockReceiver receiving a new packet, however > now we can't tell which packet this debug log belongs to, i think it would be > better to add a sequence number in the log. > now the debug log like this, missing the seqno of packet > {code:java} > 2020-12-11,16:26:30,518 DEBUG > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving one packet for > block BP-XXX:blk_XXX: PacketHeader with packetLen=2559 header data: > offsetInBlock: 1 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15733) Add seqno in log when BlockReceiver receive packet
[ https://issues.apache.org/jira/browse/HDFS-15733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-15733: Attachment: HDFS-15733-002.patch > Add seqno in log when BlockReceiver receive packet > -- > > Key: HDFS-15733 > URL: https://issues.apache.org/jira/browse/HDFS-15733 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Minor > Attachments: HDFS-15733-001.patch, HDFS-15733-002.patch > > > There is a debug log show when BlockReceiver receiving a new packet, however > now we can't tell which packet this debug log belongs to, i think it would be > better to add a sequence number in the log. > now the debug log like this, missing the seqno of packet > {code:java} > 2020-12-11,16:26:30,518 DEBUG > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving one packet for > block BP-XXX:blk_XXX: PacketHeader with packetLen=2559 header data: > offsetInBlock: 1 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15666) add average latency information to the SlowPeerReport
[ https://issues.apache.org/jira/browse/HDFS-15666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-15666: Attachment: HDFS-15666.002.patch > add average latency information to the SlowPeerReport > - > > Key: HDFS-15666 > URL: https://issues.apache.org/jira/browse/HDFS-15666 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Minor > Attachments: HDFS-15666.001.patch, HDFS-15666.002.patch > > > In namenode's jmx, there is a SlowDisksReport like this: > {code:java} > [{"SlowDiskID":"dn3:disk1","Latencies":{"WRITE":1000.1}},{"SlowDiskID":"dn2:disk2","Latencies":{"WRITE":1000.1}},{"SlowDiskID":"dn1:disk2","Latencies":{"READ":1000.3}},{"SlowDiskID":"dn1:disk1","Latencies":{"METADATA":1000.1,"READ":1000.8}}] > {code} > So we can know the disk io letency from this report.However, SlowPeersReport > dosen't have average latency: > {code:java} > [{"SlowNode":"node4","ReportingNodes":["node1"]},{"SlowNode":"node2","ReportingNodes":["node1","node3"]},{"SlowNode":"node1","ReportingNodes":["node2"]}] > {code} > I think we should add the average latency to the report, which can get from > org.apache.hadoop.hdfs.server.protocol.SlowPeerReports#slowPeers. > After adding the average latency, the SlowPeerReport can be like this: > {code:java} > [{"SlowNode":"node4","ReportingNodes":[{"nodeId":"node1","averageLatency":2000.0}]},{"SlowNode":"node2","ReportingNodes":[{"nodeId":"node1","averageLatency":2000.0},{"nodeId":"node3","averageLatency":1000.0}]},{"SlowNode":"node1","ReportingNodes":[{"nodeId":"node2","averageLatency":2000.0}]}]{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15733) Add seqno in log when BlockReceiver receive packet
[ https://issues.apache.org/jira/browse/HDFS-15733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-15733: Attachment: HDFS-15733-001.patch > Add seqno in log when BlockReceiver receive packet > -- > > Key: HDFS-15733 > URL: https://issues.apache.org/jira/browse/HDFS-15733 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Minor > Attachments: HDFS-15733-001.patch > > > There is a debug log show when BlockReceiver receiving a new packet, however > now we can't tell which packet this debug log belongs to, i think it would be > better to add a sequence number in the log. > now the debug log like this, missing the seqno of packet > {code:java} > 2020-12-11,16:26:30,518 DEBUG > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving one packet for > block BP-XXX:blk_XXX: PacketHeader with packetLen=2559 header data: > offsetInBlock: 1 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15733) Add seqno in log when BlockReceiver receive packet
[ https://issues.apache.org/jira/browse/HDFS-15733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-15733: Status: Patch Available (was: Open) > Add seqno in log when BlockReceiver receive packet > -- > > Key: HDFS-15733 > URL: https://issues.apache.org/jira/browse/HDFS-15733 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Minor > Attachments: HDFS-15733-001.patch > > > There is a debug log show when BlockReceiver receiving a new packet, however > now we can't tell which packet this debug log belongs to, i think it would be > better to add a sequence number in the log. > now the debug log like this, missing the seqno of packet > {code:java} > 2020-12-11,16:26:30,518 DEBUG > org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving one packet for > block BP-XXX:blk_XXX: PacketHeader with packetLen=2559 header data: > offsetInBlock: 1 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15733) Add seqno in log when BlockReceiver receive packet
Haibin Huang created HDFS-15733: --- Summary: Add seqno in log when BlockReceiver receive packet Key: HDFS-15733 URL: https://issues.apache.org/jira/browse/HDFS-15733 Project: Hadoop HDFS Issue Type: Improvement Reporter: Haibin Huang Assignee: Haibin Huang There is a debug log show when BlockReceiver receiving a new packet, however now we can't tell which packet this debug log belongs to, i think it would be better to add a sequence number in the log. now the debug log like this, missing the seqno of packet {code:java} 2020-12-11,16:26:30,518 DEBUG org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving one packet for block BP-XXX:blk_XXX: PacketHeader with packetLen=2559 header data: offsetInBlock: 1 {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15666) add average latency information to the SlowPeerReport
[ https://issues.apache.org/jira/browse/HDFS-15666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-15666: Attachment: HDFS-15666.001.patch > add average latency information to the SlowPeerReport > - > > Key: HDFS-15666 > URL: https://issues.apache.org/jira/browse/HDFS-15666 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Minor > Attachments: HDFS-15666.001.patch > > > In namenode's jmx, there is a SlowDisksReport like this: > {code:java} > [{"SlowDiskID":"dn3:disk1","Latencies":{"WRITE":1000.1}},{"SlowDiskID":"dn2:disk2","Latencies":{"WRITE":1000.1}},{"SlowDiskID":"dn1:disk2","Latencies":{"READ":1000.3}},{"SlowDiskID":"dn1:disk1","Latencies":{"METADATA":1000.1,"READ":1000.8}}] > {code} > So we can know the disk io letency from this report.However, SlowPeersReport > dosen't have average latency: > {code:java} > [{"SlowNode":"node4","ReportingNodes":["node1"]},{"SlowNode":"node2","ReportingNodes":["node1","node3"]},{"SlowNode":"node1","ReportingNodes":["node2"]}] > {code} > I think we should add the average latency to the report, which can get from > org.apache.hadoop.hdfs.server.protocol.SlowPeerReports#slowPeers. > After adding the average latency, the SlowPeerReport can be like this: > {code:java} > [{"SlowNode":"node4","ReportingNodes":[{"nodeId":"node1","averageLatency":2000.0}]},{"SlowNode":"node2","ReportingNodes":[{"nodeId":"node1","averageLatency":2000.0},{"nodeId":"node3","averageLatency":1000.0}]},{"SlowNode":"node1","ReportingNodes":[{"nodeId":"node2","averageLatency":2000.0}]}]{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15666) add average latency information to the SlowPeerReport
[ https://issues.apache.org/jira/browse/HDFS-15666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-15666: Status: Patch Available (was: Open) > add average latency information to the SlowPeerReport > - > > Key: HDFS-15666 > URL: https://issues.apache.org/jira/browse/HDFS-15666 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Minor > Attachments: HDFS-15666.001.patch > > > In namenode's jmx, there is a SlowDisksReport like this: > {code:java} > [{"SlowDiskID":"dn3:disk1","Latencies":{"WRITE":1000.1}},{"SlowDiskID":"dn2:disk2","Latencies":{"WRITE":1000.1}},{"SlowDiskID":"dn1:disk2","Latencies":{"READ":1000.3}},{"SlowDiskID":"dn1:disk1","Latencies":{"METADATA":1000.1,"READ":1000.8}}] > {code} > So we can know the disk io letency from this report.However, SlowPeersReport > dosen't have average latency: > {code:java} > [{"SlowNode":"node4","ReportingNodes":["node1"]},{"SlowNode":"node2","ReportingNodes":["node1","node3"]},{"SlowNode":"node1","ReportingNodes":["node2"]}] > {code} > I think we should add the average latency to the report, which can get from > org.apache.hadoop.hdfs.server.protocol.SlowPeerReports#slowPeers. > After adding the average latency, the SlowPeerReport can be like this: > {code:java} > [{"SlowNode":"node4","ReportingNodes":[{"nodeId":"node1","averageLatency":2000.0}]},{"SlowNode":"node2","ReportingNodes":[{"nodeId":"node1","averageLatency":2000.0},{"nodeId":"node3","averageLatency":1000.0}]},{"SlowNode":"node1","ReportingNodes":[{"nodeId":"node2","averageLatency":2000.0}]}]{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15666) add average latency information to the SlowPeerReport
[ https://issues.apache.org/jira/browse/HDFS-15666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-15666: Description: In namenode's jmx, there is a SlowDisksReport like this: {code:java} [{"SlowDiskID":"dn3:disk1","Latencies":{"WRITE":1000.1}},{"SlowDiskID":"dn2:disk2","Latencies":{"WRITE":1000.1}},{"SlowDiskID":"dn1:disk2","Latencies":{"READ":1000.3}},{"SlowDiskID":"dn1:disk1","Latencies":{"METADATA":1000.1,"READ":1000.8}}] {code} So we can know the disk io letency from this report.However, SlowPeersReport dosen't have average latency: {code:java} [{"SlowNode":"node4","ReportingNodes":["node1"]},{"SlowNode":"node2","ReportingNodes":["node1","node3"]},{"SlowNode":"node1","ReportingNodes":["node2"]}] {code} I think we should add the average latency to the report, which can get from org.apache.hadoop.hdfs.server.protocol.SlowPeerReports#slowPeers. After adding the average latency, the SlowPeerReport can be like this: {code:java} [{"SlowNode":"node4","ReportingNodes":[{"nodeId":"node1","averageLatency":2000.0}]},{"SlowNode":"node2","ReportingNodes":[{"nodeId":"node1","averageLatency":2000.0},{"nodeId":"node3","averageLatency":1000.0}]},{"SlowNode":"node1","ReportingNodes":[{"nodeId":"node2","averageLatency":2000.0}]}]{code} was: In namenode's jmx, there is a SlowDisksReport like this: {code:java} [{"SlowDiskID":"dn3:disk1","Latencies":{"WRITE":1000.1}},{"SlowDiskID":"dn2:disk2","Latencies":{"WRITE":1000.1}},{"SlowDiskID":"dn1:disk2","Latencies":{"READ":1000.3}},{"SlowDiskID":"dn1:disk1","Latencies":{"METADATA":1000.1,"READ":1000.8}}] {code} So we can know the disk io letency from this report.However, SlowPeersReport dosen't have average latency: {code:java} [{"SlowNode":"node4","ReportingNodes":["node1"]},{"SlowNode":"node2","ReportingNodes":["node1","node3"]},{"SlowNode":"node1","ReportingNodes":["node2"]}] {code} I think we should add the average latency to the report, which can get from org.apache.hadoop.hdfs.server.protocol.SlowPeerReports#slowPeers. After adding the average latency, the SlowPeerReport can be like this: {code:java} [{"SlowNode":"node4","ReportingNodes":[{"nodeId":"node1","averageLatency":2000.0}]},{"SlowNode":"node2","ReportingNodes":[{"nodeId":"node1","averageLatency":2000.0},{"nodeId":"node3","averageLatency":1000.0}]},{"SlowNode":"node1","ReportingNodes":[{"nodeId":"node2","averageLatency":2000.0}]}]{code} > add average latency information to the SlowPeerReport > - > > Key: HDFS-15666 > URL: https://issues.apache.org/jira/browse/HDFS-15666 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Minor > > In namenode's jmx, there is a SlowDisksReport like this: > {code:java} > [{"SlowDiskID":"dn3:disk1","Latencies":{"WRITE":1000.1}},{"SlowDiskID":"dn2:disk2","Latencies":{"WRITE":1000.1}},{"SlowDiskID":"dn1:disk2","Latencies":{"READ":1000.3}},{"SlowDiskID":"dn1:disk1","Latencies":{"METADATA":1000.1,"READ":1000.8}}] > {code} > So we can know the disk io letency from this report.However, SlowPeersReport > dosen't have average latency: > {code:java} > [{"SlowNode":"node4","ReportingNodes":["node1"]},{"SlowNode":"node2","ReportingNodes":["node1","node3"]},{"SlowNode":"node1","ReportingNodes":["node2"]}] > {code} > I think we should add the average latency to the report, which can get from > org.apache.hadoop.hdfs.server.protocol.SlowPeerReports#slowPeers. > After adding the average latency, the SlowPeerReport can be like this: > {code:java} > [{"SlowNode":"node4","ReportingNodes":[{"nodeId":"node1","averageLatency":2000.0}]},{"SlowNode":"node2","ReportingNodes":[{"nodeId":"node1","averageLatency":2000.0},{"nodeId":"node3","averageLatency":1000.0}]},{"SlowNode":"node1","ReportingNodes":[{"nodeId":"node2","averageLatency":2000.0}]}]{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15666) add average latency information to the SlowPeerReport
[ https://issues.apache.org/jira/browse/HDFS-15666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-15666: Description: In namenode's jmx, there is a SlowDisksReport like this: {code:java} [{"SlowDiskID":"dn3:disk1","Latencies":{"WRITE":1000.1}},{"SlowDiskID":"dn2:disk2","Latencies":{"WRITE":1000.1}},{"SlowDiskID":"dn1:disk2","Latencies":{"READ":1000.3}},{"SlowDiskID":"dn1:disk1","Latencies":{"METADATA":1000.1,"READ":1000.8}}] {code} So we can know the disk io letency from this report.However, SlowPeersReport dosen't have average latency: {code:java} [{"SlowNode":"node4","ReportingNodes":["node1"]},{"SlowNode":"node2","ReportingNodes":["node1","node3"]},{"SlowNode":"node1","ReportingNodes":["node2"]}] {code} I think we should add the average latency to the report, which can get from org.apache.hadoop.hdfs.server.protocol.SlowPeerReports#slowPeers. After adding the average latency, the SlowPeerReport can be like this: {code:java} [{"SlowNode":"node4","ReportingNodes":[{"nodeId":"node1","averageLatency":2000.0}]},{"SlowNode":"node2","ReportingNodes":[{"nodeId":"node1","averageLatency":2000.0},{"nodeId":"node3","averageLatency":1000.0}]},{"SlowNode":"node1","ReportingNodes":[{"nodeId":"node2","averageLatency":2000.0}]}]{code} was: In namenode's jmx, there is a SlowDisksReport like this: {code:java} [{"SlowDiskID":"dn3:disk1","Latencies":{"WRITE":1000.1}},{"SlowDiskID":"dn2:disk2","Latencies":{"WRITE":1000.1}},{"SlowDiskID":"dn1:disk2","Latencies":{"READ":1000.3}},{"SlowDiskID":"dn1:disk1","Latencies":{"METADATA":1000.1,"READ":1000.8}}] {code} So we can know the disk io letency from this report.However, SlowPeersReport dosen't have average latency: {code:java} [{"SlowNode":"node4","ReportingNodes":["node1"]},{"SlowNode":"node2","ReportingNodes":["node1","node3"]},{"SlowNode":"node1","ReportingNodes":["node2"]}] {code} I think we should add the average latency to the report, which can get from org.apache.hadoop.hdfs.server.protocol.SlowPeerReports#slowPeers > add average latency information to the SlowPeerReport > - > > Key: HDFS-15666 > URL: https://issues.apache.org/jira/browse/HDFS-15666 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Minor > > In namenode's jmx, there is a SlowDisksReport like this: > {code:java} > [{"SlowDiskID":"dn3:disk1","Latencies":{"WRITE":1000.1}},{"SlowDiskID":"dn2:disk2","Latencies":{"WRITE":1000.1}},{"SlowDiskID":"dn1:disk2","Latencies":{"READ":1000.3}},{"SlowDiskID":"dn1:disk1","Latencies":{"METADATA":1000.1,"READ":1000.8}}] > {code} > So we can know the disk io letency from this report.However, SlowPeersReport > dosen't have average latency: > > {code:java} > [{"SlowNode":"node4","ReportingNodes":["node1"]},{"SlowNode":"node2","ReportingNodes":["node1","node3"]},{"SlowNode":"node1","ReportingNodes":["node2"]}] > {code} > I think we should add the average latency to the report, which can get from > org.apache.hadoop.hdfs.server.protocol.SlowPeerReports#slowPeers. > After adding the average latency, the SlowPeerReport can be like this: > {code:java} > [{"SlowNode":"node4","ReportingNodes":[{"nodeId":"node1","averageLatency":2000.0}]},{"SlowNode":"node2","ReportingNodes":[{"nodeId":"node1","averageLatency":2000.0},{"nodeId":"node3","averageLatency":1000.0}]},{"SlowNode":"node1","ReportingNodes":[{"nodeId":"node2","averageLatency":2000.0}]}]{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15666) add average latency information to the SlowPeerReport
[ https://issues.apache.org/jira/browse/HDFS-15666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-15666: Description: In namenode's jmx, there is a SlowDisksReport like this: {code:java} [{"SlowDiskID":"dn3:disk1","Latencies":{"WRITE":1000.1}},{"SlowDiskID":"dn2:disk2","Latencies":{"WRITE":1000.1}},{"SlowDiskID":"dn1:disk2","Latencies":{"READ":1000.3}},{"SlowDiskID":"dn1:disk1","Latencies":{"METADATA":1000.1,"READ":1000.8}}] {code} So we can know the disk io letency from this report.However, SlowPeersReport dosen't have average latency: {code:java} [{"SlowNode":"node4","ReportingNodes":["node1"]},{"SlowNode":"node2","ReportingNodes":["node1","node3"]},{"SlowNode":"node1","ReportingNodes":["node2"]}] {code} I think we should add the average latency to the report, which can get from org.apache.hadoop.hdfs.server.protocol.SlowPeerReports#slowPeers was: In namenode's jmx, there is a SlowDisksReport like this: {code:java} [{"SlowDiskID":"dn3:disk1","Latencies":{"WRITE":1000.1}},{"SlowDiskID":"dn2:disk2","Latencies":{"WRITE":1000.1}},{"SlowDiskID":"dn1:disk2","Latencies":{"READ":1000.3}},{"SlowDiskID":"dn1:disk1","Latencies":{"METADATA":1000.1,"READ":1000.8}}] {code} So we can know the disk io letency from this report.However, SlowPeersReport dosen't have average latency: > add average latency information to the SlowPeerReport > - > > Key: HDFS-15666 > URL: https://issues.apache.org/jira/browse/HDFS-15666 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Minor > > In namenode's jmx, there is a SlowDisksReport like this: > {code:java} > [{"SlowDiskID":"dn3:disk1","Latencies":{"WRITE":1000.1}},{"SlowDiskID":"dn2:disk2","Latencies":{"WRITE":1000.1}},{"SlowDiskID":"dn1:disk2","Latencies":{"READ":1000.3}},{"SlowDiskID":"dn1:disk1","Latencies":{"METADATA":1000.1,"READ":1000.8}}] > {code} > So we can know the disk io letency from this report.However, SlowPeersReport > dosen't have average latency: > > {code:java} > [{"SlowNode":"node4","ReportingNodes":["node1"]},{"SlowNode":"node2","ReportingNodes":["node1","node3"]},{"SlowNode":"node1","ReportingNodes":["node2"]}] > {code} > I think we should add the average latency to the report, which can get from > org.apache.hadoop.hdfs.server.protocol.SlowPeerReports#slowPeers -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15666) add average latency information to the SlowPeerReport
Haibin Huang created HDFS-15666: --- Summary: add average latency information to the SlowPeerReport Key: HDFS-15666 URL: https://issues.apache.org/jira/browse/HDFS-15666 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs Reporter: Haibin Huang Assignee: Haibin Huang In namenode's jmx, there is a SlowDisksReport like this: {code:java} [{"SlowDiskID":"dn3:disk1","Latencies":{"WRITE":1000.1}},{"SlowDiskID":"dn2:disk2","Latencies":{"WRITE":1000.1}},{"SlowDiskID":"dn1:disk2","Latencies":{"READ":1000.3}},{"SlowDiskID":"dn1:disk1","Latencies":{"METADATA":1000.1,"READ":1000.8}}] {code} So we can know the disk io letency from this report.However, SlowPeersReport dosen't have average latency: -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15629) Add seqno when warning slow mirror/disk in BlockReceiver
[ https://issues.apache.org/jira/browse/HDFS-15629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-15629: Assignee: Haibin Huang Status: Patch Available (was: Open) > Add seqno when warning slow mirror/disk in BlockReceiver > > > Key: HDFS-15629 > URL: https://issues.apache.org/jira/browse/HDFS-15629 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-15629-001.patch > > > When client write slow, it will print a slow log from DataStreamer > {code:java} > if (ack.getSeqno() != DFSPacket.HEART_BEAT_SEQNO) { > Long begin = packetSendTime.get(ack.getSeqno()); > if (begin != null) { > long duration = Time.monotonicNow() - begin; > if (duration > dfsclientSlowLogThresholdMs) { > LOG.info("Slow ReadProcessor read fields for block " + block > + " took " + duration + "ms (threshold=" > + dfsclientSlowLogThresholdMs + "ms); ack: " + ack > + ", targets: " + Arrays.asList(targets)); > } > } > } > {code} > here is an example: > Slow ReadProcessor read fields for block BP-XXX:blk_XXX took 2756ms > (threshold=100ms); ack: seqno: 3341 status: SUCCESS status: SUCCESS status: > SUCCESS downstreamAckTimeNanos: 2751531959 4: "\000\000\000", targets: [XXX, > XXX, XXX] > There is an ack seqno in the log, so we can find which packet cause write > slow. However, datanode didn't print the seqno in slow log, so we can't kown > this packet write slow in which stage. > HDFS-11603 and HDFS-12814 add some slow warnings in BlockReceiver, i think we > should add seqno in these slow warnings, in order to find the corresponding > packet write slow in which stage. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15629) Add seqno when warning slow mirror/disk in BlockReceiver
[ https://issues.apache.org/jira/browse/HDFS-15629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-15629: Attachment: HDFS-15629-001.patch > Add seqno when warning slow mirror/disk in BlockReceiver > > > Key: HDFS-15629 > URL: https://issues.apache.org/jira/browse/HDFS-15629 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haibin Huang >Priority: Major > Attachments: HDFS-15629-001.patch > > > When client write slow, it will print a slow log from DataStreamer > {code:java} > if (ack.getSeqno() != DFSPacket.HEART_BEAT_SEQNO) { > Long begin = packetSendTime.get(ack.getSeqno()); > if (begin != null) { > long duration = Time.monotonicNow() - begin; > if (duration > dfsclientSlowLogThresholdMs) { > LOG.info("Slow ReadProcessor read fields for block " + block > + " took " + duration + "ms (threshold=" > + dfsclientSlowLogThresholdMs + "ms); ack: " + ack > + ", targets: " + Arrays.asList(targets)); > } > } > } > {code} > here is an example: > Slow ReadProcessor read fields for block BP-XXX:blk_XXX took 2756ms > (threshold=100ms); ack: seqno: 3341 status: SUCCESS status: SUCCESS status: > SUCCESS downstreamAckTimeNanos: 2751531959 4: "\000\000\000", targets: [XXX, > XXX, XXX] > There is an ack seqno in the log, so we can find which packet cause write > slow. However, datanode didn't print the seqno in slow log, so we can't kown > this packet write slow in which stage. > HDFS-11603 and HDFS-12814 add some slow warnings in BlockReceiver, i think we > should add seqno in these slow warnings, in order to find the corresponding > packet write slow in which stage. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15629) Add seqno when warning slow mirror/disk in BlockReceiver
[ https://issues.apache.org/jira/browse/HDFS-15629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-15629: Description: When client write slow, it will print a slow log from DataStreamer {code:java} if (ack.getSeqno() != DFSPacket.HEART_BEAT_SEQNO) { Long begin = packetSendTime.get(ack.getSeqno()); if (begin != null) { long duration = Time.monotonicNow() - begin; if (duration > dfsclientSlowLogThresholdMs) { LOG.info("Slow ReadProcessor read fields for block " + block + " took " + duration + "ms (threshold=" + dfsclientSlowLogThresholdMs + "ms); ack: " + ack + ", targets: " + Arrays.asList(targets)); } } } {code} here is an example: Slow ReadProcessor read fields for block BP-XXX:blk_XXX took 2756ms (threshold=100ms); ack: seqno: 3341 status: SUCCESS status: SUCCESS status: SUCCESS downstreamAckTimeNanos: 2751531959 4: "\000\000\000", targets: [XXX, XXX, XXX] There is an ack seqno in the log, so we can find which packet cause write slow. However, datanode didn't print the seqno in slow log, so we can't kown this packet write slow in which stage. HDFS-11603 and HDFS-12814 add some slow warnings in BlockReceiver, i think we should add seqno in these slow warnings, in order to find the corresponding packet write slow in which stage. was: When client write slow, it will print a slow log from DataStreamer {code:java} if (ack.getSeqno() != DFSPacket.HEART_BEAT_SEQNO) { Long begin = packetSendTime.get(ack.getSeqno()); if (begin != null) { long duration = Time.monotonicNow() - begin; if (duration > dfsclientSlowLogThresholdMs) { LOG.info("Slow ReadProcessor read fields for block " + block + " took " + duration + "ms (threshold=" + dfsclientSlowLogThresholdMs + "ms); ack: " + ack + ", targets: " + Arrays.asList(targets)); } } } {code} here is an example: Slow ReadProcessor read fields for block BP-XXX:blk_XXX took 2756ms (threshold=100ms); ack: seqno: 3341 status: SUCCESS status: SUCCESS status: SUCCESS downstreamAckTimeNanos: 2751531959 4: "\000\000\000", targets: [XXX, XXX, XXX][XXX, XXX, XXX] There is an ack seqno in the log, so we can find which packet cause write slow. However, datanode didn't print the seqno in slow log, so we can't kown this packet write slow in which stage. HDFS-11603 and HDFS-12814 add some slow warnings in BlockReceiver, i think we should add seqno in these slow warnings, in order to find the corresponding packet write slow in which stage. > Add seqno when warning slow mirror/disk in BlockReceiver > > > Key: HDFS-15629 > URL: https://issues.apache.org/jira/browse/HDFS-15629 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haibin Huang >Priority: Major > > When client write slow, it will print a slow log from DataStreamer > {code:java} > if (ack.getSeqno() != DFSPacket.HEART_BEAT_SEQNO) { > Long begin = packetSendTime.get(ack.getSeqno()); > if (begin != null) { > long duration = Time.monotonicNow() - begin; > if (duration > dfsclientSlowLogThresholdMs) { > LOG.info("Slow ReadProcessor read fields for block " + block > + " took " + duration + "ms (threshold=" > + dfsclientSlowLogThresholdMs + "ms); ack: " + ack > + ", targets: " + Arrays.asList(targets)); > } > } > } > {code} > here is an example: > Slow ReadProcessor read fields for block BP-XXX:blk_XXX took 2756ms > (threshold=100ms); ack: seqno: 3341 status: SUCCESS status: SUCCESS status: > SUCCESS downstreamAckTimeNanos: 2751531959 4: "\000\000\000", targets: [XXX, > XXX, XXX] > There is an ack seqno in the log, so we can find which packet cause write > slow. However, datanode didn't print the seqno in slow log, so we can't kown > this packet write slow in which stage. > HDFS-11603 and HDFS-12814 add some slow warnings in BlockReceiver, i think we > should add seqno in these slow warnings, in order to find the corresponding > packet write slow in which stage. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15629) Add seqno when warning slow mirror/disk in BlockReceiver
Haibin Huang created HDFS-15629: --- Summary: Add seqno when warning slow mirror/disk in BlockReceiver Key: HDFS-15629 URL: https://issues.apache.org/jira/browse/HDFS-15629 Project: Hadoop HDFS Issue Type: Improvement Reporter: Haibin Huang When client write slow, it will print a slow log from DataStreamer {code:java} if (ack.getSeqno() != DFSPacket.HEART_BEAT_SEQNO) { Long begin = packetSendTime.get(ack.getSeqno()); if (begin != null) { long duration = Time.monotonicNow() - begin; if (duration > dfsclientSlowLogThresholdMs) { LOG.info("Slow ReadProcessor read fields for block " + block + " took " + duration + "ms (threshold=" + dfsclientSlowLogThresholdMs + "ms); ack: " + ack + ", targets: " + Arrays.asList(targets)); } } } {code} here is an example: Slow ReadProcessor read fields for block BP-XXX:blk_XXX took 2756ms (threshold=100ms); ack: seqno: 3341 status: SUCCESS status: SUCCESS status: SUCCESS downstreamAckTimeNanos: 2751531959 4: "\000\000\000", targets: [XXX, XXX, XXX][XXX, XXX, XXX] There is an ack seqno in the log, so we can find which packet cause write slow. However, datanode didn't print the seqno in slow log, so we can't kown this packet write slow in which stage. HDFS-11603 and HDFS-12814 add some slow warnings in BlockReceiver, i think we should add seqno in these slow warnings, in order to find the corresponding packet write slow in which stage. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-15257) Fix spelling mistake in DataXceiverServer
[ https://issues.apache.org/jira/browse/HDFS-15257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang resolved HDFS-15257. - Resolution: Fixed > Fix spelling mistake in DataXceiverServer > - > > Key: HDFS-15257 > URL: https://issues.apache.org/jira/browse/HDFS-15257 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Minor > Attachments: HDFS-15257-001.patch > > > There is a spelling mistake in DataXceiverServer, try to fix it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15257) Fix spelling mistake in DataXceiverServer
[ https://issues.apache.org/jira/browse/HDFS-15257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-15257: Status: Open (was: Patch Available) > Fix spelling mistake in DataXceiverServer > - > > Key: HDFS-15257 > URL: https://issues.apache.org/jira/browse/HDFS-15257 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Minor > Attachments: HDFS-15257-001.patch > > > There is a spelling mistake in DataXceiverServer, try to fix it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15257) Fix spelling mistake in DataXceiverServer
[ https://issues.apache.org/jira/browse/HDFS-15257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-15257: Attachment: HDFS-15257-001.patch > Fix spelling mistake in DataXceiverServer > - > > Key: HDFS-15257 > URL: https://issues.apache.org/jira/browse/HDFS-15257 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Minor > Attachments: HDFS-15257-001.patch > > > There is a spelling mistake in DataXceiverServer, try to fix it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15257) Fix spelling mistake in DataXceiverServer
[ https://issues.apache.org/jira/browse/HDFS-15257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-15257: Status: Patch Available (was: Open) > Fix spelling mistake in DataXceiverServer > - > > Key: HDFS-15257 > URL: https://issues.apache.org/jira/browse/HDFS-15257 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Minor > Attachments: HDFS-15257-001.patch > > > There is a spelling mistake in DataXceiverServer, try to fix it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15257) Fix spelling mistake in DataXceiverServer
Haibin Huang created HDFS-15257: --- Summary: Fix spelling mistake in DataXceiverServer Key: HDFS-15257 URL: https://issues.apache.org/jira/browse/HDFS-15257 Project: Hadoop HDFS Issue Type: Bug Reporter: Haibin Huang Assignee: Haibin Huang There is a spelling mistake in DataXceiverServer, try to fix it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14783) Expired SampleStat should ignore when generating SlowPeersReport
[ https://issues.apache.org/jira/browse/HDFS-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-14783: Description: SlowPeersReport is generated by the SampleStat between tow dn, so it can present on nn's jmx like this: {code:java} "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}] {code} In each period, MutableRollingAverages will do a rollOverAvgs(), it will generate a SumAndCount object which is based on SampleStat, and store it in a LinkedBlockingDeque, the deque will be used to generate SlowPeersReport. And the old member of deque won't be removed until the queue is full. However, if dn1 don't send any packet to dn2 in the last of 36*300_000 ms, the deque will be filled with an old member, because the number of last SampleStat never change.I think these old SampleStats should be considered as expired message and ignore them when generating a new SlowPeersReport. was: SlowPeersReport is generated by the SampleStat between tow dn, so it can present on nn's jmx like this: {code:java} "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}] {code} In each period, MutableRollingAverages will do a rollOverAvgs(), it will generate a SumAndCount object which is based on SampleStat, and store it in a LinkedBlockingDeque, the deque will be used to generate SlowPeersReport. And the old member of deque won't be removed until the queue is full. However, if dn1 don't send any packet to dn2 in the last of 36*300_000 ms, the deque will be filled with an old member, because the number of last SampleStat never change.I think this old SampleStat should consider to be expired and ignore it when the SampleStat is stored in a LinkedBlockingDeque, it won't be removed until the queue is full and a newest one is generated. Therefore, if dn1 don't send any packet to dn2 for a long time, the old SampleStat will keep staying in the queue, and will be used to calculated slowpeer.I think these old SampleStats should be considered as expired message and ignore them when generating a new SlowPeersReport. > Expired SampleStat should ignore when generating SlowPeersReport > > > Key: HDFS-14783 > URL: https://issues.apache.org/jira/browse/HDFS-14783 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-14783, HDFS-14783-001.patch, HDFS-14783-002.patch, > HDFS-14783-003.patch, HDFS-14783-004.patch, HDFS-14783-005.patch > > > SlowPeersReport is generated by the SampleStat between tow dn, so it can > present on nn's jmx like this: > {code:java} > "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}] > {code} > In each period, MutableRollingAverages will do a rollOverAvgs(), it will > generate a SumAndCount object which is based on SampleStat, and store it in a > LinkedBlockingDeque, the deque will be used to generate > SlowPeersReport. And the old member of deque won't be removed until the queue > is full. However, if dn1 don't send any packet to dn2 in the last of > 36*300_000 ms, the deque will be filled with an old member, because the > number of last SampleStat never change.I think these old SampleStats should > be considered as expired message and ignore them when generating a new > SlowPeersReport. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14783) Expired SampleStat should ignore when generating SlowPeersReport
[ https://issues.apache.org/jira/browse/HDFS-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17070685#comment-17070685 ] Haibin Huang commented on HDFS-14783: - Thanks [~elgoiri], i have update the title and description, if necessary i will move this Jira to an new one based on Hadoop common > Expired SampleStat should ignore when generating SlowPeersReport > > > Key: HDFS-14783 > URL: https://issues.apache.org/jira/browse/HDFS-14783 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-14783, HDFS-14783-001.patch, HDFS-14783-002.patch, > HDFS-14783-003.patch, HDFS-14783-004.patch, HDFS-14783-005.patch > > > SlowPeersReport is generated by the SampleStat between tow dn, so it can > present on nn's jmx like this: > {code:java} > "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}] > {code} > In each period, MutableRollingAverages will do a rollOverAvgs(), it will > generate a SumAndCount object which is based on SampleStat, and store it in a > LinkedBlockingDeque, the deque will be used to generate > SlowPeersReport. And the old member of deque won't be removed until the queue > is full. However, if dn1 don't send any packet to dn2 in the last of > 36*300_000 ms, the deque will be filled with an old member, because the > number of last SampleStat never change.I think these old SampleStats should > be considered as expired message and ignore them when generating a new > SlowPeersReport. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14783) Expired SampleStat should ignore when generating SlowPeersReport
[ https://issues.apache.org/jira/browse/HDFS-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-14783: Description: SlowPeersReport is generated by the SampleStat between tow dn, so it can present on nn's jmx like this: {code:java} "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}] {code} In each period, MutableRollingAverages will do a rollOverAvgs(), it will generate a SumAndCount object which is based on SampleStat, and store it in a LinkedBlockingDeque, the deque will be used to generate SlowPeersReport. And the old member of deque won't be removed until the queue is full. However, if dn1 don't send any packet to dn2 in the last of 36*300_000 ms, the deque will be filled with an old member, because the number of last SampleStat never change.I think this old SampleStat should consider to be expired and ignore it when the SampleStat is stored in a LinkedBlockingDeque, it won't be removed until the queue is full and a newest one is generated. Therefore, if dn1 don't send any packet to dn2 for a long time, the old SampleStat will keep staying in the queue, and will be used to calculated slowpeer.I think these old SampleStats should be considered as expired message and ignore them when generating a new SlowPeersReport. was: SlowPeersReport is generated by the SampleStat between tow dn, so it can present on nn's jmx like this: {code:java} "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}] {code} In each period, MutableRollingAverages will do a rollOverAvgs(), it will generate a SumAndCount object which is based on SampleStat, and store it in a LinkedBlockingDeque, the deque will be used to generate SlowPeersReport. And the old member of deque won't be removed until the queue is full. However, if dn1 don't send any packet to dn2 in the last of 36*300_000 ms, the deque will be filled with an old member, because the SampleStat is stored in a LinkedBlockingDeque, it won't be removed until the queue is full and a newest one is generated. Therefore, if dn1 don't send any packet to dn2 for a long time, the old SampleStat will keep staying in the queue, and will be used to calculated slowpeer.I think these old SampleStats should be considered as expired message and ignore them when generating a new SlowPeersReport. > Expired SampleStat should ignore when generating SlowPeersReport > > > Key: HDFS-14783 > URL: https://issues.apache.org/jira/browse/HDFS-14783 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-14783, HDFS-14783-001.patch, HDFS-14783-002.patch, > HDFS-14783-003.patch, HDFS-14783-004.patch, HDFS-14783-005.patch > > > SlowPeersReport is generated by the SampleStat between tow dn, so it can > present on nn's jmx like this: > {code:java} > "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}] > {code} > In each period, MutableRollingAverages will do a rollOverAvgs(), it will > generate a SumAndCount object which is based on SampleStat, and store it in a > LinkedBlockingDeque, the deque will be used to generate > SlowPeersReport. And the old member of deque won't be removed until the queue > is full. However, if dn1 don't send any packet to dn2 in the last of > 36*300_000 ms, the deque will be filled with an old member, because the > number of last SampleStat never change.I think this old SampleStat should > consider to be expired and ignore it when > the SampleStat is stored in a LinkedBlockingDeque, it won't be > removed until the queue is full and a newest one is generated. Therefore, if > dn1 don't send any packet to dn2 for a long time, the old SampleStat will > keep staying in the queue, and will be used to calculated slowpeer.I think > these old SampleStats should be considered as expired message and ignore them > when generating a new SlowPeersReport. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14783) Expired SampleStat should ignore when generating SlowPeersReport
[ https://issues.apache.org/jira/browse/HDFS-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-14783: Description: SlowPeersReport is generated by the SampleStat between tow dn, so it can present on nn's jmx like this: {code:java} "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}] {code} In each period, MutableRollingAverages will do a rollOverAvgs(), it will generate a SumAndCount object which is based on SampleStat, and store it in a LinkedBlockingDeque, the deque will be used to generate SlowPeersReport. And the old member of deque won't be removed until the queue is full. However, if dn1 don't send any packet to dn2 in the last of 36*300_000 ms, the deque will be filled with an old member, because the SampleStat is stored in a LinkedBlockingDeque, it won't be removed until the queue is full and a newest one is generated. Therefore, if dn1 don't send any packet to dn2 for a long time, the old SampleStat will keep staying in the queue, and will be used to calculated slowpeer.I think these old SampleStats should be considered as expired message and ignore them when generating a new SlowPeersReport. was: SlowPeersReport is generated by the SampleStat between tow dn, so it can present on nn's jmx like this: {code:java} "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}] {code} In each period, MutableRollingAverages will do a rollOverAvgs(), it will generate a SumAndCount object which is based on SampleStat, and store it in a LinkedBlockingDeque, the deque will be used to generate SlowPeersReport. And the member of deque won't be removed until the queue is full. However, if dn1 don't send any packet to dn2 in the last of the SampleStat is stored in a LinkedBlockingDeque, it won't be removed until the queue is full and a newest one is generated. Therefore, if dn1 don't send any packet to dn2 for a long time, the old SampleStat will keep staying in the queue, and will be used to calculated slowpeer.I think these old SampleStats should be considered as expired message and ignore them when generating a new SlowPeersReport. > Expired SampleStat should ignore when generating SlowPeersReport > > > Key: HDFS-14783 > URL: https://issues.apache.org/jira/browse/HDFS-14783 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-14783, HDFS-14783-001.patch, HDFS-14783-002.patch, > HDFS-14783-003.patch, HDFS-14783-004.patch, HDFS-14783-005.patch > > > SlowPeersReport is generated by the SampleStat between tow dn, so it can > present on nn's jmx like this: > {code:java} > "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}] > {code} > In each period, MutableRollingAverages will do a rollOverAvgs(), it will > generate a SumAndCount object which is based on SampleStat, and store it in a > LinkedBlockingDeque, the deque will be used to generate > SlowPeersReport. And the old member of deque won't be removed until the queue > is full. However, if dn1 don't send any packet to dn2 in the last of > 36*300_000 ms, the deque will be filled with an old member, because > the SampleStat is stored in a LinkedBlockingDeque, it won't be > removed until the queue is full and a newest one is generated. Therefore, if > dn1 don't send any packet to dn2 for a long time, the old SampleStat will > keep staying in the queue, and will be used to calculated slowpeer.I think > these old SampleStats should be considered as expired message and ignore them > when generating a new SlowPeersReport. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14783) Expired SampleStat should ignore when generating SlowPeersReport
[ https://issues.apache.org/jira/browse/HDFS-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-14783: Description: SlowPeersReport is generated by the SampleStat between tow dn, so it can present on nn's jmx like this: {code:java} "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}] {code} In each period, MutableRollingAverages will do a rollOverAvgs(), it will generate a SumAndCount object which is based on SampleStat, and store it in a LinkedBlockingDeque, the deque will be used to generate SlowPeersReport. And the member of deque won't be removed until the queue is full. However, if dn1 don't send any packet to dn2 in the last of the SampleStat is stored in a LinkedBlockingDeque, it won't be removed until the queue is full and a newest one is generated. Therefore, if dn1 don't send any packet to dn2 for a long time, the old SampleStat will keep staying in the queue, and will be used to calculated slowpeer.I think these old SampleStats should be considered as expired message and ignore them when generating a new SlowPeersReport. was: SlowPeersReport is calculated by the SampleStat between tow dn, so it can present on nn's jmx like this: {code:java} "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}] {code} In each period, MutableRollingAverages will do a rollOverAvgs(), it will generate an SumAndCount object which is based on SampleStat, and stored it in a LinkedBlockingDeque to the SampleStat is stored in a LinkedBlockingDeque, it won't be removed until the queue is full and a newest one is generated. Therefore, if dn1 don't send any packet to dn2 for a long time, the old SampleStat will keep staying in the queue, and will be used to calculated slowpeer.I think these old SampleStats should be considered as expired message and ignore them when generating a new SlowPeersReport. > Expired SampleStat should ignore when generating SlowPeersReport > > > Key: HDFS-14783 > URL: https://issues.apache.org/jira/browse/HDFS-14783 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-14783, HDFS-14783-001.patch, HDFS-14783-002.patch, > HDFS-14783-003.patch, HDFS-14783-004.patch, HDFS-14783-005.patch > > > SlowPeersReport is generated by the SampleStat between tow dn, so it can > present on nn's jmx like this: > {code:java} > "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}] > {code} > In each period, MutableRollingAverages will do a rollOverAvgs(), it will > generate a SumAndCount object which is based on SampleStat, and store it in a > LinkedBlockingDeque, the deque will be used to generate > SlowPeersReport. And the member of deque won't be removed until the queue is > full. However, if dn1 don't send any packet to dn2 in the last of > the SampleStat is stored in a LinkedBlockingDeque, it won't be > removed until the queue is full and a newest one is generated. Therefore, if > dn1 don't send any packet to dn2 for a long time, the old SampleStat will > keep staying in the queue, and will be used to calculated slowpeer.I think > these old SampleStats should be considered as expired message and ignore them > when generating a new SlowPeersReport. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14783) Expired SampleStat should ignore when generating SlowPeersReport
[ https://issues.apache.org/jira/browse/HDFS-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-14783: Description: SlowPeersReport is calculated by the SampleStat between tow dn, so it can present on nn's jmx like this: {code:java} "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}] {code} In each period, MutableRollingAverages will do a rollOverAvgs(), it will generate an SumAndCount object which is based on SampleStat, and stored it in a LinkedBlockingDeque to the SampleStat is stored in a LinkedBlockingDeque, it won't be removed until the queue is full and a newest one is generated. Therefore, if dn1 don't send any packet to dn2 for a long time, the old SampleStat will keep staying in the queue, and will be used to calculated slowpeer.I think these old SampleStats should be considered as expired message and ignore them when generating a new SlowPeersReport. was: SlowPeersReport is calculated by the SampleStat between tow dn, so it can present on nn's jmx like this: {code:java} "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}] {code} the SampleStat is stored in a LinkedBlockingDeque, it won't be removed until the queue is full and a newest one is generated. Therefore, if dn1 don't send any packet to dn2 for a long time, the old SampleStat will keep staying in the queue, and will be used to calculated slowpeer.I think these old SampleStats should be considered as expired message and ignore them when generating a new SlowPeersReport. > Expired SampleStat should ignore when generating SlowPeersReport > > > Key: HDFS-14783 > URL: https://issues.apache.org/jira/browse/HDFS-14783 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-14783, HDFS-14783-001.patch, HDFS-14783-002.patch, > HDFS-14783-003.patch, HDFS-14783-004.patch, HDFS-14783-005.patch > > > SlowPeersReport is calculated by the SampleStat between tow dn, so it can > present on nn's jmx like this: > {code:java} > "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}] > {code} > In each period, MutableRollingAverages will do a rollOverAvgs(), it will > generate an SumAndCount object which is based on SampleStat, and stored it in > a LinkedBlockingDeque to > the SampleStat is stored in a LinkedBlockingDeque, it won't be > removed until the queue is full and a newest one is generated. Therefore, if > dn1 don't send any packet to dn2 for a long time, the old SampleStat will > keep staying in the queue, and will be used to calculated slowpeer.I think > these old SampleStats should be considered as expired message and ignore them > when generating a new SlowPeersReport. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14783) Expired SampleStat should ignore when generating SlowPeersReport
[ https://issues.apache.org/jira/browse/HDFS-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-14783: Summary: Expired SampleStat should ignore when generating SlowPeersReport (was: Expired SampleStat needs to be removed from SlowPeersReport) > Expired SampleStat should ignore when generating SlowPeersReport > > > Key: HDFS-14783 > URL: https://issues.apache.org/jira/browse/HDFS-14783 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-14783, HDFS-14783-001.patch, HDFS-14783-002.patch, > HDFS-14783-003.patch, HDFS-14783-004.patch, HDFS-14783-005.patch > > > SlowPeersReport is calculated by the SampleStat between tow dn, so it can > present on nn's jmx like this: > {code:java} > "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}] > {code} > the SampleStat is stored in a LinkedBlockingDeque, it won't be > removed until the queue is full and a newest one is generated. Therefore, if > dn1 don't send any packet to dn2 for a long time, the old SampleStat will > keep staying in the queue, and will be used to calculated slowpeer.I think > these old SampleStats should be considered as expired message and ignore them > when generating a new SlowPeersReport. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14783) Expired SampleStat needs to be removed from SlowPeersReport
[ https://issues.apache.org/jira/browse/HDFS-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17066638#comment-17066638 ] Haibin Huang commented on HDFS-14783: - Thanks [~elgoiri], can this patch commit to trunk, or need another one to review it? > Expired SampleStat needs to be removed from SlowPeersReport > --- > > Key: HDFS-14783 > URL: https://issues.apache.org/jira/browse/HDFS-14783 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-14783, HDFS-14783-001.patch, HDFS-14783-002.patch, > HDFS-14783-003.patch, HDFS-14783-004.patch, HDFS-14783-005.patch > > > SlowPeersReport is calculated by the SampleStat between tow dn, so it can > present on nn's jmx like this: > {code:java} > "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}] > {code} > the SampleStat is stored in a LinkedBlockingDeque, it won't be > removed until the queue is full and a newest one is generated. Therefore, if > dn1 don't send any packet to dn2 for a long time, the old SampleStat will > keep staying in the queue, and will be used to calculated slowpeer.I think > these old SampleStats should be considered as expired message and ignore them > when generating a new SlowPeersReport. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14783) Expired SampleStat needs to be removed from SlowPeersReport
[ https://issues.apache.org/jira/browse/HDFS-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17064491#comment-17064491 ] Haibin Huang commented on HDFS-14783: - [~elgoiri], i'm sorry, the newest patch is [^HDFS-14783-005.patch], and i don't know why it dosen't trigger Hadoop QA any more. > Expired SampleStat needs to be removed from SlowPeersReport > --- > > Key: HDFS-14783 > URL: https://issues.apache.org/jira/browse/HDFS-14783 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-14783, HDFS-14783-001.patch, HDFS-14783-002.patch, > HDFS-14783-003.patch, HDFS-14783-004.patch, HDFS-14783-005.patch > > > SlowPeersReport is calculated by the SampleStat between tow dn, so it can > present on nn's jmx like this: > {code:java} > "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}] > {code} > the SampleStat is stored in a LinkedBlockingDeque, it won't be > removed until the queue is full and a newest one is generated. Therefore, if > dn1 don't send any packet to dn2 for a long time, the old SampleStat will > keep staying in the queue, and will be used to calculated slowpeer.I think > these old SampleStats should be considered as expired message and ignore them > when generating a new SlowPeersReport. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14783) Expired SampleStat needs to be removed from SlowPeersReport
[ https://issues.apache.org/jira/browse/HDFS-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17064157#comment-17064157 ] Haibin Huang commented on HDFS-14783: - [~elgoiri] thakns for suggestion, i have updated the patch, take a look please. > Expired SampleStat needs to be removed from SlowPeersReport > --- > > Key: HDFS-14783 > URL: https://issues.apache.org/jira/browse/HDFS-14783 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-14783, HDFS-14783-001.patch, HDFS-14783-002.patch, > HDFS-14783-003.patch, HDFS-14783-004.patch, HDFS-14783-005.patch > > > SlowPeersReport is calculated by the SampleStat between tow dn, so it can > present on nn's jmx like this: > {code:java} > "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}] > {code} > the SampleStat is stored in a LinkedBlockingDeque, it won't be > removed until the queue is full and a newest one is generated. Therefore, if > dn1 don't send any packet to dn2 for a long time, the old SampleStat will > keep staying in the queue, and will be used to calculated slowpeer.I think > these old SampleStats should be considered as expired message and ignore them > when generating a new SlowPeersReport. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14783) Expired SampleStat needs to be removed from SlowPeersReport
[ https://issues.apache.org/jira/browse/HDFS-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-14783: Attachment: HDFS-14783-005.patch > Expired SampleStat needs to be removed from SlowPeersReport > --- > > Key: HDFS-14783 > URL: https://issues.apache.org/jira/browse/HDFS-14783 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-14783, HDFS-14783-001.patch, HDFS-14783-002.patch, > HDFS-14783-003.patch, HDFS-14783-004.patch, HDFS-14783-005.patch > > > SlowPeersReport is calculated by the SampleStat between tow dn, so it can > present on nn's jmx like this: > {code:java} > "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}] > {code} > the SampleStat is stored in a LinkedBlockingDeque, it won't be > removed until the queue is full and a newest one is generated. Therefore, if > dn1 don't send any packet to dn2 for a long time, the old SampleStat will > keep staying in the queue, and will be used to calculated slowpeer.I think > these old SampleStats should be considered as expired message and ignore them > when generating a new SlowPeersReport. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14783) Expired SampleStat needs to be removed from SlowPeersReport
[ https://issues.apache.org/jira/browse/HDFS-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17063802#comment-17063802 ] Haibin Huang edited comment on HDFS-14783 at 3/21/20, 7:28 AM: --- Thanks [~elgoiri] for suggestion, i think changing the behavior of SampleStat is not good too.So i remove the timestamp and use another way to judge expired SampleStat in DataNodePeerMetrics. When dn1 don't send any packet to dn2 for a long time, the SampleStat of DataNodePeerMetrics won't change, so the the same metrics info will be generated at every time doing org.apache.hadoop.metrics2.lib.MutableRollingAverages#rollOverAvgs(): {code:java} final SumAndCount sumAndCount = new SumAndCount( rate.lastStat().total(), rate.lastStat().numSamples()); /* put newest sum and count to the end */ if (!deque.offerLast(sumAndCount)) { deque.pollFirst(); deque.offerLast(sumAndCount); } {code} Which will make the deque filled with the same sumAndCount. So just need to check all members in the deque are the same, we can see the SampleStat hasn't changed int the last 36*300_000 ms.I think we can use this way to judge the exipired SampleStat in DataNodePeerMetrics. was (Author: huanghaibin): Thanks [~elgoiri] for suggestion, i think changing the behavior of SampleStat is not good too.So i remove the timestamp and use another way to judge expired SampleStat in DataNodePeerMetrics. When dn1 don't send any packet to dn2 for a long time, the SampleStat of DataNodePeerMetrics won't change, so the the same metrics info will be generated at every time doing org.apache.hadoop.metrics2.lib.MutableRollingAverages#rollOverAvgs(): {code:java} final SumAndCount sumAndCount = new SumAndCount( rate.lastStat().total(), rate.lastStat().numSamples()); /* put newest sum and count to the end */ if (!deque.offerLast(sumAndCount)) { deque.pollFirst(); deque.offerLast(sumAndCount); } {code} Which will make the deque filled with the same sumAndCount. So just need to check all members in the deque are the same, we can see whether the SampleStat hasn't changed int the last 36*300_000 ms.I think we can use this way to judge the exipired SampleStat in DataNodePeerMetrics. > Expired SampleStat needs to be removed from SlowPeersReport > --- > > Key: HDFS-14783 > URL: https://issues.apache.org/jira/browse/HDFS-14783 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-14783, HDFS-14783-001.patch, HDFS-14783-002.patch, > HDFS-14783-003.patch, HDFS-14783-004.patch > > > SlowPeersReport is calculated by the SampleStat between tow dn, so it can > present on nn's jmx like this: > {code:java} > "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}] > {code} > the SampleStat is stored in a LinkedBlockingDeque, it won't be > removed until the queue is full and a newest one is generated. Therefore, if > dn1 don't send any packet to dn2 for a long time, the old SampleStat will > keep staying in the queue, and will be used to calculated slowpeer.I think > these old SampleStats should be considered as expired message and ignore them > when generating a new SlowPeersReport. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14783) Expired SampleStat needs to be removed from SlowPeersReport
[ https://issues.apache.org/jira/browse/HDFS-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17063802#comment-17063802 ] Haibin Huang commented on HDFS-14783: - Thanks [~elgoiri] for suggestion, i think changing the behavior of SampleStat is not good too.So i remove the timestamp and use another way to judge expired SampleStat in DataNodePeerMetrics. When dn1 don't send any packet to dn2 for a long time, the SampleStat of DataNodePeerMetrics won't change, so the the same metrics info will be generated at every time doing org.apache.hadoop.metrics2.lib.MutableRollingAverages#rollOverAvgs(): {code:java} final SumAndCount sumAndCount = new SumAndCount( rate.lastStat().total(), rate.lastStat().numSamples()); /* put newest sum and count to the end */ if (!deque.offerLast(sumAndCount)) { deque.pollFirst(); deque.offerLast(sumAndCount); } {code} Which will make the deque filled with the same sumAndCount. So just need to check all members in the deque are the same, we can see whether the SampleStat hasn't changed int the last 36*300_000 ms.I think we can use this way to judge the exipired SampleStat in DataNodePeerMetrics. > Expired SampleStat needs to be removed from SlowPeersReport > --- > > Key: HDFS-14783 > URL: https://issues.apache.org/jira/browse/HDFS-14783 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-14783, HDFS-14783-001.patch, HDFS-14783-002.patch, > HDFS-14783-003.patch, HDFS-14783-004.patch > > > SlowPeersReport is calculated by the SampleStat between tow dn, so it can > present on nn's jmx like this: > {code:java} > "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}] > {code} > the SampleStat is stored in a LinkedBlockingDeque, it won't be > removed until the queue is full and a newest one is generated. Therefore, if > dn1 don't send any packet to dn2 for a long time, the old SampleStat will > keep staying in the queue, and will be used to calculated slowpeer.I think > these old SampleStats should be considered as expired message and ignore them > when generating a new SlowPeersReport. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14783) Expired SampleStat needs to be removed from SlowPeersReport
[ https://issues.apache.org/jira/browse/HDFS-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-14783: Attachment: HDFS-14783-004.patch > Expired SampleStat needs to be removed from SlowPeersReport > --- > > Key: HDFS-14783 > URL: https://issues.apache.org/jira/browse/HDFS-14783 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-14783, HDFS-14783-001.patch, HDFS-14783-002.patch, > HDFS-14783-003.patch, HDFS-14783-004.patch > > > SlowPeersReport is calculated by the SampleStat between tow dn, so it can > present on nn's jmx like this: > {code:java} > "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}] > {code} > the SampleStat is stored in a LinkedBlockingDeque, it won't be > removed until the queue is full and a newest one is generated. Therefore, if > dn1 don't send any packet to dn2 for a long time, the old SampleStat will > keep staying in the queue, and will be used to calculated slowpeer.I think > these old SampleStats should be considered as expired message and ignore them > when generating a new SlowPeersReport. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14783) expired SampleStat need to be removed from SlowPeersReport
[ https://issues.apache.org/jira/browse/HDFS-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17060902#comment-17060902 ] Haibin Huang commented on HDFS-14783: - update patch, fix bug and checkstyle > expired SampleStat need to be removed from SlowPeersReport > -- > > Key: HDFS-14783 > URL: https://issues.apache.org/jira/browse/HDFS-14783 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-14783, HDFS-14783-001.patch, HDFS-14783-002.patch, > HDFS-14783-003.patch > > > SlowPeersReport is calculated by the SampleStat between tow dn, so it can > present on nn's jmx like this: > {code:java} > "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}] > {code} > the SampleStat is stored in a LinkedBlockingDeque, it won't be > removed until the queue is full and a newest one is generated. Therefore, if > dn1 don't send any packet to dn2 for a long time, the old SampleStat will > keep staying in the queue, and will be used to calculated slowpeer.I think > these old SampleStats should be considered as expired message and ignore them > when generating a new SlowPeersReport. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14783) expired SampleStat need to be removed from SlowPeersReport
[ https://issues.apache.org/jira/browse/HDFS-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-14783: Attachment: HDFS-14783-003.patch > expired SampleStat need to be removed from SlowPeersReport > -- > > Key: HDFS-14783 > URL: https://issues.apache.org/jira/browse/HDFS-14783 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-14783, HDFS-14783-001.patch, HDFS-14783-002.patch, > HDFS-14783-003.patch > > > SlowPeersReport is calculated by the SampleStat between tow dn, so it can > present on nn's jmx like this: > {code:java} > "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}] > {code} > the SampleStat is stored in a LinkedBlockingDeque, it won't be > removed until the queue is full and a newest one is generated. Therefore, if > dn1 don't send any packet to dn2 for a long time, the old SampleStat will > keep staying in the queue, and will be used to calculated slowpeer.I think > these old SampleStats should be considered as expired message and ignore them > when generating a new SlowPeersReport. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14783) expired SampleStat need to be removed from SlowPeersReport
[ https://issues.apache.org/jira/browse/HDFS-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17060623#comment-17060623 ] Haibin Huang commented on HDFS-14783: - [~ayushtkn] [~elgoiri] , can you take a look for this, thanks. > expired SampleStat need to be removed from SlowPeersReport > -- > > Key: HDFS-14783 > URL: https://issues.apache.org/jira/browse/HDFS-14783 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-14783, HDFS-14783-001.patch, HDFS-14783-002.patch > > > SlowPeersReport is calculated by the SampleStat between tow dn, so it can > present on nn's jmx like this: > {code:java} > "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}] > {code} > the SampleStat is stored in a LinkedBlockingDeque, it won't be > removed until the queue is full and a newest one is generated. Therefore, if > dn1 don't send any packet to dn2 for a long time, the old SampleStat will > keep staying in the queue, and will be used to calculated slowpeer.I think > these old SampleStats should be considered as expired message and ignore them > when generating a new SlowPeersReport. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14783) expired SampleStat need to be removed from SlowPeersReport
[ https://issues.apache.org/jira/browse/HDFS-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-14783: Description: SlowPeersReport is calculated by the SampleStat between tow dn, so it can present on nn's jmx like this: {code:java} "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}] {code} the SampleStat is stored in a LinkedBlockingDeque, it won't be removed until the queue is full and a newest one is generated. Therefore, if dn1 don't send any packet to dn2 for a long time, the old SampleStat will keep staying in the queue, and will be used to calculated slowpeer.I think these old SampleStats should be considered as expired message and ignore them when generating a new SlowPeersReport. was: SlowPeersReport in namenode's jmx can tell us which datanode is slow node, and it is calculated by the average duration between two datanode sending packet. Here is an example, if dn1 send packet to dn2 tasks too long in average (over the *upperLimitLatency*), you will see SlowPeersReport in namenode's jmx like this : {code:java} "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}] {code} However, if dn1 just sending some packet to dn2 with a slow speed in the beginning , then didn't send any packet to dn2 for a long time, which will keep the abovementioned SlowPeersReport staying on namenode's jmx . I think this SlowPeersReport might be an expired message, because the network between dn1 and dn2 may have returned to normal, but the SlowPeersReport is still on nameonode's jmx until next time dn1 sending packet to dn2. So I use a timestamp to record when an *org.apache.hadoop.metrics2.util.SampleStat* is created, and calculate the average duration with the valid *SampleStat ,* which is judged by it timestamp. > expired SampleStat need to be removed from SlowPeersReport > -- > > Key: HDFS-14783 > URL: https://issues.apache.org/jira/browse/HDFS-14783 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-14783, HDFS-14783-001.patch, HDFS-14783-002.patch > > > SlowPeersReport is calculated by the SampleStat between tow dn, so it can > present on nn's jmx like this: > {code:java} > "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}] > {code} > the SampleStat is stored in a LinkedBlockingDeque, it won't be > removed until the queue is full and a newest one is generated. Therefore, if > dn1 don't send any packet to dn2 for a long time, the old SampleStat will > keep staying in the queue, and will be used to calculated slowpeer.I think > these old SampleStats should be considered as expired message and ignore them > when generating a new SlowPeersReport. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14783) expired SampleStat need to be removed from SlowPeersReport
[ https://issues.apache.org/jira/browse/HDFS-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-14783: Attachment: HDFS-14783-002.patch > expired SampleStat need to be removed from SlowPeersReport > -- > > Key: HDFS-14783 > URL: https://issues.apache.org/jira/browse/HDFS-14783 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-14783, HDFS-14783-001.patch, HDFS-14783-002.patch > > > SlowPeersReport in namenode's jmx can tell us which datanode is slow node, > and it is calculated by the average duration between two datanode sending > packet. Here is an example, if dn1 send packet to dn2 tasks too long in > average (over the *upperLimitLatency*), you will see SlowPeersReport in > namenode's jmx like this : > {code:java} > "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}] > {code} > However, if dn1 just sending some packet to dn2 with a slow speed in the > beginning , then didn't send any packet to dn2 for a long time, which will > keep the abovementioned SlowPeersReport staying on namenode's jmx . I think > this SlowPeersReport might be an expired message, because the network between > dn1 and dn2 may have returned to normal, but the SlowPeersReport is still on > nameonode's jmx until next time dn1 sending packet to dn2. So I use a > timestamp to record when an *org.apache.hadoop.metrics2.util.SampleStat* is > created, and calculate the average duration with the valid *SampleStat ,* > which is judged by it timestamp. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14783) expired SampleStat need to be removed from SlowPeersReport
[ https://issues.apache.org/jira/browse/HDFS-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-14783: Summary: expired SampleStat need to be removed from SlowPeersReport (was: expired SlowPeersReport will keep staying on namenode's jmx) > expired SampleStat need to be removed from SlowPeersReport > -- > > Key: HDFS-14783 > URL: https://issues.apache.org/jira/browse/HDFS-14783 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-14783, HDFS-14783-001.patch > > > SlowPeersReport in namenode's jmx can tell us which datanode is slow node, > and it is calculated by the average duration between two datanode sending > packet. Here is an example, if dn1 send packet to dn2 tasks too long in > average (over the *upperLimitLatency*), you will see SlowPeersReport in > namenode's jmx like this : > {code:java} > "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}] > {code} > However, if dn1 just sending some packet to dn2 with a slow speed in the > beginning , then didn't send any packet to dn2 for a long time, which will > keep the abovementioned SlowPeersReport staying on namenode's jmx . I think > this SlowPeersReport might be an expired message, because the network between > dn1 and dn2 may have returned to normal, but the SlowPeersReport is still on > nameonode's jmx until next time dn1 sending packet to dn2. So I use a > timestamp to record when an *org.apache.hadoop.metrics2.util.SampleStat* is > created, and calculate the average duration with the valid *SampleStat ,* > which is judged by it timestamp. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15155) writeIoRate of DataNodeVolumeMetrics is never used
[ https://issues.apache.org/jira/browse/HDFS-15155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17058374#comment-17058374 ] Haibin Huang commented on HDFS-15155: - [~ayushtkn], you are right, it can directly use {code:java} out.hflush(); {code} I have updated the patch, can you take a look please? > writeIoRate of DataNodeVolumeMetrics is never used > -- > > Key: HDFS-15155 > URL: https://issues.apache.org/jira/browse/HDFS-15155 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-15155.001.patch, HDFS-15155.002.patch, > HDFS-15155.003.patch, HDFS-15155.004.patch, HDFS-15155.005.patch > > > There is some incorrect object using in DataNodeVolumeMetrics, writeIoRate is > never used and syncIoRate should be replaced by writeIoRate in the following > code: > {code:java} > // Based on writeIoRate > public long getWriteIoSampleCount() { > return syncIoRate.lastStat().numSamples(); > } > public double getWriteIoMean() { > return syncIoRate.lastStat().mean(); > } > public double getWriteIoStdDev() { > return syncIoRate.lastStat().stddev(); > } > {code} > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15155) writeIoRate of DataNodeVolumeMetrics is never used
[ https://issues.apache.org/jira/browse/HDFS-15155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-15155: Attachment: HDFS-15155.005.patch > writeIoRate of DataNodeVolumeMetrics is never used > -- > > Key: HDFS-15155 > URL: https://issues.apache.org/jira/browse/HDFS-15155 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-15155.001.patch, HDFS-15155.002.patch, > HDFS-15155.003.patch, HDFS-15155.004.patch, HDFS-15155.005.patch > > > There is some incorrect object using in DataNodeVolumeMetrics, writeIoRate is > never used and syncIoRate should be replaced by writeIoRate in the following > code: > {code:java} > // Based on writeIoRate > public long getWriteIoSampleCount() { > return syncIoRate.lastStat().numSamples(); > } > public double getWriteIoMean() { > return syncIoRate.lastStat().mean(); > } > public double getWriteIoStdDev() { > return syncIoRate.lastStat().stddev(); > } > {code} > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15155) writeIoRate of DataNodeVolumeMetrics is never used
[ https://issues.apache.org/jira/browse/HDFS-15155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17058357#comment-17058357 ] Haibin Huang commented on HDFS-15155: - [~elgoiri], can this patch commit to trunk? > writeIoRate of DataNodeVolumeMetrics is never used > -- > > Key: HDFS-15155 > URL: https://issues.apache.org/jira/browse/HDFS-15155 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-15155.001.patch, HDFS-15155.002.patch, > HDFS-15155.003.patch, HDFS-15155.004.patch > > > There is some incorrect object using in DataNodeVolumeMetrics, writeIoRate is > never used and syncIoRate should be replaced by writeIoRate in the following > code: > {code:java} > // Based on writeIoRate > public long getWriteIoSampleCount() { > return syncIoRate.lastStat().numSamples(); > } > public double getWriteIoMean() { > return syncIoRate.lastStat().mean(); > } > public double getWriteIoStdDev() { > return syncIoRate.lastStat().stddev(); > } > {code} > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14612) SlowDiskReport won't update when SlowDisks is always empty in heartbeat
[ https://issues.apache.org/jira/browse/HDFS-14612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056599#comment-17056599 ] Haibin Huang commented on HDFS-14612: - [~elgoiri] [~hanishakoneru] , I have updated the test part, can you take a look please? > SlowDiskReport won't update when SlowDisks is always empty in heartbeat > --- > > Key: HDFS-14612 > URL: https://issues.apache.org/jira/browse/HDFS-14612 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-14612-001.patch, HDFS-14612-002.patch, > HDFS-14612-003.patch, HDFS-14612-004.patch, HDFS-14612-005.patch, > HDFS-14612-006.patch, HDFS-14612-007.patch, HDFS-14612-008.patch, > HDFS-14612.patch > > > I found SlowDiskReport won't update when slowDisks is always empty in > org.apache.hadoop.hdfs.server.blockmanagement.*handleHeartbeat*, this may > lead to outdated SlowDiskReport alway staying in jmx of namenode until next > time slowDisks isn't empty. So i think this method > *checkAndUpdateReportIfNecessary()* should be called firstly when we want to > get the jmx information about SlowDiskReport, this can keep the > SlowDiskReport on jmx alway valid. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14612) SlowDiskReport won't update when SlowDisks is always empty in heartbeat
[ https://issues.apache.org/jira/browse/HDFS-14612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-14612: Attachment: HDFS-14612-008.patch > SlowDiskReport won't update when SlowDisks is always empty in heartbeat > --- > > Key: HDFS-14612 > URL: https://issues.apache.org/jira/browse/HDFS-14612 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-14612-001.patch, HDFS-14612-002.patch, > HDFS-14612-003.patch, HDFS-14612-004.patch, HDFS-14612-005.patch, > HDFS-14612-006.patch, HDFS-14612-007.patch, HDFS-14612-008.patch, > HDFS-14612.patch > > > I found SlowDiskReport won't update when slowDisks is always empty in > org.apache.hadoop.hdfs.server.blockmanagement.*handleHeartbeat*, this may > lead to outdated SlowDiskReport alway staying in jmx of namenode until next > time slowDisks isn't empty. So i think this method > *checkAndUpdateReportIfNecessary()* should be called firstly when we want to > get the jmx information about SlowDiskReport, this can keep the > SlowDiskReport on jmx alway valid. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14612) SlowDiskReport won't update when SlowDisks is always empty in heartbeat
[ https://issues.apache.org/jira/browse/HDFS-14612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055005#comment-17055005 ] Haibin Huang commented on HDFS-14612: - [~elgoiri], how about the patch v7, when nn handle the new heartbeat from dn, it will trigger the checkAndUpdateReportIfNecessary and check if it is time for update slow disk report > SlowDiskReport won't update when SlowDisks is always empty in heartbeat > --- > > Key: HDFS-14612 > URL: https://issues.apache.org/jira/browse/HDFS-14612 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-14612-001.patch, HDFS-14612-002.patch, > HDFS-14612-003.patch, HDFS-14612-004.patch, HDFS-14612-005.patch, > HDFS-14612-006.patch, HDFS-14612-007.patch, HDFS-14612.patch > > > I found SlowDiskReport won't update when slowDisks is always empty in > org.apache.hadoop.hdfs.server.blockmanagement.*handleHeartbeat*, this may > lead to outdated SlowDiskReport alway staying in jmx of namenode until next > time slowDisks isn't empty. So i think this method > *checkAndUpdateReportIfNecessary()* should be called firstly when we want to > get the jmx information about SlowDiskReport, this can keep the > SlowDiskReport on jmx alway valid. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14612) SlowDiskReport won't update when SlowDisks is always empty in heartbeat
[ https://issues.apache.org/jira/browse/HDFS-14612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-14612: Attachment: HDFS-14612-007.patch > SlowDiskReport won't update when SlowDisks is always empty in heartbeat > --- > > Key: HDFS-14612 > URL: https://issues.apache.org/jira/browse/HDFS-14612 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-14612-001.patch, HDFS-14612-002.patch, > HDFS-14612-003.patch, HDFS-14612-004.patch, HDFS-14612-005.patch, > HDFS-14612-006.patch, HDFS-14612-007.patch, HDFS-14612.patch > > > I found SlowDiskReport won't update when slowDisks is always empty in > org.apache.hadoop.hdfs.server.blockmanagement.*handleHeartbeat*, this may > lead to outdated SlowDiskReport alway staying in jmx of namenode until next > time slowDisks isn't empty. So i think this method > *checkAndUpdateReportIfNecessary()* should be called firstly when we want to > get the jmx information about SlowDiskReport, this can keep the > SlowDiskReport on jmx alway valid. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14612) SlowDiskReport won't update when SlowDisks is always empty in heartbeat
[ https://issues.apache.org/jira/browse/HDFS-14612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17052683#comment-17052683 ] Haibin Huang commented on HDFS-14612: - [~ayushtkn] [~elgoiri] would you please review this patch? > SlowDiskReport won't update when SlowDisks is always empty in heartbeat > --- > > Key: HDFS-14612 > URL: https://issues.apache.org/jira/browse/HDFS-14612 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-14612-001.patch, HDFS-14612-002.patch, > HDFS-14612-003.patch, HDFS-14612-004.patch, HDFS-14612-005.patch, > HDFS-14612-006.patch, HDFS-14612.patch > > > I found SlowDiskReport won't update when slowDisks is always empty in > org.apache.hadoop.hdfs.server.blockmanagement.*handleHeartbeat*, this may > lead to outdated SlowDiskReport alway staying in jmx of namenode until next > time slowDisks isn't empty. So i think this method > *checkAndUpdateReportIfNecessary()* should be called firstly when we want to > get the jmx information about SlowDiskReport, this can keep the > SlowDiskReport on jmx alway valid. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14612) SlowDiskReport won't update when SlowDisks is always empty in heartbeat
[ https://issues.apache.org/jira/browse/HDFS-14612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-14612: Description: I found SlowDiskReport won't update when slowDisks is always empty in org.apache.hadoop.hdfs.server.blockmanagement.*handleHeartbeat*, this may lead to outdated SlowDiskReport alway staying in jmx of namenode until next time slowDisks isn't empty. So i think this method *checkAndUpdateReportIfNecessary()* should be called firstly when we want to get the jmx information about SlowDiskReport, this can keep the SlowDiskReport on jmx alway valid. was: I found SlowDiskReport won't update when slowDisks is always empty in org.apache.hadoop.hdfs.server.blockmanagement.*handleHeartbeat*, this may lead to outdated SlowDiskReport alway staying in jmx of namenode until next time slowDisks isn't empty. So i think this method *checkAndUpdateReportIfNecessary()* should be called firstly when we want to get the jmx information about SlowDiskReport, this can keep the SlowDiskReport on jmx is alway valid. There is also some incorrect object reference on org.apache.hadoop.hdfs.server.datanode.fsdataset. *DataNodeVolumeMetrics* {code:java} // Based on writeIoRate public long getWriteIoSampleCount() { return syncIoRate.lastStat().numSamples(); } public double getWriteIoMean() { return syncIoRate.lastStat().mean(); } public double getWriteIoStdDev() { return syncIoRate.lastStat().stddev(); } {code} > SlowDiskReport won't update when SlowDisks is always empty in heartbeat > --- > > Key: HDFS-14612 > URL: https://issues.apache.org/jira/browse/HDFS-14612 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-14612-001.patch, HDFS-14612-002.patch, > HDFS-14612-003.patch, HDFS-14612-004.patch, HDFS-14612-005.patch, > HDFS-14612-006.patch, HDFS-14612.patch > > > I found SlowDiskReport won't update when slowDisks is always empty in > org.apache.hadoop.hdfs.server.blockmanagement.*handleHeartbeat*, this may > lead to outdated SlowDiskReport alway staying in jmx of namenode until next > time slowDisks isn't empty. So i think this method > *checkAndUpdateReportIfNecessary()* should be called firstly when we want to > get the jmx information about SlowDiskReport, this can keep the > SlowDiskReport on jmx alway valid. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14612) SlowDiskReport won't update when SlowDisks is always empty in heartbeat
[ https://issues.apache.org/jira/browse/HDFS-14612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-14612: Attachment: HDFS-14612-006.patch > SlowDiskReport won't update when SlowDisks is always empty in heartbeat > --- > > Key: HDFS-14612 > URL: https://issues.apache.org/jira/browse/HDFS-14612 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-14612-001.patch, HDFS-14612-002.patch, > HDFS-14612-003.patch, HDFS-14612-004.patch, HDFS-14612-005.patch, > HDFS-14612-006.patch, HDFS-14612.patch > > > I found SlowDiskReport won't update when slowDisks is always empty in > org.apache.hadoop.hdfs.server.blockmanagement.*handleHeartbeat*, this may > lead to outdated SlowDiskReport alway staying in jmx of namenode until next > time slowDisks isn't empty. So i think this method > *checkAndUpdateReportIfNecessary()* should be called firstly when we want to > get the jmx information about SlowDiskReport, this can keep the > SlowDiskReport on jmx is alway valid. > > There is also some incorrect object reference on > org.apache.hadoop.hdfs.server.datanode.fsdataset. > *DataNodeVolumeMetrics* > {code:java} > // Based on writeIoRate > public long getWriteIoSampleCount() { > return syncIoRate.lastStat().numSamples(); > } > public double getWriteIoMean() { > return syncIoRate.lastStat().mean(); > } > public double getWriteIoStdDev() { > return syncIoRate.lastStat().stddev(); > } > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15155) writeIoRate of DataNodeVolumeMetrics is never used
[ https://issues.apache.org/jira/browse/HDFS-15155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-15155: Attachment: HDFS-15155.004.patch > writeIoRate of DataNodeVolumeMetrics is never used > -- > > Key: HDFS-15155 > URL: https://issues.apache.org/jira/browse/HDFS-15155 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-15155.001.patch, HDFS-15155.002.patch, > HDFS-15155.003.patch, HDFS-15155.004.patch > > > There is some incorrect object using in DataNodeVolumeMetrics, writeIoRate is > never used and syncIoRate should be replaced by writeIoRate in the following > code: > {code:java} > // Based on writeIoRate > public long getWriteIoSampleCount() { > return syncIoRate.lastStat().numSamples(); > } > public double getWriteIoMean() { > return syncIoRate.lastStat().mean(); > } > public double getWriteIoStdDev() { > return syncIoRate.lastStat().stddev(); > } > {code} > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15155) writeIoRate of DataNodeVolumeMetrics is never used
[ https://issues.apache.org/jira/browse/HDFS-15155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17050193#comment-17050193 ] Haibin Huang commented on HDFS-15155: - [~elgoiri] I have updated the patch. Please review it. > writeIoRate of DataNodeVolumeMetrics is never used > -- > > Key: HDFS-15155 > URL: https://issues.apache.org/jira/browse/HDFS-15155 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-15155.001.patch, HDFS-15155.002.patch, > HDFS-15155.003.patch > > > There is some incorrect object using in DataNodeVolumeMetrics, writeIoRate is > never used and syncIoRate should be replaced by writeIoRate in the following > code: > {code:java} > // Based on writeIoRate > public long getWriteIoSampleCount() { > return syncIoRate.lastStat().numSamples(); > } > public double getWriteIoMean() { > return syncIoRate.lastStat().mean(); > } > public double getWriteIoStdDev() { > return syncIoRate.lastStat().stddev(); > } > {code} > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15155) writeIoRate of DataNodeVolumeMetrics is never used
[ https://issues.apache.org/jira/browse/HDFS-15155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibin Huang updated HDFS-15155: Attachment: HDFS-15155.003.patch > writeIoRate of DataNodeVolumeMetrics is never used > -- > > Key: HDFS-15155 > URL: https://issues.apache.org/jira/browse/HDFS-15155 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-15155.001.patch, HDFS-15155.002.patch, > HDFS-15155.003.patch > > > There is some incorrect object using in DataNodeVolumeMetrics, writeIoRate is > never used and syncIoRate should be replaced by writeIoRate in the following > code: > {code:java} > // Based on writeIoRate > public long getWriteIoSampleCount() { > return syncIoRate.lastStat().numSamples(); > } > public double getWriteIoMean() { > return syncIoRate.lastStat().mean(); > } > public double getWriteIoStdDev() { > return syncIoRate.lastStat().stddev(); > } > {code} > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15155) writeIoRate of DataNodeVolumeMetrics is never used
[ https://issues.apache.org/jira/browse/HDFS-15155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17047182#comment-17047182 ] Haibin Huang commented on HDFS-15155: - Ok, i will update the patch soon > writeIoRate of DataNodeVolumeMetrics is never used > -- > > Key: HDFS-15155 > URL: https://issues.apache.org/jira/browse/HDFS-15155 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-15155.001.patch, HDFS-15155.002.patch > > > There is some incorrect object using in DataNodeVolumeMetrics, writeIoRate is > never used and syncIoRate should be replaced by writeIoRate in the following > code: > {code:java} > // Based on writeIoRate > public long getWriteIoSampleCount() { > return syncIoRate.lastStat().numSamples(); > } > public double getWriteIoMean() { > return syncIoRate.lastStat().mean(); > } > public double getWriteIoStdDev() { > return syncIoRate.lastStat().stddev(); > } > {code} > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15155) writeIoRate of DataNodeVolumeMetrics is never used
[ https://issues.apache.org/jira/browse/HDFS-15155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17047171#comment-17047171 ] Haibin Huang edited comment on HDFS-15155 at 2/28/20 3:05 AM: -- [~elgoiri] ,thanks for reviewing this patch, i think TestDataNodeVolumeMetrics#testVolumeMetrics has checked the metrics of writIo, and it need to remove this line before building the MiniDFSCluster: {code:java} SimulatedFSDataset.setFactory(conf); {code} and you will see the different in TestDataNodeVolumeMetrics#verifyDataNodeVolumeMetrics after applying this patch, you can focus on these output line: {code:java} LOG.info("writeIoSampleCount : " + metrics.getWriteIoSampleCount()); LOG.info("writeIoMean : " + metrics.getWriteIoMean()); LOG.info("writeIoStdDev : " + metrics.getWriteIoStdDev()); {code} if need some more asserts, i will update soon was (Author: huanghaibin): I think TestDataNodeVolumeMetrics#testVolumeMetrics has checked the metrics of writIo, and it need to remove this line before building the MiniDFSCluster: {code:java} SimulatedFSDataset.setFactory(conf); {code} and you will see the different in TestDataNodeVolumeMetrics#verifyDataNodeVolumeMetrics after applying this patch, you can focus on these output line: {code:java} LOG.info("writeIoSampleCount : " + metrics.getWriteIoSampleCount()); LOG.info("writeIoMean : " + metrics.getWriteIoMean()); LOG.info("writeIoStdDev : " + metrics.getWriteIoStdDev()); {code} if need some more asserts, i will update soon, thanks for reviewing > writeIoRate of DataNodeVolumeMetrics is never used > -- > > Key: HDFS-15155 > URL: https://issues.apache.org/jira/browse/HDFS-15155 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Major > Attachments: HDFS-15155.001.patch, HDFS-15155.002.patch > > > There is some incorrect object using in DataNodeVolumeMetrics, writeIoRate is > never used and syncIoRate should be replaced by writeIoRate in the following > code: > {code:java} > // Based on writeIoRate > public long getWriteIoSampleCount() { > return syncIoRate.lastStat().numSamples(); > } > public double getWriteIoMean() { > return syncIoRate.lastStat().mean(); > } > public double getWriteIoStdDev() { > return syncIoRate.lastStat().stddev(); > } > {code} > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org