[jira] [Updated] (HDFS-15744) Use cumulative counting way to improve the accuracy of slow disk detection

2020-12-21 Thread Haibin Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibin Huang updated HDFS-15744:

Attachment: HDFS-15744-001.patch

> Use cumulative counting way to improve the accuracy of slow disk detection
> --
>
> Key: HDFS-15744
> URL: https://issues.apache.org/jira/browse/HDFS-15744
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-15744-001.patch, image-2020-12-22-11-37-14-734.png, 
> image-2020-12-22-11-37-35-280.png, image-2020-12-22-11-46-48-817.png
>
>
> Hdfs has supported the datanode disk outlier detection in 
> [HDFS-11461|https://issues.apache.org/jira/browse/HDFS-11461], we can use it 
> to find out slow disk via 
> SlowDiskReport([HDFS-11551|https://issues.apache.org/jira/browse/HDFS-11551]).However
>  i found the slow disk information may not be accurate enough in practice.
> Because a large number of short-term writes can lead to miscalculation. Here 
> is the example, this disk is health, when it encounters a lot of writing in a 
> few minute, it's write io does get slow, and will be considered to be slow 
> disk.The disk just slow in a few minute but SlowDiskReport will keep it until 
> the information becomes invalid. This scenario confuse us since we want to 
> use SlowDiskReport to detect the real bad disk.
> !image-2020-12-22-11-37-14-734.png!
> !image-2020-12-22-11-37-35-280.png!
> To improve the deteciton accuracy, we use a cumulative counting way to detect 
> slow disk. If within the reportValidityMs interval, a disk is considered to 
> be outlier over 50% times, than it should be a real bad disk.
> Here is an exsample, if reportValidityMs is one hour and detection interval 
> is five minute, there will be 12 times disk outlier detection in one hour. If 
> a disk is considered to be outlier over 6 times, it should be a real bad 
> disk. We use this way to detect bad disk in cluster, it can reach over 90% 
> accuracy.
> !image-2020-12-22-11-46-48-817.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15744) Use cumulative counting way to improve the accuracy of slow disk detection

2020-12-21 Thread Haibin Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibin Huang updated HDFS-15744:

Status: Patch Available  (was: Open)

> Use cumulative counting way to improve the accuracy of slow disk detection
> --
>
> Key: HDFS-15744
> URL: https://issues.apache.org/jira/browse/HDFS-15744
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-15744-001.patch, image-2020-12-22-11-37-14-734.png, 
> image-2020-12-22-11-37-35-280.png, image-2020-12-22-11-46-48-817.png
>
>
> Hdfs has supported the datanode disk outlier detection in 
> [HDFS-11461|https://issues.apache.org/jira/browse/HDFS-11461], we can use it 
> to find out slow disk via 
> SlowDiskReport([HDFS-11551|https://issues.apache.org/jira/browse/HDFS-11551]).However
>  i found the slow disk information may not be accurate enough in practice.
> Because a large number of short-term writes can lead to miscalculation. Here 
> is the example, this disk is health, when it encounters a lot of writing in a 
> few minute, it's write io does get slow, and will be considered to be slow 
> disk.The disk just slow in a few minute but SlowDiskReport will keep it until 
> the information becomes invalid. This scenario confuse us since we want to 
> use SlowDiskReport to detect the real bad disk.
> !image-2020-12-22-11-37-14-734.png!
> !image-2020-12-22-11-37-35-280.png!
> To improve the deteciton accuracy, we use a cumulative counting way to detect 
> slow disk. If within the reportValidityMs interval, a disk is considered to 
> be outlier over 50% times, than it should be a real bad disk.
> Here is an exsample, if reportValidityMs is one hour and detection interval 
> is five minute, there will be 12 times disk outlier detection in one hour. If 
> a disk is considered to be outlier over 6 times, it should be a real bad 
> disk. We use this way to detect bad disk in cluster, it can reach over 90% 
> accuracy.
> !image-2020-12-22-11-46-48-817.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15745) Make DataNodePeerMetrics#LOW_THRESHOLD_MS and MIN_OUTLIER_DETECTION_NODES configurable

2020-12-22 Thread Haibin Huang (Jira)
Haibin Huang created HDFS-15745:
---

 Summary: Make DataNodePeerMetrics#LOW_THRESHOLD_MS and 
MIN_OUTLIER_DETECTION_NODES configurable
 Key: HDFS-15745
 URL: https://issues.apache.org/jira/browse/HDFS-15745
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Haibin Huang
Assignee: Haibin Huang
 Attachments: image-2020-12-22-17-00-50-796.png

When i enable DataNodePeerMetrics to find slow slow peer in cluster, i found 
there is a lot of slow peer but ReportingNodes's averageDelay is very low, and 
these slow peer node are normal. I think the reason of why generating so many 
slow peer is that  the value of DataNodePeerMetrics#LOW_THRESHOLD_MS is too 
small (only 5ms) and it is not configurable. The default value of slow io 
warning log threshold is 300ms, i.e. 
DFSConfigKeys.DFS_DATANODE_SLOW_IO_WARNING_THRESHOLD_DEFAULT = 300, so 
DataNodePeerMetrics#LOW_THRESHOLD_MS should not be less than 300ms, otherwise 
namenode will get a lot of invalid slow peer information.

!image-2020-12-22-17-00-50-796.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15745) Make DataNodePeerMetrics#LOW_THRESHOLD_MS and MIN_OUTLIER_DETECTION_NODES configurable

2020-12-22 Thread Haibin Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibin Huang updated HDFS-15745:

Attachment: HDFS-15745-001.patch
Status: Patch Available  (was: Open)

> Make DataNodePeerMetrics#LOW_THRESHOLD_MS and MIN_OUTLIER_DETECTION_NODES 
> configurable
> --
>
> Key: HDFS-15745
> URL: https://issues.apache.org/jira/browse/HDFS-15745
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-15745-001.patch, image-2020-12-22-17-00-50-796.png
>
>
> When i enable DataNodePeerMetrics to find slow slow peer in cluster, i found 
> there is a lot of slow peer but ReportingNodes's averageDelay is very low, 
> and these slow peer node are normal. I think the reason of why generating so 
> many slow peer is that  the value of DataNodePeerMetrics#LOW_THRESHOLD_MS is 
> too small (only 5ms) and it is not configurable. The default value of slow io 
> warning log threshold is 300ms, i.e. 
> DFSConfigKeys.DFS_DATANODE_SLOW_IO_WARNING_THRESHOLD_DEFAULT = 300, so 
> DataNodePeerMetrics#LOW_THRESHOLD_MS should not be less than 300ms, otherwise 
> namenode will get a lot of invalid slow peer information.
> !image-2020-12-22-17-00-50-796.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14789) namenode should avoid slow node when choose target in BlockPlacementPolicyDefault

2020-12-22 Thread Haibin Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibin Huang updated HDFS-14789:

Summary: namenode should avoid slow node when choose target in 
BlockPlacementPolicyDefault  (was: namenode should check slow node when 
assigning a node for writing block )

> namenode should avoid slow node when choose target in 
> BlockPlacementPolicyDefault
> -
>
> Key: HDFS-14789
> URL: https://issues.apache.org/jira/browse/HDFS-14789
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-14789
>
>
> With HDFS-11194 and HDFS-11551, we can find slow node through namenode's jmx. 
> So i think namenode should check these slow nodes when assigning a node for 
> writing block. If namenode choose a node at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault#*chooseRandom()*,
>  we should check whether it's belong to slow node, because choosing a slow 
> one to write data  may take a long time, which can cause a client writing 
> data very slowly and even encounter a socket timeout exception like this:
>  
> {code:java}
> 2019-08-19,17:16:41,181 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer 
> Exceptionjava.net.SocketTimeoutException: 495000 millis timeout while waiting 
> for channel to be ready for write. ch : 
> java.nio.channels.SocketChannel[connected local=/xxx:xxx remote=/xxx:xxx] at 
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164) 
> at 
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159) 
> at 
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117) 
> at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122) at 
> java.io.DataOutputStream.write(DataOutputStream.java:107) at 
> org.apache.hadoop.hdfs.DFSOutputStream$Packet.writeTo(DFSOutputStream.java:328)
>  at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:653){code}
>  
> I use *maxChosenCount* to prevent choosing datanode task too long, which is 
> calculated by the logarithm of probability, and it also can guarantee the 
> probability of choosing a slow node to write block less than 0.01%.
> Finally, i use an expire time to let namnode don't choose these slow nodes 
> within a specify period, because these slow nodes may have returned to normal 
> after the period and can use to write block again.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14789) namenode should avoid slow node when choose target in BlockPlacementPolicyDefault

2020-12-22 Thread Haibin Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibin Huang updated HDFS-14789:

Description: 
With HDFS-11194 and HDFS-11551, namenode can show SlowPeersReport and 
SlowDisksReport in jmx. I think namenode can avoid these slow node information 
while choosing target in 

we can find slow node through namenode's jmx. So i think namenode should check 
these slow nodes when assigning a node for writing block. If namenode choose a 
node at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault#*chooseRandom()*,
 we should check whether it's belong to slow node, because choosing a slow one 
to write data  may take a long time, which can cause a client writing data very 
slowly and even encounter a socket timeout exception like this:

 
{code:java}
2019-08-19,17:16:41,181 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer 
Exceptionjava.net.SocketTimeoutException: 495000 millis timeout while waiting 
for channel to be ready for write. ch : 
java.nio.channels.SocketChannel[connected local=/xxx:xxx remote=/xxx:xxx] at 
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164) at 
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159) at 
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117) at 
java.io.BufferedOutputStream.write(BufferedOutputStream.java:122) at 
java.io.DataOutputStream.write(DataOutputStream.java:107) at 
org.apache.hadoop.hdfs.DFSOutputStream$Packet.writeTo(DFSOutputStream.java:328) 
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:653){code}
 

I use *maxChosenCount* to prevent choosing datanode task too long, which is 
calculated by the logarithm of probability, and it also can guarantee the 
probability of choosing a slow node to write block less than 0.01%.

Finally, i use an expire time to let namnode don't choose these slow nodes 
within a specify period, because these slow nodes may have returned to normal 
after the period and can use to write block again.

  was:
With HDFS-11194 and HDFS-11551, we can find slow node through namenode's jmx. 
So i think namenode should check these slow nodes when assigning a node for 
writing block. If namenode choose a node at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault#*chooseRandom()*,
 we should check whether it's belong to slow node, because choosing a slow one 
to write data  may take a long time, which can cause a client writing data very 
slowly and even encounter a socket timeout exception like this:

 
{code:java}
2019-08-19,17:16:41,181 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer 
Exceptionjava.net.SocketTimeoutException: 495000 millis timeout while waiting 
for channel to be ready for write. ch : 
java.nio.channels.SocketChannel[connected local=/xxx:xxx remote=/xxx:xxx] at 
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164) at 
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159) at 
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117) at 
java.io.BufferedOutputStream.write(BufferedOutputStream.java:122) at 
java.io.DataOutputStream.write(DataOutputStream.java:107) at 
org.apache.hadoop.hdfs.DFSOutputStream$Packet.writeTo(DFSOutputStream.java:328) 
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:653){code}
 

I use *maxChosenCount* to prevent choosing datanode task too long, which is 
calculated by the logarithm of probability, and it also can guarantee the 
probability of choosing a slow node to write block less than 0.01%.

Finally, i use an expire time to let namnode don't choose these slow nodes 
within a specify period, because these slow nodes may have returned to normal 
after the period and can use to write block again.


> namenode should avoid slow node when choose target in 
> BlockPlacementPolicyDefault
> -
>
> Key: HDFS-14789
> URL: https://issues.apache.org/jira/browse/HDFS-14789
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-14789
>
>
> With HDFS-11194 and HDFS-11551, namenode can show SlowPeersReport and 
> SlowDisksReport in jmx. I think namenode can avoid these slow node 
> information while choosing target in 
> we can find slow node through namenode's jmx. So i think namenode should 
> check these slow nodes when assigning a node for writing block. If namenode 
> choose a node at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault#*chooseRandom()*,
>  we should check whether it's belong to slow node, because choosing a slow 
> one to write data  may take a long time, w

[jira] [Updated] (HDFS-14789) namenode should avoid slow node when choose target in BlockPlacementPolicyDefault

2020-12-22 Thread Haibin Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibin Huang updated HDFS-14789:

Description: 
With HDFS-11194 and HDFS-11551, namenode can show SlowPeersReport and 
SlowDisksReport in jmx. I think namenode can avoid these slow node while 
chooseTarget in BlockPlacementPolicyDefault. Because if there is a slow node in 
pipeline, client might write very slowly. 

I use a invalidityTime to let namnode not choose slow node before invalid time 
finish. After the invalidityTime, if slow node return to normal, namenode can 
choose it again, or it's still very slow, the invalidityTime will update and 
keep not choosing it.

Also i consider the fallback, if namenode can't choose any normal node, 
chooseTarget will throw NotEnoughReplicasException and retry, this time not 
avoiding slow nodes.

 

 

  was:
With HDFS-11194 and HDFS-11551, namenode can show SlowPeersReport and 
SlowDisksReport in jmx. I think namenode can avoid these slow node information 
while choosing target in 

we can find slow node through namenode's jmx. So i think namenode should check 
these slow nodes when assigning a node for writing block. If namenode choose a 
node at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault#*chooseRandom()*,
 we should check whether it's belong to slow node, because choosing a slow one 
to write data  may take a long time, which can cause a client writing data very 
slowly and even encounter a socket timeout exception like this:

 
{code:java}
2019-08-19,17:16:41,181 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer 
Exceptionjava.net.SocketTimeoutException: 495000 millis timeout while waiting 
for channel to be ready for write. ch : 
java.nio.channels.SocketChannel[connected local=/xxx:xxx remote=/xxx:xxx] at 
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164) at 
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159) at 
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117) at 
java.io.BufferedOutputStream.write(BufferedOutputStream.java:122) at 
java.io.DataOutputStream.write(DataOutputStream.java:107) at 
org.apache.hadoop.hdfs.DFSOutputStream$Packet.writeTo(DFSOutputStream.java:328) 
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:653){code}
 

I use *maxChosenCount* to prevent choosing datanode task too long, which is 
calculated by the logarithm of probability, and it also can guarantee the 
probability of choosing a slow node to write block less than 0.01%.

Finally, i use an expire time to let namnode don't choose these slow nodes 
within a specify period, because these slow nodes may have returned to normal 
after the period and can use to write block again.


> namenode should avoid slow node when choose target in 
> BlockPlacementPolicyDefault
> -
>
> Key: HDFS-14789
> URL: https://issues.apache.org/jira/browse/HDFS-14789
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-14789
>
>
> With HDFS-11194 and HDFS-11551, namenode can show SlowPeersReport and 
> SlowDisksReport in jmx. I think namenode can avoid these slow node while 
> chooseTarget in BlockPlacementPolicyDefault. Because if there is a slow node 
> in pipeline, client might write very slowly. 
> I use a invalidityTime to let namnode not choose slow node before invalid 
> time finish. After the invalidityTime, if slow node return to normal, 
> namenode can choose it again, or it's still very slow, the invalidityTime 
> will update and keep not choosing it.
> Also i consider the fallback, if namenode can't choose any normal node, 
> chooseTarget will throw NotEnoughReplicasException and retry, this time not 
> avoiding slow nodes.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14789) namenode should avoid slow node while chooseTarget in BlockPlacementPolicyDefault

2020-12-22 Thread Haibin Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibin Huang updated HDFS-14789:

Summary: namenode should avoid slow node while chooseTarget in 
BlockPlacementPolicyDefault  (was: namenode should avoid slow node when choose 
target in BlockPlacementPolicyDefault)

> namenode should avoid slow node while chooseTarget in 
> BlockPlacementPolicyDefault
> -
>
> Key: HDFS-14789
> URL: https://issues.apache.org/jira/browse/HDFS-14789
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-14789
>
>
> With HDFS-11194 and HDFS-11551, namenode can show SlowPeersReport and 
> SlowDisksReport in jmx. I think namenode can avoid these slow node while 
> chooseTarget in BlockPlacementPolicyDefault. Because if there is a slow node 
> in pipeline, client might write very slowly. 
> I use a invalidityTime to let namnode not choose slow node before invalid 
> time finish. After the invalidityTime, if slow node return to normal, 
> namenode can choose it again, or it's still very slow, the invalidityTime 
> will update and keep not choosing it.
> Also i consider the fallback, if namenode can't choose any normal node, 
> chooseTarget will throw NotEnoughReplicasException and retry, this time not 
> avoiding slow nodes.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14789) namenode should avoid slow node while chooseTarget in BlockPlacementPolicyDefault

2020-12-22 Thread Haibin Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibin Huang updated HDFS-14789:

Attachment: HDFS-14789-001.patch

> namenode should avoid slow node while chooseTarget in 
> BlockPlacementPolicyDefault
> -
>
> Key: HDFS-14789
> URL: https://issues.apache.org/jira/browse/HDFS-14789
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-14789, HDFS-14789-001.patch
>
>
> With HDFS-11194 and HDFS-11551, namenode can show SlowPeersReport and 
> SlowDisksReport in jmx. I think namenode can avoid these slow node while 
> chooseTarget in BlockPlacementPolicyDefault. Because if there is a slow node 
> in pipeline, client might write very slowly. 
> I use a invalidityTime to let namnode not choose slow node before invalid 
> time finish. After the invalidityTime, if slow node return to normal, 
> namenode can choose it again, or it's still very slow, the invalidityTime 
> will update and keep not choosing it.
> Also i consider the fallback, if namenode can't choose any normal node, 
> chooseTarget will throw NotEnoughReplicasException and retry, this time not 
> avoiding slow nodes.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14789) namenode should avoid slow node while chooseTarget in BlockPlacementPolicyDefault

2020-12-22 Thread Haibin Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibin Huang updated HDFS-14789:

Description: 
With HDFS-11194 and HDFS-11551, namenode can show SlowPeersReport and 
SlowDisksReport in jmx. I think namenode can avoid these slow node while 
chooseTarget in BlockPlacementPolicyDefault. Because if there is a slow node in 
pipeline, client might write very slowly. 

I use a invalidityTime to let namnode not choose slow node before invalid time 
finish. After the invalidityTime, if slow node return to normal, namenode can 
choose it again, or it's still very slow, the invalidityTime will update and 
keep not choosing it.

Also i consider the fallback, if namenode can't choose any normal node, 
chooseTarget will throw NotEnoughReplicasException and retry, this time not 
avoiding slow nodes.

 

!image-2020-12-22-22-15-17-703.png|width=969,height=322!

 

 

  was:
With HDFS-11194 and HDFS-11551, namenode can show SlowPeersReport and 
SlowDisksReport in jmx. I think namenode can avoid these slow node while 
chooseTarget in BlockPlacementPolicyDefault. Because if there is a slow node in 
pipeline, client might write very slowly. 

I use a invalidityTime to let namnode not choose slow node before invalid time 
finish. After the invalidityTime, if slow node return to normal, namenode can 
choose it again, or it's still very slow, the invalidityTime will update and 
keep not choosing it.

Also i consider the fallback, if namenode can't choose any normal node, 
chooseTarget will throw NotEnoughReplicasException and retry, this time not 
avoiding slow nodes.

 

 


> namenode should avoid slow node while chooseTarget in 
> BlockPlacementPolicyDefault
> -
>
> Key: HDFS-14789
> URL: https://issues.apache.org/jira/browse/HDFS-14789
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-14789, HDFS-14789-001.patch, 
> image-2020-12-22-22-15-17-703.png
>
>
> With HDFS-11194 and HDFS-11551, namenode can show SlowPeersReport and 
> SlowDisksReport in jmx. I think namenode can avoid these slow node while 
> chooseTarget in BlockPlacementPolicyDefault. Because if there is a slow node 
> in pipeline, client might write very slowly. 
> I use a invalidityTime to let namnode not choose slow node before invalid 
> time finish. After the invalidityTime, if slow node return to normal, 
> namenode can choose it again, or it's still very slow, the invalidityTime 
> will update and keep not choosing it.
> Also i consider the fallback, if namenode can't choose any normal node, 
> chooseTarget will throw NotEnoughReplicasException and retry, this time not 
> avoiding slow nodes.
>  
> !image-2020-12-22-22-15-17-703.png|width=969,height=322!
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14789) namenode should avoid slow node while chooseTarget in BlockPlacementPolicyDefault

2020-12-22 Thread Haibin Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibin Huang updated HDFS-14789:

Attachment: image-2020-12-22-22-15-17-703.png

> namenode should avoid slow node while chooseTarget in 
> BlockPlacementPolicyDefault
> -
>
> Key: HDFS-14789
> URL: https://issues.apache.org/jira/browse/HDFS-14789
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-14789, HDFS-14789-001.patch, 
> image-2020-12-22-22-15-17-703.png
>
>
> With HDFS-11194 and HDFS-11551, namenode can show SlowPeersReport and 
> SlowDisksReport in jmx. I think namenode can avoid these slow node while 
> chooseTarget in BlockPlacementPolicyDefault. Because if there is a slow node 
> in pipeline, client might write very slowly. 
> I use a invalidityTime to let namnode not choose slow node before invalid 
> time finish. After the invalidityTime, if slow node return to normal, 
> namenode can choose it again, or it's still very slow, the invalidityTime 
> will update and keep not choosing it.
> Also i consider the fallback, if namenode can't choose any normal node, 
> chooseTarget will throw NotEnoughReplicasException and retry, this time not 
> avoiding slow nodes.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15745) Make DataNodePeerMetrics#LOW_THRESHOLD_MS and MIN_OUTLIER_DETECTION_NODES configurable

2020-12-31 Thread Haibin Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibin Huang updated HDFS-15745:

Attachment: HDFS-15745-002.patch

> Make DataNodePeerMetrics#LOW_THRESHOLD_MS and MIN_OUTLIER_DETECTION_NODES 
> configurable
> --
>
> Key: HDFS-15745
> URL: https://issues.apache.org/jira/browse/HDFS-15745
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-15745-001.patch, HDFS-15745-002.patch, 
> image-2020-12-22-17-00-50-796.png
>
>
> When i enable DataNodePeerMetrics to find slow slow peer in cluster, i found 
> there is a lot of slow peer but ReportingNodes's averageDelay is very low, 
> and these slow peer node are normal. I think the reason of why generating so 
> many slow peer is that  the value of DataNodePeerMetrics#LOW_THRESHOLD_MS is 
> too small (only 5ms) and it is not configurable. The default value of slow io 
> warning log threshold is 300ms, i.e. 
> DFSConfigKeys.DFS_DATANODE_SLOW_IO_WARNING_THRESHOLD_DEFAULT = 300, so 
> DataNodePeerMetrics#LOW_THRESHOLD_MS should not be less than 300ms, otherwise 
> namenode will get a lot of invalid slow peer information.
> !image-2020-12-22-17-00-50-796.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15745) Make DataNodePeerMetrics#LOW_THRESHOLD_MS and MIN_OUTLIER_DETECTION_NODES configurable

2020-12-31 Thread Haibin Huang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17256966#comment-17256966
 ] 

Haibin Huang commented on HDFS-15745:
-

Thanks [~ayushtkn] for review, i have update the patch, take a look please.

> Make DataNodePeerMetrics#LOW_THRESHOLD_MS and MIN_OUTLIER_DETECTION_NODES 
> configurable
> --
>
> Key: HDFS-15745
> URL: https://issues.apache.org/jira/browse/HDFS-15745
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-15745-001.patch, HDFS-15745-002.patch, 
> image-2020-12-22-17-00-50-796.png
>
>
> When i enable DataNodePeerMetrics to find slow slow peer in cluster, i found 
> there is a lot of slow peer but ReportingNodes's averageDelay is very low, 
> and these slow peer node are normal. I think the reason of why generating so 
> many slow peer is that  the value of DataNodePeerMetrics#LOW_THRESHOLD_MS is 
> too small (only 5ms) and it is not configurable. The default value of slow io 
> warning log threshold is 300ms, i.e. 
> DFSConfigKeys.DFS_DATANODE_SLOW_IO_WARNING_THRESHOLD_DEFAULT = 300, so 
> DataNodePeerMetrics#LOW_THRESHOLD_MS should not be less than 300ms, otherwise 
> namenode will get a lot of invalid slow peer information.
> !image-2020-12-22-17-00-50-796.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15745) Make DataNodePeerMetrics#LOW_THRESHOLD_MS and MIN_OUTLIER_DETECTION_NODES configurable

2020-12-31 Thread Haibin Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibin Huang updated HDFS-15745:

Attachment: HDFS-15745-003.patch

> Make DataNodePeerMetrics#LOW_THRESHOLD_MS and MIN_OUTLIER_DETECTION_NODES 
> configurable
> --
>
> Key: HDFS-15745
> URL: https://issues.apache.org/jira/browse/HDFS-15745
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-15745-001.patch, HDFS-15745-002.patch, 
> HDFS-15745-003.patch, image-2020-12-22-17-00-50-796.png
>
>
> When i enable DataNodePeerMetrics to find slow slow peer in cluster, i found 
> there is a lot of slow peer but ReportingNodes's averageDelay is very low, 
> and these slow peer node are normal. I think the reason of why generating so 
> many slow peer is that  the value of DataNodePeerMetrics#LOW_THRESHOLD_MS is 
> too small (only 5ms) and it is not configurable. The default value of slow io 
> warning log threshold is 300ms, i.e. 
> DFSConfigKeys.DFS_DATANODE_SLOW_IO_WARNING_THRESHOLD_DEFAULT = 300, so 
> DataNodePeerMetrics#LOW_THRESHOLD_MS should not be less than 300ms, otherwise 
> namenode will get a lot of invalid slow peer information.
> !image-2020-12-22-17-00-50-796.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15745) Make DataNodePeerMetrics#LOW_THRESHOLD_MS and MIN_OUTLIER_DETECTION_NODES configurable

2020-12-31 Thread Haibin Huang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17257080#comment-17257080
 ] 

Haibin Huang commented on HDFS-15745:
-

Thanx [~ayushtkn], i have updated the patch.

> Make DataNodePeerMetrics#LOW_THRESHOLD_MS and MIN_OUTLIER_DETECTION_NODES 
> configurable
> --
>
> Key: HDFS-15745
> URL: https://issues.apache.org/jira/browse/HDFS-15745
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-15745-001.patch, HDFS-15745-002.patch, 
> HDFS-15745-003.patch, image-2020-12-22-17-00-50-796.png
>
>
> When i enable DataNodePeerMetrics to find slow slow peer in cluster, i found 
> there is a lot of slow peer but ReportingNodes's averageDelay is very low, 
> and these slow peer node are normal. I think the reason of why generating so 
> many slow peer is that  the value of DataNodePeerMetrics#LOW_THRESHOLD_MS is 
> too small (only 5ms) and it is not configurable. The default value of slow io 
> warning log threshold is 300ms, i.e. 
> DFSConfigKeys.DFS_DATANODE_SLOW_IO_WARNING_THRESHOLD_DEFAULT = 300, so 
> DataNodePeerMetrics#LOW_THRESHOLD_MS should not be less than 300ms, otherwise 
> namenode will get a lot of invalid slow peer information.
> !image-2020-12-22-17-00-50-796.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15758) Fix typos in MutableMetric

2021-01-03 Thread Haibin Huang (Jira)
Haibin Huang created HDFS-15758:
---

 Summary: Fix typos in MutableMetric
 Key: HDFS-15758
 URL: https://issues.apache.org/jira/browse/HDFS-15758
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Haibin Huang
Assignee: Haibin Huang


Now the java doc of MutableMetric#changed may cause misunderstanding, it needs 
to be fixed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15758) Fix typos in MutableMetric

2021-01-03 Thread Haibin Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibin Huang updated HDFS-15758:

Priority: Minor  (was: Major)

> Fix typos in MutableMetric
> --
>
> Key: HDFS-15758
> URL: https://issues.apache.org/jira/browse/HDFS-15758
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Minor
>
> Now the java doc of MutableMetric#changed may cause misunderstanding, it 
> needs to be fixed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15758) Fix typos in MutableMetric

2021-01-03 Thread Haibin Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibin Huang updated HDFS-15758:

Attachment: HDFS-15758-001.patch

> Fix typos in MutableMetric
> --
>
> Key: HDFS-15758
> URL: https://issues.apache.org/jira/browse/HDFS-15758
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Minor
> Attachments: HDFS-15758-001.patch
>
>
> Now the java doc of MutableMetric#changed may cause misunderstanding, it 
> needs to be fixed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15758) Fix typos in MutableMetric

2021-01-03 Thread Haibin Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibin Huang updated HDFS-15758:

Status: Patch Available  (was: Open)

> Fix typos in MutableMetric
> --
>
> Key: HDFS-15758
> URL: https://issues.apache.org/jira/browse/HDFS-15758
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Minor
> Attachments: HDFS-15758-001.patch
>
>
> Now the java doc of MutableMetric#changed may cause misunderstanding, it 
> needs to be fixed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15758) Fix typos in MutableMetric

2021-01-14 Thread Haibin Huang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17265667#comment-17265667
 ] 

Haibin Huang commented on HDFS-15758:
-

[~ayushtkn] would you mind take a look at this?

> Fix typos in MutableMetric
> --
>
> Key: HDFS-15758
> URL: https://issues.apache.org/jira/browse/HDFS-15758
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Minor
> Attachments: HDFS-15758-001.patch
>
>
> Now the java doc of MutableMetric#changed may cause misunderstanding, it 
> needs to be fixed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15666) add average latency information to the SlowPeerReport

2021-01-20 Thread Haibin Huang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17268985#comment-17268985
 ] 

Haibin Huang commented on HDFS-15666:
-

[~ayushtkn] [~aajisaka] [~elgoiri] [~hexiaoqiao] would you mind take a look at 
this? This improvement has a good effect in my company.

> add average latency information to the SlowPeerReport
> -
>
> Key: HDFS-15666
> URL: https://issues.apache.org/jira/browse/HDFS-15666
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Minor
> Attachments: HDFS-15666-003.patch, HDFS-15666-004.patch, 
> HDFS-15666.001.patch, HDFS-15666.002.patch
>
>
> In namenode's jmx, there is a SlowDisksReport like this:
> {code:java}
> [{"SlowDiskID":"dn3:disk1","Latencies":{"WRITE":1000.1}},{"SlowDiskID":"dn2:disk2","Latencies":{"WRITE":1000.1}},{"SlowDiskID":"dn1:disk2","Latencies":{"READ":1000.3}},{"SlowDiskID":"dn1:disk1","Latencies":{"METADATA":1000.1,"READ":1000.8}}]
> {code}
> So we can know the disk io letency from this report.However, SlowPeersReport 
> dosen't have average latency:
> {code:java}
> [{"SlowNode":"node4","ReportingNodes":["node1"]},{"SlowNode":"node2","ReportingNodes":["node1","node3"]},{"SlowNode":"node1","ReportingNodes":["node2"]}]
> {code}
> I think we should add the average latency to the report, which can get from 
> org.apache.hadoop.hdfs.server.protocol.SlowPeerReports#slowPeers.
> After adding the average latency, the SlowPeerReport can be like this:
> {code:java}
> [{"SlowNode":"node4","ReportingNodes":[{"nodeId":"node1","averageLatency":2000.0}]},{"SlowNode":"node2","ReportingNodes":[{"nodeId":"node1","averageLatency":2000.0},{"nodeId":"node3","averageLatency":1000.0}]},{"SlowNode":"node1","ReportingNodes":[{"nodeId":"node2","averageLatency":2000.0}]}]{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15744) Use cumulative counting way to improve the accuracy of slow disk detection

2021-01-20 Thread Haibin Huang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17268996#comment-17268996
 ] 

Haibin Huang commented on HDFS-15744:
-

[~ayushtkn] [~aajisaka] [~elgoiri] [~hexiaoqiao] would you mind take a look at 
this? We use this way to detect slow disk in our company, and the accuracy of 
finding bad disk is over 90% .

> Use cumulative counting way to improve the accuracy of slow disk detection
> --
>
> Key: HDFS-15744
> URL: https://issues.apache.org/jira/browse/HDFS-15744
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-15744-001.patch, image-2020-12-22-11-37-14-734.png, 
> image-2020-12-22-11-37-35-280.png, image-2020-12-22-11-46-48-817.png
>
>
> Hdfs has supported the datanode disk outlier detection in 
> [HDFS-11461|https://issues.apache.org/jira/browse/HDFS-11461], we can use it 
> to find out slow disk via 
> SlowDiskReport([HDFS-11551|https://issues.apache.org/jira/browse/HDFS-11551]).However
>  i found the slow disk information may not be accurate enough in practice.
> Because a large number of short-term writes can lead to miscalculation. Here 
> is the example, this disk is health, when it encounters a lot of writing in a 
> few minute, it's write io does get slow, and will be considered to be slow 
> disk.The disk just slow in a few minute but SlowDiskReport will keep it until 
> the information becomes invalid. This scenario confuse us since we want to 
> use SlowDiskReport to detect the real bad disk.
> !image-2020-12-22-11-37-14-734.png!
> !image-2020-12-22-11-37-35-280.png!
> To improve the deteciton accuracy, we use a cumulative counting way to detect 
> slow disk. If within the reportValidityMs interval, a disk is considered to 
> be outlier over 50% times, than it should be a real bad disk.
> Here is an exsample, if reportValidityMs is one hour and detection interval 
> is five minute, there will be 12 times disk outlier detection in one hour. If 
> a disk is considered to be outlier over 6 times, it should be a real bad 
> disk. We use this way to detect bad disk in cluster, it can reach over 90% 
> accuracy.
> !image-2020-12-22-11-46-48-817.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15745) Make DataNodePeerMetrics#LOW_THRESHOLD_MS and MIN_OUTLIER_DETECTION_NODES configurable

2021-03-26 Thread Haibin Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibin Huang updated HDFS-15745:

Attachment: HDFS-15745-branch-3.3.001.patch
HDFS-15745-branch-3.2.001.patch
HDFS-15745-branch-3.1.001.patch

> Make DataNodePeerMetrics#LOW_THRESHOLD_MS and MIN_OUTLIER_DETECTION_NODES 
> configurable
> --
>
> Key: HDFS-15745
> URL: https://issues.apache.org/jira/browse/HDFS-15745
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: HDFS-15745-001.patch, HDFS-15745-002.patch, 
> HDFS-15745-003.patch, HDFS-15745-branch-3.1.001.patch, 
> HDFS-15745-branch-3.2.001.patch, HDFS-15745-branch-3.3.001.patch, 
> image-2020-12-22-17-00-50-796.png
>
>
> When i enable DataNodePeerMetrics to find slow slow peer in cluster, i found 
> there is a lot of slow peer but ReportingNodes's averageDelay is very low, 
> and these slow peer node are normal. I think the reason of why generating so 
> many slow peer is that  the value of DataNodePeerMetrics#LOW_THRESHOLD_MS is 
> too small (only 5ms) and it is not configurable. The default value of slow io 
> warning log threshold is 300ms, i.e. 
> DFSConfigKeys.DFS_DATANODE_SLOW_IO_WARNING_THRESHOLD_DEFAULT = 300, so 
> DataNodePeerMetrics#LOW_THRESHOLD_MS should not be less than 300ms, otherwise 
> namenode will get a lot of invalid slow peer information.
> !image-2020-12-22-17-00-50-796.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15745) Make DataNodePeerMetrics#LOW_THRESHOLD_MS and MIN_OUTLIER_DETECTION_NODES configurable

2021-03-26 Thread Haibin Huang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17309362#comment-17309362
 ] 

Haibin Huang commented on HDFS-15745:
-

Thanks [~prasad-acit] for comment, i have update this patch for branch 3.1 & 
3.2 &3.3, [~ayushtkn] would you mind commit them?

> Make DataNodePeerMetrics#LOW_THRESHOLD_MS and MIN_OUTLIER_DETECTION_NODES 
> configurable
> --
>
> Key: HDFS-15745
> URL: https://issues.apache.org/jira/browse/HDFS-15745
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: HDFS-15745-001.patch, HDFS-15745-002.patch, 
> HDFS-15745-003.patch, HDFS-15745-branch-3.1.001.patch, 
> HDFS-15745-branch-3.2.001.patch, HDFS-15745-branch-3.3.001.patch, 
> image-2020-12-22-17-00-50-796.png
>
>
> When i enable DataNodePeerMetrics to find slow slow peer in cluster, i found 
> there is a lot of slow peer but ReportingNodes's averageDelay is very low, 
> and these slow peer node are normal. I think the reason of why generating so 
> many slow peer is that  the value of DataNodePeerMetrics#LOW_THRESHOLD_MS is 
> too small (only 5ms) and it is not configurable. The default value of slow io 
> warning log threshold is 300ms, i.e. 
> DFSConfigKeys.DFS_DATANODE_SLOW_IO_WARNING_THRESHOLD_DEFAULT = 300, so 
> DataNodePeerMetrics#LOW_THRESHOLD_MS should not be less than 300ms, otherwise 
> namenode will get a lot of invalid slow peer information.
> !image-2020-12-22-17-00-50-796.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet

2021-05-19 Thread Haibin Huang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17347348#comment-17347348
 ] 

Haibin Huang commented on HDFS-13671:
-

Thanks [~ferhui] and [~LiJinglun] involving me here, i will submit a patch 
later.

> Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
> --
>
> Key: HDFS-13671
> URL: https://issues.apache.org/jira/browse/HDFS-13671
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.0.3
>Reporter: Yiqun Lin
>Priority: Major
>
> NameNode hung when deleting large files/blocks. The stack info:
> {code}
> "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 
> tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000]
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> {code}
> In the current deletion logic in NameNode, there are mainly two steps:
> * Collect INodes and all blocks to be deleted, then delete INodes.
> * Remove blocks  chunk by chunk in a loop.
> Actually the first step should be a more expensive operation and will takes 
> more time. However, now we always see NN hangs during the remove block 
> operation. 
> Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a 
> better performance in dealing FBR/IBRs. But compared with early 
> implementation in remove-block logic, {{FoldedTreeSet}} seems more slower 
> since It will take additional time to balance tree node. When there are large 
> block to be removed/deleted, it looks bad.
> For the get type operations in {{DatanodeStorageInfo}}, we only provide the 
> {{getBlockIterator}} to return blocks iterator and no other get operation 
> with specified block. Still we need to use {{FoldedTreeSet}} in 
> {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not 
> Update. Maybe we can revert this to the early implementation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet

2021-05-24 Thread Haibin Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibin Huang updated HDFS-13671:

Attachment: HDFS-13671-001.patch

> Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
> --
>
> Key: HDFS-13671
> URL: https://issues.apache.org/jira/browse/HDFS-13671
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.0.3
>Reporter: Yiqun Lin
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-13671-001.patch
>
>
> NameNode hung when deleting large files/blocks. The stack info:
> {code}
> "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 
> tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000]
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> {code}
> In the current deletion logic in NameNode, there are mainly two steps:
> * Collect INodes and all blocks to be deleted, then delete INodes.
> * Remove blocks  chunk by chunk in a loop.
> Actually the first step should be a more expensive operation and will takes 
> more time. However, now we always see NN hangs during the remove block 
> operation. 
> Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a 
> better performance in dealing FBR/IBRs. But compared with early 
> implementation in remove-block logic, {{FoldedTreeSet}} seems more slower 
> since It will take additional time to balance tree node. When there are large 
> block to be removed/deleted, it looks bad.
> For the get type operations in {{DatanodeStorageInfo}}, we only provide the 
> {{getBlockIterator}} to return blocks iterator and no other get operation 
> with specified block. Still we need to use {{FoldedTreeSet}} in 
> {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not 
> Update. Maybe we can revert this to the early implementation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet

2021-05-24 Thread Haibin Huang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17350752#comment-17350752
 ] 

Haibin Huang commented on HDFS-13671:
-

[^HDFS-13671-001.patch] is based on 
[HDFS-9260|https://issues.apache.org/jira/browse/HDFS-9260], which revert 
FoldedTreeSet to 

LightWeightResizableGSet in 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaMap. This patch is 
work well in my company, i will submit a test report later.

> Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
> --
>
> Key: HDFS-13671
> URL: https://issues.apache.org/jira/browse/HDFS-13671
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.0.3
>Reporter: Yiqun Lin
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-13671-001.patch
>
>
> NameNode hung when deleting large files/blocks. The stack info:
> {code}
> "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 
> tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000]
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> {code}
> In the current deletion logic in NameNode, there are mainly two steps:
> * Collect INodes and all blocks to be deleted, then delete INodes.
> * Remove blocks  chunk by chunk in a loop.
> Actually the first step should be a more expensive operation and will takes 
> more time. However, now we always see NN hangs during the remove block 
> operation. 
> Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a 
> better performance in dealing FBR/IBRs. But compared with early 
> implementation in remove-block logic, {{FoldedTreeSet}} seems more slower 
> since It will take additional time to balance tree node. When there are large 
> block to be removed/deleted, it looks bad.
> For the get type operations in {{DatanodeStorageInfo}}, we only provide the 
> {{getBlockIterator}} to return blocks iterator and no other get operation 
> with specified block. Still we need to use {{FoldedTreeSet}} in 
> {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not 
> Update. Maybe we can revert this to the early implementation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet

2021-06-01 Thread Haibin Huang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17355019#comment-17355019
 ] 

Haibin Huang commented on HDFS-13671:
-

Thanks [~ferhui], i have created a new PR

https://github.com/apache/hadoop/pull/3065

> Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
> --
>
> Key: HDFS-13671
> URL: https://issues.apache.org/jira/browse/HDFS-13671
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.0.3
>Reporter: Yiqun Lin
>Assignee: Haibin Huang
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-13671-001.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> NameNode hung when deleting large files/blocks. The stack info:
> {code}
> "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 
> tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000]
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> {code}
> In the current deletion logic in NameNode, there are mainly two steps:
> * Collect INodes and all blocks to be deleted, then delete INodes.
> * Remove blocks  chunk by chunk in a loop.
> Actually the first step should be a more expensive operation and will takes 
> more time. However, now we always see NN hangs during the remove block 
> operation. 
> Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a 
> better performance in dealing FBR/IBRs. But compared with early 
> implementation in remove-block logic, {{FoldedTreeSet}} seems more slower 
> since It will take additional time to balance tree node. When there are large 
> block to be removed/deleted, it looks bad.
> For the get type operations in {{DatanodeStorageInfo}}, we only provide the 
> {{getBlockIterator}} to return blocks iterator and no other get operation 
> with specified block. Still we need to use {{FoldedTreeSet}} in 
> {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not 
> Update. Maybe we can revert this to the early implementation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet

2021-06-02 Thread Haibin Huang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17355679#comment-17355679
 ] 

Haibin Huang commented on HDFS-13671:
-

Thanks [~ferhui] 's reminding, i have update the PR, and the failed test is 
pass in my local environment, i don't know why they fail in ci.

> Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
> --
>
> Key: HDFS-13671
> URL: https://issues.apache.org/jira/browse/HDFS-13671
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.0.3
>Reporter: Yiqun Lin
>Assignee: Haibin Huang
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-13671-001.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> NameNode hung when deleting large files/blocks. The stack info:
> {code}
> "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 
> tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000]
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> {code}
> In the current deletion logic in NameNode, there are mainly two steps:
> * Collect INodes and all blocks to be deleted, then delete INodes.
> * Remove blocks  chunk by chunk in a loop.
> Actually the first step should be a more expensive operation and will takes 
> more time. However, now we always see NN hangs during the remove block 
> operation. 
> Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a 
> better performance in dealing FBR/IBRs. But compared with early 
> implementation in remove-block logic, {{FoldedTreeSet}} seems more slower 
> since It will take additional time to balance tree node. When there are large 
> block to be removed/deleted, it looks bad.
> For the get type operations in {{DatanodeStorageInfo}}, we only provide the 
> {{getBlockIterator}} to return blocks iterator and no other get operation 
> with specified block. Still we need to use {{FoldedTreeSet}} in 
> {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not 
> Update. Maybe we can revert this to the early implementation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet

2021-06-10 Thread Haibin Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibin Huang updated HDFS-13671:

Attachment: image-2021-06-10-19-28-18-373.png

> Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
> --
>
> Key: HDFS-13671
> URL: https://issues.apache.org/jira/browse/HDFS-13671
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.0.3
>Reporter: Yiqun Lin
>Assignee: Haibin Huang
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-13671-001.patch, image-2021-06-10-19-28-18-373.png
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> NameNode hung when deleting large files/blocks. The stack info:
> {code}
> "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 
> tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000]
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> {code}
> In the current deletion logic in NameNode, there are mainly two steps:
> * Collect INodes and all blocks to be deleted, then delete INodes.
> * Remove blocks  chunk by chunk in a loop.
> Actually the first step should be a more expensive operation and will takes 
> more time. However, now we always see NN hangs during the remove block 
> operation. 
> Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a 
> better performance in dealing FBR/IBRs. But compared with early 
> implementation in remove-block logic, {{FoldedTreeSet}} seems more slower 
> since It will take additional time to balance tree node. When there are large 
> block to be removed/deleted, it looks bad.
> For the get type operations in {{DatanodeStorageInfo}}, we only provide the 
> {{getBlockIterator}} to return blocks iterator and no other get operation 
> with specified block. Still we need to use {{FoldedTreeSet}} in 
> {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not 
> Update. Maybe we can revert this to the early implementation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet

2021-06-10 Thread Haibin Huang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17360763#comment-17360763
 ] 

Haibin Huang commented on HDFS-13671:
-

We apply this patch in our company's cluster which has over 300 nodes, and this 
cluster will empty trash every 6 hours. Here is the p99th rpc time on 
hdfs-client, the coordinate unit is ms. Before applying this patch, the rpc 
consume time is over 1k ms when namenode doing delete work, and after using 
this patch, it comes to normal.

!image-2021-06-10-19-28-18-373.png!

!image-2021-06-10-19-28-58-359.png!

 

> Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
> --
>
> Key: HDFS-13671
> URL: https://issues.apache.org/jira/browse/HDFS-13671
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.0.3
>Reporter: Yiqun Lin
>Assignee: Haibin Huang
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-13671-001.patch, image-2021-06-10-19-28-18-373.png
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> NameNode hung when deleting large files/blocks. The stack info:
> {code}
> "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 
> tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000]
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> {code}
> In the current deletion logic in NameNode, there are mainly two steps:
> * Collect INodes and all blocks to be deleted, then delete INodes.
> * Remove blocks  chunk by chunk in a loop.
> Actually the first step should be a more expensive operation and will takes 
> more time. However, now we always see NN hangs during the remove block 
> operation. 
> Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a 
> better performance in dealing FBR/IBRs. But compared with early 
> implementation in remove-block logic, {{FoldedTreeSet}} seems more slower 
> since It will take additional time to balance tree node. When there are large 
> block to be removed/deleted, it looks bad.
> For the get type operations in {{DatanodeStorageInfo}}, we only provide the 
> {{getBlockIterator}} to return blocks iterator and no other get operation 
> with specified block. Still we need to use {{FoldedTreeSet}} in 
> {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not 
> Update. Maybe we can revert this to the early implementation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet

2021-06-10 Thread Haibin Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibin Huang updated HDFS-13671:

Attachment: image-2021-06-10-19-28-58-359.png

> Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
> --
>
> Key: HDFS-13671
> URL: https://issues.apache.org/jira/browse/HDFS-13671
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.0.3
>Reporter: Yiqun Lin
>Assignee: Haibin Huang
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-13671-001.patch, image-2021-06-10-19-28-18-373.png, 
> image-2021-06-10-19-28-58-359.png
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> NameNode hung when deleting large files/blocks. The stack info:
> {code}
> "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 
> tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000]
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> {code}
> In the current deletion logic in NameNode, there are mainly two steps:
> * Collect INodes and all blocks to be deleted, then delete INodes.
> * Remove blocks  chunk by chunk in a loop.
> Actually the first step should be a more expensive operation and will takes 
> more time. However, now we always see NN hangs during the remove block 
> operation. 
> Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a 
> better performance in dealing FBR/IBRs. But compared with early 
> implementation in remove-block logic, {{FoldedTreeSet}} seems more slower 
> since It will take additional time to balance tree node. When there are large 
> block to be removed/deleted, it looks bad.
> For the get type operations in {{DatanodeStorageInfo}}, we only provide the 
> {{getBlockIterator}} to return blocks iterator and no other get operation 
> with specified block. Still we need to use {{FoldedTreeSet}} in 
> {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not 
> Update. Maybe we can revert this to the early implementation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet

2021-06-18 Thread Haibin Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibin Huang updated HDFS-13671:

Attachment: image-2021-06-18-15-46-46-052.png

> Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
> --
>
> Key: HDFS-13671
> URL: https://issues.apache.org/jira/browse/HDFS-13671
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.0.3
>Reporter: Yiqun Lin
>Assignee: Haibin Huang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
> Attachments: HDFS-13671-001.patch, image-2021-06-10-19-28-18-373.png, 
> image-2021-06-10-19-28-58-359.png, image-2021-06-18-15-46-46-052.png
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> NameNode hung when deleting large files/blocks. The stack info:
> {code}
> "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 
> tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000]
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> {code}
> In the current deletion logic in NameNode, there are mainly two steps:
> * Collect INodes and all blocks to be deleted, then delete INodes.
> * Remove blocks  chunk by chunk in a loop.
> Actually the first step should be a more expensive operation and will takes 
> more time. However, now we always see NN hangs during the remove block 
> operation. 
> Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a 
> better performance in dealing FBR/IBRs. But compared with early 
> implementation in remove-block logic, {{FoldedTreeSet}} seems more slower 
> since It will take additional time to balance tree node. When there are large 
> block to be removed/deleted, it looks bad.
> For the get type operations in {{DatanodeStorageInfo}}, we only provide the 
> {{getBlockIterator}} to return blocks iterator and no other get operation 
> with specified block. Still we need to use {{FoldedTreeSet}} in 
> {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not 
> Update. Maybe we can revert this to the early implementation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet

2021-06-18 Thread Haibin Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibin Huang updated HDFS-13671:

Attachment: image-2021-06-18-15-47-04-037.png

> Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
> --
>
> Key: HDFS-13671
> URL: https://issues.apache.org/jira/browse/HDFS-13671
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.0.3
>Reporter: Yiqun Lin
>Assignee: Haibin Huang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
> Attachments: HDFS-13671-001.patch, image-2021-06-10-19-28-18-373.png, 
> image-2021-06-10-19-28-58-359.png, image-2021-06-18-15-46-46-052.png, 
> image-2021-06-18-15-47-04-037.png
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> NameNode hung when deleting large files/blocks. The stack info:
> {code}
> "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 
> tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000]
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> {code}
> In the current deletion logic in NameNode, there are mainly two steps:
> * Collect INodes and all blocks to be deleted, then delete INodes.
> * Remove blocks  chunk by chunk in a loop.
> Actually the first step should be a more expensive operation and will takes 
> more time. However, now we always see NN hangs during the remove block 
> operation. 
> Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a 
> better performance in dealing FBR/IBRs. But compared with early 
> implementation in remove-block logic, {{FoldedTreeSet}} seems more slower 
> since It will take additional time to balance tree node. When there are large 
> block to be removed/deleted, it looks bad.
> For the get type operations in {{DatanodeStorageInfo}}, we only provide the 
> {{getBlockIterator}} to return blocks iterator and no other get operation 
> with specified block. Still we need to use {{FoldedTreeSet}} in 
> {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not 
> Update. Maybe we can revert this to the early implementation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet

2021-06-18 Thread Haibin Huang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17365296#comment-17365296
 ] 

Haibin Huang commented on HDFS-13671:
-

[~tomscut] You are right, it will affect the performance of handling block 
reports, in my company's cluster which has over 300 nodes, the AvgProcessTime 
of block report will increase about 70 percent, but the qps of block report is 
very slow, i think it can be acceptable. And the  p99th rpc time on hdfs-client 
can be reduced by 85% when namenode do some big delete operation,  it's worth 
doing revert.

!image-2021-06-18-15-46-46-052.png!

!image-2021-06-18-15-47-04-037.png!

> Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
> --
>
> Key: HDFS-13671
> URL: https://issues.apache.org/jira/browse/HDFS-13671
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.0.3
>Reporter: Yiqun Lin
>Assignee: Haibin Huang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
> Attachments: HDFS-13671-001.patch, image-2021-06-10-19-28-18-373.png, 
> image-2021-06-10-19-28-58-359.png, image-2021-06-18-15-46-46-052.png, 
> image-2021-06-18-15-47-04-037.png
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> NameNode hung when deleting large files/blocks. The stack info:
> {code}
> "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 
> tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000]
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> {code}
> In the current deletion logic in NameNode, there are mainly two steps:
> * Collect INodes and all blocks to be deleted, then delete INodes.
> * Remove blocks  chunk by chunk in a loop.
> Actually the first step should be a more expensive operation and will takes 
> more time. However, now we always see NN hangs during the remove block 
> operation. 
> Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a 
> better performance in dealing FBR/IBRs. But compared with early 
> implementation in remove-block logic, {{FoldedTreeSet}} seems more slower 
> since It will take additional time to balance tree node. When there are large 
> block to be removed/deleted, it looks bad.
> For the get type operations in {{DatanodeStorageInfo}}, we only provide the 
> {{getBlockIterator}} to return blocks iterator and no other get operation 
> with specified block. Still we need to use {{FoldedTreeSet}} in 
> {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not 
> Update. Maybe we can revert this to the early implementation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h..

[jira] [Commented] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet

2021-06-18 Thread Haibin Huang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17365319#comment-17365319
 ] 

Haibin Huang commented on HDFS-13671:
-

[~tomscut] there are 2 blocks in one disk and each datanode has 12 disk

> Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
> --
>
> Key: HDFS-13671
> URL: https://issues.apache.org/jira/browse/HDFS-13671
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.0.3
>Reporter: Yiqun Lin
>Assignee: Haibin Huang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
> Attachments: HDFS-13671-001.patch, image-2021-06-10-19-28-18-373.png, 
> image-2021-06-10-19-28-58-359.png, image-2021-06-18-15-46-46-052.png, 
> image-2021-06-18-15-47-04-037.png
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> NameNode hung when deleting large files/blocks. The stack info:
> {code}
> "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 
> tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000]
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> {code}
> In the current deletion logic in NameNode, there are mainly two steps:
> * Collect INodes and all blocks to be deleted, then delete INodes.
> * Remove blocks  chunk by chunk in a loop.
> Actually the first step should be a more expensive operation and will takes 
> more time. However, now we always see NN hangs during the remove block 
> operation. 
> Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a 
> better performance in dealing FBR/IBRs. But compared with early 
> implementation in remove-block logic, {{FoldedTreeSet}} seems more slower 
> since It will take additional time to balance tree node. When there are large 
> block to be removed/deleted, it looks bad.
> For the get type operations in {{DatanodeStorageInfo}}, we only provide the 
> {{getBlockIterator}} to return blocks iterator and no other get operation 
> with specified block. Still we need to use {{FoldedTreeSet}} in 
> {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not 
> Update. Maybe we can revert this to the early implementation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet

2021-06-18 Thread Haibin Huang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17365399#comment-17365399
 ] 

Haibin Huang commented on HDFS-13671:
-

[~zhaojk] I will cherry-pick to branch 3.1 later, you can just update your 
namenode and don't need to update datanode together, in my company i just 
update the namenode, it‘s compatible with the datanode which has FoldedTreeSet, 
but it better to update your datanode later if you have time.

> Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
> --
>
> Key: HDFS-13671
> URL: https://issues.apache.org/jira/browse/HDFS-13671
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.0.3
>Reporter: Yiqun Lin
>Assignee: Haibin Huang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
> Attachments: HDFS-13671-001.patch, image-2021-06-10-19-28-18-373.png, 
> image-2021-06-10-19-28-58-359.png, image-2021-06-18-15-46-46-052.png, 
> image-2021-06-18-15-47-04-037.png
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> NameNode hung when deleting large files/blocks. The stack info:
> {code}
> "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 
> tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000]
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849)
>   at 
> org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> {code}
> In the current deletion logic in NameNode, there are mainly two steps:
> * Collect INodes and all blocks to be deleted, then delete INodes.
> * Remove blocks  chunk by chunk in a loop.
> Actually the first step should be a more expensive operation and will takes 
> more time. However, now we always see NN hangs during the remove block 
> operation. 
> Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a 
> better performance in dealing FBR/IBRs. But compared with early 
> implementation in remove-block logic, {{FoldedTreeSet}} seems more slower 
> since It will take additional time to balance tree node. When there are large 
> block to be removed/deleted, it looks bad.
> For the get type operations in {{DatanodeStorageInfo}}, we only provide the 
> {{getBlockIterator}} to return blocks iterator and no other get operation 
> with specified block. Still we need to use {{FoldedTreeSet}} in 
> {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not 
> Update. Maybe we can revert this to the early implementation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14612) SlowDiskReport won't update when SlowDisks is always empty in heartbeat

2019-10-10 Thread HaiBin Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HaiBin Huang updated HDFS-14612:

Attachment: HDFS-14612-002.patch

> SlowDiskReport won't update when SlowDisks is always empty in heartbeat
> ---
>
> Key: HDFS-14612
> URL: https://issues.apache.org/jira/browse/HDFS-14612
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: HaiBin Huang
>Assignee: HaiBin Huang
>Priority: Major
> Attachments: HDFS-14612-001.patch, HDFS-14612-002.patch, 
> HDFS-14612.patch
>
>
> I found SlowDiskReport won't update when slowDisks is always empty in 
> org.apache.hadoop.hdfs.server.blockmanagement.*handleHeartbeat*, this may 
> lead to outdated SlowDiskReport alway staying in jmx of namenode until next 
> time slowDisks isn't empty. So i think this method 
> *checkAndUpdateReportIfNecessary()* should be called firstly when we want to 
> get the jmx information about SlowDiskReport, this can keep the 
> SlowDiskReport on jmx is alway valid.
>  
> There is also some incorrect object reference on 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.
> *DataNodeVolumeMetrics*
> {code:java}
> // Based on writeIoRate
> public long getWriteIoSampleCount() {
>   return syncIoRate.lastStat().numSamples();
> }
> public double getWriteIoMean() {
>   return syncIoRate.lastStat().mean();
> }
> public double getWriteIoStdDev() {
>   return syncIoRate.lastStat().stddev();
> }
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14612) SlowDiskReport won't update when SlowDisks is always empty in heartbeat

2019-10-10 Thread HaiBin Huang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16949074#comment-16949074
 ] 

HaiBin Huang commented on HDFS-14612:
-

update patch v2, add a new test

> SlowDiskReport won't update when SlowDisks is always empty in heartbeat
> ---
>
> Key: HDFS-14612
> URL: https://issues.apache.org/jira/browse/HDFS-14612
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: HaiBin Huang
>Assignee: HaiBin Huang
>Priority: Major
> Attachments: HDFS-14612-001.patch, HDFS-14612-002.patch, 
> HDFS-14612.patch
>
>
> I found SlowDiskReport won't update when slowDisks is always empty in 
> org.apache.hadoop.hdfs.server.blockmanagement.*handleHeartbeat*, this may 
> lead to outdated SlowDiskReport alway staying in jmx of namenode until next 
> time slowDisks isn't empty. So i think this method 
> *checkAndUpdateReportIfNecessary()* should be called firstly when we want to 
> get the jmx information about SlowDiskReport, this can keep the 
> SlowDiskReport on jmx is alway valid.
>  
> There is also some incorrect object reference on 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.
> *DataNodeVolumeMetrics*
> {code:java}
> // Based on writeIoRate
> public long getWriteIoSampleCount() {
>   return syncIoRate.lastStat().numSamples();
> }
> public double getWriteIoMean() {
>   return syncIoRate.lastStat().mean();
> }
> public double getWriteIoStdDev() {
>   return syncIoRate.lastStat().stddev();
> }
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14612) SlowDiskReport won't update when SlowDisks is always empty in heartbeat

2019-10-10 Thread Haibin Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibin Huang updated HDFS-14612:

Attachment: HDFS-14612-003.patch

> SlowDiskReport won't update when SlowDisks is always empty in heartbeat
> ---
>
> Key: HDFS-14612
> URL: https://issues.apache.org/jira/browse/HDFS-14612
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-14612-001.patch, HDFS-14612-002.patch, 
> HDFS-14612-003.patch, HDFS-14612.patch
>
>
> I found SlowDiskReport won't update when slowDisks is always empty in 
> org.apache.hadoop.hdfs.server.blockmanagement.*handleHeartbeat*, this may 
> lead to outdated SlowDiskReport alway staying in jmx of namenode until next 
> time slowDisks isn't empty. So i think this method 
> *checkAndUpdateReportIfNecessary()* should be called firstly when we want to 
> get the jmx information about SlowDiskReport, this can keep the 
> SlowDiskReport on jmx is alway valid.
>  
> There is also some incorrect object reference on 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.
> *DataNodeVolumeMetrics*
> {code:java}
> // Based on writeIoRate
> public long getWriteIoSampleCount() {
>   return syncIoRate.lastStat().numSamples();
> }
> public double getWriteIoMean() {
>   return syncIoRate.lastStat().mean();
> }
> public double getWriteIoStdDev() {
>   return syncIoRate.lastStat().stddev();
> }
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14780) add ReportingNode to make the information of SlowPeersReport in namenode's jmx more specific

2019-10-11 Thread Haibin Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibin Huang updated HDFS-14780:

Summary: add ReportingNode to make the information of SlowPeersReport in 
namenode's jmx more specific  (was: make the information of SlowPeersReport in 
namenode's jmx more specific)

> add ReportingNode to make the information of SlowPeersReport in namenode's 
> jmx more specific
> 
>
> Key: HDFS-14780
> URL: https://issues.apache.org/jira/browse/HDFS-14780
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-14780
>
>
> I found *SlowPeersReport* in namenode's jmx is too simple, so i make an  
> inner class called 
> org.apache.hadoop.hdfs.server.blockmanagement.SlowPeerTracker.*ReportingNode* 
> to make SlowPeersReport's information more specific.Here is an example, the 
> old SlowPeersReport maybe like this:
>  
> {code:java}
> "SlowPeersReport" 
> :[{"SlowNode":"node4","ReportingNodes":["node1"]},{"SlowNode":"node2","ReportingNodes":["node1","node3"]},{"SlowNode":"node1","ReportingNodes":["node2"]}]{code}
>  
> we can see that the old SlowPeersReport just can tell you who is reporting 
> the slownode, actually we can get more information by using the  inner class 
> *ReportingNode* :
>  
> {code:java}
> "SlowPeersReport" 
> :[{"SlowNode":"node4","ReportingNodes":[{"nodeId":"node1","averageDelay":2000.0,"reportTime":"Tue
>  Aug 27 16:44:49 CST 
> 2019"}]},{"SlowNode":"node2","ReportingNodes":[{"nodeId":"node1","averageDelay":2000.0,"reportTime":"Tue
>  Aug 27 16:44:49 CST 
> 2019"},{"nodeId":"node3","averageDelay":1000.0,"reportTime":"Tue Aug 27 
> 16:44:49 CST 
> 2019"}]},{"SlowNode":"node1","ReportingNodes":[{"nodeId":"node2","averageDelay":2000.0,"reportTime":"Tue
>  Aug 27 16:44:49 CST 2019"}]}]{code}
>  
> we can know the *averageDelay* of reportingnode sending packet to slownode, 
> and the *reportTime* can tell us when this message reporting to namenode.I 
> think these message will be helpful for us to analyze the slownode problem.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14780) add ReportingNode to make the information of SlowPeersReport in namenode's jmx more specific

2019-10-11 Thread Haibin Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibin Huang updated HDFS-14780:

Attachment: HDFS-14780.001.patch

> add ReportingNode to make the information of SlowPeersReport in namenode's 
> jmx more specific
> 
>
> Key: HDFS-14780
> URL: https://issues.apache.org/jira/browse/HDFS-14780
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-14780, HDFS-14780.001.patch
>
>
> I found *SlowPeersReport* in namenode's jmx is too simple, so i make an  
> inner class called 
> org.apache.hadoop.hdfs.server.blockmanagement.SlowPeerTracker.*ReportingNode* 
> to make SlowPeersReport's information more specific.Here is an example, the 
> old SlowPeersReport maybe like this:
>  
> {code:java}
> "SlowPeersReport" 
> :[{"SlowNode":"node4","ReportingNodes":["node1"]},{"SlowNode":"node2","ReportingNodes":["node1","node3"]},{"SlowNode":"node1","ReportingNodes":["node2"]}]{code}
>  
> we can see that the old SlowPeersReport just can tell you who is reporting 
> the slownode, actually we can get more information by using the  inner class 
> *ReportingNode* :
>  
> {code:java}
> "SlowPeersReport" 
> :[{"SlowNode":"node4","ReportingNodes":[{"nodeId":"node1","averageDelay":2000.0,"reportTime":"Tue
>  Aug 27 16:44:49 CST 
> 2019"}]},{"SlowNode":"node2","ReportingNodes":[{"nodeId":"node1","averageDelay":2000.0,"reportTime":"Tue
>  Aug 27 16:44:49 CST 
> 2019"},{"nodeId":"node3","averageDelay":1000.0,"reportTime":"Tue Aug 27 
> 16:44:49 CST 
> 2019"}]},{"SlowNode":"node1","ReportingNodes":[{"nodeId":"node2","averageDelay":2000.0,"reportTime":"Tue
>  Aug 27 16:44:49 CST 2019"}]}]{code}
>  
> we can know the *averageDelay* of reportingnode sending packet to slownode, 
> and the *reportTime* can tell us when this message reporting to namenode.I 
> think these message will be helpful for us to analyze the slownode problem.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14780) add ReportingNode to make the information of SlowPeersReport in namenode's jmx more specific

2019-10-11 Thread Haibin Huang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16949292#comment-16949292
 ] 

Haibin Huang edited comment on HDFS-14780 at 10/11/19 9:00 AM:
---

update patch ,add a new test


was (Author: huanghaibin):
updata patch ,add a new test

> add ReportingNode to make the information of SlowPeersReport in namenode's 
> jmx more specific
> 
>
> Key: HDFS-14780
> URL: https://issues.apache.org/jira/browse/HDFS-14780
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-14780, HDFS-14780.001.patch
>
>
> I found *SlowPeersReport* in namenode's jmx is too simple, so i make an  
> inner class called 
> org.apache.hadoop.hdfs.server.blockmanagement.SlowPeerTracker.*ReportingNode* 
> to make SlowPeersReport's information more specific.Here is an example, the 
> old SlowPeersReport maybe like this:
>  
> {code:java}
> "SlowPeersReport" 
> :[{"SlowNode":"node4","ReportingNodes":["node1"]},{"SlowNode":"node2","ReportingNodes":["node1","node3"]},{"SlowNode":"node1","ReportingNodes":["node2"]}]{code}
>  
> we can see that the old SlowPeersReport just can tell you who is reporting 
> the slownode, actually we can get more information by using the  inner class 
> *ReportingNode* :
>  
> {code:java}
> "SlowPeersReport" 
> :[{"SlowNode":"node4","ReportingNodes":[{"nodeId":"node1","averageDelay":2000.0,"reportTime":"Tue
>  Aug 27 16:44:49 CST 
> 2019"}]},{"SlowNode":"node2","ReportingNodes":[{"nodeId":"node1","averageDelay":2000.0,"reportTime":"Tue
>  Aug 27 16:44:49 CST 
> 2019"},{"nodeId":"node3","averageDelay":1000.0,"reportTime":"Tue Aug 27 
> 16:44:49 CST 
> 2019"}]},{"SlowNode":"node1","ReportingNodes":[{"nodeId":"node2","averageDelay":2000.0,"reportTime":"Tue
>  Aug 27 16:44:49 CST 2019"}]}]{code}
>  
> we can know the *averageDelay* of reportingnode sending packet to slownode, 
> and the *reportTime* can tell us when this message reporting to namenode.I 
> think these message will be helpful for us to analyze the slownode problem.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14780) add ReportingNode to make the information of SlowPeersReport in namenode's jmx more specific

2019-10-11 Thread Haibin Huang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16949292#comment-16949292
 ] 

Haibin Huang commented on HDFS-14780:
-

updata patch ,add a new test

> add ReportingNode to make the information of SlowPeersReport in namenode's 
> jmx more specific
> 
>
> Key: HDFS-14780
> URL: https://issues.apache.org/jira/browse/HDFS-14780
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-14780, HDFS-14780.001.patch
>
>
> I found *SlowPeersReport* in namenode's jmx is too simple, so i make an  
> inner class called 
> org.apache.hadoop.hdfs.server.blockmanagement.SlowPeerTracker.*ReportingNode* 
> to make SlowPeersReport's information more specific.Here is an example, the 
> old SlowPeersReport maybe like this:
>  
> {code:java}
> "SlowPeersReport" 
> :[{"SlowNode":"node4","ReportingNodes":["node1"]},{"SlowNode":"node2","ReportingNodes":["node1","node3"]},{"SlowNode":"node1","ReportingNodes":["node2"]}]{code}
>  
> we can see that the old SlowPeersReport just can tell you who is reporting 
> the slownode, actually we can get more information by using the  inner class 
> *ReportingNode* :
>  
> {code:java}
> "SlowPeersReport" 
> :[{"SlowNode":"node4","ReportingNodes":[{"nodeId":"node1","averageDelay":2000.0,"reportTime":"Tue
>  Aug 27 16:44:49 CST 
> 2019"}]},{"SlowNode":"node2","ReportingNodes":[{"nodeId":"node1","averageDelay":2000.0,"reportTime":"Tue
>  Aug 27 16:44:49 CST 
> 2019"},{"nodeId":"node3","averageDelay":1000.0,"reportTime":"Tue Aug 27 
> 16:44:49 CST 
> 2019"}]},{"SlowNode":"node1","ReportingNodes":[{"nodeId":"node2","averageDelay":2000.0,"reportTime":"Tue
>  Aug 27 16:44:49 CST 2019"}]}]{code}
>  
> we can know the *averageDelay* of reportingnode sending packet to slownode, 
> and the *reportTime* can tell us when this message reporting to namenode.I 
> think these message will be helpful for us to analyze the slownode problem.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14783) expired SlowPeersReport will keep staying on namenode's jmx

2019-10-11 Thread Haibin Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibin Huang updated HDFS-14783:

Attachment: HDFS-14783-001.patch

> expired SlowPeersReport will keep staying on namenode's jmx
> ---
>
> Key: HDFS-14783
> URL: https://issues.apache.org/jira/browse/HDFS-14783
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-14783, HDFS-14783-001.patch
>
>
> SlowPeersReport in namenode's jmx can tell us which datanode is slow node, 
> and it is calculated by the average duration between two datanode sending 
> packet. Here is an example, if dn1 send packet to dn2 tasks too long in 
> average (over the *upperLimitLatency*), you will see SlowPeersReport in 
> namenode's jmx like this :
> {code:java}
> "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}]
> {code}
> However, if dn1 just sending some packet to dn2 with a slow speed in the 
> beginning , then didn't send any packet to dn2 for a long time, which will 
> keep the abovementioned SlowPeersReport staying on namenode's jmx . I think 
> this SlowPeersReport might be an expired message, because the network between 
> dn1 and dn2 may have returned to normal, but the SlowPeersReport is still on 
> nameonode's jmx until next time dn1 sending packet to dn2. So I use a 
> timestamp to record when an *org.apache.hadoop.metrics2.util.SampleStat* is 
> created, and calculate the average duration with the valid *SampleStat ,* 
> which is judged by it  timestamp.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14783) expired SlowPeersReport will keep staying on namenode's jmx

2019-10-11 Thread Haibin Huang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16949410#comment-16949410
 ] 

Haibin Huang commented on HDFS-14783:
-

update patch ,add a new test

> expired SlowPeersReport will keep staying on namenode's jmx
> ---
>
> Key: HDFS-14783
> URL: https://issues.apache.org/jira/browse/HDFS-14783
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-14783, HDFS-14783-001.patch
>
>
> SlowPeersReport in namenode's jmx can tell us which datanode is slow node, 
> and it is calculated by the average duration between two datanode sending 
> packet. Here is an example, if dn1 send packet to dn2 tasks too long in 
> average (over the *upperLimitLatency*), you will see SlowPeersReport in 
> namenode's jmx like this :
> {code:java}
> "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}]
> {code}
> However, if dn1 just sending some packet to dn2 with a slow speed in the 
> beginning , then didn't send any packet to dn2 for a long time, which will 
> keep the abovementioned SlowPeersReport staying on namenode's jmx . I think 
> this SlowPeersReport might be an expired message, because the network between 
> dn1 and dn2 may have returned to normal, but the SlowPeersReport is still on 
> nameonode's jmx until next time dn1 sending packet to dn2. So I use a 
> timestamp to record when an *org.apache.hadoop.metrics2.util.SampleStat* is 
> created, and calculate the average duration with the valid *SampleStat ,* 
> which is judged by it  timestamp.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14612) SlowDiskReport won't update when SlowDisks is always empty in heartbeat

2019-10-11 Thread Haibin Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibin Huang updated HDFS-14612:

Attachment: HDFS-14612-004.patch

> SlowDiskReport won't update when SlowDisks is always empty in heartbeat
> ---
>
> Key: HDFS-14612
> URL: https://issues.apache.org/jira/browse/HDFS-14612
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-14612-001.patch, HDFS-14612-002.patch, 
> HDFS-14612-003.patch, HDFS-14612-004.patch, HDFS-14612.patch
>
>
> I found SlowDiskReport won't update when slowDisks is always empty in 
> org.apache.hadoop.hdfs.server.blockmanagement.*handleHeartbeat*, this may 
> lead to outdated SlowDiskReport alway staying in jmx of namenode until next 
> time slowDisks isn't empty. So i think this method 
> *checkAndUpdateReportIfNecessary()* should be called firstly when we want to 
> get the jmx information about SlowDiskReport, this can keep the 
> SlowDiskReport on jmx is alway valid.
>  
> There is also some incorrect object reference on 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.
> *DataNodeVolumeMetrics*
> {code:java}
> // Based on writeIoRate
> public long getWriteIoSampleCount() {
>   return syncIoRate.lastStat().numSamples();
> }
> public double getWriteIoMean() {
>   return syncIoRate.lastStat().mean();
> }
> public double getWriteIoStdDev() {
>   return syncIoRate.lastStat().stddev();
> }
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14612) SlowDiskReport won't update when SlowDisks is always empty in heartbeat

2019-11-12 Thread Haibin Huang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16972486#comment-16972486
 ] 

Haibin Huang commented on HDFS-14612:
-

[~weichiu],i have update this patch, would you help review it? Thank you.

> SlowDiskReport won't update when SlowDisks is always empty in heartbeat
> ---
>
> Key: HDFS-14612
> URL: https://issues.apache.org/jira/browse/HDFS-14612
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-14612-001.patch, HDFS-14612-002.patch, 
> HDFS-14612-003.patch, HDFS-14612-004.patch, HDFS-14612.patch
>
>
> I found SlowDiskReport won't update when slowDisks is always empty in 
> org.apache.hadoop.hdfs.server.blockmanagement.*handleHeartbeat*, this may 
> lead to outdated SlowDiskReport alway staying in jmx of namenode until next 
> time slowDisks isn't empty. So i think this method 
> *checkAndUpdateReportIfNecessary()* should be called firstly when we want to 
> get the jmx information about SlowDiskReport, this can keep the 
> SlowDiskReport on jmx is alway valid.
>  
> There is also some incorrect object reference on 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.
> *DataNodeVolumeMetrics*
> {code:java}
> // Based on writeIoRate
> public long getWriteIoSampleCount() {
>   return syncIoRate.lastStat().numSamples();
> }
> public double getWriteIoMean() {
>   return syncIoRate.lastStat().mean();
> }
> public double getWriteIoStdDev() {
>   return syncIoRate.lastStat().stddev();
> }
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14612) SlowDiskReport won't update when SlowDisks is always empty in heartbeat

2019-11-12 Thread Haibin Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibin Huang updated HDFS-14612:

Attachment: HDFS-14612-005.patch

> SlowDiskReport won't update when SlowDisks is always empty in heartbeat
> ---
>
> Key: HDFS-14612
> URL: https://issues.apache.org/jira/browse/HDFS-14612
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-14612-001.patch, HDFS-14612-002.patch, 
> HDFS-14612-003.patch, HDFS-14612-004.patch, HDFS-14612-005.patch, 
> HDFS-14612.patch
>
>
> I found SlowDiskReport won't update when slowDisks is always empty in 
> org.apache.hadoop.hdfs.server.blockmanagement.*handleHeartbeat*, this may 
> lead to outdated SlowDiskReport alway staying in jmx of namenode until next 
> time slowDisks isn't empty. So i think this method 
> *checkAndUpdateReportIfNecessary()* should be called firstly when we want to 
> get the jmx information about SlowDiskReport, this can keep the 
> SlowDiskReport on jmx is alway valid.
>  
> There is also some incorrect object reference on 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.
> *DataNodeVolumeMetrics*
> {code:java}
> // Based on writeIoRate
> public long getWriteIoSampleCount() {
>   return syncIoRate.lastStat().numSamples();
> }
> public double getWriteIoMean() {
>   return syncIoRate.lastStat().mean();
> }
> public double getWriteIoStdDev() {
>   return syncIoRate.lastStat().stddev();
> }
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



<    1   2