[jira] [Comment Edited] (HDFS-15618) Improve datanode shutdown latency

2020-10-19 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17216873#comment-17216873
 ] 

Ahmed Hussein edited comment on HDFS-15618 at 10/19/20, 4:31 PM:
-

Those are Jiras filed for the failing test cases.

HDFS-15461: TestDFSClientRetries#testGetFileChecksum fails intermittently
 HDFS-15643: TestFileChecksumCompositeCrc fails on trunk
 HDFS-15308: TestReconstructStripedFile.testNNSendsErasureCodingTasks is flaky
 HDFS-12449: TestReconstructStripedFile.testNNSendsErasureCodingTasks randomly 
cannot finish in 60s
 HADOOP-16780: Track unstable tests according to aarch CI due to OOM . It seems 
this Mira should be reopened.


was (Author: ahussein):
Those are liras filed for the failing test cases.

HDFS-15461: TestDFSClientRetries#testGetFileChecksum fails intermittently
 HDFS-15643: TestFileChecksumCompositeCrc fails on trunk
 HDFS-15308: TestReconstructStripedFile.testNNSendsErasureCodingTasks is flaky
 HDFS-12449: TestReconstructStripedFile.testNNSendsErasureCodingTasks randomly 
cannot finish in 60s
 HADOOP-16780: Track unstable tests according to aarch CI due to OOM . It seems 
this Mira should be reopened.

> Improve datanode shutdown latency
> -
>
> Key: HDFS-15618
> URL: https://issues.apache.org/jira/browse/HDFS-15618
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: HDFS-15618.001.patch, HDFS-15618.002.patch, 
> HDFS-15618.003.patch, HDFS-15618.004.patch
>
>
> The shutdown of Datanode is a very long latency. A block scanner waits for 5 
> minutes to join on each VolumeScanner thread.
> Since the scanners are daemon threads and do not alter the block content, it 
> is safe to ignore such conditions on shutdown of Datanode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15618) Improve datanode shutdown latency

2020-10-09 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211046#comment-17211046
 ] 

Ahmed Hussein edited comment on HDFS-15618 at 10/9/20, 2:23 PM:


I added a configuration key {{dfs.block.scanner.volume.join.timeout.ms}} that 
controls the duration the thread times out waiting to join on a 
{{VolumeScanner}} thread. This value is only used within 
{{BlockScanner.removeAllVolumeScanners()}}.
This parameter can be used to switch between "fast mode" vs "slow mode".
A small value guarantees that the {{Datanode}} will proceed to shutdown without 
waiting for the {{VolumeScanner}} to finish.

The default value is set to 1 minute. I tried setting the default value to 500 
ms but that would break some test cases that are expecting some timely behavior.


was (Author: ahussein):
I added a configuration key {{dfs.block.scanner.volume.join.timeout.ms}} that 
controls the duration the thread times out waiting to join on a 
{{VolumeScanner}} thread. This value is only used within 
{{BlockScanner.removeAllVolumeScanners()}}.
This parameter can be used to switch between "fast mode" vs "slow mode".
A small value guarantees that the {{Datanode}} will proceed to shutdown without 
waiting for the {{VolumeScanner}} to finish.

> Improve datanode shutdown latency
> -
>
> Key: HDFS-15618
> URL: https://issues.apache.org/jira/browse/HDFS-15618
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
> Attachments: HDFS-15618.001.patch
>
>
> The shutdown of Datanode is a very long latency. A block scanner waits for 5 
> minutes to join on each VolumeScanner thread.
> Since the scanners are daemon threads and do not alter the block content, it 
> is safe to ignore such conditions on shutdown of Datanode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org