[ https://issues.apache.org/jira/browse/HDFS-8163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arpit Agarwal updated HDFS-8163: -------------------------------- Description: {{BPServiceActor#blockReport}} has the following check: {code} List<DatanodeCommand> blockReport() throws IOException { // send block report if timer has expired. final long startTime = monotonicNow(); if (startTime - lastBlockReport <= dnConf.blockReportInterval) { return null; } {code} Many tests trigger an immediate block report via {{BPServiceActor#triggerBlockReportForTests}} which sets {{lastBlockReport = 0}}. However if the machine was restarted recently then startTime will be less than {{dnConf.blockReportInterval}} and the block report is not sent. {{Time#monotonicNow}} uses {{System#nanoTime}} which represents time elapsed since an arbitrary origin. The time should be used only for comparison with other values returned by {{System#nanoTime}}. was: {{BPServiceActor#blockReport}} has the following check: {code} List<DatanodeCommand> blockReport() throws IOException { // send block report if timer has expired. final long startTime = monotonicNow(); if (startTime - lastBlockReport <= dnConf.blockReportInterval) { return null; } {code} Many tests set lastBlockReport to zero to trigger an immediate block report via {{BPServiceActor#triggerBlockReportForTests}}. However if the machine was restarted recently then this startTime could be less than {{dnConf.blockReportInterval}} and the block report is not sent. {{Time#monotonicNow}} uses {{System#nanoTime}} which represents time elapsed since an arbitrary origin. The time should be used only for comparison with values returned by {{System#nanoTime}}. > Using monotonicNow for block report scheduling causes test failures on > recently restarted systems > ------------------------------------------------------------------------------------------------- > > Key: HDFS-8163 > URL: https://issues.apache.org/jira/browse/HDFS-8163 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Affects Versions: 2.6.1 > Reporter: Arpit Agarwal > Priority: Blocker > > {{BPServiceActor#blockReport}} has the following check: > {code} > List<DatanodeCommand> blockReport() throws IOException { > // send block report if timer has expired. > final long startTime = monotonicNow(); > if (startTime - lastBlockReport <= dnConf.blockReportInterval) { > return null; > } > {code} > Many tests trigger an immediate block report via > {{BPServiceActor#triggerBlockReportForTests}} which sets {{lastBlockReport = > 0}}. However if the machine was restarted recently then startTime will be > less than {{dnConf.blockReportInterval}} and the block report is not sent. > {{Time#monotonicNow}} uses {{System#nanoTime}} which represents time elapsed > since an arbitrary origin. The time should be used only for comparison with > other values returned by {{System#nanoTime}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)