[jira] [Updated] (HDFS-8163) Using monotonicNow for block report scheduling causes test failures on recently restarted systems
[ https://issues.apache.org/jira/browse/HDFS-8163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-8163: - Fix Version/s: 2.8.0 > Using monotonicNow for block report scheduling causes test failures on > recently restarted systems > - > > Key: HDFS-8163 > URL: https://issues.apache.org/jira/browse/HDFS-8163 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.6.1 >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal >Priority: Blocker > Fix For: 2.8.0, 2.7.1, 3.0.0-alpha1 > > Attachments: HDFS-8163.01.patch, HDFS-8163.02.patch, > HDFS-8163.03.patch > > > {{BPServiceActor#blockReport}} has the following check: > {code} > List blockReport() throws IOException { > // send block report if timer has expired. > final long startTime = monotonicNow(); > if (startTime - lastBlockReport <= dnConf.blockReportInterval) { > return null; > } > {code} > Many tests trigger an immediate block report via > {{BPServiceActor#triggerBlockReportForTests}} which sets {{lastBlockReport = > 0}}. However if the machine was restarted recently then startTime may be less > than {{dnConf.blockReportInterval}} and the block report is not sent. > {{Time#monotonicNow}} uses {{System#nanoTime}} which represents time elapsed > since an arbitrary origin. The time should be used only for comparison with > other values returned by {{System#nanoTime}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8163) Using monotonicNow for block report scheduling causes test failures on recently restarted systems
[ https://issues.apache.org/jira/browse/HDFS-8163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-8163: Resolution: Fixed Fix Version/s: 2.7.1 Target Version/s: (was: 2.7.1) Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks for the review Jing. Fixed the formatting and committed to trunk, branch-2 and branch-2.7. Here is the delta: {code:java} -// assigned/read by the actor thread. Thus they should be declared as vol -// to make sure the happens-before consistency. -@VisibleForTesting volatile long nextBlockReportTime = monotonicNow(); -@VisibleForTesting volatile long nextHeartbeatTime = monotonicNow(); -@VisibleForTesting boolean resetBlockReportTime = true; +// assigned/read by the actor thread. +@VisibleForTesting +volatile long nextBlockReportTime = monotonicNow(); + +@VisibleForTesting +volatile long nextHeartbeatTime = monotonicNow(); + +@VisibleForTesting +boolean resetBlockReportTime = true; {code} Using monotonicNow for block report scheduling causes test failures on recently restarted systems - Key: HDFS-8163 URL: https://issues.apache.org/jira/browse/HDFS-8163 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.6.1 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Priority: Blocker Fix For: 2.7.1 Attachments: HDFS-8163.01.patch, HDFS-8163.02.patch, HDFS-8163.03.patch {{BPServiceActor#blockReport}} has the following check: {code} ListDatanodeCommand blockReport() throws IOException { // send block report if timer has expired. final long startTime = monotonicNow(); if (startTime - lastBlockReport = dnConf.blockReportInterval) { return null; } {code} Many tests trigger an immediate block report via {{BPServiceActor#triggerBlockReportForTests}} which sets {{lastBlockReport = 0}}. However if the machine was restarted recently then startTime may be less than {{dnConf.blockReportInterval}} and the block report is not sent. {{Time#monotonicNow}} uses {{System#nanoTime}} which represents time elapsed since an arbitrary origin. The time should be used only for comparison with other values returned by {{System#nanoTime}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8163) Using monotonicNow for block report scheduling causes test failures on recently restarted systems
[ https://issues.apache.org/jira/browse/HDFS-8163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-8163: Attachment: HDFS-8163.03.patch Updated v03 patch to handle negative values and overflows. Split out the timestamp manipulation into a nested class Scheduler. Reviews welcome! Using monotonicNow for block report scheduling causes test failures on recently restarted systems - Key: HDFS-8163 URL: https://issues.apache.org/jira/browse/HDFS-8163 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.6.1 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Priority: Blocker Attachments: HDFS-8163.01.patch, HDFS-8163.02.patch, HDFS-8163.03.patch {{BPServiceActor#blockReport}} has the following check: {code} ListDatanodeCommand blockReport() throws IOException { // send block report if timer has expired. final long startTime = monotonicNow(); if (startTime - lastBlockReport = dnConf.blockReportInterval) { return null; } {code} Many tests trigger an immediate block report via {{BPServiceActor#triggerBlockReportForTests}} which sets {{lastBlockReport = 0}}. However if the machine was restarted recently then startTime may be less than {{dnConf.blockReportInterval}} and the block report is not sent. {{Time#monotonicNow}} uses {{System#nanoTime}} which represents time elapsed since an arbitrary origin. The time should be used only for comparison with other values returned by {{System#nanoTime}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8163) Using monotonicNow for block report scheduling causes test failures on recently restarted systems
[ https://issues.apache.org/jira/browse/HDFS-8163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-8163: Attachment: HDFS-8163.01.patch Preliminary v01 patch for Jenkins. Not ready for review yet. Using monotonicNow for block report scheduling causes test failures on recently restarted systems - Key: HDFS-8163 URL: https://issues.apache.org/jira/browse/HDFS-8163 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.6.1 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Priority: Blocker Attachments: HDFS-8163.01.patch {{BPServiceActor#blockReport}} has the following check: {code} ListDatanodeCommand blockReport() throws IOException { // send block report if timer has expired. final long startTime = monotonicNow(); if (startTime - lastBlockReport = dnConf.blockReportInterval) { return null; } {code} Many tests trigger an immediate block report via {{BPServiceActor#triggerBlockReportForTests}} which sets {{lastBlockReport = 0}}. However if the machine was restarted recently then startTime may be less than {{dnConf.blockReportInterval}} and the block report is not sent. {{Time#monotonicNow}} uses {{System#nanoTime}} which represents time elapsed since an arbitrary origin. The time should be used only for comparison with other values returned by {{System#nanoTime}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8163) Using monotonicNow for block report scheduling causes test failures on recently restarted systems
[ https://issues.apache.org/jira/browse/HDFS-8163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-8163: Description: {{BPServiceActor#blockReport}} has the following check: {code} ListDatanodeCommand blockReport() throws IOException { // send block report if timer has expired. final long startTime = monotonicNow(); if (startTime - lastBlockReport = dnConf.blockReportInterval) { return null; } {code} Many tests trigger an immediate block report via {{BPServiceActor#triggerBlockReportForTests}} which sets {{lastBlockReport = 0}}. However if the machine was restarted recently then startTime may be less than {{dnConf.blockReportInterval}} and the block report is not sent. {{Time#monotonicNow}} uses {{System#nanoTime}} which represents time elapsed since an arbitrary origin. The time should be used only for comparison with other values returned by {{System#nanoTime}}. was: {{BPServiceActor#blockReport}} has the following check: {code} ListDatanodeCommand blockReport() throws IOException { // send block report if timer has expired. final long startTime = monotonicNow(); if (startTime - lastBlockReport = dnConf.blockReportInterval) { return null; } {code} Many tests trigger an immediate block report via {{BPServiceActor#triggerBlockReportForTests}} which sets {{lastBlockReport = 0}}. However if the machine was restarted recently then startTime will be less than {{dnConf.blockReportInterval}} and the block report is not sent. {{Time#monotonicNow}} uses {{System#nanoTime}} which represents time elapsed since an arbitrary origin. The time should be used only for comparison with other values returned by {{System#nanoTime}}. Using monotonicNow for block report scheduling causes test failures on recently restarted systems - Key: HDFS-8163 URL: https://issues.apache.org/jira/browse/HDFS-8163 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.6.1 Reporter: Arpit Agarwal Priority: Blocker {{BPServiceActor#blockReport}} has the following check: {code} ListDatanodeCommand blockReport() throws IOException { // send block report if timer has expired. final long startTime = monotonicNow(); if (startTime - lastBlockReport = dnConf.blockReportInterval) { return null; } {code} Many tests trigger an immediate block report via {{BPServiceActor#triggerBlockReportForTests}} which sets {{lastBlockReport = 0}}. However if the machine was restarted recently then startTime may be less than {{dnConf.blockReportInterval}} and the block report is not sent. {{Time#monotonicNow}} uses {{System#nanoTime}} which represents time elapsed since an arbitrary origin. The time should be used only for comparison with other values returned by {{System#nanoTime}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8163) Using monotonicNow for block report scheduling causes test failures on recently restarted systems
[ https://issues.apache.org/jira/browse/HDFS-8163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-8163: Description: {{BPServiceActor#blockReport}} has the following check: {code} ListDatanodeCommand blockReport() throws IOException { // send block report if timer has expired. final long startTime = monotonicNow(); if (startTime - lastBlockReport = dnConf.blockReportInterval) { return null; } {code} Many tests trigger an immediate block report via {{BPServiceActor#triggerBlockReportForTests}} which sets {{lastBlockReport = 0}}. However if the machine was restarted recently then startTime will be less than {{dnConf.blockReportInterval}} and the block report is not sent. {{Time#monotonicNow}} uses {{System#nanoTime}} which represents time elapsed since an arbitrary origin. The time should be used only for comparison with other values returned by {{System#nanoTime}}. was: {{BPServiceActor#blockReport}} has the following check: {code} ListDatanodeCommand blockReport() throws IOException { // send block report if timer has expired. final long startTime = monotonicNow(); if (startTime - lastBlockReport = dnConf.blockReportInterval) { return null; } {code} Many tests set lastBlockReport to zero to trigger an immediate block report via {{BPServiceActor#triggerBlockReportForTests}}. However if the machine was restarted recently then this startTime could be less than {{dnConf.blockReportInterval}} and the block report is not sent. {{Time#monotonicNow}} uses {{System#nanoTime}} which represents time elapsed since an arbitrary origin. The time should be used only for comparison with values returned by {{System#nanoTime}}. Using monotonicNow for block report scheduling causes test failures on recently restarted systems - Key: HDFS-8163 URL: https://issues.apache.org/jira/browse/HDFS-8163 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.6.1 Reporter: Arpit Agarwal Priority: Blocker {{BPServiceActor#blockReport}} has the following check: {code} ListDatanodeCommand blockReport() throws IOException { // send block report if timer has expired. final long startTime = monotonicNow(); if (startTime - lastBlockReport = dnConf.blockReportInterval) { return null; } {code} Many tests trigger an immediate block report via {{BPServiceActor#triggerBlockReportForTests}} which sets {{lastBlockReport = 0}}. However if the machine was restarted recently then startTime will be less than {{dnConf.blockReportInterval}} and the block report is not sent. {{Time#monotonicNow}} uses {{System#nanoTime}} which represents time elapsed since an arbitrary origin. The time should be used only for comparison with other values returned by {{System#nanoTime}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8163) Using monotonicNow for block report scheduling causes test failures on recently restarted systems
[ https://issues.apache.org/jira/browse/HDFS-8163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-8163: Status: Patch Available (was: In Progress) Using monotonicNow for block report scheduling causes test failures on recently restarted systems - Key: HDFS-8163 URL: https://issues.apache.org/jira/browse/HDFS-8163 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.6.1 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Priority: Blocker Attachments: HDFS-8163.01.patch {{BPServiceActor#blockReport}} has the following check: {code} ListDatanodeCommand blockReport() throws IOException { // send block report if timer has expired. final long startTime = monotonicNow(); if (startTime - lastBlockReport = dnConf.blockReportInterval) { return null; } {code} Many tests trigger an immediate block report via {{BPServiceActor#triggerBlockReportForTests}} which sets {{lastBlockReport = 0}}. However if the machine was restarted recently then startTime may be less than {{dnConf.blockReportInterval}} and the block report is not sent. {{Time#monotonicNow}} uses {{System#nanoTime}} which represents time elapsed since an arbitrary origin. The time should be used only for comparison with other values returned by {{System#nanoTime}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8163) Using monotonicNow for block report scheduling causes test failures on recently restarted systems
[ https://issues.apache.org/jira/browse/HDFS-8163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-8163: Attachment: HDFS-8163.02.patch Using monotonicNow for block report scheduling causes test failures on recently restarted systems - Key: HDFS-8163 URL: https://issues.apache.org/jira/browse/HDFS-8163 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.6.1 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Priority: Blocker Attachments: HDFS-8163.01.patch, HDFS-8163.02.patch {{BPServiceActor#blockReport}} has the following check: {code} ListDatanodeCommand blockReport() throws IOException { // send block report if timer has expired. final long startTime = monotonicNow(); if (startTime - lastBlockReport = dnConf.blockReportInterval) { return null; } {code} Many tests trigger an immediate block report via {{BPServiceActor#triggerBlockReportForTests}} which sets {{lastBlockReport = 0}}. However if the machine was restarted recently then startTime may be less than {{dnConf.blockReportInterval}} and the block report is not sent. {{Time#monotonicNow}} uses {{System#nanoTime}} which represents time elapsed since an arbitrary origin. The time should be used only for comparison with other values returned by {{System#nanoTime}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)