[ https://issues.apache.org/jira/browse/HDFS-11576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Lukas Majercak updated HDFS-11576: ---------------------------------- Attachment: HDFS-11576.repro.patch Patch for reproducing the issue > Block recovery will fail indefinitely if recovery time > heartbeat interval > --------------------------------------------------------------------------- > > Key: HDFS-11576 > URL: https://issues.apache.org/jira/browse/HDFS-11576 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, hdfs, namenode > Affects Versions: 2.7.1, 2.7.2, 2.7.3, 3.0.0-alpha1, 3.0.0-alpha2 > Reporter: Lukas Majercak > Assignee: Lukas Majercak > Priority: Critical > Attachments: HDFS-11576.repro.patch > > > Block recovery will fail indefinitely if the time to recover a block is > always longer than the heartbeat interval. Scenario: > 1. DN sends heartbeat > 2. NN sends a recovery command to DN, recoveryID=X > 3. DN starts recovery > 4. DN sends another heartbeat > 5. NN sends a recovery command to DN, recoveryID=X+1 > 6. DN calls commitBlockSyncronization after succeeding with first recovery to > NN, which fails because X < X+1 > ... -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org