[ https://issues.apache.org/jira/browse/HDFS-13709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wei-Chiu Chuang updated HDFS-13709: ----------------------------------- Fix Version/s: 3.1.3 3.2.1 3.3.0 Resolution: Fixed Status: Resolved (was: Patch Available) Pushed 005 to trunk, branch-3.2 and branch-3.1. There are conflicts cherry picking the commit into branch-2 and lower. [~zhangchen] if you are interested please provide a branch-2 patch. Thanks [~zhangchen] and [~sodonnell]! > Report bad block to NN when transfer block encounter EIO exception > ------------------------------------------------------------------ > > Key: HDFS-13709 > URL: https://issues.apache.org/jira/browse/HDFS-13709 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode > Reporter: Chen Zhang > Assignee: Chen Zhang > Priority: Major > Fix For: 3.3.0, 3.2.1, 3.1.3 > > Attachments: HDFS-13709.002.patch, HDFS-13709.003.patch, > HDFS-13709.004.patch, HDFS-13709.005.patch, HDFS-13709.patch > > > In our online cluster, the BlockPoolSliceScanner is turned off, and sometimes > disk bad track may cause data loss. > For example, there are 3 replicas on 3 machines A/B/C, if a bad track occurs > on A's replica data, and someday B and C crushed at the same time, NN will > try to replicate data from A but failed, this block is corrupt now but no one > knows, because NN think there is at least 1 healthy replica and it keep > trying to replicate it. > When reading a replica which have data on bad track, OS will return an EIO > error, if DN reports the bad block as soon as it got an EIO, we can find > this case ASAP and try to avoid data loss -- This message was sent by Atlassian Jira (v8.3.2#803003) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org