[ https://issues.apache.org/jira/browse/HDFS-8960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14715564#comment-14715564 ]
Yongjun Zhang commented on HDFS-8960: ------------------------------------- HI [~tsuna], Thanks for reporting the issue. While the failure reason is yet to be found out with different DNs, would you please try to set config {{dfs.client.block.write.replace-datanode-on-failure.best-effort}} to true in hdfs-default.xml, restarting Hbase RS? For some more details, see http://blog.cloudera.com/blog/2015/03/understanding-hdfs-recovery-processes-part-2/ Thanks. > DFS client says "no more good datanodes being available to try" on a single > drive failure > ----------------------------------------------------------------------------------------- > > Key: HDFS-8960 > URL: https://issues.apache.org/jira/browse/HDFS-8960 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client > Affects Versions: 2.7.1 > Environment: openjdk version "1.8.0_45-internal" > OpenJDK Runtime Environment (build 1.8.0_45-internal-b14) > OpenJDK 64-Bit Server VM (build 25.45-b02, mixed mode) > Reporter: Benoit Sigoure > > Since we upgraded to 2.7.1 we regularly see single-drive failures cause > widespread problems at the HBase level (with the default 3x replication > target). > Here's an example. This HBase RegionServer is r12s16 (172.24.32.16) and is > writing its WAL to [172.24.32.16:10110, 172.24.32.8:10110, > 172.24.32.13:10110] as can be seen by the following occasional messages: > {code} > 2015-08-23 06:28:40,272 INFO [sync.3] wal.FSHLog: Slow sync cost: 123 ms, > current pipeline: [172.24.32.16:10110, 172.24.32.8:10110, 172.24.32.13:10110] > {code} > A bit later, the second node in the pipeline above is going to experience an > HDD failure. > {code} > 2015-08-23 07:21:58,720 WARN [DataStreamer for file > /hbase/WALs/r12s16.sjc.aristanetworks.com,9104,1439917659071/r12s16.sjc.aristanetworks.com%2C9104%2C1439917659071.default.1440314434998 > block BP-1466258523-172.24.32.1-1437768622582:blk_1073817519_77099] > hdfs.DFSClient: Error Recovery for block > BP-1466258523-172.24.32.1-1437768622582:blk_1073817519_77099 in pipeline > 172.24.32.16:10110, 172.24.32.13:10110, 172.24.32.8:10110: bad datanode > 172.24.32.8:10110 > {code} > And then HBase will go like "omg I can't write to my WAL, let me commit > suicide". > {code} > 2015-08-23 07:22:26,060 FATAL > [regionserver/r12s16.sjc.aristanetworks.com/172.24.32.16:9104.append-pool1-t1] > wal.FSHLog: Could not append. Requesting close of wal > java.io.IOException: Failed to replace a bad datanode on the existing > pipeline due to no more good datanodes being available to try. (Nodes: > current=[172.24.32.16:10110, 172.24.32.13:10110], > original=[172.24.32.16:10110, 172.24.32.13:10110]). The current failed > datanode replacement policy is DEFAULT, and a client may configure this via > 'dfs.client.block.write.replace-datanode-on-failure.policy' in its > configuration. > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:969) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:1035) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1184) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:933) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:487) > {code} > Whereas this should be mostly a non-event as the DFS client should just drop > the bad replica from the write pipeline. > This is a small cluster but has 16 DNs so the failed DN in the pipeline > should be easily replaced. I didn't set > {{dfs.client.block.write.replace-datanode-on-failure.policy}} (so it's still > {{DEFAULT}}) and didn't set > {{dfs.client.block.write.replace-datanode-on-failure.enable}} (so it's still > {{true}}). > I don't see anything noteworthy in the NN log around the time of the failure, > it just seems like the DFS client gave up or threw an exception back to HBase > that it wasn't throwing before or something else, and that made this single > drive failure lethal. > We've occasionally be "unlucky" enough to have a single-drive failure cause > multiple RegionServers to commit suicide because they had their WALs on that > drive. > We upgraded from 2.7.0 about a month ago, and I'm not sure whether we were > seeing this with 2.7 or not – prior to that we were running in a quite > different environment, but this is a fairly new deployment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)