[ 
https://issues.apache.org/jira/browse/HDFS-8960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14715564#comment-14715564
 ] 

Yongjun Zhang commented on HDFS-8960:
-------------------------------------

HI [~tsuna],

Thanks for reporting the issue. While the failure reason is yet to be found out 
with different DNs, would you please try to set config 
{{dfs.client.block.write.replace-datanode-on-failure.best-effort}} to true in 
hdfs-default.xml, restarting Hbase RS?

For some more details, see 
http://blog.cloudera.com/blog/2015/03/understanding-hdfs-recovery-processes-part-2/

Thanks.



> DFS client says "no more good datanodes being available to try" on a single 
> drive failure
> -----------------------------------------------------------------------------------------
>
>                 Key: HDFS-8960
>                 URL: https://issues.apache.org/jira/browse/HDFS-8960
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client
>    Affects Versions: 2.7.1
>         Environment: openjdk version "1.8.0_45-internal"
> OpenJDK Runtime Environment (build 1.8.0_45-internal-b14)
> OpenJDK 64-Bit Server VM (build 25.45-b02, mixed mode)
>            Reporter: Benoit Sigoure
>
> Since we upgraded to 2.7.1 we regularly see single-drive failures cause 
> widespread problems at the HBase level (with the default 3x replication 
> target).
> Here's an example.  This HBase RegionServer is r12s16 (172.24.32.16) and is 
> writing its WAL to [172.24.32.16:10110, 172.24.32.8:10110, 
> 172.24.32.13:10110] as can be seen by the following occasional messages:
> {code}
> 2015-08-23 06:28:40,272 INFO  [sync.3] wal.FSHLog: Slow sync cost: 123 ms, 
> current pipeline: [172.24.32.16:10110, 172.24.32.8:10110, 172.24.32.13:10110]
> {code}
> A bit later, the second node in the pipeline above is going to experience an 
> HDD failure.
> {code}
> 2015-08-23 07:21:58,720 WARN  [DataStreamer for file 
> /hbase/WALs/r12s16.sjc.aristanetworks.com,9104,1439917659071/r12s16.sjc.aristanetworks.com%2C9104%2C1439917659071.default.1440314434998
>  block BP-1466258523-172.24.32.1-1437768622582:blk_1073817519_77099] 
> hdfs.DFSClient: Error Recovery for block 
> BP-1466258523-172.24.32.1-1437768622582:blk_1073817519_77099 in pipeline 
> 172.24.32.16:10110, 172.24.32.13:10110, 172.24.32.8:10110: bad datanode 
> 172.24.32.8:10110
> {code}
> And then HBase will go like "omg I can't write to my WAL, let me commit 
> suicide".
> {code}
> 2015-08-23 07:22:26,060 FATAL 
> [regionserver/r12s16.sjc.aristanetworks.com/172.24.32.16:9104.append-pool1-t1]
>  wal.FSHLog: Could not append. Requesting close of wal
> java.io.IOException: Failed to replace a bad datanode on the existing 
> pipeline due to no more good datanodes being available to try. (Nodes: 
> current=[172.24.32.16:10110, 172.24.32.13:10110], 
> original=[172.24.32.16:10110, 172.24.32.13:10110]). The current failed 
> datanode replacement policy is DEFAULT, and a client may configure this via 
> 'dfs.client.block.write.replace-datanode-on-failure.policy' in its 
> configuration.
>         at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:969)
>         at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:1035)
>         at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1184)
>         at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:933)
>         at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:487)
> {code}
> Whereas this should be mostly a non-event as the DFS client should just drop 
> the bad replica from the write pipeline.
> This is a small cluster but has 16 DNs so the failed DN in the pipeline 
> should be easily replaced.  I didn't set 
> {{dfs.client.block.write.replace-datanode-on-failure.policy}} (so it's still 
> {{DEFAULT}}) and didn't set 
> {{dfs.client.block.write.replace-datanode-on-failure.enable}} (so it's still 
> {{true}}).
> I don't see anything noteworthy in the NN log around the time of the failure, 
> it just seems like the DFS client gave up or threw an exception back to HBase 
> that it wasn't throwing before or something else, and that made this single 
> drive failure lethal.
> We've occasionally be "unlucky" enough to have a single-drive failure cause 
> multiple RegionServers to commit suicide because they had their WALs on that 
> drive.
> We upgraded from 2.7.0 about a month ago, and I'm not sure whether we were 
> seeing this with 2.7 or not – prior to that we were running in a quite 
> different environment, but this is a fairly new deployment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to