Nicholas reviewed hdfs-630 patch and made some suggestions for improvements. Cosmin, the patch writer, obliged. After chatting with Nicholas and Cosmin, I will reverse the hdfs-630 patch that is in TRUNK and if the new patch passes hudson, will apply it instead. I will then put up a new vote to have the improved patch applied to 0.21.
Thanks to all who voted. St.Ack On Mon, Dec 14, 2009 at 9:56 PM, stack <st...@duboce.net> wrote: > I'd like to propose a vote on having hdfs-630 committed to 0.21 (Its > already been committed to TRUNK). > > hdfs-630 adds having the dfsclient pass the namenode the name of datanodes > its determined dead because it got a failed connection when it tried to > contact it, etc. This is useful in the interval between datanode dying and > namenode timing out its lease. Without this fix, the namenode can often > give out the dead datanode as a host for a block. If the cluster is small, > less than 5 or 6 nodes, then its very likely namenode will give out the dead > datanode as a block host. > > Small clusters are common in hbase, especially when folks are starting out > or evaluating hbase. They'll start with three or four nodes carrying both > datanodes+hbase regionservers. They'll experiment killing one of the slaves > -- datanodes and regionserver -- and watch what happens. What follows is a > struggling dfsclient trying to create replicas where one of the datanodes > passed us by the namenode is dead. DFSClient will fail and then go back to > the namenode again, etc. (See > https://issues.apache.org/jira/browse/HBASE-1876 for more detailed > blow-by-blow). HBase operation will be held up during this time and > eventually a regionserver will shut itself down to protect itself against > dataloss if we can't successfully write HDFS. > > Thanks all, > St.Ack