Re: [VOTE CANCELLED] Commit hdfs-630 to 0.21?

stack Sun, 20 Dec 2009 16:49:32 -0800

Nicholas reviewed hdfs-630 patch and made some suggestions for improvements.
 Cosmin, the patch writer, obliged.  After chatting with Nicholas and
Cosmin, I will reverse the hdfs-630 patch that is in TRUNK and if the new
patch passes hudson, will apply it instead.  I will then put up a new vote
to have the improved patch applied to 0.21.


Thanks to all who voted.
St.Ack


On Mon, Dec 14, 2009 at 9:56 PM, stack <[email protected]> wrote:

> I'd like to propose a vote on having hdfs-630 committed to 0.21 (Its
> already been committed to TRUNK).
>
> hdfs-630 adds having the dfsclient pass the namenode the name of datanodes
> its determined dead because it got a failed connection when it tried to
> contact it, etc.  This is useful in the interval between datanode dying and
> namenode timing out its lease.  Without this fix, the namenode can often
> give out the dead datanode as a host for a block.  If the cluster is small,
> less than 5 or 6 nodes, then its very likely namenode will give out the dead
> datanode as a block host.
>
> Small clusters are common in hbase, especially when folks are starting out
> or evaluating hbase.  They'll start with three or four nodes carrying both
> datanodes+hbase regionservers.  They'll experiment killing one of the slaves
> -- datanodes and regionserver -- and watch what happens.  What follows is a
> struggling dfsclient trying to create replicas where one of the datanodes
> passed us by the namenode is dead.   DFSClient will fail and then go back to
> the namenode again, etc. (See
> https://issues.apache.org/jira/browse/HBASE-1876 for more detailed
> blow-by-blow).  HBase operation will be held up during this time and
> eventually a regionserver will shut itself down to protect itself against
> dataloss if we can't successfully write HDFS.
>
> Thanks all,
> St.Ack

Re: [VOTE CANCELLED] Commit hdfs-630 to 0.21?

Reply via email to