On 08/21/2012 02:29 AM, MORITA Kazutaka wrote: > I think delaying recovery for a few seconds always is useful for many > users. Under heavy network load, sheep can wrongly detect node > failure and node membership can change frequently. Delaying recovery > for a short time makes Sheepdog tolerant against such situation.
I think your example is very vague, what kind of driver you use? Sheep itself won't sense membership and rely on cluster drivers to maintain membership. Could you detail how it happen exactly in real case? If you are talking about network partition problem, I don't think delay recovery will help solve it. We have met network partition when we used corosync driver, for zookeeper driver, we haven't met it yet. (I guess we won't meet it with zookeeper as a central membership control). Suppose we have 6 nodes in a cluster, A,B,C,D,E,F one copy with epoch = 1. For time t1, we get network partitioned, and three partitions show up, c1(A,B,C), c2(D,E),c3(F). So epoch for this three partitions is respectively epoch(c1=4, c2=5, c3=6) and all 3 partitions progress to recover and get updates to its local object. In your above example, suppose we might have these 3 partition automatically merge into one partition, this means, after merging 1) epoch(c1=7, c2=9, c3=11) 2) no code to handle different version objects which all nodes think his own local version is correct. So I think we have to handle epoch mismatch and object multi-version problems before evaluating delay recovery for network partition. If you are not talking about network partition problem, I think we can only meet stop/restart node case for manual maintenance, where I think manual recovery could really be helpful. Thanks, Yuan -- sheepdog mailing list sheepdog@lists.wpkg.org http://lists.wpkg.org/mailman/listinfo/sheepdog