Re: [sheepdog] [PATCH 1/3] collie: add delay_recovery {start|stop} command

Yunkai Zhang Mon, 30 Jul 2012 01:50:19 -0700

On Mon, Jul 30, 2012 at 4:35 PM, Yunkai Zhang <[email protected]> wrote:
> On Mon, Jul 30, 2012 at 4:24 PM, Liu Yuan <[email protected]> wrote:
>> On 07/30/2012 04:17 PM, Yunkai Zhang wrote:
>>> Can you show more information to me? it works well in my testing, and
>>
>> What kind of information? I just asked, if your patch set can work with
>> following situation:
>>
>>   while you do the manual recovery (be it group join or group kill),
>> some of other nodes fails unexpectedly, then what the result of it? For e.g
>>   0 we have 3 nodes  with 2 copies (d0,d1,d2)
>>   1 start manual group add, add node x1,x2
>>   2 some nodes d1,d2 goes down meantime <-- no membership event
>> propagate to cluster? If no, what do we handle the IO routed to failed
>> nodes x1, x2?
>
> Good question, in order to simplify these patchset, I'll let sheep continue
> to process LEAVE event even if we have start delay recovery.


Or just let sheep retry until recovery finished, I need more testing
for this complicated situation.

>
>>   3 stop manual group add.
>>
>> the expected result is (d0, x1, x2), how is the epoch looks like? like
>> follwoing?
>>
>>  epoch 1: (d0, d1, d2)
>>  epoch 2: (d0, x1, x3)
>>
>> Thanks,
>> Yuan
>
>
>
> --
> Yunkai Zhang
> Work at Taobao



-- 
Yunkai Zhang
Work at Taobao
-- 
sheepdog mailing list
[email protected]
http://lists.wpkg.org/mailman/listinfo/sheepdog

Re: [sheepdog] [PATCH 1/3] collie: add delay_recovery {start|stop} command

Reply via email to