>>Yes, in the case that Ivan described, the ledger was "leaked", probably 
>>created by a process that got restarted before using the ledger. These 
>>ledgers will be left in OPEN state forever (or at least >>until some admin 
>>tool can decide that the ledger was leaked and will remove it) as the writer 
>>was already gone.
>>The only issue, since there's no data loss (and no data to be lost), is the 
>>infinite looping of auto replication workers over it.

Probably you might have observed one specific case where admin can safely 
remove the ledger. But in general, as we discussed earlier this would be a 
tough decision point(determine an empty ledger or ledger contains entry) for 
the admin as we lost quorum.


Adding one more point to this discussion. I had faced similar issue 
BOOKKEEPER-733 few days back, here also it's a kind of infinite hanging. 
My point is, most probably there could be couple of cases where RW can enter 
into an infinite loop. But today we have two such cases only. 


Hi All, Added initial draft proposal that comes in my mind to address these 
cases together, kindly look at it and would like to see the responses. Thanks


-Rakesh

-----Original Message-----
From: Matteo Merli [mailto:[email protected]] 
Sent: 07 March 2014 04:17
To: [email protected]
Subject: Re: Problem in rereplication algorithm

On Mar 6, 2014, at 10:49 AM, Sijie Guo <[email protected]> wrote:
> Just be curious, isn't it handled by the writer to change ensemble? 
> Unless that the ledger is idle and not being used anymore.

>>Yes, in the case that Ivan described, the ledger was "leaked", probably 
>>created by a process that got restarted before using the ledger. These 
>>ledgers will be left in OPEN state forever (or at least >>until some admin 
>>tool can decide that the ledger was leaked and will remove it) as the writer 
>>was already gone.
>>The only issue, since there's no data loss (and no data to be lost), is the 
>>infinite looping of auto replication workers over it.

>>Matteo


On Thu, Mar 6, 2014 at 8:29 AM, Ivan Kelly <[email protected]> wrote:

> > OK, this comment is not entirely clear to me. I thought in your 
> > example you had ensemble 3, quorum 2, and you had lost both B2 and 
> > B3. In that case, you already lost quorum. Not for L1, but at that 
> > point there are cases in which you don't know if you've lost a 
> > record. In the specific scenario you describe, we know there is no 
> > record 1 because there is no record 0, fine. But, if you had a 
> > record 0, then we wouldn't know if we lost a record and consequently 
> > the ledger is broken. We may be able to fix this particular case by 
> > simply (not) replicating what we have and declaring success, but it 
> > is not a general solution, I'm afraid.
> After we lose the first bookie, B3, we are able to detect that the 
> ledger is empty and that a bookie is down. However, we don't do 
> anything at this point, because the bookie which is down isn't in the 
> quorum for the first entry of the ledger. The problem, is that we only 
> ever start to perceive the problem when the second bookie, B2 goes 
> down.
>
> My point is that we need to deal with the issue when the first bookie 
> goes down.
>

Just be curious, isn't it handled by the writer to change ensemble? Unless that 
the ledger is idle and not being used anymore.


>
> >
> > >>
> > >>
> > >>>> the postponing is already there, since the ledger couldn't be
> opened and fenced.
> > >>
> > >> Yeah Sijie you are right, it will postpone to next cycle.
> > >> AFAIK AutoRecovery feature will keep on trying to open it again 
> > >> and again, this cycle will never ends. It is a kind of hanging too.
> > > Actually, it's a little worse than that. The recovery worker will 
> > > acquire the lock on the unreplicated node, try to open, release 
> > > the lock, and repeat ad infinitum, without any pause between 
> > > loops. This will create a lot of write traffic on zookeeper for the locks.
> >
> >
> > Ok, thanks for the clarification. Having an unbounded number of 
> > attempts is definitely not good. Independent of how we solve this 
> > problem, I was thinking about keeping track of the number of 
> > attempts.
> Ya, adding a ratelimiter would probably be enough.
>
>
> -Ivan
>

Reply via email to