[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13234418#comment-13234418
 ] 

Sijie Guo commented on BOOKKEEPER-112:
--------------------------------------

yeah, fencing by fragment id is a very good solution, although I am not so 
clear how fencing handles following case.

we have 5 bookies, bk[1-5], suppose bk5 is down, ledger l is opened, with last 
fragment is bk3, bk4, bk5. the recovery tool fence bk3 & bk4, further attempts 
from ledger l will force it to rebuild new ensemble, suppose (bk1, bk2, bk3). 
bk3 is a common bookie between the old ensemble and the new ensemble. how to 
deal with writing to such bookie? because bookie server has no knowledge about 
ledger distribution info, it doesn't know the writing is to an old ensemble or 
to a new ensemble. unless we also send fragment id in the addEntry request.
                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that 
> ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, 
> BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3, BOOKKEEPER-112.patch_v4, 
> BOOKKEEPER-112.patch_v5
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will 
> not get notified of this update, so it will try to write out its own ledger 
> metadata, only to fail with KeeperException.BadVersion. This effectively 
> fences all write operations on the LedgerHandle (close and addEntry). close 
> will fail for obvious reasons. addEntry will fail once it gets to the failed 
> bookie in the schedule, tries to write, fails, selects a new bookie and tries 
> to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when 
> done
> Also, uncomment addEntry in 
> TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to