[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13533772#comment-13533772
 ] 

Rakesh R commented on BOOKKEEPER-355:
-------------------------------------

Thanks guys for the good progress. Sorry for pitching late, I'm busy with our 
internal releases and other schedules.

{quote}
c1: ensemble N => (A, B, C) is changed to N => (A, B, D) and M => (A, B, E) (N 
< M)
c2: ensemble N => (A, B, C) is changed to N => (A, B, E)
{quote}
I've gone through the latest code. As Ivan mentioned, concurrent closes have 
Ledger#resolveConflicts() mechanism and would help to handle most of the cases. 
I could also see one possible case of concurrency and what you guys feel about 
the following race condition:

1) Assume, c1 & c2 both initiates recovery and sees the failure of C. 

2) Say c1 would try replacing C with D and at same time c2 would try replacing 
C with E. 

3) Again assume, c1 & c2 sends the metadata update request simulataneously.

Consider c1 would succeeded and c2 recevies the metadataversion exception. Now 
c2 will enters into LedgerHandle#resolveconflicts(), here its checking whether 
C (failed bookie) exists in the new zkmetadata. As c1 would previously modified 
the metadata (C with D), c2 will be thinking the failed bookie is replaced with 
E and go ahead with unnecessary writes to E. Below is existing code:
{code}
             metadata.setVersion(newMeta.getVersion());
            // Resolve the conflicts if zk metadata still contains failed bookie
            if (newMeta.currentEnsemble.get(ensembleInfo.bookieIndex).equals(
                    ensembleInfo.addr)) {
            } else {
                // the failed bookie has been replaced
                blockAddCompletions.decrementAndGet();
                unsetSuccessAndSendWriteRequest(ensembleInfo.bookieIndex);
            }
{code}


IMHO, we would add the following checks in the LedgerHandle#resolveConflicts() 
for "failed bookie has been replaced in ZK":

{code}
            if (newMeta.currentEnsemble.get(ensembleInfo.bookieIndex).equals(
                    ensembleInfo.addr)) {
                 .....
            }else if (metadata.currentEnsemble.get(ensembleInfo.bookieIndex)
                    .equals(newMeta.currentEnsemble
                            .get(ensembleInfo.bookieIndex))) {
                // the failed bookie has been replaced
                blockAddCompletions.decrementAndGet();
                unsetSuccessAndSendWriteRequest(ensembleInfo.bookieIndex);
            }else {
                // some other client would have replaced the failed
                // bookie with new one. Hence failing!
                ......
            }
{code}

-Rakesh
                
> Ledger recovery will mark ledger as closed with -1, in case of slow bookie is 
> added to ensemble during  recovery add
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-355
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-355
>             Project: Bookkeeper
>          Issue Type: Bug
>          Components: bookkeeper-server
>    Affects Versions: 4.1.0, 4.2.0
>            Reporter: Vinay
>            Assignee: Ivan Kelly
>             Fix For: 4.2.0
>
>         Attachments: 
> 0001-BOOKKEEPER-355-Ledger-recovery-will-mark-ledger-as-c.patch, 
> 0001-BOOKKEEPER-355-Ledger-recovery-will-mark-ledger-as-c.patch, 
> BOOKKEEPER-355.patch, BOOKKEEPER-355.patch
>
>
> Scenario:
> ------------
> 1. Ledger is created with ensemble and quorum size as 2, written with one 
> entry
> 2. Now first bookie is in the ensemble is made down.
> 3. Another client fence and trying to recover the same ledger
> 4. During this time ensemble change will happen and new bookie will be added. 
> But this bookie is not able to connect.
> 5. This recovery will fail.
> 7. Now previously added bookie came up.
> 8. Another client trying to recover the same ledger.
> 9. Since new bookie is first in the ensemble, doRecoveryRead() is reading 
> from that bookie and getting NoSuchLedgerException and closing the ledger 
> with -1
> i.e. Marking the ledger as empty, even though first client had successfully 
> written one entry.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to