[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13994616#comment-13994616
 ] 

Rakesh R commented on BOOKKEEPER-745:
-------------------------------------

Thanks [~ikelly], overall patch looks fine. I've few suggestions, please see it:
# Can we move waitIfLedgerReplicationDisabled(); above to 
generateBookie2LedgersIndex(). This would make bk2ledger indexing after 
replication enabled otw it may continue with the old index, also I feel later 
it would be helpful when doing IP to hostname meta changes.
# Could you replace 'children.get(0)' using constant AUDITOR_INDEX
# Please increase the tests timeout, already tests has 4secs sleep    
@Test(timeout=5000)
# Few cleanups in tests:
- Please remove following used variables in AuditorRollingRestartTest.java:
{code}
    private final static Logger LOG = LoggerFactory
            .getLogger(AuditorPeriodicBookieCheckTest.class);

    private final static int CHECK_INTERVAL = 1; // run every second

    final int numLedgers = 1;
{code}
- Please remove unused imports

-Rakesh

> Fix for false reports of ledger unreplication during rolling restarts.
> ----------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-745
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-745
>             Project: Bookkeeper
>          Issue Type: Bug
>          Components: bookkeeper-auto-recovery
>            Reporter: Ivan Kelly
>            Assignee: Ivan Kelly
>             Fix For: 4.3.0, 4.2.3
>
>         Attachments: 
> 0001-Fix-for-false-reports-of-ledger-unreplication-.trunk.patch, 
> 0002-Fix-for-false-reports-of-ledger-unreplication-.trunk.patch, 
> 0004-Fix-for-false-reports-of-ledger-unreplication-.trunk.patch, 
> 0006-Fix-for-false-reports-of-ledger-unreplicat.branch4.2.patch
>
>
> The bug occurred because there was no check if rereplication was enabled or 
> not when the auditor came online. When the auditor comes online it does a 
> check of which bookies are up and marks the ledgers on missing bookies as 
> underreplicated. In the false report case, the auditor was running after each 
> bookie was bounced due to the way leader election for the auditor works. And 
> since one bookie was down since you're bouncing the server, all ledgers on 
> that bookie will get marked as underreplicated.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to