[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13540818#comment-13540818
 ] 

Sijie Guo commented on BOOKKEEPER-530:
--------------------------------------

{code}
    /** 
     * Scanner used to do entry log compaction
     */
    class EntryLogCompactionScanner implements EntryLogger.EntryLogScanner {
        @Override
        public boolean accept(long ledgerId) {
            // bookie has no knowledge about which ledger is deleted
            // so just accept all ledgers.
            return true;
        }   

        @Override
        public void process(long ledgerId, long offset, ByteBuffer buffer)
            throws IOException {
            addEntry(buffer);
        }   
    }
{code}

[~fpj], in compaction scanner, we just call LedgerStorage#addEntry to move 
entry from old entry log file to new entry log file. we don't add this entry 
again to journal during moving entries. this is what I mean.
                
> data might be lost during compaction.
> -------------------------------------
>
>                 Key: BOOKKEEPER-530
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-530
>             Project: Bookkeeper
>          Issue Type: Bug
>          Components: bookkeeper-server
>    Affects Versions: 4.1.0
>            Reporter: Sijie Guo
>             Fix For: 4.2.0
>
>
> {code}
>         try {
>             entryLogger.scanEntryLog(entryLogId, new 
> CompactionScanner(entryLogMeta));
>             // after moving entries to new entry log, remove this old one
>             removeEntryLog(entryLogId);
>         } catch (IOException e) {
>             LOG.info("Premature exception when compacting " + entryLogId, e); 
>         } finally {
>             // clear compacting flag
>             compacting.set(false);
>         }
> {code}
> currently compaction code has a bit problem: as the code described above, old 
> entry log is removed after new entries are added to new entry log, but new 
> entry log might not be flushed. if failures happened after removal but before 
> flush, data would be lost.
> when I implemented compaction feature in BOOKKEEPER-160, I remembered that I 
> took care of letting entry go back to normal addEntry flow to reflect journal 
> and index. But seems that the addEntry doesn't go thru journal, just move 
> entries between entry log files w/o any flush guarantee.
> there are two ideas for this solution:
> simple one is to let compaction going to normal addEntry flow (adding entry 
> to ledger storage and putting it in journal). the other one is GC thread 
> either wait for ledger storage to flush in sync thread in one flush interval 
> or force a ledger storage flush before removing entry log files.
> BTW, it was hard to design a test case by simulating bookie abnormally shut 
> down itself after entry log files are removed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to