[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sijie Guo updated BOOKKEEPER-1044:
----------------------------------
    Priority: Blocker  (was: Critical)

> Entrylogger is not readding rolled logs back to the logChannelsToFlush list 
> when exception happens while trying to flush rolled logs
> ------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-1044
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-1044
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Charan Reddy Guttapalem
>            Assignee: Charan Reddy Guttapalem
>            Priority: Blocker
>             Fix For: 4.5.0
>
>
> SyncThread.checkpoint(Checkpoint checkpoint) (which is called periodically by 
> SyncThread's executor for every flushInterval) ultimately calls 
> EntryLogger.flushRotatedLogs.  
> In EntryLogger.flushRotatedLogs, first we set 'logChannelsToFlush' to null 
> and then we try to flush and close individual file. Now, if IOException 
> happens while trying to flush/close the logchannel, then exception is thrown 
> as it is and it get propagates back upto SyncThread.checkpoint. Here we catch 
> that IOException, log it and return without calling the checkpointComplete. 
> But by now we lost reference of 'logChannelsToFlush' (rolled logs which are 
> yet to be closed), because it is set to null before we try to flush/close 
> individually rolledlogs. The next execution of 'checkpoint' (after 
> flushinterval) wouldn't be knowing about the rolledlogs it failed to 
> flush/close the previous time and it would flush the newly rolledlogs. So the 
> failure of flush/close of the previous rolledlogs goes unnoticed completely. 
> in EntryLogger.java
>         void flushRotatedLogs() throws IOException {
>         List<BufferedLogChannel> channels = null;
>         long flushedLogId = INVALID_LID;
>         synchronized (this) {
>             channels = logChannelsToFlush;
>             logChannelsToFlush = null;               <--------- here we set 
> 'logChannelsToFlush' to null before it tries to flush/close rolledlogs 
>         }
>         if (null == channels) {
>             return;
>         }
>         for (BufferedLogChannel channel : channels) {
>             channel.flush(true);                      
> <------------IOEXception can happen here or in the following closeFileChannel 
> call             
>             // since this channel is only used for writing, after flushing 
> the channel,
>             // we had to close the underlying file channel. Otherwise, we 
> might end up
>             // leaking fds which cause the disk spaces could not be reclaimed.
>             closeFileChannel(channel);
>             if (channel.getLogId() > flushedLogId) {
>                 flushedLogId = channel.getLogId();
>             }
>             LOG.info("Synced entry logger {} to disk.", channel.getLogId());
>         }
>         // move the leastUnflushedLogId ptr
>         leastUnflushedLogId = flushedLogId + 1;
>     }
> in SyncThread.java
>     public void checkpoint(Checkpoint checkpoint) {
>         try {
>             checkpoint = ledgerStorage.checkpoint(checkpoint);
>         } catch (NoWritableLedgerDirException e) {
>             LOG.error("No writeable ledger directories", e);
>             dirsListener.allDisksFull();
>             return;
>         } catch (IOException e) {
>             LOG.error("Exception flushing ledgers", e); <-----that IOExc gets 
> propagated to this method and here it is caught and not dealt appropriately   
>  
>             return;
>         }
>         try {
>             checkpointSource.checkpointComplete(checkpoint, true);
>         } catch (IOException e) {
>             LOG.error("Exception marking checkpoint as complete", e);
>             dirsListener.allDisksFull();
>         }
>     }



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to