> On Sept. 7, 2012, 4:49 p.m., Hari Shreedharan wrote:
> > There is one issue with this, which I missed in yesterday's review. If the 
> > following series of events(in the V3 case) occurs, the channel is in a bad 
> > state:
> > 
> > Checkpoint 1
> > <normal operation>
> > Checkpoint 2:
> >   * Checkpoint 2 a: Metadata written
> > <crash>
> > 
> > At this point, the checkpoint_incomplete header is not necessarily written 
> > to disk (does not look like it is guaranteed that a put is written 
> > immediately to disk - agreed we already have this situation, but it is ok 
> > since we just end up having an older checkpoint and the replay still won't 
> > screw up). So we now have metadata for a checkpoint which is actually newer 
> > than the checkpoint file itself. This means the head, size etc in the 
> > checkpoint metadata would be inaccurate and would lead to the queue being 
> > processed incorrectly during replay. I'd recommend putting the writeOrderID 
> > in both the checkpoint and the metadata file, so we can compare the two 
> > during replay(if they don't match, ask the user to delete the checkpoint 
> > dir) and thus we can eliminate this situation. I think it is ok to have log 
> > metadata and log file itself mismatched if we die in the middle of a 
> > checkpoint, we just end up replaying more events, but I think we need to 
> > ensure we can match the checkpoint metadata to the checkpoint.
> > 
> > 
> >

Agreed, will do.


- Brock


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6680/#review11168
-----------------------------------------------------------


On Sept. 5, 2012, 7:57 p.m., Brock Noland wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/6680/
> -----------------------------------------------------------
> 
> (Updated Sept. 5, 2012, 7:57 p.m.)
> 
> 
> Review request for Flume.
> 
> 
> Description
> -------
> 
> Very sorry for the enormous patch. I don't see a way around a large patch due 
> to the nature of this JIRA. This is a first crack at a patch. I will be 
> reviewing it again and possibly providing an updated patch without any 
> reviews. However, I wanted to get feedback early.
> 
> This change creates file format versions 3, which allows field updates such 
> as FLUME-1485 to be done without changing the format version and as such 
> impact users.
> 
> This new format is entirely packed by Protocol Buffers with the exception of 
> the memory mapped file we use as the queue. This last file has a checkpoint 
> in progress marker the remaining portion is a memory mapped long buffer.
> 
> All data metadata was moved to seperate files so it can be updated without 
> random writes.
> 
> Where as before, given a checkpoint directory and log directory we would have:
> 
> chkpt/checkpoint
> log/log-1
> log/log-2
> 
> Where as now we have:
> 
> chkpt/checkpoint
> chkpt/checkpoint.meta
> log/log-1
> log/log-1.meta
> log/log-2
> log/log-2.meta
> 
> The .meta files contain all metadata and the files no longer contain a 
> VERSION prefix. The .meta files do contain the a version identifer.
> 
> This patch is backwards compatible in that when it starts up:
> 
> -Old checkpoint is upgraded
> -Only new v3 log files are written
> -Old v2 log files are still readable
> 
> This process is tested during unit tests to the regression tests added in 
> FLUME-1431. The long term plan would be to remove the V2 code after a couple 
> releases.
> 
> To support this funcationality, the following were changed:
> 
> -IO was removed from FlumeEventQueue and operates on an abstract class 
> EventQueueBackingStore which has a contrete implementation for V2 and V3.
> -Checkpoint/Log file readers/writes are not obtained from factories which 
> choose the correct object or in the case of the event queue execute the 
> upgrade process
> 
> 
> This addresses bug FLUME-1487.
>     https://issues.apache.org/jira/browse/FLUME-1487
> 
> 
> Diffs
> -----
> 
>   flume-ng-channels/flume-file-channel/pom.xml 7556f9f 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/CheckpointRebuilder.java
>  32b5324 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Commit.java
>  2c92d28 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/EventQueueBackingStore.java
>  PRE-CREATION 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/EventQueueBackingStoreFactory.java
>  PRE-CREATION 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/EventQueueBackingStoreFile.java
>  PRE-CREATION 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/EventQueueBackingStoreFileV2.java
>  PRE-CREATION 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/EventQueueBackingStoreFileV3.java
>  PRE-CREATION 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FlumeEventPointer.java
>  e40cd8c 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FlumeEventQueue.java
>  8085d22 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Log.java
>  e13ecc4 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/LogFile.java
>  2867fc7 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/LogFileFactory.java
>  PRE-CREATION 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/LogFileV2.java
>  PRE-CREATION 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/LogFileV3.java
>  PRE-CREATION 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/LogUtils.java
>  09956a4 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Put.java
>  c3fda6b 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/ReplayHandler.java
>  4e908ec 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Rollback.java
>  6e4e1fc 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Serialization.java
>  PRE-CREATION 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Take.java
>  cb575f9 
>   
> flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/TransactionEventRecord.java
>  67c68f8 
>   flume-ng-channels/flume-file-channel/src/main/proto/filechannel.proto 
> PRE-CREATION 
>   
> flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestCheckpoint.java
>  2893538 
>   
> flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestEventQueueBackingStoreFactory.java
>  PRE-CREATION 
>   
> flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestFileChannel.java
>  35521d1 
>   
> flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestFlumeEventQueue.java
>  d09ddde 
>   
> flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestLog.java
>  e923a30 
>   
> flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestLogFile.java
>  193cd2b 
>   
> flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestLogRecord.java
>  9f6adc7 
>   
> flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestTransactionEventRecord.java
>  c73b11b 
>   
> flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestTransactionEventRecordV2.java
>  PRE-CREATION 
>   
> flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestTransactionEventRecordV3.java
>  PRE-CREATION 
>   
> flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestUtils.java
>  e64f856 
>   pom.xml 52a842d 
> 
> Diff: https://reviews.apache.org/r/6680/diff/
> 
> 
> Testing
> -------
> 
> Unit tests pass. I will be running some manual tests as well and will update 
> when I do so.
> 
> 
> Thanks,
> 
> Brock Noland
> 
>

Reply via email to