----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/6680/#review11168 -----------------------------------------------------------
There is one issue with this, which I missed in yesterday's review. If the following series of events(in the V3 case) occurs, the channel is in a bad state: Checkpoint 1 <normal operation> Checkpoint 2: * Checkpoint 2 a: Metadata written <crash> At this point, the checkpoint_incomplete header is not necessarily written to disk (does not look like it is guaranteed that a put is written immediately to disk - agreed we already have this situation, but it is ok since we just end up having an older checkpoint and the replay still won't screw up). So we now have metadata for a checkpoint which is actually newer than the checkpoint file itself. This means the head, size etc in the checkpoint metadata would be inaccurate and would lead to the queue being processed incorrectly during replay. I'd recommend putting the writeOrderID in both the checkpoint and the metadata file, so we can compare the two during replay(if they don't match, ask the user to delete the checkpoint dir) and thus we can eliminate this situation. I think it is ok to have log metadata and log file itself mismatched if we die in the middle of a checkpoint, we just end up replaying more events, but I think we need to ensure we can match the checkpoint metadata to the checkpoint. - Hari Shreedharan On Sept. 5, 2012, 7:57 p.m., Brock Noland wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/6680/ > ----------------------------------------------------------- > > (Updated Sept. 5, 2012, 7:57 p.m.) > > > Review request for Flume. > > > Description > ------- > > Very sorry for the enormous patch. I don't see a way around a large patch due > to the nature of this JIRA. This is a first crack at a patch. I will be > reviewing it again and possibly providing an updated patch without any > reviews. However, I wanted to get feedback early. > > This change creates file format versions 3, which allows field updates such > as FLUME-1485 to be done without changing the format version and as such > impact users. > > This new format is entirely packed by Protocol Buffers with the exception of > the memory mapped file we use as the queue. This last file has a checkpoint > in progress marker the remaining portion is a memory mapped long buffer. > > All data metadata was moved to seperate files so it can be updated without > random writes. > > Where as before, given a checkpoint directory and log directory we would have: > > chkpt/checkpoint > log/log-1 > log/log-2 > > Where as now we have: > > chkpt/checkpoint > chkpt/checkpoint.meta > log/log-1 > log/log-1.meta > log/log-2 > log/log-2.meta > > The .meta files contain all metadata and the files no longer contain a > VERSION prefix. The .meta files do contain the a version identifer. > > This patch is backwards compatible in that when it starts up: > > -Old checkpoint is upgraded > -Only new v3 log files are written > -Old v2 log files are still readable > > This process is tested during unit tests to the regression tests added in > FLUME-1431. The long term plan would be to remove the V2 code after a couple > releases. > > To support this funcationality, the following were changed: > > -IO was removed from FlumeEventQueue and operates on an abstract class > EventQueueBackingStore which has a contrete implementation for V2 and V3. > -Checkpoint/Log file readers/writes are not obtained from factories which > choose the correct object or in the case of the event queue execute the > upgrade process > > > This addresses bug FLUME-1487. > https://issues.apache.org/jira/browse/FLUME-1487 > > > Diffs > ----- > > flume-ng-channels/flume-file-channel/pom.xml 7556f9f > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/CheckpointRebuilder.java > 32b5324 > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Commit.java > 2c92d28 > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/EventQueueBackingStore.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/EventQueueBackingStoreFactory.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/EventQueueBackingStoreFile.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/EventQueueBackingStoreFileV2.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/EventQueueBackingStoreFileV3.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FlumeEventPointer.java > e40cd8c > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FlumeEventQueue.java > 8085d22 > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Log.java > e13ecc4 > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/LogFile.java > 2867fc7 > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/LogFileFactory.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/LogFileV2.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/LogFileV3.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/LogUtils.java > 09956a4 > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Put.java > c3fda6b > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/ReplayHandler.java > 4e908ec > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Rollback.java > 6e4e1fc > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Serialization.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Take.java > cb575f9 > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/TransactionEventRecord.java > 67c68f8 > flume-ng-channels/flume-file-channel/src/main/proto/filechannel.proto > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestCheckpoint.java > 2893538 > > flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestEventQueueBackingStoreFactory.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestFileChannel.java > 35521d1 > > flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestFlumeEventQueue.java > d09ddde > > flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestLog.java > e923a30 > > flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestLogFile.java > 193cd2b > > flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestLogRecord.java > 9f6adc7 > > flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestTransactionEventRecord.java > c73b11b > > flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestTransactionEventRecordV2.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestTransactionEventRecordV3.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestUtils.java > e64f856 > pom.xml 52a842d > > Diff: https://reviews.apache.org/r/6680/diff/ > > > Testing > ------- > > Unit tests pass. I will be running some manual tests as well and will update > when I do so. > > > Thanks, > > Brock Noland > >