-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/5436/
-----------------------------------------------------------
Review request for Flume and Brock Noland.
Description
-------
The file channel uses a disk-serialized in-memory checkpointing mechanism. When
the channel is full and the capacity is large, these checkpoints take a long
time to serialize and deserialize. For example, a channel with 1M entries could
take many minutes to boot up. Similarly, a boot up of a largely full channel
would require the replay of all log events to reconstruct the correct state.
Due to this latency issues and the failure interaction of the channel with the
LifeCycleSupervisor, the system could get into an unusable state easily as
evident from the FLUME-1232 issue.
This patch modifies the checkpointing mechanism as follows:
* The FlumeEventQueue itself represents a checkpoint that is maintained as a
memory mapped file.
* During checkpointing, a marker is introduced in active logs which is used to
skip records during display.
In order to ensure correctness, a reader/writer lock is used where the reader
lock is used by consumers operating against the channel while the writer lock
is used to facilitate checkpointing. Some limitations of this approach are:
* The total number of active log files is now limited to a maximum of 1024.
* Dynamic resizing of the channel capacity is no longer allowed unless the
checkpoint is rebuild from scratch which can cause significant delay in startup.
This addresses bug FLUME-1232.
https://issues.apache.org/jira/browse/FLUME-1232
Diffs
-----
/trunk/flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Checkpoint.java
1351988
/trunk/flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FileChannel.java
1351988
/trunk/flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FileChannelConfiguration.java
1351988
/trunk/flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FlumeEventQueue.java
1351988
/trunk/flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Log.java
1351988
/trunk/flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/LogFile.java
1351988
/trunk/flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/ReplayHandler.java
1351988
/trunk/flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestCheckpoint.java
1351988
/trunk/flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestFileChannel.java
1351988
/trunk/flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestFlumeEventQueue.java
1351988
/trunk/flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestLog.java
1351988
/trunk/flume-ng-core/src/main/java/org/apache/flume/sink/LoggerSink.java
1351988
Diff: https://reviews.apache.org/r/5436/diff/
Testing
-------
Ran all tests. Did some manual testing. Will be doing more manual testing and
cleanup as necessary while the review is underway.
Thanks,
Arvind Prabhakar