> On 2012-04-19 09:12:40, Arvind Prabhakar wrote: > > Thanks for the patch Brock. The implementation looks great to me! Some > > feedback follows: > > > > Some convention nits: > > * Please make sure the line lengths are under 80 chars. > > * Please use all caps for variables that are static final. > > > > One high-level consideration: Since it is possible to have multiple file > > channels within the same agent, we should either make file channel > > multi-tenant capable like the jdbc channel, or instead have the ability to > > associate directory locks for the file channel instance for checkpoint and > > data directories. Doing this will ensure that there is no corruption and > > the system exits cleanly in case of misconfiguration. > > > > If you chose to implement the non-multitenant system, it will be preferable > > to use the channel name within the default paths for checkpoint and data > > directories. This will ensure minimum configuration necessary even when > > using multiple file channels within the same agent. > > > > > > Brock Noland wrote: > I added locks to ensure only one channel has access to the directories at > a given time. > > This is a concern I have a about JDBC Channel. I am not sold on using the > channel name from the configuration as that is a symbolic name, users will > likely think they can change it. However, if there are events stored in that > channel and they change the name, all their events are gone. If we used the > name this same thing would occur. Today with JDBC Channel and this > implementation of FileChannel, it can be multi-tenant if configured to use > different directories for each Channel. I feel like this is a better approach. > > Thoughts?
For JDBC channel, this is a necessary evil. If the channel was not multi-tenant capable, it would mean a new connection pool, statement caches, cursors, etc for every channel instance. In which case the agent would become extremely resource heavy and would require extensive configuration to work out of the box. So instead we chose to use the channel name to disambiguate. I agree that this needs to be clearly documented and that users should know that the events will be lost if they change the channel name. The good news is that this is recoverable and even manually fixable. Since the file channel does not suffer from similare heavy resource requirements, it is good to have locks to ensure there is no corruption; even though that means that for an agent only one channel instance can exist with default configuration. > On 2012-04-19 09:12:40, Arvind Prabhakar wrote: > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FileChannel.java, > > line 198 > > <https://reviews.apache.org/r/4661/diff/5/?file=101000#file101000line198> > > > > The initialization of the Log instance along with replaying it should > > be done in the start() method. Correspondingly, the stop() method should > > shutdown the log and release all associated resources. > > Brock Noland wrote: > Agreed, note that the channel's start method is currently not called. > FLUME-1121 addresses this. Thanks for pointing that out. I will prioritize the review of 1121 to get that in before this patch. > On 2012-04-19 09:12:40, Arvind Prabhakar wrote: > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FileChannel.java, > > lines 289-290 > > <https://reviews.apache.org/r/4661/diff/5/?file=101000#file101000line289> > > > > wish: it would be better to have this as a thread local to safeguard > > against transaction leaks. > > Brock Noland wrote: > This was addressed in the latest patch. However, I wonder if we should be > doing this as BasicChannelSemantics.getTransaction() already does and users > cannot call createTransaction() themselves. Agreed. Makes more sense to do it in the BasicChannelSemantics implementation. We should perhaps file another Jira just for that work, and during its implementation revert the logic you have added in your patch to avoid unnecessary duplication. > On 2012-04-19 09:12:40, Arvind Prabhakar wrote: > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FlumeEventQueue.java, > > line 46 > > <https://reviews.apache.org/r/4661/diff/6/?file=102330#file102330line46> > > > > Not a request, but something to keep an eye on: this implementation > > uses the modulo operation heavily to calculate the queue positions. The > > alternative way of doing this would be to maintain head, size and > > optionally tail pointers. That will reduce the number of overall modulo > > operations needed. > > > > The only reason I point this out is that modulo is CPU intensive for > > large numbers and therefore may become a concern for performance. > > Brock Noland wrote: > I added a note, if we find this is troublesome we do a later JIRA. Since > we will be writing to disk every time we do modulus I think it's unlikely to > be a performance problem. Makes sense. > On 2012-04-19 09:12:40, Arvind Prabhakar wrote: > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FlumeEventQueue.java, > > line 214 > > <https://reviews.apache.org/r/4661/diff/6/?file=102330#file102330line214> > > > > From cursory analysis it seems that this will likely not be able to > > handle the wrap-around logic correctly. For example, if the capacity is 10, > > next is 9 and size/index is 1: the calculated index will be 10, when it > > should be 0. > > > > One way to address this would be modulo the converted value before > > return: > > > > return (next + index % elements.capacity()) % elements.capacity(); > > > > Brock Noland wrote: > I am working on the rest of the feedback, but I just wanted to say "big > thank you" for finding this bug. I hit it last night while stress testing and > was planning on spending much of my day finding it. > > Brock Noland wrote: > I have never written a data structure like this one, so I encourage a > thorough review of it. Will do my best. Although, I have never reviewed a data structure like this before too :) Regardless, if there are issues we can always do follow-up jiras. - Arvind ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4661/#review6786 ----------------------------------------------------------- On 2012-04-19 20:10:53, Brock Noland wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/4661/ > ----------------------------------------------------------- > > (Updated 2012-04-19 20:10:53) > > > Review request for Flume. > > > Summary > ------- > > This patch implements a durable file channel. It does by writing all > transaction events to disk and syncing to disk when a commit occurs. It does > have a memory component in that pointers to the event on disk are kept in > memory. This will consume 8 bytes of direct memory (non-heap) per event. Some > basic calculations are in the FileChannel java docs. > > > This addresses bug FLUME-1085. > https://issues.apache.org/jira/browse/FLUME-1085 > > > Diffs > ----- > > flume-ng-channels/flume-file-channel/pom.xml e8155be > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Checkpoint.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Commit.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FileChannel.java > a279453 > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FileChannelConfiguration.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FlumeEvent.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FlumeEventPointer.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FlumeEventQueue.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Log.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/LogFile.java > PRE-CREATION > flume-ng-sdk/src/test/java/org/apache/flume/api/TestNettyAvroRpcClient.java > 0b8a2c0 > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/LogUtils.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Pair.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Put.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/ReplayHandler.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Rollback.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Take.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/TransactionEventRecord.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/CountingSinkRunner.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/CountingSourceRunner.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestCheckpoint.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestFileChannel.java > ab66998 > > flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestFlumeEvent.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestFlumeEventPointer.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestFlumeEventQueue.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestLog.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestLogFile.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestTransactionEventRecord.java > PRE-CREATION > > flume-ng-channels/flume-file-channel/src/test/java/org/apache/flume/channel/file/TestUtils.java > PRE-CREATION > flume-ng-channels/flume-file-channel/src/test/resources/log4j.properties > 739ecc8 > > flume-ng-channels/flume-recoverable-memory-channel/src/main/java/org/apache/flume/channel/recoverable/memory/RecoverableMemoryChannel.java > 0622f27 > > flume-ng-channels/flume-recoverable-memory-channel/src/main/java/org/apache/flume/channel/recoverable/memory/wal/SequenceIDBuffer.java > fa63b73 > flume-ng-core/src/main/java/org/apache/flume/channel/ChannelProcessor.java > eb6460e > flume-ng-core/src/main/java/org/apache/flume/channel/ChannelType.java > d8419e8 > flume-ng-core/src/main/java/org/apache/flume/sink/NullSink.java c812851 > flume-ng-core/src/main/java/org/apache/flume/source/StressSource.java > PRE-CREATION > flume-ng-core/src/main/java/org/apache/flume/tools/DirectMemoryUtils.java > PRE-CREATION > flume-ng-sdk/src/test/java/org/apache/flume/api/TestFailoverRpcClient.java > 225cd34 > > Diff: https://reviews.apache.org/r/4661/diff > > > Testing > ------- > > Unit tests and integration tests added to cover obvious cases. > > > Thanks, > > Brock > >
