> On Aug. 5, 2012, 2:32 p.m., Brock Noland wrote: > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Log.java, > > line 726 > > <https://reviews.apache.org/r/6387/diff/3/?file=134246#file134246line726> > > > > Can you explain why you are doing this? I don't follow the comment. > > SInce it's a set we can add as many duplicates as we want and they will get > > removed. In practice this set will be quite small. > > Hari Shreedharan wrote: > This is not because of duplicates. The reason I am doing is this is that > in the current set of log files, there maybe files not queue.fileIDs - we > need to add those to the set of files we need to keep, and also be sure we > keep everything from queue.fileIDs. Finally we need to verify every file in > queue.fileIDs was updated. > > This can be achieved by adding to a new set the fileIDs if that fileID is > already in queue.fileIDs and then making sure that the new set and > queue.fileIDs is equal. Instead I felt it would be cheaper to first > clone,queue.fileIDs and remove IDs from queue.fileIDs, and finally verify > that queue.fileIDs is empty. The reason this is cheaper is that each insert > into a treeset is expensive, while a clone is relatively cheap. Also we need > to do a full set comparison, which also might be expensive -- which we can > avoid by doing this. > > Hope that makes sense.
Edit: The reason I am doing is this is that in the current set of log files, there maybe files not in queue.fileIDs - Hari ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/6387/#review9860 ----------------------------------------------------------- On Aug. 5, 2012, 12:26 a.m., Hari Shreedharan wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/6387/ > ----------------------------------------------------------- > > (Updated Aug. 5, 2012, 12:26 a.m.) > > > Review request for Flume, Arvind Prabhakar and Brock Noland. > > > Description > ------- > > Currently, the background worker will delete logs marked as inactive even if > there is no checkpoint. This can lead to checkpoints referencing logs which > were deleted after the checkpoint was done - causing the channel to not > startup unless the checkpoint itself is deleted. This patch collects all > fileIDs which are present in a checkpoint and deletes logs which were not > active at the time of the checkpoint. > > removeOldLogs is always called from the background worker and thus does not > need locking. Replay will not call this, since the channel is not open at the > time of replay. > > > This addresses bug FLUME-1417. > https://issues.apache.org/jira/browse/FLUME-1417 > > > Diffs > ----- > > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/FlumeEventQueue.java > 64d3dec > > flume-ng-channels/flume-file-channel/src/main/java/org/apache/flume/channel/file/Log.java > 64a70c8 > > Diff: https://reviews.apache.org/r/6387/diff/ > > > Testing > ------- > > All unit tests pass. Did functional testing. > > > Thanks, > > Hari Shreedharan > >
