[ https://issues.apache.org/jira/browse/FLUME-1227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13492923#comment-13492923 ]
Roshan Naik commented on FLUME-1227: ------------------------------------ The parent channel's put()/take() will be the source/sink for its overflow channel. For the special case of just supporting it in memory channel, I think it could easily employ whatever policy the SpillableChannel would have used. For the more general case of making this a cross-cutting feature available to all channels with the ability to chain, i would conjecture, it may be possible to use the same policy at each level of the chain. So this policy could be pushed into the common base class for channels. > Introduce some sort of SpillableChannel > --------------------------------------- > > Key: FLUME-1227 > URL: https://issues.apache.org/jira/browse/FLUME-1227 > Project: Flume > Issue Type: New Feature > Components: Channel > Reporter: Jarek Jarcec Cecho > Assignee: Patrick Wendell > > I would like to introduce new channel that would behave similarly as scribe > (https://github.com/facebook/scribe). It would be something between memory > and file channel. Input events would be saved directly to the memory (only) > and would be served from there. In case that the memory would be full, we > would outsource the events to file. > Let me describe the use case behind this request. We have plenty of frontend > servers that are generating events. We want to send all events to just > limited number of machines from where we would send the data to HDFS (some > sort of staging layer). Reason for this second layer is our need to decouple > event aggregation and front end code to separate machines. Using memory > channel is fully sufficient as we can survive lost of some portion of the > events. However in order to sustain maintenance windows or networking issues > we would have to end up with a lot of memory assigned to those "staging" > machines. Referenced "scribe" is dealing with this problem by implementing > following logic - events are saved in memory similarly as our MemoryChannel. > However in case that the memory gets full (because of maintenance, networking > issues, ...) it will spill data to disk where they will be sitting until > everything start working again. > I would like to introduce channel that would implement similar logic. It's > durability guarantees would be same as MemoryChannel - in case that someone > would remove power cord, this channel would lose data. Based on the > discussion in FLUME-1201, I would propose to have the implementation > completely independent on any other channel internal code. > Jarcec -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira