[ https://issues.apache.org/jira/browse/FLUME-1227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13857070#comment-13857070 ]
Brock Noland commented on FLUME-1227: ------------------------------------- Thank you for addressing the feedback! I am OK with your reasoning regarding adding dual checkpointing to the example. I haven't looked at this code and review in detail. It looks like Hari has, so I think he'll have to make the call of when to commit. Thank you for your hard work Roshan! > Introduce some sort of SpillableChannel > --------------------------------------- > > Key: FLUME-1227 > URL: https://issues.apache.org/jira/browse/FLUME-1227 > Project: Flume > Issue Type: New Feature > Components: Channel > Reporter: Jarek Jarcec Cecho > Assignee: Roshan Naik > Attachments: 1227.patch.1, FLUME-1227.v2.patch, FLUME-1227.v5.patch, > FLUME-1227.v6.patch, FLUME-1227.v7.patch, FLUME-1227.v8.patch, > FLUME-1227.v9.patch, SpillableMemory Channel Design 2.pdf, SpillableMemory > Channel Design.pdf > > > I would like to introduce new channel that would behave similarly as scribe > (https://github.com/facebook/scribe). It would be something between memory > and file channel. Input events would be saved directly to the memory (only) > and would be served from there. In case that the memory would be full, we > would outsource the events to file. > Let me describe the use case behind this request. We have plenty of frontend > servers that are generating events. We want to send all events to just > limited number of machines from where we would send the data to HDFS (some > sort of staging layer). Reason for this second layer is our need to decouple > event aggregation and front end code to separate machines. Using memory > channel is fully sufficient as we can survive lost of some portion of the > events. However in order to sustain maintenance windows or networking issues > we would have to end up with a lot of memory assigned to those "staging" > machines. Referenced "scribe" is dealing with this problem by implementing > following logic - events are saved in memory similarly as our MemoryChannel. > However in case that the memory gets full (because of maintenance, networking > issues, ...) it will spill data to disk where they will be sitting until > everything start working again. > I would like to introduce channel that would implement similar logic. It's > durability guarantees would be same as MemoryChannel - in case that someone > would remove power cord, this channel would lose data. Based on the > discussion in FLUME-1201, I would propose to have the implementation > completely independent on any other channel internal code. > Jarcec -- This message was sent by Atlassian JIRA (v6.1.5#6160)