[jira] [Commented] (FLUME-1227) Introduce some sort of SpillableChannel

Brock Noland (JIRA) Thu, 26 Dec 2013 12:58:40 -0800

    [ 
https://issues.apache.org/jira/browse/FLUME-1227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13857070#comment-13857070
 ]


Brock Noland commented on FLUME-1227:
-------------------------------------

Thank you for addressing the feedback!  I am OK with your reasoning regarding 
adding dual checkpointing to the example. I haven't looked at this code and 
review in detail. It looks like Hari has, so I think he'll have to make the 
call of when to commit.

Thank you for your hard work Roshan!

> Introduce some sort of SpillableChannel
> ---------------------------------------
>
>                 Key: FLUME-1227
>                 URL: https://issues.apache.org/jira/browse/FLUME-1227
>             Project: Flume
>          Issue Type: New Feature
>          Components: Channel
>            Reporter: Jarek Jarcec Cecho
>            Assignee: Roshan Naik
>         Attachments: 1227.patch.1, FLUME-1227.v2.patch, FLUME-1227.v5.patch, 
> FLUME-1227.v6.patch, FLUME-1227.v7.patch, FLUME-1227.v8.patch, 
> FLUME-1227.v9.patch, SpillableMemory Channel Design 2.pdf, SpillableMemory 
> Channel Design.pdf
>
>
> I would like to introduce new channel that would behave similarly as scribe 
> (https://github.com/facebook/scribe). It would be something between memory 
> and file channel. Input events would be saved directly to the memory (only) 
> and would be served from there. In case that the memory would be full, we 
> would outsource the events to file.
> Let me describe the use case behind this request. We have plenty of frontend 
> servers that are generating events. We want to send all events to just 
> limited number of machines from where we would send the data to HDFS (some 
> sort of staging layer). Reason for this second layer is our need to decouple 
> event aggregation and front end code to separate machines. Using memory 
> channel is fully sufficient as we can survive lost of some portion of the 
> events. However in order to sustain maintenance windows or networking issues 
> we would have to end up with a lot of memory assigned to those "staging" 
> machines. Referenced "scribe" is dealing with this problem by implementing 
> following logic - events are saved in memory similarly as our MemoryChannel. 
> However in case that the memory gets full (because of maintenance, networking 
> issues, ...) it will spill data to disk where they will be sitting until 
> everything start working again.
> I would like to introduce channel that would implement similar logic. It's 
> durability guarantees would be same as MemoryChannel - in case that someone 
> would remove power cord, this channel would lose data. Based on the 
> discussion in FLUME-1201, I would propose to have the implementation 
> completely independent on any other channel internal code.
> Jarcec



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (FLUME-1227) Introduce some sort of SpillableChannel

Reply via email to