[ 
https://issues.apache.org/jira/browse/FLUME-1227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13609911#comment-13609911
 ] 

Hari Shreedharan commented on FLUME-1227:
-----------------------------------------

Hi Juhani,

Thanks for you comments. I agree with most of what you have mentioned.
{quote}
As to lifecycle management, I don't necessary feel that having a channel own 
it's sub-channels is a particularly good precedent. I think it would be 
preferable that we allow the lifecycle manager to return interfaces rather than 
having components creating other components explicitly. Configuration would 
have to have some grasp of dependencies though... Sub-channels would need to be 
instantiated before the "owner"
{quote}

I agree with your last statement. Configuration will also need to detect cycles 
etc so that you don't have a cycle of interdependent components. I don't 
particularly like the idea of passing references of existing channels to others 
to use as sub-channels - something that I don't like, but won't block since 
there seems to have been some consensus regarding this earlier. I frankly think 
2 channels within the same one is overkill. I think this channel can be easily 
implemented by using a mmap-ed file which is never specifically fsync-ed. This 
might cause some page faults etc., but the page cache management is usually 
smart enough to not cause this to affect performance a whole lot - this 
implementation is likely to be faster too (in fact, this is very similar to the 
File Channel checkpoint class). Using this as a cyclic buffer would probably be 
as good, and gives the same guarantees as the memory channel (which is what we 
are targeting in this jira, I suppose?). 

Also, I like the implementation you have mentioned above, though this can be 
quite tricky to get right. 
                
> Introduce some sort of SpillableChannel
> ---------------------------------------
>
>                 Key: FLUME-1227
>                 URL: https://issues.apache.org/jira/browse/FLUME-1227
>             Project: Flume
>          Issue Type: New Feature
>          Components: Channel
>            Reporter: Jarek Jarcec Cecho
>            Assignee: Roshan Naik
>         Attachments: 1227.patch.1, SpillableMemory Channel Design.pdf
>
>
> I would like to introduce new channel that would behave similarly as scribe 
> (https://github.com/facebook/scribe). It would be something between memory 
> and file channel. Input events would be saved directly to the memory (only) 
> and would be served from there. In case that the memory would be full, we 
> would outsource the events to file.
> Let me describe the use case behind this request. We have plenty of frontend 
> servers that are generating events. We want to send all events to just 
> limited number of machines from where we would send the data to HDFS (some 
> sort of staging layer). Reason for this second layer is our need to decouple 
> event aggregation and front end code to separate machines. Using memory 
> channel is fully sufficient as we can survive lost of some portion of the 
> events. However in order to sustain maintenance windows or networking issues 
> we would have to end up with a lot of memory assigned to those "staging" 
> machines. Referenced "scribe" is dealing with this problem by implementing 
> following logic - events are saved in memory similarly as our MemoryChannel. 
> However in case that the memory gets full (because of maintenance, networking 
> issues, ...) it will spill data to disk where they will be sitting until 
> everything start working again.
> I would like to introduce channel that would implement similar logic. It's 
> durability guarantees would be same as MemoryChannel - in case that someone 
> would remove power cord, this channel would lose data. Based on the 
> discussion in FLUME-1201, I would propose to have the implementation 
> completely independent on any other channel internal code.
> Jarcec

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to