[ 
https://issues.apache.org/jira/browse/FLUME-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16143240#comment-16143240
 ] 

Bessenyei Balázs Donát commented on FLUME-3149:
-----------------------------------------------

Hi [~zyfo2],

Thank you for the patch and the pull request!

I have skimmed through the change and your comments. The idea is great and I do 
get your point. However, as [~fszabo] has said, the channels should be 
source-agnostic and the other way around. Just to summarize, if you are looking 
for reliability, memory channel in its current design/state is probably not the 
way to go.
If you can think of a way for achieving the performance boost while keeping the 
source and channel "separate" of each other, it would be an awesome change.

I kind of see two options here, either a new channel as you have mentioned or 
finding a way of making memory channel "reliable" without adding 
source-specific code.

What do you think?


Thank you,

Donat

> reduce cpu cost for file source transfer while still maintaining reliability
> ----------------------------------------------------------------------------
>
>                 Key: FLUME-3149
>                 URL: https://issues.apache.org/jira/browse/FLUME-3149
>             Project: Flume
>          Issue Type: Improvement
>          Components: File Channel
>            Reporter: will zhang
>
> File channel tracks transferred events and use transnational mechanism to 
> make transfer recoverable. However, it increases CPU cost due to frequent 
> system calls like write, read, etc. The Cpu cost could be very high if the 
> transfer rate is high. In contrast, Memory channel  has no such issue which 
> requires only about 10% of CPU cost  in the same environment but it's not 
> recovered if the system is down accidentally.
> For sources like taildir/spooldir, I propose we could track offsets of file 
> and store them locally to achieve reliability while still using memory 
> channel to reduce CPU cost. Actually, I have already implemented this feature 
> by storing the offsets in event headers and passing it to my own 
> "offsetMemoryChannel" and store theses offsets in local disk in our 
> production which reduces CPU cost by about 90 percent.
> Please let me know if it's worthwhile to have this feature in community 
> version. Thank you.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to