We have a legacy system which writes events to a file (existing log file). This 
will continue. If I used a filechannel, I will be double the number of IO 
operations(writes to the legacy log file, and writes to WAL). 


________________________________
 From: Brock Noland <br...@cloudera.com>
To: user@flume.apache.org; Rahul Ravindran <rahu...@yahoo.com> 
Sent: Tuesday, November 6, 2012 1:38 PM
Subject: Re: Guarantees of the memory channel for delivering to sink
 
Your still going to be writing out all events, no? So how would file
channel do more IO than that?

On Tue, Nov 6, 2012 at 3:32 PM, Rahul Ravindran <rahu...@yahoo.com> wrote:
> Hi,
>    I am very new to Flume and we are hoping to use it for our log
> aggregation into HDFS. I have a few questions below:
>
> FileChannel will double our disk IO, which will affect IO performance on
> certain performance sensitive machines. Hence, I was hoping to write a
> custom Flume source which will use a memory channel, and which will perform
> checkpointing. The checkpoint will be updated each time we perform a
> successive insertion into the memory channel. (I realize that this results
> in a risk of data, the maximum size of which is the capacity of the memory
> channel).
>
>    As long as there is capacity in the memory channel buffers, does the
> memory channel guarantee delivery to a sink (does it wait for
> acknowledgements, and retry failed packets)? This would mean that we need to
> ensure that we do not exceed the channel capacity.
>
> I am writing a custom source which will use the memory channel, and which
> will catch a ChannelException to identify any channel capacity issues(so,
> buffer used in the memory channel is full because of lagging sinks/network
> issues etc). Is that a reasonable assumption to make?
>
> Thanks,
> ~Rahul.



-- 
Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/

Reply via email to