Ping on the below questions about new Spool Directory source:
If we choose to use the memory channel with this source, to an Avro sink on a
remote box, do we risk data loss in the eventuality of a network partition/slow
network or if the flume-agent on the source box dies?
If we choose to use
Hi,
Yes if you use memory channel, you can lose data. To not lose data, file
channel needs to write to disk...
Brock
On Wed, Nov 7, 2012 at 1:29 PM, Rahul Ravindran rahu...@yahoo.com wrote:
Ping on the below questions about new Spool Directory source:
If we choose to use the memory channel
Hi,
Thanks for the response.
Does the memory channel provide transactional guarantees? In the event of a
network packet loss, does it retry sending the packet? If we ensure that we do
not exceed the capacity for the memory channel, does it continue retrying to
send an event to the remote
The memory channel doesn't know about networks. The sources like
avrosource/avrosink do. They operate on TCP/IP and when there is an error
sending data downstream they roll the transaction back so that no data is
lost. The believe the docs cover this here
Hi,
I am very new to Flume and we are hoping to use it for our log aggregation
into HDFS. I have a few questions below:
FileChannel will double our disk IO, which will affect IO performance on
certain performance sensitive machines. Hence, I was hoping to write a custom
Flume source which
Your still going to be writing out all events, no? So how would file
channel do more IO than that?
On Tue, Nov 6, 2012 at 3:32 PM, Rahul Ravindran rahu...@yahoo.com wrote:
Hi,
I am very new to Flume and we are hoping to use it for our log
aggregation into HDFS. I have a few questions below:
But in your architecture you are going to write the contents of the
memory channel out? Or did I miss something?
The checkpoint will be updated each time we perform a successive
insertion into the memory channel.
On Tue, Nov 6, 2012 at 3:43 PM, Rahul Ravindran rahu...@yahoo.com wrote:
We have a
We will update the checkpoint each time (we may tune this to be periodic) but
the contents of the memory channel will be in the legacy logs which are
currently being generated.
Additionally, the sink for the memory channel will be an Avro source in another
machine.
Does that clear things up?
This use case sounds like a perfect use of the Spool DIrectory source
which will be in the upcoming 1.3 release.
Brock
On Tue, Nov 6, 2012 at 4:53 PM, Rahul Ravindran rahu...@yahoo.com wrote:
We will update the checkpoint each time (we may tune this to be periodic)
but the contents of the