Re: Flume-ng 1.6 reliable setup

Gonzalo Herreros Mon, 19 Oct 2015 02:29:20 -0700

Why don't you use a Kafka channel?
It would be simpler and it would meet your initial requirement of having
channel fail tolerance.


Regards,
Gonzalo

On 19 October 2015 at 10:23, Simone Roselli <[email protected]>
wrote:

> However,
>
> since the arrive order on Kafka (main sink) is not a particular problem to
> me, my current solution would be:
>
>  * memory channel
>  * sinkgroup with 2 sinks:
>    ** Kafka
>    ** File_roll (write events on '/data/x' directory,  in case Kafka is
> down)
>  * periodically check the presence of files in '/data/x' and, in the case,
> re-push them to Kafka
>
> I still don't know whether it is possible to re-push File-roll files on
> Kafka using bin/flume-ng
>
> Whatever hints would be appreciated.
>
> Many thanks
>
> On Fri, Oct 16, 2015 at 4:32 PM, Simone Roselli <[email protected]
> > wrote:
>
>> Hi Phil,
>>
>> thanks for your reply.
>>
>> Yes, setting up a file-channel configuration is consuming CPU up to 80/90%
>>
>> My settings:
>> # Channel configuration
>> agent1.channels.ch1.type = file
>> agent1.channels.ch1.checkpointDir = /opt/flume-ng/chekpoint
>> agent1.channels.ch1.dataDirs = /opt/flume-ng/data
>> agent1.channels.ch1.capacity = 1000000
>> agent1.channels.ch1.transactionCapacity = 10000
>>
>> # flume-env.sh
>> export JAVA_OPTS="-Xms512m -Xmx2048m"
>>
>> # top
>> 22079 flume-ng  20   0 6924752 785536  17132 S  83.7%  2.4   3:53.19 java
>>
>> Do you have any tuning for the GC ?
>>
>> Thanks
>> Simone
>>
>>
>>
>> On Thu, Oct 15, 2015 at 7:59 PM, Phil Scala <[email protected]>
>> wrote:
>>
>>> Hi Simone
>>>
>>>
>>>
>>> I wonder why you’re seeing 90% CPU use when you use a file channel.  I
>>> would expect high disk I/O.  To counter, I have on a single server 4 spool
>>> dir sources, each going to a separate file channel.  Also on an SSD based
>>> server.   I do not see any CPU or even disk IO utilization.  I am pushing
>>> about 10 million events per day across all 4 sources and has been running
>>> reliably for 2 years now.
>>>
>>>
>>>
>>> I would always use a file channel, any memory channel runs the risk of
>>> data loss if the node were to fail.  I would be as worried about the local
>>> node failing seeing that a 3 node kafka cluster losing 2 nodes before it
>>> would lose quorum.
>>>
>>>
>>>
>>> Not sure what your data source is, if you can add more flume nodes of
>>> course that would help.
>>>
>>>
>>>
>>> Have you given ample heap space, seeing maybe GC’s causing the high CPU?
>>>
>>>
>>>
>>>
>>>
>>> Phil
>>>
>>>
>>>
>>>
>>>
>>> *From:* Simone Roselli [mailto:[email protected]]
>>> *Sent:* Friday, October 09, 2015 12:33 AM
>>> *To:* [email protected]
>>> *Subject:* Flume-ng 1.6 reliable setup
>>>
>>>
>>>
>>> Hi,
>>>
>>>
>>>
>>> I'm currently plan to migrate from Flume 0.9 to Flume-ng 1.6, but I'm
>>> having troubles trying to find a reliable setup for this one.
>>>
>>>
>>>
>>> My sink is a 3 nodes Kafka cluster. I must avoid *to lose events in
>>> case the main sink is down*, broken or unreachable for a while.
>>>
>>>
>>>
>>> In Flume 0.9, I use a memory channel with the *store on failure *feature,
>>> which starts writing events on the local disk in case the target sink is
>>> not available.
>>>
>>>
>>>
>>> In Flume-ng 1.6 the same behaviour would be accomplished by setting up a 
>>> *Spillable
>>> memory channel, *but the problem with this solution is written in the
>>> end of the channel's description: "*This channel is currently
>>> experimental and not recommended for use in production."*
>>>
>>>
>>>
>>> In Flume-ng 1.6, it's possible to setup a pool of *Failover sinks*. So,
>>> I was thinking to hypothetically configure a *File Roll *as Secondary
>>> sink in case the Primary is down. However, once the Primary sink would be
>>> back online, the data placed on the Secondary sink (local disk) won't be
>>> automatically pushed on the Primary one.
>>>
>>>
>>>
>>> Another option would be setting up a *file channel*: write each event
>>> on the disk and then sink. Without mentioning that I don't love the idea to
>>> write/delete each single event continuously on a SSD, this setup is taking
>>> 90% of CPU. The same exactly configuration but using a memory channel takes
>>> 3%.
>>>
>>>
>>>
>>> Other solutions to evaluate ?
>>>
>>>
>>>
>>> Simone
>>>
>>>
>>>
>>
>>
>

Re: Flume-ng 1.6 reliable setup

Reply via email to