Hi, .. because a Kafka channel will lead me to the same problem, no?
I have 200 nodes, each one with a Flume-ng agent aboard. I cannot lose a single event. With a memory/file channel, in case Kafka is down/broken/bugged, I could still take care of events (Spillable memory, File roll, other sinks..). In case of Kafka Channel (another separated Kafka cluster) I would exclusively rely on the Kafka cluster, which was my initial non-ideal situation, having it as a Sink. Thanks Simone On Mon, Oct 19, 2015 at 11:28 AM, Gonzalo Herreros <[email protected]> wrote: > Why don't you use a Kafka channel? > It would be simpler and it would meet your initial requirement of having > channel fail tolerance. > > Regards, > Gonzalo > > On 19 October 2015 at 10:23, Simone Roselli <[email protected]> > wrote: > >> However, >> >> since the arrive order on Kafka (main sink) is not a particular problem >> to me, my current solution would be: >> >> * memory channel >> * sinkgroup with 2 sinks: >> ** Kafka >> ** File_roll (write events on '/data/x' directory, in case Kafka is >> down) >> * periodically check the presence of files in '/data/x' and, in the >> case, re-push them to Kafka >> >> I still don't know whether it is possible to re-push File-roll files on >> Kafka using bin/flume-ng >> >> Whatever hints would be appreciated. >> >> Many thanks >> >> On Fri, Oct 16, 2015 at 4:32 PM, Simone Roselli < >> [email protected]> wrote: >> >>> Hi Phil, >>> >>> thanks for your reply. >>> >>> Yes, setting up a file-channel configuration is consuming CPU up to >>> 80/90% >>> >>> My settings: >>> # Channel configuration >>> agent1.channels.ch1.type = file >>> agent1.channels.ch1.checkpointDir = /opt/flume-ng/chekpoint >>> agent1.channels.ch1.dataDirs = /opt/flume-ng/data >>> agent1.channels.ch1.capacity = 1000000 >>> agent1.channels.ch1.transactionCapacity = 10000 >>> >>> # flume-env.sh >>> export JAVA_OPTS="-Xms512m -Xmx2048m" >>> >>> # top >>> 22079 flume-ng 20 0 6924752 785536 17132 S 83.7% 2.4 3:53.19 java >>> >>> Do you have any tuning for the GC ? >>> >>> Thanks >>> Simone >>> >>> >>> >>> On Thu, Oct 15, 2015 at 7:59 PM, Phil Scala <[email protected]> >>> wrote: >>> >>>> Hi Simone >>>> >>>> >>>> >>>> I wonder why you’re seeing 90% CPU use when you use a file channel. I >>>> would expect high disk I/O. To counter, I have on a single server 4 spool >>>> dir sources, each going to a separate file channel. Also on an SSD based >>>> server. I do not see any CPU or even disk IO utilization. I am pushing >>>> about 10 million events per day across all 4 sources and has been running >>>> reliably for 2 years now. >>>> >>>> >>>> >>>> I would always use a file channel, any memory channel runs the risk of >>>> data loss if the node were to fail. I would be as worried about the local >>>> node failing seeing that a 3 node kafka cluster losing 2 nodes before it >>>> would lose quorum. >>>> >>>> >>>> >>>> Not sure what your data source is, if you can add more flume nodes of >>>> course that would help. >>>> >>>> >>>> >>>> Have you given ample heap space, seeing maybe GC’s causing the high CPU? >>>> >>>> >>>> >>>> >>>> >>>> Phil >>>> >>>> >>>> >>>> >>>> >>>> *From:* Simone Roselli [mailto:[email protected]] >>>> *Sent:* Friday, October 09, 2015 12:33 AM >>>> *To:* [email protected] >>>> *Subject:* Flume-ng 1.6 reliable setup >>>> >>>> >>>> >>>> Hi, >>>> >>>> >>>> >>>> I'm currently plan to migrate from Flume 0.9 to Flume-ng 1.6, but I'm >>>> having troubles trying to find a reliable setup for this one. >>>> >>>> >>>> >>>> My sink is a 3 nodes Kafka cluster. I must avoid *to lose events in >>>> case the main sink is down*, broken or unreachable for a while. >>>> >>>> >>>> >>>> In Flume 0.9, I use a memory channel with the *store on failure *feature, >>>> which starts writing events on the local disk in case the target sink is >>>> not available. >>>> >>>> >>>> >>>> In Flume-ng 1.6 the same behaviour would be accomplished by setting up >>>> a *Spillable memory channel, *but the problem with this solution is >>>> written in the end of the channel's description: "*This channel is >>>> currently experimental and not recommended for use in production."* >>>> >>>> >>>> >>>> In Flume-ng 1.6, it's possible to setup a pool of *Failover sinks*. >>>> So, I was thinking to hypothetically configure a *File Roll *as >>>> Secondary sink in case the Primary is down. However, once the Primary sink >>>> would be back online, the data placed on the Secondary sink (local disk) >>>> won't be automatically pushed on the Primary one. >>>> >>>> >>>> >>>> Another option would be setting up a *file channel*: write each event >>>> on the disk and then sink. Without mentioning that I don't love the idea to >>>> write/delete each single event continuously on a SSD, this setup is taking >>>> 90% of CPU. The same exactly configuration but using a memory channel takes >>>> 3%. >>>> >>>> >>>> >>>> Other solutions to evaluate ? >>>> >>>> >>>> >>>> Simone >>>> >>>> >>>> >>> >>> >> >
