Hi Hari, I think it's actually more than enough. Given the purpose and architecture of Flume-ng it feels wrong allowing it to become all around queue. Needing more on a single node, the more your system becomes vulnerable to a full-stop situation. That's an obvious sign that your infrastructure architecture is flawing to begin with.
Instead, invest in infrastructure, add redundant flume and do round-robin load balancing. On Thu, Nov 12, 2015 at 8:19 PM, Hari Shreedharan <[email protected] > wrote: > So there are a couple of issues related to int overflows - basically the > checkpoint file is mmap-ed, so indexing is on integer, and since read 16 > bytes per event — the total number of events can be about 2 billion / 16 or > so (give or take) — so your channel capacity needs to be below that. I have > not looked at the exact numbers, but this is an approximate range. If this > is something that concerns you, please file a jira. I wanted to get to this > at some point, but didn’t see the urgency. > > Thanks, > Hari Shreedharan > > > > > On Nov 12, 2015, at 8:39 AM, Jeff Alfeld <[email protected]> wrote: > > Now that the channels are working again it raises the question of why did > this occur? If there is a theoretical limit to a filechannel size outside > of disk space limitations, what is that limit? > > Jeff > > On Thu, Nov 12, 2015 at 10:23 AM Jeff Alfeld <[email protected]> wrote: > >> Thanks for the assist, it seems that clearing the directories once more >> and lowering the capacity of the channel has allowed the service to start >> successfully on this server. >> >> Jeff >> >> On Thu, Nov 12, 2015 at 10:03 AM Ahmed Vila <[email protected]> wrote: >> >>> 10M channel capacity seems to be exaggerated to me. Try to lower it down. >>> Please check if you have at least 512MB of free space on the device >>> where you're storing channel data and checkpoint. >>> >>> To me, this seems that it tries to reply the channel log, but it >>> encounters an EOF. Please make sure that there is no hidden files in there. >>> Maybe removing settings for data and checkpoint dirs would be the best >>> bet to try first, so it creates ~/.flume/file-channel/checkpoint and >>> ~/.flume/file-channel/data >>> >>> At the end, you might want to try playing with setting use-fast-reply or >>> even use-log-reply-v1 to true. >>> >>> >>> On Tue, Nov 10, 2015 at 5:38 PM, Jeff Alfeld <[email protected]> wrote: >>> >>>> I am having an issue on a server that I am standing up to forward log >>>> data from a spooling directory to our hadoop cluster. I am receiving the >>>> following errors when flume is starting up: >>>> >>>> 10 Nov 2015 16:13:25,751 INFO [conf-file-poller-0] >>>> (org.apache.flume.node.Application.startAllComponents:145) - Starting >>>> Channel bluecoat-channel >>>> 10 Nov 2015 16:13:25,751 INFO [lifecycleSupervisor-1-0] >>>> (org.apache.flume.channel.file.FileChannel.start:269) - Starting >>>> FileChannel bluecoat-channel { dataDirs: >>>> [/Dropbox/flume_tmp/bluecoat-channel/data] }... >>>> 10 Nov 2015 16:13:25,751 INFO [conf-file-poller-0] >>>> (org.apache.flume.node.Application.startAllComponents:145) - Starting >>>> Channel fs-channel >>>> 10 Nov 2015 16:13:25,751 INFO [lifecycleSupervisor-1-2] >>>> (org.apache.flume.channel.file.FileChannel.start:269) - Starting >>>> FileChannel fs-channel { dataDirs: [/Dropbox/flume_tmp/fs-channel/data] >>>> }... >>>> 10 Nov 2015 16:13:25,778 INFO [lifecycleSupervisor-1-2] >>>> (org.apache.flume.channel.file.Log.<init>:336) - Encryption is not enabled >>>> 10 Nov 2015 16:13:25,778 INFO [lifecycleSupervisor-1-0] >>>> (org.apache.flume.channel.file.Log.<init>:336) - Encryption is not enabled >>>> 10 Nov 2015 16:13:25,779 INFO [lifecycleSupervisor-1-2] >>>> (org.apache.flume.channel.file.Log.replay:382) - Replay started >>>> 10 Nov 2015 16:13:25,779 INFO [lifecycleSupervisor-1-0] >>>> (org.apache.flume.channel.file.Log.replay:382) - Replay started >>>> 10 Nov 2015 16:13:25,780 INFO [lifecycleSupervisor-1-0] >>>> (org.apache.flume.channel.file.Log.replay:394) - Found NextFileID 0, from >>>> [] >>>> 10 Nov 2015 16:13:25,780 INFO [lifecycleSupervisor-1-2] >>>> (org.apache.flume.channel.file.Log.replay:394) - Found NextFileID 0, from >>>> [] >>>> 10 Nov 2015 16:13:25,784 ERROR [lifecycleSupervisor-1-0] >>>> (org.apache.flume.channel.file.Log.replay:492) - Failed to initialize Log >>>> on [channel=bluecoat-channel] >>>> java.io.EOFException >>>> at java.io.RandomAccessFile.readInt(RandomAccessFile.java:827) >>>> at java.io.RandomAccessFile.readLong(RandomAccessFile.java:860) >>>> at >>>> org.apache.flume.channel.file.EventQueueBackingStoreFactory.get(EventQueueBackingStoreFactory.java:80) >>>> at org.apache.flume.channel.file.Log.replay(Log.java:426) >>>> at org.apache.flume.channel.file.FileChannel.start(FileChannel.java:290) >>>> at >>>> org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) >>>> at >>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) >>>> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) >>>> at >>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) >>>> at >>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) >>>> at >>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>>> at >>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>>> at java.lang.Thread.run(Thread.java:745) >>>> 10 Nov 2015 16:13:25,786 ERROR [lifecycleSupervisor-1-0] >>>> (org.apache.flume.channel.file.FileChannel.start:301) - Failed to start >>>> the file channel [channel=bluecoat-channel] >>>> java.io.EOFException >>>> at java.io.RandomAccessFile.readInt(RandomAccessFile.java:827) >>>> at java.io.RandomAccessFile.readLong(RandomAccessFile.java:860) >>>> at >>>> org.apache.flume.channel.file.EventQueueBackingStoreFactory.get(EventQueueBackingStoreFactory.java:80) >>>> at org.apache.flume.channel.file.Log.replay(Log.java:426) >>>> at org.apache.flume.channel.file.FileChannel.start(FileChannel.java:290) >>>> at >>>> org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) >>>> at >>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) >>>> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) >>>> at >>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) >>>> at >>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) >>>> at >>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>>> at >>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>>> at java.lang.Thread.run(Thread.java:745) >>>> 10 Nov 2015 16:13:25,784 ERROR [lifecycleSupervisor-1-2] >>>> (org.apache.flume.channel.file.Log.replay:492) - Failed to initialize Log >>>> on [channel=fs-channel] >>>> java.io.EOFException >>>> at java.io.RandomAccessFile.readInt(RandomAccessFile.java:827) >>>> at java.io.RandomAccessFile.readLong(RandomAccessFile.java:860) >>>> at >>>> org.apache.flume.channel.file.EventQueueBackingStoreFactory.get(EventQueueBackingStoreFactory.java:80) >>>> at org.apache.flume.channel.file.Log.replay(Log.java:426) >>>> at org.apache.flume.channel.file.FileChannel.start(FileChannel.java:290) >>>> at >>>> org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) >>>> at >>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) >>>> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) >>>> at >>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) >>>> at >>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) >>>> at >>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>>> at >>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>>> at java.lang.Thread.run(Thread.java:745) >>>> 10 Nov 2015 16:13:25,787 ERROR [lifecycleSupervisor-1-2] >>>> (org.apache.flume.channel.file.FileChannel.start:301) - Failed to start >>>> the file channel [channel=fs-channel] >>>> java.io.EOFException >>>> at java.io.RandomAccessFile.readInt(RandomAccessFile.java:827) >>>> at java.io.RandomAccessFile.readLong(RandomAccessFile.java:860) >>>> at >>>> org.apache.flume.channel.file.EventQueueBackingStoreFactory.get(EventQueueBackingStoreFactory.java:80) >>>> at org.apache.flume.channel.file.Log.replay(Log.java:426) >>>> at org.apache.flume.channel.file.FileChannel.start(FileChannel.java:290) >>>> at >>>> org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) >>>> at >>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) >>>> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) >>>> at >>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) >>>> at >>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) >>>> at >>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>>> at >>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>>> at java.lang.Thread.run(Thread.java:745) >>>> >>>> Any suggestions on why this is occurring? I have tried stopping the >>>> service and clearing the contents of the data and checkpoint directories >>>> with no change. I have verified that the flume daemon user account has >>>> full permissions to the checkpoint and data directories also. >>>> >>>> Below is the config that I am currently trying to use: >>>> >>>> >>>> #global >>>> agent.sources = bluecoat-src fs-src >>>> agent.channels = bluecoat-channel fs-channel >>>> agent.sinks = bc-avro fs-avro >>>> >>>> >>>> #kc bluecoat logs >>>> agent.sources.bluecoat-src.type = spooldir >>>> agent.sources.bluecoat-src.channels = bluecoat-channel >>>> agent.sources.bluecoat-src.spoolDir = /Dropbox/flume >>>> agent.sources.bluecoat-src.basenameHeader = true >>>> agent.sources.bluecoat-src.basenameHeaderKey = basename >>>> agent.sources.bluecoat-src.deserializer = line >>>> agent.sources.bluecoat-src.deserializer.maxLineLength = 32000 >>>> agent.sources.bluecoat-src.deletePolicy = immediate >>>> agent.sources.bluecoat-src.decodeErrorPolicy = IGNORE >>>> agent.sources.bluecoat-src.maxBackoff = 10000 >>>> >>>> agent.channels.bluecoat-channel.type = file >>>> agent.channels.bluecoat-channel.capacity = 100000000 >>>> agent.channels.bluecoat-channel.checkpointDir = >>>> /Dropbox/flume_tmp/bluecoat-channel/checkpoint >>>> agent.channels.bluecoat-channel.dataDirs = >>>> /Dropbox/flume_tmp/bluecoat-channel/data >>>> >>>> agent.sinks.bc-avro.type = avro >>>> agent.sinks.bc-avro.channel = bluecoat-channel >>>> agent.sinks.bc-avro.hostname = {destination server address} >>>> agent.sinks.bc-avro.port = 4141 >>>> agent.sinks.bc-avro.batch-size = 250 >>>> agent.sinks.bc-avro.compression-type = deflate >>>> agent.sinks.bc-avro.compression-level = 9 >>>> >>>> >>>> #kc fs logs >>>> agent.sources.fs-src.type = spooldir >>>> agent.sources.fs-src.channels = fs-channel >>>> agent.sources.fs-src.spoolDir = /Dropbox/fs >>>> agent.sources.fs-src.deserializer = line >>>> agent.sources.fs-src.deserializer.maxLineLength = 32000 >>>> agent.sources.fs-src.deletePolicy = immediate >>>> agent.sources.fs-src.decodeErrorPolicy = IGNORE >>>> agent.sources.fs-src.maxBackoff = 10000 >>>> >>>> agent.channels.fs-channel.type = file >>>> agent.channels.fs-channel.capacity = 100000000 >>>> agent.channels.fs-channel.checkpointDir = >>>> /Dropbox/flume_tmp/fs-channel/checkpoint >>>> agent.channels.fs-channel.dataDirs = /Dropbox/flume_tmp/fs-channel/data >>>> >>>> agent.sinks.fs-avro.type = avro >>>> agent.sinks.fs-avro.channel = fs-channel >>>> agent.sinks.fs-avro.hostname = {destination server address} >>>> agent.sinks.fs-avro.port = 4145 >>>> agent.sinks.fs-avro.batch-size = 250 >>>> agent.sinks.fs-avro.compression-type = deflate >>>> agent.sinks.fs-avro.compression-level = 9 >>>> >>>> >>>> Thanks! >>>> >>>> >>>> >>> >>> >>> -- >>> >>> Best regards, >>> Ahmed Vila | Senior software developer >>> DevLogic | Sarajevo | Bosnia and Herzegovina >>> >>> Office : +387 33 942 123 >>> Mobile: +387 62 139 348 >>> >>> Website: www.devlogic.eu >>> E-mail : [email protected] >>> --------------------------------------------------------------------- >>> This e-mail and any attachment is for authorised use by the intended >>> recipient(s) only. This email contains confidential information. It should >>> not be copied, disclosed to, retained or used by, any party other than the >>> intended recipient. Any unauthorised distribution, dissemination or copying >>> of this E-mail or its attachments, and/or any use of any information >>> contained in them, is strictly prohibited and may be illegal. If you are >>> not an intended recipient then please promptly delete this e-mail and any >>> attachment and all copies and inform the sender directly via email. Any >>> emails that you send to us may be monitored by systems or persons other >>> than the named communicant for the purposes of ascertaining whether the >>> communication complies with the law and company policies. >>> >>> --------------------------------------------------------------------- >>> This e-mail and any attachment is for authorised use by the intended >>> recipient(s) only. This email contains confidential information. It should >>> not be copied, disclosed to, retained or used by, any party other than the >>> intended recipient. Any unauthorised distribution, dissemination or copying >>> of this E-mail or its attachments, and/or any use of any information >>> contained in them, is strictly prohibited and may be illegal. If you are >>> not an intended recipient then please promptly delete this e-mail and any >>> attachment and all copies and inform the sender directly via email. Any >>> emails that you send to us may be monitored by systems or persons other >>> than the named communicant for the purposes of ascertaining whether the >>> communication complies with the law and company policies. >> >> > -- Best regards, Ahmed Vila | Senior software developer DevLogic | Sarajevo | Bosnia and Herzegovina Office : +387 33 942 123 Mobile: +387 62 139 348 Website: www.devlogic.eu E-mail : [email protected] --------------------------------------------------------------------- This e-mail and any attachment is for authorised use by the intended recipient(s) only. This email contains confidential information. It should not be copied, disclosed to, retained or used by, any party other than the intended recipient. Any unauthorised distribution, dissemination or copying of this E-mail or its attachments, and/or any use of any information contained in them, is strictly prohibited and may be illegal. If you are not an intended recipient then please promptly delete this e-mail and any attachment and all copies and inform the sender directly via email. Any emails that you send to us may be monitored by systems or persons other than the named communicant for the purposes of ascertaining whether the communication complies with the law and company policies. -- --------------------------------------------------------------------- This e-mail and any attachment is for authorised use by the intended recipient(s) only. This email contains confidential information. It should not be copied, disclosed to, retained or used by, any party other than the intended recipient. Any unauthorised distribution, dissemination or copying of this E-mail or its attachments, and/or any use of any information contained in them, is strictly prohibited and may be illegal. If you are not an intended recipient then please promptly delete this e-mail and any attachment and all copies and inform the sender directly via email. Any emails that you send to us may be monitored by systems or persons other than the named communicant for the purposes of ascertaining whether the communication complies with the law and company policies.
