Brock, Thanks for the sample! Starting to see a bit more light and making a little more sense now...
If you wouldn't mind and have a couple mins to spare...I'm getting this error
and not sure how to make it go away.. I can not use hadoop for storage instead
just FILE_ROLL (ultimately the logs will need to be processed further in plain
text) I'm just not sure why....
The error follows and my conf further down.
12 Sep 2012 13:18:54,120 INFO [lifecycleSupervisor-1-0]
(org.apache.flume.channel.file.FileChannel.start:211) - Starting FileChannel
fileChannel { dataDirs: [/tmp/flume/data1, /tmp/flume/data2, /tmp/flume/data3]
}...
12 Sep 2012 13:18:54,124 ERROR [lifecycleSupervisor-1-0]
(org.apache.flume.channel.file.FileChannel.start:234) - Failed to start the
file channel [channel=fileChannel]
java.lang.NoClassDefFoundError: org/apache/hadoop/io/Writable
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
at
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at org.apache.flume.channel.file.Log$Builder.build(Log.java:144)
at org.apache.flume.channel.file.FileChannel.start(FileChannel.java:223)
at
org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:236)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at
java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.io.Writable
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
... 24 more
12 Sep 2012 13:18:54,126 ERROR [lifecycleSupervisor-1-0]
(org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run:238) -
Unable to start FileChannel fileChannel { dataDirs: [/tmp/flume/data1,
/tmp/flume/data2, /tmp/flume/data3] } - Exception follows.
java.lang.NoClassDefFoundError: org/apache/hadoop/io/Writable
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
at
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at org.apache.flume.channel.file.Log$Builder.build(Log.java:144)
at org.apache.flume.channel.file.FileChannel.start(FileChannel.java:223)
at
org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:236)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at
java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.io.Writable
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
... 24 more
12 Sep 2012 13:18:54,127 INFO [lifecycleSupervisor-1-0]
(org.apache.flume.channel.file.FileChannel.stop:249) - Stopping FileChannel
fileChannel { dataDirs: [/tmp/flume/data1, /tmp/flume/data2, /tmp/flume/data3]
}...
12 Sep 2012 13:18:54,127 ERROR [lifecycleSupervisor-1-0]
(org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run:249) -
Unsuccessful attempt to shutdown component: {} due to missing dependencies.
Please shutdown the agentor disable this component, or the agent will bein an
undefined state.
java.lang.IllegalStateException: Channel closed[channel=fileChannel]
at
com.google.common.base.Preconditions.checkState(Preconditions.java:145)
at
org.apache.flume.channel.file.FileChannel.getDepth(FileChannel.java:282)
at org.apache.flume.channel.file.FileChannel.stop(FileChannel.java:250)
at
org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:244)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at
java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
12 Sep 2012 13:18:54,622 INFO [conf-file-poller-0]
(org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.startAllComponents:141)
- Starting Sink filesink1
12 Sep 2012 13:18:54,624 INFO [conf-file-poller-0]
(org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.startAllComponents:152)
- Starting Source avroSource
12 Sep 2012 13:18:54,626 INFO [lifecycleSupervisor-1-1]
(org.apache.flume.source.AvroSource.start:138) - Starting Avro source
avroSource: { bindAddress: 0.0.0.0, port: 9432 }...
12 Sep 2012 13:18:54,641 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor]
(org.apache.flume.SinkRunner$PollingRunner.run:160) - Unable to deliver event.
Exception follows.
java.lang.IllegalStateException: Channel closed [channel=fileChannel]
at
com.google.common.base.Preconditions.checkState(Preconditions.java:145)
at
org.apache.flume.channel.file.FileChannel.createTransaction(FileChannel.java:267)
at
org.apache.flume.channel.BasicChannelSemantics.getTransaction(BasicChannelSemantics.java:118)
at
org.apache.flume.sink.RollingFileSink.process(RollingFileSink.java:172)
at
org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
at java.lang.Thread.run(Thread.java:662)
Using your config this is my starting point... (trying to get it functioning on
a single host first)
node105.sources = tailsource
node105.channels = fileChannel
node105.sinks = avroSink
node105.sources.tailsource.type = exec
node105.sources.tailsource.command =tail -F
/root/Desktop/apache-flume-1.3.0-SNAPSHOT/test.log
#node105.sources.stressSource.batchSize = 1000
node105.sources.tailsource.channels = fileChannel
## Sink sends avro messages to node103.bashkew.com port 9432
node105.sinks.avroSink.type = avro
node105.sinks.avroSink.batch-size = 1000
node105.sinks.avroSink.channel = fileChannel
node105.sinks.avroSink.hostname = localhost
node105.sinks.avroSink.port = 9432
node105.channels.fileChannel.type = file
node105.channels.fileChannel.checkpointDir =
/root/Desktop/apache-flume-1.3.0-SNAPSHOT/tmp/flume/checkpoint
node105.channels.fileChannel.dataDirs =
/root/Desktop/apache-flume-1.3.0-SNAPSHOT/tmp/flume/tmp/flume/data
node105.channels.fileChannel.capacity = 10000
node105.channels.fileChannel.checkpointInterval = 3000
node105.channels.fileChannel.maxFileSize = 5242880
node102.sources = avroSource
node102.channels = fileChannel
node102.sinks = filesink1
## Source listens for avro messages on port 9432 on all ips
node102.sources.avroSource.type = avro
node102.sources.avroSource.channels = fileChannel
node102.sources.avroSource.bind = 0.0.0.0
node102.sources.avroSource.port = 9432
node102.sinks.filesink1.type = FILE_ROLL
node102.sinks.filesink1.batchSize = 1000
node102.sinks.filesink1.channel = fileChannel
node102.sinks.filesink1.sink.directory =
/root/Desktop/apache-flume-1.3.0-SNAPSHOT/logs/rhel5/
node102.channels.fileChannel.type = file
node102.channels.fileChannel.checkpointDir = /tmp/flume/checkpoints
node102.channels.fileChannel.dataDirs =
/tmp/flume/data1,/tmp/flume/data2,/tmp/flume/data3
node102.channels.fileChannel.capacity = 5000
node102.channels.fileChannel.checkpointInterval = 45000
node102.channels.fileChannel.maxFileSize = 5242880
Thanks!
Dave
-----Original Message-----
From: Brock Noland [mailto:[email protected]]
Sent: Wed 9/12/2012 9:11 AM
To: [email protected]
Subject: Re: splitting functions
Hi,
Below is a config I use to test out the FileChannel. See the comments
"##" for how messages are sent from one host to another.
node105.sources = stressSource
node105.channels = fileChannel
node105.sinks = avroSink
node105.sources.stressSource.type = org.apache.flume.source.StressSource
node105.sources.stressSource.batchSize = 1000
node105.sources.stressSource.channels = fileChannel
## Sink sends avro messages to node103.bashkew.com port 9432
node105.sinks.avroSink.type = avro
node105.sinks.avroSink.batch-size = 1000
node105.sinks.avroSink.channel = fileChannel
node105.sinks.avroSink.hostname = node102.bashkew.com
node105.sinks.avroSink.port = 9432
node105.channels.fileChannel.type = file
node105.channels.fileChannel.checkpointDir = /tmp/flume/checkpoints
node105.channels.fileChannel.dataDirs =
/tmp/flume/data1,/tmp/flume/data2,/tmp/flume/data3
node105.channels.fileChannel.capacity = 10000
node105.channels.fileChannel.checkpointInterval = 3000
node105.channels.fileChannel.maxFileSize = 5242880
node102.sources = avroSource
node102.channels = fileChannel
node102.sinks = nullSink
## Source listens for avro messages on port 9432 on all ips
node102.sources.avroSource.type = avro
node102.sources.avroSource.channels = fileChannel
node102.sources.avroSource.bind = 0.0.0.0
node102.sources.avroSource.port = 9432
node102.sinks.nullSink.type = null
node102.sinks.nullSink.batchSize = 1000
node102.sinks.nullSink.channel = fileChannel
node102.channels.fileChannel.type = file
node102.channels.fileChannel.checkpointDir = /tmp/flume/checkpoints
node102.channels.fileChannel.dataDirs =
/tmp/flume/data1,/tmp/flume/data2,/tmp/flume/data3
node102.channels.fileChannel.capacity = 5000
node102.channels.fileChannel.checkpointInterval = 45000
node102.channels.fileChannel.maxFileSize = 5242880
On Wed, Sep 12, 2012 at 10:06 AM, Cochran, David M (Contractor)
<[email protected]> wrote:
> Okay folks, after spending the better part of a week reading the docs and
> experimenting I'm lost. I have flume 1.3.x working pretty much as expected
> on a single host. It tails a log file and writes it to another rolling log
> file via flume. No problem there, seems to work flawlessly. Where my issue
> is trying to break apart the functions across multiple hosts... a single
> host listening for others to send their logs to. All of my efforts have
> resulted in little more than headaches.
>
> I can't even see the specified port open on what should be the logging host.
> I've tried the basic examples posted on different docs but can't seem to get
> things working across multiple hosts.
>
> Would someone post a working example of the conf's needed to get me started?
> Something simple that works, so I can them pick it apart to gain more
> understanding. Apparently, I just don't yet have a firm enough grasp on all
> the pieces yet, but want to learn!
>
> Thanks in advance!
> Dave
>
>
--
Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/
<<winmail.dat>>
