Oh I see - so the current version "just works" for you. I misinterpreted your last email thinking you had done something customized to fix it.
The script must have been fixed in an earlier JIRA. Thanks for your thorough explanation. - Patrick On Fri, Aug 3, 2012 at 11:21 AM, Eran Kutner <[email protected]> wrote: > Hi Patrick, > > With the new script it's working ok. > The old script (the one that comes with cdh4) had a problem > > This code in the old script: > local HADOOP_JAVA_LIBRARY_PATH=$(HADOOP_CLASSPATH="$FLUME_CLASSPATH" \ > ${HADOOP_IN_PATH} org.apache.flume.tools.GetJavaProperty \ > java.library.path 2>/dev/null) > > Would set the HADOOP_JAVA_LIBRARY_PATH to > "java.library.path=//usr/lib/hadoop/lib/native" > Which would end up setting FLUME_JAVA_LIBRARY_PATH to > ":java.library.path=//usr/lib/hadoop/lib/native" > That would then be used to start the processes as > "-Djava.library.path=:java.library.path=//usr/lib/hadoop/lib/native" which > is obviously wrong. > > The new script has code to clean up the extra "java.library.path" returned > by the above code. > So there are two options, either the implementation of > org.apache.flume.tools.GetJavaProperty changed, so it now returns the extra > parameter name and therefore it was required to remove it in the script, or > the original one included in CDH4 had a bug that was fixed in 1.2 > > -eran > > > > On Fri, Aug 3, 2012 at 7:43 PM, Patrick Wendell <[email protected]> wrote: >> >> Hey Eran, >> >> So the flume-ng script works by trying to figure out what library path >> Hadoop is using and then replicating that for flume. If >> HADOOP_JAVA_LIBRARY_PATH is set it will try to use that. Otherwise it >> tries to infer the path based on what the hadoop script itself >> determines. >> >> What is the path getting set to in your case and how does that differ >> from expecations? Just trying to figure out what the bug is. >> >> - Patrick >> >> On Fri, Aug 3, 2012 at 9:25 AM, Eran Kutner <[email protected]> wrote: >> > Thanks Patrick, that helped me figure out the problem and it looks like >> > a >> > bug in the "flume-ng" file provided with CDH4, it was messing the >> > library.path. >> > I copied the file that was included in flume 1.2.0 distribution and it >> > now >> > works ok. >> > >> > Thanks for your help. >> > >> > -eran >> > >> > >> > >> > On Fri, Aug 3, 2012 at 6:36 PM, Patrick Wendell <[email protected]> >> > wrote: >> >> >> >> Hey Eran, >> >> >> >> You need to make sure the Flume JVM gets passed >> >> -Djava.library.path=XXX with the correct path to where your native >> >> snappy libraries are located. >> >> >> >> You can set this by adding the option directly to the flume-ng runner >> >> script. >> >> >> >> - Patrick >> >> >> >> On Fri, Aug 3, 2012 at 7:33 AM, Eran Kutner <[email protected]> wrote: >> >> > Hi, >> >> > I'm trying to use the snappy codec but keep getting "native snappy >> >> > library >> >> > not available" errors. >> >> > I'm using CDH4 but replaced the flume 1.1 JARs that are included with >> >> > that >> >> > distribution with flume 1.2 JARs. >> >> > I tried anything I can think of, including symlinking the hadoop >> >> > native >> >> > library under flume-ng/lib/ dirctory both nothing helps. >> >> > Any idea how to resolve this? >> >> > >> >> > This is the error: >> >> > 2012-08-03 10:23:30,598 WARN util.NativeCodeLoader: Unable to load >> >> > native-hadoop library for your platform... using builtin-java classes >> >> > where >> >> > applicable >> >> > 2012-08-03 10:23:35,670 WARN hdfs.HDFSEventSink: HDFS IO error >> >> > java.io.IOException: java.lang.RuntimeException: native snappy >> >> > library >> >> > not >> >> > available >> >> > at >> >> > org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.java:202) >> >> > at >> >> > >> >> > org.apache.flume.sink.hdfs.BucketWriter.access$000(BucketWriter.java:48) >> >> > at >> >> > org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:155) >> >> > at >> >> > org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:152) >> >> > at >> >> > >> >> > >> >> > org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:125) >> >> > at >> >> > org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:152) >> >> > at >> >> > org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:307) >> >> > at >> >> > >> >> > org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:717) >> >> > at >> >> > >> >> > org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:714) >> >> > at >> >> > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) >> >> > at java.util.concurrent.FutureTask.run(FutureTask.java:138) >> >> > at >> >> > >> >> > >> >> > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) >> >> > at >> >> > >> >> > >> >> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) >> >> > at java.lang.Thread.run(Thread.java:662) >> >> > Caused by: java.lang.RuntimeException: native snappy library not >> >> > available >> >> > at >> >> > >> >> > >> >> > org.apache.hadoop.io.compress.SnappyCodec.createCompressor(SnappyCodec.java:135) >> >> > at >> >> > >> >> > >> >> > org.apache.hadoop.io.compress.SnappyCodec.createOutputStream(SnappyCodec.java:84) >> >> > at >> >> > >> >> > >> >> > org.apache.flume.sink.hdfs.HDFSCompressedDataStream.open(HDFSCompressedDataStream.java:70) >> >> > at >> >> > org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.java:195) >> >> > ... 13 more >> >> > >> >> > And my sink configuration: >> >> > flume05.sinks.hdfsSink.type = hdfs >> >> > #flume05.sinks.hdfsSink.type = logger >> >> > flume05.sinks.hdfsSink.channel = memoryChannel >> >> > >> >> > >> >> > flume05.sinks.hdfsSink.hdfs.path=hdfs://hadoop2-m1:8020/test-events/%Y-%m-%d >> >> > flume05.sinks.hdfsSink.hdfs.filePrefix=raw-events.avro >> >> > flume05.sinks.hdfsSink.hdfs.rollInterval=60 >> >> > flume05.sinks.hdfsSink.hdfs.rollCount=0 >> >> > flume05.sinks.hdfsSink.hdfs.rollSize=0 >> >> > flume05.sinks.hdfsSink.hdfs.fileType=CompressedStream >> >> > flume05.sinks.hdfsSink.hdfs.codeC=snappy >> >> > flume05.sinks.hdfsSink.hdfs.writeFormat=Text >> >> > flume05.sinks.hdfsSink.hdfs.batchSize=1000 >> >> > flume05.sinks.hdfsSink.serializer = avro_event >> >> > >> >> > Thanks. >> >> > >> >> > -eran >> >> > >> > >> > > >
