Re: Can't use snappy codec

Patrick Wendell Fri, 03 Aug 2012 11:55:42 -0700

Oh I see - so the current version "just works" for you. I
misinterpreted your last email thinking you had done something
customized to fix it.


The script must have been fixed in an earlier JIRA. Thanks for your
thorough explanation.

- Patrick

On Fri, Aug 3, 2012 at 11:21 AM, Eran Kutner <[email protected]> wrote:
> Hi Patrick,
>
> With the new script it's working ok.
> The old script (the one that comes with cdh4) had a problem
>
> This code in the old script:
>     local HADOOP_JAVA_LIBRARY_PATH=$(HADOOP_CLASSPATH="$FLUME_CLASSPATH" \
>         ${HADOOP_IN_PATH} org.apache.flume.tools.GetJavaProperty \
>         java.library.path 2>/dev/null)
>
> Would set the HADOOP_JAVA_LIBRARY_PATH to
> "java.library.path=//usr/lib/hadoop/lib/native"
> Which would end up setting FLUME_JAVA_LIBRARY_PATH to
> ":java.library.path=//usr/lib/hadoop/lib/native"
> That would then be used to start the processes as
> "-Djava.library.path=:java.library.path=//usr/lib/hadoop/lib/native" which
> is obviously wrong.
>
> The new script has code to clean up the extra "java.library.path" returned
> by the above code.
> So there are two options, either the implementation of
> org.apache.flume.tools.GetJavaProperty changed, so it now returns the extra
> parameter name and therefore it was required to remove it in the script, or
> the original one included in CDH4 had a bug that was fixed in 1.2
>
> -eran
>
>
>
> On Fri, Aug 3, 2012 at 7:43 PM, Patrick Wendell <[email protected]> wrote:
>>
>> Hey Eran,
>>
>> So the flume-ng script works by trying to figure out what library path
>> Hadoop is using and then replicating that for flume. If
>> HADOOP_JAVA_LIBRARY_PATH is set it will try to use that. Otherwise it
>> tries to infer the path based on what the hadoop script itself
>> determines.
>>
>> What is the path getting set to in your case and how does that differ
>> from expecations? Just trying to figure out what the bug is.
>>
>> - Patrick
>>
>> On Fri, Aug 3, 2012 at 9:25 AM, Eran Kutner <[email protected]> wrote:
>> > Thanks Patrick, that helped me figure out the problem and it looks like
>> > a
>> > bug in the "flume-ng" file provided with CDH4, it was messing the
>> > library.path.
>> > I copied the file that was included in flume 1.2.0 distribution and it
>> > now
>> > works ok.
>> >
>> > Thanks for your help.
>> >
>> > -eran
>> >
>> >
>> >
>> > On Fri, Aug 3, 2012 at 6:36 PM, Patrick Wendell <[email protected]>
>> > wrote:
>> >>
>> >> Hey Eran,
>> >>
>> >> You need to make sure the Flume JVM gets passed
>> >> -Djava.library.path=XXX with the correct path to where your native
>> >> snappy libraries are located.
>> >>
>> >> You can set this by adding the option directly to the flume-ng runner
>> >> script.
>> >>
>> >> - Patrick
>> >>
>> >> On Fri, Aug 3, 2012 at 7:33 AM, Eran Kutner <[email protected]> wrote:
>> >> > Hi,
>> >> > I'm trying to use the snappy codec but keep getting "native snappy
>> >> > library
>> >> > not available" errors.
>> >> > I'm using CDH4 but replaced the flume 1.1 JARs that are included with
>> >> > that
>> >> > distribution with flume 1.2 JARs.
>> >> > I tried anything I can think of, including symlinking the hadoop
>> >> > native
>> >> > library under flume-ng/lib/ dirctory both nothing helps.
>> >> > Any idea how to resolve this?
>> >> >
>> >> > This is the error:
>> >> > 2012-08-03 10:23:30,598 WARN util.NativeCodeLoader: Unable to load
>> >> > native-hadoop library for your platform... using builtin-java classes
>> >> > where
>> >> > applicable
>> >> > 2012-08-03 10:23:35,670 WARN hdfs.HDFSEventSink: HDFS IO error
>> >> > java.io.IOException: java.lang.RuntimeException: native snappy
>> >> > library
>> >> > not
>> >> > available
>> >> >         at
>> >> > org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.java:202)
>> >> >         at
>> >> >
>> >> > org.apache.flume.sink.hdfs.BucketWriter.access$000(BucketWriter.java:48)
>> >> >         at
>> >> > org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:155)
>> >> >         at
>> >> > org.apache.flume.sink.hdfs.BucketWriter$1.run(BucketWriter.java:152)
>> >> >         at
>> >> >
>> >> >
>> >> > org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:125)
>> >> >         at
>> >> > org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:152)
>> >> >         at
>> >> > org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:307)
>> >> >         at
>> >> >
>> >> > org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:717)
>> >> >         at
>> >> >
>> >> > org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:714)
>> >> >         at
>> >> > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>> >> >         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>> >> >         at
>> >> >
>> >> >
>> >> > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>> >> >         at
>> >> >
>> >> >
>> >> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>> >> >         at java.lang.Thread.run(Thread.java:662)
>> >> > Caused by: java.lang.RuntimeException: native snappy library not
>> >> > available
>> >> >         at
>> >> >
>> >> >
>> >> > org.apache.hadoop.io.compress.SnappyCodec.createCompressor(SnappyCodec.java:135)
>> >> >         at
>> >> >
>> >> >
>> >> > org.apache.hadoop.io.compress.SnappyCodec.createOutputStream(SnappyCodec.java:84)
>> >> >         at
>> >> >
>> >> >
>> >> > org.apache.flume.sink.hdfs.HDFSCompressedDataStream.open(HDFSCompressedDataStream.java:70)
>> >> >         at
>> >> > org.apache.flume.sink.hdfs.BucketWriter.doOpen(BucketWriter.java:195)
>> >> >         ... 13 more
>> >> >
>> >> > And my sink configuration:
>> >> > flume05.sinks.hdfsSink.type = hdfs
>> >> > #flume05.sinks.hdfsSink.type = logger
>> >> > flume05.sinks.hdfsSink.channel = memoryChannel
>> >> >
>> >> >
>> >> > flume05.sinks.hdfsSink.hdfs.path=hdfs://hadoop2-m1:8020/test-events/%Y-%m-%d
>> >> > flume05.sinks.hdfsSink.hdfs.filePrefix=raw-events.avro
>> >> > flume05.sinks.hdfsSink.hdfs.rollInterval=60
>> >> > flume05.sinks.hdfsSink.hdfs.rollCount=0
>> >> > flume05.sinks.hdfsSink.hdfs.rollSize=0
>> >> > flume05.sinks.hdfsSink.hdfs.fileType=CompressedStream
>> >> > flume05.sinks.hdfsSink.hdfs.codeC=snappy
>> >> > flume05.sinks.hdfsSink.hdfs.writeFormat=Text
>> >> > flume05.sinks.hdfsSink.hdfs.batchSize=1000
>> >> > flume05.sinks.hdfsSink.serializer = avro_event
>> >> >
>> >> > Thanks.
>> >> >
>> >> > -eran
>> >> >
>> >
>> >
>
>

Re: Can't use snappy codec

Reply via email to