Re: buildSupportsSnappy exception when reading the snappy file in Spark

2015-09-08 Thread dong.yajun
hi Akhil,

I just use property key LD_LIBRARY_PATH in conf/spark-env.xml instead of
SPARK_LIBRARY_PATH which points to the path of native, it works.

thanks.

On Tue, Sep 8, 2015 at 6:14 PM, Akhil Das 
wrote:

> Looks like you are having different versions of snappy library. Here's a
> similar discussion if you haven't seen it already
> http://stackoverflow.com/questions/22150417/hadoop-mapreduce-java-lang-unsatisfiedlinkerror-org-apache-hadoop-util-nativec
>
> Thanks
> Best Regards
>
> On Mon, Sep 7, 2015 at 7:41 AM, dong.yajun  wrote:
>
>> hi all,
>>
>> I met problem that can't read the file with snappy encoding from HDFS in
>> Spark1.4.1,
>>
>> I have configured the SPARK_LIBRARY_PATH property in conf/spark-env.sh to
>> the native path of Hadoop and restarted the spark cluster
>>
>> SPARK_LIBRARY_PATH=$SPARK_LIBRARY_PATH:/opt/app/install/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/lib/native
>>
>>
>> the partial exception:
>>
>> Caused by: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem 
>> reading HFile Trailer from file
>> hdfs://nameservice1/hbase/data/default/IM_ItemBase/02296539242087aea77877dced9ba3d5/BaseInfo/9fe36f74334c4d30ba1bfc17bbd717f5
>>
>>  at
>> org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:478)
>>
>>  at
>> org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:521)
>>
>>  at
>> com.newegg.ec.bigdata.dump.CombineHFileRecordReader.(CombineHFileRecordReader.java:33)
>>
>>  ... 19 more
>>
>> Caused by: java.lang.UnsatisfiedLinkError:
>> org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z
>>
>>  at
>> org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy(Native Method)
>>
>>  at
>> org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoaded(SnappyCodec.java:63)
>>
>>  at
>> org.apache.hadoop.io.compress.SnappyCodec.getDecompressorType(SnappyCodec.java:192)
>>
>>  at
>> org.apache.hadoop.io.compress.CodecPool.getDecompressor(CodecPool.java:176)
>>
>>  at
>> org.apache.hadoop.hbase.io.compress.Compression$Algorithm.getDecompressor(Compression.java:328)
>>
>>  at
>> org.apache.hadoop.hbase.io.compress.Compression.decompress(Compression.java:423)
>>
>>  at
>> org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultDecodingContext.prepareDecoding(HFileBlockDefaultDecodingContext.java:90)
>>
>> --
>> *Ric Dong*
>>
>>
>


-- 
*Ric Dong*


buildSupportsSnappy exception when reading the snappy file in Spark

2015-09-06 Thread dong.yajun
hi all,

I met problem that can't read the file with snappy encoding from HDFS in
Spark1.4.1,

I have configured the SPARK_LIBRARY_PATH property in conf/spark-env.sh to
the native path of Hadoop and restarted the spark cluster

SPARK_LIBRARY_PATH=$SPARK_LIBRARY_PATH:/opt/app/install/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/lib/native


the partial exception:

Caused by: org.apache.hadoop.hbase.io.hfile.CorruptHFileException:
Problem reading HFile Trailer from file
hdfs://nameservice1/hbase/data/default/IM_ItemBase/02296539242087aea77877dced9ba3d5/BaseInfo/9fe36f74334c4d30ba1bfc17bbd717f5

 at
org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:478)

 at
org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:521)

 at
com.newegg.ec.bigdata.dump.CombineHFileRecordReader.(CombineHFileRecordReader.java:33)

 ... 19 more

Caused by: java.lang.UnsatisfiedLinkError:
org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z

 at
org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy(Native Method)

 at
org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoaded(SnappyCodec.java:63)

 at
org.apache.hadoop.io.compress.SnappyCodec.getDecompressorType(SnappyCodec.java:192)

 at
org.apache.hadoop.io.compress.CodecPool.getDecompressor(CodecPool.java:176)

 at
org.apache.hadoop.hbase.io.compress.Compression$Algorithm.getDecompressor(Compression.java:328)

 at
org.apache.hadoop.hbase.io.compress.Compression.decompress(Compression.java:423)

 at
org.apache.hadoop.hbase.io.encoding.HFileBlockDefaultDecodingContext.prepareDecoding(HFileBlockDefaultDecodingContext.java:90)

-- 
*Ric Dong*


Re: How can I save the RDD result as Orcfile with spark1.3?

2015-08-24 Thread dong.yajun
We plan to upgrade our spark cluster to 1.4, and I just have a test in
local mode  which reference here:
http://hortonworks.com/blog/bringing-orc-support-into-apache-spark/

but an exception caused when running the example, the stack trace as below:

*Exception in thread "main" java.lang.NoSuchFieldError: defaultVal*
at
org.apache.spark.sql.hive.HiveContext$$anonfun$newTemporaryConfiguration$1.apply(HiveContext.scala:536)
at
org.apache.spark.sql.hive.HiveContext$$anonfun$newTemporaryConfiguration$1.apply(HiveContext.scala:534)
at
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:105)
at
org.apache.spark.sql.hive.HiveContext$.newTemporaryConfiguration(HiveContext.scala:534)
at
org.apache.spark.sql.hive.HiveContext.executionHive$lzycompute(HiveContext.scala:165)
at
org.apache.spark.sql.hive.HiveContext.executionHive(HiveContext.scala:161)
at org.apache.spark.sql.hive.HiveContext.(HiveContext.scala:168)
*at com.newegg.ec.bigdata.ORCSpark$.main(ORCSpark.scala:24)*
at com.newegg.ec.bigdata.ORCSpark.main(ORCSpark.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

the code of the 24th line was:  val sqlContext = new
org.apache.spark.sql.hive.HiveContext(sc)

I use the Spark core with 1.4.1 and Hive with 1.1.0-cdh5.4.0


On Sat, Aug 22, 2015 at 11:18 PM, Ted Yu  wrote:

> In Spark 1.4, there was considerable refactoring around interaction with
> Hive, such as SPARK-7491.
>
> It would not be straight forward to port ORC support to 1.3
>
> FYI
>
> On Fri, Aug 21, 2015 at 10:21 PM, dong.yajun  wrote:
>
>> hi Ted,
>>
>> thanks for your reply, are there any other way to do this with spark 1.3?
>> such as write the orcfile manually in foreachPartition method?
>>
>> On Sat, Aug 22, 2015 at 12:19 PM, Ted Yu  wrote:
>>
>>> ORC support was added in Spark 1.4
>>> See SPARK-2883
>>>
>>> On Fri, Aug 21, 2015 at 7:36 PM, dong.yajun  wrote:
>>>
>>>> Hi list,
>>>>
>>>> Is there a way to save the RDD result as Orcfile in spark1.3?  due to
>>>> some reasons we can't upgrade our spark version to 1.4 now.
>>>>
>>>> --
>>>> *Ric Dong*
>>>>
>>>>
>>>
>>
>>
>> --
>> *Ric Dong*
>>
>>
>


-- 
*Ric Dong*


Re: How can I save the RDD result as Orcfile with spark1.3?

2015-08-21 Thread dong.yajun
hi Ted,

thanks for your reply, are there any other way to do this with spark 1.3?
such as write the orcfile manually in foreachPartition method?

On Sat, Aug 22, 2015 at 12:19 PM, Ted Yu  wrote:

> ORC support was added in Spark 1.4
> See SPARK-2883
>
> On Fri, Aug 21, 2015 at 7:36 PM, dong.yajun  wrote:
>
>> Hi list,
>>
>> Is there a way to save the RDD result as Orcfile in spark1.3?  due to
>> some reasons we can't upgrade our spark version to 1.4 now.
>>
>> --
>> *Ric Dong*
>>
>>
>


-- 
*Ric Dong*


How can I save the RDD result as Orcfile with spark1.3?

2015-08-21 Thread dong.yajun
Hi list,

Is there a way to save the RDD result as Orcfile in spark1.3?  due to some
reasons we can't upgrade our spark version to 1.4 now.

-- 
*Ric Dong*