Here is error log,I abstract as follows: INFO [binaryTest---main]: before first WARN [org.apache.spark.scheduler.TaskSetManager---Result resolver thread-0]: Lost task 0.0 in stage 0.0 (TID 0, spark-dev136): org.xerial.snappy.SnappyError: [FAILED_TO_LOAD_NATIVE_LIBRARY] null org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:236) org.xerial.snappy.Snappy.<clinit>(Snappy.java:48) org.xerial.snappy.SnappyInputStream.hasNextChunk(SnappyInputStream.java:351) org.xerial.snappy.SnappyInputStream.rawRead(SnappyInputStream.java:159) org.xerial.snappy.SnappyInputStream.read(SnappyInputStream.java:142) java.io.ObjectInputStream$PeekInputStream.read(ObjectInputStream.java:2288)
WARN [org.apache.spark.scheduler.TaskSetManager---Result resolver thread-1]: Lost task 0.1 in stage 0.0 (TID 2, spark-dev136): java.lang.NoClassDefFoundError: Could not initialize class org.xerial.snappy.Snappy org.xerial.snappy.SnappyInputStream.hasNextChunk(SnappyInputStream.java:351) org.xerial.snappy.SnappyInputStream.rawRead(SnappyInputStream.java:159) org.xerial.snappy.SnappyInputStream.read(SnappyInputStream.java:142) java.io.ObjectInputStream$PeekInputStream.read(ObjectInputStream.java:2288) java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2301) java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2772) java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:778) java.io.ObjectInputStream.<init>(ObjectInputStream.java:278) org.apache.spark.serializer.JavaDeserializationStream$$anon$1.<init>(JavaSerializer.scala:57) ERROR [org.apache.spark.scheduler.TaskSchedulerImpl---sparkDriver-akka.actor.default-dispatcher-17]: Lost executor 1 on spark-dev136: remote Akka client disassociated Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 4, spark-dev134): java.lang.NoClassDefFoundError: Could not initialize class org.xerial.snappy.Snappy org.xerial.snappy.SnappyInputStream.hasNextChunk(SnappyInputStream.java:351) org.xerial.snappy.SnappyInputStream.rawRead(SnappyInputStream.java:159) org.xerial.snappy.SnappyInputStream.read(SnappyInputStream.java:142) java.io.ObjectInputStream$PeekInputStream.read(ObjectInputStream.java:2288) that's the error log in console. My test code as follows,I can run this correctly on my notebook. I know there is something wrong with the spark cluster,I want to avoid use snappy compression which can avoid this problem . val conf = new SparkConf().setAppName("binary") conf.set("spark.io.compression.codec","org.apache.spark.io.LZ4CompressionCodec") val sc = new SparkContext() val arr = Array("One, two, buckle my shoe", "Three, four, shut the door", "Five, six, pick up sticks", "Seven, eight, lay them straight", "Nine, ten, a big fat hen") val pairs = arr.indices zip arr // implicit def int2IntWritable(fint:Int):IntWritable =new IntWritable() // implicit def string2Writable(fstring:String):Text = new Text() val rdd = sc.makeRDD(pairs) logInfo("before first") println(rdd.first()) logInfo("after first") val seq = new SequenceFileRDDFunctions(rdd) seq.saveAsSequenceFile(args(0)) Thanks -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-avoid-use-snappy-compression-when-saveAsSequenceFile-tp17350p17424.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org