Re: java.lang.IllegalStateException: unread block data
Look in the worker logs and see whats going on. Thanks Best Regards On Tue, Jul 14, 2015 at 4:02 PM, Arthur Chan arthur.hk.c...@gmail.com wrote: Hi, I use Spark 1.4. When saving the model to HDFS, I got error? Please help! Regards my scala command: sc.makeRDD(model.clusterCenters,10).saveAsObjectFile(/tmp/tweets/model) The error log: 15/07/14 18:27:40 INFO SequenceFileRDDFunctions: Saving as sequence file of type (NullWritable,BytesWritable) 15/07/14 18:27:40 INFO SparkContext: Starting job: saveAsObjectFile at console:45 15/07/14 18:27:40 INFO DAGScheduler: Got job 110 (saveAsObjectFile at console:45) with 10 output partitions (allowLocal=false) 15/07/14 18:27:40 INFO DAGScheduler: Final stage: ResultStage 174(saveAsObjectFile at console:45) 15/07/14 18:27:40 INFO DAGScheduler: Parents of final stage: List() 15/07/14 18:27:40 INFO DAGScheduler: Missing parents: List() 15/07/14 18:27:40 INFO DAGScheduler: Submitting ResultStage 174 (MapPartitionsRDD[258] at saveAsObjectFile at console:45), which has no missing parents 15/07/14 18:27:40 INFO MemoryStore: ensureFreeSpace(135360) called with curMem=14724380, maxMem=280248975 15/07/14 18:27:40 INFO MemoryStore: Block broadcast_256 stored as values in memory (estimated size 132.2 KB, free 253.1 MB) 15/07/14 18:27:40 INFO MemoryStore: ensureFreeSpace(46231) called with curMem=14859740, maxMem=280248975 15/07/14 18:27:40 INFO MemoryStore: Block broadcast_256_piece0 stored as bytes in memory (estimated size 45.1 KB, free 253.1 MB) 15/07/14 18:27:40 INFO BlockManagerInfo: Added broadcast_256_piece0 in memory on localhost:52681 (size: 45.1 KB, free: 263.1 MB) 15/07/14 18:27:40 INFO SparkContext: Created broadcast 256 from broadcast at DAGScheduler.scala:874 15/07/14 18:27:40 INFO DAGScheduler: Submitting 10 missing tasks from ResultStage 174 (MapPartitionsRDD[258] at saveAsObjectFile at console:45) 15/07/14 18:27:40 INFO TaskSchedulerImpl: Adding task set 174.0 with 10 tasks 15/07/14 18:27:40 INFO TaskSetManager: Starting task 0.0 in stage 174.0 (TID 4513, localhost, PROCESS_LOCAL, 9486 bytes) 15/07/14 18:27:40 INFO TaskSetManager: Starting task 1.0 in stage 174.0 (TID 4514, localhost, PROCESS_LOCAL, 9486 bytes) 15/07/14 18:27:40 INFO TaskSetManager: Starting task 2.0 in stage 174.0 (TID 4515, localhost, PROCESS_LOCAL, 9486 bytes) 15/07/14 18:27:40 INFO TaskSetManager: Starting task 3.0 in stage 174.0 (TID 4516, localhost, PROCESS_LOCAL, 9486 bytes) 15/07/14 18:27:40 INFO TaskSetManager: Starting task 4.0 in stage 174.0 (TID 4517, localhost, PROCESS_LOCAL, 9486 bytes) 15/07/14 18:27:40 INFO TaskSetManager: Starting task 5.0 in stage 174.0 (TID 4518, localhost, PROCESS_LOCAL, 9486 bytes) 15/07/14 18:27:40 INFO TaskSetManager: Starting task 6.0 in stage 174.0 (TID 4519, localhost, PROCESS_LOCAL, 9486 bytes) 15/07/14 18:27:40 INFO TaskSetManager: Starting task 7.0 in stage 174.0 (TID 4520, localhost, PROCESS_LOCAL, 9486 bytes) 15/07/14 18:27:40 INFO TaskSetManager: Starting task 8.0 in stage 174.0 (TID 4521, localhost, PROCESS_LOCAL, 9486 bytes) 15/07/14 18:27:40 INFO TaskSetManager: Starting task 9.0 in stage 174.0 (TID 4522, localhost, PROCESS_LOCAL, 9486 bytes) 15/07/14 18:27:40 INFO Executor: Running task 0.0 in stage 174.0 (TID 4513) 15/07/14 18:27:40 INFO Executor: Running task 1.0 in stage 174.0 (TID 4514) 15/07/14 18:27:40 INFO Executor: Running task 2.0 in stage 174.0 (TID 4515) 15/07/14 18:27:40 INFO Executor: Running task 3.0 in stage 174.0 (TID 4516) 15/07/14 18:27:40 INFO Executor: Running task 4.0 in stage 174.0 (TID 4517) 15/07/14 18:27:40 INFO Executor: Running task 5.0 in stage 174.0 (TID 4518) 15/07/14 18:27:40 INFO Executor: Running task 6.0 in stage 174.0 (TID 4519) 15/07/14 18:27:40 INFO Executor: Running task 7.0 in stage 174.0 (TID 4520) 15/07/14 18:27:40 INFO Executor: Running task 8.0 in stage 174.0 (TID 4521) 15/07/14 18:27:40 ERROR Executor: Exception in task 1.0 in stage 174.0 (TID 4514) java.lang.IllegalStateException: unread block data at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2424) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1383) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:69) at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:95) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:58) at org.apache.spark.scheduler.Task.run(Task.scala:70) at org.apache.spark.executor.Executor
java.lang.IllegalStateException: unread block data
Hi, I use Spark 1.4. When saving the model to HDFS, I got error? Please help! Regards my scala command: sc.makeRDD(model.clusterCenters,10).saveAsObjectFile(/tmp/tweets/model) The error log: 15/07/14 18:27:40 INFO SequenceFileRDDFunctions: Saving as sequence file of type (NullWritable,BytesWritable) 15/07/14 18:27:40 INFO SparkContext: Starting job: saveAsObjectFile at console:45 15/07/14 18:27:40 INFO DAGScheduler: Got job 110 (saveAsObjectFile at console:45) with 10 output partitions (allowLocal=false) 15/07/14 18:27:40 INFO DAGScheduler: Final stage: ResultStage 174(saveAsObjectFile at console:45) 15/07/14 18:27:40 INFO DAGScheduler: Parents of final stage: List() 15/07/14 18:27:40 INFO DAGScheduler: Missing parents: List() 15/07/14 18:27:40 INFO DAGScheduler: Submitting ResultStage 174 (MapPartitionsRDD[258] at saveAsObjectFile at console:45), which has no missing parents 15/07/14 18:27:40 INFO MemoryStore: ensureFreeSpace(135360) called with curMem=14724380, maxMem=280248975 15/07/14 18:27:40 INFO MemoryStore: Block broadcast_256 stored as values in memory (estimated size 132.2 KB, free 253.1 MB) 15/07/14 18:27:40 INFO MemoryStore: ensureFreeSpace(46231) called with curMem=14859740, maxMem=280248975 15/07/14 18:27:40 INFO MemoryStore: Block broadcast_256_piece0 stored as bytes in memory (estimated size 45.1 KB, free 253.1 MB) 15/07/14 18:27:40 INFO BlockManagerInfo: Added broadcast_256_piece0 in memory on localhost:52681 (size: 45.1 KB, free: 263.1 MB) 15/07/14 18:27:40 INFO SparkContext: Created broadcast 256 from broadcast at DAGScheduler.scala:874 15/07/14 18:27:40 INFO DAGScheduler: Submitting 10 missing tasks from ResultStage 174 (MapPartitionsRDD[258] at saveAsObjectFile at console:45) 15/07/14 18:27:40 INFO TaskSchedulerImpl: Adding task set 174.0 with 10 tasks 15/07/14 18:27:40 INFO TaskSetManager: Starting task 0.0 in stage 174.0 (TID 4513, localhost, PROCESS_LOCAL, 9486 bytes) 15/07/14 18:27:40 INFO TaskSetManager: Starting task 1.0 in stage 174.0 (TID 4514, localhost, PROCESS_LOCAL, 9486 bytes) 15/07/14 18:27:40 INFO TaskSetManager: Starting task 2.0 in stage 174.0 (TID 4515, localhost, PROCESS_LOCAL, 9486 bytes) 15/07/14 18:27:40 INFO TaskSetManager: Starting task 3.0 in stage 174.0 (TID 4516, localhost, PROCESS_LOCAL, 9486 bytes) 15/07/14 18:27:40 INFO TaskSetManager: Starting task 4.0 in stage 174.0 (TID 4517, localhost, PROCESS_LOCAL, 9486 bytes) 15/07/14 18:27:40 INFO TaskSetManager: Starting task 5.0 in stage 174.0 (TID 4518, localhost, PROCESS_LOCAL, 9486 bytes) 15/07/14 18:27:40 INFO TaskSetManager: Starting task 6.0 in stage 174.0 (TID 4519, localhost, PROCESS_LOCAL, 9486 bytes) 15/07/14 18:27:40 INFO TaskSetManager: Starting task 7.0 in stage 174.0 (TID 4520, localhost, PROCESS_LOCAL, 9486 bytes) 15/07/14 18:27:40 INFO TaskSetManager: Starting task 8.0 in stage 174.0 (TID 4521, localhost, PROCESS_LOCAL, 9486 bytes) 15/07/14 18:27:40 INFO TaskSetManager: Starting task 9.0 in stage 174.0 (TID 4522, localhost, PROCESS_LOCAL, 9486 bytes) 15/07/14 18:27:40 INFO Executor: Running task 0.0 in stage 174.0 (TID 4513) 15/07/14 18:27:40 INFO Executor: Running task 1.0 in stage 174.0 (TID 4514) 15/07/14 18:27:40 INFO Executor: Running task 2.0 in stage 174.0 (TID 4515) 15/07/14 18:27:40 INFO Executor: Running task 3.0 in stage 174.0 (TID 4516) 15/07/14 18:27:40 INFO Executor: Running task 4.0 in stage 174.0 (TID 4517) 15/07/14 18:27:40 INFO Executor: Running task 5.0 in stage 174.0 (TID 4518) 15/07/14 18:27:40 INFO Executor: Running task 6.0 in stage 174.0 (TID 4519) 15/07/14 18:27:40 INFO Executor: Running task 7.0 in stage 174.0 (TID 4520) 15/07/14 18:27:40 INFO Executor: Running task 8.0 in stage 174.0 (TID 4521) 15/07/14 18:27:40 ERROR Executor: Exception in task 1.0 in stage 174.0 (TID 4514) java.lang.IllegalStateException: unread block data at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2424) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1383) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:69) at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:95) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:58) at org.apache.spark.scheduler.Task.run(Task.scala:70) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run
Re: java.lang.IllegalStateException: unread block data
Hi, Below is the log form the worker. 15/07/14 17:18:56 ERROR FileAppender: Error writing stream to file /spark/app-20150714171703-0004/5/stderr java.io.IOException: Stream closed at java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:170) at java.io.BufferedInputStream.read1(BufferedInputStream.java:283) at java.io.BufferedInputStream.read(BufferedInputStream.java:345) at java.io.FilterInputStream.read(FilterInputStream.java:107) at org.apache.spark.util.logging.FileAppender.appendStreamToFile(FileAppender.scala:70) at org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply$mcV$sp(FileAppender.scala:39) at org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply(FileAppender.scala:39) at org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply(FileAppender.scala:39) at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1772) at org.apache.spark.util.logging.FileAppender$$anon$1.run(FileAppender.scala:38) 15/07/14 17:18:57 INFO Worker: Executor app-20150714171703-0004/5 finished with state KILLED exitStatus 143 15/07/14 17:18:57 INFO Worker: Cleaning up local directories for application app-20150714171703-0004 15/07/14 17:18:57 WARN ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkExecutor@10.10.10.1:52635] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
Re: java.lang.IllegalStateException: unread block data
Someone else also reported this error with spark 1.4.0 Thanks Best Regards On Tue, Jul 14, 2015 at 6:57 PM, Arthur Chan arthur.hk.c...@gmail.com wrote: Hi, Below is the log form the worker. 15/07/14 17:18:56 ERROR FileAppender: Error writing stream to file /spark/app-20150714171703-0004/5/stderr java.io.IOException: Stream closed at java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:170) at java.io.BufferedInputStream.read1(BufferedInputStream.java:283) at java.io.BufferedInputStream.read(BufferedInputStream.java:345) at java.io.FilterInputStream.read(FilterInputStream.java:107) at org.apache.spark.util.logging.FileAppender.appendStreamToFile(FileAppender.scala:70) at org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply$mcV$sp(FileAppender.scala:39) at org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply(FileAppender.scala:39) at org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply(FileAppender.scala:39) at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1772) at org.apache.spark.util.logging.FileAppender$$anon$1.run(FileAppender.scala:38) 15/07/14 17:18:57 INFO Worker: Executor app-20150714171703-0004/5 finished with state KILLED exitStatus 143 15/07/14 17:18:57 INFO Worker: Cleaning up local directories for application app-20150714171703-0004 15/07/14 17:18:57 WARN ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkExecutor@10.10.10.1:52635] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
Re: java.lang.IllegalStateException: unread block data
I found the reason, it is about sc. Thanks On Tue, Jul 14, 2015 at 9:45 PM, Akhil Das ak...@sigmoidanalytics.com wrote: Someone else also reported this error with spark 1.4.0 Thanks Best Regards On Tue, Jul 14, 2015 at 6:57 PM, Arthur Chan arthur.hk.c...@gmail.com wrote: Hi, Below is the log form the worker. 15/07/14 17:18:56 ERROR FileAppender: Error writing stream to file /spark/app-20150714171703-0004/5/stderr java.io.IOException: Stream closed at java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:170) at java.io.BufferedInputStream.read1(BufferedInputStream.java:283) at java.io.BufferedInputStream.read(BufferedInputStream.java:345) at java.io.FilterInputStream.read(FilterInputStream.java:107) at org.apache.spark.util.logging.FileAppender.appendStreamToFile(FileAppender.scala:70) at org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply$mcV$sp(FileAppender.scala:39) at org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply(FileAppender.scala:39) at org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply(FileAppender.scala:39) at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1772) at org.apache.spark.util.logging.FileAppender$$anon$1.run(FileAppender.scala:38) 15/07/14 17:18:57 INFO Worker: Executor app-20150714171703-0004/5 finished with state KILLED exitStatus 143 15/07/14 17:18:57 INFO Worker: Cleaning up local directories for application app-20150714171703-0004 15/07/14 17:18:57 WARN ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkExecutor@10.10.10.1:52635] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
Re: java.lang.IllegalStateException: unread block data
I got the same problem, maybe java serializer is unstable -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/java-lang-IllegalStateException-unread-block-data-tp20668p21463.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: java.lang.IllegalStateException: unread block data
I found solution. I use HADOOP_MAPRED_HOME in my environment what clashes with spark. After I set empty HADOOP_MAPRED_HOME spark's started working. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/java-lang-IllegalStateException-unread-block-data-tp20668p20742.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: java.lang.IllegalStateException: unread block data
same issue anyone help please -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/java-lang-IllegalStateException-unread-block-data-tp20668p20745.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: java.lang.IllegalStateException: unread block data
When you say restored, does it mean the internal IP/public IP remain unchanged to you changed them accordingly? (I'm assuming you are using a cloud service like AWS, GCE or Azure). What is the serializer that you are using? Try to set the following before creating the sparkContext, might help with Serialization and all System.setProperty(spark.serializer, spark.KryoSerializer) System.setProperty(spark.kryo.registrator, com.sigmoidanalytics.MyRegistrator) Morbious wrote Hi, Recently I installed Cloudera Hadoop 5.1.1 with spark. I shut down slave servers and than restored them back. After this operation I was trying to run any task but each task with file bigger than few megabytes ended with errors: 14/12/12 20:25:02 WARN scheduler.TaskSetManager: Lost TID 61 (task 1.0:61) 14/12/12 20:25:02 WARN scheduler.TaskSetManager: Loss was due to java.lang.IllegalStateException java.lang.IllegalStateException: unread block data at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2421) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1382) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) at org.apache.spark.scheduler.ShuffleMapTask.readExternal(ShuffleMapTask.scala:140) at java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1837) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:63) at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:85) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:169) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 14/12/12 20:25:02 WARN scheduler.TaskSetManager: Lost TID 62 (task 1.0:62) 14/12/12 20:25:02 INFO scheduler.TaskSetManager: Loss was due to java.lang.IllegalStateException: unread block data [duplicate 1] 14/12/12 20:25:02 WARN scheduler.TaskSetManager: Lost TID 63 (task 1.0:63) 14/12/12 20:25:02 INFO scheduler.TaskSetManager: Loss was due to java.lang.IllegalStateException: unread block data [duplicate 2] 14/12/12 20:25:02 WARN scheduler.TaskSetManager: Lost TID 64 (task 1.0:64) 14/12/12 20:25:02 INFO scheduler.TaskSetManager: Loss was due to java.lang.IllegalStateException: unread block data [duplicate 3] 14/12/12 20:25:02 WARN scheduler.TaskSetManager: Lost TID 60 (task 1.0:60) I checked security limits but everything seems to be OK. Before restart I was able to use word count on 100GB file, now it can be done only on few mb file. Best regards, Morbious -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/java-lang-IllegalStateException-unread-block-data-tp20668p20684.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: java.lang.IllegalStateException: unread block data
Restored ment reboot slave node with unchanged IP. Funny thing is that for small files spark works fine. I checked hadoop with hdfs also and I'm able to run wordcount on it without any problems (i.e. file about 50GB size). -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/java-lang-IllegalStateException-unread-block-data-tp20668p20692.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: stage failure: java.lang.IllegalStateException: unread block data
Hi, I get exactly the same error. It runs on my local machine but not on the cluster. I am running the example pi.py example. Best, Tassilo -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/stage-failure-java-lang-IllegalStateException-unread-block-data-tp17751p17889.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
stage failure: java.lang.IllegalStateException: unread block data
Hi, Got this error when running spark 1.1.0 to read Hbase 0.98.1 through simple python code in a ec2 cluster. The same program runs correctly in local mode. So this error only happens when running in a real cluster. Here's what I got, 14/10/30 17:51:53 INFO TaskSetManager: Starting task 0.1 in stage 0.0 (TID 1, node001, ANY, 1265 bytes) 14/10/30 17:51:53 INFO TaskSetManager: Lost task 0.1 in stage 0.0 (TID 1) on executor node001: java.lang.IllegalStateException (unread block data) [duplicate 1] 14/10/30 17:51:53 INFO TaskSetManager: Starting task 0.2 in stage 0.0 (TID 2, node001, ANY, 1265 bytes) 14/10/30 17:51:53 INFO TaskSetManager: Lost task 0.2 in stage 0.0 (TID 2) on executor node001: java.lang.IllegalStateException (unread block data) [duplicate 2] 14/10/30 17:51:53 INFO TaskSetManager: Starting task 0.3 in stage 0.0 (TID 3, node001, ANY, 1265 bytes) 14/10/30 17:51:53 INFO TaskSetManager: Lost task 0.3 in stage 0.0 (TID 3) on executor node001: java.lang.IllegalStateException (unread block data) [duplicate 3] 14/10/30 17:51:53 ERROR TaskSetManager: Task 0 in stage 0.0 failed 4 times; aborting job 14/10/30 17:51:53 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 14/10/30 17:51:53 INFO TaskSchedulerImpl: Cancelling stage 0 14/10/30 17:51:53 INFO DAGScheduler: Failed to run first at SerDeUtil.scala:70 Traceback (most recent call last): File /root/workspace/test/sparkhbase.py, line 22, in module conf=conf2) File /root/spark-1.1.0/python/pyspark/context.py, line 471, in newAPIHadoopRDD jconf, batchSize) File /root/spark-1.1.0/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py, line 538, in __call__ File /root/spark-1.1.0/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py, line 300, in get_return_value py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, node001): java.lang.IllegalStateException: unread block data java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2399) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1378) java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1969) java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893) java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1776) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1346) java.io.ObjectInputStream.readObject(ObjectInputStream.java:368) org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62) org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87) org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:159) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) java.lang.Thread.run(Thread.java:679) Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1185) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1174) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1173) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1173) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:688) at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1391) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at akka.actor.ActorCell.invoke(ActorCell.scala:456) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run
Re: stage failure: java.lang.IllegalStateException: unread block data
The worker side has error message as this, 14/10/30 18:29:00 INFO Worker: Asked to launch executor app-20141030182900-0006/0 for testspark_v1 14/10/30 18:29:01 INFO ExecutorRunner: Launch command: java -cp ::/root/spark-1.1.0/conf:/root/spark-1.1.0/assembly/target/scala-2.10/spark-assembly-1.1.0-hadoop2.3.0.jar -XX:MaxPermSize=128m -Dspark.driver.port=52552 -Xms512M -Xmx512M org.apache.spark.executor.CoarseGrainedExecutorBackend akka.tcp://sparkDriver@master:52552/user/CoarseGrainedScheduler 0 node001 4 akka.tcp://sparkWorker@node001:60184/user/Worker app-20141030182900-0006 14/10/30 18:29:03 INFO Worker: Asked to kill executor app-20141030182900-0006/0 14/10/30 18:29:03 INFO ExecutorRunner: Runner thread for executor app-20141030182900-0006/0 interrupted 14/10/30 18:29:03 INFO ExecutorRunner: Killing process! 14/10/30 18:29:03 ERROR FileAppender: Error writing stream to file /root/spark-1.1.0/work/app-20141030182900-0006/0/stderr java.io.IOException: Stream Closed at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:214) at org.apache.spark.util.logging.FileAppender.appendStreamToFile(FileAppender.scala:70) at org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply$mcV$sp(FileAppender.scala:39) at org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply(FileAppender.scala:39) at org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply(FileAppender.scala:39) at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1311) at org.apache.spark.util.logging.FileAppender$$anon$1.run(FileAppender.scala:38) 14/10/30 18:29:04 INFO Worker: Executor app-20141030182900-0006/0 finished with state KILLED exitStatus 143 14/10/30 18:29:04 INFO LocalActorRef: Message [akka.remote.transport.ActorTransportAdapter$DisassociateUnderlying] from Actor[akka://sparkWorker/deadLetters] to Actor[akka://sparkWorker/system/transports/akkaprotocolmanager.tcp0/akkaProtocol-tcp%3A%2F%2FsparkWorker%4010.180.49.228%3A52120-22#1336571562] was not delivered. [6] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. 14/10/30 18:29:04 ERROR EndpointWriter: AssociationError [akka.tcp://sparkWorker@node001:60184] - [akka.tcp://sparkExecutor@node001:37697]: Error [Association failed with [akka.tcp://sparkExecutor@node001:37697]] [ akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkExecutor@node001:37697] Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection refused: node001/10.180.49.228:37697 ] 14/10/30 18:29:04 ERROR EndpointWriter: AssociationError [akka.tcp://sparkWorker@node001:60184] - [akka.tcp://sparkExecutor@node001:37697]: Error [Association failed with [akka.tcp://sparkExecutor@node001:37697]] [ akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkExecutor@node001:37697] Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection refused: node001/10.180.49.228:37697 ] 14/10/30 18:29:04 ERROR EndpointWriter: AssociationError [akka.tcp://sparkWorker@node001:60184] - [akka.tcp://sparkExecutor@node001:37697]: Error [Association failed with [akka.tcp://sparkExecutor@node001:37697]] [ akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkExecutor@node001:37697] Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection refused: node001/10.180.49.228:37697 ] Thanks! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/stage-failure-java-lang-IllegalStateException-unread-block-data-tp17751p17755.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: java.lang.IllegalStateException: unread block data while running the sampe WordCount program from Eclipse
Did you ever find a sln to this problem? I'm having similar issues. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/java-lang-IllegalStateException-unread-block-data-while-running-the-sampe-WordCount-program-from-Ecle-tp8388p11412.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Apache Spark Throws java.lang.IllegalStateException: unread block data
What we are doing is: 1. Installing Spark 0.9.1 according to the documentation on the website, along with CDH4 (and another cluster with CDH5) distros of hadoop/hdfs. 2. Building a fat jar with a Spark app with sbt then trying to run it on the cluster I've also included code snippets, and sbt deps at the bottom. When I've Googled this, there seems to be two somewhat vague responses: a) Mismatching spark versions on nodes/user code b) Need to add more jars to the SparkConf Now I know that (b) is not the problem having successfully run the same code on other clusters while only including one jar (it's a fat jar). But I have no idea how to check for (a) - it appears Spark doesn't have any version checks or anything - it would be nice if it checked versions and threw a mismatching version exception: you have user code using version X and node Y has version Z. I would be very grateful for advice on this. I've submitted a bug report, because there has to be something wrong with the Spark documentation because I've seen two independent sysadms get the exact same problem with different versions of CDH on different clusters. https://issues.apache.org/jira/browse/SPARK-1867 The exception: Exception in thread main org.apache.spark.SparkException: Job aborted: Task 0.0:1 failed 32 times (most recent failure: Exception failure: java.lang.IllegalStateException: unread block data) at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1020) at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1018) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1018) at org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604) at org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:604) at org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:190) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at akka.actor.ActorCell.invoke(ActorCell.scala:456) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) 14/05/16 18:05:31 INFO scheduler.TaskSetManager: Loss was due to java.lang.IllegalStateException: unread block data [duplicate 59] My code snippet: val conf = new SparkConf() .setMaster(clusterMaster) .setAppName(appName) .setSparkHome(sparkHome) .setJars(SparkContext.jarOfClass(this.getClass)) println(count = + new SparkContext(conf).textFile(someHdfsPath).count()) My SBT dependencies: // relevant org.apache.spark % spark-core_2.10 % 0.9.1, org.apache.hadoop % hadoop-client % 2.3.0-mr1-cdh5.0.0, // standard, probably unrelated com.github.seratch %% awscala % [0.2,), org.scalacheck %% scalacheck % 1.10.1 % test, org.specs2 %% specs2 % 1.14 % test, org.scala-lang % scala-reflect % 2.10.3, org.scalaz %% scalaz-core % 7.0.5, net.minidev % json-smart % 1.2 -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Apache-Spark-Throws-java-lang-IllegalStateException-unread-block-data-tp5952.html Sent from the Apache Spark User List mailing list archive at Nabble.com.