Re: stage failure: java.lang.IllegalStateException: unread block data
Hi, I get exactly the same error. It runs on my local machine but not on the cluster. I am running the example pi.py example. Best, Tassilo -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/stage-failure-java-lang-IllegalStateException-unread-block-data-tp17751p17889.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
stage failure: java.lang.IllegalStateException: unread block data
Hi, Got this error when running spark 1.1.0 to read Hbase 0.98.1 through simple python code in a ec2 cluster. The same program runs correctly in local mode. So this error only happens when running in a real cluster. Here's what I got, 14/10/30 17:51:53 INFO TaskSetManager: Starting task 0.1 in stage 0.0 (TID 1, node001, ANY, 1265 bytes) 14/10/30 17:51:53 INFO TaskSetManager: Lost task 0.1 in stage 0.0 (TID 1) on executor node001: java.lang.IllegalStateException (unread block data) [duplicate 1] 14/10/30 17:51:53 INFO TaskSetManager: Starting task 0.2 in stage 0.0 (TID 2, node001, ANY, 1265 bytes) 14/10/30 17:51:53 INFO TaskSetManager: Lost task 0.2 in stage 0.0 (TID 2) on executor node001: java.lang.IllegalStateException (unread block data) [duplicate 2] 14/10/30 17:51:53 INFO TaskSetManager: Starting task 0.3 in stage 0.0 (TID 3, node001, ANY, 1265 bytes) 14/10/30 17:51:53 INFO TaskSetManager: Lost task 0.3 in stage 0.0 (TID 3) on executor node001: java.lang.IllegalStateException (unread block data) [duplicate 3] 14/10/30 17:51:53 ERROR TaskSetManager: Task 0 in stage 0.0 failed 4 times; aborting job 14/10/30 17:51:53 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 14/10/30 17:51:53 INFO TaskSchedulerImpl: Cancelling stage 0 14/10/30 17:51:53 INFO DAGScheduler: Failed to run first at SerDeUtil.scala:70 Traceback (most recent call last): File /root/workspace/test/sparkhbase.py, line 22, in module conf=conf2) File /root/spark-1.1.0/python/pyspark/context.py, line 471, in newAPIHadoopRDD jconf, batchSize) File /root/spark-1.1.0/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py, line 538, in __call__ File /root/spark-1.1.0/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py, line 300, in get_return_value py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, node001): java.lang.IllegalStateException: unread block data java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2399) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1378) java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1969) java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893) java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1776) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1346) java.io.ObjectInputStream.readObject(ObjectInputStream.java:368) org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62) org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87) org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:159) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) java.lang.Thread.run(Thread.java:679) Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1185) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1174) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1173) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1173) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:688) at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1391) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at akka.actor.ActorCell.invoke(ActorCell.scala:456) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at
Re: stage failure: java.lang.IllegalStateException: unread block data
The worker side has error message as this, 14/10/30 18:29:00 INFO Worker: Asked to launch executor app-20141030182900-0006/0 for testspark_v1 14/10/30 18:29:01 INFO ExecutorRunner: Launch command: java -cp ::/root/spark-1.1.0/conf:/root/spark-1.1.0/assembly/target/scala-2.10/spark-assembly-1.1.0-hadoop2.3.0.jar -XX:MaxPermSize=128m -Dspark.driver.port=52552 -Xms512M -Xmx512M org.apache.spark.executor.CoarseGrainedExecutorBackend akka.tcp://sparkDriver@master:52552/user/CoarseGrainedScheduler 0 node001 4 akka.tcp://sparkWorker@node001:60184/user/Worker app-20141030182900-0006 14/10/30 18:29:03 INFO Worker: Asked to kill executor app-20141030182900-0006/0 14/10/30 18:29:03 INFO ExecutorRunner: Runner thread for executor app-20141030182900-0006/0 interrupted 14/10/30 18:29:03 INFO ExecutorRunner: Killing process! 14/10/30 18:29:03 ERROR FileAppender: Error writing stream to file /root/spark-1.1.0/work/app-20141030182900-0006/0/stderr java.io.IOException: Stream Closed at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:214) at org.apache.spark.util.logging.FileAppender.appendStreamToFile(FileAppender.scala:70) at org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply$mcV$sp(FileAppender.scala:39) at org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply(FileAppender.scala:39) at org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply(FileAppender.scala:39) at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1311) at org.apache.spark.util.logging.FileAppender$$anon$1.run(FileAppender.scala:38) 14/10/30 18:29:04 INFO Worker: Executor app-20141030182900-0006/0 finished with state KILLED exitStatus 143 14/10/30 18:29:04 INFO LocalActorRef: Message [akka.remote.transport.ActorTransportAdapter$DisassociateUnderlying] from Actor[akka://sparkWorker/deadLetters] to Actor[akka://sparkWorker/system/transports/akkaprotocolmanager.tcp0/akkaProtocol-tcp%3A%2F%2FsparkWorker%4010.180.49.228%3A52120-22#1336571562] was not delivered. [6] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. 14/10/30 18:29:04 ERROR EndpointWriter: AssociationError [akka.tcp://sparkWorker@node001:60184] - [akka.tcp://sparkExecutor@node001:37697]: Error [Association failed with [akka.tcp://sparkExecutor@node001:37697]] [ akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkExecutor@node001:37697] Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection refused: node001/10.180.49.228:37697 ] 14/10/30 18:29:04 ERROR EndpointWriter: AssociationError [akka.tcp://sparkWorker@node001:60184] - [akka.tcp://sparkExecutor@node001:37697]: Error [Association failed with [akka.tcp://sparkExecutor@node001:37697]] [ akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkExecutor@node001:37697] Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection refused: node001/10.180.49.228:37697 ] 14/10/30 18:29:04 ERROR EndpointWriter: AssociationError [akka.tcp://sparkWorker@node001:60184] - [akka.tcp://sparkExecutor@node001:37697]: Error [Association failed with [akka.tcp://sparkExecutor@node001:37697]] [ akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkExecutor@node001:37697] Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection refused: node001/10.180.49.228:37697 ] Thanks! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/stage-failure-java-lang-IllegalStateException-unread-block-data-tp17751p17755.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org