[ https://issues.apache.org/jira/browse/PIG-4173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14157756#comment-14157756 ]
Ángel Álvarez commented on PIG-4173: ------------------------------------ I'm getting this error whenever I try to load any file from the HDFS: 2014-10-02 17:44:19,592 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 0: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, tldam4602.lda): java.lang.IllegalStateException: unread block data java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2421) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1382) java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62) It only fails when I run my two-lines script (LOAD+DUMP) on the cluster. I think ... it might have something to do with my client libraries or its order ... > Move to Spark 1.x > ----------------- > > Key: PIG-4173 > URL: https://issues.apache.org/jira/browse/PIG-4173 > Project: Pig > Issue Type: Sub-task > Components: spark > Reporter: bc Wong > Assignee: Richard Ding > Attachments: PIG-4173.patch, PIG-4173_2.patch, PIG-4173_3.patch, > TEST-org.apache.pig.spark.TestSpark.txt > > > The Spark branch is using Spark 0.9: > https://github.com/apache/pig/blob/spark/ivy.xml#L438. We should probably > switch to Spark 1.x asap, due to Spark interface changes since 1.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)