Out of curiosity - are you guys using speculation, shuffle
consolidation, or any other non-default option? If so that would help
narrow down what's causing this corruption.

On Tue, Jun 17, 2014 at 10:40 AM, Surendranauth Hiraman
<suren.hira...@velos.io> wrote:
> Matt/Ryan,
>
> Did you make any headway on this? My team is running into this also.
> Doesn't happen on smaller datasets. Our input set is about 10 GB but we
> generate 100s of GBs in the flow itself.
>
> -Suren
>
>
>
>
> On Fri, Jun 6, 2014 at 5:19 PM, Ryan Compton <compton.r...@gmail.com> wrote:
>
>> Just ran into this today myself. I'm on branch-1.0 using a CDH3
>> cluster (no modifications to Spark or its dependencies). The error
>> appeared trying to run GraphX's .connectedComponents() on a ~200GB
>> edge list (GraphX worked beautifully on smaller data).
>>
>> Here's the stacktrace (it's quite similar to yours
>> https://imgur.com/7iBA4nJ ).
>>
>> 14/06/05 20:02:28 ERROR scheduler.TaskSetManager: Task 5.599:39 failed
>> 4 times; aborting job
>> 14/06/05 20:02:28 INFO scheduler.DAGScheduler: Failed to run reduce at
>> VertexRDD.scala:100
>> Exception in thread "main" org.apache.spark.SparkException: Job
>> aborted due to stage failure: Task 5.599:39 failed 4 times, most
>> recent failure: Exception failure in TID 29735 on host node18:
>> java.io.StreamCorruptedException: invalid type code: AC
>>         java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1355)
>>         java.io.ObjectInputStream.readObject(ObjectInputStream.java:350)
>>
>> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:63)
>>
>> org.apache.spark.serializer.DeserializationStream$$anon$1.getNext(Serializer.scala:125)
>>         org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71)
>>         scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
>>
>> org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:30)
>>
>> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
>>         scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
>>         scala.collection.Iterator$class.foreach(Iterator.scala:727)
>>         scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>>
>> org.apache.spark.graphx.impl.VertexPartitionBaseOps.innerJoinKeepLeft(VertexPartitionBaseOps.scala:192)
>>
>> org.apache.spark.graphx.impl.EdgePartition.updateVertices(EdgePartition.scala:78)
>>
>> org.apache.spark.graphx.impl.ReplicatedVertexView$$anonfun$2$$anonfun$apply$1.apply(ReplicatedVertexView.scala:75)
>>
>> org.apache.spark.graphx.impl.ReplicatedVertexView$$anonfun$2$$anonfun$apply$1.apply(ReplicatedVertexView.scala:73)
>>         scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>>         scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
>>         scala.collection.Iterator$class.foreach(Iterator.scala:727)
>>         scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>>
>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:158)
>>
>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
>>         org.apache.spark.scheduler.Task.run(Task.scala:51)
>>
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>         java.lang.Thread.run(Thread.java:662)
>> Driver stacktrace:
>> at org.apache.spark.scheduler.DAGScheduler.org
>> $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1033)
>> at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1017)
>> at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1015)
>> at
>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>> at
>> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1015)
>> at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633)
>> at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633)
>> at scala.Option.foreach(Option.scala:236)
>> at
>> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:633)
>> at
>> org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1207)
>> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
>> at akka.actor.ActorCell.invoke(ActorCell.scala:456)
>> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
>> at akka.dispatch.Mailbox.run(Mailbox.scala:219)
>> at
>> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
>> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>> at
>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>> at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>> at
>> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>> 14/06/05 20:02:28 INFO scheduler.TaskSchedulerImpl: Cancelling stage 5
>>
>> On Wed, Jun 4, 2014 at 7:50 AM, Sean Owen <so...@cloudera.com> wrote:
>> > On Wed, Jun 4, 2014 at 3:33 PM, Matt Kielo <mki...@oculusinfo.com>
>> wrote:
>> >> Im trying run some spark code on a cluster but I keep running into a
>> >> "java.io.StreamCorruptedException: invalid type code: AC" error. My task
>> >> involves analyzing ~50GB of data (some operations involve sorting) then
>> >> writing them out to a JSON file. Im running the analysis on each of the
>> >> data's ~10 columns and have never had a successful run. My program
>> seems to
>> >> run for a varying amount of time each time (~between 5-30 minutes) but
>> it
>> >> always terminates with this error.
>> >
>> > I can tell you that this usually means somewhere something wrote
>> > objects to the same OutputStream with multiple ObjectOutputStreams. AC
>> > is a header value.
>> >
>> > I don't obviously see where/how that could happen, but maybe it rings
>> > a bell for someone. This could happen if an OutputStream is reused
>> > across object serializations but new ObjectOutputStreams are opened,
>> > for example.
>>
>
>
>
> --
>
> SUREN HIRAMAN, VP TECHNOLOGY
> Velos
> Accelerating Machine Learning
>
> 440 NINTH AVENUE, 11TH FLOOR
> NEW YORK, NY 10001
> O: (917) 525-2466 ext. 105
> F: 646.349.4063
> E: suren.hiraman@v <suren.hira...@sociocast.com>elos.io
> W: www.velos.io

Reply via email to