Re: a vague question, but perhaps it might ring a bell

Mark Hamstra Mon, 05 Jan 2015 08:34:03 -0800

"[T]his sort of Exception" is deeply misleading since what Michael posted
is just the very tail-end of the process of Spark shutting down when an
unhandled exception is thrown somewhere else.  "[T]his sort of Exception"
is not the root cause of the problem, but rather will be the common outcome
from numerous root causes.  To find the root cause and what sort of
Exception is actually being thrown to cause Spark to shut down, you can't
look just at the tail-end of the Driver logs but must dig further back in
this and/or other logs that Spark nodes are generating.


On Sun, Jan 4, 2015 at 10:21 PM, Akhil Das <[email protected]>
wrote:

> What are you trying to do? Can you paste the whole code? I used to see
> this sort of Exception when i close the fs object inside map/mapPartition
> etc.
>
> Thanks
> Best Regards
>
> On Mon, Jan 5, 2015 at 6:43 AM, Michael Albert <
> [email protected]> wrote:
>
>> Greetings!
>>
>> So, I think I have data saved so that each partition (part-r-00000, etc)
>> is exactly what I wan to translate into an output file of a format not
>> related to
>> hadoop.
>>
>> I believe I've figured out how to tell Spark to read the data set without
>> re-partitioning (in
>> another post I mentioned this -- I have a non-splitable InputFormat).
>>
>> I do something like
>>    mapPartitionWithIndex( (partId, iter) =>
>>            conf = new Configuration()
>>            fs = Filesystem.get(conf)
>>            strm = fs.create(new Path(...))
>>             //  write data to stream
>>           strm.close() // in finally block }
>>
>> This runs for a few hundred input files (so each executors sees 10's of
>> files),
>> and it chugs along nicely, then suddenly everything shuts down.
>> I can restart (telling it to skip the partIds which it has already
>> completed), and it
>> chugs along again for a while (going past the previous stopping point)
>> and again dies.
>>
>> I am a t a loss.  This work for the first 10's of files (so it runs for
>> about 1hr) then quits,
>> and I see no useful error information (no Exceptions except the stuff
>> below.
>> I'm not shutting it down.
>>
>> Any idea what I might check? I've bumped up the memory multiple times
>> (16G currently)
>> and fiddled with increasing other parameters.
>>
>> Thanks.
>> Exception in thread "main" org.apache.spark.SparkException: Job
>> cancelled because SparkContext was shut down
>>     at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:694)
>>     at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:693)
>>     at scala.collection.mutable.HashSet.foreach(HashSet.scala:79)
>>     at
>> org.apache.spark.scheduler.DAGScheduler.cleanUpAfterSchedulerStop(DAGScheduler.scala:693)
>>     at
>> org.apache.spark.scheduler.DAGSchedulerEventProcessActor.postStop(DAGScheduler.scala:1399)
>>     at
>> akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:201)
>>     at
>> akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:163)
>>     at akka.actor.ActorCell.terminate(ActorCell.scala:338)
>>     at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:431)
>>     at akka.actor.ActorCell.systemInvoke(ActorCell.scala:447)
>>     at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:262)
>>
>>
>>
>
>

Re: a vague question, but perhaps it might ring a bell

Reply via email to