---------- Forwarded message ----------
From: Ritesh Kumar Singh <riteshoneinamill...@gmail.com>
Date: Mon, Nov 10, 2014 at 10:52 PM
Subject: Re: Executor Lost Failure
To: Akhil Das <ak...@sigmoidanalytics.com>


Tasks are now getting submitted, but many tasks don't happen.
Like, after opening the spark-shell, I load a text file from disk and try
printing its contentsas:

>sc.textFile("/path/to/file").foreach(println)

It does not give me any output. While running this:

>sc.textFile("/path/to/file").count

gives me the right number of lines in the text file.
Not sure what the error is. But here is the output on the console for print
case:

14/11/10 22:48:02 INFO MemoryStore: ensureFreeSpace(215230) called with
curMem=709528, maxMem=463837593
14/11/10 22:48:02 INFO MemoryStore: Block broadcast_6 stored as values in
memory (estimated size 210.2 KB, free 441.5 MB)
14/11/10 22:48:02 INFO MemoryStore: ensureFreeSpace(17239) called with
curMem=924758, maxMem=463837593
14/11/10 22:48:02 INFO MemoryStore: Block broadcast_6_piece0 stored as
bytes in memory (estimated size 16.8 KB, free 441.5 MB)
14/11/10 22:48:02 INFO BlockManagerInfo: Added broadcast_6_piece0 in memory
on gonephishing.local:42648 (size: 16.8 KB, free: 442.3 MB)
14/11/10 22:48:02 INFO BlockManagerMaster: Updated info of block
broadcast_6_piece0
14/11/10 22:48:02 INFO FileInputFormat: Total input paths to process : 1
14/11/10 22:48:02 INFO SparkContext: Starting job: foreach at <console>:13
14/11/10 22:48:02 INFO DAGScheduler: Got job 3 (foreach at <console>:13)
with 2 output partitions (allowLocal=false)
14/11/10 22:48:02 INFO DAGScheduler: Final stage: Stage 3(foreach at
<console>:13)
14/11/10 22:48:02 INFO DAGScheduler: Parents of final stage: List()
14/11/10 22:48:02 INFO DAGScheduler: Missing parents: List()
14/11/10 22:48:02 INFO DAGScheduler: Submitting Stage 3 (Desktop/mnd.txt
MappedRDD[7] at textFile at <console>:13), which has no missing parents
14/11/10 22:48:02 INFO MemoryStore: ensureFreeSpace(2504) called with
curMem=941997, maxMem=463837593
14/11/10 22:48:02 INFO MemoryStore: Block broadcast_7 stored as values in
memory (estimated size 2.4 KB, free 441.4 MB)
14/11/10 22:48:02 INFO MemoryStore: ensureFreeSpace(1602) called with
curMem=944501, maxMem=463837593
14/11/10 22:48:02 INFO MemoryStore: Block broadcast_7_piece0 stored as
bytes in memory (estimated size 1602.0 B, free 441.4 MB)
14/11/10 22:48:02 INFO BlockManagerInfo: Added broadcast_7_piece0 in memory
on gonephishing.local:42648 (size: 1602.0 B, free: 442.3 MB)
14/11/10 22:48:02 INFO BlockManagerMaster: Updated info of block
broadcast_7_piece0
14/11/10 22:48:02 INFO DAGScheduler: Submitting 2 missing tasks from Stage
3 (Desktop/mnd.txt MappedRDD[7] at textFile at <console>:13)
14/11/10 22:48:02 INFO TaskSchedulerImpl: Adding task set 3.0 with 2 tasks
14/11/10 22:48:02 INFO TaskSetManager: Starting task 0.0 in stage 3.0 (TID
6, gonephishing.local, PROCESS_LOCAL, 1216 bytes)
14/11/10 22:48:02 INFO TaskSetManager: Starting task 1.0 in stage 3.0 (TID
7, gonephishing.local, PROCESS_LOCAL, 1216 bytes)
14/11/10 22:48:02 INFO BlockManagerInfo: Added broadcast_7_piece0 in memory
on gonephishing.local:48857 (size: 1602.0 B, free: 442.3 MB)
14/11/10 22:48:02 INFO BlockManagerInfo: Added broadcast_6_piece0 in memory
on gonephishing.local:48857 (size: 16.8 KB, free: 442.3 MB)
14/11/10 22:48:02 INFO TaskSetManager: Finished task 0.0 in stage 3.0 (TID
6) in 308 ms on gonephishing.local (1/2)
14/11/10 22:48:02 INFO DAGScheduler: Stage 3 (foreach at <console>:13)
finished in 0.321 s
14/11/10 22:48:02 INFO TaskSetManager: Finished task 1.0 in stage 3.0 (TID
7) in 315 ms on gonephishing.local (2/2)
14/11/10 22:48:02 INFO SparkContext: Job finished: foreach at <console>:13,
took 0.376602079 s
14/11/10 22:48:02 INFO TaskSchedulerImpl: Removed TaskSet 3.0, whose tasks
have all completed, from pool

=======================================================================



On Mon, Nov 10, 2014 at 8:01 PM, Akhil Das <ak...@sigmoidanalytics.com>
wrote:

> ​Try adding the following configurations also, might work.
>
>  spark.rdd.compress true
>
>       spark.storage.memoryFraction 1
>       spark.core.connection.ack.wait.timeout 600
>       spark.akka.frameSize 50
>
> Thanks
> Best Regards
>
> On Mon, Nov 10, 2014 at 6:51 PM, Ritesh Kumar Singh <
> riteshoneinamill...@gmail.com> wrote:
>
>> Hi,
>>
>> I am trying to submit my application using spark-submit, using following
>> spark-default.conf params:
>>
>> spark.master                     spark://<master-ip>:7077
>> spark.eventLog.enabled           true
>> spark.serializer
>> org.apache.spark.serializer.KryoSerializer
>> spark.executor.extraJavaOptions  -XX:+PrintGCDetails -Dkey=value
>> -Dnumbers="one two three"
>>
>> ===============================================================
>> But every time I am getting this error:
>>
>> 14/11/10 18:39:17 ERROR TaskSchedulerImpl: Lost executor 1 on aa.local:
>> remote Akka client disassociated
>> 14/11/10 18:39:17 WARN TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1,
>> aa.local): ExecutorLostFailure (executor lost)
>> 14/11/10 18:39:17 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0,
>> aa.local): ExecutorLostFailure (executor lost)
>> 14/11/10 18:39:20 ERROR TaskSchedulerImpl: Lost executor 2 on aa.local:
>> remote Akka client disassociated
>> 14/11/10 18:39:20 WARN TaskSetManager: Lost task 0.1 in stage 0.0 (TID 2,
>> aa.local): ExecutorLostFailure (executor lost)
>> 14/11/10 18:39:20 WARN TaskSetManager: Lost task 1.1 in stage 0.0 (TID 3,
>> aa.local): ExecutorLostFailure (executor lost)
>> 14/11/10 18:39:26 ERROR TaskSchedulerImpl: Lost executor 4 on aa.local:
>> remote Akka client disassociated
>> 14/11/10 18:39:26 WARN TaskSetManager: Lost task 0.2 in stage 0.0 (TID 5,
>> aa.local): ExecutorLostFailure (executor lost)
>> 14/11/10 18:39:26 WARN TaskSetManager: Lost task 1.2 in stage 0.0 (TID 4,
>> aa.local): ExecutorLostFailure (executor lost)
>> 14/11/10 18:39:29 ERROR TaskSchedulerImpl: Lost executor 5 on aa.local:
>> remote Akka client disassociated
>> 14/11/10 18:39:29 WARN TaskSetManager: Lost task 0.3 in stage 0.0 (TID 7,
>> aa.local): ExecutorLostFailure (executor lost)
>> 14/11/10 18:39:29 ERROR TaskSetManager: Task 0 in stage 0.0 failed 4
>> times; aborting job
>> 14/11/10 18:39:29 WARN TaskSetManager: Lost task 1.3 in stage 0.0 (TID 6,
>> aa.local): ExecutorLostFailure (executor lost)
>> Exception in thread "main" org.apache.spark.SparkException: Job aborted
>> due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent
>> failure: Lost task 0.3 in stage 0.0 (TID 7, gonephishing.local):
>> ExecutorLostFailure (executor lost)
>> Driver stacktrace:
>> at org.apache.spark.scheduler.DAGScheduler.org
>> $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1185)
>> at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1174)
>> at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1173)
>> at
>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>> at
>> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1173)
>> at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
>> at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
>> at scala.Option.foreach(Option.scala:236)
>> at
>> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:688)
>> at
>> org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1391)
>> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
>> at akka.actor.ActorCell.invoke(ActorCell.scala:456)
>> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
>> at akka.dispatch.Mailbox.run(Mailbox.scala:219)
>> at
>> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
>> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>> at
>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>> at
>> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>> at
>> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>>
>> =================================================================
>> Any fixes?
>>
>
>

Reply via email to