On Mon, Nov 10, 2014 at 10:52 PM, Ritesh Kumar Singh <
riteshoneinamill...@gmail.com> wrote:

> Tasks are now getting submitted, but many tasks don't happen.
> Like, after opening the spark-shell, I load a text file from disk and try
> printing its contentsas:
>
> >sc.textFile("/path/to/file").foreach(println)
>
> It does not give me any output. While running this:
>
> >sc.textFile("/path/to/file").count
>
> gives me the right number of lines in the text file.
> Not sure what the error is. But here is the output on the console for
> print case:
>
> 14/11/10 22:48:02 INFO MemoryStore: ensureFreeSpace(215230) called with
> curMem=709528, maxMem=463837593
> 14/11/10 22:48:02 INFO MemoryStore: Block broadcast_6 stored as values in
> memory (estimated size 210.2 KB, free 441.5 MB)
> 14/11/10 22:48:02 INFO MemoryStore: ensureFreeSpace(17239) called with
> curMem=924758, maxMem=463837593
> 14/11/10 22:48:02 INFO MemoryStore: Block broadcast_6_piece0 stored as
> bytes in memory (estimated size 16.8 KB, free 441.5 MB)
> 14/11/10 22:48:02 INFO BlockManagerInfo: Added broadcast_6_piece0 in
> memory on gonephishing.local:42648 (size: 16.8 KB, free: 442.3 MB)
> 14/11/10 22:48:02 INFO BlockManagerMaster: Updated info of block
> broadcast_6_piece0
> 14/11/10 22:48:02 INFO FileInputFormat: Total input paths to process : 1
> 14/11/10 22:48:02 INFO SparkContext: Starting job: foreach at <console>:13
> 14/11/10 22:48:02 INFO DAGScheduler: Got job 3 (foreach at <console>:13)
> with 2 output partitions (allowLocal=false)
> 14/11/10 22:48:02 INFO DAGScheduler: Final stage: Stage 3(foreach at
> <console>:13)
> 14/11/10 22:48:02 INFO DAGScheduler: Parents of final stage: List()
> 14/11/10 22:48:02 INFO DAGScheduler: Missing parents: List()
> 14/11/10 22:48:02 INFO DAGScheduler: Submitting Stage 3 (Desktop/mnd.txt
> MappedRDD[7] at textFile at <console>:13), which has no missing parents
> 14/11/10 22:48:02 INFO MemoryStore: ensureFreeSpace(2504) called with
> curMem=941997, maxMem=463837593
> 14/11/10 22:48:02 INFO MemoryStore: Block broadcast_7 stored as values in
> memory (estimated size 2.4 KB, free 441.4 MB)
> 14/11/10 22:48:02 INFO MemoryStore: ensureFreeSpace(1602) called with
> curMem=944501, maxMem=463837593
> 14/11/10 22:48:02 INFO MemoryStore: Block broadcast_7_piece0 stored as
> bytes in memory (estimated size 1602.0 B, free 441.4 MB)
> 14/11/10 22:48:02 INFO BlockManagerInfo: Added broadcast_7_piece0 in
> memory on gonephishing.local:42648 (size: 1602.0 B, free: 442.3 MB)
> 14/11/10 22:48:02 INFO BlockManagerMaster: Updated info of block
> broadcast_7_piece0
> 14/11/10 22:48:02 INFO DAGScheduler: Submitting 2 missing tasks from Stage
> 3 (Desktop/mnd.txt MappedRDD[7] at textFile at <console>:13)
> 14/11/10 22:48:02 INFO TaskSchedulerImpl: Adding task set 3.0 with 2 tasks
> 14/11/10 22:48:02 INFO TaskSetManager: Starting task 0.0 in stage 3.0 (TID
> 6, gonephishing.local, PROCESS_LOCAL, 1216 bytes)
> 14/11/10 22:48:02 INFO TaskSetManager: Starting task 1.0 in stage 3.0 (TID
> 7, gonephishing.local, PROCESS_LOCAL, 1216 bytes)
> 14/11/10 22:48:02 INFO BlockManagerInfo: Added broadcast_7_piece0 in
> memory on gonephishing.local:48857 (size: 1602.0 B, free: 442.3 MB)
> 14/11/10 22:48:02 INFO BlockManagerInfo: Added broadcast_6_piece0 in
> memory on gonephishing.local:48857 (size: 16.8 KB, free: 442.3 MB)
> 14/11/10 22:48:02 INFO TaskSetManager: Finished task 0.0 in stage 3.0 (TID
> 6) in 308 ms on gonephishing.local (1/2)
> 14/11/10 22:48:02 INFO DAGScheduler: Stage 3 (foreach at <console>:13)
> finished in 0.321 s
> 14/11/10 22:48:02 INFO TaskSetManager: Finished task 1.0 in stage 3.0 (TID
> 7) in 315 ms on gonephishing.local (2/2)
> 14/11/10 22:48:02 INFO SparkContext: Job finished: foreach at
> <console>:13, took 0.376602079 s
> 14/11/10 22:48:02 INFO TaskSchedulerImpl: Removed TaskSet 3.0, whose tasks
> have all completed, from pool
>
> =======================================================================
>
>
>
> On Mon, Nov 10, 2014 at 8:01 PM, Akhil Das <ak...@sigmoidanalytics.com>
> wrote:
>
>> ​Try adding the following configurations also, might work.
>>
>>  spark.rdd.compress true
>>
>>       spark.storage.memoryFraction 1
>>       spark.core.connection.ack.wait.timeout 600
>>       spark.akka.frameSize 50
>>
>> Thanks
>> Best Regards
>>
>> On Mon, Nov 10, 2014 at 6:51 PM, Ritesh Kumar Singh <
>> riteshoneinamill...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I am trying to submit my application using spark-submit, using following
>>> spark-default.conf params:
>>>
>>> spark.master                     spark://<master-ip>:7077
>>> spark.eventLog.enabled           true
>>> spark.serializer
>>> org.apache.spark.serializer.KryoSerializer
>>> spark.executor.extraJavaOptions  -XX:+PrintGCDetails -Dkey=value
>>> -Dnumbers="one two three"
>>>
>>> ===============================================================
>>> But every time I am getting this error:
>>>
>>> 14/11/10 18:39:17 ERROR TaskSchedulerImpl: Lost executor 1 on aa.local:
>>> remote Akka client disassociated
>>> 14/11/10 18:39:17 WARN TaskSetManager: Lost task 1.0 in stage 0.0 (TID
>>> 1, aa.local): ExecutorLostFailure (executor lost)
>>> 14/11/10 18:39:17 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID
>>> 0, aa.local): ExecutorLostFailure (executor lost)
>>> 14/11/10 18:39:20 ERROR TaskSchedulerImpl: Lost executor 2 on aa.local:
>>> remote Akka client disassociated
>>> 14/11/10 18:39:20 WARN TaskSetManager: Lost task 0.1 in stage 0.0 (TID
>>> 2, aa.local): ExecutorLostFailure (executor lost)
>>> 14/11/10 18:39:20 WARN TaskSetManager: Lost task 1.1 in stage 0.0 (TID
>>> 3, aa.local): ExecutorLostFailure (executor lost)
>>> 14/11/10 18:39:26 ERROR TaskSchedulerImpl: Lost executor 4 on aa.local:
>>> remote Akka client disassociated
>>> 14/11/10 18:39:26 WARN TaskSetManager: Lost task 0.2 in stage 0.0 (TID
>>> 5, aa.local): ExecutorLostFailure (executor lost)
>>> 14/11/10 18:39:26 WARN TaskSetManager: Lost task 1.2 in stage 0.0 (TID
>>> 4, aa.local): ExecutorLostFailure (executor lost)
>>> 14/11/10 18:39:29 ERROR TaskSchedulerImpl: Lost executor 5 on aa.local:
>>> remote Akka client disassociated
>>> 14/11/10 18:39:29 WARN TaskSetManager: Lost task 0.3 in stage 0.0 (TID
>>> 7, aa.local): ExecutorLostFailure (executor lost)
>>> 14/11/10 18:39:29 ERROR TaskSetManager: Task 0 in stage 0.0 failed 4
>>> times; aborting job
>>> 14/11/10 18:39:29 WARN TaskSetManager: Lost task 1.3 in stage 0.0 (TID
>>> 6, aa.local): ExecutorLostFailure (executor lost)
>>> Exception in thread "main" org.apache.spark.SparkException: Job aborted
>>> due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent
>>> failure: Lost task 0.3 in stage 0.0 (TID 7, gonephishing.local):
>>> ExecutorLostFailure (executor lost)
>>> Driver stacktrace:
>>> at org.apache.spark.scheduler.DAGScheduler.org
>>> $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1185)
>>> at
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1174)
>>> at
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1173)
>>> at
>>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>>> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>>> at
>>> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1173)
>>> at
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
>>> at
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
>>> at scala.Option.foreach(Option.scala:236)
>>> at
>>> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:688)
>>> at
>>> org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1391)
>>> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
>>> at akka.actor.ActorCell.invoke(ActorCell.scala:456)
>>> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
>>> at akka.dispatch.Mailbox.run(Mailbox.scala:219)
>>> at
>>> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
>>> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>>> at
>>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>>> at
>>> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>>> at
>>> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>>>
>>> =================================================================
>>> Any fixes?
>>>
>>
>>
>

Reply via email to