Nothing much useful. Following are the interpreter logs I can see before the
job just hangs:
INFO [2018-11-23 14:25:58,517] ({pool-2-thread-3}
SchedulerFactory.java[jobStarted]:109) - Job 20181123-112240_1827913615 started
by scheduler interpreter_34089707
INFO [2018-11-23 14:25:59,241] ({pool-2-thread-3}
FileInputFormat.java[listStatus]:253) - Total input paths to process : 1
INFO [2018-11-23 14:25:59,299] ({pool-2-thread-3} Logging.scala[logInfo]:54) -
Starting job: take at <console>:28
INFO [2018-11-23 14:25:59,317] ({dag-scheduler-event-loop}
Logging.scala[logInfo]:54) - Got job 0 (take at <console>:28) with 1 output
partitions
INFO [2018-11-23 14:25:59,319] ({dag-scheduler-event-loop}
Logging.scala[logInfo]:54) - Final stage: ResultStage 0 (take at <console>:28)
INFO [2018-11-23 14:25:59,320] ({dag-scheduler-event-loop}
Logging.scala[logInfo]:54) - Parents of final stage: List()
INFO [2018-11-23 14:25:59,323] ({dag-scheduler-event-loop}
Logging.scala[logInfo]:54) - Missing parents: List()
INFO [2018-11-23 14:25:59,328] ({dag-scheduler-event-loop}
Logging.scala[logInfo]:54) - Submitting ResultStage 0
(/tmp/earthquake/GEM-GHEC-v1_2.txt MapPartitionsRDD[1] at textFile at
<console>:25), which has no missing parents
Nabeel
> On Nov 23, 2018, at 12:33 PM, 王刚 <[email protected]> wrote:
>
> Is there something useful information your local spark process log?
>
>> 在 2018年11月23日,下午4:20,Nabeel Imtiaz <[email protected]> 写道:
>>
>> Hi,
>>
>>
>> When I trying to even simply take first 10 lines of a file (like
>> ```batchData.take(10).foreach(println _)``` from sprakContext, the paragraph
>> hangs.
>>
>> If I inspect the job in spark console, it shows the job in PENDING state. I
>> check that I have more than enough memory in the system available.
>>
>> Is it a known issue? Any fixes or workaround?
>>
>>
>>
>> Nabeel
>