Re: 10hrs of Scheduler Delay

Ted Yu Thu, 21 Jan 2016 16:45:15 -0800

Looks like you were running on YARN.

What hadoop version are you using ?


Can you capture a few stack traces of the AppMaster during the delay and
pastebin them ?

Thanks

On Thu, Jan 21, 2016 at 8:08 AM, Sanders, Isaac B <sande...@rose-hulman.edu>
wrote:

> The Spark Version is 1.4.1
>
> The logs are full of standard fair, nothing like an exception or even
> interesting [INFO] lines.
>
> Here is the script I am using:
> https://gist.github.com/isaacsanders/660f480810fbc07d4df2
>
> Thanks
> Isaac
>
> On Jan 21, 2016, at 11:03 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>
> Can you provide a bit more information ?
>
> command line for submitting Spark job
> version of Spark
> anything interesting from driver / executor logs ?
>
> Thanks
>
> On Thu, Jan 21, 2016 at 7:35 AM, Sanders, Isaac B <
> sande...@rose-hulman.edu> wrote:
>
>> Hey all,
>>
>> I am a CS student in the United States working on my senior thesis.
>>
>> My thesis uses Spark, and I am encountering some trouble.
>>
>> I am using https://github.com/alitouka/spark_dbscan, and to determine
>> parameters, I am using the utility class they supply,
>> org.alitouka.spark.dbscan.exploratoryAnalysis.DistanceToNearestNeighborDriver.
>>
>> I am on a 10 node cluster with one machine with 8 cores and 32G of memory
>> and nine machines with 6 cores and 16G of memory.
>>
>> I have 442M of data, which seems like it would be a joke, but the job
>> stalls at the last stage.
>>
>> It was stuck in Scheduler Delay for 10 hours overnight, and I have tried
>> a number of things for the last couple days, but nothing seems to be
>> helping.
>>
>> I have tried:
>> - Increasing heap sizes and numbers of cores
>> - More/less executors with different amounts of resources.
>> - Kyro Serialization
>> - FAIR Scheduling
>>
>> It doesn’t seem like it should require this much. Any ideas?
>>
>> - Isaac
>
>
>
>

Re: 10hrs of Scheduler Delay

Reply via email to