Re: 10hrs of Scheduler Delay

Ted Yu Thu, 21 Jan 2016 19:09:13 -0800

You may have noticed the following - did this indicate prolonged
computation in your code ?


org.apache.commons.math3.util.MathArrays.distance(MathArrays.java:205)
org.apache.commons.math3.ml.distance.EuclideanDistance.compute(EuclideanDistance.java:34)
org.alitouka.spark.dbscan.spatial.DistanceCalculation$class.calculateDistance(DistanceCalculation.scala:15)
org.alitouka.spark.dbscan.exploratoryAnalysis.DistanceToNearestNeighborDriver$.calculateDistance(DistanceToNearestNeighborDriver.scala:16)


On Thu, Jan 21, 2016 at 5:13 PM, Sanders, Isaac B <sande...@rose-hulman.edu>
wrote:

> Hadoop is: HDP 2.3.2.0-2950
>
> Here is a gist (pastebin) of my versions en masse and a stacktrace:
> https://gist.github.com/isaacsanders/2e59131758469097651b
>
> Thanks
>
> On Jan 21, 2016, at 7:44 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>
> Looks like you were running on YARN.
>
> What hadoop version are you using ?
>
> Can you capture a few stack traces of the AppMaster during the delay and
> pastebin them ?
>
> Thanks
>
> On Thu, Jan 21, 2016 at 8:08 AM, Sanders, Isaac B <
> sande...@rose-hulman.edu> wrote:
>
>> The Spark Version is 1.4.1
>>
>> The logs are full of standard fair, nothing like an exception or even
>> interesting [INFO] lines.
>>
>> Here is the script I am using:
>> https://gist.github.com/isaacsanders/660f480810fbc07d4df2
>>
>> Thanks
>> Isaac
>>
>> On Jan 21, 2016, at 11:03 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>>
>> Can you provide a bit more information ?
>>
>> command line for submitting Spark job
>> version of Spark
>> anything interesting from driver / executor logs ?
>>
>> Thanks
>>
>> On Thu, Jan 21, 2016 at 7:35 AM, Sanders, Isaac B <
>> sande...@rose-hulman.edu> wrote:
>>
>>> Hey all,
>>>
>>> I am a CS student in the United States working on my senior thesis.
>>>
>>> My thesis uses Spark, and I am encountering some trouble.
>>>
>>> I am using https://github.com/alitouka/spark_dbscan, and to determine
>>> parameters, I am using the utility class they supply,
>>> org.alitouka.spark.dbscan.exploratoryAnalysis.DistanceToNearestNeighborDriver.
>>>
>>> I am on a 10 node cluster with one machine with 8 cores and 32G of
>>> memory and nine machines with 6 cores and 16G of memory.
>>>
>>> I have 442M of data, which seems like it would be a joke, but the job
>>> stalls at the last stage.
>>>
>>> It was stuck in Scheduler Delay for 10 hours overnight, and I have tried
>>> a number of things for the last couple days, but nothing seems to be
>>> helping.
>>>
>>> I have tried:
>>> - Increasing heap sizes and numbers of cores
>>> - More/less executors with different amounts of resources.
>>> - Kyro Serialization
>>> - FAIR Scheduling
>>>
>>> It doesn’t seem like it should require this much. Any ideas?
>>>
>>> - Isaac
>>
>>
>>
>>
>
>

Re: 10hrs of Scheduler Delay

Reply via email to