Thanks; I tried looking at the thread dumps for the driver and the one executor 
that had that option in the UI, but I'm afraid I don't know how to interpret 
what I saw...  I don't think it could be my code directly, since at this point 
my code has all completed? Could GC be taking that long? 
(I could also try grabbing the thread dumps and pasting them here, if that 
would help?)

    On Sunday, November 6, 2016 8:36 AM, Aniket Bhatnagar 
<aniket.bhatna...@gmail.com> wrote:
 

 In order to know what's going on, you can study the thread dumps either from 
spark UI or from any other thread dump analysis tool.
Thanks,Aniket
On Sun, Nov 6, 2016 at 1:31 PM Michael Johnson 
<mjjohnson....@yahoo.com.invalid> wrote:

I'm doing some processing and then clustering of a small dataset (~150 MB). 
Everything seems to work fine, until the end; the last few lines of my program 
are log statements, but after printing those, nothing seems to happen for a 
long time...many minutes; I'm not usually patient enough to let it go, but I 
think one time when I did just wait, it took over an hour (and did eventually 
exit on its own). Any ideas on what's happening, or how to troubleshoot?
(This happens both when running locally, using the localhost mode, as well as 
on a small cluster with four 4-processor nodes each with 15GB of RAM; in both 
cases the executors have 2GB+ of RAM, and none of the inputs/outputs on any of 
the stages is more than 75 MB...)
Thanks,Michael


   

Reply via email to