Thanks; I tried looking at the thread dumps for the driver and the one executor that had that option in the UI, but I'm afraid I don't know how to interpret what I saw... I don't think it could be my code directly, since at this point my code has all completed? Could GC be taking that long? (I could also try grabbing the thread dumps and pasting them here, if that would help?)
On Sunday, November 6, 2016 8:36 AM, Aniket Bhatnagar <aniket.bhatna...@gmail.com> wrote: In order to know what's going on, you can study the thread dumps either from spark UI or from any other thread dump analysis tool. Thanks,Aniket On Sun, Nov 6, 2016 at 1:31 PM Michael Johnson <mjjohnson....@yahoo.com.invalid> wrote: I'm doing some processing and then clustering of a small dataset (~150 MB). Everything seems to work fine, until the end; the last few lines of my program are log statements, but after printing those, nothing seems to happen for a long time...many minutes; I'm not usually patient enough to let it go, but I think one time when I did just wait, it took over an hour (and did eventually exit on its own). Any ideas on what's happening, or how to troubleshoot? (This happens both when running locally, using the localhost mode, as well as on a small cluster with four 4-processor nodes each with 15GB of RAM; in both cases the executors have 2GB+ of RAM, and none of the inputs/outputs on any of the stages is more than 75 MB...) Thanks,Michael