Re: Very long pause/hang at end of execution

2016-11-16 Thread Michael Johnson
On Wed, Nov 16, 2016 at 10:44 AM Aniket Bhatnagar wrote: Thanks for sharing the thread dump. I had a look at them and couldn't find anything unusual. Is there anything in the logs (driver + executor) that suggests what's going on? Also, what does the spark job do

Re: Very long pause/hang at end of execution

2016-11-16 Thread Aniket Bhatnagar
Also, how are you launching the application? Through spark submit or creating spark content in your app? Thanks, Aniket On Wed, Nov 16, 2016 at 10:44 AM Aniket Bhatnagar < aniket.bhatna...@gmail.com> wrote: > Thanks for sharing the thread dump. I had a look at them and couldn't find > anything

Re: Very long pause/hang at end of execution

2016-11-16 Thread Aniket Bhatnagar
Thanks for sharing the thread dump. I had a look at them and couldn't find anything unusual. Is there anything in the logs (driver + executor) that suggests what's going on? Also, what does the spark job do and what is the version of spark and hadoop you are using? Thanks, Aniket On Wed, Nov 16,

Re: Very long pause/hang at end of execution

2016-11-16 Thread Pietro Pugni
I have the same issue with Spark 2.0.1, Java 1.8.x and pyspark. I also use SparkSQL and JDBC. My application runs locally. It happens only of I connect to the UI during Spark execution and even if I close the browser before the execution ends. I observed this behaviour both on macOS Sierra and Red

Re: Very long pause/hang at end of execution

2016-11-06 Thread Michael Johnson
Hm. Something must have changed, as it was happening quite consistently and now I can't get it to reproduce. Thank you for the offer, and if it happens again I will try grabbing thread dumps and I will see if I can figure out what is going on. On Sunday, November 6, 2016 10:02 AM, Aniket

Re: Very long pause/hang at end of execution

2016-11-06 Thread Gourav Sengupta
Hi, In case your process finishes after a lag, then please check whether you are writing by converting to Pandas or using coalesce (in which case entire traffic is being directed to a single node) or writing over S3, in which case there can be lags. Regards, Gourav On Sun, Nov 6, 2016 at 1:28

Re: Very long pause/hang at end of execution

2016-11-06 Thread Aniket Bhatnagar
I doubt it's GC as you mentioned that the pause is several minutes. Since it's reproducible in local mode, can you run the spark application locally and once your job is complete (and application appears paused), can you take 5 thread dumps (using jstack or jcmd on the local spark JVM process)

Re: Very long pause/hang at end of execution

2016-11-06 Thread Michael Johnson
Thanks; I tried looking at the thread dumps for the driver and the one executor that had that option in the UI, but I'm afraid I don't know how to interpret what I saw...  I don't think it could be my code directly, since at this point my code has all completed? Could GC be taking that long?

Re: Very long pause/hang at end of execution

2016-11-06 Thread Aniket Bhatnagar
In order to know what's going on, you can study the thread dumps either from spark UI or from any other thread dump analysis tool. Thanks, Aniket On Sun, Nov 6, 2016 at 1:31 PM Michael Johnson wrote: > I'm doing some processing and then clustering of a small

Very long pause/hang at end of execution

2016-11-06 Thread Michael Johnson
I'm doing some processing and then clustering of a small dataset (~150 MB). Everything seems to work fine, until the end; the last few lines of my program are log statements, but after printing those, nothing seems to happen for a long time...many minutes; I'm not usually patient enough to let