Could you send a PR to fix it? Thanks! Best Regards, Shixiong Zhu
2015-12-08 13:31 GMT-08:00 Richard Marscher <rmarsc...@localytics.com>: > Alright I was able to work through the problem. > > So the owning thread was one from the executor task launch worker, which > at least in local mode runs the task and the related user code of the task. > After judiciously naming every thread in the pools in the user code (with a > custom ThreadFactory) I was able to trace down the leak to a couple thread > pools that were not shut down properly by noticing the named threads > accumulating in thread dumps of the JVM process. > > On Mon, Dec 7, 2015 at 6:41 PM, Richard Marscher <rmarsc...@localytics.com > > wrote: > >> Thanks for the response. >> >> The version is Spark 1.5.2. >> >> Some examples of the thread names: >> >> pool-1061-thread-1 >> pool-1059-thread-1 >> pool-1638-thread-1 >> >> There become hundreds then thousands of these stranded in WAITING. >> >> I added logging to try to track the lifecycle of the thread pool in >> Executor as mentioned before. Here is an excerpt, but every seems fine >> there. Every executor that starts is also shut down and it seems like it >> shuts down fine. >> >> 15/12/07 23:30:21 WARN o.a.s.e.Executor: Threads finished in executor >> driver. pool shut down >> java.util.concurrent.ThreadPoolExecutor@e5d036b[Terminated, >> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1] >> 15/12/07 23:30:28 WARN o.a.s.e.Executor: Executor driver created, thread >> pool: java.util.concurrent.ThreadPoolExecutor@3bc41ae3[Running, pool >> size = 0, active threads = 0, queued tasks = 0, completed tasks = 0] >> 15/12/07 23:31:06 WARN o.a.s.e.Executor: Threads finished in executor >> driver. pool shut down >> java.util.concurrent.ThreadPoolExecutor@3bc41ae3[Terminated, >> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 36] >> 15/12/07 23:31:11 WARN o.a.s.e.Executor: Executor driver created, thread >> pool: java.util.concurrent.ThreadPoolExecutor@6e85ece4[Running, pool >> size = 0, active threads = 0, queued tasks = 0, completed tasks = 0] >> 15/12/07 23:34:35 WARN o.a.s.e.Executor: Threads finished in executor >> driver. pool shut down >> java.util.concurrent.ThreadPoolExecutor@6e85ece4[Terminated, >> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 288] >> >> Also here is an example thread dump of such a thread: >> >> "pool-493-thread-1" prio=10 tid=0x00007f0e60612800 nid=0x18c4 waiting on >> condition [0x00007f0c33c3e000] >> java.lang.Thread.State: WAITING (parking) >> at sun.misc.Unsafe.park(Native Method) >> - parking to wait for <0x00007f10b3e8fb60> (a >> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) >> at >> java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) >> at >> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043) >> at >> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) >> at >> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> at java.lang.Thread.run(Thread.java:745) >> >> On Mon, Dec 7, 2015 at 6:23 PM, Shixiong Zhu <zsxw...@gmail.com> wrote: >> >>> Which version are you using? Could you post these thread names here? >>> >>> Best Regards, >>> Shixiong Zhu >>> >>> 2015-12-07 14:30 GMT-08:00 Richard Marscher <rmarsc...@localytics.com>: >>> >>>> Hi, >>>> >>>> I've been running benchmarks against Spark in local mode in a long >>>> running process. I'm seeing threads leaking each time it runs a job. It >>>> doesn't matter if I recycle SparkContext constantly or have 1 context stay >>>> alive for the entire application lifetime. >>>> >>>> I see a huge accumulation ongoing of "pool-xxxx-thread-1" threads with >>>> the creating thread "Executor task launch worker-xx" where x's are numbers. >>>> The number of leaks per launch worker varies but usually 1 to a few. >>>> >>>> Searching the Spark code the pool is created in the Executor class. It >>>> is `.shutdown()` in the stop for the executor. I've wired up logging and >>>> also tried shutdownNow() and awaitForTermination on the pools. Every seems >>>> okay there for every Executor that is called with `stop()` but I'm still >>>> not sure yet if every Executor is called as such, which I am looking into >>>> now. >>>> >>>> What I'm curious to know is if anyone has seen a similar issue? >>>> >>>> -- >>>> *Richard Marscher* >>>> Software Engineer >>>> Localytics >>>> Localytics.com <http://localytics.com/> | Our Blog >>>> <http://localytics.com/blog> | Twitter <http://twitter.com/localytics> >>>> | Facebook <http://facebook.com/localytics> | LinkedIn >>>> <http://www.linkedin.com/company/1148792?trk=tyah> >>>> >>> >>> >> >> >> -- >> *Richard Marscher* >> Software Engineer >> Localytics >> Localytics.com <http://localytics.com/> | Our Blog >> <http://localytics.com/blog> | Twitter <http://twitter.com/localytics> | >> Facebook <http://facebook.com/localytics> | LinkedIn >> <http://www.linkedin.com/company/1148792?trk=tyah> >> > > > > -- > *Richard Marscher* > Software Engineer > Localytics > Localytics.com <http://localytics.com/> | Our Blog > <http://localytics.com/blog> | Twitter <http://twitter.com/localytics> | > Facebook <http://facebook.com/localytics> | LinkedIn > <http://www.linkedin.com/company/1148792?trk=tyah> >