Fixed a typo in the code to avoid any confusion.... Please comment on the code below...
dstream.map( p -> { ThreadLocal<SomeClass> d = new ThreadLocal<>() { public SomeClass initialValue() { return new SomeClass(); } }; somefunc(p, d.get()); d.remove(); return p; }; ); On Fri, Jan 29, 2016 at 4:32 PM, N B <nb.nos...@gmail.com> wrote: > So this use of ThreadLocal will be inside the code of a function executing > on the workers i.e. within a call from one of the lambdas. Would it just > look like this then: > > dstream.map( p -> { ThreadLocal<Data> d = new ThreadLocal<>() { > public SomeClass initialValue() { return new SomeClass(); } > }; > somefunc(p, d.get()); > d.remove(); > return p; > }; ); > > Will this make sure that all threads inside the worker clean up the > ThreadLocal once they are done with processing this task? > > Thanks > NB > > > On Fri, Jan 29, 2016 at 1:00 PM, Shixiong(Ryan) Zhu < > shixi...@databricks.com> wrote: > >> Spark Streaming uses threadpools so you need to remove ThreadLocal when >> it's not used. >> >> On Fri, Jan 29, 2016 at 12:55 PM, N B <nb.nos...@gmail.com> wrote: >> >>> Thanks for the response Ryan. So I would say that it is in fact the >>> purpose of a ThreadLocal i.e. to have a copy of the variable as long as the >>> thread lives. I guess my concern is around usage of threadpools and whether >>> Spark streaming will internally create many threads that rotate between >>> tasks on purpose thereby holding onto ThreadLocals that may actually never >>> be used again. >>> >>> Thanks >>> >>> On Fri, Jan 29, 2016 at 12:12 PM, Shixiong(Ryan) Zhu < >>> shixi...@databricks.com> wrote: >>> >>>> Of cause. If you use a ThreadLocal in a long living thread and forget >>>> to remove it, it's definitely a memory leak. >>>> >>>> On Thu, Jan 28, 2016 at 9:31 PM, N B <nb.nos...@gmail.com> wrote: >>>> >>>>> Hello, >>>>> >>>>> Does anyone know if there are any potential pitfalls associated with >>>>> using ThreadLocal variables in a Spark streaming application? One things I >>>>> have seen mentioned in the context of app servers that use thread pools is >>>>> that ThreadLocals can leak memory. Could this happen in Spark streaming >>>>> also? >>>>> >>>>> Thanks >>>>> Nikunj >>>>> >>>>> >>>> >>> >> >