Fixed a typo in the code to avoid any confusion.... Please comment on the
code below...

dstream.map( p -> { ThreadLocal<SomeClass> d = new ThreadLocal<>() {
         public SomeClass initialValue() { return new SomeClass(); }
    };
    somefunc(p, d.get());
    d.remove();
    return p;
}; );

On Fri, Jan 29, 2016 at 4:32 PM, N B <nb.nos...@gmail.com> wrote:

> So this use of ThreadLocal will be inside the code of a function executing
> on the workers i.e. within a call from one of the lambdas. Would it just
> look like this then:
>
> dstream.map( p -> { ThreadLocal<Data> d = new ThreadLocal<>() {
>          public SomeClass initialValue() { return new SomeClass(); }
>     };
>     somefunc(p, d.get());
>     d.remove();
>     return p;
> }; );
>
> Will this make sure that all threads inside the worker clean up the
> ThreadLocal once they are done with processing this task?
>
> Thanks
> NB
>
>
> On Fri, Jan 29, 2016 at 1:00 PM, Shixiong(Ryan) Zhu <
> shixi...@databricks.com> wrote:
>
>> Spark Streaming uses threadpools so you need to remove ThreadLocal when
>> it's not used.
>>
>> On Fri, Jan 29, 2016 at 12:55 PM, N B <nb.nos...@gmail.com> wrote:
>>
>>> Thanks for the response Ryan. So I would say that it is in fact the
>>> purpose of a ThreadLocal i.e. to have a copy of the variable as long as the
>>> thread lives. I guess my concern is around usage of threadpools and whether
>>> Spark streaming will internally create many threads that rotate between
>>> tasks on purpose thereby holding onto ThreadLocals that may actually never
>>> be used again.
>>>
>>> Thanks
>>>
>>> On Fri, Jan 29, 2016 at 12:12 PM, Shixiong(Ryan) Zhu <
>>> shixi...@databricks.com> wrote:
>>>
>>>> Of cause. If you use a ThreadLocal in a long living thread and forget
>>>> to remove it, it's definitely a memory leak.
>>>>
>>>> On Thu, Jan 28, 2016 at 9:31 PM, N B <nb.nos...@gmail.com> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> Does anyone know if there are any potential pitfalls associated with
>>>>> using ThreadLocal variables in a Spark streaming application? One things I
>>>>> have seen mentioned in the context of app servers that use thread pools is
>>>>> that ThreadLocals can leak memory. Could this happen in Spark streaming
>>>>> also?
>>>>>
>>>>> Thanks
>>>>> Nikunj
>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to