It looks weird. Why don't you just pass "new SomeClass()" to "somefunc"?
You don't need to use ThreadLocal if there are no multiple threads in your
codes.

On Fri, Jan 29, 2016 at 4:39 PM, N B <nb.nos...@gmail.com> wrote:

> Fixed a typo in the code to avoid any confusion.... Please comment on the
> code below...
>
> dstream.map( p -> { ThreadLocal<SomeClass> d = new ThreadLocal<>() {
>          public SomeClass initialValue() { return new SomeClass(); }
>     };
>     somefunc(p, d.get());
>     d.remove();
>     return p;
> }; );
>
> On Fri, Jan 29, 2016 at 4:32 PM, N B <nb.nos...@gmail.com> wrote:
>
>> So this use of ThreadLocal will be inside the code of a function
>> executing on the workers i.e. within a call from one of the lambdas. Would
>> it just look like this then:
>>
>> dstream.map( p -> { ThreadLocal<Data> d = new ThreadLocal<>() {
>>          public SomeClass initialValue() { return new SomeClass(); }
>>     };
>>     somefunc(p, d.get());
>>     d.remove();
>>     return p;
>> }; );
>>
>> Will this make sure that all threads inside the worker clean up the
>> ThreadLocal once they are done with processing this task?
>>
>> Thanks
>> NB
>>
>>
>> On Fri, Jan 29, 2016 at 1:00 PM, Shixiong(Ryan) Zhu <
>> shixi...@databricks.com> wrote:
>>
>>> Spark Streaming uses threadpools so you need to remove ThreadLocal when
>>> it's not used.
>>>
>>> On Fri, Jan 29, 2016 at 12:55 PM, N B <nb.nos...@gmail.com> wrote:
>>>
>>>> Thanks for the response Ryan. So I would say that it is in fact the
>>>> purpose of a ThreadLocal i.e. to have a copy of the variable as long as the
>>>> thread lives. I guess my concern is around usage of threadpools and whether
>>>> Spark streaming will internally create many threads that rotate between
>>>> tasks on purpose thereby holding onto ThreadLocals that may actually never
>>>> be used again.
>>>>
>>>> Thanks
>>>>
>>>> On Fri, Jan 29, 2016 at 12:12 PM, Shixiong(Ryan) Zhu <
>>>> shixi...@databricks.com> wrote:
>>>>
>>>>> Of cause. If you use a ThreadLocal in a long living thread and forget
>>>>> to remove it, it's definitely a memory leak.
>>>>>
>>>>> On Thu, Jan 28, 2016 at 9:31 PM, N B <nb.nos...@gmail.com> wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> Does anyone know if there are any potential pitfalls associated with
>>>>>> using ThreadLocal variables in a Spark streaming application? One things 
>>>>>> I
>>>>>> have seen mentioned in the context of app servers that use thread pools 
>>>>>> is
>>>>>> that ThreadLocals can leak memory. Could this happen in Spark streaming
>>>>>> also?
>>>>>>
>>>>>> Thanks
>>>>>> Nikunj
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to