Using synchronization is a poor/invalid substitute for thread locals in many cases.

The point of the thread local in these referenced cases is too allow streaming reads on a file descriptor. if you use a shared file descriptor/buffer you are going to continually invalidate the buffer.

On Jul 8, 2008, at 5:12 AM, Michael McCandless wrote:


Well ... SegmentReader uses ThreadLocal to hold a thread-private instance of TermVectorsReader, to avoid synchronizing per-document when loading term vectors.

Clearing this ThreadLocal value per call to SegmentReader's methods that load term vectors would defeat its purpose.

Though, of course, we then synchronize on the underlying file (when using FSDirectory), so perhaps we are really not saving much by using ThreadLocal here. But we are looking to relax that low level synchronization with LUCENE-753.

Maybe we could make our own ThreadLocal that just uses a HashMap, which we'd have to synchronize on when getting the per-thread instances. Or, go back to sharing a single TermVectorsReader and synchronize per-document.

Jason has suggested moving to a model where you ask the IndexReader for an object that can return term vectors / stored fields / etc, and then you interact with that many times to retrieve each doc. We could then synchronize only on retrieving that object, and provide a thread-private instance.

It seems like we should move away from using ThreadLocal in Lucene and do "normal" synchronization instead.

Mike

Adrian Tarau wrote:

Usually ThreadLocal.remove() should be called at the end(in a finally block), before the current call leaves your code.

Ex : if during searching ThreadLocal is used, every search(..) method should cleanup any ThreadLocal variables, or even deeper in the implementation. When the call leaves Lucene any used ThreadLocal should be cleaned up.

Michael McCandless wrote:

ThreadLocal, which we use in several places in Lucene, causes a leak in app servers because the classloader never fully deallocates Lucene's classes because the ThreadLocal is holding strong references.

Yet, ThreadLocal is very convenient for avoiding synchronization.

Does anyone have any ideas on how to solve this w/o falling back to "normal" synchronization?

Mike

Begin forwarded message:

From: "Yonik Seeley" <[EMAIL PROTECTED]>
Date: July 7, 2008 3:30:28 PM EDT
To: [EMAIL PROTECTED]
Subject: Re: ThreadLocal in SegmentReader
Reply-To: [EMAIL PROTECTED]

On Mon, Jul 7, 2008 at 2:43 PM, Michael McCandless
<[EMAIL PROTECTED]> wrote:
So now I'm confused: the SegmentReader itself should no longer be reachable,
assuming you are not holding any references to your IndexReader.

Which means the ThreadLocal instance should no longer be reachable.

It will still be referenced from the Thread(s) ThreadLocalMap
The key (the ThreadLocal) will be weakly referenced, but the values
(now stale) are strongly referenced and won't be actually removed
until the table is resized (under the Java6 impl at least).
Nice huh?

-Yonik

------------------------------------------------------------------- --
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-------------------------------------------------------------------- -
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to