OK so we compact the list (removing dead threads) every time we add a
new entry to the list. This way for a long lived SegmentReader but
short lived threads, the list keeps only live threads.
We do need sync access to the list, but that's only on binding a new
thread. Retrieving an existing thread has no sync.
Mike
robert engels wrote:
You still need to sync access to the list, and how would it be
removed from the list prior to close? That is you need one per
thread, but you can have the reader shared across all threads. So if
threads were created and destroyed without ever closing the reader,
the list would grow unbounded.
On Sep 11, 2008, at 9:20 AM, Michael McCandless wrote:
I don't need it by thread, because I would still use ThreadLocal to
retrieve the SegmentTermEnum. This avoids any sync during get.
The list is just a "fallback" to hold a hard reference to the
SegmentTermEnum to keep it alive. That's it's only purpose. Then,
when SegmentReader is closed this list is cleared and GC is free to
reclaim all SegmentTermEnums.
Mike
robert engels wrote:
But you need it by thread, so it can't be a list.
You could have a HashMap of <Thread,ThreadState> in FieldsReader,
and when SegmentReader is closed, FieldsReader is closed, which
clears the map, and not use thread locals at all. The difference
being you would need a sync'd map.
On Sep 11, 2008, at 4:56 AM, Michael McCandless wrote:
What if we wrap the value in a WeakReference, but secondarily
hold a hard reference to it in a "normal" list?
Then, when TermInfosReader is closed we clear that list of all
its hard references, at which point GC will be free to reclaim
the object out from under the ThreadLocal even before the
ThreadLocal purges its stale entries.
Mike
robert engels wrote:
You can't hold the ThreadLocal value in a WeakReference, because
there is no hard reference between enumeration calls (so it
would be cleared out from under you while enumerating).
All of this occurs because you have some objects (readers/
segments etc.) that are shared across all threads, but these
contain objects that are 'thread/search state' specific. These
latter objects are essentially "cached" for performance (so you
don't need to seek and read, sequential buffer access, etc.)
A sometimes better solution is to have the state returned to the
caller, and require the caller to pass/use the state later -
then you don't need thread locals.
You can accomplish a similar solution by returning a
"SessionKey" object, and have the caller pass this later. You
can then have a WeakHashMap of SessionKey,SearchState that the
code can use. When the SessionKey is destroyed (no longer
referenced), the state map can be cleaned up automatically.
On Sep 10, 2008, at 11:30 PM, Noble Paul നോബിള്
नोब्ळ् wrote:
When I look at the reference tree That is the feeling I get. if
you
held a WeakReference it would get released .
|- base of org.apache.lucene.index.CompoundFileReader
$CSIndexInput
|- input of org.apache.lucene.index.SegmentTermEnum
|- value of java.lang.ThreadLocal$ThreadLocalMap
$Entry
On Wed, Sep 10, 2008 at 8:39 PM, Chris Lu <[EMAIL PROTECTED]>
wrote:
Does this make any difference?
If I intentionally close the searcher and reader failed to
release the
memory, I can not rely on some magic of JVM to release it.
--
Chris Lu
-------------------------
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com
Lucene Database Search in 3 minutes:
http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
DBSight customer, a shopping comparison site, (anonymous per
request) got
2.6 Million Euro funding!
On Wed, Sep 10, 2008 at 4:03 AM, Noble Paul
നോബിള് नोब्ळ्
<[EMAIL PROTECTED]> wrote:
Why do you need to keep a strong reference?
Why not a WeakReference ?
--Noble
On Wed, Sep 10, 2008 at 12:27 AM, Chris Lu
<[EMAIL PROTECTED]> wrote:
The problem should be similar to what's talked about on this
discussion.
http://lucene.markmail.org/message/keosgz2c2yjc7qre?q=ThreadLocal
There is a memory leak for Lucene search from Lucene-1195.
(svn r659602,
May23,2008)
This patch brings in a ThreadLocal cache to TermInfosReader.
It's usually recommended to keep the reader open, and reuse
it when
possible. In a common J2EE application, the http requests
are usually
handled by different threads. But since the cache is
ThreadLocal, the
cache
are not really usable by other threads. What's worse, the
cache can not
be
cleared by another thread!
This leak is not so obvious usually. But my case is using
RAMDirectory,
having several hundred megabytes. So one un-released
resource is obvious
to
me.
Here is the reference tree:
org.apache.lucene.store.RAMDirectory
|- directory of org.apache.lucene.store.RAMFile
|- file of org.apache.lucene.store.RAMInputStream
|- base of
org.apache.lucene.index.CompoundFileReader$CSIndexInput
|- input of org.apache.lucene.index.SegmentTermEnum
|- value of java.lang.ThreadLocal
$ThreadLocalMap$Entry
After I switched back to svn revision 659601, right before
this patch is
checked in, the memory leak is gone.
Although my case is RAMDirectory, I believe this will affect
disk based
index also.
--
Chris Lu
-------------------------
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com
Lucene Database Search in 3 minutes:
http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
DBSight customer, a shopping comparison site, (anonymous per
request)
got
2.6 Million Euro funding!
--
--Noble Paul
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: java-dev-
[EMAIL PROTECTED]
--
--Noble Paul
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]