That is correct.ANY library that uses thread locals is going to have the same issue, and in many cases the containers themselves use static thread locals.
This is just another reason why (I think) a better solution when using Lucene as a service is to run it in its own process, and communicate with the process from the web container. You have better memory/request control - all of the recent memory caching enhancements are worthless when running Lucene along side other apps in the same web container - future JSRs might make this feasible, but not yet. It allows for better process monitoring, etc. So many advantages - very few downsides
On Jul 14, 2008, at 8:38 AM, Roman Puchkovskiy wrote:
1. It is not obligatory for servlet container to deploy applications from the same thread. It easily can happen, that first instance of webapp is loaded by the main thread at container start-up, then repedloy happens because a hot deployer discovers that WAR has changed (and this hot deployer easily may work in another thread, not in the main one), and after that the webapp is redeployed in response to HTTP request (Tomcat manager is example, whichis webapp itself and may redeploy other applications). Here a bunch of threads deploy webapp, and they are all different.2. What if a user just wants to undeploy the webapp, without the redeploy?He expects the memory to be released, but it will not be. Robert Engels wrote:If you attempting to restart the web app context, then when restarted the Thread will still be the same and as it runs, new ThreadLocals will be created which will cause the old to be removed, eventually allowing the class loader to be unloaded. On Jul 14, 2008, at 12:31 AM, Roman Puchkovskiy wrote:Oops, sent too early, excuse me :) One more to say: yes, Yonik is correct. Values remain strong-referenced until the stale values are removed from the table. Just a quote from the ThreadLocals' javadoc: "However, since reference queues are not used, stale entries are guaranteed to be removed only when the table starts running out of space." How do you suggest to force the removal of stale entries to the Lucene user? Robert Engels wrote:You are mistaken - Yonik's comment in that thread is correct(although it is not just at table resize - any time a ThreadLocal isadded, and any time the ThreadLocal is not found in its direct hash it might clear entries).The ThreadLocals map only has a WeakReference to the ThreadLocal, so if that is the only reference, it will be GC'd - eventually, and thenit will be cleared as new ThreadLocals are created.With a static reference, the thread can reference the ThreadLocal atany time, and thus the WeakReference will not be cleared.If the object is VERY large, and new ThreadLocals are not created itcould cause a problem, but I don't think that is the case with Lucene, as the objects stored in ThreadLocals are designed to live for the life of the SegmentReader/IndexReader and thread. On Jul 12, 2008, at 2:12 AM, Roman Puchkovskiy wrote:Well, possibly I'm mistaken, but it seems that this affects non- static fields too. Please see http://www.nabble.com/ThreadLocal-in-SegmentReader-to18306230.html where the use case is described in the details. In short: it seems that the scope of ThreadLocals does not matter. What really matters is that they are referenced by ThreadLocals map in the thread which is still alive. Robert Engels wrote:This is only an issue for static ThreadLocals ... On Jul 11, 2008, at 11:32 PM, Roman Puchkovskiy wrote:The problem here is not because ThreadLocal instances are not GC'd(they are GC'd, and your test shows this clearly). But even one instance which is not removed from its Thread is enough to prevent the classloader from being unloaded, and that's the problem. Michael McCandless-2 wrote:OK, I created a simple test to test this (attached). The test just runs 10 threads, each one creating a 100 KB byte array which is stored into a ThreadLocal, and then periodically the ThreadLocal is replaced with a new one. This is to test whether GC of a ThreadLocal, even though the thread is still alive, in fact leads to GC of the objects held in the ThreadLocal. Indeed on Sun JRE 1.4, 1.5 and 1.6 it appears that the objects are in fact properly collected. So this is not a leak but rather a "delayed collection" issue. Java'sGC is never guaranteed to be immediate, and apparently when usingThreadLocals it's even less immediate than "normal". In the original issue, if other things create ThreadLocals, then eventually Lucene's unreferenced ThreadLocals would be properly collected. So I think we continue to use non-static ThreadLocals in Lucene... Mike robert engels wrote:Once again, these are "static" thread locals. A completely different issue. Since the object is available statically, the weak reference cannot be cleared so stale entries will never be cleared as long as the thread is alive. On Jul 9, 2008, at 4:46 PM, Adrian Tarau wrote:Just a few examples of "problems" using ThreadLocals. http://opensource.atlassian.com/projects/hibernate/browse/ HHH-2481 http://www.theserverside.com/news/thread.tss?thread_id=41473Once again, I'm not pointing to Lucene SegmentReader as a "bad"implementation, and maybe the current "problems" of ThreadLocals are not a problem for SegmentReader but it seems safer to useThreadLocals to pass context information which is cleared whenthe call exits instead of storing long-lived objects. robert engels wrote:Aside from the pre-1.5 thread local "perceived leak", there are no issues with ThreadLocals if used properly. There is no need for try/finally blocks, unless you MUST releaseresources immediately, usually this is not the case, which iswhy a ThreadLocal is used in the first place. From the ThreadLocalMap javadoc... /** * ThreadLocalMap is a customized hash map suitable only for * maintaining thread local values. No operations are exported* outside of the ThreadLocal class. The class is packageprivate to * allow declaration of fields in class Thread. To help deal with * very large and long-lived usages, the hash table entries use * WeakReferences for keys. However, since reference queues are not * used, stale entries are guaranteed to be removed only when * the table starts running out of space. */ /** * Heuristically scan some cells looking for stale entries. * This is invoked when either a new element is added, or* another stale one has been expunged. It performs a* logarithmic number of scans, as a balance between no* scanning (fast but retains garbage) and a number ofscans * proportional to number of elements, that would find all * garbage but would cause some insertions to take O (n) time. * * @param i a position known NOT to hold a stale entry. The * scan starts at the element after i. * * @param n scan control: <tt>log2(n)</tt> cells are scanned, * unless a stale entry one is found, in which case* <tt>log2(table.length)-1</tt> additional cells arescanned.* When called from insertions, this parameter is thenumber * of elements, but when from replaceStaleEntry, it is the * table length. (Note: all this could be changed to be either * more or less aggressive by weighting n instead of just * using straight log n. But this version is simple, fast, and * seems to work well.) * * @return true if any stale entries have been removed. */The instance ThreadLocals (and what the refer to) will be GC'dwhen the containing Object is GC'd. There IS NO MEMORY LEAK in ThreadLocal. If the ThreadLocal refers to an object that has native resources (e.g. file handles), it may not be released until other thread locals are created by the thread (or the thread terminates). You can avoid this "delay" by calling remove(), but in most applications it should never be necessary - unless a very strange usage... On Jul 9, 2008, at 2:37 PM, Adrian Tarau wrote:From what I know, storing objects in ThreadLocal is safe as longas you release the object within a try {} finall {} block orstore objects which are independent of the rest of the code (no dependencies).Otherwise it can get pretty tricky(memory leaks, classloader problems) after awhile. It is pretty convenient to pass HTTP request information with aThreadLocal in a servlet(but you should cleanup the variablebefore leaving the servlet) but I'm not sure how safe it is in this case. robert engels wrote:Using synchronization is a poor/invalid substitute for thread locals in many cases. The point of the thread local in these referenced cases is too allow streaming reads on a file descriptor. if you use a shared file descriptor/buffer you are going to continually invalidate the buffer. On Jul 8, 2008, at 5:12 AM, Michael McCandless wrote:Well ... SegmentReader uses ThreadLocal to hold a thread- private instance of TermVectorsReader, to avoid synchronizing per-document when loading term vectors.Clearing this ThreadLocal value per call to SegmentReader'smethods that load term vectors would defeat its purpose. Though, of course, we then synchronize on the underlying file (when using FSDirectory), so perhaps we are really not saving much by using ThreadLocal here. But we are looking to relax that low level synchronization with LUCENE-753. Maybe we could make our own ThreadLocal that just uses aHashMap, which we'd have to synchronize on when getting theper- thread instances. Or, go back to sharing a single TermVectorsReader and synchronize per-document. Jason has suggested moving to a model where you ask the IndexReader for an object that can return term vectors / storedfields / etc, and then you interact with that many times toretrieve each doc. We could then synchronize only on retrieving that object, and provide a thread-private instance.It seems like we should move away from using ThreadLocal inLucene and do "normal" synchronization instead. Mike Adrian Tarau wrote:Usually ThreadLocal.remove() should be called at the end (in afinally block), before the current call leaves your code.Ex : if during searching ThreadLocal is used, every search(..) method should cleanup any ThreadLocal variables, or evendeeper in the implementation. When the call leaves Luceneany used ThreadLocal should be cleaned up. Michael McCandless wrote:ThreadLocal, which we use in several places in Lucene, causesa leak in app servers because the classloader never fullydeallocates Lucene's classes because the ThreadLocal is holding strong references. Yet, ThreadLocal is very convenient for avoiding synchronization. Does anyone have any ideas on how to solve this w/o falling back to "normal" synchronization? Mike Begin forwarded message:From: "Yonik Seeley" <[EMAIL PROTECTED]> Date: July 7, 2008 3:30:28 PM EDT To: [EMAIL PROTECTED] Subject: Re: ThreadLocal in SegmentReader Reply-To: [EMAIL PROTECTED] On Mon, Jul 7, 2008 at 2:43 PM, Michael McCandless <[EMAIL PROTECTED]> wrote:So now I'm confused: the SegmentReader itself should nolonger be reachable, assuming you are not holding any references to your IndexReader. Which means the ThreadLocal instance should no longer be reachable.It will still be referenced from the Thread(s) ThreadLocalMap The key (the ThreadLocal) will be weakly referenced, but the values (now stale) are strongly referenced and won't be actually removed until the table is resized (under the Java6 impl at least). Nice huh? -Yonik------------------------------------------------------ ---- -- --------- To unsubscribe, e-mail: java-user- [EMAIL PROTECTED] For additional commands, e-mail: java-user- [EMAIL PROTECTED]------------------------------------------------------- ---- -- -------- To unsubscribe, e-mail: java-dev- [EMAIL PROTECTED] For additional commands, e-mail: java-dev- [EMAIL PROTECTED]-------------------------------------------------------- ---- -- ------- To unsubscribe, e-mail: java-dev- [EMAIL PROTECTED] For additional commands, e-mail: java-dev- [EMAIL PROTECTED]--------------------------------------------------------- ---- -- ------ To unsubscribe, e-mail: java-dev- [EMAIL PROTECTED] For additional commands, e-mail: java-dev- [EMAIL PROTECTED]---------------------------------------------------------- ---- -- ----- To unsubscribe, e-mail: java-dev- [EMAIL PROTECTED] For additional commands, e-mail: java-dev- [EMAIL PROTECTED]----------------------------------------------------------- ---- -- ---- To unsubscribe, e-mail: java-dev- [EMAIL PROTECTED] For additional commands, e-mail: java-dev- [EMAIL PROTECTED]--------------------------------------------------------------- ---- -- To unsubscribe, e-mail: [EMAIL PROTECTED]For additional commands, e-mail: java-dev- [EMAIL PROTECTED]-- View this message in context: http://www.nabble.com/Fwd%3A- ThreadLocal-in-SegmentReader-tp18326720p18416022.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com.---------------------------------------------------------------- ---- - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]----------------------------------------------------------------- ---- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]-- View this message in context: http://www.nabble.com/Fwd%3A- ThreadLocal-in-SegmentReader-tp18326720p18416841.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com.------------------------------------------------------------------ --- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]------------------------------------------------------------------- --To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]-- View this message in context: http://www.nabble.com/Fwd%3A- ThreadLocal-in-SegmentReader-tp18326720p18437716.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com.-------------------------------------------------------------------- -To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]--View this message in context: http://www.nabble.com/Fwd%3A- ThreadLocal-in-SegmentReader-tp18326720p18444095.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com.--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
