[jira] Commented: (LUCENE-436) [PATCH] TermInfosReader, SegmentTermEnum Out Of Memory Exception

2006-05-04 Thread Erik Hatcher (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-436?page=comments#action_12377933 ] Erik Hatcher commented on LUCENE-436: - Please, everyone, let's keep this discussion technical and factual and avoid making degrading statements to one another. It doesn't

[jira] Commented: (LUCENE-436) [PATCH] TermInfosReader, SegmentTermEnum Out Of Memory Exception

2006-05-04 Thread kieran (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-436?page=comments#action_12377905 ] kieran commented on LUCENE-436: --- Robert Engels description of the ThreadLocal issue in JDK 1.4.2 provides a very plausible explanation for why this is not a bug, either in Lucen

Re: Statistaical evaluation of modifications to a Lucene query based on search logs

2006-05-04 Thread Chris Hostetter
: It's got one difference from yours, in that the terms are allowed to : occur in any order in the sub-phrases (so phrase "C B" from your : original example is scored like "B C"). there's a much bigger differnece, in that your technique won't reqard documents where B and C are "near" eachother, b

Re: Statistaical evaluation of modifications to a Lucene query based on search logs

2006-05-04 Thread Robin H. Johnson
On Thu, May 04, 2006 at 10:52:46AM -0400, Daniel Shane wrote: > I'm developing a new type of Query, called a SubPhraseQuery. I have sent > a message to the list regarding this and Doug was kind enough to put me > on the right track. The query is simply a PhraseQuery where all terms > are search,

[jira] Commented: (LUCENE-436) [PATCH] TermInfosReader, SegmentTermEnum Out Of Memory Exception

2006-05-04 Thread robert engels (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-436?page=comments#action_12377865 ] robert engels commented on LUCENE-436: -- Then you have some other problem. The ONLY way the ThreadLocal issue is an ISSUE is if VERY LARGE objects are referenced by the T

[jira] Commented: (LUCENE-436) [PATCH] TermInfosReader, SegmentTermEnum Out Of Memory Exception

2006-05-04 Thread Nicholaus Shupe (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-436?page=comments#action_12377863 ] Nicholaus Shupe commented on LUCENE-436: I am not using a RAMDirectory. I never indicated as such. Here's the the way I load my index: reader = IndexReader.open(dic

[jira] Resolved: (LUCENE-529) TermInfosReader and other + instance ThreadLocal => transient/odd memory leaks => OutOfMemoryException

2006-05-04 Thread Otis Gospodnetic (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-529?page=all ] Otis Gospodnetic resolved LUCENE-529: - Resolution: Duplicate Duplicate of LUCENE-436 > TermInfosReader and other + instance ThreadLocal => transient/odd memory > leaks => OutOfMemor

[jira] Commented: (LUCENE-436) [PATCH] TermInfosReader, SegmentTermEnum Out Of Memory Exception

2006-05-04 Thread robert engels (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-436?page=comments#action_12377858 ] robert engels commented on LUCENE-436: -- I attached a better version of TermInfosReader that avoids multiple ThreadLocal.get() during get(Term r) which can be expensive.

[jira] Updated: (LUCENE-436) [PATCH] TermInfosReader, SegmentTermEnum Out Of Memory Exception

2006-05-04 Thread robert engels (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-436?page=all ] robert engels updated LUCENE-436: - Attachment: TermInfosReader.java better version of TermInfos reader that avoid multiple ThreadLocal.get() calls > [PATCH] TermInfosReader, SegmentTermEnum Ou

[jira] Commented: (LUCENE-436) [PATCH] TermInfosReader, SegmentTermEnum Out Of Memory Exception

2006-05-04 Thread robert engels (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-436?page=comments#action_12377855 ] robert engels commented on LUCENE-436: -- Are you always loading your indexes using RAMDirectories? I use multi-hundred megabytes indexes, with 196 max heap, and it runs fo

[jira] Commented: (LUCENE-436) [PATCH] TermInfosReader, SegmentTermEnum Out Of Memory Exception

2006-05-04 Thread Nicholaus Shupe (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-436?page=comments#action_12377850 ] Nicholaus Shupe commented on LUCENE-436: However, this problem is classified is eventually irrelevant to me. Bug in the JDK, bug in Lucene, not a bug, it's all the sa

[jira] Commented: (LUCENE-436) [PATCH] TermInfosReader, SegmentTermEnum Out Of Memory Exception

2006-05-04 Thread robert engels (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-436?page=comments#action_12377842 ] robert engels commented on LUCENE-436: -- Actually, the last simple "fix" only works well for single threaded applications, so it is not much of a "fix". Use FixedthreadLo

Re: Why ThreadLocal?

2006-05-04 Thread Yonik Seeley
On 5/4/06, Robert Engels <[EMAIL PROTECTED]> wrote: In reviewing the code for bug 436 (http://issues.apache.org/jira/browse/LUCENE-436) Why are we using a ThreadLocal for the enumeration at all? Since terms(), and terms(Term t) return new instances anyway, why not just have them clone the neede

Why ThreadLocal?

2006-05-04 Thread Robert Engels
In reviewing the code for bug 436 (http://issues.apache.org/jira/browse/LUCENE-436) Why are we using a ThreadLocal for the enumeration at all? Since terms(), and terms(Term t) return new instances anyway, why not just have them clone the needed data structures? Seems like the code could be much

[jira] Commented: (LUCENE-436) [PATCH] TermInfosReader, SegmentTermEnum Out Of Memory Exception

2006-05-04 Thread robert engels (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-436?page=comments#action_12377829 ] robert engels commented on LUCENE-436: -- The best solution is this, move the enumerators.set(null); to the TermInfosReader close(), and remove the finalize(). Ever

Re: Statistaical evaluation of modifications to a Lucene query based on search logs

2006-05-04 Thread Otis Gospodnetic
Hi, I haven't done this type of analysis and my guess is no commercial search engine would be willing to share this data. However, I think it's intuitive that longer phrase matches would look more promissing. I think one way you can test this is by showing KWIC snippets (using Highlighter) and

[jira] Commented: (LUCENE-436) [PATCH] TermInfosReader, SegmentTermEnum Out Of Memory Exception

2006-05-04 Thread robert engels (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-436?page=comments#action_12377826 ] robert engels commented on LUCENE-436: -- Oops. Last comment was not quite correct. The reason the finalize() methods are worthless for clearing the ThreadLocal entries, i

[jira] Commented: (LUCENE-436) [PATCH] TermInfosReader, SegmentTermEnum Out Of Memory Exception

2006-05-04 Thread robert engels (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-436?page=comments#action_12377815 ] robert engels commented on LUCENE-436: -- To restate: It is NOT a bug. It is a design decision in the JDK between performance and memory use. This decisions changed a bit

Statistaical evaluation of modifications to a Lucene query based on search logs

2006-05-04 Thread Daniel Shane
Hi! I'm developing a new type of Query, called a SubPhraseQuery. I have sent a message to the list regarding this and Doug was kind enough to put me on the right track. The query is simply a PhraseQuery where all terms are search, but, if any of the subphrases are found, it boosts the results

[jira] Commented: (LUCENE-436) [PATCH] TermInfosReader, SegmentTermEnum Out Of Memory Exception

2006-05-04 Thread Andy Hind (JIRA)
[ http://issues.apache.org/jira/browse/LUCENE-436?page=comments#action_12377764 ] Andy Hind commented on LUCENE-436: -- I agree this is not strictly an issue with lucenebut . Lucene has an unusual use pattern for thread locals (instance vs static mem