Re: ThreadLocal in SegmentReader

2008-07-07 Thread Michael McCandless
Well ... if the thread dies, the value in its ThreadLocal should be GC'd. If the thread does not die (eg thread pool in an app server) then the ThreadLocal value remains, but that value is a shallow clone of the original TermVectorsReader and should not be consuming that much RAM per th

Re: ThreadLocal in SegmentReader

2008-07-07 Thread Roman Puchkovskiy
Unfortunately, it's not ok sometimes. For instance, when Lucene is loaded by a web-application from its WEB-INF/lib and SegmentReader is initialized during the application start-up (i.e. it's initialized in the thread which will never die), this causes problems with unloading of a classloader of t

Re: ThreadLocal in SegmentReader

2008-07-07 Thread Michael McCandless
Hmmm, I see... you're right. ThreadLocal is dangerous. So how would you recommend fixing it? One thing we can do, in SegmentReader.close, is to call termVectorsLocal.set(null). We do this eg in FieldsReader.close, which uses a ThreadLocal to hold thread-private clones of the fieldsStrea

Re: ThreadLocal in SegmentReader

2008-07-07 Thread Roman Puchkovskiy
Yes, calling set(null) does not seem a good fix. As for setting a reference to termVectorsLocal to null, not sure could this help or not, as this ThreadLocal will still be referenced by the thread (or threads). Anyway, I will try to test this approach and post the results here. Michael McCandles

Re: Reg : lucene RemoteSearchable Object

2008-07-07 Thread Yonik Seeley
The files are still open by the process (deletion doesn't change this) and hence the local IndexReader can still read the index. -Yonik On Sun, Jul 6, 2008 at 9:59 AM, saikrishna venkata pendyala <[EMAIL PROTECTED]> wrote: > Hi all, > > I am currently developing a distributed search engine using

Re: ThreadLocal in SegmentReader

2008-07-07 Thread Roman Puchkovskiy
I've tested a little, and it seems that assigning a null is not sufficient. As expected... I don't see other ways how to fix this, but I'm not the Lucene developer :) Fortunately, there's the work-around with temporary thread. Michael McCandless-2 wrote: > > > Hmmm, I see... you're right. Thr

disable boolean operators ?

2008-07-07 Thread Balthasar Schopman
Hiya, Is there a way to disable boolean operators in the Lucene engine? The reason for this question is the mystical / unexpected exception I encounter when parsing a query containing many words. I query on a single field with a query containing 2243 words (14.742 characters). I haven't ha

can the boolean operators be disabeled ?

2008-07-07 Thread Balthasar Schopman
Hiya, Is there a way to disable boolean operators in the Lucene engine? The reason for this question is the mystical / unexpected exception I encounter when parsing a query containing many words. I query on a single field with a query containing 2243 words (14.742 characters). I haven't ha

Re: disable boolean operators ?

2008-07-07 Thread Erick Erickson
I think you're off base a little. Lucene defaults to 1,024 boolean clauses as the maximum number allowed. There is an implied boolean between each term no matter what. Removing OR, AND, NOT doesn't change that in the least. That is, term1 term2 term3 is equivalent to term1 OR term2 OR term3 So rem

Re: disable boolean operators ?

2008-07-07 Thread Balthasar Schopman
That expains. Thanks a lot! On Jul 7, 2008, at 6:19 PM, Erick Erickson wrote: I think you're off base a little. Lucene defaults to 1,024 boolean clauses as the maximum number allowed. There is an implied boolean between each term no matter what. Removing OR, AND, NOT doesn't change that in th

Re: Reg : lucene RemoteSearchable Object

2008-07-07 Thread saikrishna venkata pendyala
Yes, I didn't reopen the index. It's working fine now :) On Mon, Jul 7, 2008 at 6:30 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote: > The files are still open by the process (deletion doesn't change this) > and hence the local IndexReader can still read the index. > > -Yonik > > On Sun, Jul 6, 2008

Re: ThreadLocal in SegmentReader

2008-07-07 Thread Michael McCandless
So now I'm confused: the SegmentReader itself should no longer be reachable, assuming you are not holding any references to your IndexReader. Which means the ThreadLocal instance should no longer be reachable. Which means it should be GC'd and everything it's holding should be GC'd as we

Re: ThreadLocal in SegmentReader

2008-07-07 Thread Yonik Seeley
On Mon, Jul 7, 2008 at 2:43 PM, Michael McCandless <[EMAIL PROTECTED]> wrote: > So now I'm confused: the SegmentReader itself should no longer be reachable, > assuming you are not holding any references to your IndexReader. > > Which means the ThreadLocal instance should no longer be reachable. It

Re: ThreadLocal in SegmentReader

2008-07-07 Thread Michael McCandless
Ugh! I'll move this to java-dev to brainstorm fixes... Mike Yonik Seeley wrote: On Mon, Jul 7, 2008 at 2:43 PM, Michael McCandless <[EMAIL PROTECTED]> wrote: So now I'm confused: the SegmentReader itself should no longer be reachable, assuming you are not holding any references to your In

How to make documents clustering and topic classification with lucene

2008-07-07 Thread Ariel
Hi everybody: Do you have Idea how to make how to make documents clustering and topic classification using lucene ??? Is there anyway to do this. Please I need help. Thanks everybody. Ariel

Re: How to make documents clustering and topic classification with lucene

2008-07-07 Thread Ariel
Hi everybody: Do you have Idea how to make how to make documents clustering and topic classification using lucene ??? Is there anyway to do this. Please I need help. Thanks everybody. Ariel