Re: Lucene Memory Leak

2008-09-02 Thread 장용석
I think when your doQuery method is run, Directory and Analyzer classes are new create every time. If index file's size is very large then create new Directory instance is pressure to jvm and it takes long time for create new Directory instance. I suggest that modify the code , Analyzer class and D

Re: Lucene Memory Leak

2008-09-02 Thread Grant Ingersoll
Closing the directory seems a bit strange, why are you doing that (other than it is a public method), especially since you say you are keeping the IndexSearcher around? Also, you probably shouldn't open a new searcher every time. Are your queries on different directories every time? Wha

Re: Lucene Memory Leak

2008-09-02 Thread Andy33
As stated in my original message, I am closing the IndexSearcher elsewhere. I don't close it in the method I copied because otherwise I lose access to the Hits that come back. You should really close the IndexSearcher rather than the directory. Andy33 wrote: > I have a memory leak in my lucen

Re: Lucene Memory Leak

2008-09-02 Thread Mark Miller
You should really close the IndexSearcher rather than the directory. Andy33 wrote: I have a memory leak in my lucene search code. I am able to run a few queries fine, but I eventually run out of memory. Please note that I do close and set to null the ivIndexSearcher object elsewhere. Here is the

Re: Lucene Memory Leak

2008-09-02 Thread Andy33
-- View this message in context: http://www.nabble.com/Lucene-Memory-Leak-tp19276999p19277001.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. - To unsubscribe, e-mail: [EMAIL PROTECTED] For addition

Lucene Memory Leak

2008-09-02 Thread Andy33
I have a memory leak in my lucene search code. I am able to run a few queries fine, but I eventually run out of memory. Please note that I do close and set to null the ivIndexSearcher object elsewhere. Here is the code I am using... private synchronized Hits doQuery(String field, String querySt

Re: Beginner: Specific indexing

2008-09-02 Thread Raymond Balmès
OK, not clear enough. I have documents in which I'm looking for 3 consecutive elements : <#1> <#2> (string1 is a predefined list) I want to disregard those without this sequence and reverse index those with these markers... it looks to me that parsing won't do the job since my documents are unst

Re: Performance, yet again

2008-09-02 Thread Andre Rubin
I've tested ConstantScorePrefixQuery and it hit right in the head. It's now mind-boggling fast! Even a query that has 200.000 matches was under 0.5 seconds! Thanks! :)) Andre On Tue, Sep 2, 2008 at 10:44 AM, Mark Miller <[EMAIL PROTECTED]> wrote: > Andre Rubin wrote: > >> On Tue, Sep 2, 2008

Re: Test. Please ignore that. Fwd: How to send mail to java user

2008-09-02 Thread Leonid Maslov
+1. I've got On Tue, Sep 2, 2008 at 6:31 PM, Raymond Balmès <[EMAIL PROTECTED]>wrote: > I'm getting plenty of message but do you receive mine... please someone > give > me reply > > On Tue, Sep 2, 2008 at 11:36 AM, Leonid Maslov <[EMAIL PROTECTED]> wrote: > > > -- Forwarded message --

Re: Performance, yet again

2008-09-02 Thread Chris Hostetter
: I was trying, before, to use it, but it doesn't seem as straightfoward as : Hits. Is there an example code, somewhere? "SearchFiles.java" in the Lucene demo was updated to use TopDocCollector when Hits was deprecated. : > Is it possible to pre-sort the index, so I don't have to every time I

Re: Question: Lucene MoreLikeThis score values all the same:

2008-09-02 Thread Chris Hostetter
: 1. Looking at the hits, they have the same score. I'd expect them to be : different, based on their relevance to the source document. Any ideas? ... : This is my output. I can paste my source code in too if needed. The output of arbitrary "secret" code isn't really a very useful for the

Re: Performance, yet again

2008-09-02 Thread Mark Miller
Andre Rubin wrote: On Tue, Sep 2, 2008 at 10:16 AM, Mark Miller <[EMAIL PROTECTED]> wrote: Andre Rubin wrote: Hi all, Most of our queries are very simple, of the type: Query query = new PrefixQuery(new Term(LABEL_FIELD, prefix)); Hits hits = searcher.search(query, new Sort(new SortF

Re: Beginner: Specific indexing

2008-09-02 Thread Chris Hostetter
I may be missunderstanding your question, but i wouldn't attempt to tackle this with a TokenFilter unless you want both the "tag" and the numbers to appear in the same field. i think what you want to do is first parse whatever file format you are dealing with, then build Documents based on the

Re: Injecting additional tokens

2008-09-02 Thread Chris Hostetter
: Is my subscription working... I got no reply on my previous question. : Sorry the disturbance. 1) if you see your message show up in one of the archives, that' a pretty good indication that your post made it to the list... http://www.nabble.com/forum/Search.jtp?query=Raymond+Balm%C3%A8s&local=

Re: Performance, yet again

2008-09-02 Thread Andre Rubin
On Tue, Sep 2, 2008 at 10:16 AM, Mark Miller <[EMAIL PROTECTED]> wrote: > Andre Rubin wrote: > >> Hi all, >> >> Most of our queries are very simple, of the type: >> >> Query query = new PrefixQuery(new Term(LABEL_FIELD, prefix)); >> Hits hits = searcher.search(query, new Sort(new SortField(LABEL_F

Re: Performance, yet again

2008-09-02 Thread Mark Miller
Andre Rubin wrote: Hi all, Most of our queries are very simple, of the type: Query query = new PrefixQuery(new Term(LABEL_FIELD, prefix)); Hits hits = searcher.search(query, new Sort(new SortField(LABEL_FIELD))) You might want to check out solrs ConstantScorePrefixQuery and compare performa

Performance, yet again

2008-09-02 Thread Andre Rubin
Hi all, Most of our queries are very simple, of the type: Query query = new PrefixQuery(new Term(LABEL_FIELD, prefix)); Hits hits = searcher.search(query, new Sort(new SortField(LABEL_FIELD))) Which sometimes result in 10, 20, sometimes 40 thousand hits. I get good performance if hits.length is

Hits document offset information

2008-09-02 Thread Nabil BOUZERNA
Hi All, I have following problem. All threads similar to my problem are former so I try a new post. When I execute a search (Lucene Core 2.3.2) I receive the list of document Hits. Then, I call current highlighter (2.3.2) to get the best fragments : getBestFragments(tokenStream, texte,1,".

Hits document offset information

2008-09-02 Thread Nabil BOUZERNA
Hi All, I have following problem. All threads similar to my problem are former so I try a new post. When I execute a search (Lucene Core 2.3.2) I receive the list of document Hits. Then, I call current highlighter (2.3.2) to get the best fragments : getBestFragments(tokenStream, texte,1,".

Re: Test. Please ignore that. Fwd: How to send mail to java user

2008-09-02 Thread Raymond Balmès
I'm getting plenty of message but do you receive mine... please someone give me reply On Tue, Sep 2, 2008 at 11:36 AM, Leonid Maslov <[EMAIL PROTECTED]> wrote: > -- Forwarded message -- > From: Sankari Palanisamy <[EMAIL PROTECTED]> > Date: Tue, Sep 2, 2008 at 12:32 PM > Subject:

Test. Please ignore that. Fwd: How to send mail to java user

2008-09-02 Thread Leonid Maslov
-- Forwarded message -- From: Sankari Palanisamy <[EMAIL PROTECTED]> Date: Tue, Sep 2, 2008 at 12:32 PM Subject: Re: How to send mail to java user To: Leonid Maslov <[EMAIL PROTECTED]> Hi, Thanks for u'r response.. Still i am getting following message Further what to do. Thank

Re: Newbie question: using Lucene to index hierarchical information.

2008-09-02 Thread Karsten F.
Hi Leonid, what kind of query is your use case? Comlex scenario: You need all the hierarchical structure information in one query. This means you want to search with xpath in a real xml-Database. (like: All Documents with a subtitle XY which contains directly after this subtitle a table with the

Re: reusing Document with multiple fields in lucene 2.3

2008-09-02 Thread Michael McCandless
Maybe the complexity caused by reuse in this case (a pool of Field instances) may not be offset by the performance gains of avoiding GC? You could code up a quick test and see what performance gains it gives you? Reuse works very well when your documents are extremely regular. Mike Juyal

Re: getTimestamp method in IndexCommit

2008-09-02 Thread Michael McCandless
Are you thinking this would just fallback to Directory.fileModified on the segments_N file for that commit? You could actually do that without any API change, because IndexCommit exposes a getSegmentsFileName(). Mike Akshay wrote: Hi, We need a feature for time based cleanup of IndexC

search for empty field?

2008-09-02 Thread Chris Lu
Is it possible to query for documents that have empty values for a field? Say need to find documents with category empty, I tried negative query: -category:* But it returns 0 document. I think "category:*" is basically match all, so this "-category:*" doesn't work. Thanks! -- Chris Lu

Re: Injecting additional tokens

2008-09-02 Thread Karsten F.
Hi Markus, hopefully someone will tell you the predefined Filter for this. I only want to agree, that filter is the correct place for this, and that you should be aware of the Token positions (after your filter you must have two Tokens on the same position). I think "WordDelimitierFilter" is a