Java Heap Space -Out Of Memory Error

2007-09-04 Thread Sebastin
Hi All, i used to search 3 Lucene Index store of size 6 GB,10 GB,10 GB of records using MultiReader class. here is the following code snippet: Directory indexDir2 = FSDirectory.getDirectory(indexSourceDir02,false);

Re: how to implement searching in time efficiently

2007-09-04 Thread Sebastin
Hi Erick, help me for this search in time efficiently. Erick Erickson wrote: This topic has been discussed a number of times, I suggest you search the mail archives as that will get you very complete answers more quickly. See http://www.gossamer-threads.com/lists/lucene/java-user/

Re: Java Heap Space -Out Of Memory Error

2007-09-04 Thread testn
I think you store dateSc with full precision i.e. with time. You should consider to index it just date part or to the resolution you really need. It should reduce the memory it use when constructing DateRangeQuery and plus it will improve search performance as well. Sebastin wrote: Hi All,

Data in the Index [was: JdbcDirectory]

2007-09-04 Thread Guilherme Barile
So, Anyone ever stored the data in the index also ? What are your experiences ? Thanks a lot Gui On Sep 3, 2007, at 3:47 PM, Guilherme Barile wrote: Storing the data in the index, mainly for non-structured data. We plan to implement something like this ThingDB from http://

Re: Data in the Index [was: JdbcDirectory]

2007-09-04 Thread Patrick Turcotte
Hi, At first, we thought we would use a dual approach, an Lucene index and a RDBMS for storage. While prototyping, for simplicity sake, we used the Lucene index as storage, thinking we could easily replace it later. So far, speed is satisfying enough that we are going to keep data there util

Re: Java Heap Space -Out Of Memory Error

2007-09-04 Thread Sebastin
Hi testn, i index the dateSc as 070904(2007/09/04) format.i am not using any timestamp here.how can we effectively reopen the IndexSearcher for an hour and save the memory because my index gets updated every minute. testn wrote: Check out Wiki for more information at

Look for strange encodings -- tokenization

2007-09-04 Thread poeta simbolista
Hi all, I'd want to know the best way to look for strange encodings on a Lucene index. i have several inputs where input can have been encoded on different sets. I not always know if my guess about the encoding has been ok. Hence, I'd thought of querying the index for some typical strings that

Re: Lockless read-only deletions in IndexReader?

2007-09-04 Thread Michael McCandless
Excellent, a much simpler approach! I think it should work? Maybe override numDocs() as well? Mike Karl Wettin [EMAIL PROTECTED] wrote: 20 aug 2007 kl. 14.33 skrev Michael McCandless: karl wettin [EMAIL PROTECTED] wrote: I want to set documents in my IndexReader as deleted, but I

Re: Data in the Index [was: JdbcDirectory]

2007-09-04 Thread Chris Lu
I store Lucene index outside database, and run indexing periodically to get the latest updates, not depending on ORM APIs. In general, search data can be slower to update unless some realtime requirements. Storing data in index saves trips to databases. This usually is a huge difference on

Re: Java Heap Space -Out Of Memory Error

2007-09-04 Thread testn
Can you provide more info about your index? How many documents, fields and what is the average document length? Sebastin wrote: Hi testn, i index the dateSc as 070904(2007/09/04) format.i am not using any timestamp here.how can we effectively reopen the IndexSearcher for an

open file descriptors for deleted index files

2007-09-04 Thread Tony Qian
All, I'm facing an issue in which the file descriptors are not closed for deleted index files. I searched mailing list and didn't find the solution. Here is some info: java 21488 wppd 139r REG8,7 152456865 571208 /data/index/_idx.cfs (deleted) java 21488

Re: open file descriptors for deleted index files

2007-09-04 Thread Bill Au
Closing old IndexSearcher should take care of this problem for you. Take a look at Solr. It opens a new IndexSearcher and direct all requests to the new one. It then closes the old IndexSearcher when all the requests that it is serving has completed. Bill On 9/4/07, Tony Qian [EMAIL PROTECTED]

Extract terms not by reader, but by documents

2007-09-04 Thread Rafael Rossini
Hi all, In some custom highlighting, I often write a code like this: SetTerm matchedTerms = new HashSetTerm(); query.rewrite(reader).extractTerms(matchedTerms); With this code the Term Set gets populated by the matched query in your whole index. Is it possible to this with

Re: Extract terms not by reader, but by documents

2007-09-04 Thread Grant Ingersoll
Not sure if I am understanding what you are trying to do. I think you are trying to find out which terms occurred in a particular document, correct? I also am not sure about your first example. My understanding of extractTerms is that it just gives you back the set of all terms that