Index strategy for tagged documents where tags can change often

2010-07-28 Thread Hans Merkl
Hi, In addition to text content my documents have tags which can be searched too. The problem now is that the tags change quite often and every time a tag gets added or removed I have to call UpdateDocument which is quite slow when done for hundreds of documents. Are there any well performing str

Re: understanding lucene

2010-07-28 Thread Erick Erickson
that code has way too much stuff in it for your first application. Hibernate is in there and it looks, from the description, like it tries to search your database. I'd *strongly* recommend that you don't go there. Try looking at http://wiki.apache.org/lucene-java/LuceneFAQ#How_do_I_start_using

Term browsing much slower in Lucene 3.x.x

2010-07-28 Thread Nader, John P
We recently upgraded from lucene 2.4.0 to lucene 3.0.2. Our load testing revealed a serious performance drop specific to traversing the list of terms and their associated documents for a given indexed field. Our code looks something like this: for(Term term : terms) { TermDocs termDocs = inde

Re: Get all terms of a specific field

2010-07-28 Thread Philippe
Hi Grant, thanks for the ideas. I implemented a personal Collector, which returns all docID's. In the next step I collect all terms using a customised FieldSelector. This implementation is about 2 to 3 times faster than my previous implementation using only a customised FieldSelector. Howeve

Re: Using lucene for substring matching

2010-07-28 Thread Ian Lea
You could also look at MemoryIndex or InstantiatedIndex, both in lucene's contrib area. I think that I was also wondering if you might gain from using TermDocs or TermVectors or something directly. -- Ian. On Tue, Jul 27, 2010 at 9:34 PM, Geir Gullestad Pettersen wrote: > Thanks for your fee