Re: index architectures

2006-10-24 Thread Doron Cohen
Perhaps another comment on the same line - I think you would be able to get more from your system by bounding the number of open searchers to 2: - old, serving 'old' queries, would be soon closed; - new, being opened and warmed up, and then serving 'new' queries; Because... - if I understood ho

Re: index architectures

2006-10-20 Thread Paul Waite
Doron wrote: > Not sure if this is the case, but you said "searchers", so might be it - > you can (and should) reuse searchers for multiple/concurrent queries. > IndexSearcher is thread-safe, so no need to have a different searcher for > each query. Keep using this searcher until you decide to ope

Re: index architectures

2006-10-18 Thread Doron Cohen
Not sure if this is the case, but you said "searchers", so might be it - you can (and should) reuse searchers for multiple/concurrent queries. IndexSearcher is thread-safe, so no need to have a different searcher for each query. Keep using this searcher until you decide to open a new searcher - act

Re: index architectures

2006-10-18 Thread Paul Waite
Some excellent feedback guys - thanks heaps. On my OOM issue, I think Hoss has nailed it here: > That said: if you are seeing OOM errors when you sort by a field (but > not when you use the docId ordering, or sort by score) then it sounds > like you are keeping refrences to IndexReaders arround

Re: index architectures

2006-10-18 Thread Michael D. Curtin
On Wed, 2006-10-18 at 19:05 +1300, Paul Waite wrote: No they don't want that. They just want a small number. What happens is they enter some silly query, like searching for all stories with a single common non-stop-word in them, and with the usual sort criterion of by date (ie. a field) descendi

Re: index architectures

2006-10-18 Thread Joe Shaw
Hi, On Wed, 2006-10-18 at 19:05 +1300, Paul Waite wrote: > No they don't want that. They just want a small number. What happens is > they enter some silly query, like searching for all stories with a single > common non-stop-word in them, and with the usual sort criterion of by date > (ie. a field

Re: index architectures

2006-10-18 Thread Chris Hostetter
: I *think* that if you reduce your result set by, say, a filter, you might : drastically reduce what gets sorted. I'm thinking of something like this : BooleanQuery bq = new BooleanQuery(); : bq.add(Filter for the last N days wrapped in a ConstantScoreQuery, MUST) : bq.add(all the rest of your st

Re: index architectures

2006-10-18 Thread Erick Erickson
No, you've got that right. But there's something I think you might be able to try. Fair warning, I'm remembering things I've read on this list and my memory isn't what it used to be I *think* that if you reduce your result set by, say, a filter, you might drastically reduce what gets sorted.

Re: index architectures

2006-10-17 Thread Paul Waite
Many thanks to Erik and Ollie for responding - a lot of ideas and I'll have my work cut out grokking them properly and thinking about what to do. I'll respond further as that develops. One quick thing though - Erik wrote: > So, I wonder if your out of memory issue is really related to the number

RE: index architectures

2006-10-17 Thread Oliver Hutchison
October 2006 6:29 AM > To: java-user@lucene.apache.org > Subject: Re: index architectures > > I've been curious for a while about this scheme, and I'm > hoping you implement it and tell me if it works . In > truth, my data is pretty static so I haven't had to worry &

Re: index architectures

2006-10-17 Thread Erick Erickson
I've been curious for a while about this scheme, and I'm hoping you implement it and tell me if it works . In truth, my data is pretty static so I haven't had to worry about it much. That said... Would it do (and, perhaps, be less complex) to have a FSDirectory and a RAMDirectory that you search?

Re: index architectures

2006-10-17 Thread Paul Waite
Hi chaps, Just looking for some ideas/experience as to how to improve our current architecture. We have a single-index system containing approx. 2.5 million docs of about 1-3k each. The Lucene implementation is a daemon and it services requests on a port in multi-threaded manner, and it runs on