Re: Prevent duplicate results?

2004-02-25 Thread Doug Cutting
How could Lucene know that something is duplicate but older? Sounds like an application-specific thing. Doug Kevin A. Burton wrote: Is there any way to prevent lucene from returning duplicate (but 'older') results from returning within a search result? Kevin

Re: MoreLikeThis Query generator - Re: code for more like this query expansion - was - Re: setMaxClauseCount ??

2004-02-25 Thread Bruce Ritchie
David Spencer wrote: Code rewritten, automagically chooses lots of defaults, lets you override the defs thru the static vars at the bottom or the non-static vars also at the bottom. I've taken the liberty to update this code to handle multiple fields and use the new term vector support in CVS

segments question

2004-02-25 Thread sam xia
Hi, My pages can be sorted to about 1 sub categories. Each category could have up to 1 million html pages. (of course, right now I do not have this yet. I am on the early staging of thinking...) The index will be stored in hard disk. A user may be interested in 10 out of the 1 sub

Re: MoreLikeThis Query generator - Re: code for more like this query expansion - was - Re: setMaxClauseCount ??

2004-02-25 Thread David Spencer
Bruce Ritchie wrote: David Spencer wrote: Code rewritten, automagically chooses lots of defaults, lets you override the defs thru the static vars at the bottom or the non-static vars also at the bottom. I've taken the liberty to update this code to handle multiple fields and use the new

Re: segments question

2004-02-25 Thread Erik Hatcher
On Feb 25, 2004, at 4:01 PM, sam xia wrote: Or should I build the whole thing into one big segment and use the filter to do this. There is a DateFilter. Is there a way to implement a category filter? What is the best way to accomplish this? I'd recommend a pool of filters for each category.

Re: segments question

2004-02-25 Thread sam xia
I'd recommend a pool of filters for each category. Regenerate them when the index changes, otherwise leave the instances alive and reuse them for queries - this will speed things up pretty dramatically I'd guess. There is a QueryFilter you could use, or write a custom one that

Re: segments question

2004-02-25 Thread Erik Hatcher
On Feb 25, 2004, at 7:58 PM, sam xia wrote: I'd recommend a pool of filters for each category. Regenerate them when the index changes, otherwise leave the instances alive and reuse them for queries - this will speed things up pretty dramatically I'd guess. There is a QueryFilter you could use,

Database

2004-02-25 Thread Parminder Singh
I've a CMS application that deploys metadata to a database. Is it possible to use lucene to search this database instead of it's (lucene's) index. If you could tell me the steps that would be involved in doing this, it'd be great help. I'm new to Lucene. Thank You. Parminder Singh

Iterating TermEnum backwards

2004-02-25 Thread Matt Quail
Hi all, Is there any way to iterate through a TermEnum backwards? Okay, I know that there isn't a way to do this via the TermEnum class, but is it implementable on top of the underlying Lucene datastore? My particular problem is this: I have an index of documents, each document has a date field