+1
-Grant
On Sep 30, 2007, at 4:47 PM, markharw00d wrote:
I've put together a new Filter and Junit test for eliminating
duplicates from search results.
The typical usage scenario is where multiple documents exist in the
index which share an untokenized field value (e.g. the same
primary key or URL). It is desirable to keep copies in the index
because some searches wish to see the multiple versions (e.g. to
view a revision history for a document). However, when a search is
done which needs to return only one version of each document (often
the latest version) this filter can be used as an efficient means
of filtering results. The bitset produced marks ALL the "master"
docs in an index for a field and this filter can be safely cached
for reuse with any query
DuplicateFilter df=new DuplicateFilter(KEY_FIELD_NAME);
df.setKeepMode(DuplicateFilter.KM_USE_LAST_OCCURRENCE);
Hits h = searcher.search(query,df);
If anyone else finds this useful I'll commit it.
Cheers
Mark
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]