+1

-Grant
On Sep 30, 2007, at 4:47 PM, markharw00d wrote:

I've put together a new Filter and Junit test for eliminating duplicates from search results.

The typical usage scenario is where multiple documents exist in the index which share an untokenized field value (e.g. the same primary key or URL). It is desirable to keep copies in the index because some searches wish to see the multiple versions (e.g. to view a revision history for a document). However, when a search is done which needs to return only one version of each document (often the latest version) this filter can be used as an efficient means of filtering results. The bitset produced marks ALL the "master" docs in an index for a field and this filter can be safely cached for reuse with any query

       DuplicateFilter df=new DuplicateFilter(KEY_FIELD_NAME);
       df.setKeepMode(DuplicateFilter.KM_USE_LAST_OCCURRENCE);
       Hits h = searcher.search(query,df);


If anyone else finds this useful I'll commit it.

Cheers
Mark


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to