Re: how to use DuplicateFilter to get unique documents based on a fieldName

Ian Lea Thu, 04 Mar 2010 07:08:15 -0800

If the field you want to use for deduping is ISBN, create a
DuplicateFilter using whatever your ISBN field name is as the field
name and pass that to one of the search methods that takes a filter.


If your index is large I'd be worried about performance and would look
at deduping at indexing time i.e. have one lucene document per ISBN.


--
Ian.


On Thu, Mar 4, 2010 at 12:43 PM, ani...@ekkitab <ani...@ekkitab.com> wrote:
>
> Hi there, Could someone help me with the usage of DuplicateFilters. Here is
> my problem
>
> I have created a search index on book Id , title ,and author from a database
> of books which fall under various categories. Some books fall under more
> than one category. Now, when i issue a search, I get back 'X' books matching
> the search criteria, some of which are repeated, because that books are in
> different documents and its the expected behaviour.
>
> I use the  TopFieldDocCollector . getTotalHits() to get the total count. But
> this includes the repeats as mentioned above. This count is not the actual
> count, Hence when I issue a search on title or author i want to get a unique
> count / list of books. How do I use DuplicateFilter to acheive this.
>
> Please help
>
> Regards
> Anish
> --
> View this message in context: 
> http://old.nabble.com/how-to-use-DuplicateFilter-to-get-unique-documents-based-on-a-fieldName-tp27780251p27780251.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: how to use DuplicateFilter to get unique documents based on a fieldName

Reply via email to