If the field you want to use for deduping is ISBN, create a DuplicateFilter using whatever your ISBN field name is as the field name and pass that to one of the search methods that takes a filter.
If your index is large I'd be worried about performance and would look at deduping at indexing time i.e. have one lucene document per ISBN. -- Ian. On Thu, Mar 4, 2010 at 12:43 PM, ani...@ekkitab <ani...@ekkitab.com> wrote: > > Hi there, Could someone help me with the usage of DuplicateFilters. Here is > my problem > > I have created a search index on book Id , title ,and author from a database > of books which fall under various categories. Some books fall under more > than one category. Now, when i issue a search, I get back 'X' books matching > the search criteria, some of which are repeated, because that books are in > different documents and its the expected behaviour. > > I use the  TopFieldDocCollector . getTotalHits() to get the total count. But > this includes the repeats as mentioned above. This count is not the actual > count, Hence when I issue a search on title or author i want to get a unique > count / list of books. How do I use DuplicateFilter to acheive this. > > Please help > > Regards > Anish > -- > View this message in context: > http://old.nabble.com/how-to-use-DuplicateFilter-to-get-unique-documents-based-on-a-fieldName-tp27780251p27780251.html > Sent from the Lucene - Java Users mailing list archive at Nabble.com. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org