Is the field that you are using to dedupe stored as a docvalue? From: java-user@lucene.apache.org At: 10/09/20 12:18:04To: java-user@lucene.apache.org Subject: Deduplication of search result with custom with custom sort
Hi, I need to deduplicate search results by specific field and I have no idea how to implement this properly. I have tried grouping with setGroupDocsLimit(1) and it gives me expected results, but has not very good performance. I think that I need something like DiversifiedTopDocsCollector, but suitable for collecting TopFieldDocs. Is there any possibility to achieve deduplication with existing lucene components, or do I need to implement my own DiversifiedTopFieldsCollector?