hello all, i have a collection of a few million documents; i have many duplicates in this collection. they have been clustered with a simple algorithm, i have a field called 'duplicate' which is 0 or 1 and a fields called 'description, tags, meta', documents are clustered on different criteria and the text i search against could be very different among members of a cluster.
im currently using a dismax handler to search across the text fields with different boosts, and a filter query to restrict to masters (duplicate: 0) my question is then, how do i best query for documents which are masters OR match text but are not included in the matched set of masters? does this make sense?