Hi, I'm considering to write a component for diversifying the results. I know that diversification can be achieved by using grouping but I'm thinking about something different and query biased. The idea is to have something that gets applied after the normal retrieval and selects the top k documents more diverse based on some distance metric:
Example: imagine that you are asking for 10 rows, and you set diversify.rows=3 diversity.metric=tfidf diversify.field=body Solr might retrieve the the top 10 rows as usual, extract tfidf vectors for the bodies and select the top 3 stories that are more distant according to the cosine similarity. This would be different from grouping because documents will be 'collapsed' or not based on the subset of documents retrieved for the query. Do you think it would make sense to have it as a component? any feedback / idea?