Hello, I would like to use the Solr score distribution to pick up most relevant documents from the search result. Rather than top n results, I am interested only in picking up the most relevant based on statistical distribution of the scores.
A brief study of some sample searches (the most frequently searched terms) on my data-set shows that the mode and median scores seem to coincide or be very close together. Is this the kind of trend which is generally observed in Solr (though I understand variations on specific searches)? Hence, I was considering using statistical mode as the threshold above which I use the documents from the result. Has anyone done something like this before or would like to critique my approach? Regards, Ashish