: I give to Solr document to index which UniqueKey Field is based on the Url : and the Time at which the croawler downloaded it so UniqueKey is a digit : obtained like that MyAlgo(Url+Time); the problem occur at searching time : solr return me the result which contain duplication it means for example the : 10 first result correspond to the same web page with the same content : because in fact it is the same Url. So I want to remove this duplication, : so I want to add a parameter in the solr request for example permitdupp : which takes values (true or false ) if permitdupp= true I will let the : default Solr behaviour but if permitdupp=false I want to remouve all the : duplicative document and just to keep the recent indexed document (to get
This sounds like the exact use case of "Field Collapsing" which is not yet part of Solr but does have an open issue with some patches you may want to try. https://issues.apache.org/jira/browse/SOLR-236 -Hoss