We want to configure solr so that fields are indexed with a maximum term frequency and a minimum document length. If a term appears more than N times in a field it will be considered to have appeared only N times. If a document length is under M terms, it will be considered to exactly M terms. We have done this in the past in raw Lucene by writing a Similarity class like this:
public class LimitingSimilarity extends DefaultSimilarity { public float lengthNorm(String fieldName, int numTerms) { return super.lengthNorm(fieldName, Math.max(minNumTerms, numTerms)); } public float tf(float freq) { freq = Math.min(maxTermFrequency,freq); return super.tf(freq); } } Is there a better way to this within solr configuration files? Thanks, Jonah