A useful relevance "feature" is the number of terms in a field in a document. Basically the term length discounted for overlaps, or the total number of positions -- the position length. org.apache.lucene.search.similarities.Similarity#computeNorm receives this information, applies a Similarity-dependent formula, and the result is stored into the norms disk format. The Similarity API does not provide an API to reverse this, even though it has the formulas to go one direction. Wouldn't such an API be nice -- WDYT? The ultimate goal would be to provide a ValueSource for accessing. There is something similar -- NormValueSource but that yields the decoded norm, not the term length (AKA position length), and it's limited to TFIDFSimilarity.
~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley