I am disatisfied with the document scores that I'm getting. If a document is short, and has one occurrence of the search term, it is ranked higher than a longer document with two occurrences of the term. This makes little sense to me, and I'd like the longer document with more occurrences to be ranked higher. I figured I have to override the scoring method, but I can't find where Lucene actually does the scoring. This is actually not an uncommon problem for me, as I find perusing the API to be high on the confusing scale, due to the lack of comprehensive Javadoc documentation. (Something that even Sun doesn't spend much time on.) I attempt to read the code, but variable names are terse, and there's a dearth of commenting, which makes it fairly unfathomable.
This is the code that I'm using. Am I doing the right thing in using the Query object, or should I be using a different one, such as TermQuery ? Does TermQuery score differently, so that I might be happier with it's behavior ? If not, where might I find the method that actually computes the Document's score, so that I may modify it ? Hits find ( String string_searchString, String string_indexPath ) { Searcher indexSearcher ; Analyzer analyzer ; Query query ; QueryParser queryParser ; Hits searchResults_Hits ; try { indexSearcher = new IndexSearcher ( string_indexPath ) ; analyzer = new SimpleAnalyzer () ; query = QueryParser.parse ( string_searchString, "DocumentText", analyzer ) ; searchResults_Hits = indexSearcher.search ( query ) ; return searchResults_Hits ; }