> What I'd like to do is get a relevancy-based order in which (a) longer > documents tend to get more weight than shorter ones, (b) a document body > with 'X' instances of a query term gets a higher ranking than one with fewer > than 'X' instances. and (c) a term found in the headline (usually in > addition to finding the same term in the body) is more highly ranked than > one with the term only in the body. > > But that's not what happens with the default scoring, and I'd like to change > that.
I am not Lucene developer, but: 1) Lucene uses the Vector model, if you want to use different model you must understand what you are doing and you must change similarity calculations. AFAIK you would set the normalization factor to a constant value (1.0 or so). 2) you are trying to search for DATA, not INFORMATION. It is a big difference. For your task, you could rather use simpler engine that is based on RDBMS and B+. -g- -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>