Another approach is to change the document length normalization formula.
See Similarity.lengthNorm() in Lucene.
wunder
On Oct 15, 2009, at 12:45 AM, Andrea D'Ippolito wrote:
I've read (correct me if I'm wrong)
that a solution to achieve that is overboost all the other fields.
but I guess this works easily only if u have few fields indexed ;)
bye
2009/10/15 Simon Wistow <si...@thegestalt.org>
Our index has some items in it which basically contain a title and a
single word body.
If the user searches for a word in the title (especially if title
is of
itself only oen word) then that doc will get scored quite highly,
despite the fact that, in this case, it's not really relevant.
I've tried something like
qf=title^2.0 content^0.5
bf=num_pages
but that disproportionally boosts long documents to the detriment of
relevancy
bf=product(num_pages,0.05)
has no effect but
bf=product(num_pages,0.06)
has a bunch of long documents which don't seem to return any
highlighted
fields plus the short document with only the query in the title
which is
progress in that it's almost exactly the opposite of what I want.
Any suggestions? Am I going to need to reindex and add the length in
bytes or characters of the document?
Simon