Another approach is to change the document length normalization formula.

See Similarity.lengthNorm() in Lucene.

wunder

On Oct 15, 2009, at 12:45 AM, Andrea D'Ippolito wrote:

I've read (correct me if I'm wrong)
that a solution to achieve that is overboost all the other fields.
but I guess this works easily only if u have few fields indexed ;)

bye

2009/10/15 Simon Wistow <si...@thegestalt.org>

Our index has some items in it which basically contain a title and a
single word body.

If the user searches for a word in the title (especially if title is of
itself only oen word) then that doc will get scored quite highly,
despite the fact that, in this case, it's not really relevant.

I've tried something like

qf=title^2.0 content^0.5
bf=num_pages

but that disproportionally boosts long documents to the detriment of
relevancy

bf=product(num_pages,0.05)

has no effect but

bf=product(num_pages,0.06)


has a bunch of long documents which don't seem to return any highlighted fields plus the short document with only the query in the title which is
progress in that it's almost exactly the opposite of what I want.

Any suggestions? Am I going to need to reindex and add the length in
bytes or characters of the document?

Simon






Reply via email to