Re: 'Down' boosting shorter docs

Walter Underwood Thu, 15 Oct 2009 06:36:23 -0700

Another approach is to change the document length normalization formula.


See Similarity.lengthNorm() in Lucene.

wunder

On Oct 15, 2009, at 12:45 AM, Andrea D'Ippolito wrote:

I've read (correct me if I'm wrong)
that a solution to achieve that is overboost all the other fields.
but I guess this works easily only if u have few fields indexed ;)

bye

2009/10/15 Simon Wistow <si...@thegestalt.org>

Our index has some items in it which basically contain a title and a
single word body.

If the user searches for a word in the title (especially if titleis of

itself only oen word) then that doc will get scored quite highly,
despite the fact that, in this case, it's not really relevant.

I've tried something like

qf=title^2.0 content^0.5
bf=num_pages

but that disproportionally boosts long documents to the detriment of
relevancy

bf=product(num_pages,0.05)

has no effect but

bf=product(num_pages,0.06)

has a bunch of long documents which don't seem to return anyhighlightedfields plus the short document with only the query in the titlewhich is

progress in that it's almost exactly the opposite of what I want.

Any suggestions? Am I going to need to reindex and add the length in
bytes or characters of the document?

Simon

Re: 'Down' boosting shorter docs

Reply via email to