Re: Strategy for making short documents not bubble to the top?

yahootintin . 11533894 Wed, 29 Jun 2005 15:39:07 -0700

Hi Jian,

Thanks for the reply.  The problem with that is it completely
ignores document length.  A book that mentions "frog" 5 times in its 2,000
pages should be less relevant than a book that mentions "frog" 4 times in
its 4 pages.


I really want to lower the document length weight instead
of removing it completely.  Any ideas how to do that?

Thanks.

--- [email protected]
wrote:
Hi,
> 
> I would use pure span or cover density based ranking algorithm
which
> do not take document length into consideration. (tweaking whatever

> currently in the standard Lucene distribution?)
> 
> For example, searching
for the keywords "beautiful house", span/cover
> ranking will treat a long
document and a short document the same
> ranking as long as they have the
same number of spans/covers (for
> example, "beautiful xxxxxx house" is one
cover), and with each
> span/cover, the editing distance between the keywords
is the same.
> 
> Just my 2 cents, 
> 
> Cheers,
> 
> Jian
> 
> On
29 Jun 2005 20:30:49 -0000, [EMAIL PROTECTED]
> <[EMAIL PROTECTED]>
wrote:
> > Hi,
> > 
> > Short documents bubble to the top of the results
because the field
> > length is short.  Does anyone have a good strategy
for working around this?
> >  Will doing something like log(document length)
flatten out my results while
> > still making them meaningful?  I'm going
to try some different approaches
> > but any advice is appreciated.
> >

> > Thanks.
> > 
> > ---------------------------------------------------------------------

> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> >
For additional commands, e-mail: [EMAIL PROTECTED]
> > 
>
>
> 
> ---------------------------------------------------------------------

> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For
additional commands, e-mail: [EMAIL PROTECTED]
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Strategy for making short documents not bubble to the top?

Reply via email to