t + overallIntersectionPercent)
/ 2;
}
*From:* Mark Bennett
*To:* dev@lucene.apache.org
*Sent:* Fri, 10 September, 2010 18:44:31
*Subject:* Re: Relevancy, Phrase Boosting, Shingles and Long Tail Curves
Thanks Mark H,
Maybe I'll look at MLT (More Like This) again. I'll also c
tersectionPercent;
// so here we take an average of the two:
return (termBIntersectionPercent + overallIntersectionPercent) / 2;
}
From: Mark Bennett
To: dev@lucene.apache.org
Sent: Fri, 10 September, 2010 18:44:31
Subject: Re: Relevancy
x27;s another topic.
> BTW, the Luke tool has a "Zipf" plugin that you may find useful in
> examining index term distributions in Lucene indexes..
>
> Cheers
> Mark
>
> --
> *From:* Mark Bennett
> *To:* java-...@lucene.apache.org
>
Lucene indexes..
Cheers
Mark
From: Mark Bennett
To: java-...@lucene.apache.org
Sent: Fri, 10 September, 2010 1:42:11
Subject: Relevancy, Phrase Boosting, Shingles and Long Tail Curves
I want to boost the relevancy of some Question and Answer content. I'm using
stop words, Dism
I want to boost the relevancy of some Question and Answer content. I'm using
stop words, Dismax, and I'm already a fan of Phrase Boosting and have
cranked that up a bit. But I'm considering using long Shingles to make use
of some of the normally stopped out "junk words" in the content to help
relev