Hi Ken, I found this post on the Lucene documentation page: http://wiki.apache.org/lucene-java/LuceneFAQ#head-912c1f237bb00259185353182948e5935f0c2f03
In practice you sometimes need to have a cut-off or boost factor post tf-idf scoring. The way I've been going about it is by picking values and seeing if the results are better. I'm sure there is a deep information theory problem there. M On Wed, Feb 25, 2009 at 8:38 AM, Ken Williams < ken.willi...@thomsonreuters.com> wrote: > Hi all, > > I didn't get a response to this - not sure whether the question was > ill-posed, or too-frequently-asked, or just not interesting. But if anyone > could take a stab at it or let me know a different place to look, I'd > really > appreciate it. > > Thanks, > > -Ken > > > On 2/20/09 12:00 PM, "Ken Williams" <ken.willi...@thomsonreuters.com> > wrote: > > > Hi, > > > > Has there been any work done on getting confidence scores at runtime, so > > that scores of documents can be compared across queries? I found one > > reference in the mailing list to some work in 2003, but couldn't find any > > follow-up: > > > > http://osdir.com/ml/jakarta.lucene.user/2003-12/msg00093.html > > > > Thanks. > > -- > Ken Williams > Research Scientist > The Thomson Reuters Corporation > Eagan, MN > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >