Thank you, Robert.
--- On Wed, 2/10/10, Robert Muir wrote:
> From: Robert Muir
> Subject: Re: TREC Data and Topic-Specific Index
> To: java-user@lucene.apache.org
> Date: Wednesday, February 10, 2010, 9:23 AM
> Hi, so you mean around 15% and 24%
> respectively? i think you
tcSimilarity gets us more improvement:
> 0.175-0.141=0.034.
>
> Thanks,
>
> Ivan
>
> --- On Sun, 2/7/10, Robert Muir wrote:
>
> > From: Robert Muir
> > Subject: Re: TREC Data and Topic-Specific Index
> > To: java-user@lucene.apache.org
> > Date: Sunda
Muir wrote:
> From: Robert Muir
> Subject: Re: TREC Data and Topic-Specific Index
> To: java-user@lucene.apache.org
> Date: Sunday, February 7, 2010, 10:59 PM
> you should do (a), and pretend you
> know nothing about the relevance
> judgements up front.
>
> it is true
you should do (a), and pretend you know nothing about the relevance
judgements up front.
it is true you might make some change to your search engine and wonder, how
is it fair that I am bringing back possibly relevant docs that were never
judged (and thus scored implicitly as non-relevant)? i.e. t
Robert,
We are using TREC-3 data and Ad Hoc topics 151-200. The relevance judgments
list contains 97,319 entries, of which 68,559 are unique document ids. The
TIPSTER collection which was used in TREC-3 is around 750,000 documents.
Should we (a) index the entire 750,000 document collection