Erick Erickson wrote:
I don't believe you can compare scores across queries in any meaningful
way.
I actually investigated this to some degree in my thesis, comparing
different participating systems from the TREC campaigns. It turns out
that some systems' scores (e.g. the top scores for a gi
On 8/21/07, Heba Farouk <[EMAIL PROTECTED]> wrote:
> the documents are not duplicated, i mean the hits (assume that 2 documents
> have the same subject but with different authors, so if i'm searching the
> subject, the returned hits will have duplicates )
> i was asking if i can remove duplicates
Hello,
I'm looking to extract significant terms characterizing a set of
documents (which in turn relate to a topic).
This basically comes down to functionality similar to determining the
terms with the greatest offer weight (as used for blind relevance
feedback), or maximizing tf.idf (as is
Hi,
I have a Lucene index with a few million entries, and I will need to
add batches of a few hundred thousand or a few million additional
entries. Unfortunately, I absolutely need to have all indexed entries
available when inserting a new one, even within one batch, in order to
do some duplicat