Doug Cutting wrote:

> We should be careful not to tune things too much for any one application
> and/or dataset.  Tools to perform evaluation would clearly be valuable.
>   But changes that improve Lucene's results on TREC data may or may not
> be of general utility.  The best way to tune an application is to sample
> its query stream and evaluate these against its documents.

I agree - it may very well be that what is found to help in scoring a
certain application (e.g. TREC) would turn less helpful in another
collection/application, either because the data is different, or perhaps
even because the assessments used were not perfect. In particular for TREC
data, I've read some (can't find the link now) comparison of the
performance of few systems, concluding that for that specific collection
the probability of a document to be relevant correlates to its length, so
longer docs are more probable to be relevant, and a system punishing long
docs too much would get poorer results.

My take from this is that, yes, we should be very careful from too specific
tuning. On the same time, if we find that certain general logic that makes
sense and improves in some cases is not possible with current Luene
API/design, we should consider making that possible.

> That said, Lucene's scoring method has never been systematically tuned,
> and some judicious tuning based on TREC results would probably benefit a
> majority of Lucene applications.  Ideally we can develop evaluation
> tools, use them on a variety of datasets to find better defaults for
> Lucene, and make the tools available so that folks can fine-tune things
> for their particular applications.

I submitted a first version patch for this in LUCENE-836 - comments are
appreciated...

Thanks for your comments,
Doron


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to