Mark,

In one project, with Lucene not Solr, I also use a smallish unit test sample 
and apply some queries there. 
It is very limited but is automatable.

I find a better way is to have precision and recall measures of real users run 
release after release. 
I could never fully apply this yet on a recurring basis sadly.

My ideal world would be that the search sample is small enough and that users 
are able to restrict search to this.
Then users have the possibility of checking correctness of each result (say, 
first 10) for each query out of which one can then read results. Often, users 
provide comments along, e.g. missing matches. This is packed as a wiki page.
First samples generally do not use enough of the features, this is adjusted as 
a dialogue.

As a developer I review the test suite run and plan for next adjustments.
The numeric approach allows easy mean precision and mean recall which is good 
for reporting.

My best reference for PR testing and other forms of testing Kavi Mahesh's Text 
Retrieval Quality, a primer: 
http://www.oracle.com/technetwork/database/enterprise-edition/imt-quality-092464.html

I would love to hear more of what the users have been doing.

paul


Le 6 avr. 2011 à 08:10, Mark Mandel a écrit :

> Hey guys,
> 
> I'm wondering how people are managing regression testing, in particular with
> things like text based search.
> 
> I.e. if you change how fields are indexed or change boosts in dismax,
> ensuring that doesn't mean that critical queries are showing bad data.
> 
> The obvious answer to me was using unit tests. These may be brittle as some
> index data can change over time, but I couldn't think of a better way.
> 
> How is everyone else solving this problem?
> 
> Cheers,
> 
> Mark
> 
> -- 
> E: mark.man...@gmail.com
> T: http://www.twitter.com/neurotic
> W: www.compoundtheory.com
> 
> cf.Objective(ANZ) - Nov 17, 18 - Melbourne Australia
> http://www.cfobjective.com.au
> 
> Hands-on ColdFusion ORM Training
> www.ColdFusionOrmTraining.com

Reply via email to