Dear All,

as far as the production of a huge amount relevance assessment is
concerned, you could have a look at the TREC Million Query Track (
http://ciir.cs.umass.edu/research/million/, http://trec.nist.gov/pubs/trec17/t17_proceedings.html).

As far as the production of a test collection in an interactive way is concerned, you could look at:

Corkmack et al., "Efficient construction of large test collections",SIGIR 1998, http://doi.acm.org/10.1145/290941.291009

Sanderson & Joho, "Forming test collections with no system pooling", SIGIR 2004, http://doi.acm.org/10.1145/1008992.1009001

Wrt the creation of pools (and sampling of collections) targeted towards a specific metric, you could have a look at:

Aslam et al., "A statistical method for system evaluation using incomplete judgments", SIGIR 2006, http://doi.acm.org/10.1145/1148170.1148263

Finally, a system that can be of your interest is DIRECT (Distributed Information Retrieval Evaluation Campaign Tool), that we have built for managing the CLEF evaluation campaigns. Among other things, it allows for interactive topic creation by searching in document collections (by the way we use Lucene to do this) and interactive relevance assessments. You can find some information about DIRECT at: http://www.trebleclef.eu/getfile.php?id=75

All the best,
Nicola Ferro

----------------------------------------------------------------------------------
      Nicola Ferro   -   Ph.D. in Computer Science
      Assistant Professor

      Department of Information Engineering (DEI)
      University of Padua
      Via Gradenigo, 6/A  -  35131 Padova - Italy
      Tel +39 049 827 7939  Fax: +39 049 827 7799

      skype: nicola.ferro
      e-mail: [email protected]
      home page: http://ims.dei.unipd.it/members/ferro/
----------------------------------------------------------------------------------

Reply via email to