Re: Benchmarking on GOV2

2006-05-29 Thread Sebastiano Vigna
On Mon, 2006-05-29 at 14:35 -1000, Chuck Williams wrote: > I'm not sure what form you would like that help to take, but here are a > couple high-level points imho: Help in configuring Lucene so that it uses all resources available, and so that the results returned are identical to all other engin

Re: Benchmarking on GOV2

2006-05-29 Thread Sebastiano Vigna
On Mon, 2006-05-29 at 17:33 +0800, Dave Kor wrote: > I was wondering if you have seen the TREC 2004 paper by Giuseppe > Attardi, Andrea Esuli and Chirag Pate from the University of Pisa, > Italy, titled "Using Clustering and Blade Clusters in the TeraByte > task"? http://trec.nist.gov/pubs/trec13/

Benchmarking on GOV2

2006-05-29 Thread Sebastiano Vigna
Dear Lucene developers, I'd be interested in doing some benchmarking on (at least) Lucene, Egothor and MG4J. There is no actual data around on publicly available collections, and it would be nice to have some more objective data on efficiency for a significantly large collection. We have GOV2 (25M