On Dec 7, 2007, at 3:01 PM, Mark Miller wrote:

Yes, and even if they did not use the stock defaults, I would bet there would be complaints about what was done wrong at every turn. This seems like a very difficult thing to do. How long does it take to fully learn how to correctly utilize each search engine for the task at hand? I am sure longer than these busy men could possibly take. It seems that such a comparison could only be done legitimately if experts for each search engine set up the indexing/ searching processes. Even then the results seem like they could be difficult to measure...eg was each search engine configured so that they would only break on spaces for indexing and do nothing else special at all? So many small settings and knowledge need to ensure each engine is on level ground...

This is why I have called on NIST/TREC to open source their collections. Until then, Lucene and the other O/S search engines will be reliant on those contributors who have access to them, which is spotty at best. (And, yes, I know, TREC is not the be all, end all of IR evaluations, but it is a common ground for doing research) See http://www.gossamer-threads.com/lists/lucene/java-dev/52022?search_string=TREC;#52022

-Grant

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to