A few of us who are interested in an Open Relevance assessment project (ala TREC) have started to put some thoughts down on "paper" over at http://wiki.apache.org/lucene-java/OpenRelevance

Thus, if you'd like to somehow participate (TBD what that actually means just yet) in developing a set of open collections, queries and assessments for relevance testing, let's discuss here and on that Wiki page.

The basic gist of it is, we'd like to crawl Creative Commons and/or other free content, redistribute it along with queries and judgments, thus fueling the testing capabilities to further improve Lucene's search quality as well as, of course, providing the means for a completely open assessment process whereby anyone can participate without having to fork up money to license 20 year old copyrighted news articles that are of no other value whatsoever other than testing.

At this point, we're open to a lot of ideas. Once we solidify a bit, then we'd like to make it an official Lucene subproject and get our own resources as well as figure out how to crawl and host the content using ASF infrastructure (without making the ASF infra. team upset!)


Reply via email to