On Jan 28, 2010, at 11:00 AM, Robert Muir wrote: > in addition to what Grant said, even if your documents are similar, what > about queries? > > For example, if only a few trec queries contain proper names, acronyms, > abbreviations, or whatever, but your users frequently input things like > this, it won't be representative.
+1 > > i will disagree with him on a few things though, I would rather have less > queries (25 or so), but more judgements, definitely a lot more than 10. > Maybe your users only care about the top-10 results but its crucial to judge > some lower-ranking docs too, especially if you have recall problems... Perfectly reasonable as well. I've seen some people who only care about p...@5 and even p...@1 and others who do much more. The important thing is to think about what makes sense for your application and users. Much of this can be found through basic log analysis (assuming an existing system) or some reasoning about use cases (new system) and users (how sophisticated they are, etc.) -Grant --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org