Paul Libbrecht <p...@hoplahup.net> wrote: > I did several changes of this sort and the precision and recall > measures went better in particular in presence of language-indication > failure which happened to be very common in our authoring environment.
There are two kinds of failures: no language, or wrong language. For no language, I fall back to StandardAnalyzer, so I should have results similar to yours. For wrong language, well, I'm using OTS trigram-based language guessers, and they're pretty good these days. > >> Wouldn't it be better to prefer precise matches (a field that is > >> analyzed with StandardAnalyzer for example) but also allow matches are > >> stemmed. Yes, I think it might improve things, but again, by how much? Stemming is better than no stemming, in terms of recall. But this approach would also improve precision. Bill --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org