On Nov 13, 2006, at 1:51 PM, Chris Hostetter wrote:
That reminds me ... i seem to remember someone saying once that Nutch lso
builds word based n-grams out of it's stop words, so searches on "the"
or "on" won't match anything because those words are never indexed as a single tokens, but if a document contains "the dog in the house" it would
match a search on "in the" becaue the Analyzer would treat that as a
single token "in_the".


Yup.... we covered this in LIA:

        <http://lucenebook.com/search?query=nutch+stop+words>


Reply via email to