On 10/24/06, Andreas Korth <[EMAIL PROTECTED]> wrote: > > On 24.10.2006, at 23:28, Scott Persinger wrote: > > > I am seeing trouble with searches for 'you' not returning anything. It > > appears that 'you' is a stop word to the standard analyzer: > > > I assumed from the docs that StandardAnalyzer was using stop words > > as defined by: > > > > Ferret::Analysis::ENGLISH_STOP_WORDS > > > > I don't see 'you' in there. > > StandardAnalyzer actually uses > Ferret::Analysis::FULL_ENGLISH_STOP_WORDS by default. (Note the 'FULL_')
My apologies. This had been fixed in the documentation a while ago. I just have updated the docs on the Ferret homepage for a while. > > Supplying my own stop words seems to fix the problem: > > Standard stop words are just a one-size-fit-all reasonable default. > For maximum control you should always supply your own list of stop > words. > > I am running the latest Windows build, but I've seen the same behavior > > on Linux with the latest builds. I am happy with my solution, but it > > seems odd that 'you' should be standard stop word. > > Depends on how you look at it. 'You' is definitely not the least > adequate candidate for a stop word. Then again, it's not included in > Ferret::Analysis::ENGLISH_STOP_WORDS. > > Cheers, > Andy Thanks Andy. Actually the reason for the two English stop-word lists is that they come from two different sources. ENGLISH_STOP_WORDS is the list taken from Lucene. FULL_ENGLISH_STOP_WORDS is taken from Martin Porter's website[1]. I hope that clears things up a little. You are quite right in saying you should probably use your own list of stop words for best results. Cheers, Dave [1] http://snowball.tartarus.org/ _______________________________________________ Ferret-talk mailing list [email protected] http://rubyforge.org/mailman/listinfo/ferret-talk

