On 10/24/06, Andreas Korth <[EMAIL PROTECTED]> wrote:
>
> On 24.10.2006, at 23:28, Scott Persinger wrote:
>
> > I am seeing trouble with searches for 'you' not returning anything. It
> > appears that 'you' is a stop word to the standard analyzer:
>
> > I assumed from the docs that StandardAnalyzer was using stop words
> > as defined by:
> >
> > Ferret::Analysis::ENGLISH_STOP_WORDS
> >
> > I don't see 'you' in there.
>
> StandardAnalyzer actually uses
> Ferret::Analysis::FULL_ENGLISH_STOP_WORDS by default. (Note the 'FULL_')

My apologies. This had been fixed in the documentation a while ago. I
just have updated the docs on the Ferret homepage for a while.

> > Supplying my own stop words seems to fix the problem:
>
> Standard stop words are just a one-size-fit-all reasonable default.
> For maximum control you should always supply your own list of stop
> words.

> > I am running the latest Windows build, but I've seen the same behavior
> > on Linux with the latest builds. I am happy with my solution, but it
> > seems odd that 'you'  should be standard stop word.
>
> Depends on how you look at it. 'You' is definitely not the least
> adequate candidate for a stop word. Then again, it's not included in
> Ferret::Analysis::ENGLISH_STOP_WORDS.
>
> Cheers,
> Andy

Thanks Andy. Actually the reason for the two English stop-word lists
is that they come from two different sources. ENGLISH_STOP_WORDS is
the list taken from Lucene. FULL_ENGLISH_STOP_WORDS is taken from
Martin Porter's website[1]. I hope that clears things up a little. You
are quite right in saying you should probably use your own list of
stop words for best results.

Cheers,
Dave

[1] http://snowball.tartarus.org/
_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk

Reply via email to