On Jun 28, 2005, at 5:54 PM, Paul Libbrecht wrote:
Hi,
I've been looking around at analyzers for use in Lucene.
Among the contributions, the Snowball projects' output seem quite
nicely usable.
However, right in the box of lucene-1.4.3.jar, there's a
GermanAnalyzer, using a stemmer, and a RussianAnalyzer. Several
other languages can be found in the contribs directory.
Any reason I cannot find an "EnglishAnalyzer" and an EnglishStemmer ?
I don't think the other analyzers I could find (e.g.
StandardAnalyzer) are based on stemmers.
Paul - if stemming is what you're looking for, then grab the
SnowballAnalyzer code from Subversion under contrib/snowball. Or you
could get a binary copy of the JAR from the source code distribution
of Lucene in Action at http://www.lucenebook.com (and see examples of
its use in the sample code there). Lucene's JAR does have a built-in
PorterStemFilter, but none of the built-in analyzers use it, and the
SnowballAnalyzer is more commonly used anyway.
Erik
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]