On Oct 28, 2009, at 12:09, Bill Janssen <jans...@parc.com> wrote:

Andi Vajda <va...@apache.org> wrote:

The snowball JAR comes from this statement in the Makefile:
SNOWBALL_JAR=$(LUCENE)/build/contrib/snowball/lucene-snowball-$ (LUCENE_VER).jar

Which means that it's whatever corresponds to the Lucene version
checked out. For PyLucene 2.9.0, that is:
  http://svn.apache.org/repos/asf/lucene/java/tags/lucene_2_9_0

In other words, this is a question best asked on the
java-u...@lucene.apache.org mailing list as PyLucene doesn't do
anything different (at least intentionally).

I've looked through that set of APIs, and don't see anything useful.
This was more of a brainstorming question for the list...

What could we do in Python to enumerate the list?

import lucene
lucene.initVM(classpath=lucene.CLASSPATH)
for n,v in lucene.__dict__.items():
 ...    if n.endswith("Stemmer"):
 ...       print n, lucene.SnowballProgram.instance_(v)
 ...

That is checking if a class is an instance of SnowballProgram which is probably not what you want. Use isAssignableFrom() maybe ?

There may be an API in the Snowball library to do this enumeration. I don't know and that's why I suggested asking java-user. Nothing wrong with brainstorming here, of course.

Andi..



ItalianStemmer False
FrenchStemmer False
HungarianStemmer False
LovinsStemmer False
RussianStemmer False
FinnishStemmer False
PortugueseStemmer False
KpStemmer False
BrazilianStemmer False
DanishStemmer False
TurkishStemmer False
DutchStemmer False
SwedishStemmer False
German2Stemmer False
EnglishStemmer False
GermanStemmer False
RomanianStemmer False
PorterStemmer False
NorwegianStemmer False
SpanishStemmer False

Seems to me that this should give different results. Am I using the JCC
"instance_" method improperly?

Bill

Reply via email to