On Jun 1, 2008, at 10:53 PM, "Cloud Zhang" <[EMAIL PROTECTED]> wrote:
Thank a lot for this very detailed guide, I'll forward this to
Chinese Python community, since the first thing a Chinese developer
looking for about Lucene is a tokenizer for Chinese and get stuck
with importing a jar...
Isn't there a Chinese analyzer already shipped with Java Lucene in
contrib/analyzers ?
That contrib is already built into PyLucene.
Andi..
On Mon, Jun 2, 2008 at 1:15 PM, Andi Vajda <[EMAIL PROTECTED]>
wrote:
On Mon, 2 Jun 2008, Cloud Zhang wrote:
Adding an new analyzer (in jar form) in Java is really
straightforward, but
when I was trying to add one for pyLucene, I found no way to refer
the jar
package.
I went though the building process of pyLucene and guess maybe I
could:
* put the analyzer source under
PyLucene-2.3.2-1/lucene-java-2.3.2/contrib/analyzers/src/java/, and
recompile Lucene then pyLucene
or
* put the analyzer jar somewhere in the building folder and add it
to the
Makefile, then recompile pyLucene
Could them work? Or is there other solution which is as
straightforward as
setting CLASSPATH in java?
To access your class(es) by name from Python, you must have JCC
generate wrappers for it (them). This is what is done line 177 and
on in PyLucene's Makefile. The easiest way for you to add your own
Java classes to PyLucene is to create another jar file with your own
analyzer classes and code and add it to the JCC invocation there.
For example, the Makefile snippet in question currently says:
GENERATE=$(JCC) $(foreach jar,$(JARS),--jar $(jar)) \
--package java.lang java.lang.System \
java.lang.Runtime \
--package java.util \
java.text.SimpleDateFormat \
--package java.io java.io.StringReader \
java.io.InputStreamReader \
java.io.FileInputStream \
--exclude org.apache.lucene.queryParser.Token \
--exclude org.apache.lucene.queryParser.TokenMgrError \
--exclude
org.apache.lucene.queryParser.QueryParserTokenManager \
--exclude org.apache.lucene.queryParser.ParseException \
--python lucene \
--mapping org.apache.lucene.document.Document 'get:(Ljava/
lang/String;)Ljava/lang/String;' \
--mapping java.util.Properties 'getProperty:(Ljava/lang/
String;)Ljava/lang/String;' \
--sequence org.apache.lucene.search.Hits 'length:()I' 'doc:
(I)Lorg/apache/lucene/document/Document;' \
--version $(LUCENE_VER) \
--files $(NUM_FILES)
change the first line to say:
GENERATE=$(JCC) $(foreach jar,$(JARS),--jar $(jar)) --jar myjar.jar \
...
and rebuild PyLucene. That should be all you need to do. Your jar
file is going to be installed along with lucene's in the lucene egg
and it is going to be put on lucene.CLASSPATH which you use with
lucene.initVM().
Your classes can be declared in any Java package you want. Just make
sure that their names don't clash with other Lucene class names that
you also need to use as the class namespace is flattened in PyLucene.
For more information about JCC and its command line args see JCC's
README file at [1].
Andi..
[1] http://svn.osafoundation.org/pylucene/trunk/jcc/jcc/README
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev
--
Cheers,
Cloud
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev