I get an OutOfMemoryError: Java heap space after indexing less than 40,000 documents. Here are the details.PyLucene-2.2.0-2 JCC Ubuntu 7.10 64bit running on 4GB Core 2 Duo Python 2.5.1
I am starting Lucene with the following: lucene.initVM(lucene.CLASSPATH, maxheap='2048m') Mergefactor (I've tried everything from 10 - 10,000) MaxMergeDocs and MaxBufferedDocs are at their defaults I believe the problem somehow stems from a filter I've written that turns tokens into bigrams (each token returns two tokens, the original token and a new token created from concatenating the text of the current and previous token). These bigrams add a lot of unique tokens but I didn't think that would be a problem (aren't they all flushed out to disk?) Any ideas or suggestions would be greatly appreciated. -brian
_______________________________________________ pylucene-dev mailing list [email protected] http://lists.osafoundation.org/mailman/listinfo/pylucene-dev
