Hi, Is there any way to get more details of a JavaError in PyLucene? It seems that the method getJavaException() only returns the Exception message, but not the full stack trace.
Background: we noticed some strange behaviour in PyLucene (JCC-version 2.2) when trying to index a RTF (richtext) document - PyLucene reported JavaError: java.lang.IllegalArgumentException: term length 55296 exceeds max term length 16383 and (more remarkably) all attemtps to index further documents in that batch job result in JavaError: java.lang.NullPointerException in writer.addDocument(doc) The reason is obviously the content of the RTF document (lengthy strings) - which should have been converted to plain text before indexing. Having looked at the java source of Lucene it seems the Exception is raised in DocumentsWriter where the Token lenght is checked. I understand that it doesnt make sense to index terms of certain length, but I'd expect that either those terms are silently ignored, or at least indexing further documents should still work. Has anyone encounterd this yet or found a workaround? E.g. is it possible to configure lucene to ignore terms of a specified length at all? (Without raising an Exception) Our current solution/workaround is the fetch the JavaError and close the writer before adding further documents. That way index seems to keep in a valid state. I'm not sure if this is a PyLucene-related question so pls excuse in that case (should probably post to lucene mailing list then). Anyway the JavaError-question is certainly PyLucene related. Kind regards Thomas Koch -- OrbiTeam Software GmbH & Co. KG http://www.orbiteam.de _______________________________________________ pylucene-dev mailing list [email protected] http://lists.osafoundation.org/mailman/listinfo/pylucene-dev
