Hello,
I have some "funny" problem with the Jackrabbit-Indexer mechanism. Are store
text files with a file size between 2 und 170MB. The Xmx is set to 850, and the
indexer works without problem when storing the stream in the jackrabbit
datastore (FileDataStore). Till here is everything ok.
But if I delete the workspace-index directory to let Jackrabbit restore it when
starting the next time, the indexer starts, works some files and then creates
an java.lang.OutOfMemoryError: Java heap space.
Can someone tell me, where the difference is between "indexing while storing"
and "(re)indexing while startup the repository"?
Thank you very much for any hint,
Best regards,
Ulrich
I appended the stacktrace here:
2011-03-16 11:04:36,916 WARN : [LazyTextExtractorField] Failed to extract text
from a binary property
java.lang.OutOfMemoryError: Java heap space
at
java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99)
at
java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518)
at java.lang.StringBuilder.append(StringBuilder.java:190)
at
org.apache.jackrabbit.core.query.lucene.LazyTextExtractorField$ParsingTask.characters(LazyTextExtractorField.java:191)
at
org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
at
org.apache.tika.sax.SecureContentHandler.characters(SecureContentHandler.java:153)
at
org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
at
org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
at
org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
at
org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
at
org.apache.tika.sax.SafeContentHandler.access$001(SafeContentHandler.java:39)
at
org.apache.tika.sax.SafeContentHandler$1.write(SafeContentHandler.java:61)
at
org.apache.tika.sax.SafeContentHandler.filter(SafeContentHandler.java:113)
at
org.apache.tika.sax.SafeContentHandler.characters(SafeContentHandler.java:151)
at
org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:261)
at org.apache.tika.parser.txt.TXTParser.parse(TXTParser.java:132)
at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
at
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:137)
at
org.apache.jackrabbit.core.query.lucene.JackrabbitParser.parse(JackrabbitParser.java:192)
at
org.apache.jackrabbit.core.query.lucene.LazyTextExtractorField$ParsingTask.run(LazyTextExtractorField.java:174)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:417)
at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269)
at java.util.concurrent.FutureTask.run(FutureTask.java:123)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:65)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:168)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
at java.lang.Thread.run(Thread.java:595)