Hi, we're using Solr running on tomcat with 1GB in production, and of late we've been having a huge number of OutOfMemory issues. It seems from what I can tell this is coming from the tika extraction of the content. I've processed the java dump file using a memory analyzer and its pretty clean at least the class involved. It seems like a leak to me, as we don't parse any files larger than 20M, and these objects are taking up ~700M
I've attached 2 screen shots from the tool (not sure if you receive attachments). But to summarize (class, number of objects, Used heap size, Retained Heap Size): org.apache.xmlbeans.impl.store.Xob$ElementXObj 838,993 80,533,728 604,606,040 org.apache.poi.openxml4j.opc.ZipPackage 2 112 87,009,848 char[] 587 32,216,960 38,216,950 We're really desperate to find a solution to this - any ideas or help is greatly appreciated. Wayne