Solr - Tika(?) memory leak

Wayne W Fri, 13 Jan 2012 23:54:37 -0800

Hi,

we're using Solr running on tomcat with 1GB in production, and of late
we've been having a huge number of OutOfMemory issues. It seems from
what I can tell this is coming from the tika extraction of the
content. I've processed the java dump file using a memory analyzer and
its pretty clean at least the class involved. It seems like a leak to
me, as we don't parse any files larger than 20M, and these objects are
taking up ~700M


I've attached 2 screen shots from the tool (not sure if you receive
attachments).

But to summarize (class, number of objects, Used heap size, Retained Heap Size):


org.apache.xmlbeans.impl.store.Xob$ElementXObj             838,993
         80,533,728       604,606,040
org.apache.poi.openxml4j.opc.ZipPackage                          2
                   112                  87,009,848
char[]
              587                    32,216,960       38,216,950


We're really desperate to find a solution to this - any ideas or help
is greatly appreciated.
Wayne

Solr - Tika(?) memory leak

Reply via email to