Re: java.lang.OutOfMemoryError indexing xlsm and xlsx file
Hi Mario, could you please share your settings (e.g. OS, JVM memory, System memory)? Andrea On 27/07/18 11:36, Bisonti Mario wrote: Hallo I obtain the error indexing a .xlsm or .xlsx file of 11 MB What could I do? Thanks a lot Mario 2018-07-27 11:08:25.634 WARN (qtp1521083627-99) [ x:core_share] o.e.j.s.HttpChannel /solr/core_share/update/extract java.lang.OutOfMemoryError at java.base/java.lang.AbstractStringBuilder.hugeCapacity(AbstractStringBuilder.java:188) at java.base/java.lang.AbstractStringBuilder.newCapacity(AbstractStringBuilder.java:180) at java.base/java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:147) at java.base/java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:660) at java.base/java.lang.StringBuilder.append(StringBuilder.java:195) at org.apache.solr.handler.extraction.SolrContentHandler.characters(SolrContentHandler.java:302) at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146) at org.apache.tika.sax.SecureContentHandler.characters(SecureContentHandler.java:270) at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146) at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146) at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146) at org.apache.tika.sax.SafeContentHandler.access$001(SafeContentHandler.java:46) at org.apache.tika.sax.SafeContentHandler$1.write(SafeContentHandler.java:82) at org.apache.tika.sax.SafeContentHandler.filter(SafeContentHandler.java:140) at org.apache.tika.sax.SafeContentHandler.characters(SafeContentHandler.java:287) at org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:279) at org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:306) at org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler.run(OOXMLTikaBodyPartHandler.java:147) at org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.handleEndOfRun(OOXMLWordAndPowerPointTextHandler.java:468) at org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.endElement(OOXMLWordAndPowerPointTextHandler.java:450) at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136) at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136) at java.xml/com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(AbstractSAXParser.java:609) at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEndElement(XMLDocumentFragmentScannerImpl.java:1714) at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2879) at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:602) at java.xml/com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112) at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:532) at java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:888) at java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:824) at java.xml/com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141) at java.xml/com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213) at java.xml/com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:635) at java.xml/com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl.parse(SAXParserImpl.java:324) at java.xml/javax.xml.parsers.SAXParser.parse(SAXParser.java:197) at org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.handleGeneralTextContainingPart(AbstractOOXMLExtractor.java:506) at org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.processShapes(XSSFExcelExtractorDecorator.java:279) at org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.buildXHTML(XSSFExcelExtractorDecorator.java:185) at org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.getXHTML(AbstractOOXMLExtractor.java:135) at org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.getXHTML(XSSFExcelExtractorDecorator.java:120) at org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:143) at org.apac
java.lang.OutOfMemoryError indexing xlsm and xlsx file
Hallo I obtain the error indexing a .xlsm or .xlsx file of 11 MB What could I do? Thanks a lot Mario 2018-07-27 11:08:25.634 WARN (qtp1521083627-99) [ x:core_share] o.e.j.s.HttpChannel /solr/core_share/update/extract java.lang.OutOfMemoryError at java.base/java.lang.AbstractStringBuilder.hugeCapacity(AbstractStringBuilder.java:188) at java.base/java.lang.AbstractStringBuilder.newCapacity(AbstractStringBuilder.java:180) at java.base/java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:147) at java.base/java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:660) at java.base/java.lang.StringBuilder.append(StringBuilder.java:195) at org.apache.solr.handler.extraction.SolrContentHandler.characters(SolrContentHandler.java:302) at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146) at org.apache.tika.sax.SecureContentHandler.characters(SecureContentHandler.java:270) at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146) at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146) at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146) at org.apache.tika.sax.SafeContentHandler.access$001(SafeContentHandler.java:46) at org.apache.tika.sax.SafeContentHandler$1.write(SafeContentHandler.java:82) at org.apache.tika.sax.SafeContentHandler.filter(SafeContentHandler.java:140) at org.apache.tika.sax.SafeContentHandler.characters(SafeContentHandler.java:287) at org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:279) at org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:306) at org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler.run(OOXMLTikaBodyPartHandler.java:147) at org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.handleEndOfRun(OOXMLWordAndPowerPointTextHandler.java:468) at org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.endElement(OOXMLWordAndPowerPointTextHandler.java:450) at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136) at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136) at java.xml/com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(AbstractSAXParser.java:609) at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEndElement(XMLDocumentFragmentScannerImpl.java:1714) at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2879) at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:602) at java.xml/com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112) at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:532) at java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:888) at java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:824) at java.xml/com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141) at java.xml/com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213) at java.xml/com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:635) at java.xml/com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl.parse(SAXParserImpl.java:324) at java.xml/javax.xml.parsers.SAXParser.parse(SAXParser.java:197) at org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.handleGeneralTextContainingPart(AbstractOOXMLExtractor.java:506) at org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.processShapes(XSSFExcelExtractorDecorator.java:279) at org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.buildXHTML(XSSFExcelExtractorDecorator.java:185) at org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.getXHTML(AbstractOOXMLExtractor.java:135) at org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.getXHTML(XSSFExcelExtractorDecorator.java:120) at org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:143) at org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:106) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at org.apache.ti