Hallo You had right, Karl. I have been helped by the tika people and they patched the tika jar of the solr installation and the problem was solved!
Now I solved using the tika 1.19 versions nightly build. Thanks a lot. Da: Karl Wright <[email protected]> Inviato: venerdì 27 luglio 2018 12:39 A: [email protected] Oggetto: Re: Job stuck internal http error 500 I am afraid you will need to open a Tika ticket, and be prepared to attach your file to it. Thanks, Karl On Fri, Jul 27, 2018 at 6:04 AM Bisonti Mario <[email protected]<mailto:[email protected]>> wrote: It isn’t a memory problem because xls file bigger (30MB) have been processed. This file xlsm with many colors etc hang I could suppose that it is a tika/solr erro but I don’t know how to solve it ☹ Oggetto: R: Job stuck internal http error 500 Yes, I am using: /opt/manifoldcf/multiprocess-file-example-proprietary I set: sudo nano options.env.unix -Xms2048m -Xmx2048m But I obtain the same error. My doubt is that it could be a solr/tika problem. What could I do? I restrict the scan to a single file and I obtain the same error Da: Karl Wright <[email protected]<mailto:[email protected]>> Inviato: venerdì 27 luglio 2018 11:36 A: [email protected]<mailto:[email protected]> Oggetto: Re: Job stuck internal http error 500 I am presuming you are using the examples. If so, edit the options file to grant more memory to you agents process by increasing the Xmx value. Karl On Fri, Jul 27, 2018, 3:04 AM Bisonti Mario <[email protected]<mailto:[email protected]>> wrote: Hallo. My job is stucking indexing an xlsx file of 38MB What could I do to solve my problem? In the following there is the error: 2018-07-27 08:55:15.562 WARN (qtp1521083627-52) [ x:core_share] o.e.j.s.HttpChannel /solr/core_share/update/extract java.lang.OutOfMemoryError at java.base/java.lang.AbstractStringBuilder.hugeCapacity(AbstractStringBuilder.java:188) at java.base/java.lang.AbstractStringBuilder.newCapacity(AbstractStringBuilder.java:180) at java.base/java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:147) at java.base/java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:660) at java.base/java.lang.StringBuilder.append(StringBuilder.java:195) at org.apache.solr.handler.extraction.SolrContentHandler.characters(SolrContentHandler.java:302) at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146) at org.apache.tika.sax.SecureContentHandler.characters(SecureContentHandler.java:270) at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146) at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146) at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146) at org.apache.tika.sax.SafeContentHandler.access$001(SafeContentHandler.java:46) at org.apache.tika.sax.SafeContentHandler$1.write(SafeContentHandler.java:82) at org.apache.tika.sax.SafeContentHandler.filter(SafeContentHandler.java:140) at org.apache.tika.sax.SafeContentHandler.characters(SafeContentHandler.java:287) at org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:279) at org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:306) at org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler.run(OOXMLTikaBodyPartHandler.java:147) at org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.handleEndOfRun(OOXMLWordAndPowerPointTextHandler.java:468) at org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.endElement(OOXMLWordAndPowerPointTextHandler.java:450) at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136) at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136) at java.xml/com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(AbstractSAXParser.java:609) at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEndElement(XMLDocumentFragmentScannerImpl.java:1714) at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2879) at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:602) at java.xml/com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112) at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:532) at java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:888) at java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:824) at java.xml/com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141) at java.xml/com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213) at java.xml/com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:635) at java.xml/com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl.parse(SAXParserImpl.java:324) at java.xml/javax.xml.parsers.SAXParser.parse(SAXParser.java:197) at org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.handleGeneralTextContainingPart(AbstractOOXMLExtractor.java:506) at org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.processShapes(XSSFExcelExtractorDecorator.java:279) at org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.buildXHTML(XSSFExcelExtractorDecorator.java:185) at org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.getXHTML(AbstractOOXMLExtractor.java:135) at org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.getXHTML(XSSFExcelExtractorDecorator.java:120) at org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:143) at org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:106) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143) at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:228) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195) at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503) at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711) at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) at org.eclipse.jetty.server.Server.handle(Server.java:530) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:102) at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:247) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:140) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131) at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:382) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708) at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626) at java.base/java.lang.Thread.run(Thread.java:844)
