Thanks for the update! Did the Tika people say when 1.19 will be released?
Karl On Wed, Aug 8, 2018 at 8:29 AM Bisonti Mario <mario.biso...@vimar.com> wrote: > Hallo > > You had right, Karl. > > > > I have been helped by the tika people and they patched the tika jar of the > solr installation and the problem was solved! > > > > Now I solved using the tika 1.19 versions nightly build. > > > > > > Thanks a lot. > > > > > > > > *Da:* Karl Wright <daddy...@gmail.com> > *Inviato:* venerdì 27 luglio 2018 12:39 > *A:* user@manifoldcf.apache.org > *Oggetto:* Re: Job stuck internal http error 500 > > > > I am afraid you will need to open a Tika ticket, and be prepared to attach > your file to it. > > > > Thanks, > > > > Karl > > > > > > On Fri, Jul 27, 2018 at 6:04 AM Bisonti Mario <mario.biso...@vimar.com> > wrote: > > It isn’t a memory problem because xls file bigger (30MB) have been > processed. > > > > This file xlsm with many colors etc hang > > I could suppose that it is a tika/solr erro but I don’t know how to solve > it > > ☹ > > > > *Oggetto:* R: Job stuck internal http error 500 > > > > Yes, I am using: > /opt/manifoldcf/multiprocess-file-example-proprietary > I set: > > sudo nano options.env.unix > > -Xms2048m > > -Xmx2048m > > > > But I obtain the same error. > > My doubt is that it could be a solr/tika problem. > > What could I do? > > I restrict the scan to a single file and I obtain the same error > > > > > > > > *Da:* Karl Wright <daddy...@gmail.com> > *Inviato:* venerdì 27 luglio 2018 11:36 > *A:* user@manifoldcf.apache.org > *Oggetto:* Re: Job stuck internal http error 500 > > > > I am presuming you are using the examples. If so, edit the options file > to grant more memory to you agents process by increasing the Xmx value. > > > > Karl > > > > On Fri, Jul 27, 2018, 3:04 AM Bisonti Mario <mario.biso...@vimar.com> > wrote: > > Hallo. > > My job is stucking indexing an xlsx file of 38MB > > > > What could I do to solve my problem? > > > > In the following there is the error: > 2018-07-27 08:55:15.562 WARN (qtp1521083627-52) [ x:core_share] > o.e.j.s.HttpChannel /solr/core_share/update/extract > > java.lang.OutOfMemoryError > > at > java.base/java.lang.AbstractStringBuilder.hugeCapacity(AbstractStringBuilder.java:188) > > at > java.base/java.lang.AbstractStringBuilder.newCapacity(AbstractStringBuilder.java:180) > > at > java.base/java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:147) > > at > java.base/java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:660) > > at java.base/java.lang.StringBuilder.append(StringBuilder.java:195) > > at > org.apache.solr.handler.extraction.SolrContentHandler.characters(SolrContentHandler.java:302) > > at > org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146) > > at > org.apache.tika.sax.SecureContentHandler.characters(SecureContentHandler.java:270) > > at > org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146) > > at > org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146) > > at > org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146) > > at > org.apache.tika.sax.SafeContentHandler.access$001(SafeContentHandler.java:46) > > at > org.apache.tika.sax.SafeContentHandler$1.write(SafeContentHandler.java:82) > > at > org.apache.tika.sax.SafeContentHandler.filter(SafeContentHandler.java:140) > > at > org.apache.tika.sax.SafeContentHandler.characters(SafeContentHandler.java:287) > > at > org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:279) > > at > org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:306) > > at > org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler.run(OOXMLTikaBodyPartHandler.java:147) > > at > org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.handleEndOfRun(OOXMLWordAndPowerPointTextHandler.java:468) > > at > org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.endElement(OOXMLWordAndPowerPointTextHandler.java:450) > > at > org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136) > > at > org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136) > > at > java.xml/com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(AbstractSAXParser.java:609) > > at > java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEndElement(XMLDocumentFragmentScannerImpl.java:1714) > > at > java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2879) > > at > java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:602) > > at > java.xml/com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112) > > at > java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:532) > > at > java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:888) > > at > java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:824) > > at > java.xml/com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141) > > at > java.xml/com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213) > > at > java.xml/com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:635) > > at > java.xml/com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl.parse(SAXParserImpl.java:324) > > at java.xml/javax.xml.parsers.SAXParser.parse(SAXParser.java:197) > > at > org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.handleGeneralTextContainingPart(AbstractOOXMLExtractor.java:506) > > at > org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.processShapes(XSSFExcelExtractorDecorator.java:279) > > at > org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.buildXHTML(XSSFExcelExtractorDecorator.java:185) > > at > org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.getXHTML(AbstractOOXMLExtractor.java:135) > > at > org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.getXHTML(XSSFExcelExtractorDecorator.java:120) > > at > org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:143) > > at > org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:106) > > at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) > > at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) > > at > org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143) > > at > org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:228) > > at > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68) > > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195) > > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503) > > at > org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711) > > at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517) > > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384) > > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330) > > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629) > > at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533) > > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) > > at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) > > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) > > at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190) > > at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595) > > at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188) > > at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253) > > at > org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168) > > at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473) > > at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564) > > at > org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166) > > at > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155) > > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) > > at > org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219) > > at > org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126) > > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) > > at > org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335) > > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) > > at org.eclipse.jetty.server.Server.handle(Server.java:530) > > at > org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347) > > at > org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256) > > at > org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279) > > at > org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:102) > > at > org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124) > > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:247) > > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:140) > > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131) > > at > org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:382) > > at > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708) > > at > org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626) > > at java.base/java.lang.Thread.run(Thread.java:844) > > > > > >