That looks like a static, final hard-coded limit in UnpackerResource.
Please open an issue to make it configurable.

‪On Fri, May 12, 2023 at 9:02 AM ‫שי ברק‬‎ <shai...@gmail.com> wrote:‬
>
> Hey,
>
> I have a MP4 file that is 132Mb that I send to the Tika server and I get back 
> the data successfully.
> However, when I wrap the same file in rar format I get the following error:
>
> org.apache.tika.exception.TikaException: TIKA-198: Illegal IOException from 
> org.apache.tika.parser.pkg.UnrarParser@36d35f86
>         at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:304) 
> ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298) 
> ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at 
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:195) 
> ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at 
> org.apache.tika.server.core.resource.TikaResource.parse(TikaResource.java:352)
>  ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at 
> org.apache.tika.server.core.resource.UnpackerResource.process(UnpackerResource.java:145)
>  ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at 
> org.apache.tika.server.core.resource.UnpackerResource.unpackAll(UnpackerResource.java:109)
>  ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) ~[?:?]
>         at 
> jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>  ~[?:?]
>         at 
> jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:?]
>         at java.lang.reflect.Method.invoke(Method.java:568) ~[?:?]
>         at 
> org.apache.cxf.service.invoker.AbstractInvoker.performInvocation(AbstractInvoker.java:179)
>  ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at 
> org.apache.cxf.service.invoker.AbstractInvoker.invoke(AbstractInvoker.java:96)
>  ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:201) 
> ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:104) 
> ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at 
> org.apache.cxf.interceptor.ServiceInvokerInterceptor$1.run(ServiceInvokerInterceptor.java:59)
>  ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at 
> org.apache.cxf.interceptor.ServiceInvokerInterceptor.handleMessage(ServiceInvokerInterceptor.java:96)
>  ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at 
> org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:307)
>  ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at 
> org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121)
>  ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at 
> org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:265)
>  ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at 
> org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:247)
>  ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at 
> org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:79)
>  ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
>  ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
>  ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1440)
>  ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at 
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:190)
>  ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1355)
>  ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) 
> ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191)
>  ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
>  ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at org.eclipse.jetty.server.Server.handle(Server.java:516) 
> ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at 
> org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:487) 
> ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at 
> org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:732) 
> ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:479) 
> ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277) 
> ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
>  ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105) 
> ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at 
> org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104) 
> ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:883)
>  ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1034)
>  ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at java.lang.Thread.run(Thread.java:833) ~[?:?]
> Caused by: java.io.IOException: 
> org.apache.tika.exception.TikaMemoryLimitException: Tried to allocate 
> 104857601 bytes, but 104857600 is the maximum allowed. Please open an issue 
> https://issues.apache.org/jira/projects/TIKA if you believe this file is not 
> corrupt.
>         at 
> org.apache.tika.server.core.resource.UnpackerResource$MyEmbeddedDocumentExtractor.parseEmbedded(UnpackerResource.java:184)
>  ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at 
> org.apache.tika.parser.pkg.UnrarParser.processFile(UnrarParser.java:136) 
> ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at 
> org.apache.tika.parser.pkg.UnrarParser.processDirectory(UnrarParser.java:121) 
> ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at org.apache.tika.parser.pkg.UnrarParser.parse(UnrarParser.java:105) 
> ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298) 
> ~[tika-server-standard-2.7.0.jar:2.7.0]
>         ... 39 more
> Caused by: org.apache.tika.exception.TikaMemoryLimitException: Tried to 
> allocate 104857601 bytes, but 104857600 is the maximum allowed. Please open 
> an issue https://issues.apache.org/jira/projects/TIKA if you believe this 
> file is not corrupt.
>         at 
> org.apache.tika.server.core.resource.UnpackerResource$MyEmbeddedDocumentExtractor.parseEmbedded(UnpackerResource.java:184)
>  ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at 
> org.apache.tika.parser.pkg.UnrarParser.processFile(UnrarParser.java:136) 
> ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at 
> org.apache.tika.parser.pkg.UnrarParser.processDirectory(UnrarParser.java:121) 
> ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at org.apache.tika.parser.pkg.UnrarParser.parse(UnrarParser.java:105) 
> ~[tika-server-standard-2.7.0.jar:2.7.0]
>         at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298) 
> ~[tika-server-standard-2.7.0.jar:2.7.0]
>
> I attached my Tike-config.xml and I would like to get ideas on what I should 
> do to solve that issue.
>
> Thx,
> Shay.

Reply via email to