[ https://issues.apache.org/jira/browse/TIKA-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17682185#comment-17682185 ]
Josh Burchard commented on TIKA-3961: ------------------------------------- I attached the particular file that I reproduced the problem with. > When a parser exception happens, the "resourceName" key becomes "esourceName" > ----------------------------------------------------------------------------- > > Key: TIKA-3961 > URL: https://issues.apache.org/jira/browse/TIKA-3961 > Project: Tika > Issue Type: Bug > Components: core > Affects Versions: 2.4.1 > Environment: Windows 10. Tika 2.4.1. Tika server. > Reporter: Josh Burchard > Priority: Major > Attachments: encrypted.docx > > > Test env: Windows 10 > Tika 2.4.1, tika server > > In my config I've specified: > <metadataFilter > class="org.apache.tika.metadata.filter.IncludeFieldMetadataFilter"> > <params> > <include> > <field>X-TIKA:content</field> > <field>dc:creator</field> > <field>dc:title</field> > <field>resourceName</field> > <field>X-TIKA:EXCEPTION:container_exception</field> > </include> > </params> > </metadataFilter> > > For a password-protected docx file Tika returns the following (see bold txt > at the bottom): > [{"X-TIKA:EXCEPTION:container_exception":"org.apache.poi.EncryptedDocumentException: > java.security.NoSuchAlgorithmException: Cannot find any provider supporting > AES/CBC/NoPadding\r\n\tat > org.apache.poi.poifs.crypt.CryptoFunctions.getCipher(CryptoFunctions[7B14:0002-7080] > java:274)\r\n\tat > org.apache.poi.poifs.crypt.CryptoFunctions.getCipher(CryptoFunctions.java:223)\r\n\tat > > org.apache.poi.poifs.crypt.agile.AgileDecryptor.hashInput(AgileDecryptor.java:196)\r\n\tat > > org.apache.poi.poifs.crypt.agile.AgileDecryptor.verifyPasswrd(AgileDecryptor.java:102)\r\n\tat > > org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:261)\r\n\tat > > org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:175)\r\n\tat > > org.apache.tika.parser.CompositeParser.parse(CompositParser.java:298)\r\n\tat > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298)\r\n\tat > > org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:167)\r\n\tat > > org.apache.tika.parser.RecursiveParserWrapper.parse(RecursiveParserWraper.java:163)\r\n\tat > > org.apache.tika.server.core.resource.TikaResource.parse(TikaResource.java:352)\r\n\tat > > org.apache.tika.server.core.resource.RecursiveMetadataResource.parseMetadata(RecursiveMetadataResource.java:78)\r\n\tat > > org.apache.tika.server.cor.resource.RecursiveMetadataResource.parseMetadataToMetadataList(RecursiveMetadataResource.java:190)\r\n\tat > > org.apache.tika.server.core.resource.RecursiveMetadataResource.getMetadata(RecursiveMetadataResource.java:179)\r\n\tat > sun.reflect.GeneratedMethodAcessor7.invoke(Unknown Source)\r\n\tat > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\r\n\tat > java.lang.reflect.Method.invoke(Method.java:498)\r\n\tat > org.apache.cxf.service.invoker.AbstractInvoker.performInvocation(bstractInvoker.java:179)\r\n\tat > > org.apache.cxf.service.invoker.AbstractInvoker.invoke(AbstractInvoker.java:96)\r\n\tat > org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:201)\r\n\tat > org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:104)r\n\tat > org.apache.cxf.interceptor.ServiceInvokerInterceptor$1.run(ServiceInvokerInterceptor.java:59)\r\n\tat > > org.apache.cxf.interceptor.ServiceInvokerInterceptor.handleMessage(ServiceInvokerInterceptor.java:96)\r\n\tat > > org.apache.cxf.phase.PhaseInterceptrChain.doIntercept(PhaseInterceptorChain.java:307)\r\n\tat > > org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121)\r\n\tat > > org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:265)\\n\tat > > org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:247)\r\n\tat > > org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:79)\r\n\tat > > org.eclipse.jetty.server.handler.HandlerWrapper.andle(HandlerWrapper.java:127)\r\n\tat > > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)\r\n\tat > > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1440)\r\n\tat > > org.eclipse.jetty.server.handler.ScpedHandler.nextScope(ScopedHandler.java:190)\r\n\tat > > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1355)\r\n\tat > > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\r\n\tat > > org.eclipse.jetty.server.hndler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191)\r\n\tat > > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)\r\n\tat > org.eclipse.jetty.server.Server.handle(Server.java:516)\r\n\tat > org.eclipse.jetty.servr.HttpChannel.lambda$handle$1(HttpChannel.java:487)\r\n\tat > org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:732)\r\n\tat > org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:479)\r\n\tat > org.eclipse.jetty.server.HttpConnection.onFilable(HttpConnection.java:277)\r\n\tat > > org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)\r\n\tat > org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)\r\n\tat > org.eclipse.jetty.io.ChannelEndPoint$1.run(hannelEndPoint.java:104)\r\n\tat > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:338)\r\n\tat > > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:315)\r\n\tat > > org.eclipse.jetty.util.thread.trategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:173)\r\n\tat > > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)\r\n\tat > > org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.jaa:409)\r\n\tat > > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:883)\r\n\tat > > org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1034)\r\n\tat > java.lang.Thread.run(Thread.java:827)\r\nCaused by: > java.ecurity.NoSuchAlgorithmException: Cannot find any provider supporting > AES/CBC/NoPadding\r\n\tat > javax.crypto.Cipher.getInstance(Cipher.java:543)\r\n\tat > org.apache.poi.poifs.crypt.CryptoFunctions.getCipher(CryptoFunctions.java:258)\r\n\t... > 51 more\r\n",{*}"esourceName":"encrypted.docx"{*}}] > > If I disable return of the exception meta, then resourceName is returned > correctly: > [8D84:0002-60C4] 01/26/2023 05:45:58 PM DEBUG_TIKA write_callback - ptr = t: > [\{"resourceName":"encrypted.docx"}] > > Believe this is reproducible with any password-protected docx file. -- This message was sent by Atlassian Jira (v8.20.10#820010)