Hi guys, Here is wrong ... <parent> <groupId>org.apache.tika</groupId> <artifactId>tika-parent</artifactId> <version>1.16-SNAPSHOT</version> <relativePath>tika-parent/pom.xml</relativePath> </parent>
If you are cloning the project, the upper level pom contains this. The fix is to change 1.16-SNAPSHOT to 1.15 What i did was: git clone https://github.com/apache/tika.git Any suggestions? BR, OLeg On Tue, May 23, 2017 at 3:01 PM, Allison, Timothy B. <talli...@mitre.org> wrote: > I _think_ it is included. See below for the two options for parsing > testZipEncrypted.zip. > > Are you not seeing this behavior? Were you expecting different behavior? > > > 1) RecursiveParserWrapper > > List<Metadata> metadataList = getRecursiveMetadata(" > testZipEncrypted.zip"); > debug(metadataList); > > yields: > > 0: X-Parsed-By : org.apache.tika.parser.DefaultParser > 0: X-Parsed-By : org.apache.tika.parser.pkg.PackageParser > 0: X-TIKA:EXCEPTION:embedded_stream_exception : > org.apache.tika.exception.EncryptedDocumentException: > stream (encrypted.txt) is encrypted > at org.apache.tika.parser.pkg.PackageParser.parseEntry( > PackageParser.java:306) > at org.apache.tika.parser.pkg.PackageParser.parse( > PackageParser.java:230) > at org.apache.tika.parser.CompositeParser.parse( > CompositeParser.java:280) > at org.apache.tika.parser.CompositeParser.parse( > CompositeParser.java:280) > at org.apache.tika.parser.AutoDetectParser.parse( > AutoDetectParser.java:135) > at org.apache.tika.parser.RecursiveParserWrapper.parse( > RecursiveParserWrapper.java:158) > at org.apache.tika.TikaTest.getRecursiveMetadata(TikaTest. > java:221) > at org.apache.tika.TikaTest.getRecursiveMetadata(TikaTest. > java:213) > at org.apache.tika.parser.pkg.ZipParserTest.testZipEncrypted( > ZipParserTest.java:213) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke( > NativeMethodAccessorImpl.java:62) > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall( > FrameworkMethod.java:50) > at org.junit.internal.runners.model.ReflectiveCallable.run( > ReflectiveCallable.java:12) > at org.junit.runners.model.FrameworkMethod.invokeExplosively( > FrameworkMethod.java:47) > at org.junit.internal.runners.statements.InvokeMethod. > evaluate(InvokeMethod.java:17) > at org.junit.internal.runners.statements.RunBefores. > evaluate(RunBefores.java:26) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at org.junit.runners.BlockJUnit4ClassRunner.runChild( > BlockJUnit4ClassRunner.java:78) > at org.junit.runners.BlockJUnit4ClassRunner.runChild( > BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren( > ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate( > ParentRunner.java:268) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs( > JUnit4IdeaTestRunner.java:68) > at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater. > startRunnerWithArgs(IdeaTestRunner.java:51) > at com.intellij.rt.execution.junit.JUnitStarter. > prepareStreamsAndStart(JUnitStarter.java:242) > at com.intellij.rt.execution.junit.JUnitStarter.main( > JUnitStarter.java:70) > > 0: X-TIKA:parse_time_millis : 34 > 0: X-TIKA:content : <html xmlns="http://www.w3.org/1999/xhtml"> > <head> > <meta name="X-Parsed-By" content="org.apache.tika.parser.DefaultParser" /> > <meta name="X-Parsed-By" content="org.apache.tika.parser.pkg.PackageParser" > /> > <meta name="Content-Type" content="application/zip" /> > <title></title> > </head> > <body><div class="embedded" id="unencrypted.txt" /> > <div class="package-entry"><h1>unencrypted.txt</h1> > </div> > <p>encrypted.txt</p> > </body></html> > 0: Content-Type : application/zip > 1: date : 2017-03-21T13:07:48Z > 1: X-Parsed-By : org.apache.tika.parser.DefaultParser > 1: X-Parsed-By : org.apache.tika.parser.txt.TXTParser > 1: resourceName : unencrypted.txt > 1: dcterms:modified : 2017-03-21T13:07:48Z > 1: Last-Modified : 2017-03-21T13:07:48Z > 1: Last-Save-Date : 2017-03-21T13:07:48Z > 1: embeddedRelationshipId : unencrypted.txt > 1: meta:save-date : 2017-03-21T13:07:48Z > 1: Content-Encoding : windows-1252 > 1: X-TIKA:parse_time_millis : 3 > 1: modified : 2017-03-21T13:07:48Z > 1: X-TIKA:content : <html xmlns="http://www.w3.org/1999/xhtml"> > <head> > <meta name="date" content="2017-03-21T13:07:48Z" /> > <meta name="X-Parsed-By" content="org.apache.tika.parser.DefaultParser" /> > <meta name="X-Parsed-By" content="org.apache.tika.parser.txt.TXTParser" /> > <meta name="resourceName" content="unencrypted.txt" /> > <meta name="dcterms:modified" content="2017-03-21T13:07:48Z" /> > <meta name="Last-Modified" content="2017-03-21T13:07:48Z" /> > <meta name="Last-Save-Date" content="2017-03-21T13:07:48Z" /> > <meta name="embeddedRelationshipId" content="unencrypted.txt" /> > <meta name="meta:save-date" content="2017-03-21T13:07:48Z" /> > <meta name="Content-Encoding" content="windows-1252" /> > <meta name="modified" content="2017-03-21T13:07:48Z" /> > <meta name="Content-Length" content="13" /> > <meta name="X-TIKA:embedded_resource_path" content="/unencrypted.txt" /> > <meta name="Content-Type" content="text/plain; charset=windows-1252" /> > <title></title> > </head> > <body><p>hello world > </p> > </body></html> > 1: Content-Length : 13 > 1: X-TIKA:embedded_resource_path : /unencrypted.txt > 1: Content-Type : text/plain; charset=windows-1252 > > 2) Classic XML: > > XMLResult r = getXML("testZipEncrypted.zip"); > for (String n : r.metadata.names()) { > for (String v : r.metadata.getValues(n)) { > System.out.println("meta: "+n + " : "+v); > } > } > System.out.println(r.xml); > > Yields: > meta: X-Parsed-By : org.apache.tika.parser.DefaultParser > meta: X-Parsed-By : org.apache.tika.parser.pkg.PackageParser > meta: X-TIKA:EXCEPTION:embedded_stream_exception : > org.apache.tika.exception.EncryptedDocumentException: stream > (encrypted.txt) is encrypted > at org.apache.tika.parser.pkg.PackageParser.parseEntry( > PackageParser.java:306) > at org.apache.tika.parser.pkg.PackageParser.parse( > PackageParser.java:230) > at org.apache.tika.parser.CompositeParser.parse( > CompositeParser.java:280) > at org.apache.tika.parser.CompositeParser.parse( > CompositeParser.java:280) > at org.apache.tika.parser.AutoDetectParser.parse( > AutoDetectParser.java:135) > at org.apache.tika.TikaTest.getXML(TikaTest.java:205) > at org.apache.tika.TikaTest.getXML(TikaTest.java:191) > at org.apache.tika.parser.pkg.ZipParserTest.testZipEncrypted( > ZipParserTest.java:206) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke( > NativeMethodAccessorImpl.java:62) > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall( > FrameworkMethod.java:50) > at org.junit.internal.runners.model.ReflectiveCallable.run( > ReflectiveCallable.java:12) > at org.junit.runners.model.FrameworkMethod.invokeExplosively( > FrameworkMethod.java:47) > at org.junit.internal.runners.statements.InvokeMethod. > evaluate(InvokeMethod.java:17) > at org.junit.internal.runners.statements.RunBefores. > evaluate(RunBefores.java:26) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at org.junit.runners.BlockJUnit4ClassRunner.runChild( > BlockJUnit4ClassRunner.java:78) > at org.junit.runners.BlockJUnit4ClassRunner.runChild( > BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren( > ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate( > ParentRunner.java:268) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs( > JUnit4IdeaTestRunner.java:68) > at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater. > startRunnerWithArgs(IdeaTestRunner.java:51) > at com.intellij.rt.execution.junit.JUnitStarter. > prepareStreamsAndStart(JUnitStarter.java:242) > at com.intellij.rt.execution.junit.JUnitStarter.main( > JUnitStarter.java:70) > > meta: Content-Type : application/zip > <html xmlns="http://www.w3.org/1999/xhtml"> > <head> > <meta name="X-Parsed-By" content="org.apache.tika.parser.DefaultParser" /> > <meta name="X-Parsed-By" content="org.apache.tika.parser.pkg.PackageParser" > /> > <meta name="Content-Type" content="application/zip" /> > <title></title> > </head> > <body><div class="embedded" id="unencrypted.txt" /> > <div class="package-entry"><h1>unencrypted.txt</h1> > <p>hello world > </p> > > </div> > <p>encrypted.txt</p> > </body></html> > > -----Original Message----- > From: Aeham Abushwashi [mailto:aeham.abushwa...@exonar.com] > Sent: Tuesday, May 23, 2017 3:47 AM > To: u...@tika.apache.org; Tim Allison <talli...@apache.org> > Cc: dev@tika.apache.org > Subject: Re: [VOTE] Release Apache Tika 1.15 Candidate #1 > > Thanks Tim and apologies if this isn't the right thread to ask this > question... any reason TIKA-2300 is not included despite FixVersions=1.15 > on the ticket? > > On 22 May 2017 at 20:25, Tim Allison <talli...@apache.org> wrote: > > > A candidate for the Tika 1.15 release is available at: > > https://dist.apache.org/repos/dist/dev/tika/ > > > > The release candidate is a zip archive of the sources in: > > https://github.com/apache/tika/tree/1.15-rc1 > > > > The SHA1 checksum of the archive is > > e82697a6804373367fbba98d47426ab74e036eb1. > > > > In addition, a staged maven repository is available here: > > https://repository.apache.org/content/repositories/orgapachetika-1022 > > > > Please vote on releasing this package as Apache Tika 1.15. > > The vote is open for the next 72 hours and passes if a majority of at > > least three +1 Tika PMC votes are cast. > > > > [ ] +1 Release this package as Apache Tika 1.15 [ ] -1 Do not release > > this package because... > > > > ***This is my first time as release manager. Please kick the tires > > thoroughly.*** > > > > This is my +1. > > > > Cheers, > > > > Tim > > > > > > -- > Aeham Abushwashi > Head of Engineering > Exonar > > v: video.exonar.com | w: exonar.com <http://www.exonar.com/> | twitter: > @exonar <https://twitter.com/exonar> > > GDPR: Why It’s About More Than Regulation: Download the White Paper Here < > https://goo.gl/1cSVzH> > > Trial <https://www.exonar.com/platform/> the capability on your own > organisation's data to understand what you've got, where it is and who has > access to it. > > > Come and meet us for a chat at Infosecurity Europe <http://www. > infosecurityeurope.com/>on stand S07 in the Cyber Innovation Zone < > http://www.infosecurityeurope.com/visit/whats-on/uk-cyber-innovation-zone/ > > > > > Exonar Limited, registered in the UK, registration number 06439969 at 14 > West Mills, Newbury, Berkshire, RG14 5HG. DISCLAIMER: This email and any > attachments to it may be confidential or private. If you have received it > in error, please notify us and delete it from your system. >