Hi Annie, On Mon, May 12, 2014 at 12:01 PM, <dev-digest-h...@tika.apache.org> wrote:
snip... > testiBooksParser(org.apache.tika.parser.ibooks.iBooksParserTest): > Premature end of file. > > Tests run: 506, Failures: 0, Errors: 1, Skipped: 1 > > > snip... > Java version: 1.8.0_05, vendor: Oracle Corporation > > Your Java version _may_ be the problem. Has Tika been tested against Oracle JDK8? I am not sure that it has... nightly builds do not run on JDK8 they run on latest JDK6 https://builds.apache.org/view/All/job/Tika-trunk/ Please see my other thread for discussion on changing Jenkins configuration for trunk job. > Does anyone have any insight as to why this is failing at > 'iBooksParserTest'? > Thanks! > Annie > > Yes I do... Building the project with the following Apache Maven 3.2.1 (ea8b2b07643dbb1b84b6d16e1f08391b666bc1e9; 2014-02-14T09:37:52-08:00) Maven home: /usr/local/apache/apache-maven-3.2.1 Java version: 1.8.0_05, vendor: Oracle Corporation Java home: /Library/Java/JavaVirtualMachines/jdk1.8.0_05.jdk/Contents/Home/jre Default locale: en_US, platform encoding: UTF-8 OS name: "mac os x", version: "10.9.2", arch: "x86_64", family: "mac" I can replicate your errors with same test... Running org.apache.tika.parser.ibooks.iBooksParserTest Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.008 sec <<< FAILURE! please also note the other issues with tests; most noticeably with Running org.apache.tika.parser.hdf.HDFParserTest WARN [main] (H4header.java:392) - **dimension length=0 for TagVGroup= *refno=53 tag= VG (1965) Vgroup length=34 class= Dim0.0 name= Longitude using data 52 WARN [main] (H4header.java:392) - **dimension length=0 for TagVGroup= *refno=55 tag= VG (1965) Vgroup length=33 class= Dim0.0 name= Latitude using data 54 WARN [main] (H4header.java:392) - **dimension length=0 for TagVGroup= *refno=57 tag= VG (1965) Vgroup length=33 class= Dim0.0 name= fakeDim2 using data 56 WARN [main] (H4header.java:392) - **dimension length=0 for TagVGroup= *refno=59 tag= VG (1965) Vgroup length=33 class= Dim0.0 name= fakeDim3 using data 58 WARN [main] (H4header.java:832) - data tag missing vgroup= 70 Sea Surface Temperature WARN [main] (H4header.java:832) - data tag missing vgroup= 73 Number of Observations per Bin and Running org.apache.tika.parser.pdf.PDFParserTest ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object xref at offset 116 ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object xref at offset 26441 ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object xref at offset 2314576 WARN [main] (COSDocument.java:303) - java.lang.ClassCastException: org.apache.pdfbox.cos.COSString cannot be cast to org.apache.pdfbox.cos.COSName java.lang.ClassCastException: org.apache.pdfbox.cos.COSString cannot be cast to org.apache.pdfbox.cos.COSName at org.apache.pdfbox.cos.COSDocument.getObjectsByType(COSDocument.java:295) at org.apache.pdfbox.cos.COSDocument.dereferenceObjectStreams(COSDocument.java:657) at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:244) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1239) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1204) at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:118) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120) at org.apache.tika.TikaTest.getText(TikaTest.java:125) at org.apache.tika.TikaTest.getText(TikaTest.java:133) at org.apache.tika.parser.pdf.PDFParserTest.testSequentialParser(PDFParserTest.java:548) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:236) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:134) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:113) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189) at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165) at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:103) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:74) ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object xref at offset 12324 ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object xref at offset 116 ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object xref at offset 5969 ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object xref at offset 116 ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object xref at offset 5500 ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object xref at offset 116 ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object xref at offset 5592 ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object xref at offset 116 ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object xref at offset 5592 ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object xref at offset 116 ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object xref at offset 5592 ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object xref at offset 116 ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object xref at offset 5687 ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object xref at offset 116 ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object xref at offset 8777 The stack trace for the failing iBooksParserTest looks like the following testiBooksParser(org.apache.tika.parser.ibooks.iBooksParserTest) Time elapsed: 0.008 sec <<< ERROR! org.xml.sax.SAXParseException; Premature end of file. at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source) at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source) at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source) at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source) at org.apache.xerces.impl.XMLVersionDetector.determineDocVersion(Unknown Source) at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source) at org.apache.xerces.jaxp.SAXParserImpl.parse(Unknown Source) at javax.xml.parsers.SAXParser.parse(SAXParser.java:195) at org.apache.tika.parser.epub.EpubContentParser.parse(EpubContentParser.java:72) at org.apache.tika.parser.epub.EpubParser.parse(EpubParser.java:104) at org.apache.tika.parser.ibooks.iBooksParserTest.testiBooksParser(iBooksParserTest.java:40) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:236) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:134) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:113) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189) at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165) at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:103) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:74) _downgrading_ to the following fixes everything (note PDF parser stack trace as above is still present) Apache Maven 3.2.1 (ea8b2b07643dbb1b84b6d16e1f08391b666bc1e9; 2014-02-14T09:37:52-08:00) Maven home: /usr/local/apache/apache-maven-3.2.1 Java version: 1.7.0_55, vendor: Oracle Corporation Java home: /Library/Java/JavaVirtualMachines/jdk1.7.0_55.jdk/Contents/Home/jre Default locale: en_US, platform encoding: UTF-8 OS name: "mac os x", version: "10.9.2", arch: "x86_64", family: "mac" ... [INFO] Reactor Summary: [INFO] [INFO] Apache Tika parent ................................ SUCCESS [ 1.046 s] [INFO] Apache Tika core .................................. SUCCESS [ 12.017 s] [INFO] Apache Tika parsers ............................... SUCCESS [ 46.203 s] [INFO] Apache Tika XMP ................................... SUCCESS [ 1.734 s] [INFO] Apache Tika application ........................... SUCCESS [ 13.228 s] [INFO] Apache Tika OSGi bundle ........................... SUCCESS [ 17.832 s] [INFO] Apache Tika server ................................ SUCCESS [ 30.877 s] [INFO] Apache Tika Java-7 Components ..................... SUCCESS [ 3.157 s] [INFO] Apache Tika ....................................... SUCCESS [ 0.023 s] [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 02:06 min [INFO] Finished at: 2014-05-14T08:14:00-08:00 [INFO] Final Memory: 158M/1049M So you have an option... downgrade to JDK7 as above or else take on the task of making a stable build with JDK8 :) hth Lewis