Hi Annie,

On Mon, May 12, 2014 at 12:01 PM, <dev-digest-h...@tika.apache.org> wrote:

snip...


>   testiBooksParser(org.apache.tika.parser.ibooks.iBooksParserTest):
> Premature end of file.
>
> Tests run: 506, Failures: 0, Errors: 1, Skipped: 1
>
>
>
snip...


> Java version: 1.8.0_05, vendor: Oracle Corporation
>
>
Your Java version _may_ be the problem. Has Tika been tested against Oracle
JDK8?
I am not sure that it has... nightly builds do not run on JDK8 they run on
latest JDK6
https://builds.apache.org/view/All/job/Tika-trunk/
Please see my other thread for discussion on changing Jenkins configuration
for trunk job.



> Does anyone have any insight as to why this is failing at
> 'iBooksParserTest'?
> Thanks!
> Annie
>
>
Yes I do...
Building the project with the following

Apache Maven 3.2.1 (ea8b2b07643dbb1b84b6d16e1f08391b666bc1e9;
2014-02-14T09:37:52-08:00)
Maven home: /usr/local/apache/apache-maven-3.2.1
Java version: 1.8.0_05, vendor: Oracle Corporation
Java home:
/Library/Java/JavaVirtualMachines/jdk1.8.0_05.jdk/Contents/Home/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "mac os x", version: "10.9.2", arch: "x86_64", family: "mac"

I can replicate your errors with same test...

Running org.apache.tika.parser.ibooks.iBooksParserTest
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.008 sec
<<< FAILURE!

please also note the other issues with tests; most noticeably with

Running org.apache.tika.parser.hdf.HDFParserTest
 WARN [main] (H4header.java:392) - **dimension length=0 for TagVGroup=
*refno=53 tag= VG (1965) Vgroup length=34 class= Dim0.0 name= Longitude
using data 52
 WARN [main] (H4header.java:392) - **dimension length=0 for TagVGroup=
*refno=55 tag= VG (1965) Vgroup length=33 class= Dim0.0 name= Latitude
using data 54
 WARN [main] (H4header.java:392) - **dimension length=0 for TagVGroup=
*refno=57 tag= VG (1965) Vgroup length=33 class= Dim0.0 name= fakeDim2
using data 56
 WARN [main] (H4header.java:392) - **dimension length=0 for TagVGroup=
*refno=59 tag= VG (1965) Vgroup length=33 class= Dim0.0 name= fakeDim3
using data 58
 WARN [main] (H4header.java:832) - data tag missing vgroup= 70 Sea Surface
Temperature
 WARN [main] (H4header.java:832) - data tag missing vgroup= 73 Number of
Observations per Bin

and

Running org.apache.tika.parser.pdf.PDFParserTest
ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object
xref at offset 116
ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object
xref at offset 26441
ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object
xref at offset 2314576
 WARN [main] (COSDocument.java:303) - java.lang.ClassCastException:
org.apache.pdfbox.cos.COSString cannot be cast to
org.apache.pdfbox.cos.COSName
java.lang.ClassCastException: org.apache.pdfbox.cos.COSString cannot be
cast to org.apache.pdfbox.cos.COSName
    at
org.apache.pdfbox.cos.COSDocument.getObjectsByType(COSDocument.java:295)
    at
org.apache.pdfbox.cos.COSDocument.dereferenceObjectStreams(COSDocument.java:657)
    at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:244)
    at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1239)
    at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1204)
    at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:118)
    at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
    at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
    at
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
    at org.apache.tika.TikaTest.getText(TikaTest.java:125)
    at org.apache.tika.TikaTest.getText(TikaTest.java:133)
    at
org.apache.tika.parser.pdf.PDFParserTest.testSequentialParser(PDFParserTest.java:548)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:483)
    at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
    at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
    at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
    at
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
    at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
    at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
    at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
    at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
    at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
    at
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:236)
    at
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:134)
    at
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:113)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:483)
    at
org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
    at
org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
    at
org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
    at
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:103)
    at
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:74)
ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object
xref at offset 12324
ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object
xref at offset 116
ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object
xref at offset 5969
ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object
xref at offset 116
ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object
xref at offset 5500
ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object
xref at offset 116
ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object
xref at offset 5592
ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object
xref at offset 116
ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object
xref at offset 5592
ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object
xref at offset 116
ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object
xref at offset 5592
ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object
xref at offset 116
ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object
xref at offset 5687
ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object
xref at offset 116
ERROR [main] (NonSequentialPDFParser.java:1887) - Can't find the object
xref at offset 8777

The stack trace for the failing iBooksParserTest looks like the following

testiBooksParser(org.apache.tika.parser.ibooks.iBooksParserTest)  Time
elapsed: 0.008 sec  <<< ERROR!
org.xml.sax.SAXParseException; Premature end of file.
        at
org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown
Source)
        at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown
Source)
        at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown
Source)
        at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown
Source)
        at
org.apache.xerces.impl.XMLVersionDetector.determineDocVersion(Unknown
Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown
Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown
Source)
        at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
        at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
        at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown
Source)
        at org.apache.xerces.jaxp.SAXParserImpl.parse(Unknown Source)
        at javax.xml.parsers.SAXParser.parse(SAXParser.java:195)
        at
org.apache.tika.parser.epub.EpubContentParser.parse(EpubContentParser.java:72)
        at org.apache.tika.parser.epub.EpubParser.parse(EpubParser.java:104)
        at
org.apache.tika.parser.ibooks.iBooksParserTest.testiBooksParser(iBooksParserTest.java:40)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:483)
        at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
        at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
        at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
        at
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
        at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
        at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
        at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
        at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
        at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
        at
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:236)
        at
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:134)
        at
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:113)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:483)
        at
org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
        at
org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
        at
org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
        at
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:103)
        at
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:74)

_downgrading_ to the following fixes everything (note PDF parser stack
trace as above is still present)

Apache Maven 3.2.1 (ea8b2b07643dbb1b84b6d16e1f08391b666bc1e9;
2014-02-14T09:37:52-08:00)
Maven home: /usr/local/apache/apache-maven-3.2.1
Java version: 1.7.0_55, vendor: Oracle Corporation
Java home:
/Library/Java/JavaVirtualMachines/jdk1.7.0_55.jdk/Contents/Home/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "mac os x", version: "10.9.2", arch: "x86_64", family: "mac"
...
[INFO] Reactor Summary:
[INFO]
[INFO] Apache Tika parent ................................ SUCCESS [  1.046
s]
[INFO] Apache Tika core .................................. SUCCESS [ 12.017
s]
[INFO] Apache Tika parsers ............................... SUCCESS [ 46.203
s]
[INFO] Apache Tika XMP ................................... SUCCESS [  1.734
s]
[INFO] Apache Tika application ........................... SUCCESS [ 13.228
s]
[INFO] Apache Tika OSGi bundle ........................... SUCCESS [ 17.832
s]
[INFO] Apache Tika server ................................ SUCCESS [ 30.877
s]
[INFO] Apache Tika Java-7 Components ..................... SUCCESS [  3.157
s]
[INFO] Apache Tika ....................................... SUCCESS [  0.023
s]
[INFO]
------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO]
------------------------------------------------------------------------
[INFO] Total time: 02:06 min
[INFO] Finished at: 2014-05-14T08:14:00-08:00
[INFO] Final Memory: 158M/1049M

So you have an option... downgrade to JDK7 as above or else take on the
task of making a stable build with JDK8 :)

hth
Lewis

Reply via email to