[ https://issues.apache.org/jira/browse/TIKA-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14142309#comment-14142309 ]
Chris A. Mattmann commented on TIKA-1421: ----------------------------------------- Here's how it fails when Tesseract is installed: {noformat} [INFO] Surefire report directory: /data/home/mattmann/src/tika/tika-parsers/target/surefire-reports ------------------------------------------------------- T E S T S ------------------------------------------------------- Running org.apache.tika.parser.audio.AudioParserTest Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.827 sec Running org.apache.tika.parser.audio.MidiParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.015 sec Running org.apache.tika.parser.microsoft.ooxml.OOXMLContainerExtractionTest Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.483 sec Running org.apache.tika.parser.microsoft.ooxml.OOXMLParserTest Tests run: 34, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.665 sec Running org.apache.tika.parser.microsoft.VisioParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.047 sec Running org.apache.tika.parser.microsoft.PowerPointParserTest Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.296 sec Running org.apache.tika.parser.microsoft.POIContainerExtractionTest Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.127 sec Running org.apache.tika.parser.microsoft.ExcelParserTest Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.793 sec Running org.apache.tika.parser.microsoft.WriteProtectedParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.091 sec Running org.apache.tika.parser.microsoft.OfficeParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.004 sec Running org.apache.tika.parser.microsoft.ProjectParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.006 sec Running org.apache.tika.parser.microsoft.WordParserTest Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.466 sec Running org.apache.tika.parser.microsoft.OutlookParserTest Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.418 sec Running org.apache.tika.parser.microsoft.PublisherParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.008 sec Running org.apache.tika.parser.microsoft.TNEFParserTest Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.049 sec Running org.apache.tika.parser.xml.EmptyAndDuplicateElementsXMLParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.004 sec Running org.apache.tika.parser.xml.FictionBookParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.025 sec Running org.apache.tika.parser.xml.DcXMLParserTest Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.009 sec Running org.apache.tika.parser.iwork.IWorkParserTest Tests run: 18, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.634 sec Running org.apache.tika.parser.iwork.AutoPageNumberUtilsTest Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0 sec Running org.apache.tika.parser.asm.ClassParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.01 sec Running org.apache.tika.parser.chm.TestPmglHeader Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.006 sec Running org.apache.tika.parser.chm.TestChmItspHeader Tests run: 16, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.01 sec Running org.apache.tika.parser.chm.TestPmgiHeader Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.013 sec Running org.apache.tika.parser.chm.TestChmExtractor Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.185 sec Running org.apache.tika.parser.chm.TestChmLzxState Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.006 sec Running org.apache.tika.parser.chm.TestChmLzxcResetTable Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.02 sec Running org.apache.tika.parser.chm.TestChmExtraction Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.938 sec Running org.apache.tika.parser.chm.TestDirectoryListingEntry Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.004 sec Running org.apache.tika.parser.chm.TestChmLzxcControlData Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.004 sec Running org.apache.tika.parser.chm.TestChmBlockInfo Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.003 sec Running org.apache.tika.parser.chm.TestChmItsfHeader Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.005 sec Running org.apache.tika.parser.txt.TXTParserTest Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.017 sec Running org.apache.tika.parser.txt.CharsetDetectorTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.02 sec Running org.apache.tika.parser.image.xmp.JempboxExtractorTest Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.012 sec Running org.apache.tika.parser.image.PSDParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.002 sec Running org.apache.tika.parser.image.ImageParserTest Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.034 sec Running org.apache.tika.parser.image.ImageMetadataExtractorTest Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.233 sec Running org.apache.tika.parser.image.MetadataFieldsTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0 sec Running org.apache.tika.parser.image.TiffParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.017 sec Running org.apache.tika.parser.font.FontParsersTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.183 sec Running org.apache.tika.parser.mp4.MP4ParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.059 sec Running org.apache.tika.parser.mp3.Mp3ParserTest Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.035 sec Running org.apache.tika.parser.mp3.MpegStreamTest Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.001 sec Running org.apache.tika.parser.dwg.DWGParserTest Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.014 sec Running org.apache.tika.parser.pkg.GzipParserTest Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.186 sec Running org.apache.tika.parser.pkg.Seven7ParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.314 sec Running org.apache.tika.parser.pkg.TarParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.082 sec Running org.apache.tika.parser.pkg.Bzip2ParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.166 sec Running org.apache.tika.parser.pkg.ArParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.016 sec Running org.apache.tika.parser.pkg.ZipParserTest Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.216 sec Running org.apache.tika.parser.video.FLVParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.017 sec Running org.apache.tika.parser.solidworks.SolidworksParserTest Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.014 sec Running org.apache.tika.parser.ibooks.iBooksParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.013 sec Running org.apache.tika.parser.ParsingReaderTest Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.013 sec Running org.apache.tika.parser.mail.RFC822ParserTest Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.46 sec Running org.apache.tika.parser.mbox.MboxParserTest Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.026 sec Running org.apache.tika.parser.mbox.OutlookPSTParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.072 sec Running org.apache.tika.parser.jpeg.JpegParserTest Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.178 sec Running org.apache.tika.parser.executable.ExecutableParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.001 sec Running org.apache.tika.parser.rtf.RTFParserTest Tests run: 31, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.12 sec Running org.apache.tika.parser.fork.ForkParserIntegrationTest Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.299 sec Running org.apache.tika.parser.envi.EnviHeaderParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.001 sec Running org.apache.tika.parser.AutoDetectParserTest Tests run: 22, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.884 sec Running org.apache.tika.parser.epub.EpubParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.003 sec Running org.apache.tika.parser.code.SourceCodeParserTest Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.065 sec Running org.apache.tika.parser.netcdf.NetCDFParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.127 sec Running org.apache.tika.parser.pdf.PDFParserTest WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 205317 WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 56931 WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 56931 WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 56931 WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 56931 WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 205317 INFO [main] (PDFParser.java:248) - Document is encrypted ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at offset 116 ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at offset 5592 WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 51851 WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 51851 ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at offset 116 ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at offset 5592 ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at offset 12324 ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at offset 116 ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at offset 5969 ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at offset 116 ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at offset 5687 WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 44785 WARN [main] (FontManager.java:312) - Font not found: Times New Roman WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 44785 WARN [main] (FontManager.java:312) - Font not found: Times New Roman WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 56931 WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 56931 ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at offset 116 ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at offset 26441 ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at offset 116 ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at offset 5592 WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 205317 WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 205317 ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at offset 116 ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at offset 8777 ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at offset 2314576 WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 68229 WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 68229 ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at offset 116 ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref at offset 5500 WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 56931 WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 51851 INFO [main] (PDFParser.java:248) - Document is encrypted INFO [main] (PDFParser.java:248) - Document is encrypted Tests run: 27, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 14.361 sec Running org.apache.tika.parser.RecursiveParserWrapperTest Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.243 sec Running org.apache.tika.parser.prt.PRTParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.011 sec Running org.apache.tika.parser.html.HtmlParserTest Tests run: 38, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 0.133 sec Running org.apache.tika.parser.mat.MatParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.536 sec Running org.apache.tika.parser.feed.FeedParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.009 sec Running org.apache.tika.parser.ocr.TesseractOCRTest Tests run: 3, Failures: 3, Errors: 0, Skipped: 0, Time elapsed: 0.272 sec <<< FAILURE! Running org.apache.tika.parser.odf.ODFParserTest Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.097 sec Running org.apache.tika.parser.hdf.HDFParserTest WARN [main] (H4header.java:392) - **dimension length=0 for TagVGroup= *refno=53 tag= VG (1965) Vgroup length=34 class= Dim0.0 name= Longitude using data 52 WARN [main] (H4header.java:392) - **dimension length=0 for TagVGroup= *refno=55 tag= VG (1965) Vgroup length=33 class= Dim0.0 name= Latitude using data 54 WARN [main] (H4header.java:392) - **dimension length=0 for TagVGroup= *refno=57 tag= VG (1965) Vgroup length=33 class= Dim0.0 name= fakeDim2 using data 56 WARN [main] (H4header.java:392) - **dimension length=0 for TagVGroup= *refno=59 tag= VG (1965) Vgroup length=33 class= Dim0.0 name= fakeDim3 using data 58 WARN [main] (H4header.java:844) - data tag missing vgroup= 70 Sea Surface Temperature WARN [main] (H4header.java:844) - data tag missing vgroup= 73 Number of Observations per Bin Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.087 sec Running org.apache.tika.embedder.ExternalEmbedderTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.008 sec Running org.apache.tika.mime.MimeTypesTest Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.001 sec Running org.apache.tika.mime.TestMimeTypes Tests run: 47, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.126 sec Running org.apache.tika.mime.MimeTypeTest Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0 sec Running org.apache.tika.detect.TestContainerAwareDetector Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.262 sec Running org.apache.tika.TestParsers WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 68229 WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 44785 WARN [main] (FontManager.java:312) - Font not found: Times New Roman Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.186 sec Results : Failed tests: testPPTXOCR(org.apache.tika.parser.ocr.TesseractOCRTest): Check for the image's text. testDOCXOCR(org.apache.tika.parser.ocr.TesseractOCRTest) testPDFOCR(org.apache.tika.parser.ocr.TesseractOCRTest) Tests run: 538, Failures: 3, Errors: 0, Skipped: 1 {noformat} > Tika-Parsers tests fail on CentOS6 if tesseract isn't installed > --------------------------------------------------------------- > > Key: TIKA-1421 > URL: https://issues.apache.org/jira/browse/TIKA-1421 > Project: Tika > Issue Type: Bug > Components: parser > Environment: CentOS6 AWS VM for DARPA Memex > Reporter: Chris A. Mattmann > Assignee: Chris A. Mattmann > Fix For: 1.7 > > > While testing TIKA-93 on CentOS6, I ran into some test failing issues on a > 1.7-trunk fresh install of tika in tika-parsers: > {noformat} > Running org.apache.tika.parser.chm.TestChmLzxcControlData > Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.008 sec > Running org.apache.tika.parser.chm.TestChmBlockInfo > Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.003 sec > Running org.apache.tika.parser.chm.TestChmItsfHeader > Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.005 sec > Running org.apache.tika.parser.txt.TXTParserTest > Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.016 sec > Running org.apache.tika.parser.txt.CharsetDetectorTest > Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.02 sec > Running org.apache.tika.parser.image.xmp.JempboxExtractorTest > Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.014 sec > Running org.apache.tika.parser.image.PSDParserTest > Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.002 sec > Running org.apache.tika.parser.image.ImageParserTest > Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.034 sec > Running org.apache.tika.parser.image.ImageMetadataExtractorTest > Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.241 sec > Running org.apache.tika.parser.image.MetadataFieldsTest > Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0 sec > Running org.apache.tika.parser.image.TiffParserTest > Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.02 sec > Running org.apache.tika.parser.font.FontParsersTest > Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.192 sec > Running org.apache.tika.parser.mp4.MP4ParserTest > Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.07 sec > Running org.apache.tika.parser.mp3.Mp3ParserTest > Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.046 sec > Running org.apache.tika.parser.mp3.MpegStreamTest > Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.003 sec > Running org.apache.tika.parser.dwg.DWGParserTest > Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.02 sec > Running org.apache.tika.parser.pkg.GzipParserTest > Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.252 sec > Running org.apache.tika.parser.pkg.Seven7ParserTest > Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.37 sec > Running org.apache.tika.parser.pkg.TarParserTest > Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.118 sec > Running org.apache.tika.parser.pkg.Bzip2ParserTest > Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.233 sec > Running org.apache.tika.parser.pkg.ArParserTest > Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.017 sec > Running org.apache.tika.parser.pkg.ZipParserTest > Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.302 sec > Running org.apache.tika.parser.video.FLVParserTest > Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.026 sec > Running org.apache.tika.parser.solidworks.SolidworksParserTest > Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.019 sec > Running org.apache.tika.parser.ibooks.iBooksParserTest > Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.019 sec > Running org.apache.tika.parser.ParsingReaderTest > Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.018 sec > Running org.apache.tika.parser.mail.RFC822ParserTest > Tests run: 8, Failures: 1, Errors: 1, Skipped: 0, Time elapsed: 0.31 sec <<< > FAILURE! > Running org.apache.tika.parser.mbox.MboxParserTest > Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.026 sec > Running org.apache.tika.parser.mbox.OutlookPSTParserTest > Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.094 sec > Running org.apache.tika.parser.jpeg.JpegParserTest > Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.153 sec > Running org.apache.tika.parser.executable.ExecutableParserTest > Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.003 sec > Running org.apache.tika.parser.rtf.RTFParserTest > Tests run: 31, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.221 sec > Running org.apache.tika.parser.fork.ForkParserIntegrationTest > Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.322 sec > Running org.apache.tika.parser.envi.EnviHeaderParserTest > Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.001 sec > Running org.apache.tika.parser.AutoDetectParserTest > Tests run: 22, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 2.439 sec > <<< FAILURE! > Running org.apache.tika.parser.epub.EpubParserTest > Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.005 sec > Running org.apache.tika.parser.code.SourceCodeParserTest > Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.069 sec > Running org.apache.tika.parser.netcdf.NetCDFParserTest > Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.125 sec > Running org.apache.tika.parser.pdf.PDFParserTest > WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 205317 > WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 56931 > WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 56931 > WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 56931 > WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 56931 > WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 205317 > INFO [main] (PDFParser.java:248) - Document is encrypted > ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref > at offset 116 > ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref > at offset 5592 > WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 51851 > WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 51851 > ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref > at offset 116 > ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref > at offset 5592 > ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref > at offset 12324 > ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref > at offset 116 > ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref > at offset 5969 > ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref > at offset 116 > ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref > at offset 5687 > WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 44785 > WARN [main] (FontManager.java:312) - Font not found: Times New Roman > WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 44785 > WARN [main] (FontManager.java:312) - Font not found: Times New Roman > WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 56931 > WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 56931 > ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref > at offset 116 > ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref > at offset 26441 > ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref > at offset 116 > ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref > at offset 5592 > WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 205317 > WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 205317 > ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref > at offset 116 > ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref > at offset 8777 > ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref > at offset 2314576 > WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 68229 > WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 68229 > ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref > at offset 116 > ERROR [main] (NonSequentialPDFParser.java:1904) - Can't find the object xref > at offset 5500 > WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 56931 > WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 51851 > INFO [main] (PDFParser.java:248) - Document is encrypted > INFO [main] (PDFParser.java:248) - Document is encrypted > Tests run: 27, Failures: 3, Errors: 0, Skipped: 0, Time elapsed: 14.305 sec > <<< FAILURE! > Running org.apache.tika.parser.RecursiveParserWrapperTest > Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.233 sec > Running org.apache.tika.parser.prt.PRTParserTest > Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.014 sec > Running org.apache.tika.parser.html.HtmlParserTest > Tests run: 38, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 0.162 sec > Running org.apache.tika.parser.mat.MatParserTest > Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.543 sec > Running org.apache.tika.parser.feed.FeedParserTest > Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.011 sec > Running org.apache.tika.parser.ocr.TesseractOCRTest > Tests run: 3, Failures: 0, Errors: 0, Skipped: 3, Time elapsed: 0.007 sec > Running org.apache.tika.parser.odf.ODFParserTest > Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.098 sec > Running org.apache.tika.parser.hdf.HDFParserTest > WARN [main] (H4header.java:392) - **dimension length=0 for TagVGroup= > *refno=53 tag= VG (1965) Vgroup length=34 class= Dim0.0 name= Longitude using > data 52 > WARN [main] (H4header.java:392) - **dimension length=0 for TagVGroup= > *refno=55 tag= VG (1965) Vgroup length=33 class= Dim0.0 name= Latitude using > data 54 > WARN [main] (H4header.java:392) - **dimension length=0 for TagVGroup= > *refno=57 tag= VG (1965) Vgroup length=33 class= Dim0.0 name= fakeDim2 using > data 56 > WARN [main] (H4header.java:392) - **dimension length=0 for TagVGroup= > *refno=59 tag= VG (1965) Vgroup length=33 class= Dim0.0 name= fakeDim3 using > data 58 > WARN [main] (H4header.java:844) - data tag missing vgroup= 70 Sea Surface > Temperature > WARN [main] (H4header.java:844) - data tag missing vgroup= 73 Number of > Observations per Bin > Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.087 sec > Running org.apache.tika.embedder.ExternalEmbedderTest > Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.014 sec > Running org.apache.tika.mime.MimeTypesTest > Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.002 sec > Running org.apache.tika.mime.TestMimeTypes > Tests run: 47, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.163 sec > Running org.apache.tika.mime.MimeTypeTest > Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.001 sec > Running org.apache.tika.detect.TestContainerAwareDetector > Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.277 sec > Running org.apache.tika.TestParsers > WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 68229 > WARN [main] (PDFParser.java:757) - Count in xref table is 0 at offset 44785 > WARN [main] (FontManager.java:312) - Font not found: Times New Roman > Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.2 sec > Results : > Failed tests: testMultipart(org.apache.tika.parser.mail.RFC822ParserTest): > Exception thrown: TIKA-198: Illegal IOException from > org.apache.tika.parser.ocr.TesseractOCRParser@2657d8a0 > testInlineSelector(org.apache.tika.parser.pdf.PDFParserTest): expected:<2> > but was:<0> > testInlineConfig(org.apache.tika.parser.pdf.PDFParserTest): expected:<2> > but was:<0> > testEmbeddedFilesInChildren(org.apache.tika.parser.pdf.PDFParserTest): > expected:<5> but was:<3> > Tests in error: > testUnusualFromAddress(org.apache.tika.parser.mail.RFC822ParserTest): > TIKA-198: Illegal IOException from > org.apache.tika.parser.ocr.TesseractOCRParser@1574a7af > testImages(org.apache.tika.parser.AutoDetectParserTest): TIKA-198: Illegal > IOException from org.apache.tika.parser.ocr.TesseractOCRParser@107aac4a > Tests run: 538, Failures: 4, Errors: 2, Skipped: 4 > {noformat} > I tried installing Tesseract here: > http://pkgs.org/centos-6/naulinux-school-x86_64/tesseract-3.01-2.el6.x86_64.rpm.html > However, installing that causes the other tests to pass, but the Tesseract > ones to fail (I think there is something wrong with the English config and am > looking into it). -- This message was sent by Atlassian JIRA (v6.3.4#6332)