[jira] [Commented] (TIKA-1421) Tika-Parsers tests fail on CentOS6 if tesseract isn't installed
[ https://issues.apache.org/jira/browse/TIKA-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14143041#comment-14143041 ] Hong-Thai Nguyen commented on TIKA-1421: Not only CentOS, this test failed also on my Windows without Tesseract installed. Tika-Parsers tests fail on CentOS6 if tesseract isn't installed --- Key: TIKA-1421 URL: https://issues.apache.org/jira/browse/TIKA-1421 Project: Tika Issue Type: Bug Components: parser Environment: CentOS6 AWS VM for DARPA Memex Reporter: Chris A. Mattmann Assignee: Chris A. Mattmann Fix For: 1.7 While testing TIKA-93 on CentOS6, I ran into some test failing issues on a 1.7-trunk fresh install of tika in tika-parsers: {noformat} Running org.apache.tika.parser.chm.TestChmLzxcControlData Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.008 sec Running org.apache.tika.parser.chm.TestChmBlockInfo Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.003 sec Running org.apache.tika.parser.chm.TestChmItsfHeader Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.005 sec Running org.apache.tika.parser.txt.TXTParserTest Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.016 sec Running org.apache.tika.parser.txt.CharsetDetectorTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.02 sec Running org.apache.tika.parser.image.xmp.JempboxExtractorTest Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.014 sec Running org.apache.tika.parser.image.PSDParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.002 sec Running org.apache.tika.parser.image.ImageParserTest Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.034 sec Running org.apache.tika.parser.image.ImageMetadataExtractorTest Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.241 sec Running org.apache.tika.parser.image.MetadataFieldsTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0 sec Running org.apache.tika.parser.image.TiffParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.02 sec Running org.apache.tika.parser.font.FontParsersTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.192 sec Running org.apache.tika.parser.mp4.MP4ParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.07 sec Running org.apache.tika.parser.mp3.Mp3ParserTest Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.046 sec Running org.apache.tika.parser.mp3.MpegStreamTest Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.003 sec Running org.apache.tika.parser.dwg.DWGParserTest Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.02 sec Running org.apache.tika.parser.pkg.GzipParserTest Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.252 sec Running org.apache.tika.parser.pkg.Seven7ParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.37 sec Running org.apache.tika.parser.pkg.TarParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.118 sec Running org.apache.tika.parser.pkg.Bzip2ParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.233 sec Running org.apache.tika.parser.pkg.ArParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.017 sec Running org.apache.tika.parser.pkg.ZipParserTest Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.302 sec Running org.apache.tika.parser.video.FLVParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.026 sec Running org.apache.tika.parser.solidworks.SolidworksParserTest Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.019 sec Running org.apache.tika.parser.ibooks.iBooksParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.019 sec Running org.apache.tika.parser.ParsingReaderTest Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.018 sec Running org.apache.tika.parser.mail.RFC822ParserTest Tests run: 8, Failures: 1, Errors: 1, Skipped: 0, Time elapsed: 0.31 sec FAILURE! Running org.apache.tika.parser.mbox.MboxParserTest Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.026 sec Running org.apache.tika.parser.mbox.OutlookPSTParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.094 sec Running org.apache.tika.parser.jpeg.JpegParserTest Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.153 sec Running org.apache.tika.parser.executable.ExecutableParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.003 sec Running
[jira] [Commented] (TIKA-1421) Tika-Parsers tests fail on CentOS6 if tesseract isn't installed
[ https://issues.apache.org/jira/browse/TIKA-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14143273#comment-14143273 ] Tyler Palsulich commented on TIKA-1421: --- I commented on list, but here is a proposed solution. TesseractOCRParser is the default parser for image types (by nature of how Parsers are dynamically loaded). If Tesseract is installed, that's the only Parser that gets called. If not, ImageParser is called as a fallback. Tests for TesseractOCRParser will check before calling TesseractOCRParser.parse whether it's installed and skip it otherwise. Tests for ImageParser will create the ImageParser directly. Tika-Parsers tests fail on CentOS6 if tesseract isn't installed --- Key: TIKA-1421 URL: https://issues.apache.org/jira/browse/TIKA-1421 Project: Tika Issue Type: Bug Components: parser Environment: CentOS6 AWS VM for DARPA Memex Reporter: Chris A. Mattmann Assignee: Chris A. Mattmann Priority: Blocker Fix For: 1.7 While testing TIKA-93 on CentOS6, I ran into some test failing issues on a 1.7-trunk fresh install of tika in tika-parsers: {noformat} Running org.apache.tika.parser.chm.TestChmLzxcControlData Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.008 sec Running org.apache.tika.parser.chm.TestChmBlockInfo Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.003 sec Running org.apache.tika.parser.chm.TestChmItsfHeader Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.005 sec Running org.apache.tika.parser.txt.TXTParserTest Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.016 sec Running org.apache.tika.parser.txt.CharsetDetectorTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.02 sec Running org.apache.tika.parser.image.xmp.JempboxExtractorTest Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.014 sec Running org.apache.tika.parser.image.PSDParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.002 sec Running org.apache.tika.parser.image.ImageParserTest Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.034 sec Running org.apache.tika.parser.image.ImageMetadataExtractorTest Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.241 sec Running org.apache.tika.parser.image.MetadataFieldsTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0 sec Running org.apache.tika.parser.image.TiffParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.02 sec Running org.apache.tika.parser.font.FontParsersTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.192 sec Running org.apache.tika.parser.mp4.MP4ParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.07 sec Running org.apache.tika.parser.mp3.Mp3ParserTest Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.046 sec Running org.apache.tika.parser.mp3.MpegStreamTest Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.003 sec Running org.apache.tika.parser.dwg.DWGParserTest Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.02 sec Running org.apache.tika.parser.pkg.GzipParserTest Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.252 sec Running org.apache.tika.parser.pkg.Seven7ParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.37 sec Running org.apache.tika.parser.pkg.TarParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.118 sec Running org.apache.tika.parser.pkg.Bzip2ParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.233 sec Running org.apache.tika.parser.pkg.ArParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.017 sec Running org.apache.tika.parser.pkg.ZipParserTest Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.302 sec Running org.apache.tika.parser.video.FLVParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.026 sec Running org.apache.tika.parser.solidworks.SolidworksParserTest Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.019 sec Running org.apache.tika.parser.ibooks.iBooksParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.019 sec Running org.apache.tika.parser.ParsingReaderTest Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.018 sec Running org.apache.tika.parser.mail.RFC822ParserTest Tests run: 8, Failures: 1, Errors: 1, Skipped: 0, Time elapsed: 0.31 sec FAILURE! Running org.apache.tika.parser.mbox.MboxParserTest Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.026 sec Running
[jira] [Commented] (TIKA-1421) Tika-Parsers tests fail on CentOS6 if tesseract isn't installed
[ https://issues.apache.org/jira/browse/TIKA-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14144299#comment-14144299 ] Hudson commented on TIKA-1421: -- SUCCESS: Integrated in tika-trunk-jdk1.6 #203 (See [https://builds.apache.org/job/tika-trunk-jdk1.6/203/]) Fix for TIKA-1421 Check if Tesseract is installed before attempting OCR Contributed by tpalsulich,mattmann. (mattmann: http://svn.apache.org/viewvc/tika/trunk/?view=revrev=1626932) * /tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/ocr/TesseractOCRParser.java Tika-Parsers tests fail on CentOS6 if tesseract isn't installed --- Key: TIKA-1421 URL: https://issues.apache.org/jira/browse/TIKA-1421 Project: Tika Issue Type: Bug Components: parser Environment: CentOS6 AWS VM for DARPA Memex Reporter: Chris A. Mattmann Assignee: Chris A. Mattmann Priority: Blocker Fix For: 1.7 While testing TIKA-93 on CentOS6, I ran into some test failing issues on a 1.7-trunk fresh install of tika in tika-parsers: {noformat} Running org.apache.tika.parser.chm.TestChmLzxcControlData Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.008 sec Running org.apache.tika.parser.chm.TestChmBlockInfo Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.003 sec Running org.apache.tika.parser.chm.TestChmItsfHeader Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.005 sec Running org.apache.tika.parser.txt.TXTParserTest Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.016 sec Running org.apache.tika.parser.txt.CharsetDetectorTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.02 sec Running org.apache.tika.parser.image.xmp.JempboxExtractorTest Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.014 sec Running org.apache.tika.parser.image.PSDParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.002 sec Running org.apache.tika.parser.image.ImageParserTest Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.034 sec Running org.apache.tika.parser.image.ImageMetadataExtractorTest Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.241 sec Running org.apache.tika.parser.image.MetadataFieldsTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0 sec Running org.apache.tika.parser.image.TiffParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.02 sec Running org.apache.tika.parser.font.FontParsersTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.192 sec Running org.apache.tika.parser.mp4.MP4ParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.07 sec Running org.apache.tika.parser.mp3.Mp3ParserTest Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.046 sec Running org.apache.tika.parser.mp3.MpegStreamTest Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.003 sec Running org.apache.tika.parser.dwg.DWGParserTest Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.02 sec Running org.apache.tika.parser.pkg.GzipParserTest Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.252 sec Running org.apache.tika.parser.pkg.Seven7ParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.37 sec Running org.apache.tika.parser.pkg.TarParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.118 sec Running org.apache.tika.parser.pkg.Bzip2ParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.233 sec Running org.apache.tika.parser.pkg.ArParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.017 sec Running org.apache.tika.parser.pkg.ZipParserTest Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.302 sec Running org.apache.tika.parser.video.FLVParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.026 sec Running org.apache.tika.parser.solidworks.SolidworksParserTest Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.019 sec Running org.apache.tika.parser.ibooks.iBooksParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.019 sec Running org.apache.tika.parser.ParsingReaderTest Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.018 sec Running org.apache.tika.parser.mail.RFC822ParserTest Tests run: 8, Failures: 1, Errors: 1, Skipped: 0, Time elapsed: 0.31 sec FAILURE! Running org.apache.tika.parser.mbox.MboxParserTest Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.026 sec Running org.apache.tika.parser.mbox.OutlookPSTParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
[jira] [Commented] (TIKA-1421) Tika-Parsers tests fail on CentOS6 if tesseract isn't installed
[ https://issues.apache.org/jira/browse/TIKA-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14144330#comment-14144330 ] Hudson commented on TIKA-1421: -- SUCCESS: Integrated in tika-trunk-jdk1.7 #225 (See [https://builds.apache.org/job/tika-trunk-jdk1.7/225/]) Fix for TIKA-1421 Check if Tesseract is installed before attempting OCR Contributed by tpalsulich,mattmann. (mattmann: http://svn.apache.org/viewvc/tika/trunk/?view=revrev=1626932) * /tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/ocr/TesseractOCRParser.java Tika-Parsers tests fail on CentOS6 if tesseract isn't installed --- Key: TIKA-1421 URL: https://issues.apache.org/jira/browse/TIKA-1421 Project: Tika Issue Type: Bug Components: parser Environment: CentOS6 AWS VM for DARPA Memex Reporter: Chris A. Mattmann Assignee: Chris A. Mattmann Priority: Blocker Fix For: 1.7 While testing TIKA-93 on CentOS6, I ran into some test failing issues on a 1.7-trunk fresh install of tika in tika-parsers: {noformat} Running org.apache.tika.parser.chm.TestChmLzxcControlData Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.008 sec Running org.apache.tika.parser.chm.TestChmBlockInfo Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.003 sec Running org.apache.tika.parser.chm.TestChmItsfHeader Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.005 sec Running org.apache.tika.parser.txt.TXTParserTest Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.016 sec Running org.apache.tika.parser.txt.CharsetDetectorTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.02 sec Running org.apache.tika.parser.image.xmp.JempboxExtractorTest Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.014 sec Running org.apache.tika.parser.image.PSDParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.002 sec Running org.apache.tika.parser.image.ImageParserTest Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.034 sec Running org.apache.tika.parser.image.ImageMetadataExtractorTest Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.241 sec Running org.apache.tika.parser.image.MetadataFieldsTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0 sec Running org.apache.tika.parser.image.TiffParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.02 sec Running org.apache.tika.parser.font.FontParsersTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.192 sec Running org.apache.tika.parser.mp4.MP4ParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.07 sec Running org.apache.tika.parser.mp3.Mp3ParserTest Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.046 sec Running org.apache.tika.parser.mp3.MpegStreamTest Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.003 sec Running org.apache.tika.parser.dwg.DWGParserTest Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.02 sec Running org.apache.tika.parser.pkg.GzipParserTest Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.252 sec Running org.apache.tika.parser.pkg.Seven7ParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.37 sec Running org.apache.tika.parser.pkg.TarParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.118 sec Running org.apache.tika.parser.pkg.Bzip2ParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.233 sec Running org.apache.tika.parser.pkg.ArParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.017 sec Running org.apache.tika.parser.pkg.ZipParserTest Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.302 sec Running org.apache.tika.parser.video.FLVParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.026 sec Running org.apache.tika.parser.solidworks.SolidworksParserTest Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.019 sec Running org.apache.tika.parser.ibooks.iBooksParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.019 sec Running org.apache.tika.parser.ParsingReaderTest Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.018 sec Running org.apache.tika.parser.mail.RFC822ParserTest Tests run: 8, Failures: 1, Errors: 1, Skipped: 0, Time elapsed: 0.31 sec FAILURE! Running org.apache.tika.parser.mbox.MboxParserTest Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.026 sec Running org.apache.tika.parser.mbox.OutlookPSTParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
[jira] [Commented] (TIKA-1421) Tika-Parsers tests fail on CentOS6 if tesseract isn't installed
[ https://issues.apache.org/jira/browse/TIKA-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14142309#comment-14142309 ] Chris A. Mattmann commented on TIKA-1421: - Here's how it fails when Tesseract is installed: {noformat} [INFO] Surefire report directory: /data/home/mattmann/src/tika/tika-parsers/target/surefire-reports --- T E S T S --- Running org.apache.tika.parser.audio.AudioParserTest Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.827 sec Running org.apache.tika.parser.audio.MidiParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.015 sec Running org.apache.tika.parser.microsoft.ooxml.OOXMLContainerExtractionTest Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.483 sec Running org.apache.tika.parser.microsoft.ooxml.OOXMLParserTest Tests run: 34, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.665 sec Running org.apache.tika.parser.microsoft.VisioParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.047 sec Running org.apache.tika.parser.microsoft.PowerPointParserTest Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.296 sec Running org.apache.tika.parser.microsoft.POIContainerExtractionTest Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.127 sec Running org.apache.tika.parser.microsoft.ExcelParserTest Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.793 sec Running org.apache.tika.parser.microsoft.WriteProtectedParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.091 sec Running org.apache.tika.parser.microsoft.OfficeParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.004 sec Running org.apache.tika.parser.microsoft.ProjectParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.006 sec Running org.apache.tika.parser.microsoft.WordParserTest Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.466 sec Running org.apache.tika.parser.microsoft.OutlookParserTest Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.418 sec Running org.apache.tika.parser.microsoft.PublisherParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.008 sec Running org.apache.tika.parser.microsoft.TNEFParserTest Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.049 sec Running org.apache.tika.parser.xml.EmptyAndDuplicateElementsXMLParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.004 sec Running org.apache.tika.parser.xml.FictionBookParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.025 sec Running org.apache.tika.parser.xml.DcXMLParserTest Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.009 sec Running org.apache.tika.parser.iwork.IWorkParserTest Tests run: 18, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.634 sec Running org.apache.tika.parser.iwork.AutoPageNumberUtilsTest Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0 sec Running org.apache.tika.parser.asm.ClassParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.01 sec Running org.apache.tika.parser.chm.TestPmglHeader Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.006 sec Running org.apache.tika.parser.chm.TestChmItspHeader Tests run: 16, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.01 sec Running org.apache.tika.parser.chm.TestPmgiHeader Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.013 sec Running org.apache.tika.parser.chm.TestChmExtractor Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.185 sec Running org.apache.tika.parser.chm.TestChmLzxState Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.006 sec Running org.apache.tika.parser.chm.TestChmLzxcResetTable Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.02 sec Running org.apache.tika.parser.chm.TestChmExtraction Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.938 sec Running org.apache.tika.parser.chm.TestDirectoryListingEntry Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.004 sec Running org.apache.tika.parser.chm.TestChmLzxcControlData Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.004 sec Running org.apache.tika.parser.chm.TestChmBlockInfo Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.003 sec Running org.apache.tika.parser.chm.TestChmItsfHeader Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.005 sec Running org.apache.tika.parser.txt.TXTParserTest Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.017 sec Running org.apache.tika.parser.txt.CharsetDetectorTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.02
[jira] [Commented] (TIKA-1421) Tika-Parsers tests fail on CentOS6 if tesseract isn't installed
[ https://issues.apache.org/jira/browse/TIKA-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14142310#comment-14142310 ] Tyler Palsulich commented on TIKA-1421: --- bq. Here's how it fails when Tesseract is installed: Is the English language data pack installed? Tika-Parsers tests fail on CentOS6 if tesseract isn't installed --- Key: TIKA-1421 URL: https://issues.apache.org/jira/browse/TIKA-1421 Project: Tika Issue Type: Bug Components: parser Environment: CentOS6 AWS VM for DARPA Memex Reporter: Chris A. Mattmann Assignee: Chris A. Mattmann Fix For: 1.7 While testing TIKA-93 on CentOS6, I ran into some test failing issues on a 1.7-trunk fresh install of tika in tika-parsers: {noformat} Running org.apache.tika.parser.chm.TestChmLzxcControlData Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.008 sec Running org.apache.tika.parser.chm.TestChmBlockInfo Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.003 sec Running org.apache.tika.parser.chm.TestChmItsfHeader Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.005 sec Running org.apache.tika.parser.txt.TXTParserTest Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.016 sec Running org.apache.tika.parser.txt.CharsetDetectorTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.02 sec Running org.apache.tika.parser.image.xmp.JempboxExtractorTest Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.014 sec Running org.apache.tika.parser.image.PSDParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.002 sec Running org.apache.tika.parser.image.ImageParserTest Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.034 sec Running org.apache.tika.parser.image.ImageMetadataExtractorTest Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.241 sec Running org.apache.tika.parser.image.MetadataFieldsTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0 sec Running org.apache.tika.parser.image.TiffParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.02 sec Running org.apache.tika.parser.font.FontParsersTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.192 sec Running org.apache.tika.parser.mp4.MP4ParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.07 sec Running org.apache.tika.parser.mp3.Mp3ParserTest Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.046 sec Running org.apache.tika.parser.mp3.MpegStreamTest Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.003 sec Running org.apache.tika.parser.dwg.DWGParserTest Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.02 sec Running org.apache.tika.parser.pkg.GzipParserTest Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.252 sec Running org.apache.tika.parser.pkg.Seven7ParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.37 sec Running org.apache.tika.parser.pkg.TarParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.118 sec Running org.apache.tika.parser.pkg.Bzip2ParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.233 sec Running org.apache.tika.parser.pkg.ArParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.017 sec Running org.apache.tika.parser.pkg.ZipParserTest Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.302 sec Running org.apache.tika.parser.video.FLVParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.026 sec Running org.apache.tika.parser.solidworks.SolidworksParserTest Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.019 sec Running org.apache.tika.parser.ibooks.iBooksParserTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.019 sec Running org.apache.tika.parser.ParsingReaderTest Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.018 sec Running org.apache.tika.parser.mail.RFC822ParserTest Tests run: 8, Failures: 1, Errors: 1, Skipped: 0, Time elapsed: 0.31 sec FAILURE! Running org.apache.tika.parser.mbox.MboxParserTest Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.026 sec Running org.apache.tika.parser.mbox.OutlookPSTParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.094 sec Running org.apache.tika.parser.jpeg.JpegParserTest Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.153 sec Running org.apache.tika.parser.executable.ExecutableParserTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.003 sec Running