Thanks Dave, yes I have tesseract enabled and this is on my Mac Book.

Thanks for looking into it Daveā€¦

 

Cheers,

Chris

 

 

 

From: "loo...@gmail.com" <loo...@gmail.com>
Reply-To: "dev@tika.apache.org" <dev@tika.apache.org>
Date: Thursday, May 24, 2018 at 11:34 AM
To: "dev@tika.apache.org" <dev@tika.apache.org>
Subject: Re: Branch_1x build broke?

 

Hey Chris,

 

This is happening to me with Tesseract enabled but only on my MacBook.

 

Are you running this on OSX?

 

Been trying to get some time to dig into it as it works perfectly on my

Windows and Linux setups.

 

Cheers,

Dave

 

 

 

On Thu, 24 May 2018, 17:09 Chris Mattmann, <mattm...@apache.org> wrote:

 

Tim,

 

 

 

Are you seeing this?

 

 

 

Results :

 

 

 

Failed tests:

 

 

PDFParserTest.testEmbeddedDocsWithOCROnly:1250->TikaTest.assertContains:103

pdf_haystack not found in:

 

<html xmlns="http://www.w3.org/1999/xhtml";>

 

<head>

 

<meta name="date" content="2013-05-23T18:30:00Z" />

 

<meta name="cp:revision" content="1" />

 

<meta name="extended-properties:AppVersion" content="14.0000" />

 

<meta name="meta:paragraph-count" content="1" />

 

<meta name="meta:word-count" content="16" />

 

<meta name="extended-properties:Company" content="" />

 

<meta name="Word-Count" content="16" />

 

<meta name="dcterms:created" content="2013-05-23T18:30:00Z" />

 

<meta name="meta:line-count" content="1" />

 

<meta name="Last-Modified" content="2013-05-23T18:30:00Z" />

 

<meta name="dcterms:modified" content="2013-05-23T18:30:00Z" />

 

<meta name="Last-Save-Date" content="2013-05-23T18:30:00Z" />

 

<meta name="meta:character-count" content="96" />

 

<meta name="Template" content="Normal.dotm" />

 

<meta name="Line-Count" content="1" />

 

<meta name="Paragraph-Count" content="1" />

 

<meta name="meta:save-date" content="2013-05-23T18:30:00Z" />

 

<meta name="meta:character-count-with-spaces" content="111" />

 

<meta name="Application-Name" content="Microsoft Office Word" />

 

<meta name="modified" content="2013-05-23T18:30:00Z" />

 

<meta name="Content-Type"

content="application/vnd.openxmlformats-officedocument.wordprocessingml.document"

/>

 

<meta name="X-Parsed-By" content="org.apache.tika.parser.DefaultParser" />

 

<meta name="X-Parsed-By"

content="org.apache.tika.parser.microsoft.ooxml.OOXMLParser" />

 

<meta name="meta:creation-date" content="2013-05-23T18:30:00Z" />

 

<meta name="extended-properties:Application" content="Microsoft Office

Word" />

 

<meta name="Creation-Date" content="2013-05-23T18:30:00Z" />

 

<meta name="xmpTPg:NPages" content="1" />

 

<meta name="Character-Count-With-Spaces" content="111" />

 

<meta name="Character Count" content="96" />

 

<meta name="Page-Count" content="1" />

 

<meta name="Revision-Number" content="1" />

 

<meta name="Application-Version" content="14.0000" />

 

<meta name="extended-properties:Template" content="Normal.dotm" />

 

<meta name="publisher" content="" />

 

<meta name="meta:page-count" content="1" />

 

<meta name="dc:publisher" content="" />

 

<title></title>

 

</head>

 

<body><p class="header" />

 

<p class="header" />

 

<p class="header" />

 

<p>Outer_haystack</p>

 

<p>Outer_haystack</p>

 

<p><div class="embedded" id="rId8" />

 

</p>

 

<p>Outer_haystack</p>

 

<p />

 

<p>Outer_haystack</p>

 

<p />

 

<p>Outer_haystack</p>

 

<p><a name="_GoBack" /></p>

 

<p class="footer" />

 

<p class="footer" />

 

<p class="footer" />

 

<p>attached.pdf</p>

 

<div class="page"><div class="ocr">dehayslack dehaystack dehayslack

dehaystack dehaystack dehaystack pd'

 

 

 

</div>

 

</div>

 

<p class="header" />

 

 

 

<p class="header" />

 

 

 

<p class="header" />

 

 

 

<p>Haystack</p>

 

 

 

<p>Needle</p>

 

 

 

<p>Haystack</p>

 

 

 

<p><a name="_GoBack" /></p>

 

 

 

<p class="footer" />

 

 

 

<p class="footer" />

 

 

 

<p class="footer" />

 

 

 

<div source="attachment" class="embedded" id="Test.docx" />

 

</body></html>

 

 

 

Tests run: 1009, Failures: 1, Errors: 0, Skipped: 30

 

 

 

[INFO]

------------------------------------------------------------------------

 

[INFO] Reactor Summary:

 

[INFO]

 

[INFO] Apache Tika parent ................................. SUCCESS [

1.565 s]

 

[INFO] Apache Tika core ................................... SUCCESS [

32.977 s]

 

[INFO] Apache Tika parsers ................................ FAILURE [05:52

min]

 

[INFO] Apache Tika XMP .................................... SKIPPED

 

[INFO] Apache Tika serialization .......................... SKIPPED

 

[INFO] Apache Tika batch .................................. SKIPPED

 

[INFO] Apache Tika language detection ..................... SKIPPED

 

[INFO] Apache Tika application ............................ SKIPPED

 

[INFO] Apache Tika OSGi bundle ............................ SKIPPED

 

[INFO] Apache Tika translate .............................. SKIPPED

 

[INFO] Apache Tika server ................................. SKIPPED

 

[INFO] Apache Tika examples ............................... SKIPPED

 

[INFO] Apache Tika Java-7 Components ...................... SKIPPED

 

[INFO] Apache Tika eval ................................... SKIPPED

 

[INFO] Apache Tika Deep Learning (powered by DL4J) ........ SKIPPED

 

[INFO] Apache Tika Natural Language Processing ............ SKIPPED

 

[INFO] Apache Tika ........................................ SKIPPED

 

[INFO]

------------------------------------------------------------------------

 

[INFO] BUILD FAILURE

 

[INFO]

------------------------------------------------------------------------

 

[INFO] Total time: 06:27 min

 

[INFO] Finished at: 2018-05-24T09:04:59-07:00

 

[INFO] Final Memory: 72M/1029M

 

[INFO]

------------------------------------------------------------------------

 

[ERROR] Failed to execute goal

org.apache.maven.plugins:maven-surefire-plugin:2.18.1:test (default-test)

on project tika-parsers: There are test failures.

 

[ERROR]

 

[ERROR] Please refer to

/Users/mattmann/tmp/tika2.0.0/tika-parsers/target/surefire-reports for the

individual test results.

 

[ERROR] -> [Help 1]

 

[ERROR]

 

[ERROR] To see the full stack trace of the errors, re-run Maven with the

-e switch.

 

[ERROR] Re-run Maven using the -X switch to enable full debug logging.

 

[ERROR]

 

[ERROR] For more information about the errors and possible solutions,

please read the following articles:

 

[ERROR] [Help 1]

http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException

 

[ERROR]

 

[ERROR] After correcting the problems, you can resume the build with the

command

 

[ERROR]   mvn <goals> -rf :tika-parsers

 

 

 

Keeps failing for me.

 

nonas:tika2.0.0 mattmann$ java -version

 

java version "1.8.0_144"

 

Java(TM) SE Runtime Environment (build 1.8.0_144-b01)

 

Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode)

 

nonas:tika2.0.0 mattmann$

 

 

 

Any ideas?

 

 

 

Cheers,

 

Chris

 

 

 

 

 

Reply via email to