[jira] [Comment Edited] (TIKA-1422) org.apache.tika.parser.mail.RFC822ParserTest fails

2014-10-21 Thread Oleg Tikhonov (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14178018#comment-14178018
 ] 

Oleg Tikhonov edited comment on TIKA-1422 at 10/21/14 6:19 AM:
---

Were missing imports of image parsers in the TesseractOCRParser unit test.

Env:
Windows 7, PE, x64. 
java version 1.7.0_11
Java(TM) SE Runtime Environment (build 1.7.0_11-b21)
Java HotSpot(TM) 64-Bit Server VM (build 23.6-b04, mixed mode)

Output:
After import image parsers:
[INFO] 
[INFO] Building Apache Tika 1.7-SNAPSHOT
[INFO] 
[INFO]
[INFO] --- maven-clean-plugin:2.4.1:clean (default-clean) @ tika ---
[INFO] Deleting E:\work_dir\tika\tika-site\target
[INFO]
[INFO] --- maven-remote-resources-plugin:1.2.1:process (default) @ tika ---
[INFO]
[INFO] --- maven-site-plugin:3.0:attach-descriptor (attach-descriptor) @ tika 
---
[INFO]
[INFO] --- maven-install-plugin:2.3.1:install (default-install) @ tika ---
[INFO] Installing E:\work_dir\tika\tika-site\pom.xml to 
\.m2\repository\org\apache\tika\tika\1.7-SNAPSHOT\tika-1.7-SNAPSHOT.pom
[INFO] 
[INFO] Reactor Summary:
[INFO]
[INFO] Apache Tika parent  SUCCESS [1.093s]
[INFO] Apache Tika core .. SUCCESS [14.594s]
[INFO] Apache Tika parsers ... SUCCESS [49.359s]
[INFO] Apache Tika XMP ... SUCCESS [1.161s]
[INFO] Apache Tika serialization . SUCCESS [1.311s]
[INFO] Apache Tika application ... SUCCESS [11.725s]
[INFO] Apache Tika OSGi bundle ... SUCCESS [19.826s]
[INFO] Apache Tika server  SUCCESS [15.705s]
[INFO] Apache Tika translate . SUCCESS [1.476s]
[INFO] Apache Tika examples .. SUCCESS [2.231s]
[INFO] Apache Tika Java-7 Components . SUCCESS [1.429s]
[INFO] Apache Tika ... SUCCESS [0.029s]
[INFO] 
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 2:00.578s
[INFO] Finished at: Tue Oct 21 08:12:17 IST 2014
[INFO] Final Memory: 67M/1156M
[INFO] 



was (Author: olegt):
Were missing imports of image parsers in the TesseractOCRParser unit test.

 org.apache.tika.parser.mail.RFC822ParserTest fails
 --

 Key: TIKA-1422
 URL: https://issues.apache.org/jira/browse/TIKA-1422
 Project: Tika
  Issue Type: Bug
  Components: parser
Reporter: Chris A. Mattmann
Assignee: Chris A. Mattmann
 Fix For: 1.7

 Attachments: TIKA-1422.Mattmann.100114.patch.txt, 
 TIKA-1422.Mattmann.100414.patch.txt, TIKA-1422.oleg.20141021.patch, 
 TIKA-1422.palsulich.100414.patch, TIKA-1422.palsulich.100714.patch


 I'm seeing test failures from:
 {noformat}
 Results :
 Failed tests:   testMultipart(org.apache.tika.parser.mail.RFC822ParserTest): 
 (..)
 Tests run: 538, Failures: 1, Errors: 0, Skipped: 1
 {noformat}
 CentOS6 VM image, running:
 {noformat}
 [mattmann@memex tika]$ java -version
 java version 1.7.0_67
 Java(TM) SE Runtime Environment (build 1.7.0_67-b01)
 Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode)
 [mattmann@memex tika]$ mvn -version
 Apache Maven 3.2.1 (ea8b2b07643dbb1b84b6d16e1f08391b666bc1e9; 
 2014-02-14T09:37:52-08:00)
 Maven home: /usr/share/apache-maven
 Java version: 1.7.0_65, vendor: Oracle Corporation
 Java home: /data/home/mattmann/dist/jdk1.7.0_65/jre
 Default locale: en_US, platform encoding: UTF-8
 OS name: linux, version: 2.6.32-431.23.3.el6.centos.plus.x86_64, arch: 
 amd64, family: unix
 [mattmann@memex tika]$ 
 {noformat}
 Here are the surefire reports - no clue what's up here:
 {noformat}
 [mattmann@memex tika]$ more 
 tika-parsers/target/surefire-reports/org.apache.tika.parser.mail.RFC822ParserTest.txt
  
 ---
 Test set: org.apache.tika.parser.mail.RFC822ParserTest
 ---
 Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.699 sec  
 FAILURE!
 testMultipart(org.apache.tika.parser.mail.RFC822ParserTest)  Time elapsed: 
 0.152 sec   FAILURE!
 org.mockito.exceptions.verification.TooManyActualInvocations: 
 xHTMLContentHandler.startElement(
 http://www.w3.org/1999/xhtml;,
 div,
   

[jira] [Comment Edited] (TIKA-1422) org.apache.tika.parser.mail.RFC822ParserTest fails

2014-10-21 Thread Hong-Thai Nguyen (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14178186#comment-14178186
 ] 

Hong-Thai Nguyen edited comment on TIKA-1422 at 10/21/14 9:48 AM:
--

Applied latest fix on r1633325  r161 with some formatting. Thank


was (Author: thaichat04):
Applied latest fix on r1633325 with some formatting. Thank

 org.apache.tika.parser.mail.RFC822ParserTest fails
 --

 Key: TIKA-1422
 URL: https://issues.apache.org/jira/browse/TIKA-1422
 Project: Tika
  Issue Type: Bug
  Components: parser
Reporter: Chris A. Mattmann
Assignee: Chris A. Mattmann
 Fix For: 1.7

 Attachments: TIKA-1422.Mattmann.100114.patch.txt, 
 TIKA-1422.Mattmann.100414.patch.txt, TIKA-1422.oleg.20141021.patch, 
 TIKA-1422.palsulich.100414.patch, TIKA-1422.palsulich.100714.patch


 I'm seeing test failures from:
 {noformat}
 Results :
 Failed tests:   testMultipart(org.apache.tika.parser.mail.RFC822ParserTest): 
 (..)
 Tests run: 538, Failures: 1, Errors: 0, Skipped: 1
 {noformat}
 CentOS6 VM image, running:
 {noformat}
 [mattmann@memex tika]$ java -version
 java version 1.7.0_67
 Java(TM) SE Runtime Environment (build 1.7.0_67-b01)
 Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode)
 [mattmann@memex tika]$ mvn -version
 Apache Maven 3.2.1 (ea8b2b07643dbb1b84b6d16e1f08391b666bc1e9; 
 2014-02-14T09:37:52-08:00)
 Maven home: /usr/share/apache-maven
 Java version: 1.7.0_65, vendor: Oracle Corporation
 Java home: /data/home/mattmann/dist/jdk1.7.0_65/jre
 Default locale: en_US, platform encoding: UTF-8
 OS name: linux, version: 2.6.32-431.23.3.el6.centos.plus.x86_64, arch: 
 amd64, family: unix
 [mattmann@memex tika]$ 
 {noformat}
 Here are the surefire reports - no clue what's up here:
 {noformat}
 [mattmann@memex tika]$ more 
 tika-parsers/target/surefire-reports/org.apache.tika.parser.mail.RFC822ParserTest.txt
  
 ---
 Test set: org.apache.tika.parser.mail.RFC822ParserTest
 ---
 Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.699 sec  
 FAILURE!
 testMultipart(org.apache.tika.parser.mail.RFC822ParserTest)  Time elapsed: 
 0.152 sec   FAILURE!
 org.mockito.exceptions.verification.TooManyActualInvocations: 
 xHTMLContentHandler.startElement(
 http://www.w3.org/1999/xhtml;,
 div,
 div,
 isA(org.xml.sax.Attributes)
 );
 Wanted 4 times but was 5
   at 
 org.apache.tika.parser.mail.RFC822ParserTest.testMultipart(RFC822ParserTest.java:87)
 Caused by: org.mockito.exceptions.cause.UndesiredInvocation: 
 Undesired invocation:
   at 
 org.apache.tika.sax.ContentHandlerDecorator.startElement(ContentHandlerDecorator.java:126)
   at 
 org.apache.tika.sax.SafeContentHandler.startElement(SafeContentHandler.java:264)
   at 
 org.apache.tika.sax.XHTMLContentHandler.startElement(XHTMLContentHandler.java:254)
   at 
 org.apache.tika.sax.ContentHandlerDecorator.startElement(ContentHandlerDecorator.java:126)
   at 
 org.apache.tika.sax.xpath.MatchingContentHandler.startElement(MatchingContentHandler.java:60)
   at 
 org.apache.tika.sax.ContentHandlerDecorator.startElement(ContentHandlerDecorator.java:126)
   at 
 org.apache.tika.sax.ContentHandlerDecorator.startElement(ContentHandlerDecorator.java:126)
   at 
 org.apache.tika.sax.ContentHandlerDecorator.startElement(ContentHandlerDecorator.java:126)
   at 
 org.apache.tika.sax.ContentHandlerDecorator.startElement(ContentHandlerDecorator.java:126)
   at 
 org.apache.tika.sax.SafeContentHandler.startElement(SafeContentHandler.java:264)
   at 
 org.apache.tika.sax.XHTMLContentHandler.startElement(XHTMLContentHandler.java:254)
   at 
 org.apache.tika.sax.XHTMLContentHandler.startElement(XHTMLContentHandler.java:284)
   at 
 org.apache.tika.parser.ocr.TesseractOCRParser.extractOutput(TesseractOCRParser.java:243)
   at 
 org.apache.tika.parser.ocr.TesseractOCRParser.parse(TesseractOCRParser.java:155)
   at 
 org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:247)
   at 
 org.apache.tika.parser.mail.MailContentHandler.body(MailContentHandler.java:102)
   at 
 org.apache.james.mime4j.parser.MimeStreamParser.parse(MimeStreamParser.java:133)
   at org.apache.tika.parser.mail.RFC822Parser.parse(RFC822Parser.java:76)
   at 
 org.apache.tika.parser.mail.RFC822ParserTest.testMultipart(RFC822ParserTest.java:84)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at