[jira] [Commented] (TIKA-93) OCR support

2014-02-08 Thread frank (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-93?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13895500#comment-13895500 ] frank commented on TIKA-93: --- BTW, does this feature support .TIFF format? we have a lot of files sc

[jira] [Commented] (TIKA-93) OCR support

2014-02-08 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-93?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13895514#comment-13895514 ] Grant Ingersoll commented on TIKA-93: - It can, via some ancient JavaIO stuff, which, in s

[jira] [Commented] (TIKA-1232) Add PDF version to PDFParser output

2014-02-08 Thread Thomas Ledoux (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13895528#comment-13895528 ] Thomas Ledoux commented on TIKA-1232: - Regarding XMP ouput from tika and the inclusion

Re: [jira] [Commented] (TIKA-93) OCR support

2014-02-08 Thread Oleg Tikhonov
Hi Grant, what you're doing seems great. I've checked the Tess4j (http://tess4j.sourceforge.net/) they released and distributed under the Apache License, v2.0 . Hope it helps. BR, Oleg On Sat, Feb 8, 2014 at 1:14 PM, Grant Ingersoll (JIRA) wrote

[jira] [Updated] (TIKA-93) OCR support

2014-02-08 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-93?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated TIKA-93: Attachment: TIKA-93.patch Here is a _very_ early stage patch that creates a JavaOCR parser. It is not

[jira] [Updated] (TIKA-93) OCR support

2014-02-08 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-93?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated TIKA-93: Attachment: TIKA-93.patch Tests for the JavaOCRParser. Next step is to start integrating into various

[jira] [Assigned] (TIKA-93) OCR support

2014-02-08 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-93?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann reassigned TIKA-93: - Assignee: Chris A. Mattmann > OCR support > --- > > Key: TIKA-93 >

[jira] [Commented] (TIKA-93) OCR support

2014-02-08 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-93?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13895698#comment-13895698 ] Chris A. Mattmann commented on TIKA-93: --- Hey Grant, patch is looking good! I will need

[jira] [Updated] (TIKA-93) OCR support

2014-02-08 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-93?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated TIKA-93: Attachment: TIKA-93.patch This shows what I am thinking for integration with PDFParser. Not sure if i

[jira] [Commented] (TIKA-93) OCR support

2014-02-08 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-93?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13895718#comment-13895718 ] Grant Ingersoll commented on TIKA-93: - bq. what is the dependency on jacoco in tika-paren

Re: [jira] [Commented] (TIKA-93) OCR support

2014-02-08 Thread Oleg Tikhonov
Hi, There is another code coverage maven plug-in, called cobertura. If you run *mvn clean install cobertura:cobertura* no need to put it in the pom. Hope it helps. On Sat, Feb 8, 2014 at 10:17 PM, Grant Ingersoll (JIRA) wrote: > > [ > https://issues.apache.org/jira/browse/TIKA-93?page=com