[ https://issues.apache.org/jira/browse/TIKA-2293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15996866#comment-15996866 ]
ASF GitHub Bot commented on TIKA-2293: -------------------------------------- tballison commented on issue #158: TIKA-2293 - Tess4jOCRParser - A simpler Java version of TesseractOCRParser URL: https://github.com/apache/tika/pull/158#issuecomment-299211242 See the discussion here: https://issues.apache.org/jira/browse/TIKA-2293 . I think there's consensus that this doesn't buy us enough and actually adds some complexity to our current setup. I proposed moving this into a standalone project/parser that we can mention. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Tess4jOCRParser - A simpler Java version of TesseractOCRParser > --------------------------------------------------------------- > > Key: TIKA-2293 > URL: https://issues.apache.org/jira/browse/TIKA-2293 > Project: Tika > Issue Type: Improvement > Components: ocr > Reporter: Thejan Wijesinghe > Fix For: 1.15 > > > Right now, TesseractOCRParser calls tesseract and imagemagick from command > line. Intention of this new parser "Tess4jOCRParser" is to use the Tess4J API > instead of the runtime.exec way to executing tesseract out of process. -- This message was sent by Atlassian JIRA (v6.3.15#6346)