There is the one (GPL) I've been playing with: http://javadjvu.foxtrottechnologies.com/
However, in order to extract text/context from images, we have to find suitable implementation of OCR. On Fri, Oct 7, 2011 at 11:02 AM, Jukka Zitting (Commented) (JIRA) < j...@apache.org> wrote: > > [ > https://issues.apache.org/jira/browse/TIKA-513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13122639#comment-13122639] > > Jukka Zitting commented on TIKA-513: > ------------------------------------ > > Is there a DjVu parser we could use? > > > Support of Deja Vu (DjVu) format > > -------------------------------- > > > > Key: TIKA-513 > > URL: https://issues.apache.org/jira/browse/TIKA-513 > > Project: Tika > > Issue Type: New Feature > > Components: parser > > Reporter: Oleg Tikhonov > > > > It might be great if Tika could provide such a parser. Any > suggestions/thoughts? > > -- > This message is automatically generated by JIRA. > If you think it was sent incorrectly, please contact your JIRA > administrators: > https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa > For more information on JIRA, see: http://www.atlassian.com/software/jira > > >