Re: [jira] [Commented] (TIKA-93) OCR support

Oleg Tikhonov Sat, 08 Feb 2014 07:20:32 -0800

Hi Grant,
what you're doing seems great.
I've checked the Tess4j (http://tess4j.sourceforge.net/) they released and
distributed under the Apache License,
v2.0<http://www.apache.org/licenses/LICENSE-2.0.html>
.


Hope it helps.

BR,
Oleg



On Sat, Feb 8, 2014 at 1:14 PM, Grant Ingersoll (JIRA) <[email protected]>wrote:

>
>     [
> https://issues.apache.org/jira/browse/TIKA-93?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13895514#comment-13895514]
>
> Grant Ingersoll commented on TIKA-93:
> -------------------------------------
>
> It can, via some ancient JavaIO stuff, which, in some cases, has some
> weird dependencies.  Still working this out, but the way this is shaping up
> is that it is all going to have to be very pluggable to avoid any of these
> cases.  If anyone is up for lobbying the Tess4J team to remove
> GPL/LGPL/viral dependencies, we'd be in much better shape.
>
> > OCR support
> > -----------
> >
> >                 Key: TIKA-93
> >                 URL: https://issues.apache.org/jira/browse/TIKA-93
> >             Project: Tika
> >          Issue Type: New Feature
> >          Components: parser
> >            Reporter: Jukka Zitting
> >            Priority: Minor
> >
> > I don't know of any decent open source pure Java OCR libraries, but
> there are command line OCR tools like Tesseract (
> http://code.google.com/p/tesseract-ocr/) that could be invoked by Tika to
> extract text content (where available) from image files.
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.1.5#6160)
>

Re: [jira] [Commented] (TIKA-93) OCR support

Reply via email to