Re: OCR Form tools

2011-12-08 Thread Ryan Coleman
of forms equivalent to invoices that I'd like to put into > a database. I'm thinking I would like to have some OCR app/tool scan these > forms, and then generate a CSV with each field. Does anyone have > recommendations on software for this?

OCR Form tools

2011-12-08 Thread Adam Vande More
I have thousands of forms equivalent to invoices that I'd like to put into a database. I'm thinking I would like to have some OCR app/tool scan these forms, and then generate a CSV with each field. Does anyone have recommendations on software for this? -- Adam

Re: OCR...

2009-01-29 Thread Andrew Gould
On Thu, Jan 29, 2009 at 7:11 AM, Reko Turja wrote: > > -- > From: "Gary Kline" > Sent: Thursday, January 29, 2009 4:23 AM > To: "Andrew Gould" > Cc: "Reko Turja" ; "FreeBSD Mailing List&q

Re: OCR...

2009-01-29 Thread Reko Turja
-- From: "Gary Kline" Sent: Thursday, January 29, 2009 4:23 AM To: "Andrew Gould" Cc: "Reko Turja" ; "FreeBSD Mailing List" Subject: Re: OCR... On Wed, Jan 28, 2009 at 07:33:41PM -0600, Andrew Goul

Re: OCR...

2009-01-29 Thread Andrew Gould
o with. gOCR > > > > > > >looks > > > > > > >best so far to me. > > > > > > > > > > > > AABBYY Finereader - Omnipage haven't been able to catch it in > several > > >

Re: OCR...

2009-01-28 Thread Gary Kline
gt; > > AABBYY Finereader - Omnipage haven't been able to catch it in several > > > > > years either feature or qualitywise. No idea if Finereader runs under > > > > > emulator though. If the file is already a PDF and 72 DPI with text > > as > > &

Re: OCR...

2009-01-28 Thread Andrew Gould
itywise. No idea if Finereader runs under > > > > emulator though. If the file is already a PDF and 72 DPI with text > as > > > > graphics most of the damage has already been done, and it will be > > > > extremely hard to OCR. > > > > > >

Re: OCR...

2009-01-28 Thread Gary Kline
t; > >best so far to me. > > > > > > AABBYY Finereader - Omnipage haven't been able to catch it in several > > > years either feature or qualitywise. No idea if Finereader runs under > > > emulator though. If the file is already a PDF and 72 DPI with tex

Re: OCR...

2009-01-28 Thread Andrew Gould
it in several > > years either feature or qualitywise. No idea if Finereader runs under > > emulator though. If the file is already a PDF and 72 DPI with text as > > graphics most of the damage has already been done, and it will be > > extremely hard to OCR. > > >

Re: OCR...

2009-01-28 Thread Gary Kline
or though. If the file is already a PDF and 72 DPI with text as > graphics most of the damage has already been done, and it will be > extremely hard to OCR. > well, damage is probably done. how can i check the resolution? i tried to increase it by creating huge ppm and

Re: OCR...

2009-01-28 Thread Reko Turja
ly hard to OCR. -Reko ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"

Re: OCR...

2009-01-28 Thread Michel Talon
Gary Kline wrote: > well, i'm ashamed to admit that i've put at least a dozen hours in > trying, then re-re-retrying to OCR a imaged pdf file with as many > open source ocr packages as i can find. I have seen good results with tesseract which is in the ports and free. Othe

OCR...

2009-01-27 Thread Gary Kline
guys, well, i'm ashamed to admit that i've put at least a dozen hours in trying, then re-re-retrying to OCR a imaged pdf file with as many open source ocr packages as i can find. before i quit for supper tonight, i finally threw in the towel. realized than i would have been THROUG

Re: any way to turn a pdf file into something OCR-able?

2008-12-02 Thread Gary Kline
> > I wrote some code using Python PDF library 'pypdf' to split a multipage > PDF scan into individual pages, then used the tesseract OCR to convert > to text. Not 100% of course, and it really got confused by pages that > were not right-side-up, but not a bad start fo

Re: any way to turn a pdf file into something OCR-able?

2008-12-02 Thread Gary Kline
On Tue, Dec 02, 2008 at 02:07:30AM +0100, Roland Smith wrote: > On Mon, Dec 01, 2008 at 03:14:43PM -0800, Gary Kline wrote: > > pdftotext fail on the large [32MB] file I've got. Is there any > > other way I can translate this huge textfile to ascii or html or > > text? > > Please defi

Re: any way to turn a pdf file into something OCR-able?

2008-12-02 Thread Roland Smith
; > there is no text for pdftotext to convert => epic fail. > > In this case "convert" from the ImageMagick port will get you a > series of .jpg/.gif/.. Read the manual carefully before > attempting; also note this can be a slow process. Which still doesn't give pla

Re: any way to turn a pdf file into something OCR-able?

2008-12-01 Thread Olivier Nicole
> > 1) Some PDFs are just wrappers around JPEG images. In this case > > there is no text for pdftotext to convert => epic fail. > > In this case "convert" from the ImageMagick port will get you a > series of .jpg/.gif/.. Read the manual carefully before > attempting; also note this can be

Re: any way to turn a pdf file into something OCR-able?

2008-12-01 Thread Robert Huff
Roland Smith writes: > >pdftotext fail on the large [32MB] file I've got. Is there any > >other way I can translate this huge textfile to ascii or html or > >text? > > Please define "fail" in this context? I've used pdftotxt on > documents exceeding 40MB. However there are of

Re: any way to turn a pdf file into something OCR-able?

2008-12-01 Thread Roland Smith
On Mon, Dec 01, 2008 at 03:14:43PM -0800, Gary Kline wrote: > pdftotext fail on the large [32MB] file I've got. Is there any > other way I can translate this huge textfile to ascii or html or > text? Please define "fail" in this context? I've used pdftotxt on documents exceeding

any way to turn a pdf file into something OCR-able?

2008-12-01 Thread Gary Kline
Guys, pdftotext fail on the large [32MB] file I've got. Is there any other way I can translate this huge textfile to ascii or html or text? thanks, gary -- Gary Kline [EMAIL PROTECTED] http://www.thought.org Public Service Unix http://jot

Re: best OCR scanner??

2005-09-02 Thread Nikolas Britton
nt and out-of-copyright > > > book (from 1913) and need to know what the best scanner is > > > and if there has been substantial improvement in OCR > > > software in recent years. This book has few footnotes > > > or differen

Re: best OCR scanner??

2005-09-02 Thread Bill Campbell
n hold the page flat while it's been photographed, with something to keep the opposite page out of the camera's way. I have to admit that I do all my scanning and OCR on an OS X system, only marginally related to FreeBSD. I use an older HP Scanjet with automatic document feeder (ADF), an

Re: best OCR scanner??

2005-09-02 Thread Roland Smith
fine. > > and if there has been substantial improvement in OCR > > software in recent years. This book has few footnotes > > or different typefaces, so it should make things easier. There are several free OCR programs. I've used gocr (http://joc

Re: best OCR scanner??

2005-09-02 Thread Gary Kline
to know what the best scanner is > > and if there has been substantial improvement in OCR > > software in recent years. This book has few footnotes > > or different typefaces, so it should make things easier. > > > > Oh, an if there is somet

Re: best OCR scanner??

2005-09-02 Thread Gary Kline
canner is > >and if there has been substantial improvement in OCR > >software in recent years. This book has few footnotes > >or different typefaces, so it should make things easier. > > > >Oh, an if there is something that plugs into DOS

Re: best OCR scanner??

2005-09-02 Thread Roger Merritt
At 08:07 PM 9/1/2005 -0700, Gary Kline wrote: People, I want to scan ~400 pp of an out-of-print and out-of-copyright book (from 1913) and need to know what the best scanner is and if there has been substantial improvement in OCR software in recent years

Re: best OCR scanner??

2005-09-02 Thread Nikolas Britton
On 9/1/05, Gary Kline <[EMAIL PROTECTED]> wrote: > People, > > I want to scan ~400 pp of an out-of-print and out-of-copyright > book (from 1913) and need to know what the best scanner is > and if there has been substantial improvement in OCR

Re: best OCR scanner??

2005-09-02 Thread Gary Kline
On Thu, Sep 01, 2005 at 08:07:26PM -0700, Gary Kline wrote: > People, > > I want to scan ~400 pp of an out-of-print and out-of-copyright > book (from 1913) and need to know what the best scanner is > and if there has been substantial imp

best OCR scanner??

2005-09-01 Thread Gary Kline
People, I want to scan ~400 pp of an out-of-print and out-of-copyright book (from 1913) and need to know what the best scanner is and if there has been substantial improvement in OCR software in recent years. This book has few footnotes or