UCLA had developed a very good scanning OCR solution ..... but I don't 
think it was pure FOSS.... will ask.

Joseph

Tim Churches wrote:
> Tim Churches wrote:
>> Karsten Hilbert wrote:
>>> Well, the path of least resistance here is to scan it and
>>> use it as a background image in some text editor or other so
>>> that what you type appears to be written into the fields
>>> while it is (technically) written on top of the background
>>> image. We then save the result as any other old document
>>> tied into the medical record.
>> No, we need the data in computable form for epidemiological (aggregate)
>> analysis - images of numbers nd characters must be converted to ASCII or
>> Unicode bytes. There is a commercial product, Teleform, which does this
>> reasonably well - see
>> http://www.cardiff.com/products/teleform/index.html - and we may just
>> provide an interface which can load data which has been scanned off
>> hand-written forms using that, but gee, an open source solution would be
>> nice. Suggestions very welcome.
> 
> A few months ago Google released Tesseract OCR, an oCR engine developed
> in the 1990s by Hewlett-Packard. Apparently it was state-of-the-art in
> 1995, but that's over a decade ago, and has not been developed since.
> There don't seem to be any other open source OCR engines around that are
> being actively developed or which are anything more than demos or
> proofs-of-concept. And Teleform seems to have the OCR-from-paper-forms
> market almost to themselves. I think we'll have to build a batch input
> interface that Teleform can be plugged into - I think it exports to XML,
> or at the very least CSV files.
> 
> But if anyone can suggest an alternative for turning data recorded on
> paper forms into data (as opposed to raster image) files, we'd love to
> hear of it.
> 
> Tim C
> 
> 
> 
>  
> Yahoo! Groups Links
> 
> 
> 
> .
> 

Reply via email to