> -----Original Message-----
> From: Simh [mailto:simh-boun...@trailing-edge.com] On Behalf Of Paul
> Koning
> Sent: 06 February 2016 19:01
> To: Timothe Litt <l...@ieee.org>
> Cc: simh@trailing-edge.com
> Subject: Re: [Simh] OSs with accessible documentation
> 
> 
> > On Feb 5, 2016, at 6:10 PM, Timothe Litt <l...@ieee.org> wrote:
> >
> > Some of the PDFs on bitsavers are searchable.  It would be a good
> > project to OCR the rest into searchable pdfs - as that also means that
> > the text can be extracted.   OCR is getting good enough (finally) that
> > it's feasible.  I'm sure that they'd be accepted back into bitsavers
> > - searchable is good for everyone.
> 
> Some disapprove of OCR for reasons I don't really understand.

It depends how you build the PDF. If you replace the images with the OCR's 
text, which seems to be the default, then you introduce errors.
If you leave the images in place and put text behind the images I can't see 
what the problem is,


> 
> A problem with OCR is that it's hard to find a good one.  I dabbled with an
> OCR plugin that Adobe once offered (free, and worth about that).  I also
> once tried an open source OCR, which was vastly inferior still.
> 

> But commercial OCR programs exist that do a decent job, especially if the
> scanned material is clean as is the case for much of what is on Bitsavers.  I 
> use
> Abbyy FineReader which I rather like, but I expect there are other good ones
> out there too.
> 

I also use a copy of Abbey Fine Reader PRO I got from a Magazine cover disk. It 
seems to work well, and can be tweaked..

> One key point is that you typically need to spend some time "training" the
> program on the particular type of material -- typeface etc. -- that you're
> working with.  The default settings are rarely adequate.
> 

Fine Reader Pro is OK if the scans are good. My new scanner is quicker and 
produces better scans. It also has a sheet feeder.

>       paul
> 
> _______________________________________________
> Simh mailing list
> Simh@trailing-edge.com
> http://mailman.trailing-edge.com/mailman/listinfo/simh

Dave
G4UGM

_______________________________________________
Simh mailing list
Simh@trailing-edge.com
http://mailman.trailing-edge.com/mailman/listinfo/simh

Reply via email to