Michel Jullian wrote:

Jed, it seems the way it performs depends on the original document's
> characteristics (resolution, fonts, multiplicity of fonts maybe?).


Yes. There is remarkable variability between different kinds of documents
and different versions of Acrobat. I cannot figure out what all of the
controlling parameters are. It also depends a lot on the number and size of
the figures, and the amount of noise in the scan (extraneous dots).



> In the case of the Feynman Lectures on Physics, volume 3 (quantum
> mechanics), of which I made a searchable backup of my print version
> from an image format pdf found on scribd :
>

1/ The ClearScan'd pdf file size was several times *smaller* than the
> original image-only pdf found on the web
>

What is the URL of that file? I will run it through a variety of different
Acrobat programs. If ClearScan reduced the size I expect the original was
made a long time ago with an early version of Acrobat.



> 3/ The OCR quality is much better than what you got with your EPRI
> document, without any "touching up" I got, for page 1-1 :
>

That's probably a function of the quality of scan. A good quality scan of
cleanly printed text without a skew and without much noise will OCR far
better than an old one like the EPRI document.

- Jed

Reply via email to