Michel Jullian wrote: Jed, it seems the way it performs depends on the original document's > characteristics (resolution, fonts, multiplicity of fonts maybe?).
Yes. There is remarkable variability between different kinds of documents and different versions of Acrobat. I cannot figure out what all of the controlling parameters are. It also depends a lot on the number and size of the figures, and the amount of noise in the scan (extraneous dots). > In the case of the Feynman Lectures on Physics, volume 3 (quantum > mechanics), of which I made a searchable backup of my print version > from an image format pdf found on scribd : > 1/ The ClearScan'd pdf file size was several times *smaller* than the > original image-only pdf found on the web > What is the URL of that file? I will run it through a variety of different Acrobat programs. If ClearScan reduced the size I expect the original was made a long time ago with an early version of Acrobat. > 3/ The OCR quality is much better than what you got with your EPRI > document, without any "touching up" I got, for page 1-1 : > That's probably a function of the quality of scan. A good quality scan of cleanly printed text without a skew and without much noise will OCR far better than an old one like the EPRI document. - Jed