On 3 Jun 2011, at 23:01, Jed Rothwell wrote: > Jones Beene <jone...@pacbell.net> wrote: > > I thought OCR would make a more compact file, but apparently not. > > > This is an OCR layer added underneath an image file. The image file is > intact, so the whole thing is bigger. > > This is a messy way to do things. To do it properly you dump the image of the > text and replace them with ASCII. You preserve only the figures in image > format. You redo the entire document in Microsoft Word, and then create a > fresh PDF. That is what I did for hundreds of documents. I am sick of doing > it. It is no longer as necessary because people nowadays have fast > connections and they can download huge files.
Another proper way of doing it is to use the djvu format (http://djvu.org/) which was design up front for this kind of thing. Joe