Frank Cox wrote: > On Tue, 20 Mar 2007 00:56:21 -0700 > "Brian Burger" <blurdesign at gmail.com> wrote: > > >> Is it possible to produce per-page diffs of two PDFs? >> > > You could use something like pdftk to tear the files down into individual > pages > and then use cmp to determine which pages are identical. > > I think this is worth a try, not to negate any of Craig's concerns. One of the issues you're dealing with is the scale of the job. Using the 'burst' command in pdftk you can make each into individual pages, then see what a diff gets you. There are probably pages you can ignore since they may be very obviously different. If you can at least carve the job down to a more manageable one, you're ahead. Be prepared for 500 pages of pdftk output to take up a load of memory -- much more than the original file.
I have yet to see text stripped from a PDF to come out very well -- lots of mistakes, spaces in unusual places. Greg
