Hi all!
A client has landed us with a bunch of CD's containing several large multipage TIFF images. I ran them through our usual conversion tools (namely a find script that passes the images through tiffsplit), and all seemed fine and dandy.
However, on inspecting the output, it seems that half of the files are JPEG encoded in a TIFF wrapper, which tiffsplit can't handle, and neither can our £5000-worth of Adobe Capture. The result is a garblified TIFF that I can't even render, let alone OCR.
A google returned the following snippet from the libtiff mailing list:
==========
2004.02.26 10:27 "Re: tiffsplit & JPEG compression", by Andrey Kiselev
On Wed, Feb 25, 2004 at 06:13:04PM +0300, Artem Mirolubov wrote: > tiffsplit dont copy pages with JPEG compression. > what tags i must copy with CopyField, to add such support?
I have fixed that problem, thank you for report. We need to copy contents of the TIFFTAG_JPEGTABLES tag.
> And what tags i must copy, to add support of TIFF files with JPEG > compression version 6.0 specification (Plz dont tell i dont need it. I > really need it! And i defined "never" in "tif_ojpeg.c":) ?
Well, if you have enough sample files you can experiment with all tags, defined in ojpegFieldInfo (see tif_ojpeg.c file).
Best regards, Andrey
==========
Alot of that is greek to me, but it seems that the gist is that JPEG-TIFF support has been added.
Does anyone know if these changes have made it into the current versions of the Debian TIFF utils? Or do I need to build myself a customised TIFF library (argh!)? Failing that, does anyone know of any alternate way to batch-convert JPEG-TIFF's (preferably in Linux)? I've already tried using imagemagick, but it has some serious problems dealing with multipage TIFFs (namely trying to do all 80MB of a file at once, and running the system into the ground).
If anyone out there has more of a clue than me, I'd be much obliged!