tiffany 0.3 released

Christian Tismer Thu, 07 Jun 2012 15:54:39 -0700

# coding=utf-8

Tiffany - Read/Write Multipage-Tiff with PIL without PIL
========================================================


Tiffany stands for any tiff. The tiny module solves a large set of
problems, has no dependencies and just works wherever Python works.
Tiffany was developed in the course of the *DiDoCa* project and will
now appear on PyPi.

Abstract
========

During the development of *DiDoCa* (Distributed Document Capture) we were
confronted with the problem to read multipage Tiff scans. The GUI toolkit
*PySide (Qt)* does support Tiff, but only shows the first page. We also had
to support Fax compression (CCITT G3/G4), but *Qt* supports this.

As a first approach we copied single pages out of multi-page tiff files

using *tiffcp* or *tiffutil* (OS X) as a temp file for display. Asub-optimum

solution, especially for data security reasons.

The second approach replaced this by a tiny modification of the linkage of
the tiff directories (IFD). This way, a tiff file could be patched in memory
with the wanted page offset and then be shown without any files involved.

Unfortunately also this solution was not satisfactory:

- out tiff files have anomalies in their tiff tags like too many null-bytes
  and wrong tag order,

- Qt's implementation of tiff is over-pedantic and ignores all tagsafter the

  smalles error.

Being a good friend of *Fredrik Lundh* and his *PIL* since years, I tried to

attach the problem using this. Sadly Fredrik hasn't worked much on thissince

2006, and the situation is slightly messed up:

*PIL* has a clean-up of tiff tags, but cannot cope with fax compressionby default.There exists a patch since many years, but this complicates the buildprocess

and pulls with *libtiff* a lot of dependencies in.

Furthermore, *PIL* is unable to write fax compressed files, but blowsthe data

up to the full size, making this approach only a half solution as well.

After a longer odyssey I saw then the light of a Tiffany lamp:

I use only a hand-full of *PIL*s files, without any modification,pretend to unpacka tiff file, but actually cheating. Only the tiff tags are nicelyprocessed and

streamlined, but the compressed data is taken unmodified as-is.

When writing a tiff page out, the existing data is just assembled in thecorrect

order.

For many projects like *didoca* that are processing tiff files withouteditingtheir contents, this is a complete solution of their tiff problem. Thedependenciesof the project stay minimal, there are no binaries required, and Tiffanyis with

less than 300 lines remarkably small.

Because just 5 files from *PIL* are used and the _imaging module is notcompiled

at all, I'm talking about "PIL without PIL" ;-)

Tiffany is a stand-alone module and has no interference with *PIL*.

You can see this by looking at ``import_mapper.py``. This modulemodifies ``__import__``so that the *PIL* modules appear as top-level internally, but becomesub-modules of

tiffany in ``sys.modules``.

Please let me know if this stuff works for you, and send requests to
<tis...@stackless.com> or use the links in the bitbucket website:

https://bitbucket.org/didoca/tiffany

cheers -- Chris

--
Christian Tismer             :^)<mailto:tis...@stackless.com>
tismerysoft GmbH             :     Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121     :    *Starship* http://starship.python.net/
14482 Potsdam                :     PGP key ->  http://pgp.uni-mainz.de
work +49 173 24 18 776  mobile +49 173 24 18 776  fax n.a.
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/

--
http://mail.python.org/mailman/listinfo/python-list

tiffany 0.3 released

Reply via email to