[RFC] Dynamic image generator handler

Michael A Nachbaur Thu, 09 May 2002 22:54:36 -0700

This is an request for comments.  If you don't care about dynamic image
generation with mod_perl, or don't care about offering or reading about
suggestions, you can safely ignore this.  Also, be forewarned, this was
written in StarOffice, and then copied/pasted into my email program, and
hand tweaked, so some things may not have made the transition properly.


Dynamic Image Manipulator
-------------------------

*) Overview

This is a mod_perl handler, not directly tied in with my content
management system, but is/will be used extensively by it. The premise is
to dynamically generate images, cache them, and present them to browser
clients. The URI, as well as Apache configuration directives, is used to
determine what is to be generated.

*) Basic Uses

The most basic uses of this application will to dynamically generate TTF
text for titles, buttons, sidebars, etc. The current version of this
code does this, and quite well. Foreground and background colors, font
name (with bold/italic support), font size, image size (or automatically
detected based on the size of the text), and rotation.

The basic text support could be extended to allow for images to be
overlayed on the text (or placed under the text), or stretch images
similarly to how Enlightenment displays window manager themes.

Other uses planned would be to manipulate existing images. For instance,
if an image on a website needs a thumbnail, medium size and full-size
view, normally a person must make all versions by hand. If any
formatting needs to be done, like borders or drop-shadows, this
increases complexity. If a person could just drop an image in a
directory, and link to that image, the image could automatically be
resized, borders added, drop shadows put in place. The resulting image
would then be cached, and outputted.

*) URI Arguments

Information about what is to be done is passed through the URI. This
works for simple tasks like text display, but if anything more
complicated is to be done, external configuration files must be used.
We'll get to that in a bit.

Essentially, arguments are passed using the PATH_INFO HTTP header. We
want the browser to think this is an actual file, instead of a
dynamically generated image, so that the browser is more inclined to
cache the content. So, a typical query would be:

  http://localhost/genText/font=ArialBold;size=24;fgcolor=#ffffff;
  bgcolor=#000000;rotate=90;text=This+Is+The+Text

Resizing an image could be accomplished by doing:

  http://localhost/genImage/source=/images/ducks.jpg;scale-ratio=1:1;
  width=120;height=80;border-size:1px;border-color:#000000;
  shadow-color:#000000;shadow-angle:270;shadow-distance:5px

This would resize an image to the indicated width/height. The
"scale-ratio" argument would limit the width/height ratio, so the
maximum dimension would be used. The other attributes are obvious.

*) Configuration Files
Lets assume that we are going to scale an image, add borders to it which
consist of other images, and add text captions over the image. This
would result in such a long URI, that browsers would probably truncate
it. In addition, providing direct access to the browser opens up
possibilities for DoS attacks. Therefore, a configuration file should be
used. The config file must be flexible enough to allow a web page to
provide various input, but have certain defaults set, and restricted.

The proposed solution would be to have a config file that has preset
default templates that the input arguments augment. For instance:

<genimage>
  <preset name="thumbnail-image">
    <image>
      <style> 
        border-style: solid;
        border-color: #000000;
        border-width: 1px;
        shadow-distance: 5px;
        shadow-angle: 270; <!-- or something like 1.2rad -->
        shadow-color: #000000;
      </style>
      <content name="src"/>
    </image>
    <image>
      <style href="/css/watermark.css"/>
      <!-- The above-referenced file has the following contents:
        opacity: 80%;
        position: top;
      -->
      <content>/images/watermark.gif</content>
    </image>
    <text>
      <style>
        font-face: Arial;
        font-size: 10px;
        color: #000000;
        border-style: solid;
        border-color: #ffffff;
        opacity: 80%;
        text-align: right;
      </style>
      <content>Copyright &copy; 2002 Foo Bar Industries</content>
    </text>
    <text>
      <style>
        font-face: Arial;
        font-size: 14px;
        color: #ffffff;
        text-align: left;
        position: top;
      </style>
      <content name="date"/>
    </text>
  </preset>
</genimage>

As you can see, the above configuration file uses CSS. It makes sense to
leverage that, although I'm not certain how difficult it would be to
interface with CSS files. As far as I know, there are Perl CSS parsers,
but I have yet to use them. The configuration for a preset config
template would be layered, so the earlier the definition, the lower the
layer is. The real important part here, is the "name" attribute of any
element, as this identifies where input can be indicated. The above
preset could be used by invoking the following URI.

  http://localhost/genImage/preset=thumbnail-image;src=/images/ducks.jpg

As you can see, the preset is invoked by passing it's name as an
attribute, and any element that has a name attribute, it's value can be
provided on the URI. If an element has both a value and a name
attribute, the value in the config file can be used as a default.

*) Caching Schemes

A caching scheme similar to AxKit could be used. The current module
takes all the input arguments, sorts them (including all values that are
not provided, for completeness), and takes it's MD5 checksum. That
becomes the image's filename on the system. It is placed in a temporary
directory, and any further requests to that same URI, the file is pulled
from the filesystem without regenerating the image. Further, the code
has been blatantly ripped off from AxKit, which separates the directory
into two sub-levels, to prevent performance problems of having too many
files in one directory.

Note: To prevent the filesystem from filling up, due to DoS attacks, it
may be prudent to have a cron job periodically cull files that have the
oldest access time.

*) Image Manipulation Modules

My current code uses GD for text writing, and I'm quite happy with it.
It is extremely fast, and creates nice text output when compiled with a
TTF font engine. Looking forward however, it may not be as desirable if
things like drop shadows is to be done. GD can work with multiple
images, can resize them, etc, but the advanced features are still
unknown.

*) File Expiration Headers and Browser Caching

With my current code, it seems that browsers are reluctant to cache
these dynamically generated images. I have passed Expires: headers to
tell the browser to cache file file for a long period of time (2+
weeks), but I have been unsuccessful. I know the caching headers are
complex, and needs more than one simple header, but fixing this has
moved to the back-burner of my project. However, if more complicated
processing is to be done, and with more images, it will be crucial to
make browsers cache these images.

*) Current and Future Uses

Currently, the few production sites I have running my TTF text image
code, perform quite well, if a little slow on loading toolbar images and
the like; this should be improved with proper HTTP caching headers. It
has proven very convenient. I build sites for customers using XML and
XSL, with an XML configuration file describing the "sitemap" layout.
This way, a document can have a tag indicating which sitemap location it
occupies, and breadcrumb trails, sidebar sitemap trees, title bar text
and page titles all can be displayed without any per-page editing. The
addition of the TTF image code is wonderful, because now I can even have
section titles, often placed in toolbars or headers, generated
dynamically.

I have recently released version 1.0 of an all-XML content management
system (although it is not yet available for public use...sorry), and
this means people can edit their entire site, not just plain text. They
can completely change their sitemap structure interactively, and all
site images relating to the sitemap automagically change.

Looking forward, I would like to be able to use this for more than just
text. I'd like my customers to be able to select an image, set the
maximum size, check a few boxes or radio buttons to say what formatting
options they'd like, and the image is automagically resized without
wasting disk space or taking a long time to download. It also saves me,
as a web developer, from having to mess around with adding borders, drop
shadows, and other menial changes to an image every time the customer
gets it into their head to add a new picture.

I'm sure there are plenty of other uses for this, but I'll leave it at
this.

I have another, similar image module, which generates web-based graphs
using the same sort of URI structure, but it is severely limited. It is
a component of my content management system used to show a user how much
space of their quota is used (3D pie chart).

*) Summary

I have to make some subtle feature additions to this code within the
next few weeks (months?), and instead of just performing a knee-jerk
reaction and hacking on some cruft to my existing module, I wanted to
open this up to the mod_perl developer community as a whole, to see if
this is something others can use. If you like the idea, good for you. If
you have constructive suggestions, I'd like to hear them (no "Images
suck, use Lynx" comments please). If you would like to help, have the
time to help, and have the skills necessary, please let me know because
many eyes make all bugs shallow.

-- 

-man
Michael A Nachbaur

PGP Public Key at http://www.nachbaur.com/pgpkey.asc
PGP Key fingerprint = 83DC 7C3A 3084 6A21 9A3F  801E D974 AFB4 BFD7 2B6F

[RFC] Dynamic image generator handler

Reply via email to