Hello. This is my Hello message. :-) I've recently become aware of gnu-pdf, and so I wanted to make the project aware of my open source PDF inspection/manipulation software, qpdf. qpdf is released under the terms of version 2.0 of the Artistic license, but I would be supportive of inclusion of any of its code, or of use of any of its code for ideas for gnu-pdf. As qpdf is written in C++ and certainly has a different underlying architecture, so most of the code probably won't "drop in" in its present form, but it may still be useful.
qpdf is a library whose focus is structural reorganization of PDF files. It is also somewhat of a PDF hacker's toolkit, which is where most of its usefulness may be to people who are working on gnu-pdf. It has a PDF structure checker and can also do a number of content-preserving transformations. Here is a partial list: * Linearization * Conversion to or from object streams * Encryption/Decryption (R=2,3,4 including AES) One of the features of qpdf that I find most useful (or the original reason I wrote the software) is a form that I call "QDF" form. This is a form intended for helping PDF experts look at and work with PDF files in an ordinary text editor. QDF files are fully valid PDF files that are laid out in a particular way and have some extra comments in them that help with reconstruction of the cross reference table (or stream) after the file is manually edited. When qpdf writes QDF files, it also uncompresses all streams that it knows how to uncompress. There's a companion perl script called "fix-qdf" that reads a QDF file as input and writes a new one as output with a corrected cross reference table and, if object streams are in use, the offset tables at the beginnings of object streams. It also fixes all stream lengths. This makes it possible to generate a QDF file, hack away at the content streams or other structures, repair the damage, and convert back to a normal PDF file with compressed content streams, etc. It can be a big help when learning about PDF or when experimenting with ideas that you may want to build into the code. QPDF also has the ability to automatically recover from several common forms of damage to PDF files, and when it can't recover, it gives detailed developer-oriented error messages that can help you manually recover broken files. I have manually rescued many PDF files that couldn't be opened by any PDF software, and qpdf's automatic recovery often works better than Adobe Reader's automatic recovery, though I'm sure there are also cases where other readers will do a better job. QPDF has the ability to decode for the following filters: Flate, LZW, ASCII85, ASCIIHex. It can also encode with Flate. The flate decode filter is implementing using zlib. The others are hand-coded inside QPDF. QPDF uses a simple pipeline system I put together for these and also for encryption, which makes it pretty easy to work with chains of filters. If you're interested in more information, I encourage you to download qpdf and look at its documentation or read comments in the public header files. You can also find a documentation link from qpdf's main website, http://qpdf.sourceforge.net. For those of you using debian GNU/Linux (or Ubuntu), qpdf is available in the archive. I just uploaded the newly-released 2.1 on Friday, so that version can only be found in debian unstable. The older 2.0.6 release is in debian Lenny and in the current Ubuntu release. It doesn't support R=4 encryption and its recovery features are not quite as good, but it has the other features I mentioned. Unfortunately, I doubt that I will have much time available to contribute code to the GNU PDF project, at least for now, but as someone who is pretty familiar with the PDF specification (at least at the structural level), I may lurk on the list and chime in when I feel that I have something to offer. Again, I encourage you to make use of qpdf in whatever way you can, whether by taking code or ideas from it, or whether by just using it as a tool to help you look at and experiment with PDF files. Whether you use qpdf or not, I wish everyone the best of success on the GNU PDF project. I think it's an important contribution to the overall suite of available tools. -- Jay Berkenbilt <[email protected]>
