At 02:07 PM 3/7/2006, Petter Nyström wrote:
I hope this is correct. Writing these bytes to a file will however not produce anything useful.

        That is correct, in many cases.

Images in PDF files are either in JPEG/JFIF format OR they are a raw "array of pixels". In the first case, you can indeed just "dump the data to disk" and get a usable image. In the second, you will need to convert it to some standard image format.


None of my image viewing program will understand it. Neither would ppmtojpeg or a host of other conversion programs that I've tried at it. :p

ImageMagick will understand it as a raw bitmap format, if you provide the necesary colorspace, etc. values. but that won't help your users.


So there I am. Reading the dictionary of the PRStream I learn that the colorspace is of the type /DeviceCMYK. And while I think I have a (very) basic understanding of what that means, I am far from ready to manipulate the byte stream into a recognizable image format myself!

Yes, you have a LOT to learn to get from a PDF image data to usuable image formats. You need to learn about colorspaces (and conversion thereof), bit depths, indexed vs. "true color", etc.


I feel I am moving outside the scope of iText here...

Somewhat, but as long as it is PDF related we can help you for a bit...


But could someone give any pointers to what tools that exist, both within and outside iText, for dealing with a byte stream of this type? I want this stream turned into some sort of standard image format to work with.

        So you'll need the following components.

1) Image buffer management
2) Colorspace handling and/or conversion
3) File format writing


Are there other troubles I should watch out for?

        Sure, a bunch.


Such as byte streams being encrypted, compressed, etc.?

Encryption is handled automatically by iText - PROVIDED that you have the rights to extract the data from the PDF. Remember, not all authors grant such rights.

Decompression will be handled by iText as well - BUT you DO NOT want to do always do it. Images compressed with DCTDecode (aka JPEG) should be treated as UNCOMPRESSED.


Leonard

---------------------------------------------------------------------------
Leonard Rosenthol                            <mailto:[EMAIL PROTECTED]>
Chief Technical Officer                      <http://www.pdfsages.com>
PDF Sages, Inc.                              215-938-7080 (voice)
                                             215-938-0880 (fax)



-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid0944&bid$1720&dat1642
_______________________________________________
iText-questions mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/itext-questions

Reply via email to