Yes, AI beginning with version 10 is PDF. That's what brought me to PDFBox in the first place ... 3 years ago I think.
What is the PDFDebugger tool you're using? That sounds like a tremendous help! FlateDecode with predictor 15 ... right I see that. The "getColorSpace() returned NULL" is from PDXObjectImage.getColorSpace(). There's a branch there for a null CS when the filter is CCITTFAX_DECODE. In that case it assumes PDDeviceGray. I tried the same thing, but just got exceptions so thought I was headed down the wrong road. Do you have any ideas about what should be there? Thanks! Daniel Wilson On Thu, Apr 16, 2009 at 3:04 AM, Jeremias Maerki <d...@jeremias-maerki.ch>wrote: > Hehe, I didn't make the link that a *.ai file could also be a PDF file. > So now I know where I have to look. I learned something myself just now. > > So, I opened ArchiveRGB.ai with the PDFDebugger and took a look at the > image contained in the file. There are two images, both of which are > 1bit black/white images which are configured as image masks (which I > find strange). The images may well come from a TIFF file originally but > what ends up in the PDF has nothing to do with TIFF. The compression > used is not even T.4 or T.6 (the most common compression scheme for > bi-level images), the FlateDecode filter with predictor 15 (PNG optimum) > is used. TIFF 6.0 doesn't even support that compression type. So Adobe > Illustrator loaded the TIFF file but put it in the PDF as something else. > Predictor 15 should be supported by PDFBox if I interpret the source > code correctly. So the problem to display that PDF probably doesn't lie > with decompressing the image but with painting it. At any rate, I > suggest you should not get yourself distracted by the fact the original > image might have been a TIFF once. Another problem (and probably the key) > will be the "getColorSpace() returned NULL" error on the log. The two > images seem to be using a Separation color space, which means that the > black pixels in the bi-level images are not painted in black but in the > color specified in the separation color space. I'd start looking in > PDPixelMap and why it gets null as the color space. HTH > > On 16.04.2009 02:31:19 Daniel Wilson wrote: > > Thanks for the explanation, Jeremias, but I'm not sure you're correct. > > > > Here's what the customer who created the artwork told me: > > > > Daniel, > > To your question below, are you are asking "what process is used" for the > > background color that you are displaying as bright lime green? That is a > > normal CMYK spot color as far as the background goes. The next layer is > the > > <text> that is a dark green color and on top of it is a tiff scan to give > it > > that faded look. Then the final layer is the text in white. > > Matt > > > > That helps I think, Matt. > > > > I think the "faded look" is what I'm having trouble with. I'm seeing a > > faded look on another one that almost looks like water droplets or > > something. > > > > Is that also a tiff scan? > > > > On that tiff scan, are you using that tiff semi-transparently? I think > I'm > > either not using the tiff, setting it to fully transparent so it's not > seen > > at all, or moving it behind other layers. > > > > Thanks for the help. > > > > Daniel Wilson > > > > Yes I believe we have a scan that looks like water droplets or something > > similar, we have about every type of scan possible. Yes I'm sure the scan > > has to be transparent to give it that affect. > > > > So ... they claim they are putting a TIFF into the PDF file. And ... > isn't > > that what PDInlinedImage and PDXObjectImage are all about? > > > > I have gotten permission and posted another example (red instead of > green) > > in the trunk\test\input\rendering folder as ArchiveRGB.ai. > > > > Thanks! > > > > Daniel Wilson > > > > > > On Wed, Apr 15, 2009 at 4:53 PM, Jeremias Maerki <d...@jeremias-maerki.ch > >wrote: > > > > > Daniel, I think there's a misunderstanding. PDF doesn't contain TIFF > > > files. And not PNGs. In a way you could say PDFs can contain JPEGs as > > > the DCTDecode filter handles practically the same as a raw JPEG file. > In > > > FOP we can basically embed a baseline JPEG file 1:1 without > > > decompression in a PDF. But the same is not true for TIFF and PNG. What > > > PDF uses are the PNG predictors to increase image compression over > plain > > > deflate. But that's not the same as PNG. I've tried to embed PNGs in > > > PDFs without decompressing them in FOP and I didn't manage for some > > > reason. > > > > > > By transparent TIFF, you mean black/white 1bit images. Is that right? > > > When you're talking about TIFF, are you not rather talking about the > > > CCITTFaxDecode filter which uses the compression algorithms defined in > > > the ITU T.4 and T.6 specifications (CCITT Fax Group 3 and 4)? Like PDF, > > > TIFF uses those algorithms, but that's not the same as embedding TIFF. > > > In FOP, we can transfer CCITT encoded image data extracted from TIFF > > > into PDFs without decompression, much like JPEG data. > > > > > > I've just had a closer look at CCITTFaxDecodeFilter in PDFBox. If I > > > interpret the code correctly, it actually just embeds the stream data > in > > > a TIFF wrapper which is loaded (now by ImageIO?) somewhere else. Not > > > what I expected. This was probably a work-around to make use of JAI's > > > codec for this kind of image. I guess what you're really looking for is > > > a decompressor (and eventually a compressor) for ITU T.4 and T.6. > > > > > > A suitably licensed decompressor can be found in Apache XML Graphics > > > Commons: > > > > > > > http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/java/org/apache/xmlgraphics/image/codec/tiff/TIFFFaxDecoder.java?view=markup > > > > > > This could be integrated into an implementation of the CCITTFaxDecode > > > filter in PDFBox. > > > > > > For the compression side, I'm currently working on a T.4/T.6 compressor > > > but that's not finished, yet. I need that for FOP's PDF, PS, AFP and > > > TIFF output. I'm implementing it as an OutputStream subclass so it can > > > easily be integrated in Sanselan, PDFBox or whatever. The only two > > > problems left is finding the right place to put it in the end and for > me > > > to find time to finish it. > > > > > > BTW, Sanselan doesn't have a CCITT/T.4/T.6 implementation, yet, so it > > > won't be a help right now. > > > > > > So if I got this right, a full TIFF codec is only needed if you wanted > > > PDFBox to be able to read TIFFs when embedding them in a new PDF or > when > > > you extract bitmaps from a PDF and want to save them as external image > > > files. For PDF viewing, you only need the T.4/T.6 decompressor. > > > > > > I hope I'm making sense. > > > > > > On 15.04.2009 21:52:25 Daniel Wilson wrote: > > > > Some PDF's have transparent TIFF's in them. > > > > > > > > They come into the PDXObject arena ... as a PDPixelMap. But there we > are > > > > best prepared to handle JPEG's and PNG's. > > > > > > > > Most sources on rendering a TIFF in Java say to use JAI. Someone > (Jukka > > > I > > > > think) went to a good deal of trouble to excise JAI from PDFBox due > to > > > > licensing restrictions. > > > > > > > > Lizardworks' TIFF library is about 10 years old, lacks Deflate > > > > decompression, and is licensed under the Library GPL. > > > > http://www.lizardworks.com/libs.html > > > > > > > > So I don't think it is an option. > > > > > > > > The TIFF spec is 121 pages long in its own right. > > > > http://partners.adobe.com/public/developer/en/tiff/TIFF6.pdf > > > > That's a lot simpler than PDF, but doing our own implementation would > be > > > a > > > > non-trivial undertaking. > > > > > > > > Any ideas on how to procede? > > > > > > > > Thanks. > > > > > > > > Daniel Wilson > > > > > > > > > > > > > > > Jeremias Maerki > > > > > > > > > > > Jeremias Maerki > >