Re: PDF's, TIFF's and JAI, oh my!

Daniel Wilson Thu, 16 Apr 2009 04:25:58 -0700

Yes, AI beginning with version 10 is PDF.  That's what brought me to PDFBox
in the first place ... 3 years ago I think.


What is the PDFDebugger tool you're using?  That sounds like a tremendous
help!

FlateDecode with predictor 15 ... right I see that.  The "getColorSpace()
returned NULL" is from PDXObjectImage.getColorSpace().  There's a branch
there for a null CS when the filter is CCITTFAX_DECODE.  In that case it
assumes PDDeviceGray.  I tried the same thing, but just got exceptions so
thought I was headed down the wrong road.

Do you have any ideas about what should be there?

Thanks!

Daniel Wilson

On Thu, Apr 16, 2009 at 3:04 AM, Jeremias Maerki <d...@jeremias-maerki.ch>wrote:

> Hehe, I didn't make the link that a *.ai file could also be a PDF file.
> So now I know where I have to look. I learned something myself just now.
>
> So, I opened ArchiveRGB.ai with the PDFDebugger and took a look at the
> image contained in the file. There are two images, both of which are
> 1bit black/white images which are configured as image masks (which I
> find strange). The images may well come from a TIFF file originally but
> what ends up in the PDF has nothing to do with TIFF. The compression
> used is not even T.4 or T.6 (the most common compression scheme for
> bi-level images), the FlateDecode filter with predictor 15 (PNG optimum)
> is used. TIFF 6.0 doesn't even support that compression type. So Adobe
> Illustrator loaded the TIFF file but put it in the PDF as something else.
> Predictor 15 should be supported by PDFBox if I interpret the source
> code correctly. So the problem to display that PDF probably doesn't lie
> with decompressing the image but with painting it. At any rate, I
> suggest you should not get yourself distracted by the fact the original
> image might have been a TIFF once. Another problem (and probably the key)
> will be the "getColorSpace() returned NULL" error on the log. The two
> images seem to be using a Separation color space, which means that the
> black pixels in the bi-level images are not painted in black but in the
> color specified in the separation color space. I'd start looking in
> PDPixelMap and why it gets null as the color space. HTH
>
> On 16.04.2009 02:31:19 Daniel Wilson wrote:
> > Thanks for the explanation, Jeremias, but I'm not sure you're correct.
> >
> > Here's what the customer who created the artwork told me:
> >
> > Daniel,
> > To your question below, are you are asking "what process is used" for the
> > background color that you are displaying as bright lime green? That is a
> > normal CMYK spot color as far as the background goes. The next layer is
> the
> > <text> that is a dark green color and on top of it is a tiff scan to give
> it
> > that faded look. Then the final layer is the text in white.
> > Matt
> >
> > That helps I think, Matt.
> >
> > I think the "faded look" is what I'm having trouble with.  I'm seeing a
> > faded look on another one that almost looks like water droplets or
> > something.
> >
> > Is that also a tiff scan?
> >
> > On that tiff scan, are you using that tiff semi-transparently?  I think
> I'm
> > either not using the tiff, setting it to fully transparent so it's not
> seen
> > at all, or moving it behind other layers.
> >
> > Thanks for the help.
> >
> > Daniel Wilson
> >
> > Yes I believe we have a scan that looks like water droplets or something
> > similar, we have about every type of scan possible. Yes I'm sure the scan
> > has to be transparent to give it that affect.
> >
> > So ... they claim they are putting a TIFF into the PDF file.  And ...
> isn't
> > that what PDInlinedImage and PDXObjectImage are all about?
> >
> > I have gotten permission and posted another example (red instead of
> green)
> > in the trunk\test\input\rendering folder as ArchiveRGB.ai.
> >
> > Thanks!
> >
> > Daniel Wilson
> >
> >
> > On Wed, Apr 15, 2009 at 4:53 PM, Jeremias Maerki <d...@jeremias-maerki.ch
> >wrote:
> >
> > > Daniel, I think there's a misunderstanding. PDF doesn't contain TIFF
> > > files. And not PNGs. In a way you could say PDFs can contain JPEGs as
> > > the DCTDecode filter handles practically the same as a raw JPEG file.
> In
> > > FOP we can basically embed a baseline JPEG file 1:1 without
> > > decompression in a PDF. But the same is not true for TIFF and PNG. What
> > > PDF uses are the PNG predictors to increase image compression over
> plain
> > > deflate. But that's not the same as PNG. I've tried to embed PNGs in
> > > PDFs without decompressing them in FOP and I didn't manage for some
> > > reason.
> > >
> > > By transparent TIFF, you mean black/white 1bit images. Is that right?
> > > When you're talking about TIFF, are you not rather talking about the
> > > CCITTFaxDecode filter which uses the compression algorithms defined in
> > > the ITU T.4 and T.6 specifications (CCITT Fax Group 3 and 4)? Like PDF,
> > > TIFF uses those algorithms, but that's not the same as embedding TIFF.
> > > In FOP, we can transfer CCITT encoded image data extracted from TIFF
> > > into PDFs without decompression, much like JPEG data.
> > >
> > > I've just had a closer look at CCITTFaxDecodeFilter in PDFBox. If I
> > > interpret the code correctly, it actually just embeds the stream data
> in
> > > a TIFF wrapper which is loaded (now by ImageIO?) somewhere else. Not
> > > what I expected. This was probably a work-around to make use of JAI's
> > > codec for this kind of image. I guess what you're really looking for is
> > > a decompressor (and eventually a compressor) for ITU T.4 and T.6.
> > >
> > > A suitably licensed decompressor can be found in Apache XML Graphics
> > > Commons:
> > >
> > >
> http://svn.apache.org/viewvc/xmlgraphics/commons/trunk/src/java/org/apache/xmlgraphics/image/codec/tiff/TIFFFaxDecoder.java?view=markup
> > >
> > > This could be integrated into an implementation of the CCITTFaxDecode
> > > filter in PDFBox.
> > >
> > > For the compression side, I'm currently working on a T.4/T.6 compressor
> > > but that's not finished, yet. I need that for FOP's PDF, PS, AFP and
> > > TIFF output. I'm implementing it as an OutputStream subclass so it can
> > > easily be integrated in Sanselan, PDFBox or whatever. The only two
> > > problems left is finding the right place to put it in the end and for
> me
> > > to find time to finish it.
> > >
> > > BTW, Sanselan doesn't have a CCITT/T.4/T.6 implementation, yet, so it
> > > won't be a help right now.
> > >
> > > So if I got this right, a full TIFF codec is only needed if you wanted
> > > PDFBox to be able to read TIFFs when embedding them in a new PDF or
> when
> > > you extract bitmaps from a PDF and want to save them as external image
> > > files. For PDF viewing, you only need the T.4/T.6 decompressor.
> > >
> > > I hope I'm making sense.
> > >
> > > On 15.04.2009 21:52:25 Daniel Wilson wrote:
> > > > Some PDF's have transparent TIFF's in them.
> > > >
> > > > They come into the PDXObject arena ... as a PDPixelMap.  But there we
> are
> > > > best prepared to handle JPEG's and PNG's.
> > > >
> > > > Most sources on rendering a TIFF in Java say to use JAI.  Someone
> (Jukka
> > > I
> > > > think) went to a good deal of trouble to excise JAI from PDFBox due
> to
> > > > licensing restrictions.
> > > >
> > > > Lizardworks' TIFF library is about 10 years old, lacks Deflate
> > > > decompression, and is licensed under the Library GPL.
> > > > http://www.lizardworks.com/libs.html
> > > >
> > > > So I don't think it is an option.
> > > >
> > > > The TIFF spec is 121 pages long in its own right.
> > > > http://partners.adobe.com/public/developer/en/tiff/TIFF6.pdf
> > > > That's a lot simpler than PDF, but doing our own implementation would
> be
> > > a
> > > > non-trivial undertaking.
> > > >
> > > > Any ideas on how to procede?
> > > >
> > > > Thanks.
> > > >
> > > > Daniel Wilson
> > >
> > >
> > >
> > >
> > > Jeremias Maerki
> > >
> > >
>
>
>
>
> Jeremias Maerki
>
>

Re: PDF's, TIFF's and JAI, oh my!

Reply via email to