On Wed, 2009-07-15 at 23:52 +0100, Thach Tran wrote: > Hi all, > > I'm doing some parsing on PDF's content stream with PoDoFo and came > across this situation where the page content stream contains inline > image. The binary image data lies between ID and EI keyword causes the > tokenizer to choke (ePdfError_InvalidDataType is thrown). I know PDF > files nowadays rarely contain inline image but still, I would like to > know is there any ways to get around this problem (e.g. skip the > binary data and keep on parsing). You can have a look at the sample > PDF enclosed.
PdfContentsTokenizer will need to be enhanced to recognise the ID/EI keywords and maintain internal state indicating whether or not it's currently reading binary image data. This shouldn't be too tricky to add - patches accepted ;-) Inline images should be reasonably small (small enough, at least, not to be a memory burden) so you should just be able to read the whole inline image and return the image data in a PdfVariant from the PdfContentsTokenizer::ReadNext call like usual. I think using the internal variant type PdfData would be appropriate for the returned data. I'd suggest adding a new value to the EPdfContentsType enum, something like ePdfContentsType_ImageData, just to help the caller know what they're getting, though the returned "ID" keyword should've been a hint. If you do implement this, PLEASE do so against svn trunk not against a release branch, and send us a patch. If you don't know how to produce a patch, send the modified file(s) containing ONLY the changes required to implement inline images in PdfContentsTokenizer. It'd be nice to use your PDF as a test case for the class, too. OK by you? -- Craig Ringer ------------------------------------------------------------------------------ Enter the BlackBerry Developer Challenge This is your chance to win up to $100,000 in prizes! For a limited time, vendors submitting new applications to BlackBerry App World(TM) will have the opportunity to enter the BlackBerry Developer Challenge. See full prize details at: http://p.sf.net/sfu/Challenge _______________________________________________ Podofo-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/podofo-users
