Hi,I have quickly put together what you suggested earlier. The patch is attached in this message, feel free to give it a go. I'm really new at PoDoFo code base; so mistakes are quite inevitable. Please send me some feedbacks if that's ok. About using my sample PDF as a test case, I'm totally fine with it. I know it takes a bit of effort to find/ generate a PDF contains inline image these day :-)
Cheers, Thach
inline-img.patch
Description: Binary data
On 20 Jul 2009, at 03:31, Craig Ringer wrote:
On Wed, 2009-07-15 at 23:52 +0100, Thach Tran wrote:Hi all, I'm doing some parsing on PDF's content stream with PoDoFo and came across this situation where the page content stream contains inlineimage. The binary image data lies between ID and EI keyword causes thetokenizer to choke (ePdfError_InvalidDataType is thrown). I know PDF files nowadays rarely contain inline image but still, I would like to know is there any ways to get around this problem (e.g. skip the binary data and keep on parsing). You can have a look at the sample PDF enclosed.PdfContentsTokenizer will need to be enhanced to recognise the ID/EI keywords and maintain internal state indicating whether or not it's currently reading binary image data. This shouldn't be too tricky to add - patches accepted ;-)Inline images should be reasonably small (small enough, at least, not to be a memory burden) so you should just be able to read the whole inlineimage and return the image data in a PdfVariant from the PdfContentsTokenizer::ReadNext call like usual. I think using the internal variant type PdfData would be appropriate for the returned data. I'd suggest adding a new value to the EPdfContentsType enum,something like ePdfContentsType_ImageData, just to help the caller know what they're getting, though the returned "ID" keyword should've been ahint. If you do implement this, PLEASE do so against svn trunk not against arelease branch, and send us a patch. If you don't know how to produce a patch, send the modified file(s) containing ONLY the changes required toimplement inline images in PdfContentsTokenizer. It'd be nice to use your PDF as a test case for the class, too. OK by you? -- Craig Ringer
------------------------------------------------------------------------------ Enter the BlackBerry Developer Challenge This is your chance to win up to $100,000 in prizes! For a limited time, vendors submitting new applications to BlackBerry App World(TM) will have the opportunity to enter the BlackBerry Developer Challenge. See full prize details at: http://p.sf.net/sfu/Challenge
_______________________________________________ Podofo-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/podofo-users
