I've done battle with the PDXObjectImage, but it has usually defeated me! Sections 4.7 and 4.8 of the PDF spec address it.
Daniel On Tue, Sep 15, 2009 at 6:01 PM, Martinez, Mel <[email protected]>wrote: > I've been playing with extracting images. > > I've found a few 'wierdnesses' (I know, that's not a real word) in the > org.apache.pdfbox.ExtractText class and If I can clear some time, I'll try > to submit something on that. > > Ignoring the 'wierdnesses' (which have more to do with options parsing and > filenaming), it does successfully extract images to separate files. > > However, the color table is apparently not being handled properly. > > All the images end up displaying with the default Windows palette, which > tells me that they probably are missing their own. > > I assume that what probably needs to be done is that the color space needs > to be rebuilt and reset on each image object prior to writing the image out > to file, but I'm not entirely certain how to proceed with that. > > Does anybody have any familiarity with the PDXObjectImage and its related > APIs? > > If someone can point me in the right direction, I don't mind doing the work > of fixing this. > > Mel > > Dr. Mel Martinez > [email protected] > > > >
