[ 
https://issues.apache.org/jira/browse/PDFBOX-1154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152023#comment-13152023
 ] 

Pontus Hulin commented on PDFBOX-1154:
--------------------------------------

I have taken a look at the pdf files that this dispaly this problem and this is 
what I have found: 
all pdf files seem to contain ImageI in the ProcSet in page Resources. 

The pdf also seems to contain an /Indexed object for each image that is a part 
of the large image. The /Indexed object look like its a Colorspace.

Example:
71 0 obj
[/Indexed/DeviceCMYK 33 145 0 R]

So, If an Image has a references to Colorspace, that is Indexed we should not 
bother with it, is my conclusion. 
Has anyone else got any idea if this is the case?

I will do some more testing and post the results here.

Best regards
/ Pontus
                
> pdfbox exports 1200+ images from a pdf instead of one
> -----------------------------------------------------
>
>                 Key: PDFBOX-1154
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1154
>             Project: PDFBox
>          Issue Type: New Feature
>    Affects Versions: 1.6.0
>         Environment: Mac OS X 10.6
>            Reporter: Pontus Hulin
>              Labels: extract, images
>         Attachments: testfile.pdf
>
>
> I have a pdf that I export all images from. My problem is that I get 1290 
> images after export. If I export all images from the pdf in Acrobat Pro, I 
> get only one. There must be some way that the pdf composes these images 
> together, but I cant figure out how? 
> The pdf is problbly made from an ad pdf placed in an indesign CS4 dokument 
> and exported as a pdf by Indesign server 6.x
> I dont need to compose all the images to one, I just want to filter out the 
> ones that "belong" together. I will attach the pdf to this Issue.
> Does anyone know how to do that?
> Best regards
> / Pontus Hulin

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to