Ravi created PDFBOX-3727:
----------------------------

             Summary: "premature EOF, image will be incomplete"
                 Key: PDFBOX-3727
                 URL: https://issues.apache.org/jira/browse/PDFBOX-3727
             Project: PDFBox
          Issue Type: Bug
          Components: Parsing, Text extraction
    Affects Versions: 2.0.5, 2.0.4
         Environment: Windows 10/X64
            Reporter: Ravi
             Fix For: 2.0.4


I am trying to extract all the embeded images from Pdf file. But some times the 
images extracted are throwing warnings below.

[main] WARN  o.a.p.p.g.image.SampledImageReader - premature EOF, image will be 
incomplete

The extracted images are half-complete(half- greyed out)

I would like to know if any solution available for this. Below is my code 
snippet

Any Help is greatly appreciated.

        public static void testPDFBoxExtractImages() throws Exception {
            PDDocument document = PDDocument.load(new File(fileName));
            PDPageTree list = document.getPages();
            for (PDPage page : list) {
                PDResources pdResources = page.getResources();
                System.out.println(page.getRotation());
                for (COSName c : pdResources.getXObjectNames()) {
                    PDXObject o = pdResources.getXObject(c);
                    if (o instanceof 
org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject) {
                        File file = new File("C:/temp/" + System.nanoTime() + 
".png");
                        
ImageIO.write(((org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject)o).getImage(),
 "png", file);
                    }
                }
            }
        }




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to