PDF images are NOT in a standard format - they are "arrays of color values" in 
a specific colorspace with a certain number of bits per component and 
potentially processed with one or more "filters".  Details are described in ISO 
32000-1.

As such, you will need to extract the image stream into some "image processing 
library" that knows what to do with the various structures and then can also, 
possibly, save them out to various image formats.   JAI is probably a good 
place to look.

-----Original Message-----
From: java.geek [mailto:java.g...@rediffmail.com] 
Sent: Sunday, November 29, 2009 9:12 AM
To: itext-questions@lists.sourceforge.net
Subject: [iText-questions] Extract PDF embedded images using iText


Hi All, I am trying to extract images from pdf document using iText library.
 
I am able to create the instance of only JPEG format(*.jpg, *.jpeg, *.jpe).
**** Image imageObject = Image.getInstance(image); ****
Not other format images are embedded in PDF document.


public void extractImagesInfo(){
                try{
                        PdfReader chartReader = new PdfReader("MyPdf.pdf");
                 for (int i = 0; i < chartReader.getXrefSize(); i++) {
                  PdfObject pdfobj = chartReader.getPdfObject(i);
                  if (pdfobj != null && pdfobj.isStream()) {
                    PdfStream stream = (PdfStream) pdfobj;
                    PdfObject pdfsubtype = stream.get(PdfName.SUBTYPE);
                    //System.out.println("Stream subType: " + pdfsubtype);
                    if (pdfsubtype != null &&
pdfsubtype.toString().equals(PdfName.IMAGE.toString())) {
                byte[] image = PdfReader.getStreamBytesRaw((PRStream)
stream);
                Image imageObject = Image.getInstance(image);
                System.out.println("Resolution" + imageObject.getDpiX());
                System.out.println("Height" + imageObject.getHeight());
                System.out.println("Width" + imageObject.getWidth());
               
                    }
                  }
                }
                }catch(Exception e){
                e.printStackTrace();
                }
               
        } 
-- 
View this message in context: 
http://old.nabble.com/Extract-PDF-embedded-images-using-iText-tp26562385p26562385.html
Sent from the iText - General mailing list archive at Nabble.com.


------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

Buy the iText book: http://www.1t3xt.com/docs/book.php
Check the site with examples before you ask questions: 
http://www.1t3xt.info/examples/
You can also search the keywords list: http://1t3xt.info/tutorials/keywords/

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

Buy the iText book: http://www.1t3xt.com/docs/book.php
Check the site with examples before you ask questions: 
http://www.1t3xt.info/examples/
You can also search the keywords list: http://1t3xt.info/tutorials/keywords/

Reply via email to