Converting PDF to image

Tilen Bobek Tue, 17 Mar 2009 12:26:32 -0700

Hello everyone!

I tried to convert each page from a PDF document to BufferedImage and store
each image to disk.


Steps:

- I downloaded PDFBox from svn and build it with ant, created jar lib from
classes (added Resources to the jar)
- I'm using the jar in NetBeans
- tried to call convertToImage() function on a PDPage instance and got
exception that a class cannot be found so I downloaded FontBox-1.0.1.jar and
added it to NetBeans project
- and the following code snippet still throws an exception that it can't
find a mehtod in CMapParser class from FontBox-1.0.1.jar library

Code snippet from a example I wrote (important: path stored in filePath
points to an existing PDF):


        try {

            // laod PDF document
            PDDocument document = PDDocument.load(new File(filePath));

            // get all pages
            List<PDPage> pages =
document.getDocumentCatalog().getAllPages();

            // for each page
            for (int i = 0; i < pages.size(); i++) {
                // single page
                PDPage singlePage = pages.get(i);

                // to BufferedImage
                BufferedImage buffImage =  singlePage.convertToImage(); //
<-- HERE GETS THE FOLLOWING EXCEPTION THROWN FROM MY CODE

                // write image to disk
                ImageIO.write(buffImage, "image/png", new
File("C:\\Users\\Funky\\Desktop\\page" + i + ".png"));
            }

        } catch (IOException ex) {
            ex.printStackTrace();
        }

The exception I get:

Exception in thread "main" java.lang.NoSuchMethodError:
org.fontbox.cmap.CMapParser.parse(Ljava/lang/String;Ljava/io/InputStream;)Lorg/fontbox/cmap/CMap;
        at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:513)
        at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:367)
        at
org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:325)
        at
org.apache.pdfbox.util.operator.ShowTextGlyph.process(ShowTextGlyph.java:66)
        at
org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:491)
        at
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:214)
        at
org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:173)
        at
org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:88)
        at org.apache.pdfbox.pdmodel.PDPage.convertToImage(PDPage.java:669)
        at PDFBox.Example.<init>(Example.java:45)
        at PDFBox.Example.main(Example.java:63)

I looked in PDFont.java source file and saw that the function
CMapParser.parse(...) takes two String attributes.

What can I do to make the PDPage.convertToImage() function work properly?

Thanks for any help!

Regards.

Tilen

Converting PDF to image

Reply via email to