Hello everyone!
I tried to convert each page from a PDF document to BufferedImage and store
each image to disk.
Steps:
- I downloaded PDFBox from svn and build it with ant, created jar lib from
classes (added Resources to the jar)
- I'm using the jar in NetBeans
- tried to call convertToImage() function on a PDPage instance and got
exception that a class cannot be found so I downloaded FontBox-1.0.1.jar and
added it to NetBeans project
- and the following code snippet still throws an exception that it can't
find a mehtod in CMapParser class from FontBox-1.0.1.jar library
Code snippet from a example I wrote (important: path stored in filePath
points to an existing PDF):
try {
// laod PDF document
PDDocument document = PDDocument.load(new File(filePath));
// get all pages
List<PDPage> pages =
document.getDocumentCatalog().getAllPages();
// for each page
for (int i = 0; i < pages.size(); i++) {
// single page
PDPage singlePage = pages.get(i);
// to BufferedImage
BufferedImage buffImage = singlePage.convertToImage(); //
<-- HERE GETS THE FOLLOWING EXCEPTION THROWN FROM MY CODE
// write image to disk
ImageIO.write(buffImage, "image/png", new
File("C:\\Users\\Funky\\Desktop\\page" + i + ".png"));
}
} catch (IOException ex) {
ex.printStackTrace();
}
The exception I get:
Exception in thread "main" java.lang.NoSuchMethodError:
org.fontbox.cmap.CMapParser.parse(Ljava/lang/String;Ljava/io/InputStream;)Lorg/fontbox/cmap/CMap;
at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:513)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:367)
at
org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:325)
at
org.apache.pdfbox.util.operator.ShowTextGlyph.process(ShowTextGlyph.java:66)
at
org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:491)
at
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:214)
at
org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:173)
at
org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:88)
at org.apache.pdfbox.pdmodel.PDPage.convertToImage(PDPage.java:669)
at PDFBox.Example.<init>(Example.java:45)
at PDFBox.Example.main(Example.java:63)
I looked in PDFont.java source file and saw that the function
CMapParser.parse(...) takes two String attributes.
What can I do to make the PDPage.convertToImage() function work properly?
Thanks for any help!
Regards.
Tilen