Counting pages of a PDF gives OutOfMemoryError
-----------------------------------------------
Key: PDFBOX-1226
URL: https://issues.apache.org/jira/browse/PDFBOX-1226
Project: PDFBox
Issue Type: Bug
Components: PDFReader
Affects Versions: 1.6.0
Environment: Windows 7 / Windows XP
Reporter: Anca Zapuc
I have a pdf ( 397 MB) and I am trying to count the pages.
I am able to open the PDF with AdobeReader 9, but no with FoxitReader.
Code:
PDDocument doc = null;
File temp = null;
RandomAccessFile rand = null;
int nr = 0;
try {
//create a temporary file needed by the PDFBox when dealing
with PDFs really really large
temp = new File("e:/temp.tmp");
//using random access file needed for PDF really large
rand = new RandomAccessFile(temp,"rw");
doc = PDDocument.load(file,rand);
nr = doc.getNumberOfPages();
}catch(Exception e){
e.printStackTrace();
}
Got following exception:
org.apache.pdfbox.exceptions.WrappedIOException
at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:240)
at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1069)
at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1022)
at PDFBoxExample.getHugeNrOfFiles(PDFBoxExample.java:36)
at PDFBoxExample.main(PDFBoxExample.java:258)
Caused by: java.lang.OutOfMemoryError: Java heap space
at java.lang.AbstractStringBuilder.<init>(AbstractStringBuilder.java:45)
at java.lang.StringBuffer.<init>(StringBuffer.java:79)
at
org.apache.pdfbox.pdfparser.BaseParser.readString(BaseParser.java:1121)
at
org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:402)
at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:552)
at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:184)
... 4 more
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira