Counting  pages of a PDF gives OutOfMemoryError
-----------------------------------------------

                 Key: PDFBOX-1226
                 URL: https://issues.apache.org/jira/browse/PDFBOX-1226
             Project: PDFBox
          Issue Type: Bug
          Components: PDFReader
    Affects Versions: 1.6.0
         Environment: Windows 7 / Windows XP
            Reporter: Anca Zapuc


I have a pdf ( 397 MB) and I am trying to count the pages.
I am able to open the PDF with AdobeReader 9, but no with FoxitReader.
Code:
  PDDocument doc = null;
                File temp = null;
                RandomAccessFile rand = null;
                int nr = 0;
                try {
                    //create a temporary file needed by the PDFBox when dealing 
with PDFs really really large
                    temp = new File("e:/temp.tmp");
                    //using random access file needed for PDF really large
                    rand = new RandomAccessFile(temp,"rw");
                    doc = PDDocument.load(file,rand);
                    nr = doc.getNumberOfPages();
        }catch(Exception e){
                e.printStackTrace();
        }

Got following exception:
org.apache.pdfbox.exceptions.WrappedIOException
        at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:240)
        at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1069)
        at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1022)
        at PDFBoxExample.getHugeNrOfFiles(PDFBoxExample.java:36)
        at PDFBoxExample.main(PDFBoxExample.java:258)
Caused by: java.lang.OutOfMemoryError: Java heap space
        at java.lang.AbstractStringBuilder.<init>(AbstractStringBuilder.java:45)
        at java.lang.StringBuffer.<init>(StringBuffer.java:79)
        at 
org.apache.pdfbox.pdfparser.BaseParser.readString(BaseParser.java:1121)
        at 
org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:402)
        at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:552)
        at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:184)
        ... 4 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to