Hello,

I'm trying to understand why I'm getting a NullPointerException when I 
merely try to load this one particular PDF and then getNumberOfPages(). 
The core problem seems to be there the "Pages" in the document catalog 
references an object which doesn't seem to exist.  Here's the metadata 
from the PDF:
<</Metadata 178 0 R/PageLayout/OneColumn/Pages 186 0 R/Type/Catalog>>

I searched for "186" with a text editor and it doesn't appear anywhere 
else in the PDF.  This explains why cat.getPages() (in 
PDDocument.getNumberOfPages()) returns null, which then causes the NPE.
Code:
    doc = PDDocument.load(inputFile);
    System.out.println("Number of pages = " + doc.getNumberOfPages());

Stacktrace:
    java.lang.NullPointerException
        at 
org.apache.pdfbox.pdmodel.PDPageNode.getCount(PDPageNode.java:102)
        at 
org.apache.pdfbox.pdmodel.PDDocument.getNumberOfPages(PDDocument.java:931)
        at com.xldynamics.common.PdfBoxTest.main(PdfBoxTest.java:30)

I can open this same file in Adobe Acrobat and Adobe Reader with no 
problem.  If those programs can open it, I think PDFBox should be able to 
as well.  I'm using HEAD tag (revision 937546), Windows Vista 32-bit, Java 
1.5.0_06.

I think the reason this is happening may be on account of the owner 
password (which I don't know, by the way), however I didn't think the 
owner password would prevent doing things as simple as getting a page 
count.  So my questions are:
1.) Is this NullPointerException caused by the owner password?
2.) How can I process this file (or any file with an owner password, if 
that is the issue)?
3.) I'm not sure if this is a bug in the lib or not, but should I open up 
a ticket on jira anyway so I can attach the PDF for reference (since I 
can't attach it on the mailing lists)?

I remember seeing someone suggest decrypting PDFs with a null password, or 
empty string at some point in the past for some crypto problem.  I'm not 
sure if that's a logical thing to do in my particular case, but I tried it 
anyway.  That resulted in a different stacktrace, but I may be going in 
the complete wrong direction here...  The reason for this stacktrace is 
that lastByte (BaseParser.java line 1254) was -1 on the first iteration of 
the loop which left intBuffer empty.  Integer.parseInt() then throws the 
exception and results in the following stacktrace:
    java.io.IOException: Error: Expected an integer type, actual=''
        at 
org.apache.pdfbox.pdfparser.BaseParser.readInt(BaseParser.java:1275)
        at 
org.apache.pdfbox.pdfparser.PDFObjectStreamParser.parse(PDFObjectStreamParser.java:81)
        at 
org.apache.pdfbox.cos.COSDocument.dereferenceObjectStreams(COSDocument.java:449)
        at 
org.apache.pdfbox.pdmodel.PDDocument.openProtection(PDDocument.java:1100)
        at com.xldynamics.common.PdfBoxTest.main(PdfBoxTest.java:32)

If anyone has any suggestions on where I should go next, I'd be most 
grateful.  Just for the record, this issue is not at all related to 
PDFBOX-699 nor PDFBOX-700 which I opened yesterday.

Thanks,
Adam


?  Click here to submit conditions  

This email and any content within or attached hereto from  Sun West Mortgage 
Company, Inc.  is confidential and/or legally privileged. The information is 
intended only for the use of the individual or entity named on this email. If 
you are not the intended recipient, you are hereby notified that any 
disclosure, copying, distribution or the taking of any action in reliance on 
the contents of this email information is strictly prohibited, and that the 
documents should be returned to this office immediately by email. Receipt by 
anyone other than the intended recipient is not a waiver of any privilege. 
Please do not include your social security number, account number, or any other 
personal or financial information in the content of the email. Should you have 
any questions, please call  (800) 453 7884.   

Reply via email to