RE: Why would PDFTextStripper.getText() generate a NullPointerException ?

2010-05-11 Thread Adam
aPage.findCropBox() in PDFStreamEngine.java line 202 (on HEAD tag) is returning null. That null is passed into the PDGraphicsState constructor and throws the exception your seeing. I'm guessing that the page doesn't have a cropbox, nor does its parent, nor does it have a media box. This may n

RE: Why would PDFTextStripper.getText() generate a NullPointerException ?

2010-05-11 Thread Lupton, Chris B.
"RE: Why would PDFTextStripper.getText() generate a NullPointerException ?" Follow Up on my previous post: The Stack Trace that I am getting: === Exception in thread "main" java.lang.NullPointerException at org.apache.pdfbox.pdmodel.graph

Re: Why would PDFTextStripper.getText() generate a NullPointerException ?

2010-05-11 Thread Hannes Erven
Chris, > For my project, I have found a few PDF Documents that generate a Null > Pointer Exception when invoking the getText() method of a > PDFTextStripper Instance. If you are unable to share such a document, please post the exact stack trace of that exception. -hannes

Why would PDFTextStripper.getText() generate a NullPointerException ?

2010-05-11 Thread Lupton, Chris B.
I am trying to setup a very simple Java Test to learn how to use pdfBox. I found a few examples on the mailing lists that illustrate how to use the PDFTextStripper class. For my project, I have found a few PDF Documents that generate a Null Pointer Exception when invoking the getText() method of a

Illegible decoding in some pdf documents

2010-05-11 Thread Thomas Fischer
Hello, I sent this note last week and didn't receive any response, here is an updated version with some additional information. To explain the context a little: I tried to extract text from 5091 mathematical PDF files. While I got some messages like "You do not have permission to extract text",