Here's an additional error:
WARNING: java.lang.NullPointerException
at
org.apache.pdfbox.util.TextPosition.<init>(TextPosition.java:95)
org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:443)
org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:50)
org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:493)
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:214)
org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:173)
org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:358)
org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:282)
org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:238)
16 Apr 2009 11:16:28 PM org.apache.pdfbox.pdfparser.BaseParser parseCOSArray
WARNING: Corrupt object reference
Jamie Band wrote:
I am also getting the following:
java.lang.System.arraycopy(Object, int, Object, int, int) at
org.apache.pdfbox.util.PDFTextStripper.writeText(PDDocument, Writer)
[ARRAY INDEX OUT OF BOUNDS]
Jamie Band wrote:
Hi There
When calling PDFBox to extract text from PDF documents, I find that
it is prudent to wrap the calls with a Throwable clause since
PDFBox appears to frequently generate Null Pointer and Class Cast
exceptions.
Occasionally, I receive null pointer exceptions in the following:
org.apache.pdfbox.pdmodel.PDPageNode.getAllKids(List, COSDictionary,
boolean) (The method calls itself recursively) [NULL POINTER]
org.apache.pdfbox.encryption.DocumentEncryption.decryptDocument(String)
[CLASSCAST EXCEPTION]
I am using the latest checkout from svn.
I am sorry I don't have more information than since I obtained the
exception from a long running application.
Regards,
Jamie