Jorge Spinsanti created PDFBOX-3967: ---------------------------------------
Summary: IllegalArgumentException: Illegal Capacity: -1 Key: PDFBOX-3967 URL: https://issues.apache.org/jira/browse/PDFBOX-3967 Project: PDFBox Issue Type: Bug Components: Parsing Affects Versions: 2.0.7 Reporter: Jorge Spinsanti I got an exception to extract TXT or HTML from PDF file: {code} org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.pdf.PDFParser@429568e8 at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:282) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) ... Caused by: java.lang.IllegalArgumentException: Illegal Capacity: -1 at java.util.ArrayList.<init>(ArrayList.java:142) at org.apache.pdfbox.pdfparser.PDFObjectStreamParser.parse(PDFObjectStreamParser.java:72) at org.apache.pdfbox.pdfparser.COSParser.parseObjectStream(COSParser.java:845) at org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:748) at org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:673) at org.apache.pdfbox.pdfparser.COSParser.parseDictObjects(COSParser.java:633) at org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:241) at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:276) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1143) at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1077) at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:149) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) ... 25 more {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org