[ https://issues.apache.org/jira/browse/TIKA-1038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14347377#comment-14347377 ]
Tilman Hausherr edited comment on TIKA-1038 at 3/4/15 6:59 PM: --------------------------------------------------------------- [~talli...@mitre.org] are you watching this one? I made a (hopefully useful) response in PDFBOX-1835. was (Author: tilman): [~talli...@mitre.org]are you watching this one? I made a (hopefully useful) response in PDFBOX-1835. > Parsing PDF with StackOverlowError > ----------------------------------- > > Key: TIKA-1038 > URL: https://issues.apache.org/jira/browse/TIKA-1038 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 1.2 > Reporter: Konstantin Privezentsev > > Tika corrupt with StackOverflowError on some pdf documents: > http://www.ellipse-labo.com/fiches/1303214351.pdf > http://downloads.joomlacode.org/frsrelease/5/4/0/54089/handbuch_ckforms-DE-1.3.2.pdf > Code: > {code:java} > AutoDetectParser parser = new AutoDetectParser( > new TypeDetector(), > new PDFParser(), > new OfficeParser(), > new HtmlParser(), > new RTFParser(), > new OOXMLParser()); > WriteOutContentHandler contentHandler = new WriteOutContentHandler(); > Metadata metadata = new Metadata(); > parser.parse(contentStream, new BodyContentHandler(contentHandler), metadata, > new ParseContext()); > {code} > Stack trace: > {code} > java.lang.StackOverflowError > at > java.util.LinkedHashMap$LinkedHashIterator.<init>(LinkedHashMap.java:345) > at > java.util.LinkedHashMap$LinkedHashIterator.<init>(LinkedHashMap.java:345) > at java.util.LinkedHashMap$KeyIterator.<init>(LinkedHashMap.java:383) > at java.util.LinkedHashMap$KeyIterator.<init>(LinkedHashMap.java:383) > at java.util.LinkedHashMap.newKeyIterator(LinkedHashMap.java:396) > at java.util.HashMap$KeySet.iterator(HashMap.java:874) > at org.apache.pdfbox.cos.COSDictionary.toString(COSDictionary.java:1416) > at org.apache.pdfbox.cos.COSDictionary.toString(COSDictionary.java:1421) > at org.apache.pdfbox.cos.COSDictionary.toString(COSDictionary.java:1421) > at org.apache.pdfbox.cos.COSDictionary.toString(COSDictionary.java:1421) > at org.apache.pdfbox.cos.COSDictionary.toString(COSDictionary.java:1421) > ... > {code} > -- This message was sent by Atlassian JIRA (v6.3.4#6332)