[ https://issues.apache.org/jira/browse/TIKA-2258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15851037#comment-15851037 ]
Sharath Kumar commented on TIKA-2258: ------------------------------------- Thanks Tim. https://bz.apache.org/bugzilla/show_bug.cgi?id=60685 > Unable to parse .pub files -java.lang.ArrayIndexOutOfBoundsException: 88 > ------------------------------------------------------------------------ > > Key: TIKA-2258 > URL: https://issues.apache.org/jira/browse/TIKA-2258 > Project: Tika > Issue Type: Bug > Components: core, parser > Affects Versions: 1.13 > Environment: Windows 7 > Reporter: Sharath Kumar > Attachments: Roc.pub > > > When i try to parse the attached .pub file, it fails with the below exception > Caused by: java.lang.ArrayIndexOutOfBoundsException: 88 > at org.apache.poi.util.LittleEndian.getUShort(LittleEndian.java:343) > at > org.apache.poi.hpbf.model.qcbits.QCPLCBit$Type12.<init>(QCPLCBit.java:215) > at > org.apache.poi.hpbf.model.qcbits.QCPLCBit$Type12.<init>(QCPLCBit.java:176) > at > org.apache.poi.hpbf.model.qcbits.QCPLCBit.createQCPLCBit(QCPLCBit.java:90) > at org.apache.poi.hpbf.model.QuillContents.<init>(QuillContents.java:71) > at org.apache.poi.hpbf.HPBFDocument.<init>(HPBFDocument.java:67) > at > org.apache.poi.hpbf.extractor.PublisherTextExtractor.<init>(PublisherTextExtractor.java:45) > at > org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:141) > at > org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:117) > at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) > ... 28 more -- This message was sent by Atlassian JIRA (v6.3.15#6346)