[ https://issues.apache.org/jira/browse/TIKA-589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988044#action_12988044 ]
Maxim Valyanskiy commented on TIKA-589: --------------------------------------- There is invalid style declaration in this document. I think that we can just ignore such references, but I'm not sure that it is right way to fix this issue. Please create bug in POI bugzilla on component HWPF > NPE with POI when parsing word docs > ----------------------------------- > > Key: TIKA-589 > URL: https://issues.apache.org/jira/browse/TIKA-589 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 0.9 > Reporter: John Wang > > I think this is a POI issue, but dunno where to file it... > stacktrace: > Caused by: java.lang.NullPointerException > at > org.apache.poi.hwpf.sprm.ParagraphSprmUncompressor.uncompressPAP(ParagraphSprmUncompressor.java:47) > at org.apache.poi.hwpf.model.StyleSheet.createPap(StyleSheet.java:241) > at org.apache.poi.hwpf.model.StyleSheet.<init>(StyleSheet.java:116) > at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:229) > at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:131) > at > org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.java:61) > at > org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:182) > at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197) > The word file I am trying to parse is: > http://www2.ed.gov/programs/titleiparta/parentinvguid.doc -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.