[ 
https://issues.apache.org/jira/browse/TIKA-2225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15770044#comment-15770044
 ] 

Jorge Spinsanti commented on TIKA-2225:
---------------------------------------

I created an issue on POI too: 
https://bz.apache.org/bugzilla/show_bug.cgi?id=60484

> Parse DOCX file due to NullPointerException on POI code
> -------------------------------------------------------
>
>                 Key: TIKA-2225
>                 URL: https://issues.apache.org/jira/browse/TIKA-2225
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.14
>            Reporter: Jorge Spinsanti
>
> I'm trying to get text from DOCX file but I got an exception due to 
> NullPonterException on POI code. Stacktrace:
> {code}
> org.apache.tika.exception.TikaException: Unexpected RuntimeException from 
> org.apache.tika.parser.microsoft.OfficeParser@4f5692fe
>       at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:282)
>       at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
>       ... 16 more
> Caused by: java.lang.NullPointerException
>       at org.apache.poi.hwpf.usermodel.Picture.getRawContent(Picture.java:422)
>       at 
> org.apache.poi.hwpf.usermodel.Picture.fillImageContent(Picture.java:131)
>       at org.apache.poi.hwpf.usermodel.Picture.getContent(Picture.java:286)
>       at 
> org.apache.tika.parser.microsoft.WordExtractor.handlePictureCharacterRun(WordExtractor.java:609)
>       at 
> org.apache.tika.parser.microsoft.WordExtractor.handleSpecialCharacterRuns(WordExtractor.java:517)
>       at 
> org.apache.tika.parser.microsoft.WordExtractor.handleParagraph(WordExtractor.java:346)
>       at 
> org.apache.tika.parser.microsoft.WordExtractor.handleParagraph(WordExtractor.java:273)
>       at 
> org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.java:179)
>       at 
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:169)
>       at 
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:130)
>       at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to