Failing word parse with tika 0.9

2011-06-23 Thread Tom Gross
Hi I have a Word Document maie.doc: CDF V2 Document, Little Endian, Os: Windows, Version 5.1, Code page: 1252, Title: Modul: Unternehmungsf\177hrung 5, Author: APO, Template: Normal.dot, Last Saved By: APO, Revision Number: 8, Name of Creating Application: Microsoft Office Word, Last Printed:

Re: Failing word parse with tika 0.9

2011-06-23 Thread Nick Burch
On Thu, 23 Jun 2011, Tom Gross wrote: which tika 0.9 can't parse. It fails with: Caused by: java.lang.NullPointerException at org.apache.poi.hwpf.sprm.ParagraphSprmUncompressor.uncompressPAP(ParagraphSprmUncompressor.java:47) at org.apache.poi.hwpf.model.PAPX.getParagraphProperti

Re: Failing word parse with tika 0.9

2011-06-23 Thread Tom Gross
Upgrading to poi 3.8beta3 fixed the issue. Thanks Nick! On 06/23/2011 07:24 PM, Nick Burch wrote: On Thu, 23 Jun 2011, Tom Gross wrote: which tika 0.9 can't parse. It fails with: Caused by: java.lang.NullPointerException at org.apache.poi.hwpf.sprm.ParagraphSprmUncompressor.uncompressPAP(Para