On Mon, 5 Sep 2011, Jukka Zitting wrote:
Hm, that is strange - current version of OfficeParser.POIFSDocumentType.detectType() thinks that "CONTENTS" part identifies POI filesystem as MS Works document. Maybe this is not right.

I think we have some MS Works test files that do contain the
"CONTENTS" entry, though I'm not sure if that's the best possible
heuristic for detecting MS Works documents.

I've checked a few sample ones, and they have both CONTENTS and SPELLING, so I tweaked the rule to look for both

My fix in revision 1165259 also checks for the presence of explicit OLE entries, which I believe should help prevent collisions with actual embedded MS Works documents.

I think we might want a different type for OLE1 native and general OLE2, as currently the detector won't let us spot the difference between them?

Nick

Reply via email to