On 20/12/11 15:55, amy...@apache.org wrote:
<mime-type type="application/vnd.ms-works"> +<magic priority="50"> +<match value="0xd0cf11e0a1b11ae1" type="string" offset="0:8"> +<match value="M\x00a\x00t\x00O\x00S\x00T" type="string" offset="1152:4096" /> +</match> +</magic> <glob pattern="*.wps"/>
--- tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/microsoft/POIFSContainerDetector.java (original) +++ tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/microsoft/POIFSContainerDetector.java Tue Dec 20 15:55:48 2011 @@ -164,6 +164,9 @@ public class POIFSContainerDetector impl return VSD; } else if (names.contains("\u0001Ole10Native")) { return OLE10_NATIVE; + } else if (names.contains("MatOST")) { + // this occurs on older Works Word Processor files (versions 3.0 and 4.0) + return WPS; } else if (names.contains("CONTENTS")&& names.contains("SPELLING")) {
Can you check your indenting settings? Only there seems to be something a bit odd with the indenting on several bits of this commit
Cheers Nick