[ https://issues.apache.org/jira/browse/TIKA-821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13173267#comment-13173267 ]
Antoni Mylka commented on TIKA-821: ----------------------------------- Committed in r1221323 > Support detecting old MIcrosoft Works Word Processor formats > ------------------------------------------------------------ > > Key: TIKA-821 > URL: https://issues.apache.org/jira/browse/TIKA-821 > Project: Tika > Issue Type: Improvement > Components: mime > Affects Versions: 1.1 > Reporter: Antoni Mylka > Assignee: Antoni Mylka > > An issue similar to TIKA-812. This time it's about old Works Word Processor > formats. They use an OLE2 structure, but the top-level entry is called > "MatOST", they are not supported by the OfficeParser. I would like to: > # Add a magic to tika-mimetypes.xml to mark the file as ms-works if "MatOST" > is found. (After TIKA-806 we officially like those). > # Add an 'if' to POIFSContainerDetector to look for MatOST. > I'm not creating a separate media type for this (like I did in TIKA-812) > because no parser supports it anyway. In TIKA-812 it was necessary, because > ExcelParser can't work with all vnd.ms-works files but can work with 7.0 > spreadsheets. In this case there is no gain in a separate mime type. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira