[ https://issues.apache.org/jira/browse/TIKA-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157197#comment-13157197 ]
Nick Burch commented on TIKA-790: --------------------------------- One possible solution to the few extra types that POIFSDocumentType has (such as Encrypted) is to add a parameter to the mimetype returned by POIFSContainerDetector, eg for an Encrypted file return "application/x-tika-msoffice; format=encrypted" > Reduce duplication between POIFSDocumentType (in OfficeParser) and > POIFSContainerDetector > ----------------------------------------------------------------------------------------- > > Key: TIKA-790 > URL: https://issues.apache.org/jira/browse/TIKA-790 > Project: Tika > Issue Type: Improvement > Components: parser > Affects Versions: 1.0 > Reporter: Nick Burch > Assignee: Nick Burch > > For historical reasons, we now have two parts of Tika that handle trying to > identify the type of an OLE2 based file. > POIFSDocumentType is able to detect a few kinds of files that > POIFSContainerDetector is not able to (eg Encrypted and OLE Native), mostly > which may not map well onto mimetypes. POIFSDocumentType also lacks some of > the logic in the main detector, and only does the office parser supported > files > We should probably try to reduce the duplication. One option is to add the > extra few types into the Detector some how, the other is to use the detector > first and do additional specific checks after -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira