[ 
https://issues.apache.org/jira/browse/TIKA-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157197#comment-13157197
 ] 

Nick Burch commented on TIKA-790:
---------------------------------

One possible solution to the few extra types that POIFSDocumentType has (such 
as Encrypted) is to add a parameter to the mimetype returned by 
POIFSContainerDetector, eg for an Encrypted file return 
"application/x-tika-msoffice; format=encrypted"
                
> Reduce duplication between POIFSDocumentType (in OfficeParser) and 
> POIFSContainerDetector
> -----------------------------------------------------------------------------------------
>
>                 Key: TIKA-790
>                 URL: https://issues.apache.org/jira/browse/TIKA-790
>             Project: Tika
>          Issue Type: Improvement
>          Components: parser
>    Affects Versions: 1.0
>            Reporter: Nick Burch
>            Assignee: Nick Burch
>
> For historical reasons, we now have two parts of Tika that handle trying to 
> identify the type of an OLE2 based file.
> POIFSDocumentType is able to detect a few kinds of files that 
> POIFSContainerDetector is not able to (eg Encrypted and OLE Native), mostly 
> which may not map well onto mimetypes. POIFSDocumentType also lacks some of 
> the logic in the main detector, and only does the office parser supported 
> files
> We should probably try to reduce the duplication. One option is to add the 
> extra few types into the Detector some how, the other is to use the detector 
> first and do additional specific checks after

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to