[ https://issues.apache.org/jira/browse/TIKA-447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jukka Zitting resolved TIKA-447. -------------------------------- Resolution: Fixed Fix Version/s: 1.0 As suggested above, I moved the detector classes from o.a.t.detect to o.a.t.parser subpackages in revision 1159985. That should complete the last remaining open issue with this feature, so resolving as fixed. > Container aware mimetype detection > ---------------------------------- > > Key: TIKA-447 > URL: https://issues.apache.org/jira/browse/TIKA-447 > Project: Tika > Issue Type: New Feature > Components: mime > Affects Versions: 0.7 > Reporter: Nick Burch > Fix For: 1.0 > > Attachments: TIKA-447-TikaInputStream.patch, > TikaContainerDetection.patch > > > As discussed on the dev list, Tika should ideally have a configurable way to > process container based formats (eg zip files and ole2 files) when trying to > detect the correct mime type for a document. > This needs to be configurable, because some people won't want Tika to have to > do all the work of parsing the whole file when they're not interested in > knowing exactly what's in it > Once we have gone to the trouble of opening and parsing the container file, > we should try to keep the open container around to speed up parsing of the > contents -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira