[ https://issues.apache.org/jira/browse/TIKA-697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13145308#comment-13145308 ]
Alex Ott commented on TIKA-697: ------------------------------- I think, that following magic in tika-mimetypes.xml will be enough (instead of modifying code of Tika): <mime-type type="application/x-unix-archive"> <magic priority="50"> <match value="0x213C617263683E0A" type="string" offset="0" /> </magic> <glob pattern="*.a"/> </mime-type> > Tika reports the content type of AR archives as "text/plain" > ------------------------------------------------------------ > > Key: TIKA-697 > URL: https://issues.apache.org/jira/browse/TIKA-697 > Project: Tika > Issue Type: Bug > Environment: Linux (CentOS 5.6) > Reporter: PNS > Priority: Trivial > > The Tika.detect(InputStream) method returns "text/plain" for AR archives > created with the Linux "Create Archive" option of Nautilus (available via > right-clicking on a file). > The Apache Commons Compress "autodetection" code of the ArchiveStreamFactory > looks at the first 12 bytes of the stream and correctly identifies the type > as AR. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira