[ 
https://issues.apache.org/jira/browse/TIKA-697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13158070#comment-13158070
 ] 

Nick Burch commented on TIKA-697:
---------------------------------

Thanks for this

I've tweaked the existing mime magic in r1206896, which should now correctly 
detect the file format (the previous one had an eronious = at the start, and 
lacked the \n). I've also added the alternate extension and alternate mimetype

In r1206898 I've also added mime magic for .deb, based on the working one for 
archive. Ideally we should also add a very small .deb file to the test suite
                
> Tika reports the content type of AR archives as "text/plain"
> ------------------------------------------------------------
>
>                 Key: TIKA-697
>                 URL: https://issues.apache.org/jira/browse/TIKA-697
>             Project: Tika
>          Issue Type: Bug
>         Environment: Linux (CentOS 5.6)
>            Reporter: PNS
>            Priority: Trivial
>             Fix For: 1.1
>
>         Attachments: tika-697.diff
>
>
> The Tika.detect(InputStream) method returns "text/plain" for AR archives 
> created with the Linux "Create Archive" option of Nautilus (available via 
> right-clicking on a file).
> The Apache Commons Compress "autodetection" code of the ArchiveStreamFactory 
> looks at the first 12 bytes of the stream and correctly identifies the type 
> as AR.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to