[ 
https://issues.apache.org/jira/browse/TIKA-1882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15182146#comment-15182146
 ] 

Nick Burch commented on TIKA-1882:
----------------------------------

Just because other people think it's a magic doesn't necessarily mean it is - 
many others just blindly find a few bytes that look common without trying to 
understand the underlying format, and consequently can get it wrong...

As the QuickTime container is a base for MP4, and our MP4 Video mime type 
declares QuickTime Video as its parent, if things are common then QuickTime is 
the right place to put it. 

I've had a go in bee1a87d7d9ad3a1c5f45cf65082b9505dbe9fc0 to better express the 
QuickTime/MP4 relationship in the mime types hierarchy. If you could merge that 
and re-test, and all tests pass, plus switch hex strings to text where possible 
(see pull request comments) then I think we should be fine to apply

> Updating the tika-mimetypes.xml for new mime magic patterns
> -----------------------------------------------------------
>
>                 Key: TIKA-1882
>                 URL: https://issues.apache.org/jira/browse/TIKA-1882
>             Project: Tika
>          Issue Type: Improvement
>          Components: mime
>    Affects Versions: 1.11
>            Reporter: Manisha Kampasi
>            Priority: Minor
>              Labels: patch
>
> The following mime magic can be added to better detect the below mime-types:
> 1. vnd.ms-cab-compressed (.cab files) - pattern "MCSF" in the first 4 bytes
> 2. application/vnd.xara (.xar files) - pattern "xar!" in the first 4 bytes
> 3. application/x-mobipocket-ebook (.mobi files) - pattern "BOOKMOBI" starting 
> at byte position 60
> 4. video/quicktime (.mov files) - patterns "free" and "wide" seen starting at 
> byte position 4
> The changes can be seen here:
> https://github.com/mkampasi/tika/commit/f7433daf434a44937ba3ae8b15813a768f95e334



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to