[ 
https://issues.apache.org/jira/browse/TIKA-86?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jukka Zitting resolved TIKA-86.
-------------------------------

    Resolution: Won't Fix

Agreed with the points above, so resolving as Won't Fix. Let's follow up in 
separate issue on more actionable tasks.

I looked at magic file parsing on a few occasions, but as noted most of the 
magic files around there are targeted for human-readable output and don't 
contain very comprehensive or accurate media type information. Matching such 
input to the needs of Tika seems more trouble than it's worth.

That said, some of the more complicated detection rules (like the regexp 
patterns mentioned above) could well be useful for Tika. I'd love to see 
contributions in that area! That would allow us to mine some of the larger 
magic files for specific complex patterns for reuse in our type database.
                
> Support magic(5) files
> ----------------------
>
>                 Key: TIKA-86
>                 URL: https://issues.apache.org/jira/browse/TIKA-86
>             Project: Tika
>          Issue Type: New Feature
>          Components: general
>            Reporter: Jukka Zitting
>
> Tika should have a parser for the magic(5) file format used by the file(1) 
> command. Then we could use existing magic rules from places like 
> http://svn.apache.org/repos/asf/httpd/httpd/trunk/docs/conf/magic.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to