[ 
https://issues.apache.org/jira/browse/NUTCH-1991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14507590#comment-14507590
 ] 

Chris A. Mattmann commented on NUTCH-1991:
------------------------------------------

will try this out today. Thanks Iain!

> Tika mime detection not using Nutch supplied tika-mimetypes.xml for content 
> based detection
> -------------------------------------------------------------------------------------------
>
>                 Key: NUTCH-1991
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1991
>             Project: Nutch
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 2.2, 2.3, 1.8, 2.4, 1.9, 2.2.1, 1.10, 1.11, 2.3.1
>            Reporter: Iain Lopata
>            Assignee: Chris A. Mattmann
>            Priority: Minor
>         Attachments: NUTCH-1991-1.6.patch
>
>
> From Nutch Version 1.5 onwards the MimeUtil.java class that acts as a facade 
> to Tika to perform mime type detection uses a process that attempts a match 
> using the mimetype returned by the server, the filename and the content. 
> NUTCH-1045 provided for the use of an external tika-mimetype.xml file which 
> provides the configuration for this process.  However, the content based 
> detection did not use this file, but instead reverted to using the 
> configuration included in the tika library.  Consequently, any content based 
> match rules added to the nutch version of the configuration file were not 
> used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to