[ https://issues.apache.org/jira/browse/NUTCH-1991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14512588#comment-14512588 ]
Hudson commented on NUTCH-1991: ------------------------------- FAILURE: Integrated in Nutch-trunk #3089 (See [https://builds.apache.org/job/Nutch-trunk/3089/]) Fix for NUTCH-1991 Tika mime detection not using Nutch supplied tika-mimetypes.xml for content based detection contributed by Iain Lopata and Sebastien Nagel. (mattmann: http://svn.apache.org/viewvc/nutch/trunk/?view=rev&rev=1676028) * /nutch/trunk/CHANGES.txt * /nutch/trunk/src/java/org/apache/nutch/util/MimeUtil.java > Tika mime detection not using Nutch supplied tika-mimetypes.xml for content > based detection > ------------------------------------------------------------------------------------------- > > Key: NUTCH-1991 > URL: https://issues.apache.org/jira/browse/NUTCH-1991 > Project: Nutch > Issue Type: Bug > Components: util > Affects Versions: 2.2, 2.3, 1.8, 2.4, 1.9, 2.2.1, 1.10, 1.11, 2.3.1 > Reporter: Iain Lopata > Assignee: Chris A. Mattmann > Priority: Minor > Fix For: 1.10 > > Attachments: NUTCH-1991-1.6.patch, NUTCH-1991-trunk.v2.patch > > > From Nutch Version 1.5 onwards the MimeUtil.java class that acts as a facade > to Tika to perform mime type detection uses a process that attempts a match > using the mimetype returned by the server, the filename and the content. > NUTCH-1045 provided for the use of an external tika-mimetype.xml file which > provides the configuration for this process. However, the content based > detection did not use this file, but instead reverted to using the > configuration included in the tika library. Consequently, any content based > match rules added to the nutch version of the configuration file were not > used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)