[ 
https://issues.apache.org/jira/browse/NUTCH-618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12601519#action_12601519
 ] 

Chris A. Mattmann commented on NUTCH-618:
-----------------------------------------

Dennis Kubes tested this patch for me. According to Dennis, there were 2 
lingering log warnings that still came up:

1. For alias:

<alias type="application/x-dosexec;exe" />

removing the ;exe removed one of the errors

2. removing the subclass from:
<mime-type type="application/xhtml+xml">
<sub-class-of type="text/xml" />
<glob pattern="*.xhtml" />
<root-XML namespaceURI='http://www.w3.org/1999/xhtml'
localName='html' />
</mime-type>

removes the second of the errors.

I am going to attach an updated patch that address these issues.

Thanks,
 Chris


> Tika error "Media type alias already exists"
> --------------------------------------------
>
>                 Key: NUTCH-618
>                 URL: https://issues.apache.org/jira/browse/NUTCH-618
>             Project: Nutch
>          Issue Type: Bug
>          Components: mime_type_detector
>    Affects Versions: 1.0.0
>            Reporter: Andrzej Bialecki 
>            Assignee: Chris A. Mattmann
>         Attachments: NUTCH-618.Mattmann.patch.060108.2.txt, 
> NUTCH-618.Mattmann.patch.060108.txt
>
>          Time Spent: 2h
>  Remaining Estimate: 0h
>
> After the upgrade to the latest Tika jar we see a lot of errors like this:
> 2008-03-06 08:07:20,659 WARN org.apache.tika.mime.MimeTypesReader: Invalid 
> media type alias: text/xml
> org.apache.tika.mime.MimeTypeException: Media type alias already exists: 
> text/xml
>       at org.apache.tika.mime.MimeTypes.addAlias(MimeTypes.java:312)
>       at org.apache.tika.mime.MimeType.addAlias(MimeType.java:238)
>       at 
> org.apache.tika.mime.MimeTypesReader.readMimeType(MimeTypesReader.java:168)
>       at org.apache.tika.mime.MimeTypesReader.read(MimeTypesReader.java:138)
>       at org.apache.tika.mime.MimeTypesReader.read(MimeTypesReader.java:121)
>       at 
> org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:56)
>       at org.apache.nutch.util.MimeUtil.(MimeUtil.java:58)
>       at org.apache.nutch.protocol.Content.(Content.java:85)
>       at 
> org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java:226)
>       at 
> org.apache.nutch.fetcher.Fetcher2$FetcherThread.run(Fetcher2.java:523)
> This is caused most likely by the duplicate tika-mimetypes.xml file - one 
> copy is embedded inside the Tika jar, the other is found in Nutch conf/ 
> directory. The one inside the jar seems to be more recent, so I propose to 
> simply remove the one we have in conf.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to