Thanks it helps to solve my problem too.

Does it means we need to update the config file in the trunk ?

> I get the same error (same setup as you-no changes to default with
> Nutch).  I doubt it's the way to do it, but I did find just now that if
> I extract the tikal-mimetypes.xml from the jar file and copy it over the
> one in nutch-trunk/conf at least I don't see those errors any more.
>
> Emmanuel wrote:
>> Hi Chris,
>>
>> FYI, i used the version provided by nutch without changing it.
>>
>> Anyway please find it attached.
>>
>> Thanks,
>> E
>>  > Hi Emmanuel,
>>  >
>>  > Could you please post your
>> /data/sengine/search/conf/tika-mimetypes.xml
>>  > file?
>>  >
>>  > Thanks,
>>  >  Chris
>>  >
>>  >
>>  >
>>  > On 2/14/08 6:07 AM, "Emmanuel" <[EMAIL PROTECTED]
>> <mailto:[EMAIL PROTECTED]>> wrote:
>>  >
>>  >> Hi Guys,
>>  >>
>>  >> I've updated my nutch version to use the latest trunk with the new
>> TIKA
>>  >> jar.
>>  >>
>>  >> I run a crawl and i've got a lot of error like that
>>  >> 2008-02-14 22:02:51,494 INFO  conf.Configuration - found resource
>>  >> tika-mimetypes.xml at
>> file:/data/sengine/search/conf/tika-mimetypes.xml
>>  >> 2008-02-14 22:02:51,499 WARN  mime.MimeTypesReader - Invalid media
>> type
>>  >> alias: text/xml
>>  >> org.apache.tika.mime.MimeTypeException: Media type alias already
>> exists:
>>  >> text/xml
>>  >>         at
>> org.apache.tika.mime.MimeTypes.addAlias(MimeTypes.java:312)
>>  >>         at org.apache.tika.mime.MimeType.addAlias(MimeType.java:238)
>>  >>         at org.apache.tika.mime.MimeTypesReader.readMimeType(
>>  >> MimeTypesReader.java:168)
>>  >>         at
>>  >> org.apache.tika.mime.MimeTypesReader.read(MimeTypesReader.java
>>  >> :138)
>>  >>         at
>>  >> org.apache.tika.mime.MimeTypesReader.read(MimeTypesReader.java
>>  >> :121)
>>  >>         at org.apache.tika.mime.MimeTypesFactory.create(
>>  >> MimeTypesFactory.java:56)
>>  >>         at org.apache.nutch.util.MimeUtil.<init>(MimeUtil.java:58)
>>  >>         at org.apache.nutch.protocol.Content.<init>(Content.java:85)
>>  >>         at
>>  >> org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(
>>  >> HttpBase.java:226)
>>  >>         at
>>  >> org.apache.nutch.fetcher.Fetcher2$FetcherThread.run(Fetcher2.java
>>  >> :523)
>>  >> 2008-02-14 22:02:51,500 WARN  mime.MimeTypesReader - Invalid media
>> type
>>  >> alias: application/x-dosexec;exe
>>  >> org.apache.tika.mime.MimeTypeException: Invalid media type alias:
>>  >> application/x-dosexec;exe
>>  >>         at org.apache.tika.mime.MimeType.addAlias(MimeType.java:242)
>>  >>         at org.apache.tika.mime.MimeTypesReader.readMimeType(
>>  >> MimeTypesReader.java:168)
>>  >>         at
>>  >> org.apache.tika.mime.MimeTypesReader.read(MimeTypesReader.java
>>  >> :138)
>>  >>         at
>>  >> org.apache.tika.mime.MimeTypesReader.read(MimeTypesReader.java
>>  >> :121)
>>  >>         at org.apache.tika.mime.MimeTypesFactory.create(
>>  >> MimeTypesFactory.java:56)
>>  >>         at org.apache.nutch.util.MimeUtil.<init>(MimeUtil.java:58)
>>  >>         at org.apache.nutch.protocol.Content.<init>(Content.java:85)
>>  >>         at
>>  >> org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(
>>  >> HttpBase.java:226)
>>  >>         at
>>  >> org.apache.nutch.fetcher.Fetcher2$FetcherThread.run(Fetcher2.java
>>  >> :523)
>>  >>
>>  >> Is that normal ?
>>  >> Do i miss something ?
>>  >
>>  > ______________________________________________
>>  > Chris Mattmann, Ph.D.
>>  > [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>
>>  > Cognizant Development Engineer
>>  > Early Detection Research Network Project
>>  > _________________________________________________
>>  > Jet Propulsion Laboratory            Pasadena, CA
>>  > Office: 171-266B                     Mailstop:  171-246
>>  > _______________________________________________________
>>  >
>>  > Disclaimer:  The opinions presented within are my own and do not
>> reflect
>>  > those of either NASA, JPL, or the California Institute of Technology.
>>  >
>>  >
>>  >
>>
>
> --
> This email message and any attachments are for the sole use of the
> intended
> recipient(s) and may contain information that is proprietary to Ahold
> and/or
> its subsidiaries ("Ahold") or otherwise confidential or legally
> privileged.
> If you have received this message in error, please notify the sender by
> reply, and delete all copies of this message and any attachments.  If you
> are the intended recipient you may use the information contained in this
> message and any files attached to this message only as authorized by
> Ahold.
> Files attached to this message may only be transmitted using secure
> systems
> and appropriate means of encryption, and must be secured using the same
> level of password and security protection with which the file was provided
> to you.  Any unauthorized use, dissemination or disclosure of this message
> or its attachments is strictly prohibited.
>

Reply via email to