Hi,

Not too familiar these days
with Nutch, but my guess is
that a Solr analyser is getting applied. To have a field exactly as is, use
the String fieldtype on Solr's schema.xml rather than tje text fieldtype.

Regards,
Gora
On 05-Aug-2011 6:35 PM, "Marek Bachmann" <[email protected]> wrote:
> Hello people,
>
> I was just wondering how to avoid that the content-type string is split
> in to multiple values.
> For example: If a document has the content-type: "Application/pdf" it is
> broken into three pieces "Application/pdf", "Application", "pdf" in the
> solr filed type.
>
> I am not sure if this is done by nutch, or if it is an index topic in
solr.
>
> Sure someone knows the answer to that.
>
> Thank you.

Reply via email to