Add the "passthrough" dynamic field to your Solr schema, and then see what fields get passed through to Solr from Nutch. Then, add the missing fields to your Solr schema and remove the passthrough.

<dynamicField name="*" type="string" indexed="true" stored="true" multiValued="true" />

Or, add Solr <copyField> directives to place fields in existing named fields.

Or... talk to the nutch people about how to do field name mapping on the nutch side of the fence.

Hold off on UUIDs until you figure all of the above out and everything is working without them.

-- Jack Krupansky

-----Original Message----- From: Joe Zhang
Sent: Sunday, June 23, 2013 2:35 PM
To: solr-user@lucene.apache.org
Subject: Re: document id in nutch/solr

Can somebody help with this one, please?


On Fri, Jun 21, 2013 at 10:36 PM, Joe Zhang <smartag...@gmail.com> wrote:

A quite standard configuration of nutch seems to autoamtically map "url"
to "id". Two questions:

- Where is such mapping defined? I can't find it anywhere in
nutch-site.xml or schema.xml. The latter does define the "id" field as well
as its uniqueness, but not the mapping.

- Given that nutch nutch has already defined such an id, can i ask solr to
redefine id as UUID?
<field name="id" type="uuid" indexed="true" stored="true" default="NEW"/>

- This leads to a related question: do solr and nutch have to have
IDENTICAL schema.xml?


Reply via email to