Hello my name is Antony and I'm new to apache nutch and solr.

I want to crawl my website and therefore I downloaded nutch to do this.
This works fine. But no I would like to integrate nutch with solr. Im
running this on my unix system.
Im trying to follow this tutorial:
But it wont for me. Running Solr without nutch is no problem. I can post
documents to solr with post.jar. But what I want to do is post my nutch
crawl to solr.
Now if I copy the schema.xml from nutch to
apache-solr-4.0.0/example/solr/collection1/conf directory aned restart solr
(java -jar start.jar), I get compiling errors but Solr will start. (Is this
the correct directory to copy my schema?)

Nov 8, 2012 9:40:33 AM org.apache.solr.schema.IndexSchema readSchema
INFO: Schema name=nutch
Nov 8, 2012 9:40:33 AM org.apache.solr.core.CoreContainer create
SEVERE: Unable to create core: collection1
org.apache.solr.common.SolrException: Schema Parsing Failed: multiple points
        at org.apache.solr.schema.IndexSchema.<init>(IndexSchema.java:113)

Nov 8, 2012 9:40:33 AM org.apache.solr.common.SolrException log
SEVERE: null:org.apache.solr.common.SolrException: Schema Parsing Failed:
multiple points
        at org.apache.solr.schema.IndexSchema.<init>(IndexSchema.java:113)
        at org.apache.solr.core.CoreContainer.create(CoreContainer.java:846)

Now if I don't copy the schema and push my nutch crawl to solr I get
following error:

SolrIndexer: starting at 2012-11-08 10:49:02
Indexing 5 documents
java.io.IOException: Job failed!
SolrDeleteDuplicates: starting at 2012-11-08 10:49:47
SolrDeleteDuplicates: Solr url: http://photon:8983/solr/

And this is taken from the logging:
org.apache.solr.common.SolrException: ERROR: [doc=
http://e-docs/infrastructure/cpuload_monitor.html] unknown field 'host'

What should I do or what am I missing?

I hope you can help me
Best Regards

Reply via email to