[nutch 1.7 - solr indexing]

Olle Romo Sat, 05 Oct 2013 13:38:30 -0700

Hi,

I'm new to nutch and I'm trying to do a basic crawl and index to solr. I run


./bin/crawl ./urls/seed.txt Test http://localhost:8983/solr/ 2

The crawl generates the 'segments', 'linkdb' and 'crawldb' dirs and it contains 
data. The indexing step gives me:

2013-10-05 22:18:50.529 java[40459:1203] Unable to load realm info from 
SCDynamicStore
Indexer: java.io.IOException: Job failed!
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1357)
        at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:123)
        at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:185)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:195)

I've tried the latest solr and 3.4. Copy over schema.xml and 
apache-solr-solrj-3.4.0.jar and similar things with solr-4.4.0. I've followed 
the apache tutorials and some other helpful blogs. I assume there's something 
simple I'm missing. Any thoughts would be super welcome :)

Best,
Olle

[nutch 1.7 - solr indexing]

Reply via email to