Actually, I do get something in the hadoop log: java.lang.Exception: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://localhost:8983/solr: Expected mime type application/octet-stream but got text/html. <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/> <title>Error 404 Not Found</title> </head> <body><h2>HTTP ERROR 404</h2> <p>Problem accessing /solr/update. Reason: <pre> Not Found</pre></p><hr><i><small>Powered by Jetty://</small></i><hr/>
</body> </html> at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529) Caused by: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://localhost:8983/solr: Expected mime type application/octet-stream but got text/html. <html> Googling finds suggestions that /solr/update is the wrong url, that it needs to include the nutch core. But, where does that url need to be configured? I don't believe it's on the command line. On Fri, Oct 6, 2017 at 5:28 PM, Sol Lederman <sol.leder...@gmail.com> wrote: > Hi, > > I've got Nutch 1.13 and Solr 5.5.0. When I try to index some documents I > get an error: > > % bin/nutch index -D solr.server.url=http://localhost:8983/solr > crawl/crawldb/ -linkdb crawl/linkdb/ crawl/segments/20170910201610/ -filter > -normalize -deleteGone > > Indexing 20/20 documents > Deleting 0 documents > Indexing 20/20 documents > Deleting 0 documents > Indexer: java.io.IOException: Job failed! > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:865) > at org.apache.nutch.indexer.IndexingJob.index(IndexingJob. > java:147) > at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:230) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:239) > > I found an article on StackOverflow that suggested comparing the fields in > schema.xml. So, I compared > /home/me/apache-solr/solr-5.5.0/server/solr/configsets/nutch/conf/schema.xml > and > /home/me/apache-nutch/apache-nutch-1.13/conf/schema.xml > > There are no differences in fields.And, there is not any more info in the > Nutch log. > > How can I debug this? > > Thanks! > Sol >