[ https://issues.apache.org/jira/browse/NUTCH-1773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Julien Nioche resolved NUTCH-1773. ---------------------------------- Resolution: Not a Problem As Lewis pointed out you need to specify the SOLR URL with the indexsolr command or if using the index command directly pass it (solr.server.url) either on the command line (-D solr.server.url=MYSOLRURL) or via nutch-site.xml. > Solr Indexer fails > ------------------ > > Key: NUTCH-1773 > URL: https://issues.apache.org/jira/browse/NUTCH-1773 > Project: Nutch > Issue Type: Bug > Components: indexer > Affects Versions: 2.3 > Environment: Ubuntu 12.04 LTS, java version "1.7.0_55" - Hbase-0.90.6 > (pseudo dist), Hadoop 1.2.1, Solr 4.6 > Reporter: Ralf > Priority: Critical > Fix For: 2.3 > > > When using crawl script or solrindexer by itself (/bin/nutch solrindex) in > localmode it fails with: > hduser@bl4ck1c3:~/nutch-2.3/runtime/local$ bin/nutch solrindex TestCrawl18 > -reindex > IndexingJob: starting > Active IndexWriters : > SOLRIndexWriter > solr.server.url : URL of the SOLR instance (mandatory) > solr.commit.size : buffer size when sending to SOLR (default 1000) > solr.mapping.file : name of the mapping file for fields (default > solrindex-mapping.xml) > solr.auth : use authentication (default false) > solr.auth.username : use authentication (default false) > solr.auth : username for authentication > solr.auth.password : password for authentication > SolrIndexerJob: java.lang.IllegalStateException: Target host must not be > null, or set in parameters. > at > org.apache.http.impl.client.DefaultRequestDirector.determineRoute(DefaultRequestDirector.java:787) > at > org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:414) > at > org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) > at > org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) > at > org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784) > at > org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:393) > at > org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:197) > at > org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117) > at org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:168) > at org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:146) > at > org.apache.nutch.indexwriter.solr.SolrIndexWriter.commit(SolrIndexWriter.java:146) > at org.apache.nutch.indexer.IndexWriters.commit(IndexWriters.java:127) > at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:171) > at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:187) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:196) > when using the new INDEX command it finishes, but nothing is added to Solr: > hduser@bl4ck1c3:~/nutch-2.3/runtime/local$ bin/nutch index TestCrawl18 > -reindex > IndexingJob: starting > Active IndexWriters : > SOLRIndexWriter > solr.server.url : URL of the SOLR instance (mandatory) > solr.commit.size : buffer size when sending to SOLR (default 1000) > solr.mapping.file : name of the mapping file for fields (default > solrindex-mapping.xml) > solr.auth : use authentication (default false) > solr.auth.username : use authentication (default false) > solr.auth : username for authentication > solr.auth.password : password for authentication > > Log shows: > 2014-05-13 03:01:13,781 INFO indexer.IndexingJob - IndexingJob: starting > 2014-05-13 03:01:14,108 INFO indexer.IndexingFilters - Adding > org.apache.nutch.analysis.lang.LanguageIndexingFilter > 2014-05-13 03:01:14,109 INFO basic.BasicIndexingFilter - Maximum title > length for indexing set to: 100 > 2014-05-13 03:01:14,109 INFO indexer.IndexingFilters - Adding > org.apache.nutch.indexer.basic.BasicIndexingFilter > 2014-05-13 03:01:14,335 INFO indexer.IndexingFilters - Adding > org.apache.nutch.indexer.more.MoreIndexingFilter > 2014-05-13 03:01:14,336 INFO anchor.AnchorIndexingFilter - Anchor > deduplication is: off > 2014-05-13 03:01:14,336 INFO indexer.IndexingFilters - Adding > org.apache.nutch.indexer.anchor.AnchorIndexingFilter > 2014-05-13 03:01:14,620 WARN zookeeper.ClientCnxnSocket - Connected to an > old server; r-o mode will be unavailable > 2014-05-13 03:01:14,768 WARN zookeeper.ClientCnxnSocket - Connected to an > old server; r-o mode will be unavailable > 2014-05-13 03:01:14,968 WARN zookeeper.ClientCnxnSocket - Connected to an > old server; r-o mode will be unavailable > 2014-05-13 03:01:15,243 WARN zookeeper.ClientCnxnSocket - Connected to an > old server; r-o mode will be unavailable > 2014-05-13 03:01:15,276 WARN zookeeper.ClientCnxnSocket - Connected to an > old server; r-o mode will be unavailable > 2014-05-13 03:01:15,326 WARN zookeeper.ClientCnxnSocket - Connected to an > old server; r-o mode will be unavailable > 2014-05-13 03:01:15,386 INFO indexer.IndexWriters - Adding > org.apache.nutch.indexwriter.solr.SolrIndexWriter > 2014-05-13 03:01:15,403 INFO solr.SolrMappingReader - source: content dest: > content > 2014-05-13 03:01:15,403 INFO solr.SolrMappingReader - source: title dest: > title > 2014-05-13 03:01:15,403 INFO solr.SolrMappingReader - source: host dest: host > 2014-05-13 03:01:15,404 INFO solr.SolrMappingReader - source: batchId dest: > batchId > 2014-05-13 03:01:15,404 INFO solr.SolrMappingReader - source: boost dest: > boost > 2014-05-13 03:01:15,404 INFO solr.SolrMappingReader - source: digest dest: > digest > 2014-05-13 03:01:15,404 INFO solr.SolrMappingReader - source: tstamp dest: > tstamp > 2014-05-13 03:01:15,405 INFO indexer.IndexingFilters - Adding > org.apache.nutch.analysis.lang.LanguageIndexingFilter > 2014-05-13 03:01:15,405 INFO basic.BasicIndexingFilter - Maximum title > length for indexing set to: 100 > 2014-05-13 03:01:15,405 INFO indexer.IndexingFilters - Adding > org.apache.nutch.indexer.basic.BasicIndexingFilter > 2014-05-13 03:01:15,405 INFO indexer.IndexingFilters - Adding > org.apache.nutch.indexer.more.MoreIndexingFilter > 2014-05-13 03:01:15,405 INFO anchor.AnchorIndexingFilter - Anchor > deduplication is: off > 2014-05-13 03:01:15,405 INFO indexer.IndexingFilters - Adding > org.apache.nutch.indexer.anchor.AnchorIndexingFilter > 2014-05-13 03:01:15,426 WARN zookeeper.ClientCnxnSocket - Connected to an > old server; r-o mode will be unavailable > 2014-05-13 03:01:15,442 WARN mapred.FileOutputCommitter - Output path is > null in cleanup > 2014-05-13 03:01:16,144 INFO indexer.IndexWriters - Adding > org.apache.nutch.indexwriter.solr.SolrIndexWriter > 2014-05-13 03:01:16,144 INFO indexer.IndexingJob - Active IndexWriters : > SOLRIndexWriter > solr.server.url : URL of the SOLR instance (mandatory) > solr.commit.size : buffer size when sending to SOLR (default 1000) > solr.mapping.file : name of the mapping file for fields (default > solrindex-mapping.xml) > solr.auth : use authentication (default false) > solr.auth.username : use authentication (default false) > solr.auth : username for authentication > solr.auth.password : password for authentication > 2014-05-13 03:01:16,145 INFO solr.SolrMappingReader - source: content dest: > content > 2014-05-13 03:01:16,145 INFO solr.SolrMappingReader - source: title dest: > title > 2014-05-13 03:01:16,145 INFO solr.SolrMappingReader - source: host dest: host > 2014-05-13 03:01:16,145 INFO solr.SolrMappingReader - source: batchId dest: > batchId > 2014-05-13 03:01:16,145 INFO solr.SolrMappingReader - source: boost dest: > boost > 2014-05-13 03:01:16,145 INFO solr.SolrMappingReader - source: digest dest: > digest > 2014-05-13 03:01:16,145 INFO solr.SolrMappingReader - source: tstamp dest: > tstamp > 2014-05-13 03:01:16,338 INFO solr.SolrIndexWriter - Total 0 document is > added. > 2014-05-13 03:01:16,338 INFO indexer.IndexingJob - IndexingJob: done. -- This message was sent by Atlassian JIRA (v6.2#6252)