[ https://issues.apache.org/jira/browse/NUTCH-1773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13998333#comment-13998333 ]
Lewis John McGibbney commented on NUTCH-1773: --------------------------------------------- bq. hduser@bl4ck1c3:~/nutch-2.3/runtime/local$ bin/nutch solrindex TestCrawl18 -reindex There is no $SOLR_URL passed as an argument here! Is anyone else getting issues with this? Can we reproduce? > Solr Indexer fails > ------------------ > > Key: NUTCH-1773 > URL: https://issues.apache.org/jira/browse/NUTCH-1773 > Project: Nutch > Issue Type: Bug > Components: indexer > Affects Versions: 2.3 > Environment: Ubuntu 12.04 LTS, java version "1.7.0_55" - Hbase-0.90.6 > (pseudo dist), Hadoop 1.2.1, Solr 4.6 > Reporter: Ralf > Priority: Critical > Fix For: 2.3 > > > When using crawl script or solrindexer by itself (/bin/nutch solrindex) in > localmode it fails with: > hduser@bl4ck1c3:~/nutch-2.3/runtime/local$ bin/nutch solrindex TestCrawl18 > -reindex > IndexingJob: starting > Active IndexWriters : > SOLRIndexWriter > solr.server.url : URL of the SOLR instance (mandatory) > solr.commit.size : buffer size when sending to SOLR (default 1000) > solr.mapping.file : name of the mapping file for fields (default > solrindex-mapping.xml) > solr.auth : use authentication (default false) > solr.auth.username : use authentication (default false) > solr.auth : username for authentication > solr.auth.password : password for authentication > SolrIndexerJob: java.lang.IllegalStateException: Target host must not be > null, or set in parameters. > at > org.apache.http.impl.client.DefaultRequestDirector.determineRoute(DefaultRequestDirector.java:787) > at > org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:414) > at > org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) > at > org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) > at > org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784) > at > org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:393) > at > org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:197) > at > org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117) > at org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:168) > at org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:146) > at > org.apache.nutch.indexwriter.solr.SolrIndexWriter.commit(SolrIndexWriter.java:146) > at org.apache.nutch.indexer.IndexWriters.commit(IndexWriters.java:127) > at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:171) > at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:187) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:196) > when using the new INDEX command it finishes, but nothing is added to Solr: > hduser@bl4ck1c3:~/nutch-2.3/runtime/local$ bin/nutch index TestCrawl18 > -reindex > IndexingJob: starting > Active IndexWriters : > SOLRIndexWriter > solr.server.url : URL of the SOLR instance (mandatory) > solr.commit.size : buffer size when sending to SOLR (default 1000) > solr.mapping.file : name of the mapping file for fields (default > solrindex-mapping.xml) > solr.auth : use authentication (default false) > solr.auth.username : use authentication (default false) > solr.auth : username for authentication > solr.auth.password : password for authentication > > Log shows: > 2014-05-13 03:01:13,781 INFO indexer.IndexingJob - IndexingJob: starting > 2014-05-13 03:01:14,108 INFO indexer.IndexingFilters - Adding > org.apache.nutch.analysis.lang.LanguageIndexingFilter > 2014-05-13 03:01:14,109 INFO basic.BasicIndexingFilter - Maximum title > length for indexing set to: 100 > 2014-05-13 03:01:14,109 INFO indexer.IndexingFilters - Adding > org.apache.nutch.indexer.basic.BasicIndexingFilter > 2014-05-13 03:01:14,335 INFO indexer.IndexingFilters - Adding > org.apache.nutch.indexer.more.MoreIndexingFilter > 2014-05-13 03:01:14,336 INFO anchor.AnchorIndexingFilter - Anchor > deduplication is: off > 2014-05-13 03:01:14,336 INFO indexer.IndexingFilters - Adding > org.apache.nutch.indexer.anchor.AnchorIndexingFilter > 2014-05-13 03:01:14,620 WARN zookeeper.ClientCnxnSocket - Connected to an > old server; r-o mode will be unavailable > 2014-05-13 03:01:14,768 WARN zookeeper.ClientCnxnSocket - Connected to an > old server; r-o mode will be unavailable > 2014-05-13 03:01:14,968 WARN zookeeper.ClientCnxnSocket - Connected to an > old server; r-o mode will be unavailable > 2014-05-13 03:01:15,243 WARN zookeeper.ClientCnxnSocket - Connected to an > old server; r-o mode will be unavailable > 2014-05-13 03:01:15,276 WARN zookeeper.ClientCnxnSocket - Connected to an > old server; r-o mode will be unavailable > 2014-05-13 03:01:15,326 WARN zookeeper.ClientCnxnSocket - Connected to an > old server; r-o mode will be unavailable > 2014-05-13 03:01:15,386 INFO indexer.IndexWriters - Adding > org.apache.nutch.indexwriter.solr.SolrIndexWriter > 2014-05-13 03:01:15,403 INFO solr.SolrMappingReader - source: content dest: > content > 2014-05-13 03:01:15,403 INFO solr.SolrMappingReader - source: title dest: > title > 2014-05-13 03:01:15,403 INFO solr.SolrMappingReader - source: host dest: host > 2014-05-13 03:01:15,404 INFO solr.SolrMappingReader - source: batchId dest: > batchId > 2014-05-13 03:01:15,404 INFO solr.SolrMappingReader - source: boost dest: > boost > 2014-05-13 03:01:15,404 INFO solr.SolrMappingReader - source: digest dest: > digest > 2014-05-13 03:01:15,404 INFO solr.SolrMappingReader - source: tstamp dest: > tstamp > 2014-05-13 03:01:15,405 INFO indexer.IndexingFilters - Adding > org.apache.nutch.analysis.lang.LanguageIndexingFilter > 2014-05-13 03:01:15,405 INFO basic.BasicIndexingFilter - Maximum title > length for indexing set to: 100 > 2014-05-13 03:01:15,405 INFO indexer.IndexingFilters - Adding > org.apache.nutch.indexer.basic.BasicIndexingFilter > 2014-05-13 03:01:15,405 INFO indexer.IndexingFilters - Adding > org.apache.nutch.indexer.more.MoreIndexingFilter > 2014-05-13 03:01:15,405 INFO anchor.AnchorIndexingFilter - Anchor > deduplication is: off > 2014-05-13 03:01:15,405 INFO indexer.IndexingFilters - Adding > org.apache.nutch.indexer.anchor.AnchorIndexingFilter > 2014-05-13 03:01:15,426 WARN zookeeper.ClientCnxnSocket - Connected to an > old server; r-o mode will be unavailable > 2014-05-13 03:01:15,442 WARN mapred.FileOutputCommitter - Output path is > null in cleanup > 2014-05-13 03:01:16,144 INFO indexer.IndexWriters - Adding > org.apache.nutch.indexwriter.solr.SolrIndexWriter > 2014-05-13 03:01:16,144 INFO indexer.IndexingJob - Active IndexWriters : > SOLRIndexWriter > solr.server.url : URL of the SOLR instance (mandatory) > solr.commit.size : buffer size when sending to SOLR (default 1000) > solr.mapping.file : name of the mapping file for fields (default > solrindex-mapping.xml) > solr.auth : use authentication (default false) > solr.auth.username : use authentication (default false) > solr.auth : username for authentication > solr.auth.password : password for authentication > 2014-05-13 03:01:16,145 INFO solr.SolrMappingReader - source: content dest: > content > 2014-05-13 03:01:16,145 INFO solr.SolrMappingReader - source: title dest: > title > 2014-05-13 03:01:16,145 INFO solr.SolrMappingReader - source: host dest: host > 2014-05-13 03:01:16,145 INFO solr.SolrMappingReader - source: batchId dest: > batchId > 2014-05-13 03:01:16,145 INFO solr.SolrMappingReader - source: boost dest: > boost > 2014-05-13 03:01:16,145 INFO solr.SolrMappingReader - source: digest dest: > digest > 2014-05-13 03:01:16,145 INFO solr.SolrMappingReader - source: tstamp dest: > tstamp > 2014-05-13 03:01:16,338 INFO solr.SolrIndexWriter - Total 0 document is > added. > 2014-05-13 03:01:16,338 INFO indexer.IndexingJob - IndexingJob: done. -- This message was sent by Atlassian JIRA (v6.2#6252)