[ https://issues.apache.org/jira/browse/NUTCH-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14496426#comment-14496426 ]
Michael Joyce commented on NUTCH-1987: -------------------------------------- Hi folks, I'll have a patch up in a bit for this. I think my current plan to minimize the number of changes that I'm shoving into a single patch is to: * Add solr.server.url to nutch-default and set the value to some sane default (http://127.0.0.1:8983/solr/) * Make the 'index' calls in the bin/nutch script generic and slightly change the call format. * Update some variable names and echos in the bin/crawl script so it doesn't only mention Solr and confuse people I envision a call being something similar to this after these changes: {code} # Run the indexer bin/crawl urls/ crawl/ "run_indexer" 1 # Don't run the indexer bin/crawl urls/ crawl/ 1 {code} I don't think this is necessarily the ideal solution but it minimizes calling formats for people with existing setups and only really requires that a single configuration value is added/updated. Note, this change obviously requires some/many documentation updates. I'm more than happy to help with those as well but I wasn't including them in this ticket. Thoughts? > Make bin/crawl indexer agnostic > ------------------------------- > > Key: NUTCH-1987 > URL: https://issues.apache.org/jira/browse/NUTCH-1987 > Project: Nutch > Issue Type: Improvement > Affects Versions: 1.9 > Reporter: Michael Joyce > Fix For: 1.10 > > > The crawl script makes it a bit challenging to use an indexer that isn't > Solr. For instance, when I want to use the indexer-elastic plugin I still > need to call the crawler script with a fake Solr URL otherwise it will skip > the indexing step all together. > {code} > bin/crawl urls/ crawl/ "http://fakeurl.com:9200" 1 > {code} > It would be nice to keep configuration for the Solr indexer in the conf files > (to mirror the elastic search indexer conf and others) and to make the > indexing parameter simply toggle whether indexing does or doesn't occur > instead of also trying to configure the indexer at the same time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)