artodeto created NUTCH-2507: ------------------------------- Summary: NutchTutorial wiki pages as a lot of outdated command line calls when it starts with the solr interaction Key: NUTCH-2507 URL: https://issues.apache.org/jira/browse/NUTCH-2507 Project: Nutch Issue Type: Bug Components: documentation Affects Versions: 1.14 Reporter: artodeto
h2. h2. Section "Step-by-Step: Indexing into Apache Solr" replace: {code:java} Example: bin/nutch index http://localhost:8983/solr crawl/crawldb/ -linkdb crawl/linkdb/ crawl/segments/20131108063838/ -filter -normalize -deleteGone{code} with: {code:java} Example: bin/nutch index -Dsolr.server.url=http://localhost:8983/solr/nutch ${NUTCH_RUNTIME_HOME}/crawl /crawldb/ -linkdb ${NUTCH_RUNTIME_HOME}/crawl /linkdb/ ${NUTCH_RUNTIME_HOME}/crawl /segments/20131108063838 / -filter -normalize -deleteGo{code} h2. Section "Step-by-Step: Deleting Duplicates" replace: {code:java} Usage: bin/nutch dedup <solr url> Example: /bin/nutch dedup http://localhost:8983/solr {code} with: {code:java} Usage: bin/nutch dedup <path to the crawldb> <solr url> Example: /bin/nutch dedup ${NUTCH_RUNTIME_HOME}/crawl/crawldb/ http://localhost:8983/sol {code} h2. Section "Step-by-Step: Cleaning Solr" replace: {code:java} Usage: bin/nutch clean -Dsolr.server.url=<solr index url> <crawldb> Example: /bin/nutch clean -Dsolr.server.url=http://localhost:8983/solr/nutch crawl/crawldb/ {code} with: {code} Usage: bin/nutch clean -Dsolr.server.url=<solr index url> <crawldb> Example: /bin/nutch clean -Dsolr.server.url=http://localhost:8983/solr/nutch ${NUTCH_RUNTIME_HOME}/crawl/crawldb/ {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)