Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The "bin/nutch solrclean" page has been changed by LewisJohnMcgibbney: http://wiki.apache.org/nutch/bin/nutch%20solrclean Comment: Update to reflect Nutch 1.3 API New page: Solrclean is an alias for org.apache.nutch.indexer.solr.SolrClean The class scans a crawldb directory looking for entries with status DB_GONE (404) and sends delete requests to Solr for those documents. Once Solr receives the request the aforementioned documents are duly deleted. This maintains a healthier quality of Solr index. Usage: {{{ bin/nutch solrclean <crawldb> <solrurl> }}} '''<crawldb>''': The path to a crawldb directory. This enables us to search for 404 URLs and update the solr index accordingly. '''<solrurl>''': The solr instance we wish to update and remove 404 pages from e.g. ''http://localhost:8983/solr/'' CommandLineOptions