Here is my output:
[Gavin@Gavin local]$ bin/nutch inject urls InjectorJob: starting at 2014-02-12 17:16:20 InjectorJob: Injecting urlDir: urls InjectorJob: Using class org.apache.gora.hbase.store.HBaseStore as the Gora storage class. InjectorJob: total number of urls rejected by filters: 0 InjectorJob: total number of urls injected after normalization and filtering: 1 Injector: finished at 2014-02-12 17:16:25, elapsed: 00:00:04 [Gavin@Gavin local]$ bin/nutch generate -topN 5 GeneratorJob: starting at 2014-02-12 17:16:46 GeneratorJob: Selecting best-scoring urls due for fetch. GeneratorJob: starting GeneratorJob: filtering: true GeneratorJob: normalizing: true GeneratorJob: topN: 5 GeneratorJob: finished at 2014-02-12 17:16:51, time elapsed: 00:00:05 GeneratorJob: generated batch id: 1392196606-229189632 [Gavin@Gavin local]$ bin/nutch fetch -all FetcherJob: starting FetcherJob: fetching all FetcherJob: threads: 10 FetcherJob: parsing: false FetcherJob: resuming: false FetcherJob : timelimit set for : -1 Using queue mode : byHost Fetcher: threads: 10 QueueFeeder finished: total 5 records. Hit by time limit :0 Fetcher: throughput threshold: -1 Fetcher: throughput threshold sequence: 5 fetching http://www.163.com/ (queue crawl delay=5000ms) fetching http://nutch.apache.org/ (queue crawl delay=5000ms) fetching http://www.tianya.cn/ (queue crawl delay=5000ms) fetching http://www.taobao.com/ (queue crawl delay=5000ms) -finishing thread FetcherThread5, activeThreads=8 -finishing thread FetcherThread6, activeThreads=8 -finishing thread FetcherThread4, activeThreads=7 -finishing thread FetcherThread3, activeThreads=6 -finishing thread FetcherThread2, activeThreads=5 fetching http://www.hao123.com/ (queue crawl delay=5000ms) -finishing thread FetcherThread0, activeThreads=4 -finishing thread FetcherThread7, activeThreads=3 -finishing thread FetcherThread1, activeThreads=2 -finishing thread FetcherThread8, activeThreads=1 -finishing thread FetcherThread9, activeThreads=0 0/0 spinwaiting/active, 4 pages, 0 errors, 0.8 1 pages/s, 242 242 kb/s, 0 URLs in 0 queues -activeThreads=0 FetcherJob: done [Gavin@Gavin local]$ bin/nutch parse -all ParserJob: starting ParserJob: resuming: false ParserJob: forced reparse: false ParserJob: parsing all Parsing http://www.tianya.cn/ Parsing http://www.163.com/ Parsing http://www.hao123.com/ Parsing http://www.taobao.com/ Parsing http://nutch.apache.org/ ParserJob: success [Gavin@Gavin local]$ bin/nutch solrindex http://127.0.0.1:8983/solr -all SolrIndexerJob: starting SolrIndexerJob: done. Thank you! ------------------ Original ------------------ From: "d_k";<[email protected]>; Date: Wed, Feb 12, 2014 04:58 PM To: "user"<[email protected]>; Subject: Re: Nutch 2.2.1 can not index to solr What is the output of each of the steps when you execute them separately? Did you edit regex-urlfilter.txt accordingly? $ bin/nutch inject urls $ bin/nutch generate -topN 5 $ bin/nutch fetch -all $ bin/nutch parse -all Taken from here: https://github.com/renepickhardt/metalcon/wiki/simpleNutchSolrSetup On Wed, Feb 12, 2014 at 10:33 AM, Gavin <[email protected]> wrote: > I compiled nutch in eclipse. My storage is hbase. > After I run the bin/crawl , there are to tables in hbase :"webpage" and > "%crawl_ID%webpage" > but there is no data in solr and no exception. > why? > > (I can crawl and index to solr server use nutch1.7.bin,so I think my solr > server is ok)

