you should check your conf file, I have had the similar error! 2009/6/5 Xudong Du <[email protected]>
> hi, all. > when i run nutch-1.0 to crawl on hadoop-0.19.1 by setting > nutch-site.config, > i met such problem: > > 2009-06-05 06:46:31,012 WARN crawl.Generator - Generator: 0 records > selected for fetching, exiting ... > 2009-06-05 06:46:31,028 INFO crawl.Crawl - Stopping at depth=0 - no more > URLs to fetch. > 2009-06-05 06:46:31,028 WARN crawl.Crawl - No URLs to fetch - check your > seed list and URL filters. > > I run "bin/hadoop -put urls urls" to dfs and "bin/hadoop -get urls ." to > check that in urls directory the seed.txt does exist and not blank. and i > also set the crawl-urlfilter.txt to let the "my.domain.com" changed. > when i set nutch-site.config to let it run locally instead of hadoop, it > works. however when runing on hadoop, it comes to "NO URL to fetch". > > when i search the reason, I found that nutch-0.9 used to have a bug which > can cause this problem, but when i check the patch file, i found that > nutch-1.0 has already added the patch. > > I am very confused and looking forward your help. > > thank you very much. > > > -- > Yours Sincerely > Xudong Du > Zijing 2# 305A > Tsinghua University, Beijing, China, 100084 >
