hi, all. when i run nutch-1.0 to crawl on hadoop-0.19.1 by setting nutch-site.config, i met such problem:
2009-06-05 06:46:31,012 WARN crawl.Generator - Generator: 0 records selected for fetching, exiting ... 2009-06-05 06:46:31,028 INFO crawl.Crawl - Stopping at depth=0 - no more URLs to fetch. 2009-06-05 06:46:31,028 WARN crawl.Crawl - No URLs to fetch - check your seed list and URL filters. I run "bin/hadoop -put urls urls" to dfs and "bin/hadoop -get urls ." to check that in urls directory the seed.txt does exist and not blank. and i also set the crawl-urlfilter.txt to let the "my.domain.com" changed. when i set nutch-site.config to let it run locally instead of hadoop, it works. however when runing on hadoop, it comes to "NO URL to fetch". when i search the reason, I found that nutch-0.9 used to have a bug which can cause this problem, but when i check the patch file, i found that nutch-1.0 has already added the patch. I am very confused and looking forward your help. thank you very much. -- Yours Sincerely Xudong Du Zijing 2# 305A Tsinghua University, Beijing, China, 100084
