you should check your conf file, I have had the similar error!

2009/6/5 Xudong Du <[email protected]>

> hi, all.
> when i run nutch-1.0 to crawl on hadoop-0.19.1 by setting
> nutch-site.config,
> i met such problem:
>
> 2009-06-05 06:46:31,012 WARN  crawl.Generator - Generator: 0 records
> selected for fetching, exiting ...
> 2009-06-05 06:46:31,028 INFO  crawl.Crawl - Stopping at depth=0 - no more
> URLs to fetch.
> 2009-06-05 06:46:31,028 WARN  crawl.Crawl - No URLs to fetch - check your
> seed list and URL filters.
>
> I run "bin/hadoop -put urls urls" to dfs and "bin/hadoop -get urls ." to
> check that in urls directory the seed.txt does exist and not blank. and i
> also set the crawl-urlfilter.txt to let the "my.domain.com" changed.
> when i set nutch-site.config to let it run locally instead of hadoop, it
> works. however when runing on hadoop, it comes to "NO URL to fetch".
>
> when i search the reason, I found that nutch-0.9 used to have a bug which
> can cause this problem, but when i check the patch file, i  found that
> nutch-1.0 has already added the patch.
>
> I am very confused and looking forward your help.
>
> thank you very much.
>
>
> --
> Yours Sincerely
> Xudong Du
> Zijing 2# 305A
> Tsinghua University, Beijing, China, 100084
>

Reply via email to