Ian
Can you please help with this? I have upgraded to Nutch 0.9. I am able to
run Nutch in a standalone mode, ie without hadoop. But with hadoop I get the
error "Generator: 0 records selected for fetching, exiting ...".
I have performed this step - bin/hadoop dfs -put urls urls. And upon
running bin/hadoop dfs -ls, I see that urls is there in the dfs
bin/hadoop dfs -ls
Found 2 items
/user/nutch/crawl <dir>
/user/nutch/urls <dir>
Output of Crawl.
crawl started in: crawl
rootUrlDir = urls
threads = 10
depth = 3
topN = 50
Injector: starting
Injector: crawlDb: crawl/crawldb
Injector: urlDir: urls
Injector: Converting injected urls to crawl db entries.
Injector: Merging injected urls into crawl db.
Injector: done
Generator: Selecting best-scoring urls due for fetch.
Generator: starting
Generator: segment: crawl/segments/20070419134155
Generator: filtering: false
Generator: topN: 50
Generator: 0 records selected for fetching, exiting ...
Stopping at depth=0 - no more URLs to fetch.
No URLs to fetch - check your seed list and URL filters.
crawl finished: crawl
************************************** See what's free at http://www.aol.com.
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general