Ian
 
Can you please help with this? I have upgraded to Nutch 0.9. I am able to  
run Nutch in a standalone mode, ie without hadoop. But with hadoop I get the  
error "Generator: 0 records selected for fetching, exiting ...". 
I have performed this step - bin/hadoop dfs -put urls urls.  And upon  
running bin/hadoop dfs -ls, I see that urls is there in the dfs
 
bin/hadoop dfs -ls
Found 2 items
/user/nutch/crawl        <dir>
/user/nutch/urls         <dir>

Output of Crawl.
 
crawl started in: crawl
rootUrlDir = urls
threads = 10
depth =  3
topN = 50
Injector: starting
Injector: crawlDb:  crawl/crawldb
Injector: urlDir: urls
Injector: Converting injected urls to  crawl db entries.
Injector: Merging injected urls into crawl db.
Injector:  done
Generator: Selecting best-scoring urls due for fetch.
Generator:  starting
Generator: segment: crawl/segments/20070419134155
Generator:  filtering: false
Generator: topN: 50
Generator: 0 records selected for  fetching, exiting ...
Stopping at depth=0 - no more URLs to fetch.
No URLs  to fetch - check your seed list and URL filters.
crawl finished:  crawl





************************************** See what's free at http://www.aol.com.
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to