Hi I'm try to get the nutch/hadoop example from http://wiki.apache.org/nutch/NutchHadoopTutorial running.
I've set up the urllist.txm and the crawl-urlfilter.xml as instructed in the tutorial, but whenever I run the crawl it either reports Generator: 0 records selected for fetching, exiting ... Stopping at depth=1 - no more URLs to fetch. or Generator: 0 records selected for fetching, exiting ... Stopping at depth=0 - no more URLs to fetch. I can't tell if the crawler has managed to fetch any data. How can I extract whatever data is has downloaded? thanks, Barry