Hi! I have problems in crawling..Mainly I cannot even start to crawl. I've downloaded latest source of nutch, and after 3 hours of struggling with config files, I gave up. I have some question I want to ask 1) What is hadoop and how can I use it. I searched information about hadoop and found that it's no longer integrated in nutch.It's another project. But in lib folder I found corresponding hadoop-0.1-dev.jar file. But what does he do? 2) How can I crawl? :) when I type command I get following exception
No input directories specified in: Configuration: defaults: hadoop-default.xml , mapred-default.xml , /tmp/hadoop/mapred/local/localRunner/job_vpit8j.xmlfinal: hadoop-site.xml at org.apache.hadoop.mapred.InputFormatBase.listFiles(InputFormatBase.java:84) at org.apache.hadoop.mapred.InputFormatBase.getSplits(InputFormatBase.java:94) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:70) 060222 131857 map 0% reduce 0% Exception in thread "main" java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:310) at org.apache.nutch.crawl.Injector.inject(Injector.java:114) at org.apache.nutch.crawl.Crawl.main(Crawl.java:104) I wrote in hadoop-site.xml following <!--StartFragment--><property> <name>fs.default.name</name> <value>localhost:9000</value> </property> <property> <name>mapred.job.tracker</name> <value>localhost:9001</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property> But I don't know what does it mean.(just copied from hadoop website) So, how can I crawl using nutch-0.8? 3) where is ./nutch ndfs? When I execute this command I get Exception in thread "main" java.lang.NoClassDefFoundError: ndfs I had no problems with 0.7 version. I decided to move to 0.8 because of parse-swf plugin, since I couldn't compile it. Please describe how to use new nutch? Or what do I need to compile parse-swf plugin?