in 0.8 you submit a _directory_ containing urls.txt not the file itself. so remove /urls.txt part from your commandline and it should go fine.
-- Sami Siren BDalton wrote: >I get this error, > >bin/nutch crawl url.txt -dir newcrawled -depth 2 >& crawl.log > >Exception in thread "main" java.io.IOException: Input directory >d:/nutch3/urls/urls.txt in local is invalid. > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:274) > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:327) > at org.apache.nutch.crawl.Injector.inject(Injector.java:138) > at org.apache.nutch.crawl.Crawl.main(Crawl.java:105) > > > ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
