On 2/21/07, Oleg V. Konovalov <[EMAIL PROTECTED]> wrote: > Thanx, but... As I wrote earlier, - I've tried MANY WAYS, including > recommended. > > For example: > > bin/nutch generate /nutch/filesystem/crawl/crawldb > /nutch/filesystem/crawl/segments > Generator: starting > Generator: segment: /nutch/filesystem/crawl/segments/20070221175753 > Generator: Selecting best-scoring urls due for fetch. > Exception in thread "main" java.io.IOException: Input directory > /nutch/filesystem/crawl/crawldb/current in localhost:9000 is invalid. > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:274) > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:327) > at org.apache.nutch.crawl.Generator.generate(Generator.java:319) > at org.apache.nutch.crawl.Generator.main(Generator.java:395) > > /nutch/filesystem/crawl/crawldb/current EXISTS!
Very strange. I am not sure what the problem is then. Can you include the output of commands: hadoop dfs -ls /nutch/filesystem/crawl/ hadoop dfs -ls /nutch/filesystem/crawl/crawldb > > Any other ideas? > > -- > Oleg. > > -- Doğacan Güney ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
