On 2/21/07, Oleg V. Konovalov <[EMAIL PROTECTED]> wrote:
> Thanx, but... As I wrote earlier, - I've tried MANY WAYS, including 
> recommended.
>
> For example:
>
> bin/nutch generate /nutch/filesystem/crawl/crawldb 
> /nutch/filesystem/crawl/segments
> Generator: starting
> Generator: segment: /nutch/filesystem/crawl/segments/20070221175753
> Generator: Selecting best-scoring urls due for fetch.
> Exception in thread "main" java.io.IOException: Input directory 
> /nutch/filesystem/crawl/crawldb/current in localhost:9000 is invalid.
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:274)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:327)
>         at org.apache.nutch.crawl.Generator.generate(Generator.java:319)
>         at org.apache.nutch.crawl.Generator.main(Generator.java:395)
>
> /nutch/filesystem/crawl/crawldb/current EXISTS!

Very strange. I am not sure what the problem is then. Can you include
the output of commands:

hadoop dfs -ls /nutch/filesystem/crawl/
hadoop dfs -ls /nutch/filesystem/crawl/crawldb

>
> Any other ideas?
>
> --
> Oleg.
>
>


-- 
Doğacan Güney
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to