On Wed, 21 Feb 2007 16:45:39 +0200
"Doğacan Güney" <[EMAIL PROTECTED]> wrote:

> Hi,
> [snip]
> > OK, next, "generate":
> You configured nutch to look for HDFS at localhost:9000. If default fs
> is configured to be HDFS and you give a relative path to any nutch
> command (like crawl/crawldb) then nutch (actually hadoop) will assume
> that you are accessing /user/<username>/<relative_path>. You either
> have to put your crawldb there or configure nutch to use local fs or
> change generate's arguments.
> [snip]

Thanx, but... As I wrote earlier, - I've tried MANY WAYS, including recommended.

For example:

bin/nutch generate /nutch/filesystem/crawl/crawldb 
/nutch/filesystem/crawl/segments
Generator: starting
Generator: segment: /nutch/filesystem/crawl/segments/20070221175753
Generator: Selecting best-scoring urls due for fetch.
Exception in thread "main" java.io.IOException: Input directory 
/nutch/filesystem/crawl/crawldb/current in localhost:9000 is invalid.
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:274)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:327)
        at org.apache.nutch.crawl.Generator.generate(Generator.java:319)
        at org.apache.nutch.crawl.Generator.main(Generator.java:395)

/nutch/filesystem/crawl/crawldb/current EXISTS!

Any other ideas?

--
Oleg.


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to