hi all:
when I run nutch on eclipse due to "RunNutchInEclipse1.0",I meet some
peoblems.
the main args is : -dir crawl -threads 5 -depth 3 -topN 3 rooturl
but when i print the args.length,it is 74! and the args[i] repeat the
parameter :" crawl 5 3 3 rooturl "
and I creat a folder named rooturl ,there is a txt in it,the txt include
some urls;
but I met the prombles,the print is:
crawl started in: crawl-20090423113556
rootUrlDir = crawl
threads = 5
depth = 3
topN = 3
crawl
Injector: starting
Injector: crawlDb: crawl-20090423113556/crawldb
Injector: urlDir: crawl
Injector: Converting injected urls to crawl db entries.
Injector: crawlDb: crawl-20090423113556/crawldb
Exception in thread "main" java.io.IOException: Not a file:
file:/nutch-1.0/crawl/crawldb
at
org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:195)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:797)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1142)
at org.apache.nutch.crawl.Injector.inject(Injector.java:164)
at org.apache.nutch.crawl.Crawl.main(Crawl.java:122)
"Not a file: file:/nutch-1.0/crawl/crawldb"??? I check-up the folder, it is
exist!
i run eclipse on the centos5.2 which is in the vmware machine.the vmware
machine is running on windows.
does anyone know how to solve the prombles?
thanks experts!!!
--
View this message in context:
http://www.nabble.com/run-nutch-on-eclipse-problem--tp23190944p23190944.html
Sent from the Nutch - User mailing list archive at Nabble.com.