Re: run nutch on eclipse problem?

Raymond Balmès Thu, 23 Apr 2009 01:18:37 -0700

not sure if it helps, but I needed admin rights to run nutch steps. I'm
using Vista. Do you have them when you run under Eclipse ?
As recommended in different threads, I run nutch out of Cygwin.


-Ray-

2009/4/23 askNutch <[email protected]>

>
> hi all:
>  when I run nutch on eclipse due to "RunNutchInEclipse1.0",I meet some
> peoblems.
>  the main args is :  -dir crawl -threads 5 -depth 3 -topN 3 rooturl
>  but when i print the args.length,it is 74! and the args[i] repeat the
> parameter :" crawl  5 3  3 rooturl "
>  and I creat a folder named rooturl ,there is a txt in it,the txt include
> some urls;
>  but I met the prombles,the print is:
>
> crawl started in: crawl-20090423113556
> rootUrlDir = crawl
> threads = 5
> depth = 3
> topN = 3
> crawl
> Injector: starting
> Injector: crawlDb: crawl-20090423113556/crawldb
> Injector: urlDir: crawl
> Injector: Converting injected urls to crawl db entries.
> Injector: crawlDb: crawl-20090423113556/crawldb
> Exception in thread "main" java.io.IOException: Not a file:
> file:/nutch-1.0/crawl/crawldb
>        at
>
> org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:195)
>        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:797)
>        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1142)
>        at org.apache.nutch.crawl.Injector.inject(Injector.java:164)
>        at org.apache.nutch.crawl.Crawl.main(Crawl.java:122)
>
> "Not a file: file:/nutch-1.0/crawl/crawldb"??? I check-up the folder, it is
> exist!
>
> i run eclipse on the centos5.2 which is in the vmware machine.the vmware
> machine is running on windows.
> does anyone know how to solve the prombles?
> thanks experts!!!
> --
> View this message in context:
> http://www.nabble.com/run-nutch-on-eclipse-problem--tp23190944p23190944.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>
>

Re: run nutch on eclipse problem?

Reply via email to