i think nutch is using the "crawl" dir as the urldir

Injector: urlDir: crawl



try this: rooturl -dir crawl -threads 5 -depth 3 -topN 3



On Thu, Apr 23, 2009 at 11:48 AM, askNutch <[email protected]> wrote:

>
> thank you ,but i run in root!
>
> Raymond Balmès wrote:
> >
> > not sure if it helps, but I needed admin rights to run nutch steps. I'm
> > using Vista. Do you have them when you run under Eclipse ?
> > As recommended in different threads, I run nutch out of Cygwin.
> >
> > -Ray-
> >
> > 2009/4/23 askNutch <[email protected]>
> >
> >>
> >> hi all:
> >>  when I run nutch on eclipse due to "RunNutchInEclipse1.0",I meet some
> >> peoblems.
> >>  the main args is :  -dir crawl -threads 5 -depth 3 -topN 3 rooturl
> >>  but when i print the args.length,it is 74! and the args[i] repeat the
> >> parameter :" crawl  5 3  3 rooturl "
> >>  and I creat a folder named rooturl ,there is a txt in it,the txt
> include
> >> some urls;
> >>  but I met the prombles,the print is:
> >>
> >> crawl started in: crawl-20090423113556
> >> rootUrlDir = crawl
> >> threads = 5
> >> depth = 3
> >> topN = 3
> >> crawl
> >> Injector: starting
> >> Injector: crawlDb: crawl-20090423113556/crawldb
> >> Injector: urlDir: crawl
> >> Injector: Converting injected urls to crawl db entries.
> >> Injector: crawlDb: crawl-20090423113556/crawldb
> >> Exception in thread "main" java.io.IOException: Not a file:
> >> file:/nutch-1.0/crawl/crawldb
> >>        at
> >>
> >>
> org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:195)
> >>        at
> >> org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:797)
> >>        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1142)
> >>        at org.apache.nutch.crawl.Injector.inject(Injector.java:164)
> >>        at org.apache.nutch.crawl.Crawl.main(Crawl.java:122)
> >>
> >> "Not a file: file:/nutch-1.0/crawl/crawldb"??? I check-up the folder, it
> >> is
> >> exist!
> >>
> >> i run eclipse on the centos5.2 which is in the vmware machine.the vmware
> >> machine is running on windows.
> >> does anyone know how to solve the prombles?
> >> thanks experts!!!
> >> --
> >> View this message in context:
> >>
> http://www.nabble.com/run-nutch-on-eclipse-problem--tp23190944p23190944.html
> >> Sent from the Nutch - User mailing list archive at Nabble.com.
> >>
> >>
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/run-nutch-on-eclipse-problem--tp23190944p23193715.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>
>

Reply via email to