I see now whats causing the error. /urls/nutch is a file...but you have to give
as input only the urls folder not the file as i did ;)

ps: is there an irc channel for nutch or 'only' mailing list?

thx
martin

Zitat von Briggs <[EMAIL PROTECTED]>:

> is urls/nutch a file or directory?
>
> On 6/6/07, Martin Kammerlander <[EMAIL PROTECTED]>
> wrote:
> > Hi
> >
> > I wanted to start a crawl like it is done in the nutch 0.8.x tutorial.
> > Unfortunately I get the following error:
> >
> > [EMAIL PROTECTED] nutch-0.8.1]$ bin/nutch crawl urls/nutch -dir crawl.test 
> > -depth 10
> > crawl started in: crawl.test
> > rootUrlDir = urls/nutch
> > threads = 10
> > depth = 10
> > Injector: starting
> > Injector: crawlDb: crawl.test/crawldb
> > Injector: urlDir: urls/nutch
> > Injector: Converting injected urls to crawl db entries.
> > Exception in thread "main" java.io.IOException: Input directory
> > /scratch/nutch-0.8.1/urls/nutch in local is invalid.
> >         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:274)
> >         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:327)
> >         at org.apache.nutch.crawl.Injector.inject(Injector.java:138)
> >         at org.apache.nutch.crawl.Crawl.main(Crawl.java:105)
> >
> > Any ideas what is causing that?
> >
> > regards
> > martin
> >
>
>
> --
> "Conscious decisions by conscious minds are what make reality real"
>




-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-general mailing list
Nutch-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to