Re: Regarding Lucene & Nutc

2007-09-09 Thread MOHIT GOYAL
I m using nutch to crawl local directory on my system.I have modified all the conf files like default.xml,crawl-urlfilter etc. I have also modified HttpResponse.java but it is skipping all the URLS.please help.

Re: in nutch0.9 I cant create a CrawlDb

2007-08-31 Thread goyal
> > I want use nutch0.9 for Whole-web Crawling but the nutch0.9 not the admin > commad to create a crawldb and I just execute nutch the display the > commandline not about how to create a crawldb > And I can't find any tutorial for nutch0.9 so I help somebody to tell me > how > to create a crawldb

Re: protocol not found for url=file

2007-08-24 Thread MOHIT GOYAL
: org.apache.nutch.protocol.ProtocolNotFound: protocol not found for url=file fetching file:///root/Desktop/csiro-split/CSIRO002 MOHIT GOYAL CSE 200502013