Re: crawling in any depth until no new pages were found

2011-07-24 Thread Markus Jelsma
You don't need to check manually if you use the generator return code. It returns a non-zero value if no fetch-list is generated, that usually happens when there's nothing left to crawl at the moment. Hi all, has anyone suggestions how I could solve following task: I want to crawl a

Re: crawling in any depth until no new pages were found

2011-07-20 Thread lewis john mcgibbney
Hi Marek, As were talking about automating the task were immediately looking at implementing a bash script. In the situation we have described, we wish Nutch to adopt a breadth first search BFS behaviour when crawling. Between us can we suggest any methods for best practice relating to BFS? As