Re: Problems on Crawling

2005-09-17 Thread Daniele Menozzi
On 11:44:00 17/Sep , Piotr Kosiorowski wrote: > Yes - depth means in fact - number of interations of > generate/fetch/update cycle. ok, now it's clear :) > nutch generate - will include already fetched pages in new segment for > fetching after some time (I think default is 30 days and you can

Re: Problems on Crawling

2005-09-17 Thread Piotr Kosiorowski
Daniele Menozzi wrote: ok, so the depth value is only used to stop the crawling at a certain point, and proceed with the indexing, right? Yes - depth means in fact - number of interations of generate/fetch/update cycle. But, another thing: how can I refresh old pages? What class do I have to

Re: Problems on Crawling

2005-09-16 Thread Daniele Menozzi
On 19:33:57 16/Sep , Piotr Kosiorowski wrote: > bin/nutch updatedb db $s1 > command updates WebDB with links you fetched in segment $s1. ok, so the depth value is only used to stop the crawling at a certain point, and proceed with the indexing, right? But, another thing: how can I refresh old p

Re: Problems on Crawling

2005-09-16 Thread Piotr Kosiorowski
bin/nutch updatedb db $s1 command updates WebDB with links you fetched in segment $s1. Regards Piotr Daniele Menozzi wrote: Hi all, I have questions regarding org.apache.nutch.tools.CrawlTool: I do not have really understood what is the ralationship between depth,segments,fetching.. Take for ex

Re: Problems on Crawling

2005-09-16 Thread Michael Ji
at look at this good nutch doc http://wiki.apache.org/nutch/DissectingTheNutchCrawler Michael Ji --- Daniele Menozzi <[EMAIL PROTECTED]> wrote: > Hi all, I have questions regarding > org.apache.nutch.tools.CrawlTool: I do > not have really understood what is the ralationship > between > depth,s

Problems on Crawling

2005-09-16 Thread Daniele Menozzi
Hi all, I have questions regarding org.apache.nutch.tools.CrawlTool: I do not have really understood what is the ralationship between depth,segments,fetching.. Take for example the tutorial, I understand theese 2 steps: bin/nutch admin db -create bin/nutch inject db -dmozfile conte