Hello,

For example, I have a single *seed *url say "http://nutch.apache.org/"; and
I am crawling it for "n" times. At the end of the crawl, I have 1220 new
urls generated/fetched/updated from a single seed url. While looking at
these 1220 new urls, I am interested to know how a particular site eg.
"www.abc/xy.com" has been crawled. Better question would be - in what
sequence did the crawler work its way to a particular url "www.abc/xy.com"?

Thanks for your help!

Reply via email to