Dennis, I am in the same dilemma as you are. Here are my thoughts. 1. I am planning to write the Plugin to do it where in the plugin can be modified based on the site map and levels 2. The Fetcher itself can be modified. But again code merging with latest contributons fixes and enhancement from community will be very hard. 3. Other way is to write a prefetcher which will fetch all the urls from a site, populate the file. Then the Nutch Crawler can be triggered to crawl the prefetched urls. Within the prefetched url pages, any unnecessary URLs not to be crawled, will have to be ignored. I am still trying a way to do this. Please share your thoughts.. Thanks
Dennis Kubes <[EMAIL PROTECTED]> wrote: I am trying to modify Nutch to add level to the website parse data. What I mean by this is suppose you start parsing a website at its homepage that would be level one. Any links in the same site from the homepage would be level two, links from those pages would be level three and so on. I am only counting links in the same site. How would I go about modifying Nutch to handle this? I was thinking that I would have to modify Fetcher to do this, adding the level to the parse metadata. What I am not gettings is how would I get the link level initially? I was thinking I would have to modify something in the generator but didn't know what. Dennis Sudhi Seshachala http://sudhilogs.blogspot.com/ --------------------------------- Blab-away for as little as 1ยข/min. Make PC-to-Phone Calls using Yahoo! Messenger with Voice.
