you can disable this in url-filter file, it is disabled by default. you ran into a loop on that site
On Wed, Apr 8, 2009 at 7:32 AM, yanky young <[email protected]> wrote: > Hi guys: > > I am using nutch in a project. But I found that nutch repeat fetching some > pages. For example: > > http://www.me.washington.edu//people/faculty/wang/ > > this is a page fetched. But also, there are some urls like this in > commandline output: > > http://www.me.washington.edu//people/faculty/wang/ > http://www.me.washington.edu///people/faculty/wang/ > http://www.me.washington.edu////people/faculty/wang/ > ...... > http://www.me.washington.edu////////////people/faculty/wang/ > > it seems nutch will repeat this process for ever. Why is that? > > any help is appreciated! > > yanky >
