There is a db.max.outlinks.per.page setting in nutch-default.xml. You should increase the value of this setting in nutch-site.xml if you want Nutch to fetch more outlinks per page than the default (which is 100 if I remember correctly).
Rgrds, Thomas On 3/15/06, Aled Jones <[EMAIL PROTECTED]> wrote: > > Hi > > Does nutch have a limit on the number of links it will fetch per page? > > I have a directory-like structure to my web pages, with each subfolder > having it's own index page. Some index pages have a lot of links, up to > about 200 in some cases. > > There are some index pages where it isn't fetching all the links at the > bottom. It will do the first 100 or so, but there's no sign of the > rest. > > Any ideas? > > Thanks > Aled > > > > > ########################################### > > This message has been scanned by F-Secure Anti-Virus for Microsoft > Exchange. > For more information, connect to http://www.f-secure.com/ > ************************************************************************ > This e-mail and any attachments are strictly confidential and intended > solely for the addressee. They may contain information which is covered by > legal, professional or other privilege. If you are not the intended > addressee, you must not copy the e-mail or the attachments, or use them for > any purpose or disclose their contents to any other person. To do so may be > unlawful. If you have received this transmission in error, please notify us > as soon as possible and delete the message and attachments from all places > in your computer where they are stored. > > Although we have scanned this e-mail and any attachments for viruses, it > is your responsibility to ensure that they are actually virus free. > > > >
