There is a db.max.outlinks.per.page setting in nutch-default.xml. You should
increase the value of this setting in nutch-site.xml if you want Nutch to
fetch more outlinks per page than the default (which is 100 if I remember
correctly).

Rgrds, Thomas





On 3/15/06, Aled Jones <[EMAIL PROTECTED]> wrote:
>
> Hi
>
> Does nutch have a limit on the number of links it will fetch per page?
>
> I have a directory-like structure to my web pages, with each subfolder
> having it's own index page.  Some index pages have a lot of links, up to
> about 200 in some cases.
>
> There are some index pages where it isn't fetching all the links at the
> bottom.  It will do the first 100 or so, but there's no sign of the
> rest.
>
> Any ideas?
>
> Thanks
> Aled
>
>
>
>
> ###########################################
>
> This message has been scanned by F-Secure Anti-Virus for Microsoft
> Exchange.
> For more information, connect to http://www.f-secure.com/
> ************************************************************************
> This e-mail and any attachments are strictly confidential and intended
> solely for the addressee. They may contain information which is covered by
> legal, professional or other privilege. If you are not the intended
> addressee, you must not copy the e-mail or the attachments, or use them for
> any purpose or disclose their contents to any other person. To do so may be
> unlawful. If you have received this transmission in error, please notify us
> as soon as possible and delete the message and attachments from all places
> in your computer where they are stored.
>
> Although we have scanned this e-mail and any attachments for viruses, it
> is your responsibility to ensure that they are actually virus free.
>
>
>
>

Reply via email to