Hi,

On 6/12/07, patrik <[EMAIL PROTECTED]> wrote:
>
> I'm running Nutch 0.8.1 on 3 servers. Everything works fine, but I'm
> confused about some Fetcher behavior. I'll generate a list of 100k urls
> to fetch, that works fine. However, only 1 server in the cluster
> actually fetches a reasonable number. 2 out of three go get at most 20
> pages. I've gotta believe I'm just missing some important configuration
> settings.

When generator runs in distributed mode, it partitions urls to
seperate map tasks according to their hosts. This way, urls under the
same host end up in the same map task (which is necessary for
politeness). So, in your case, you either have very few hosts (of
which one has almost 100K urls) or there is a problem with
partitioning.

>
> Patrik
>
[...snip...]


-- 
Doğacan Güney
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to