Hi, On 6/12/07, patrik <[EMAIL PROTECTED]> wrote: > > I'm running Nutch 0.8.1 on 3 servers. Everything works fine, but I'm > confused about some Fetcher behavior. I'll generate a list of 100k urls > to fetch, that works fine. However, only 1 server in the cluster > actually fetches a reasonable number. 2 out of three go get at most 20 > pages. I've gotta believe I'm just missing some important configuration > settings.
When generator runs in distributed mode, it partitions urls to seperate map tasks according to their hosts. This way, urls under the same host end up in the same map task (which is necessary for politeness). So, in your case, you either have very few hosts (of which one has almost 100K urls) or there is a problem with partitioning. > > Patrik > [...snip...] -- Doğacan Güney ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
