Re: generate.max.per.host is per reduce task

2006-05-07 Thread Doug Cutting

Chris Schneider wrote:
I just noticed that the generate.max.per.host property is only enforced 
on a "per reduce task" basis during the first generate job (see 
Generator.Selector.reduce for details). At a minimum, it should probably 
be documented this way in nutch-default.xml.template.


Yes, but all URLs with the same host are a single reduce task, since it 
is generating host-disjoint fetch lists.


Doug


generate.max.per.host is per reduce task

2006-05-07 Thread Chris Schneider

Gang,

I just noticed that the generate.max.per.host property is only 
enforced on a "per reduce task" basis during the first generate job 
(see Generator.Selector.reduce for details). At a minimum, it should 
probably be documented this way in nutch-default.xml.template.


Thoughts?

- Chris
--

Chris Schneider
TransPac Software, Inc.
[EMAIL PROTECTED]