Hello people,
can someone explain me how the generator genrates the fetch lists?
In particular:
I don't understand why it generates fetch lists which very different
amounts of urls.
Sometimes it generates > 25k urls and somestimes > 1k.
In every case there were more than >25k urls unfetched in the crawldb.
So I was expecting that it always generates ~ 25k urls. But as I said
before, sometimes only ~ 1k.
In my nutch-site.xml I have defined following values:
<property>
<name>generate.max.count</name>
<value>-1</value>
<description>The maximum number of urls in a single
fetchlist. -1 if unlimited. The urls are counted according
to the value of the parameter generator.count.mode.
</description>
</property>
<property>
<name>generate.max.count</name>
<value>-1</value>
<description>The maximum number of urls in a single
fetchlist. -1 if unlimited. The urls are counted according
to the value of the parameter generator.count.mode.
</description>
</property>
Any ideas?
Thanks