The generate.max.per.host is deprecated but still is used inside the
Generator logic.
In Generator.java:
* if (maxCount==-1 && oldMaxPerHost!=-1){*
* maxCount = oldMaxPerHost;*
* byDomain = false;*
* }*
("generate.max.count" is stored in maxCount and "generate.max.per.host" is
stored in oldMaxPerHost.)
So despite of having "generate.max.count" as -1 in the config file,
internally it was using 100.
Thanks,
Tejas Patil
On Sat, Jan 5, 2013 at 6:19 PM, Bayu Widyasanyata
<[email protected]>wrote:
> Problem fixed :)
>
> Many thanks!
>
> On Sun, Jan 6, 2013 at 9:15 AM, Bayu Widyasanyata
> <[email protected]>wrote:
>
> > I think it was the problem, on my nutch-site.xml
> >
> > <property>
> > <name>generate.max.per.host</name>
> > <value>100</value>
> > </property>
> >
> > eventhough it's deprecated.
> > OK, I will remove it (on nutch-site.xml) and try to recrawl again.
> >
> > Thanks Tejas!
> >
> >
> > On Sun, Jan 6, 2013 at 8:59 AM, Tejas Patil <[email protected]
> >wrote:
> >
> >> What all properties have you set in nutch-site.xml ?
> >>
> >> Thanks,
> >> Tejas Patil
> >>
> >>
> >> On Sat, Jan 5, 2013 at 5:31 PM, Bayu Widyasanyata
> >> <[email protected]>wrote:
> >>
> >> > Hi,
> >> >
> >> > I got warn message on nutch:
> >> >
> >> > "Host or domain example.com has more than 100 URLs for all 1
> segments.
> >> > Additional URLs won't be included in the fetchlist."
> >> >
> >> > Property of generate.max.count in nutch-default.xml is still default
> >> value
> >> > which is -1 (unlimited).
> >> > Why does this error is still appear?
> >> >
> >> > I use nutch 1.6 with Solr 4.0.
> >> >
> >> > Thanks,
> >> >
> >> > --
> >> > wassalam,
> >> > [bayu]
> >> >
> >>
> >
> >
> >
> > --
> > wassalam,
> > [bayu]
>
>
>
>
> --
> wassalam,
> [bayu]
>