Nice one Tejas I will try to review today.
Thanks
Lewis
On Jan 6, 2013 7:11 AM, "Tejas Patil" <[email protected]> wrote:

> Hi Lewis,
> I have created a jira [0] for this and uploaded the patch.
>
> [0] : https://issues.apache.org/jira/browse/NUTCH-1514
>
> Thanks,
> Tejas
>
>
> On Sun, Jan 6, 2013 at 12:01 AM, Tejas Patil <[email protected]
> >wrote:
>
> > Hey Lewis,
> >
> > Yes. Thats a good idea. There are so many properties in nutch-default.xml
> > and having the deprecated ones adds to the confusion.
> >
> > Thanks,
> > Tejas Patil
> >
> >
> > On Sat, Jan 5, 2013 at 11:12 PM, Lewis John Mcgibbney <
> > [email protected]> wrote:
> >
> >> I think it would be good to phase out some of the deprecated
> configuration
> >> properties if possible. We have had several stable releases with these
> >> props included...
> >> Lewis
> >> On Jan 5, 2013 6:22 PM, "Tejas Patil" <[email protected]> wrote:
> >>
> >> > The generate.max.per.host is deprecated but still is used inside the
> >> > Generator logic.
> >> > In Generator.java:
> >> >
> >> > *      if (maxCount==-1 && oldMaxPerHost!=-1){*
> >> > *        maxCount = oldMaxPerHost;*
> >> > *        byDomain = false;*
> >> > *      }*
> >> >
> >> > ("generate.max.count" is stored in maxCount and
> "generate.max.per.host"
> >> is
> >> > stored in oldMaxPerHost.)
> >> > So despite of having "generate.max.count" as -1 in the config file,
> >> > internally it was using 100.
> >> >
> >> > Thanks,
> >> > Tejas Patil
> >> >
> >> >
> >> > On Sat, Jan 5, 2013 at 6:19 PM, Bayu Widyasanyata
> >> > <[email protected]>wrote:
> >> >
> >> > > Problem fixed :)
> >> > >
> >> > > Many thanks!
> >> > >
> >> > > On Sun, Jan 6, 2013 at 9:15 AM, Bayu Widyasanyata
> >> > > <[email protected]>wrote:
> >> > >
> >> > > > I think it was the problem, on my nutch-site.xml
> >> > > >
> >> > > >    <property>
> >> > > >        <name>generate.max.per.host</name>
> >> > > >        <value>100</value>
> >> > > >    </property>
> >> > > >
> >> > > > eventhough it's deprecated.
> >> > > > OK, I will remove it (on nutch-site.xml) and try to recrawl again.
> >> > > >
> >> > > > Thanks Tejas!
> >> > > >
> >> > > >
> >> > > > On Sun, Jan 6, 2013 at 8:59 AM, Tejas Patil <
> >> [email protected]
> >> > > >wrote:
> >> > > >
> >> > > >> What all properties have you set in nutch-site.xml ?
> >> > > >>
> >> > > >> Thanks,
> >> > > >> Tejas Patil
> >> > > >>
> >> > > >>
> >> > > >> On Sat, Jan 5, 2013 at 5:31 PM, Bayu Widyasanyata
> >> > > >> <[email protected]>wrote:
> >> > > >>
> >> > > >> > Hi,
> >> > > >> >
> >> > > >> > I got warn message on nutch:
> >> > > >> >
> >> > > >> > "Host or domain example.com has more than 100 URLs for all 1
> >> > > segments.
> >> > > >> > Additional URLs won't be included in the fetchlist."
> >> > > >> >
> >> > > >> > Property of generate.max.count in nutch-default.xml is still
> >> default
> >> > > >> value
> >> > > >> > which is -1 (unlimited).
> >> > > >> > Why does this error is still appear?
> >> > > >> >
> >> > > >> > I use nutch 1.6 with Solr 4.0.
> >> > > >> >
> >> > > >> > Thanks,
> >> > > >> >
> >> > > >> > --
> >> > > >> > wassalam,
> >> > > >> > [bayu]
> >> > > >> >
> >> > > >>
> >> > > >
> >> > > >
> >> > > >
> >> > > > --
> >> > > > wassalam,
> >> > > > [bayu]
> >> > >
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > wassalam,
> >> > > [bayu]
> >> > >
> >> >
> >>
> >
> >
>

Reply via email to