Nice one Tejas I will try to review today. Thanks Lewis On Jan 6, 2013 7:11 AM, "Tejas Patil" <[email protected]> wrote:
> Hi Lewis, > I have created a jira [0] for this and uploaded the patch. > > [0] : https://issues.apache.org/jira/browse/NUTCH-1514 > > Thanks, > Tejas > > > On Sun, Jan 6, 2013 at 12:01 AM, Tejas Patil <[email protected] > >wrote: > > > Hey Lewis, > > > > Yes. Thats a good idea. There are so many properties in nutch-default.xml > > and having the deprecated ones adds to the confusion. > > > > Thanks, > > Tejas Patil > > > > > > On Sat, Jan 5, 2013 at 11:12 PM, Lewis John Mcgibbney < > > [email protected]> wrote: > > > >> I think it would be good to phase out some of the deprecated > configuration > >> properties if possible. We have had several stable releases with these > >> props included... > >> Lewis > >> On Jan 5, 2013 6:22 PM, "Tejas Patil" <[email protected]> wrote: > >> > >> > The generate.max.per.host is deprecated but still is used inside the > >> > Generator logic. > >> > In Generator.java: > >> > > >> > * if (maxCount==-1 && oldMaxPerHost!=-1){* > >> > * maxCount = oldMaxPerHost;* > >> > * byDomain = false;* > >> > * }* > >> > > >> > ("generate.max.count" is stored in maxCount and > "generate.max.per.host" > >> is > >> > stored in oldMaxPerHost.) > >> > So despite of having "generate.max.count" as -1 in the config file, > >> > internally it was using 100. > >> > > >> > Thanks, > >> > Tejas Patil > >> > > >> > > >> > On Sat, Jan 5, 2013 at 6:19 PM, Bayu Widyasanyata > >> > <[email protected]>wrote: > >> > > >> > > Problem fixed :) > >> > > > >> > > Many thanks! > >> > > > >> > > On Sun, Jan 6, 2013 at 9:15 AM, Bayu Widyasanyata > >> > > <[email protected]>wrote: > >> > > > >> > > > I think it was the problem, on my nutch-site.xml > >> > > > > >> > > > <property> > >> > > > <name>generate.max.per.host</name> > >> > > > <value>100</value> > >> > > > </property> > >> > > > > >> > > > eventhough it's deprecated. > >> > > > OK, I will remove it (on nutch-site.xml) and try to recrawl again. > >> > > > > >> > > > Thanks Tejas! > >> > > > > >> > > > > >> > > > On Sun, Jan 6, 2013 at 8:59 AM, Tejas Patil < > >> [email protected] > >> > > >wrote: > >> > > > > >> > > >> What all properties have you set in nutch-site.xml ? > >> > > >> > >> > > >> Thanks, > >> > > >> Tejas Patil > >> > > >> > >> > > >> > >> > > >> On Sat, Jan 5, 2013 at 5:31 PM, Bayu Widyasanyata > >> > > >> <[email protected]>wrote: > >> > > >> > >> > > >> > Hi, > >> > > >> > > >> > > >> > I got warn message on nutch: > >> > > >> > > >> > > >> > "Host or domain example.com has more than 100 URLs for all 1 > >> > > segments. > >> > > >> > Additional URLs won't be included in the fetchlist." > >> > > >> > > >> > > >> > Property of generate.max.count in nutch-default.xml is still > >> default > >> > > >> value > >> > > >> > which is -1 (unlimited). > >> > > >> > Why does this error is still appear? > >> > > >> > > >> > > >> > I use nutch 1.6 with Solr 4.0. > >> > > >> > > >> > > >> > Thanks, > >> > > >> > > >> > > >> > -- > >> > > >> > wassalam, > >> > > >> > [bayu] > >> > > >> > > >> > > >> > >> > > > > >> > > > > >> > > > > >> > > > -- > >> > > > wassalam, > >> > > > [bayu] > >> > > > >> > > > >> > > > >> > > > >> > > -- > >> > > wassalam, > >> > > [bayu] > >> > > > >> > > >> > > > > >

