Re: [Nutch-dev] lib-http crawl-delay problem

2007-02-15 Thread ogjunk-nutch
HI, I think the robots.txt example you used was invalid (no path for that last Disallow rule). Small patch indeed, but sticking it in JIRA would still make sense because: - it leaves a good record of the bug + fix - it could be used for release notes/changelog Not trying to be picky, just pointi

Re: [Nutch-dev] lib-http crawl-delay problem

2007-02-15 Thread rubdabadub
Thanks for the link! On 2/15/07, Doğacan Güney <[EMAIL PROTECTED]> wrote: > rubdabadub wrote: > > Hi: > > > > I am unable to get the attached patch via mail. Its better if you > > create a JIra issue and attached the patch there. > > > > Thank you. > > > > I don't know, this bug seems too minor

Re: [Nutch-dev] lib-http crawl-delay problem

2007-02-15 Thread Doğacan Güney
rubdabadub wrote: > Hi: > > I am unable to get the attached patch via mail. Its better if you > create a JIra issue and attached the patch there. > > Thank you. > I don't know, this bug seems too minor to require its own JIRA issue. So I put the patch to http://www.ceng.metu.edu.tr/~e1345172/crawl

Re: [Nutch-dev] lib-http crawl-delay problem

2007-02-15 Thread rubdabadub
Hi: I am unable to get the attached patch via mail. Its better if you create a JIra issue and attached the patch there. Thank you. On 2/15/07, Doğacan Güney <[EMAIL PROTECTED]> wrote: > Hi, > > There seems to be two small bugs in lib-http's RobotRulesParser. > > First is about reading crawl-dela

[Nutch-dev] lib-http crawl-delay problem

2007-02-15 Thread Doğacan Güney
Hi, There seems to be two small bugs in lib-http's RobotRulesParser. First is about reading crawl-delay. The code doesn't check for addRules, so the nutch bot will get the crawl-delay value of another robot's crawl-delay in robots.txt. Let me try to be more clear: User-agent: foobot Crawl-delay: