Add site fetcher.max.crawl.delay as log output by default.
----------------------------------------------------------

                 Key: NUTCH-1284
                 URL: https://issues.apache.org/jira/browse/NUTCH-1284
             Project: Nutch
          Issue Type: New Feature
          Components: fetcher
    Affects Versions: nutchgora, 1.5
            Reporter: Lewis John McGibbney
            Priority: Trivial
             Fix For: nutchgora, 1.5


Currently, when manually scanning our log output we cannot infer which pages 
are governed by a crawl delay between successive fetch attempts of any given 
page within the site. The value should be made available as something like:

{code}
2012-02-19 12:33:33,031 INFO  fetcher.Fetcher - fetching 
http://nutch.apache.org/ (crawl.delay=XXXms)
{code}

This way we can easily and quickly determine whether the fetcher is having to 
use this functionality or not. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to