Adaptive re-fetch interval. Detecting umodified content
-------------------------------------------------------

         Key: NUTCH-61
         URL: http://issues.apache.org/jira/browse/NUTCH-61
     Project: Nutch
        Type: New Feature
  Components: fetcher  
    Reporter: Andrzej Bialecki 
 Assigned to: Andrzej Bialecki  


Currently Nutch doesn't adjust automatically its re-fetch period, no matter if 
individual pages change seldom or frequently. The goal of these changes is to 
extend the current codebase to support various possible adjustments to re-fetch 
times and intervals, and specifically a re-fetch schedule which tries to adapt 
the period between consecutive fetches to the period of content changes.

Also, these patches implement checking if the content has changed since last 
fetching; protocol plugins are also changed to make use of this information, so 
that if content is unmodified it doesn't have to be fetched and processed.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira



-------------------------------------------------------
This SF.Net email is sponsored by: NEC IT Guy Games.  How far can you shotput
a projector? How fast can you ride your desk chair down the office luge track?
If you want to score the big prize, get to know the little guy.  
Play to win an NEC 61" plasma display: http://www.necitguy.com/?r=20
_______________________________________________
Nutch-developers mailing list
Nutch-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to