[ http://issues.apache.org/jira/browse/NUTCH-61?page=comments#action_12361131 ]
raghavendra prabhu commented on NUTCH-61: ----------------------------------------- Will the same thing work for a filesystem For a file system , We can directly get the modified date store it in the db The plugins will have a look at the content date and if it is different they will index it Otherwise they will not fetch it This can be a solution for file based content (The thing is it does away entirely with fetch interval and takes decision only based upon file modification date) > Adaptive re-fetch interval. Detecting umodified content > ------------------------------------------------------- > > Key: NUTCH-61 > URL: http://issues.apache.org/jira/browse/NUTCH-61 > Project: Nutch > Type: New Feature > Components: fetcher > Reporter: Andrzej Bialecki > Assignee: Andrzej Bialecki > Attachments: 20050606.diff > > Currently Nutch doesn't adjust automatically its re-fetch period, no matter > if individual pages change seldom or frequently. The goal of these changes is > to extend the current codebase to support various possible adjustments to > re-fetch times and intervals, and specifically a re-fetch schedule which > tries to adapt the period between consecutive fetches to the period of > content changes. > Also, these patches implement checking if the content has changed since last > fetching; protocol plugins are also changed to make use of this information, > so that if content is unmodified it doesn't have to be fetched and processed. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
