Markus Jelsma created NUTCH-1430: ------------------------------------ Summary: Freegenerator records overwrite CrawlDB records with AdaptiveFetchSchedule Key: NUTCH-1430 URL: https://issues.apache.org/jira/browse/NUTCH-1430 Project: Nutch Issue Type: Bug Components: crawldb Affects Versions: 1.5 Reporter: Markus Jelsma Priority: Critical Fix For: 1.6
Steps to reproduce: Without AdaptiveFetchSchedule: {code} $ bin/nutch readdb crawl/crawldb/ -url http://www.openindex.io/en/home.html URL: http://www.openindex.io/en/home.html Version: 7 Status: 2 (db_fetched) Fetch time: Thu Aug 16 13:58:23 CEST 2012 Modified time: Thu Jan 01 01:00:00 CET 1970 Retries since fetch: 0 Retry interval: 2592000 seconds (30 days) Score: 0.0 Signature: c2601ca503f2fc5edcb286501d7fb271 Metadata: Content-Type: text/html_pst_: success(1), lastModified=0 {code} With AdaptiveFetchSchedule: {code} $ bin/nutch readdb crawl/crawldb/ -url http://www.openindex.io/en/home.html URL: http://www.openindex.io/en/home.html Version: 7 Status: 2 (db_fetched) Fetch time: Tue Jul 17 13:56:33 CEST 2012 Modified time: Tue Jul 17 13:55:33 CEST 2012 Retries since fetch: 0 Retry interval: 60 seconds (0 days) Score: 0.0 Signature: 23567bb52ee8b905b8649c4305ed82ee Metadata: Content-Type: text/html_pst_: success(1), lastModified=0 {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira