[ https://issues.apache.org/jira/browse/NUTCH-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440189#comment-13440189 ]
Lewis John McGibbney commented on NUTCH-1430: --------------------------------------------- Hi Markus yeah you are right (and although I am not using FreeGenerator) this is a bad one. The last thing we wish is for the default interval to disappear (overwrite) and our ModifiedTime to space hop back to the 70's... not a good thought, however I do like Led Zeppelin. Anyway I'm +1 for this, although you've had it running in production it would be real nice to try and test for this though. Basically I'm +1. Thanks > Freegenerator records overwrite CrawlDB records with AdaptiveFetchSchedule > -------------------------------------------------------------------------- > > Key: NUTCH-1430 > URL: https://issues.apache.org/jira/browse/NUTCH-1430 > Project: Nutch > Issue Type: Bug > Components: crawldb > Affects Versions: 1.5 > Reporter: Markus Jelsma > Assignee: Markus Jelsma > Priority: Critical > Fix For: 1.6 > > Attachments: NUTCH-1430-1.6-1.patch > > > Steps to reproduce: > Without AdaptiveFetchSchedule: > {code} > $ bin/nutch readdb crawl/crawldb/ -url http://www.openindex.io/en/home.html > URL: http://www.openindex.io/en/home.html > Version: 7 > Status: 2 (db_fetched) > Fetch time: Thu Aug 16 13:58:23 CEST 2012 > Modified time: Thu Jan 01 01:00:00 CET 1970 > Retries since fetch: 0 > Retry interval: 2592000 seconds (30 days) > Score: 0.0 > Signature: c2601ca503f2fc5edcb286501d7fb271 > Metadata: Content-Type: text/html_pst_: success(1), lastModified=0 > {code} > With AdaptiveFetchSchedule: > {code} > $ bin/nutch readdb crawl/crawldb/ -url http://www.openindex.io/en/home.html > URL: http://www.openindex.io/en/home.html > Version: 7 > Status: 2 (db_fetched) > Fetch time: Tue Jul 17 13:56:33 CEST 2012 > Modified time: Tue Jul 17 13:55:33 CEST 2012 > Retries since fetch: 0 > Retry interval: 60 seconds (0 days) > Score: 0.0 > Signature: 23567bb52ee8b905b8649c4305ed82ee > Metadata: Content-Type: text/html_pst_: success(1), lastModified=0 > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira