[ https://issues.apache.org/jira/browse/NUTCH-1600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13699034#comment-13699034 ]
lufeng commented on NUTCH-1600: ------------------------------- test work fine. +1 > Injector overwrite does not always work properly > ------------------------------------------------ > > Key: NUTCH-1600 > URL: https://issues.apache.org/jira/browse/NUTCH-1600 > Project: Nutch > Issue Type: Bug > Components: injector > Affects Versions: 1.7 > Reporter: Markus Jelsma > Assignee: Markus Jelsma > Fix For: 1.8 > > Attachments: NUTCH-1600-1.8.patch > > > db.injector.update works as it should but db.injector.overwrite doesn't > always seem to properly overwrite the record. This issue exists for some time > and we've already fixed it in our dist of Nutch. > This record just has been updated (interval). > {code} > Injector: starting at 2013-07-03 10:34:15 > Injector: crawlDb: crawl/crawldb > Injector: urlDir: seeds > Injector: Converting injected urls to crawl db entries. > Injector: total number of urls rejected by filters: 0 > Injector: total number of urls injected after normalization and filtering: 9 > Injector: Merging injected urls into crawl db. > Injector: finished at 2013-07-03 10:34:21, elapsed: 00:00:05 > URL: url > Version: 7 > Status: 2 (db_fetched) > Fetch time: Fri Jul 05 12:11:44 CEST 2013 > Modified time: Fri Jun 28 12:11:44 CEST 2013 > Retries since fetch: 0 > Retry interval: 604800 seconds (7 days) > Score: 0.0 > Signature: ba29ef3e680323a6d0da74c156800e03 > Metadata: Content-Type: text/html_pst_: success(1), lastModified=0 > {code} > If we now overwrite the record, nothing happens. With this patch installed it > overwrites the record as it should and also logs update & overwrite switches > to console: > {code} > Injector: starting at 2013-07-03 10:36:30 > Injector: crawlDb: crawl/crawldb > Injector: urlDir: seeds > Injector: Converting injected urls to crawl db entries. > Injector: total number of urls rejected by filters: 0 > Injector: total number of urls injected after normalization and filtering: 9 > Injector: Merging injected urls into crawl db. > Injector: overwrite: true > Injector: update: false > Injector: finished at 2013-07-03 10:36:36, elapsed: 00:00:05 > URL: url > Version: 7 > Status: 1 (db_unfetched) > Fetch time: Wed Jul 03 10:36:30 CEST 2013 > Modified time: Thu Jan 01 01:00:00 CET 1970 > Retries since fetch: 0 > Retry interval: 14000 seconds (0 days) > Score: 1.0 > Signature: null > Metadata: fixedInterval: 14000.0 > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira