Markus Jelsma created NUTCH-1430:
------------------------------------

             Summary: Freegenerator records overwrite CrawlDB records with 
AdaptiveFetchSchedule
                 Key: NUTCH-1430
                 URL: https://issues.apache.org/jira/browse/NUTCH-1430
             Project: Nutch
          Issue Type: Bug
          Components: crawldb
    Affects Versions: 1.5
            Reporter: Markus Jelsma
            Priority: Critical
             Fix For: 1.6


Steps to reproduce:

Without AdaptiveFetchSchedule:

{code}
$ bin/nutch readdb crawl/crawldb/ -url http://www.openindex.io/en/home.html
URL: http://www.openindex.io/en/home.html
Version: 7
Status: 2 (db_fetched)
Fetch time: Thu Aug 16 13:58:23 CEST 2012
Modified time: Thu Jan 01 01:00:00 CET 1970
Retries since fetch: 0
Retry interval: 2592000 seconds (30 days)
Score: 0.0
Signature: c2601ca503f2fc5edcb286501d7fb271
Metadata: Content-Type: text/html_pst_: success(1), lastModified=0
{code}

With AdaptiveFetchSchedule:

{code}
$ bin/nutch readdb crawl/crawldb/ -url http://www.openindex.io/en/home.html
URL: http://www.openindex.io/en/home.html
Version: 7
Status: 2 (db_fetched)
Fetch time: Tue Jul 17 13:56:33 CEST 2012
Modified time: Tue Jul 17 13:55:33 CEST 2012
Retries since fetch: 0
Retry interval: 60 seconds (0 days)
Score: 0.0
Signature: 23567bb52ee8b905b8649c4305ed82ee
Metadata: Content-Type: text/html_pst_: success(1), lastModified=0
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to