[ 
https://issues.apache.org/jira/browse/NUTCH-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15279842#comment-15279842
 ] 

Jurian Broertjes commented on NUTCH-2242:
-----------------------------------------

Hi Sebastian, I've put this in the reduce() function because that is where a 
generic modified/not-modified check is done. I think it would make sense to do 
setModifiedTime() there, together with setSignature().
The one in DefaultFetchSchedule is only for setting the modified time on the 
first successful fetch.

> lastModified not always set
> ---------------------------
>
>                 Key: NUTCH-2242
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2242
>             Project: Nutch
>          Issue Type: Bug
>          Components: crawldb
>    Affects Versions: 1.11
>            Reporter: Jurian Broertjes
>            Priority: Minor
>             Fix For: 1.12
>
>         Attachments: NUTCH-2242.patch
>
>
> I observed two issues:
> - When using the DefaultFetchSchedule, CrawlDatum's modifiedTime field is not 
> updated on the first successful fetch. 
> - When a document modification is detected (protocol- or signature-wise), the 
> modifiedTime isn't updated
> I can provide a patch later today.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to