[
https://issues.apache.org/jira/browse/NUTCH-246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12483003
]
Michael Gillis commented on NUTCH-246:
--
I don't see any commit on this issue -- is it actually fixed in 0.9? It l
[
http://issues.apache.org/jira/browse/NUTCH-246?page=comments#action_12374279 ]
Doug Cutting commented on NUTCH-246:
Looks good, although I think I'd put the setFetchTime in the mapper, where the
CrawlDatum is constructed, rather than in the reducer.
[
http://issues.apache.org/jira/browse/NUTCH-246?page=comments#action_12374272 ]
Doug Cutting commented on NUTCH-246:
> It seems like the Injector should be loading the current time from a job
> configuration property in the same way that that the Gener
[
http://issues.apache.org/jira/browse/NUTCH-246?page=comments#action_12374253 ]
Chris Schneider commented on NUTCH-246:
---
As it turns out, this problem was due to a time synchronization between the
jobtracker and the tasktrackers. When the URLs were i
[
http://issues.apache.org/jira/browse/NUTCH-246?page=comments#action_12374049 ]
Chris Schneider commented on NUTCH-246:
---
A few more details:
Stefan and I were able to reproduce this problem using either an injection set
of 4500 URLs or a larger set