Hi,
The more I look at CrawlDbReducer the less I like the method it uses to
select the most recent records.
This selection is primarily made in the while() loop in
CrawlDbReducer:45. My main objection is that selecting the highest
value (meaning most recent) relies on the fact that values
Andrzej Bialecki wrote:
This selection is primarily made in the while() loop in
CrawlDbReducer:45. My main objection is that selecting the highest
value (meaning most recent) relies on the fact that values of status
codes in CrawlDatum are ordered according to their meaning, and they are