CrawlDbReducer - selecting data for DB update

2006-04-07 Thread Andrzej Bialecki
Hi, The more I look at CrawlDbReducer the less I like the method it uses to select the most recent records. This selection is primarily made in the while() loop in CrawlDbReducer:45. My main objection is that selecting the highest value (meaning most recent) relies on the fact that values

Re: CrawlDbReducer - selecting data for DB update

2006-04-07 Thread Doug Cutting
Andrzej Bialecki wrote: This selection is primarily made in the while() loop in CrawlDbReducer:45. My main objection is that selecting the highest value (meaning most recent) relies on the fact that values of status codes in CrawlDatum are ordered according to their meaning, and they are