Julien Nioche created NUTCH-1777: ------------------------------------ Summary: Fetcher not getting all the entries in input Key: NUTCH-1777 URL: https://issues.apache.org/jira/browse/NUTCH-1777 Project: Nutch Issue Type: Bug Components: fetcher Affects Versions: 2.2.1 Reporter: Julien Nioche Fix For: 2.3
See comments in [NUTCH-1714] : bq. The Generator marks 50K entries with GENERATE_MARK but the Fetcher shows only 49,461 as Map Input Records (and the same number as Reduce input records) => looks like we are not getting all the records we should be getting. I dumped the content of the table pre-fetching and it contains the right number of entries i.e. 50K This was noticed after applying [NUTCH-1714] and [NUTCH-1674] but could also have been the case before that. -- This message was sent by Atlassian JIRA (v6.2#6252)