[ https://issues.apache.org/jira/browse/PIG-1389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886362#action_12886362 ]
Richard Ding commented on PIG-1389: ----------------------------------- Locally ran and passed core unit tests. > Implement Pig counter to track number of rows for each input files > ------------------------------------------------------------------- > > Key: PIG-1389 > URL: https://issues.apache.org/jira/browse/PIG-1389 > Project: Pig > Issue Type: Improvement > Affects Versions: 0.7.0 > Reporter: Richard Ding > Assignee: Richard Ding > Fix For: 0.8.0 > > Attachments: PIG-1389.patch, PIG-1389.patch, PIG-1389_1.patch, > PIG-1389_2.patch > > > A MR job generated by Pig not only can have multiple outputs (in the case of > multiquery) but also can have multiple inputs (in the case of join or > cogroup). In both cases, the existing Hadoop counters (e.g. > MAP_INPUT_RECORDS, REDUCE_OUTPUT_RECORDS) can not be used to count the number > of records in the given input or output. PIG-1299 addressed the case of > multiple outputs. We need to add new counters for jobs with multiple inputs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.