[ https://issues.apache.org/jira/browse/PIG-4757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15044291#comment-15044291 ]
Rohini Palaniswamy commented on PIG-4757: ----------------------------------------- Another thing to fix is for hbase inputs, there is no bytes read with Tez, but it is displayed in MR. > Job stats on successfully read/output records wrong with multiple > inputs/outputs > -------------------------------------------------------------------------------- > > Key: PIG-4757 > URL: https://issues.apache.org/jira/browse/PIG-4757 > Project: Pig > Issue Type: Bug > Components: tez > Reporter: Rohini Palaniswamy > Assignee: Daniel Dai > Fix For: 0.16.0 > > > TezVertexStats uses TaskCounter.INPUT_RECORDS_PROCESSED to display records > read from MRInput. But in cases of replicate join or scalar it also includes > replicate join input. Need to have a pig specific counter > (MULTI_INPUTS_RECORD_COUNTER) in POSimpleTezLoad. > TezVertexStats uses TaskCounter.OUTPUT_RECORDS to display records stored to > MROutput if there is single store. If there are multiple stores it uses > MULTI_STORE_RECORD_COUNTER and there are no issues. If there is a single > store with another output, then value from OUTPUT_RECORDS is wrong. Need to > use MULTI_STORE_RECORD_COUNTER for all cases even if there is no multiple > store. -- This message was sent by Atlassian JIRA (v6.3.4#6332)