[ 
https://issues.apache.org/jira/browse/PIG-4757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15044291#comment-15044291
 ] 

Rohini Palaniswamy commented on PIG-4757:
-----------------------------------------

Another thing to fix is for hbase inputs, there is no bytes read with Tez, but 
it is displayed in MR.

> Job stats on successfully read/output records wrong with multiple 
> inputs/outputs
> --------------------------------------------------------------------------------
>
>                 Key: PIG-4757
>                 URL: https://issues.apache.org/jira/browse/PIG-4757
>             Project: Pig
>          Issue Type: Bug
>          Components: tez
>            Reporter: Rohini Palaniswamy
>            Assignee: Daniel Dai
>             Fix For: 0.16.0
>
>
> TezVertexStats uses TaskCounter.INPUT_RECORDS_PROCESSED to display records 
> read from MRInput. But in cases of replicate join or scalar it also includes 
> replicate join input.  Need to have a pig specific counter 
> (MULTI_INPUTS_RECORD_COUNTER) in POSimpleTezLoad.
> TezVertexStats uses TaskCounter.OUTPUT_RECORDS to display records stored to 
> MROutput if there is single store. If there are multiple stores it uses 
> MULTI_STORE_RECORD_COUNTER and there are no issues. If there is a single 
> store with another output, then value from OUTPUT_RECORDS is wrong. Need to 
> use MULTI_STORE_RECORD_COUNTER for all cases even if there is no multiple 
> store.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to