Rohini Palaniswamy created PIG-4483:
---------------------------------------

             Summary: Pig on Tez output statistics shows storing to same 
directory twice for union
                 Key: PIG-4483
                 URL: https://issues.apache.org/jira/browse/PIG-4483
             Project: Pig
          Issue Type: Improvement
    Affects Versions: 0.14.0
            Reporter: Rohini Palaniswamy


For the below script

A = LOAD 'data1';
B = LOAD 'data2';
C = UNION A, B;
STORE C into 'data3';

Output message is shown as below due to vertex group and storing from separate 
vertices.

Successfully stored 10 records (xxx bytes) in: "data3"
Successfully stored 20 records (yyy bytes) in: "data3"

Even though it is correct it can be confusing for users and they have to sum it 
up before comparing to Pig on MR output message. OutputStats with same filename 
should be combined and shown as

Successfully stored 30 records (xxx bytes) in: "data3"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to