[ https://issues.apache.org/jira/browse/PIG-4784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15111798#comment-15111798 ]
liyunzhang_intel commented on PIG-4784: --------------------------------------- In mr mode, there is hadoop api(https://hadoop.apache.org/docs/r1.2.1/api/org/apache/hadoop/mapred/Task.Counter.html) to calculate the MAP_INPUT_RECORDS and REDUCE_OUTPUT_RECORDS. But when in multiple inputs and outputs case, there is no hadoop api to calculate the MAP_INPUT_RECORDS and REDUCE_OUTPUT_RECORDS of each file. When there are multiple inputs, in mr mode, pig counts once reading each record(https://github.com/apache/pig/blob/trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigRecordReader.java#L148) of an input file. When there are multiple outputs, in mr mode, pig counts once getting the result of POStore(https://github.com/apache/pig/blob/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POStore.java#L170). So in mr mode, "pig.disable.counter" is only suitable for for multiple inputs and multiple outputs case. In spark mode, there is no spark api to calculate the input and output records of single input and output. In PIG-4655 and PIG-4634 we implemented counter. So in spark mode, whether in single or multiple inputs, the counter will be disabled and the record number of input and output is always -1 when pig.disable.counter is true. > Enable "pig.disable.counter“ for spark engine > --------------------------------------------- > > Key: PIG-4784 > URL: https://issues.apache.org/jira/browse/PIG-4784 > Project: Pig > Issue Type: Sub-task > Components: spark > Reporter: liyunzhang_intel > Assignee: liyunzhang_intel > Fix For: spark-branch > > Attachments: PIG-4784.patch > > > When you enable pig.disable.counter as "true" in the conf/pig.properties, the > counter to calculate the number of input records and output records will be > disabled. > Following unit tests are designed to test it but now they fail: > org.apache.pig.test.TestPigRunner#testDisablePigCounters > org.apache.pig.test.TestPigRunner#testDisablePigCounters2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)