[ 
https://issues.apache.org/jira/browse/TEZ-3459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15566278#comment-15566278
 ] 

Kuhu Shukla commented on TEZ-3459:
----------------------------------

I ran the jar and think I know what the issue is.
The framework and fs and other counters do show up in the task level and dag 
level counters as show below- values have been overwritten. (when in yarn-tez 
mode)
{code}
[INFO] [TezChild] |runtime.LogicalIOProcessorRuntimeTask|: Final Counters for 
attempt_123_0001_1_00_000000_0: Counters: 28 [[File System Counters 
FILE_BYTES_READ=123, FILE_BYTES_WRITTEN=123, FILE_READ_OPS=0, 
FILE_LARGE_READ_OPS=0, FILE_WRITE_OPS=0, HDFS_BYTES_READ=1234, 
HDFS_BYTES_WRITTEN=0, HDFS_READ_OPS=2, HDFS_LARGE_READ_OPS=0, 
HDFS_WRITE_OPS=0][org.apache.tez.common.counters.TaskCounter 
SPLIT_RAW_BYTES=123, SPILLED_RECORDS=123, GC_TIME_MILLIS=123, 
CPU_MILLISECONDS=123, PHYSICAL_MEMORY_BYTES=123456, 
VIRTUAL_MEMORY_BYTES=123456, COMMITTED_HEAP_BYTES=12345678, 
INPUT_RECORDS_PROCESSED=0, INPUT_SPLIT_LENGTH_BYTES=1234, OUTPUT_RECORDS=123, 
OUTPUT_BYTES=123, OUTPUT_BYTES_WITH_OVERHEAD=123, OUTPUT_BYTES_PHYSICAL=12, 
ADDITIONAL_SPILLS_BYTES_WRITTEN=0, ADDITIONAL_SPILLS_BYTES_READ=0, 
ADDITIONAL_SPILL_COUNT=0, 
SHUFFLE_CHUNK_COUNT=1][example.MapredColorCount$ColorCounter INPUT_RECORDS=100]]
{code}
{code}
[INFO] [TezChild] |runtime.LogicalIOProcessorRuntimeTask|: Final Counters for 
attempt_123_0001_1_01_000000_0: Counters: 44 [[File System Counters 
FILE_BYTES_READ=123, FILE_BYTES_WRITTEN=0, FILE_READ_OPS=0, 
FILE_LARGE_READ_OPS=0, FILE_WRITE_OPS=0, HDFS_BYTES_READ=0, 
HDFS_BYTES_WRITTEN=123, HDFS_READ_OPS=4, HDFS_LARGE_READ_OPS=0, 
HDFS_WRITE_OPS=2][org.apache.tez.common.counters.TaskCounter 
REDUCE_INPUT_GROUPS=12, REDUCE_INPUT_RECORDS=1234, COMBINE_INPUT_RECORDS=0, 
SPILLED_RECORDS=123, NUM_SHUFFLED_INPUTS=5, NUM_SKIPPED_INPUTS=0, 
NUM_FAILED_SHUFFLE_INPUTS=0, MERGED_MAP_OUTPUTS=5, GC_TIME_MILLIS=12, 
CPU_MILLISECONDS=1234, PHYSICAL_MEMORY_BYTES=12345678, 
VIRTUAL_MEMORY_BYTES=12345678, COMMITTED_HEAP_BYTES=12345678, OUTPUT_RECORDS=7, 
ADDITIONAL_SPILLS_BYTES_WRITTEN=0, ADDITIONAL_SPILLS_BYTES_READ=123, 
SHUFFLE_BYTES=123, SHUFFLE_BYTES_DECOMPRESSED=1234, SHUFFLE_BYTES_TO_MEM=0, 
SHUFFLE_BYTES_TO_DISK=0, SHUFFLE_BYTES_DISK_DIRECT=123, 
NUM_MEM_TO_DISK_MERGES=0, NUM_DISK_TO_DISK_MERGES=0, SHUFFLE_PHASE_TIME=12, 
MERGE_PHASE_TIME=123, FIRST_EVENT_RECEIVED=12, LAST_EVENT_RECEIVED=12][Shuffle 
Errors BAD_ID=0, CONNECTION=0, IO_ERROR=0, WRONG_LENGTH=0, WRONG_MAP=0, 
WRONG_REDUCE=0][example.MapredColorCount$ColorCounter OUTPUT_RECORDS=7]]
{code}

The issue is when we try to retrieve those counters, the Tez {{YarnRunner}} 
returns an empty counter object :
{code}
  public Counters getJobCounters(JobID jobId)
      throws IOException, InterruptedException {
    // FIXME needs counters support from DAG
    // with a translation layer on client side
    Counters empty = new Counters();
    return empty;
  }
{code}
Hence when we try to get the custom counter it is init-ed to zero and treated 
like a new counter as per:

AbstractCounterGroup : 
{code}
private synchronized T findCounterImpl(String counterName, boolean create) {
    T counter = counters.get(counterName);
    if (counter == null && create) {
      String localized =
          ResourceBundles.getCounterName(getName(), counterName, counterName);
      return addCounterImpl(counterName, localized, 0);
    }
    return counter;
  }
{code}
This is true even for framework counters as the counters map above is empty in 
tez case.

Asking [~hitesh] if this triaging makes sense and comments on a possible fix. 
Mapred YarnRunner equivalent uses getCountersProto to get the job counters.

> Issues running M/R jobs with Tez
> --------------------------------
>
>                 Key: TEZ-3459
>                 URL: https://issues.apache.org/jira/browse/TEZ-3459
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Manuel Godbert
>         Attachments: colorCount.sh, mr-example.jar
>
>
> After applying the patch delivered in TEZ-3330, I enriched the 
> MapredColorCount example to reproduce some of the other issues I encountered 
> on the jobs I wish to see running with Tez.
> I am attaching a jar to the JIRA, including source code, and a script file 
> detailing the observed results in comments.
> It adresses 3 issues:
> - the embedded jars in /lib are ignored by Tez, but YARN uses them without 
> additional configuration
> - The use of a combiner causes a NullPointerException
> - The counters incremented in the Reporter objects stay at 0
> I am using HDP2.4



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to