FILE_BYTES_READ - Represents the data read from local disk HDFS_BYTES_READ - Represents data read from HDFS (does not include data read from disk)
SHUFFLE_BYTES - Represents the data that was transferred over the wire while doing shuffle. Downloaded data either gets into memory or disk (depending on memory availability). So, SHUFFLE_BYTES_TO_MEM and SHUFFLE_BYTES_TO_DISK would have correlation with SHUFFLE_BYTES. This does not have direct relationship with FILE_BYTES_READ. However, in case of spills & merge, FILES_BYTES_READ can be incremented correspondingly. ~Rajesh.B On Wed, Jul 8, 2015 at 1:25 PM, Joe Zhang (SDE) <[email protected]> wrote: > HI Tez experts: > > > > Now I am using Tez Rest API to get tez tasks running Info, but I am > confusing some concepts in Counter > > > > <1> For File system counters: > > > > counterName : FILE_BYTES_READ ? does it mean read from local disk or > somewhere else ? > > > > HDFS_BYTES_READ ? is it included by > FILE_BYTES_READ ? > > > > <2> For org.apache.tez.common.counters.TaskCounter: > > > > counterName SHUFFLE_BYTES ? does it have some relationship with > FILE_BYTES_READ ? which data should be included in it ? > > > > Best wishes > > Joe zhang > > >
