Re: Tez Counter question

2015-07-13 Thread Rajesh Balamohan
ake > a look at? HDFS_BYTES_WRITTEN+ OUTPUT_BYTES_PHYSICAL? > > > > Xiaoyong > > > > *From:* Rajesh Balamohan [mailto:rbalamo...@apache.org] > *Sent:* Monday, July 13, 2015 1:46 PM > *To:* user@tez.apache.org > *Cc:* Hitesh Shah; Yifung Lin; Zhaomin Xu; Joe

RE: Tez Counter question

2015-07-13 Thread Xiaoyong Zhu
; Zhaomin Xu; Joe Zhang (SDE) Subject: Re: Tez Counter question For skew analysis, "SHUFFLE_BYTES (fetched from previous vertex) + HDFS_BYTES_READ (read from HDFS)" can be used. Along with this, REDUCE_INPUT_GROUPS & REDUCE_INPUT_RECORDS could give details on data skew. For example, c

Re: Tez Counter question

2015-07-12 Thread Rajesh Balamohan
e- > From: Hitesh Shah [mailto:hit...@apache.org] > Sent: Thursday, July 9, 2015 1:39 PM > To: user@tez.apache.org > Cc: Xiaoyong Zhu; Yifung Lin; Zhaomin Xu > Subject: Re: Tez Counter question > > For data skew, you may also want to consider enabling " > tez

RE: Tez Counter question

2015-07-12 Thread Joe Zhang (SDE)
or write tp hdfs? Best wishes Joe zhang From: Rajesh Balamohan [mailto:rbalamo...@apache.org] Sent: Wednesday, July 8, 2015 4:57 PM To: user@tez.apache.org<mailto:user@tez.apache.org> Cc: Xiaoyong Zhu; Yifung Lin Subject: Re: Tez Counter question FILE_BYTES_READ - Represents the data read fr

RE: Tez Counter question

2015-07-12 Thread Xiaoyong Zhu
: Tez Counter question For data skew, you may also want to consider enabling "tez.task.generate.counters.per.io". This enables counters on a per edge basis which is more helpful for complex DAGs. - Hitesh On Jul 8, 2015, at 10:29 PM, Joe Zhang (SDE) wrote: > Hi Rajesh: > &

Re: Tez Counter question

2015-07-08 Thread Hitesh Shah
so on. So I want to know which counter value is meaningful > for analyzing data skew ? > > Best wishes > Joe zhang > > From: Rajesh Balamohan [mailto:rbalamo...@apache.org] > Sent: Wednesday, July 8, 2015 4:57 PM > To: user@tez.apache.org > Cc: Xiaoyong

RE: Tez Counter question

2015-07-08 Thread Joe Zhang (SDE)
, SHUFFLE_BYTES and so on. So I want to know which counter value is meaningful for analyzing data skew ? Best wishes Joe zhang From: Rajesh Balamohan [mailto:rbalamo...@apache.org] Sent: Wednesday, July 8, 2015 4:57 PM To: user@tez.apache.org Cc: Xiaoyong Zhu; Yifung Lin Subject: Re: Tez Counter

RE: Tez Counter question

2015-07-08 Thread Xiaoyong Zhu
(FILE_BYTES_READ + HDFS_BYTES_READ+ SHUFFLE_BYTES)? Thanks! Xiaoyong From: Rajesh Balamohan [mailto:rbalamo...@apache.org] Sent: Wednesday, July 8, 2015 5:15 PM To: user@tez.apache.org Subject: Re: Tez Counter question Correct. In case processor chooses to read some additional data from HDFS (as a

Re: Tez Counter question

2015-07-08 Thread Rajesh Balamohan
pache.org] > *Sent:* Wednesday, July 8, 2015 4:57 PM > *To:* user@tez.apache.org > *Cc:* Xiaoyong Zhu; Yifung Lin > *Subject:* Re: Tez Counter question > > > > FILE_BYTES_READ - Represents the data read from local disk > > > > HDFS_BYTES_READ

RE: Tez Counter question

2015-07-08 Thread Xiaoyong Zhu
[mailto:rbalamo...@apache.org] Sent: Wednesday, July 8, 2015 4:57 PM To: user@tez.apache.org Cc: Xiaoyong Zhu; Yifung Lin Subject: Re: Tez Counter question FILE_BYTES_READ - Represents the data read from local disk HDFS_BYTES_READ - Represents data read from HDFS (does not include data read from

Re: Tez Counter question

2015-07-08 Thread Rajesh Balamohan
FILE_BYTES_READ - Represents the data read from local disk HDFS_BYTES_READ - Represents data read from HDFS (does not include data read from disk) SHUFFLE_BYTES - Represents the data that was transferred over the wire while doing shuffle. Downloaded data either gets into memory or disk (depending

Tez Counter question

2015-07-08 Thread Joe Zhang (SDE)
HI Tez experts: Now I am using Tez Rest API to get tez tasks running Info, but I am confusing some concepts in Counter <1> For File system counters: counterName : FILE_BYTES_READ ? does it mean read from local disk or somewhere else ? HDFS_BYTES_READ ? i