ake
> a look at? HDFS_BYTES_WRITTEN+ OUTPUT_BYTES_PHYSICAL?
>
>
>
> Xiaoyong
>
>
>
> *From:* Rajesh Balamohan [mailto:rbalamo...@apache.org]
> *Sent:* Monday, July 13, 2015 1:46 PM
> *To:* user@tez.apache.org
> *Cc:* Hitesh Shah; Yifung Lin; Zhaomin Xu; Joe
; Zhaomin Xu; Joe Zhang (SDE)
Subject: Re: Tez Counter question
For skew analysis, "SHUFFLE_BYTES (fetched from previous vertex) +
HDFS_BYTES_READ (read from HDFS)" can be used. Along with this,
REDUCE_INPUT_GROUPS & REDUCE_INPUT_RECORDS could give details on data skew.
For example, c
e-
> From: Hitesh Shah [mailto:hit...@apache.org]
> Sent: Thursday, July 9, 2015 1:39 PM
> To: user@tez.apache.org
> Cc: Xiaoyong Zhu; Yifung Lin; Zhaomin Xu
> Subject: Re: Tez Counter question
>
> For data skew, you may also want to consider enabling "
> tez
or write tp hdfs?
Best wishes
Joe zhang
From: Rajesh Balamohan [mailto:rbalamo...@apache.org]
Sent: Wednesday, July 8, 2015 4:57 PM
To: user@tez.apache.org<mailto:user@tez.apache.org>
Cc: Xiaoyong Zhu; Yifung Lin
Subject: Re: Tez Counter question
FILE_BYTES_READ - Represents the data read fr
: Tez Counter question
For data skew, you may also want to consider enabling
"tez.task.generate.counters.per.io". This enables counters on a per edge basis
which is more helpful for complex DAGs.
- Hitesh
On Jul 8, 2015, at 10:29 PM, Joe Zhang (SDE) wrote:
> Hi Rajesh:
>
&
so on. So I want to know which counter value is meaningful
> for analyzing data skew ?
>
> Best wishes
> Joe zhang
>
> From: Rajesh Balamohan [mailto:rbalamo...@apache.org]
> Sent: Wednesday, July 8, 2015 4:57 PM
> To: user@tez.apache.org
> Cc: Xiaoyong
, SHUFFLE_BYTES
and so on. So I want to know which counter value is meaningful for analyzing
data skew ?
Best wishes
Joe zhang
From: Rajesh Balamohan [mailto:rbalamo...@apache.org]
Sent: Wednesday, July 8, 2015 4:57 PM
To: user@tez.apache.org
Cc: Xiaoyong Zhu; Yifung Lin
Subject: Re: Tez Counter
(FILE_BYTES_READ +
HDFS_BYTES_READ+ SHUFFLE_BYTES)?
Thanks!
Xiaoyong
From: Rajesh Balamohan [mailto:rbalamo...@apache.org]
Sent: Wednesday, July 8, 2015 5:15 PM
To: user@tez.apache.org
Subject: Re: Tez Counter question
Correct. In case processor chooses to read some additional data from HDFS (as
a
pache.org]
> *Sent:* Wednesday, July 8, 2015 4:57 PM
> *To:* user@tez.apache.org
> *Cc:* Xiaoyong Zhu; Yifung Lin
> *Subject:* Re: Tez Counter question
>
>
>
> FILE_BYTES_READ - Represents the data read from local disk
>
>
>
> HDFS_BYTES_READ
[mailto:rbalamo...@apache.org]
Sent: Wednesday, July 8, 2015 4:57 PM
To: user@tez.apache.org
Cc: Xiaoyong Zhu; Yifung Lin
Subject: Re: Tez Counter question
FILE_BYTES_READ - Represents the data read from local disk
HDFS_BYTES_READ - Represents data read from HDFS (does not include data read
from
FILE_BYTES_READ - Represents the data read from local disk
HDFS_BYTES_READ - Represents data read from HDFS (does not include data
read from disk)
SHUFFLE_BYTES - Represents the data that was transferred over the wire
while doing shuffle. Downloaded data either gets into memory or disk
(depending
HI Tez experts:
Now I am using Tez Rest API to get tez tasks running Info, but I am confusing
some concepts in Counter
<1> For File system counters:
counterName : FILE_BYTES_READ ? does it mean read from local disk or somewhere
else ?
HDFS_BYTES_READ ? i
12 matches
Mail list logo