[ https://issues.apache.org/jira/browse/TEZ-2021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14301205#comment-14301205 ]
Rajesh Balamohan edited comment on TEZ-2021 at 2/2/15 12:27 PM: ---------------------------------------------------------------- At a high level - Parse tez task logs using tez-tfile-parser in tez-tools - compute stats to identify slow nodes in the cluster and to understand shuffle performance -- How much data got transferred between two nodes in the job -- Time taken for shuffle (from source to destination machines) -- What was the data transfer rate (min/max/avg) between 2 machines - Save the parsed data in csv format, so that it can be imported to any tool. - Provide util scripts to generate heat charts. was (Author: rajesh.balamohan): At a high level - Parse tez task logs using tez-tfile-parser in tez-tools - compute stats to identify slow nodes in the cluster and to understand shuffle performance -- How much data got transferred between two nodes in the job -- Time taken for shuffle (from source to destination machines) -- What was the data transfer rate (min/max/avg) between 2 machines - Save the parsed data in csv format, so that it can be imported to any tool. - Provide util scripts to gererate heat charts. > Tez tool to analyze shuffle performance in large clusters by mining task logs > ----------------------------------------------------------------------------- > > Key: TEZ-2021 > URL: https://issues.apache.org/jira/browse/TEZ-2021 > Project: Apache Tez > Issue Type: Improvement > Reporter: Rajesh Balamohan > Assignee: Rajesh Balamohan > > Tez tool to analyze shuffle performance in large clusters by mining task > logs. Provide an easier way to visualize (heat chart) and identify bad nodes > in large cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)