[
https://issues.apache.org/jira/browse/TEZ-2021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14361379#comment-14361379
]
Rajesh Balamohan edited comment on TEZ-2021 at 3/14/15 12:19 AM:
-
[~jeagles] This has been tested on small cluster with 20 nodes. It would be
really helpful if you would like to try it out and provide your comments.
* Apply this patch.
* Build tez-tfile-parser in $TEZ/tez-tools/tez-tfile-parser/
** "mvn clean package"
* Populate env.sh in $TEZ/tez-tools/perf-analyzer/shuffle/
** PIG_HOME, TEZ_HOME
** YARN_APP_LOGS_LOCATION
*** "yarn.log-aggregation-enable" is set to true in the cluster
*** Note down "yarn.nodemanager.remote-app-log-dir &
yarn.nodemanager.remote-app-log-dir-suffix" parameters in your cluster and
setup YARN_APP_LOGS_LOCATIONin env.sh appropriately
* This requires "gnuplot" in the machine where you are planning to run.
* Run "sh gnuplot.sh " (In case you would like to parse some
other user's job, you might want to set "export APP_USER=appUserWhoRanTheJob"
before running this)
was (Author: rajesh.balamohan):
This has been tested on small cluster with 20 nodes. It would be really
helpful if you would like to try it out and provide your comments.
* Apply this patch.
* Build tez-tfile-parser in $TEZ/tez-tools/tez-tfile-parser/
** "mvn clean package"
* Populate env.sh in $TEZ/tez-tools/perf-analyzer/shuffle/
** PIG_HOME, TEZ_HOME
** YARN_APP_LOGS_LOCATION
*** "yarn.log-aggregation-enable" is set to true in the cluster
*** Note down "yarn.nodemanager.remote-app-log-dir &
yarn.nodemanager.remote-app-log-dir-suffix" parameters in your cluster and
setup YARN_APP_LOGS_LOCATIONin env.sh appropriately
* This requires "gnuplot" in the machine where you are planning to run.
* Run "sh gnuplot.sh " (In case you would like to parse some
other user's job, you might want to set "export APP_USER=appUserWhoRanTheJob"
before running this)
> Tez tool to analyze shuffle performance in large clusters by mining task logs
> -
>
> Key: TEZ-2021
> URL: https://issues.apache.org/jira/browse/TEZ-2021
> Project: Apache Tez
> Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Attachments: TEZ-2021.1.patch, TEZ-2021.2.patch,
> avg_time_Taken_after_fix.png, avg_time_taken_to_download.png,
> no_of_times_contacted.png, total_data_transferred.png
>
>
> Tez tool to analyze shuffle performance in large clusters by mining task
> logs. Provide an easier way to visualize (heat chart) and identify bad nodes
> in large cluster.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)