[jira] [Comment Edited] (TEZ-2021) Tez tool to analyze shuffle performance in large clusters by mining task logs

2015-03-13 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14361379#comment-14361379
 ] 

Rajesh Balamohan edited comment on TEZ-2021 at 3/14/15 12:19 AM:
-

[~jeagles] This has been tested on small cluster with 20 nodes.  It would be 
really helpful if you would like to try it out and provide your comments.

* Apply this patch.
* Build tez-tfile-parser in $TEZ/tez-tools/tez-tfile-parser/
** "mvn clean package"
* Populate env.sh in $TEZ/tez-tools/perf-analyzer/shuffle/
** PIG_HOME, TEZ_HOME
** YARN_APP_LOGS_LOCATION
*** "yarn.log-aggregation-enable" is set to true in the cluster
*** Note down "yarn.nodemanager.remote-app-log-dir & 
yarn.nodemanager.remote-app-log-dir-suffix" parameters in your cluster and 
setup YARN_APP_LOGS_LOCATIONin env.sh appropriately
* This requires "gnuplot" in the machine where you are planning to run. 
* Run "sh gnuplot.sh " (In case you would like to parse some 
other user's job, you might want to set "export APP_USER=appUserWhoRanTheJob" 
before running this)


was (Author: rajesh.balamohan):
This has been tested on small cluster with 20 nodes.  It would be really 
helpful if you would like to try it out and provide your comments.

* Apply this patch.
* Build tez-tfile-parser in $TEZ/tez-tools/tez-tfile-parser/
** "mvn clean package"
* Populate env.sh in $TEZ/tez-tools/perf-analyzer/shuffle/
** PIG_HOME, TEZ_HOME
** YARN_APP_LOGS_LOCATION
*** "yarn.log-aggregation-enable" is set to true in the cluster
*** Note down "yarn.nodemanager.remote-app-log-dir & 
yarn.nodemanager.remote-app-log-dir-suffix" parameters in your cluster and 
setup YARN_APP_LOGS_LOCATIONin env.sh appropriately
* This requires "gnuplot" in the machine where you are planning to run. 
* Run "sh gnuplot.sh " (In case you would like to parse some 
other user's job, you might want to set "export APP_USER=appUserWhoRanTheJob" 
before running this)

> Tez tool to analyze shuffle performance in large clusters by mining task logs
> -
>
> Key: TEZ-2021
> URL: https://issues.apache.org/jira/browse/TEZ-2021
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Attachments: TEZ-2021.1.patch, TEZ-2021.2.patch, 
> avg_time_Taken_after_fix.png, avg_time_taken_to_download.png, 
> no_of_times_contacted.png, total_data_transferred.png
>
>
> Tez tool to analyze shuffle performance in large clusters by mining task 
> logs. Provide an easier way to visualize (heat chart) and identify bad nodes 
> in large cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TEZ-2021) Tez tool to analyze shuffle performance in large clusters by mining task logs

2015-02-02 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14301205#comment-14301205
 ] 

Rajesh Balamohan edited comment on TEZ-2021 at 2/2/15 12:27 PM:


At a high level
-  Parse tez task logs using tez-tfile-parser in tez-tools
-  compute stats to identify slow nodes in the cluster and to understand 
shuffle performance
-- How much data got transferred between two nodes in the job
--  Time taken for shuffle (from source to destination machines)
--  What was the data transfer rate (min/max/avg) between 2 machines
-  Save the parsed data in csv format, so that it can be imported to any tool.
-  Provide util scripts to generate heat charts.


was (Author: rajesh.balamohan):
At a high level
-  Parse tez task logs using tez-tfile-parser in tez-tools
-  compute stats to identify slow nodes in the cluster and to understand 
shuffle performance
-- How much data got transferred between two nodes in the job
--  Time taken for shuffle (from source to destination machines)
--  What was the data transfer rate (min/max/avg) between 2 machines
-  Save the parsed data in csv format, so that it can be imported to any tool.
-  Provide util scripts to gererate heat charts.

> Tez tool to analyze shuffle performance in large clusters by mining task logs
> -
>
> Key: TEZ-2021
> URL: https://issues.apache.org/jira/browse/TEZ-2021
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>
> Tez tool to analyze shuffle performance in large clusters by mining task 
> logs. Provide an easier way to visualize (heat chart) and identify bad nodes 
> in large cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)